Model Performance Evaluation and Scenario Analysis (MPESA) Tutorial
This tool consists of two parts: model performance evaluation and scenario analysis (MPESA). The model performance evaluation consists of two components: model performance evaluation metrics and model diagnostics. These metrics provides modelers with statistical goodness-of-fit m...
Haji Ali Afzali, Hossein; Gray, Jodi; Karnon, Jonathan
2013-04-01
Decision analytic models play an increasingly important role in the economic evaluation of health technologies. Given uncertainties around the assumptions used to develop such models, several guidelines have been published to identify and assess 'best practice' in the model development process, including general modelling approach (e.g., time horizon), model structure, input data and model performance evaluation. This paper focuses on model performance evaluation. In the absence of a sufficient level of detail around model performance evaluation, concerns regarding the accuracy of model outputs, and hence the credibility of such models, are frequently raised. Following presentation of its components, a review of the application and reporting of model performance evaluation is presented. Taking cardiovascular disease as an illustrative example, the review investigates the use of face validity, internal validity, external validity, and cross model validity. As a part of the performance evaluation process, model calibration is also discussed and its use in applied studies investigated. The review found that the application and reporting of model performance evaluation across 81 studies of treatment for cardiovascular disease was variable. Cross-model validation was reported in 55 % of the reviewed studies, though the level of detail provided varied considerably. We found that very few studies documented other types of validity, and only 6 % of the reviewed articles reported a calibration process. Considering the above findings, we propose a comprehensive model performance evaluation framework (checklist), informed by a review of best-practice guidelines. This framework provides a basis for more accurate and consistent documentation of model performance evaluation. This will improve the peer review process and the comparability of modelling studies. Recognising the fundamental role of decision analytic models in informing public funding decisions, the proposed framework should usefully inform guidelines for preparing submissions to reimbursement bodies.
Evaluating Models of Human Performance: Safety-Critical Systems Applications
NASA Technical Reports Server (NTRS)
Feary, Michael S.
2012-01-01
This presentation is part of panel discussion on Evaluating Models of Human Performance. The purpose of this panel is to discuss the increasing use of models in the world today and specifically focus on how to describe and evaluate models of human performance. My presentation will focus on discussions of generating distributions of performance, and the evaluation of different strategies for humans performing tasks with mixed initiative (Human-Automation) systems. I will also discuss issues with how to provide Human Performance modeling data to support decisions on acceptability and tradeoffs in the design of safety critical systems. I will conclude with challenges for the future.
NASA Astrophysics Data System (ADS)
Mao, Chao; Chen, Shou
2017-01-01
According to the traditional entropy value method still have low evaluation accuracy when evaluating the performance of mining projects, a performance evaluation model of mineral project founded on improved entropy is proposed. First establish a new weight assignment model founded on compatible matrix analysis of analytic hierarchy process (AHP) and entropy value method, when the compatibility matrix analysis to achieve consistency requirements, if it has differences between subjective weights and objective weights, moderately adjust both proportions, then on this basis, the fuzzy evaluation matrix for performance evaluation. The simulation experiments show that, compared with traditional entropy and compatible matrix analysis method, the proposed performance evaluation model of mining project based on improved entropy value method has higher accuracy assessment.
Model Performance Evaluation and Scenario Analysis (MPESA)
Model Performance Evaluation and Scenario Analysis (MPESA) assesses the performance with which models predict time series data. The tool was developed Hydrological Simulation Program-Fortran (HSPF) and the Stormwater Management Model (SWMM)
Models for evaluating the performability of degradable computing systems
NASA Technical Reports Server (NTRS)
Wu, L. T.
1982-01-01
Recent advances in multiprocessor technology established the need for unified methods to evaluate computing systems performance and reliability. In response to this modeling need, a general modeling framework that permits the modeling, analysis and evaluation of degradable computing systems is considered. Within this framework, several user oriented performance variables are identified and shown to be proper generalizations of the traditional notions of system performance and reliability. Furthermore, a time varying version of the model is developed to generalize the traditional fault tree reliability evaluation methods of phased missions.
Multi-objective optimization for generating a weighted multi-model ensemble
NASA Astrophysics Data System (ADS)
Lee, H.
2017-12-01
Many studies have demonstrated that multi-model ensembles generally show better skill than each ensemble member. When generating weighted multi-model ensembles, the first step is measuring the performance of individual model simulations using observations. There is a consensus on the assignment of weighting factors based on a single evaluation metric. When considering only one evaluation metric, the weighting factor for each model is proportional to a performance score or inversely proportional to an error for the model. While this conventional approach can provide appropriate combinations of multiple models, the approach confronts a big challenge when there are multiple metrics under consideration. When considering multiple evaluation metrics, it is obvious that a simple averaging of multiple performance scores or model ranks does not address the trade-off problem between conflicting metrics. So far, there seems to be no best method to generate weighted multi-model ensembles based on multiple performance metrics. The current study applies the multi-objective optimization, a mathematical process that provides a set of optimal trade-off solutions based on a range of evaluation metrics, to combining multiple performance metrics for the global climate models and their dynamically downscaled regional climate simulations over North America and generating a weighted multi-model ensemble. NASA satellite data and the Regional Climate Model Evaluation System (RCMES) software toolkit are used for assessment of the climate simulations. Overall, the performance of each model differs markedly with strong seasonal dependence. Because of the considerable variability across the climate simulations, it is important to evaluate models systematically and make future projections by assigning optimized weighting factors to the models with relatively good performance. Our results indicate that the optimally weighted multi-model ensemble always shows better performance than an arithmetic ensemble mean and may provide reliable future projections.
Formal implementation of a performance evaluation model for the face recognition system.
Shin, Yong-Nyuo; Kim, Jason; Lee, Yong-Jun; Shin, Woochang; Choi, Jin-Young
2008-01-01
Due to usability features, practical applications, and its lack of intrusiveness, face recognition technology, based on information, derived from individuals' facial features, has been attracting considerable attention recently. Reported recognition rates of commercialized face recognition systems cannot be admitted as official recognition rates, as they are based on assumptions that are beneficial to the specific system and face database. Therefore, performance evaluation methods and tools are necessary to objectively measure the accuracy and performance of any face recognition system. In this paper, we propose and formalize a performance evaluation model for the biometric recognition system, implementing an evaluation tool for face recognition systems based on the proposed model. Furthermore, we performed evaluations objectively by providing guidelines for the design and implementation of a performance evaluation system, formalizing the performance test process.
Performance of the SEAPROG prognosis variant of the forest vegetation simulator.
Michael H. McClellan; Frances E. Biles
2003-01-01
This paper reports the first phase of a recent effort to evaluate the performance and use of the FVS-SEAPROG vegetation growth model. In this paper, we present our evaluation of SEAPROGâs performance in modeling the growth of even-aged stands regenerated by clearcutting, windthrow, or fire. We evaluated the model by comparing model predictions to observed values from...
2012-06-01
SLEEP AND PERFORMANCE STUDY: EVALUATING THE SAFTE MODEL FOR MARITIME WORKPLACE APPLICATION by Stephanie A. T. Brown June 2012 Thesis...REPORT DATE June 2012 3. REPORT TYPE AND DATES COVERED Master’s Thesis 4. TITLE AND SUBTITLE Maritime Platform Sleep and Performance Study...Evaluating the SAFTE Model for Maritime Workplace Application 5. FUNDING NUMBERS 6. AUTHOR(S) Stephanie A. T. Brown 7. PERFORMING ORGANIZATION
Reyes, Jeanette M; Xu, Yadong; Vizuete, William; Serre, Marc L
2017-01-01
The regulatory Community Multiscale Air Quality (CMAQ) model is a means to understanding the sources, concentrations and regulatory attainment of air pollutants within a model's domain. Substantial resources are allocated to the evaluation of model performance. The Regionalized Air quality Model Performance (RAMP) method introduced here explores novel ways of visualizing and evaluating CMAQ model performance and errors for daily Particulate Matter ≤ 2.5 micrometers (PM2.5) concentrations across the continental United States. The RAMP method performs a non-homogenous, non-linear, non-homoscedastic model performance evaluation at each CMAQ grid. This work demonstrates that CMAQ model performance, for a well-documented 2001 regulatory episode, is non-homogeneous across space/time. The RAMP correction of systematic errors outperforms other model evaluation methods as demonstrated by a 22.1% reduction in Mean Square Error compared to a constant domain wide correction. The RAMP method is able to accurately reproduce simulated performance with a correlation of r = 76.1%. Most of the error coming from CMAQ is random error with only a minority of error being systematic. Areas of high systematic error are collocated with areas of high random error, implying both error types originate from similar sources. Therefore, addressing underlying causes of systematic error will have the added benefit of also addressing underlying causes of random error.
Model Performance Evaluation and Scenario Analysis (MPESA) Tutorial
The model performance evaluation consists of metrics and model diagnostics. These metrics provides modelers with statistical goodness-of-fit measures that capture magnitude only, sequence only, and combined magnitude and sequence errors.
Models and techniques for evaluating the effectiveness of aircraft computing systems
NASA Technical Reports Server (NTRS)
Meyer, J. F.
1978-01-01
Progress in the development of system models and techniques for the formulation and evaluation of aircraft computer system effectiveness is reported. Topics covered include: analysis of functional dependence: a prototype software package, METAPHOR, developed to aid the evaluation of performability; and a comprehensive performability modeling and evaluation exercise involving the SIFT computer.
Metrics for Performance Evaluation of Patient Exercises during Physical Therapy.
Vakanski, Aleksandar; Ferguson, Jake M; Lee, Stephen
2017-06-01
The article proposes a set of metrics for evaluation of patient performance in physical therapy exercises. Taxonomy is employed that classifies the metrics into quantitative and qualitative categories, based on the level of abstraction of the captured motion sequences. Further, the quantitative metrics are classified into model-less and model-based metrics, in reference to whether the evaluation employs the raw measurements of patient performed motions, or whether the evaluation is based on a mathematical model of the motions. The reviewed metrics include root-mean square distance, Kullback Leibler divergence, log-likelihood, heuristic consistency, Fugl-Meyer Assessment, and similar. The metrics are evaluated for a set of five human motions captured with a Kinect sensor. The metrics can potentially be integrated into a system that employs machine learning for modelling and assessment of the consistency of patient performance in home-based therapy setting. Automated performance evaluation can overcome the inherent subjectivity in human performed therapy assessment, and it can increase the adherence to prescribed therapy plans, and reduce healthcare costs.
Performability modeling with continuous accomplishment sets
NASA Technical Reports Server (NTRS)
Meyer, J. F.
1979-01-01
A general modeling framework that permits the definition, formulation, and evaluation of performability is described. It is shown that performability relates directly to system effectiveness, and is a proper generalization of both performance and reliability. A hierarchical modeling scheme is used to formulate the capability function used to evaluate performability. The case in which performance variables take values in a continuous accomplishment set is treated explicitly.
Performance Evaluation Model for Application Layer Firewalls.
Xuan, Shichang; Yang, Wu; Dong, Hui; Zhang, Jiangchuan
2016-01-01
Application layer firewalls protect the trusted area network against information security risks. However, firewall performance may affect user experience. Therefore, performance analysis plays a significant role in the evaluation of application layer firewalls. This paper presents an analytic model of the application layer firewall, based on a system analysis to evaluate the capability of the firewall. In order to enable users to improve the performance of the application layer firewall with limited resources, resource allocation was evaluated to obtain the optimal resource allocation scheme in terms of throughput, delay, and packet loss rate. The proposed model employs the Erlangian queuing model to analyze the performance parameters of the system with regard to the three layers (network, transport, and application layers). Then, the analysis results of all the layers are combined to obtain the overall system performance indicators. A discrete event simulation method was used to evaluate the proposed model. Finally, limited service desk resources were allocated to obtain the values of the performance indicators under different resource allocation scenarios in order to determine the optimal allocation scheme. Under limited resource allocation, this scheme enables users to maximize the performance of the application layer firewall.
Model Performance Evaluation and Scenario Analysis ...
This tool consists of two parts: model performance evaluation and scenario analysis (MPESA). The model performance evaluation consists of two components: model performance evaluation metrics and model diagnostics. These metrics provides modelers with statistical goodness-of-fit measures that capture magnitude only, sequence only, and combined magnitude and sequence errors. The performance measures include error analysis, coefficient of determination, Nash-Sutcliffe efficiency, and a new weighted rank method. These performance metrics only provide useful information about the overall model performance. Note that MPESA is based on the separation of observed and simulated time series into magnitude and sequence components. The separation of time series into magnitude and sequence components and the reconstruction back to time series provides diagnostic insights to modelers. For example, traditional approaches lack the capability to identify if the source of uncertainty in the simulated data is due to the quality of the input data or the way the analyst adjusted the model parameters. This report presents a suite of model diagnostics that identify if mismatches between observed and simulated data result from magnitude or sequence related errors. MPESA offers graphical and statistical options that allow HSPF users to compare observed and simulated time series and identify the parameter values to adjust or the input data to modify. The scenario analysis part of the too
We present an application of the online coupled WRF-CMAQ modeling system to two annual simulations over North America performed under Phase 2 of the Air Quality Model Evaluation International Initiative (AQMEII). Operational evaluation shows that model performance is comparable t...
ATAMM enhancement and multiprocessor performance evaluation
NASA Technical Reports Server (NTRS)
Stoughton, John W.; Mielke, Roland R.; Som, Sukhamoy; Obando, Rodrigo; Malekpour, Mahyar R.; Jones, Robert L., III; Mandala, Brij Mohan V.
1991-01-01
ATAMM (Algorithm To Architecture Mapping Model) enhancement and multiprocessor performance evaluation is discussed. The following topics are included: the ATAMM model; ATAMM enhancement; ADM (Advanced Development Model) implementation of ATAMM; and ATAMM support tools.
USDA-ARS?s Scientific Manuscript database
Experimental and simulation uncertainties have not been included in many of the statistics used in assessing agricultural model performance. The objectives of this study were to develop an F-test that can be used to evaluate model performance considering experimental and simulation uncertainties, an...
NASA Astrophysics Data System (ADS)
Odman, M. T.; Hu, Y.; Russell, A.; Chai, T.; Lee, P.; Shankar, U.; Boylan, J.
2012-12-01
Regulatory air quality modeling, such as State Implementation Plan (SIP) modeling, requires that model performance meets recommended criteria in the base-year simulations using period-specific, estimated emissions. The goal of the performance evaluation is to assure that the base-year modeling accurately captures the observed chemical reality of the lower troposphere. Any significant deficiencies found in the performance evaluation must be corrected before any base-case (with typical emissions) and future-year modeling is conducted. Corrections are usually made to model inputs such as emission-rate estimates or meteorology and/or to the air quality model itself, in modules that describe specific processes. Use of ground-level measurements that follow approved protocols is recommended for evaluating model performance. However, ground-level monitoring networks are spatially sparse, especially for particulate matter. Satellite retrievals of atmospheric chemical properties such as aerosol optical depth (AOD) provide spatial coverage that can compensate for the sparseness of ground-level measurements. Satellite retrievals can also help diagnose potential model or data problems in the upper troposphere. It is possible to achieve good model performance near the ground, but have, for example, erroneous sources or sinks in the upper troposphere that may result in misleading and unrealistic responses to emission reductions. Despite these advantages, satellite retrievals are rarely used in model performance evaluation, especially for regulatory modeling purposes, due to the high uncertainty in retrievals associated with various contaminations, for example by clouds. In this study, 2007 was selected as the base year for SIP modeling in the southeastern U.S. Performance of the Community Multiscale Air Quality (CMAQ) model, at a 12-km horizontal resolution, for this annual simulation is evaluated using both recommended ground-level measurements and non-traditional satellite retrievals. Evaluation results are assessed against recommended criteria and peer studies in the literature. Further analysis is conducted, based upon these assessments, to discover likely errors in model inputs and potential deficiencies in the model itself. Correlations as well as differences in input errors and model deficiencies revealed by ground-level measurements versus satellite observations are discussed. Additionally, sensitivity analyses are employed to investigate errors in emission-rate estimates using either ground-level measurements or satellite retrievals, and the results are compared against each other considering observational uncertainties. Recommendations are made for how to effectively utilize satellite retrievals in regulatory air quality modeling.
NASA Technical Reports Server (NTRS)
Trachta, G.
1976-01-01
A model of Univac 1108 work flow has been developed to assist in performance evaluation studies and configuration planning. Workload profiles and system configurations are parameterized for ease of experimental modification. Outputs include capacity estimates and performance evaluation functions. The U1108 system is conceptualized as a service network; classical queueing theory is used to evaluate network dynamics.
NASA Technical Reports Server (NTRS)
Foyle, David C.
1993-01-01
Based on existing integration models in the psychological literature, an evaluation framework is developed to assess sensor fusion displays as might be implemented in an enhanced/synthetic vision system. The proposed evaluation framework for evaluating the operator's ability to use such systems is a normative approach: The pilot's performance with the sensor fusion image is compared to models' predictions based on the pilot's performance when viewing the original component sensor images prior to fusion. This allows for the determination as to when a sensor fusion system leads to: poorer performance than one of the original sensor displays, clearly an undesirable system in which the fused sensor system causes some distortion or interference; better performance than with either single sensor system alone, but at a sub-optimal level compared to model predictions; optimal performance compared to model predictions; or, super-optimal performance, which may occur if the operator were able to use some highly diagnostic 'emergent features' in the sensor fusion display, which were unavailable in the original sensor displays.
THE ATMOSPHERIC MODEL EVALUATION TOOL
This poster describes a model evaluation tool that is currently being developed and applied for meteorological and air quality model evaluation. The poster outlines the framework and provides examples of statistical evaluations that can be performed with the model evaluation tool...
Metrics for evaluating performance and uncertainty of Bayesian network models
Bruce G. Marcot
2012-01-01
This paper presents a selected set of existing and new metrics for gauging Bayesian network model performance and uncertainty. Selected existing and new metrics are discussed for conducting model sensitivity analysis (variance reduction, entropy reduction, case file simulation); evaluating scenarios (influence analysis); depicting model complexity (numbers of model...
Proposed evaluation framework for assessing operator performance with multisensor displays
NASA Technical Reports Server (NTRS)
Foyle, David C.
1992-01-01
Despite aggressive work on the development of sensor fusion algorithms and techniques, no formal evaluation procedures have been proposed. Based on existing integration models in the literature, an evaluation framework is developed to assess an operator's ability to use multisensor, or sensor fusion, displays. The proposed evaluation framework for evaluating the operator's ability to use such systems is a normative approach: The operator's performance with the sensor fusion display can be compared to the models' predictions based on the operator's performance when viewing the original sensor displays prior to fusion. This allows for the determination as to when a sensor fusion system leads to: 1) poorer performance than one of the original sensor displays (clearly an undesirable system in which the fused sensor system causes some distortion or interference); 2) better performance than with either single sensor system alone, but at a sub-optimal (compared to the model predictions) level; 3) optimal performance (compared to model predictions); or, 4) super-optimal performance, which may occur if the operator were able to use some highly diagnostic 'emergent features' in the sensor fusion display, which were unavailable in the original sensor displays. An experiment demonstrating the usefulness of the proposed evaluation framework is discussed.
Switching performance of OBS network model under prefetched real traffic
NASA Astrophysics Data System (ADS)
Huang, Zhenhua; Xu, Du; Lei, Wen
2005-11-01
Optical Burst Switching (OBS) [1] is now widely considered as an efficient switching technique in building the next generation optical Internet .So it's very important to precisely evaluate the performance of the OBS network model. The performance of the OBS network model is variable in different condition, but the most important thing is that how it works under real traffic load. In the traditional simulation models, uniform traffics are usually generated by simulation software to imitate the data source of the edge node in the OBS network model, and through which the performance of the OBS network is evaluated. Unfortunately, without being simulated by real traffic, the traditional simulation models have several problems and their results are doubtable. To deal with this problem, we present a new simulation model for analysis and performance evaluation of the OBS network, which uses prefetched IP traffic to be data source of the OBS network model. The prefetched IP traffic can be considered as real IP source of the OBS edge node and the OBS network model has the same clock rate with a real OBS system. So it's easy to conclude that this model is closer to the real OBS system than the traditional ones. The simulation results also indicate that this model is more accurate to evaluate the performance of the OBS network system and the results of this model are closer to the actual situation.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Breuker, M.S.; Braun, J.E.
This paper presents a detailed evaluation of the performance of a statistical, rule-based fault detection and diagnostic (FDD) technique presented by Rossi and Braun (1997). Steady-state and transient tests were performed on a simple rooftop air conditioner over a range of conditions and fault levels. The steady-state data without faults were used to train models that predict outputs for normal operation. The transient data with faults were used to evaluate FDD performance. The effect of a number of design variables on FDD sensitivity for different faults was evaluated and two prototype systems were specified for more complete evaluation. Good performancemore » was achieved in detecting and diagnosing five faults using only six temperatures (2 input and 4 output) and linear models. The performance improved by about a factor of two when ten measurements (three input and seven output) and higher order models were used. This approach for evaluating and optimizing the performance of the statistical, rule-based FDD technique could be used as a design and evaluation tool when applying this FDD method to other packaged air-conditioning systems. Furthermore, the approach could also be modified to evaluate the performance of other FDD methods.« less
Devriendt, Floris; Moldovan, Darie; Verbeke, Wouter
2018-03-01
Prescriptive analytics extends on predictive analytics by allowing to estimate an outcome in function of control variables, allowing as such to establish the required level of control variables for realizing a desired outcome. Uplift modeling is at the heart of prescriptive analytics and aims at estimating the net difference in an outcome resulting from a specific action or treatment that is applied. In this article, a structured and detailed literature survey on uplift modeling is provided by identifying and contrasting various groups of approaches. In addition, evaluation metrics for assessing the performance of uplift models are reviewed. An experimental evaluation on four real-world data sets provides further insight into their use. Uplift random forests are found to be consistently among the best performing techniques in terms of the Qini and Gini measures, although considerable variability in performance across the various data sets of the experiments is observed. In addition, uplift models are frequently observed to be unstable and display a strong variability in terms of performance across different folds in the cross-validation experimental setup. This potentially threatens their actual use for business applications. Moreover, it is found that the available evaluation metrics do not provide an intuitively understandable indication of the actual use and performance of a model. Specifically, existing evaluation metrics do not facilitate a comparison of uplift models and predictive models and evaluate performance either at an arbitrary cutoff or over the full spectrum of potential cutoffs. In conclusion, we highlight the instability of uplift models and the need for an application-oriented approach to assess uplift models as prime topics for further research.
NASA Technical Reports Server (NTRS)
Wickens, Christopher; Sebok, Angelia; Keller, John; Peters, Steve; Small, Ronald; Hutchins, Shaun; Algarin, Liana; Gore, Brian Francis; Hooey, Becky Lee; Foyle, David C.
2013-01-01
NextGen operations are associated with a variety of changes to the national airspace system (NAS) including changes to the allocation of roles and responsibilities among operators and automation, the use of new technologies and automation, additional information presented on the flight deck, and the entire concept of operations (ConOps). In the transition to NextGen airspace, aviation and air operations designers need to consider the implications of design or system changes on human performance and the potential for error. To ensure continued safety of the NAS, it will be necessary for researchers to evaluate design concepts and potential NextGen scenarios well before implementation. One approach for such evaluations is through human performance modeling. Human performance models (HPMs) provide effective tools for predicting and evaluating operator performance in systems. HPMs offer significant advantages over empirical, human-in-the-loop testing in that (1) they allow detailed analyses of systems that have not yet been built, (2) they offer great flexibility for extensive data collection, (3) they do not require experimental participants, and thus can offer cost and time savings. HPMs differ in their ability to predict performance and safety with NextGen procedures, equipment and ConOps. Models also vary in terms of how they approach human performance (e.g., some focus on cognitive processing, others focus on discrete tasks performed by a human, while others consider perceptual processes), and in terms of their associated validation efforts. The objectives of this research effort were to support the Federal Aviation Administration (FAA) in identifying HPMs that are appropriate for predicting pilot performance in NextGen operations, to provide guidance on how to evaluate the quality of different models, and to identify gaps in pilot performance modeling research, that could guide future research opportunities. This research effort is intended to help the FAA evaluate pilot modeling efforts and select the appropriate tools for future modeling efforts to predict pilot performance in NextGen operations.
Performability evaluation of the SIFT computer
NASA Technical Reports Server (NTRS)
Meyer, J. F.; Furchtgott, D. G.; Wu, L. T.
1979-01-01
Performability modeling and evaluation techniques are applied to the SIFT computer as it might operate in the computational evironment of an air transport mission. User-visible performance of the total system (SIFT plus its environment) is modeled as a random variable taking values in a set of levels of accomplishment. These levels are defined in terms of four attributes of total system behavior: safety, no change in mission profile, no operational penalties, and no economic process whose states describe the internal structure of SIFT as well as relavant conditions of the environment. Base model state trajectories are related to accomplishment levels via a capability function which is formulated in terms of a 3-level model hierarchy. Performability evaluation algorithms are then applied to determine the performability of the total system for various choices of computer and environment parameter values. Numerical results of those evaluations are presented and, in conclusion, some implications of this effort are discussed.
Evaluation methodologies for an advanced information processing system
NASA Technical Reports Server (NTRS)
Schabowsky, R. S., Jr.; Gai, E.; Walker, B. K.; Lala, J. H.; Motyka, P.
1984-01-01
The system concept and requirements for an Advanced Information Processing System (AIPS) are briefly described, but the emphasis of this paper is on the evaluation methodologies being developed and utilized in the AIPS program. The evaluation tasks include hardware reliability, maintainability and availability, software reliability, performance, and performability. Hardware RMA and software reliability are addressed with Markov modeling techniques. The performance analysis for AIPS is based on queueing theory. Performability is a measure of merit which combines system reliability and performance measures. The probability laws of the performance measures are obtained from the Markov reliability models. Scalar functions of this law such as the mean and variance provide measures of merit in the AIPS performability evaluations.
A Conceptual Framework for Evaluating Higher Education Institutions
ERIC Educational Resources Information Center
Chinta, Ravi; Kebritchi, Mansureh; Ellias, Janelle
2016-01-01
Purpose: Performance evaluation is a topic that has been researched and practiced extensively in business organizations but has received scant attention in higher education institutions. A review of literature revealed that context, input, process, product (CIPP) model is an appropriate performance evaluation model for higher education…
ERIC Educational Resources Information Center
Zantal-Wiener, Kathy; Horwood, Thomas J.
2010-01-01
The authors propose a comprehensive evaluation framework to prepare for evaluating school emergency management programs. This framework involves a logic model that incorporates Government Performance and Results Act (GPRA) measures as a foundation for comprehensive evaluation that complements performance monitoring used by the U.S. Department of…
A model for evaluating the social performance of construction waste management.
Yuan, Hongping
2012-06-01
It has been determined by existing literature that a lot of research efforts have been made to the economic performance of construction waste management (CWM), but less attention is paid to investigation of the social performance of CWM. This study therefore attempts to develop a model for quantitatively evaluating the social performance of CWM by using a system dynamics (SD) approach. Firstly, major variables affecting the social performance of CWM are identified and a holistic system for assessing the social performance of CWM is formulated in line with feedback relationships underlying these variables. The developed system is then converted into a SD model through the software iThink. An empirical case study is finally conducted to demonstrate application of the model. Results of model validation indicate that the model is robust and reasonable to reflect the situation of the real system under study. Findings of the case study offer helpful insights into effectively promoting the social performance of CWM of the project investigated. Furthermore, the model exhibits great potential to function as an experimental platform for dynamically evaluating effects of management measures on improving the social performance of CWM of construction projects. Copyright © 2012 Elsevier Ltd. All rights reserved.
ERIC Educational Resources Information Center
Louzada, Alexandre Neves; Elia, Marcos da Fonseca; Sampaio, Fábio Ferrentini; Vidal, Andre Luiz Pestana
2014-01-01
The aim of this work is to adapt and test, in a Brazilian public school, the ACE model proposed by Borkulo for evaluating student performance as a teaching-learning process based on computational modeling systems. The ACE model is based on different types of reasoning involving three dimensions. In addition to adapting the model and introducing…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Snyder, Abigail C.; Link, Robert P.; Calvin, Katherine V.
Hindcasting experiments (conducting a model forecast for a time period in which observational data are available) are being undertaken increasingly often by the integrated assessment model (IAM) community, across many scales of models. When they are undertaken, the results are often evaluated using global aggregates or otherwise highly aggregated skill scores that mask deficiencies. We select a set of deviation-based measures that can be applied on different spatial scales (regional versus global) to make evaluating the large number of variable–region combinations in IAMs more tractable. We also identify performance benchmarks for these measures, based on the statistics of the observationalmore » dataset, that allow a model to be evaluated in absolute terms rather than relative to the performance of other models at similar tasks. An ideal evaluation method for hindcast experiments in IAMs would feature both absolute measures for evaluation of a single experiment for a single model and relative measures to compare the results of multiple experiments for a single model or the same experiment repeated across multiple models, such as in community intercomparison studies. The performance benchmarks highlight the use of this scheme for model evaluation in absolute terms, providing information about the reasons a model may perform poorly on a given measure and therefore identifying opportunities for improvement. To demonstrate the use of and types of results possible with the evaluation method, the measures are applied to the results of a past hindcast experiment focusing on land allocation in the Global Change Assessment Model (GCAM) version 3.0. The question of how to more holistically evaluate models as complex as IAMs is an area for future research. We find quantitative evidence that global aggregates alone are not sufficient for evaluating IAMs that require global supply to equal global demand at each time period, such as GCAM. The results of this work indicate it is unlikely that a single evaluation measure for all variables in an IAM exists, and therefore sector-by-sector evaluation may be necessary.« less
Snyder, Abigail C.; Link, Robert P.; Calvin, Katherine V.
2017-11-29
Hindcasting experiments (conducting a model forecast for a time period in which observational data are available) are being undertaken increasingly often by the integrated assessment model (IAM) community, across many scales of models. When they are undertaken, the results are often evaluated using global aggregates or otherwise highly aggregated skill scores that mask deficiencies. We select a set of deviation-based measures that can be applied on different spatial scales (regional versus global) to make evaluating the large number of variable–region combinations in IAMs more tractable. We also identify performance benchmarks for these measures, based on the statistics of the observationalmore » dataset, that allow a model to be evaluated in absolute terms rather than relative to the performance of other models at similar tasks. An ideal evaluation method for hindcast experiments in IAMs would feature both absolute measures for evaluation of a single experiment for a single model and relative measures to compare the results of multiple experiments for a single model or the same experiment repeated across multiple models, such as in community intercomparison studies. The performance benchmarks highlight the use of this scheme for model evaluation in absolute terms, providing information about the reasons a model may perform poorly on a given measure and therefore identifying opportunities for improvement. To demonstrate the use of and types of results possible with the evaluation method, the measures are applied to the results of a past hindcast experiment focusing on land allocation in the Global Change Assessment Model (GCAM) version 3.0. The question of how to more holistically evaluate models as complex as IAMs is an area for future research. We find quantitative evidence that global aggregates alone are not sufficient for evaluating IAMs that require global supply to equal global demand at each time period, such as GCAM. The results of this work indicate it is unlikely that a single evaluation measure for all variables in an IAM exists, and therefore sector-by-sector evaluation may be necessary.« less
NASA Astrophysics Data System (ADS)
Snyder, Abigail C.; Link, Robert P.; Calvin, Katherine V.
2017-11-01
Hindcasting experiments (conducting a model forecast for a time period in which observational data are available) are being undertaken increasingly often by the integrated assessment model (IAM) community, across many scales of models. When they are undertaken, the results are often evaluated using global aggregates or otherwise highly aggregated skill scores that mask deficiencies. We select a set of deviation-based measures that can be applied on different spatial scales (regional versus global) to make evaluating the large number of variable-region combinations in IAMs more tractable. We also identify performance benchmarks for these measures, based on the statistics of the observational dataset, that allow a model to be evaluated in absolute terms rather than relative to the performance of other models at similar tasks. An ideal evaluation method for hindcast experiments in IAMs would feature both absolute measures for evaluation of a single experiment for a single model and relative measures to compare the results of multiple experiments for a single model or the same experiment repeated across multiple models, such as in community intercomparison studies. The performance benchmarks highlight the use of this scheme for model evaluation in absolute terms, providing information about the reasons a model may perform poorly on a given measure and therefore identifying opportunities for improvement. To demonstrate the use of and types of results possible with the evaluation method, the measures are applied to the results of a past hindcast experiment focusing on land allocation in the Global Change Assessment Model (GCAM) version 3.0. The question of how to more holistically evaluate models as complex as IAMs is an area for future research. We find quantitative evidence that global aggregates alone are not sufficient for evaluating IAMs that require global supply to equal global demand at each time period, such as GCAM. The results of this work indicate it is unlikely that a single evaluation measure for all variables in an IAM exists, and therefore sector-by-sector evaluation may be necessary.
The use of neural network technology to model swimming performance.
Silva, António José; Costa, Aldo Manuel; Oliveira, Paulo Moura; Reis, Victor Machado; Saavedra, José; Perl, Jurgen; Rouboa, Abel; Marinho, Daniel Almeida
2007-01-01
to identify the factors which are able to explain the performance in the 200 meters individual medley and 400 meters front crawl events in young swimmers, to model the performance in those events using non-linear mathematic methods through artificial neural networks (multi-layer perceptrons) and to assess the neural network models precision to predict the performance. A sample of 138 young swimmers (65 males and 73 females) of national level was submitted to a test battery comprising four different domains: kinanthropometric evaluation, dry land functional evaluation (strength and flexibility), swimming functional evaluation (hydrodynamics, hydrostatic and bioenergetics characteristics) and swimming technique evaluation. To establish a profile of the young swimmer non-linear combinations between preponderant variables for each gender and swim performance in the 200 meters medley and 400 meters font crawl events were developed. For this purpose a feed forward neural network was used (Multilayer Perceptron) with three neurons in a single hidden layer. The prognosis precision of the model (error lower than 0.8% between true and estimated performances) is supported by recent evidence. Therefore, we consider that the neural network tool can be a good approach in the resolution of complex problems such as performance modeling and the talent identification in swimming and, possibly, in a wide variety of sports. Key pointsThe non-linear analysis resulting from the use of feed forward neural network allowed us the development of four performance models.The mean difference between the true and estimated results performed by each one of the four neural network models constructed was low.The neural network tool can be a good approach in the resolution of the performance modeling as an alternative to the standard statistical models that presume well-defined distributions and independence among all inputs.The use of neural networks for sports sciences application allowed us to create very realistic models for swimming performance prediction based on previous selected criterions that were related with the dependent variable (performance).
VI-G, Sec. 661, P.L. 91-230. Final Performance Report.
ERIC Educational Resources Information Center
1976
Presented is the final performance report of the CSDC model which is designed to provide services for learning disabled high school students. Sections cover the following program aspects: organizational structure, inservice sessions, identification of students, materials and equipment, evaluation of student performance, evaluation of the model,…
A model for evaluating the social performance of construction waste management
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yuan Hongping, E-mail: hpyuan2005@gmail.com
Highlights: Black-Right-Pointing-Pointer Scant attention is paid to social performance of construction waste management (CWM). Black-Right-Pointing-Pointer We develop a model for assessing the social performance of CWM. Black-Right-Pointing-Pointer With the model, the social performance of CWM can be quantitatively simulated. - Abstract: It has been determined by existing literature that a lot of research efforts have been made to the economic performance of construction waste management (CWM), but less attention is paid to investigation of the social performance of CWM. This study therefore attempts to develop a model for quantitatively evaluating the social performance of CWM by using a system dynamicsmore » (SD) approach. Firstly, major variables affecting the social performance of CWM are identified and a holistic system for assessing the social performance of CWM is formulated in line with feedback relationships underlying these variables. The developed system is then converted into a SD model through the software iThink. An empirical case study is finally conducted to demonstrate application of the model. Results of model validation indicate that the model is robust and reasonable to reflect the situation of the real system under study. Findings of the case study offer helpful insights into effectively promoting the social performance of CWM of the project investigated. Furthermore, the model exhibits great potential to function as an experimental platform for dynamically evaluating effects of management measures on improving the social performance of CWM of construction projects.« less
Reliability and performance evaluation of systems containing embedded rule-based expert systems
NASA Technical Reports Server (NTRS)
Beaton, Robert M.; Adams, Milton B.; Harrison, James V. A.
1989-01-01
A method for evaluating the reliability of real-time systems containing embedded rule-based expert systems is proposed and investigated. It is a three stage technique that addresses the impact of knowledge-base uncertainties on the performance of expert systems. In the first stage, a Markov reliability model of the system is developed which identifies the key performance parameters of the expert system. In the second stage, the evaluation method is used to determine the values of the expert system's key performance parameters. The performance parameters can be evaluated directly by using a probabilistic model of uncertainties in the knowledge-base or by using sensitivity analyses. In the third and final state, the performance parameters of the expert system are combined with performance parameters for other system components and subsystems to evaluate the reliability and performance of the complete system. The evaluation method is demonstrated in the context of a simple expert system used to supervise the performances of an FDI algorithm associated with an aircraft longitudinal flight-control system.
VERIFICATION OF THE HYDROLOGIC EVALUATION OF LANDFILL PERFORMANCE (HELP) MODEL USING FIELD DATA
The report describes a study conducted to verify the Hydrologic Evaluation of Landfill Performance (HELP) computer model using existing field data from a total of 20 landfill cells at 7 sites in the United States. Simulations using the HELP model were run to compare the predicted...
NASA Astrophysics Data System (ADS)
Zheng, Feifei; Maier, Holger R.; Wu, Wenyan; Dandy, Graeme C.; Gupta, Hoshin V.; Zhang, Tuqiao
2018-02-01
Hydrological models are used for a wide variety of engineering purposes, including streamflow forecasting and flood-risk estimation. To develop such models, it is common to allocate the available data to calibration and evaluation data subsets. Surprisingly, the issue of how this allocation can affect model evaluation performance has been largely ignored in the research literature. This paper discusses the evaluation performance bias that can arise from how available data are allocated to calibration and evaluation subsets. As a first step to assessing this issue in a statistically rigorous fashion, we present a comprehensive investigation of the influence of data allocation on the development of data-driven artificial neural network (ANN) models of streamflow. Four well-known formal data splitting methods are applied to 754 catchments from Australia and the U.S. to develop 902,483 ANN models. Results clearly show that the choice of the method used for data allocation has a significant impact on model performance, particularly for runoff data that are more highly skewed, highlighting the importance of considering the impact of data splitting when developing hydrological models. The statistical behavior of the data splitting methods investigated is discussed and guidance is offered on the selection of the most appropriate data splitting methods to achieve representative evaluation performance for streamflow data with different statistical properties. Although our results are obtained for data-driven models, they highlight the fact that this issue is likely to have a significant impact on all types of hydrological models, especially conceptual rainfall-runoff models.
Stochastic performance modeling and evaluation of obstacle detectability with imaging range sensors
NASA Technical Reports Server (NTRS)
Matthies, Larry; Grandjean, Pierrick
1993-01-01
Statistical modeling and evaluation of the performance of obstacle detection systems for Unmanned Ground Vehicles (UGVs) is essential for the design, evaluation, and comparison of sensor systems. In this report, we address this issue for imaging range sensors by dividing the evaluation problem into two levels: quality of the range data itself and quality of the obstacle detection algorithms applied to the range data. We review existing models of the quality of range data from stereo vision and AM-CW LADAR, then use these to derive a new model for the quality of a simple obstacle detection algorithm. This model predicts the probability of detecting obstacles and the probability of false alarms, as a function of the size and distance of the obstacle, the resolution of the sensor, and the level of noise in the range data. We evaluate these models experimentally using range data from stereo image pairs of a gravel road with known obstacles at several distances. The results show that the approach is a promising tool for predicting and evaluating the performance of obstacle detection with imaging range sensors.
Solid rocket booster performance evaluation model. Volume 4: Program listing
NASA Technical Reports Server (NTRS)
1974-01-01
All subprograms or routines associated with the solid rocket booster performance evaluation model are indexed in this computer listing. An alphanumeric list of each routine in the index is provided in a table of contents.
NASA Astrophysics Data System (ADS)
Mohammadi, Mousa; Rai, Piyush; Gupta, Suprakash
2017-03-01
Overall Equipment Effectiveness (OEE) has been used since last over two decades as a measure of performance in manufacturing industries. Unfortunately, enough, application of OEE in mining and excavation industry has not been duly adopted. In this paper an effort has been made to identify the OEE for performance evaluation of Bucket based Excavating, Loading and Transport (BELT) equipment. The conceptual model of OEE, as used in the manufacturing industries, has been revised to adapt to the BELT equipment. The revised and adapted model considered the operational time, speed and bucket capacity utilization losses as the key OEE components for evaluating the performance of BELT equipment. To illustrate the efficacy of the devised model on real-time basis, a case study was undertaken on the biggest single bucket excavating equipment - the dragline, in a large surface coal mine. One-year data was collected in order to evaluate the proposed OEE model.
NASA Astrophysics Data System (ADS)
Lee, H.
2016-12-01
Precipitation is one of the most important climate variables that are taken into account in studying regional climate. Nevertheless, how precipitation will respond to a changing climate and even its mean state in the current climate are not well represented in regional climate models (RCMs). Hence, comprehensive and mathematically rigorous methodologies to evaluate precipitation and related variables in multiple RCMs are required. The main objective of the current study is to evaluate the joint variability of climate variables related to model performance in simulating precipitation and condense multiple evaluation metrics into a single summary score. We use multi-objective optimization, a mathematical process that provides a set of optimal tradeoff solutions based on a range of evaluation metrics, to characterize the joint representation of precipitation, cloudiness and insolation in RCMs participating in the North American Regional Climate Change Assessment Program (NARCCAP) and Coordinated Regional Climate Downscaling Experiment-North America (CORDEX-NA). We also leverage ground observations, NASA satellite data and the Regional Climate Model Evaluation System (RCMES). Overall, the quantitative comparison of joint probability density functions between the three variables indicates that performance of each model differs markedly between sub-regions and also shows strong seasonal dependence. Because of the large variability across the models, it is important to evaluate models systematically and make future projections using only models showing relatively good performance. Our results indicate that the optimized multi-model ensemble always shows better performance than the arithmetic ensemble mean and may guide reliable future projections.
Regime-based evaluation of cloudiness in CMIP5 models
NASA Astrophysics Data System (ADS)
Jin, Daeho; Oreopoulos, Lazaros; Lee, Dongmin
2017-01-01
The concept of cloud regimes (CRs) is used to develop a framework for evaluating the cloudiness of 12 fifth Coupled Model Intercomparison Project (CMIP5) models. Reference CRs come from existing global International Satellite Cloud Climatology Project (ISCCP) weather states. The evaluation is made possible by the implementation in several CMIP5 models of the ISCCP simulator generating in each grid cell daily joint histograms of cloud optical thickness and cloud top pressure. Model performance is assessed with several metrics such as CR global cloud fraction (CF), CR relative frequency of occurrence (RFO), their product [long-term average total cloud amount (TCA)], cross-correlations of CR RFO maps, and a metric of resemblance between model and ISCCP CRs. In terms of CR global RFO, arguably the most fundamental metric, the models perform unsatisfactorily overall, except for CRs representing thick storm clouds. Because model CR CF is internally constrained by our method, RFO discrepancies yield also substantial TCA errors. Our results support previous findings that CMIP5 models underestimate cloudiness. The multi-model mean performs well in matching observed RFO maps for many CRs, but is still not the best for this or other metrics. When overall performance across all CRs is assessed, some models, despite shortcomings, apparently outperform Moderate Resolution Imaging Spectroradiometer cloud observations evaluated against ISCCP like another model output. Lastly, contrasting cloud simulation performance against each model's equilibrium climate sensitivity in order to gain insight on whether good cloud simulation pairs with particular values of this parameter, yields no clear conclusions.
The Third Phase of AQMEII: Evaluation Strategy and Multi-Model Performance Analysis
AQMEII (Air Quality Model Evaluation International Initiative) is an extraordinary effort promoting policy-relevant research on regional air quality model evaluation across the European and North American atmospheric modelling communities, providing the ideal platform for advanci...
A Perspective on Computational Human Performance Models as Design Tools
NASA Technical Reports Server (NTRS)
Jones, Patricia M.
2010-01-01
The design of interactive systems, including levels of automation, displays, and controls, is usually based on design guidelines and iterative empirical prototyping. A complementary approach is to use computational human performance models to evaluate designs. An integrated strategy of model-based and empirical test and evaluation activities is particularly attractive as a methodology for verification and validation of human-rated systems for commercial space. This talk will review several computational human performance modeling approaches and their applicability to design of display and control requirements.
Performance evaluation of an agent-based occupancy simulation model
Luo, Xuan; Lam, Khee Poh; Chen, Yixing; ...
2017-01-17
Occupancy is an important factor driving building performance. Static and homogeneous occupant schedules, commonly used in building performance simulation, contribute to issues such as performance gaps between simulated and measured energy use in buildings. Stochastic occupancy models have been recently developed and applied to better represent spatial and temporal diversity of occupants in buildings. However, there is very limited evaluation of the usability and accuracy of these models. This study used measured occupancy data from a real office building to evaluate the performance of an agent-based occupancy simulation model: the Occupancy Simulator. The occupancy patterns of various occupant types weremore » first derived from the measured occupant schedule data using statistical analysis. Then the performance of the simulation model was evaluated and verified based on (1) whether the distribution of observed occupancy behavior patterns follows the theoretical ones included in the Occupancy Simulator, and (2) whether the simulator can reproduce a variety of occupancy patterns accurately. Results demonstrated the feasibility of applying the Occupancy Simulator to simulate a range of occupancy presence and movement behaviors for regular types of occupants in office buildings, and to generate stochastic occupant schedules at the room and individual occupant levels for building performance simulation. For future work, model validation is recommended, which includes collecting and using detailed interval occupancy data of all spaces in an office building to validate the simulated occupant schedules from the Occupancy Simulator.« less
Performance evaluation of an agent-based occupancy simulation model
DOE Office of Scientific and Technical Information (OSTI.GOV)
Luo, Xuan; Lam, Khee Poh; Chen, Yixing
Occupancy is an important factor driving building performance. Static and homogeneous occupant schedules, commonly used in building performance simulation, contribute to issues such as performance gaps between simulated and measured energy use in buildings. Stochastic occupancy models have been recently developed and applied to better represent spatial and temporal diversity of occupants in buildings. However, there is very limited evaluation of the usability and accuracy of these models. This study used measured occupancy data from a real office building to evaluate the performance of an agent-based occupancy simulation model: the Occupancy Simulator. The occupancy patterns of various occupant types weremore » first derived from the measured occupant schedule data using statistical analysis. Then the performance of the simulation model was evaluated and verified based on (1) whether the distribution of observed occupancy behavior patterns follows the theoretical ones included in the Occupancy Simulator, and (2) whether the simulator can reproduce a variety of occupancy patterns accurately. Results demonstrated the feasibility of applying the Occupancy Simulator to simulate a range of occupancy presence and movement behaviors for regular types of occupants in office buildings, and to generate stochastic occupant schedules at the room and individual occupant levels for building performance simulation. For future work, model validation is recommended, which includes collecting and using detailed interval occupancy data of all spaces in an office building to validate the simulated occupant schedules from the Occupancy Simulator.« less
Walsh, Matthew M; Gluck, Kevin A; Gunzelmann, Glenn; Jastrzembski, Tiffany; Krusmark, Michael
2018-06-01
The spacing effect is among the most widely replicated empirical phenomena in the learning sciences, and its relevance to education and training is readily apparent. Yet successful applications of spacing effect research to education and training is rare. Computational modeling can provide the crucial link between a century of accumulated experimental data on the spacing effect and the emerging interest in using that research to enable adaptive instruction. In this paper, we review relevant literature and identify 10 criteria for rigorously evaluating computational models of the spacing effect. Five relate to evaluating the theoretic adequacy of a model, and five relate to evaluating its application potential. We use these criteria to evaluate a novel computational model of the spacing effect called the Predictive Performance Equation (PPE). Predictive Performance Equation combines elements of earlier models of learning and memory including the General Performance Equation, Adaptive Control of Thought-Rational, and the New Theory of Disuse, giving rise to a novel computational account of the spacing effect that performs favorably across the complete sets of theoretic and applied criteria. We implemented two other previously published computational models of the spacing effect and compare them to PPE using the theoretic and applied criteria as guides. Copyright © 2018 Cognitive Science Society, Inc.
METAPHOR: Programmer's guide, Version 1
NASA Technical Reports Server (NTRS)
Furchtgott, D. G.
1979-01-01
The internal structure of the Michigan Evaluation Aid for Perphormability (METAPHOR), an interactive software package to facilitate performability modeling and evaluation is described. Revised supplemented guides are prepared in order to maintain an up-to-date documentation of the system. Programmed tools to facilitate each step of performability model construction and model solution are given.
ERIC Educational Resources Information Center
Brady, Michael P.; Heiser, Lawrence A.; McCormick, Jazarae K.; Forgan, James
2016-01-01
High-stakes standardized student assessments are increasingly used in value-added evaluation models to connect teacher performance to P-12 student learning. These assessments are also being used to evaluate teacher preparation programs, despite validity and reliability threats. A more rational model linking student performance to candidates who…
An Evaluation Model for Competency Based Teacher Preparatory Programs.
ERIC Educational Resources Information Center
Denton, Jon J.
This discussion describes an evaluation model designed to complement a curriculum development project, the primary goal of which is to structure a performance based program for preservice teachers. Data collected from the implementation of this four-phase model can be used to make decisions for developing and changing performance objectives and…
Application of Support Vector Machine to Forex Monitoring
NASA Astrophysics Data System (ADS)
Kamruzzaman, Joarder; Sarker, Ruhul A.
Previous studies have demonstrated superior performance of artificial neural network (ANN) based forex forecasting models over traditional regression models. This paper applies support vector machines to build a forecasting model from the historical data using six simple technical indicators and presents a comparison with an ANN based model trained by scaled conjugate gradient (SCG) learning algorithm. The models are evaluated and compared on the basis of five commonly used performance metrics that measure closeness of prediction as well as correctness in directional change. Forecasting results of six different currencies against Australian dollar reveal superior performance of SVM model using simple linear kernel over ANN-SCG model in terms of all the evaluation metrics. The effect of SVM parameter selection on prediction performance is also investigated and analyzed.
Conversations, Not Evaluations: An Alternative Model of Performance Management
ERIC Educational Resources Information Center
Lee, Christopher D.
2003-01-01
Traditional appraisal and evaluation systems focus almost exclusively on an employee's past performance. The desired result in each of these systems is better work performance. The very nature of most appraisals or evaluations, however, may inhibit performance unintentionally by focusing energy, attention and effort on past shortcomings rather…
Fan, Yunzhou; Wang, Ying; Jiang, Hongbo; Yang, Wenwen; Yu, Miao; Yan, Weirong; Diwan, Vinod K; Xu, Biao; Dong, Hengjin; Palm, Lars; Nie, Shaofa
2014-01-01
Syndromic surveillance promotes the early detection of diseases outbreaks. Although syndromic surveillance has increased in developing countries, performance on outbreak detection, particularly in cases of multi-stream surveillance, has scarcely been evaluated in rural areas. This study introduces a temporal simulation model based on healthcare-seeking behaviors to evaluate the performance of multi-stream syndromic surveillance for influenza-like illness. Data were obtained in six towns of rural Hubei Province, China, from April 2012 to June 2013. A Susceptible-Exposed-Infectious-Recovered model generated 27 scenarios of simulated influenza A (H1N1) outbreaks, which were converted into corresponding simulated syndromic datasets through the healthcare-behaviors model. We then superimposed converted syndromic datasets onto the baselines obtained to create the testing datasets. Outbreak performance of single-stream surveillance of clinic visit, frequency of over the counter drug purchases, school absenteeism, and multi-stream surveillance of their combinations were evaluated using receiver operating characteristic curves and activity monitoring operation curves. In the six towns examined, clinic visit surveillance and school absenteeism surveillance exhibited superior performances of outbreak detection than over the counter drug purchase frequency surveillance; the performance of multi-stream surveillance was preferable to signal-stream surveillance, particularly at low specificity (Sp <90%). The temporal simulation model based on healthcare-seeking behaviors offers an accessible method for evaluating the performance of multi-stream surveillance.
NASA Astrophysics Data System (ADS)
Schmidt, J. B.
1985-09-01
This thesis investigates ways of improving the real-time performance of the Stockpoint Logistics Integrated Communication Environment (SPLICE). Performance evaluation through continuous monitoring activities and performance studies are the principle vehicles discussed. The method for implementing this performance evaluation process is the measurement of predefined performance indexes. Performance indexes for SPLICE are offered that would measure these areas. Existing SPLICE capability to carry out performance evaluation is explored, and recommendations are made to enhance that capability.
Zhao, Wei; Kaguelidou, Florentia; Biran, Valérie; Zhang, Daolun; Allegaert, Karel; Capparelli, Edmund V; Holford, Nick; Kimura, Toshimi; Lo, Yoke-Lin; Peris, José-Esteban; Thomson, Alison; Anker, John N; Fakhoury, May; Jacqz-Aigrain, Evelyne
2013-01-01
Aims Vancomycin is one of the most evaluated antibiotics in neonates using modeling and simulation approaches. However no clear consensus on optimal dosing has been achieved. The objective of the present study was to perform an external evaluation of published models, in order to test their predictive performances in an independent dataset and to identify the possible study-related factors influencing the transferability of pharmacokinetic models to different clinical settings. Method Published neonatal vancomycin pharmacokinetic models were screened from the literature. The predictive performance of six models was evaluated using an independent dataset (112 concentrations from 78 neonates). The evaluation procedures used simulation-based diagnostics [visual predictive check (VPC) and normalized prediction distribution errors (NPDE)]. Results Differences in predictive performances of models for vancomycin pharmacokinetics in neonates were found. The mean of NPDE for six evaluated models were 1.35, −0.22, −0.36, 0.24, 0.66 and 0.48, respectively. These differences were explained, at least partly, by taking into account the method used to measure serum creatinine concentrations. The adult conversion factor of 1.3 (enzymatic to Jaffé) was tested with an improvement in the VPC and NPDE, but it still needs to be evaluated and validated in neonates. Differences were also identified between analytical methods for vancomycin. Conclusion The importance of analytical techniques for serum creatinine concentrations and vancomycin as predictors of vancomycin concentrations in neonates have been confirmed. Dosage individualization of vancomycin in neonates should consider not only patients' characteristics and clinical conditions, but also the methods used to measure serum creatinine and vancomycin. PMID:23148919
Regime-Based Evaluation of Cloudiness in CMIP5 Models
NASA Technical Reports Server (NTRS)
Jin, Daeho; Oraiopoulos, Lazaros; Lee, Dong Min
2016-01-01
The concept of Cloud Regimes (CRs) is used to develop a framework for evaluating the cloudiness of 12 fifth Coupled Model Intercomparison Project (CMIP5) models. Reference CRs come from existing global International Satellite Cloud Climatology Project (ISCCP) weather states. The evaluation is made possible by the implementation in several CMIP5 models of the ISCCP simulator generating for each gridcell daily joint histograms of cloud optical thickness and cloud top pressure. Model performance is assessed with several metrics such as CR global cloud fraction (CF), CR relative frequency of occurrence (RFO), their product (long-term average total cloud amount [TCA]), cross-correlations of CR RFO maps, and a metric of resemblance between model and ISCCP CRs. In terms of CR global RFO, arguably the most fundamental metric, the models perform unsatisfactorily overall, except for CRs representing thick storm clouds. Because model CR CF is internally constrained by our method, RFO discrepancies yield also substantial TCA errors. Our findings support previous studies showing that CMIP5 models underestimate cloudiness. The multi-model mean performs well in matching observed RFO maps for many CRs, but is not the best for this or other metrics. When overall performance across all CRs is assessed, some models, despite their shortcomings, apparently outperform Moderate Resolution Imaging Spectroradiometer (MODIS) cloud observations evaluated against ISCCP as if they were another model output. Lastly, cloud simulation performance is contrasted with each model's equilibrium climate sensitivity (ECS) in order to gain insight on whether good cloud simulation pairs with particular values of this parameter.
The Use of Neural Network Technology to Model Swimming Performance
Silva, António José; Costa, Aldo Manuel; Oliveira, Paulo Moura; Reis, Victor Machado; Saavedra, José; Perl, Jurgen; Rouboa, Abel; Marinho, Daniel Almeida
2007-01-01
The aims of the present study were: to identify the factors which are able to explain the performance in the 200 meters individual medley and 400 meters front crawl events in young swimmers, to model the performance in those events using non-linear mathematic methods through artificial neural networks (multi-layer perceptrons) and to assess the neural network models precision to predict the performance. A sample of 138 young swimmers (65 males and 73 females) of national level was submitted to a test battery comprising four different domains: kinanthropometric evaluation, dry land functional evaluation (strength and flexibility), swimming functional evaluation (hydrodynamics, hydrostatic and bioenergetics characteristics) and swimming technique evaluation. To establish a profile of the young swimmer non-linear combinations between preponderant variables for each gender and swim performance in the 200 meters medley and 400 meters font crawl events were developed. For this purpose a feed forward neural network was used (Multilayer Perceptron) with three neurons in a single hidden layer. The prognosis precision of the model (error lower than 0.8% between true and estimated performances) is supported by recent evidence. Therefore, we consider that the neural network tool can be a good approach in the resolution of complex problems such as performance modeling and the talent identification in swimming and, possibly, in a wide variety of sports. Key pointsThe non-linear analysis resulting from the use of feed forward neural network allowed us the development of four performance models.The mean difference between the true and estimated results performed by each one of the four neural network models constructed was low.The neural network tool can be a good approach in the resolution of the performance modeling as an alternative to the standard statistical models that presume well-defined distributions and independence among all inputs.The use of neural networks for sports sciences application allowed us to create very realistic models for swimming performance prediction based on previous selected criterions that were related with the dependent variable (performance). PMID:24149233
Solid rocket booster performance evaluation model. Volume 2: Users manual
NASA Technical Reports Server (NTRS)
1974-01-01
This users manual for the solid rocket booster performance evaluation model (SRB-II) contains descriptions of the model, the program options, the required program inputs, the program output format and the program error messages. SRB-II is written in FORTRAN and is operational on both the IBM 370/155 and the MSFC UNIVAC 1108 computers.
The Hydrologic Evaluation of Landfill Performance (HELP) computer program is a quasi-two-dimensional hydrologic model of water movement across, into, through and out of landfills. The model accepts weather, soil and design data. Landfill systems including various combinations o...
Modeling and Performance Evaluation of Backoff Misbehaving Nodes in CSMA/CA Networks
2012-08-01
Modeling and Performance Evaluation of Backoff Misbehaving Nodes in CSMA/CA Networks Zhuo Lu, Student Member, IEEE, Wenye Wang, Senior Member, IEEE... misbehaving nodes can obtain, we define and study two general classes of backoff misbehavior: continuous misbehavior, which keeps manipulating the backoff...misbehavior sporadically. Our approach is to introduce a new performance metric, namely order gain, to characterize the performance benefits of misbehaving
Models and techniques for evaluating the effectiveness of aircraft computing systems
NASA Technical Reports Server (NTRS)
Meyer, J. F.
1977-01-01
Models, measures and techniques were developed for evaluating the effectiveness of aircraft computing systems. The concept of effectiveness involves aspects of system performance, reliability and worth. Specifically done was a detailed development of model hierarchy at mission, functional task, and computational task levels. An appropriate class of stochastic models was investigated which served as bottom level models in the hierarchial scheme. A unified measure of effectiveness called 'performability' was defined and formulated.
Minimum resolvable power contrast model
NASA Astrophysics Data System (ADS)
Qian, Shuai; Wang, Xia; Zhou, Jingjing
2018-01-01
Signal-to-noise ratio and MTF are important indexs to evaluate the performance of optical systems. However,whether they are used alone or joint assessment cannot intuitively describe the overall performance of the system. Therefore, an index is proposed to reflect the comprehensive system performance-Minimum Resolvable Radiation Performance Contrast (MRP) model. MRP is an evaluation model without human eyes. It starts from the radiance of the target and the background, transforms the target and background into the equivalent strips,and considers attenuation of the atmosphere, the optical imaging system, and the detector. Combining with the signal-to-noise ratio and the MTF, the Minimum Resolvable Radiation Performance Contrast is obtained. Finally the detection probability model of MRP is given.
NASA Astrophysics Data System (ADS)
He, Zhibin; Wen, Xiaohu; Liu, Hu; Du, Jun
2014-02-01
Data driven models are very useful for river flow forecasting when the underlying physical relationships are not fully understand, but it is not clear whether these data driven models still have a good performance in the small river basin of semiarid mountain regions where have complicated topography. In this study, the potential of three different data driven methods, artificial neural network (ANN), adaptive neuro fuzzy inference system (ANFIS) and support vector machine (SVM) were used for forecasting river flow in the semiarid mountain region, northwestern China. The models analyzed different combinations of antecedent river flow values and the appropriate input vector has been selected based on the analysis of residuals. The performance of the ANN, ANFIS and SVM models in training and validation sets are compared with the observed data. The model which consists of three antecedent values of flow has been selected as the best fit model for river flow forecasting. To get more accurate evaluation of the results of ANN, ANFIS and SVM models, the four quantitative standard statistical performance evaluation measures, the coefficient of correlation (R), root mean squared error (RMSE), Nash-Sutcliffe efficiency coefficient (NS) and mean absolute relative error (MARE), were employed to evaluate the performances of various models developed. The results indicate that the performance obtained by ANN, ANFIS and SVM in terms of different evaluation criteria during the training and validation period does not vary substantially; the performance of the ANN, ANFIS and SVM models in river flow forecasting was satisfactory. A detailed comparison of the overall performance indicated that the SVM model performed better than ANN and ANFIS in river flow forecasting for the validation data sets. The results also suggest that ANN, ANFIS and SVM method can be successfully applied to establish river flow with complicated topography forecasting models in the semiarid mountain regions.
Sakieh, Yousef; Salmanmahiny, Abdolrassoul
2016-03-01
Performance evaluation is a critical step when developing land-use and cover change (LUCC) models. The present study proposes a spatially explicit model performance evaluation method, adopting a landscape metric-based approach. To quantify GEOMOD model performance, a set of composition- and configuration-based landscape metrics including number of patches, edge density, mean Euclidean nearest neighbor distance, largest patch index, class area, landscape shape index, and splitting index were employed. The model takes advantage of three decision rules including neighborhood effect, persistence of change direction, and urbanization suitability values. According to the results, while class area, largest patch index, and splitting indices demonstrated insignificant differences between spatial pattern of ground truth and simulated layers, there was a considerable inconsistency between simulation results and real dataset in terms of the remaining metrics. Specifically, simulation outputs were simplistic and the model tended to underestimate number of developed patches by producing a more compact landscape. Landscape-metric-based performance evaluation produces more detailed information (compared to conventional indices such as the Kappa index and overall accuracy) on the model's behavior in replicating spatial heterogeneity features of a landscape such as frequency, fragmentation, isolation, and density. Finally, as the main characteristic of the proposed method, landscape metrics employ the maximum potential of observed and simulated layers for a performance evaluation procedure, provide a basis for more robust interpretation of a calibration process, and also deepen modeler insight into the main strengths and pitfalls of a specific land-use change model when simulating a spatiotemporal phenomenon.
AERMOD performance evaluation for three coal-fired electrical generating units in Southwest Indiana.
Frost, Kali D
2014-03-01
An evaluation of the steady-state dispersion model AERMOD was conducted to determine its accuracy at predicting hourly ground-level concentrations of sulfur dioxide (SO2) by comparing model-predicted concentrations to a full year of monitored SO2 data. The two study sites are comprised of three coal-fired electrical generating units (EGUs) located in southwest Indiana. The sites are characterized by tall, buoyant stacks,flat terrain, multiple SO2 monitors, and relatively isolated locations. AERMOD v12060 and AERMOD v12345 with BETA options were evaluated at each study site. For the six monitor-receptor pairs evaluated, AERMOD showed generally good agreement with monitor values for the hourly 99th percentile SO2 design value, with design value ratios that ranged from 0.92 to 1.99. AERMOD was within acceptable performance limits for the Robust Highest Concentration (RHC) statistic (RHC ratios ranged from 0.54 to 1.71) at all six monitors. Analysis of the top 5% of hourly concentrations at the six monitor-receptor sites, paired in time and space, indicated poor model performance in the upper concentration range. The amount of hourly model predicted data that was within a factor of 2 of observations at these higher concentrations ranged from 14 to 43% over the six sites. Analysis of subsets of data showed consistent overprediction during low wind speed and unstable meteorological conditions, and underprediction during stable, low wind conditions. Hourly paired comparisons represent a stringent measure of model performance; however given the potential for application of hourly model predictions to the SO2 NAAQS design value, this may be appropriate. At these two sites, AERMOD v12345 BETA options do not improve model performance. A regulatory evaluation of AERMOD utilizing quantile-quantile (Q-Q) plots, the RHC statistic, and 99th percentile design value concentrations indicates that model performance is acceptable according to widely accepted regulatory performance limits. However, a scientific evaluation examining hourly paired monitor and model values at concentrations of interest indicates overprediction and underprediction bias that is outside of acceptable model performance measures. Overprediction of 1-hr SO2 concentrations by AERMOD presents major ramifications for state and local permitting authorities when establishing emission limits.
Performance and evaluation of real-time multicomputer control systems
NASA Technical Reports Server (NTRS)
Shin, K. G.
1983-01-01
New performance measures, detailed examples, modeling of error detection process, performance evaluation of rollback recovery methods, experiments on FTMP, and optimal size of an NMR cluster are discussed.
Margolin, Adam A.; Bilal, Erhan; Huang, Erich; Norman, Thea C.; Ottestad, Lars; Mecham, Brigham H.; Sauerwine, Ben; Kellen, Michael R.; Mangravite, Lara M.; Furia, Matthew D.; Vollan, Hans Kristian Moen; Rueda, Oscar M.; Guinney, Justin; Deflaux, Nicole A.; Hoff, Bruce; Schildwachter, Xavier; Russnes, Hege G.; Park, Daehoon; Vang, Veronica O.; Pirtle, Tyler; Youseff, Lamia; Citro, Craig; Curtis, Christina; Kristensen, Vessela N.; Hellerstein, Joseph; Friend, Stephen H.; Stolovitzky, Gustavo; Aparicio, Samuel; Caldas, Carlos; Børresen-Dale, Anne-Lise
2013-01-01
Although molecular prognostics in breast cancer are among the most successful examples of translating genomic analysis to clinical applications, optimal approaches to breast cancer clinical risk prediction remain controversial. The Sage Bionetworks–DREAM Breast Cancer Prognosis Challenge (BCC) is a crowdsourced research study for breast cancer prognostic modeling using genome-scale data. The BCC provided a community of data analysts with a common platform for data access and blinded evaluation of model accuracy in predicting breast cancer survival on the basis of gene expression data, copy number data, and clinical covariates. This approach offered the opportunity to assess whether a crowdsourced community Challenge would generate models of breast cancer prognosis commensurate with or exceeding current best-in-class approaches. The BCC comprised multiple rounds of blinded evaluations on held-out portions of data on 1981 patients, resulting in more than 1400 models submitted as open source code. Participants then retrained their models on the full data set of 1981 samples and submitted up to five models for validation in a newly generated data set of 184 breast cancer patients. Analysis of the BCC results suggests that the best-performing modeling strategy outperformed previously reported methods in blinded evaluations; model performance was consistent across several independent evaluations; and aggregating community-developed models achieved performance on par with the best-performing individual models. PMID:23596205
Evaluating Organic Aerosol Model Performance: Impact of two Embedded Assumptions
NASA Astrophysics Data System (ADS)
Jiang, W.; Giroux, E.; Roth, H.; Yin, D.
2004-05-01
Organic aerosols are important due to their abundance in the polluted lower atmosphere and their impact on human health and vegetation. However, modeling organic aerosols is a very challenging task because of the complexity of aerosol composition, structure, and formation processes. Assumptions and their associated uncertainties in both models and measurement data make model performance evaluation a truly demanding job. Although some assumptions are obvious, others are hidden and embedded, and can significantly impact modeling results, possibly even changing conclusions about model performance. This paper focuses on analyzing the impact of two embedded assumptions on evaluation of organic aerosol model performance. One assumption is about the enthalpy of vaporization widely used in various secondary organic aerosol (SOA) algorithms. The other is about the conversion factor used to obtain ambient organic aerosol concentrations from measured organic carbon. These two assumptions reflect uncertainties in the model and in the ambient measurement data, respectively. For illustration purposes, various choices of the assumed values are implemented in the evaluation process for an air quality model based on CMAQ (the Community Multiscale Air Quality Model). Model simulations are conducted for the Lower Fraser Valley covering Southwest British Columbia, Canada, and Northwest Washington, United States, for a historical pollution episode in 1993. To understand the impact of the assumed enthalpy of vaporization on modeling results, its impact on instantaneous organic aerosol yields (IAY) through partitioning coefficients is analysed first. The analysis shows that utilizing different enthalpy of vaporization values causes changes in the shapes of IAY curves and in the response of SOA formation capability of reactive organic gases to temperature variations. These changes are then carried into the air quality model and cause substantial changes in the organic aerosol modeling results. In another aspect, using different assumed factors to convert measured organic carbon to organic aerosol concentrations cause substantial variations in the processed ambient data themselves, which are normally used as performance targets for model evaluations. The combination of uncertainties in the modeling results and in the moving performance targets causes major uncertainties in the final conclusion about the model performance. Without further information, the best thing that a modeler can do is to choose a combination of the assumed values from the sensible parameter ranges available in the literature, based on the best match of the modeling results with the processed measurement data. However, the best match of the modeling results with the processed measurement data may not necessarily guarantee that the model itself is rigorous and the model performance is robust. Conclusions on the model performance can only be reached with sufficient understanding of the uncertainties and their impact.
A mixed integer bi-level DEA model for bank branch performance evaluation by Stackelberg approach
NASA Astrophysics Data System (ADS)
Shafiee, Morteza; Lotfi, Farhad Hosseinzadeh; Saleh, Hilda; Ghaderi, Mehdi
2016-03-01
One of the most complicated decision making problems for managers is the evaluation of bank performance, which involves various criteria. There are many studies about bank efficiency evaluation by network DEA in the literature review. These studies do not focus on multi-level network. Wu (Eur J Oper Res 207:856-864, 2010) proposed a bi-level structure for cost efficiency at the first time. In this model, multi-level programming and cost efficiency were used. He used a nonlinear programming to solve the model. In this paper, we have focused on multi-level structure and proposed a bi-level DEA model. We then used a liner programming to solve our model. In other hand, we significantly improved the way to achieve the optimum solution in comparison with the work by Wu (2010) by converting the NP-hard nonlinear programing into a mixed integer linear programming. This study uses a bi-level programming data envelopment analysis model that embodies internal structure with Stackelberg-game relationships to evaluate the performance of banking chain. The perspective of decentralized decisions is taken in this paper to cope with complex interactions in banking chain. The results derived from bi-level programming DEA can provide valuable insights and detailed information for managers to help them evaluate the performance of the banking chain as a whole using Stackelberg-game relationships. Finally, this model was applied in the Iranian bank to evaluate cost efficiency.
10 CFR 431.445 - Determination of small electric motor efficiency.
Code of Federal Regulations, 2012 CFR
2012-01-01
... statistical analysis, computer simulation or modeling, or other analytic evaluation of performance data. (3... statistical analysis, computer simulation or modeling, and other analytic evaluation of performance data on.... (ii) If requested by the Department, the manufacturer shall conduct simulations to predict the...
Development of task network models of human performance in microgravity
NASA Technical Reports Server (NTRS)
Diaz, Manuel F.; Adam, Susan
1992-01-01
This paper discusses the utility of task-network modeling for quantifying human performance variability in microgravity. The data are gathered for: (1) improving current methodologies for assessing human performance and workload in the operational space environment; (2) developing tools for assessing alternative system designs; and (3) developing an integrated set of methodologies for the evaluation of performance degradation during extended duration spaceflight. The evaluation entailed an analysis of the Remote Manipulator System payload-grapple task performed on many shuttle missions. Task-network modeling can be used as a tool for assessing and enhancing human performance in man-machine systems, particularly for modeling long-duration manned spaceflight. Task-network modeling can be directed toward improving system efficiency by increasing the understanding of basic capabilities of the human component in the system and the factors that influence these capabilities.
Gokhale, Sharad; Raokhande, Namita
2008-05-01
There are several models that can be used to evaluate roadside air quality. The comparison of the operational performance of different models pertinent to local conditions is desirable so that the model that performs best can be identified. Three air quality models, namely the 'modified General Finite Line Source Model' (M-GFLSM) of particulates, the 'California Line Source' (CALINE3) model, and the 'California Line Source for Queuing & Hot Spot Calculations' (CAL3QHC) model have been identified for evaluating the air quality at one of the busiest traffic intersections in the city of Guwahati. These models have been evaluated statistically with the vehicle-derived airborne particulate mass emissions in two sizes, i.e. PM10 and PM2.5, the prevailing meteorology and the temporal distribution of the measured daily average PM10 and PM2.5 concentrations in wintertime. The study has shown that the CAL3QHC model would make better predictions compared to other models for varied meteorology and traffic conditions. The detailed study reveals that the agreements between the measured and the modeled PM10 and PM2.5 concentrations have been reasonably good for CALINE3 and CAL3QHC models. Further detailed analysis shows that the CAL3QHC model performed well compared to the CALINE3. The monthly performance measures have also led to the similar results. These two models have also outperformed for a class of wind speed velocities except for low winds (<1 m s(-1)), for which, the M-GFLSM model has shown the tendency of better performance for PM10. Nevertheless, the CAL3QHC model has outperformed for both the particulate sizes and for all the wind classes, which therefore can be optional for air quality assessment at urban traffic intersections.
An urban energy performance evaluation system and its computer implementation.
Wang, Lei; Yuan, Guan; Long, Ruyin; Chen, Hong
2017-12-15
To improve the urban environment and effectively reflect and promote urban energy performance, an urban energy performance evaluation system was constructed, thereby strengthening urban environmental management capabilities. From the perspectives of internalization and externalization, a framework of evaluation indicators and key factors that determine urban energy performance and explore the reasons for differences in performance was proposed according to established theory and previous studies. Using the improved stochastic frontier analysis method, an urban energy performance evaluation and factor analysis model was built that brings performance evaluation and factor analysis into the same stage for study. According to data obtained for the Chinese provincial capitals from 2004 to 2013, the coefficients of the evaluation indicators and key factors were calculated by the urban energy performance evaluation and factor analysis model. These coefficients were then used to compile the program file. The urban energy performance evaluation system developed in this study was designed in three parts: a database, a distributed component server, and a human-machine interface. Its functions were designed as login, addition, edit, input, calculation, analysis, comparison, inquiry, and export. On the basis of these contents, an urban energy performance evaluation system was developed using Microsoft Visual Studio .NET 2015. The system can effectively reflect the status of and any changes in urban energy performance. Beijing was considered as an example to conduct an empirical study, which further verified the applicability and convenience of this evaluation system. Copyright © 2017 Elsevier Ltd. All rights reserved.
Modelling invasion for a habitat generalist and a specialist plant species
Evangelista, P.H.; Kumar, S.; Stohlgren, T.J.; Jarnevich, C.S.; Crall, A.W.; Norman, J. B.; Barnett, D.T.
2008-01-01
Predicting suitable habitat and the potential distribution of invasive species is a high priority for resource managers and systems ecologists. Most models are designed to identify habitat characteristics that define the ecological niche of a species with little consideration to individual species' traits. We tested five commonly used modelling methods on two invasive plant species, the habitat generalist Bromus tectorum and habitat specialist Tamarix chinensis, to compare model performances, evaluate predictability, and relate results to distribution traits associated with each species. Most of the tested models performed similarly for each species; however, the generalist species proved to be more difficult to predict than the specialist species. The highest area under the receiver-operating characteristic curve values with independent validation data sets of B. tectorum and T. chinensis was 0.503 and 0.885, respectively. Similarly, a confusion matrix for B. tectorum had the highest overall accuracy of 55%, while the overall accuracy for T. chinensis was 85%. Models for the generalist species had varying performances, poor evaluations, and inconsistent results. This may be a result of a generalist's capability to persist in a wide range of environmental conditions that are not easily defined by the data, independent variables or model design. Models for the specialist species had consistently strong performances, high evaluations, and similar results among different model applications. This is likely a consequence of the specialist's requirement for explicit environmental resources and ecological barriers that are easily defined by predictive models. Although defining new invaders as generalist or specialist species can be challenging, model performances and evaluations may provide valuable information on a species' potential invasiveness.
Towards systematic evaluation of crop model outputs for global land-use models
NASA Astrophysics Data System (ADS)
Leclere, David; Azevedo, Ligia B.; Skalský, Rastislav; Balkovič, Juraj; Havlík, Petr
2016-04-01
Land provides vital socioeconomic resources to the society, however at the cost of large environmental degradations. Global integrated models combining high resolution global gridded crop models (GGCMs) and global economic models (GEMs) are increasingly being used to inform sustainable solution for agricultural land-use. However, little effort has yet been done to evaluate and compare the accuracy of GGCM outputs. In addition, GGCM datasets require a large amount of parameters whose values and their variability across space are weakly constrained: increasing the accuracy of such dataset has a very high computing cost. Innovative evaluation methods are required both to ground credibility to the global integrated models, and to allow efficient parameter specification of GGCMs. We propose an evaluation strategy for GGCM datasets in the perspective of use in GEMs, illustrated with preliminary results from a novel dataset (the Hypercube) generated by the EPIC GGCM and used in the GLOBIOM land use GEM to inform on present-day crop yield, water and nutrient input needs for 16 crops x 15 management intensities, at a spatial resolution of 5 arc-minutes. We adopt the following principle: evaluation should provide a transparent diagnosis of model adequacy for its intended use. We briefly describe how the Hypercube data is generated and how it articulates with GLOBIOM in order to transparently identify the performances to be evaluated, as well as the main assumptions and data processing involved. Expected performances include adequately representing the sub-national heterogeneity in crop yield and input needs: i) in space, ii) across crop species, and iii) across management intensities. We will present and discuss measures of these expected performances and weight the relative contribution of crop model, input data and data processing steps in performances. We will also compare obtained yield gaps and main yield-limiting factors against the M3 dataset. Next steps include iterative improvement of parameter assumptions and evaluation of implications of GGCM performances for intended use in the IIASA EPIC-GLOBIOM model cluster. Our approach helps targeting future efforts at improving GGCM accuracy and would achieve highest efficiency if combined with traditional field-scale evaluation and sensitivity analysis.
Katoh, Masakazu; Hamajima, Fumiyasu; Ogasawara, Takahiro; Hata, Ken-Ichiro
2009-06-01
A validation study of an in vitro skin irritation testing method using a reconstructed human skin model has been conducted by the European Centre for the Validation of Alternative Methods (ECVAM), and a protocol using EpiSkin (SkinEthic, France) has been approved. The structural and performance criteria of skin models for testing are defined in the ECVAM Performance Standards announced along with the approval. We have performed several evaluations of the new reconstructed human epidermal model LabCyte EPI-MODEL, and confirmed that it is applicable to skin irritation testing as defined in the ECVAM Performance Standards. We selected 19 materials (nine irritants and ten non-irritants) available in Japan as test chemicals among the 20 reference chemicals described in the ECVAM Performance Standard. A test chemical was applied to the surface of the LabCyte EPI-MODEL for 15 min, after which it was completely removed and the model then post-incubated for 42 hr. Cell v iability was measured by MTT assay and skin irritancy of the test chemical evaluated. In addition, interleukin-1 alpha (IL-1alpha) concentration in the culture supernatant after post-incubation was measured to provide a complementary evaluation of skin irritation. Evaluation of the 19 test chemicals resulted in 79% accuracy, 78% sensitivity and 80% specificity, confirming that the in vitro skin irritancy of the LabCyte EPI-MODEL correlates highly with in vivo skin irritation. These results suggest that LabCyte EPI-MODEL is applicable to the skin irritation testing protocol set out in the ECVAM Performance Standards.
NASA Astrophysics Data System (ADS)
Wang, Wen-Chuan; Chau, Kwok-Wing; Cheng, Chun-Tian; Qiu, Lin
2009-08-01
SummaryDeveloping a hydrological forecasting model based on past records is crucial to effective hydropower reservoir management and scheduling. Traditionally, time series analysis and modeling is used for building mathematical models to generate hydrologic records in hydrology and water resources. Artificial intelligence (AI), as a branch of computer science, is capable of analyzing long-series and large-scale hydrological data. In recent years, it is one of front issues to apply AI technology to the hydrological forecasting modeling. In this paper, autoregressive moving-average (ARMA) models, artificial neural networks (ANNs) approaches, adaptive neural-based fuzzy inference system (ANFIS) techniques, genetic programming (GP) models and support vector machine (SVM) method are examined using the long-term observations of monthly river flow discharges. The four quantitative standard statistical performance evaluation measures, the coefficient of correlation ( R), Nash-Sutcliffe efficiency coefficient ( E), root mean squared error (RMSE), mean absolute percentage error (MAPE), are employed to evaluate the performances of various models developed. Two case study river sites are also provided to illustrate their respective performances. The results indicate that the best performance can be obtained by ANFIS, GP and SVM, in terms of different evaluation criteria during the training and validation phases.
NASA Astrophysics Data System (ADS)
Ajami, H.; Sharma, A.; Lakshmi, V.
2017-12-01
Application of semi-distributed hydrologic modeling frameworks is a viable alternative to fully distributed hyper-resolution hydrologic models due to computational efficiency and resolving fine-scale spatial structure of hydrologic fluxes and states. However, fidelity of semi-distributed model simulations is impacted by (1) formulation of hydrologic response units (HRUs), and (2) aggregation of catchment properties for formulating simulation elements. Here, we evaluate the performance of a recently developed Soil Moisture and Runoff simulation Toolkit (SMART) for large catchment scale simulations. In SMART, topologically connected HRUs are delineated using thresholds obtained from topographic and geomorphic analysis of a catchment, and simulation elements are equivalent cross sections (ECS) representative of a hillslope in first order sub-basins. Earlier investigations have shown that formulation of ECSs at the scale of a first order sub-basin reduces computational time significantly without compromising simulation accuracy. However, the implementation of this approach has not been fully explored for catchment scale simulations. To assess SMART performance, we set-up the model over the Little Washita watershed in Oklahoma. Model evaluations using in-situ soil moisture observations show satisfactory model performance. In addition, we evaluated the performance of a number of soil moisture disaggregation schemes recently developed to provide spatially explicit soil moisture outputs at fine scale resolution. Our results illustrate that the statistical disaggregation scheme performs significantly better than the methods based on topographic data. Future work is focused on assessing the performance of SMART using remotely sensed soil moisture observations using spatially based model evaluation metrics.
NASA Technical Reports Server (NTRS)
Al-Jaar, Robert Y.; Desrochers, Alan A.
1989-01-01
The main objective of this research is to develop a generic modeling methodology with a flexible and modular framework to aid in the design and performance evaluation of integrated manufacturing systems using a unified model. After a thorough examination of the available modeling methods, the Petri Net approach was adopted. The concurrent and asynchronous nature of manufacturing systems are easily captured by Petri Net models. Three basic modules were developed: machine, buffer, and Decision Making Unit. The machine and buffer modules are used for modeling transfer lines and production networks. The Decision Making Unit models the functions of a computer node in a complex Decision Making Unit Architecture. The underlying model is a Generalized Stochastic Petri Net (GSPN) that can be used for performance evaluation and structural analysis. GSPN's were chosen because they help manage the complexity of modeling large manufacturing systems. There is no need to enumerate all the possible states of the Markov Chain since they are automatically generated from the GSPN model.
Four-dimensional evaluation of regional air quality models
We present highlights of the results obtained in the third phase of the Air Quality Model Evaluation International Initiative (AQMEII3). Activities in AQMEII3 were focused on evaluating the performance of global, hemispheric and regional modeling systems over Europe and North Ame...
Nagy, Christopher J; Fitzgerald, Brian M; Kraus, Gregory P
2014-01-01
Anesthesiology residency programs will be expected to have Milestones-based evaluation systems in place by July 2014 as part of the Next Accreditation System. The San Antonio Uniformed Services Health Education Consortium (SAUSHEC) anesthesiology residency program developed and implemented a Milestones-based feedback and evaluation system a year ahead of schedule. It has been named the Milestone-specific, Observed Data points for Evaluating Levels of performance (MODEL) assessment strategy. The "MODEL Menu" and the "MODEL Blueprint" are tools that other anesthesiology residency programs can use in developing their own Milestones-based feedback and evaluation systems prior to ACGME-required implementation. Data from our early experience with the streamlined MODEL blueprint assessment strategy showed substantially improved faculty compliance with reporting requirements. The MODEL assessment strategy provides programs with a workable assessment method for residents, and important Milestones data points to programs for ACGME reporting.
Evaluation of performance of distributed delay model for chemotherapy-induced myelosuppression.
Krzyzanski, Wojciech; Hu, Shuhua; Dunlavey, Michael
2018-04-01
The distributed delay model has been introduced that replaces the transit compartments in the classic model of chemotherapy-induced myelosuppression with a convolution integral. The maturation of granulocyte precursors in the bone marrow is described by the gamma probability density function with the shape parameter (ν). If ν is a positive integer, the distributed delay model coincides with the classic model with ν transit compartments. The purpose of this work was to evaluate performance of the distributed delay model with particular focus on model deterministic identifiability in the presence of the shape parameter. The classic model served as a reference for comparison. Previously published white blood cell (WBC) count data in rats receiving bolus doses of 5-fluorouracil were fitted by both models. The negative two log-likelihood objective function (-2LL) and running times were used as major markers of performance. Local sensitivity analysis was done to evaluate the impact of ν on the pharmacodynamics response WBC. The ν estimate was 1.46 with 16.1% CV% compared to ν = 3 for the classic model. The difference of 6.78 in - 2LL between classic model and the distributed delay model implied that the latter performed significantly better than former according to the log-likelihood ratio test (P = 0.009), although the overall performance was modestly better. The running times were 1 s and 66.2 min, respectively. The long running time of the distributed delay model was attributed to computationally intensive evaluation of the convolution integral. The sensitivity analysis revealed that ν strongly influences the WBC response by controlling cell proliferation and elimination of WBCs from the circulation. In conclusion, the distributed delay model was deterministically identifiable from typical cytotoxic data. Its performance was modestly better than the classic model with significantly longer running time.
Model-centric distribution automation: Capacity, reliability, and efficiency
Onen, Ahmet; Jung, Jaesung; Dilek, Murat; ...
2016-02-26
A series of analyses along with field validations that evaluate efficiency, reliability, and capacity improvements of model-centric distribution automation are presented. With model-centric distribution automation, the same model is used from design to real-time control calculations. A 14-feeder system with 7 substations is considered. The analyses involve hourly time-varying loads and annual load growth factors. Phase balancing and capacitor redesign modifications are used to better prepare the system for distribution automation, where the designs are performed considering time-varying loads. Coordinated control of load tap changing transformers, line regulators, and switched capacitor banks is considered. In evaluating distribution automation versus traditionalmore » system design and operation, quasi-steady-state power flow analysis is used. In evaluating distribution automation performance for substation transformer failures, reconfiguration for restoration analysis is performed. In evaluating distribution automation for storm conditions, Monte Carlo simulations coupled with reconfiguration for restoration calculations are used. As a result, the evaluations demonstrate that model-centric distribution automation has positive effects on system efficiency, capacity, and reliability.« less
Model-centric distribution automation: Capacity, reliability, and efficiency
DOE Office of Scientific and Technical Information (OSTI.GOV)
Onen, Ahmet; Jung, Jaesung; Dilek, Murat
A series of analyses along with field validations that evaluate efficiency, reliability, and capacity improvements of model-centric distribution automation are presented. With model-centric distribution automation, the same model is used from design to real-time control calculations. A 14-feeder system with 7 substations is considered. The analyses involve hourly time-varying loads and annual load growth factors. Phase balancing and capacitor redesign modifications are used to better prepare the system for distribution automation, where the designs are performed considering time-varying loads. Coordinated control of load tap changing transformers, line regulators, and switched capacitor banks is considered. In evaluating distribution automation versus traditionalmore » system design and operation, quasi-steady-state power flow analysis is used. In evaluating distribution automation performance for substation transformer failures, reconfiguration for restoration analysis is performed. In evaluating distribution automation for storm conditions, Monte Carlo simulations coupled with reconfiguration for restoration calculations are used. As a result, the evaluations demonstrate that model-centric distribution automation has positive effects on system efficiency, capacity, and reliability.« less
An IPA-Embedded Model for Evaluating Creativity Curricula
ERIC Educational Resources Information Center
Chang, Chi-Cheng
2014-01-01
How to diagnose the effectiveness of creativity-related curricula is a crucial concern in the pursuit of educational excellence. This paper introduces an importance-performance analysis (IPA)-embedded model for curriculum evaluation, using the example of an IT project implementation course to assess the creativity performance deduced from student…
AN ANNUAL EVALUATION OF THE 2005 RELEASE OF MODELS-3 CMAQ
An annual operation performance evaluation of the 2005 release of Models-3 CMAQ v4.5 has been performed. The poster presented results from the winter and summer season for sulfate, nitrate, ammonium, elemental carbon, organic carbon, PM2.5 mass and AQS 8-hr maximum ozone. Stati...
The purpose of this report is to develop a database of physiological parameters needed for understanding and evaluating performance of the APEX and SHEDS exposure/intake dose rate model used by the Environmental Protection Agency (EPA) as part of its regulatory activities. The A...
Global Gridded Crop Model Evaluation: Benchmarking, Skills, Deficiencies and Implications.
NASA Technical Reports Server (NTRS)
Muller, Christoph; Elliott, Joshua; Chryssanthacopoulos, James; Arneth, Almut; Balkovic, Juraj; Ciais, Philippe; Deryng, Delphine; Folberth, Christian; Glotter, Michael; Hoek, Steven;
2017-01-01
Crop models are increasingly used to simulate crop yields at the global scale, but so far there is no general framework on how to assess model performance. Here we evaluate the simulation results of 14 global gridded crop modeling groups that have contributed historic crop yield simulations for maize, wheat, rice and soybean to the Global Gridded Crop Model Intercomparison (GGCMI) of the Agricultural Model Intercomparison and Improvement Project (AgMIP). Simulation results are compared to reference data at global, national and grid cell scales and we evaluate model performance with respect to time series correlation, spatial correlation and mean bias. We find that global gridded crop models (GGCMs) show mixed skill in reproducing time series correlations or spatial patterns at the different spatial scales. Generally, maize, wheat and soybean simulations of many GGCMs are capable of reproducing larger parts of observed temporal variability (time series correlation coefficients (r) of up to 0.888 for maize, 0.673 for wheat and 0.643 for soybean at the global scale) but rice yield variability cannot be well reproduced by most models. Yield variability can be well reproduced for most major producing countries by many GGCMs and for all countries by at least some. A comparison with gridded yield data and a statistical analysis of the effects of weather variability on yield variability shows that the ensemble of GGCMs can explain more of the yield variability than an ensemble of regression models for maize and soybean, but not for wheat and rice. We identify future research needs in global gridded crop modeling and for all individual crop modeling groups. In the absence of a purely observation-based benchmark for model evaluation, we propose that the best performing crop model per crop and region establishes the benchmark for all others, and modelers are encouraged to investigate how crop model performance can be increased. We make our evaluation system accessible to all crop modelers so that other modeling groups can also test their model performance against the reference data and the GGCMI benchmark.
Confidence in the application of models for forecasting and regulatory assessments is furthered by conducting four types of model evaluation: operational, dynamic, diagnostic, and probabilistic. Operational model evaluation alone does not reveal the confidence limits that can be ...
Monte Carlo simulation of Ray-Scan 64 PET system and performance evaluation using GATE toolkit
NASA Astrophysics Data System (ADS)
Li, Suying; Zhang, Qiushi; Vuletic, Ivan; Xie, Zhaoheng; Yang, Kun; Ren, Qiushi
2017-02-01
In this study, we aimed to develop a GATE model for the simulation of Ray-Scan 64 PET scanner and model its performance characteristics. A detailed implementation of system geometry and physical process were included in the simulation model. Then we modeled the performance characteristics of Ray-Scan 64 PET system for the first time, based on National Electrical Manufacturers Association (NEMA) NU-2 2007 protocols and validated the model against experimental measurement, including spatial resolution, sensitivity, counting rates and noise equivalent count rate (NECR). Moreover, an accurate dead time module was investigated to simulate the counting rate performance. Overall results showed reasonable agreement between simulation and experimental data. The validation results showed the reliability and feasibility of the GATE model to evaluate major performance of Ray-Scan 64 PET system. It provided a useful tool for a wide range of research applications.
Information technology model for evaluating emergency medicine teaching
NASA Astrophysics Data System (ADS)
Vorbach, James; Ryan, James
1996-02-01
This paper describes work in progress to develop an Information Technology (IT) model and supporting information system for the evaluation of clinical teaching in the Emergency Medicine (EM) Department of North Shore University Hospital. In the academic hospital setting student physicians, i.e. residents, and faculty function daily in their dual roles as teachers and students respectively, and as health care providers. Databases exist that are used to evaluate both groups in either academic or clinical performance, but rarely has this information been integrated to analyze the relationship between academic performance and the ability to care for patients. The goal of the IT model is to improve the quality of teaching of EM physicians by enabling the development of integrable metrics for faculty and resident evaluation. The IT model will include (1) methods for tracking residents in order to develop experimental databases; (2) methods to integrate lecture evaluation, clinical performance, resident evaluation, and quality assurance databases; and (3) a patient flow system to monitor patient rooms and the waiting area in the Emergency Medicine Department, to record and display status of medical orders, and to collect data for analyses.
Multi-Fidelity Framework for Modeling Combustion Instability
2016-07-27
generated from the reduced-domain dataset. Evaluations of the framework are performed based on simplified test problems for a model rocket combustor showing...generated from the reduced-domain dataset. Evaluations of the framework are performed based on simplified test problems for a model rocket combustor...of Aeronautics and Astronautics and Associate Fellow AIAA. ‡ Professor Emeritus. § Senior Scientist, Rocket Propulsion Division and Senior Member
Implementation of an Integrated On-Board Aircraft Engine Diagnostic Architecture
NASA Technical Reports Server (NTRS)
Armstrong, Jeffrey B.; Simon, Donald L.
2012-01-01
An on-board diagnostic architecture for aircraft turbofan engine performance trending, parameter estimation, and gas-path fault detection and isolation has been developed and evaluated in a simulation environment. The architecture incorporates two independent models: a realtime self-tuning performance model providing parameter estimates and a performance baseline model for diagnostic purposes reflecting long-term engine degradation trends. This architecture was evaluated using flight profiles generated from a nonlinear model with realistic fleet engine health degradation distributions and sensor noise. The architecture was found to produce acceptable estimates of engine health and unmeasured parameters, and the integrated diagnostic algorithms were able to perform correct fault isolation in approximately 70 percent of the tested cases
Quality of protection evaluation of security mechanisms.
Ksiezopolski, Bogdan; Zurek, Tomasz; Mokkas, Michail
2014-01-01
Recent research indicates that during the design of teleinformatic system the tradeoff between the systems performance and the system protection should be made. The traditional approach assumes that the best way is to apply the strongest possible security measures. Unfortunately, the overestimation of security measures can lead to the unreasonable increase of system load. This is especially important in multimedia systems where the performance has critical character. In many cases determination of the required level of protection and adjustment of some security measures to these requirements increase system efficiency. Such an approach is achieved by means of the quality of protection models where the security measures are evaluated according to their influence on the system security. In the paper, we propose a model for QoP evaluation of security mechanisms. Owing to this model, one can quantify the influence of particular security mechanisms on ensuring security attributes. The methodology of our model preparation is described and based on it the case study analysis is presented. We support our method by the tool where the models can be defined and QoP evaluation can be performed. Finally, we have modelled TLS cryptographic protocol and presented the QoP security mechanisms evaluation for the selected versions of this protocol.
Scoring annual earthquake predictions in China
NASA Astrophysics Data System (ADS)
Zhuang, Jiancang; Jiang, Changsheng
2012-02-01
The Annual Consultation Meeting on Earthquake Tendency in China is held by the China Earthquake Administration (CEA) in order to provide one-year earthquake predictions over most China. In these predictions, regions of concern are denoted together with the corresponding magnitude range of the largest earthquake expected during the next year. Evaluating the performance of these earthquake predictions is rather difficult, especially for regions that are of no concern, because they are made on arbitrary regions with flexible magnitude ranges. In the present study, the gambling score is used to evaluate the performance of these earthquake predictions. Based on a reference model, this scoring method rewards successful predictions and penalizes failures according to the risk (probability of being failure) that the predictors have taken. Using the Poisson model, which is spatially inhomogeneous and temporally stationary, with the Gutenberg-Richter law for earthquake magnitudes as the reference model, we evaluate the CEA predictions based on 1) a partial score for evaluating whether issuing the alarmed regions is based on information that differs from the reference model (knowledge of average seismicity level) and 2) a complete score that evaluates whether the overall performance of the prediction is better than the reference model. The predictions made by the Annual Consultation Meetings on Earthquake Tendency from 1990 to 2003 are found to include significant precursory information, but the overall performance is close to that of the reference model.
Kramer, Andrew A; Higgins, Thomas L; Zimmerman, Jack E
2015-02-01
To compare ICU performance using standardized mortality ratios generated by the Acute Physiology and Chronic Health Evaluation IVa and a National Quality Forum-endorsed methodology and examine potential reasons for model-based standardized mortality ratio differences. Retrospective analysis of day 1 hospital mortality predictions at the ICU level using Acute Physiology and Chronic Health Evaluation IVa and National Quality Forum models on the same patient cohort. Forty-seven ICUs at 36 U.S. hospitals from January 2008 to May 2013. Eighty-nine thousand three hundred fifty-three consecutive unselected ICU admissions. None. We assessed standardized mortality ratios for each ICU using data for patients eligible for Acute Physiology and Chronic Health Evaluation IVa and National Quality Forum predictions in order to compare unit-level model performance, differences in ICU rankings, and how case-mix adjustment might explain standardized mortality ratio differences. Hospital mortality was 11.5%. Overall standardized mortality ratio was 0.89 using Acute Physiology and Chronic Health Evaluation IVa and 1.07 using National Quality Forum, the latter having a widely dispersed and multimodal standardized mortality ratio distribution. Model exclusion criteria eliminated mortality predictions for 10.6% of patients for Acute Physiology and Chronic Health Evaluation IVa and 27.9% for National Quality Forum. The two models agreed on the significance and direction of standardized mortality ratio only 45% of the time. Four ICUs had standardized mortality ratios significantly less than 1.0 using Acute Physiology and Chronic Health Evaluation IVa, but significantly greater than 1.0 using National Quality Forum. Two ICUs had standardized mortality ratios exceeding 1.75 using National Quality Forum, but nonsignificant performance using Acute Physiology and Chronic Health Evaluation IVa. Stratification by patient and institutional characteristics indicated that units caring for more severely ill patients and those with a higher percentage of patients on mechanical ventilation had the most discordant standardized mortality ratios between the two predictive models. Acute Physiology and Chronic Health Evaluation IVa and National Quality Forum models yield different ICU performance assessments due to differences in case-mix adjustment. Given the growing role of outcomes in driving prospective payment patient referral and public reporting, performance should be assessed by models with fewer exclusions, superior accuracy, and better case-mix adjustment.
Measures of GCM Performance as Functions of Model Parameters Affecting Clouds and Radiation
NASA Astrophysics Data System (ADS)
Jackson, C.; Mu, Q.; Sen, M.; Stoffa, P.
2002-05-01
This abstract is one of three related presentations at this meeting dealing with several issues surrounding optimal parameter and uncertainty estimation of model predictions of climate. Uncertainty in model predictions of climate depends in part on the uncertainty produced by model approximations or parameterizations of unresolved physics. Evaluating these uncertainties is computationally expensive because one needs to evaluate how arbitrary choices for any given combination of model parameters affects model performance. Because the computational effort grows exponentially with the number of parameters being investigated, it is important to choose parameters carefully. Evaluating whether a parameter is worth investigating depends on two considerations: 1) does reasonable choices of parameter values produce a large range in model response relative to observational uncertainty? and 2) does the model response depend non-linearly on various combinations of model parameters? We have decided to narrow our attention to selecting parameters that affect clouds and radiation, as it is likely that these parameters will dominate uncertainties in model predictions of future climate. We present preliminary results of ~20 to 30 AMIPII style climate model integrations using NCAR's CCM3.10 that show model performance as functions of individual parameters controlling 1) critical relative humidity for cloud formation (RHMIN), and 2) boundary layer critical Richardson number (RICR). We also explore various definitions of model performance that include some or all observational data sources (surface air temperature and pressure, meridional and zonal winds, clouds, long and short-wave cloud forcings, etc...) and evaluate in a few select cases whether the model's response depends non-linearly on the parameter values we have selected.
Dynamic Evaluation of Long-Term Air Quality Model Simulations Over the Northeastern U.S.
Dynamic model evaluation assesses a modeling system's ability to reproduce changes in air quality induced by changes in meteorology and/or emissions. In this paper, we illustrate various approaches to dynamic mode evaluation utilizing 18 years of air quality simulations perform...
NASA Astrophysics Data System (ADS)
Ko, P.; Kurosawa, S.
2014-03-01
The understanding and accurate prediction of the flow behaviour related to cavitation and pressure fluctuation in a Kaplan turbine are important to the design work enhancing the turbine performance including the elongation of the operation life span and the improvement of turbine efficiency. In this paper, high accuracy turbine and cavitation performance prediction method based on entire flow passage for a Kaplan turbine is presented and evaluated. Two-phase flow field is predicted by solving Reynolds-Averaged Navier-Stokes equations expressed by volume of fluid method tracking the free surface and combined with Reynolds Stress model. The growth and collapse of cavitation bubbles are modelled by the modified Rayleigh-Plesset equation. The prediction accuracy is evaluated by comparing with the model test results of Ns 400 Kaplan model turbine. As a result that the experimentally measured data including turbine efficiency, cavitation performance, and pressure fluctuation are accurately predicted. Furthermore, the cavitation occurrence on the runner blade surface and the influence to the hydraulic loss of the flow passage are discussed. Evaluated prediction method for the turbine flow and performance is introduced to facilitate the future design and research works on Kaplan type turbine.
NASA Astrophysics Data System (ADS)
Strader, Anne; Schorlemmer, Danijel; Beutin, Thomas
2017-04-01
The Global Earthquake Activity Rate Model (GEAR1) is a hybrid seismicity model, constructed from a loglinear combination of smoothed seismicity from the Global Centroid Moment Tensor (CMT) earthquake catalog and geodetic strain rates (Global Strain Rate Map, version 2.1). For the 2005-2012 retrospective evaluation period, GEAR1 outperformed both parent strain rate and smoothed seismicity forecasts. Since 1. October 2015, GEAR1 has been prospectively evaluated by the Collaboratory for the Study of Earthquake Predictability (CSEP) testing center. Here, we present initial one-year test results of the GEAR1, GSRM and GSRM2.1, as well as localized evaluation of GEAR1 performance. The models were evaluated on the consistency in number (N-test), spatial (S-test) and magnitude (M-test) distribution of forecasted and observed earthquakes, as well as overall data consistency (CL-, L-tests). Performance at target earthquake locations was compared between models using the classical paired T-test and its non-parametric equivalent, the W-test, to determine if one model could be rejected in favor of another at the 0.05 significance level. For the evaluation period from 1. October 2015 to 1. October 2016, the GEAR1, GSRM and GSRM2.1 forecasts pass all CSEP likelihood tests. Comparative test results show statistically significant improvement of GEAR1 performance over both strain rate-based forecasts, both of which can be rejected in favor of GEAR1. Using point process residual analysis, we investigate the spatial distribution of differences in GEAR1, GSRM and GSRM2 model performance, to identify regions where the GEAR1 model should be adjusted, that could not be inferred from CSEP test results. Furthermore, we investigate whether the optimal combination of smoothed seismicity and strain rates remains stable over space and time.
NASA Astrophysics Data System (ADS)
Stisen, S.; Demirel, C.; Koch, J.
2017-12-01
Evaluation of performance is an integral part of model development and calibration as well as it is of paramount importance when communicating modelling results to stakeholders and the scientific community. There exists a comprehensive and well tested toolbox of metrics to assess temporal model performance in the hydrological modelling community. On the contrary, the experience to evaluate spatial performance is not corresponding to the grand availability of spatial observations readily available and to the sophisticate model codes simulating the spatial variability of complex hydrological processes. This study aims at making a contribution towards advancing spatial pattern oriented model evaluation for distributed hydrological models. This is achieved by introducing a novel spatial performance metric which provides robust pattern performance during model calibration. The promoted SPAtial EFficiency (spaef) metric reflects three equally weighted components: correlation, coefficient of variation and histogram overlap. This multi-component approach is necessary in order to adequately compare spatial patterns. spaef, its three components individually and two alternative spatial performance metrics, i.e. connectivity analysis and fractions skill score, are tested in a spatial pattern oriented model calibration of a catchment model in Denmark. The calibration is constrained by a remote sensing based spatial pattern of evapotranspiration and discharge timeseries at two stations. Our results stress that stand-alone metrics tend to fail to provide holistic pattern information to the optimizer which underlines the importance of multi-component metrics. The three spaef components are independent which allows them to complement each other in a meaningful way. This study promotes the use of bias insensitive metrics which allow comparing variables which are related but may differ in unit in order to optimally exploit spatial observations made available by remote sensing platforms. We see great potential of spaef across environmental disciplines dealing with spatially distributed modelling.
NASA Astrophysics Data System (ADS)
Imran, H. M.; Kala, J.; Ng, A. W. M.; Muthukumaran, S.
2018-04-01
Appropriate choice of physics options among many physics parameterizations is important when using the Weather Research and Forecasting (WRF) model. The responses of different physics parameterizations of the WRF model may vary due to geographical locations, the application of interest, and the temporal and spatial scales being investigated. Several studies have evaluated the performance of the WRF model in simulating the mean climate and extreme rainfall events for various regions in Australia. However, no study has explicitly evaluated the sensitivity of the WRF model in simulating heatwaves. Therefore, this study evaluates the performance of a WRF multi-physics ensemble that comprises 27 model configurations for a series of heatwave events in Melbourne, Australia. Unlike most previous studies, we not only evaluate temperature, but also wind speed and relative humidity, which are key factors influencing heatwave dynamics. No specific ensemble member for all events explicitly showed the best performance, for all the variables, considering all evaluation metrics. This study also found that the choice of planetary boundary layer (PBL) scheme had largest influence, the radiation scheme had moderate influence, and the microphysics scheme had the least influence on temperature simulations. The PBL and microphysics schemes were found to be more sensitive than the radiation scheme for wind speed and relative humidity. Additionally, the study tested the role of Urban Canopy Model (UCM) and three Land Surface Models (LSMs). Although the UCM did not play significant role, the Noah-LSM showed better performance than the CLM4 and NOAH-MP LSMs in simulating the heatwave events. The study finally identifies an optimal configuration of WRF that will be a useful modelling tool for further investigations of heatwaves in Melbourne. Although our results are invariably region-specific, our results will be useful to WRF users investigating heatwave dynamics elsewhere.
The Evaluation of Hospital Performance in Iran: A Systematic Review Article
BAHADORI, Mohammadkarim; IZADI, Ahmad Reza; GHARDASHI, Fatemeh; RAVANGARD, Ramin; HOSSEINI, Seyed Mojtaba
2016-01-01
Background: This research aimed to systematically study and outline the methods of hospital performance evaluation used in Iran. Methods: In this systematic review, all Persian and English-language articles published in the Iranian and non-Iranian scientific journals indexed from Sep 2004 to Sep 2014 were studied. For finding the related articles, the researchers searched the Iranian electronic databases, including SID, IranMedex, IranDoc, Magiran, as well as the non-Iranian electronic databases, including Medline, Embase, Scopus, and Google Scholar. For reviewing the selected articles, a data extraction form, developed by the researchers was used. Results: The entire review process led to the selection of 51 articles. The publication of articles on the hospital performance evaluation in Iran has increased considerably in the recent years. Besides, among these 51 articles, 38 articles (74.51%) had been published in Persian language and 13 articles (25.49%) in English language. Eight models were recognized as evaluation model for Iranian hospitals. Totally, in 15 studies, the data envelopment analysis model had been used to evaluate the hospital performance. Conclusion: Using a combination of model to integrate indicators in the hospital evaluation process is inevitable. Therefore, the Ministry of Health and Medical Education should use a set of indicators such as the balanced scorecard in the process of hospital evaluation and accreditation and encourage the hospital managers to use them. PMID:27516991
The Evaluation of Hospital Performance in Iran: A Systematic Review Article.
Bahadori, Mohammadkarim; Izadi, Ahmad Reza; Ghardashi, Fatemeh; Ravangard, Ramin; Hosseini, Seyed Mojtaba
2016-07-01
This research aimed to systematically study and outline the methods of hospital performance evaluation used in Iran. In this systematic review, all Persian and English-language articles published in the Iranian and non-Iranian scientific journals indexed from Sep 2004 to Sep 2014 were studied. For finding the related articles, the researchers searched the Iranian electronic databases, including SID, IranMedex, IranDoc, Magiran, as well as the non-Iranian electronic databases, including Medline, Embase, Scopus, and Google Scholar. For reviewing the selected articles, a data extraction form, developed by the researchers was used. The entire review process led to the selection of 51 articles. The publication of articles on the hospital performance evaluation in Iran has increased considerably in the recent years. Besides, among these 51 articles, 38 articles (74.51%) had been published in Persian language and 13 articles (25.49%) in English language. Eight models were recognized as evaluation model for Iranian hospitals. Totally, in 15 studies, the data envelopment analysis model had been used to evaluate the hospital performance. Using a combination of model to integrate indicators in the hospital evaluation process is inevitable. Therefore, the Ministry of Health and Medical Education should use a set of indicators such as the balanced scorecard in the process of hospital evaluation and accreditation and encourage the hospital managers to use them.
A modular method for evaluating the performance of picture archiving and communication systems.
Sanders, W H; Kant, L A; Kudrimoti, A
1993-08-01
Modeling can be used to predict the performance of picture archiving and communication system (PACS) configurations under various load conditions at an early design stage. This is important because choices made early in the design of a system can have a significant impact on the performance of the resulting implementation. Because PACS consist of many types of components, it is important to do such evaluations in a modular manner, so that alternative configurations and designs can be easily investigated. Stochastic activity networks (SANs) and reduced base model construction methods can aid in doing this. SANs are a model type particularly suited to the evaluation of systems in which several activities may be in progress concurrently, and each activity may affect the others through the results of its completion. Together with SANs, reduced base model construction methods provide a means to build highly modular models, in which models of particular components can be easily reused. In this article, we investigate the use of SANs and reduced base model construction techniques in evaluating PACS. Construction and solution of the models is done using UltraSAN, a graphic-oriented software tool for model specification, analysis, and simulation. The method is illustrated via the evaluation of a realistically sized PACS for a typical United States hospital of 300 to 400 beds, and the derivation of system response times and component utilizations.
ERIC Educational Resources Information Center
Harlow, Lisa L.; Burkholder, Gary J.; Morrow, Jennifer A.
2002-01-01
Used a structural modeling approach to evaluate relations among attitudes, initial skills, and performance in a Quantitative Methods course that involved students in active learning. Results largely confirmed hypotheses offering support for educational reform efforts that propose actively involving students in the learning process, especially in…
DeltaSA tool for source apportionment benchmarking, description and sensitivity analysis
NASA Astrophysics Data System (ADS)
Pernigotti, D.; Belis, C. A.
2018-05-01
DeltaSA is an R-package and a Java on-line tool developed at the EC-Joint Research Centre to assist and benchmark source apportionment applications. Its key functionalities support two critical tasks in this kind of studies: the assignment of a factor to a source in factor analytical models (source identification) and the model performance evaluation. The source identification is based on the similarity between a given factor and source chemical profiles from public databases. The model performance evaluation is based on statistical indicators used to compare model output with reference values generated in intercomparison exercises. The references values are calculated as the ensemble average of the results reported by participants that have passed a set of testing criteria based on chemical profiles and time series similarity. In this study, a sensitivity analysis of the model performance criteria is accomplished using the results of a synthetic dataset where "a priori" references are available. The consensus modulated standard deviation punc gives the best choice for the model performance evaluation when a conservative approach is adopted.
Distributed multi-criteria model evaluation and spatial association analysis
NASA Astrophysics Data System (ADS)
Scherer, Laura; Pfister, Stephan
2015-04-01
Model performance, if evaluated, is often communicated by a single indicator and at an aggregated level; however, it does not embrace the trade-offs between different indicators and the inherent spatial heterogeneity of model efficiency. In this study, we simulated the water balance of the Mississippi watershed using the Soil and Water Assessment Tool (SWAT). The model was calibrated against monthly river discharge at 131 measurement stations. Its time series were bisected to allow for subsequent validation at the same gauges. Furthermore, the model was validated against evapotranspiration which was available as a continuous raster based on remote sensing. The model performance was evaluated for each of the 451 sub-watersheds using four different criteria: 1) Nash-Sutcliffe efficiency (NSE), 2) percent bias (PBIAS), 3) root mean square error (RMSE) normalized to standard deviation (RSR), as well as 4) a combined indicator of the squared correlation coefficient and the linear regression slope (bR2). Conditions that might lead to a poor model performance include aridity, a very flat and steep relief, snowfall and dams, as indicated by previous research. In an attempt to explain spatial differences in model efficiency, the goodness of the model was spatially compared to these four phenomena by means of a bivariate spatial association measure which combines Pearson's correlation coefficient and Moran's index for spatial autocorrelation. In order to assess the model performance of the Mississippi watershed as a whole, three different averages of the sub-watershed results were computed by 1) applying equal weights, 2) weighting by the mean observed river discharge, 3) weighting by the upstream catchment area and the square root of the time series length. Ratings of model performance differed significantly in space and according to efficiency criterion. The model performed much better in the humid Eastern region than in the arid Western region which was confirmed by the high spatial association with the aridity index (ratio of mean annual precipitation to mean annual potential evapotranspiration). This association was still significant when controlling for slopes which manifested the second highest spatial association. In line with these findings, overall model efficiency of the entire Mississippi watershed appeared better when weighted with mean observed river discharge. Furthermore, the model received the highest rating with regards to PBIAS and was judged worst when considering NSE as the most comprehensive indicator. No universal performance indicator exists that considers all aspects of a hydrograph. Therefore, sound model evaluation must take into account multiple criteria. Since model efficiency varies in space which is masked by aggregated ratings spatially explicit model goodness should be communicated as standard praxis - at least as a measure of spatial variability of indicators. Furthermore, transparent documentation of the evaluation procedure also with regards to weighting of aggregated model performance is crucial but often lacking in published research. Finally, the high spatial association between model performance and aridity highlights the need to improve modelling schemes for arid conditions as priority over other aspects that might weaken model goodness.
NASA Technical Reports Server (NTRS)
Donohue, Paul F.
1987-01-01
The results of an aerodynamic performance evaluation of the National Aeronautics and Space Administration (NASA)/Ames Research Center Advanced Concepts Flight Simulator (ACFS), conducted in association with the Navy-NASA Joint Institute of Aeronautics, are presented. The ACFS is a full-mission flight simulator which provides an excellent platform for the critical evaluation of emerging flight systems and aircrew performance. The propulsion and flight dynamics models were evaluated using classical flight test techniques. The aerodynamic performance model of the ACFS was found to realistically represent that of current day, medium range transport aircraft. Recommendations are provided to enhance the capabilities of the ACFS to a level forecast for 1995 transport aircraft. The graphical and tabular results of this study will establish a performance section of the ACFS Operation's Manual.
A computer program for condensing heat exchanger performance in the presence of noncondensable gases
NASA Technical Reports Server (NTRS)
Yendler, Boris
1994-01-01
A computer model has been developed which evaluates the performance of a heat exchanger. This model is general enough to be used to evaluate many heat exchanger geometries and a number of different operating conditions. The film approach is used to describe condensation in the presence of noncondensables. The model is also easily expanded to include other effects like fog formation or suction.
Pearce, J; Ferrier, S; Scotts, D
2001-06-01
To use models of species distributions effectively in conservation planning, it is important to determine the predictive accuracy of such models. Extensive modelling of the distribution of vascular plant and vertebrate fauna species within north-east New South Wales has been undertaken by linking field survey data to environmental and geographical predictors using logistic regression. These models have been used in the development of a comprehensive and adequate reserve system within the region. We evaluate the predictive accuracy of models for 153 small reptile, arboreal marsupial, diurnal bird and vascular plant species for which independent evaluation data were available. The predictive performance of each model was evaluated using the relative operating characteristic curve to measure discrimination capacity. Good discrimination ability implies that a model's predictions provide an acceptable index of species occurrence. The discrimination capacity of 89% of the models was significantly better than random, with 70% of the models providing high levels of discrimination. Predictions generated by this type of modelling therefore provide a reasonably sound basis for regional conservation planning. The discrimination ability of models was highest for the less mobile biological groups, particularly the vascular plants and small reptiles. In the case of diurnal birds, poor performing models tended to be for species which occur mainly within specific habitats not well sampled by either the model development or evaluation data, highly mobile species, species that are locally nomadic or those that display very broad habitat requirements. Particular care needs to be exercised when employing models for these types of species in conservation planning.
Evaluating Internal Model Strength and Performance of Myoelectric Prosthesis Control Strategies.
Shehata, Ahmed W; Scheme, Erik J; Sensinger, Jonathon W
2018-05-01
On-going developments in myoelectric prosthesis control have provided prosthesis users with an assortment of control strategies that vary in reliability and performance. Many studies have focused on improving performance by providing feedback to the user but have overlooked the effect of this feedback on internal model development, which is key to improve long-term performance. In this paper, the strength of internal models developed for two commonly used myoelectric control strategies: raw control with raw feedback (using a regression-based approach) and filtered control with filtered feedback (using a classifier-based approach), were evaluated using two psychometric measures: trial-by-trial adaptation and just-noticeable difference. The performance of both strategies was also evaluated using Schmidt's style target acquisition task. Results obtained from 24 able-bodied subjects showed that although filtered control with filtered feedback had better short-term performance in path efficiency ( ), raw control with raw feedback resulted in stronger internal model development ( ), which may lead to better long-term performance. Despite inherent noise in the control signals of the regression controller, these findings suggest that rich feedback associated with regression control may be used to improve human understanding of the myoelectric control system.
Systematic evaluation of atmospheric chemistry-transport model CHIMERE
NASA Astrophysics Data System (ADS)
Khvorostyanov, Dmitry; Menut, Laurent; Mailler, Sylvain; Siour, Guillaume; Couvidat, Florian; Bessagnet, Bertrand; Turquety, Solene
2017-04-01
Regional-scale atmospheric chemistry-transport models (CTM) are used to develop air quality regulatory measures, to support environmentally sensitive decisions in the industry, and to address variety of scientific questions involving the atmospheric composition. Model performance evaluation with measurement data is critical to understand their limits and the degree of confidence in model results. CHIMERE CTM (http://www.lmd.polytechnique.fr/chimere/) is a French national tool for operational forecast and decision support and is widely used in the international research community in various areas of atmospheric chemistry and physics, climate, and environment (http://www.lmd.polytechnique.fr/chimere/CW-articles.php). This work presents the model evaluation framework applied systematically to the new CHIMERE CTM versions in the course of the continuous model development. The framework uses three of the four CTM evaluation types identified by the Environmental Protection Agency (EPA) and the American Meteorological Society (AMS): operational, diagnostic, and dynamic. It allows to compare the overall model performance in subsequent model versions (operational evaluation), identify specific processes and/or model inputs that could be improved (diagnostic evaluation), and test the model sensitivity to the changes in air quality, such as emission reductions and meteorological events (dynamic evaluation). The observation datasets currently used for the evaluation are: EMEP (surface concentrations), AERONET (optical depths), and WOUDC (ozone sounding profiles). The framework is implemented as an automated processing chain and allows interactive exploration of the results via a web interface.
Evaluation of Generation Alternation Models in Evolutionary Robotics
NASA Astrophysics Data System (ADS)
Oiso, Masashi; Matsumura, Yoshiyuki; Yasuda, Toshiyuki; Ohkura, Kazuhiro
For efficient implementation of Evolutionary Algorithms (EA) to a desktop grid computing environment, we propose a new generation alternation model called Grid-Oriented-Deletion (GOD) based on comparison with the conventional techniques. In previous research, generation alternation models are generally evaluated by using test functions. However, their exploration performance on the real problems such as Evolutionary Robotics (ER) has not been made very clear yet. Therefore we investigate the relationship between the exploration performance of EA on an ER problem and its generation alternation model. We applied four generation alternation models to the Evolutionary Multi-Robotics (EMR), which is the package-pushing problem to investigate their exploration performance. The results show that GOD is more effective than the other conventional models.
Wave and Wind Model Performance Metrics Tools
NASA Astrophysics Data System (ADS)
Choi, J. K.; Wang, D. W.
2016-02-01
Continual improvements and upgrades of Navy ocean wave and wind models are essential to the assurance of battlespace environment predictability of ocean surface wave and surf conditions in support of Naval global operations. Thus, constant verification and validation of model performance is equally essential to assure the progress of model developments and maintain confidence in the predictions. Global and regional scale model evaluations may require large areas and long periods of time. For observational data to compare against, altimeter winds and waves along the tracks from past and current operational satellites as well as moored/drifting buoys can be used for global and regional coverage. Using data and model runs in previous trials such as the planned experiment, the Dynamics of the Adriatic in Real Time (DART), we demonstrated the use of accumulated altimeter wind and wave data over several years to obtain an objective evaluation of the performance the SWAN (Simulating Waves Nearshore) model running in the Adriatic Sea. The assessment provided detailed performance of wind and wave models by using cell-averaged statistical variables maps with spatial statistics including slope, correlation, and scatter index to summarize model performance. Such a methodology is easily generalized to other regions and at global scales. Operational technology currently used by subject matter experts evaluating the Navy Coastal Ocean Model and the Hybrid Coordinate Ocean Model can be expanded to evaluate wave and wind models using tools developed for ArcMAP, a GIS application developed by ESRI. Recent inclusion of altimeter and buoy data into a format through the Naval Oceanographic Office's (NAVOCEANO) quality control system and the netCDF standards applicable to all model output makes it possible for the fusion of these data and direct model verification. Also, procedures were developed for the accumulation of match-ups of modelled and observed parameters to form a data base with which statistics are readily calculated, for the short or long term. Such a system has potential for a quick transition to operations at NAVOCEANO.
NASA Technical Reports Server (NTRS)
Taylor, B. K.; Casasent, D. P.
1989-01-01
The use of simplified error models to accurately simulate and evaluate the performance of an optical linear-algebra processor is described. The optical architecture used to perform banded matrix-vector products is reviewed, along with a linear dynamic finite-element case study. The laboratory hardware and ac-modulation technique used are presented. The individual processor error-source models and their simulator implementation are detailed. Several significant simplifications are introduced to ease the computational requirements and complexity of the simulations. The error models are verified with a laboratory implementation of the processor, and are used to evaluate its potential performance.
Closed-form solutions of performability. [in computer systems
NASA Technical Reports Server (NTRS)
Meyer, J. F.
1982-01-01
It is noted that if computing system performance is degradable then system evaluation must deal simultaneously with aspects of both performance and reliability. One approach is the evaluation of a system's performability which, relative to a specified performance variable Y, generally requires solution of the probability distribution function of Y. The feasibility of closed-form solutions of performability when Y is continuous are examined. In particular, the modeling of a degradable buffer/multiprocessor system is considered whose performance Y is the (normalized) average throughput rate realized during a bounded interval of time. Employing an approximate decomposition of the model, it is shown that a closed-form solution can indeed be obtained.
Optimization of a reversible hood for protecting a pedestrian's head during car collisions.
Huang, Sunan; Yang, Jikuang
2010-07-01
This study evaluated and optimized the performance of a reversible hood (RH) for the prevention of the head injuries of an adult pedestrian from car collisions. The FE model of a production car front was introduced and validated. The baseline RH was developed from the original hood in the validated car front model. In order to evaluate the protective performance of the baseline RH, the FE models of an adult headform and a 50th percentile human head were used in parallel to impact the baseline RH. Based on the evaluation, the response surface method was applied to optimize the RH in terms of the material stiffness, lifting speed, and lifted height. Finally, the headform model and the human head model were again used to evaluate the protective performance of the optimized RH. It was found that the lifted baseline RH can obviously reduce the impact responses of the headform model and the human head model by comparing with the retracted and lifting baseline RH. When the optimized RH was lifted, the HIC values of the headform model and the human head model were further reduced to much lower than 1000. The risk of pedestrian head injuries can be prevented as required by EEVC WG17. Copyright 2009 Elsevier Ltd. All rights reserved.
This paper presents an analysis of the CMAQ v4.5 model performance for particulate matter and its chemical components for the simulated year 2001. This is part two is two part series of papers that examines the model performance of CMAQ v4.5.
A critical evaluation of various turbulence models as applied to internal fluid flows
NASA Technical Reports Server (NTRS)
Nallasamy, M.
1985-01-01
Models employed in the computation of turbulent flows are described and their application to internal flows is evaluated by examining the predictions of various turbulence models in selected flow configurations. The main conclusions are: (1) the k-epsilon model is used in a majority of all the two-dimensional flow calculations reported in the literature; (2) modified forms of the k-epsilon model improve the performance for flows with streamline curvature and heat transfer; (3) for flows with swirl, the k-epsilon model performs rather poorly; the algebraic stress model performs better in this case; and (4) for flows with regions of secondary flow (noncircular duct flows), the algebraic stress model performs fairly well for fully developed flow, for developing flow, the algebraic stress model performance is not good; a Reynolds stress model should be used. False diffusion and inlet boundary conditions are discussed. Countergradient transport and its implications in turbulence modeling is mentioned. Two examples of recirculating flow predictions obtained using PHOENICS code are discussed. The vortex method, large eddy simulation (modeling of subgrid scale Reynolds stresses), and direct simulation, are considered. Some recommendations for improving the model performance are made. The need for detailed experimental data in flows with strong curvature is emphasized.
A Spectral Evaluation of Models Performances in Mediterranean Oak Woodlands
NASA Astrophysics Data System (ADS)
Vargas, R.; Baldocchi, D. D.; Abramowitz, G.; Carrara, A.; Correia, A.; Kobayashi, H.; Papale, D.; Pearson, D.; Pereira, J.; Piao, S.; Rambal, S.; Sonnentag, O.
2009-12-01
Ecosystem processes are influenced by climatic trends at multiple temporal scales including diel patterns and other mid-term climatic modes, such as interannual and seasonal variability. Because interactions between biophysical components of ecosystem processes are complex, it is important to test how models perform in frequency (e.g. hours, days, weeks, months, years) and time (i.e. day of the year) domains in addition to traditional tests of annual or monthly sums. Here we present a spectral evaluation using wavelet time series analysis of model performance in seven Mediterranean Oak Woodlands that encompass three deciduous and four evergreen sites. We tested the performance of five models (CABLE, ORCHIDEE, BEPS, Biome-BGC, and JULES) on measured variables of gross primary production (GPP) and evapotranspiration (ET). In general, model performance fails at intermediate periods (e.g. weeks to months) likely because these models do not represent the water pulse dynamics that influence GPP and ET at these Mediterranean systems. To improve the performance of a model it is critical to identify first where and when the model fails. Only by identifying where a model fails we can improve the model performance and use them as prognostic tools and to generate further hypotheses that can be tested by new experiments and measurements.
PERFORMANCE EVALUATION OF TYPE I MARINE SANITATION DEVICES
This performance test was designed to evaluate the effectiveness of two Type I Marine Sanitation Devices (MSDs): the Electro Scan Model EST 12, manufactured by Raritan Engineering Company, Inc., and the Thermopure-2, manufactured by Gross Mechanical Laboratories, Inc. Performance...
Performance evaluation and placement analysis of w-beam guardrails behind curbs.
DOT National Transportation Integrated Search
2014-12-15
This report summarizes the research efforts of using finite element modeling and simulations to evaluate the performance : of NCDOT W-beam guardrails behind curbs under MASH TL-2 impact conditions. A literature review is included on : performance eva...
NASA Astrophysics Data System (ADS)
Posselt, D.; L'Ecuyer, T.; Matsui, T.
2009-05-01
Cloud resolving models are typically used to examine the characteristics of clouds and precipitation and their relationship to radiation and the large-scale circulation. As such, they are not required to reproduce the exact location of each observed convective system, much less each individual cloud. Some of the most relevant information about clouds and precipitation is provided by instruments located on polar-orbiting satellite platforms, but these observations are intermittent "snapshots" in time, making assessment of model performance challenging. In contrast to direct comparison, model results can be evaluated statistically. This avoids the requirement for the model to reproduce the observed systems, while returning valuable information on the performance of the model in a climate-relevant sense. The focus of this talk is a model evaluation study, in which updates to the microphysics scheme used in a three-dimensional version of the Goddard Cumulus Ensemble (GCE) model are evaluated using statistics of observed clouds, precipitation, and radiation. We present the results of multiday (non-equilibrium) simulations of organized deep convection using single- and double-moment versions of a the model's cloud microphysical scheme. Statistics of TRMM multi-sensor derived clouds, precipitation, and radiative fluxes are used to evaluate the GCE results, as are simulated TRMM measurements obtained using a sophisticated instrument simulator suite. We present advantages and disadvantages of performing model comparisons in retrieval and measurement space and conclude by motivating the use of data assimilation techniques for analyzing and improving model parameterizations.
Pohjola, Mikko V; Pohjola, Pasi; Tainio, Marko; Tuomisto, Jouni T
2013-06-26
The calls for knowledge-based policy and policy-relevant research invoke a need to evaluate and manage environment and health assessments and models according to their societal outcomes. This review explores how well the existing approaches to assessment and model performance serve this need. The perspectives to assessment and model performance in the scientific literature can be called: (1) quality assurance/control, (2) uncertainty analysis, (3) technical assessment of models, (4) effectiveness and (5) other perspectives, according to what is primarily seen to constitute the goodness of assessments and models. The categorization is not strict and methods, tools and frameworks in different perspectives may overlap. However, altogether it seems that most approaches to assessment and model performance are relatively narrow in their scope. The focus in most approaches is on the outputs and making of assessments and models. Practical application of the outputs and the consequential outcomes are often left unaddressed. It appears that more comprehensive approaches that combine the essential characteristics of different perspectives are needed. This necessitates a better account of the mechanisms of collective knowledge creation and the relations between knowledge and practical action. Some new approaches to assessment, modeling and their evaluation and management span the chain from knowledge creation to societal outcomes, but the complexity of evaluating societal outcomes remains a challenge.
Properties of the Multiple Measures in Arizona's Teacher Evaluation Model. REL 2015-050
ERIC Educational Resources Information Center
Lazarev, Valeriy; Newman, Denis; Sharp, Alyssa
2014-01-01
This study explored the relationships among the components of the Arizona Department of Education's new teacher evaluation model, with a particular focus on the extent to which ratings from the state model's teacher observation instrument differentiated higher and lower performance. The study used teacher-level evaluation data collected by the…
Seismic performance evaluation of RC frame-shear wall structures using nonlinear analysis methods
NASA Astrophysics Data System (ADS)
Shi, Jialiang; Wang, Qiuwei
To further understand the seismic performance of reinforced concrete (RC) frame-shear wall structures, a 1/8 model structure is scaled from a main factory structure with seven stories and seven bays. The model with four-stories and two-bays was pseudo-dynamically tested under six earthquake actions whose peak ground accelerations (PGA) vary from 50gal to 400gal. The damage process and failure patterns were investigated. Furthermore, nonlinear dynamic analysis (NDA) and capacity spectrum method (CSM) were adopted to evaluate the seismic behavior of the model structure. The top displacement curve, story drift curve and distribution of hinges were obtained and discussed. It is shown that the model structure had the characteristics of beam-hinge failure mechanism. The two methods can be used to evaluate the seismic behavior of RC frame-shear wall structures well. What’s more, the NDA can be somewhat replaced by CSM for the seismic performance evaluation of RC structures.
The Isprs Benchmark on Indoor Modelling
NASA Astrophysics Data System (ADS)
Khoshelham, K.; Díaz Vilariño, L.; Peter, M.; Kang, Z.; Acharya, D.
2017-09-01
Automated generation of 3D indoor models from point cloud data has been a topic of intensive research in recent years. While results on various datasets have been reported in literature, a comparison of the performance of different methods has not been possible due to the lack of benchmark datasets and a common evaluation framework. The ISPRS benchmark on indoor modelling aims to address this issue by providing a public benchmark dataset and an evaluation framework for performance comparison of indoor modelling methods. In this paper, we present the benchmark dataset comprising several point clouds of indoor environments captured by different sensors. We also discuss the evaluation and comparison of indoor modelling methods based on manually created reference models and appropriate quality evaluation criteria. The benchmark dataset is available for download at: http://www2.isprs.org/commissions/comm4/wg5/benchmark-on-indoor-modelling.html.
Quality of Protection Evaluation of Security Mechanisms
Ksiezopolski, Bogdan; Zurek, Tomasz; Mokkas, Michail
2014-01-01
Recent research indicates that during the design of teleinformatic system the tradeoff between the systems performance and the system protection should be made. The traditional approach assumes that the best way is to apply the strongest possible security measures. Unfortunately, the overestimation of security measures can lead to the unreasonable increase of system load. This is especially important in multimedia systems where the performance has critical character. In many cases determination of the required level of protection and adjustment of some security measures to these requirements increase system efficiency. Such an approach is achieved by means of the quality of protection models where the security measures are evaluated according to their influence on the system security. In the paper, we propose a model for QoP evaluation of security mechanisms. Owing to this model, one can quantify the influence of particular security mechanisms on ensuring security attributes. The methodology of our model preparation is described and based on it the case study analysis is presented. We support our method by the tool where the models can be defined and QoP evaluation can be performed. Finally, we have modelled TLS cryptographic protocol and presented the QoP security mechanisms evaluation for the selected versions of this protocol. PMID:25136683
Development Of A Parallel Performance Model For The THOR Neutral Particle Transport Code
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yessayan, Raffi; Azmy, Yousry; Schunert, Sebastian
The THOR neutral particle transport code enables simulation of complex geometries for various problems from reactor simulations to nuclear non-proliferation. It is undergoing a thorough V&V requiring computational efficiency. This has motivated various improvements including angular parallelization, outer iteration acceleration, and development of peripheral tools. For guiding future improvements to the code’s efficiency, better characterization of its parallel performance is useful. A parallel performance model (PPM) can be used to evaluate the benefits of modifications and to identify performance bottlenecks. Using INL’s Falcon HPC, the PPM development incorporates an evaluation of network communication behavior over heterogeneous links and a functionalmore » characterization of the per-cell/angle/group runtime of each major code component. After evaluating several possible sources of variability, this resulted in a communication model and a parallel portion model. The former’s accuracy is bounded by the variability of communication on Falcon while the latter has an error on the order of 1%.« less
Evaluation of Supply Chain Efficiency Based on a Novel Network of Data Envelopment Analysis Model
NASA Astrophysics Data System (ADS)
Fu, Li Fang; Meng, Jun; Liu, Ying
2015-12-01
Performance evaluation of supply chain (SC) is a vital topic in SC management and inherently complex problems with multilayered internal linkages and activities of multiple entities. Recently, various Network Data Envelopment Analysis (NDEA) models, which opened the “black box” of conventional DEA, were developed and applied to evaluate the complex SC with a multilayer network structure. However, most of them are input or output oriented models which cannot take into consideration the nonproportional changes of inputs and outputs simultaneously. This paper extends the Slack-based measure (SBM) model to a nonradial, nonoriented network model named as U-NSBM with the presence of undesirable outputs in the SC. A numerical example is presented to demonstrate the applicability of the model in quantifying the efficiency and ranking the supply chain performance. By comparing with the CCR and U-SBM models, it is shown that the proposed model has higher distinguishing ability and gives feasible solution in the presence of undesirable outputs. Meanwhile, it provides more insights for decision makers about the source of inefficiency as well as the guidance to improve the SC performance.
POCO-MOEA: Using Evolutionary Algorithms to Solve the Controller Placement Problem
2016-03-24
to gather data on POCO-MOEA performance to a series of iv model networks. The algorithm’s behavior is then evaluated and compared to ex- haustive... evaluation of a third heuristic based on a Multi 3 Objective Evolutionary Algorithm (MOEA). This heuristic is modeled after one of the most well known MOEAs...researchers to extend into more realistic evaluations of the performance characteristics of SDN controllers, such as the use of simulators or live
Performance of an Automated-Mixed-Traffic-Vehicle /AMTV/ System. [urban people mover
NASA Technical Reports Server (NTRS)
Peng, T. K. C.; Chon, K.
1978-01-01
This study analyzes the operation and evaluates the expected performance of a proposed automatic guideway transit system which uses low-speed Automated Mixed Traffic Vehicles (AMTV's). Vehicle scheduling and headway control policies are evaluated with a transit system simulation model. The effect of mixed-traffic interference on the average vehicle speed is examined with a vehicle-pedestrian interface model. Control parameters regulating vehicle speed are evaluated for safe stopping and passenger comfort.
Evaluation of Multiclass Model Observers in PET LROC Studies
NASA Astrophysics Data System (ADS)
Gifford, H. C.; Kinahan, P. E.; Lartizien, C.; King, M. A.
2007-02-01
A localization ROC (LROC) study was conducted to evaluate nonprewhitening matched-filter (NPW) and channelized NPW (CNPW) versions of a multiclass model observer as predictors of human tumor-detection performance with PET images. Target localization is explicitly performed by these model observers. Tumors were placed in the liver, lungs, and background soft tissue of a mathematical phantom, and the data simulation modeled a full-3D acquisition mode. Reconstructions were performed with the FORE+AWOSEM algorithm. The LROC study measured observer performance with 2D images consisting of either coronal, sagittal, or transverse views of the same set of cases. Versions of the CNPW observer based on two previously published difference-of-Gaussian channel models demonstrated good quantitative agreement with human observers. One interpretation of these results treats the CNPW observer as a channelized Hotelling observer with implicit internal noise
A PERFORMANCE EVALUATION OF THE 2004 RELEASE OF MODELS-3 CMAQ
This performance evaluation compares a full annual simulation (2001) of CMAQ (Version4.4) covering the contiguous United States against monitoring data from four nationwide networks. This effort, which represents one of the most spatially and temporally comprehensive performance...
DOT National Transportation Integrated Search
2009-06-30
This report summarizes the finite element modeling and simulation efforts on evaluating the performance of cable median barriers including the current and several proposed retrofit designs. It also synthesizes a literature review of the performance e...
An evaluation of Dynamic TOPMODEL for low flow simulation
NASA Astrophysics Data System (ADS)
Coxon, G.; Freer, J. E.; Quinn, N.; Woods, R. A.; Wagener, T.; Howden, N. J. K.
2015-12-01
Hydrological models are essential tools for drought risk management, often providing input to water resource system models, aiding our understanding of low flow processes within catchments and providing low flow predictions. However, simulating low flows and droughts is challenging as hydrological systems often demonstrate threshold effects in connectivity, non-linear groundwater contributions and a greater influence of water resource system elements during low flow periods. These dynamic processes are typically not well represented in commonly used hydrological models due to data and model limitations. Furthermore, calibrated or behavioural models may not be effectively evaluated during more extreme drought periods. A better understanding of the processes that occur during low flows and how these are represented within models is thus required if we want to be able to provide robust and reliable predictions of future drought events. In this study, we assess the performance of dynamic TOPMODEL for low flow simulation. Dynamic TOPMODEL was applied to a number of UK catchments in the Thames region using time series of observed rainfall and potential evapotranspiration data that captured multiple historic droughts over a period of several years. The model performance was assessed against the observed discharge time series using a limits of acceptability framework, which included uncertainty in the discharge time series. We evaluate the models against multiple signatures of catchment low-flow behaviour and investigate differences in model performance between catchments, model diagnostics and for different low flow periods. We also considered the impact of surface water and groundwater abstractions and discharges on the observed discharge time series and how this affected the model evaluation. From analysing the model performance, we suggest future improvements to Dynamic TOPMODEL to improve the representation of low flow processes within the model structure.
Zhang, Xu; Jin, Weiqi; Li, Jiakun; Wang, Xia; Li, Shuo
2017-04-01
Thermal imaging technology is an effective means of detecting hazardous gas leaks. Much attention has been paid to evaluation of the performance of gas leak infrared imaging detection systems due to several potential applications. The minimum resolvable temperature difference (MRTD) and the minimum detectable temperature difference (MDTD) are commonly used as the main indicators of thermal imaging system performance. This paper establishes a minimum detectable gas concentration (MDGC) performance evaluation model based on the definition and derivation of MDTD. We proposed the direct calculation and equivalent calculation method of MDGC based on the MDTD measurement system. We build an experimental MDGC measurement system, which indicates the MDGC model can describe the detection performance of a thermal imaging system to typical gases. The direct calculation, equivalent calculation, and direct measurement results are consistent. The MDGC and the minimum resolvable gas concentration (MRGC) model can effectively describe the performance of "detection" and "spatial detail resolution" of thermal imaging systems to gas leak, respectively, and constitute the main performance indicators of gas leak detection systems.
Performance of ANFIS versus MLP-NN dissolved oxygen prediction models in water quality monitoring.
Najah, A; El-Shafie, A; Karim, O A; El-Shafie, Amr H
2014-02-01
We discuss the accuracy and performance of the adaptive neuro-fuzzy inference system (ANFIS) in training and prediction of dissolved oxygen (DO) concentrations. The model was used to analyze historical data generated through continuous monitoring of water quality parameters at several stations on the Johor River to predict DO concentrations. Four water quality parameters were selected for ANFIS modeling, including temperature, pH, nitrate (NO3) concentration, and ammoniacal nitrogen concentration (NH3-NL). Sensitivity analysis was performed to evaluate the effects of the input parameters. The inputs with the greatest effect were those related to oxygen content (NO3) or oxygen demand (NH3-NL). Temperature was the parameter with the least effect, whereas pH provided the lowest contribution to the proposed model. To evaluate the performance of the model, three statistical indices were used: the coefficient of determination (R (2)), the mean absolute prediction error, and the correlation coefficient. The performance of the ANFIS model was compared with an artificial neural network model. The ANFIS model was capable of providing greater accuracy, particularly in the case of extreme events.
ERIC Educational Resources Information Center
Harwood, Henrick; Bazron, Barbara; Fountain, Douglas
This paper presents state-of-the-art models addressing issues related to coordination of treatment and evaluation activities, and integration of clinical, performance, and evaluation information. Specifically, this concept paper contains a discussion of the need for and types of cost analyses for CSAT treatment evaluation and knowledge-generating…
Review of Airport Ground Traffic Models Including an Evaluation of the ASTS Computer Program
DOT National Transportation Integrated Search
1972-12-01
The report covers an evaluation of Airport Ground Traffic models for the purpose of simulating an Autonomous Local Intersection Controller. All known models were reviewed and a detailed study was performed on the two in-house models the ASTS and ROSS...
Application of Wavelet Filters in an Evaluation of Photochemical Model Performance
Air quality model evaluation can be enhanced with time-scale specific comparisons of outputs and observations. For example, high-frequency (hours to one day) time scale information in observed ozone is not well captured by deterministic models and its incorporation into model pe...
Towards Systematic Benchmarking of Climate Model Performance
NASA Astrophysics Data System (ADS)
Gleckler, P. J.
2014-12-01
The process by which climate models are evaluated has evolved substantially over the past decade, with the Coupled Model Intercomparison Project (CMIP) serving as a centralizing activity for coordinating model experimentation and enabling research. Scientists with a broad spectrum of expertise have contributed to the CMIP model evaluation process, resulting in many hundreds of publications that have served as a key resource for the IPCC process. For several reasons, efforts are now underway to further systematize some aspects of the model evaluation process. First, some model evaluation can now be considered routine and should not require "re-inventing the wheel" or a journal publication simply to update results with newer models. Second, the benefit of CMIP research to model development has not been optimal because the publication of results generally takes several years and is usually not reproducible for benchmarking newer model versions. And third, there are now hundreds of model versions and many thousands of simulations, but there is no community-based mechanism for routinely monitoring model performance changes. An important change in the design of CMIP6 can help address these limitations. CMIP6 will include a small set standardized experiments as an ongoing exercise (CMIP "DECK": ongoing Diagnostic, Evaluation and Characterization of Klima), so that modeling groups can submit them at any time and not be overly constrained by deadlines. In this presentation, efforts to establish routine benchmarking of existing and future CMIP simulations will be described. To date, some benchmarking tools have been made available to all CMIP modeling groups to enable them to readily compare with CMIP5 simulations during the model development process. A natural extension of this effort is to make results from all CMIP simulations widely available, including the results from newer models as soon as the simulations become available for research. Making the results from routine performance tests readily accessible will help advance a more transparent model evaluation process.
da Fonseca Neto, João Viana; Abreu, Ivanildo Silva; da Silva, Fábio Nogueira
2010-04-01
Toward the synthesis of state-space controllers, a neural-genetic model based on the linear quadratic regulator design for the eigenstructure assignment of multivariable dynamic systems is presented. The neural-genetic model represents a fusion of a genetic algorithm and a recurrent neural network (RNN) to perform the selection of the weighting matrices and the algebraic Riccati equation solution, respectively. A fourth-order electric circuit model is used to evaluate the convergence of the computational intelligence paradigms and the control design method performance. The genetic search convergence evaluation is performed in terms of the fitness function statistics and the RNN convergence, which is evaluated by landscapes of the energy and norm, as a function of the parameter deviations. The control problem solution is evaluated in the time and frequency domains by the impulse response, singular values, and modal analysis.
Ren, Huazhong; Liu, Rongyuan; Yan, Guangjian; Li, Zhao-Liang; Qin, Qiming; Liu, Qiang; Nerry, Françoise
2015-04-06
Land surface emissivity is a crucial parameter in the surface status monitoring. This study aims at the evaluation of four directional emissivity models, including two bi-directional reflectance distribution function (BRDF) models and two gap-frequency-based models. Results showed that the kernel-driven BRDF model could well represent directional emissivity with an error less than 0.002, and was consequently used to retrieve emissivity with an accuracy of about 0.012 from an airborne multi-angular thermal infrared data set. Furthermore, we updated the cavity effect factor relating to multiple scattering inside canopy, which improved the performance of the gap-frequency-based models.
NASA Astrophysics Data System (ADS)
Zaherpour, Jamal; Gosling, Simon N.; Mount, Nick; Müller Schmied, Hannes; Veldkamp, Ted I. E.; Dankers, Rutger; Eisner, Stephanie; Gerten, Dieter; Gudmundsson, Lukas; Haddeland, Ingjerd; Hanasaki, Naota; Kim, Hyungjun; Leng, Guoyong; Liu, Junguo; Masaki, Yoshimitsu; Oki, Taikan; Pokhrel, Yadu; Satoh, Yusuke; Schewe, Jacob; Wada, Yoshihide
2018-06-01
Global-scale hydrological models are routinely used to assess water scarcity, flood hazards and droughts worldwide. Recent efforts to incorporate anthropogenic activities in these models have enabled more realistic comparisons with observations. Here we evaluate simulations from an ensemble of six models participating in the second phase of the Inter-Sectoral Impact Model Inter-comparison Project (ISIMIP2a). We simulate monthly runoff in 40 catchments, spatially distributed across eight global hydrobelts. The performance of each model and the ensemble mean is examined with respect to their ability to replicate observed mean and extreme runoff under human-influenced conditions. Application of a novel integrated evaluation metric to quantify the models’ ability to simulate timeseries of monthly runoff suggests that the models generally perform better in the wetter equatorial and northern hydrobelts than in drier southern hydrobelts. When model outputs are temporally aggregated to assess mean annual and extreme runoff, the models perform better. Nevertheless, we find a general trend in the majority of models towards the overestimation of mean annual runoff and all indicators of upper and lower extreme runoff. The models struggle to capture the timing of the seasonal cycle, particularly in northern hydrobelts, while in southern hydrobelts the models struggle to reproduce the magnitude of the seasonal cycle. It is noteworthy that over all hydrological indicators, the ensemble mean fails to perform better than any individual model—a finding that challenges the commonly held perception that model ensemble estimates deliver superior performance over individual models. The study highlights the need for continued model development and improvement. It also suggests that caution should be taken when summarising the simulations from a model ensemble based upon its mean output.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, Xiaodong, E-mail: eastdawn@tsinghua.edu.cn; Su, Shu, E-mail: sushuqh@163.com; Zhang, Zhihui, E-mail: zhzhg@tsinghua.edu.cn
To comprehensively pre-evaluate the damages to both the environment and human health due to construction activities in China, this paper presents an integrated building environmental and health performance (EHP) assessment model based on the Building Environmental Performance Analysis System (BEPAS) and the Building Health Impact Analysis System (BHIAS) models and offers a new inventory data estimation method. The new model follows the life cycle assessment (LCA) framework and the inventory analysis step involves bill of quantity (BOQ) data collection, consumption data formation, and environmental profile transformation. The consumption data are derived from engineering drawings and quotas to conduct the assessmentmore » before construction for pre-evaluation. The new model classifies building impacts into three safeguard areas: ecosystems, natural resources and human health. Thus, this model considers environmental impacts as well as damage to human wellbeing. The monetization approach, distance-to-target method and panel method are considered as optional weighting approaches. Finally, nine residential buildings of different structural types are taken as case studies to test the operability of the integrated model through application. The results indicate that the new model can effectively pre-evaluate building EHP and the structure type significantly affects the performance of residential buildings.« less
ERIC Educational Resources Information Center
Buttram, Joan L.; Covert, Robert W.
The Discrepancy Evaluation Model (DEM), developed in 1966 by Malcolm Provus, provides information for program assessment and program improvement. Under the DEM, evaluation is defined as the comparison of an actual performance to a desired standard. The DEM embodies five stages of evaluation based upon a program's natural development: program…
ERIC Educational Resources Information Center
California School Boards Association, Sacramento.
This publication is intended to aid local school board members in establishing procedures and priorities for evaluating the performance of their district superintendent. Except for a brief introductory section, the entire publication consists of a model comprehensive evaluation instrument. The evaluation model is organized in two main sections,…
NASA Astrophysics Data System (ADS)
Koch, Julian; Cüneyd Demirel, Mehmet; Stisen, Simon
2018-05-01
The process of model evaluation is not only an integral part of model development and calibration but also of paramount importance when communicating modelling results to the scientific community and stakeholders. The modelling community has a large and well-tested toolbox of metrics to evaluate temporal model performance. In contrast, spatial performance evaluation does not correspond to the grand availability of spatial observations readily available and to the sophisticate model codes simulating the spatial variability of complex hydrological processes. This study makes a contribution towards advancing spatial-pattern-oriented model calibration by rigorously testing a multiple-component performance metric. The promoted SPAtial EFficiency (SPAEF) metric reflects three equally weighted components: correlation, coefficient of variation and histogram overlap. This multiple-component approach is found to be advantageous in order to achieve the complex task of comparing spatial patterns. SPAEF, its three components individually and two alternative spatial performance metrics, i.e. connectivity analysis and fractions skill score, are applied in a spatial-pattern-oriented model calibration of a catchment model in Denmark. Results suggest the importance of multiple-component metrics because stand-alone metrics tend to fail to provide holistic pattern information. The three SPAEF components are found to be independent, which allows them to complement each other in a meaningful way. In order to optimally exploit spatial observations made available by remote sensing platforms, this study suggests applying bias insensitive metrics which further allow for a comparison of variables which are related but may differ in unit. This study applies SPAEF in the hydrological context using the mesoscale Hydrologic Model (mHM; version 5.8), but we see great potential across disciplines related to spatially distributed earth system modelling.
Thrust stand evaluation of engine performance improvement algorithms in an F-15 airplane
NASA Technical Reports Server (NTRS)
Conners, Timothy R.
1992-01-01
Results are presented from the evaluation of the performance seeking control (PSC) optimization algorithm developed by Smith et al. (1990) for F-15 aircraft, which optimizes the quasi-steady-state performance of an F100 derivative turbofan engine for several modes of operation. The PSC algorithm uses onboard software engine model that calculates thrust, stall margin, and other unmeasured variables for use in the optimization. Comparisons are presented between the load cell measurements, PSC onboard model thrust calculations, and posttest state variable model computations. Actual performance improvements using the PSC algorithm are presented for its various modes. The results of using PSC algorithm are compared with similar test case results using the HIDEC algorithm.
Evaluation of Student Performance through a Multidimensional Finite Mixture IRT Model.
Bacci, Silvia; Bartolucci, Francesco; Grilli, Leonardo; Rampichini, Carla
2017-01-01
In the Italian academic system, a student can enroll for an exam immediately after the end of the teaching period or can postpone it; in this second case the exam result is missing. We propose an approach for the evaluation of a student performance throughout the course of study, accounting also for nonattempted exams. The approach is based on an item response theory model that includes two discrete latent variables representing student performance and priority in selecting the exams to take. We explicitly account for nonignorable missing observations as the indicators of attempted exams also contribute to measure the performance (within-item multidimensionality). The model also allows for individual covariates in its structural part.
The work here complements the overview analysis of the modelling systems participating in the third phase of the Air Quality Model Evaluation International Initiative (AQMEII3) by focusing on the performance for hourly surface ozone by two modelling systems, Chimere for Europe an...
NASA Technical Reports Server (NTRS)
Tranter, W. H.; Ziemer, R. E.; Fashano, M. J.
1975-01-01
This paper reviews the SYSTID technique for performance evaluation of communication systems using time-domain computer simulation. An example program illustrates the language. The inclusion of both Gaussian and impulse noise models make accurate simulation possible in a wide variety of environments. A very flexible postprocessor makes possible accurate and efficient performance evaluation.
Models and techniques for evaluating the effectiveness of aircraft computing systems
NASA Technical Reports Server (NTRS)
Meyer, J. F.
1978-01-01
The development of system models that can provide a basis for the formulation and evaluation of aircraft computer system effectiveness, the formulation of quantitative measures of system effectiveness, and the development of analytic and simulation techniques for evaluating the effectiveness of a proposed or existing aircraft computer are described. Specific topics covered include: system models; performability evaluation; capability and functional dependence; computation of trajectory set probabilities; and hierarchical modeling of an air transport mission.
Evaluating bacterial gene-finding HMM structures as probabilistic logic programs.
Mørk, Søren; Holmes, Ian
2012-03-01
Probabilistic logic programming offers a powerful way to describe and evaluate structured statistical models. To investigate the practicality of probabilistic logic programming for structure learning in bioinformatics, we undertook a simplified bacterial gene-finding benchmark in PRISM, a probabilistic dialect of Prolog. We evaluate Hidden Markov Model structures for bacterial protein-coding gene potential, including a simple null model structure, three structures based on existing bacterial gene finders and two novel model structures. We test standard versions as well as ADPH length modeling and three-state versions of the five model structures. The models are all represented as probabilistic logic programs and evaluated using the PRISM machine learning system in terms of statistical information criteria and gene-finding prediction accuracy, in two bacterial genomes. Neither of our implementations of the two currently most used model structures are best performing in terms of statistical information criteria or prediction performances, suggesting that better-fitting models might be achievable. The source code of all PRISM models, data and additional scripts are freely available for download at: http://github.com/somork/codonhmm. Supplementary data are available at Bioinformatics online.
ERIC Educational Resources Information Center
Darch, Craig; And Others
Several evaluation formats were used to examine the impact that the Direct Instruction Model had on 600 selected students in Williamsburg County, South Carolina, over a 7-year period. The performance of students in the Direct Instruction Model was contrasted with the performance of similar students (on the basis of family income, ethnicity,…
Evaluation of natural language processing systems: Issues and approaches
DOE Office of Scientific and Technical Information (OSTI.GOV)
Guida, G.; Mauri, G.
This paper encompasses two main topics: a broad and general analysis of the issue of performance evaluation of NLP systems and a report on a specific approach developed by the authors and experimented on a sample test case. More precisely, it first presents a brief survey of the major works in the area of NLP systems evaluation. Then, after introducing the notion of the life cycle of an NLP system, it focuses on the concept of performance evaluation and analyzes the scope and the major problems of the investigation. The tools generally used within computer science to assess the qualitymore » of a software system are briefly reviewed, and their applicability to the task of evaluation of NLP systems is discussed. Particular attention is devoted to the concepts of efficiency, correctness, reliability, and adequacy, and how all of them basically fail in capturing the peculiar features of performance evaluation of an NLP system is discussed. Two main approaches to performance evaluation are later introduced; namely, black-box- and model-based, and their most important characteristics are presented. Finally, a specific model for performance evaluation proposed by the authors is illustrated, and the results of an experiment with a sample application are reported. The paper concludes with a discussion on research perspective, open problems, and importance of performance evaluation to industrial applications.« less
An evaluative model of system performance in manned teleoperational systems
NASA Technical Reports Server (NTRS)
Haines, Richard F.
1989-01-01
Manned teleoperational systems are used in aerospace operations in which humans must interact with machines remotely. Manual guidance of remotely piloted vehicles, controling a wind tunnel, carrying out a scientific procedure remotely are examples of teleoperations. A four input parameter throughput (Tp) model is presented which can be used to evaluate complex, manned, teleoperations-based systems and make critical comparisons among candidate control systems. The first two parameters of this model deal with nominal (A) and off-nominal (B) predicted events while the last two focus on measured events of two types, human performance (C) and system performance (D). Digital simulations showed that the expression A(1-B)/C+D) produced the greatest homogeneity of variance and distribution symmetry. Results from a recently completed manned life science telescience experiment will be used to further validate the model. Complex, interacting teleoperational systems may be systematically evaluated using this expression much like a computer benchmark is used.
USDA-ARS?s Scientific Manuscript database
Previous publications have outlined recommended practices for hydrologic and water quality (H/WQ) modeling, but none have formulated comprehensive guidelines for the final stage of modeling applications, namely evaluation, interpretation, and communication of model results and the consideration of t...
NASA Astrophysics Data System (ADS)
Bessagnet, Bertrand; Pirovano, Guido; Mircea, Mihaela; Cuvelier, Cornelius; Aulinger, Armin; Calori, Giuseppe; Ciarelli, Giancarlo; Manders, Astrid; Stern, Rainer; Tsyro, Svetlana; García Vivanco, Marta; Thunis, Philippe; Pay, Maria-Teresa; Colette, Augustin; Couvidat, Florian; Meleux, Frédérik; Rouïl, Laurence; Ung, Anthony; Aksoyoglu, Sebnem; María Baldasano, José; Bieser, Johannes; Briganti, Gino; Cappelletti, Andrea; D'Isidoro, Massimo; Finardi, Sandro; Kranenburg, Richard; Silibello, Camillo; Carnevale, Claudio; Aas, Wenche; Dupont, Jean-Charles; Fagerli, Hilde; Gonzalez, Lucia; Menut, Laurent; Prévôt, André S. H.; Roberts, Pete; White, Les
2016-10-01
The EURODELTA III exercise has facilitated a comprehensive intercomparison and evaluation of chemistry transport model performances. Participating models performed calculations for four 1-month periods in different seasons in the years 2006 to 2009, allowing the influence of different meteorological conditions on model performances to be evaluated. The exercise was performed with strict requirements for the input data, with few exceptions. As a consequence, most of differences in the outputs will be attributed to the differences in model formulations of chemical and physical processes. The models were evaluated mainly for background rural stations in Europe. The performance was assessed in terms of bias, root mean square error and correlation with respect to the concentrations of air pollutants (NO2, O3, SO2, PM10 and PM2.5), as well as key meteorological variables. Though most of meteorological parameters were prescribed, some variables like the planetary boundary layer (PBL) height and the vertical diffusion coefficient were derived in the model preprocessors and can partly explain the spread in model results. In general, the daytime PBL height is underestimated by all models. The largest variability of predicted PBL is observed over the ocean and seas. For ozone, this study shows the importance of proper boundary conditions for accurate model calculations and then on the regime of the gas and particle chemistry. The models show similar and quite good performance for nitrogen dioxide, whereas they struggle to accurately reproduce measured sulfur dioxide concentrations (for which the agreement with observations is the poorest). In general, the models provide a close-to-observations map of particulate matter (PM2.5 and PM10) concentrations over Europe rather with correlations in the range 0.4-0.7 and a systematic underestimation reaching -10 µg m-3 for PM10. The highest concentrations are much more underestimated, particularly in wintertime. Further evaluation of the mean diurnal cycles of PM reveals a general model tendency to overestimate the effect of the PBL height rise on PM levels in the morning, while the intensity of afternoon chemistry leads formation of secondary species to be underestimated. This results in larger modelled PM diurnal variations than the observations for all seasons. The models tend to be too sensitive to the daily variation of the PBL. All in all, in most cases model performances are more influenced by the model setup than the season. The good representation of temporal evolution of wind speed is the most responsible for models' skillfulness in reproducing the daily variability of pollutant concentrations (e.g. the development of peak episodes), while the reconstruction of the PBL diurnal cycle seems to play a larger role in driving the corresponding pollutant diurnal cycle and hence determines the presence of systematic positive and negative biases detectable on daily basis.
Evaluation of annual, global seismicity forecasts, including ensemble models
NASA Astrophysics Data System (ADS)
Taroni, Matteo; Zechar, Jeremy; Marzocchi, Warner
2013-04-01
In 2009, the Collaboratory for the Study of the Earthquake Predictability (CSEP) initiated a prototype global earthquake forecast experiment. Three models participated in this experiment for 2009, 2010 and 2011—each model forecast the number of earthquakes above magnitude 6 in 1x1 degree cells that span the globe. Here we use likelihood-based metrics to evaluate the consistency of the forecasts with the observed seismicity. We compare model performance with statistical tests and a new method based on the peer-to-peer gambling score. The results of the comparisons are used to build ensemble models that are a weighted combination of the individual models. Notably, in these experiments the ensemble model always performs significantly better than the single best-performing model. Our results indicate the following: i) time-varying forecasts, if not updated after each major shock, may not provide significant advantages with respect to time-invariant models in 1-year forecast experiments; ii) the spatial distribution seems to be the most important feature to characterize the different forecasting performances of the models; iii) the interpretation of consistency tests may be misleading because some good models may be rejected while trivial models may pass consistency tests; iv) a proper ensemble modeling seems to be a valuable procedure to get the best performing model for practical purposes.
Cheng, F T; Yang, H C; Luo, T L; Feng, C; Jeng, M
2000-01-01
Equipment Managers (EMs) play a major role in a Manufacturing Execution System (MES). They serve as the communication bridge between the components of an MES and the equipment. The purpose of this paper is to propose a novel methodology for developing analytical and simulation models for the EM such that the validity and performance of the EM can be evaluated. Domain knowledge and requirements are collected from a real semiconductor packaging factory. By using IDEFO and state diagrams, a static functional model and a dynamic state model of the EM are built. Next, these two models are translated into a Petri net model. This allows qualitative and quantitative analyses of the system. The EM net model is then expanded into the MES net model. Therefore, the performance of an EM in the MES environment can be evaluated. These evaluation results are good references for design and decision making.
Performance Indicators in Education.
ERIC Educational Resources Information Center
Irvine, David J.
Evaluation of education involves assessing the effectiveness of schools and trying to determine how best to improve them. Since evaluation often deals only with the question of effectiveness, performance indicators in education are designed to make evaluation more complete. They are a set of statistical models which relate several important…
The Role of Multimodel Combination in Improving Streamflow Prediction
NASA Astrophysics Data System (ADS)
Arumugam, S.; Li, W.
2008-12-01
Model errors are the inevitable part in any prediction exercise. One approach that is currently gaining attention to reduce model errors is by optimally combining multiple models to develop improved predictions. The rationale behind this approach primarily lies on the premise that optimal weights could be derived for each model so that the developed multimodel predictions will result in improved predictability. In this study, we present a new approach to combine multiple hydrological models by evaluating their predictability contingent on the predictor state. We combine two hydrological models, 'abcd' model and Variable Infiltration Capacity (VIC) model, with each model's parameter being estimated by two different objective functions to develop multimodel streamflow predictions. The performance of multimodel predictions is compared with individual model predictions using correlation, root mean square error and Nash-Sutcliffe coefficient. To quantify precisely under what conditions the multimodel predictions result in improved predictions, we evaluate the proposed algorithm by testing it against streamflow generated from a known model ('abcd' model or VIC model) with errors being homoscedastic or heteroscedastic. Results from the study show that streamflow simulated from individual models performed better than multimodels under almost no model error. Under increased model error, the multimodel consistently performed better than the single model prediction in terms of all performance measures. The study also evaluates the proposed algorithm for streamflow predictions in two humid river basins from NC as well as in two arid basins from Arizona. Through detailed validation in these four sites, the study shows that multimodel approach better predicts the observed streamflow in comparison to the single model predictions.
Yu, Lei; Kang, Jian
2009-09-01
This research aims to explore the feasibility of using computer-based models to predict the soundscape quality evaluation of potential users in urban open spaces at the design stage. With the data from large scale field surveys in 19 urban open spaces across Europe and China, the importance of various physical, behavioral, social, demographical, and psychological factors for the soundscape evaluation has been statistically analyzed. Artificial neural network (ANN) models have then been explored at three levels. It has been shown that for both subjective sound level and acoustic comfort evaluation, a general model for all the case study sites is less feasible due to the complex physical and social environments in urban open spaces; models based on individual case study sites perform well but the application range is limited; and specific models for certain types of location/function would be reliable and practical. The performance of acoustic comfort models is considerably better than that of sound level models. Based on the ANN models, soundscape quality maps can be produced and this has been demonstrated with an example.
NASA Technical Reports Server (NTRS)
Gupta, Hoshin V.; Kling, Harald; Yilmaz, Koray K.; Martinez-Baquero, Guillermo F.
2009-01-01
The mean squared error (MSE) and the related normalization, the Nash-Sutcliffe efficiency (NSE), are the two criteria most widely used for calibration and evaluation of hydrological models with observed data. Here, we present a diagnostically interesting decomposition of NSE (and hence MSE), which facilitates analysis of the relative importance of its different components in the context of hydrological modelling, and show how model calibration problems can arise due to interactions among these components. The analysis is illustrated by calibrating a simple conceptual precipitation-runoff model to daily data for a number of Austrian basins having a broad range of hydro-meteorological characteristics. Evaluation of the results clearly demonstrates the problems that can be associated with any calibration based on the NSE (or MSE) criterion. While we propose and test an alternative criterion that can help to reduce model calibration problems, the primary purpose of this study is not to present an improved measure of model performance. Instead, we seek to show that there are systematic problems inherent with any optimization based on formulations related to the MSE. The analysis and results have implications to the manner in which we calibrate and evaluate environmental models; we discuss these and suggest possible ways forward that may move us towards an improved and diagnostically meaningful approach to model performance evaluation and identification.
Evaluation of advanced geopotential models for operational orbit determination
NASA Technical Reports Server (NTRS)
Radomski, M. S.; Davis, B. E.; Samii, M. V.; Engel, C. J.; Doll, C. E.
1988-01-01
To meet future orbit determination accuracy requirements for different NASA projects, analyses are performed using Tracking and Data Relay Satellite System (TDRSS) tracking measurements and orbit determination improvements in areas such as the modeling of the Earth's gravitational field. Current operational requirements are satisfied using the Goddard Earth Model-9 (GEM-9) geopotential model with the harmonic expansion truncated at order and degree 21 (21-by-21). This study evaluates the performance of 36-by-36 geopotential models, such as the GEM-10B and Preliminary Goddard Solution-3117 (PGS-3117) models. The Earth Radiation Budget Satellite (ERBS) and LANDSAT-5 are the spacecraft considered in this study.
NASA Astrophysics Data System (ADS)
Walker, Ernest; Chen, Xinjia; Cooper, Reginald L.
2010-04-01
An arbitrarily accurate approach is used to determine the bit-error rate (BER) performance for generalized asynchronous DS-CDMA systems, in Gaussian noise with Raleigh fading. In this paper, and the sequel, new theoretical work has been contributed which substantially enhances existing performance analysis formulations. Major contributions include: substantial computational complexity reduction, including a priori BER accuracy bounding; an analytical approach that facilitates performance evaluation for systems with arbitrary spectral spreading distributions, with non-uniform transmission delay distributions. Using prior results, augmented by these enhancements, a generalized DS-CDMA system model is constructed and used to evaluated the BER performance, in a variety of scenarios. In this paper, the generalized system modeling was used to evaluate the performance of both Walsh- Hadamard (WH) and Walsh-Hadamard-seeded zero-correlation-zone (WH-ZCZ) coding. The selection of these codes was informed by the observation that WH codes contain N spectral spreading values (0 to N - 1), one for each code sequence; while WH-ZCZ codes contain only two spectral spreading values (N/2 - 1,N/2); where N is the sequence length in chips. Since these codes span the spectral spreading range for DS-CDMA coding, by invoking an induction argument, the generalization of the system model is sufficiently supported. The results in this paper, and the sequel, support the claim that an arbitrary accurate performance analysis for DS-CDMA systems can be evaluated over the full range of binary coding, with minimal computational complexity.
Abstract: Two physically based and deterministic models, CASC2-D and KINEROS are evaluated and compared for their performances on modeling sediment movement on a small agricultural watershed over several events. Each model has different conceptualization of a watershed. CASC...
Two physically based watershed models, GSSHA and KINEROS-2 are evaluated and compared for their performances on modeling flow and sediment movement. Each model has a different watershed conceptualization. GSSHA divides the watershed into cells, and flow and sediments are routed t...
Conceptual Modeling Framework for E-Area PA HELP Infiltration Model Simulations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dyer, J. A.
A conceptual modeling framework based on the proposed E-Area Low-Level Waste Facility (LLWF) closure cap design is presented for conducting Hydrologic Evaluation of Landfill Performance (HELP) model simulations of intact and subsided cap infiltration scenarios for the next E-Area Performance Assessment (PA).
Brief Lags in Interrupted Sequential Performance: Evaluating a Model and Model Evaluation Method
2015-01-05
rehearsal mechanism in the model. To evaluate the model we developed a simple new goodness-of-fit test based on analysis of variance that offers an...repeated step). Sequen- tial constraints are common in medicine, equipment maintenance, computer programming and technical support, data analysis ...legal analysis , accounting, and many other home and workplace environ- ments. Sequential constraints also play a role in such basic cognitive processes
Within the context of the Air Quality Model Evaluation International Initiative phase 2 (AQMEII2) project, this part II paper performs a multi-model assessment of major column abundances of gases, radiation, aerosol, and cloud variables for 2006 and 2010 simulations with three on...
Evaluating Organizational Performance: Rational, Natural, and Open System Models
ERIC Educational Resources Information Center
Martz, Wes
2013-01-01
As the definition of organization has evolved, so have the approaches used to evaluate organizational performance. During the past 60 years, organizational theorists and management scholars have developed a comprehensive line of thinking with respect to organizational assessment that serves to inform and be informed by the evaluation discipline.…
NASA Astrophysics Data System (ADS)
Bouaziz, Laurène; de Boer-Euser, Tanja; Brauer, Claudia; Drogue, Gilles; Fenicia, Fabrizio; Grelier, Benjamin; de Niel, Jan; Nossent, Jiri; Pereira, Fernando; Savenije, Hubert; Thirel, Guillaume; Willems, Patrick
2016-04-01
International collaboration between institutes and universities is a promising way to reach consensus on hydrological model development. Education, experience and expert knowledge of the hydrological community have resulted in the development of a great variety of model concepts, calibration methods and analysis techniques. Although comparison studies are very valuable for international cooperation, they do often not lead to very clear new insights regarding the relevance of the modelled processes. We hypothesise that this is partly caused by model complexity and the used comparison methods, which focus on a good overall performance instead of focusing on specific events. We propose an approach that focuses on the evaluation of specific events. Eight international research groups calibrated their model for the Ourthe catchment in Belgium (1607 km2) and carried out a validation in time for the Ourthe (i.e. on two different periods, one of them on a blind mode for the modellers) and a validation in space for nested and neighbouring catchments of the Meuse in a completely blind mode. For each model, the same protocol was followed and an ensemble of best performing parameter sets was selected. Signatures were first used to assess model performances in the different catchments during validation. Comparison of the models was then followed by evaluation of selected events, which include: low flows, high flows and the transition from low to high flows. While the models show rather similar performances based on general metrics (i.e. Nash-Sutcliffe Efficiency), clear differences can be observed for specific events. While most models are able to simulate high flows well, large differences are observed during low flows and in the ability to capture the first peaks after drier months. The transferability of model parameters to neighbouring and nested catchments is assessed as an additional measure in the model evaluation. This suggested approach helps to select, among competing model alternatives, the most suitable model for a specific purpose.
Global evaluation of runoff from 10 state-of-the-art hydrological models
NASA Astrophysics Data System (ADS)
Beck, Hylke E.; van Dijk, Albert I. J. M.; de Roo, Ad; Dutra, Emanuel; Fink, Gabriel; Orth, Rene; Schellekens, Jaap
2017-06-01
Observed streamflow data from 966 medium sized catchments (1000-5000 km2) around the globe were used to comprehensively evaluate the daily runoff estimates (1979-2012) of six global hydrological models (GHMs) and four land surface models (LSMs) produced as part of tier-1 of the eartH2Observe project. The models were all driven by the WATCH Forcing Data ERA-Interim (WFDEI) meteorological dataset, but used different datasets for non-meteorologic inputs and were run at various spatial and temporal resolutions, although all data were re-sampled to a common 0. 5° spatial and daily temporal resolution. For the evaluation, we used a broad range of performance metrics related to important aspects of the hydrograph. We found pronounced inter-model performance differences, underscoring the importance of hydrological model uncertainty in addition to climate input uncertainty, for example in studies assessing the hydrological impacts of climate change. The uncalibrated GHMs were found to perform, on average, better than the uncalibrated LSMs in snow-dominated regions, while the ensemble mean was found to perform only slightly worse than the best (calibrated) model. The inclusion of less-accurate models did not appreciably degrade the ensemble performance. Overall, we argue that more effort should be devoted on calibrating and regionalizing the parameters of macro-scale models. We further found that, despite adjustments using gauge observations, the WFDEI precipitation data still contain substantial biases that propagate into the simulated runoff. The early bias in the spring snowmelt peak exhibited by most models is probably primarily due to the widespread precipitation underestimation at high northern latitudes.
NASA Technical Reports Server (NTRS)
1977-01-01
The development of a framework and structure for shuttle era unmanned spacecraft projects and the development of a commonality evaluation model is documented. The methodology developed for model utilization in performing cost trades and comparative evaluations for commonality studies is discussed. The model framework consists of categories of activities associated with the spacecraft system's development process. The model structure describes the physical elements to be treated as separate identifiable entities. Cost estimating relationships for subsystem and program-level components were calculated.
Model for Predicting the Performance of Planetary Suit Hip Bearing Designs
NASA Technical Reports Server (NTRS)
Cowley, Matthew S.; Margerum, Sarah; Hharvill, Lauren; Rajulu, Sudhakar
2012-01-01
Designing a space suit is very complex and often requires difficult trade-offs between performance, cost, mass, and system complexity. During the development period of the suit numerous design iterations need to occur before the hardware meets human performance requirements. Using computer models early in the design phase of hardware development is advantageous, by allowing virtual prototyping to take place. A virtual design environment allows designers to think creatively, exhaust design possibilities, and study design impacts on suit and human performance. A model of the rigid components of the Mark III Technology Demonstrator Suit (planetary-type space suit) and a human manikin were created and tested in a virtual environment. The performance of the Mark III hip bearing model was first developed and evaluated virtually by comparing the differences in mobility performance between the nominal bearing configurations and modified bearing configurations. Suited human performance was then simulated with the model and compared to actual suited human performance data using the same bearing configurations. The Mark III hip bearing model was able to visually represent complex bearing rotations and the theoretical volumetric ranges of motion in three dimensions. The model was also able to predict suited human hip flexion and abduction maximums to within 10% of the actual suited human subject data, except for one modified bearing condition in hip flexion which was off by 24%. Differences between the model predictions and the human subject performance data were attributed to the lack of joint moment limits in the model, human subject fitting issues, and the limited suit experience of some of the subjects. The results demonstrate that modeling space suit rigid segments is a feasible design tool for evaluating and optimizing suited human performance. Keywords: space suit, design, modeling, performance
Bio-SimVerb and Bio-SimLex: wide-coverage evaluation sets of word similarity in biomedicine.
Chiu, Billy; Pyysalo, Sampo; Vulić, Ivan; Korhonen, Anna
2018-02-05
Word representations support a variety of Natural Language Processing (NLP) tasks. The quality of these representations is typically assessed by comparing the distances in the induced vector spaces against human similarity judgements. Whereas comprehensive evaluation resources have recently been developed for the general domain, similar resources for biomedicine currently suffer from the lack of coverage, both in terms of word types included and with respect to the semantic distinctions. Notably, verbs have been excluded, although they are essential for the interpretation of biomedical language. Further, current resources do not discern between semantic similarity and semantic relatedness, although this has been proven as an important predictor of the usefulness of word representations and their performance in downstream applications. We present two novel comprehensive resources targeting the evaluation of word representations in biomedicine. These resources, Bio-SimVerb and Bio-SimLex, address the previously mentioned problems, and can be used for evaluations of verb and noun representations respectively. In our experiments, we have computed the Pearson's correlation between performances on intrinsic and extrinsic tasks using twelve popular state-of-the-art representation models (e.g. word2vec models). The intrinsic-extrinsic correlations using our datasets are notably higher than with previous intrinsic evaluation benchmarks such as UMNSRS and MayoSRS. In addition, when evaluating representation models for their abilities to capture verb and noun semantics individually, we show a considerable variation between performances across all models. Bio-SimVerb and Bio-SimLex enable intrinsic evaluation of word representations. This evaluation can serve as a predictor of performance on various downstream tasks in the biomedical domain. The results on Bio-SimVerb and Bio-SimLex using standard word representation models highlight the importance of developing dedicated evaluation resources for NLP in biomedicine for particular word classes (e.g. verbs). These are needed to identify the most accurate methods for learning class-specific representations. Bio-SimVerb and Bio-SimLex are publicly available.
Pimperl, A; Schreyögg, J; Rothgang, H; Busse, R; Glaeske, G; Hildebrandt, H
2015-12-01
Transparency of economic performance of integrated care systems (IV) is a basic requirement for the acceptance and further development of integrated care. Diverse evaluation methods are used but are seldom openly discussed because of the proprietary nature of the different business models. The aim of this article is to develop a generic model for measuring economic performance of IV interventions. A catalogue of five quality criteria is used to discuss different evaluation methods -(uncontrolled before-after-studies, control group-based approaches, regression models). On this -basis a best practice model is proposed. A regression model based on the German morbidity-based risk structure equalisation scheme (MorbiRSA) has some benefits in comparison to the other methods mentioned. In particular it requires less resources to be implemented and offers advantages concerning the relia-bility and the transparency of the method (=important for acceptance). Also validity is sound. Although RCTs and - also to a lesser -extent - complex difference-in-difference matching approaches can lead to a higher validity of the results, their feasibility in real life settings is limited due to economic and practical reasons. That is why central criticisms of a MorbiRSA-based model were addressed, adaptions proposed and incorporated in a best practice model: Population-oriented morbidity adjusted margin improvement model (P-DBV(MRSA)). The P-DBV(MRSA) approach may be used as a standardised best practice model for the economic evaluation of IV. Parallel to the proposed approach for measuring economic performance a balanced, quality-oriented performance measurement system should be introduced. This should prevent incentivising IV-players to undertake short-term cost cutting at the expense of quality. © Georg Thieme Verlag KG Stuttgart · New York.
Evaluating synoptic systems in the CMIP5 climate models over the Australian region
NASA Astrophysics Data System (ADS)
Gibson, Peter B.; Uotila, Petteri; Perkins-Kirkpatrick, Sarah E.; Alexander, Lisa V.; Pitman, Andrew J.
2016-10-01
Climate models are our principal tool for generating the projections used to inform climate change policy. Our confidence in projections depends, in part, on how realistically they simulate present day climate and associated variability over a range of time scales. Traditionally, climate models are less commonly assessed at time scales relevant to daily weather systems. Here we explore the utility of a self-organizing maps (SOMs) procedure for evaluating the frequency, persistence and transitions of daily synoptic systems in the Australian region simulated by state-of-the-art global climate models. In terms of skill in simulating the climatological frequency of synoptic systems, large spread was observed between models. A positive association between all metrics was found, implying that relative skill in simulating the persistence and transitions of systems is related to skill in simulating the climatological frequency. Considering all models and metrics collectively, model performance was found to be related to model horizontal resolution but unrelated to vertical resolution or representation of the stratosphere. In terms of the SOM procedure, the timespan over which evaluation was performed had some influence on model performance skill measures, as did the number of circulation types examined. These findings have implications for selecting models most useful for future projections over the Australian region, particularly for projections related to synoptic scale processes and phenomena. More broadly, this study has demonstrated the utility of the SOMs procedure in providing a process-based evaluation of climate models.
NASA Astrophysics Data System (ADS)
Giordano, Lea; Brunner, Dominik; Im, Ulas; Galmarini, Stefano
2014-05-01
The Air Quality Model Evaluation International Initiative (AQMEII) coordinated by the EC-JRC and US-EPA, promotes since 2008 research on regional air quality model evaluation across the atmospheric modelling communities of Europe and North America. AQMEII has now reached its Phase 2 that is dedicated to the evaluation of on-line coupled chemistry-meteorology models as opposed to Phase 1 where only off-line models were considered. At European level, AQMEII collaborates with the COST Action "European framework for on-line integrated air quality and meteorology modelling" (EuMetChem). All European groups participating in AQMEII performed simulations over the same spatial domain (Europe at a resolution of about 20 km) and using the same simulation strategy (e.g. no nudging allowed) and the same input data as much as possible. The initial and boundary conditions (IC/BC) were shared between all groups. Emissions were provided by the TNO-MACC database for anthropogenic emissions and the FMI database for biomass burning emissions. Chemical IC/BC data were taken from IFS-MOZART output, and meteorological IC/BC from the ECWMF global model. Evaluation data sets were collected by the Joint Research Center (JRC) and include measurements from surface in situ networks (AirBase and EMEP), vertical profiles from ozone sondes and aircraft (MOZAIC), and remote sensing (AERONET, satellites). Since Phase 2 focuses on on-line coupled models, a special effort is devoted to the detailed speciation of particulate matter components, with the goal of studying feedback processes. For the AQMEII exercise, COSMO-ART has been run with 40 levels of vertical resolution, and a chemical scheme that includes the SCAV module of Knote and Brunner (ACP 2013) for wet-phase chemistry and the SOA treatment according to VBS (volatility basis set) approach (Athanasopoulou et al., ACP 2013). The COSMO-ART evaluation shows that, next to a good performance in the meteorology, the gas phase chemistry is well captured throughout the year; the few cases showing a systematic underestimation of chemical concentrations arise as a consequence of the boundary conditions. Through this exercise we have identified the main critical issues in the COSMO-ART performance: sea salt and dust particulate matter components. The AQMEII exercise has provided an excellent platform to evaluate the COSMO-ART performance against both measurement data and other European regional on-line coupled models. From the analysis we have been able to identify specific model deficiencies and situations where the model cannot satisfactorily reproduce the data. Our future work will be focused on improving their modelling.
WRF/CMAQ AQMEII3 Simulations of US Regional-Scale ...
Chemical boundary conditions are a key input to regional-scale photochemical models. In this study, performed during the third phase of the Air Quality Model Evaluation International Initiative (AQMEII3), we perform annual simulations over North America with chemical boundary conditions prepared from four different global models. Results indicate that the impacts of different boundary conditions are significant for ozone throughout the year and most pronounced outside the summer season. The National Exposure Research Laboratory (NERL) Computational Exposure Division (CED) develops and evaluates data, decision-support tools, and models to be applied to media-specific or receptor-specific problem areas. CED uses modeling-based approaches to characterize exposures, evaluate fate and transport, and support environmental diagnostics/forensics with input from multiple data sources. It also develops media- and receptor-specific models, process models, and decision support tools for use both within and outside of EPA.
Effects of distributed database modeling on evaluation of transaction rollbacks
NASA Technical Reports Server (NTRS)
Mukkamala, Ravi
1991-01-01
Data distribution, degree of data replication, and transaction access patterns are key factors in determining the performance of distributed database systems. In order to simplify the evaluation of performance measures, database designers and researchers tend to make simplistic assumptions about the system. The effect is studied of modeling assumptions on the evaluation of one such measure, the number of transaction rollbacks, in a partitioned distributed database system. Six probabilistic models and expressions are developed for the numbers of rollbacks under each of these models. Essentially, the models differ in terms of the available system information. The analytical results so obtained are compared to results from simulation. From here, it is concluded that most of the probabilistic models yield overly conservative estimates of the number of rollbacks. The effect of transaction commutativity on system throughout is also grossly undermined when such models are employed.
Effects of distributed database modeling on evaluation of transaction rollbacks
NASA Technical Reports Server (NTRS)
Mukkamala, Ravi
1991-01-01
Data distribution, degree of data replication, and transaction access patterns are key factors in determining the performance of distributed database systems. In order to simplify the evaluation of performance measures, database designers and researchers tend to make simplistic assumptions about the system. Here, researchers investigate the effect of modeling assumptions on the evaluation of one such measure, the number of transaction rollbacks in a partitioned distributed database system. The researchers developed six probabilistic models and expressions for the number of rollbacks under each of these models. Essentially, the models differ in terms of the available system information. The analytical results obtained are compared to results from simulation. It was concluded that most of the probabilistic models yield overly conservative estimates of the number of rollbacks. The effect of transaction commutativity on system throughput is also grossly undermined when such models are employed.
Information and complexity measures for hydrologic model evaluation
USDA-ARS?s Scientific Manuscript database
Hydrological models are commonly evaluated through the residual-based performance measures such as the root-mean square error or efficiency criteria. Such measures, however, do not evaluate the degree of similarity of patterns in simulated and measured time series. The objective of this study was to...
Evaluations of high-resolution dynamically downscaled ensembles over the contiguous United States
NASA Astrophysics Data System (ADS)
Zobel, Zachary; Wang, Jiali; Wuebbles, Donald J.; Kotamarthi, V. Rao
2018-02-01
This study uses Weather Research and Forecast (WRF) model to evaluate the performance of six dynamical downscaled decadal historical simulations with 12-km resolution for a large domain (7200 × 6180 km) that covers most of North America. The initial and boundary conditions are from three global climate models (GCMs) and one reanalysis data. The GCMs employed in this study are the Geophysical Fluid Dynamics Laboratory Earth System Model with Generalized Ocean Layer Dynamics component, Community Climate System Model, version 4, and the Hadley Centre Global Environment Model, version 2-Earth System. The reanalysis data is from the National Centers for Environmental Prediction-US. Department of Energy Reanalysis II. We analyze the effects of bias correcting, the lateral boundary conditions and the effects of spectral nudging. We evaluate the model performance for seven surface variables and four upper atmospheric variables based on their climatology and extremes for seven subregions across the United States. The results indicate that the simulation's performance depends on both location and the features/variable being tested. We find that the use of bias correction and/or nudging is beneficial in many situations, but employing these when running the RCM is not always an improvement when compared to the reference data. The use of an ensemble mean and median leads to a better performance in measuring the climatology, while it is significantly biased for the extremes, showing much larger differences than individual GCM driven model simulations from the reference data. This study provides a comprehensive evaluation of these historical model runs in order to make informed decisions when making future projections.
NASA Technical Reports Server (NTRS)
Zaychik, Kirill; Cardullo, Frank; George, Gary; Kelly, Lon C.
2009-01-01
In order to use the Hess Structural Model to predict the need for certain cueing systems, George and Cardullo significantly expanded it by adding motion feedback to the model and incorporating models of the motion system dynamics, motion cueing algorithm and a vestibular system. This paper proposes a methodology to evaluate effectiveness of these innovations by performing a comparison analysis of the model performance with and without the expanded motion feedback. The proposed methodology is composed of two stages. The first stage involves fine-tuning parameters of the original Hess structural model in order to match the actual control behavior recorded during the experiments at NASA Visual Motion Simulator (VMS) facility. The parameter tuning procedure utilizes a new automated parameter identification technique, which was developed at the Man-Machine Systems Lab at SUNY Binghamton. In the second stage of the proposed methodology, an expanded motion feedback is added to the structural model. The resulting performance of the model is then compared to that of the original one. As proposed by Hess, metrics to evaluate the performance of the models include comparison against the crossover models standards imposed on the crossover frequency and phase margin of the overall man-machine system. Preliminary results indicate the advantage of having the model of the motion system and motion cueing incorporated into the model of the human operator. It is also demonstrated that the crossover frequency and the phase margin of the expanded model are well within the limits imposed by the crossover model.
Business School's Performance Management System Standards Design
ERIC Educational Resources Information Center
Azis, Anton Mulyono; Simatupang, Togar M.; Wibisono, Dermawan; Basri, Mursyid Hasan
2014-01-01
This paper aims to compare various Performance Management Systems (PMS) for business school in order to find the strengths of each standard as inputs to design new model of PMS. There are many critical aspects and gaps notified for new model to improve performance and even recognized that self evaluation performance management is not well…
Adaptation of Mesoscale Weather Models to Local Forecasting
NASA Technical Reports Server (NTRS)
Manobianco, John T.; Taylor, Gregory E.; Case, Jonathan L.; Dianic, Allan V.; Wheeler, Mark W.; Zack, John W.; Nutter, Paul A.
2003-01-01
Methodologies have been developed for (1) configuring mesoscale numerical weather-prediction models for execution on high-performance computer workstations to make short-range weather forecasts for the vicinity of the Kennedy Space Center (KSC) and the Cape Canaveral Air Force Station (CCAFS) and (2) evaluating the performances of the models as configured. These methodologies have been implemented as part of a continuing effort to improve weather forecasting in support of operations of the U.S. space program. The models, methodologies, and results of the evaluations also have potential value for commercial users who could benefit from tailoring their operations and/or marketing strategies based on accurate predictions of local weather. More specifically, the purpose of developing the methodologies for configuring the models to run on computers at KSC and CCAFS is to provide accurate forecasts of winds, temperature, and such specific thunderstorm-related phenomena as lightning and precipitation. The purpose of developing the evaluation methodologies is to maximize the utility of the models by providing users with assessments of the capabilities and limitations of the models. The models used in this effort thus far include the Mesoscale Atmospheric Simulation System (MASS), the Regional Atmospheric Modeling System (RAMS), and the National Centers for Environmental Prediction Eta Model ( Eta for short). The configuration of the MASS and RAMS is designed to run the models at very high spatial resolution and incorporate local data to resolve fine-scale weather features. Model preprocessors were modified to incorporate surface, ship, buoy, and rawinsonde data as well as data from local wind towers, wind profilers, and conventional or Doppler radars. The overall evaluation of the MASS, Eta, and RAMS was designed to assess the utility of these mesoscale models for satisfying the weather-forecasting needs of the U.S. space program. The evaluation methodology includes objective and subjective verification methodologies. Objective (e.g., statistical) verification of point forecasts is a stringent measure of model performance, but when used alone, it is not usually sufficient for quantifying the value of the overall contribution of the model to the weather-forecasting process. This is especially true for mesoscale models with enhanced spatial and temporal resolution that may be capable of predicting meteorologically consistent, though not necessarily accurate, fine-scale weather phenomena. Therefore, subjective (phenomenological) evaluation, focusing on selected case studies and specific weather features, such as sea breezes and precipitation, has been performed to help quantify the added value that cannot be inferred solely from objective evaluation.
NASA Technical Reports Server (NTRS)
Hooey, Becky Lee; Gore, Brian Francis; Mahlstedt, Eric; Foyle, David C.
2013-01-01
The objectives of the current research were to develop valid human performance models (HPMs) of approach and land operations; use these models to evaluate the impact of NextGen Closely Spaced Parallel Operations (CSPO) on pilot performance; and draw conclusions regarding flight deck display design and pilot-ATC roles and responsibilities for NextGen CSPO concepts. This document presents guidelines and implications for flight deck display designs and candidate roles and responsibilities. A companion document (Gore, Hooey, Mahlstedt, & Foyle, 2013) provides complete scenario descriptions and results including predictions of pilot workload, visual attention and time to detect off-nominal events.
Uncertainty Evaluation and Appropriate Distribution for the RDHM in the Rockies
NASA Astrophysics Data System (ADS)
Kim, J.; Bastidas, L. A.; Clark, E. P.
2010-12-01
The problems that hydrologic models have in properly reproducing the processes involved in mountainous areas, and in particular the Rocky Mountains, are widely acknowledged. Herein, we present an application of the National Weather Service RDHM distributed model over the Durango River basin in Colorado. We focus primarily in the assessment of the model prediction uncertainty associated with the parameter estimation and the comparison of the model performance using parameters obtained with a priori estimation following the procedure of Koren et al., and those obtained via inverse modeling using a variety of Markov chain Monte Carlo based optimization algorithms. The model evaluation is based on traditional procedures as well as non-traditional ones based on the use of shape matching functions, which are more appropriate for the evaluation of distributed information (e.g. Hausdorff distance, earth movers distance). The variables used for the model performance evaluation are discharge (with internal nodes), snow cover and snow water equivalent. An attempt to establish the proper degree of distribution, for the Durango basin with the RDHM model, is also presented.
Experimental and analytical studies of advanced air cushion landing systems
NASA Technical Reports Server (NTRS)
Lee, E. G. S.; Boghani, A. B.; Captain, K. M.; Rutishauser, H. J.; Farley, H. L.; Fish, R. B.; Jeffcoat, R. L.
1981-01-01
Several concepts are developed for air cushion landing systems (ACLS) which have the potential for improving performance characteristics (roll stiffness, heave damping, and trunk flutter), and reducing fabrication cost and complexity. After an initial screening, the following five concepts were evaluated in detail: damped trunk, filled trunk, compartmented trunk, segmented trunk, and roll feedback control. The evaluation was based on tests performed on scale models. An ACLS dynamic simulation developed earlier is updated so that it can be used to predict the performance of full-scale ACLS incorporating these refinements. The simulation was validated through scale-model tests. A full-scale ACLS based on the segmented trunk concept was fabricated and installed on the NASA ACLS test vehicle, where it is used to support advanced system development. A geometrically-scaled model (one third full scale) of the NASA test vehicle was fabricated and tested. This model, evaluated by means of a series of static and dynamic tests, is used to investigate scaling relationships between reduced and full-scale models. The analytical model developed earlier is applied to simulate both the one third scale and the full scale response.
Wind Energy Conversion System Analysis Model (WECSAM) computer program documentation
NASA Astrophysics Data System (ADS)
Downey, W. T.; Hendrick, P. L.
1982-07-01
Described is a computer-based wind energy conversion system analysis model (WECSAM) developed to predict the technical and economic performance of wind energy conversion systems (WECS). The model is written in CDC FORTRAN V. The version described accesses a data base containing wind resource data, application loads, WECS performance characteristics, utility rates, state taxes, and state subsidies for a six state region (Minnesota, Michigan, Wisconsin, Illinois, Ohio, and Indiana). The model is designed for analysis at the county level. The computer model includes a technical performance module and an economic evaluation module. The modules can be run separately or together. The model can be run for any single user-selected county within the region or looped automatically through all counties within the region. In addition, the model has a restart capability that allows the user to modify any data-base value written to a scratch file prior to the technical or economic evaluation.
NASA Technical Reports Server (NTRS)
Daiker, Ron; Schnell, Thomas
2010-01-01
A human motor model was developed on the basis of performance data that was collected in a flight simulator. The motor model is under consideration as one component of a virtual pilot model for the evaluation of NextGen crew alerting and notification systems in flight decks. This model may be used in a digital Monte Carlo simulation to compare flight deck layout design alternatives. The virtual pilot model is being developed as part of a NASA project to evaluate multiple crews alerting and notification flight deck configurations. Model parameters were derived from empirical distributions of pilot data collected in a flight simulator experiment. The goal of this model is to simulate pilot motor performance in the approach-to-landing task. The unique challenges associated with modeling the complex dynamics of humans interacting with the cockpit environment are discussed, along with the current state and future direction of the model.
Choi, Wona; Rho, Mi Jung; Park, Jiyun; Kim, Kwang-Jum; Kwon, Young Dae; Choi, In Young
2013-06-01
Intensified competitiveness in the healthcare industry has increased the number of healthcare centers and propelled the introduction of customer relationship management (CRM) systems to meet diverse customer demands. This study aimed to develop the information system success model of the CRM system by investigating previously proposed indicators within the model. THE EVALUATION AREAS OF THE CRM SYSTEM INCLUDES THREE AREAS: the system characteristics area (system quality, information quality, and service quality), the user area (perceived usefulness and user satisfaction), and the performance area (personal performance and organizational performance). Detailed evaluation criteria of the three areas were developed, and its validity was verified by a survey administered to CRM system users in 13 nationwide health promotion centers. The survey data were analyzed by the structural equation modeling method, and the results confirmed that the model is feasible. Information quality and service quality showed a statistically significant relationship with perceived usefulness and user satisfaction. Consequently, the perceived usefulness and user satisfaction had significant influence on individual performance as well as an indirect influence on organizational performance. This study extends the research area on information success from general information systems to CRM systems in health promotion centers applying a previous information success model. This lays a foundation for evaluating health promotion center systems and provides a useful guide for successful implementation of hospital CRM systems.
Choi, Wona; Rho, Mi Jung; Park, Jiyun; Kim, Kwang-Jum; Kwon, Young Dae
2013-01-01
Objectives Intensified competitiveness in the healthcare industry has increased the number of healthcare centers and propelled the introduction of customer relationship management (CRM) systems to meet diverse customer demands. This study aimed to develop the information system success model of the CRM system by investigating previously proposed indicators within the model. Methods The evaluation areas of the CRM system includes three areas: the system characteristics area (system quality, information quality, and service quality), the user area (perceived usefulness and user satisfaction), and the performance area (personal performance and organizational performance). Detailed evaluation criteria of the three areas were developed, and its validity was verified by a survey administered to CRM system users in 13 nationwide health promotion centers. The survey data were analyzed by the structural equation modeling method, and the results confirmed that the model is feasible. Results Information quality and service quality showed a statistically significant relationship with perceived usefulness and user satisfaction. Consequently, the perceived usefulness and user satisfaction had significant influence on individual performance as well as an indirect influence on organizational performance. Conclusions This study extends the research area on information success from general information systems to CRM systems in health promotion centers applying a previous information success model. This lays a foundation for evaluating health promotion center systems and provides a useful guide for successful implementation of hospital CRM systems. PMID:23882416
Snyder, Jon J; Salkowski, Nicholas; Kim, S Joseph; Zaun, David; Xiong, Hui; Israni, Ajay K; Kasiske, Bertram L
2016-02-01
Created by the US National Organ Transplant Act in 1984, the Scientific Registry of Transplant Recipients (SRTR) is obligated to publicly report data on transplant program and organ procurement organization performance in the United States. These reports include risk-adjusted assessments of graft and patient survival, and programs performing worse or better than expected are identified. The SRTR currently maintains 43 risk adjustment models for assessing posttransplant patient and graft survival and, in collaboration with the SRTR Technical Advisory Committee, has developed and implemented a new systematic process for model evaluation and revision. Patient cohorts for the risk adjustment models are identified, and single-organ and multiorgan transplants are defined, then each risk adjustment model is developed following a prespecified set of steps. Model performance is assessed, the model is refit to a more recent cohort before each evaluation cycle, and then it is applied to the evaluation cohort. The field of solid organ transplantation is unique in the breadth of the standardized data that are collected. These data allow for quality assessment across all transplant providers in the United States. A standardized process of risk model development using data from national registries may enhance the field.
Evaluation of Troxler model 3411 nuclear gage.
DOT National Transportation Integrated Search
1978-01-01
The performance of the Troxler Electronics Laboratory Model 3411 nuclear gage was evaluated through laboratory tests on the Department's density and moisture standards and field tests on various soils, base courses, and bituminous concrete overlays t...
CMAQ Model Evaluation Framework
CMAQ is tested to establish the modeling system’s credibility in predicting pollutants such as ozone and particulate matter. Evaluation of CMAQ has been designed to assess the model’s performance for specific time periods and for specific uses.
Herd-of-origin effect on the post-weaning performance of centrally tested Nellore beef cattle.
de Rezende Neves, Haroldo Henrique; Polin dos Reis, Felipe; Motta Paterno, Flávia; Rocha Guarini, Aline; Carvalheiro, Roberto; da Silva, Lilian Regina; de Oliveira, João Ademir; Aidar de Queiroz, Sandra
2014-10-01
The objective of a performance test station is to evaluate the performance of potential breeding bulls earlier in order to decrease the generation interval and increase genetic gain as well. This study evaluates the herd-of-origin influence on end-of-test weight (ETW), average daily weight gain during testing (ADG), average daily weight gain during the adjustment period (ADGadj), rib eye area (REA), marbling (MARB), subcutaneous fat thickness (SFT), conformation (C), early finishing (EF), muscling (M), navel (N) and temperament (T) scores, and scrotal circumference (SC) of Nellore cattle that underwent a performance test. We evaluated 664 animals that participated in the performance tests conducted at the Center for Performance CRV Lagoa between 2007 and 2012. Components of variance for each trait were estimated by an animal model (model 1), using the restricted maximum likelihood method. An alternative animal model (model 2) included, in addition to the fixed effects present in S1, the non-correlated random effect of herd-year (HY). A significant HY effect was observed on ETW, REA, SFT, ADGadj, C, and Cw (p < 0.05). The estimated heritability of all traits decreased when the HY effect was included in the model; also, the bull rank, in deciles, changed significantly for traits ETW, REA, SFT, and C. The adjustment period did not completely remove the environmental effect of herd of origin on ETW, REA, SFT, and C. It is recommended that the herd-of-origin effect should be included in the statistical models used to predict the breeding values of the participants of these performance tests.
Evaluation of the Community Multiscale Air Quality (CMAQ) Model Version 5.1
The AMAD will performed two CMAQ model simulations, one with the current publically available version of the CMAQ model (v5.0.2) and the other with the new version of the CMAQ model (v5.1). The results of each model simulation are compared to observations and the performance of t...
Evaluation of weighted regression and sample size in developing a taper model for loblolly pine
Kenneth L. Cormier; Robin M. Reich; Raymond L. Czaplewski; William A. Bechtold
1992-01-01
A stem profile model, fit using pseudo-likelihood weighted regression, was used to estimate merchantable volume of loblolly pine (Pinus taeda L.) in the southeast. The weighted regression increased model fit marginally, but did not substantially increase model performance. In all cases, the unweighted regression models performed as well as the...
Performance Model and Sensitivity Analysis for a Solar Thermoelectric Generator
NASA Astrophysics Data System (ADS)
Rehman, Naveed Ur; Siddiqui, Mubashir Ali
2017-03-01
In this paper, a regression model for evaluating the performance of solar concentrated thermoelectric generators (SCTEGs) is established and the significance of contributing parameters is discussed in detail. The model is based on several natural, design and operational parameters of the system, including the thermoelectric generator (TEG) module and its intrinsic material properties, the connected electrical load, concentrator attributes, heat transfer coefficients, solar flux, and ambient temperature. The model is developed by fitting a response curve, using the least-squares method, to the results. The sample points for the model were obtained by simulating a thermodynamic model, also developed in this paper, over a range of values of input variables. These samples were generated employing the Latin hypercube sampling (LHS) technique using a realistic distribution of parameters. The coefficient of determination was found to be 99.2%. The proposed model is validated by comparing the predicted results with those in the published literature. In addition, based on the elasticity for parameters in the model, sensitivity analysis was performed and the effects of parameters on the performance of SCTEGs are discussed in detail. This research will contribute to the design and performance evaluation of any SCTEG system for a variety of applications.
Temporal Cyber Attack Detection.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ingram, Joey Burton; Draelos, Timothy J.; Galiardi, Meghan
Rigorous characterization of the performance and generalization ability of cyber defense systems is extremely difficult, making it hard to gauge uncertainty, and thus, confidence. This difficulty largely stems from a lack of labeled attack data that fully explores the potential adversarial space. Currently, performance of cyber defense systems is typically evaluated in a qualitative manner by manually inspecting the results of the system on live data and adjusting as needed. Additionally, machine learning has shown promise in deriving models that automatically learn indicators of compromise that are more robust than analyst-derived detectors. However, to generate these models, most algorithms requiremore » large amounts of labeled data (i.e., examples of attacks). Algorithms that do not require annotated data to derive models are similarly at a disadvantage, because labeled data is still necessary when evaluating performance. In this work, we explore the use of temporal generative models to learn cyber attack graph representations and automatically generate data for experimentation and evaluation. Training and evaluating cyber systems and machine learning models requires significant, annotated data, which is typically collected and labeled by hand for one-off experiments. Automatically generating such data helps derive/evaluate detection models and ensures reproducibility of results. Experimentally, we demonstrate the efficacy of generative sequence analysis techniques on learning the structure of attack graphs, based on a realistic example. These derived models can then be used to generate more data. Additionally, we provide a roadmap for future research efforts in this area.« less
Adam, Asrul; Ibrahim, Zuwairie; Mokhtar, Norrima; Shapiai, Mohd Ibrahim; Cumming, Paul; Mubin, Marizan
2016-01-01
Various peak models have been introduced to detect and analyze peaks in the time domain analysis of electroencephalogram (EEG) signals. In general, peak model in the time domain analysis consists of a set of signal parameters, such as amplitude, width, and slope. Models including those proposed by Dumpala, Acir, Liu, and Dingle are routinely used to detect peaks in EEG signals acquired in clinical studies of epilepsy or eye blink. The optimal peak model is the most reliable peak detection performance in a particular application. A fair measure of performance of different models requires a common and unbiased platform. In this study, we evaluate the performance of the four different peak models using the extreme learning machine (ELM)-based peak detection algorithm. We found that the Dingle model gave the best performance, with 72 % accuracy in the analysis of real EEG data. Statistical analysis conferred that the Dingle model afforded significantly better mean testing accuracy than did the Acir and Liu models, which were in the range 37-52 %. Meanwhile, the Dingle model has no significant difference compared to Dumpala model.
DOT National Transportation Integrated Search
1973-02-01
The volume presents the models used to analyze basic features of the system, establish feasibility of techniques, and evaluate system performance. The models use analytical expressions and computer simulations to represent the relationship between sy...
NASA Technical Reports Server (NTRS)
Putnam, Jacob P.; Untaroiu, Costin; Somers. Jeffrey
2014-01-01
In an effort to develop occupant protection standards for future multipurpose crew vehicles, the National Aeronautics and Space Administration (NASA) has looked to evaluate the test device for human occupant restraint with the modification kit (THOR-K) anthropomorphic test device (ATD) in relevant impact test scenarios. With the allowance and support of the National Highway Traffic Safety Administration, NASA has performed a series of sled impact tests on the latest developed THOR-K ATD. These tests were performed to match test conditions from human volunteer data previously collected by the U.S. Air Force. The objective of this study was to evaluate the THOR-K finite element (FE) model and the Total HUman Model for Safety (THUMS) FE model with respect to the tests performed. These models were evaluated in spinal and frontal impacts against kinematic and kinetic data recorded in ATD and human testing. Methods: The FE simulations were developed based on recorded pretest ATD/human position and sled acceleration pulses measured during testing. Predicted responses by both human and ATD models were compared to test data recorded under the same impact conditions. The kinematic responses of the models were quantitatively evaluated using the ISO-metric curve rating system. In addition, ATD injury criteria and human stress/strain data were calculated to evaluate the risk of injury predicted by the ATD and human model, respectively. Results: Preliminary results show well-correlated response between both FE models and their physical counterparts. In addition, predicted ATD injury criteria and human model stress/strain values are shown to positively relate. Kinematic comparison between human and ATD models indicates promising biofidelic response, although a slightly stiffer response is observed within the ATD. Conclusion: As a compliment to ATD testing, numerical simulation provides efficient means to assess vehicle safety throughout the design process and further improve the design of physical ATDs. The assessment of the THOR-K and THUMS FE models in a spaceflight testing condition is an essential first step to implementing these models in the computational evaluation of spacecraft occupant safety. Promising results suggest future use of these models in the aerospace field.
Gale, C P; Manda, S O M; Weston, C F; Birkhead, J S; Batin, P D; Hall, A S
2009-03-01
To compare the discriminative performance of the PURSUIT, GUSTO-1, GRACE, SRI and EMMACE risk models, assess their performance among risk supergroups and evaluate the EMMACE risk model over the wider spectrum of acute coronary syndrome (ACS). Observational study of a national registry. All acute hospitals in England and Wales. 100 686 cases of ACS between 2003 and 2005. Model performance (C-index) in predicting the likelihood of death over the time period for which they were designed. The C-index, or area under the receiver-operating curve, range 0-1, is a measure of the discriminative performance of a model. The C-indexes were: PURSUIT C-index 0.79 (95% confidence interval 0.78 to 0.80); GUSTO-1 0.80 (0.79 to 0.81); GRACE in-hospital 0.80 (0.80 to 0.81); GRACE 6-month 0.80 (0.79 to 0.80); SRI 0.79 (0.78 to 0.80); and EMMACE 0.78 (0.77 to 0.78). EMMACE maintained its ability to discriminate 30-day mortality throughout different ACS diagnoses. Recalibration of the model offered no notable improvement in performance over the original risk equation. For all models the discriminative performance was reduced in patients with diabetes, chronic renal failure or angina. The five ACS risk models maintained their discriminative performance in a large unselected English and Welsh ACS population, but performed less well in higher-risk supergroups. Simpler risk models had comparable performance to more complex risk models. The EMMACE risk score performed well across the wider spectrum of ACS diagnoses.
Evaluation of CASL boiling model for DNB performance in full scale 5x5 fuel bundle with spacer grids
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kim, Seung Jun
As one of main tasks for FY17 CASL-THM activity, Evaluation study on applicability of the CASL baseline boiling model for 5x5 DNB application is conducted and the predictive capability of the DNB analysis is reported here. While the baseline CASL-boiling model (GEN- 1A) approach has been successfully implemented and validated with a single pipe application in the previous year’s task, the extended DNB validation for realistic sub-channels with detailed spacer grid configurations are tasked in FY17. The focus area of the current study is to demonstrate the robustness and feasibility of the CASL baseline boiling model for DNB performance inmore » a full 5x5 fuel bundle application. A quantitative evaluation of the DNB predictive capability is performed by comparing with corresponding experimental measurements (i.e. reference for the model validation). The reference data are provided from the Westinghouse Electricity Company (WEC). Two different grid configurations tested here include Non-Mixing Vane Grid (NMVG), and Mixing Vane Grid (MVG). Thorough validation studies with two sub-channel configurations are performed at a wide range of realistic PWR operational conditions.« less
NASA Technical Reports Server (NTRS)
Corker, Kevin; Pisanich, Gregory; Condon, Gregory W. (Technical Monitor)
1995-01-01
A predictive model of human operator performance (flight crew and air traffic control (ATC)) has been developed and applied in order to evaluate the impact of automation developments in flight management and air traffic control. The model is used to predict the performance of a two person flight crew and the ATC operators generating and responding to clearances aided by the Center TRACON Automation System (CTAS). The purpose of the modeling is to support evaluation and design of automated aids for flight management and airspace management and to predict required changes in procedure both air and ground in response to advancing automation in both domains. Additional information is contained in the original extended abstract.
ERIC Educational Resources Information Center
McCall, James P.
2011-01-01
The evaluation, improvement, and accountability of teachers has been the topic of the nation throughout the era of No Child Left Behind. Where some critics point to a business model of measuring outputs (i.e., student achievement scores on standardized tests) to evaluate teacher performance, others will advocate for a fair evaluation system that…
Pohjola, Mikko V.; Pohjola, Pasi; Tainio, Marko; Tuomisto, Jouni T.
2013-01-01
The calls for knowledge-based policy and policy-relevant research invoke a need to evaluate and manage environment and health assessments and models according to their societal outcomes. This review explores how well the existing approaches to assessment and model performance serve this need. The perspectives to assessment and model performance in the scientific literature can be called: (1) quality assurance/control, (2) uncertainty analysis, (3) technical assessment of models, (4) effectiveness and (5) other perspectives, according to what is primarily seen to constitute the goodness of assessments and models. The categorization is not strict and methods, tools and frameworks in different perspectives may overlap. However, altogether it seems that most approaches to assessment and model performance are relatively narrow in their scope. The focus in most approaches is on the outputs and making of assessments and models. Practical application of the outputs and the consequential outcomes are often left unaddressed. It appears that more comprehensive approaches that combine the essential characteristics of different perspectives are needed. This necessitates a better account of the mechanisms of collective knowledge creation and the relations between knowledge and practical action. Some new approaches to assessment, modeling and their evaluation and management span the chain from knowledge creation to societal outcomes, but the complexity of evaluating societal outcomes remains a challenge. PMID:23803642
Although strong collaborations in the air pollution field have existed among the North American (NA) and European (EU) countries over the past five decades, regional-scale air quality model developments and model performance evaluations have been carried out independently unlike ...
Zhang, Jing; Liang, Lichen; Anderson, Jon R; Gatewood, Lael; Rottenberg, David A; Strother, Stephen C
2008-01-01
As functional magnetic resonance imaging (fMRI) becomes widely used, the demands for evaluation of fMRI processing pipelines and validation of fMRI analysis results is increasing rapidly. The current NPAIRS package, an IDL-based fMRI processing pipeline evaluation framework, lacks system interoperability and the ability to evaluate general linear model (GLM)-based pipelines using prediction metrics. Thus, it can not fully evaluate fMRI analytical software modules such as FSL.FEAT and NPAIRS.GLM. In order to overcome these limitations, a Java-based fMRI processing pipeline evaluation system was developed. It integrated YALE (a machine learning environment) into Fiswidgets (a fMRI software environment) to obtain system interoperability and applied an algorithm to measure GLM prediction accuracy. The results demonstrated that the system can evaluate fMRI processing pipelines with univariate GLM and multivariate canonical variates analysis (CVA)-based models on real fMRI data based on prediction accuracy (classification accuracy) and statistical parametric image (SPI) reproducibility. In addition, a preliminary study was performed where four fMRI processing pipelines with GLM and CVA modules such as FSL.FEAT and NPAIRS.CVA were evaluated with the system. The results indicated that (1) the system can compare different fMRI processing pipelines with heterogeneous models (NPAIRS.GLM, NPAIRS.CVA and FSL.FEAT) and rank their performance by automatic performance scoring, and (2) the rank of pipeline performance is highly dependent on the preprocessing operations. These results suggest that the system will be of value for the comparison, validation, standardization and optimization of functional neuroimaging software packages and fMRI processing pipelines.
Metrics for Evaluation of Student Models
ERIC Educational Resources Information Center
Pelanek, Radek
2015-01-01
Researchers use many different metrics for evaluation of performance of student models. The aim of this paper is to provide an overview of commonly used metrics, to discuss properties, advantages, and disadvantages of different metrics, to summarize current practice in educational data mining, and to provide guidance for evaluation of student…
Modeling and Performance Simulation of the Mass Storage Network Environment
NASA Technical Reports Server (NTRS)
Kim, Chan M.; Sang, Janche
2000-01-01
This paper describes the application of modeling and simulation in evaluating and predicting the performance of the mass storage network environment. Network traffic is generated to mimic the realistic pattern of file transfer, electronic mail, and web browsing. The behavior and performance of the mass storage network and a typical client-server Local Area Network (LAN) are investigated by modeling and simulation. Performance characteristics in throughput and delay demonstrate the important role of modeling and simulation in network engineering and capacity planning.
Performance modeling of automated manufacturing systems
NASA Astrophysics Data System (ADS)
Viswanadham, N.; Narahari, Y.
A unified and systematic treatment is presented of modeling methodologies and analysis techniques for performance evaluation of automated manufacturing systems. The book is the first treatment of the mathematical modeling of manufacturing systems. Automated manufacturing systems are surveyed and three principal analytical modeling paradigms are discussed: Markov chains, queues and queueing networks, and Petri nets.
ERIC Educational Resources Information Center
Longo, Paul J.
This study explored the mechanics of using an enhanced, comprehensive multipurpose logic model, the Performance Blueprint, as a means of building evaluation capacity, referred to in this paper as performance measurement literacy, to facilitate the attainment of both service-delivery oriented and community-oriented outcomes. The application of this…
Maurer, Max; Lienert, Judit
2017-01-01
We compare the use of multi-criteria decision analysis (MCDA)–or more precisely, models used in multi-attribute value theory (MAVT)–to integrated assessment (IA) models for supporting long-term water supply planning in a small town case study in Switzerland. They are used to evaluate thirteen system scale water supply alternatives in four future scenarios regarding forty-four objectives, covering technical, social, environmental, and economic aspects. The alternatives encompass both conventional and unconventional solutions and differ regarding technical, spatial and organizational characteristics. This paper focuses on the impact assessment and final evaluation step of the structured MCDA decision support process. We analyze the performance of the alternatives for ten stakeholders. We demonstrate the implications of model assumptions by comparing two IA and three MAVT evaluation model layouts of different complexity. For this comparison, we focus on the validity (ranking stability), desirability (value), and distinguishability (value range) of the alternatives given the five model layouts. These layouts exclude or include stakeholder preferences and uncertainties. Even though all five led us to identify the same best alternatives, they did not produce identical rankings. We found that the MAVT-type models provide higher distinguishability and a more robust basis for discussion than the IA-type models. The needed complexity of the model, however, should be determined based on the intended use of the model within the decision support process. The best-performing alternatives had consistently strong performance for all stakeholders and future scenarios, whereas the current water supply system was outperformed in all evaluation layouts. The best-performing alternatives comprise proactive pipe rehabilitation, adapted firefighting provisions, and decentralized water storage and/or treatment. We present recommendations for possible ways of improving water supply planning in the case study and beyond. PMID:28481881
Source term model evaluations for the low-level waste facility performance assessment
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yim, M.S.; Su, S.I.
1995-12-31
The estimation of release of radionuclides from various waste forms to the bottom boundary of the waste disposal facility (source term) is one of the most important aspects of LLW facility performance assessment. In this work, several currently used source term models are comparatively evaluated for the release of carbon-14 based on a test case problem. The models compared include PRESTO-EPA-CPG, IMPACTS, DUST and NEFTRAN-II. Major differences in assumptions and approaches between the models are described and key parameters are identified through sensitivity analysis. The source term results from different models are compared and other concerns or suggestions are discussed.
Rule based design of conceptual models for formative evaluation
NASA Technical Reports Server (NTRS)
Moore, Loretta A.; Chang, Kai; Hale, Joseph P.; Bester, Terri; Rix, Thomas; Wang, Yaowen
1994-01-01
A Human-Computer Interface (HCI) Prototyping Environment with embedded evaluation capability has been investigated. This environment will be valuable in developing and refining HCI standards and evaluating program/project interface development, especially Space Station Freedom on-board displays for payload operations. This environment, which allows for rapid prototyping and evaluation of graphical interfaces, includes the following four components: (1) a HCI development tool; (2) a low fidelity simulator development tool; (3) a dynamic, interactive interface between the HCI and the simulator; and (4) an embedded evaluator that evaluates the adequacy of a HCI based on a user's performance. The embedded evaluation tool collects data while the user is interacting with the system and evaluates the adequacy of an interface based on a user's performance. This paper describes the design of conceptual models for the embedded evaluation system using a rule-based approach.
Rule based design of conceptual models for formative evaluation
NASA Technical Reports Server (NTRS)
Moore, Loretta A.; Chang, Kai; Hale, Joseph P.; Bester, Terri; Rix, Thomas; Wang, Yaowen
1994-01-01
A Human-Computer Interface (HCI) Prototyping Environment with embedded evaluation capability has been investigated. This environment will be valuable in developing and refining HCI standards and evaluating program/project interface development, especially Space Station Freedom on-board displays for payload operations. This environment, which allows for rapid prototyping and evaluation of graphical interfaces, includes the following four components: (1) a HCI development tool, (2) a low fidelity simulator development tool, (3) a dynamic, interactive interface between the HCI and the simulator, and (4) an embedded evaluator that evaluates the adequacy of a HCI based on a user's performance. The embedded evaluation tool collects data while the user is interacting with the system and evaluates the adequacy of an interface based on a user's performance. This paper describes the design of conceptual models for the embedded evaluation system using a rule-based approach.
Samuel A. Cushman; Jesse S. Lewis; Erin L. Landguth
2014-01-01
There have been few assessments of the performance of alternative resistance surfaces, and little is known about how connectivity modeling approaches differ in their ability to predict organism movements. In this paper, we evaluate the performance of four connectivity modeling approaches applied to two resistance surfaces in predicting the locations of highway...
The TRIM.FaTE Evaluation Report is composed of three volumes. Volume I presents conceptual, mechanistic, and structural complexity evaluations of various aspects of the model. Volumes II and III present performance evaluation.
Comprehensive system models: Strategies for evaluation
NASA Technical Reports Server (NTRS)
Field, Christopher; Kutzbach, John E.; Ramanathan, V.; Maccracken, Michael C.
1992-01-01
The task of evaluating comprehensive earth system models is vast involving validations of every model component at every scale of organization, as well as tests of all the individual linkages. Even the most detailed evaluation of each of the component processes and the individual links among them should not, however, engender confidence in the performance of the whole. The integrated earth system is so rich with complex feedback loops, often involving components of the atmosphere, oceans, biosphere, and cryosphere, that it is certain to exhibit emergent properties very difficult to predict from the perspective of a narrow focus on any individual component of the system. Therefore, a substantial share of the task of evaluating comprehensive earth system models must reside at the level of whole system evaluations. Since complete, integrated atmosphere/ ocean/ biosphere/ hydrology models are not yet operational, questions of evaluation must be addressed at the level of the kinds of earth system processes that the models should be competent to simulate, rather than at the level of specific performance criteria. Here, we have tried to identify examples of earth system processes that are difficult to simulate with existing models and that involve a rich enough suite of feedbacks that they are unlikely to be satisfactorily described by highly simplified or toy models. Our purpose is not to specify a checklist of evaluation criteria but to introduce characteristics of the earth system that may present useful opportunities for model testing and, of course, improvement.
NASA Astrophysics Data System (ADS)
Dobson, B.; Pianosi, F.; Reed, P. M.; Wagener, T.
2017-12-01
In previous work, we have found that water supply companies are typically hesitant to use reservoir operation tools to inform their release decisions. We believe that this is, in part, due to a lack of faith in the fidelity of the optimization exercise with regards to its ability to represent the real world. In an attempt to quantify this, recent literature has studied the impact on performance from uncertainty arising in: forcing (e.g. reservoir inflows), parameters (e.g. parameters for the estimation of evaporation rate) and objectives (e.g. worst first percentile or worst case). We suggest that there is also epistemic uncertainty in the choices made during model creation, for example in the formulation of an evaporation model or aggregating regional storages. We create `rival framings' (a methodology originally developed to demonstrate the impact of uncertainty arising from alternate objective formulations), each with different modelling choices, and determine their performance impacts. We identify the Pareto approximate set of policies for several candidate formulations and then make them compete with one another in a large ensemble re-evaluation in each other's modelled spaces. This enables us to distinguish the impacts of different structural changes in the model used to evaluate system performance in an effort to generalize the validity of the optimized performance expectations.
Performance Evaluation and Parameter Identification on DROID III
NASA Technical Reports Server (NTRS)
Plumb, Julianna J.
2011-01-01
The DROID III project consisted of two main parts. The former, performance evaluation, focused on the performance characteristics of the aircraft such as lift to drag ratio, thrust required for level flight, and rate of climb. The latter, parameter identification, focused on finding the aerodynamic coefficients for the aircraft using a system that creates a mathematical model to match the flight data of doublet maneuvers and the aircraft s response. Both portions of the project called for flight testing and that data is now available on account of this project. The conclusion of the project is that the performance evaluation data is well-within desired standards but could be improved with a thrust model, and that parameter identification is still in need of more data processing but seems to produce reasonable results thus far.
On-Board Propulsion System Analysis of High Density Propellants
NASA Technical Reports Server (NTRS)
Schneider, Steven J.
1998-01-01
The impact of the performance and density of on-board propellants on science payload mass of Discovery Program class missions is evaluated. A propulsion system dry mass model, anchored on flight-weight system data from the Near Earth Asteroid Rendezvous mission is used. This model is used to evaluate the performance of liquid oxygen, hydrogen peroxide, hydroxylammonium nitrate, and oxygen difluoride oxidizers with hydrocarbon and metal hydride fuels. Results for the propellants evaluated indicate that the state-of-art, Earth Storable propellants with high performance rhenium engine technology in both the axial and attitude control systems has performance capabilities that can only be exceeded by liquid oxygen/hydrazine, liquid oxygen/diborane and oxygen difluoride/diborane propellant combinations. Potentially lower ground operations costs is the incentive for working with nontoxic propellant combinations.
NASA Astrophysics Data System (ADS)
Javernick, Luke; Redolfi, Marco; Bertoldi, Walter
2018-05-01
New data collection techniques offer numerical modelers the ability to gather and utilize high quality data sets with high spatial and temporal resolution. Such data sets are currently needed for calibration, verification, and to fuel future model development, particularly morphological simulations. This study explores the use of high quality spatial and temporal data sets of observed bed load transport in braided river flume experiments to evaluate the ability of a two-dimensional model, Delft3D, to predict bed load transport. This study uses a fixed bed model configuration and examines the model's shear stress calculations, which are the foundation to predict the sediment fluxes necessary for morphological simulations. The evaluation is conducted for three flow rates, and model setup used highly accurate Structure-from-Motion (SfM) topography and discharge boundary conditions. The model was hydraulically calibrated using bed roughness, and performance was evaluated based on depth and inundation agreement. Model bed load performance was evaluated in terms of critical shear stress exceedance area compared to maps of observed bed mobility in a flume. Following the standard hydraulic calibration, bed load performance was tested for sensitivity to horizontal eddy viscosity parameterization and bed morphology updating. Simulations produced depth errors equal to the SfM inherent errors, inundation agreement of 77-85%, and critical shear stress exceedance in agreement with 49-68% of the observed active area. This study provides insight into the ability of physically based, two-dimensional simulations to accurately predict bed load as well as the effects of horizontal eddy viscosity and bed updating. Further, this study highlights how using high spatial and temporal data to capture the physical processes at work during flume experiments can help to improve morphological modeling.
Marsot, Amélie; Michel, Fabrice; Chasseloup, Estelle; Paut, Olivier; Guilhaumou, Romain; Blin, Olivier
2017-10-01
An external evaluation of phenobarbital population pharmacokinetic model described by Marsot et al. was performed in pediatric intensive care unit. Model evaluation is an important issue for dose adjustment. This external evaluation should allow confirming the proposed dosage adaptation and extending these recommendations to the entire intensive care pediatric population. External evaluation of phenobarbital published population pharmacokinetic model of Marsot et al. was realized in a new retrospective dataset of 35 patients hospitalized in a pediatric intensive care unit. The published population pharmacokinetic model was implemented in nonmem 7.3. Predictive performance was assessed by quantifying bias and inaccuracy of model prediction. Normalized prediction distribution errors (NPDE) and visual predictive check (VPC) were also evaluated. A total of 35 infants were studied with a mean age of 33.5 weeks (range: 12 days-16 years) and a mean weight of 12.6 kg (range: 2.7-70.0 kg). The model predicted the observed phenobarbital concentrations with a reasonable bias and inaccuracy. The median prediction error was 3.03% (95% CI: -8.52 to 58.12%), and the median absolute prediction error was 26.20% (95% CI: 13.07-75.59%). No trends in NPDE and VPC were observed. The model previously proposed by Marsot et al. in neonates hospitalized in intensive care unit was externally validated for IV infusion administration. The model-based dosing regimen was extended in all pediatric intensive care unit to optimize treatment. Due to inter- and intravariability in pharmacokinetic model, this dosing regimen should be combined with therapeutic drug monitoring. © 2017 Société Française de Pharmacologie et de Thérapeutique.
A Universal Model for Evaluating Basic Electronic Courses in Terms of Field Utilization of Training.
ERIC Educational Resources Information Center
Air Force Occupational Measurement Center, Lackland AFB, TX.
The main purpose of the Air Force project was to develop a universal model to evaluate usage of basic electronic principles training. The criterion used by the model to evaluate electronic theory training is a determination of the usefulness of the training vis-a-vis the performance of assigned tasks in the various electronic career fields. Data…
ERIC Educational Resources Information Center
Kaliski, Pamela; Wind, Stefanie A.; Engelhard, George, Jr.; Morgan, Deanna; Plake, Barbara; Reshetar, Rosemary
2012-01-01
The Many-Facet Rasch (MFR) Model is traditionally used to evaluate the quality of ratings on constructed response assessments; however, it can also be used to evaluate the quality of judgments from panel-based standard setting procedures. The current study illustrates the use of the MFR Model by examining the quality of ratings obtained from a…
NASA Astrophysics Data System (ADS)
Yahya, Khairunnisa; Wang, Kai; Campbell, Patrick; Chen, Ying; Glotfelty, Timothy; He, Jian; Pirhalla, Michael; Zhang, Yang
2017-03-01
An advanced online-coupled meteorology-chemistry model, i.e., the Weather Research and Forecasting Model with Chemistry (WRF/Chem), is applied for current (2001-2010) and future (2046-2055) decades under the representative concentration pathways (RCP) 4.5 and 8.5 scenarios to examine changes in future climate, air quality, and their interactions. In this Part I paper, a comprehensive model evaluation is carried out for current decade to assess the performance of WRF/Chem and WRF under both scenarios and the benefits of downscaling the North Carolina State University's (NCSU) version of the Community Earth System Model (CESM_NCSU) using WRF/Chem. The evaluation of WRF/Chem shows an overall good performance for most meteorological and chemical variables on a decadal scale. Temperature at 2-m is overpredicted by WRF (by ∼0.2-0.3 °C) but underpredicted by WRF/Chem (by ∼0.3-0.4 °C), due to higher radiation from WRF. Both WRF and WRF/Chem show large overpredictions for precipitation, indicating limitations in their microphysics or convective parameterizations. WRF/Chem with prognostic chemical concentrations, however, performs much better than WRF with prescribed chemical concentrations for radiation variables, illustrating the benefit of predicting gases and aerosols and representing their feedbacks into meteorology in WRF/Chem. WRF/Chem performs much better than CESM_NCSU for most surface meteorological variables and O3 hourly mixing ratios. In addition, WRF/Chem better captures observed temporal and spatial variations than CESM_NCSU. CESM_NCSU performance for radiation variables is comparable to or better than WRF/Chem performance because of the model tuning in CESM_NCSU that is routinely made in global models.
Solid rocket booster performance evaluation model. Volume 1: Engineering description
NASA Technical Reports Server (NTRS)
1974-01-01
The space shuttle solid rocket booster performance evaluation model (SRB-II) is made up of analytical and functional simulation techniques linked together so that a single pass through the model will predict the performance of the propulsion elements of a space shuttle solid rocket booster. The available options allow the user to predict static test performance, predict nominal and off nominal flight performance, and reconstruct actual flight and static test performance. Options selected by the user are dependent on the data available. These can include data derived from theoretical analysis, small scale motor test data, large motor test data and motor configuration data. The user has several options for output format that include print, cards, tape and plots. Output includes all major performance parameters (Isp, thrust, flowrate, mass accounting and operating pressures) as a function of time as well as calculated single point performance data. The engineering description of SRB-II discusses the engineering and programming fundamentals used, the function of each module, and the limitations of each module.
Air quality models are used to predict changes in pollutant concentrations resulting from envisioned emission control policies. Recognizing the need to assess the credibility of air quality models in a policy-relevant context, we perform a dynamic evaluation of the community Mult...
EVALUATION OF THE AGDISP AERIAL SPRAY ALGORITHMS IN THE AGDRIFT MODEL
A systematic evaluation of the AgDISP algorithms, which simulate off-site drift and deposition of aerially applied pesticides, contained in the AgDRIFT model was performed by comparing model simulations to field-trial data collected by the Spray Drift Task Force. Field-trial data...
ERIC Educational Resources Information Center
Connelly, Edward A.; And Others
A new approach to deriving human performance measures and criteria for use in automatically evaluating trainee performance is documented in this report. The ultimate application of the research is to provide methods for automatically measuring pilot performance in a flight simulator or from recorded in-flight data. An efficient method of…
NASA Astrophysics Data System (ADS)
Walker, Ernest L.
1994-05-01
This paper presents results of a theoretical investigation to evaluate the performance of code division multiple access communications over multimode optical fiber channels in an asynchronous, multiuser communication network environment. The system is evaluated using Gold sequences for spectral spreading of the baseband signal from each user employing direct-sequence biphase shift keying and intensity modulation techniques. The transmission channel model employed is a lossless linear system approximation of the field transfer function for the alpha -profile multimode optical fiber. Due to channel model complexity, a correlation receiver model employing a suboptimal receive filter was used in calculating the peak output signal at the ith receiver. In Part 1, the performance measures for the system, i.e., signal-to-noise ratio and bit error probability for the ith receiver, are derived as functions of channel characteristics, spectral spreading, number of active users, and the bit energy to noise (white) spectral density ratio. In Part 2, the overall system performance is evaluated.
Goodness of fit of probability distributions for sightings as species approach extinction.
Vogel, Richard M; Hosking, Jonathan R M; Elphick, Chris S; Roberts, David L; Reed, J Michael
2009-04-01
Estimating the probability that a species is extinct and the timing of extinctions is useful in biological fields ranging from paleoecology to conservation biology. Various statistical methods have been introduced to infer the time of extinction and extinction probability from a series of individual sightings. There is little evidence, however, as to which of these models provide adequate fit to actual sighting records. We use L-moment diagrams and probability plot correlation coefficient (PPCC) hypothesis tests to evaluate the goodness of fit of various probabilistic models to sighting data collected for a set of North American and Hawaiian bird populations that have either gone extinct, or are suspected of having gone extinct, during the past 150 years. For our data, the uniform, truncated exponential, and generalized Pareto models performed moderately well, but the Weibull model performed poorly. Of the acceptable models, the uniform distribution performed best based on PPCC goodness of fit comparisons and sequential Bonferroni-type tests. Further analyses using field significance tests suggest that although the uniform distribution is the best of those considered, additional work remains to evaluate the truncated exponential model more fully. The methods we present here provide a framework for evaluating subsequent models.
External Evaluation of Two Fluconazole Infant Population Pharmacokinetic Models
Hwang, Michael F.; Beechinor, Ryan J.; Wade, Kelly C.; Benjamin, Daniel K.; Smith, P. Brian; Hornik, Christoph P.; Capparelli, Edmund V.; Duara, Shahnaz; Kennedy, Kathleen A.; Cohen-Wolkowiez, Michael
2017-01-01
ABSTRACT Fluconazole is an antifungal agent used for the treatment of invasive candidiasis, a leading cause of morbidity and mortality in premature infants. Population pharmacokinetic (PK) models of fluconazole in infants have been previously published by Wade et al. (Antimicrob Agents Chemother 52:4043–4049, 2008, https://doi.org/10.1128/AAC.00569-08) and Momper et al. (Antimicrob Agents Chemother 60:5539–5545, 2016, https://doi.org/10.1128/AAC.00963-16). Here we report the results of the first external evaluation of the predictive performance of both models. We used patient-level data from both studies to externally evaluate both PK models. The predictive performance of each model was evaluated using the model prediction error (PE), mean prediction error (MPE), mean absolute prediction error (MAPE), prediction-corrected visual predictive check (pcVPC), and normalized prediction distribution errors (NPDE). The values of the parameters of each model were reestimated using both the external and merged data sets. When evaluated with the external data set, the model proposed by Wade et al. showed lower median PE, MPE, and MAPE (0.429 μg/ml, 41.9%, and 57.6%, respectively) than the model proposed by Momper et al. (2.45 μg/ml, 188%, and 195%, respectively). The values of the majority of reestimated parameters were within 20% of their respective original parameter values for all model evaluations. Our analysis determined that though both models are robust, the model proposed by Wade et al. had greater accuracy and precision than the model proposed by Momper et al., likely because it was derived from a patient population with a wider age range. This study highlights the importance of the external evaluation of infant population PK models. PMID:28893774
Evaluation of Rainfall-Runoff Models for Mediterranean Subcatchments
NASA Astrophysics Data System (ADS)
Cilek, A.; Berberoglu, S.; Donmez, C.
2016-06-01
The development and the application of rainfall-runoff models have been a corner-stone of hydrological research for many decades. The amount of rainfall and its intensity and variability control the generation of runoff and the erosional processes operating at different scales. These interactions can be greatly variable in Mediterranean catchments with marked hydrological fluctuations. The aim of the study was to evaluate the performance of rainfall-runoff model, for rainfall-runoff simulation in a Mediterranean subcatchment. The Pan-European Soil Erosion Risk Assessment (PESERA), a simplified hydrological process-based approach, was used in this study to combine hydrological surface runoff factors. In total 128 input layers derived from data set includes; climate, topography, land use, crop type, planting date, and soil characteristics, are required to run the model. Initial ground cover was estimated from the Landsat ETM data provided by ESA. This hydrological model was evaluated in terms of their performance in Goksu River Watershed, Turkey. It is located at the Central Eastern Mediterranean Basin of Turkey. The area is approximately 2000 km2. The landscape is dominated by bare ground, agricultural and forests. The average annual rainfall is 636.4mm. This study has a significant importance to evaluate different model performances in a complex Mediterranean basin. The results provided comprehensive insight including advantages and limitations of modelling approaches in the Mediterranean environment.
Cheng, Jieyao; Hou, Jinlin; Ding, Huiguo; Chen, Guofeng; Xie, Qing; Wang, Yuming; Zeng, Minde; Ou, Xiaojuan; Ma, Hong; Jia, Jidong
2015-01-01
Background and Aims Noninvasive models have been developed for fibrosis assessment in patients with chronic hepatitis B. However, the sensitivity, specificity and diagnostic accuracy in evaluating liver fibrosis of these methods have not been validated and compared in the same group of patients. The aim of this study was to verify the diagnostic performance and reproducibility of ten reported noninvasive models in a large cohort of Asian CHB patients. Methods The diagnostic performance of ten noninvasive models (HALF index, FibroScan, S index, Zeng model, Youyi model, Hui model, APAG, APRI, FIB-4 and FibroTest) was assessed against the liver histology by ROC curve analysis in CHB patients. The reproducibility of the ten models were evaluated by recalculating the diagnostic values at the given cut-off values defined by the original studies. Results Six models (HALF index, FibroScan, Zeng model, Youyi model, S index and FibroTest) had AUROCs higher than 0.70 in predicting any fibrosis stage and 2 of them had best diagnostic performance with AUROCs to predict F≥2, F≥3 and F4 being 0.83, 0.89 and 0.89 for HALF index, 0.82, 0.87 and 0.87 for FibroScan, respectively. Four models (HALF index, FibroScan, Zeng model and Youyi model) showed good diagnostic values at given cut-offs. Conclusions HALF index, FibroScan, Zeng model, Youyi model, S index and FibroTest show a good diagnostic performance and all of them, except S index and FibroTest, have good reproducibility for evaluating liver fibrosis in CHB patients. Registration Number ChiCTR-DCS-07000039. PMID:26709706
Editorial: Cognitive Architectures, Model Comparison and AGI
NASA Astrophysics Data System (ADS)
Lebiere, Christian; Gonzalez, Cleotilde; Warwick, Walter
2010-12-01
Cognitive Science and Artificial Intelligence share compatible goals of understanding and possibly generating broadly intelligent behavior. In order to determine if progress is made, it is essential to be able to evaluate the behavior of complex computational models, especially those built on general cognitive architectures, and compare it to benchmarks of intelligent behavior such as human performance. Significant methodological challenges arise, however, when trying to extend approaches used to compare model and human performance from tightly controlled laboratory tasks to complex tasks involving more open-ended behavior. This paper describes a model comparison challenge built around a dynamic control task, the Dynamic Stocks and Flows. We present and discuss distinct approaches to evaluating performance and comparing models. Lessons drawn from this challenge are discussed in light of the challenge of using cognitive architectures to achieve Artificial General Intelligence.
[Evaluation of national prevention campaigns against AIDS: analysis model].
Hausser, D; Lehmann, P; Dubois, F; Gutzwiller, F
1987-01-01
The evaluation of the "Stop-Aids" campaign is based upon a model of behaviour modification (McAlister) which includes the communication theory of McGuire and the social learning theory of Bandura. Using this model, it is possible to define key variables that are used to measure the impact of the campaign. Process evaluation allows identification of multipliers that reinforce and confirm the initial message of prevention (source) thereby encouraging behaviour modifications that are likely to reduce the transmission of HIV (condom use, no sharing of injection material, monogamous relationship, etc.). Twelve studies performed by seven teams in the three linguistic areas contribute to the project. A synthesis of these results will be performed by the IUMSP.
75 FR 42760 - Statement of Organization, Functions, and Delegations of Authority
Federal Register 2010, 2011, 2012, 2013, 2014
2010-07-22
... accounting reports and invoices, and monitoring all spending. The Team develops, defends and executes the... results; performance measurement; research and evaluation methodologies; demonstration testing and model... ACF programs; strategic planning; performance measurement; program and policy evaluation; research and...
visCOS: An R-package to evaluate model performance of hydrological models
NASA Astrophysics Data System (ADS)
Klotz, Daniel; Herrnegger, Mathew; Wesemann, Johannes; Schulz, Karsten
2016-04-01
The evaluation of model performance is a central part of (hydrological) modelling. Much attention has been given to the development of evaluation criteria and diagnostic frameworks. (Klemeš, 1986; Gupta et al., 2008; among many others). Nevertheless, many applications exist for which objective functions do not yet provide satisfying summaries. Thus, the necessity to visualize results arises in order to explore a wider range of model capacities, be it strengths or deficiencies. Visualizations are usually devised for specific projects and these efforts are often not distributed to a broader community (e.g. via open source software packages). Hence, the opportunity to explicitly discuss a state-of-the-art presentation technique is often missed. We therefore present a comprehensive R-package for evaluating model performance by visualizing and exploring different aspects of hydrological time-series. The presented package comprises a set of useful plots and visualization methods, which complement existing packages, such as hydroGOF (Zambrano-Bigiarini et al., 2012). It is derived from practical applications of the hydrological models COSERO and COSEROreg (Kling et al., 2014). visCOS, providing an interface in R, represents an easy-to-use software package for visualizing and assessing model performance and can be implemented in the process of model calibration or model development. The package provides functions to load hydrological data into R, clean the data, process, visualize, explore and finally save the results in a consistent way. Together with an interactive zoom function of the time series, an online calculation of the objective functions for variable time-windows is included. Common hydrological objective functions, such as the Nash-Sutcliffe Efficiency and the Kling-Gupta Efficiency, can also be evaluated and visualized in different ways for defined sub-periods like hydrological years or seasonal sections. Many hydrologists use long-term water-balances as a pivotal tool in model evaluation. They allow inferences about different systematic model-shortcomings and are an efficient way for communicating these in practice (Schulz et al., 2015). The evaluation and construction of such water balances is implemented with the presented package. During the (manual) calibration of a model or in the scope of model development, many model runs and iterations are necessary. Thus, users are often interested in comparing different model results in a visual way in order to learn about the model and to analyse parameter-changes on the output. A method to illuminate these differences and the evolution of changes is also included. References: • Gupta, H.V.; Wagener, T.; Liu, Y. (2008): Reconciling theory with observations: elements of a diagnostic approach to model evaluation, Hydrol. Process. 22, doi: 10.1002/hyp.6989. • Klemeš, V. (1986): Operational testing of hydrological simulation models, Hydrolog. Sci. J., doi: 10.1080/02626668609491024. • Kling, H.; Stanzel, P.; Fuchs, M.; and Nachtnebel, H. P. (2014): Performance of the COSERO precipitation-runoff model under non-stationary conditions in basins with different climates, Hydrolog. Sci. J., doi: 10.1080/02626667.2014.959956. • Schulz, K., Herrnegger, M., Wesemann, J., Klotz, D. Senoner, T. (2015): Kalibrierung COSERO - Mur für Pro Vis, Verbund Trading GmbH (Abteilung STG), final report, Institute of Water Management, Hydrology and Hydraulic Engineering, University of Natural Resources and Applied Life Sciences, Vienna, Austria, 217pp. • Zambrano-Bigiarini, M; Bellin, A. (2010): Comparing Goodness-of-fit Measures for Calibration of Models Focused on Extreme Events. European Geosciences Union (EGU), Geophysical Research Abstracts 14, EGU2012-11549-1.
The MSFC UNIVAC 1108 EXEC 8 simulation model
NASA Technical Reports Server (NTRS)
Williams, T. G.; Richards, F. M.; Weatherbee, J. E.; Paul, L. K.
1972-01-01
A model is presented which simulates the MSFC Univac 1108 multiprocessor system. The hardware/operating system is described to enable a good statistical measurement of the system behavior. The performance of the 1108 is evaluated by performing twenty-four different experiments designed to locate system bottlenecks and also to test the sensitivity of system throughput with respect to perturbation of the various Exec 8 scheduling algorithms. The model is implemented in the general purpose system simulation language and the techniques described can be used to assist in the design, development, and evaluation of multiprocessor systems.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zobel, Zachary; Wang, Jiali; Wuebbles, Donald J.
This study uses Weather Research and Forecast (WRF) model to evaluate the performance of six dynamical downscaled decadal historical simulations with 12-km resolution for a large domain (7200 x 6180 km) that covers most of North America. The initial and boundary conditions are from three global climate models (GCMs) and one reanalysis data. The GCMs employed in this study are the Geophysical Fluid Dynamics Laboratory Earth System Model with Generalized Ocean Layer Dynamics component, Community Climate System Model, version 4, and the Hadley Centre Global Environment Model, version 2-Earth System. The reanalysis data is from the National Centers for Environmentalmore » Prediction-US. Department of Energy Reanalysis II. We analyze the effects of bias correcting, the lateral boundary conditions and the effects of spectral nudging. We evaluate the model performance for seven surface variables and four upper atmospheric variables based on their climatology and extremes for seven subregions across the United States. The results indicate that the simulation’s performance depends on both location and the features/variable being tested. We find that the use of bias correction and/or nudging is beneficial in many situations, but employing these when running the RCM is not always an improvement when compared to the reference data. The use of an ensemble mean and median leads to a better performance in measuring the climatology, while it is significantly biased for the extremes, showing much larger differences than individual GCM driven model simulations from the reference data. This study provides a comprehensive evaluation of these historical model runs in order to make informed decisions when making future projections.« less
System analysis tools for an ELT at ESO
NASA Astrophysics Data System (ADS)
Mueller, Michael; Koch, Franz
2006-06-01
Engineering of complex, large scale systems like the ELT designs currently investigated and developed in Europe and Northern America require powerful and sophisticated tools within specific technical disciplines such as mechanics, optics and control engineering. However, even analyzing a certain component of the telescope like the telescope structure necessitates a system approach to evaluate the structural effects onto the optical performance. This paper shows several software tools developed by the European Southern Observatory (ESO) which focus onto the system approach in the analyses: Using modal results of a finite element analysis the SMI-toolbox allows an easy generation of structural models with different sizes and levels of accuracy for the control design and closed-loop simulations. The optical modeling code BeamWarrior was developed by ESO and Astrium GmbH, Germany) especially for integrated modeling and interfering with a structural model. Within BeamWarrior displacements and deformations can be applied in an arbitrary coordinate system, and hence also in the global coordinates of the FE model avoiding error prone transformations. In addition to this, a sparse state space model object was developed for Matlab to gain in computational efficiency and reduced memory requirements due to the sparsity pattern of both the structural models and the control architecture. As one result these tools allow building an integrated model in order to reliably simulate interactions, cross-coupling effects, system responses, and to evaluate global performance. In order to evaluate disturbance effects on the optical performance in openloop more efficiently, an optical evaluation toolbox was built in the FE software ANSYS which performs Zernike decomposition and best-fit computation of the deformations directly in the FE analysis.
Evaluating Curriculum-Based Measurement from a Behavioral Assessment Perspective
ERIC Educational Resources Information Center
Ardoin, Scott P.; Roof, Claire M.; Klubnick, Cynthia; Carfolite, Jessica
2008-01-01
Curriculum-based measurement Reading (CBM-R) is an assessment procedure used to evaluate students' relative performance compared to peers and to evaluate their growth in reading. Within the response to intervention (RtI) model, CBM-R data are plotted in time series fashion as a means modeling individual students' response to varying levels of…
ERIC Educational Resources Information Center
Martins, Jorge Tiago; Martins, Rosa Maria
2012-01-01
This paper reports the implementation results of the Portuguese School Libraries Evaluation Model, more specifically the results of primary schools self-evaluation of their libraries' reading promotion and information literacy development activities. School libraries that rated their performance as either "Excellent" or "Poor"…
Wang, Chia-Nan; Nguyen, Nhu-Ty; Tran, Thanh-Tuyen
2015-01-01
The growth of economy and population together with the higher demand in energy has created many concerns for the Indian electricity industry whose capacity is at 211 gigawatts mostly in coal-fired plants. Due to insufficient fuel supply, India suffers from a shortage of electricity generation, leading to rolling blackouts; thus, performance evaluation and ranking the industry turn into significant issues. By this study, we expect to evaluate the rankings of these companies under control of the Ministry of Power. Also, this research would like to test if there are any significant differences between the two DEA models: Malmquist nonradial and Malmquist radial. Then, one advance model of MPI would be chosen to see these companies' performance in recent years and next few years by using forecasting results of Grey system theory. Totally, the realistic data 14 are considered to be in this evaluation after the strict selection from the whole industry. The results found that all companies have not shown many abrupt changes on their scores, and it is always not consistently good or consistently standing out, which demonstrated the high applicable usability of the integrated methods. This integrated numerical research gives a better "past-present-future" insights into performance evaluation in Indian electricity industry.
Wang, Chia-Nan; Tran, Thanh-Tuyen
2015-01-01
The growth of economy and population together with the higher demand in energy has created many concerns for the Indian electricity industry whose capacity is at 211 gigawatts mostly in coal-fired plants. Due to insufficient fuel supply, India suffers from a shortage of electricity generation, leading to rolling blackouts; thus, performance evaluation and ranking the industry turn into significant issues. By this study, we expect to evaluate the rankings of these companies under control of the Ministry of Power. Also, this research would like to test if there are any significant differences between the two DEA models: Malmquist nonradial and Malmquist radial. Then, one advance model of MPI would be chosen to see these companies' performance in recent years and next few years by using forecasting results of Grey system theory. Totally, the realistic data 14 are considered to be in this evaluation after the strict selection from the whole industry. The results found that all companies have not shown many abrupt changes on their scores, and it is always not consistently good or consistently standing out, which demonstrated the high applicable usability of the integrated methods. This integrated numerical research gives a better “past-present-future” insights into performance evaluation in Indian electricity industry. PMID:25821854
Simulation and performance of brushless dc motor actuators
NASA Astrophysics Data System (ADS)
Gerba, A., Jr.
1985-12-01
The simulation model for a Brushless D.C. Motor and the associated commutation power conditioner transistor model are presented. The necessary conditions for maximum power output while operating at steady-state speed and sinusoidally distributed air-gap flux are developed. Comparison of simulated model with the measured performance of a typical motor are done both on time response waveforms and on average performance characteristics. These preliminary results indicate good agreement. Plans for model improvement and testing of a motor-driven positioning device for model evaluation are outlined.
The CMAQ modeling system has been used to simulate the air quality for North America and Europe for the entire year of 2006 as part of the Air Quality Model Evaluation International Initiative (AQMEII) and the operational model performance of O3, fine particulate matte...
NASA Technical Reports Server (NTRS)
Jeracki, R. J.; Mitchell, G. A.
1981-01-01
The performance of lower speed, 5 foot diameter model general aviation propellers, was tested in the Lewis wind tunnel. Performance was evaluated for various levels of airfoil technology and activity factor. The difference was associated with inadequate modeling of blade and spinner losses for propellers round shank blade designs. Suggested concepts for improvement are: (1) advanced blade shapes (airfoils and sweep); (2) tip devices (proplets); (3) integrated propeller/nacelles; and (4) composites. Several advanced aerodynamic concepts were evaluated in the Lewis wind tunnel. Results show that high propeller performance can be obtained to at least Mach 0.8.
NASA Astrophysics Data System (ADS)
Solazzo, Efisio; Bianconi, Roberto; Pirovano, Guido; Matthias, Volker; Vautard, Robert; Moran, Michael D.; Wyat Appel, K.; Bessagnet, Bertrand; Brandt, Jørgen; Christensen, Jesper H.; Chemel, Charles; Coll, Isabelle; Ferreira, Joana; Forkel, Renate; Francis, Xavier V.; Grell, Georg; Grossi, Paola; Hansen, Ayoe B.; Miranda, Ana Isabel; Nopmongcol, Uarporn; Prank, Marje; Sartelet, Karine N.; Schaap, Martijn; Silver, Jeremy D.; Sokhi, Ranjeet S.; Vira, Julius; Werhahn, Johannes; Wolke, Ralf; Yarwood, Greg; Zhang, Junhua; Rao, S. Trivikrama; Galmarini, Stefano
2012-06-01
Ten state-of-the-science regional air quality (AQ) modeling systems have been applied to continental-scale domains in North America and Europe for full-year simulations of 2006 in the context of Air Quality Model Evaluation International Initiative (AQMEII), whose main goals are model inter-comparison and evaluation. Standardised modeling outputs from each group have been shared on the web-distributed ENSEMBLE system, which allows statistical and ensemble analyses to be performed. In this study, the one-year model simulations are inter-compared and evaluated with a large set of observations for ground-level particulate matter (PM10 and PM2.5) and its chemical components. Modeled concentrations of gaseous PM precursors, SO2 and NO2, have also been evaluated against observational data for both continents. Furthermore, modeled deposition (dry and wet) and emissions of several species relevant to PM are also inter-compared. The unprecedented scale of the exercise (two continents, one full year, fifteen modeling groups) allows for a detailed description of AQ model skill and uncertainty with respect to PM. Analyses of PM10 yearly time series and mean diurnal cycle show a large underestimation throughout the year for the AQ models included in AQMEII. The possible causes of PM bias, including errors in the emissions and meteorological inputs (e.g., wind speed and precipitation), and the calculated deposition are investigated. Further analysis of the coarse PM components, PM2.5 and its major components (SO4, NH4, NO3, elemental carbon), have also been performed, and the model performance for each component evaluated against measurements. Finally, the ability of the models to capture high PM concentrations has been evaluated by examining two separate PM2.5 episodes in Europe and North America. A large variability among models in predicting emissions, deposition, and concentration of PM and its precursors during the episodes has been found. Major challenges still remain with regards to identifying and eliminating the sources of PM bias in the models. Although PM2.5 was found to be much better estimated by the models than PM10, no model was found to consistently match the observations for all locations throughout the entire year.
Park, Seong Ho; Han, Kyunghwa
2018-03-01
The use of artificial intelligence in medicine is currently an issue of great interest, especially with regard to the diagnostic or predictive analysis of medical images. Adoption of an artificial intelligence tool in clinical practice requires careful confirmation of its clinical utility. Herein, the authors explain key methodology points involved in a clinical evaluation of artificial intelligence technology for use in medicine, especially high-dimensional or overparameterized diagnostic or predictive models in which artificial deep neural networks are used, mainly from the standpoints of clinical epidemiology and biostatistics. First, statistical methods for assessing the discrimination and calibration performances of a diagnostic or predictive model are summarized. Next, the effects of disease manifestation spectrum and disease prevalence on the performance results are explained, followed by a discussion of the difference between evaluating the performance with use of internal and external datasets, the importance of using an adequate external dataset obtained from a well-defined clinical cohort to avoid overestimating the clinical performance as a result of overfitting in high-dimensional or overparameterized classification model and spectrum bias, and the essentials for achieving a more robust clinical evaluation. Finally, the authors review the role of clinical trials and observational outcome studies for ultimate clinical verification of diagnostic or predictive artificial intelligence tools through patient outcomes, beyond performance metrics, and how to design such studies. © RSNA, 2018.
Phased models for evaluating the performability of computing systems
NASA Technical Reports Server (NTRS)
Wu, L. T.; Meyer, J. F.
1979-01-01
A phase-by-phase modelling technique is introduced to evaluate a fault tolerant system's ability to execute different sets of computational tasks during different phases of the control process. Intraphase processes are allowed to differ from phase to phase. The probabilities of interphase state transitions are specified by interphase transition matrices. Based on constraints imposed on the intraphase and interphase transition probabilities, various iterative solution methods are developed for calculating system performability.
Display/control requirements for automated VTOL aircraft
NASA Technical Reports Server (NTRS)
Hoffman, W. C.; Kleinman, D. L.; Young, L. R.
1976-01-01
A systematic design methodology for pilot displays in advanced commercial VTOL aircraft was developed and refined. The analyst is provided with a step-by-step procedure for conducting conceptual display/control configurations evaluations for simultaneous monitoring and control pilot tasks. The approach consists of three phases: formulation of information requirements, configuration evaluation, and system selection. Both the monitoring and control performance models are based upon the optimal control model of the human operator. Extensions to the conventional optimal control model required in the display design methodology include explicit optimization of control/monitoring attention; simultaneous monitoring and control performance predictions; and indifference threshold effects. The methodology was applied to NASA's experimental CH-47 helicopter in support of the VALT program. The CH-47 application examined the system performance of six flight conditions. Four candidate configurations are suggested for evaluation in pilot-in-the-loop simulations and eventual flight tests.
Online Deviation Detection for Medical Processes
Christov, Stefan C.; Avrunin, George S.; Clarke, Lori A.
2014-01-01
Human errors are a major concern in many medical processes. To help address this problem, we are investigating an approach for automatically detecting when performers of a medical process deviate from the acceptable ways of performing that process as specified by a detailed process model. Such deviations could represent errors and, thus, detecting and reporting deviations as they occur could help catch errors before harm is done. In this paper, we identify important issues related to the feasibility of the proposed approach and empirically evaluate the approach for two medical procedures, chemotherapy and blood transfusion. For the evaluation, we use the process models to generate sample process executions that we then seed with synthetic errors. The process models describe the coordination of activities of different process performers in normal, as well as in exceptional situations. The evaluation results suggest that the proposed approach could be applied in clinical settings to help catch errors before harm is done. PMID:25954343
Teratologic Evaluation of a Model Perfluorinated Acid, NDFDA
1981-01-01
perfluorocarboxylic and perfluorosulfonic acids. I & G Product Research and Development. Vol. 1, No. 3, 165-169. Olson, C. T. and K. C. Back 1978...AFAMRL-TR-81 -14 TERATOLOGIC EVALUATION OF A MODEL PERFLUORINATED ACID, NDFDA INEZ R. BA CON UNIVERSITY OF THE DISTRICT OF COLUMBIA DEPARTMENT OF...TYPE OF REPORT & PERIOD COVERED TERATOLOGIC EVALUATION OF A MODEL PERFLUORINATED ACID, NDFDA 6. PERFORMING ORG. REPORT NUMBER 7. AUTHOR(s) S. CONTRACT
Šiljić Tomić, Aleksandra N; Antanasijević, Davor Z; Ristić, Mirjana Đ; Perić-Grujić, Aleksandra A; Pocajt, Viktor V
2016-05-01
This paper describes the application of artificial neural network models for the prediction of biological oxygen demand (BOD) levels in the Danube River. Eighteen regularly monitored water quality parameters at 17 stations on the river stretch passing through Serbia were used as input variables. The optimization of the model was performed in three consecutive steps: firstly, the spatial influence of a monitoring station was examined; secondly, the monitoring period necessary to reach satisfactory performance was determined; and lastly, correlation analysis was applied to evaluate the relationship among water quality parameters. Root-mean-square error (RMSE) was used to evaluate model performance in the first two steps, whereas in the last step, multiple statistical indicators of performance were utilized. As a result, two optimized models were developed, a general regression neural network model (labeled GRNN-1) that covers the monitoring stations from the Danube inflow to the city of Novi Sad and a GRNN model (labeled GRNN-2) that covers the stations from the city of Novi Sad to the border with Romania. Both models demonstrated good agreement between the predicted and actually observed BOD values.
Travtek Evaluation Modeling Study
DOT National Transportation Integrated Search
1996-03-01
THE FOLLOWING REPORT DESCRIBES A MODELING STUDY THAT WAS PERFORMED TO EXTRAPOLATE, FROM THE TRAVTEK OPERATIONAL TEST DATA, A SET OF SYSTEM WIDE BENEFITS AND PERFORMANCE VALUES FOR A WIDER-SCALE DEPLOYMENT OF A TRAVTEK-LIKE SYSTEM. IN THE FIRST PART O...
NASA Astrophysics Data System (ADS)
Goodman, A.; Lee, H.; Waliser, D. E.; Guttowski, W.
2017-12-01
Observation-based evaluations of global climate models (GCMs) have been a key element for identifying systematic model biases that can be targeted for model improvements and for establishing uncertainty associated with projections of global climate change. However, GCMs are limited in their ability to represent physical phenomena which occur on smaller, regional scales, including many types of extreme weather events. In order to help facilitate projections in changes of such phenomena, simulations from regional climate models (RCMs) for 14 different domains around the world are being provided by the Coordinated Regional Climate Downscaling Experiment (CORDEX; www.cordex.org). However, although CORDEX specifies standard simulation and archiving protocols, these simulations are conducted independently by individual research and modeling groups representing each of these domains often with different output requirements and data archiving and exchange capabilities. Thus, with respect to similar efforts using GCMs (e.g., the Coupled Model Intercomparison Project, CMIP), it is more difficult to achieve a standardized, systematic evaluation of the RCMs for each domain and across all the CORDEX domains. Using the Regional Climate Model Evaluation System (RCMES; rcmes.jpl.nasa.gov) developed at JPL, we are developing easy to use templates for performing systematic evaluations of CORDEX simulations. Results from the application of a number of evaluation metrics (e.g., biases, centered RMS, and pattern correlations) will be shown for a variety of physical quantities and CORDEX domains. These evaluations are performed using products from obs4MIPs, an activity initiated by DOE and NASA, and now shepherded by the World Climate Research Program's Data Advisory Council.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tessum, C. W.; Hill, J. D.; Marshall, J. D.
We present results from and evaluate the performance of a 12-month, 12 km horizontal resolution year 2005 air pollution simulation for the contiguous United States using the WRF-Chem (Weather Research and Forecasting with Chemistry) meteorology and chemical transport model (CTM). We employ the 2005 US National Emissions Inventory, the Regional Atmospheric Chemistry Mechanism (RACM), and the Modal Aerosol Dynamics Model for Europe (MADE) with a volatility basis set (VBS) secondary aerosol module. Overall, model performance is comparable to contemporary modeling efforts used for regulatory and health-effects analysis, with an annual average daytime ozone (O 3) mean fractional bias (MFB) ofmore » 12% and an annual average fine particulate matter (PM 2.5) MFB of −1%. WRF-Chem, as configured here, tends to overpredict total PM 2.5 at some high concentration locations and generally overpredicts average 24 h O 3 concentrations. Performance is better at predicting daytime-average and daily peak O 3 concentrations, which are more relevant for regulatory and health effects analyses relative to annual average values. Predictive performance for PM 2.5 subspecies is mixed: the model overpredicts particulate sulfate (MFB = 36%), underpredicts particulate nitrate (MFB = −110%) and organic carbon (MFB = −29%), and relatively accurately predicts particulate ammonium (MFB = 3%) and elemental carbon (MFB = 3%), so that the accuracy in total PM 2.5 predictions is to some extent a function of offsetting over- and underpredictions of PM 2.5 subspecies. Model predictive performance for PM 2.5 and its subspecies is in general worse in winter and in the western US than in other seasons and regions, suggesting spatial and temporal opportunities for future WRF-Chem model development and evaluation.« less
Tessum, C. W.; Hill, J. D.; Marshall, J. D.
2015-04-07
We present results from and evaluate the performance of a 12-month, 12 km horizontal resolution year 2005 air pollution simulation for the contiguous United States using the WRF-Chem (Weather Research and Forecasting with Chemistry) meteorology and chemical transport model (CTM). We employ the 2005 US National Emissions Inventory, the Regional Atmospheric Chemistry Mechanism (RACM), and the Modal Aerosol Dynamics Model for Europe (MADE) with a volatility basis set (VBS) secondary aerosol module. Overall, model performance is comparable to contemporary modeling efforts used for regulatory and health-effects analysis, with an annual average daytime ozone (O 3) mean fractional bias (MFB) ofmore » 12% and an annual average fine particulate matter (PM 2.5) MFB of −1%. WRF-Chem, as configured here, tends to overpredict total PM 2.5 at some high concentration locations and generally overpredicts average 24 h O 3 concentrations. Performance is better at predicting daytime-average and daily peak O 3 concentrations, which are more relevant for regulatory and health effects analyses relative to annual average values. Predictive performance for PM 2.5 subspecies is mixed: the model overpredicts particulate sulfate (MFB = 36%), underpredicts particulate nitrate (MFB = −110%) and organic carbon (MFB = −29%), and relatively accurately predicts particulate ammonium (MFB = 3%) and elemental carbon (MFB = 3%), so that the accuracy in total PM 2.5 predictions is to some extent a function of offsetting over- and underpredictions of PM 2.5 subspecies. Model predictive performance for PM 2.5 and its subspecies is in general worse in winter and in the western US than in other seasons and regions, suggesting spatial and temporal opportunities for future WRF-Chem model development and evaluation.« less
USDA-ARS?s Scientific Manuscript database
Coupled Model Intercomparison Project 3 simulations of surface temperature were evaluated over the period 1902-1999 to assess their ability to reproduce historical temperature variability at 211 global locations. Model performance was evaluated using the running Mann Whitney-Z method, a technique th...
In this study, the concept of scale analysis is applied to evaluate two state-of-science meteorological models, namely MM5 and RAMS3b, currently being used to drive regional-scale air quality models. To this end, seasonal time series of observations and predictions for temperatur...
An investigation of the concentrations of nitrogen oxides (NOx) from an air quality model and observations at monitoring sites was performed to assess the changes in NOx levels attributable to changes in mobile emissions. This evaluation effort focused on weekday morning rush hou...
Simulating forage crop production in a northern climate with the Integrated Farm System Model
USDA-ARS?s Scientific Manuscript database
Whole-farm simulation models are useful tools for evaluating the effect of management practices and climate variability on the agro-environmental and economic performance of farms. A few process-based farm-scale models have been developed, but none have been evaluated in a northern region with a sho...
Integrated modeling tool for performance engineering of complex computer systems
NASA Technical Reports Server (NTRS)
Wright, Gary; Ball, Duane; Hoyt, Susan; Steele, Oscar
1989-01-01
This report summarizes Advanced System Technologies' accomplishments on the Phase 2 SBIR contract NAS7-995. The technical objectives of the report are: (1) to develop an evaluation version of a graphical, integrated modeling language according to the specification resulting from the Phase 2 research; and (2) to determine the degree to which the language meets its objectives by evaluating ease of use, utility of two sets of performance predictions, and the power of the language constructs. The technical approach followed to meet these objectives was to design, develop, and test an evaluation prototype of a graphical, performance prediction tool. The utility of the prototype was then evaluated by applying it to a variety of test cases found in the literature and in AST case histories. Numerous models were constructed and successfully tested. The major conclusion of this Phase 2 SBIR research and development effort is that complex, real-time computer systems can be specified in a non-procedural manner using combinations of icons, windows, menus, and dialogs. Such a specification technique provides an interface that system designers and architects find natural and easy to use. In addition, PEDESTAL's multiview approach provides system engineers with the capability to perform the trade-offs necessary to produce a design that meets timing performance requirements. Sample system designs analyzed during the development effort showed that models could be constructed in a fraction of the time required by non-visual system design capture tools.
The Application of FIA-based Data to Wildlife Habitat Modeling: A Comparative Study
Thomas C., Jr. Edwards; Gretchen G. Moisen; Tracey S. Frescino; Randall J. Schultz
2005-01-01
We evaluated the capability of two types of models, one based on spatially explicit variables derived from FIA data and one using so-called traditional habitat evaluation methods, for predicting the presence of cavity-nesting bird habitat in Fishlake National Forest, Utah. Both models performed equally well, in measures of predictive accuracy, with the FIA-based model...
NASA Astrophysics Data System (ADS)
Pappenberger, F.; Beven, K. J.; Frodsham, K.; Matgen, P.
2005-12-01
Flood inundation models play an increasingly important role in assessing flood risk. The growth of 2D inundation models that are intimately related to raster maps of floodplains is occurring at the same time as an increase in the availability of 2D remote data (e.g. SAR images and aerial photographs), against which model performancee can be evaluated. This requires new techniques to be explored in order to evaluate model performance in two dimensional space. In this paper we present a fuzzified pattern matching algorithm which compares favorably to a set of traditional measures. However, we further argue that model calibration has to go beyond the comparison of physical properties and should demonstrate how a weighting towards consequences, such as loss of property, can enhance model focus and prediction. Indeed, it will be necessary to abandon a fully spatial comparison in many scenarios to concentrate the model calibration exercise on specific points such as hospitals, police stations or emergency response centers. It can be shown that such point evaluations lead to significantly different flood hazard maps due to the averaging effect of a spatial performance measure. A strategy to balance the different needs (accuracy at certain spatial points and acceptable spatial performance) has to be based in a public and political decision making process.
Modelling the photochemical pollution over the metropolitan area of Porto Alegre, Brazil
NASA Astrophysics Data System (ADS)
Borrego, C.; Monteiro, A.; Ferreira, J.; Moraes, M. R.; Carvalho, A.; Ribeiro, I.; Miranda, A. I.; Moreira, D. M.
2010-01-01
The main purpose of this study is to evaluate the photochemical pollution over the Metropolitan Area of Porto Alegre (MAPA), Brazil, where high concentrations of ozone have been registered during the past years. Due to the restricted spatial coverage of the monitoring air quality network, a numerical modelling technique was selected and applied to this assessment exercise. Two different chemistry-transport models - CAMx and CALGRID - were applied for a summer period, driven by the MM5 meteorological model. The meteorological model performance was evaluated comparing its results to available monitoring data measured at the Porto Alegre airport. Validation results point out a good model performance. It was not possible to evaluate the chemistry models performance due to the lack of adequate monitoring data. Nevertheless, the model intercomparison between CAMx and CALGRID shows a similar behaviour in what concerns the simulation of nitrogen dioxide, but some discrepancies concerning ozone. Regarding the fulfilment of the Brazilian air quality targets, the simulated ozone concentrations surpass the legislated value in specific periods, mainly outside the urban area of Porto Alegre. The ozone formation is influenced by the emission of pollutants that act as precursors (like the nitrogen oxides emitted at Porto Alegre urban area and coming from a large refinery complex) and by the meteorological conditions.
Estimation and prediction under local volatility jump-diffusion model
NASA Astrophysics Data System (ADS)
Kim, Namhyoung; Lee, Younhee
2018-02-01
Volatility is an important factor in operating a company and managing risk. In the portfolio optimization and risk hedging using the option, the value of the option is evaluated using the volatility model. Various attempts have been made to predict option value. Recent studies have shown that stochastic volatility models and jump-diffusion models reflect stock price movements accurately. However, these models have practical limitations. Combining them with the local volatility model, which is widely used among practitioners, may lead to better performance. In this study, we propose a more effective and efficient method of estimating option prices by combining the local volatility model with the jump-diffusion model and apply it using both artificial and actual market data to evaluate its performance. The calibration process for estimating the jump parameters and local volatility surfaces is divided into three stages. We apply the local volatility model, stochastic volatility model, and local volatility jump-diffusion model estimated by the proposed method to KOSPI 200 index option pricing. The proposed method displays good estimation and prediction performance.
Solar power plant performance evaluation: simulation and experimental validation
NASA Astrophysics Data System (ADS)
Natsheh, E. M.; Albarbar, A.
2012-05-01
In this work the performance of solar power plant is evaluated based on a developed model comprise photovoltaic array, battery storage, controller and converters. The model is implemented using MATLAB/SIMULINK software package. Perturb and observe (P&O) algorithm is used for maximizing the generated power based on maximum power point tracker (MPPT) implementation. The outcome of the developed model are validated and supported by a case study carried out using operational 28.8kW grid-connected solar power plant located in central Manchester. Measurements were taken over 21 month's period; using hourly average irradiance and cell temperature. It was found that system degradation could be clearly monitored by determining the residual (the difference) between the output power predicted by the model and the actual measured power parameters. It was found that the residual exceeded the healthy threshold, 1.7kW, due to heavy snow in Manchester last winter. More important, the developed performance evaluation technique could be adopted to detect any other reasons that may degrade the performance of the P V panels such as shading and dirt. Repeatability and reliability of the developed system performance were validated during this period. Good agreement was achieved between the theoretical simulation and the real time measurement taken the online grid connected solar power plant.
Weather model performance on extreme rainfall events simulation's over Western Iberian Peninsula
NASA Astrophysics Data System (ADS)
Pereira, S. C.; Carvalho, A. C.; Ferreira, J.; Nunes, J. P.; Kaiser, J. J.; Rocha, A.
2012-08-01
This study evaluates the performance of the WRF-ARW numerical weather model in simulating the spatial and temporal patterns of an extreme rainfall period over a complex orographic region in north-central Portugal. The analysis was performed for the December month of 2009, during the Portugal Mainland rainy season. The heavy rainfall to extreme heavy rainfall periods were due to several low surface pressure's systems associated with frontal surfaces. The total amount of precipitation for December exceeded, in average, the climatological mean for the 1971-2000 time period in +89 mm, varying from 190 mm (south part of the country) to 1175 mm (north part of the country). Three model runs were conducted to assess possible improvements in model performance: (1) the WRF-ARW is forced with the initial fields from a global domain model (RunRef); (2) data assimilation for a specific location (RunObsN) is included; (3) nudging is used to adjust the analysis field (RunGridN). Model performance was evaluated against an observed hourly precipitation dataset of 15 rainfall stations using several statistical parameters. The WRF-ARW model reproduced well the temporal rainfall patterns but tended to overestimate precipitation amounts. The RunGridN simulation provided the best results but model performance of the other two runs was good too, so that the selected extreme rainfall episode was successfully reproduced.
NASA Technical Reports Server (NTRS)
Koch, S. E.; Skillman, W. C.; Kocin, P. J.; Wetzel, P. J.; Brill, K.; Keyser, D. A.; Mccumber, M. C.
1983-01-01
The overall performance characteristics of a limited area, hydrostatic, fine (52 km) mesh, primitive equation, numerical weather prediction model are determined in anticipation of satellite data assimilations with the model. The synoptic and mesoscale predictive capabilities of version 2.0 of this model, the Mesoscale Atmospheric Simulation System (MASS 2.0), were evaluated. The two part study is based on a sample of approximately thirty 12h and 24h forecasts of atmospheric flow patterns during spring and early summer. The synoptic scale evaluation results benchmark the performance of MASS 2.0 against that of an operational, synoptic scale weather prediction model, the Limited area Fine Mesh (LFM). The large sample allows for the calculation of statistically significant measures of forecast accuracy and the determination of systematic model errors. The synoptic scale benchmark is required before unsmoothed mesoscale forecast fields can be seriously considered.
Kaneko, Hiromasa; Funatsu, Kimito
2013-09-23
We propose predictive performance criteria for nonlinear regression models without cross-validation. The proposed criteria are the determination coefficient and the root-mean-square error for the midpoints between k-nearest-neighbor data points. These criteria can be used to evaluate predictive ability after the regression models are updated, whereas cross-validation cannot be performed in such a situation. The proposed method is effective and helpful in handling big data when cross-validation cannot be applied. By analyzing data from numerical simulations and quantitative structural relationships, we confirm that the proposed criteria enable the predictive ability of the nonlinear regression models to be appropriately quantified.
An Overview and Evaluation of Recent Machine Learning Imputation Methods Using Cardiac Imaging Data.
Liu, Yuzhe; Gopalakrishnan, Vanathi
2017-03-01
Many clinical research datasets have a large percentage of missing values that directly impacts their usefulness in yielding high accuracy classifiers when used for training in supervised machine learning. While missing value imputation methods have been shown to work well with smaller percentages of missing values, their ability to impute sparse clinical research data can be problem specific. We previously attempted to learn quantitative guidelines for ordering cardiac magnetic resonance imaging during the evaluation for pediatric cardiomyopathy, but missing data significantly reduced our usable sample size. In this work, we sought to determine if increasing the usable sample size through imputation would allow us to learn better guidelines. We first review several machine learning methods for estimating missing data. Then, we apply four popular methods (mean imputation, decision tree, k-nearest neighbors, and self-organizing maps) to a clinical research dataset of pediatric patients undergoing evaluation for cardiomyopathy. Using Bayesian Rule Learning (BRL) to learn ruleset models, we compared the performance of imputation-augmented models versus unaugmented models. We found that all four imputation-augmented models performed similarly to unaugmented models. While imputation did not improve performance, it did provide evidence for the robustness of our learned models.
Research on the performance evaluation of agricultural products supply chain integrated operation
NASA Astrophysics Data System (ADS)
Jiang, Jiake; Wang, Xifu; Liu, Yang
2017-04-01
The agricultural product supply chain integrated operation can ensure the quality and efficiency of agricultural products, and achieve the optimal goal of low cost and high service. This paper establishes a performance evaluation index system of agricultural products supply chain integration operation based on the development status of agricultural products and SCOR, BSC and KPI model. And then, we constructing rough set theory and BP neural network comprehensive evaluation model with the aid of Rosetta and MATLAB tools and the case study is about the development of agricultural products integrated supply chain in Jing-Jin-Ji region. And finally, we obtain the corresponding performance results, and give some improvement measures and management recommendations to the managers.
Self Evaluation of Organizations.
ERIC Educational Resources Information Center
Pooley, Richard C.
Evaluation within human service organizations is defined in terms of accepted evaluation criteria, with reasonable expectations shown and structured into a model of systematic evaluation practice. The evaluation criteria of program effort, performance, adequacy, efficiency and process mechanisms are discussed, along with measurement information…
Performance comparison of LUR and OK in PM2.5 concentration mapping: a multidimensional perspective
Zou, Bin; Luo, Yanqing; Wan, Neng; Zheng, Zhong; Sternberg, Troy; Liao, Yilan
2015-01-01
Methods of Land Use Regression (LUR) modeling and Ordinary Kriging (OK) interpolation have been widely used to offset the shortcomings of PM2.5 data observed at sparse monitoring sites. However, traditional point-based performance evaluation strategy for these methods remains stagnant, which could cause unreasonable mapping results. To address this challenge, this study employs ‘information entropy’, an area-based statistic, along with traditional point-based statistics (e.g. error rate, RMSE) to evaluate the performance of LUR model and OK interpolation in mapping PM2.5 concentrations in Houston from a multidimensional perspective. The point-based validation reveals significant differences between LUR and OK at different test sites despite the similar end-result accuracy (e.g. error rate 6.13% vs. 7.01%). Meanwhile, the area-based validation demonstrates that the PM2.5 concentrations simulated by the LUR model exhibits more detailed variations than those interpolated by the OK method (i.e. information entropy, 7.79 vs. 3.63). Results suggest that LUR modeling could better refine the spatial distribution scenario of PM2.5 concentrations compared to OK interpolation. The significance of this study primarily lies in promoting the integration of point- and area-based statistics for model performance evaluation in air pollution mapping. PMID:25731103
Rapid performance modeling and parameter regression of geodynamic models
NASA Astrophysics Data System (ADS)
Brown, J.; Duplyakin, D.
2016-12-01
Geodynamic models run in a parallel environment have many parameters with complicated effects on performance and scientifically-relevant functionals. Manually choosing an efficient machine configuration and mapping out the parameter space requires a great deal of expert knowledge and time-consuming experiments. We propose an active learning technique based on Gaussion Process Regression to automatically select experiments to map out the performance landscape with respect to scientific and machine parameters. The resulting performance model is then used to select optimal experiments for improving the accuracy of a reduced order model per unit of computational cost. We present the framework and evaluate its quality and capability using popular lithospheric dynamics models.
Closed-loop, pilot/vehicle analysis of the approach and landing task
NASA Technical Reports Server (NTRS)
Schmidt, D. K.; Anderson, M. R.
1985-01-01
Optimal-control-theoretic modeling and frequency-domain analysis is the methodology proposed to evaluate analytically the handling qualities of higher-order manually controlled dynamic systems. Fundamental to the methodology is evaluating the interplay between pilot workload and closed-loop pilot/vehicle performance and stability robustness. The model-based metric for pilot workload is the required pilot phase compensation. Pilot/vehicle performance and loop stability is then evaluated using frequency-domain techniques. When these techniques were applied to the flight-test data for thirty-two highly-augmented fighter configurations, strong correlation was obtained between the analytical and experimental results.
NASA Technical Reports Server (NTRS)
Campbell, Stefan F.; Kaneshige, John T.; Nguyen, Nhan T.; Krishakumar, Kalmanje S.
2010-01-01
Presented here is the evaluation of multiple adaptive control technologies for a generic transport aircraft simulation. For this study, seven model reference adaptive control (MRAC) based technologies were considered. Each technology was integrated into an identical dynamic-inversion control architecture and tuned using a methodology based on metrics and specific design requirements. Simulation tests were then performed to evaluate each technology s sensitivity to time-delay, flight condition, model uncertainty, and artificially induced cross-coupling. The resulting robustness and performance characteristics were used to identify potential strengths, weaknesses, and integration challenges of the individual adaptive control technologies
Risk assessment model for development of advanced age-related macular degeneration.
Klein, Michael L; Francis, Peter J; Ferris, Frederick L; Hamon, Sara C; Clemons, Traci E
2011-12-01
To design a risk assessment model for development of advanced age-related macular degeneration (AMD) incorporating phenotypic, demographic, environmental, and genetic risk factors. We evaluated longitudinal data from 2846 participants in the Age-Related Eye Disease Study. At baseline, these individuals had all levels of AMD, ranging from none to unilateral advanced AMD (neovascular or geographic atrophy). Follow-up averaged 9.3 years. We performed a Cox proportional hazards analysis with demographic, environmental, phenotypic, and genetic covariates and constructed a risk assessment model for development of advanced AMD. Performance of the model was evaluated using the C statistic and the Brier score and externally validated in participants in the Complications of Age-Related Macular Degeneration Prevention Trial. The final model included the following independent variables: age, smoking history, family history of AMD (first-degree member), phenotype based on a modified Age-Related Eye Disease Study simple scale score, and genetic variants CFH Y402H and ARMS2 A69S. The model did well on performance measures, with very good discrimination (C statistic = 0.872) and excellent calibration and overall performance (Brier score at 5 years = 0.08). Successful external validation was performed, and a risk assessment tool was designed for use with or without the genetic component. We constructed a risk assessment model for development of advanced AMD. The model performed well on measures of discrimination, calibration, and overall performance and was successfully externally validated. This risk assessment tool is available for online use.
Kramer, Andrew A; Higgins, Thomas L; Zimmerman, Jack E
2014-03-01
To examine the accuracy of the original Mortality Probability Admission Model III, ICU Outcomes Model/National Quality Forum modification of Mortality Probability Admission Model III, and Acute Physiology and Chronic Health Evaluation IVa models for comparing observed and risk-adjusted hospital mortality predictions. Retrospective paired analyses of day 1 hospital mortality predictions using three prognostic models. Fifty-five ICUs at 38 U.S. hospitals from January 2008 to December 2012. Among 174,001 intensive care admissions, 109,926 met model inclusion criteria and 55,304 had data for mortality prediction using all three models. None. We compared patient exclusions and the discrimination, calibration, and accuracy for each model. Acute Physiology and Chronic Health Evaluation IVa excluded 10.7% of all patients, ICU Outcomes Model/National Quality Forum 20.1%, and Mortality Probability Admission Model III 24.1%. Discrimination of Acute Physiology and Chronic Health Evaluation IVa was superior with area under receiver operating curve (0.88) compared with Mortality Probability Admission Model III (0.81) and ICU Outcomes Model/National Quality Forum (0.80). Acute Physiology and Chronic Health Evaluation IVa was better calibrated (lowest Hosmer-Lemeshow statistic). The accuracy of Acute Physiology and Chronic Health Evaluation IVa was superior (adjusted Brier score = 31.0%) to that for Mortality Probability Admission Model III (16.1%) and ICU Outcomes Model/National Quality Forum (17.8%). Compared with observed mortality, Acute Physiology and Chronic Health Evaluation IVa overpredicted mortality by 1.5% and Mortality Probability Admission Model III by 3.1%; ICU Outcomes Model/National Quality Forum underpredicted mortality by 1.2%. Calibration curves showed that Acute Physiology and Chronic Health Evaluation performed well over the entire risk range, unlike the Mortality Probability Admission Model and ICU Outcomes Model/National Quality Forum models. Acute Physiology and Chronic Health Evaluation IVa had better accuracy within patient subgroups and for specific admission diagnoses. Acute Physiology and Chronic Health Evaluation IVa offered the best discrimination and calibration on a large common dataset and excluded fewer patients than Mortality Probability Admission Model III or ICU Outcomes Model/National Quality Forum. The choice of ICU performance benchmarks should be based on a comparison of model accuracy using data for identical patients.
Performance evaluation of NCDOT w-beam guardrails under MASH TL-2 conditions.
DOT National Transportation Integrated Search
2013-11-01
This report summarizes the research efforts of using finite element modeling and simulations to evaluate the performance : of W-beam guardrails for different heights under MASH Test Level 2 (TL-2) and Test Level 3 (TL-3) impact conditions. A : litera...
Trajectory tracking in quadrotor platform by using PD controller and LQR control approach
NASA Astrophysics Data System (ADS)
Islam, Maidul; Okasha, Mohamed; Idres, Moumen Mohammad
2017-11-01
The purpose of the paper is to discuss a comparative evaluation of performance of two different controllers i.e. Proportional-Derivative Controller (PD) and Linear Quadratic Regulation (LQR) in Quadrotor dynamic system that is under-actuated with high nonlinearity. As only four states can be controlled at the same time in the Quadrotor, the trajectories are designed on the basis of the four states whereas three dimensional position and rotation along an axis, known as yaw movement are considered. In this work, both the PD controller and LQR control approach are used for Quadrotor nonlinear model to track the trajectories. LQR control approach for nonlinear model is designed on the basis of a linear model of the Quadrotor because the performance of linear model and nonlinear model around certain nominal point is almost similar. Simulink and MATLAB software is used to design the controllers and to evaluate the performance of both the controllers.
Managing for efficiency in health care: the case of Greek public hospitals.
Mitropoulos, Panagiotis; Mitropoulos, Ioannis; Sissouras, Aris
2013-12-01
This paper evaluates the efficiency of public hospitals with two alternative conceptual models. One model targets resource usage directly to assess production efficiency, while the other model incorporates financial results to assess economic efficiency. Performance analysis of these models was conducted in two stages. In stage one, we utilized data envelopment analysis to obtain the efficiency score of each hospital, while in stage two we took into account the influence of the operational environment on efficiency by regressing those scores on explanatory variables that concern the performance of hospital services. We applied these methods to evaluate 96 general hospitals in the Greek national health system. The results indicate that, although the average efficiency scores in both models have remained relatively stable compared to past assessments, internal changes in hospital performances do exist. This study provides a clear framework for policy implications to increase the overall efficiency of general hospitals.
Performance Evaluation of Nano-JASMINE
NASA Astrophysics Data System (ADS)
Hatsutori, Y.; Kobayashi, Y.; Gouda, N.; Yano, T.; Murooka, J.; Niwa, Y.; Yamada, Y.
We report the results of performance evaluation of the first Japanese astrometry satellite, Nano-JASMINE. It is a very small satellite and weighs only 35 kg. It aims to carry out astrometry measurement of nearby bright stars (z ≤ 7.5 mag) with an accuracy of 3 milli-arcseconds. Nano-JASMINE will be launched by Cyclone-4 rocket in August 2011 from Brazil. The current status is in the process of evaluating the performances. A series of performance tests and numerical analysis were conducted. As a result, the engineering model (EM) of the telescope was measured to be achieving a diffraction-limited performance and confirmed that it has enough performance for scientific astrometry.
Least-Squares Models to Correct for Rater Effects in Performance Assessment.
ERIC Educational Resources Information Center
Raymond, Mark R.; Viswesvaran, Chockalingam
This study illustrates the use of three least-squares models to control for rater effects in performance evaluation: (1) ordinary least squares (OLS); (2) weighted least squares (WLS); and (3) OLS subsequent to applying a logistic transformation to observed ratings (LOG-OLS). The three models were applied to ratings obtained from four…
Implementation of a WRF-CMAQ Air Quality Modeling System in Bogotá, Colombia
NASA Astrophysics Data System (ADS)
Nedbor-Gross, R.; Henderson, B. H.; Pachon, J. E.; Davis, J. R.; Baublitz, C. B.; Rincón, A.
2014-12-01
Due to a continuous economic growth Bogotá, Colombia has experienced air pollution issues in recent years. The local environmental authority has implemented several strategies to curb air pollution that have resulted in the decrease of PM10 concentrations since 2010. However, more activities are necessary in order to meet international air quality standards in the city. The University of Florida Air Quality and Climate group is collaborating with the Universidad de La Salle to prioritize regulatory strategies for Bogotá using air pollution simulations. To simulate pollution, we developed a modeling platform that combines the Weather Research and Forecasting Model (WRF), local emissions, and the Community Multi-scale Air Quality model (CMAQ). This platform is the first of its kind to be implemented in the megacity of Bogota, Colombia. The presentation will discuss development and evaluation of the air quality modeling system, highlight initial results characterizing photochemical conditions in Bogotá, and characterize air pollution under proposed regulatory strategies. The WRF model has been configured and applied to Bogotá, which resides in a tropical climate with complex mountainous topography. Developing the configuration included incorporation of local topography and land-use data, a physics sensitivity analysis, review, and systematic evaluation. The threshold, however, was set based on synthesis of model performance under less mountainous conditions. We will evaluate the impact that differences in autocorrelation contribute to the non-ideal performance. Air pollution predictions are currently under way. CMAQ has been configured with WRF meteorology, global boundary conditions from GEOS-Chem, and a locally produced emission inventory. Preliminary results from simulations show promising performance of CMAQ in Bogota. Anticipated results include a systematic performance evaluation of ozone and PM10, characterization of photochemical sensitivity, and air quality predictions under proposed regulatory scenarios.
Forecasting biodiversity in breeding birds using best practices
Taylor, Shawn D.; White, Ethan P.
2018-01-01
Biodiversity forecasts are important for conservation, management, and evaluating how well current models characterize natural systems. While the number of forecasts for biodiversity is increasing, there is little information available on how well these forecasts work. Most biodiversity forecasts are not evaluated to determine how well they predict future diversity, fail to account for uncertainty, and do not use time-series data that captures the actual dynamics being studied. We addressed these limitations by using best practices to explore our ability to forecast the species richness of breeding birds in North America. We used hindcasting to evaluate six different modeling approaches for predicting richness. Hindcasts for each method were evaluated annually for a decade at 1,237 sites distributed throughout the continental United States. All models explained more than 50% of the variance in richness, but none of them consistently outperformed a baseline model that predicted constant richness at each site. The best practices implemented in this study directly influenced the forecasts and evaluations. Stacked species distribution models and “naive” forecasts produced poor estimates of uncertainty and accounting for this resulted in these models dropping in the relative performance compared to other models. Accounting for observer effects improved model performance overall, but also changed the rank ordering of models because it did not improve the accuracy of the “naive” model. Considering the forecast horizon revealed that the prediction accuracy decreased across all models as the time horizon of the forecast increased. To facilitate the rapid improvement of biodiversity forecasts, we emphasize the value of specific best practices in making forecasts and evaluating forecasting methods. PMID:29441230
2011-09-01
a quality evaluation with limited data, a model -based assessment must be...that affect system performance, a multistage approach to system validation, a modeling and experimental methodology for efficiently addressing a ...affect system performance, a multistage approach to system validation, a modeling and experimental methodology for efficiently addressing a wide range
ERIC Educational Resources Information Center
Gong, Yue; Beck, Joseph E.; Heffernan, Neil T.
2011-01-01
Student modeling is a fundamental concept applicable to a variety of intelligent tutoring systems (ITS). However, there is not a lot of practical guidance on how to construct and train such models. This paper compares two approaches for student modeling, Knowledge Tracing (KT) and Performance Factors Analysis (PFA), by evaluating their predictive…
Spatiotemporal Variation in Distance Dependent Animal Movement Contacts: One Size Doesn’t Fit All
Brommesson, Peter; Wennergren, Uno; Lindström, Tom
2016-01-01
The structure of contacts that mediate transmission has a pronounced effect on the outbreak dynamics of infectious disease and simulation models are powerful tools to inform policy decisions. Most simulation models of livestock disease spread rely to some degree on predictions of animal movement between holdings. Typically, movements are more common between nearby farms than between those located far away from each other. Here, we assessed spatiotemporal variation in such distance dependence of animal movement contacts from an epidemiological perspective. We evaluated and compared nine statistical models, applied to Swedish movement data from 2008. The models differed in at what level (if at all), they accounted for regional and/or seasonal heterogeneities in the distance dependence of the contacts. Using a kernel approach to describe how probability of contacts between farms changes with distance, we developed a hierarchical Bayesian framework and estimated parameters by using Markov Chain Monte Carlo techniques. We evaluated models by three different approaches of model selection. First, we used Deviance Information Criterion to evaluate their performance relative to each other. Secondly, we estimated the log predictive posterior distribution, this was also used to evaluate their relative performance. Thirdly, we performed posterior predictive checks by simulating movements with each of the parameterized models and evaluated their ability to recapture relevant summary statistics. Independent of selection criteria, we found that accounting for regional heterogeneity improved model accuracy. We also found that accounting for seasonal heterogeneity was beneficial, in terms of model accuracy, according to two of three methods used for model selection. Our results have important implications for livestock disease spread models where movement is an important risk factor for between farm transmission. We argue that modelers should refrain from using methods to simulate animal movements that assume the same pattern across all regions and seasons without explicitly testing for spatiotemporal variation. PMID:27760155
Campbell, William; Ganna, Andrea; Ingelsson, Erik; Janssens, A Cecile J W
2016-01-01
We propose a new measure of assessing the performance of risk models, the area under the prediction impact curve (auPIC), which quantifies the performance of risk models in terms of their average health impact in the population. Using simulated data, we explain how the prediction impact curve (PIC) estimates the percentage of events prevented when a risk model is used to assign high-risk individuals to an intervention. We apply the PIC to the Atherosclerosis Risk in Communities (ARIC) Study to illustrate its application toward prevention of coronary heart disease. We estimated that if the ARIC cohort received statins at baseline, 5% of events would be prevented when the risk model was evaluated at a cutoff threshold of 20% predicted risk compared to 1% when individuals were assigned to the intervention without the use of a model. By calculating the auPIC, we estimated that an average of 15% of events would be prevented when considering performance across the entire interval. We conclude that the PIC is a clinically meaningful measure for quantifying the expected health impact of risk models that supplements existing measures of model performance. Copyright © 2016 Elsevier Inc. All rights reserved.
Evaluation of a black-footed ferret resource utilization function model
Eads, D.A.; Millspaugh, J.J.; Biggins, D.E.; Jachowski, D.S.; Livieri, T.M.
2011-01-01
Resource utilization function (RUF) models permit evaluation of potential habitat for endangered species; ideally such models should be evaluated before use in management decision-making. We evaluated the predictive capabilities of a previously developed black-footed ferret (Mustela nigripes) RUF. Using the population-level RUF, generated from ferret observations at an adjacent yet distinct colony, we predicted the distribution of ferrets within a black-tailed prairie dog (Cynomys ludovicianus) colony in the Conata Basin, South Dakota, USA. We evaluated model performance, using data collected during post-breeding spotlight surveys (2007-2008) by assessing model agreement via weighted compositional analysis and count-metrics. Compositional analysis of home range use and colony-level availability, and core area use and home range availability, demonstrated ferret selection of the predicted Very high and High occurrence categories in 2007 and 2008. Simple count-metrics corroborated these findings and suggested selection of the Very high category in 2007 and the Very high and High categories in 2008. Collectively, these results suggested that the RUF was useful in predicting occurrence and intensity of space use of ferrets at our study site, the 2 objectives of the RUF. Application of this validated RUF would increase the resolution of habitat evaluations, permitting prediction of the distribution of ferrets within distinct colonies. Additional model evaluation at other sites, on other black-tailed prairie dog colonies of varying resource configuration and size, would increase understanding of influences upon model performance and the general utility of the RUF. ?? 2011 The Wildlife Society.
Under the Air Quality Model Evaluation International Initiative, Phase 2 (AQMEII-2), three online coupled air quality model simulations, with six different configurations, are analyzed for their performance, inter-model agreement, and responses to emission and meteorological chan...
Evaluating Process Improvement Courses of Action Through Modeling and Simulation
2017-09-16
changes to a process is time consuming and has potential to overlook stochastic effects. By modeling a process as a Numerical Design Structure Matrix...13 Methods to Evaluate Process Performance ................................................................15 The Design Structure...Matrix ......................................................................................16 Numerical Design Structure Matrix
Improving Learner Handovers in Medical Education.
Warm, Eric J; Englander, Robert; Pereira, Anne; Barach, Paul
2017-07-01
Multiple studies have demonstrated that the information included in the Medical Student Performance Evaluation fails to reliably predict medical students' future performance. This faulty transfer of information can lead to harm when poorly prepared students fail out of residency or, worse, are shuttled through the medical education system without an honest accounting of their performance. Such poor learner handovers likely arise from two root causes: (1) the absence of agreed-on outcomes of training and/or accepted assessments of those outcomes, and (2) the lack of standardized ways to communicate the results of those assessments. To improve the current learner handover situation, an authentic, shared mental model of competency is needed; high-quality tools to assess that competency must be developed and tested; and transparent, reliable, and safe ways to communicate this information must be created.To achieve these goals, the authors propose using a learner handover process modeled after a patient handover process. The CLASS model includes a description of the learner's Competency attainment, a summary of the Learner's performance, an Action list and statement of Situational awareness, and Synthesis by the receiving program. This model also includes coaching oriented towards improvement along the continuum of education and care. Just as studies have evaluated patient handover models using metrics that matter most to patients, studies must evaluate this learner handover model using metrics that matter most to providers, patients, and learners.
Evaluation of ceramics for stator application: Gas turbine engine report
NASA Technical Reports Server (NTRS)
Trela, W.; Havstad, P. H.
1978-01-01
Current ceramic materials, component fabrication processes, and reliability prediction capability for ceramic stators in an automotive gas turbine engine environment are assessed. Simulated engine duty cycle testing of stators conducted at temperatures up to 1093 C is discussed. Materials evaluated are SiC and Si3N4 fabricated from two near-net-shape processes: slip casting and injection molding. Stators for durability cycle evaluation and test specimens for material property characterization, and reliability prediction model prepared to predict stator performance in the simulated engine environment are considered. The status and description of the work performed for the reliability prediction modeling, stator fabrication, material property characterization, and ceramic stator evaluation efforts are reported.
Evaluation of CNN as anthropomorphic model observer
NASA Astrophysics Data System (ADS)
Massanes, Francesc; Brankov, Jovan G.
2017-03-01
Model observers (MO) are widely used in medical imaging to act as surrogates of human observers in task-based image quality evaluation, frequently towards optimization of reconstruction algorithms. In this paper, we explore the use of convolutional neural networks (CNN) to be used as MO. We will compare CNN MO to alternative MO currently being proposed and used such as the relevance vector machine based MO and channelized Hotelling observer (CHO). As the success of the CNN, and other deep learning approaches, is rooted in large data sets availability, which is rarely the case in medical imaging systems task-performance evaluation, we will evaluate CNN performance on both large and small training data sets.
A new method based on fuzzy logic to evaluate the contract service provider performance.
Miguel, C A; Barr, C; Moreno, M J L
2008-01-01
This paper puts forward a fuzzy inference system for evaluating the service quality performance of service contract providers. An application service provider (ASP) model for computerized maintenance management was used in establishing common performance indicators of the quality of service. This model was implemented in 10 separate hospitals. As a result, inference produced a service cost/acquisition cost (SC/AC) ratio reduction from 16.14% to 6.09%, an increase of 20.9% in availability, with a maintained repair quality (NRR) in the period of December 2001 to January 2003.
A condition metric for Eucalyptus woodland derived from expert evaluations.
Sinclair, Steve J; Bruce, Matthew J; Griffioen, Peter; Dodd, Amanda; White, Matthew D
2018-02-01
The evaluation of ecosystem quality is important for land-management and land-use planning. Evaluation is unavoidably subjective, and robust metrics must be based on consensus and the structured use of observations. We devised a transparent and repeatable process for building and testing ecosystem metrics based on expert data. We gathered quantitative evaluation data on the quality of hypothetical grassy woodland sites from experts. We used these data to train a model (an ensemble of 30 bagged regression trees) capable of predicting the perceived quality of similar hypothetical woodlands based on a set of 13 site variables as inputs (e.g., cover of shrubs, richness of native forbs). These variables can be measured at any site and the model implemented in a spreadsheet as a metric of woodland quality. We also investigated the number of experts required to produce an opinion data set sufficient for the construction of a metric. The model produced evaluations similar to those provided by experts, as shown by assessing the model's quality scores of expert-evaluated test sites not used to train the model. We applied the metric to 13 woodland conservation reserves and asked managers of these sites to independently evaluate their quality. To assess metric performance, we compared the model's evaluation of site quality with the managers' evaluations through multidimensional scaling. The metric performed relatively well, plotting close to the center of the space defined by the evaluators. Given the method provides data-driven consensus and repeatability, which no single human evaluator can provide, we suggest it is a valuable tool for evaluating ecosystem quality in real-world contexts. We believe our approach is applicable to any ecosystem. © 2017 State of Victoria.
Kalvāns, Andis; Bitāne, Māra; Kalvāne, Gunta
2015-02-01
A historical phenological record and meteorological data of the period 1960-2009 are used to analyse the ability of seven phenological models to predict leaf unfolding and beginning of flowering for two tree species-silver birch Betula pendula and bird cherry Padus racemosa-in Latvia. Model stability is estimated performing multiple model fitting runs using half of the data for model training and the other half for evaluation. Correlation coefficient, mean absolute error and mean squared error are used to evaluate model performance. UniChill (a model using sigmoidal development rate and temperature relationship and taking into account the necessity for dormancy release) and DDcos (a simple degree-day model considering the diurnal temperature fluctuations) are found to be the best models for describing the considered spring phases. A strong collinearity between base temperature and required heat sum is found for several model fitting runs of the simple degree-day based models. Large variation of the model parameters between different model fitting runs in case of more complex models indicates similar collinearity and over-parameterization of these models. It is suggested that model performance can be improved by incorporating the resolved daily temperature fluctuations of the DDcos model into the framework of the more complex models (e.g. UniChill). The average base temperature, as found by DDcos model, for B. pendula leaf unfolding is 5.6 °C and for the start of the flowering 6.7 °C; for P. racemosa, the respective base temperatures are 3.2 °C and 3.4 °C.
An Object-Based Approach to Evaluation of Climate Variability Projections and Predictions
NASA Astrophysics Data System (ADS)
Ammann, C. M.; Brown, B.; Kalb, C. P.; Bullock, R.
2017-12-01
Evaluations of the performance of earth system model predictions and projections are of critical importance to enhance usefulness of these products. Such evaluations need to address specific concerns depending on the system and decisions of interest; hence, evaluation tools must be tailored to inform about specific issues. Traditional approaches that summarize grid-based comparisons of analyses and models, or between current and future climate, often do not reveal important information about the models' performance (e.g., spatial or temporal displacements; the reason behind a poor score) and are unable to accommodate these specific information needs. For example, summary statistics such as the correlation coefficient or the mean-squared error provide minimal information to developers, users, and decision makers regarding what is "right" and "wrong" with a model. New spatial and temporal-spatial object-based tools from the field of weather forecast verification (where comparisons typically focus on much finer temporal and spatial scales) have been adapted to more completely answer some of the important earth system model evaluation questions. In particular, the Method for Object-based Diagnostic Evaluation (MODE) tool and its temporal (three-dimensional) extension (MODE-TD) have been adapted for these evaluations. More specifically, these tools can be used to address spatial and temporal displacements in projections of El Nino-related precipitation and/or temperature anomalies, ITCZ-associated precipitation areas, atmospheric rivers, seasonal sea-ice extent, and other features of interest. Examples of several applications of these tools in a climate context will be presented, using output of the CESM large ensemble. In general, these tools provide diagnostic information about model performance - accounting for spatial, temporal, and intensity differences - that cannot be achieved using traditional (scalar) model comparison approaches. Thus, they can provide more meaningful information that can be used in decision-making and planning. Future extensions and applications of these tools in a climate context will be considered.
Liu, Yaoming; Cohen, Mark E; Hall, Bruce L; Ko, Clifford Y; Bilimoria, Karl Y
2016-08-01
The American College of Surgeon (ACS) NSQIP Surgical Risk Calculator has been widely adopted as a decision aid and informed consent tool by surgeons and patients. Previous evaluations showed excellent discrimination and combined discrimination and calibration, but model calibration alone, and potential benefits of recalibration, were not explored. Because lack of calibration can lead to systematic errors in assessing surgical risk, our objective was to assess calibration and determine whether spline-based adjustments could improve it. We evaluated Surgical Risk Calculator model calibration, as well as discrimination, for each of 11 outcomes modeled from nearly 3 million patients (2010 to 2014). Using independent random subsets of data, we evaluated model performance for the Development (60% of records), Validation (20%), and Test (20%) datasets, where prediction equations from the Development dataset were recalibrated using restricted cubic splines estimated from the Validation dataset. We also evaluated performance on data subsets composed of higher-risk operations. The nonrecalibrated Surgical Risk Calculator performed well, but there was a slight tendency for predicted risk to be overestimated for lowest- and highest-risk patients and underestimated for moderate-risk patients. After recalibration, this distortion was eliminated, and p values for miscalibration were most often nonsignificant. Calibration was also excellent for subsets of higher-risk operations, though observed calibration was reduced due to instability associated with smaller sample sizes. Performance of NSQIP Surgical Risk Calculator models was shown to be excellent and improved with recalibration. Surgeons and patients can rely on the calculator to provide accurate estimates of surgical risk. Copyright © 2016 American College of Surgeons. Published by Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Rahmati, Omid; Tahmasebipour, Nasser; Haghizadeh, Ali; Pourghasemi, Hamid Reza; Feizizadeh, Bakhtiar
2017-12-01
Gully erosion constitutes a serious problem for land degradation in a wide range of environments. The main objective of this research was to compare the performance of seven state-of-the-art machine learning models (SVM with four kernel types, BP-ANN, RF, and BRT) to model the occurrence of gully erosion in the Kashkan-Poldokhtar Watershed, Iran. In the first step, a gully inventory map consisting of 65 gully polygons was prepared through field surveys. Three different sample data sets (S1, S2, and S3), including both positive and negative cells (70% for training and 30% for validation), were randomly prepared to evaluate the robustness of the models. To model the gully erosion susceptibility, 12 geo-environmental factors were selected as predictors. Finally, the goodness-of-fit and prediction skill of the models were evaluated by different criteria, including efficiency percent, kappa coefficient, and the area under the ROC curves (AUC). In terms of accuracy, the RF, RBF-SVM, BRT, and P-SVM models performed excellently both in the degree of fitting and in predictive performance (AUC values well above 0.9), which resulted in accurate predictions. Therefore, these models can be used in other gully erosion studies, as they are capable of rapidly producing accurate and robust gully erosion susceptibility maps (GESMs) for decision-making and soil and water management practices. Furthermore, it was found that performance of RF and RBF-SVM for modelling gully erosion occurrence is quite stable when the learning and validation samples are changed.
ERIC Educational Resources Information Center
Harrison, Christopher; Cohen-Vogel, Lora
2012-01-01
Following a multiyear debate, Florida lawmakers passed the "Student Success Act" in March 2011, introducing some of the most sweeping educational reforms in the state's history--the introduction of teacher evaluation systems based on value-added modeling, mandatory "performance pay" for teachers, and the elimination of…
Development and Integration of Control System Models
NASA Technical Reports Server (NTRS)
Kim, Young K.
1998-01-01
The computer simulation tool, TREETOPS, has been upgraded and used at NASA/MSFC to model various complicated mechanical systems and to perform their dynamics and control analysis with pointing control systems. A TREETOPS model of Advanced X-ray Astrophysics Facility - Imaging (AXAF-1) dynamics and control system was developed to evaluate the AXAF-I pointing performance for Normal Pointing Mode. An optical model of Shooting Star Experiment (SSE) was also developed and its optical performance analysis was done using the MACOS software.
2012-01-01
Background We introduce the linguistic annotation of a corpus of 97 full-text biomedical publications, known as the Colorado Richly Annotated Full Text (CRAFT) corpus. We further assess the performance of existing tools for performing sentence splitting, tokenization, syntactic parsing, and named entity recognition on this corpus. Results Many biomedical natural language processing systems demonstrated large differences between their previously published results and their performance on the CRAFT corpus when tested with the publicly available models or rule sets. Trainable systems differed widely with respect to their ability to build high-performing models based on this data. Conclusions The finding that some systems were able to train high-performing models based on this corpus is additional evidence, beyond high inter-annotator agreement, that the quality of the CRAFT corpus is high. The overall poor performance of various systems indicates that considerable work needs to be done to enable natural language processing systems to work well when the input is full-text journal articles. The CRAFT corpus provides a valuable resource to the biomedical natural language processing community for evaluation and training of new models for biomedical full text publications. PMID:22901054
NASA Astrophysics Data System (ADS)
Collins, Jarrod A.; Brown, Daniel; Kingham, T. Peter; Jarnagin, William R.; Miga, Michael I.; Clements, Logan W.
2015-03-01
Development of a clinically accurate predictive model of microwave ablation (MWA) procedures would represent a significant advancement and facilitate an implementation of patient-specific treatment planning to achieve optimal probe placement and ablation outcomes. While studies have been performed to evaluate predictive models of MWA, the ability to quantify the performance of predictive models via clinical data has been limited to comparing geometric measurements of the predicted and actual ablation zones. The accuracy of placement, as determined by the degree of spatial overlap between ablation zones, has not been achieved. In order to overcome this limitation, a method of evaluation is proposed where the actual location of the MWA antenna is tracked and recorded during the procedure via a surgical navigation system. Predictive models of the MWA are then computed using the known position of the antenna within the preoperative image space. Two different predictive MWA models were used for the preliminary evaluation of the proposed method: (1) a geometric model based on the labeling associated with the ablation antenna and (2) a 3-D finite element method based computational model of MWA using COMSOL. Given the follow-up tomographic images that are acquired at approximately 30 days after the procedure, a 3-D surface model of the necrotic zone was generated to represent the true ablation zone. A quantification of the overlap between the predicted ablation zones and the true ablation zone was performed after a rigid registration was computed between the pre- and post-procedural tomograms. While both model show significant overlap with the true ablation zone, these preliminary results suggest a slightly higher degree of overlap with the geometric model.
Evaluation of CAMEL - comprehensive areal model of earthquake-induced landslides
Miles, S.B.; Keefer, D.K.
2009-01-01
A new comprehensive areal model of earthquake-induced landslides (CAMEL) has been developed to assist in planning decisions related to disaster risk reduction. CAMEL provides an integrated framework for modeling all types of earthquake-induced landslides using fuzzy logic systems and geographic information systems. CAMEL is designed to facilitate quantitative and qualitative representation of terrain conditions and knowledge about these conditions on the likely areal concentration of each landslide type. CAMEL has been empirically evaluated with respect to disrupted landslides (Category I) using a case study of the 1989 M = 6.9 Loma Prieta, CA earthquake. In this case, CAMEL performs best in comparison to disrupted slides and falls in soil. For disrupted rock fall and slides, CAMEL's performance was slightly poorer. The model predicted a low occurrence of rock avalanches, when none in fact occurred. A similar comparison with the Loma Prieta case study was also conducted using a simplified Newmark displacement model. The area under the curve method of evaluation was used in order to draw comparisons between both models, revealing improved performance with CAMEL. CAMEL should not however be viewed as a strict alternative to Newmark displacement models. CAMEL can be used to integrate Newmark displacements with other, previously incompatible, types of knowledge. ?? 2008 Elsevier B.V.
NASA Astrophysics Data System (ADS)
Azadeh, A.; Salehi, V.; Salehi, R.
2017-10-01
Information systems (IS) are strongly influenced by changes in new technology and should react swiftly in response to external conditions. Resilience engineering is a new method that can enable these systems to absorb changes. In this study, a new framework is presented for performance evaluation of IS that includes DeLone and McLean's factors of success in addition to resilience. Hence, this study is an attempt to evaluate the impact of resilience on IS by the proposed model in Iranian Gas Engineering and Development Company via the data obtained from questionnaires and Fuzzy Data Envelopment Analysis (FDEA) approach. First, FDEA model with α-cut = 0.05 was identified as the most suitable model to this application by performing all Banker, Charnes and Cooper and Charnes, Cooper and Rhodes models of and FDEA and selecting the appropriate model based on maximum mean efficiency. Then, the factors were ranked based on the results of sensitivity analysis, which showed resilience had a significantly higher impact on the proposed model relative to other factors. The results of this study were then verified by conducting the related ANOVA test. This is the first study that examines the impact of resilience on IS by statistical and mathematical approaches.
NASA Astrophysics Data System (ADS)
Triantafyllou, A. G.; Kalogiros, J.; Krestou, A.; Leivaditou, E.; Zoumakis, N.; Bouris, D.; Garas, S.; Konstantinidis, E.; Wang, Q.
2018-03-01
This paper provides the performance evaluation of the meteorological component of The Air Pollution Model (TAPM), a nestable prognostic model, in predicting meteorological variables in urban areas, for both its surface layer and atmospheric boundary layer (ABL) turbulence parameterizations. The model was modified by incorporating four urban land surface types, replacing the existing single urban surface. Control runs were carried out over the wider area of Kozani, an urban area in NW Greece. The model was evaluated for both surface and ABL meteorological variables by using measurements of near-surface and vertical profiles of wind and temperature. The data were collected by using monitoring surface stations in selected sites as well as an acoustic sounder (SOnic Detection And Ranging (SODAR), up to 300 m above ground) and a radiometer profiler (up to 600 m above ground). The results showed the model demonstrated good performance in predicting the near-surface meteorology in the Kozani region for both a winter and a summer month. In the ABL, the comparison showed that the model's forecasts generally performed well with respect to the thermal structure (temperature profiles and ABL height) but overestimated wind speed at the heights of comparison (mostly below 200 m) up to 3-4 ms-1.
Norm-Referenced Tests. Summary. REL 2014-004
ERIC Educational Resources Information Center
Stuit, David; Austin, Megan J.; Berends, Mark; Gerdeman, R. Dean
2014-01-01
Recent changes to state laws on accountability have prompted school districts to design teacher performance evaluation systems that incorporate student achievement (student growth) as a major component. As a consequence, some states and districts are considering teacher value- added models as part of teacher performance evaluations. Value-added…
Norm-Referenced Tests. REL 2014-004
ERIC Educational Resources Information Center
Stuit, David; Austin, Megan J.; Berends, Mark; Gerdeman, R. Dean
2014-01-01
Recent changes to state laws on accountability have prompted school districts to design teacher performance evaluation systems that incorporate student achievement (student growth) as a major component. As a consequence, some states and districts are considering teacher value-added models as part of teacher performance evaluations. Value-added…
WRF-Cordex simulations for Europe: mean and extreme precipitation for present and future climates
NASA Astrophysics Data System (ADS)
Cardoso, Rita M.; Soares, Pedro M. M.; Miranda, Pedro M. A.
2013-04-01
The Weather Research and Forecast (WRF-ARW) model, version 3.3.1, was used to perform the European domain Cordex simulations, at 50km resolution. A first simulation, forced by ERA-Interim (1989-2009), was carried out to evaluate the models performance to represent the mean and extreme precipitation in present European climate. This evaluation is based in the comparison of WRF results against the ECAD regular gridded dataset of daily precipitation. Results are comparable to recent studies with other models for the European region, at this resolution. For the same domain a control and a future scenario (RCP8.5) simulation was performed to assess the climate change impact on the mean and extreme precipitation. These regional simulations were forced by EC-EARTH model results, and, encompass the periods from 1960-2006 and 2006-2100, respectively.
Predicting mining activity with parallel genetic algorithms
Talaie, S.; Leigh, R.; Louis, S.J.; Raines, G.L.; Beyer, H.G.; O'Reilly, U.M.; Banzhaf, Arnold D.; Blum, W.; Bonabeau, C.; Cantu-Paz, E.W.; ,; ,
2005-01-01
We explore several different techniques in our quest to improve the overall model performance of a genetic algorithm calibrated probabilistic cellular automata. We use the Kappa statistic to measure correlation between ground truth data and data predicted by the model. Within the genetic algorithm, we introduce a new evaluation function sensitive to spatial correctness and we explore the idea of evolving different rule parameters for different subregions of the land. We reduce the time required to run a simulation from 6 hours to 10 minutes by parallelizing the code and employing a 10-node cluster. Our empirical results suggest that using the spatially sensitive evaluation function does indeed improve the performance of the model and our preliminary results also show that evolving different rule parameters for different regions tends to improve overall model performance. Copyright 2005 ACM.
In this paper, the concept of scale analysis is applied to evaluate ozone predictions from two regional-scale air quality models. To this end, seasonal time series of observations and predictions from the RAMS3b/UAM-V and MM5/MAQSIP (SMRAQ) modeling systems for ozone were spectra...
USDA-ARS?s Scientific Manuscript database
Representing the performance of cattle finished on an all forage diet in process-based whole farm system models has presented a challenge. To address this challenge, a study was done to evaluate average daily gain (ADG) predictions of the Integrated Farm System Model (IFSM) for steers consuming all-...
Field evaluations of a forestry version of DRAINMOD-NII model
S. Tian; M. A. Youssef; R.W. Skaggs; D.M. Amatya; G.M. Chescheir
2010-01-01
This study evaluated the performance of the newly developed forestry version of DRAINMOD-NII model using a long term (21-year) data set collected from an artificially drained loblolly pine (Pinus taeda L.) plantation in eastern North Carolina, U.S.A. The model simulates the main hydrological and biogeochemical processes in drained forested lands. The...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Platania, P., E-mail: platania@ifp.cnr.it; Figini, L.; Farina, D.
The purpose of this work is the optical modeling and physical performances evaluations of the JT-60SA ECRF launcher system. The beams have been simulated with the electromagnetic code GRASP® and used as input for ECCD calculations performed with the beam tracing code GRAY, capable of modeling propagation, absorption and current drive of an EC Gaussion beam with general astigmatism. Full details of the optical analysis has been taken into account to model the launched beams. Inductive and advanced reference scenarios has been analysed for physical evaluations in the full poloidal and toroidal steering ranges for two slightly different layouts ofmore » the launcher system.« less
Catchment area-based evaluation of the AMC-dependent SCS-CN-based rainfall-runoff models
NASA Astrophysics Data System (ADS)
Mishra, S. K.; Jain, M. K.; Pandey, R. P.; Singh, V. P.
2005-09-01
Using a large set of rainfall-runoff data from 234 watersheds in the USA, a catchment area-based evaluation of the modified version of the Mishra and Singh (2002a) model was performed. The model is based on the Soil Conservation Service Curve Number (SCS-CN) methodology and incorporates the antecedent moisture in computation of direct surface runoff. Comparison with the existing SCS-CN method showed that the modified version performed better than did the existing one on the data of all seven area-based groups of watersheds ranging from 0.01 to 310.3 km2.
Integrated Model for Performance Analysis of All-Optical Multihop Packet Switches
NASA Astrophysics Data System (ADS)
Jeong, Han-You; Seo, Seung-Woo
2000-09-01
The overall performance of an all-optical packet switching system is usually determined by two criteria, i.e., switching latency and packet loss rate. In some real-time applications, however, in which packets arriving later than a timeout period are discarded as loss, the packet loss rate becomes the most dominant criterion for system performance. Here we focus on evaluating the performance of all-optical packet switches in terms of the packet loss rate, which normally arises from the insufficient hardware or the degradation of an optical signal. Considering both aspects, we propose what we believe is a new analysis model for the packet loss rate that reflects the complicated interactions between physical impairments and system-level parameters. On the basis of the estimation model for signal quality degradation in a multihop path we construct an equivalent analysis model of a switching network for evaluating an average bit error rate. With the model constructed we then propose an integrated model for estimating the packet loss rate in three architectural examples of multihop packet switches, each of which is based on a different switching concept. We also derive the bounds on the packet loss rate induced by bit errors. Finally, it is verified through simulation studies that our analysis model accurately predicts system performance.
Performance analysis of 60-min to 1-min integration time rain rate conversion models in Malaysia
NASA Astrophysics Data System (ADS)
Ng, Yun-Yann; Singh, Mandeep Singh Jit; Thiruchelvam, Vinesh
2018-01-01
Utilizing the frequency band above 10 GHz is in focus nowadays as a result of the fast expansion of radio communication systems in Malaysia. However, rain fade is the critical factor in attenuation of signal propagation for frequencies above 10 GHz. Malaysia is located in a tropical and equatorial region with high rain intensity throughout the year, and this study will review rain distribution and evaluate the performance of 60-min to 1-min integration time rain rate conversion methods for Malaysia. Several conversion methods such as Segal, Chebil & Rahman, Burgeono, Emiliani, Lavergnat and Gole (LG), Simplified Moupfouma, Joo et al., fourth order polynomial fit and logarithmic model have been chosen to evaluate the performance to predict 1-min rain rate for 10 sites in Malaysia. After the completion of this research, the results show that Chebil & Rahman model, Lavergnat & Gole model, Fourth order polynomial fit and Logarithmic model have shown the best performances in 60-min to 1-min rain rate conversion over 10 sites. In conclusion, it is proven that there is no single model which can claim to perform the best across 10 sites. By averaging RMSE and SC-RMSE over 10 sites, Chebil and Rahman model is the best method.
Towards improved and more routine Earth system model evaluation in CMIP
Eyring, Veronika; Gleckler, Peter J.; Heinze, Christoph; ...
2016-11-01
The Coupled Model Intercomparison Project (CMIP) has successfully provided the climate community with a rich collection of simulation output from Earth system models (ESMs) that can be used to understand past climate changes and make projections and uncertainty estimates of the future. Confidence in ESMs can be gained because the models are based on physical principles and reproduce many important aspects of observed climate. More research is required to identify the processes that are most responsible for systematic biases and the magnitude and uncertainty of future projections so that more relevant performance tests can be developed. At the same time,more » there are many aspects of ESM evaluation that are well established and considered an essential part of systematic evaluation but have been implemented ad hoc with little community coordination. Given the diversity and complexity of ESM analysis, we argue that the CMIP community has reached a critical juncture at which many baseline aspects of model evaluation need to be performed much more efficiently and consistently. We provide a perspective and viewpoint on how a more systematic, open, and rapid performance assessment of the large and diverse number of models that will participate in current and future phases of CMIP can be achieved, and announce our intention to implement such a system for CMIP6. Accomplishing this could also free up valuable resources as many scientists are frequently "re-inventing the wheel" by re-writing analysis routines for well-established analysis methods. A more systematic approach for the community would be to develop and apply evaluation tools that are based on the latest scientific knowledge and observational reference, are well suited for routine use, and provide a wide range of diagnostics and performance metrics that comprehensively characterize model behaviour as soon as the output is published to the Earth System Grid Federation (ESGF). The CMIP infrastructure enforces data standards and conventions for model output and documentation accessible via the ESGF, additionally publishing observations (obs4MIPs) and reanalyses (ana4MIPs) for model intercomparison projects using the same data structure and organization as the ESM output. This largely facilitates routine evaluation of the ESMs, but to be able to process the data automatically alongside the ESGF, the infrastructure needs to be extended with processing capabilities at the ESGF data nodes where the evaluation tools can be executed on a routine basis. Efforts are already underway to develop community-based evaluation tools, and we encourage experts to provide additional diagnostic codes that would enhance this capability for CMIP. And, at the same time, we encourage the community to contribute observations and reanalyses for model evaluation to the obs4MIPs and ana4MIPs archives. The intention is to produce through the ESGF a widely accepted quasi-operational evaluation framework for CMIP6 that would routinely execute a series of standardized evaluation tasks. Over time, as this capability matures, we expect to produce an increasingly systematic characterization of models which, compared with early phases of CMIP, will more quickly and openly identify the strengths and weaknesses of the simulations. This will also reveal whether long-standing model errors remain evident in newer models and will assist modelling groups in improving their models. Finally, this framework will be designed to readily incorporate updates, including new observations and additional diagnostics and metrics as they become available from the research community.« less
Assessment of tools for modeling aircraft noise in national parks
DOT National Transportation Integrated Search
2005-03-18
The first objective of this study was to evaluate the series of model enhancements that were included in : INM as a result of the recommendations from the GCNP MVS. Specifically, there was a desire to : evaluate the performance of the latest versions...
Evaluating the Predictive Value of Growth Prediction Models
ERIC Educational Resources Information Center
Murphy, Daniel L.; Gaertner, Matthew N.
2014-01-01
This study evaluates four growth prediction models--projection, student growth percentile, trajectory, and transition table--commonly used to forecast (and give schools credit for) middle school students' future proficiency. Analyses focused on vertically scaled summative mathematics assessments, and two performance standards conditions (high…
The evaluation system of city's smart growth success rates
NASA Astrophysics Data System (ADS)
Huang, Yifan
2018-04-01
"Smart growth" is to pursue the best integrated perform+-ance of the Economically prosperous, socially Equitable, and Environmentally Sustainable(3E). Firstly, we establish the smart growth evaluation system(SGI) and the sustainable development evaluation system(SDI). Based on the ten principles and the definition of three E's of sustainability. B y using the Z-score method and the principal component analysis method, we evaluate and quantify indexes synthetically. Then we define the success of smart growth as the ratio of the SDI to the SGI composite score growth rate (SSG). After that we select two cities — Canberra and Durres as the objects of our model in view of the model. Based on the development plans and key data of these two cities, we can figure out the success of smart growth. And according to our model, we adjust some of the growth indicators for both cities. Then observe the results before and after adjustment, and finally verify the accuracy of the model.
Evaluating the Value of High Spatial Resolution in National Capacity Expansion Models using ReEDS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Krishnan, Venkat; Cole, Wesley
2016-11-14
Power sector capacity expansion models (CEMs) have a broad range of spatial resolutions. This paper uses the Regional Energy Deployment System (ReEDS) model, a long-term national scale electric sector CEM, to evaluate the value of high spatial resolution for CEMs. ReEDS models the United States with 134 load balancing areas (BAs) and captures the variability in existing generation parameters, future technology costs, performance, and resource availability using very high spatial resolution data, especially for wind and solar modeled at 356 resource regions. In this paper we perform planning studies at three different spatial resolutions--native resolution (134 BAs), state-level, and NERCmore » region level--and evaluate how results change under different levels of spatial aggregation in terms of renewable capacity deployment and location, associated transmission builds, and system costs. The results are used to ascertain the value of high geographically resolved models in terms of their impact on relative competitiveness among renewable energy resources.« less
Gowd, Snigdha; Shankar, T; Dash, Samarendra; Sahoo, Nivedita; Chatterjee, Suravi; Mohanty, Pritam
2017-01-01
The aim of the study was to evaluate the reliability of cone beam computed tomography (CBCT) obtained image over plaster model for the assessment of mixed dentition analysis. Thirty CBCT-derived images and thirty plaster models were derived from the dental archives, and Moyer's and Tanaka-Johnston analyses were performed. The data obtained were interpreted and analyzed statistically using SPSS 10.0/PC (SPSS Inc., Chicago, IL, USA). Descriptive and analytical analysis along with Student's t -test was performed to qualitatively evaluate the data and P < 0.05 was considered statistically significant. Statistically, significant results were obtained on data comparison between CBCT-derived images and plaster model; the mean for Moyer's analysis in the left and right lower arch for CBCT and plaster model was 21.2 mm, 21.1 mm and 22.5 mm, 22.5 mm, respectively. CBCT-derived images were less reliable as compared to data obtained directly from plaster model for mixed dentition analysis.
GANViz: A Visual Analytics Approach to Understand the Adversarial Game.
Wang, Junpeng; Gou, Liang; Yang, Hao; Shen, Han-Wei
2018-06-01
Generative models bear promising implications to learn data representations in an unsupervised fashion with deep learning. Generative Adversarial Nets (GAN) is one of the most popular frameworks in this arena. Despite the promising results from different types of GANs, in-depth understanding on the adversarial training process of the models remains a challenge to domain experts. The complexity and the potential long-time training process of the models make it hard to evaluate, interpret, and optimize them. In this work, guided by practical needs from domain experts, we design and develop a visual analytics system, GANViz, aiming to help experts understand the adversarial process of GANs in-depth. Specifically, GANViz evaluates the model performance of two subnetworks of GANs, provides evidence and interpretations of the models' performance, and empowers comparative analysis with the evidence. Through our case studies with two real-world datasets, we demonstrate that GANViz can provide useful insight into helping domain experts understand, interpret, evaluate, and potentially improve GAN models.
Benchmark simulation model no 2: general protocol and exploratory case studies.
Jeppsson, U; Pons, M-N; Nopens, I; Alex, J; Copp, J B; Gernaey, K V; Rosen, C; Steyer, J-P; Vanrolleghem, P A
2007-01-01
Over a decade ago, the concept of objectively evaluating the performance of control strategies by simulating them using a standard model implementation was introduced for activated sludge wastewater treatment plants. The resulting Benchmark Simulation Model No 1 (BSM1) has been the basis for a significant new development that is reported on here: Rather than only evaluating control strategies at the level of the activated sludge unit (bioreactors and secondary clarifier) the new BSM2 now allows the evaluation of control strategies at the level of the whole plant, including primary clarifier and sludge treatment with anaerobic sludge digestion. In this contribution, the decisions that have been made over the past three years regarding the models used within the BSM2 are presented and argued, with particular emphasis on the ADM1 description of the digester, the interfaces between activated sludge and digester models, the included temperature dependencies and the reject water storage. BSM2-implementations are now available in a wide range of simulation platforms and a ring test has verified their proper implementation, consistent with the BSM2 definition. This guarantees that users can focus on the control strategy evaluation rather than on modelling issues. Finally, for illustration, twelve simple operational strategies have been implemented in BSM2 and their performance evaluated. Results show that it is an interesting control engineering challenge to further improve the performance of the BSM2 plant (which is the whole idea behind benchmarking) and that integrated control (i.e. acting at different places in the whole plant) is certainly worthwhile to achieve overall improvement.
Modular biowaste monitoring system
NASA Technical Reports Server (NTRS)
Fogal, G. L.
1975-01-01
The objective of the Modular Biowaste Monitoring System Program was to generate and evaluate hardware for supporting shuttle life science experimental and diagnostic programs. An initial conceptual design effort established requirements and defined an overall modular system for the collection, measurement, sampling and storage of urine and feces biowastes. This conceptual design effort was followed by the design, fabrication and performance evaluation of a flight prototype model urine collection, volume measurement and sampling capability. No operational or performance deficiencies were uncovered as a result of the performance evaluation tests.
NASA Astrophysics Data System (ADS)
Beck, Hylke; de Roo, Ad; van Dijk, Albert; McVicar, Tim; Miralles, Diego; Schellekens, Jaap; Bruijnzeel, Sampurno; de Jeu, Richard
2015-04-01
Motivated by the lack of large-scale model parameter regionalization studies, a large set of 3328 small catchments (< 10000 km2) around the globe was used to set up and evaluate five model parameterization schemes at global scale. The HBV-light model was chosen because of its parsimony and flexibility to test the schemes. The catchments were calibrated against observed streamflow (Q) using an objective function incorporating both behavioral and goodness-of-fit measures, after which the catchment set was split into subsets of 1215 donor and 2113 evaluation catchments based on the calibration performance. The donor catchments were subsequently used to derive parameter sets that were transferred to similar grid cells based on a similarity measure incorporating climatic and physiographic characteristics, thereby producing parameter maps with global coverage. Overall, there was a lack of suitable donor catchments for mountainous and tropical environments. The schemes with spatially-uniform parameter sets (EXP2 and EXP3) achieved the worst Q estimation performance in the evaluation catchments, emphasizing the importance of parameter regionalization. The direct transfer of calibrated parameter sets from donor catchments to similar grid cells (scheme EXP1) performed best, although there was still a large performance gap between EXP1 and HBV-light calibrated against observed Q. The schemes with parameter sets obtained by simultaneously calibrating clusters of similar donor catchments (NC10 and NC58) performed worse than EXP1. The relatively poor Q estimation performance achieved by two (uncalibrated) macro-scale hydrological models suggests there is considerable merit in regionalizing the parameters of such models. The global HBV-light parameter maps and ancillary data are freely available via http://water.jrc.ec.europa.eu.
NASA Technical Reports Server (NTRS)
Pavel, M.
1993-01-01
This presentation outlines in viewgraph format a general approach to the evaluation of display system quality for aviation applications. This approach is based on the assumption that it is possible to develop a model of the display which captures most of the significant properties of the display. The display characteristics should include spatial and temporal resolution, intensity quantizing effects, spatial sampling, delays, etc. The model must be sufficiently well specified to permit generation of stimuli that simulate the output of the display system. The first step in the evaluation of display quality is an analysis of the tasks to be performed using the display. Thus, for example, if a display is used by a pilot during a final approach, the aesthetic aspects of the display may be less relevant than its dynamic characteristics. The opposite task requirements may apply to imaging systems used for displaying navigation charts. Thus, display quality is defined with regard to one or more tasks. Given a set of relevant tasks, there are many ways to approach display evaluation. The range of evaluation approaches includes visual inspection, rapid evaluation, part-task simulation, and full mission simulation. The work described is focused on two complementary approaches to rapid evaluation. The first approach is based on a model of the human visual system. A model of the human visual system is used to predict the performance of the selected tasks. The model-based evaluation approach permits very rapid and inexpensive evaluation of various design decisions. The second rapid evaluation approach employs specifically designed critical tests that embody many important characteristics of actual tasks. These are used in situations where a validated model is not available. These rapid evaluation tests are being implemented in a workstation environment.
Observational uncertainty and regional climate model evaluation: A pan-European perspective
NASA Astrophysics Data System (ADS)
Kotlarski, Sven; Szabó, Péter; Herrera, Sixto; Räty, Olle; Keuler, Klaus; Soares, Pedro M.; Cardoso, Rita M.; Bosshard, Thomas; Pagé, Christian; Boberg, Fredrik; Gutiérrez, José M.; Jaczewski, Adam; Kreienkamp, Frank; Liniger, Mark. A.; Lussana, Cristian; Szepszo, Gabriella
2017-04-01
Local and regional climate change assessments based on downscaling methods crucially depend on the existence of accurate and reliable observational reference data. In dynamical downscaling via regional climate models (RCMs) observational data can influence model development itself and, later on, model evaluation, parameter calibration and added value assessment. In empirical-statistical downscaling, observations serve as predictand data and directly influence model calibration with corresponding effects on downscaled climate change projections. Focusing on the evaluation of RCMs, we here analyze the influence of uncertainties in observational reference data on evaluation results in a well-defined performance assessment framework and on a European scale. For this purpose we employ three different gridded observational reference grids, namely (1) the well-established EOBS dataset (2) the recently developed EURO4M-MESAN regional re-analysis, and (3) several national high-resolution and quality-controlled gridded datasets that recently became available. In terms of climate models five reanalysis-driven experiments carried out by five different RCMs within the EURO-CORDEX framework are used. Two variables (temperature and precipitation) and a range of evaluation metrics that reflect different aspects of RCM performance are considered. We furthermore include an illustrative model ranking exercise and relate observational spread to RCM spread. The results obtained indicate a varying influence of observational uncertainty on model evaluation depending on the variable, the season, the region and the specific performance metric considered. Over most parts of the continent, the influence of the choice of the reference dataset for temperature is rather small for seasonal mean values and inter-annual variability. Here, model uncertainty (as measured by the spread between the five RCM simulations considered) is typically much larger than reference data uncertainty. For parameters of the daily temperature distribution and for the spatial pattern correlation, however, important dependencies on the reference dataset can arise. The related evaluation uncertainties can be as large or even larger than model uncertainty. For precipitation the influence of observational uncertainty is, in general, larger than for temperature. It often dominates model uncertainty especially for the evaluation of the wet day frequency, the spatial correlation and the shape and location of the distribution of daily values. But even the evaluation of large-scale seasonal mean values can be considerably affected by the choice of the reference. When employing a simple and illustrative model ranking scheme on these results it is found that RCM ranking in many cases depends on the reference dataset employed.
Do gender and directness of trauma exposure moderate PTSD's latent structure?
Frankfurt, Sheila B; Armour, Cherie; Contractor, Ateka A; Elhai, Jon D
2016-11-30
The PTSD diagnosis and latent structure were substantially revised in the transition from DSM-IV to DSM-5. However, three alternative models (i.e., anhedonia model, externalizing behavior model, and hybrid model) of PTSD fit the DSM-5 symptom criteria better than the DSM-5 factor model. Thus, the psychometric performance of the DSM-5 and alternative models' PTSD factor structure needs to be critically evaluated. The current study examined whether gender or trauma directness (i.e., direct or indirect trauma exposure) moderates the PTSD latent structure when using the DSM-5 or alternative models. Model performance was evaluated with measurement invariance testing procedures on a large undergraduate sample (n=455). Gender and trauma directness moderated the DSM-5 PTSD and externalizing behavior model and did not moderate the anhedonia and hybrid models' latent structure. Clinical implications and directions for future research are discussed. Published by Elsevier Ireland Ltd.
Critical evaluation of mechanistic two-phase flow pipeline and well simulation models
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dhulesia, H.; Lopez, D.
1996-12-31
Mechanistic steady state simulation models, rather than empirical correlations, are used for a design of multiphase production system including well, pipeline and downstream installations. Among the available models, PEPITE, WELLSIM, OLGA, TACITE and TUFFP are widely used for this purpose and consequently, a critical evaluation of these models is needed. An extensive validation methodology is proposed which consists of two distinct steps: first to validate the hydrodynamic point model using the test loop data and, then to validate the over-all simulation model using the real pipelines and wells data. The test loop databank used in this analysis contains about 5952more » data sets originated from four different test loops and a majority of these data are obtained at high pressures (up to 90 bars) with real hydrocarbon fluids. Before performing the model evaluation, physical analysis of the test loops data is required to eliminate non-coherent data. The evaluation of these point models demonstrates that the TACITE and OLGA models can be applied to any configuration of pipes. The TACITE model performs better than the OLGA model because it uses the most appropriate closure laws from the literature validated on a large number of data. The comparison of predicted and measured pressure drop for various real pipelines and wells demonstrates that the TACITE model is a reliable tool.« less
Pressman, Alice R; Lo, Joan C; Chandra, Malini; Ettinger, Bruce
2011-01-01
Area under the receiver operating characteristics (AUROC) curve is often used to evaluate risk models. However, reclassification tests provide an alternative assessment of model performance. We performed both evaluations on results from FRAX (World Health Organization Collaborating Centre for Metabolic Bone Diseases, University of Sheffield, UK), a fracture risk tool, using Kaiser Permanente Northern California women older than 50yr with bone mineral density (BMD) measured during 1997-2003. We compared FRAX performance with and without BMD in the model. Among 94,489 women with mean follow-up of 6.6yr, 1579 (1.7%) sustained a hip fracture. Overall, AUROCs were 0.83 and 0.84 for FRAX without and with BMD, suggesting that BMD did not contribute to model performance. AUROC decreased with increasing age, and BMD contributed significantly to higher AUROC among those aged 70yr and older. Using an 81% sensitivity threshold (optimum level from receiver operating characteristic curve, corresponding to 1.2% cutoff), 35% of those categorized above were reassigned below when BMD was added. In contrast, only 10% of those categorized below were reassigned to the higher risk category when BMD was added. The net reclassification improvement was 5.5% (p<0.01). Two versions of this risk tool have similar AUROCs, but alternative assessments indicate that addition of BMD improves performance. Multiple methods should be used to evaluate risk tool performance with less reliance on AUROC alone. Copyright © 2011 The International Society for Clinical Densitometry. Published by Elsevier Inc. All rights reserved.
Evaluating Air-Quality Models: Review and Outlook.
NASA Astrophysics Data System (ADS)
Weil, J. C.; Sykes, R. I.; Venkatram, A.
1992-10-01
Over the past decade, much attention has been devoted to the evaluation of air-quality models with emphasis on model performance in predicting the high concentrations that are important in air-quality regulations. This paper stems from our belief that this practice needs to be expanded to 1) evaluate model physics and 2) deal with the large natural or stochastic variability in concentration. The variability is represented by the root-mean- square fluctuating concentration (c about the mean concentration (C) over an ensemble-a given set of meteorological, source, etc. conditions. Most air-quality models used in applications predict C, whereas observations are individual realizations drawn from an ensemble. For cC large residuals exist between predicted and observed concentrations, which confuse model evaluations.This paper addresses ways of evaluating model physics in light of the large c the focus is on elevated point-source models. Evaluation of model physics requires the separation of the mean model error-the difference between the predicted and observed C-from the natural variability. A residual analysis is shown to be an elective way of doing this. Several examples demonstrate the usefulness of residuals as well as correlation analyses and laboratory data in judging model physics.In general, c models and predictions of the probability distribution of the fluctuating concentration (c), (c, are in the developmental stage, with laboratory data playing an important role. Laboratory data from point-source plumes in a convection tank show that (c approximates a self-similar distribution along the plume center plane, a useful result in a residual analysis. At pmsent,there is one model-ARAP-that predicts C, c, and (c for point-source plumes. This model is more computationally demanding than other dispersion models (for C only) and must be demonstrated as a practical tool. However, it predicts an important quantity for applications- the uncertainty in the very high and infrequent concentrations. The uncertainty is large and is needed in evaluating operational performance and in predicting the attainment of air-quality standards.
ERIC Educational Resources Information Center
Bauch, Jerold P.
This paper presents guidelines for the evaluation of candidate performance, the basic function of the evaluation component of the Georgia program model for the preparation of elementary school teachers. The three steps in the evaluation procedure are outlined: (1) proficiency module (PM) entry appraisal (pretest); (2) self evaluation and the…
Evaluation of computing systems using functionals of a Stochastic process
NASA Technical Reports Server (NTRS)
Meyer, J. F.; Wu, L. T.
1980-01-01
An intermediate model was used to represent the probabilistic nature of a total system at a level which is higher than the base model and thus closer to the performance variable. A class of intermediate models, which are generally referred to as functionals of a Markov process, were considered. A closed form solution of performability for the case where performance is identified with the minimum value of a functional was developed.
Agent-based modeling as a tool for program design and evaluation.
Lawlor, Jennifer A; McGirr, Sara
2017-12-01
Recently, systems thinking and systems science approaches have gained popularity in the field of evaluation; however, there has been relatively little exploration of how evaluators could use quantitative tools to assist in the implementation of systems approaches therein. The purpose of this paper is to explore potential uses of one such quantitative tool, agent-based modeling, in evaluation practice. To this end, we define agent-based modeling and offer potential uses for it in typical evaluation activities, including: engaging stakeholders, selecting an intervention, modeling program theory, setting performance targets, and interpreting evaluation results. We provide demonstrative examples from published agent-based modeling efforts both inside and outside the field of evaluation for each of the evaluative activities discussed. We further describe potential pitfalls of this tool and offer cautions for evaluators who may chose to implement it in their practice. Finally, the article concludes with a discussion of the future of agent-based modeling in evaluation practice and a call for more formal exploration of this tool as well as other approaches to simulation modeling in the field. Copyright © 2017 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Reyes, J.; Vizuete, W.; Serre, M. L.; Xu, Y.
2015-12-01
The EPA employs a vast monitoring network to measure ambient PM2.5 concentrations across the United States with one of its goals being to quantify exposure within the population. However, there are several areas of the country with sparse monitoring spatially and temporally. One means to fill in these monitoring gaps is to use PM2.5 modeled estimates from Chemical Transport Models (CTMs) specifically the Community Multi-scale Air Quality (CMAQ) model. CMAQ is able to provide complete spatial coverage but is subject to systematic and random error due to model uncertainty. Due to the deterministic nature of CMAQ, often these uncertainties are not quantified. Much effort is employed to quantify the efficacy of these models through different metrics of model performance. Currently evaluation is specific to only locations with observed data. Multiyear studies across the United States are challenging because the error and model performance of CMAQ are not uniform over such large space/time domains. Error changes regionally and temporally. Because of the complex mix of species that constitute PM2.5, CMAQ error is also a function of increasing PM2.5 concentration. To address this issue we introduce a model performance evaluation for PM2.5 CMAQ that is regionalized and non-linear. This model performance evaluation leads to error quantification for each CMAQ grid. Areas and time periods of error being better qualified. The regionalized error correction approach is non-linear and is therefore more flexible at characterizing model performance than approaches that rely on linearity assumptions and assume homoscedasticity of CMAQ predictions errors. Corrected CMAQ data are then incorporated into the modern geostatistical framework of Bayesian Maximum Entropy (BME). Through cross validation it is shown that incorporating error-corrected CMAQ data leads to more accurate estimates than just using observed data by themselves.
ERIC Educational Resources Information Center
Hansen, Michael
2013-01-01
The growing prominence of value-added models for measuring teacher effectiveness has prompted a recent surge in policies that consider students' classroom performance part of a teacher's evaluation. Yet, in light of the criticism and limitations of the current models, whether and how evaluation systems will adapt over time is unclear. This paper…
An evaluation of attention models for use in SLAM
NASA Astrophysics Data System (ADS)
Dodge, Samuel; Karam, Lina
2013-12-01
In this paper we study the application of visual saliency models for the simultaneous localization and mapping (SLAM) problem. We consider visual SLAM, where the location of the camera and a map of the environment can be generated using images from a single moving camera. In visual SLAM, the interest point detector is of key importance. This detector must be invariant to certain image transformations so that features can be matched across di erent frames. Recent work has used a model of human visual attention to detect interest points, however it is unclear as to what is the best attention model for this purpose. To this aim, we compare the performance of interest points from four saliency models (Itti, GBVS, RARE, and AWS) with the performance of four traditional interest point detectors (Harris, Shi-Tomasi, SIFT, and FAST). We evaluate these detectors under several di erent types of image transformation and nd that the Itti saliency model, in general, achieves the best performance in terms of keypoint repeatability.
Johansson, Michael A; Reich, Nicholas G; Hota, Aditi; Brownstein, John S; Santillana, Mauricio
2016-09-26
Dengue viruses, which infect millions of people per year worldwide, cause large epidemics that strain healthcare systems. Despite diverse efforts to develop forecasting tools including autoregressive time series, climate-driven statistical, and mechanistic biological models, little work has been done to understand the contribution of different components to improved prediction. We developed a framework to assess and compare dengue forecasts produced from different types of models and evaluated the performance of seasonal autoregressive models with and without climate variables for forecasting dengue incidence in Mexico. Climate data did not significantly improve the predictive power of seasonal autoregressive models. Short-term and seasonal autocorrelation were key to improving short-term and long-term forecasts, respectively. Seasonal autoregressive models captured a substantial amount of dengue variability, but better models are needed to improve dengue forecasting. This framework contributes to the sparse literature of infectious disease prediction model evaluation, using state-of-the-art validation techniques such as out-of-sample testing and comparison to an appropriate reference model.
Johansson, Michael A.; Reich, Nicholas G.; Hota, Aditi; Brownstein, John S.; Santillana, Mauricio
2016-01-01
Dengue viruses, which infect millions of people per year worldwide, cause large epidemics that strain healthcare systems. Despite diverse efforts to develop forecasting tools including autoregressive time series, climate-driven statistical, and mechanistic biological models, little work has been done to understand the contribution of different components to improved prediction. We developed a framework to assess and compare dengue forecasts produced from different types of models and evaluated the performance of seasonal autoregressive models with and without climate variables for forecasting dengue incidence in Mexico. Climate data did not significantly improve the predictive power of seasonal autoregressive models. Short-term and seasonal autocorrelation were key to improving short-term and long-term forecasts, respectively. Seasonal autoregressive models captured a substantial amount of dengue variability, but better models are needed to improve dengue forecasting. This framework contributes to the sparse literature of infectious disease prediction model evaluation, using state-of-the-art validation techniques such as out-of-sample testing and comparison to an appropriate reference model. PMID:27665707
Luo, Mei; Wang, Hao; Lyu, Zhi
2017-12-01
Species distribution models (SDMs) are widely used by researchers and conservationists. Results of prediction from different models vary significantly, which makes users feel difficult in selecting models. In this study, we evaluated the performance of two commonly used SDMs, the Biomod2 and Maximum Entropy (MaxEnt), with real presence/absence data of giant panda, and used three indicators, i.e., area under the ROC curve (AUC), true skill statistics (TSS), and Cohen's Kappa, to evaluate the accuracy of the two model predictions. The results showed that both models could produce accurate predictions with adequate occurrence inputs and simulation repeats. Comparedto MaxEnt, Biomod2 made more accurate prediction, especially when occurrence inputs were few. However, Biomod2 was more difficult to be applied, required longer running time, and had less data processing capability. To choose the right models, users should refer to the error requirements of their objectives. MaxEnt should be considered if the error requirement was clear and both models could achieve, otherwise, we recommend the use of Biomod2 as much as possible.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sansourekidou, P; Allen, C
2015-06-15
Purpose: To evaluate the Raystation v4.51 Electron Monte Carlo algorithm for Varian Trilogy, IX and 2100 series linear accelerators and commission for clinical use. Methods: Seventy two water and forty air scans were acquired with a water tank in the form of profiles and depth doses, as requested by vendor. Data was imported into Rayphysics beam modeling module. Energy spectrum was modeled using seven parameters. Contamination photons were modeled using five parameters. Source phase space was modeled using six parameters. Calculations were performed in clinical version 4.51 and percent depth dose curves and profiles were extracted to be compared tomore » water tank measurements. Sensitivity tests were performed for all parameters. Grid size and particle histories were evaluated per energy for statistical uncertainty performance. Results: Model accuracy for air profiles is poor in the shoulder and penumbra region. However, model accuracy for water scans is acceptable. All energies and cones are within 2%/2mm for 90% of the points evaluated. Source phase space parameters have a cumulative effect. To achieve distributions with satisfactory smoothness level a 0.1cm grid and 3,000,000 particle histories were used for commissioning calculations. Calculation time was approximately 3 hours per energy. Conclusion: Raystation electron Monte Carlo is acceptable for clinical use for the Varian accelerators listed. Results are inferior to Elekta Electron Monte Carlo modeling. Known issues were reported to Raysearch and will be resolved in upcoming releases. Auto-modeling is limited to open cone depth dose curves and needs expansion.« less
Federal Register 2010, 2011, 2012, 2013, 2014
2011-02-08
... Coordinator; (2) applies research methodologies to perform evaluation studies of health information technology grant programs; and, (3) applies advanced mathematical or quantitative modeling to the U.S. health care... remaining items in the paragraph accordingly: ``(1) Applying research methodologies to perform evaluation...
The Rasch Model for Evaluating Italian Student Performance
ERIC Educational Resources Information Center
Camminatiello, Ida; Gallo, Michele; Menini, Tullio
2010-01-01
In 1997 the Organisation for Economic Co-operation and Development (OECD) launched the OECD Programme for International Student Assessment (PISA) for collecting information about 15-year-old students in participating countries. Our study analyses the PISA 2006 cognitive test for evaluating the Italian student performance in mathematics, reading…
NASA Astrophysics Data System (ADS)
Morita, Yukinori; Mori, Takahiro; Migita, Shinji; Mizubayashi, Wataru; Tanabe, Akihito; Fukuda, Koichi; Matsukawa, Takashi; Endo, Kazuhiko; O'uchi, Shin-ichi; Liu, Yongxun; Masahara, Meishoku; Ota, Hiroyuki
2014-12-01
The performance of parallel electric field tunnel field-effect transistors (TFETs), in which band-to-band tunneling (BTBT) was initiated in-line to the gate electric field was evaluated. The TFET was fabricated by inserting an epitaxially-grown parallel-plate tunnel capacitor between heavily doped source wells and gate insulators. Analysis using a distributed-element circuit model indicated there should be a limit of the drain current caused by the self-voltage-drop effect in the ultrathin channel layer.
Leveraging simulation to evaluate system performance in presence of fixed pattern noise
NASA Astrophysics Data System (ADS)
Teaney, Brian P.
2017-05-01
The development of image simulation techniques which map the effects of a notional, modeled sensor system onto an existing image can be used to evaluate the image quality of camera systems prior to the development of prototype systems. In addition, image simulation or `virtual prototyping' can be utilized to reduce the time and expense associated with conducting extensive field trials. In this paper we examine the development of a perception study designed to assess the performance of the NVESD imager performance metrics as a function of fixed pattern noise. This paper discusses the development of the model theory and the implementation and execution of the perception study. In addition, other applications of the image simulation component including the evaluation of limiting resolution and other test targets is provided.
Cho, Kyoung Won; Bae, Sung-Kwon; Ryu, Ji-Hye; Kim, Kyeong Na; An, Chang-Ho; Chae, Young Moon
2015-01-01
This study was to evaluate the performance of the newly developed information system (IS) implemented on July 1, 2014 at three public hospitals in Korea. User satisfaction scores of twelve key performance indicators of six IS success factors based on the DeLone and McLean IS Success Model were utilized to evaluate IS performance before and after the newly developed system was introduced. All scores increased after system introduction except for the completeness of medical records and impact on the clinical environment. The relationships among six IS factors were also analyzed to identify the important factors influencing three IS success factors (Intention to Use, User Satisfaction, and Net Benefits). All relationships were significant except for the relationships among Service Quality, Intention to Use, and Net Benefits. The results suggest that hospitals should not only focus on systems and information quality; rather, they should also continuously improve service quality to improve user satisfaction and eventually reach full the potential of IS performance.
Morris, Ralph E; McNally, Dennis E; Tesche, Thomas W; Tonnesen, Gail; Boylan, James W; Brewer, Patricia
2005-11-01
The Visibility Improvement State and Tribal Association of the Southeast (VISTAS) is one of five Regional Planning Organizations that is charged with the management of haze, visibility, and other regional air quality issues in the United States. The VISTAS Phase I work effort modeled three episodes (January 2002, July 1999, and July 2001) to identify the optimal model configuration(s) to be used for the 2002 annual modeling in Phase II. Using model configurations recommended in the Phase I analysis, 2002 annual meteorological (Mesoscale Meterological Model [MM5]), emissions (Sparse Matrix Operator Kernal Emissions [SMOKE]), and air quality (Community Multiscale Air Quality [CMAQ]) simulations were performed on a 36-km grid covering the continental United States and a 12-km grid covering the Eastern United States. Model estimates were then compared against observations. This paper presents the results of the preliminary CMAQ model performance evaluation for the initial 2002 annual base case simulation. Model performance is presented for the Eastern United States using speciated fine particle concentration and wet deposition measurements from several monitoring networks. Initial results indicate fairly good performance for sulfate with fractional bias values generally within +/-20%. Nitrate is overestimated in the winter by approximately +50% and underestimated in the summer by more than -100%. Organic carbon exhibits a large summer underestimation bias of approximately -100% with much improved performance seen in the winter with a bias near zero. Performance for elemental carbon is reasonable with fractional bias values within +/- 40%. Other fine particulate (soil) and coarse particular matter exhibit large (80-150%) overestimation in the winter but improved performance in the summer. The preliminary 2002 CMAQ runs identified several areas of enhancements to improve model performance, including revised temporal allocation factors for ammonia emissions to improve nitrate performance and addressing missing processes in the secondary organic aerosol module to improve OC performance.
NASA Astrophysics Data System (ADS)
Amrina, E.; Yulianto, A.
2018-03-01
Sustainable maintenance is a new challenge for manufacturing companies to realize sustainable development. In this paper, an interpretive structural model is developed to evaluate sustainable maintenance in the rubber industry. The initial key performance indicators (KPIs) is identified and derived from literature and then validated by academic and industry experts. As a result, three factors of economic, social, and environmental dividing into a total of thirteen indicators are proposed as the KPIs for sustainable maintenance evaluation in rubber industry. Interpretive structural modeling (ISM) methodology is applied to develop a network structure model of the KPIs consisting of three levels. The results show the economic factor is regarded as the basic factor, the social factor as the intermediate factor, while the environmental factor indicated to be the leading factor. Two indicators of social factor i.e. labor relationship, and training and education have both high driver and dependence power, thus categorized as the unstable indicators which need further attention. All the indicators of environmental factor and one indicator of social factor are indicated as the most influencing indicator. The interpretive structural model hoped can aid the rubber companies in evaluating sustainable maintenance performance.
Evolution of the Marine Officer Fitness Report: A Multivariate Analysis
This thesis explores the evaluation behavior of United States Marine Corps (USMC) Reporting Seniors (RSs) from 2010 to 2017. Using fitness report...RSs evaluate the performance of subordinate active component unrestricted officer MROs over time. I estimate logistic regression models of the...lowest. However, these correlations indicating the effects of race matching on FITREP evaluations narrow in significance when performance-based factors
ERIC Educational Resources Information Center
Mechling, Linda C.; Ayres, Kevin M.; Foster, Ashley L.; Bryant, Kathryn J.
2015-01-01
The purpose of this study was to evaluate the ability of four high school-aged students with a diagnosis of autism spectrum disorder and moderate intellectual disability to generalize performance of skills when using materials different from those presented through video models. An adapted alternating treatments design was used to evaluate student…
Application of single-step genomic evaluation for crossbred performance in pig.
Xiang, T; Nielsen, B; Su, G; Legarra, A; Christensen, O F
2016-03-01
Crossbreding is predominant and intensively used in commercial meat production systems, especially in poultry and swine. Genomic evaluation has been successfully applied for breeding within purebreds but also offers opportunities of selecting purebreds for crossbred performance by combining information from purebreds with information from crossbreds. However, it generally requires that all relevant animals are genotyped, which is costly and presently does not seem to be feasible in practice. Recently, a novel single-step BLUP method for genomic evaluation of both purebred and crossbred performance has been developed that can incorporate marker genotypes into a traditional animal model. This new method has not been validated in real data sets. In this study, we applied this single-step method to analyze data for the maternal trait of total number of piglets born in Danish Landrace, Yorkshire, and two-way crossbred pigs in different scenarios. The genetic correlation between purebred and crossbred performances was investigated first, and then the impact of (crossbred) genomic information on prediction reliability for crossbred performance was explored. The results confirm the existence of a moderate genetic correlation, and it was seen that the standard errors on the estimates were reduced when including genomic information. Models with marker information, especially crossbred genomic information, improved model-based reliabilities for crossbred performance of purebred boars and also improved the predictive ability for crossbred animals and, to some extent, reduced the bias of prediction. We conclude that the new single-step BLUP method is a good tool in the genetic evaluation for crossbred performance in purebred animals.
Koeppen Bioclimatic Metrics for Evaluating CMIP5 Simulations of Historical Climate
NASA Astrophysics Data System (ADS)
Phillips, T. J.; Bonfils, C.
2012-12-01
The classic Koeppen bioclimatic classification scheme associates generic vegetation types (e.g. grassland, tundra, broadleaf or evergreen forests, etc.) with regional climate zones defined by the observed amplitude and phase of the annual cycles of continental temperature (T) and precipitation (P). Koeppen classification thus can provide concise, multivariate metrics for evaluating climate model performance in simulating the regional magnitudes and seasonalities of climate variables that are of critical importance for living organisms. In this study, 14 Koeppen vegetation types are derived from annual-cycle climatologies of T and P in some 3 dozen CMIP5 simulations of 1980-1999 climate, a period when observational data provides a reliable global validation standard. Metrics for evaluating the ability of the CMIP5 models to simulate the correct locations and areas of the vegetation types, as well as measures of overall model performance, also are developed. It is found that the CMIP5 models are most deficient in simulating 1) the climates of the drier zones (e.g. desert, savanna, grassland, steppe vegetation types) that are located in the Southwestern U.S. and Mexico, Eastern Europe, Southern Africa, and Central Australia, as well as 2) the climate of regions such as Central Asia and Western South America where topography plays a central role. (Detailed analysis of regional biases in the annual cycles of T and P of selected simulations exemplifying general model performance problems also are to be presented.) The more encouraging results include evidence for a general improvement in CMIP5 performance relative to that of older CMIP3 models. Within CMIP5 also, the more complex Earth Systems Models (ESMs) with prognostic biogeochemistry perform comparably to the corresponding global models that simulate only the "physical" climate. Acknowledgments This work was funded by the U.S. Department of Energy Office of Science and was performed at the Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
NASA Astrophysics Data System (ADS)
Uysal, Selcuk Can
In this research, MATLAB SimulinkRTM was used to develop a cooled engine model for industrial gas turbines and aero-engines. The model consists of uncooled on-design, mean-line turbomachinery design and a cooled off-design analysis in order to evaluate the engine performance parameters by using operating conditions, polytropic efficiencies, material information and cooling system details. The cooling analysis algorithm involves a 2nd law analysis to calculate losses from the cooling technique applied. The model is used in a sensitivity analysis that evaluates the impacts of variations in metal Biot number, thermal barrier coating Biot number, film cooling effectiveness, internal cooling effectiveness and maximum allowable blade temperature on main engine performance parameters of aero and industrial gas turbine engines. The model is subsequently used to analyze the relative performance impact of employing Anti-Vortex Film Cooling holes (AVH) by means of data obtained for these holes by Detached Eddy Simulation-CFD Techniques that are valid for engine-like turbulence intensity conditions. Cooled blade configurations with AVH and other different external cooling techniques were used in a performance comparison study. (Abstract shortened by ProQuest.).
Alsharif, Naser Z; Galt, Kimberly A
2008-04-15
To evaluate an instructional model for teaching clinically relevant medicinal chemistry. An instructional model that uses Bloom's cognitive and Krathwohl's affective taxonomy, published and tested concepts in teaching medicinal chemistry, and active learning strategies, was introduced in the medicinal chemistry courses for second-professional year (P2) doctor of pharmacy (PharmD) students (campus and distance) in the 2005-2006 academic year. Student learning and the overall effectiveness of the instructional model were assessed. Student performance after introducing the instructional model was compared to that in prior years. Student performance on course examinations improved compared to previous years. Students expressed overall enthusiasm about the course and better understood the value of medicinal chemistry to clinical practice. The explicit integration of the cognitive and affective learning objectives improved student performance, student ability to apply medicinal chemistry to clinical practice, and student attitude towards the discipline. Testing this instructional model provided validation to this theoretical framework. The model is effective for both our campus and distance-students. This instructional model may also have broad-based applications to other science courses.
Rahaman, Md Saifur; Mavinic, Donald S; Meikleham, Alexandra; Ellis, Naoko
2014-03-15
The cost associated with the disposal of phosphate-rich sludge, the stringent regulations to limit phosphate discharge into aquatic environments, and resource shortages resulting from limited phosphorus rock reserves, have diverted attention to phosphorus recovery in the form of struvite (MAP: MgNH4PO4·6H2O) crystals, which can essentially be used as a slow release fertilizer. Fluidized-bed crystallization is one of the most efficient unit processes used in struvite crystallization from wastewater. In this study, a comprehensive mathematical model, incorporating solution thermodynamics, struvite precipitation kinetics and reactor hydrodynamics, was developed to illustrate phosphorus depletion through struvite crystal growth in a continuous, fluidized-bed crystallizer. A thermodynamic equilibrium model for struvite precipitation was linked to the fluidized-bed reactor model. While the equilibrium model provided information on supersaturation generation, the reactor model captured the dynamic behavior of the crystal growth processes, as well as the effect of the reactor hydrodynamics on the overall process performance. The model was then used for performance evaluation of the reactor, in terms of removal efficiencies of struvite constituent species (Mg, NH4 and PO4), and the average product crystal sizes. The model also determined the variation of species concentration of struvite within the crystal bed height. The species concentrations at two extreme ends (inlet and outlet) were used to evaluate the reactor performance. The model predictions provided a reasonably good fit with the experimental results for PO4-P, NH4-N and Mg removals. Predicated average crystal sizes also matched fairly well with the experimental observations. Therefore, this model can be used as a tool for performance evaluation and process optimization of struvite crystallization in a fluidized-bed reactor. Crown Copyright © 2013. Published by Elsevier Ltd. All rights reserved.
High dimensional biological data retrieval optimization with NoSQL technology.
Wang, Shicai; Pandis, Ioannis; Wu, Chao; He, Sijin; Johnson, David; Emam, Ibrahim; Guitton, Florian; Guo, Yike
2014-01-01
High-throughput transcriptomic data generated by microarray experiments is the most abundant and frequently stored kind of data currently used in translational medicine studies. Although microarray data is supported in data warehouses such as tranSMART, when querying relational databases for hundreds of different patient gene expression records queries are slow due to poor performance. Non-relational data models, such as the key-value model implemented in NoSQL databases, hold promise to be more performant solutions. Our motivation is to improve the performance of the tranSMART data warehouse with a view to supporting Next Generation Sequencing data. In this paper we introduce a new data model better suited for high-dimensional data storage and querying, optimized for database scalability and performance. We have designed a key-value pair data model to support faster queries over large-scale microarray data and implemented the model using HBase, an implementation of Google's BigTable storage system. An experimental performance comparison was carried out against the traditional relational data model implemented in both MySQL Cluster and MongoDB, using a large publicly available transcriptomic data set taken from NCBI GEO concerning Multiple Myeloma. Our new key-value data model implemented on HBase exhibits an average 5.24-fold increase in high-dimensional biological data query performance compared to the relational model implemented on MySQL Cluster, and an average 6.47-fold increase on query performance on MongoDB. The performance evaluation found that the new key-value data model, in particular its implementation in HBase, outperforms the relational model currently implemented in tranSMART. We propose that NoSQL technology holds great promise for large-scale data management, in particular for high-dimensional biological data such as that demonstrated in the performance evaluation described in this paper. We aim to use this new data model as a basis for migrating tranSMART's implementation to a more scalable solution for Big Data.
High dimensional biological data retrieval optimization with NoSQL technology
2014-01-01
Background High-throughput transcriptomic data generated by microarray experiments is the most abundant and frequently stored kind of data currently used in translational medicine studies. Although microarray data is supported in data warehouses such as tranSMART, when querying relational databases for hundreds of different patient gene expression records queries are slow due to poor performance. Non-relational data models, such as the key-value model implemented in NoSQL databases, hold promise to be more performant solutions. Our motivation is to improve the performance of the tranSMART data warehouse with a view to supporting Next Generation Sequencing data. Results In this paper we introduce a new data model better suited for high-dimensional data storage and querying, optimized for database scalability and performance. We have designed a key-value pair data model to support faster queries over large-scale microarray data and implemented the model using HBase, an implementation of Google's BigTable storage system. An experimental performance comparison was carried out against the traditional relational data model implemented in both MySQL Cluster and MongoDB, using a large publicly available transcriptomic data set taken from NCBI GEO concerning Multiple Myeloma. Our new key-value data model implemented on HBase exhibits an average 5.24-fold increase in high-dimensional biological data query performance compared to the relational model implemented on MySQL Cluster, and an average 6.47-fold increase on query performance on MongoDB. Conclusions The performance evaluation found that the new key-value data model, in particular its implementation in HBase, outperforms the relational model currently implemented in tranSMART. We propose that NoSQL technology holds great promise for large-scale data management, in particular for high-dimensional biological data such as that demonstrated in the performance evaluation described in this paper. We aim to use this new data model as a basis for migrating tranSMART's implementation to a more scalable solution for Big Data. PMID:25435347
Correlation tracking study for meter-class solar telescope on space shuttle. [solar granulation
NASA Technical Reports Server (NTRS)
Smithson, R. C.; Tarbell, T. D.
1977-01-01
The theory and expected performance level of correlation trackers used to control the pointing of a solar telescope in space using white light granulation as a target were studied. Three specific trackers were modeled and their performance levels predicted for telescopes of various apertures. The performance of the computer model trackers on computer enhanced granulation photographs was evaluated. Parametric equations for predicting tracker performance are presented.
Hofman, Jelle; Samson, Roeland
2014-09-01
Biomagnetic monitoring of tree leaf deposited particles has proven to be a good indicator of the ambient particulate concentration. The objective of this study is to apply this method to validate a local-scale air quality model (ENVI-met), using 96 tree crown sampling locations in a typical urban street canyon. To the best of our knowledge, the application of biomagnetic monitoring for the validation of pollutant dispersion modeling is hereby presented for the first time. Quantitative ENVI-met validation showed significant correlations between modeled and measured results throughout the entire in-leaf period. ENVI-met performed much better at the first half of the street canyon close to the ring road (r=0.58-0.79, RMSE=44-49%), compared to second part (r=0.58-0.64, RMSE=74-102%). The spatial model behavior was evaluated by testing effects of height, azimuthal position, tree position and distance from the main pollution source on the obtained model results and magnetic measurements. Our results demonstrate that biomagnetic monitoring seems to be a valuable method to evaluate the performance of air quality models. Due to the high spatial and temporal resolution of this technique, biomagnetic monitoring can be applied anywhere in the city (where urban green is present) to evaluate model performance at different spatial scales. Copyright © 2014 Elsevier Ltd. All rights reserved.
Performance and life evaluation of advanced battery technologies for electric vehicle applications
NASA Astrophysics Data System (ADS)
Deluca, W. H.; Gillie, K. R.; Kulaga, J. E.; Smaga, J. A.; Tummillo, A. F.; Webster, C. E.
Advanced battery technology evaluations are performed under simulated electric vehicle (EV) operating conditions at the Argonne Analysis and Diagnostic Laboratory (ADL). The ADL provides a common basis for both performance characterization and life evaluation with unbiased application of tests and analyses. This paper summarizes the performance characterizations and life evaluations conducted in 1990 on nine single cells and fifteen 3- to 360-cell modules that encompass six technologies: (Na/S, Zn/Br, Ni/Fe, Ni/Cd, Ni-metal hydride, and lead-acid). These evaluations were performed for the Department of Energy and Electric Power Research Institute. The results provide battery users, developers, and program managers an interim measure of the progress being made in battery R and D programs, a comparison of battery technologies, and a source of basic data for modelling and continuing R and D.
Note on evaluating safety performance of road infrastructure to motivate safety competition.
Han, Sangjin
2016-01-01
Road infrastructures are usually developed and maintained by governments or public sectors. There is no competitor in the market of their jurisdiction. This monopolic feature discourages road authorities from improving the level of safety with proactive motivation. This study suggests how to apply a principle of competition for roads, in particular by means of performance evaluation. It first discusses why road infrastructure has been slow in safety oriented development and management in respect of its business model. Then it suggests some practical ways of how to promote road safety between road authorities, particularly by evaluating safety performance of road infrastructure. These are summarized as decision of safety performance indicators, classification of spatial boundaries, data collection, evaluation, and reporting. Some consideration points are also discussed to make safety performance evaluation on road infrastructure lead to better road safety management.
Evaluation methodology for query-based scene understanding systems
NASA Astrophysics Data System (ADS)
Huster, Todd P.; Ross, Timothy D.; Culbertson, Jared L.
2015-05-01
In this paper, we are proposing a method for the principled evaluation of scene understanding systems in a query-based framework. We can think of a query-based scene understanding system as a generalization of typical sensor exploitation systems where instead of performing a narrowly defined task (e.g., detect, track, classify, etc.), the system can perform general user-defined tasks specified in a query language. Examples of this type of system have been developed as part of DARPA's Mathematics of Sensing, Exploitation, and Execution (MSEE) program. There is a body of literature on the evaluation of typical sensor exploitation systems, but the open-ended nature of the query interface introduces new aspects to the evaluation problem that have not been widely considered before. In this paper, we state the evaluation problem and propose an approach to efficiently learn about the quality of the system under test. We consider the objective of the evaluation to be to build a performance model of the system under test, and we rely on the principles of Bayesian experiment design to help construct and select optimal queries for learning about the parameters of that model.
NASA Astrophysics Data System (ADS)
Adnan, F. A.; Romlay, F. R. M.; Shafiq, M.
2018-04-01
Owing to the advent of the industrial revolution 4.0, the need for further evaluating processes applied in the additive manufacturing application particularly the computational process for slicing is non-trivial. This paper evaluates a real-time slicing algorithm for slicing an STL formatted computer-aided design (CAD). A line-plane intersection equation was applied to perform the slicing procedure at any given height. The application of this algorithm has found to provide a better computational time regardless the number of facet in the STL model. The performance of this algorithm is evaluated by comparing the results of the computational time for different geometry.
Rosato, Stefano; D'Errigo, Paola; Badoni, Gabriella; Fusco, Danilo; Perucci, Carlo A; Seccareccia, Fulvia
2008-08-01
The availability of two contemporary sources of information about coronary artery bypass graft (CABG) interventions, allowed 1) to verify the feasibility of performing outcome evaluation studies using administrative data sources, and 2) to compare hospital performance obtainable using the CABG Project clinical database with hospital performance derived from the use of current administrative data. Interventions recorded in the CABG Project were linked to the hospital discharge record (HDR) administrative database. Only the linked records were considered for subsequent analyses (46% of the total CABG Project). A new selected population "clinical card-HDR" was then defined. Two independent risk-adjustment models were applied, each of them using information derived from one of the two different sources. Then, HDR information was supplemented with some patient preoperative conditions from the CABG clinical database. The two models were compared in terms of their adaptability to data. Hospital performances identified by the two different models and significantly different from the mean was compared. In only 4 of the 13 hospitals considered for analysis, the results obtained using the HDR model did not completely overlap with those obtained by the CABG model. When comparing statistical parameters of the HDR model and the HDR model + patient preoperative conditions, the latter showed the best adaptability to data. In this "clinical card-HDR" population, hospital performance assessment obtained using information from the clinical database is similar to that derived from the use of current administrative data. However, when risk-adjustment models built on administrative databases are supplemented with a few clinical variables, their statistical parameters improve and hospital performance assessment becomes more accurate.
Application of structured analysis to a telerobotic system
NASA Technical Reports Server (NTRS)
Dashman, Eric; Mclin, David; Harrison, F. W.; Soloway, Donald; Young, Steven
1990-01-01
The analysis and evaluation of a multiple arm telerobotic research and demonstration system developed by the NASA Intelligent Systems Research Laboratory (ISRL) is described. Structured analysis techniques were used to develop a detailed requirements model of an existing telerobotic testbed. Performance models generated during this process were used to further evaluate the total system. A commercial CASE tool called Teamwork was used to carry out the structured analysis and development of the functional requirements model. A structured analysis and design process using the ISRL telerobotic system as a model is described. Evaluation of this system focused on the identification of bottlenecks in this implementation. The results demonstrate that the use of structured methods and analysis tools can give useful performance information early in a design cycle. This information can be used to ensure that the proposed system meets its design requirements before it is built.
Deriving the expected utility of a predictive model when the utilities are uncertain.
Cooper, Gregory F; Visweswaran, Shyam
2005-01-01
Predictive models are often constructed from clinical databases with the goal of eventually helping make better clinical decisions. Evaluating models using decision theory is therefore natural. When constructing a model using statistical and machine learning methods, however, we are often uncertain about precisely how the model will be used. Thus, decision-independent measures of classification performance, such as the area under an ROC curve, are popular. As a complementary method of evaluation, we investigate techniques for deriving the expected utility of a model under uncertainty about the model's utilities. We demonstrate an example of the application of this approach to the evaluation of two models that diagnose coronary artery disease.
ERIC Educational Resources Information Center
Colvin, Julanne; Lee, Mingun; Magnano, Julienne; Smith, Valerie
2008-01-01
This article reports on the further development of the task-centered model for difficulties in school performance. We used Bailey-Dempsey and Reid's (1996) application of Rothman and Thomas's (1994) design and development framework and annual evaluations of the Partners in Prevention (PIP) Program to refine the task-centered case management model.…
An evaluation of the STEMS tree growth projection system.
Margaret R. Holdaway; Gary J. Brand
1983-01-01
STEMS (Stand and Tree Evaluation and Modeling System) is a tree growth projection system. This paper (1) compares the performance of the current version of STEMS developed for the Lake States with that of the original model and (2) reports the results of an analysis of the current model over a wide range of conditions and identifies its main strengths and weaknesses...
Evaluation of Disaster Preparedness Based on Simulation Exercises: A Comparison of Two Models.
Rüter, Andres; Kurland, Lisa; Gryth, Dan; Murphy, Jason; Rådestad, Monica; Djalali, Ahmadreza
2016-08-01
The objective of this study was to highlight 2 models, the Hospital Incident Command System (HICS) and the Disaster Management Indicator model (DiMI), for evaluating the in-hospital management of a disaster situation through simulation exercises. Two disaster exercises, A and B, with similar scenarios were performed. Both exercises were evaluated with regard to actions, processes, and structures. After the exercises, the results were calculated and compared. In exercise A the HICS model indicated that 32% of the required positions for the immediate phase were taken under consideration with an average performance of 70%. For exercise B, the corresponding scores were 42% and 68%, respectively. According to the DiMI model, the results for exercise A were a score of 68% for management processes and 63% for management structure (staff skills). In B the results were 77% and 86%, respectively. Both models demonstrated acceptable results in relation to previous studies. More research in this area is needed to validate which of these methods best evaluates disaster preparedness based on simulation exercises or whether the methods are complementary and should therefore be used together. (Disaster Med Public Health Preparedness. 2016;10:544-548).
Sakr, Sherif; Elshawi, Radwa; Ahmed, Amjad M; Qureshi, Waqas T; Brawner, Clinton A; Keteyian, Steven J; Blaha, Michael J; Al-Mallah, Mouaz H
2017-12-19
Prior studies have demonstrated that cardiorespiratory fitness (CRF) is a strong marker of cardiovascular health. Machine learning (ML) can enhance the prediction of outcomes through classification techniques that classify the data into predetermined categories. The aim of this study is to present an evaluation and comparison of how machine learning techniques can be applied on medical records of cardiorespiratory fitness and how the various techniques differ in terms of capabilities of predicting medical outcomes (e.g. mortality). We use data of 34,212 patients free of known coronary artery disease or heart failure who underwent clinician-referred exercise treadmill stress testing at Henry Ford Health Systems Between 1991 and 2009 and had a complete 10-year follow-up. Seven machine learning classification techniques were evaluated: Decision Tree (DT), Support Vector Machine (SVM), Artificial Neural Networks (ANN), Naïve Bayesian Classifier (BC), Bayesian Network (BN), K-Nearest Neighbor (KNN) and Random Forest (RF). In order to handle the imbalanced dataset used, the Synthetic Minority Over-Sampling Technique (SMOTE) is used. Two set of experiments have been conducted with and without the SMOTE sampling technique. On average over different evaluation metrics, SVM Classifier has shown the lowest performance while other models like BN, BC and DT performed better. The RF classifier has shown the best performance (AUC = 0.97) among all models trained using the SMOTE sampling. The results show that various ML techniques can significantly vary in terms of its performance for the different evaluation metrics. It is also not necessarily that the more complex the ML model, the more prediction accuracy can be achieved. The prediction performance of all models trained with SMOTE is much better than the performance of models trained without SMOTE. The study shows the potential of machine learning methods for predicting all-cause mortality using cardiorespiratory fitness data.
Ortega, Jesus; Khivsara, Sagar; Christian, Joshua; ...
2016-05-30
In single phase performance and appealing thermo-physical properties supercritical carbon dioxide (s-CO 2) make a good heat transfer fluid candidate for concentrating solar power (CSP) technologies. The development of a solar receiver capable of delivering s-CO 2 at outlet temperatures ~973 K is required in order to merge CSP and s-CO 2 Brayton cycle technologies. A coupled optical and thermal-fluid modeling effort for a tubular receiver is undertaken to evaluate the direct tubular s-CO 2 receiver’s thermal performance when exposed to a concentrated solar power input of ~0.3–0.5 MW. Ray tracing, using SolTrace, is performed to determine the heat fluxmore » profiles on the receiver and computational fluid dynamics (CFD) determines the thermal performance of the receiver under the specified heating conditions. Moreover, an in-house MATLAB code is developed to couple SolTrace and ANSYS Fluent. CFD modeling is performed using ANSYS Fluent to predict the thermal performance of the receiver by evaluating radiation and convection heat loss mechanisms. Understanding the effects of variation in heliostat aiming strategy and flow configurations on the thermal performance of the receiver was achieved through parametric analyses. Finally, a receiver thermal efficiency ~85% was predicted and the surface temperatures were observed to be within the allowable limit for the materials under consideration.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ortega, Jesus; Khivsara, Sagar; Christian, Joshua
In single phase performance and appealing thermo-physical properties supercritical carbon dioxide (s-CO 2) make a good heat transfer fluid candidate for concentrating solar power (CSP) technologies. The development of a solar receiver capable of delivering s-CO 2 at outlet temperatures ~973 K is required in order to merge CSP and s-CO 2 Brayton cycle technologies. A coupled optical and thermal-fluid modeling effort for a tubular receiver is undertaken to evaluate the direct tubular s-CO 2 receiver’s thermal performance when exposed to a concentrated solar power input of ~0.3–0.5 MW. Ray tracing, using SolTrace, is performed to determine the heat fluxmore » profiles on the receiver and computational fluid dynamics (CFD) determines the thermal performance of the receiver under the specified heating conditions. Moreover, an in-house MATLAB code is developed to couple SolTrace and ANSYS Fluent. CFD modeling is performed using ANSYS Fluent to predict the thermal performance of the receiver by evaluating radiation and convection heat loss mechanisms. Understanding the effects of variation in heliostat aiming strategy and flow configurations on the thermal performance of the receiver was achieved through parametric analyses. Finally, a receiver thermal efficiency ~85% was predicted and the surface temperatures were observed to be within the allowable limit for the materials under consideration.« less
Embedded measures of performance validity using verbal fluency tests in a clinical sample.
Sugarman, Michael A; Axelrod, Bradley N
2015-01-01
The objective of this study was to determine to what extent verbal fluency measures can be used as performance validity indicators during neuropsychological evaluation. Participants were clinically referred for neuropsychological evaluation in an urban-based Veteran's Affairs hospital. Participants were placed into 2 groups based on their objectively evaluated effort on performance validity tests (PVTs). Individuals who exhibited credible performance (n = 431) failed 0 PVTs, and those with poor effort (n = 192) failed 2 or more PVTs. All participants completed the Controlled Oral Word Association Test (COWAT) and Animals verbal fluency measures. We evaluated how well verbal fluency scores could discriminate between the 2 groups. Raw scores and T scores for Animals discriminated between the credible performance and poor-effort groups with 90% specificity and greater than 40% sensitivity. COWAT scores had lower sensitivity for detecting poor effort. A combination of FAS and Animals scores into logistic regression models yielded acceptable group classification, with 90% specificity and greater than 44% sensitivity. Verbal fluency measures can yield adequate detection of poor effort during neuropsychological evaluation. We provide suggested cut points and logistic regression models for predicting the probability of poor effort in our clinical setting and offer suggested cutoff scores to optimize sensitivity and specificity.
NASA Astrophysics Data System (ADS)
Akinsanola, A. A.; Ajayi, V. O.; Adejare, A. T.; Adeyeri, O. E.; Gbode, I. E.; Ogunjobi, K. O.; Nikulin, G.; Abolude, A. T.
2018-04-01
This study presents evaluation of the ability of Rossby Centre Regional Climate Model (RCA4) driven by nine global circulation models (GCMs), to skilfully reproduce the key features of rainfall climatology over West Africa for the period of 1980-2005. The seasonal climatology and annual cycle of the RCA4 simulations were assessed over three homogenous subregions of West Africa (Guinea coast, Savannah, and Sahel) and evaluated using observed precipitation data from the Global Precipitation Climatology Project (GPCP). Furthermore, the model output was evaluated using a wide range of statistical measures. The interseasonal and interannual variability of the RCA4 were further assessed over the subregions and the whole of the West Africa domain. Results indicate that the RCA4 captures the spatial and interseasonal rainfall pattern adequately but exhibits a weak performance over the Guinea coast. Findings from the interannual rainfall variability indicate that the model performance is better over the larger West Africa domain than the subregions. The largest difference across the RCA4 simulated annual rainfall was found in the Sahel. Result from the Mann-Kendall test showed no significant trend for the 1980-2005 period in annual rainfall either in GPCP observation data or in the model simulations over West Africa. In many aspects, the RCA4 simulation driven by the HadGEM2-ES perform best over the region. The use of the multimodel ensemble mean has resulted to the improved representation of rainfall characteristics over the study domain.
Small area population forecasting: some experience with British models.
Openshaw, S; Van Der Knaap, G A
1983-01-01
This study is concerned with the evaluation of the various models including time-series forecasts, extrapolation, and projection procedures, that have been developed to prepare population forecasts for planning purposes. These models are evaluated using data for the Netherlands. "As part of a research project at the Erasmus University, space-time population data has been assembled in a geographically consistent way for the period 1950-1979. These population time series are of sufficient length for the first 20 years to be used to build models and then evaluate the performance of the model for the next 10 years. Some 154 different forecasting models for 832 municipalities have been evaluated. It would appear that the best forecasts are likely to be provided by either a Holt-Winters model, or a ratio-correction model, or a low order exponential-smoothing model." excerpt
NASA Astrophysics Data System (ADS)
Safeeq, Mohammad; Fares, Ali
2011-12-01
Daily and sub-daily weather data are often required for hydrological and environmental modeling. Various weather generator programs have been used to generate synthetic climate data where observed climate data are limited. In this study, a weather data generator, ClimGen, was evaluated for generating information on daily precipitation, temperature, and wind speed at four tropical watersheds located in Hawai`i, USA. We also evaluated different daily to sub-daily weather data disaggregation methods for precipitation, air temperature, dew point temperature, and wind speed at Mākaha watershed. The hydrologic significance values of the different disaggregation methods were evaluated using Distributed Hydrology Soil Vegetation Model. MuDRain and diurnal method performed well over uniform distribution in disaggregating daily precipitation. However, the diurnal method is more consistent if accurate estimates of hourly precipitation intensities are desired. All of the air temperature disaggregation methods performed reasonably well, but goodness-of-fit statistics were slightly better for sine curve model with 2 h lag. Cosine model performed better than random model in disaggregating daily wind speed. The largest differences in annual water balance were related to wind speed followed by precipitation and dew point temperature. Simulated hourly streamflow, evapotranspiration, and groundwater recharge were less sensitive to the method of disaggregating daily air temperature. ClimGen performed well in generating the minimum and maximum temperature and wind speed. However, for precipitation, it clearly underestimated the number of extreme rainfall events with an intensity of >100 mm/day in all four locations. ClimGen was unable to replicate the distribution of observed precipitation at three locations (Honolulu, Kahului, and Hilo). ClimGen was able to reproduce the distributions of observed minimum temperature at Kahului and wind speed at Kahului and Hilo. Although the weather data generation and disaggregation methods were concentrated in a few Hawaiian watersheds, the results presented can be used to similar mountainous location settings, as well as any specific locations aimed at furthering the site-specific performance evaluation of these tested models.
Local spatio-temporal analysis in vision systems
NASA Astrophysics Data System (ADS)
Geisler, Wilson S.; Bovik, Alan; Cormack, Lawrence; Ghosh, Joydeep; Gildeen, David
1994-07-01
The aims of this project are the following: (1) develop a physiologically and psychophysically based model of low-level human visual processing (a key component of which are local frequency coding mechanisms); (2) develop image models and image-processing methods based upon local frequency coding; (3) develop algorithms for performing certain complex visual tasks based upon local frequency representations, (4) develop models of human performance in certain complex tasks based upon our understanding of low-level processing; and (5) develop a computational testbed for implementing, evaluating and visualizing the proposed models and algorithms, using a massively parallel computer. Progress has been substantial on all aims. The highlights include the following: (1) completion of a number of psychophysical and physiological experiments revealing new, systematic and exciting properties of the primate (human and monkey) visual system; (2) further development of image models that can accurately represent the local frequency structure in complex images; (3) near completion in the construction of the Texas Active Vision Testbed; (4) development and testing of several new computer vision algorithms dealing with shape-from-texture, shape-from-stereo, and depth-from-focus; (5) implementation and evaluation of several new models of human visual performance; and (6) evaluation, purchase and installation of a MasPar parallel computer.
Results from the VALUE perfect predictor experiment: process-based evaluation
NASA Astrophysics Data System (ADS)
Maraun, Douglas; Soares, Pedro; Hertig, Elke; Brands, Swen; Huth, Radan; Cardoso, Rita; Kotlarski, Sven; Casado, Maria; Pongracz, Rita; Bartholy, Judit
2016-04-01
Until recently, the evaluation of downscaled climate model simulations has typically been limited to surface climatologies, including long term means, spatial variability and extremes. But these aspects are often, at least partly, tuned in regional climate models to match observed climate. The tuning issue is of course particularly relevant for bias corrected regional climate models. In general, a good performance of a model for these aspects in present climate does therefore not imply a good performance in simulating climate change. It is now widely accepted that, to increase our condidence in climate change simulations, it is necessary to evaluate how climate models simulate relevant underlying processes. In other words, it is important to assess whether downscaling does the right for the right reason. Therefore, VALUE has carried out a broad process-based evaluation study based on its perfect predictor experiment simulations: the downscaling methods are driven by ERA-Interim data over the period 1979-2008, reference observations are given by a network of 85 meteorological stations covering all European climates. More than 30 methods participated in the evaluation. In order to compare statistical and dynamical methods, only variables provided by both types of approaches could be considered. This limited the analysis to conditioning local surface variables on variables from driving processes that are simulated by ERA-Interim. We considered the following types of processes: at the continental scale, we evaluated the performance of downscaling methods for positive and negative North Atlantic Oscillation, Atlantic ridge and blocking situations. At synoptic scales, we considered Lamb weather types for selected European regions such as Scandinavia, the United Kingdom, the Iberian Pensinsula or the Alps. At regional scales we considered phenomena such as the Mistral, the Bora or the Iberian coastal jet. Such process-based evaluation helps to attribute biases in surface variables to underlying processes and ultimately to improve climate models.
NASA Astrophysics Data System (ADS)
Du, Xiaorong
2017-04-01
Water is the basic condition for human survival and development. As China is the most populous country, rural drinking water safety problems are most conspicuous. Therefore, the Chinese government keeps increasing investment and has built a large number of rural drinking water safety projects. Scientific evaluation of project performance is of great significance to promote the sustainable operation of the project and the sustainable development of rural economy. Previous studies mainly focus on the economic benefits of the project, while ignoring the fact that the rural drinking water safety project is quasi-public goods, which has economic, social and ecological benefits. This paper establishes a comprehensive evaluation model for rural drinking water safety performance, which adapts the rules of "5E" (economy, efficiency, effectiveness, equity and environment) as the value orientation, and selects a rural drinking water safety project as object in case study at K District, which is in the north of Jiangsu Province, China. The results shows: 1) the comprehensive performance of K project is in good condition; 2) The performance of every part shows that the scores of criteria "efficiency", "environment" and "effect" are higher than the mean performance, while the "economy" is slightly lower than the mean and the "equity" is the lowest. 3) The performance of indicator layer shows that: the planned completion rate of project, the reduction rate of project cost and the penetration rate of water-use population are significantly lower than other indicators. Based on the achievements of previous studies and the characteristics of rural drinking water safety project, this study integrates the evaluation dimensions of equity and environment, which can contribute to a more comprehensive and systematic assessment of project performance and provide empirical data for performance evaluation and management of rural drinking water safety project. Key Words: Rural drinking water safety project; Performance evaluation; 5E rules; Comprehensive evaluation model
Evaluating and Improving Cloud Processes in the Multi-Scale Modeling Framework
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ackerman, Thomas P.
2015-03-01
The research performed under this grant was intended to improve the embedded cloud model in the Multi-scale Modeling Framework (MMF) for convective clouds by using a 2-moment microphysics scheme rather than the single moment scheme used in all the MMF runs to date. The technical report and associated documents describe the results of testing the cloud resolving model with fixed boundary conditions and evaluation of model results with data. The overarching conclusion is that such model evaluations are problematic because errors in the forcing fields control the results so strongly that variations in parameterization values cannot be usefully constrained
ERIC Educational Resources Information Center
Darabi, A. Aubteen
2005-01-01
This article reports a case study describing how the principles of a cognitive apprenticeship (CA) model developed by Collins, Brown, and Holum (1991) were applied to a graduate course on performance systems analysis (PSA), and the differences this application made in student performance and evaluation of the course compared to the previous…
DOT National Transportation Integrated Search
2012-10-01
This project conducted a thorough review of the existing Pavement Management Information System (PMIS) database, : performance models, needs estimates, utility curves, and scores calculations, as well as a review of District practices : concerning th...
Federal Register 2010, 2011, 2012, 2013, 2014
2012-04-10
... must be one which, if proven, would entitle the requestor/petitioner to relief. A requestor/ petitioner..., and fire modeling calculations, have been performed to demonstrate that the performance-based... may include engineering evaluations, probabilistic safety assessments, and fire modeling calculations...
Evaluating hydrological model performance using information theory-based metrics
USDA-ARS?s Scientific Manuscript database
The accuracy-based model performance metrics not necessarily reflect the qualitative correspondence between simulated and measured streamflow time series. The objective of this work was to use the information theory-based metrics to see whether they can be used as complementary tool for hydrologic m...
Conceptual design and analysis of a dynamic scale model of the Space Station Freedom
NASA Technical Reports Server (NTRS)
Davis, D. A.; Gronet, M. J.; Tan, M. K.; Thorne, J.
1994-01-01
This report documents the conceptual design study performed to evaluate design options for a subscale dynamic test model which could be used to investigate the expected on-orbit structural dynamic characteristics of the Space Station Freedom early build configurations. The baseline option was a 'near-replica' model of the SSF SC-7 pre-integrated truss configuration. The approach used to develop conceptual design options involved three sets of studies: evaluation of the full-scale design and analysis databases, conducting scale factor trade studies, and performing design sensitivity studies. The scale factor trade study was conducted to develop a fundamental understanding of the key scaling parameters that drive design, performance and cost of a SSF dynamic scale model. Four scale model options were estimated: 1/4, 1/5, 1/7, and 1/10 scale. Prototype hardware was fabricated to assess producibility issues. Based on the results of the study, a 1/4-scale size is recommended based on the increased model fidelity associated with a larger scale factor. A design sensitivity study was performed to identify critical hardware component properties that drive dynamic performance. A total of 118 component properties were identified which require high-fidelity replication. Lower fidelity dynamic similarity scaling can be used for non-critical components.
3 Lectures: "Lagrangian Models", "Numerical Transport Schemes", and "Chemical and Transport Models"
NASA Technical Reports Server (NTRS)
Douglass, A.
2005-01-01
The topics for the three lectures for the Canadian Summer School are Lagrangian Models, numerical transport schemes, and chemical and transport models. In the first lecture I will explain the basic components of the Lagrangian model (a trajectory code and a photochemical code), the difficulties in using such a model (initialization) and show some applications in interpretation of aircraft and satellite data. If time permits I will show some results concerning inverse modeling which is being used to evaluate sources of tropospheric pollutants. In the second lecture I will discuss one of the core components of any grid point model, the numerical transport scheme. I will explain the basics of shock capturing schemes, and performance criteria. I will include an example of the importance of horizontal resolution to polar processes. We have learned from NASA's global modeling initiative that horizontal resolution matters for predictions of the future evolution of the ozone hole. The numerical scheme will be evaluated using performance metrics based on satellite observations of long-lived tracers. The final lecture will discuss the evolution of chemical transport models over the last decade. Some of the problems with assimilated winds will be demonstrated, using satellite data to evaluate the simulations.
COBRA ATD minefield detection model initial performance analysis
NASA Astrophysics Data System (ADS)
Holmes, V. Todd; Kenton, Arthur C.; Hilton, Russell J.; Witherspoon, Ned H.; Holloway, John H., Jr.
2000-08-01
A statistical performance analysis of the USMC Coastal Battlefield Reconnaissance and Analysis (COBRA) Minefield Detection (MFD) Model has been performed in support of the COBRA ATD Program under execution by the Naval Surface Warfare Center/Dahlgren Division/Coastal Systems Station . This analysis uses the Veridian ERIM International MFD model from the COBRA Sensor Performance Evaluation and Computational Tools for Research Analysis modeling toolbox and a collection of multispectral mine detection algorithm response distributions for mines and minelike clutter objects. These mine detection response distributions were generated form actual COBRA ATD test missions over littoral zone minefields. This analysis serves to validate both the utility and effectiveness of the COBRA MFD Model as a predictive MFD performance too. COBRA ATD minefield detection model algorithm performance results based on a simulate baseline minefield detection scenario are presented, as well as result of a MFD model algorithm parametric sensitivity study.
An Empirical Study of Kirkpatrick's Evaluation Model in the Hospitality Industry
ERIC Educational Resources Information Center
Chang, Ya-Hui Elegance
2010-01-01
This study examined Kirkpatrick's training evaluation model (Kirkpatrick & Kirkpatrick, 2006) by assessing a sales training program conducted at an organization in the hospitality industry. The study assessed the employees' training outcomes of knowledge and skills, job performance, and the impact of the training upon the organization. By…
An Objective Evaluation of a Behavior Modeling Training Program.
ERIC Educational Resources Information Center
Meyer, Herbert H.; Raich, Michael S.
1983-01-01
Evaluated a behavior modeling training program for sales representatives (N=58) in relation to effects on their sales performance. Results showed participants increased their sales by an average of seven percent during the ensuing six-month period, while the control group showed a 3 percent decrease. (JAC)
The National Ambient Air Quality Standards for particulate matter (PM) and the federal regional haze regulations place some emphasis on the assessment of fine particle (PM; 5) concentrations. Current air quality models need to be improved and evaluated against observations to a...
Evaluating the mitigation of greenhouse gas emissions and adaptation in dairy production.
USDA-ARS?s Scientific Manuscript database
Process-level modeling at the farm scale provides a tool for evaluating strategies for both mitigating greenhouse gas emissions and adapting to climate change. The Integrated Farm System Model (IFSM) simulates representative crop, beef or dairy farms over many years of weather to predict performance...
A diagnostic model evaluation effort has been performed to focus on photochemical ozone formation and the horizontal transport process since they strongly impact the temporal evolution and spatial distribution of ozone (O3) within the lower troposphere. Results from th...
Integrated Main Propulsion System Performance Reconstruction Process/Models
NASA Technical Reports Server (NTRS)
Lopez, Eduardo; Elliott, Katie; Snell, Steven; Evans, Michael
2013-01-01
The Integrated Main Propulsion System (MPS) Performance Reconstruction process provides the MPS post-flight data files needed for postflight reporting to the project integration management and key customers to verify flight performance. This process/model was used as the baseline for the currently ongoing Space Launch System (SLS) work. The process utilizes several methodologies, including multiple software programs, to model integrated propulsion system performance through space shuttle ascent. It is used to evaluate integrated propulsion systems, including propellant tanks, feed systems, rocket engine, and pressurization systems performance throughout ascent based on flight pressure and temperature data. The latest revision incorporates new methods based on main engine power balance model updates to model higher mixture ratio operation at lower engine power levels.
Moore, Lynne; Turgeon, Alexis F; Sirois, Marie-Josée; Murat, Valérie; Lavoie, André
2011-09-01
Trauma center performance evaluations generally include adjustment for injury severity, age, and comorbidity. However, disparities across trauma centers may be due to other differences in source populations that are not accounted for, such as socioeconomic status (SES). We aimed to evaluate whether SES influences trauma center performance evaluations in an inclusive trauma system with universal access to health care. The study was based on data collected between 1999 and 2006 in a Canadian trauma system. Patient SES was quantified using an ecologic index of social and material deprivation. Performance evaluations were based on mortality adjusted using the Trauma Risk Adjustment Model. Agreement between performance results with and without additional adjustment for SES was evaluated with correlation coefficients. The study sample comprised a total of 71,784 patients from 48 trauma centers, including 3,828 deaths within 30 days (4.5%) and 5,549 deaths within 6 months (7.7%). The proportion of patients in the highest quintile of social and material deprivation varied from 3% to 43% and from 11% to 90% across hospitals, respectively. The correlation between performance results with or without adjustment for SES was almost perfect (r = 0.997; 95% CI 0.995-0.998) and the same hospital outliers were identified. We observed an important variation in SES across trauma centers but no change in risk-adjusted mortality estimates when SES was added to adjustment models. Results suggest that after adjustment for injury severity, age, comorbidity, and transfer status, disparities in SES across trauma center source populations do not influence trauma center performance evaluations in a system offering universal health coverage. Copyright © 2011 American College of Surgeons. Published by Elsevier Inc. All rights reserved.
Land Ice Verification and Validation Kit
DOE Office of Scientific and Technical Information (OSTI.GOV)
2015-07-15
To address a pressing need to better understand the behavior and complex interaction of ice sheets within the global Earth system, significant development of continental-scale, dynamical ice-sheet models is underway. The associated verification and validation process of these models is being coordinated through a new, robust, python-based extensible software package, the Land Ice Verification and Validation toolkit (LIVV). This release provides robust and automated verification and a performance evaluation on LCF platforms. The performance V&V involves a comprehensive comparison of model performance relative to expected behavior on a given computing platform. LIVV operates on a set of benchmark and testmore » data, and provides comparisons for a suite of community prioritized tests, including configuration and parameter variations, bit-4-bit evaluation, and plots of tests where differences occur.« less
A photosynthesis-based two-leaf canopy stomatal ...
A coupled photosynthesis-stomatal conductance model with single-layer sunlit and shaded leaf canopy scaling is implemented and evaluated in a diagnostic box model with the Pleim-Xiu land surface model (PX LSM) and ozone deposition model components taken directly from the meteorology and air quality modeling system—WRF/CMAQ (Weather Research and Forecast model and Community Multiscale Air Quality model). The photosynthesis-based model for PX LSM (PX PSN) is evaluated at a FLUXNET site for implementation against different parameterizations and the current PX LSM approach with a simple Jarvis function (PX Jarvis). Latent heat flux (LH) from PX PSN is further evaluated at five FLUXNET sites with different vegetation types and landscape characteristics. Simulated ozone deposition and flux from PX PSN are evaluated at one of the sites with ozone flux measurements. Overall, the PX PSN simulates LH as well as the PX Jarvis approach. The PX PSN, however, shows distinct advantages over the PX Jarvis approach for grassland that likely result from its treatment of C3 and C4 plants for CO2 assimilation. Simulations using Moderate Resolution Imaging Spectroradiometer (MODIS) leaf area index (LAI) rather than LAI measured at each site assess how the model would perform with grid averaged data used in WRF/CMAQ. MODIS LAI estimates degrade model performance at all sites but one site having exceptionally old and tall trees. Ozone deposition velocity and ozone flux along with LH
Evaluation of the Community Multi-scale Air Quality (CMAQ) ...
The Community Multiscale Air Quality (CMAQ) model is a state-of-the-science air quality model that simulates the emission, transport and fate of numerous air pollutants, including ozone and particulate matter. The Computational Exposure Division (CED) of the U.S. Environmental Protection Agency develops the CMAQ model and periodically releases new versions of the model that include bug fixes and various other improvements to the modeling system. In the fall of 2015, CMAQ version 5.1 was released. This new version of CMAQ will contain important bug fixes to several issues that were identified in CMAQv5.0.2 and additionally include updates to other portions of the code. Several annual, and numerous episodic, CMAQv5.1 simulations were performed to assess the impact of these improvements on the model results. These results will be presented, along with a base evaluation of the performance of the CMAQv5.1 modeling system against available surface and upper-air measurements available during the time period simulated. The National Exposure Research Laboratory (NERL) Computational Exposure Division (CED) develops and evaluates data, decision-support tools, and models to be applied to media-specific or receptor-specific problem areas. CED uses modeling-based approaches to characterize exposures, evaluate fate and transport, and support environmental diagnostics/forensics with input from multiple data sources. It also develops media- and receptor-specific models, proces
Measuring the performance of Internet companies using a two-stage data envelopment analysis model
NASA Astrophysics Data System (ADS)
Cao, Xiongfei; Yang, Feng
2011-05-01
In exploring the business operation of Internet companies, few researchers have used data envelopment analysis (DEA) to evaluate their performance. Since the Internet companies have a two-stage production process: marketability and profitability, this study employs a relational two-stage DEA model to assess the efficiency of the 40 dot com firms. The results show that our model performs better in measuring efficiency, and is able to discriminate the causes of inefficiency, thus helping business management to be more effective through providing more guidance to business performance improvement.
NASA Astrophysics Data System (ADS)
Hosseinalipour, S. M.; Raja, A.; Hajikhani, S.
2012-06-01
A full three dimensional Navier - Stokes numerical simulation has been performed for performance analysis of a Kaplan turbine which is installed in one of the Irans south dams. No simplifications have been enforced in the simulation. The numerical results have been evaluated using some integral parameters such as the turbine efficiency via comparing the results with existing experimental data from the prototype Hill chart. In part of this study the numerical simulations were performed in order to calculate the prototype turbine efficiencies in some specific points which comes from the scaling up of the model efficiency that are available in the model experimental Hill chart. The results are very promising which shows the good ability of the numerical techniques for resolving the flow characteristics in these kind of complex geometries. A parametric study regarding the evaluation of turbine performance in three different runner angles of the prototype is also performed and the results are cited in this paper.
On testing models for the pressure-strain correlation of turbulence using direct simulations
NASA Technical Reports Server (NTRS)
Speziale, Charles G.; Gatski, Thomas B.; Sarkar, Sutanu
1992-01-01
Direct simulations of homogeneous turbulence have, in recent years, come into widespread use for the evaluation of models for the pressure-strain correlation of turbulence. While work in this area has been beneficial, the increasingly common practice of testing the slow and rapid parts of these models separately in uniformly strained turbulent flows is shown in this paper to be unsound. For such flows, the decomposition of models for the total pressure-strain correlation into slow and rapid parts is ambiguous. Consequently, when tested in this manner, misleading conclusions can be drawn about the performance of pressure-strain models. This point is amplified by illustrative calculations of homogeneous shear flow where other pitfalls in the evaluation of models are also uncovered. More meaningful measures for testing the performance of pressure-strain models in uniformly strained turbulent flows are proposed and the implications for turbulence modeling are discussed.
A stirling engine computer model for performance calculations
NASA Technical Reports Server (NTRS)
Tew, R.; Jefferies, K.; Miao, D.
1978-01-01
To support the development of the Stirling engine as a possible alternative to the automobile spark-ignition engine, the thermodynamic characteristics of the Stirling engine were analyzed and modeled on a computer. The modeling techniques used are presented. The performance of an existing rhombic-drive Stirling engine was simulated by use of this computer program, and some typical results are presented. Engine tests are planned in order to evaluate this model.
Vakanski, A; Ferguson, JM; Lee, S
2016-01-01
Objective The objective of the proposed research is to develop a methodology for modeling and evaluation of human motions, which will potentially benefit patients undertaking a physical rehabilitation therapy (e.g., following a stroke or due to other medical conditions). The ultimate aim is to allow patients to perform home-based rehabilitation exercises using a sensory system for capturing the motions, where an algorithm will retrieve the trajectories of a patient’s exercises, will perform data analysis by comparing the performed motions to a reference model of prescribed motions, and will send the analysis results to the patient’s physician with recommendations for improvement. Methods The modeling approach employs an artificial neural network, consisting of layers of recurrent neuron units and layers of neuron units for estimating a mixture density function over the spatio-temporal dependencies within the human motion sequences. Input data are sequences of motions related to a prescribed exercise by a physiotherapist to a patient, and recorded with a motion capture system. An autoencoder subnet is employed for reducing the dimensionality of captured sequences of human motions, complemented with a mixture density subnet for probabilistic modeling of the motion data using a mixture of Gaussian distributions. Results The proposed neural network architecture produced a model for sets of human motions represented with a mixture of Gaussian density functions. The mean log-likelihood of observed sequences was employed as a performance metric in evaluating the consistency of a subject’s performance relative to the reference dataset of motions. A publically available dataset of human motions captured with Microsoft Kinect was used for validation of the proposed method. Conclusion The article presents a novel approach for modeling and evaluation of human motions with a potential application in home-based physical therapy and rehabilitation. The described approach employs the recent progress in the field of machine learning and neural networks in developing a parametric model of human motions, by exploiting the representational power of these algorithms to encode nonlinear input-output dependencies over long temporal horizons. PMID:28111643
Vakanski, A; Ferguson, J M; Lee, S
2016-12-01
The objective of the proposed research is to develop a methodology for modeling and evaluation of human motions, which will potentially benefit patients undertaking a physical rehabilitation therapy (e.g., following a stroke or due to other medical conditions). The ultimate aim is to allow patients to perform home-based rehabilitation exercises using a sensory system for capturing the motions, where an algorithm will retrieve the trajectories of a patient's exercises, will perform data analysis by comparing the performed motions to a reference model of prescribed motions, and will send the analysis results to the patient's physician with recommendations for improvement. The modeling approach employs an artificial neural network, consisting of layers of recurrent neuron units and layers of neuron units for estimating a mixture density function over the spatio-temporal dependencies within the human motion sequences. Input data are sequences of motions related to a prescribed exercise by a physiotherapist to a patient, and recorded with a motion capture system. An autoencoder subnet is employed for reducing the dimensionality of captured sequences of human motions, complemented with a mixture density subnet for probabilistic modeling of the motion data using a mixture of Gaussian distributions. The proposed neural network architecture produced a model for sets of human motions represented with a mixture of Gaussian density functions. The mean log-likelihood of observed sequences was employed as a performance metric in evaluating the consistency of a subject's performance relative to the reference dataset of motions. A publically available dataset of human motions captured with Microsoft Kinect was used for validation of the proposed method. The article presents a novel approach for modeling and evaluation of human motions with a potential application in home-based physical therapy and rehabilitation. The described approach employs the recent progress in the field of machine learning and neural networks in developing a parametric model of human motions, by exploiting the representational power of these algorithms to encode nonlinear input-output dependencies over long temporal horizons.
EMDS 3.0: A modeling framework for coping with complexity in environmental assessment and planning.
K.M. Reynolds
2006-01-01
EMDS 3.0 is implemented as an ArcMap® extension and integrates the logic engine of NetWeaver® to perform landscape evaluations, and the decision modeling engine of Criterium DecisionPlus® for evaluating management priorities. Key features of the system's evaluation component include abilities to (1) reason about large, abstract, multifaceted ecosystem management...
Evaluation For Intelligent Transportation Systems, Evaluation Methodologies
DOT National Transportation Integrated Search
1996-03-01
THE BRIEFING ALSO PRESENTS THOUGHTS ON EVALUATION IN LIGHT OF THE RECENT LAUNCH OF OPERATION TIMESAVER, THE MODEL DEPLOYMENT INITIATIVE FOR FOUR DIFFERENT CITIES, AND THE IMPLICATIONS OF THE RECENT "GOVERNMENT PERFORMANCE AND RESULTS ACT" THAT REQUIR...
Recent evaluations of crack-opening-area in circumferentially cracked pipes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rahman, S.; Brust, F.; Ghadiali, N.
1997-04-01
Leak-before-break (LBB) analyses for circumferentially cracked pipes are currently being conducted in the nuclear industry to justify elimination of pipe whip restraints and jet shields which are present because of the expected dynamic effects from pipe rupture. The application of the LBB methodology frequently requires calculation of leak rates. The leak rates depend on the crack-opening area of the through-wall crack in the pipe. In addition to LBB analyses which assume a hypothetical flaw size, there is also interest in the integrity of actual leaking cracks corresponding to current leakage detection requirements in NRC Regulatory Guide 1.45, or for assessingmore » temporary repair of Class 2 and 3 pipes that have leaks as are being evaluated in ASME Section XI. The objectives of this study were to review, evaluate, and refine current predictive models for performing crack-opening-area analyses of circumferentially cracked pipes. The results from twenty-five full-scale pipe fracture experiments, conducted in the Degraded Piping Program, the International Piping Integrity Research Group Program, and the Short Cracks in Piping and Piping Welds Program, were used to verify the analytical models. Standard statistical analyses were performed to assess used to verify the analytical models. Standard statistical analyses were performed to assess quantitatively the accuracy of the predictive models. The evaluation also involved finite element analyses for determining the crack-opening profile often needed to perform leak-rate calculations.« less
Using the 360 degrees multisource feedback model to evaluate teaching and professionalism.
Berk, Ronald A
2009-12-01
Student ratings have dominated as the primary and, frequently, only measure of teaching performance at colleges and universities for the past 50 years. Recently, there has been a trend toward augmenting those ratings with other data sources to broaden and deepen the evidence base. The 360 degrees multisource feedback (MSF) model used in management and industry for half a century and in clinical medicine for the last decade seemed like a best fit to evaluate teaching performance and professionalism. To adapt the 360 degrees MSF model to the assessment of teaching performance and professionalism of medical school faculty. The salient characteristics of the MSF models in industry and medicine were extracted from the literature. These characteristics along with 14 sources of evidence from eight possible raters, including students, self, peers, outside experts, mentors, alumni, employers, and administrators, based on the research in higher education were adapted to formative and summative decisions. Three 360 degrees MSF models were generated for three different decisions: (1) formative decisions and feedback about teaching improvement; (2) summative decisions and feedback for merit pay and contract renewal; and (3) formative decisions and feedback about professional behaviors in the academic setting. The characteristics of each model were listed. Finally, a top-10 list of the most persistent and, perhaps, intractable psychometric issues in executing these models was suggested to guide future research. The 360 degrees MSF model appears to be a useful framework for implementing a multisource evaluation of faculty teaching performance and professionalism in medical schools. This model can provide more accurate, reliable, fair, and equitable decisions than the one based on just a single source.
The CMAQ modeling system has been used to simulate the CONUS using 12-km by 12-km horizontal grid spacing for the entire year of 2006 as part of the Air Quality Model Evaluation International initiative (AQMEII). The operational model performance for O3 and PM2.5<...
Hill, Mary C.; L. Foglia,; S. W. Mehl,; P. Burlando,
2013-01-01
Model adequacy is evaluated with alternative models rated using model selection criteria (AICc, BIC, and KIC) and three other statistics. Model selection criteria are tested with cross-validation experiments and insights for using alternative models to evaluate model structural adequacy are provided. The study is conducted using the computer codes UCODE_2005 and MMA (MultiModel Analysis). One recharge alternative is simulated using the TOPKAPI hydrological model. The predictions evaluated include eight heads and three flows located where ecological consequences and model precision are of concern. Cross-validation is used to obtain measures of prediction accuracy. Sixty-four models were designed deterministically and differ in representation of river, recharge, bedrock topography, and hydraulic conductivity. Results include: (1) What may seem like inconsequential choices in model construction may be important to predictions. Analysis of predictions from alternative models is advised. (2) None of the model selection criteria consistently identified models with more accurate predictions. This is a disturbing result that suggests to reconsider the utility of model selection criteria, and/or the cross-validation measures used in this work to measure model accuracy. (3) KIC displayed poor performance for the present regression problems; theoretical considerations suggest that difficulties are associated with wide variations in the sensitivity term of KIC resulting from the models being nonlinear and the problems being ill-posed due to parameter correlations and insensitivity. The other criteria performed somewhat better, and similarly to each other. (4) Quantities with high leverage are more difficult to predict. The results are expected to be generally applicable to models of environmental systems.
Vandenplas, J; Janssens, S; Buys, N; Gengler, N
2013-06-01
The aim of this study was to test the integration of external information, i.e. foreign estimated breeding values (EBV) and the associated reliabilities (REL), for stallions into the Belgian genetic evaluation for jumping horses. The Belgian model is a bivariate repeatability Best Linear Unbiased Prediction animal model only based on Belgian performances, while Belgian breeders import horses from neighbouring countries. Hence, use of external information is needed as prior to achieve more accurate EBV. Pedigree and performance data contained 101382 horses and 712212 performances, respectively. After conversion to the Belgian trait, external information of 98 French and 67 Dutch stallions was integrated into the Belgian evaluation. Resulting Belgian rankings of the foreign stallions were more similar to foreign rankings according to the increase of the rank correlations of at least 12%. REL of their EBV were improved of at least 2% on average. External information was partially to totally equivalent to 4 years of contemporary horses' performances or to all the stallions' own performances. All these results showed the interest to integrate external information into the Belgian evaluation. © 2012 Blackwell Verlag GmbH.
Lim, Sandy; Tai, Kenneth
2014-03-01
This study extends the stress literature by exploring the relationship between family incivility and job performance. We examine whether psychological distress mediates the link between family incivility and job performance. We also investigate how core self-evaluation might moderate this mediated relationship. Data from a 2-wave study indicate that psychological distress mediates the relationship between family incivility and job performance. In addition, core self-evaluation moderates the relationship between family incivility and psychological distress but not the relationship between psychological distress and job performance. The results hold while controlling for general job stress, family-to-work conflict, and work-to-family conflict. The findings suggest that family incivility is linked to poor performance at work, and psychological distress and core self-evaluation are key mechanisms in the relationship.
Management systems research study
NASA Technical Reports Server (NTRS)
Bruno, A. V.
1975-01-01
The development of a Monte Carlo simulation of procurement activities at the NASA Ames Research Center is described. Data cover: simulation of the procurement cycle, construction of a performance evaluation model, examination of employee development, procedures and review of evaluation criteria for divisional and individual performance evaluation. Determination of the influences and apparent impact of contract type and structure and development of a management control system for planning and controlling manpower requirements.
Model evaluation using a community benchmarking system for land surface models
NASA Astrophysics Data System (ADS)
Mu, M.; Hoffman, F. M.; Lawrence, D. M.; Riley, W. J.; Keppel-Aleks, G.; Kluzek, E. B.; Koven, C. D.; Randerson, J. T.
2014-12-01
Evaluation of atmosphere, ocean, sea ice, and land surface models is an important step in identifying deficiencies in Earth system models and developing improved estimates of future change. For the land surface and carbon cycle, the design of an open-source system has been an important objective of the International Land Model Benchmarking (ILAMB) project. Here we evaluated CMIP5 and CLM models using a benchmarking system that enables users to specify models, data sets, and scoring systems so that results can be tailored to specific model intercomparison projects. Our scoring system used information from four different aspects of global datasets, including climatological mean spatial patterns, seasonal cycle dynamics, interannual variability, and long-term trends. Variable-to-variable comparisons enable investigation of the mechanistic underpinnings of model behavior, and allow for some control of biases in model drivers. Graphics modules allow users to evaluate model performance at local, regional, and global scales. Use of modular structures makes it relatively easy for users to add new variables, diagnostic metrics, benchmarking datasets, or model simulations. Diagnostic results are automatically organized into HTML files, so users can conveniently share results with colleagues. We used this system to evaluate atmospheric carbon dioxide, burned area, global biomass and soil carbon stocks, net ecosystem exchange, gross primary production, ecosystem respiration, terrestrial water storage, evapotranspiration, and surface radiation from CMIP5 historical and ESM historical simulations. We found that the multi-model mean often performed better than many of the individual models for most variables. We plan to publicly release a stable version of the software during fall of 2014 that has land surface, carbon cycle, hydrology, radiation and energy cycle components.
NASA Astrophysics Data System (ADS)
Amran, T. G.; Janitra Yose, Mindy
2018-03-01
As the free trade Asean Economic Community (AEC) causes the tougher competition, it is important that Indonesia’s automotive industry have high competitiveness as well. A model of logistics performance measurement was designed as an evaluation tool for automotive component companies to improve their logistics performance in order to compete in AEC. The design of logistics performance measurement model was based on the Logistics Scorecard perspectives, divided into two stages: identifying the logistics business strategy to get the KPI and arranging the model. 23 KPI was obtained. The measurement result can be taken into consideration of determining policies to improve the performance logistics competitiveness.
ERIC Educational Resources Information Center
Christensen, William Howard
2013-01-01
In 2010, the federal government increased accountability expectations by placing more emphasis on monitoring teacher performance. Using a model that focuses on the New York State teacher evaluation system, that is comprised of a rubric for observation, local student assessment scores, and student state assessment scores, this…
Personality and Student Performance on Evaluation Methods Used in Business Administration Courses
ERIC Educational Resources Information Center
Lakhal, Sawsen; Sévigny, Serge; Frenette, Éric
2015-01-01
The objective of this study was to verify whether personality (Big Five model) influences performance on the evaluation methods used in business administration courses. A sample of 169 students enrolled in two compulsory undergraduate business courses responded to an online questionnaire. As it is difficult within the same course to assess…
The Evaluation of Teachers' Job Performance Based on Total Quality Management (TQM)
ERIC Educational Resources Information Center
Shahmohammadi, Nayereh
2017-01-01
This study aimed to evaluate teachers' job performance based on total quality management (TQM) model. This was a descriptive survey study. The target population consisted of all primary school teachers in Karaj (N = 2917). Using Cochran formula and simple random sampling, 340 participants were selected as sample. A total quality management…
DOT National Transportation Integrated Search
2010-09-01
This project focused on the evaluation of traffic sign sheeting performance in terms of meeting the nighttime : driver needs. The goal was to develop a nighttime driver needs specification for traffic signs. The : researchers used nighttime sign legi...
Remote control circuit breaker evaluation testing. [for space shuttles
NASA Technical Reports Server (NTRS)
Bemko, L. M.
1974-01-01
Engineering evaluation tests were performed on several models/types of remote control circuit breakers marketed in an attempt to gain some insight into their potential suitability for use on the space shuttle vehicle. Tests included the measurement of several electrical and operational performance parameters under laboratory ambient, space simulation, acceleration and vibration environmental conditions.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ascough, II, James Clifford
1992-05-01
The capability to objectively evaluate design performance of shallow landfill burial (SLB) systems is of great interest to diverse scientific disciplines, including hydrologists, engineers, environmental scientists, and SLB regulators. The goal of this work was to develop and validate a procedure for the nonsubjective evaluation of SLB designs under actual or simulated environmental conditions. A multiobjective decision module (MDM) based on scoring functions (Wymore, 1988) was implemented to evaluate SLB design performance. Input values to the MDM are provided by hydrologic models. The MDM assigns a total score to each SLB design alternative, thereby allowing for rapid and repeatable designmore » performance evaluation. The MDM was validated for a wide range of SLB designs under different climatic conditions. Rigorous assessment of SLB performance also requires incorporation of hydrologic probabilistic analysis and hydrologic risk into the overall design. This was accomplished through the development of a frequency analysis module. The frequency analysis module allows SLB design event magnitudes to be calculated based on the hydrologic return period. The multiobjective decision and freqeuncy anslysis modules were integrated in a decision support system (DSS) framework, SLEUTH (Shallow Landfill Evaluation Using Transport and Hydrology). SLEUTH is a Microsoft Windows {trademark} application, and is written in the Knowledge Pro Windows (Knowledge Garden, Inc., 1991) development language.« less
Evaluation of the PV energy production after 12-years of operating
NASA Astrophysics Data System (ADS)
Bouchakour, Salim; Arab, Amar Hadj; Abdeladim, Kamel; Boulahchiche, Saliha; Amrouche, Said Ould; Razagui, Abdelhak
2018-05-01
This paper presents a simple way to approximately evaluate the photovoltaic (PV) array performance degradation, the studied PV arrays are connected to the local electric grid at the Centre de Developpement des Energies Renouvelables (CDER) in Algiers, Algeria, since June 2004. The used PV module model takes in consideration the module temperature and the effective solar radiance, the electrical characteristics provided by the manufacturer data sheet and the evaluation of the performance coefficient. For the dynamic behavior we use the Linear Reoriented Coordinates Method (LRCM) to estimate the maximum power point (MPP). The performance coefficient is evaluated on the one hand under STC conditions to estimate the dc energy according to the manufacturer data. On the other hand, under real conditions using both the monitored data and the LM optimization algorithm, allowing a good degree of accuracy of estimated dc energy. The application of the developed modeling procedure to the analysis of the monitored data is expected to improve understanding and assessment of the PV performance degradation of the PV arrays after 12 years of operation.
NASA Astrophysics Data System (ADS)
Field, Robert; Kim, Daehyun; Kelley, Max; LeGrande, Allegra; Worden, John; Schmidt, Gavin
2014-05-01
Observational and theoretical arguments suggest that satellite retrievals of the stable isotope composition of water vapor could be useful for climate model evaluation. The isotopic composition of water vapor is controlled by the same processes that control water vapor amount, but the observed distribution of isotopic composition is distinct from amount itself . This is due to the fractionation that occurs between the abundant H216O isotopes (isotopologues) and the rare and heavy H218O and HDO isotopes during evaporation and condensation. The fractionation physics are much simpler than the underlying moist physics; discrepancies between observed and modeled isotopic fields are more likely due to problems in the latter. Isotopic measurements therefore have the potential for identifying problems that might not be apparent from more conventional measurements. Isotopic tracers have existed in climate models since the 1980s but it is only since the mid 2000s that there have been enough data for meaningful model evaluation in this sense, in the troposphere at least. We have evaluated the NASA GISS ModelE2 general circulation model over the tropics against water isotope (HDO/H2O) retrievals from the Aura Tropospheric Emission Spectrometer (TES), alongside more conventional measurements. A small ensemble of experiments was performed with physics perturbations to the cumulus and planetary boundary layer schemes, done in the context of the normal model development process. We examined the degree to which model-data agreement could be used to constrain a select group of internal processes in the model, namely condensate evaporation, entrainment strength, and moist convective air mass flux. All are difficult to parameterize, but exert strong influence over model performance. We found that the water isotope composition was significantly more sensitive to physics changes than precipitation, temperature or relative humidity through the depth of the tropical troposphere. Among the processes considered, this was most closely, and fairly exclusively, related to mid-tropospheric entrainment strength. This demonstrates that water isotope retrievals have considerable potential alongside more conventional measurements for climate model evaluation and development.
WRF/CMAQ AQMEII3 Simulations of U.S. Regional-Scale Ozone: Sensitivity to Processes and Inputs
Chemical boundary conditions are a key input to regional-scale photochemical models. In this study, performed during the third phase of the Air Quality Model Evaluation International Initiative (AQMEII3), we perform annual simulations over North America with chemical boundary con...
USDA-ARS?s Scientific Manuscript database
Watershed models typically are evaluated solely through comparison of in-stream water and nutrient fluxes with measured data using established performance criteria, whereas processes and responses within the interior of the watershed that govern these global fluxes often are neglected. Due to the l...
Chemical boundary conditions are a key input to regional-scale photochemical models. In this study, performed during the third phase of the Air Quality Model Evaluation International Initiative (AQMEII3), we perform annual simulations over North America with chemical boundary con...
Key results of battery performance and life tests at Argonne National Laboratory
NASA Astrophysics Data System (ADS)
Deluca, W. H.; Gillie, K. R.; Kulaga, J. E.; Smaga, J. A.; Tummillo, A. F.; Webster, C. E.
1991-12-01
Advanced battery technology evaluations are performed under simulated electric vehicle operating conditions at Argonne National Laboratory's & Diagnostic Laboratory (ADL). The ADL provide a common basis for both performance characterization and life evaluation with unbiased application of tests and analyses. This paper summarizes the performance characterizations and life evaluations conducted in 1991 on twelve single cells and eight 3- to 360-cell modules that encompass six battery technologies (Na/S, Li/MS, Ni/MH, Zn/Br, Ni/Fe, and Pb-Acid). These evaluations were performed for the Department of Energy, Office of Transportation Technologies, Electric and Hybrid Propulsion Division. The results measure progress in battery R & D programs, compare battery technologies, and provide basic data for modeling and continuing R & D to battery users, developers, and program managers.
Santos, Rafael D; Boote, Kenneth J; Sollenberger, Lynn E; Neves, Andre L A; Pereira, Luiz G R; Scherer, Carolina B; Gonçalves, Lucio C
2017-01-01
Forage production is primarily limited by weather conditions under dryland production systems in Brazilian semi-arid regions, therefore sowing at the appropriate time is critical. The objectives of this study were to evaluate the CSM-CERES-Pearl Millet model from the DSSAT software suite for its ability to simulate growth, development, and forage accumulation of pearl millet [ Pennisetum glaucum (L.) R.] at three Brazilian semi-arid locations, and to use the model to study the impact of different sowing dates on pearl millet performance for forage. Four pearl millet cultivars were grown during the 2011 rainy season in field experiments conducted at three Brazilian semi-arid locations, under rainfed conditions. The genetic coefficients of the four pearl millet cultivars were calibrated for the model, and the model performance was evaluated with experimental data. The model was run for 14 sowing dates using long-term historical weather data from three locations, to determine the optimum sowing window. Results showed that performance of the model was satisfactory as indicated by accurate simulation of crop phenology and forage accumulation against measured data. The optimum sowing window varied among locations depending on rainfall patterns, although showing the same trend for cultivars within the site. The best sowing windows were from 15 April to 15 May for the Bom Conselho location; 12 April to 02 May for Nossa Senhora da Gloria; and 17 April to 25 May for Sao Bento do Una. The model can be used as a tool to evaluate the effect of sowing date on forage pearl millet performance in Brazilian semi-arid conditions.
Bondi, Robert W; Igne, Benoît; Drennen, James K; Anderson, Carl A
2012-12-01
Near-infrared spectroscopy (NIRS) is a valuable tool in the pharmaceutical industry, presenting opportunities for online analyses to achieve real-time assessment of intermediates and finished dosage forms. The purpose of this work was to investigate the effect of experimental designs on prediction performance of quantitative models based on NIRS using a five-component formulation as a model system. The following experimental designs were evaluated: five-level, full factorial (5-L FF); three-level, full factorial (3-L FF); central composite; I-optimal; and D-optimal. The factors for all designs were acetaminophen content and the ratio of microcrystalline cellulose to lactose monohydrate. Other constituents included croscarmellose sodium and magnesium stearate (content remained constant). Partial least squares-based models were generated using data from individual experimental designs that related acetaminophen content to spectral data. The effect of each experimental design was evaluated by determining the statistical significance of the difference in bias and standard error of the prediction for that model's prediction performance. The calibration model derived from the I-optimal design had similar prediction performance as did the model derived from the 5-L FF design, despite containing 16 fewer design points. It also outperformed all other models estimated from designs with similar or fewer numbers of samples. This suggested that experimental-design selection for calibration-model development is critical, and optimum performance can be achieved with efficient experimental designs (i.e., optimal designs).
Comparison of in silico models for prediction of mutagenicity.
Bakhtyari, Nazanin G; Raitano, Giuseppa; Benfenati, Emilio; Martin, Todd; Young, Douglas
2013-01-01
Using a dataset with more than 6000 compounds, the performance of eight quantitative structure activity relationships (QSAR) models was evaluated: ACD/Tox Suite, Absorption, Distribution, Metabolism, Elimination, and Toxicity of chemical substances (ADMET) predictor, Derek, Toxicity Estimation Software Tool (T.E.S.T.), TOxicity Prediction by Komputer Assisted Technology (TOPKAT), Toxtree, CEASAR, and SARpy (SAR in python). In general, the results showed a high level of performance. To have a realistic estimate of the predictive ability, the results for chemicals inside and outside the training set for each model were considered. The effect of applicability domain tools (when available) on the prediction accuracy was also evaluated. The predictive tools included QSAR models, knowledge-based systems, and a combination of both methods. Models based on statistical QSAR methods gave better results.
Comparison of AERMOD and CALPUFF models for simulating SO2 concentrations in a gas refinery.
Atabi, Farideh; Jafarigol, Farzaneh; Moattar, Faramarz; Nouri, Jafar
2016-09-01
In this study, concentration of SO2 from a gas refinery located in complex terrain was calculated by the steady-state, AERMOD model, and nonsteady-state CALPUFF model. First, in four seasons, SO2 concentrations emitted from 16 refinery stacks, in nine receptors, were obtained by field measurements, and then the performance of both models was evaluated. Then, the simulated results for SO2 ambient concentrations made by each model were compared with the results of the observed concentrations, and model results were compared among themselves. The evaluation of the two models to simulate SO2 concentrations was based on the statistical analysis and Q-Q plots. Review of statistical parameters and Q-Q plots has shown that, according to the evaluation of estimations made, performance of both models to simulate the concentration of SO2 in the region can be considered acceptable. The results showed the AERMOD composite ratio between simulated values made by models and the observed values in various receptors for all four average times is 0.72, whereas CALPUFF's ratio is 0.89. However, in the complex conditions of topography, CALPUFF offers better agreement with the observed concentrations.
Testability of evolutionary game dynamics based on experimental economics data
NASA Astrophysics Data System (ADS)
Wang, Yijia; Chen, Xiaojie; Wang, Zhijian
In order to better understand the dynamic processes of a real game system, we need an appropriate dynamics model, so to evaluate the validity of a model is not a trivial task. Here, we demonstrate an approach, considering the dynamical macroscope patterns of angular momentum and speed as the measurement variables, to evaluate the validity of various dynamics models. Using the data in real time Rock-Paper-Scissors (RPS) games experiments, we obtain the experimental dynamic patterns, and then derive the related theoretical dynamic patterns from a series of typical dynamics models respectively. By testing the goodness-of-fit between the experimental and theoretical patterns, the validity of the models can be evaluated. One of the results in our study case is that, among all the nonparametric models tested, the best-known Replicator dynamics model performs almost worst, while the Projection dynamics model performs best. Besides providing new empirical macroscope patterns of social dynamics, we demonstrate that the approach can be an effective and rigorous tool to test game dynamics models. Fundamental Research Funds for the Central Universities (SSEYI2014Z) and the National Natural Science Foundation of China (Grants No. 61503062).
Performance and Simulation of a Stand-alone Parabolic Trough Solar Thermal Power Plant
NASA Astrophysics Data System (ADS)
Mohammad, S. T.; Al-Kayiem, H. H.; Assadi, M. K.; Gilani, S. I. U. H.; Khlief, A. K.
2018-05-01
In this paper, a Simulink® Thermolib Model has been established for simulation performance evaluation of Stand-alone Parabolic Trough Solar Thermal Power Plant in Universiti Teknologi PETRONAS, Malaysia. This paper proposes a design of 1.2 kW parabolic trough power plant. The model is capable to predict temperatures at any system outlet in the plant, as well as the power output produced. The conditions that are taken into account as input to the model are: local solar radiation and ambient temperatures, which have been measured during the year. Other parameters that have been input to the model are the collector’s sizes, location in terms of latitude and altitude. Lastly, the results are presented in graphical manner to describe the analysed variations of various outputs of the solar fields obtained, and help to predict the performance of the plant. The developed model allows an initial evaluation of the viability and technical feasibility of any similar solar thermal power plant.
Yang, Yan; Onishi, Takeo; Hiramatsu, Ken
2014-01-01
Simulation results of the widely used temperature index snowmelt model are greatly influenced by input air temperature data. Spatially sparse air temperature data remain the main factor inducing uncertainties and errors in that model, which limits its applications. Thus, to solve this problem, we created new air temperature data using linear regression relationships that can be formulated based on MODIS land surface temperature data. The Soil Water Assessment Tool model, which includes an improved temperature index snowmelt module, was chosen to test the newly created data. By evaluating simulation performance for daily snowmelt in three test basins of the Amur River, performance of the newly created data was assessed. The coefficient of determination (R 2) and Nash-Sutcliffe efficiency (NSE) were used for evaluation. The results indicate that MODIS land surface temperature data can be used as a new source for air temperature data creation. This will improve snow simulation using the temperature index model in an area with sparse air temperature observations. PMID:25165746
DOE Office of Scientific and Technical Information (OSTI.GOV)
Patton, A.D.; Ayoub, A.K.; Singh, C.
1982-07-01
Existing methods for generating capacity reliability evaluation do not explicitly recognize a number of operating considerations which may have important effects in system reliability performance. Thus, current methods may yield estimates of system reliability which differ appreciably from actual observed reliability. Further, current methods offer no means of accurately studying or evaluating alternatives which may differ in one or more operating considerations. Operating considerations which are considered to be important in generating capacity reliability evaluation include: unit duty cycles as influenced by load cycle shape, reliability performance of other units, unit commitment policy, and operating reserve policy; unit start-up failuresmore » distinct from unit running failures; unit start-up times; and unit outage postponability and the management of postponable outages. A detailed Monte Carlo simulation computer model called GENESIS and two analytical models called OPCON and OPPLAN have been developed which are capable of incorporating the effects of many operating considerations including those noted above. These computer models have been used to study a variety of actual and synthetic systems and are available from EPRI. The new models are shown to produce system reliability indices which differ appreciably from index values computed using traditional models which do not recognize operating considerations.« less
Quantifying Parkinson's disease progression by simulating gait patterns
NASA Astrophysics Data System (ADS)
Cárdenas, Luisa; Martínez, Fabio; Atehortúa, Angélica; Romero, Eduardo
2015-12-01
Modern rehabilitation protocols of most neurodegenerative diseases, in particular the Parkinson Disease, rely on a clinical analysis of gait patterns. Currently, such analysis is highly dependent on both the examiner expertise and the type of evaluation. Development of evaluation methods with objective measures is then crucial. Physical models arise as a powerful alternative to quantify movement patterns and to emulate the progression and performance of specific treatments. This work introduces a novel quantification of the Parkinson disease progression using a physical model that accurately represents the main gait biomarker, the body Center of Gravity (CoG). The model tracks the whole gait cycle by a coupled double inverted pendulum that emulates the leg swinging for the single support phase and by a damper-spring System (SDP) that recreates both legs in contact with the ground for the double phase. The patterns generated by the proposed model are compared with actual ones learned from 24 subjects in stages 2,3, and 4. The evaluation performed demonstrates a better performance of the proposed model when compared with a baseline model(SP) composed of a coupled double pendulum and a mass-spring system. The Frechet distance measured differences between model estimations and real trajectories, showing for stages 2, 3 and 4 distances of 0.137, 0.155, 0.38 for the baseline and 0.07, 0.09, 0.29 for the proposed method.
NASA Astrophysics Data System (ADS)
Krishnan, Govindarajapuram Subramaniam
1997-12-01
The National Aeronautics & Space Administration (NASA), the European Space Agency (ESA), and the Canadian Space Agency (CSA) missions involve the performance of scientific experiments in Space. Instruments used in such experiments are fabricated using electronic parts such as microcircuits, inductors, capacitors, diodes, transistors, etc. For instruments to perform reliably the selection of commercial parts must be monitored and strictly controlled. The process used to achieve this goal is by a manual review and approval of every part used to build the instrument. The present system to select and approve parts for space applications is manual, inefficient, inconsistent, slow and tedious, and very costly. In this dissertation a computer based decision support model is developed for implementing this process using artificial intelligence concepts based on the current information (expert sources). Such a model would result in a greater consistency, accuracy, and timeliness of evaluation. This study presents the methodology of development and features of the model, and the analysis of the data pertaining to the performance of the model in the field. The model was evaluated for three different part types by experts from three different space agencies. The results show that the model was more consistent than the manual evaluation for all part types considered. The study concludes with the cost and benefits analysis of implementing the models and shows that implementation of the model will result in significant cost savings. Other implementation details are highlighted.
Evaluation and intercomparison of air quality forecasts over Korea during the KORUS-AQ campaign
NASA Astrophysics Data System (ADS)
Lee, Seungun; Park, Rokjin J.; Kim, Soontae; Song, Chul H.; Kim, Cheol-Hee; Woo, Jung-Hun
2017-04-01
We evaluate and intercompare ozone and aerosol simulations over Korea during the KORUS-AQ campaign, which was conducted in May-June 2016. Four global and regional air quality models participated in the campaign and provided daily air quality forecasts over Korea to guide aircraft flight paths for detecting air pollution events over Korean peninsula and its nearby oceans. We first evaluate the model performance by comparing simulated and observed hourly surface ozone and PM2.5 concentrations at ground sites in Korea and find that the models successfully capture intermittent air pollution events and reproduce the daily variation of ozone and PM2.5 concentrations. However, significant underestimates of peak ozone concentrations in the afternoon are also found in most models. Among chemical constituents of PM2.5, the models typically overestimate observed nitrate aerosol concentrations and underestimate organic aerosol concentrations, although the observed mass concentrations of PM2.5 are seemingly reproduced by the models. In particular, all models used the same anthropogenic emission inventory (KU-CREATE) for daily air quality forecast, but they show a considerable discrepancy for ozone and aerosols. Compared to individual model results, the ensemble mean of all models shows the best performance with correlation coefficients of 0.73 for ozone and 0.57 for PM2.5. We here investigate contributing factors to the discrepancy, which will serve as a guidance to improve the performance of the air quality forecast.
Decision-relevant evaluation of climate models: A case study of chill hours in California
NASA Astrophysics Data System (ADS)
Jagannathan, K. A.; Jones, A. D.; Kerr, A. C.
2017-12-01
The past decade has seen a proliferation of different climate datasets with over 60 climate models currently in use. Comparative evaluation and validation of models can assist practitioners chose the most appropriate models for adaptation planning. However, such assessments are usually conducted for `climate metrics' such as seasonal temperature, while sectoral decisions are often based on `decision-relevant outcome metrics' such as growing degree days or chill hours. Since climate models predict different metrics with varying skill, the goal of this research is to conduct a bottom-up evaluation of model skill for `outcome-based' metrics. Using chill hours (number of hours in winter months where temperature is lesser than 45 deg F) in Fresno, CA as a case, we assess how well different GCMs predict the historical mean and slope of chill hours, and whether and to what extent projections differ based on model selection. We then compare our results with other climate-based evaluations of the region, to identify similarities and differences. For the model skill evaluation, historically observed chill hours were compared with simulations from 27 GCMs (and multiple ensembles). Model skill scores were generated based on a statistical hypothesis test of the comparative assessment. Future projections from RCP 8.5 runs were evaluated, and a simple bias correction was also conducted. Our analysis indicates that model skill in predicting chill hour slope is dependent on its skill in predicting mean chill hours, which results from the non-linear nature of the chill metric. However, there was no clear relationship between the models that performed well for the chill hour metric and those that performed well in other temperature-based evaluations (such winter minimum temperature or diurnal temperature range). Further, contrary to conclusions from other studies, we also found that the multi-model mean or large ensemble mean results may not always be most appropriate for this outcome metric. Our assessment sheds light on key differences between global versus local skill, and broad versus specific skill of climate models, highlighting that decision-relevant model evaluation may be crucial for providing practitioners with the best available climate information for their specific needs.
NASA Astrophysics Data System (ADS)
Maulana, I.; Sumarto; Nurafiati, P.; Puspita, R. H.
2018-02-01
This research aims to find out the evaluation program of the Industrial apprenticeship (Prakerin) in electrical engineering. This research includes on four variables of CIPP. (1). Context (a). programme planning (b). design. (2). Input (a). readiness of students (b). performance of vocational education teachers (c). Facilities and infrastructure, (3). process (a). performance students (b). performance mentors, (4). Product (a). readiness of student work. This research is a type of program evaluation research with Stake model approach. Data collection methods used are questionnaires with closed questions and frequently asked questions.
NASA Astrophysics Data System (ADS)
Couvidat, Florian; Bessagnet, Bertrand; Garcia-Vivanco, Marta; Real, Elsa; Menut, Laurent; Colette, Augustin
2018-01-01
A new aerosol module was developed and integrated in the air quality model CHIMERE. Developments include the use of the Model of Emissions and Gases and Aerosols from Nature (MEGAN) 2.1 for biogenic emissions, the implementation of the inorganic thermodynamic model ISORROPIA 2.1, revision of wet deposition processes and of the algorithms of condensation/evaporation and coagulation and the implementation of the secondary organic aerosol (SOA) mechanism H2O and the thermodynamic model SOAP. Concentrations of particles over Europe were simulated by the model for the year 2013. Model concentrations were compared to the European Monitoring and Evaluation Programme (EMEP) observations and other observations available in the EBAS database to evaluate the performance of the model. Performances were determined for several components of particles (sea salt, sulfate, ammonium, nitrate, organic aerosol) with a seasonal and regional analysis of results. The model gives satisfactory performance in general. For sea salt, the model succeeds in reproducing the seasonal evolution of concentrations for western and central Europe. For sulfate, except for an overestimation of sulfate in northern Europe, modeled concentrations are close to observations and the model succeeds in reproducing the seasonal evolution of concentrations. For organic aerosol, the model reproduces with satisfactory results concentrations for stations with strong modeled biogenic SOA concentrations. However, the model strongly overestimates ammonium nitrate concentrations during late autumn (possibly due to problems in the temporal evolution of emissions) and strongly underestimates summer organic aerosol concentrations over most of the stations (especially in the northern half of Europe). This underestimation could be due to a lack of anthropogenic SOA or biogenic emissions in northern Europe. A list of recommended tests and developments to improve the model is also given.
Detailed performance and environmental monitoring of aquifer heating and cooling systems
NASA Astrophysics Data System (ADS)
Acuna, José; Ahlkrona, Malva; Zandin, Hanna; Singh, Ashutosh
2016-04-01
The project intends to quantify the performance and environmental impact of large scale aquifer thermal energy storage, as well as point at recommendations for operating and estimating the environmental footprint of future systems. Field measurements, test of innovative equipment as well as advanced modelling work and analysis will be performed. The following aspects are introduced and covered in the presentation: -Thermal, chemical and microbiological influence of akvifer thermal energy storage systems: measurement and evaluation of real conditions and the influence of one system in operation. -Follow up of energy extraction from aquifer as compared to projected values, recommendations for improvements. -Evaluation of the most used thermal modeling tool for design and calculation of groundwater temperatures, calculations with MODFLOW/MT3DMS -Test and evaluation of optical fiber cables as a way to measure temperatures in aquifer thermal energy storages
Performance evaluation of wireless communications through capsule endoscope.
Takizawa, Kenichi; Aoyagi, Takahiro; Hamaguchi, Kiyoshi; Kohno, Ryuji
2009-01-01
This paper presents a performance evaluation of wireless communications applicable into a capsule endoscope. A numerical model to describe the received signal strength (RSS) radiated from a capsule-sized signal generator is derived through measurements in which a liquid phantom that has equivalent electrical constants is used. By introducing this model and taking into account the characteristics of its direction pattern of the capsule and propagation distance between the implanted capsule and on-body antenna, a cumulative distribution function (CDF) of the received SNR is evaluated. Then, simulation results related to the error ratio in the wireless channel are obtained. These results show that the frequencies of 611 MHz or lesser would be useful for the capsule endoscope applications from the view point of error rate performance. Further, we show that the use of antenna diversity brings additional gain to this application.
Evaluation as a critical factor of success in local public health accreditation programs.
Tremain, Beverly; Davis, Mary; Joly, Brenda; Edgar, Mark; Kushion, Mary L; Schmidt, Rita
2007-01-01
This article presents the variety of approaches used to conduct evaluations of performance improvement or accreditation systems, while illustrating the complexity of conducting evaluations to inform local public health practice. We, in addition, hope to inform the Exploring Accreditation Program about relevant experiences involving accreditation and performance assessment processes, specifically evaluation, as it debates and discusses a national voluntary model. A background of each state is given. To further explore these issues, interviews were conducted with each state's evaluator to gain more in-depth information on the many different evaluation strategies and approaches used. On the basis of the interviews, the authors provide several overall themes, which suggest that evaluation is a critical tool and success factor for performance assessment or accreditation programs.
NASA Technical Reports Server (NTRS)
Layland, J. W.
1974-01-01
An approximate analysis of the effect of a noisy carrier reference on the performance of sequential decoding is presented. The analysis uses previously developed techniques for evaluating noisy reference performance for medium-rate uncoded communications adapted to sequential decoding for data rates of 8 to 2048 bits/s. In estimating the ten to the minus fourth power deletion probability thresholds for Helios, the model agrees with experimental data to within the experimental tolerances. The computational problem involved in sequential decoding, carrier loop effects, the main characteristics of the medium-rate model, modeled decoding performance, and perspectives on future work are discussed.
DECIDE: a software for computer-assisted evaluation of diagnostic test performance.
Chiecchio, A; Bo, A; Manzone, P; Giglioli, F
1993-05-01
The evaluation of the performance of clinical tests is a complex problem involving different steps and many statistical tools, not always structured in an organic and rational system. This paper presents a software which provides an organic system of statistical tools helping evaluation of clinical test performance. The program allows (a) the building and the organization of a working database, (b) the selection of the minimal set of tests with the maximum information content, (c) the search of the model best fitting the distribution of the test values, (d) the selection of optimal diagnostic cut-off value of the test for every positive/negative situation, (e) the evaluation of performance of the combinations of correlated and uncorrelated tests. The uncertainty associated with all the variables involved is evaluated. The program works in a MS-DOS environment with EGA or higher performing graphic card.
The “Dry-Run” Analysis: A Method for Evaluating Risk Scores for Confounding Control
Wyss, Richard; Hansen, Ben B.; Ellis, Alan R.; Gagne, Joshua J.; Desai, Rishi J.; Glynn, Robert J.; Stürmer, Til
2017-01-01
Abstract A propensity score (PS) model's ability to control confounding can be assessed by evaluating covariate balance across exposure groups after PS adjustment. The optimal strategy for evaluating a disease risk score (DRS) model's ability to control confounding is less clear. DRS models cannot be evaluated through balance checks within the full population, and they are usually assessed through prediction diagnostics and goodness-of-fit tests. A proposed alternative is the “dry-run” analysis, which divides the unexposed population into “pseudo-exposed” and “pseudo-unexposed” groups so that differences on observed covariates resemble differences between the actual exposed and unexposed populations. With no exposure effect separating the pseudo-exposed and pseudo-unexposed groups, a DRS model is evaluated by its ability to retrieve an unconfounded null estimate after adjustment in this pseudo-population. We used simulations and an empirical example to compare traditional DRS performance metrics with the dry-run validation. In simulations, the dry run often improved assessment of confounding control, compared with the C statistic and goodness-of-fit tests. In the empirical example, PS and DRS matching gave similar results and showed good performance in terms of covariate balance (PS matching) and controlling confounding in the dry-run analysis (DRS matching). The dry-run analysis may prove useful in evaluating confounding control through DRS models. PMID:28338910
Performance bounds on parallel self-initiating discrete-event
NASA Technical Reports Server (NTRS)
Nicol, David M.
1990-01-01
The use is considered of massively parallel architectures to execute discrete-event simulations of what is termed self-initiating models. A logical process in a self-initiating model schedules its own state re-evaluation times, independently of any other logical process, and sends its new state to other logical processes following the re-evaluation. The interest is in the effects of that communication on synchronization. The performance is considered of various synchronization protocols by deriving upper and lower bounds on optimal performance, upper bounds on Time Warp's performance, and lower bounds on the performance of a new conservative protocol. The analysis of Time Warp includes the overhead costs of state-saving and rollback. The analysis points out sufficient conditions for the conservative protocol to outperform Time Warp. The analysis also quantifies the sensitivity of performance to message fan-out, lookahead ability, and the probability distributions underlying the simulation.
MODELING AND PERFORMANCE EVALUATION FOR AVIATION SECURITY CARGO INSPECTION QUEUING SYSTEM
DOE Office of Scientific and Technical Information (OSTI.GOV)
Allgood, Glenn O; Olama, Mohammed M; Rose, Terri A
Beginning in 2010, the U.S. will require that all cargo loaded in passenger aircraft be inspected. This will require more efficient processing of cargo and will have a significant impact on the inspection protocols and business practices of government agencies and the airlines. In this paper, we conduct performance evaluation study for an aviation security cargo inspection queuing system for material flow and accountability. The overall performance of the aviation security cargo inspection system is computed, analyzed, and optimized for the different system dynamics. Various performance measures are considered such as system capacity, residual capacity, and throughput. These metrics aremore » performance indicators of the system s ability to service current needs and response capacity to additional requests. The increased physical understanding resulting from execution of the queuing model utilizing these vetted performance measures will reduce the overall cost and shipping delays associated with the new inspection requirements.« less
Kasthurirathne, Suranga N; Dixon, Brian E; Gichoya, Judy; Xu, Huiping; Xia, Yuni; Mamlin, Burke; Grannis, Shaun J
2017-05-01
Existing approaches to derive decision models from plaintext clinical data frequently depend on medical dictionaries as the sources of potential features. Prior research suggests that decision models developed using non-dictionary based feature sourcing approaches and "off the shelf" tools could predict cancer with performance metrics between 80% and 90%. We sought to compare non-dictionary based models to models built using features derived from medical dictionaries. We evaluated the detection of cancer cases from free text pathology reports using decision models built with combinations of dictionary or non-dictionary based feature sourcing approaches, 4 feature subset sizes, and 5 classification algorithms. Each decision model was evaluated using the following performance metrics: sensitivity, specificity, accuracy, positive predictive value, and area under the receiver operating characteristics (ROC) curve. Decision models parameterized using dictionary and non-dictionary feature sourcing approaches produced performance metrics between 70 and 90%. The source of features and feature subset size had no impact on the performance of a decision model. Our study suggests there is little value in leveraging medical dictionaries for extracting features for decision model building. Decision models built using features extracted from the plaintext reports themselves achieve comparable results to those built using medical dictionaries. Overall, this suggests that existing "off the shelf" approaches can be leveraged to perform accurate cancer detection using less complex Named Entity Recognition (NER) based feature extraction, automated feature selection and modeling approaches. Copyright © 2017 Elsevier Inc. All rights reserved.
Teaching project: a low-cost swine model for chest tube insertion training.
Netto, Fernando Antonio Campelo Spencer; Sommer, Camila Garcia; Constantino, Michael de Mello; Cardoso, Michel; Cipriani, Raphael Flávio Fachini; Pereira, Renan Augusto
2016-02-01
to describe and evaluate the acceptance of a low-cost chest tube insertion porcine model in a medical education project in the southwest of Paraná, Brazil. we developed a low-cost and low technology porcine model for teaching chest tube insertion and used it in a teaching project. Medical trainees - students and residents - received theoretical instructions about the procedure and performed thoracic drainage in this porcine model. After performing the procedure, the participants filled a feedback questionnaire about the proposed experimental model. This study presents the model and analyzes the questionnaire responses. seventy-nine medical trainees used and evaluated the model. The anatomical correlation between the porcine model and human anatomy was considered high and averaged 8.1±1.0 among trainees. All study participants approved the low-cost porcine model for chest tube insertion. the presented low-cost porcine model for chest tube insertion training was feasible and had good acceptability among trainees. This model has potential use as a teaching tool in medical education.
NASA Astrophysics Data System (ADS)
Song, Chi; Zhang, Xuejun; Zhang, Xin; Hu, Haifei; Zeng, Xuefeng
2017-06-01
A rigid conformal (RC) lap can smooth mid-spatial-frequency (MSF) errors, which are naturally smaller than the tool size, while still removing large-scale errors in a short time. However, the RC-lap smoothing efficiency performance is poorer than expected, and existing smoothing models cannot explicitly specify the methods to improve this efficiency. We presented an explicit time-dependent smoothing evaluation model that contained specific smoothing parameters directly derived from the parametric smoothing model and the Preston equation. Based on the time-dependent model, we proposed a strategy to improve the RC-lap smoothing efficiency, which incorporated the theoretical model, tool optimization, and efficiency limit determination. Two sets of smoothing experiments were performed to demonstrate the smoothing efficiency achieved using the time-dependent smoothing model. A high, theory-like tool influence function and a limiting tool speed of 300 RPM were o
Network Performance Evaluation Model for assessing the impacts of high-occupancy vehicle facilities
DOE Office of Scientific and Technical Information (OSTI.GOV)
Janson, B.N.; Zozaya-Gorostiza, C.; Southworth, F.
1986-09-01
A model to assess the impacts of major high-occupancy vehicle (HOV) facilities on regional levels of energy consumption and vehicle air pollution emissions in urban aeas is developed and applied. This model can be used to forecast and compare the impacts of alternative HOV facility design and operation plans on traffic patterns, travel costs, model choice, travel demand, energy consumption and vehicle emissions. The model is designed to show differences in the overall impacts of alternative HOV facility types, locations and operation plans rather than to serve as a tool for detailed engineering design and traffic planning studies. The Networkmore » Performance Evaluation Model (NETPEM) combines several urban transportation planning models within a multi-modal network equilibrium framework including modules with which to define the type, location and use policy of the HOV facility to be tested, and to assess the impacts of this facility.« less
A comprehensive mechanistic model for upward two-phase flow in wellbores
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sylvester, N.D.; Sarica, C.; Shoham, O.
1994-05-01
A comprehensive model is formulated to predict the flow behavior for upward two-phase flow. This model is composed of a model for flow-pattern prediction and a set of independent mechanistic models for predicting such flow characteristics as holdup and pressure drop in bubble, slug, and annular flow. The comprehensive model is evaluated by using a well data bank made up of 1,712 well cases covering a wide variety of field data. Model performance is also compared with six commonly used empirical correlations and the Hasan-Kabir mechanistic model. Overall model performance is in good agreement with the data. In comparison withmore » other methods, the comprehensive model performed the best.« less
A new global and comprehensive model for ICU ventilator performances evaluation.
Marjanovic, Nicolas S; De Simone, Agathe; Jegou, Guillaume; L'Her, Erwan
2017-12-01
This study aimed to provide a new global and comprehensive evaluation of recent ICU ventilators taking into account both technical performances and ergonomics. Six recent ICU ventilators were evaluated. Technical performances were assessed under two FIO 2 levels (100%, 50%), three respiratory mechanics combinations (Normal: compliance [C] = 70 mL cmH 2 O -1 /resistance [R] = 5 cmH 2 O L -1 s -1 ; Restrictive: C = 30/R = 10; Obstructive: C = 120/R = 20), four exponential levels of leaks (from 0 to 12.5 L min -1 ) and three levels of inspiratory effort (P0.1 = 2, 4 and 8 cmH 2 O), using an automated test lung. Ergonomics were evaluated by 20 ICU physicians using a global and comprehensive model involving physiological response to stress measurements (heart rate, respiratory rate, tidal volume variability and eye tracking), psycho-cognitive scales (SUS and NASA-TLX) and objective tasks completion. Few differences in terms of technical performance were observed between devices. Non-invasive ventilation modes had a huge influence on asynchrony occurrence. Using our global model, either objective tasks completion, psycho-cognitive scales and/or physiological measurements were able to depict significant differences in terms of devices' usability. The level of failure that was observed with some devices depicted the lack of adaptation of device's development to end users' requests. Despite similar technical performance, some ICU ventilators exhibit low ergonomics performance and a high risk of misusage.
Evaluation of Lithofacies Up-Scaling Methods for Probabilistic Prediction of Carbon Dioxide Behavior
NASA Astrophysics Data System (ADS)
Park, J. Y.; Lee, S.; Lee, Y. I.; Kihm, J. H.; Kim, J. M.
2017-12-01
Behavior of carbon dioxide injected into target reservoir (storage) formations is highly dependent on heterogeneities of geologic lithofacies and properties. These heterogeneous lithofacies and properties basically have probabilistic characteristics. Thus, their probabilistic evaluation has to be implemented properly into predicting behavior of injected carbon dioxide in heterogeneous storage formations. In this study, a series of three-dimensional geologic modeling is performed first using SKUA-GOCAD (ASGA and Paradigm) to establish lithofacies models of the Janggi Conglomerate in the Janggi Basin, Korea within a modeling domain. The Janggi Conglomerate is composed of mudstone, sandstone, and conglomerate, and it has been identified as a potential reservoir rock (clastic saline formation) for geologic carbon dioxide storage. Its lithofacies information are obtained from four boreholes and used in lithofacies modeling. Three different up-scaling methods (i.e., nearest to cell center, largest proportion, and random) are applied, and lithofacies modeling is performed 100 times for each up-scaling method. The lithofacies models are then compared and analyzed with the borehole data to evaluate the relative suitability of the three up-scaling methods. Finally, the lithofacies models are converted into coarser lithofacies models within the same modeling domain with larger grid blocks using the three up-scaling methods, and a series of multiphase thermo-hydrological numerical simulation is performed using TOUGH2-MP (Zhang et al., 2008) to predict probabilistically behavior of injected carbon dioxide. The coarser lithofacies models are also compared and analyzed with the borehole data and finer lithofacies models to evaluate the relative suitability of the three up-scaling methods. Three-dimensional geologic modeling, up-scaling, and multiphase thermo-hydrological numerical simulation as linked methodologies presented in this study can be utilized as a practical probabilistic evaluation tool to predict behavior of injected carbon dioxide and even to analyze its leakage risk. This work was supported by the Korea CCS 2020 Project of the Korea Carbon Capture and Sequestration R&D Center (KCRC) funded by the National Research Foundation (NRF), Ministry of Science and ICT (MSIT), Korea.
An ARM data-oriented diagnostics package to evaluate the climate model simulation
NASA Astrophysics Data System (ADS)
Zhang, C.; Xie, S.
2016-12-01
A set of diagnostics that utilize long-term high frequency measurements from the DOE Atmospheric Radiation Measurement (ARM) program is developed for evaluating the regional simulation of clouds, radiation and precipitation in climate models. The diagnostics results are computed and visualized automatically in a python-based package that aims to serve as an easy entry point for evaluating climate simulations using the ARM data, as well as the CMIP5 multi-model simulations. Basic performance metrics are computed to measure the accuracy of mean state and variability of simulated regional climate. The evaluated physical quantities include vertical profiles of clouds, temperature, relative humidity, cloud liquid water path, total column water vapor, precipitation, sensible and latent heat fluxes, radiative fluxes, aerosol and cloud microphysical properties. Process-oriented diagnostics focusing on individual cloud and precipitation-related phenomena are developed for the evaluation and development of specific model physical parameterizations. Application of the ARM diagnostics package will be presented in the AGU session. This work is performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344, IM release number is: LLNL-ABS-698645.
NASA Astrophysics Data System (ADS)
Sivavaraprasad, G.; Venkata Ratnam, D.
2017-07-01
Ionospheric delay is one of the major atmospheric effects on the performance of satellite-based radio navigation systems. It limits the accuracy and availability of Global Positioning System (GPS) measurements, related to critical societal and safety applications. The temporal and spatial gradients of ionospheric total electron content (TEC) are driven by several unknown priori geophysical conditions and solar-terrestrial phenomena. Thereby, the prediction of ionospheric delay is challenging especially over Indian sub-continent. Therefore, an appropriate short/long-term ionospheric delay forecasting model is necessary. Hence, the intent of this paper is to forecast ionospheric delays by considering day to day, monthly and seasonal ionospheric TEC variations. GPS-TEC data (January 2013-December 2013) is extracted from a multi frequency GPS receiver established at K L University, Vaddeswaram, Guntur station (geographic: 16.37°N, 80.37°E; geomagnetic: 7.44°N, 153.75°E), India. An evaluation, in terms of forecasting capabilities, of three ionospheric time delay models - an Auto Regressive Moving Average (ARMA) model, Auto Regressive Integrated Moving Average (ARIMA) model, and a Holt-Winter's model is presented. The performances of these models are evaluated through error measurement analysis during both geomagnetic quiet and disturbed days. It is found that, ARMA model is effectively forecasting the ionospheric delay with an accuracy of 82-94%, which is 10% more superior to ARIMA and Holt-Winter's models. Moreover, the modeled VTEC derived from International Reference Ionosphere, IRI (IRI-2012) model and new global TEC model, Neustrelitz TEC Model (NTCM-GL) have compared with forecasted VTEC values of ARMA, ARIMA and Holt-Winter's models during geomagnetic quiet days. The forecast results are indicating that ARMA model would be useful to set up an early warning system for ionospheric disturbances at low latitude regions.
Weykamp, Cas; Siebelder, Carla
2017-11-01
HbA1c is a key parameter in diabetes management. For years the test has been used exclusively for monitoring of long-term diabetic control. However, due to improvement of the performance, HbA1c is considered more and more for diagnosis and screening. With this new application, quality demands further increase. A task force of the International Federation of Clinical Chemistry and Laboratory Medicine developed a model to set and evaluate quality targets for HbA1c. The model is based on the concept of total error and takes into account the major sources of analytical errors in the medical laboratory: bias and imprecision. Performance criteria are derived from sigma-metrics and biological variation. This review shows 2 examples of the application of the model: at the level of single laboratories, and at the level of a group of laboratories. In the first example data of 125 individual laboratories of a recent external quality assessment program in the Netherlands are evaluated. Differences between laboratories as well as their relation to method principles are shown. The second example uses recent and 3-year-old data of the proficiency test of the College of American Pathologists. The differences in performance between 26 manufacturer-related groups of laboratories are shown. Over time these differences are quite consistent although some manufacturers improved substantially either by better standardization or by replacing a test. The IFCC model serves all who are involved in HbA1c testing in the ongoing process of better performance and better patient care.
NASA Technical Reports Server (NTRS)
Abbott, Mark R.
1996-01-01
Our first activity is based on delivery of code to Bob Evans (University of Miami) for integration and eventual delivery to the MODIS Science Data Support Team. As we noted in our previous semi-annual report, coding required the development and analysis of an end-to-end model of fluorescence line height (FLH) errors and sensitivity. This model is described in a paper in press in Remote Sensing of the Environment. Once the code was delivered to Miami, we continue to use this error analysis to evaluate proposed changes in MODIS sensor specifications and performance. Simply evaluating such changes on a band by band basis may obscure the true impacts of changes in sensor performance that are manifested in the complete algorithm. This is especially true with FLH that is sensitive to band placement and width. The error model will be used by Howard Gordon (Miami) to evaluate the effects of absorbing aerosols on the FLH algorithm performance. Presently, FLH relies only on simple corrections for atmospheric effects (viewing geometry, Rayleigh scattering) without correcting for aerosols. Our analysis suggests that aerosols should have a small impact relative to changes in the quantum yield of fluorescence in phytoplankton. However, the effect of absorbing aerosol is a new process and will be evaluated by Gordon.
NASA Astrophysics Data System (ADS)
Camp, H. A.; Moyer, Steven; Moore, Richard K.
2010-04-01
The Night Vision and Electronic Sensors Directorate's current time-limited search (TLS) model, which makes use of the targeting task performance (TTP) metric to describe image quality, does not explicitly account for the effects of visual clutter on observer performance. The TLS model is currently based on empirical fits to describe human performance for a time of day, spectrum and environment. Incorporating a clutter metric into the TLS model may reduce the number of these empirical fits needed. The masked target transform volume (MTTV) clutter metric has been previously presented and compared to other clutter metrics. Using real infrared imagery of rural images with varying levels of clutter, NVESD is currently evaluating the appropriateness of the MTTV metric. NVESD had twenty subject matter experts (SME) rank the amount of clutter in each scene in a series of pair-wise comparisons. MTTV metric values were calculated and then compared to the SME observers rankings. The MTTV metric ranked the clutter in a similar manner to the SME evaluation, suggesting that the MTTV metric may emulate SME response. This paper is a first step in quantifying clutter and measuring the agreement to subjective human evaluation.
Rahman, M Shafiqur; Ambler, Gareth; Choodari-Oskooei, Babak; Omar, Rumana Z
2017-04-18
When developing a prediction model for survival data it is essential to validate its performance in external validation settings using appropriate performance measures. Although a number of such measures have been proposed, there is only limited guidance regarding their use in the context of model validation. This paper reviewed and evaluated a wide range of performance measures to provide some guidelines for their use in practice. An extensive simulation study based on two clinical datasets was conducted to investigate the performance of the measures in external validation settings. Measures were selected from categories that assess the overall performance, discrimination and calibration of a survival prediction model. Some of these have been modified to allow their use with validation data, and a case study is provided to describe how these measures can be estimated in practice. The measures were evaluated with respect to their robustness to censoring and ease of interpretation. All measures are implemented, or are straightforward to implement, in statistical software. Most of the performance measures were reasonably robust to moderate levels of censoring. One exception was Harrell's concordance measure which tended to increase as censoring increased. We recommend that Uno's concordance measure is used to quantify concordance when there are moderate levels of censoring. Alternatively, Gönen and Heller's measure could be considered, especially if censoring is very high, but we suggest that the prediction model is re-calibrated first. We also recommend that Royston's D is routinely reported to assess discrimination since it has an appealing interpretation. The calibration slope is useful for both internal and external validation settings and recommended to report routinely. Our recommendation would be to use any of the predictive accuracy measures and provide the corresponding predictive accuracy curves. In addition, we recommend to investigate the characteristics of the validation data such as the level of censoring and the distribution of the prognostic index derived in the validation setting before choosing the performance measures.
NASA Astrophysics Data System (ADS)
Odera, Patroba Achola; Fukuda, Yoichi
2017-09-01
The performance of Gravity field and steady-state Ocean Circulation Explorer (GOCE) global gravity field models (GGMs), at the end of GOCE mission covering 42 months, is evaluated using geoid undulations and free-air gravity anomalies over Japan, including six sub-regions (Hokkaido, north Honshu, central Honshu, west Honshu, Shikoku and Kyushu). Seventeen GOCE-based GGMs are evaluated and compared with EGM2008. The evaluations are carried out at 150, 180, 210, 240 and 270 spherical harmonics degrees. Results show that EGM2008 performs better than GOCE and related GGMs in Japan and three sub-regions (Hokkaido, central Honshu and Kyushu). However, GOCE and related GGMs perform better than EGM2008 in north Honshu, west Honshu and Shikoku up to degree 240. This means that GOCE data can improve geoid model over half of Japan. The improvement is only evident between degrees 150 and 240 beyond which EGM2008 performs better than GOCE GGMs in all the six regions. In general, the latest GOCE GGMs (releases 4 and 5) perform better than the earlier GOCE GGMs (releases 1, 2 and 3), indicating the contribution of data collected by GOCE in the last months before the mission ended on 11 November 2013. The results indicate that a more accurate geoid model over Japan is achievable, based on a combination of GOCE, EGM2008 and terrestrial gravity data sets. [Figure not available: see fulltext. Caption: Standard deviations of the differences between observed and GGMs implied ( a) free-air gravity anomalies over Japan, ( b) geoid undulations over Japan. n represents the spherical harmonic degrees
NASA Technical Reports Server (NTRS)
Mathur, F. P.
1972-01-01
Description of an on-line interactive computer program called CARE (Computer-Aided Reliability Estimation) which can model self-repair and fault-tolerant organizations and perform certain other functions. Essentially CARE consists of a repository of mathematical equations defining the various basic redundancy schemes. These equations, under program control, are then interrelated to generate the desired mathematical model to fit the architecture of the system under evaluation. The mathematical model is then supplied with ground instances of its variables and is then evaluated to generate values for the reliability-theoretic functions applied to the model.
NASA Astrophysics Data System (ADS)
El Akbar, R. Reza; Anshary, Muhammad Adi Khairul; Hariadi, Dennis
2018-02-01
Model MACP for HE ver.1. Is a model that describes how to perform measurement and monitoring performance for Higher Education. Based on a review of the research related to the model, there are several parts of the model component to develop in further research, so this research has four main objectives. The first objective is to differentiate the CSF (critical success factor) components in the previous model, the two key KPI (key performance indicators) exploration in the previous model, the three based on the previous objective, the new and more detailed model design. The final goal is the fourth designed prototype application for performance measurement in higher education, based on a new model created. The method used is explorative research method and application design using prototype method. The results of this study are first, forming a more detailed new model for measurement and monitoring of performance in higher education, differentiation and exploration of the Model MACP for HE Ver.1. The second result compiles a dictionary of college performance measurement by re-evaluating the existing indicators. The third result is the design of prototype application of performance measurement in higher education.
Kentel, Behzat B; King, Mark A; Mitchell, Sean R
2011-11-01
A torque-driven, subject-specific 3-D computer simulation model of the impact phase of one-handed tennis backhand strokes was evaluated by comparing performance and simulation results. Backhand strokes of an elite subject were recorded on an artificial tennis court. Over the 50-ms period after impact, good agreement was found with an overall RMS difference of 3.3° between matching simulation and performance in terms of joint and racket angles. Consistent with previous experimental research, the evaluation process showed that grip tightness and ball impact location are important factors that affect postimpact racket and arm kinematics. Associated with these factors, the model can be used for a better understanding of the eccentric contraction of the wrist extensors during one-handed backhand ground strokes, a hypothesized mechanism of tennis elbow.
USDA-ARS?s Scientific Manuscript database
1. To evaluate the performance of visible and near-infrared (Vis/NIR) spectroscopic models for discriminating true pale, soft and exudative (PSE), normal and dark, firm and dry (DFD) broiler breast meat in different conditions of preprocessing methods, spectral ranges, characteristic wavelength sele...
Evaluating Rater Accuracy in Rater-Mediated Assessments Using an Unfolding Model
ERIC Educational Resources Information Center
Wang, Jue; Engelhard, George, Jr.; Wolfe, Edward W.
2016-01-01
The number of performance assessments continues to increase around the world, and it is important to explore new methods for evaluating the quality of ratings obtained from raters. This study describes an unfolding model for examining rater accuracy. Accuracy is defined as the difference between observed and expert ratings. Dichotomous accuracy…
USDA-ARS?s Scientific Manuscript database
Originally developed for simulating soybean growth and development, the CROPGRO model was recently re-parameterized for cotton. However, further efforts are necessary to evaluate the model's performance against field measurements for new environments and management options. The objective of this stu...
Source emission and model evaluation of formaldehyde from baby furniture in the full scale chamber
This paper describes the measurement and model evaluation of formaldehyde source emissions from composite and solid wood furniture in a full-scale chamber at different ventilation rates for up to 4000 h using ASTM D 6670-01 (2007). Tests were performed on four types of furniture ...
Federal Register 2010, 2011, 2012, 2013, 2014
2012-01-30
... tool. The PBP analysis tool is a cash-flow model for evaluating alternative financing arrangements, and... PBP analysis tool is a cash-flow model for evaluating alternative financing arrangements, and is... that reflects adequate consideration to the Government for the improved contractor cash flow...
Gowd, Snigdha; Shankar, T; Dash, Samarendra; Sahoo, Nivedita; Chatterjee, Suravi; Mohanty, Pritam
2017-01-01
Aims and Objective: The aim of the study was to evaluate the reliability of cone beam computed tomography (CBCT) obtained image over plaster model for the assessment of mixed dentition analysis. Materials and Methods: Thirty CBCT-derived images and thirty plaster models were derived from the dental archives, and Moyer's and Tanaka-Johnston analyses were performed. The data obtained were interpreted and analyzed statistically using SPSS 10.0/PC (SPSS Inc., Chicago, IL, USA). Descriptive and analytical analysis along with Student's t-test was performed to qualitatively evaluate the data and P < 0.05 was considered statistically significant. Results: Statistically, significant results were obtained on data comparison between CBCT-derived images and plaster model; the mean for Moyer's analysis in the left and right lower arch for CBCT and plaster model was 21.2 mm, 21.1 mm and 22.5 mm, 22.5 mm, respectively. Conclusion: CBCT-derived images were less reliable as compared to data obtained directly from plaster model for mixed dentition analysis. PMID:28852639
Data envelopment analysis in service quality evaluation: an empirical study
NASA Astrophysics Data System (ADS)
Najafi, Seyedvahid; Saati, Saber; Tavana, Madjid
2015-09-01
Service quality is often conceptualized as the comparison between service expectations and the actual performance perceptions. It enhances customer satisfaction, decreases customer defection, and promotes customer loyalty. Substantial literature has examined the concept of service quality, its dimensions, and measurement methods. We introduce the perceived service quality index (PSQI) as a single measure for evaluating the multiple-item service quality construct based on the SERVQUAL model. A slack-based measure (SBM) of efficiency with constant inputs is used to calculate the PSQI. In addition, a non-linear programming model based on the SBM is proposed to delineate an improvement guideline and improve service quality. An empirical study is conducted to assess the applicability of the method proposed in this study. A large number of studies have used DEA as a benchmarking tool to measure service quality. These models do not propose a coherent performance evaluation construct and consequently fail to deliver improvement guidelines for improving service quality. The DEA models proposed in this study are designed to evaluate and improve service quality within a comprehensive framework and without any dependency on external data.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gastelum, Zoe N.; Harvey, Julia B.
The International Atomic Energy Agency State Evaluation Process: The Role of Information Analysis in Reaching Safeguards Conclusions (Mathews et al. 2008), several examples of nonproliferation models using analytical software were developed that may assist the IAEA with collecting, visualizing, analyzing, and reporting information in support of the State Evaluation Process. This paper focuses on one of the examples a set of models developed in the Proactive Scenario Production, Evidence Collection, and Testing (ProSPECT) software that evaluates the status and nature of a state’s nuclear activities. The models use three distinct subject areas to perform this assessment: the presence of nuclearmore » activities, the consistency of those nuclear activities with national nuclear energy goals, and the geopolitical context in which those nuclear activities are taking place. As a proof-of-concept for the models, a crude case study was performed. The study, which attempted to evaluate the nuclear activities taking place in Syria prior to September 2007, yielded illustrative, yet inconclusive, results. Due to the inconclusive nature of the case study results, changes that may improve the model’s efficiency and accuracy are proposed.« less
Development of Ku-band rendezvous radar tracking and acquisition simulation programs
NASA Technical Reports Server (NTRS)
1986-01-01
The fidelity of the Space Shuttle Radar tracking simulation model was improved. The data from the Shuttle Orbiter Radar Test and Evaluation (SORTE) program experiments performed at the White Sands Missile Range (WSMR) were reviewed and analyzed. The selected flight rendezvous radar data was evaluated. Problems with the Inertial Line-of-Sight (ILOS) angle rate tracker were evaluated using the improved fidelity angle rate tracker simulation model.
Simulation modeling of route guidance concept
DOT National Transportation Integrated Search
1997-01-01
The methodology of a simulation model developed at the University of New South Wales, Australia, for the evaluation of performance of Dynamic Route Guidance Systems (DRGS) is described. The microscopic simulation model adopts the event update simulat...