statistical model evaluation: Topics by Science.gov

Sample records for statistical model evaluation

Linearised and non-linearised isotherm models optimization analysis by error functions and statistical means

PubMed Central

2014-01-01

In adsorption study, to describe sorption process and evaluation of best-fitting isotherm model is a key analysis to investigate the theoretical hypothesis. Hence, numerous statistically analysis have been extensively used to estimate validity of the experimental equilibrium adsorption values with the predicted equilibrium values. Several statistical error analysis were carried out. In the present study, the following statistical analysis were carried out to evaluate the adsorption isotherm model fitness, like the Pearson correlation, the coefficient of determination and the Chi-square test, have been used. The ANOVA test was carried out for evaluating significance of various error functions and also coefficient of dispersion were evaluated for linearised and non-linearised models. The adsorption of phenol onto natural soil (Local name Kalathur soil) was carried out, in batch mode at 30 ± 20 C. For estimating the isotherm parameters, to get a holistic view of the analysis the models were compared between linear and non-linear isotherm models. The result reveled that, among above mentioned error functions and statistical functions were designed to determine the best fitting isotherm. PMID:25018878
Evaluating Item Fit for Multidimensional Item Response Models

ERIC Educational Resources Information Center

Zhang, Bo; Stone, Clement A.

2008-01-01

This research examines the utility of the s-x[superscript 2] statistic proposed by Orlando and Thissen (2000) in evaluating item fit for multidimensional item response models. Monte Carlo simulation was conducted to investigate both the Type I error and statistical power of this fit statistic in analyzing two kinds of multidimensional test…
Statistical and Hydrological evaluation of precipitation forecasts from IMD MME and ECMWF numerical weather forecasts for Indian River basins

NASA Astrophysics Data System (ADS)

Mohite, A. R.; Beria, H.; Behera, A. K.; Chatterjee, C.; Singh, R.

2016-12-01

Flood forecasting using hydrological models is an important and cost-effective non-structural flood management measure. For forecasting at short lead times, empirical models using real-time precipitation estimates have proven to be reliable. However, their skill depreciates with increasing lead time. Coupling a hydrologic model with real-time rainfall forecasts issued from numerical weather prediction (NWP) systems could increase the lead time substantially. In this study, we compared 1-5 days precipitation forecasts from India Meteorological Department (IMD) Multi-Model Ensemble (MME) with European Center for Medium Weather forecast (ECMWF) NWP forecasts for over 86 major river basins in India. We then evaluated the hydrologic utility of these forecasts over Basantpur catchment (approx. 59,000 km2) of the Mahanadi River basin. Coupled MIKE 11 RR (NAM) and MIKE 11 hydrodynamic (HD) models were used for the development of flood forecast system (FFS). RR model was calibrated using IMD station rainfall data. Cross-sections extracted from SRTM 30 were used as input to the MIKE 11 HD model. IMD started issuing operational MME forecasts from the year 2008, and hence, both the statistical and hydrologic evaluation were carried out from 2008-2014. The performance of FFS was evaluated using both the NWP datasets separately for the year 2011, which was a large flood year in Mahanadi River basin. We will present figures and metrics for statistical (threshold based statistics, skill in terms of correlation and bias) and hydrologic (Nash Sutcliffe efficiency, mean and peak error statistics) evaluation. The statistical evaluation will be at pan-India scale for all the major river basins and the hydrologic evaluation will be for the Basantpur catchment of the Mahanadi River basin.
A BAYESIAN STATISTICAL APPROACHES FOR THE EVALUATION OF CMAQ

EPA Science Inventory

This research focuses on the application of spatial statistical techniques for the evaluation of the Community Multiscale Air Quality (CMAQ) model. The upcoming release version of the CMAQ model was run for the calendar year 2001 and is in the process of being evaluated by EPA an...
Evaluation of a New Mean Scaled and Moment Adjusted Test Statistic for SEM

ERIC Educational Resources Information Center

Tong, Xiaoxiao; Bentler, Peter M.

2013-01-01

Recently a new mean scaled and skewness adjusted test statistic was developed for evaluating structural equation models in small samples and with potentially nonnormal data, but this statistic has received only limited evaluation. The performance of this statistic is compared to normal theory maximum likelihood and 2 well-known robust test…
Model Performance Evaluation and Scenario Analysis (MPESA) Tutorial

EPA Science Inventory

This tool consists of two parts: model performance evaluation and scenario analysis (MPESA). The model performance evaluation consists of two components: model performance evaluation metrics and model diagnostics. These metrics provides modelers with statistical goodness-of-fit m...
THE ATMOSPHERIC MODEL EVALUATION TOOL

EPA Science Inventory

This poster describes a model evaluation tool that is currently being developed and applied for meteorological and air quality model evaluation. The poster outlines the framework and provides examples of statistical evaluations that can be performed with the model evaluation tool...
10 CFR 431.445 - Determination of small electric motor efficiency.

Code of Federal Regulations, 2012 CFR

2012-01-01

... statistical analysis, computer simulation or modeling, or other analytic evaluation of performance data. (3... statistical analysis, computer simulation or modeling, and other analytic evaluation of performance data on.... (ii) If requested by the Department, the manufacturer shall conduct simulations to predict the...
Numerical and Qualitative Contrasts of Two Statistical Models for Water Quality Change in Tidal Waters

EPA Science Inventory

Two statistical approaches, weighted regression on time, discharge, and season and generalized additive models, have recently been used to evaluate water quality trends in estuaries. Both models have been used in similar contexts despite differences in statistical foundations and...
Evaluation of airborne lidar data to predict vegetation Presence/Absence

USGS Publications Warehouse

Palaseanu-Lovejoy, M.; Nayegandhi, A.; Brock, J.; Woodman, R.; Wright, C.W.

2009-01-01

This study evaluates the capabilities of the Experimental Advanced Airborne Research Lidar (EAARL) in delineating vegetation assemblages in Jean Lafitte National Park, Louisiana. Five-meter-resolution grids of bare earth, canopy height, canopy-reflection ratio, and height of median energy were derived from EAARL data acquired in September 2006. Ground-truth data were collected along transects to assess species composition, canopy cover, and ground cover. To decide which model is more accurate, comparisons of general linear models and generalized additive models were conducted using conventional evaluation methods (i.e., sensitivity, specificity, Kappa statistics, and area under the curve) and two new indexes, net reclassification improvement and integrated discrimination improvement. Generalized additive models were superior to general linear models in modeling presence/absence in training vegetation categories, but no statistically significant differences between the two models were achieved in determining the classification accuracy at validation locations using conventional evaluation methods, although statistically significant improvements in net reclassifications were observed. ?? 2009 Coastal Education and Research Foundation.
Heads Up! a Calculation- & Jargon-Free Approach to Statistics

ERIC Educational Resources Information Center

Giese, Alan R.

2012-01-01

Evaluating the strength of evidence in noisy data is a critical step in scientific thinking that typically relies on statistics. Students without statistical training will benefit from heuristic models that highlight the logic of statistical analysis. The likelihood associated with various coin-tossing outcomes gives students such a model. There…
Evaluating bacterial gene-finding HMM structures as probabilistic logic programs.

PubMed

Mørk, Søren; Holmes, Ian

2012-03-01

Probabilistic logic programming offers a powerful way to describe and evaluate structured statistical models. To investigate the practicality of probabilistic logic programming for structure learning in bioinformatics, we undertook a simplified bacterial gene-finding benchmark in PRISM, a probabilistic dialect of Prolog. We evaluate Hidden Markov Model structures for bacterial protein-coding gene potential, including a simple null model structure, three structures based on existing bacterial gene finders and two novel model structures. We test standard versions as well as ADPH length modeling and three-state versions of the five model structures. The models are all represented as probabilistic logic programs and evaluated using the PRISM machine learning system in terms of statistical information criteria and gene-finding prediction accuracy, in two bacterial genomes. Neither of our implementations of the two currently most used model structures are best performing in terms of statistical information criteria or prediction performances, suggesting that better-fitting models might be achievable. The source code of all PRISM models, data and additional scripts are freely available for download at: http://github.com/somork/codonhmm. Supplementary data are available at Bioinformatics online.
Bayesian Posterior Odds Ratios: Statistical Tools for Collaborative Evaluations

ERIC Educational Resources Information Center

Hicks, Tyler; Rodríguez-Campos, Liliana; Choi, Jeong Hoon

2018-01-01

To begin statistical analysis, Bayesians quantify their confidence in modeling hypotheses with priors. A prior describes the probability of a certain modeling hypothesis apart from the data. Bayesians should be able to defend their choice of prior to a skeptical audience. Collaboration between evaluators and stakeholders could make their choices…
Statistical Evaluation of CRM-Simulated Cloud and Precipitation Structures Using Multi- sensor TRMM Measurements and Retrievals

NASA Astrophysics Data System (ADS)

Posselt, D.; L'Ecuyer, T.; Matsui, T.

2009-05-01

Cloud resolving models are typically used to examine the characteristics of clouds and precipitation and their relationship to radiation and the large-scale circulation. As such, they are not required to reproduce the exact location of each observed convective system, much less each individual cloud. Some of the most relevant information about clouds and precipitation is provided by instruments located on polar-orbiting satellite platforms, but these observations are intermittent "snapshots" in time, making assessment of model performance challenging. In contrast to direct comparison, model results can be evaluated statistically. This avoids the requirement for the model to reproduce the observed systems, while returning valuable information on the performance of the model in a climate-relevant sense. The focus of this talk is a model evaluation study, in which updates to the microphysics scheme used in a three-dimensional version of the Goddard Cumulus Ensemble (GCE) model are evaluated using statistics of observed clouds, precipitation, and radiation. We present the results of multiday (non-equilibrium) simulations of organized deep convection using single- and double-moment versions of a the model's cloud microphysical scheme. Statistics of TRMM multi-sensor derived clouds, precipitation, and radiative fluxes are used to evaluate the GCE results, as are simulated TRMM measurements obtained using a sophisticated instrument simulator suite. We present advantages and disadvantages of performing model comparisons in retrieval and measurement space and conclude by motivating the use of data assimilation techniques for analyzing and improving model parameterizations.
An Application of M[subscript 2] Statistic to Evaluate the Fit of Cognitive Diagnostic Models

ERIC Educational Resources Information Center

Liu, Yanlou; Tian, Wei; Xin, Tao

2016-01-01

The fit of cognitive diagnostic models (CDMs) to response data needs to be evaluated, since CDMs might yield misleading results when they do not fit the data well. Limited-information statistic M[subscript 2] and the associated root mean square error of approximation (RMSEA[subscript 2]) in item factor analysis were extended to evaluate the fit of…
A New Statistic for Evaluating Item Response Theory Models for Ordinal Data. CRESST Report 839

ERIC Educational Resources Information Center

Cai, Li; Monroe, Scott

2014-01-01

We propose a new limited-information goodness of fit test statistic C[subscript 2] for ordinal IRT models. The construction of the new statistic lies formally between the M[subscript 2] statistic of Maydeu-Olivares and Joe (2006), which utilizes first and second order marginal probabilities, and the M*[subscript 2] statistic of Cai and Hansen…
The epistemology of mathematical and statistical modeling: a quiet methodological revolution.

PubMed

Rodgers, Joseph Lee

2010-01-01

A quiet methodological revolution, a modeling revolution, has occurred over the past several decades, almost without discussion. In contrast, the 20th century ended with contentious argument over the utility of null hypothesis significance testing (NHST). The NHST controversy may have been at least partially irrelevant, because in certain ways the modeling revolution obviated the NHST argument. I begin with a history of NHST and modeling and their relation to one another. Next, I define and illustrate principles involved in developing and evaluating mathematical models. Following, I discuss the difference between using statistical procedures within a rule-based framework and building mathematical models from a scientific epistemology. Only the former is treated carefully in most psychology graduate training. The pedagogical implications of this imbalance and the revised pedagogy required to account for the modeling revolution are described. To conclude, I discuss how attention to modeling implies shifting statistical practice in certain progressive ways. The epistemological basis of statistics has moved away from being a set of procedures, applied mechanistically, and moved toward building and evaluating statistical and scientific models. Copyrigiht 2009 APA, all rights reserved.
Best practices for evaluating the capability of nondestructive evaluation (NDE) and structural health monitoring (SHM) techniques for damage characterization

NASA Astrophysics Data System (ADS)

Aldrin, John C.; Annis, Charles; Sabbagh, Harold A.; Lindgren, Eric A.

2016-02-01

A comprehensive approach to NDE and SHM characterization error (CE) evaluation is presented that follows the framework of the `ahat-versus-a' regression analysis for POD assessment. Characterization capability evaluation is typically more complex with respect to current POD evaluations and thus requires engineering and statistical expertise in the model-building process to ensure all key effects and interactions are addressed. Justifying the statistical model choice with underlying assumptions is key. Several sizing case studies are presented with detailed evaluations of the most appropriate statistical model for each data set. The use of a model-assisted approach is introduced to help assess the reliability of NDE and SHM characterization capability under a wide range of part, environmental and damage conditions. Best practices of using models are presented for both an eddy current NDE sizing and vibration-based SHM case studies. The results of these studies highlight the general protocol feasibility, emphasize the importance of evaluating key application characteristics prior to the study, and demonstrate an approach to quantify the role of varying SHM sensor durability and environmental conditions on characterization performance.
Evaluation of Models of the Reading Process.

ERIC Educational Resources Information Center

Balajthy, Ernest

A variety of reading process models have been proposed and evaluated in reading research. Traditional approaches to model evaluation specify the workings of a system in a simplified fashion to enable organized, systematic study of the system's components. Following are several statistical methods of model evaluation: (1) empirical research on…
Evaluating statistical consistency in the ocean model component of the Community Earth System Model (pyCECT v2.0)

NASA Astrophysics Data System (ADS)

Baker, Allison H.; Hu, Yong; Hammerling, Dorit M.; Tseng, Yu-heng; Xu, Haiying; Huang, Xiaomeng; Bryan, Frank O.; Yang, Guangwen

2016-07-01

The Parallel Ocean Program (POP), the ocean model component of the Community Earth System Model (CESM), is widely used in climate research. Most current work in CESM-POP focuses on improving the model's efficiency or accuracy, such as improving numerical methods, advancing parameterization, porting to new architectures, or increasing parallelism. Since ocean dynamics are chaotic in nature, achieving bit-for-bit (BFB) identical results in ocean solutions cannot be guaranteed for even tiny code modifications, and determining whether modifications are admissible (i.e., statistically consistent with the original results) is non-trivial. In recent work, an ensemble-based statistical approach was shown to work well for software verification (i.e., quality assurance) on atmospheric model data. The general idea of the ensemble-based statistical consistency testing is to use a qualitative measurement of the variability of the ensemble of simulations as a metric with which to compare future simulations and make a determination of statistical distinguishability. The capability to determine consistency without BFB results boosts model confidence and provides the flexibility needed, for example, for more aggressive code optimizations and the use of heterogeneous execution environments. Since ocean and atmosphere models have differing characteristics in term of dynamics, spatial variability, and timescales, we present a new statistical method to evaluate ocean model simulation data that requires the evaluation of ensemble means and deviations in a spatial manner. In particular, the statistical distribution from an ensemble of CESM-POP simulations is used to determine the standard score of any new model solution at each grid point. Then the percentage of points that have scores greater than a specified threshold indicates whether the new model simulation is statistically distinguishable from the ensemble simulations. Both ensemble size and composition are important. Our experiments indicate that the new POP ensemble consistency test (POP-ECT) tool is capable of distinguishing cases that should be statistically consistent with the ensemble and those that should not, as well as providing a simple, subjective and systematic way to detect errors in CESM-POP due to the hardware or software stack, positively contributing to quality assurance for the CESM-POP code.

Evaluating model accuracy for model-based reasoning

NASA Technical Reports Server (NTRS)

Chien, Steve; Roden, Joseph

1992-01-01

Described here is an approach to automatically assessing the accuracy of various components of a model. In this approach, actual data from the operation of a target system is used to drive statistical measures to evaluate the prediction accuracy of various portions of the model. We describe how these statistical measures of model accuracy can be used in model-based reasoning for monitoring and design. We then describe the application of these techniques to the monitoring and design of the water recovery system of the Environmental Control and Life Support System (ECLSS) of Space Station Freedom.
Development and evaluation of statistical shape modeling for principal inner organs on torso CT images.

PubMed

Zhou, Xiangrong; Xu, Rui; Hara, Takeshi; Hirano, Yasushi; Yokoyama, Ryujiro; Kanematsu, Masayuki; Hoshi, Hiroaki; Kido, Shoji; Fujita, Hiroshi

2014-07-01

The shapes of the inner organs are important information for medical image analysis. Statistical shape modeling provides a way of quantifying and measuring shape variations of the inner organs in different patients. In this study, we developed a universal scheme that can be used for building the statistical shape models for different inner organs efficiently. This scheme combines the traditional point distribution modeling with a group-wise optimization method based on a measure called minimum description length to provide a practical means for 3D organ shape modeling. In experiments, the proposed scheme was applied to the building of five statistical shape models for hearts, livers, spleens, and right and left kidneys by use of 50 cases of 3D torso CT images. The performance of these models was evaluated by three measures: model compactness, model generalization, and model specificity. The experimental results showed that the constructed shape models have good "compactness" and satisfied the "generalization" performance for different organ shape representations; however, the "specificity" of these models should be improved in the future.
Model Performance Evaluation and Scenario Analysis (MPESA) Tutorial

EPA Pesticide Factsheets

The model performance evaluation consists of metrics and model diagnostics. These metrics provides modelers with statistical goodness-of-fit measures that capture magnitude only, sequence only, and combined magnitude and sequence errors.
A new test statistic for climate models that includes field and spatial dependencies using Gaussian Markov random fields

DOE PAGES

Nosedal-Sanchez, Alvaro; Jackson, Charles S.; Huerta, Gabriel

2016-07-20

A new test statistic for climate model evaluation has been developed that potentially mitigates some of the limitations that exist for observing and representing field and space dependencies of climate phenomena. Traditionally such dependencies have been ignored when climate models have been evaluated against observational data, which makes it difficult to assess whether any given model is simulating observed climate for the right reasons. The new statistic uses Gaussian Markov random fields for estimating field and space dependencies within a first-order grid point neighborhood structure. We illustrate the ability of Gaussian Markov random fields to represent empirical estimates of fieldmore » and space covariances using "witch hat" graphs. We further use the new statistic to evaluate the tropical response of a climate model (CAM3.1) to changes in two parameters important to its representation of cloud and precipitation physics. Overall, the inclusion of dependency information did not alter significantly the recognition of those regions of parameter space that best approximated observations. However, there were some qualitative differences in the shape of the response surface that suggest how such a measure could affect estimates of model uncertainty.« less
A new test statistic for climate models that includes field and spatial dependencies using Gaussian Markov random fields

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nosedal-Sanchez, Alvaro; Jackson, Charles S.; Huerta, Gabriel

A new test statistic for climate model evaluation has been developed that potentially mitigates some of the limitations that exist for observing and representing field and space dependencies of climate phenomena. Traditionally such dependencies have been ignored when climate models have been evaluated against observational data, which makes it difficult to assess whether any given model is simulating observed climate for the right reasons. The new statistic uses Gaussian Markov random fields for estimating field and space dependencies within a first-order grid point neighborhood structure. We illustrate the ability of Gaussian Markov random fields to represent empirical estimates of fieldmore » and space covariances using "witch hat" graphs. We further use the new statistic to evaluate the tropical response of a climate model (CAM3.1) to changes in two parameters important to its representation of cloud and precipitation physics. Overall, the inclusion of dependency information did not alter significantly the recognition of those regions of parameter space that best approximated observations. However, there were some qualitative differences in the shape of the response surface that suggest how such a measure could affect estimates of model uncertainty.« less
Relevance of the c-statistic when evaluating risk-adjustment models in surgery.

PubMed

Merkow, Ryan P; Hall, Bruce L; Cohen, Mark E; Dimick, Justin B; Wang, Edward; Chow, Warren B; Ko, Clifford Y; Bilimoria, Karl Y

2012-05-01

The measurement of hospital quality based on outcomes requires risk adjustment. The c-statistic is a popular tool used to judge model performance, but can be limited, particularly when evaluating specific operations in focused populations. Our objectives were to examine the interpretation and relevance of the c-statistic when used in models with increasingly similar case mix and to consider an alternative perspective on model calibration based on a graphical depiction of model fit. From the American College of Surgeons National Surgical Quality Improvement Program (2008-2009), patients were identified who underwent a general surgery procedure, and procedure groups were increasingly restricted: colorectal-all, colorectal-elective cases only, and colorectal-elective cancer cases only. Mortality and serious morbidity outcomes were evaluated using logistic regression-based risk adjustment, and model c-statistics and calibration curves were used to compare model performance. During the study period, 323,427 general, 47,605 colorectal-all, 39,860 colorectal-elective, and 21,680 colorectal cancer patients were studied. Mortality ranged from 1.0% in general surgery to 4.1% in the colorectal-all group, and serious morbidity ranged from 3.9% in general surgery to 12.4% in the colorectal-all procedural group. As case mix was restricted, c-statistics progressively declined from the general to the colorectal cancer surgery cohorts for both mortality and serious morbidity (mortality: 0.949 to 0.866; serious morbidity: 0.861 to 0.668). Calibration was evaluated graphically by examining predicted vs observed number of events over risk deciles. For both mortality and serious morbidity, there was no qualitative difference in calibration identified between the procedure groups. In the present study, we demonstrate how the c-statistic can become less informative and, in certain circumstances, can lead to incorrect model-based conclusions, as case mix is restricted and patients become more homogenous. Although it remains an important tool, caution is advised when the c-statistic is advanced as the sole measure of a model performance. Copyright © 2012 American College of Surgeons. All rights reserved.
Evaluation of the Williams-type spring wheat model in North Dakota and Minnesota

NASA Technical Reports Server (NTRS)

Leduc, S. (Principal Investigator)

1982-01-01

The Williams type model, developed similarly to previous models of C.V.D. Williams, uses monthly temperature and precipitation data as well as soil and topological variables to predict the yield of the spring wheat crop. The models are statistically developed using the regression technique. Eight model characteristics are examined in the evaluation of the model. Evaluation is at the crop reporting district level, the state level and for the entire region. A ten year bootstrap test was the basis of the statistical evaluation. The accuracy and current indication of modeled yield reliability could show improvement. There is great variability in the bias measured over the districts, but there is a slight overall positive bias. The model estimates for the east central crop reporting district in Minnesota are not accurate. The estimate of yield for 1974 were inaccurate for all of the models.
Clinical study of the Erlanger silver catheter--data management and biometry.

PubMed

Martus, P; Geis, C; Lugauer, S; Böswald, M; Guggenbichler, J P

1999-01-01

The clinical evaluation of venous catheters for catheter-induced infections must conform to a strict biometric methodology. The statistical planning of the study (target population, design, degree of blinding), data management (database design, definition of variables, coding), quality assurance (data inspection at several levels) and the biometric evaluation of the Erlanger silver catheter project are described. The three-step data flow included: 1) primary data from the hospital, 2) relational database, 3) files accessible for statistical evaluation. Two different statistical models were compared: analyzing the first catheter only of a patient in the analysis (independent data) and analyzing several catheters from the same patient (dependent data) by means of the generalized estimating equations (GEE) method. The main result of the study was based on the comparison of both statistical models.
A Comparative Evaluation of Mixed Dentition Analysis on Reliability of Cone Beam Computed Tomography Image Compared to Plaster Model.

PubMed

Gowd, Snigdha; Shankar, T; Dash, Samarendra; Sahoo, Nivedita; Chatterjee, Suravi; Mohanty, Pritam

2017-01-01

The aim of the study was to evaluate the reliability of cone beam computed tomography (CBCT) obtained image over plaster model for the assessment of mixed dentition analysis. Thirty CBCT-derived images and thirty plaster models were derived from the dental archives, and Moyer's and Tanaka-Johnston analyses were performed. The data obtained were interpreted and analyzed statistically using SPSS 10.0/PC (SPSS Inc., Chicago, IL, USA). Descriptive and analytical analysis along with Student's t -test was performed to qualitatively evaluate the data and P < 0.05 was considered statistically significant. Statistically, significant results were obtained on data comparison between CBCT-derived images and plaster model; the mean for Moyer's analysis in the left and right lower arch for CBCT and plaster model was 21.2 mm, 21.1 mm and 22.5 mm, 22.5 mm, respectively. CBCT-derived images were less reliable as compared to data obtained directly from plaster model for mixed dentition analysis.
Uranium resource assessment through statistical analysis of exploration geochemical and other data. Final report. [Codes EVAL, SURE

DOE Office of Scientific and Technical Information (OSTI.GOV)

Koch, G.S. Jr.; Howarth, R.J.; Schuenemeyer, J.H.

1981-02-01

We have developed a procedure that can help quadrangle evaluators to systematically summarize and use hydrogeochemical and stream sediment reconnaissance (HSSR) and occurrence data. Although we have not provided an independent estimate of uranium endowment, we have devised a methodology that will provide this independent estimate when additional calibration is done by enlarging the study area. Our statistical model for evaluation (system EVAL) ranks uranium endowment for each quadrangle. Because using this model requires experience in geology, statistics, and data analysis, we have also devised a simplified model, presented in the package SURE, a System for Uranium Resource Evaluation. Wemore » have developed and tested these models for the four quadrangles in southern Colorado that comprise the study area; to investigate their generality, the models should be applied to other quandrangles. Once they are calibrated with accepted uranium endowments for several well-known quadrangles, the models can be used to give independent estimates for less-known quadrangles. The point-oriented models structure the objective comparison of the quandrangles on the bases of: (1) Anomalies (a) derived from stream sediments, (b) derived from waters (stream, well, pond, etc.), (2) Geology (a) source rocks, as defined by the evaluator, (b) host rocks, as defined by the evaluator, and (3) Aerial radiometric anomalies.« less
Seasonal Atmospheric and Oceanic Predictions

NASA Technical Reports Server (NTRS)

Roads, John; Rienecker, Michele (Technical Monitor)

2003-01-01

Several projects associated with dynamical, statistical, single column, and ocean models are presented. The projects include: 1) Regional Climate Modeling; 2) Statistical Downscaling; 3) Evaluation of SCM and NSIPP AGCM Results at the ARM Program Sites; and 4) Ocean Forecasts.
An astronomer's guide to period searching

NASA Astrophysics Data System (ADS)

Schwarzenberg-Czerny, A.

2003-03-01

We concentrate on analysis of unevenly sampled time series, interrupted by periodic gaps, as often encountered in astronomy. While some of our conclusions may appear surprising, all are based on classical statistical principles of Fisher & successors. Except for discussion of the resolution issues, it is best for the reader to forget temporarily about Fourier transforms and to concentrate on problems of fitting of a time series with a model curve. According to their statistical content we divide the issues into several sections, consisting of: (ii) statistical numerical aspects of model fitting, (iii) evaluation of fitted models as hypotheses testing, (iv) the role of the orthogonal models in signal detection (v) conditions for equivalence of periodograms (vi) rating sensitivity by test power. An experienced observer working with individual objects would benefit little from formalized statistical approach. However, we demonstrate the usefulness of this approach in evaluation of performance of periodograms and in quantitative design of large variability surveys.
On Lack of Robustness in Hydrological Model Development Due to Absence of Guidelines for Selecting Calibration and Evaluation Data: Demonstration for Data-Driven Models

NASA Astrophysics Data System (ADS)

Zheng, Feifei; Maier, Holger R.; Wu, Wenyan; Dandy, Graeme C.; Gupta, Hoshin V.; Zhang, Tuqiao

2018-02-01

Hydrological models are used for a wide variety of engineering purposes, including streamflow forecasting and flood-risk estimation. To develop such models, it is common to allocate the available data to calibration and evaluation data subsets. Surprisingly, the issue of how this allocation can affect model evaluation performance has been largely ignored in the research literature. This paper discusses the evaluation performance bias that can arise from how available data are allocated to calibration and evaluation subsets. As a first step to assessing this issue in a statistically rigorous fashion, we present a comprehensive investigation of the influence of data allocation on the development of data-driven artificial neural network (ANN) models of streamflow. Four well-known formal data splitting methods are applied to 754 catchments from Australia and the U.S. to develop 902,483 ANN models. Results clearly show that the choice of the method used for data allocation has a significant impact on model performance, particularly for runoff data that are more highly skewed, highlighting the importance of considering the impact of data splitting when developing hydrological models. The statistical behavior of the data splitting methods investigated is discussed and guidance is offered on the selection of the most appropriate data splitting methods to achieve representative evaluation performance for streamflow data with different statistical properties. Although our results are obtained for data-driven models, they highlight the fact that this issue is likely to have a significant impact on all types of hydrological models, especially conceptual rainfall-runoff models.
A nonparametric spatial scan statistic for continuous data.

PubMed

Jung, Inkyung; Cho, Ho Jin

2015-10-20

Spatial scan statistics are widely used for spatial cluster detection, and several parametric models exist. For continuous data, a normal-based scan statistic can be used. However, the performance of the model has not been fully evaluated for non-normal data. We propose a nonparametric spatial scan statistic based on the Wilcoxon rank-sum test statistic and compared the performance of the method with parametric models via a simulation study under various scenarios. The nonparametric method outperforms the normal-based scan statistic in terms of power and accuracy in almost all cases under consideration in the simulation study. The proposed nonparametric spatial scan statistic is therefore an excellent alternative to the normal model for continuous data and is especially useful for data following skewed or heavy-tailed distributions.
Evaluating statistical cloud schemes: What can we gain from ground-based remote sensing?

NASA Astrophysics Data System (ADS)

Grützun, V.; Quaas, J.; Morcrette, C. J.; Ament, F.

2013-09-01

Statistical cloud schemes with prognostic probability distribution functions have become more important in atmospheric modeling, especially since they are in principle scale adaptive and capture cloud physics in more detail. While in theory the schemes have a great potential, their accuracy is still questionable. High-resolution three-dimensional observational data of water vapor and cloud water, which could be used for testing them, are missing. We explore the potential of ground-based remote sensing such as lidar, microwave, and radar to evaluate prognostic distribution moments using the "perfect model approach." This means that we employ a high-resolution weather model as virtual reality and retrieve full three-dimensional atmospheric quantities and virtual ground-based observations. We then use statistics from the virtual observation to validate the modeled 3-D statistics. Since the data are entirely consistent, any discrepancy occurring is due to the method. Focusing on total water mixing ratio, we find that the mean ratio can be evaluated decently but that it strongly depends on the meteorological conditions as to whether the variance and skewness are reliable. Using some simple schematic description of different synoptic conditions, we show how statistics obtained from point or line measurements can be poor at representing the full three-dimensional distribution of water in the atmosphere. We argue that a careful analysis of measurement data and detailed knowledge of the meteorological situation is necessary to judge whether we can use the data for an evaluation of higher moments of the humidity distribution used by a statistical cloud scheme.
Evaluating measurement models in clinical research: covariance structure analysis of latent variable models of self-conception.

PubMed

Hoyle, R H

1991-02-01

Indirect measures of psychological constructs are vital to clinical research. On occasion, however, the meaning of indirect measures of psychological constructs is obfuscated by statistical procedures that do not account for the complex relations between items and latent variables and among latent variables. Covariance structure analysis (CSA) is a statistical procedure for testing hypotheses about the relations among items that indirectly measure a psychological construct and relations among psychological constructs. This article introduces clinical researchers to the strengths and limitations of CSA as a statistical procedure for conceiving and testing structural hypotheses that are not tested adequately with other statistical procedures. The article is organized around two empirical examples that illustrate the use of CSA for evaluating measurement models with correlated error terms, higher-order factors, and measured and latent variables.
Performance of Bootstrapping Approaches To Model Test Statistics and Parameter Standard Error Estimation in Structural Equation Modeling.

ERIC Educational Resources Information Center

Nevitt, Jonathan; Hancock, Gregory R.

2001-01-01

Evaluated the bootstrap method under varying conditions of nonnormality, sample size, model specification, and number of bootstrap samples drawn from the resampling space. Results for the bootstrap suggest the resampling-based method may be conservative in its control over model rejections, thus having an impact on the statistical power associated…
Evaluation of the ecological relevance of mysid toxicity tests using population modeling techniques

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kuhn-Hines, A.; Munns, W.R. Jr.; Lussier, S.

1995-12-31

A number of acute and chronic bioassay statistics are used to evaluate the toxicity and risks of chemical stressors to the mysid shrimp, Mysidopsis bahia. These include LC{sub 50}S from acute tests, NOECs from 7-day and life-cycle tests, and the US EPA Water Quality Criteria Criterion Continuous Concentrations (CCC). Because these statistics are generated from endpoints which focus upon the responses of individual organisms, their relationships to significant effects at higher levels of ecological organization are unknown. This study was conducted to evaluate the quantitative relationships between toxicity test statistics and a concentration-based statistic derived from exposure-response models describing populationmore » growth rate ({lambda}) to stressor concentration. This statistic, C{sup {sm_bullet}} (concentration where {lambda} = I, zero population growth) describes the concentration above which mysid populations are projected to decline in abundance as determined using population modeling techniques. An analysis of M. bahia responses to 9 metals and 9 organic contaminants indicated the NOEC from life-cycle tests to be the best predictor of C{sup {sm_bullet}}, although the acute LC{sub 50} predicted population-level response surprisingly well. These analyses provide useful information regarding uncertainties of extrapolation among test statistics in assessments of ecological risk.« less
A Comparative Evaluation of Mixed Dentition Analysis on Reliability of Cone Beam Computed Tomography Image Compared to Plaster Model

PubMed Central

Gowd, Snigdha; Shankar, T; Dash, Samarendra; Sahoo, Nivedita; Chatterjee, Suravi; Mohanty, Pritam

2017-01-01

Aims and Objective: The aim of the study was to evaluate the reliability of cone beam computed tomography (CBCT) obtained image over plaster model for the assessment of mixed dentition analysis. Materials and Methods: Thirty CBCT-derived images and thirty plaster models were derived from the dental archives, and Moyer's and Tanaka-Johnston analyses were performed. The data obtained were interpreted and analyzed statistically using SPSS 10.0/PC (SPSS Inc., Chicago, IL, USA). Descriptive and analytical analysis along with Student's t-test was performed to qualitatively evaluate the data and P < 0.05 was considered statistically significant. Results: Statistically, significant results were obtained on data comparison between CBCT-derived images and plaster model; the mean for Moyer's analysis in the left and right lower arch for CBCT and plaster model was 21.2 mm, 21.1 mm and 22.5 mm, 22.5 mm, respectively. Conclusion: CBCT-derived images were less reliable as compared to data obtained directly from plaster model for mixed dentition analysis. PMID:28852639
A Management Information System Model for Program Management. Ph.D. Thesis - Oklahoma State Univ.; [Computerized Systems Analysis

NASA Technical Reports Server (NTRS)

Shipman, D. L.

1972-01-01

The development of a model to simulate the information system of a program management type of organization is reported. The model statistically determines the following parameters: type of messages, destinations, delivery durations, type processing, processing durations, communication channels, outgoing messages, and priorites. The total management information system of the program management organization is considered, including formal and informal information flows and both facilities and equipment. The model is written in General Purpose System Simulation 2 computer programming language for use on the Univac 1108, Executive 8 computer. The model is simulated on a daily basis and collects queue and resource utilization statistics for each decision point. The statistics are then used by management to evaluate proposed resource allocations, to evaluate proposed changes to the system, and to identify potential problem areas. The model employs both empirical and theoretical distributions which are adjusted to simulate the information flow being studied.

EVALUATION OF THE REAL-TIME AIR-QUALITY MODEL USING THE RAPS (REGIONAL AIR POLLUTION STUDY) DATA BASE. VOLUME 4. EVALUATION GUIDE

EPA Science Inventory

The theory and programming of statistical tests for evaluating the Real-Time Air-Quality Model (RAM) using the Regional Air Pollution Study (RAPS) data base are fully documented in four volumes. Moreover, the tests are generally applicable to other model evaluation problems. Volu...
EVALUATION OF A NEW MEAN SCALED AND MOMENT ADJUSTED TEST STATISTIC FOR SEM.

PubMed

Tong, Xiaoxiao; Bentler, Peter M

2013-01-01

Recently a new mean scaled and skewness adjusted test statistic was developed for evaluating structural equation models in small samples and with potentially nonnormal data, but this statistic has received only limited evaluation. The performance of this statistic is compared to normal theory maximum likelihood and two well-known robust test statistics. A modification to the Satorra-Bentler scaled statistic is developed for the condition that sample size is smaller than degrees of freedom. The behavior of the four test statistics is evaluated with a Monte Carlo confirmatory factor analysis study that varies seven sample sizes and three distributional conditions obtained using Headrick's fifth-order transformation to nonnormality. The new statistic performs badly in most conditions except under the normal distribution. The goodness-of-fit χ(2) test based on maximum-likelihood estimation performed well under normal distributions as well as under a condition of asymptotic robustness. The Satorra-Bentler scaled test statistic performed best overall, while the mean scaled and variance adjusted test statistic outperformed the others at small and moderate sample sizes under certain distributional conditions.
Addressing issues associated with evaluating prediction models for survival endpoints based on the concordance statistic.

PubMed

Wang, Ming; Long, Qi

2016-09-01

Prediction models for disease risk and prognosis play an important role in biomedical research, and evaluating their predictive accuracy in the presence of censored data is of substantial interest. The standard concordance (c) statistic has been extended to provide a summary measure of predictive accuracy for survival models. Motivated by a prostate cancer study, we address several issues associated with evaluating survival prediction models based on c-statistic with a focus on estimators using the technique of inverse probability of censoring weighting (IPCW). Compared to the existing work, we provide complete results on the asymptotic properties of the IPCW estimators under the assumption of coarsening at random (CAR), and propose a sensitivity analysis under the mechanism of noncoarsening at random (NCAR). In addition, we extend the IPCW approach as well as the sensitivity analysis to high-dimensional settings. The predictive accuracy of prediction models for cancer recurrence after prostatectomy is assessed by applying the proposed approaches. We find that the estimated predictive accuracy for the models in consideration is sensitive to NCAR assumption, and thus identify the best predictive model. Finally, we further evaluate the performance of the proposed methods in both settings of low-dimensional and high-dimensional data under CAR and NCAR through simulations. © 2016, The International Biometric Society.
Statistical procedures for evaluating daily and monthly hydrologic model predictions

USGS Publications Warehouse

Coffey, M.E.; Workman, S.R.; Taraba, J.L.; Fogle, A.W.

2004-01-01

The overall study objective was to evaluate the applicability of different qualitative and quantitative methods for comparing daily and monthly SWAT computer model hydrologic streamflow predictions to observed data, and to recommend statistical methods for use in future model evaluations. Statistical methods were tested using daily streamflows and monthly equivalent runoff depths. The statistical techniques included linear regression, Nash-Sutcliffe efficiency, nonparametric tests, t-test, objective functions, autocorrelation, and cross-correlation. None of the methods specifically applied to the non-normal distribution and dependence between data points for the daily predicted and observed data. Of the tested methods, median objective functions, sign test, autocorrelation, and cross-correlation were most applicable for the daily data. The robust coefficient of determination (CD*) and robust modeling efficiency (EF*) objective functions were the preferred methods for daily model results due to the ease of comparing these values with a fixed ideal reference value of one. Predicted and observed monthly totals were more normally distributed, and there was less dependence between individual monthly totals than was observed for the corresponding predicted and observed daily values. More statistical methods were available for comparing SWAT model-predicted and observed monthly totals. The 1995 monthly SWAT model predictions and observed data had a regression Rr2 of 0.70, a Nash-Sutcliffe efficiency of 0.41, and the t-test failed to reject the equal data means hypothesis. The Nash-Sutcliffe coefficient and the R r2 coefficient were the preferred methods for monthly results due to the ability to compare these coefficients to a set ideal value of one.
Using Computational Modeling to Assess the Impact of Clinical Decision Support on Cancer Screening within Community Health Centers

PubMed Central

Carney, Timothy Jay; Morgan, Geoffrey P.; Jones, Josette; McDaniel, Anna M.; Weaver, Michael; Weiner, Bryan; Haggstrom, David A.

2014-01-01

Our conceptual model demonstrates our goal to investigate the impact of clinical decision support (CDS) utilization on cancer screening improvement strategies in the community health care (CHC) setting. We employed a dual modeling technique using both statistical and computational modeling to evaluate impact. Our statistical model used the Spearman’s Rho test to evaluate the strength of relationship between our proximal outcome measures (CDS utilization) against our distal outcome measure (provider self-reported cancer screening improvement). Our computational model relied on network evolution theory and made use of a tool called Construct-TM to model the use of CDS measured by the rate of organizational learning. We employed the use of previously collected survey data from community health centers Cancer Health Disparities Collaborative (HDCC). Our intent is to demonstrate the added valued gained by using a computational modeling tool in conjunction with a statistical analysis when evaluating the impact a health information technology, in the form of CDS, on health care quality process outcomes such as facility-level screening improvement. Significant simulated disparities in organizational learning over time were observed between community health centers beginning the simulation with high and low clinical decision support capability. PMID:24953241
A BAYESIAN STATISTICAL APPROACH FOR THE EVALUATION OF CMAQ

EPA Science Inventory

Bayesian statistical methods are used to evaluate Community Multiscale Air Quality (CMAQ) model simulations of sulfate aerosol over a section of the eastern US for 4-week periods in summer and winter 2001. The observed data come from two U.S. Environmental Protection Agency data ...
Progress of statistical analysis in biomedical research through the historical review of the development of the Framingham score.

PubMed

Ignjatović, Aleksandra; Stojanović, Miodrag; Milošević, Zoran; Anđelković Apostolović, Marija

2017-12-02

The interest in developing risk models in medicine not only is appealing, but also associated with many obstacles in different aspects of predictive model development. Initially, the association of biomarkers or the association of more markers with the specific outcome was proven by statistical significance, but novel and demanding questions required the development of new and more complex statistical techniques. Progress of statistical analysis in biomedical research can be observed the best through the history of the Framingham study and development of the Framingham score. Evaluation of predictive models comes from a combination of the facts which are results of several metrics. Using logistic regression and Cox proportional hazards regression analysis, the calibration test, and the ROC curve analysis should be mandatory and eliminatory, and the central place should be taken by some new statistical techniques. In order to obtain complete information related to the new marker in the model, recently, there is a recommendation to use the reclassification tables by calculating the net reclassification index and the integrated discrimination improvement. Decision curve analysis is a novel method for evaluating the clinical usefulness of a predictive model. It may be noted that customizing and fine-tuning of the Framingham risk score initiated the development of statistical analysis. Clinically applicable predictive model should be a trade-off between all abovementioned statistical metrics, a trade-off between calibration and discrimination, accuracy and decision-making, costs and benefits, and quality and quantity of patient's life.
Statistics on continuous IBD data: Exact distribution evaluation for a pair of full(half)-sibs and a pair of a (great-) grandchild with a (great-) grandparent

PubMed Central

Stefanov, Valeri T

2002-01-01

Background Pairs of related individuals are widely used in linkage analysis. Most of the tests for linkage analysis are based on statistics associated with identity by descent (IBD) data. The current biotechnology provides data on very densely packed loci, and therefore, it may provide almost continuous IBD data for pairs of closely related individuals. Therefore, the distribution theory for statistics on continuous IBD data is of interest. In particular, distributional results which allow the evaluation of p-values for relevant tests are of importance. Results A technology is provided for numerical evaluation, with any given accuracy, of the cumulative probabilities of some statistics on continuous genome data for pairs of closely related individuals. In the case of a pair of full-sibs, the following statistics are considered: (i) the proportion of genome with 2 (at least 1) haplotypes shared identical-by-descent (IBD) on a chromosomal segment, (ii) the number of distinct pieces (subsegments) of a chromosomal segment, on each of which exactly 2 (at least 1) haplotypes are shared IBD. The natural counterparts of these statistics for the other relationships are also considered. Relevant Maple codes are provided for a rapid evaluation of the cumulative probabilities of such statistics. The genomic continuum model, with Haldane's model for the crossover process, is assumed. Conclusions A technology, together with relevant software codes for its automated implementation, are provided for exact evaluation of the distributions of relevant statistics associated with continuous genome data on closely related individuals. PMID:11996673
An Exploration of Student Attitudes and Satisfaction in a GAISE-Influenced Introductory Statistics Course

ERIC Educational Resources Information Center

Paul, Warren; Cunnington, R. Clare

2017-01-01

We used the Survey of Attitudes Toward Statistics to (1) evaluate using presemester data the Students' Attitudes Toward Statistics Model (SATS-M), and (2) test the effect on attitudes of an introductory statistics course redesigned according to the Guidelines for Assessment and Instruction in Statistics Education (GAISE) by examining the change in…
EVALUATION OF THE REAL-TIME AIR-QUALITY MODEL USING THE RAPS (REGIONAL AIR POLLUTION STUDY) DATA BASE. VOLUME 1. OVERVIEW

EPA Science Inventory

The theory and programming of statistical tests for evaluating the Real-Time Air-Quality Model (RAM) using the Regional Air Pollution Study (RAPS) data base are fully documented in four report volumes. Moreover, the tests are generally applicable to other model evaluation problem...
Understanding Broadscale Wildfire Risks in a Human-Dominated Landscape

Treesearch

Jeffrey P. Prestemon; John M. Pye; David T. Butry; Thomas P. Holmes; D. Evan Mercer

2002-01-01

Broadscale statistical evaluations of wildfire incidence can answer policy relevant questions about the effectiveness of microlevel vegetation management and can identify subjects needing further study. A dynamic time series cross-sectional model was used to evaluate the statistical links between forest wildfire and vegetation management, human land use, and climatic...
Subject-enabled analytics model on measurement statistics in health risk expert system for public health informatics.

PubMed

Chung, Chi-Jung; Kuo, Yu-Chen; Hsieh, Yun-Yu; Li, Tsai-Chung; Lin, Cheng-Chieh; Liang, Wen-Miin; Liao, Li-Na; Li, Chia-Ing; Lin, Hsueh-Chun

2017-11-01

This study applied open source technology to establish a subject-enabled analytics model that can enhance measurement statistics of case studies with the public health data in cloud computing. The infrastructure of the proposed model comprises three domains: 1) the health measurement data warehouse (HMDW) for the case study repository, 2) the self-developed modules of online health risk information statistics (HRIStat) for cloud computing, and 3) the prototype of a Web-based process automation system in statistics (PASIS) for the health risk assessment of case studies with subject-enabled evaluation. The system design employed freeware including Java applications, MySQL, and R packages to drive a health risk expert system (HRES). In the design, the HRIStat modules enforce the typical analytics methods for biomedical statistics, and the PASIS interfaces enable process automation of the HRES for cloud computing. The Web-based model supports both modes, step-by-step analysis and auto-computing process, respectively for preliminary evaluation and real time computation. The proposed model was evaluated by computing prior researches in relation to the epidemiological measurement of diseases that were caused by either heavy metal exposures in the environment or clinical complications in hospital. The simulation validity was approved by the commercial statistics software. The model was installed in a stand-alone computer and in a cloud-server workstation to verify computing performance for a data amount of more than 230K sets. Both setups reached efficiency of about 10 5 sets per second. The Web-based PASIS interface can be used for cloud computing, and the HRIStat module can be flexibly expanded with advanced subjects for measurement statistics. The analytics procedure of the HRES prototype is capable of providing assessment criteria prior to estimating the potential risk to public health. Copyright © 2017 Elsevier B.V. All rights reserved.
Probabilistic Evaluation of Competing Climate Models

NASA Astrophysics Data System (ADS)

Braverman, A. J.; Chatterjee, S.; Heyman, M.; Cressie, N.

2017-12-01

A standard paradigm for assessing the quality of climate model simulations is to compare what these models produce for past and present time periods, to observations of the past and present. Many of these comparisons are based on simple summary statistics called metrics. Here, we propose an alternative: evaluation of competing climate models through probabilities derived from tests of the hypothesis that climate-model-simulated and observed time sequences share common climate-scale signals. The probabilities are based on the behavior of summary statistics of climate model output and observational data, over ensembles of pseudo-realizations. These are obtained by partitioning the original time sequences into signal and noise components, and using a parametric bootstrap to create pseudo-realizations of the noise sequences. The statistics we choose come from working in the space of decorrelated and dimension-reduced wavelet coefficients. We compare monthly sequences of CMIP5 model output of average global near-surface temperature anomalies to similar sequences obtained from the well-known HadCRUT4 data set, as an illustration.
Statistical error model for a solar electric propulsion thrust subsystem

NASA Technical Reports Server (NTRS)

Bantell, M. H.

1973-01-01

The solar electric propulsion thrust subsystem statistical error model was developed as a tool for investigating the effects of thrust subsystem parameter uncertainties on navigation accuracy. The model is currently being used to evaluate the impact of electric engine parameter uncertainties on navigation system performance for a baseline mission to Encke's Comet in the 1980s. The data given represent the next generation in statistical error modeling for low-thrust applications. Principal improvements include the representation of thrust uncertainties and random process modeling in terms of random parametric variations in the thrust vector process for a multi-engine configuration.
A Monte Carlo Simulation Comparing the Statistical Precision of Two High-Stakes Teacher Evaluation Methods: A Value-Added Model and a Composite Measure

ERIC Educational Resources Information Center

Spencer, Bryden

2016-01-01

Value-added models are a class of growth models used in education to assign responsibility for student growth to teachers or schools. For value-added models to be used fairly, sufficient statistical precision is necessary for accurate teacher classification. Previous research indicated precision below practical limits. An alternative approach has…
Statistical Power of Alternative Structural Models for Comparative Effectiveness Research: Advantages of Modeling Unreliability.

PubMed

Coman, Emil N; Iordache, Eugen; Dierker, Lisa; Fifield, Judith; Schensul, Jean J; Suggs, Suzanne; Barbour, Russell

2014-05-01

The advantages of modeling the unreliability of outcomes when evaluating the comparative effectiveness of health interventions is illustrated. Adding an action-research intervention component to a regular summer job program for youth was expected to help in preventing risk behaviors. A series of simple two-group alternative structural equation models are compared to test the effect of the intervention on one key attitudinal outcome in terms of model fit and statistical power with Monte Carlo simulations. Some models presuming parameters equal across the intervention and comparison groups were underpowered to detect the intervention effect, yet modeling the unreliability of the outcome measure increased their statistical power and helped in the detection of the hypothesized effect. Comparative Effectiveness Research (CER) could benefit from flexible multi-group alternative structural models organized in decision trees, and modeling unreliability of measures can be of tremendous help for both the fit of statistical models to the data and their statistical power.
The statistical evaluation and comparison of ADMS-Urban model for the prediction of nitrogen dioxide with air quality monitoring network.

PubMed

Dėdelė, Audrius; Miškinytė, Auksė

2015-09-01

In many countries, road traffic is one of the main sources of air pollution associated with adverse effects on human health and environment. Nitrogen dioxide (NO2) is considered to be a measure of traffic-related air pollution, with concentrations tending to be higher near highways, along busy roads, and in the city centers, and the exceedances are mainly observed at measurement stations located close to traffic. In order to assess the air quality in the city and the air pollution impact on public health, air quality models are used. However, firstly, before the model can be used for these purposes, it is important to evaluate the accuracy of the dispersion modelling as one of the most widely used method. The monitoring and dispersion modelling are two components of air quality monitoring system (AQMS), in which statistical comparison was made in this research. The evaluation of the Atmospheric Dispersion Modelling System (ADMS-Urban) was made by comparing monthly modelled NO2 concentrations with the data of continuous air quality monitoring stations in Kaunas city. The statistical measures of model performance were calculated for annual and monthly concentrations of NO2 for each monitoring station site. The spatial analysis was made using geographic information systems (GIS). The calculation of statistical parameters indicated a good ADMS-Urban model performance for the prediction of NO2. The results of this study showed that the agreement of modelled values and observations was better for traffic monitoring stations compared to the background and residential stations.
Comments on statistical issues in numerical modeling for underground nuclear test monitoring

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nicholson, W.L.; Anderson, K.K.

1993-11-01

The Symposium concluded with prepared summaries by four experts in the involved disciplines. These experts made no mention of statistics and/or the statistical content of issues. The first author contributed an extemporaneous statement at the Symposium because there are important issues associated with conducting and evaluating numerical modeling that are familiar to statisticians and often treated successfully by them. This note expands upon these extemporaneous remarks.
EVALUATION OF THE REAL-TIME AIR-QUALITY MODEL USING THE RAPS (REGIONAL AIR POLLUTION STUDY) DATA BASE. VOLUME 3. PROGRAM USER'S GUIDE

EPA Science Inventory

The theory and programming of statistical tests for evaluating the Real-Time Air-Quality Model (RAM) using the Regional Air Pollution Study (RAPS) data base are fully documented in four volumes. Moreover, the tests are generally applicable to other model evaluation problems. Volu...
INLAND DISSOLVED SALT CHEMISTRY: STATISTICAL EVALUATION OF BIVARIATE AND TERNARY DIAGRAM MODELS FOR SURFACE AND SUBSURFACE WATERS

EPA Science Inventory

We compared the use of ternary and bivariate diagrams to distinguish the effects of atmospheric precipitation, rock weathering, and evaporation on inland surface and subsurface water chemistry. The three processes could not be statistically differentiated using bivariate models e...

A Statistical Decision Model for Periodical Selection for a Specialized Information Center

ERIC Educational Resources Information Center

Dym, Eleanor D.; Shirey, Donald L.

1973-01-01

An experiment is described which attempts to define a quantitative methodology for the identification and evaluation of all possibly relevant periodical titles containing toxicological-biological information. A statistical decision model was designed and employed, along with yes/no criteria questions, a training technique and a quality control…
On prognostic models, artificial intelligence and censored observations.

PubMed

Anand, S S; Hamilton, P W; Hughes, J G; Bell, D A

2001-03-01

The development of prognostic models for assisting medical practitioners with decision making is not a trivial task. Models need to possess a number of desirable characteristics and few, if any, current modelling approaches based on statistical or artificial intelligence can produce models that display all these characteristics. The inability of modelling techniques to provide truly useful models has led to interest in these models being purely academic in nature. This in turn has resulted in only a very small percentage of models that have been developed being deployed in practice. On the other hand, new modelling paradigms are being proposed continuously within the machine learning and statistical community and claims, often based on inadequate evaluation, being made on their superiority over traditional modelling methods. We believe that for new modelling approaches to deliver true net benefits over traditional techniques, an evaluation centric approach to their development is essential. In this paper we present such an evaluation centric approach to developing extensions to the basic k-nearest neighbour (k-NN) paradigm. We use standard statistical techniques to enhance the distance metric used and a framework based on evidence theory to obtain a prediction for the target example from the outcome of the retrieved exemplars. We refer to this new k-NN algorithm as Censored k-NN (Ck-NN). This reflects the enhancements made to k-NN that are aimed at providing a means for handling censored observations within k-NN.
A Statistical Approach For Modeling Tropical Cyclones. Synthetic Hurricanes Generator Model

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pasqualini, Donatella

This manuscript brie y describes a statistical ap- proach to generate synthetic tropical cyclone tracks to be used in risk evaluations. The Synthetic Hur- ricane Generator (SynHurG) model allows model- ing hurricane risk in the United States supporting decision makers and implementations of adaptation strategies to extreme weather. In the literature there are mainly two approaches to model hurricane hazard for risk prediction: deterministic-statistical approaches, where the storm key physical parameters are calculated using physi- cal complex climate models and the tracks are usually determined statistically from historical data; and sta- tistical approaches, where both variables and tracks are estimatedmore » stochastically using historical records. SynHurG falls in the second category adopting a pure stochastic approach.« less
Grain-Size Based Additivity Models for Scaling Multi-rate Uranyl Surface Complexation in Subsurface Sediments

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhang, Xiaoying; Liu, Chongxuan; Hu, Bill X.

This study statistically analyzed a grain-size based additivity model that has been proposed to scale reaction rates and parameters from laboratory to field. The additivity model assumed that reaction properties in a sediment including surface area, reactive site concentration, reaction rate, and extent can be predicted from field-scale grain size distribution by linearly adding reaction properties for individual grain size fractions. This study focused on the statistical analysis of the additivity model with respect to reaction rate constants using multi-rate uranyl (U(VI)) surface complexation reactions in a contaminated sediment as an example. Experimental data of rate-limited U(VI) desorption in amore » stirred flow-cell reactor were used to estimate the statistical properties of multi-rate parameters for individual grain size fractions. The statistical properties of the rate constants for the individual grain size fractions were then used to analyze the statistical properties of the additivity model to predict rate-limited U(VI) desorption in the composite sediment, and to evaluate the relative importance of individual grain size fractions to the overall U(VI) desorption. The result indicated that the additivity model provided a good prediction of the U(VI) desorption in the composite sediment. However, the rate constants were not directly scalable using the additivity model, and U(VI) desorption in individual grain size fractions have to be simulated in order to apply the additivity model. An approximate additivity model for directly scaling rate constants was subsequently proposed and evaluated. The result found that the approximate model provided a good prediction of the experimental results within statistical uncertainty. This study also found that a gravel size fraction (2-8mm), which is often ignored in modeling U(VI) sorption and desorption, is statistically significant to the U(VI) desorption in the sediment.« less
Modifying climate change habitat models using tree species-specific assessments of model uncertainty and life history-factors

Treesearch

Stephen N. Matthews; Louis R. Iverson; Anantha M. Prasad; Matthew P. Peters; Paul G. Rodewald

2011-01-01

Species distribution models (SDMs) to evaluate trees' potential responses to climate change are essential for developing appropriate forest management strategies. However, there is a great need to better understand these models' limitations and evaluate their uncertainties. We have previously developed statistical models of suitable habitat, based on both...
Performance Analysis of Live-Virtual-Constructive and Distributed Virtual Simulations: Defining Requirements in Terms of Temporal Consistency

DTIC Science & Technology

2009-12-01

events. Work associated with aperiodic tasks have the same statistical behavior and the same timing requirements. The timing deadlines are soft. • Sporadic...answers, but it is possible to calculate how precise the estimates are. Simulation-based performance analysis of a model includes a statistical ...to evaluate all pos- sible states in a timely manner. This is the principle reason for resorting to simulation and statistical analysis to evaluate
Evaluation of Fast-Time Wake Vortex Prediction Models

NASA Technical Reports Server (NTRS)

Proctor, Fred H.; Hamilton, David W.

2009-01-01

Current fast-time wake models are reviewed and three basic types are defined. Predictions from several of the fast-time models are compared. Previous statistical evaluations of the APA-Sarpkaya and D2P fast-time models are discussed. Root Mean Square errors between fast-time model predictions and Lidar wake measurements are examined for a 24 hr period at Denver International Airport. Shortcomings in current methodology for evaluating wake errors are also discussed.
The Effects of Selection Strategies for Bivariate Loglinear Smoothing Models on NEAT Equating Functions

ERIC Educational Resources Information Center

Moses, Tim; Holland, Paul W.

2010-01-01

In this study, eight statistical strategies were evaluated for selecting the parameterizations of loglinear models for smoothing the bivariate test score distributions used in nonequivalent groups with anchor test (NEAT) equating. Four of the strategies were based on significance tests of chi-square statistics (Likelihood Ratio, Pearson,…
The Evaluation and Selection of Adequate Causal Models: A Compensatory Education Example.

ERIC Educational Resources Information Center

Tanaka, Jeffrey S.

1982-01-01

Implications of model evaluation (using traditional chi square goodness of fit statistics, incremental fit indices for covariance structure models, and latent variable coefficients of determination) on substantive conclusions are illustrated with an example examining the effects of participation in a compensatory education program on posttreatment…
A modified F-test for evaluating model performance by including both experimental and simulation uncertainties

USDA-ARS?s Scientific Manuscript database

Experimental and simulation uncertainties have not been included in many of the statistics used in assessing agricultural model performance. The objectives of this study were to develop an F-test that can be used to evaluate model performance considering experimental and simulation uncertainties, an...
Incorporating principal component analysis into air quality model evaluation

EPA Science Inventory

The efficacy of standard air quality model evaluation techniques is becoming compromised as the simulation periods continue to lengthen in response to ever increasing computing capacity. Accordingly, the purpose of this paper is to demonstrate a statistical approach called Princi...
Modeling Soot Oxidation and Gasification with Bayesian Statistics

DOE PAGES

Josephson, Alexander J.; Gaffin, Neal D.; Smith, Sean T.; ...

2017-08-22

This paper presents a statistical method for model calibration using data collected from literature. The method is used to calibrate parameters for global models of soot consumption in combustion systems. This consumption is broken into two different submodels: first for oxidation where soot particles are attacked by certain oxidizing agents; second for gasification where soot particles are attacked by H 2O or CO 2 molecules. Rate data were collected from 19 studies in the literature and evaluated using Bayesian statistics to calibrate the model parameters. Bayesian statistics are valued in their ability to quantify uncertainty in modeling. The calibrated consumptionmore » model with quantified uncertainty is presented here along with a discussion of associated implications. The oxidation results are found to be consistent with previous studies. Significant variation is found in the CO 2 gasification rates.« less
Modeling Soot Oxidation and Gasification with Bayesian Statistics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Josephson, Alexander J.; Gaffin, Neal D.; Smith, Sean T.

This paper presents a statistical method for model calibration using data collected from literature. The method is used to calibrate parameters for global models of soot consumption in combustion systems. This consumption is broken into two different submodels: first for oxidation where soot particles are attacked by certain oxidizing agents; second for gasification where soot particles are attacked by H 2O or CO 2 molecules. Rate data were collected from 19 studies in the literature and evaluated using Bayesian statistics to calibrate the model parameters. Bayesian statistics are valued in their ability to quantify uncertainty in modeling. The calibrated consumptionmore » model with quantified uncertainty is presented here along with a discussion of associated implications. The oxidation results are found to be consistent with previous studies. Significant variation is found in the CO 2 gasification rates.« less
Evidential evaluation of DNA profiles using a discrete statistical model implemented in the DNA LiRa software.

PubMed

Puch-Solis, Roberto; Clayton, Tim

2014-07-01

The high sensitivity of the technology for producing profiles means that it has become routine to produce profiles from relatively small quantities of DNA. The profiles obtained from low template DNA (LTDNA) are affected by several phenomena which must be taken into consideration when interpreting and evaluating this evidence. Furthermore, many of the same phenomena affect profiles from higher amounts of DNA (e.g. where complex mixtures has been revealed). In this article we present a statistical model, which forms the basis of software DNA LiRa, and that is able to calculate likelihood ratios where one to four donors are postulated and for any number of replicates. The model can take into account dropin and allelic dropout for different contributors, template degradation and uncertain allele designations. In this statistical model unknown parameters are treated following the Empirical Bayesian paradigm. The performance of LiRa is tested using examples and the outputs are compared with those generated using two other statistical software packages likeLTD and LRmix. The concept of ban efficiency is introduced as a measure for assessing model sensitivity. Copyright © 2014. Published by Elsevier Ireland Ltd.
Nondestructive Evaluation (NDE) Technology Initiatives (NTIP). Delivery Order 0039: Statistical Comparison of Competing Material Models

DTIC Science & Technology

2003-01-01

adapted from Kass and Rafferty (1995) and Congdon (2001). Page 10 of 57 density adjusted for resin content, z, since resin contributes to the density...c.f.: Congdon , 2001). How to Download the WinBUGS Software Package BUGS was originally a statistical research project at the Medical Research...Likelihood Estimation,” July 2002, working paper to be published. 18) Congdon , Peter, Bayesian Statistical Modeling, Wiley, 2001 19) Cox, D. R. and
Evaluation of high-resolution sea ice models on the basis of statistical and scaling properties of Arctic sea ice drift and deformation

NASA Astrophysics Data System (ADS)

Girard, L.; Weiss, J.; Molines, J. M.; Barnier, B.; Bouillon, S.

2009-08-01

Sea ice drift and deformation from models are evaluated on the basis of statistical and scaling properties. These properties are derived from two observation data sets: the RADARSAT Geophysical Processor System (RGPS) and buoy trajectories from the International Arctic Buoy Program (IABP). Two simulations obtained with the Louvain-la-Neuve Ice Model (LIM) coupled to a high-resolution ocean model and a simulation obtained with the Los Alamos Sea Ice Model (CICE) were analyzed. Model ice drift compares well with observations in terms of large-scale velocity field and distributions of velocity fluctuations although a significant bias on the mean ice speed is noted. On the other hand, the statistical properties of ice deformation are not well simulated by the models: (1) The distributions of strain rates are incorrect: RGPS distributions of strain rates are power law tailed, i.e., exhibit "wild randomness," whereas models distributions remain in the Gaussian attraction basin, i.e., exhibit "mild randomness." (2) The models are unable to reproduce the spatial and temporal correlations of the deformation fields: In the observations, ice deformation follows spatial and temporal scaling laws that express the heterogeneity and the intermittency of deformation. These relations do not appear in simulated ice deformation. Mean deformation in models is almost scale independent. The statistical properties of ice deformation are a signature of the ice mechanical behavior. The present work therefore suggests that the mechanical framework currently used by models is inappropriate. A different modeling framework based on elastic interactions could improve the representation of the statistical and scaling properties of ice deformation.
Statistical colour models: an automated digital image analysis method for quantification of histological biomarkers.

PubMed

Shu, Jie; Dolman, G E; Duan, Jiang; Qiu, Guoping; Ilyas, Mohammad

2016-04-27

Colour is the most important feature used in quantitative immunohistochemistry (IHC) image analysis; IHC is used to provide information relating to aetiology and to confirm malignancy. Statistical modelling is a technique widely used for colour detection in computer vision. We have developed a statistical model of colour detection applicable to detection of stain colour in digital IHC images. Model was first trained by massive colour pixels collected semi-automatically. To speed up the training and detection processes, we removed luminance channel, Y channel of YCbCr colour space and chose 128 histogram bins which is the optimal number. A maximum likelihood classifier is used to classify pixels in digital slides into positively or negatively stained pixels automatically. The model-based tool was developed within ImageJ to quantify targets identified using IHC and histochemistry. The purpose of evaluation was to compare the computer model with human evaluation. Several large datasets were prepared and obtained from human oesophageal cancer, colon cancer and liver cirrhosis with different colour stains. Experimental results have demonstrated the model-based tool achieves more accurate results than colour deconvolution and CMYK model in the detection of brown colour, and is comparable to colour deconvolution in the detection of pink colour. We have also demostrated the proposed model has little inter-dataset variations. A robust and effective statistical model is introduced in this paper. The model-based interactive tool in ImageJ, which can create a visual representation of the statistical model and detect a specified colour automatically, is easy to use and available freely at http://rsb.info.nih.gov/ij/plugins/ihc-toolbox/index.html . Testing to the tool by different users showed only minor inter-observer variations in results.
Evaluation of "e-rater"® for the "Praxis I"®Writing Test. Research Report. ETS RR-15-03

ERIC Educational Resources Information Center

Ramineni, Chaitanya; Trapani, Catherine S.; Williamson, David M.

2015-01-01

Automated scoring models were trained and evaluated for the essay task in the "Praxis I"® writing test. Prompt-specific and generic "e-rater"® scoring models were built, and evaluation statistics, such as quadratic weighted kappa, Pearson correlation, and standardized differences in mean scores, were examined to evaluate the…
Multi-region statistical shape model for cochlear implantation

NASA Astrophysics Data System (ADS)

Romera, Jordi; Kjer, H. Martin; Piella, Gemma; Ceresa, Mario; González Ballester, Miguel A.

2016-03-01

Statistical shape models are commonly used to analyze the variability between similar anatomical structures and their use is established as a tool for analysis and segmentation of medical images. However, using a global model to capture the variability of complex structures is not enough to achieve the best results. The complexity of a proper global model increases even more when the amount of data available is limited to a small number of datasets. Typically, the anatomical variability between structures is associated to the variability of their physiological regions. In this paper, a complete pipeline is proposed for building a multi-region statistical shape model to study the entire variability from locally identified physiological regions of the inner ear. The proposed model, which is based on an extension of the Point Distribution Model (PDM), is built for a training set of 17 high-resolution images (24.5 μm voxels) of the inner ear. The model is evaluated according to its generalization ability and specificity. The results are compared with the ones of a global model built directly using the standard PDM approach. The evaluation results suggest that better accuracy can be achieved using a regional modeling of the inner ear.
Using meta-regression models to systematically evaluate data in the published literature: relative contributions of agricultural drift, para-occupational, and residential use exposure pathways to house dust pesticide concentrations

EPA Science Inventory

Background: Data reported in the published literature have been used qualitatively to aid exposure assessment activities in epidemiologic studies. Analyzing these data in computational models presents statistical challenges because these data are often reported as summary statist...

ANALYSIS OF MERCURY IN VERMONT AND NEW HAMPSHIRE LAKES: EVALUATION OF THE REGIONAL MERCURY CYCLING MODEL

EPA Science Inventory

An evaluation of the Regional Mercury Cycling Model (R-MCM, a steady-state fate and transport model used to simulate mercury concentrations in lakes) is presented based on its application to a series of 91 lakes in Vermont and New Hampshire. Visual and statistical analyses are pr...
On the Use of Principal Component and Spectral Density Analysis to Evaluate the Community Multiscale Air Quality (CMAQ) Model

EPA Science Inventory

A 5 year (2002-2006) simulation of CMAQ covering the eastern United States is evaluated using principle component analysis in order to identify and characterize statistically significant patterns of model bias. Such analysis is useful in that in can identify areas of poor model ...
Development of a statistical model for cervical cancer cell death with irreversible electroporation in vitro.

PubMed

Yang, Yongji; Moser, Michael A J; Zhang, Edwin; Zhang, Wenjun; Zhang, Bing

2018-01-01

The aim of this study was to develop a statistical model for cell death by irreversible electroporation (IRE) and to show that the statistic model is more accurate than the electric field threshold model in the literature using cervical cancer cells in vitro. HeLa cell line was cultured and treated with different IRE protocols in order to obtain data for modeling the statistical relationship between the cell death and pulse-setting parameters. In total, 340 in vitro experiments were performed with a commercial IRE pulse system, including a pulse generator and an electric cuvette. Trypan blue staining technique was used to evaluate cell death after 4 hours of incubation following IRE treatment. Peleg-Fermi model was used in the study to build the statistical relationship using the cell viability data obtained from the in vitro experiments. A finite element model of IRE for the electric field distribution was also built. Comparison of ablation zones between the statistical model and electric threshold model (drawn from the finite element model) was used to show the accuracy of the proposed statistical model in the description of the ablation zone and its applicability in different pulse-setting parameters. The statistical models describing the relationships between HeLa cell death and pulse length and the number of pulses, respectively, were built. The values of the curve fitting parameters were obtained using the Peleg-Fermi model for the treatment of cervical cancer with IRE. The difference in the ablation zone between the statistical model and the electric threshold model was also illustrated to show the accuracy of the proposed statistical model in the representation of ablation zone in IRE. This study concluded that: (1) the proposed statistical model accurately described the ablation zone of IRE with cervical cancer cells, and was more accurate compared with the electric field model; (2) the proposed statistical model was able to estimate the value of electric field threshold for the computer simulation of IRE in the treatment of cervical cancer; and (3) the proposed statistical model was able to express the change in ablation zone with the change in pulse-setting parameters.
The application of the statistical classifying models for signal evaluation of the gas sensors analyzing mold contamination of the building materials

NASA Astrophysics Data System (ADS)

Majerek, Dariusz; Guz, Łukasz; Suchorab, Zbigniew; Łagód, Grzegorz; Sobczuk, Henryk

2017-07-01

Mold that develops on moistened building barriers is a major cause of the Sick Building Syndrome (SBS). Fungal contamination is normally evaluated using standard biological methods which are time-consuming and require a lot of manual labor. Fungi emit Volatile Organic Compounds (VOC) that can be detected in the indoor air using several techniques of detection e.g. chromatography. VOCs can be also detected using gas sensors arrays. All array sensors generate particular voltage signals that ought to be analyzed using properly selected statistical methods of interpretation. This work is focused on the attempt to apply statistical classifying models in evaluation of signals from gas sensors matrix to analyze the air sampled from the headspace of various types of the building materials at different level of contamination but also clean reference materials.
System Analysis for the Huntsville Operation Support Center, Distributed Computer System

NASA Technical Reports Server (NTRS)

Ingels, F. M.; Massey, D.

1985-01-01

HOSC as a distributed computing system, is responsible for data acquisition and analysis during Space Shuttle operations. HOSC also provides computing services for Marshall Space Flight Center's nonmission activities. As mission and nonmission activities change, so do the support functions of HOSC change, demonstrating the need for some method of simulating activity at HOSC in various configurations. The simulation developed in this work primarily models the HYPERchannel network. The model simulates the activity of a steady state network, reporting statistics such as, transmitted bits, collision statistics, frame sequences transmitted, and average message delay. These statistics are used to evaluate such performance indicators as throughout, utilization, and delay. Thus the overall performance of the network is evaluated, as well as predicting possible overload conditions.
Comparative evaluation of statistical and mechanistic models of Escherichia coli at beaches in southern Lake Michigan

USGS Publications Warehouse

Safaie, Ammar; Wendzel, Aaron; Ge, Zhongfu; Nevers, Meredith; Whitman, Richard L.; Corsi, Steven R.; Phanikumar, Mantha S.

2016-01-01

Statistical and mechanistic models are popular tools for predicting the levels of indicator bacteria at recreational beaches. Researchers tend to use one class of model or the other, and it is difficult to generalize statements about their relative performance due to differences in how the models are developed, tested, and used. We describe a cooperative modeling approach for freshwater beaches impacted by point sources in which insights derived from mechanistic modeling were used to further improve the statistical models and vice versa. The statistical models provided a basis for assessing the mechanistic models which were further improved using probability distributions to generate high-resolution time series data at the source, long-term “tracer” transport modeling based on observed electrical conductivity, better assimilation of meteorological data, and the use of unstructured-grids to better resolve nearshore features. This approach resulted in improved models of comparable performance for both classes including a parsimonious statistical model suitable for real-time predictions based on an easily measurable environmental variable (turbidity). The modeling approach outlined here can be used at other sites impacted by point sources and has the potential to improve water quality predictions resulting in more accurate estimates of beach closures.
Incorporating big data into treatment plan evaluation: Development of statistical DVH metrics and visualization dashboards.

PubMed

Mayo, Charles S; Yao, John; Eisbruch, Avraham; Balter, James M; Litzenberg, Dale W; Matuszak, Martha M; Kessler, Marc L; Weyburn, Grant; Anderson, Carlos J; Owen, Dawn; Jackson, William C; Haken, Randall Ten

2017-01-01

To develop statistical dose-volume histogram (DVH)-based metrics and a visualization method to quantify the comparison of treatment plans with historical experience and among different institutions. The descriptive statistical summary (ie, median, first and third quartiles, and 95% confidence intervals) of volume-normalized DVH curve sets of past experiences was visualized through the creation of statistical DVH plots. Detailed distribution parameters were calculated and stored in JavaScript Object Notation files to facilitate management, including transfer and potential multi-institutional comparisons. In the treatment plan evaluation, structure DVH curves were scored against computed statistical DVHs and weighted experience scores (WESs). Individual, clinically used, DVH-based metrics were integrated into a generalized evaluation metric (GEM) as a priority-weighted sum of normalized incomplete gamma functions. Historical treatment plans for 351 patients with head and neck cancer, 104 with prostate cancer who were treated with conventional fractionation, and 94 with liver cancer who were treated with stereotactic body radiation therapy were analyzed to demonstrate the usage of statistical DVH, WES, and GEM in a plan evaluation. A shareable dashboard plugin was created to display statistical DVHs and integrate GEM and WES scores into a clinical plan evaluation within the treatment planning system. Benchmarking with normal tissue complication probability scores was carried out to compare the behavior of GEM and WES scores. DVH curves from historical treatment plans were characterized and presented, with difficult-to-spare structures (ie, frequently compromised organs at risk) identified. Quantitative evaluations by GEM and/or WES compared favorably with the normal tissue complication probability Lyman-Kutcher-Burman model, transforming a set of discrete threshold-priority limits into a continuous model reflecting physician objectives and historical experience. Statistical DVH offers an easy-to-read, detailed, and comprehensive way to visualize the quantitative comparison with historical experiences and among institutions. WES and GEM metrics offer a flexible means of incorporating discrete threshold-prioritizations and historic context into a set of standardized scoring metrics. Together, they provide a practical approach for incorporating big data into clinical practice for treatment plan evaluations.
Design of a testing strategy using non-animal based test methods: lessons learnt from the ACuteTox project.

PubMed

Kopp-Schneider, Annette; Prieto, Pilar; Kinsner-Ovaskainen, Agnieszka; Stanzel, Sven

2013-06-01

In the framework of toxicology, a testing strategy can be viewed as a series of steps which are taken to come to a final prediction about a characteristic of a compound under study. The testing strategy is performed as a single-step procedure, usually called a test battery, using simultaneously all information collected on different endpoints, or as tiered approach in which a decision tree is followed. Design of a testing strategy involves statistical considerations, such as the development of a statistical prediction model. During the EU FP6 ACuteTox project, several prediction models were proposed on the basis of statistical classification algorithms which we illustrate here. The final choice of testing strategies was not based on statistical considerations alone. However, without thorough statistical evaluations a testing strategy cannot be identified. We present here a number of observations made from the statistical viewpoint which relate to the development of testing strategies. The points we make were derived from problems we had to deal with during the evaluation of this large research project. A central issue during the development of a prediction model is the danger of overfitting. Procedures are presented to deal with this challenge. Copyright © 2012 Elsevier Ltd. All rights reserved.
Using statistical and machine learning to help institutions detect suspicious access to electronic health records.

PubMed

Boxwala, Aziz A; Kim, Jihoon; Grillo, Janice M; Ohno-Machado, Lucila

2011-01-01

To determine whether statistical and machine-learning methods, when applied to electronic health record (EHR) access data, could help identify suspicious (ie, potentially inappropriate) access to EHRs. From EHR access logs and other organizational data collected over a 2-month period, the authors extracted 26 features likely to be useful in detecting suspicious accesses. Selected events were marked as either suspicious or appropriate by privacy officers, and served as the gold standard set for model evaluation. The authors trained logistic regression (LR) and support vector machine (SVM) models on 10-fold cross-validation sets of 1291 labeled events. The authors evaluated the sensitivity of final models on an external set of 58 events that were identified as truly inappropriate and investigated independently from this study using standard operating procedures. The area under the receiver operating characteristic curve of the models on the whole data set of 1291 events was 0.91 for LR, and 0.95 for SVM. The sensitivity of the baseline model on this set was 0.8. When the final models were evaluated on the set of 58 investigated events, all of which were determined as truly inappropriate, the sensitivity was 0 for the baseline method, 0.76 for LR, and 0.79 for SVM. The LR and SVM models may not generalize because of interinstitutional differences in organizational structures, applications, and workflows. Nevertheless, our approach for constructing the models using statistical and machine-learning techniques can be generalized. An important limitation is the relatively small sample used for the training set due to the effort required for its construction. The results suggest that statistical and machine-learning methods can play an important role in helping privacy officers detect suspicious accesses to EHRs.
Using statistical and machine learning to help institutions detect suspicious access to electronic health records

PubMed Central

Kim, Jihoon; Grillo, Janice M; Ohno-Machado, Lucila

2011-01-01

Objective To determine whether statistical and machine-learning methods, when applied to electronic health record (EHR) access data, could help identify suspicious (ie, potentially inappropriate) access to EHRs. Methods From EHR access logs and other organizational data collected over a 2-month period, the authors extracted 26 features likely to be useful in detecting suspicious accesses. Selected events were marked as either suspicious or appropriate by privacy officers, and served as the gold standard set for model evaluation. The authors trained logistic regression (LR) and support vector machine (SVM) models on 10-fold cross-validation sets of 1291 labeled events. The authors evaluated the sensitivity of final models on an external set of 58 events that were identified as truly inappropriate and investigated independently from this study using standard operating procedures. Results The area under the receiver operating characteristic curve of the models on the whole data set of 1291 events was 0.91 for LR, and 0.95 for SVM. The sensitivity of the baseline model on this set was 0.8. When the final models were evaluated on the set of 58 investigated events, all of which were determined as truly inappropriate, the sensitivity was 0 for the baseline method, 0.76 for LR, and 0.79 for SVM. Limitations The LR and SVM models may not generalize because of interinstitutional differences in organizational structures, applications, and workflows. Nevertheless, our approach for constructing the models using statistical and machine-learning techniques can be generalized. An important limitation is the relatively small sample used for the training set due to the effort required for its construction. Conclusion The results suggest that statistical and machine-learning methods can play an important role in helping privacy officers detect suspicious accesses to EHRs. PMID:21672912
Seismic activity prediction using computational intelligence techniques in northern Pakistan

NASA Astrophysics Data System (ADS)

Asim, Khawaja M.; Awais, Muhammad; Martínez-Álvarez, F.; Iqbal, Talat

2017-10-01

Earthquake prediction study is carried out for the region of northern Pakistan. The prediction methodology includes interdisciplinary interaction of seismology and computational intelligence. Eight seismic parameters are computed based upon the past earthquakes. Predictive ability of these eight seismic parameters is evaluated in terms of information gain, which leads to the selection of six parameters to be used in prediction. Multiple computationally intelligent models have been developed for earthquake prediction using selected seismic parameters. These models include feed-forward neural network, recurrent neural network, random forest, multi layer perceptron, radial basis neural network, and support vector machine. The performance of every prediction model is evaluated and McNemar's statistical test is applied to observe the statistical significance of computational methodologies. Feed-forward neural network shows statistically significant predictions along with accuracy of 75% and positive predictive value of 78% in context of northern Pakistan.
Improving Our Ability to Evaluate Underlying Mechanisms of Behavioral Onset and Other Event Occurrence Outcomes: A Discrete-Time Survival Mediation Model

PubMed Central

Fairchild, Amanda J.; Abara, Winston E.; Gottschall, Amanda C.; Tein, Jenn-Yun; Prinz, Ronald J.

2015-01-01

The purpose of this article is to introduce and describe a statistical model that researchers can use to evaluate underlying mechanisms of behavioral onset and other event occurrence outcomes. Specifically, the article develops a framework for estimating mediation effects with outcomes measured in discrete-time epochs by integrating the statistical mediation model with discrete-time survival analysis. The methodology has the potential to help strengthen health research by targeting prevention and intervention work more effectively as well as by improving our understanding of discretized periods of risk. The model is applied to an existing longitudinal data set to demonstrate its use, and programming code is provided to facilitate its implementation. PMID:24296470
ACCELERATED FAILURE TIME MODELS PROVIDE A USEFUL STATISTICAL FRAMEWORK FOR AGING RESEARCH

PubMed Central

Swindell, William R.

2009-01-01

Survivorship experiments play a central role in aging research and are performed to evaluate whether interventions alter the rate of aging and increase lifespan. The accelerated failure time (AFT) model is seldom used to analyze survivorship data, but offers a potentially useful statistical approach that is based upon the survival curve rather than the hazard function. In this study, AFT models were used to analyze data from 16 survivorship experiments that evaluated the effects of one or more genetic manipulations on mouse lifespan. Most genetic manipulations were found to have a multiplicative effect on survivorship that is independent of age and well-characterized by the AFT model “deceleration factor”. AFT model deceleration factors also provided a more intuitive measure of treatment effect than the hazard ratio, and were robust to departures from modeling assumptions. Age-dependent treatment effects, when present, were investigated using quantile regression modeling. These results provide an informative and quantitative summary of survivorship data associated with currently known long-lived mouse models. In addition, from the standpoint of aging research, these statistical approaches have appealing properties and provide valuable tools for the analysis of survivorship data. PMID:19007875
Accelerated failure time models provide a useful statistical framework for aging research.

PubMed

Swindell, William R

2009-03-01

Survivorship experiments play a central role in aging research and are performed to evaluate whether interventions alter the rate of aging and increase lifespan. The accelerated failure time (AFT) model is seldom used to analyze survivorship data, but offers a potentially useful statistical approach that is based upon the survival curve rather than the hazard function. In this study, AFT models were used to analyze data from 16 survivorship experiments that evaluated the effects of one or more genetic manipulations on mouse lifespan. Most genetic manipulations were found to have a multiplicative effect on survivorship that is independent of age and well-characterized by the AFT model "deceleration factor". AFT model deceleration factors also provided a more intuitive measure of treatment effect than the hazard ratio, and were robust to departures from modeling assumptions. Age-dependent treatment effects, when present, were investigated using quantile regression modeling. These results provide an informative and quantitative summary of survivorship data associated with currently known long-lived mouse models. In addition, from the standpoint of aging research, these statistical approaches have appealing properties and provide valuable tools for the analysis of survivorship data.
Evaluation Statistics Computed for the Wave Information Studies (WIS)

DTIC Science & Technology

2016-07-01

Studies (WIS) by Mary A. Bryant, Tyler J. Hesser, and Robert E. Jensen PURPOSE: This Coastal and Hydraulics Engineering Technical Note (CHETN...describes the statistical metrics used by the Wave Information Studies (WIS) and produced as part of the model evaluation process. INTRODUCTION: The...gauge locations along the Pacific, Great Lakes, Gulf of Mexico , Atlantic, and Western Alaska coasts. Estimates of wave climatology produced by ocean
Evaluation of The Operational Benefits Versus Costs of An Automated Cargo Mover

DTIC Science & Technology

2016-12-01

logistics footprint and life-cycle cost are presented as part of this report. Analysis of modeling and simulation results identified statistically...life-cycle cost are presented as part of this report. Analysis of modeling and simulation results identified statistically significant differences...Error of Estimation. Source: Eskew and Lawler (1994). ...........................75 Figure 24. Load Results (100 Runs per Scenario
Statistical robustness of machine-learning estimates for characterizing a groundwater-surface water system, Southland, New Zealand

NASA Astrophysics Data System (ADS)

Friedel, M. J.; Daughney, C.

2016-12-01

The development of a successful surface-groundwater management strategy depends on the quality of data provided for analysis. This study evaluates the statistical robustness when using a modified self-organizing map (MSOM) technique to estimate missing values for three hypersurface models: synoptic groundwater-surface water hydrochemistry, time-series of groundwater-surface water hydrochemistry, and mixed-survey (combination of groundwater-surface water hydrochemistry and lithologies) hydrostratigraphic unit data. These models of increasing complexity are developed and validated based on observations from the Southland region of New Zealand. In each case, the estimation method is sufficiently robust to cope with groundwater-surface water hydrochemistry vagaries due to sample size and extreme data insufficiency, even when >80% of the data are missing. The estimation of surface water hydrochemistry time series values enabled the evaluation of seasonal variation, and the imputation of lithologies facilitated the evaluation of hydrostratigraphic controls on groundwater-surface water interaction. The robust statistical results for groundwater-surface water models of increasing data complexity provide justification to apply the MSOM technique in other regions of New Zealand and abroad.
Evolution of Natural Attenuation Evaluation Protocols

EPA Science Inventory

Traditionally the evaluation of the efficacy of natural attenuation was based on changes in contaminant concentrations and mass reduction. Statistical tools and models such as Bioscreen provided evaluation protocols which now are being approached via other vehicles including m...
The trend of changes in the evaluation scores of faculty members from administrators' and students' perspectives at the medical school over 10 years.

PubMed

Yamani, Nikoo; Changiz, Tahereh; Feizi, Awat; Kamali, Farahnaz

2018-01-01

To assess the trend of changes in the evaluation scores of faculty members and discrepancy between administrators' and students' perspectives in a medical school from 2006 to 2015. This repeated cross-sectional study was conducted on the 10-year evaluation scores of all faculty members of a medical school (n=579) in an urban area of Iran. Data on evaluation scores given by students and administrators and the total of these scores were evaluated. Data were analyzed using descriptive and inferential statistics including linear mixed effect models for repeated measures via the SPSS software. There were statistically significant differences between the students' and administrators' perspectives over time ( p <0.001). The mean of the total evaluation scores also showed a statistically significant change over time ( p <0.001). Furthermore, the mean of changes over time in the total evaluation score between different departments was statistically significant ( p <0.001). The trend of changes in the student's evaluations was clear and positive, but the trend of administrators' evaluation was unclear. Since the evaluation of faculty members is affected by many other factors, there is a need for more future studies.
Performance comparison of LUR and OK in PM2.5 concentration mapping: a multidimensional perspective

PubMed Central

Zou, Bin; Luo, Yanqing; Wan, Neng; Zheng, Zhong; Sternberg, Troy; Liao, Yilan

2015-01-01

Methods of Land Use Regression (LUR) modeling and Ordinary Kriging (OK) interpolation have been widely used to offset the shortcomings of PM2.5 data observed at sparse monitoring sites. However, traditional point-based performance evaluation strategy for these methods remains stagnant, which could cause unreasonable mapping results. To address this challenge, this study employs ‘information entropy’, an area-based statistic, along with traditional point-based statistics (e.g. error rate, RMSE) to evaluate the performance of LUR model and OK interpolation in mapping PM2.5 concentrations in Houston from a multidimensional perspective. The point-based validation reveals significant differences between LUR and OK at different test sites despite the similar end-result accuracy (e.g. error rate 6.13% vs. 7.01%). Meanwhile, the area-based validation demonstrates that the PM2.5 concentrations simulated by the LUR model exhibits more detailed variations than those interpolated by the OK method (i.e. information entropy, 7.79 vs. 3.63). Results suggest that LUR modeling could better refine the spatial distribution scenario of PM2.5 concentrations compared to OK interpolation. The significance of this study primarily lies in promoting the integration of point- and area-based statistics for model performance evaluation in air pollution mapping. PMID:25731103

Using Patient Demographics and Statistical Modeling to Predict Knee Tibia Component Sizing in Total Knee Arthroplasty.

PubMed

Ren, Anna N; Neher, Robert E; Bell, Tyler; Grimm, James

2018-06-01

Preoperative planning is important to achieve successful implantation in primary total knee arthroplasty (TKA). However, traditional TKA templating techniques are not accurate enough to predict the component size to a very close range. With the goal of developing a general predictive statistical model using patient demographic information, ordinal logistic regression was applied to build a proportional odds model to predict the tibia component size. The study retrospectively collected the data of 1992 primary Persona Knee System TKA procedures. Of them, 199 procedures were randomly selected as testing data and the rest of the data were randomly partitioned between model training data and model evaluation data with a ratio of 7:3. Different models were trained and evaluated on the training and validation data sets after data exploration. The final model had patient gender, age, weight, and height as independent variables and predicted the tibia size within 1 size difference 96% of the time on the validation data, 94% of the time on the testing data, and 92% on a prospective cadaver data set. The study results indicated the statistical model built by ordinal logistic regression can increase the accuracy of tibia sizing information for Persona Knee preoperative templating. This research shows statistical modeling may be used with radiographs to dramatically enhance the templating accuracy, efficiency, and quality. In general, this methodology can be applied to other TKA products when the data are applicable. Copyright © 2018 Elsevier Inc. All rights reserved.
Statistical appearance models based on probabilistic correspondences.

PubMed

Krüger, Julia; Ehrhardt, Jan; Handels, Heinz

2017-04-01

Model-based image analysis is indispensable in medical image processing. One key aspect of building statistical shape and appearance models is the determination of one-to-one correspondences in the training data set. At the same time, the identification of these correspondences is the most challenging part of such methods. In our earlier work, we developed an alternative method using correspondence probabilities instead of exact one-to-one correspondences for a statistical shape model (Hufnagel et al., 2008). In this work, a new approach for statistical appearance models without one-to-one correspondences is proposed. A sparse image representation is used to build a model that combines point position and appearance information at the same time. Probabilistic correspondences between the derived multi-dimensional feature vectors are used to omit the need for extensive preprocessing of finding landmarks and correspondences as well as to reduce the dependence of the generated model on the landmark positions. Model generation and model fitting can now be expressed by optimizing a single global criterion derived from a maximum a-posteriori (MAP) approach with respect to model parameters that directly affect both shape and appearance of the considered objects inside the images. The proposed approach describes statistical appearance modeling in a concise and flexible mathematical framework. Besides eliminating the demand for costly correspondence determination, the method allows for additional constraints as topological regularity in the modeling process. In the evaluation the model was applied for segmentation and landmark identification in hand X-ray images. The results demonstrate the feasibility of the model to detect hand contours as well as the positions of the joints between finger bones for unseen test images. Further, we evaluated the model on brain data of stroke patients to show the ability of the proposed model to handle partially corrupted data and to demonstrate a possible employment of the correspondence probabilities to indicate these corrupted/pathological areas. Copyright © 2017 Elsevier B.V. All rights reserved.
Small Sample Statistics for Incomplete Nonnormal Data: Extensions of Complete Data Formulae and a Monte Carlo Comparison

ERIC Educational Resources Information Center

Savalei, Victoria

2010-01-01

Incomplete nonnormal data are common occurrences in applied research. Although these 2 problems are often dealt with separately by methodologists, they often cooccur. Very little has been written about statistics appropriate for evaluating models with such data. This article extends several existing statistics for complete nonnormal data to…
Guidelines 13 and 14—Prediction uncertainty

USGS Publications Warehouse

Hill, Mary C.; Tiedeman, Claire

2005-01-01

An advantage of using optimization for model development and calibration is that optimization provides methods for evaluating and quantifying prediction uncertainty. Both deterministic and statistical methods can be used. Guideline 13 discusses using regression and post-audits, which we classify as deterministic methods. Guideline 14 discusses inferential statistics and Monte Carlo methods, which we classify as statistical methods.
Statistical analysis of water-quality data containing multiple detection limits: S-language software for regression on order statistics

USGS Publications Warehouse

Lee, L.; Helsel, D.

2005-01-01

Trace contaminants in water, including metals and organics, often are measured at sufficiently low concentrations to be reported only as values below the instrument detection limit. Interpretation of these "less thans" is complicated when multiple detection limits occur. Statistical methods for multiply censored, or multiple-detection limit, datasets have been developed for medical and industrial statistics, and can be employed to estimate summary statistics or model the distributions of trace-level environmental data. We describe S-language-based software tools that perform robust linear regression on order statistics (ROS). The ROS method has been evaluated as one of the most reliable procedures for developing summary statistics of multiply censored data. It is applicable to any dataset that has 0 to 80% of its values censored. These tools are a part of a software library, or add-on package, for the R environment for statistical computing. This library can be used to generate ROS models and associated summary statistics, plot modeled distributions, and predict exceedance probabilities of water-quality standards. ?? 2005 Elsevier Ltd. All rights reserved.
Evaluating climate models: Should we use weather or climate observations?

DOE Office of Scientific and Technical Information (OSTI.GOV)

Oglesby, Robert J; Erickson III, David J

2009-12-01

Calling the numerical models that we use for simulations of climate change 'climate models' is a bit of a misnomer. These 'general circulation models' (GCMs, AKA global climate models) and their cousins the 'regional climate models' (RCMs) are actually physically-based weather simulators. That is, these models simulate, either globally or locally, daily weather patterns in response to some change in forcing or boundary condition. These simulated weather patterns are then aggregated into climate statistics, very much as we aggregate observations into 'real climate statistics'. Traditionally, the output of GCMs has been evaluated using climate statistics, as opposed to their abilitymore » to simulate realistic daily weather observations. At the coarse global scale this may be a reasonable approach, however, as RCM's downscale to increasingly higher resolutions, the conjunction between weather and climate becomes more problematic. We present results from a series of present-day climate simulations using the WRF ARW for domains that cover North America, much of Latin America, and South Asia. The basic domains are at a 12 km resolution, but several inner domains at 4 km have also been simulated. These include regions of complex topography in Mexico, Colombia, Peru, and Sri Lanka, as well as a region of low topography and fairly homogeneous land surface type (the U.S. Great Plains). Model evaluations are performed using standard climate analyses (e.g., reanalyses; NCDC data) but also using time series of daily station observations. Preliminary results suggest little difference in the assessment of long-term mean quantities, but the variability on seasonal and interannual timescales is better described. Furthermore, the value-added by using daily weather observations as an evaluation tool increases with the model resolution.« less
Model Performance Evaluation and Scenario Analysis ...

EPA Pesticide Factsheets

This tool consists of two parts: model performance evaluation and scenario analysis (MPESA). The model performance evaluation consists of two components: model performance evaluation metrics and model diagnostics. These metrics provides modelers with statistical goodness-of-fit measures that capture magnitude only, sequence only, and combined magnitude and sequence errors. The performance measures include error analysis, coefficient of determination, Nash-Sutcliffe efficiency, and a new weighted rank method. These performance metrics only provide useful information about the overall model performance. Note that MPESA is based on the separation of observed and simulated time series into magnitude and sequence components. The separation of time series into magnitude and sequence components and the reconstruction back to time series provides diagnostic insights to modelers. For example, traditional approaches lack the capability to identify if the source of uncertainty in the simulated data is due to the quality of the input data or the way the analyst adjusted the model parameters. This report presents a suite of model diagnostics that identify if mismatches between observed and simulated data result from magnitude or sequence related errors. MPESA offers graphical and statistical options that allow HSPF users to compare observed and simulated time series and identify the parameter values to adjust or the input data to modify. The scenario analysis part of the too
Ten Years of Cloud Properties from MODIS: Global Statistics and Use in Climate Model Evaluation

NASA Technical Reports Server (NTRS)

Platnick, Steven E.

2011-01-01

The NASA Moderate Resolution Imaging Spectroradiometer (MODIS), launched onboard the Terra and Aqua spacecrafts, began Earth observations on February 24, 2000 and June 24,2002, respectively. Among the algorithms developed and applied to this sensor, a suite of cloud products includes cloud masking/detection, cloud-top properties (temperature, pressure), and optical properties (optical thickness, effective particle radius, water path, and thermodynamic phase). All cloud algorithms underwent numerous changes and enhancements between for the latest Collection 5 production version; this process continues with the current Collection 6 development. We will show example MODIS Collection 5 cloud climatologies derived from global spatial . and temporal aggregations provided in the archived gridded Level-3 MODIS atmosphere team product (product names MOD08 and MYD08 for MODIS Terra and Aqua, respectively). Data sets in this Level-3 product include scalar statistics as well as 1- and 2-D histograms of many cloud properties, allowing for higher order information and correlation studies. In addition to these statistics, we will show trends and statistical significance in annual and seasonal means for a variety of the MODIS cloud properties, as well as the time required for detection given assumed trends. To assist in climate model evaluation, we have developed a MODIS cloud simulator with an accompanying netCDF file containing subsetted monthly Level-3 statistical data sets that correspond to the simulator output. Correlations of cloud properties with ENSO offer the potential to evaluate model cloud sensitivity; initial results will be discussed.
Evaluation of Regression Models of Balance Calibration Data Using an Empirical Criterion

NASA Technical Reports Server (NTRS)

Ulbrich, Norbert; Volden, Thomas R.

2012-01-01

An empirical criterion for assessing the significance of individual terms of regression models of wind tunnel strain gage balance outputs is evaluated. The criterion is based on the percent contribution of a regression model term. It considers a term to be significant if its percent contribution exceeds the empirical threshold of 0.05%. The criterion has the advantage that it can easily be computed using the regression coefficients of the gage outputs and the load capacities of the balance. First, a definition of the empirical criterion is provided. Then, it is compared with an alternate statistical criterion that is widely used in regression analysis. Finally, calibration data sets from a variety of balances are used to illustrate the connection between the empirical and the statistical criterion. A review of these results indicated that the empirical criterion seems to be suitable for a crude assessment of the significance of a regression model term as the boundary between a significant and an insignificant term cannot be defined very well. Therefore, regression model term reduction should only be performed by using the more universally applicable statistical criterion.
A new in silico classification model for ready biodegradability, based on molecular fragments.

PubMed

Lombardo, Anna; Pizzo, Fabiola; Benfenati, Emilio; Manganaro, Alberto; Ferrari, Thomas; Gini, Giuseppina

2014-08-01

Regulations such as the European REACH (Registration, Evaluation, Authorization and restriction of Chemicals) often require chemicals to be evaluated for ready biodegradability, to assess the potential risk for environmental and human health. Because not all chemicals can be tested, there is an increasing demand for tools for quick and inexpensive biodegradability screening, such as computer-based (in silico) theoretical models. We developed an in silico model starting from a dataset of 728 chemicals with ready biodegradability data (MITI-test Ministry of International Trade and Industry). We used the novel software SARpy to automatically extract, through a structural fragmentation process, a set of substructures statistically related to ready biodegradability. Then, we analysed these substructures in order to build some general rules. The model consists of a rule-set made up of the combination of the statistically relevant fragments and of the expert-based rules. The model gives good statistical performance with 92%, 82% and 76% accuracy on the training, test and external set respectively. These results are comparable with other in silico models like BIOWIN developed by the United States Environmental Protection Agency (EPA); moreover this new model includes an easily understandable explanation. Copyright © 2014 Elsevier Ltd. All rights reserved.
Validation of the measure automobile emissions model : a statistical analysis

DOT National Transportation Integrated Search

2000-09-01

The Mobile Emissions Assessment System for Urban and Regional Evaluation (MEASURE) model provides an external validation capability for hot stabilized option; the model is one of several new modal emissions models designed to predict hot stabilized e...
Transfer Student Success: Educationally Purposeful Activities Predictive of Undergraduate GPA

ERIC Educational Resources Information Center

Fauria, Renee M.; Fuller, Matthew B.

2015-01-01

Researchers evaluated the effects of Educationally Purposeful Activities (EPAs) on transfer and nontransfer students' cumulative GPAs. Hierarchical, linear, and multiple regression models yielded seven statistically significant educationally purposeful items that influenced undergraduate student GPAs. Statistically significant positive EPAs for…
Evaluation of the "e-rater"® Scoring Engine for the "TOEFL"® Independent and Integrated Prompts. Research Report. ETS RR-12-06

ERIC Educational Resources Information Center

Ramineni, Chaitanya; Trapani, Catherine S.; Williamson, David M.; Davey, Tim; Bridgeman, Brent

2012-01-01

Scoring models for the "e-rater"® system were built and evaluated for the "TOEFL"® exam's independent and integrated writing prompts. Prompt-specific and generic scoring models were built, and evaluation statistics, such as weighted kappas, Pearson correlations, standardized differences in mean scores, and correlations with…
Evaluation of the "e-rater"® Scoring Engine for the "GRE"® Issue and Argument Prompts. Research Report. ETS RR-12-02

ERIC Educational Resources Information Center

Ramineni, Chaitanya; Trapani, Catherine S.; Williamson, David M.; Davey, Tim; Bridgeman, Brent

2012-01-01

Automated scoring models for the "e-rater"® scoring engine were built and evaluated for the "GRE"® argument and issue-writing tasks. Prompt-specific, generic, and generic with prompt-specific intercept scoring models were built and evaluation statistics such as weighted kappas, Pearson correlations, standardized difference in…
Wave and Wind Model Performance Metrics Tools

NASA Astrophysics Data System (ADS)

Choi, J. K.; Wang, D. W.

2016-02-01

Continual improvements and upgrades of Navy ocean wave and wind models are essential to the assurance of battlespace environment predictability of ocean surface wave and surf conditions in support of Naval global operations. Thus, constant verification and validation of model performance is equally essential to assure the progress of model developments and maintain confidence in the predictions. Global and regional scale model evaluations may require large areas and long periods of time. For observational data to compare against, altimeter winds and waves along the tracks from past and current operational satellites as well as moored/drifting buoys can be used for global and regional coverage. Using data and model runs in previous trials such as the planned experiment, the Dynamics of the Adriatic in Real Time (DART), we demonstrated the use of accumulated altimeter wind and wave data over several years to obtain an objective evaluation of the performance the SWAN (Simulating Waves Nearshore) model running in the Adriatic Sea. The assessment provided detailed performance of wind and wave models by using cell-averaged statistical variables maps with spatial statistics including slope, correlation, and scatter index to summarize model performance. Such a methodology is easily generalized to other regions and at global scales. Operational technology currently used by subject matter experts evaluating the Navy Coastal Ocean Model and the Hybrid Coordinate Ocean Model can be expanded to evaluate wave and wind models using tools developed for ArcMAP, a GIS application developed by ESRI. Recent inclusion of altimeter and buoy data into a format through the Naval Oceanographic Office's (NAVOCEANO) quality control system and the netCDF standards applicable to all model output makes it possible for the fusion of these data and direct model verification. Also, procedures were developed for the accumulation of match-ups of modelled and observed parameters to form a data base with which statistics are readily calculated, for the short or long term. Such a system has potential for a quick transition to operations at NAVOCEANO.
Meteorological models for estimating phenology of corn

NASA Technical Reports Server (NTRS)

Daughtry, C. S. T.; Cochran, J. C.; Hollinger, S. E.

1984-01-01

Knowledge of when critical crop stages occur and how the environment affects them should provide useful information for crop management decisions and crop production models. Two sources of data were evaluated for predicting dates of silking and physiological maturity of corn (Zea mays L.). Initial evaluations were conducted using data of an adapted corn hybrid grown on a Typic Agriaquoll at the Purdue University Agronomy Farm. The second phase extended the analyses to large areas using data acquired by the Statistical Reporting Service of USDA for crop reporting districts (CRD) in Indiana and Iowa. Several thermal models were compared to calendar days for predicting dates of silking and physiological maturity. Mixed models which used a combination of thermal units to predict silking and days after silking to predict physiological maturity were also evaluated. At the Agronomy Farm the models were calibrated and tested on the same data. The thermal models were significantly less biased and more accurate than calendar days for predicting dates of silking. Differences among the thermal models were small. Significant improvements in both bias and accuracy were observed when the mixed models were used to predict dates of physiological maturity. The results indicate that statistical data for CRD can be used to evaluate models developed at agricultural experiment stations.
Inservice trainings for Shiraz University of Medical Sciences employees: Effectiveness assessment by using the CIPP model

PubMed Central

MOKHTARZADEGAN, MARYAM; AMINI, MITRA; TAKMIL, FARNAZ; ADAMIAT, MOHAMMAD; SARVERAVAN, POONEH

2015-01-01

Introduction Nowadays, the employees` in-service training has become one of the core components in survival and success of any organization. Unfortunately, despite the importance of training evaluation, a small portion of resources are allocated to this matter. Among many evaluation models, the CIPP model or Context, Input, Process, Product model is a very useful approach to educational evaluation. So far, the evaluation of the training courses mostly provided information for learners but this investigation aims at evaluating the effectiveness of the experts’ training programs in SUMS and identifying its pros and cons based on the 4 stages of the CIPP model. Method In this descriptive analytical study, done in 2013, 250 employees of SUMS participated in in-service training courses were randomly selected. The evaluated variables were designed using CIPP model and a researcher-made questionnaire was used for data collection; the questionnaire was validated using expert opinion and its reliability was confirmed by Cronbach’s alpha (0.89). Quantitative data were analyzed using SPSS 14 and statistical tests was done as needed. Results In the context phase, the mean score was highest in solving work problems (4.07±0.88) and lowest in focusing on learners’ learning style training courses (2.68±0.91). There is a statistically significant difference between the employees` education level and the product phase evaluation (p<0.001). The necessary effectiveness was not statistically significant in context and input level (p>0.001), in contrast with the process and product phase which showed a significant deference (p<0.001). Conclusion Considering our results, although the in-service trainings given to sums employees has been effective in many ways, it has some weaknesses as well. Therefore improving these weaknesses and reinforcing strong points within the identified fields in this study should be taken into account by decision makers and administrators. PMID:25927072
Inservice trainings for Shiraz University of Medical Sciences employees: Effectiveness assessment by using the CIPP model.

PubMed

Mokhtarzadegan, Maryam; Amini, Mitra; Takmil, Farnaz; Adamiat, Mohammad; Sarveravan, Pooneh

2015-04-01

Nowadays, the employees` in-service training has become one of the core components in survival and success of any organization. Unfortunately, despite the importance of training evaluation, a small portion of resources are allocated to this matter. Among many evaluation models, the CIPP model or Context, Input, Process, Product model is a very useful approach to educational evaluation. So far, the evaluation of the training courses mostly provided information for learners but this investigation aims at evaluating the effectiveness of the experts' training programs in SUMS and identifying its pros and cons based on the 4 stages of the CIPP model. In this descriptive analytical study, done in 2013, 250 employees of SUMS participated in in-service training courses were randomly selected. The evaluated variables were designed using CIPP model and a researcher-made questionnaire was used for data collection; the questionnaire was validated using expert opinion and its reliability was confirmed by Cronbach's alpha (0.89). Quantitative data were analyzed using SPSS 14 and statistical tests was done as needed. In the context phase, the mean score was highest in solving work problems (4.07±0.88) and lowest in focusing on learners' learning style training courses (2.68±0.91). There is a statistically significant difference between the employees` education level and the product phase evaluation (p<0.001). The necessary effectiveness was not statistically significant in context and input level (p>0.001), in contrast with the process and product phase which showed a significant deference (p<0.001). Considering our results, although the in-service trainings given to sums employees has been effective in many ways, it has some weaknesses as well. Therefore improving these weaknesses and reinforcing strong points within the identified fields in this study should be taken into account by decision makers and administrators.
Evolution in Cloud Population Statistics of the MJO: From AMIE Field Observations to Global Cloud-Permiting Models

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhang, Chidong

Motivated by the success of the AMIE/DYNAMO field campaign, which collected unprecedented observations of cloud and precipitation from the tropical Indian Ocean in Octber 2011 – March 2012, this project explored how such observations can be applied to assist the development of global cloud-permitting models through evaluating and correcting model biases in cloud statistics. The main accomplishment of this project were made in four categories: generating observational products for model evaluation, using AMIE/DYNAMO observations to validate global model simulations, using AMIE/DYNAMO observations in numerical studies of cloud-permitting models, and providing leadership in the field. Results from this project provide valuablemore » information for building a seamless bridge between DOE ASR program’s component on process level understanding of cloud processes in the tropics and RGCM focus on global variability and regional extremes. In particular, experience gained from this project would be directly applicable to evaluation and improvements of ACME, especially as it transitions to a non-hydrostatic variable resolution model.« less
External Validation of a Case-Mix Adjustment Model for the Standardized Reporting of 30-Day Stroke Mortality Rates in China.

PubMed

Yu, Ping; Pan, Yuesong; Wang, Yongjun; Wang, Xianwei; Liu, Liping; Ji, Ruijun; Meng, Xia; Jing, Jing; Tong, Xu; Guo, Li; Wang, Yilong

2016-01-01

A case-mix adjustment model has been developed and externally validated, demonstrating promise. However, the model has not been thoroughly tested among populations in China. In our study, we evaluated the performance of the model in Chinese patients with acute stroke. The case-mix adjustment model A includes items on age, presence of atrial fibrillation on admission, National Institutes of Health Stroke Severity Scale (NIHSS) score on admission, and stroke type. Model B is similar to Model A but includes only the consciousness component of the NIHSS score. Both model A and B were evaluated to predict 30-day mortality rates in 13,948 patients with acute stroke from the China National Stroke Registry. The discrimination of the models was quantified by c-statistic. Calibration was assessed using Pearson's correlation coefficient. The c-statistic of model A in our external validation cohort was 0.80 (95% confidence interval, 0.79-0.82), and the c-statistic of model B was 0.82 (95% confidence interval, 0.81-0.84). Excellent calibration was reported in the two models with Pearson's correlation coefficient (0.892 for model A, p<0.001; 0.927 for model B, p = 0.008). The case-mix adjustment model could be used to effectively predict 30-day mortality rates in Chinese patients with acute stroke.

Developing statistical wildlife habitat relationships for assessing cumulative effects of fuels treatments: Final Report for Joint Fire Science Program Project

Treesearch

Samuel A. Cushman; Kevin S. McKelvey

2006-01-01

The primary weakness in our current ability to evaluate future landscapes in terms of wildlife lies in the lack of quantitative models linking wildlife to forest stand conditions, including fuels treatments. This project focuses on 1) developing statistical wildlife habitat relationships models (WHR) utilizing Forest Inventory and Analysis (FIA) and National Vegetation...
Using "Excel" for White's Test--An Important Technique for Evaluating the Equality of Variance Assumption and Model Specification in a Regression Analysis

ERIC Educational Resources Information Center

Berenson, Mark L.

2013-01-01

There is consensus in the statistical literature that severe departures from its assumptions invalidate the use of regression modeling for purposes of inference. The assumptions of regression modeling are usually evaluated subjectively through visual, graphic displays in a residual analysis but such an approach, taken alone, may be insufficient…
Maximum-likelihood curve-fitting scheme for experiments with pulsed lasers subject to intensity fluctuations.

PubMed

Metz, Thomas; Walewski, Joachim; Kaminski, Clemens F

2003-03-20

Evaluation schemes, e.g., least-squares fitting, are not generally applicable to any types of experiments. If the evaluation schemes were not derived from a measurement model that properly described the experiment to be evaluated, poorer precision or accuracy than attainable from the measured data could result. We outline ways in which statistical data evaluation schemes should be derived for all types of experiment, and we demonstrate them for laser-spectroscopic experiments, in which pulse-to-pulse fluctuations of the laser power cause correlated variations of laser intensity and generated signal intensity. The method of maximum likelihood is demonstrated in the derivation of an appropriate fitting scheme for this type of experiment. Statistical data evaluation contains the following steps. First, one has to provide a measurement model that considers statistical variation of all enclosed variables. Second, an evaluation scheme applicable to this particular model has to be derived or provided. Third, the scheme has to be characterized in terms of accuracy and precision. A criterion for accepting an evaluation scheme is that it have accuracy and precision as close as possible to the theoretical limit. The fitting scheme derived for experiments with pulsed lasers is compared to well-established schemes in terms of fitting power and rational functions. The precision is found to be as much as three timesbetter than for simple least-squares fitting. Our scheme also suppresses the bias on the estimated model parameters that other methods may exhibit if they are applied in an uncritical fashion. We focus on experiments in nonlinear spectroscopy, but the fitting scheme derived is applicable in many scientific disciplines.
Evaluating the performance of a fault detection and diagnostic system for vapor compression equipment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Breuker, M.S.; Braun, J.E.

This paper presents a detailed evaluation of the performance of a statistical, rule-based fault detection and diagnostic (FDD) technique presented by Rossi and Braun (1997). Steady-state and transient tests were performed on a simple rooftop air conditioner over a range of conditions and fault levels. The steady-state data without faults were used to train models that predict outputs for normal operation. The transient data with faults were used to evaluate FDD performance. The effect of a number of design variables on FDD sensitivity for different faults was evaluated and two prototype systems were specified for more complete evaluation. Good performancemore » was achieved in detecting and diagnosing five faults using only six temperatures (2 input and 4 output) and linear models. The performance improved by about a factor of two when ten measurements (three input and seven output) and higher order models were used. This approach for evaluating and optimizing the performance of the statistical, rule-based FDD technique could be used as a design and evaluation tool when applying this FDD method to other packaged air-conditioning systems. Furthermore, the approach could also be modified to evaluate the performance of other FDD methods.« less
Illustrating the practice of statistics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hamada, Christina A; Hamada, Michael S

2009-01-01

The practice of statistics involves analyzing data and planning data collection schemes to answer scientific questions. Issues often arise with the data that must be dealt with and can lead to new procedures. In analyzing data, these issues can sometimes be addressed through the statistical models that are developed. Simulation can also be helpful in evaluating a new procedure. Moreover, simulation coupled with optimization can be used to plan a data collection scheme. The practice of statistics as just described is much more than just using a statistical package. In analyzing the data, it involves understanding the scientific problem andmore » incorporating the scientist's knowledge. In modeling the data, it involves understanding how the data were collected and accounting for limitations of the data where possible. Moreover, the modeling is likely to be iterative by considering a series of models and evaluating the fit of these models. Designing a data collection scheme involves understanding the scientist's goal and staying within hislher budget in terms of time and the available resources. Consequently, a practicing statistician is faced with such tasks and requires skills and tools to do them quickly. We have written this article for students to provide a glimpse of the practice of statistics. To illustrate the practice of statistics, we consider a problem motivated by some precipitation data that our relative, Masaru Hamada, collected some years ago. We describe his rain gauge observational study in Section 2. We describe modeling and an initial analysis of the precipitation data in Section 3. In Section 4, we consider alternative analyses that address potential issues with the precipitation data. In Section 5, we consider the impact of incorporating additional infonnation. We design a data collection scheme to illustrate the use of simulation and optimization in Section 6. We conclude this article in Section 7 with a discussion.« less
An evaluation of the variable-resolution CESM for modeling California's climate: Evaluation of VR-CESM for Modeling California's Climate

DOE Office of Scientific and Technical Information (OSTI.GOV)

Huang, Xingying; Rhoades, Alan M.; Ullrich, Paul A.

In this paper, the recently developed variable-resolution option within the Community Earth System Model (VR-CESM) is assessed for long-term regional climate modeling of California at 0.25° (~ 28 km) and 0.125° (~ 14 km) horizontal resolutions. The mean climatology of near-surface temperature and precipitation is analyzed and contrasted with reanalysis, gridded observational data sets, and a traditional regional climate model (RCM)—the Weather Research and Forecasting (WRF) model. Statistical metrics for model evaluation and tests for differential significance have been extensively applied. VR-CESM tended to produce a warmer summer (by about 1–3°C) and overestimated overall winter precipitation (about 25%–35%) compared tomore » reference data sets when sea surface temperatures were prescribed. Increasing resolution from 0.25° to 0.125° did not produce a statistically significant improvement in the model results. By comparison, the analogous WRF climatology (constrained laterally and at the sea surface by ERA-Interim reanalysis) was ~1–3°C colder than the reference data sets, underestimated precipitation by ~20%–30% at 27 km resolution, and overestimated precipitation by ~ 65–85% at 9 km. Overall, VR-CESM produced comparable statistical biases to WRF in key climatological quantities. Moreover, this assessment highlights the value of variable-resolution global climate models (VRGCMs) in capturing fine-scale atmospheric processes, projecting future regional climate, and addressing the computational expense of uniform-resolution global climate models.« less
An evaluation of the variable-resolution CESM for modeling California's climate: Evaluation of VR-CESM for Modeling California's Climate

DOE PAGES

Huang, Xingying; Rhoades, Alan M.; Ullrich, Paul A.; ...

2016-03-01

In this paper, the recently developed variable-resolution option within the Community Earth System Model (VR-CESM) is assessed for long-term regional climate modeling of California at 0.25° (~ 28 km) and 0.125° (~ 14 km) horizontal resolutions. The mean climatology of near-surface temperature and precipitation is analyzed and contrasted with reanalysis, gridded observational data sets, and a traditional regional climate model (RCM)—the Weather Research and Forecasting (WRF) model. Statistical metrics for model evaluation and tests for differential significance have been extensively applied. VR-CESM tended to produce a warmer summer (by about 1–3°C) and overestimated overall winter precipitation (about 25%–35%) compared tomore » reference data sets when sea surface temperatures were prescribed. Increasing resolution from 0.25° to 0.125° did not produce a statistically significant improvement in the model results. By comparison, the analogous WRF climatology (constrained laterally and at the sea surface by ERA-Interim reanalysis) was ~1–3°C colder than the reference data sets, underestimated precipitation by ~20%–30% at 27 km resolution, and overestimated precipitation by ~ 65–85% at 9 km. Overall, VR-CESM produced comparable statistical biases to WRF in key climatological quantities. Moreover, this assessment highlights the value of variable-resolution global climate models (VRGCMs) in capturing fine-scale atmospheric processes, projecting future regional climate, and addressing the computational expense of uniform-resolution global climate models.« less
An efficient soil water balance model based on hybrid numerical and statistical methods

NASA Astrophysics Data System (ADS)

Mao, Wei; Yang, Jinzhong; Zhu, Yan; Ye, Ming; Liu, Zhao; Wu, Jingwei

2018-04-01

Most soil water balance models only consider downward soil water movement driven by gravitational potential, and thus cannot simulate upward soil water movement driven by evapotranspiration especially in agricultural areas. In addition, the models cannot be used for simulating soil water movement in heterogeneous soils, and usually require many empirical parameters. To resolve these problems, this study derives a new one-dimensional water balance model for simulating both downward and upward soil water movement in heterogeneous unsaturated zones. The new model is based on a hybrid of numerical and statistical methods, and only requires four physical parameters. The model uses three governing equations to consider three terms that impact soil water movement, including the advective term driven by gravitational potential, the source/sink term driven by external forces (e.g., evapotranspiration), and the diffusive term driven by matric potential. The three governing equations are solved separately by using the hybrid numerical and statistical methods (e.g., linear regression method) that consider soil heterogeneity. The four soil hydraulic parameters required by the new models are as follows: saturated hydraulic conductivity, saturated water content, field capacity, and residual water content. The strength and weakness of the new model are evaluated by using two published studies, three hypothetical examples and a real-world application. The evaluation is performed by comparing the simulation results of the new model with corresponding results presented in the published studies, obtained using HYDRUS-1D and observation data. The evaluation indicates that the new model is accurate and efficient for simulating upward soil water flow in heterogeneous soils with complex boundary conditions. The new model is used for evaluating different drainage functions, and the square drainage function and the power drainage function are recommended. Computational efficiency of the new model makes it particularly suitable for large-scale simulation of soil water movement, because the new model can be used with coarse discretization in space and time.
Statistical analysis of modeling error in structural dynamic systems

NASA Technical Reports Server (NTRS)

Hasselman, T. K.; Chrostowski, J. D.

1990-01-01

The paper presents a generic statistical model of the (total) modeling error for conventional space structures in their launch configuration. Modeling error is defined as the difference between analytical prediction and experimental measurement. It is represented by the differences between predicted and measured real eigenvalues and eigenvectors. Comparisons are made between pre-test and post-test models. Total modeling error is then subdivided into measurement error, experimental error and 'pure' modeling error, and comparisons made between measurement error and total modeling error. The generic statistical model presented in this paper is based on the first four global (primary structure) modes of four different structures belonging to the generic category of Conventional Space Structures (specifically excluding large truss-type space structures). As such, it may be used to evaluate the uncertainty of predicted mode shapes and frequencies, sinusoidal response, or the transient response of other structures belonging to the same generic category.
Comparisons between physics-based, engineering, and statistical learning models for outdoor sound propagation.

PubMed

Hart, Carl R; Reznicek, Nathan J; Wilson, D Keith; Pettit, Chris L; Nykaza, Edward T

2016-05-01

Many outdoor sound propagation models exist, ranging from highly complex physics-based simulations to simplified engineering calculations, and more recently, highly flexible statistical learning methods. Several engineering and statistical learning models are evaluated by using a particular physics-based model, namely, a Crank-Nicholson parabolic equation (CNPE), as a benchmark. Narrowband transmission loss values predicted with the CNPE, based upon a simulated data set of meteorological, boundary, and source conditions, act as simulated observations. In the simulated data set sound propagation conditions span from downward refracting to upward refracting, for acoustically hard and soft boundaries, and low frequencies. Engineering models used in the comparisons include the ISO 9613-2 method, Harmonoise, and Nord2000 propagation models. Statistical learning methods used in the comparisons include bagged decision tree regression, random forest regression, boosting regression, and artificial neural network models. Computed skill scores are relative to sound propagation in a homogeneous atmosphere over a rigid ground. Overall skill scores for the engineering noise models are 0.6%, -7.1%, and 83.8% for the ISO 9613-2, Harmonoise, and Nord2000 models, respectively. Overall skill scores for the statistical learning models are 99.5%, 99.5%, 99.6%, and 99.6% for bagged decision tree, random forest, boosting, and artificial neural network regression models, respectively.
An empirical comparison of statistical tests for assessing the proportional hazards assumption of Cox's model.

PubMed

Ng'andu, N H

1997-03-30

In the analysis of survival data using the Cox proportional hazard (PH) model, it is important to verify that the explanatory variables analysed satisfy the proportional hazard assumption of the model. This paper presents results of a simulation study that compares five test statistics to check the proportional hazard assumption of Cox's model. The test statistics were evaluated under proportional hazards and the following types of departures from the proportional hazard assumption: increasing relative hazards; decreasing relative hazards; crossing hazards; diverging hazards, and non-monotonic hazards. The test statistics compared include those based on partitioning of failure time and those that do not require partitioning of failure time. The simulation results demonstrate that the time-dependent covariate test, the weighted residuals score test and the linear correlation test have equally good power for detection of non-proportionality in the varieties of non-proportional hazards studied. Using illustrative data from the literature, these test statistics performed similarly.
Evaluating Structural Equation Models for Categorical Outcomes: A New Test Statistic and a Practical Challenge of Interpretation.

PubMed

Monroe, Scott; Cai, Li

2015-01-01

This research is concerned with two topics in assessing model fit for categorical data analysis. The first topic involves the application of a limited-information overall test, introduced in the item response theory literature, to structural equation modeling (SEM) of categorical outcome variables. Most popular SEM test statistics assess how well the model reproduces estimated polychoric correlations. In contrast, limited-information test statistics assess how well the underlying categorical data are reproduced. Here, the recently introduced C2 statistic of Cai and Monroe (2014) is applied. The second topic concerns how the root mean square error of approximation (RMSEA) fit index can be affected by the number of categories in the outcome variable. This relationship creates challenges for interpreting RMSEA. While the two topics initially appear unrelated, they may conveniently be studied in tandem since RMSEA is based on an overall test statistic, such as C2. The results are illustrated with an empirical application to data from a large-scale educational survey.
Tropical geometry of statistical models.

PubMed

Pachter, Lior; Sturmfels, Bernd

2004-11-16

This article presents a unified mathematical framework for inference in graphical models, building on the observation that graphical models are algebraic varieties. From this geometric viewpoint, observations generated from a model are coordinates of a point in the variety, and the sum-product algorithm is an efficient tool for evaluating specific coordinates. Here, we address the question of how the solutions to various inference problems depend on the model parameters. The proposed answer is expressed in terms of tropical algebraic geometry. The Newton polytope of a statistical model plays a key role. Our results are applied to the hidden Markov model and the general Markov model on a binary tree.
Statistical metrology—measurement and modeling of variation for advanced process development and design rule generation

NASA Astrophysics Data System (ADS)

Boning, Duane S.; Chung, James E.

1998-11-01

Advanced process technology will require more detailed understanding and tighter control of variation in devices and interconnects. The purpose of statistical metrology is to provide methods to measure and characterize variation, to model systematic and random components of that variation, and to understand the impact of variation on both yield and performance of advanced circuits. Of particular concern are spatial or pattern-dependencies within individual chips; such systematic variation within the chip can have a much larger impact on performance than wafer-level random variation. Statistical metrology methods will play an important role in the creation of design rules for advanced technologies. For example, a key issue in multilayer interconnect is the uniformity of interlevel dielectric (ILD) thickness within the chip. For the case of ILD thickness, we describe phases of statistical metrology development and application to understanding and modeling thickness variation arising from chemical-mechanical polishing (CMP). These phases include screening experiments including design of test structures and test masks to gather electrical or optical data, techniques for statistical decomposition and analysis of the data, and approaches to calibrating empirical and physical variation models. These models can be integrated with circuit CAD tools to evaluate different process integration or design rule strategies. One focus for the generation of interconnect design rules are guidelines for the use of "dummy fill" or "metal fill" to improve the uniformity of underlying metal density and thus improve the uniformity of oxide thickness within the die. Trade-offs that can be evaluated via statistical metrology include the improvements to uniformity possible versus the effect of increased capacitance due to additional metal.
Hydrologic consistency as a basis for assessing complexity of monthly water balance models for the continental United States

NASA Astrophysics Data System (ADS)

Martinez, Guillermo F.; Gupta, Hoshin V.

2011-12-01

Methods to select parsimonious and hydrologically consistent model structures are useful for evaluating dominance of hydrologic processes and representativeness of data. While information criteria (appropriately constrained to obey underlying statistical assumptions) can provide a basis for evaluating appropriate model complexity, it is not sufficient to rely upon the principle of maximum likelihood (ML) alone. We suggest that one must also call upon a "principle of hydrologic consistency," meaning that selected ML structures and parameter estimates must be constrained (as well as possible) to reproduce desired hydrological characteristics of the processes under investigation. This argument is demonstrated in the context of evaluating the suitability of candidate model structures for lumped water balance modeling across the continental United States, using data from 307 snow-free catchments. The models are constrained to satisfy several tests of hydrologic consistency, a flow space transformation is used to ensure better consistency with underlying statistical assumptions, and information criteria are used to evaluate model complexity relative to the data. The results clearly demonstrate that the principle of consistency provides a sensible basis for guiding selection of model structures and indicate strong spatial persistence of certain model structures across the continental United States. Further work to untangle reasons for model structure predominance can help to relate conceptual model structures to physical characteristics of the catchments, facilitating the task of prediction in ungaged basins.
Comparison of AERMOD and CALPUFF models for simulating SO2 concentrations in a gas refinery.

PubMed

Atabi, Farideh; Jafarigol, Farzaneh; Moattar, Faramarz; Nouri, Jafar

2016-09-01

In this study, concentration of SO2 from a gas refinery located in complex terrain was calculated by the steady-state, AERMOD model, and nonsteady-state CALPUFF model. First, in four seasons, SO2 concentrations emitted from 16 refinery stacks, in nine receptors, were obtained by field measurements, and then the performance of both models was evaluated. Then, the simulated results for SO2 ambient concentrations made by each model were compared with the results of the observed concentrations, and model results were compared among themselves. The evaluation of the two models to simulate SO2 concentrations was based on the statistical analysis and Q-Q plots. Review of statistical parameters and Q-Q plots has shown that, according to the evaluation of estimations made, performance of both models to simulate the concentration of SO2 in the region can be considered acceptable. The results showed the AERMOD composite ratio between simulated values made by models and the observed values in various receptors for all four average times is 0.72, whereas CALPUFF's ratio is 0.89. However, in the complex conditions of topography, CALPUFF offers better agreement with the observed concentrations.
EVALUATION OF SEVERAL PM 2.5 FORECAST MODELS USING DATA COLLECTED DURING THE ICARTT/NEAQS 2004 FIELD STUDY

EPA Science Inventory

Real-time forecasts of PM_2.5 aerosol mass from seven air-quality forecast models (AQFMs) are statistically evaluated against observations collected in the northeastern U.S. and southeastern Canada from two surface networks and aircraft data during the summer of 2004 IC...
Peer Review Documents Related to the Evaluation of ...

EPA Pesticide Factsheets

BMDS is one of the Agency's premier tools for estimating risk assessments, therefore the validity and reliability of its statistical models are of paramount importance. This page provides links to peer review and expert summaries of the BMDS application and its models as they were developed and eventually released documenting the rigorous review process taken to provide the best science tools available for statistical modeling. This page provides links to peer reviews and expert summaries of the BMDS applications and its models as they were developed and eventually released.
scoringRules - A software package for probabilistic model evaluation

NASA Astrophysics Data System (ADS)

Lerch, Sebastian; Jordan, Alexander; Krüger, Fabian

2016-04-01

Models in the geosciences are generally surrounded by uncertainty, and being able to quantify this uncertainty is key to good decision making. Accordingly, probabilistic forecasts in the form of predictive distributions have become popular over the last decades. With the proliferation of probabilistic models arises the need for decision theoretically principled tools to evaluate the appropriateness of models and forecasts in a generalized way. Various scoring rules have been developed over the past decades to address this demand. Proper scoring rules are functions S(F,y) which evaluate the accuracy of a forecast distribution F , given that an outcome y was observed. As such, they allow to compare alternative models, a crucial ability given the variety of theories, data sources and statistical specifications that is available in many situations. This poster presents the software package scoringRules for the statistical programming language R, which contains functions to compute popular scoring rules such as the continuous ranked probability score for a variety of distributions F that come up in applied work. Two main classes are parametric distributions like normal, t, or gamma distributions, and distributions that are not known analytically, but are indirectly described through a sample of simulation draws. For example, Bayesian forecasts produced via Markov Chain Monte Carlo take this form. Thereby, the scoringRules package provides a framework for generalized model evaluation that both includes Bayesian as well as classical parametric models. The scoringRules package aims to be a convenient dictionary-like reference for computing scoring rules. We offer state of the art implementations of several known (but not routinely applied) formulas, and implement closed-form expressions that were previously unavailable. Whenever more than one implementation variant exists, we offer statistically principled default choices.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhang, Fangyan; Zhang, Song; Chung Wong, Pak

Effectively visualizing large graphs and capturing the statistical properties are two challenging tasks. To aid in these two tasks, many sampling approaches for graph simplification have been proposed, falling into three categories: node sampling, edge sampling, and traversal-based sampling. It is still unknown which approach is the best. We evaluate commonly used graph sampling methods through a combined visual and statistical comparison of graphs sampled at various rates. We conduct our evaluation on three graph models: random graphs, small-world graphs, and scale-free graphs. Initial results indicate that the effectiveness of a sampling method is dependent on the graph model, themore » size of the graph, and the desired statistical property. This benchmark study can be used as a guideline in choosing the appropriate method for a particular graph sampling task, and the results presented can be incorporated into graph visualization and analysis tools.« less

Multiple Versus Single Set Validation of Multivariate Models to Avoid Mistakes.

PubMed

Harrington, Peter de Boves

2018-01-02

Validation of multivariate models is of current importance for a wide range of chemical applications. Although important, it is neglected. The common practice is to use a single external validation set for evaluation. This approach is deficient and may mislead investigators with results that are specific to the single validation set of data. In addition, no statistics are available regarding the precision of a derived figure of merit (FOM). A statistical approach using bootstrapped Latin partitions is advocated. This validation method makes an efficient use of the data because each object is used once for validation. It was reviewed a decade earlier but primarily for the optimization of chemometric models this review presents the reasons it should be used for generalized statistical validation. Average FOMs with confidence intervals are reported and powerful, matched-sample statistics may be applied for comparing models and methods. Examples demonstrate the problems with single validation sets.
Introducing Multisensor Satellite Radiance-Based Evaluation for Regional Earth System Modeling

NASA Technical Reports Server (NTRS)

Matsui, T.; Santanello, J.; Shi, J. J.; Tao, W.-K.; Wu, D.; Peters-Lidard, C.; Kemp, E.; Chin, M.; Starr, D.; Sekiguchi, M.;

2014-01-01

Earth System modeling has become more complex, and its evaluation using satellite data has also become more difficult due to model and data diversity. Therefore, the fundamental methodology of using satellite direct measurements with instrumental simulators should be addressed especially for modeling community members lacking a solid background of radiative transfer and scattering theory. This manuscript introduces principles of multisatellite, multisensor radiance-based evaluation methods for a fully coupled regional Earth System model: NASA-Unified Weather Research and Forecasting (NU-WRF) model. We use a NU-WRF case study simulation over West Africa as an example of evaluating aerosol-cloud-precipitation-land processes with various satellite observations. NU-WRF-simulated geophysical parameters are converted to the satellite-observable raw radiance and backscatter under nearly consistent physics assumptions via the multisensor satellite simulator, the Goddard Satellite Data Simulator Unit. We present varied examples of simple yet robust methods that characterize forecast errors and model physics biases through the spatial and statistical interpretation of various satellite raw signals: infrared brightness temperature (Tb) for surface skin temperature and cloud top temperature, microwave Tb for precipitation ice and surface flooding, and radar and lidar backscatter for aerosol-cloud profiling simultaneously. Because raw satellite signals integrate many sources of geophysical information, we demonstrate user-defined thresholds and a simple statistical process to facilitate evaluations, including the infrared-microwave-based cloud types and lidar/radar-based profile classifications.

Effect of Internet-Based Cognitive Apprenticeship Model (i-CAM) on Statistics Learning among Postgraduate Students.

PubMed

Saadati, Farzaneh; Ahmad Tarmizi, Rohani; Mohd Ayub, Ahmad Fauzi; Abu Bakar, Kamariah

2015-01-01

Because students' ability to use statistics, which is mathematical in nature, is one of the concerns of educators, embedding within an e-learning system the pedagogical characteristics of learning is 'value added' because it facilitates the conventional method of learning mathematics. Many researchers emphasize the effectiveness of cognitive apprenticeship in learning and problem solving in the workplace. In a cognitive apprenticeship learning model, skills are learned within a community of practitioners through observation of modelling and then practice plus coaching. This study utilized an internet-based Cognitive Apprenticeship Model (i-CAM) in three phases and evaluated its effectiveness for improving statistics problem-solving performance among postgraduate students. The results showed that, when compared to the conventional mathematics learning model, the i-CAM could significantly promote students' problem-solving performance at the end of each phase. In addition, the combination of the differences in students' test scores were considered to be statistically significant after controlling for the pre-test scores. The findings conveyed in this paper confirmed the considerable value of i-CAM in the improvement of statistics learning for non-specialized postgraduate students.
Mourning dove hunting regulation strategy based on annual harvest statistics and banding data

USGS Publications Warehouse

Otis, D.L.

2006-01-01

Although managers should strive to base game bird harvest management strategies on mechanistic population models, monitoring programs required to build and continuously update these models may not be in place. Alternatively, If estimates of total harvest and harvest rates are available, then population estimates derived from these harvest data can serve as the basis for making hunting regulation decisions based on population growth rates derived from these estimates. I present a statistically rigorous approach for regulation decision-making using a hypothesis-testing framework and an assumed framework of 3 hunting regulation alternatives. I illustrate and evaluate the technique with historical data on the mid-continent mallard (Anas platyrhynchos) population. I evaluate the statistical properties of the hypothesis-testing framework using the best available data on mourning doves (Zenaida macroura). I use these results to discuss practical implementation of the technique as an interim harvest strategy for mourning doves until reliable mechanistic population models and associated monitoring programs are developed.
Predicting Outcomes After Chemo-Embolization in Patients with Advanced-Stage Hepatocellular Carcinoma: An Evaluation of Different Radiologic Response Criteria

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gunn, Andrew J., E-mail: agunn@uabmc.edu; Sheth, Rahul A.; Luber, Brandon

2017-01-15

PurposeThe purpse of this study was to evaluate the ability of various radiologic response criteria to predict patient outcomes after trans-arterial chemo-embolization with drug-eluting beads (DEB-TACE) in patients with advanced-stage (BCLC C) hepatocellular carcinoma (HCC).Materials and methodsHospital records from 2005 to 2011 were retrospectively reviewed. Non-infiltrative lesions were measured at baseline and on follow-up scans after DEB-TACE according to various common radiologic response criteria, including guidelines of the World Health Organization (WHO), Response Evaluation Criteria in Solid Tumors (RECIST), the European Association for the Study of the Liver (EASL), and modified RECIST (mRECIST). Statistical analysis was performed to see which,more » if any, of the response criteria could be used as a predictor of overall survival (OS) or time-to-progression (TTP).Results75 patients met inclusion criteria. Median OS and TTP were 22.6 months (95 % CI 11.6–24.8) and 9.8 months (95 % CI 7.1–21.6), respectively. Univariate and multivariate Cox analyses revealed that none of the evaluated criteria had the ability to be used as a predictor for OS or TTP. Analysis of the C index in both univariate and multivariate models showed that the evaluated criteria were not accurate predictors of either OS (C-statistic range: 0.51–0.58 in the univariate model; range: 0.54–0.58 in the multivariate model) or TTP (C-statistic range: 0.55–0.59 in the univariate model; range: 0.57–0.61 in the multivariate model).ConclusionCurrent response criteria are not accurate predictors of OS or TTP in patients with advanced-stage HCC after DEB-TACE.« less
Predicting Outcomes After Chemo-Embolization in Patients with Advanced-Stage Hepatocellular Carcinoma: An Evaluation of Different Radiologic Response Criteria.

PubMed

Gunn, Andrew J; Sheth, Rahul A; Luber, Brandon; Huynh, Minh-Huy; Rachamreddy, Niranjan R; Kalva, Sanjeeva P

2017-01-01

The purpse of this study was to evaluate the ability of various radiologic response criteria to predict patient outcomes after trans-arterial chemo-embolization with drug-eluting beads (DEB-TACE) in patients with advanced-stage (BCLC C) hepatocellular carcinoma (HCC). Hospital records from 2005 to 2011 were retrospectively reviewed. Non-infiltrative lesions were measured at baseline and on follow-up scans after DEB-TACE according to various common radiologic response criteria, including guidelines of the World Health Organization (WHO), Response Evaluation Criteria in Solid Tumors (RECIST), the European Association for the Study of the Liver (EASL), and modified RECIST (mRECIST). Statistical analysis was performed to see which, if any, of the response criteria could be used as a predictor of overall survival (OS) or time-to-progression (TTP). 75 patients met inclusion criteria. Median OS and TTP were 22.6 months (95 % CI 11.6-24.8) and 9.8 months (95 % CI 7.1-21.6), respectively. Univariate and multivariate Cox analyses revealed that none of the evaluated criteria had the ability to be used as a predictor for OS or TTP. Analysis of the C index in both univariate and multivariate models showed that the evaluated criteria were not accurate predictors of either OS (C-statistic range: 0.51-0.58 in the univariate model; range: 0.54-0.58 in the multivariate model) or TTP (C-statistic range: 0.55-0.59 in the univariate model; range: 0.57-0.61 in the multivariate model). Current response criteria are not accurate predictors of OS or TTP in patients with advanced-stage HCC after DEB-TACE.
Survival modeling for the estimation of transition probabilities in model-based economic evaluations in the absence of individual patient data: a tutorial.

PubMed

Diaby, Vakaramoko; Adunlin, Georges; Montero, Alberto J

2014-02-01

Survival modeling techniques are increasingly being used as part of decision modeling for health economic evaluations. As many models are available, it is imperative for interested readers to know about the steps in selecting and using the most suitable ones. The objective of this paper is to propose a tutorial for the application of appropriate survival modeling techniques to estimate transition probabilities, for use in model-based economic evaluations, in the absence of individual patient data (IPD). An illustration of the use of the tutorial is provided based on the final progression-free survival (PFS) analysis of the BOLERO-2 trial in metastatic breast cancer (mBC). An algorithm was adopted from Guyot and colleagues, and was then run in the statistical package R to reconstruct IPD, based on the final PFS analysis of the BOLERO-2 trial. It should be emphasized that the reconstructed IPD represent an approximation of the original data. Afterwards, we fitted parametric models to the reconstructed IPD in the statistical package Stata. Both statistical and graphical tests were conducted to verify the relative and absolute validity of the findings. Finally, the equations for transition probabilities were derived using the general equation for transition probabilities used in model-based economic evaluations, and the parameters were estimated from fitted distributions. The results of the application of the tutorial suggest that the log-logistic model best fits the reconstructed data from the latest published Kaplan-Meier (KM) curves of the BOLERO-2 trial. Results from the regression analyses were confirmed graphically. An equation for transition probabilities was obtained for each arm of the BOLERO-2 trial. In this paper, a tutorial was proposed and used to estimate the transition probabilities for model-based economic evaluation, based on the results of the final PFS analysis of the BOLERO-2 trial in mBC. The results of our study can serve as a basis for any model (Markov) that needs the parameterization of transition probabilities, and only has summary KM plots available.
A Bayesian Approach to Evaluating Consistency between Climate Model Output and Observations

NASA Astrophysics Data System (ADS)

Braverman, A. J.; Cressie, N.; Teixeira, J.

2010-12-01

Like other scientific and engineering problems that involve physical modeling of complex systems, climate models can be evaluated and diagnosed by comparing their output to observations of similar quantities. Though the global remote sensing data record is relatively short by climate research standards, these data offer opportunities to evaluate model predictions in new ways. For example, remote sensing data are spatially and temporally dense enough to provide distributional information that goes beyond simple moments to allow quantification of temporal and spatial dependence structures. In this talk, we propose a new method for exploiting these rich data sets using a Bayesian paradigm. For a collection of climate models, we calculate posterior probabilities its members best represent the physical system each seeks to reproduce. The posterior probability is based on the likelihood that a chosen summary statistic, computed from observations, would be obtained when the model's output is considered as a realization from a stochastic process. By exploring how posterior probabilities change with different statistics, we may paint a more quantitative and complete picture of the strengths and weaknesses of the models relative to the observations. We demonstrate our method using model output from the CMIP archive, and observations from NASA's Atmospheric Infrared Sounder.
Bayesian Nonparametric Prediction and Statistical Inference

DTIC Science & Technology

1989-09-07

Kadane, J. (1980), "Bayesian decision theory and the sim- plification of models," in Evaluation of Econometric Models, J. Kmenta and J. Ramsey , eds...the random model and weighted least squares regression," in Evaluation of Econometric Models, ed. by J. Kmenta and J. Ramsey , Academic Press, 197-217...likelihood function. On the other hand, H. Jeffreys’s theory of hypothesis testing covers the most important situations in which the prior is not diffuse. See
Plurality of Type A evaluations of uncertainty

NASA Astrophysics Data System (ADS)

Possolo, Antonio; Pintar, Adam L.

2017-10-01

The evaluations of measurement uncertainty involving the application of statistical methods to measurement data (Type A evaluations as specified in the Guide to the Expression of Uncertainty in Measurement, GUM) comprise the following three main steps: (i) developing a statistical model that captures the pattern of dispersion or variability in the experimental data, and that relates the data either to the measurand directly or to some intermediate quantity (input quantity) that the measurand depends on; (ii) selecting a procedure for data reduction that is consistent with this model and that is fit for the purpose that the results are intended to serve; (iii) producing estimates of the model parameters, or predictions based on the fitted model, and evaluations of uncertainty that qualify either those estimates or these predictions, and that are suitable for use in subsequent uncertainty propagation exercises. We illustrate these steps in uncertainty evaluations related to the measurement of the mass fraction of vanadium in a bituminous coal reference material, including the assessment of the homogeneity of the material, and to the calibration and measurement of the amount-of-substance fraction of a hydrochlorofluorocarbon in air, and of the age of a meteorite. Our goal is to expose the plurality of choices that can reasonably be made when taking each of the three steps outlined above, and to show that different choices typically lead to different estimates of the quantities of interest, and to different evaluations of the associated uncertainty. In all the examples, the several alternatives considered represent choices that comparably competent statisticians might make, but who differ in the assumptions that they are prepared to rely on, and in their selection of approach to statistical inference. They represent also alternative treatments that the same statistician might give to the same data when the results are intended for different purposes.
Evaluation of the 29-km Eta Model. Part 1; Objective Verification at Three Selected Stations

NASA Technical Reports Server (NTRS)

Nutter, Paul A.; Manobianco, John; Merceret, Francis J. (Technical Monitor)

1998-01-01

This paper describes an objective verification of the National Centers for Environmental Prediction (NCEP) 29-km eta model from May 1996 through January 1998. The evaluation was designed to assess the model's surface and upper-air point forecast accuracy at three selected locations during separate warm (May - August) and cool (October - January) season periods. In order to enhance sample sizes available for statistical calculations, the objective verification includes two consecutive warm and cool season periods. Systematic model deficiencies comprise the larger portion of the total error in most of the surface forecast variables that were evaluated. The error characteristics for both surface and upper-air forecasts vary widely by parameter, season, and station location. At upper levels, a few characteristic biases are identified. Overall however, the upper-level errors are more nonsystematic in nature and could be explained partly by observational measurement uncertainty. With a few exceptions, the upper-air results also indicate that 24-h model error growth is not statistically significant. In February and August 1997, NCEP implemented upgrades to the eta model's physical parameterizations that were designed to change some of the model's error characteristics near the surface. The results shown in this paper indicate that these upgrades led to identifiable and statistically significant changes in forecast accuracy for selected surface parameters. While some of the changes were expected, others were not consistent with the intent of the model updates and further emphasize the need for ongoing sensitivity studies and localized statistical verification efforts. Objective verification of point forecasts is a stringent measure of model performance, but when used alone, is not enough to quantify the overall value that model guidance may add to the forecast process. Therefore, results from a subjective verification of the meso-eta model over the Florida peninsula are discussed in the companion paper by Manobianco and Nutter. Overall verification results presented here and in part two should establish a reasonable benchmark from which model users and developers may pursue the ongoing eta model verification strategies in the future.
Enhancing predictive accuracy and reproducibility in clinical evaluation research: Commentary on the special section of the Journal of Evaluation in Clinical Practice.

PubMed

Bryant, Fred B

2016-12-01

This paper introduces a special section of the current issue of the Journal of Evaluation in Clinical Practice that includes a set of 6 empirical articles showcasing a versatile, new machine-learning statistical method, known as optimal data (or discriminant) analysis (ODA), specifically designed to produce statistical models that maximize predictive accuracy. As this set of papers clearly illustrates, ODA offers numerous important advantages over traditional statistical methods-advantages that enhance the validity and reproducibility of statistical conclusions in empirical research. This issue of the journal also includes a review of a recently published book that provides a comprehensive introduction to the logic, theory, and application of ODA in empirical research. It is argued that researchers have much to gain by using ODA to analyze their data. © 2016 John Wiley & Sons, Ltd.
Bridging the gap between habitat-modeling research and bird conservation with dynamic landscape and population models

Treesearch

Frank R., III Thompson

2009-01-01

Habitat models are widely used in bird conservation planning to assess current habitat or populations and to evaluate management alternatives. These models include species-habitat matrix or database models, habitat suitability models, and statistical models that predict abundance. While extremely useful, these approaches have some limitations.
Application of Hierarchy Theory to Cross-Scale Hydrologic Modeling of Nutrient Loads

EPA Science Inventory

We describe a model called Regional Hydrologic Modeling for Environmental Evaluation 16 (RHyME²) for quantifying annual nutrient loads in stream networks and watersheds. RHyME² is 17 a cross-scale statistical and process-based water-quality model. The model ...
A full year evaluation of the CALIOPE-EU air quality modeling system over Europe for 2004

NASA Astrophysics Data System (ADS)

Pay, M. T.; Piot, M.; Jorba, O.; Gassó, S.; Gonçalves, M.; Basart, S.; Dabdub, D.; Jiménez-Guerrero, P.; Baldasano, J. M.

The CALIOPE-EU high-resolution air quality modeling system, namely WRF-ARW/HERMES-EMEP/CMAQ/BSC-DREAM8b, is developed and applied to Europe (12 km × 12 km, 1 h). The model performances are tested in terms of air quality levels and dynamics reproducibility on a yearly basis. The present work describes a quantitative evaluation of gas phase species (O 3, NO 2 and SO 2) and particulate matter (PM2.5 and PM10) against ground-based measurements from the EMEP (European Monitoring and Evaluation Programme) network for the year 2004. The evaluation is based on statistics. Simulated O 3 achieves satisfactory performances for both daily mean and daily maximum concentrations, especially in summer, with annual mean correlations of 0.66 and 0.69, respectively. Mean normalized errors are comprised within the recommendations proposed by the United States Environmental Protection Agency (US-EPA). The general trends and daily variations of primary pollutants (NO 2 and SO 2) are satisfactory. Daily mean concentrations of NO 2 correlate well with observations (annual correlation r = 0.67) but tend to be underestimated. For SO 2, mean concentrations are well simulated (mean bias = 0.5 μg m -3) with relatively high annual mean correlation ( r = 0.60), although peaks are generally overestimated. The dynamics of PM2.5 and PM10 is well reproduced (0.49 < r < 0.62), but mean concentrations remain systematically underestimated. Deficiencies in particulate matter source characterization are discussed. Also, the spatially distributed statistics and the general patterns for each pollutant over Europe are examined. The model performances are compared with other European studies. While O 3 statistics generally remain lower than those obtained by the other considered studies, statistics for NO 2, SO 2, PM2.5 and PM10 present higher scores than most models.
External Validation of a Case-Mix Adjustment Model for the Standardized Reporting of 30-Day Stroke Mortality Rates in China

PubMed Central

Yu, Ping; Pan, Yuesong; Wang, Yongjun; Wang, Xianwei; Liu, Liping; Ji, Ruijun; Meng, Xia; Jing, Jing; Tong, Xu; Guo, Li; Wang, Yilong

2016-01-01

Background and Purpose A case-mix adjustment model has been developed and externally validated, demonstrating promise. However, the model has not been thoroughly tested among populations in China. In our study, we evaluated the performance of the model in Chinese patients with acute stroke. Methods The case-mix adjustment model A includes items on age, presence of atrial fibrillation on admission, National Institutes of Health Stroke Severity Scale (NIHSS) score on admission, and stroke type. Model B is similar to Model A but includes only the consciousness component of the NIHSS score. Both model A and B were evaluated to predict 30-day mortality rates in 13,948 patients with acute stroke from the China National Stroke Registry. The discrimination of the models was quantified by c-statistic. Calibration was assessed using Pearson’s correlation coefficient. Results The c-statistic of model A in our external validation cohort was 0.80 (95% confidence interval, 0.79–0.82), and the c-statistic of model B was 0.82 (95% confidence interval, 0.81–0.84). Excellent calibration was reported in the two models with Pearson’s correlation coefficient (0.892 for model A, p<0.001; 0.927 for model B, p = 0.008). Conclusions The case-mix adjustment model could be used to effectively predict 30-day mortality rates in Chinese patients with acute stroke. PMID:27846282
Modelling unsupervised online-learning of artificial grammars: linking implicit and statistical learning.

PubMed

Rohrmeier, Martin A; Cross, Ian

2014-07-01

Humans rapidly learn complex structures in various domains. Findings of above-chance performance of some untrained control groups in artificial grammar learning studies raise questions about the extent to which learning can occur in an untrained, unsupervised testing situation with both correct and incorrect structures. The plausibility of unsupervised online-learning effects was modelled with n-gram, chunking and simple recurrent network models. A novel evaluation framework was applied, which alternates forced binary grammaticality judgments and subsequent learning of the same stimulus. Our results indicate a strong online learning effect for n-gram and chunking models and a weaker effect for simple recurrent network models. Such findings suggest that online learning is a plausible effect of statistical chunk learning that is possible when ungrammatical sequences contain a large proportion of grammatical chunks. Such common effects of continuous statistical learning may underlie statistical and implicit learning paradigms and raise implications for study design and testing methodologies. Copyright © 2014 Elsevier Inc. All rights reserved.
Metrological traceability in education: A practical online system for measuring and managing middle school mathematics instruction

NASA Astrophysics Data System (ADS)

Torres Irribarra, D.; Freund, R.; Fisher, W.; Wilson, M.

2015-02-01

Computer-based, online assessments modelled, designed, and evaluated for adaptively administered invariant measurement are uniquely suited to defining and maintaining traceability to standardized units in education. An assessment of this kind is embedded in the Assessing Data Modeling and Statistical Reasoning (ADM) middle school mathematics curriculum. Diagnostic information about middle school students' learning of statistics and modeling is provided via computer-based formative assessments for seven constructs that comprise a learning progression for statistics and modeling from late elementary through the middle school grades. The seven constructs are: Data Display, Meta-Representational Competence, Conceptions of Statistics, Chance, Modeling Variability, Theory of Measurement, and Informal Inference. The end product is a web-delivered system built with Ruby on Rails for use by curriculum development teams working with classroom teachers in designing, developing, and delivering formative assessments. The online accessible system allows teachers to accurately diagnose students' unique comprehension and learning needs in a common language of real-time assessment, logging, analysis, feedback, and reporting.
Pitfalls in statistical landslide susceptibility modelling

NASA Astrophysics Data System (ADS)

Schröder, Boris; Vorpahl, Peter; Märker, Michael; Elsenbeer, Helmut

2010-05-01

The use of statistical methods is a well-established approach to predict landslide occurrence probabilities and to assess landslide susceptibility. This is achieved by applying statistical methods relating historical landslide inventories to topographic indices as predictor variables. In our contribution, we compare several new and powerful methods developed in machine learning and well-established in landscape ecology and macroecology for predicting the distribution of shallow landslides in tropical mountain rainforests in southern Ecuador (among others: boosted regression trees, multivariate adaptive regression splines, maximum entropy). Although these methods are powerful, we think it is necessary to follow a basic set of guidelines to avoid some pitfalls regarding data sampling, predictor selection, and model quality assessment, especially if a comparison of different models is contemplated. We therefore suggest to apply a novel toolbox to evaluate approaches to the statistical modelling of landslide susceptibility. Additionally, we propose some methods to open the "black box" as an inherent part of machine learning methods in order to achieve further explanatory insights into preparatory factors that control landslides. Sampling of training data should be guided by hypotheses regarding processes that lead to slope failure taking into account their respective spatial scales. This approach leads to the selection of a set of candidate predictor variables considered on adequate spatial scales. This set should be checked for multicollinearity in order to facilitate model response curve interpretation. Model quality assesses how well a model is able to reproduce independent observations of its response variable. This includes criteria to evaluate different aspects of model performance, i.e. model discrimination, model calibration, and model refinement. In order to assess a possible violation of the assumption of independency in the training samples or a possible lack of explanatory information in the chosen set of predictor variables, the model residuals need to be checked for spatial auto¬correlation. Therefore, we calculate spline correlograms. In addition to this, we investigate partial dependency plots and bivariate interactions plots considering possible interactions between predictors to improve model interpretation. Aiming at presenting this toolbox for model quality assessment, we investigate the influence of strategies in the construction of training datasets for statistical models on model quality.
FAST COGNITIVE AND TASK ORIENTED, ITERATIVE DATA DISPLAY (FACTOID)

DTIC Science & Technology

2017-06-01

approaches. As a result, the following assumptions guided our efforts in developing modeling and descriptive metrics for evaluation purposes...Application Evaluation . Our analytic workflow for evaluation is to first provide descriptive statistics about applications across metrics (performance...distributions for evaluation purposes because the goal of evaluation is accurate description , not inference (e.g., prediction). Outliers depicted

Evaluating a linearized Euler equations model for strong turbulence effects on sound propagation.

PubMed

Ehrhardt, Loïc; Cheinet, Sylvain; Juvé, Daniel; Blanc-Benon, Philippe

2013-04-01

Sound propagation outdoors is strongly affected by atmospheric turbulence. Under strongly perturbed conditions or long propagation paths, the sound fluctuations reach their asymptotic behavior, e.g., the intensity variance progressively saturates. The present study evaluates the ability of a numerical propagation model based on the finite-difference time-domain solving of the linearized Euler equations in quantitatively reproducing the wave statistics under strong and saturated intensity fluctuations. It is the continuation of a previous study where weak intensity fluctuations were considered. The numerical propagation model is presented and tested with two-dimensional harmonic sound propagation over long paths and strong atmospheric perturbations. The results are compared to quantitative theoretical or numerical predictions available on the wave statistics, including the log-amplitude variance and the probability density functions of the complex acoustic pressure. The match is excellent for the evaluated source frequencies and all sound fluctuations strengths. Hence, this model captures these many aspects of strong atmospheric turbulence effects on sound propagation. Finally, the model results for the intensity probability density function are compared with a standard fit by a generalized gamma function.
Performance Indicators in Education.

ERIC Educational Resources Information Center

Irvine, David J.

Evaluation of education involves assessing the effectiveness of schools and trying to determine how best to improve them. Since evaluation often deals only with the question of effectiveness, performance indicators in education are designed to make evaluation more complete. They are a set of statistical models which relate several important…
DEVELOPMENT OF THE VIRTUAL BEACH MODEL, PHASE 1: AN EMPIRICAL MODEL

EPA Science Inventory

With increasing attention focused on the use of multiple linear regression (MLR) modeling of beach fecal bacteria concentration, the validity of the entire statistical process should be carefully evaluated to assure satisfactory predictions. This work aims to identify pitfalls an...
Alpha1 LASSO data bundles Lamont, OK

DOE Data Explorer

Gustafson, William Jr; Vogelmann, Andrew; Endo, Satoshi; Toto, Tami; Xiao, Heng; Li, Zhijin; Cheng, Xiaoping; Krishna, Bhargavi (ORCID:000000018828528X)

2016-08-03

A data bundle is a unified package consisting of LASSO LES input and output, observations, evaluation diagnostics, and model skill scores. LES input includes model configuration information and forcing data. LES output includes profile statistics and full domain fields of cloud and environmental variables. Model evaluation data consists of LES output and ARM observations co-registered on the same grid and sampling frequency. Model performance is quantified by skill scores and diagnostics in terms of cloud and environmental variables.
A Numerical Simulation and Statistical Modeling of High Intensity Radiated Fields Experiment Data

NASA Technical Reports Server (NTRS)

Smith, Laura J.

2004-01-01

Tests are conducted on a quad-redundant fault tolerant flight control computer to establish upset characteristics of an avionics system in an electromagnetic field. A numerical simulation and statistical model are described in this work to analyze the open loop experiment data collected in the reverberation chamber at NASA LaRC as a part of an effort to examine the effects of electromagnetic interference on fly-by-wire aircraft control systems. By comparing thousands of simulation and model outputs, the models that best describe the data are first identified and then a systematic statistical analysis is performed on the data. All of these efforts are combined which culminate in an extrapolation of values that are in turn used to support previous efforts used in evaluating the data.
Primal/dual linear programming and statistical atlases for cartilage segmentation.

PubMed

Glocker, Ben; Komodakis, Nikos; Paragios, Nikos; Glaser, Christian; Tziritas, Georgios; Navab, Nassir

2007-01-01

In this paper we propose a novel approach for automatic segmentation of cartilage using a statistical atlas and efficient primal/dual linear programming. To this end, a novel statistical atlas construction is considered from registered training examples. Segmentation is then solved through registration which aims at deforming the atlas such that the conditional posterior of the learned (atlas) density is maximized with respect to the image. Such a task is reformulated using a discrete set of deformations and segmentation becomes equivalent to finding the set of local deformations which optimally match the model to the image. We evaluate our method on 56 MRI data sets (28 used for the model and 28 used for evaluation) and obtain a fully automatic segmentation of patella cartilage volume with an overlap ratio of 0.84 with a sensitivity and specificity of 94.06% and 99.92%, respectively.
Application of a Fuzzy Verification Technique for Assessment of the Weather Running Estimate-Nowcast (WRE-N) Model

DTIC Science & Technology

2016-10-01

comes when considering numerous scores and statistics during a preliminary evaluation of the applicability of the fuzzy- verification minimum coverage...The selection of thresholds with which to generate categorical-verification scores and statistics from the application of both traditional and...of statistically significant numbers of cases; the latter presents a challenge of limited application for assessment of the forecast models’ ability
Evaluating observations in the context of predictions for the death valley regional groundwater system

USGS Publications Warehouse

Ely, D.M.; Hill, M.C.; Tiedeman, C.R.; O'Brien, G. M.

2004-01-01

When a model is calibrated by nonlinear regression, calculated diagnostic and inferential statistics provide a wealth of information about many aspects of the system. This work uses linear inferential statistics that are measures of prediction uncertainty to investigate the likely importance of continued monitoring of hydraulic head to the accuracy of model predictions. The measurements evaluated are hydraulic heads; the predictions of interest are subsurface transport from 15 locations. The advective component of transport is considered because it is the component most affected by the system dynamics represented by the regional-scale model being used. The problem is addressed using the capabilities of the U.S. Geological Survey computer program MODFLOW-2000, with its Advective Travel Observation (ADV) Package. Copyright ASCE 2004.
Evaluation of neutron total and capture cross sections on 99Tc in the unresolved resonance region

NASA Astrophysics Data System (ADS)

Iwamoto, Nobuyuki; Katabuchi, Tatsuya

2017-09-01

Long-lived fission product Technetium-99 is one of the most important radioisotopes for nuclear transmutation. The reliable nuclear data are indispensable for a wide energy range up to a few MeV, in order to develop environmental load reducing technology. The statistical analyses of resolved resonances were performed by using the truncated Porter-Thomas distribution, coupled-channels optical model, nuclear level density model and Bayes' theorem on conditional probability. The total and capture cross sections were calculated by a nuclear reaction model code CCONE. The resulting cross sections have statistical consistency between the resolved and unresolved resonance regions. The evaluated capture data reproduce those recently measured at ANNRI of J-PARC/MLF above resolved resonance region up to 800 keV.
Deriving the expected utility of a predictive model when the utilities are uncertain.

PubMed

Cooper, Gregory F; Visweswaran, Shyam

2005-01-01

Predictive models are often constructed from clinical databases with the goal of eventually helping make better clinical decisions. Evaluating models using decision theory is therefore natural. When constructing a model using statistical and machine learning methods, however, we are often uncertain about precisely how the model will be used. Thus, decision-independent measures of classification performance, such as the area under an ROC curve, are popular. As a complementary method of evaluation, we investigate techniques for deriving the expected utility of a model under uncertainty about the model's utilities. We demonstrate an example of the application of this approach to the evaluation of two models that diagnose coronary artery disease.
Characterizing and Addressing the Need for Statistical Adjustment of Global Climate Model Data

NASA Astrophysics Data System (ADS)

White, K. D.; Baker, B.; Mueller, C.; Villarini, G.; Foley, P.; Friedman, D.

2017-12-01

As part of its mission to research and measure the effects of the changing climate, the U. S. Army Corps of Engineers (USACE) regularly uses the World Climate Research Programme's Coupled Model Intercomparison Project Phase 5 (CMIP5) multi-model dataset. However, these data are generated at a global level and are not fine-tuned for specific watersheds. This often causes CMIP5 output to vary from locally observed patterns in the climate. Several downscaling methods have been developed to increase the resolution of the CMIP5 data and decrease systemic differences to support decision-makers as they evaluate results at the watershed scale. Evaluating preliminary comparisons of observed and projected flow frequency curves over the US revealed a simple framework for water resources decision makers to plan and design water resources management measures under changing conditions using standard tools. Using this framework as a basis, USACE has begun to explore to use of statistical adjustment to alter global climate model data to better match the locally observed patterns while preserving the general structure and behavior of the model data. When paired with careful measurement and hypothesis testing, statistical adjustment can be particularly effective at navigating the compromise between the locally observed patterns and the global climate model structures for decision makers.
Nakagami-based total variation method for speckle reduction in thyroid ultrasound images.

PubMed

Koundal, Deepika; Gupta, Savita; Singh, Sukhwinder

2016-02-01

A good statistical model is necessary for the reduction in speckle noise. The Nakagami model is more general than the Rayleigh distribution for statistical modeling of speckle in ultrasound images. In this article, the Nakagami-based noise removal method is presented to enhance thyroid ultrasound images and to improve clinical diagnosis. The statistics of log-compressed image are derived from the Nakagami distribution following a maximum a posteriori estimation framework. The minimization problem is solved by optimizing an augmented Lagrange and Chambolle's projection method. The proposed method is evaluated on both artificial speckle-simulated and real ultrasound images. The experimental findings reveal the superiority of the proposed method both quantitatively and qualitatively in comparison with other speckle reduction methods reported in the literature. The proposed method yields an average signal-to-noise ratio gain of more than 2.16 dB over the non-convex regularizer-based speckle noise removal method, 3.83 dB over the Aubert-Aujol model, 1.71 dB over the Shi-Osher model and 3.21 dB over the Rudin-Lions-Osher model on speckle-simulated synthetic images. Furthermore, visual evaluation of the despeckled images shows that the proposed method suppresses speckle noise well while preserving the textures and fine details. © IMechE 2015.
Multiplicity Control in Structural Equation Modeling

ERIC Educational Resources Information Center

Cribbie, Robert A.

2007-01-01

Researchers conducting structural equation modeling analyses rarely, if ever, control for the inflated probability of Type I errors when evaluating the statistical significance of multiple parameters in a model. In this study, the Type I error control, power and true model rates of famsilywise and false discovery rate controlling procedures were…
Evaluating Teachers and Schools Using Student Growth Models

ERIC Educational Resources Information Center

Schafer, William D.; Lissitz, Robert W.; Zhu, Xiaoshu; Zhang, Yuan; Hou, Xiaodong; Li, Ying

2012-01-01

Interest in Student Growth Modeling (SGM) and Value Added Modeling (VAM) arises from educators concerned with measuring the effectiveness of teaching and other school activities through changes in student performance as a companion and perhaps even an alternative to status. Several formal statistical models have been proposed for year-to-year…
Evaluating simulations of daily discharge from large watersheds using autoregression and an index of flashiness

USDA-ARS?s Scientific Manuscript database

Watershed models are calibrated to simulate stream discharge as accurately as possible. Modelers will often calculate model validation statistics on aggregate (often monthly) time periods, rather than the daily step at which models typically operate. This is because daily hydrologic data exhibit lar...
Developing Risk Prediction Models for Kidney Injury and Assessing Incremental Value for Novel Biomarkers

PubMed Central

Kerr, Kathleen F.; Meisner, Allison; Thiessen-Philbrook, Heather; Coca, Steven G.

2014-01-01

The field of nephrology is actively involved in developing biomarkers and improving models for predicting patients’ risks of AKI and CKD and their outcomes. However, some important aspects of evaluating biomarkers and risk models are not widely appreciated, and statistical methods are still evolving. This review describes some of the most important statistical concepts for this area of research and identifies common pitfalls. Particular attention is paid to metrics proposed within the last 5 years for quantifying the incremental predictive value of a new biomarker. PMID:24855282
Evaluation of Model Fit in Cognitive Diagnosis Models

ERIC Educational Resources Information Center

Hu, Jinxiang; Miller, M. David; Huggins-Manley, Anne Corinne; Chen, Yi-Hsin

2016-01-01

Cognitive diagnosis models (CDMs) estimate student ability profiles using latent attributes. Model fit to the data needs to be ascertained in order to determine whether inferences from CDMs are valid. This study investigated the usefulness of some popular model fit statistics to detect CDM fit including relative fit indices (AIC, BIC, and CAIC),…
GIA Model Statistics for GRACE Hydrology, Cryosphere, and Ocean Science

NASA Astrophysics Data System (ADS)

Caron, L.; Ivins, E. R.; Larour, E.; Adhikari, S.; Nilsson, J.; Blewitt, G.

2018-03-01

We provide a new analysis of glacial isostatic adjustment (GIA) with the goal of assembling the model uncertainty statistics required for rigorously extracting trends in surface mass from the Gravity Recovery and Climate Experiment (GRACE) mission. Such statistics are essential for deciphering sea level, ocean mass, and hydrological changes because the latter signals can be relatively small (≤2 mm/yr water height equivalent) over very large regions, such as major ocean basins and watersheds. With abundant new >7 year continuous measurements of vertical land motion (VLM) reported by Global Positioning System stations on bedrock and new relative sea level records, our new statistical evaluation of GIA uncertainties incorporates Bayesian methodologies. A unique aspect of the method is that both the ice history and 1-D Earth structure vary through a total of 128,000 forward models. We find that best fit models poorly capture the statistical inferences needed to correctly invert for lower mantle viscosity and that GIA uncertainty exceeds the uncertainty ascribed to trends from 14 years of GRACE data in polar regions.
Cross-Participant EEG-Based Assessment of Cognitive Workload Using Multi-Path Convolutional Recurrent Neural Networks.

PubMed

Hefron, Ryan; Borghetti, Brett; Schubert Kabban, Christine; Christensen, James; Estepp, Justin

2018-04-26

Applying deep learning methods to electroencephalograph (EEG) data for cognitive state assessment has yielded improvements over previous modeling methods. However, research focused on cross-participant cognitive workload modeling using these techniques is underrepresented. We study the problem of cross-participant state estimation in a non-stimulus-locked task environment, where a trained model is used to make workload estimates on a new participant who is not represented in the training set. Using experimental data from the Multi-Attribute Task Battery (MATB) environment, a variety of deep neural network models are evaluated in the trade-space of computational efficiency, model accuracy, variance and temporal specificity yielding three important contributions: (1) The performance of ensembles of individually-trained models is statistically indistinguishable from group-trained methods at most sequence lengths. These ensembles can be trained for a fraction of the computational cost compared to group-trained methods and enable simpler model updates. (2) While increasing temporal sequence length improves mean accuracy, it is not sufficient to overcome distributional dissimilarities between individuals’ EEG data, as it results in statistically significant increases in cross-participant variance. (3) Compared to all other networks evaluated, a novel convolutional-recurrent model using multi-path subnetworks and bi-directional, residual recurrent layers resulted in statistically significant increases in predictive accuracy and decreases in cross-participant variance.
Grain-Size Based Additivity Models for Scaling Multi-rate Uranyl Surface Complexation in Subsurface Sediments

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhang, Xiaoying; Liu, Chongxuan; Hu, Bill X.

The additivity model assumed that field-scale reaction properties in a sediment including surface area, reactive site concentration, and reaction rate can be predicted from field-scale grain-size distribution by linearly adding reaction properties estimated in laboratory for individual grain-size fractions. This study evaluated the additivity model in scaling mass transfer-limited, multi-rate uranyl (U(VI)) surface complexation reactions in a contaminated sediment. Experimental data of rate-limited U(VI) desorption in a stirred flow-cell reactor were used to estimate the statistical properties of the rate constants for individual grain-size fractions, which were then used to predict rate-limited U(VI) desorption in the composite sediment. The resultmore » indicated that the additivity model with respect to the rate of U(VI) desorption provided a good prediction of U(VI) desorption in the composite sediment. However, the rate constants were not directly scalable using the additivity model. An approximate additivity model for directly scaling rate constants was subsequently proposed and evaluated. The result found that the approximate model provided a good prediction of the experimental results within statistical uncertainty. This study also found that a gravel-size fraction (2 to 8 mm), which is often ignored in modeling U(VI) sorption and desorption, is statistically significant to the U(VI) desorption in the sediment.« less

Cross-Participant EEG-Based Assessment of Cognitive Workload Using Multi-Path Convolutional Recurrent Neural Networks

PubMed Central

Hefron, Ryan; Borghetti, Brett; Schubert Kabban, Christine; Christensen, James; Estepp, Justin

2018-01-01

Applying deep learning methods to electroencephalograph (EEG) data for cognitive state assessment has yielded improvements over previous modeling methods. However, research focused on cross-participant cognitive workload modeling using these techniques is underrepresented. We study the problem of cross-participant state estimation in a non-stimulus-locked task environment, where a trained model is used to make workload estimates on a new participant who is not represented in the training set. Using experimental data from the Multi-Attribute Task Battery (MATB) environment, a variety of deep neural network models are evaluated in the trade-space of computational efficiency, model accuracy, variance and temporal specificity yielding three important contributions: (1) The performance of ensembles of individually-trained models is statistically indistinguishable from group-trained methods at most sequence lengths. These ensembles can be trained for a fraction of the computational cost compared to group-trained methods and enable simpler model updates. (2) While increasing temporal sequence length improves mean accuracy, it is not sufficient to overcome distributional dissimilarities between individuals’ EEG data, as it results in statistically significant increases in cross-participant variance. (3) Compared to all other networks evaluated, a novel convolutional-recurrent model using multi-path subnetworks and bi-directional, residual recurrent layers resulted in statistically significant increases in predictive accuracy and decreases in cross-participant variance. PMID:29701668
Statistical Models for the Analysis and Design of Digital Polymerase Chain Reaction (dPCR) Experiments.

PubMed

Dorazio, Robert M; Hunter, Margaret E

2015-11-03

Statistical methods for the analysis and design of experiments using digital PCR (dPCR) have received only limited attention and have been misused in many instances. To address this issue and to provide a more general approach to the analysis of dPCR data, we describe a class of statistical models for the analysis and design of experiments that require quantification of nucleic acids. These models are mathematically equivalent to generalized linear models of binomial responses that include a complementary, log-log link function and an offset that is dependent on the dPCR partition volume. These models are both versatile and easy to fit using conventional statistical software. Covariates can be used to specify different sources of variation in nucleic acid concentration, and a model's parameters can be used to quantify the effects of these covariates. For purposes of illustration, we analyzed dPCR data from different types of experiments, including serial dilution, evaluation of copy number variation, and quantification of gene expression. We also showed how these models can be used to help design dPCR experiments, as in selection of sample sizes needed to achieve desired levels of precision in estimates of nucleic acid concentration or to detect differences in concentration among treatments with prescribed levels of statistical power.
Performance of Reclassification Statistics in Comparing Risk Prediction Models

PubMed Central

Paynter, Nina P.

2012-01-01

Concerns have been raised about the use of traditional measures of model fit in evaluating risk prediction models for clinical use, and reclassification tables have been suggested as an alternative means of assessing the clinical utility of a model. Several measures based on the table have been proposed, including the reclassification calibration (RC) statistic, the net reclassification improvement (NRI), and the integrated discrimination improvement (IDI), but the performance of these in practical settings has not been fully examined. We used simulations to estimate the type I error and power for these statistics in a number of scenarios, as well as the impact of the number and type of categories, when adding a new marker to an established or reference model. The type I error was found to be reasonable in most settings, and power was highest for the IDI, which was similar to the test of association. The relative power of the RC statistic, a test of calibration, and the NRI, a test of discrimination, varied depending on the model assumptions. These tools provide unique but complementary information. PMID:21294152
Multi-criterion model ensemble of CMIP5 surface air temperature over China

NASA Astrophysics Data System (ADS)

Yang, Tiantian; Tao, Yumeng; Li, Jingjing; Zhu, Qian; Su, Lu; He, Xiaojia; Zhang, Xiaoming

2018-05-01

The global circulation models (GCMs) are useful tools for simulating climate change, projecting future temperature changes, and therefore, supporting the preparation of national climate adaptation plans. However, different GCMs are not always in agreement with each other over various regions. The reason is that GCMs' configurations, module characteristics, and dynamic forcings vary from one to another. Model ensemble techniques are extensively used to post-process the outputs from GCMs and improve the variability of model outputs. Root-mean-square error (RMSE), correlation coefficient (CC, or R) and uncertainty are commonly used statistics for evaluating the performances of GCMs. However, the simultaneous achievements of all satisfactory statistics cannot be guaranteed in using many model ensemble techniques. In this paper, we propose a multi-model ensemble framework, using a state-of-art evolutionary multi-objective optimization algorithm (termed MOSPD), to evaluate different characteristics of ensemble candidates and to provide comprehensive trade-off information for different model ensemble solutions. A case study of optimizing the surface air temperature (SAT) ensemble solutions over different geographical regions of China is carried out. The data covers from the period of 1900 to 2100, and the projections of SAT are analyzed with regard to three different statistical indices (i.e., RMSE, CC, and uncertainty). Among the derived ensemble solutions, the trade-off information is further analyzed with a robust Pareto front with respect to different statistics. The comparison results over historical period (1900-2005) show that the optimized solutions are superior over that obtained simple model average, as well as any single GCM output. The improvements of statistics are varying for different climatic regions over China. Future projection (2006-2100) with the proposed ensemble method identifies that the largest (smallest) temperature changes will happen in the South Central China (the Inner Mongolia), the North Eastern China (the South Central China), and the North Western China (the South Central China), under RCP 2.6, RCP 4.5, and RCP 8.5 scenarios, respectively.
Evaluating Differential Effects Using Regression Interactions and Regression Mixture Models

ERIC Educational Resources Information Center

Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung

2015-01-01

Research increasingly emphasizes understanding differential effects. This article focuses on understanding regression mixture models, which are relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their…
Multivariate Strategies in Functional Magnetic Resonance Imaging

ERIC Educational Resources Information Center

Hansen, Lars Kai

2007-01-01

We discuss aspects of multivariate fMRI modeling, including the statistical evaluation of multivariate models and means for dimensional reduction. In a case study we analyze linear and non-linear dimensional reduction tools in the context of a "mind reading" predictive multivariate fMRI model.
Hedonic approaches based on spatial econometrics and spatial statistics: application to evaluation of project benefits

NASA Astrophysics Data System (ADS)

Tsutsumi, Morito; Seya, Hajime

2009-12-01

This study discusses the theoretical foundation of the application of spatial hedonic approaches—the hedonic approach employing spatial econometrics or/and spatial statistics—to benefits evaluation. The study highlights the limitations of the spatial econometrics approach since it uses a spatial weight matrix that is not employed by the spatial statistics approach. Further, the study presents empirical analyses by applying the Spatial Autoregressive Error Model (SAEM), which is based on the spatial econometrics approach, and the Spatial Process Model (SPM), which is based on the spatial statistics approach. SPMs are conducted based on both isotropy and anisotropy and applied to different mesh sizes. The empirical analysis reveals that the estimated benefits are quite different, especially between isotropic and anisotropic SPM and between isotropic SPM and SAEM; the estimated benefits are similar for SAEM and anisotropic SPM. The study demonstrates that the mesh size does not affect the estimated amount of benefits. Finally, the study provides a confidence interval for the estimated benefits and raises an issue with regard to benefit evaluation.
An Investigation of Item Fit Statistics for Mixed IRT Models

ERIC Educational Resources Information Center

Chon, Kyong Hee

2009-01-01

The purpose of this study was to investigate procedures for assessing model fit of IRT models for mixed format data. In this study, various IRT model combinations were fitted to data containing both dichotomous and polytomous item responses, and the suitability of the chosen model mixtures was evaluated based on a number of model fit procedures.…
Improving UWB-Based Localization in IoT Scenarios with Statistical Models of Distance Error.

PubMed

Monica, Stefania; Ferrari, Gianluigi

2018-05-17

Interest in the Internet of Things (IoT) is rapidly increasing, as the number of connected devices is exponentially growing. One of the application scenarios envisaged for IoT technologies involves indoor localization and context awareness. In this paper, we focus on a localization approach that relies on a particular type of communication technology, namely Ultra Wide Band (UWB). UWB technology is an attractive choice for indoor localization, owing to its high accuracy. Since localization algorithms typically rely on estimated inter-node distances, the goal of this paper is to evaluate the improvement brought by a simple (linear) statistical model of the distance error. On the basis of an extensive experimental measurement campaign, we propose a general analytical framework, based on a Least Square (LS) method, to derive a novel statistical model for the range estimation error between a pair of UWB nodes. The proposed statistical model is then applied to improve the performance of a few illustrative localization algorithms in various realistic scenarios. The obtained experimental results show that the use of the proposed statistical model improves the accuracy of the considered localization algorithms with a reduction of the localization error up to 66%.
Effect of Internet-Based Cognitive Apprenticeship Model (i-CAM) on Statistics Learning among Postgraduate Students

PubMed Central

Saadati, Farzaneh; Ahmad Tarmizi, Rohani

2015-01-01

Because students’ ability to use statistics, which is mathematical in nature, is one of the concerns of educators, embedding within an e-learning system the pedagogical characteristics of learning is ‘value added’ because it facilitates the conventional method of learning mathematics. Many researchers emphasize the effectiveness of cognitive apprenticeship in learning and problem solving in the workplace. In a cognitive apprenticeship learning model, skills are learned within a community of practitioners through observation of modelling and then practice plus coaching. This study utilized an internet-based Cognitive Apprenticeship Model (i-CAM) in three phases and evaluated its effectiveness for improving statistics problem-solving performance among postgraduate students. The results showed that, when compared to the conventional mathematics learning model, the i-CAM could significantly promote students’ problem-solving performance at the end of each phase. In addition, the combination of the differences in students' test scores were considered to be statistically significant after controlling for the pre-test scores. The findings conveyed in this paper confirmed the considerable value of i-CAM in the improvement of statistics learning for non-specialized postgraduate students. PMID:26132553
On the Land-Ocean Contrast of Tropical Convection and Microphysics Statistics Derived from TRMM Satellite Signals and Global Storm-Resolving Models

NASA Technical Reports Server (NTRS)

Matsui, Toshihisa; Chern, Jiun-Dar; Tao, Wei-Kuo; Lang, Stephen E.; Satoh, Masaki; Hashino, Tempei; Kubota, Takuji

2016-01-01

A 14-year climatology of Tropical Rainfall Measuring Mission (TRMM) collocated multi-sensor signal statistics reveal a distinct land-ocean contrast as well as geographical variability of precipitation type, intensity, and microphysics. Microphysics information inferred from the TRMM precipitation radar and Microwave Imager (TMI) show a large land-ocean contrast for the deep category, suggesting continental convective vigor. Over land, TRMM shows higher echo-top heights and larger maximum echoes, suggesting taller storms and more intense precipitation, as well as larger microwave scattering, suggesting the presence of morelarger frozen convective hydrometeors. This strong land-ocean contrast in deep convection is invariant over seasonal and multi-year time-scales. Consequently, relatively short-term simulations from two global storm-resolving models can be evaluated in terms of their land-ocean statistics using the TRMM Triple-sensor Three-step Evaluation via a satellite simulator. The models evaluated are the NASA Multi-scale Modeling Framework (MMF) and the Non-hydrostatic Icosahedral Cloud Atmospheric Model (NICAM). While both simulations can represent convective land-ocean contrasts in warm precipitation to some extent, near-surface conditions over land are relatively moisture in NICAM than MMF, which appears to be the key driver in the divergent warm precipitation results between the two models. Both the MMF and NICAM produced similar frequencies of large CAPE between land and ocean. The dry MMF boundary layer enhanced microwave scattering signals over land, but only NICAM had an enhanced deep convection frequency over land. Neither model could reproduce a realistic land-ocean contrast in in deep convective precipitation microphysics. A realistic contrast between land and ocean remains an issue in global storm-resolving modeling.
Gene-Based Association Analysis for Censored Traits Via Fixed Effect Functional Regressions.

PubMed

Fan, Ruzong; Wang, Yifan; Yan, Qi; Ding, Ying; Weeks, Daniel E; Lu, Zhaohui; Ren, Haobo; Cook, Richard J; Xiong, Momiao; Swaroop, Anand; Chew, Emily Y; Chen, Wei

2016-02-01

Genetic studies of survival outcomes have been proposed and conducted recently, but statistical methods for identifying genetic variants that affect disease progression are rarely developed. Motivated by our ongoing real studies, here we develop Cox proportional hazard models using functional regression (FR) to perform gene-based association analysis of survival traits while adjusting for covariates. The proposed Cox models are fixed effect models where the genetic effects of multiple genetic variants are assumed to be fixed. We introduce likelihood ratio test (LRT) statistics to test for associations between the survival traits and multiple genetic variants in a genetic region. Extensive simulation studies demonstrate that the proposed Cox RF LRT statistics have well-controlled type I error rates. To evaluate power, we compare the Cox FR LRT with the previously developed burden test (BT) in a Cox model and sequence kernel association test (SKAT), which is based on mixed effect Cox models. The Cox FR LRT statistics have higher power than or similar power as Cox SKAT LRT except when 50%/50% causal variants had negative/positive effects and all causal variants are rare. In addition, the Cox FR LRT statistics have higher power than Cox BT LRT. The models and related test statistics can be useful in the whole genome and whole exome association studies. An age-related macular degeneration dataset was analyzed as an example. © 2016 WILEY PERIODICALS, INC.
Gene-based Association Analysis for Censored Traits Via Fixed Effect Functional Regressions

PubMed Central

Fan, Ruzong; Wang, Yifan; Yan, Qi; Ding, Ying; Weeks, Daniel E.; Lu, Zhaohui; Ren, Haobo; Cook, Richard J; Xiong, Momiao; Swaroop, Anand; Chew, Emily Y.; Chen, Wei

2015-01-01

Summary Genetic studies of survival outcomes have been proposed and conducted recently, but statistical methods for identifying genetic variants that affect disease progression are rarely developed. Motivated by our ongoing real studies, we develop here Cox proportional hazard models using functional regression (FR) to perform gene-based association analysis of survival traits while adjusting for covariates. The proposed Cox models are fixed effect models where the genetic effects of multiple genetic variants are assumed to be fixed. We introduce likelihood ratio test (LRT) statistics to test for associations between the survival traits and multiple genetic variants in a genetic region. Extensive simulation studies demonstrate that the proposed Cox RF LRT statistics have well-controlled type I error rates. To evaluate power, we compare the Cox FR LRT with the previously developed burden test (BT) in a Cox model and sequence kernel association test (SKAT) which is based on mixed effect Cox models. The Cox FR LRT statistics have higher power than or similar power as Cox SKAT LRT except when 50%/50% causal variants had negative/positive effects and all causal variants are rare. In addition, the Cox FR LRT statistics have higher power than Cox BT LRT. The models and related test statistics can be useful in the whole genome and whole exome association studies. An age-related macular degeneration dataset was analyzed as an example. PMID:26782979
Statistical framework for evaluation of climate model simulations by use of climate proxy data from the last millennium - Part 1: Theory

NASA Astrophysics Data System (ADS)

Sundberg, R.; Moberg, A.; Hind, A.

2012-08-01

A statistical framework for comparing the output of ensemble simulations from global climate models with networks of climate proxy and instrumental records has been developed, focusing on near-surface temperatures for the last millennium. This framework includes the formulation of a joint statistical model for proxy data, instrumental data and simulation data, which is used to optimize a quadratic distance measure for ranking climate model simulations. An essential underlying assumption is that the simulations and the proxy/instrumental series have a shared component of variability that is due to temporal changes in external forcing, such as volcanic aerosol load, solar irradiance or greenhouse gas concentrations. Two statistical tests have been formulated. Firstly, a preliminary test establishes whether a significant temporal correlation exists between instrumental/proxy and simulation data. Secondly, the distance measure is expressed in the form of a test statistic of whether a forced simulation is closer to the instrumental/proxy series than unforced simulations. The proposed framework allows any number of proxy locations to be used jointly, with different seasons, record lengths and statistical precision. The goal is to objectively rank several competing climate model simulations (e.g. with alternative model parameterizations or alternative forcing histories) by means of their goodness of fit to the unobservable true past climate variations, as estimated from noisy proxy data and instrumental observations.
Evaluation of different models to estimate the global solar radiation on inclined surface

NASA Astrophysics Data System (ADS)

Demain, C.; Journée, M.; Bertrand, C.

2012-04-01

Global and diffuse solar radiation intensities are, in general, measured on horizontal surfaces, whereas stationary solar conversion systems (both flat plate solar collector and solar photovoltaic) are mounted on inclined surface to maximize the amount of solar radiation incident on the collector surface. Consequently, the solar radiation incident measured on a tilted surface has to be determined by converting solar radiation from horizontal surface to tilted surface of interest. This study evaluates the performance of 14 models transposing 10 minutes, hourly and daily diffuse solar irradiation from horizontal to inclined surface. Solar radiation data from 8 months (April to November 2011) which include diverse atmospheric conditions and solar altitudes, measured on the roof of the radiation tower of the Royal Meteorological Institute of Belgium in Uccle (Longitude 4.35°, Latitude 50.79°) were used for validation purposes. The individual model performance is assessed by an inter-comparison between the calculated and measured solar global radiation on the south-oriented surface tilted at 50.79° using statistical methods. The relative performance of the different models under different sky conditions has been studied. Comparison of the statistical errors between the different radiation models in function of the clearness index shows that some models perform better under one type of sky condition. Putting together different models acting under different sky conditions can lead to a diminution of the statistical error between global measured solar radiation and global estimated solar radiation. As models described in this paper have been developed for hourly data inputs, statistical error indexes are minimum for hourly data and increase for 10 minutes and one day frequency data.
Use of Selected Goodness-of-Fit Statistics to Assess the Accuracy of a Model of Henry Hagg Lake, Oregon

NASA Astrophysics Data System (ADS)

Rounds, S. A.; Sullivan, A. B.

2004-12-01

Assessing a model's ability to reproduce field data is a critical step in the modeling process. For any model, some method of determining goodness-of-fit to measured data is needed to aid in calibration and to evaluate model performance. Visualizations and graphical comparisons of model output are an excellent way to begin that assessment. At some point, however, model performance must be quantified. Goodness-of-fit statistics, including the mean error (ME), mean absolute error (MAE), root mean square error, and coefficient of determination, typically are used to measure model accuracy. Statistical tools such as the sign test or Wilcoxon test can be used to test for model bias. The runs test can detect phase errors in simulated time series. Each statistic is useful, but each has its limitations. None provides a complete quantification of model accuracy. In this study, a suite of goodness-of-fit statistics was applied to a model of Henry Hagg Lake in northwest Oregon. Hagg Lake is a man-made reservoir on Scoggins Creek, a tributary to the Tualatin River. Located on the west side of the Portland metropolitan area, the Tualatin Basin is home to more than 450,000 people. Stored water in Hagg Lake helps to meet the agricultural and municipal water needs of that population. Future water demands have caused water managers to plan for a potential expansion of Hagg Lake, doubling its storage to roughly 115,000 acre-feet. A model of the lake was constructed to evaluate the lake's water quality and estimate how that quality might change after raising the dam. The laterally averaged, two-dimensional, U.S. Army Corps of Engineers model CE-QUAL-W2 was used to construct the Hagg Lake model. Calibrated for the years 2000 and 2001 and confirmed with data from 2002 and 2003, modeled parameters included water temperature, ammonia, nitrate, phosphorus, algae, zooplankton, and dissolved oxygen. Several goodness-of-fit statistics were used to quantify model accuracy and bias. Model performance was judged to be excellent for water temperature (annual ME: -0.22 to 0.05 ° C; annual MAE: 0.62 to 0.68 ° C) and dissolved oxygen (annual ME: -0.28 to 0.18 mg/L; annual MAE: 0.43 to 0.92 mg/L), showing that the model is sufficiently accurate for future water resources planning and management.
Analysis and modeling of wafer-level process variability in 28 nm FD-SOI using split C-V measurements

NASA Astrophysics Data System (ADS)

Pradeep, Krishna; Poiroux, Thierry; Scheer, Patrick; Juge, André; Gouget, Gilles; Ghibaudo, Gérard

2018-07-01

This work details the analysis of wafer level global process variability in 28 nm FD-SOI using split C-V measurements. The proposed approach initially evaluates the native on wafer process variability using efficient extraction methods on split C-V measurements. The on-wafer threshold voltage (VT) variability is first studied and modeled using a simple analytical model. Then, a statistical model based on the Leti-UTSOI compact model is proposed to describe the total C-V variability in different bias conditions. This statistical model is finally used to study the contribution of each process parameter to the total C-V variability.
Selecting the "Best" Factor Structure and Moving Measurement Validation Forward: An Illustration.

PubMed

Schmitt, Thomas A; Sass, Daniel A; Chappelle, Wayne; Thompson, William

2018-04-09

Despite the broad literature base on factor analysis best practices, research seeking to evaluate a measure's psychometric properties frequently fails to consider or follow these recommendations. This leads to incorrect factor structures, numerous and often overly complex competing factor models and, perhaps most harmful, biased model results. Our goal is to demonstrate a practical and actionable process for factor analysis through (a) an overview of six statistical and psychometric issues and approaches to be aware of, investigate, and report when engaging in factor structure validation, along with a flowchart for recommended procedures to understand latent factor structures; (b) demonstrating these issues to provide a summary of the updated Posttraumatic Stress Disorder Checklist (PCL-5) factor models and a rationale for validation; and (c) conducting a comprehensive statistical and psychometric validation of the PCL-5 factor structure to demonstrate all the issues we described earlier. Considering previous research, the PCL-5 was evaluated using a sample of 1,403 U.S. Air Force remotely piloted aircraft operators with high levels of battlefield exposure. Previously proposed PCL-5 factor structures were not supported by the data, but instead a bifactor model is arguably more statistically appropriate.
Statistical and dynamical forecast of regional precipitation after mature phase of ENSO

NASA Astrophysics Data System (ADS)

Sohn, S.; Min, Y.; Lee, J.; Tam, C.; Ahn, J.

2010-12-01

While the seasonal predictability of general circulation models (GCMs) has been improved, the current model atmosphere in the mid-latitude does not respond correctly to external forcing such as tropical sea surface temperature (SST), particularly over the East Asia and western North Pacific summer monsoon regions. In addition, the time-scale of prediction scope is considerably limited and the model forecast skill still is very poor beyond two weeks. Although recent studies indicate that coupled model based multi-model ensemble (MME) forecasts show the better performance, the long-lead forecasts exceeding 9 months still show a dramatic decrease of the seasonal predictability. This study aims at diagnosing the dynamical MME forecasts comprised of the state of art 1-tier models as well as comparing them with the statistical model forecasts, focusing on the East Asian summer precipitation predictions after mature phase of ENSO. The lagged impact of El Nino as major climate contributor on the summer monsoon in model environments is also evaluated, in the sense of the conditional probabilities. To evaluate the probability forecast skills, the reliability (attributes) diagram and the relative operating characteristics following the recommendations of the World Meteorological Organization (WMO) Standardized Verification System for Long-Range Forecasts are used in this study. The results should shed light on the prediction skill for dynamical model and also for the statistical model, in forecasting the East Asian summer monsoon rainfall with a long-lead time.
Simulation and assimilation of satellite altimeter data at the oceanic mesoscale

NASA Technical Reports Server (NTRS)

Demay, P.; Robinson, A. R.

1984-01-01

An improved "objective analysis' technique is used along with an altimeter signal statistical model, an altimeter noise statistical model, an orbital model, and synoptic surface current maps in the POLYMODE-SDE area, to evaluate the performance of various observational strategies in catching the mesoscale variability at mid-latitudes. In particular, simulated repetitive nominal orbits of ERS-1, TOPEX, and SPOT/POSEIDON are examined. Results show the critical importance of existence of a subcycle, scanning in either direction. Moreover, long repeat cycles ( 20 days) and short cross-track distances ( 300 km) seem preferable, since they match mesoscale statistics. Another goal of the study is to prepare and discuss sea-surface height (SSH) assimilation in quasigeostrophic models. Restored SSH maps are shown to meet that purpose, if an efficient extrapolation method or deep in-situ data (floats) are used on the vertical to start and update the model.

Development of a new eyellipse and seating accommodation model for trucks and buses.

DOT National Transportation Integrated Search

2005-11-01

Driver posture data from a laboratory study and an in-vehicle test-track study were used to develop and to : evaluate a new seating accommodation model and eyellipse for SAE Class-B vehicles. The new statistical : models are configurable for populati...
Applications of explicitly-incorporated/post-processing measurement uncertainty in watershed modeling

USDA-ARS?s Scientific Manuscript database

The importance of measurement uncertainty in terms of calculation of model evaluation error statistics has been recently stated in the literature. The impact of measurement uncertainty on calibration results indicates the potential vague zone in the field of watershed modeling where the assumption ...
Automated finite element modeling of the lumbar spine: Using a statistical shape model to generate a virtual population of models.

PubMed

Campbell, J Q; Petrella, A J

2016-09-06

Population-based modeling of the lumbar spine has the potential to be a powerful clinical tool. However, developing a fully parameterized model of the lumbar spine with accurate geometry has remained a challenge. The current study used automated methods for landmark identification to create a statistical shape model of the lumbar spine. The shape model was evaluated using compactness, generalization ability, and specificity. The primary shape modes were analyzed visually, quantitatively, and biomechanically. The biomechanical analysis was performed by using the statistical shape model with an automated method for finite element model generation to create a fully parameterized finite element model of the lumbar spine. Functional finite element models of the mean shape and the extreme shapes (±3 standard deviations) of all 17 shape modes were created demonstrating the robust nature of the methods. This study represents an advancement in finite element modeling of the lumbar spine and will allow population-based modeling in the future. Copyright © 2016 Elsevier Ltd. All rights reserved.
Developing Statistical Physics Course Handout on Distribution Function Materials Based on Science, Technology, Engineering, and Mathematics

NASA Astrophysics Data System (ADS)

Riandry, M. A.; Ismet, I.; Akhsan, H.

2017-09-01

This study aims to produce a valid and practical statistical physics course handout on distribution function materials based on STEM. Rowntree development model is used to produce this handout. The model consists of three stages: planning, development and evaluation stages. In this study, the evaluation stage used Tessmer formative evaluation. It consists of 5 stages: self-evaluation, expert review, one-to-one evaluation, small group evaluation and field test stages. However, the handout is limited to be tested on validity and practicality aspects, so the field test stage is not implemented. The data collection technique used walkthroughs and questionnaires. Subjects of this study are students of 6th and 8th semester of academic year 2016/2017 Physics Education Study Program of Sriwijaya University. The average result of expert review is 87.31% (very valid category). One-to-one evaluation obtained the average result is 89.42%. The result of small group evaluation is 85.92%. From one-to-one and small group evaluation stages, averagestudent response to this handout is 87,67% (very practical category). Based on the results of the study, it can be concluded that the handout is valid and practical.
Evaluating and implementing temporal, spatial, and spatio-temporal methods for outbreak detection in a local syndromic surveillance system

PubMed Central

Lall, Ramona; Levin-Rector, Alison; Sell, Jessica; Paladini, Marc; Konty, Kevin J.; Olson, Don; Weiss, Don

2017-01-01

The New York City Department of Health and Mental Hygiene has operated an emergency department syndromic surveillance system since 2001, using temporal and spatial scan statistics run on a daily basis for cluster detection. Since the system was originally implemented, a number of new methods have been proposed for use in cluster detection. We evaluated six temporal and four spatial/spatio-temporal detection methods using syndromic surveillance data spiked with simulated injections. The algorithms were compared on several metrics, including sensitivity, specificity, positive predictive value, coherence, and timeliness. We also evaluated each method’s implementation, programming time, run time, and the ease of use. Among the temporal methods, at a set specificity of 95%, a Holt-Winters exponential smoother performed the best, detecting 19% of the simulated injects across all shapes and sizes, followed by an autoregressive moving average model (16%), a generalized linear model (15%), a modified version of the Early Aberration Reporting System’s C2 algorithm (13%), a temporal scan statistic (11%), and a cumulative sum control chart (<2%). Of the spatial/spatio-temporal methods we tested, a spatial scan statistic detected 3% of all injects, a Bayes regression found 2%, and a generalized linear mixed model and a space-time permutation scan statistic detected none at a specificity of 95%. Positive predictive value was low (<7%) for all methods. Overall, the detection methods we tested did not perform well in identifying the temporal and spatial clusters of cases in the inject dataset. The spatial scan statistic, our current method for spatial cluster detection, performed slightly better than the other tested methods across different inject magnitudes and types. Furthermore, we found the scan statistics, as applied in the SaTScan software package, to be the easiest to program and implement for daily data analysis. PMID:28886112
Evaluating and implementing temporal, spatial, and spatio-temporal methods for outbreak detection in a local syndromic surveillance system.

PubMed

Mathes, Robert W; Lall, Ramona; Levin-Rector, Alison; Sell, Jessica; Paladini, Marc; Konty, Kevin J; Olson, Don; Weiss, Don

2017-01-01

The New York City Department of Health and Mental Hygiene has operated an emergency department syndromic surveillance system since 2001, using temporal and spatial scan statistics run on a daily basis for cluster detection. Since the system was originally implemented, a number of new methods have been proposed for use in cluster detection. We evaluated six temporal and four spatial/spatio-temporal detection methods using syndromic surveillance data spiked with simulated injections. The algorithms were compared on several metrics, including sensitivity, specificity, positive predictive value, coherence, and timeliness. We also evaluated each method's implementation, programming time, run time, and the ease of use. Among the temporal methods, at a set specificity of 95%, a Holt-Winters exponential smoother performed the best, detecting 19% of the simulated injects across all shapes and sizes, followed by an autoregressive moving average model (16%), a generalized linear model (15%), a modified version of the Early Aberration Reporting System's C2 algorithm (13%), a temporal scan statistic (11%), and a cumulative sum control chart (<2%). Of the spatial/spatio-temporal methods we tested, a spatial scan statistic detected 3% of all injects, a Bayes regression found 2%, and a generalized linear mixed model and a space-time permutation scan statistic detected none at a specificity of 95%. Positive predictive value was low (<7%) for all methods. Overall, the detection methods we tested did not perform well in identifying the temporal and spatial clusters of cases in the inject dataset. The spatial scan statistic, our current method for spatial cluster detection, performed slightly better than the other tested methods across different inject magnitudes and types. Furthermore, we found the scan statistics, as applied in the SaTScan software package, to be the easiest to program and implement for daily data analysis.
Time-to-event methodology improved statistical evaluation in register-based health services research.

PubMed

Bluhmki, Tobias; Bramlage, Peter; Volk, Michael; Kaltheuner, Matthias; Danne, Thomas; Rathmann, Wolfgang; Beyersmann, Jan

2017-02-01

Complex longitudinal sampling and the observational structure of patient registers in health services research are associated with methodological challenges regarding data management and statistical evaluation. We exemplify common pitfalls and want to stimulate discussions on the design, development, and deployment of future longitudinal patient registers and register-based studies. For illustrative purposes, we use data from the prospective, observational, German DIabetes Versorgungs-Evaluation register. One aim was to explore predictors for the initiation of a basal insulin supported therapy in patients with type 2 diabetes initially prescribed to glucose-lowering drugs alone. Major challenges are missing mortality information, time-dependent outcomes, delayed study entries, different follow-up times, and competing events. We show that time-to-event methodology is a valuable tool for improved statistical evaluation of register data and should be preferred to simple case-control approaches. Patient registers provide rich data sources for health services research. Analyses are accompanied with the trade-off between data availability, clinical plausibility, and statistical feasibility. Cox' proportional hazards model allows for the evaluation of the outcome-specific hazards, but prediction of outcome probabilities is compromised by missing mortality information. Copyright © 2016 Elsevier Inc. All rights reserved.
An evaluation of three statistical estimation methods for assessing health policy effects on prescription drug claims.

PubMed

Mittal, Manish; Harrison, Donald L; Thompson, David M; Miller, Michael J; Farmer, Kevin C; Ng, Yu-Tze

2016-01-01

While the choice of analytical approach affects study results and their interpretation, there is no consensus to guide the choice of statistical approaches to evaluate public health policy change. This study compared and contrasted three statistical estimation procedures in the assessment of a U.S. Food and Drug Administration (FDA) suicidality warning, communicated in January 2008 and implemented in May 2009, on antiepileptic drug (AED) prescription claims. Longitudinal designs were utilized to evaluate Oklahoma (U.S. State) Medicaid claim data from January 2006 through December 2009. The study included 9289 continuously eligible individuals with prevalent diagnoses of epilepsy and/or psychiatric disorder. Segmented regression models using three estimation procedures [i.e., generalized linear models (GLM), generalized estimation equations (GEE), and generalized linear mixed models (GLMM)] were used to estimate trends of AED prescription claims across three time periods: before (January 2006-January 2008); during (February 2008-May 2009); and after (June 2009-December 2009) the FDA warning. All three statistical procedures estimated an increasing trend (P < 0.0001) in AED prescription claims before the FDA warning period. No procedures detected a significant change in trend during (GLM: -30.0%, 99% CI: -60.0% to 10.0%; GEE: -20.0%, 99% CI: -70.0% to 30.0%; GLMM: -23.5%, 99% CI: -58.8% to 1.2%) and after (GLM: 50.0%, 99% CI: -70.0% to 160.0%; GEE: 80.0%, 99% CI: -20.0% to 200.0%; GLMM: 47.1%, 99% CI: -41.2% to 135.3%) the FDA warning when compared to pre-warning period. Although the three procedures provided consistent inferences, the GEE and GLMM approaches accounted appropriately for correlation. Further, marginal models estimated using GEE produced more robust and valid population-level estimations. Copyright © 2016 Elsevier Inc. All rights reserved.
Match statistics related to winning in the group stage of 2014 Brazil FIFA World Cup.

PubMed

Liu, Hongyou; Gomez, Miguel-Ángel; Lago-Peñas, Carlos; Sampaio, Jaime

2015-01-01

Identifying match statistics that strongly contribute to winning in football matches is a very important step towards a more predictive and prescriptive performance analysis. The current study aimed to determine relationships between 24 match statistics and the match outcome (win, loss and draw) in all games and close games of the group stage of FIFA World Cup (2014, Brazil) by employing the generalised linear model. The cumulative logistic regression was run in the model taking the value of each match statistic as independent variable to predict the logarithm of the odds of winning. Relationships were assessed as effects of a two-standard-deviation increase in the value of each variable on the change in the probability of a team winning a match. Non-clinical magnitude-based inferences were employed and were evaluated by using the smallest worthwhile change. Results showed that for all the games, nine match statistics had clearly positive effects on the probability of winning (Shot, Shot on Target, Shot from Counter Attack, Shot from Inside Area, Ball Possession, Short Pass, Average Pass Streak, Aerial Advantage and Tackle), four had clearly negative effects (Shot Blocked, Cross, Dribble and Red Card), other 12 statistics had either trivial or unclear effects. While for the close games, the effects of Aerial Advantage and Yellow Card turned to trivial and clearly negative, respectively. Information from the tactical modelling can provide a more thorough and objective match understanding to coaches and performance analysts for evaluating post-match performances and for scouting upcoming oppositions.
Statistical models for the analysis and design of digital polymerase chain (dPCR) experiments

USGS Publications Warehouse

Dorazio, Robert; Hunter, Margaret

2015-01-01

Statistical methods for the analysis and design of experiments using digital PCR (dPCR) have received only limited attention and have been misused in many instances. To address this issue and to provide a more general approach to the analysis of dPCR data, we describe a class of statistical models for the analysis and design of experiments that require quantification of nucleic acids. These models are mathematically equivalent to generalized linear models of binomial responses that include a complementary, log–log link function and an offset that is dependent on the dPCR partition volume. These models are both versatile and easy to fit using conventional statistical software. Covariates can be used to specify different sources of variation in nucleic acid concentration, and a model’s parameters can be used to quantify the effects of these covariates. For purposes of illustration, we analyzed dPCR data from different types of experiments, including serial dilution, evaluation of copy number variation, and quantification of gene expression. We also showed how these models can be used to help design dPCR experiments, as in selection of sample sizes needed to achieve desired levels of precision in estimates of nucleic acid concentration or to detect differences in concentration among treatments with prescribed levels of statistical power.
Comparison of Artificial Neural Networks and ARIMA statistical models in simulations of target wind time series

NASA Astrophysics Data System (ADS)

Kolokythas, Kostantinos; Vasileios, Salamalikis; Athanassios, Argiriou; Kazantzidis, Andreas

2015-04-01

The wind is a result of complex interactions of numerous mechanisms taking place in small or large scales, so, the better knowledge of its behavior is essential in a variety of applications, especially in the field of power production coming from wind turbines. In the literature there is a considerable number of models, either physical or statistical ones, dealing with the problem of simulation and prediction of wind speed. Among others, Artificial Neural Networks (ANNs) are widely used for the purpose of wind forecasting and, in the great majority of cases, outperform other conventional statistical models. In this study, a number of ANNs with different architectures, which have been created and applied in a dataset of wind time series, are compared to Auto Regressive Integrated Moving Average (ARIMA) statistical models. The data consist of mean hourly wind speeds coming from a wind farm on a hilly Greek region and cover a period of one year (2013). The main goal is to evaluate the models ability to simulate successfully the wind speed at a significant point (target). Goodness-of-fit statistics are performed for the comparison of the different methods. In general, the ANN showed the best performance in the estimation of wind speed prevailing over the ARIMA models.
The Structure of Human Intelligence: It Is Verbal, Perceptual, and Image Rotation (VPR), Not Fluid and Crystallized

ERIC Educational Resources Information Center

Johnson, W.; Bouchard, T.J.

2005-01-01

In a heterogeneous sample of 436 adult individuals who completed 42 mental ability tests, we evaluated the relative statistical performance of three major psychometric models of human intelligence-the Cattell-Horn fluid-crystallized model, Vernon's verbal-perceptual model, and Carroll's three-strata model. The verbal-perceptual model fit…
A Bayesian approach for parameter estimation and prediction using a computationally intensive model

DOE PAGES

Higdon, Dave; McDonnell, Jordan D.; Schunck, Nicolas; ...

2015-02-05

Bayesian methods have been successful in quantifying uncertainty in physics-based problems in parameter estimation and prediction. In these cases, physical measurements y are modeled as the best fit of a physics-based modelmore » $$\\eta (\\theta )$$, where θ denotes the uncertain, best input setting. Hence the statistical model is of the form $$y=\\eta (\\theta )+\\epsilon ,$$ where $$\\epsilon $$ accounts for measurement, and possibly other, error sources. When nonlinearity is present in $$\\eta (\\cdot )$$, the resulting posterior distribution for the unknown parameters in the Bayesian formulation is typically complex and nonstandard, requiring computationally demanding computational approaches such as Markov chain Monte Carlo (MCMC) to produce multivariate draws from the posterior. Although generally applicable, MCMC requires thousands (or even millions) of evaluations of the physics model $$\\eta (\\cdot )$$. This requirement is problematic if the model takes hours or days to evaluate. To overcome this computational bottleneck, we present an approach adapted from Bayesian model calibration. This approach combines output from an ensemble of computational model runs with physical measurements, within a statistical formulation, to carry out inference. A key component of this approach is a statistical response surface, or emulator, estimated from the ensemble of model runs. We demonstrate this approach with a case study in estimating parameters for a density functional theory model, using experimental mass/binding energy measurements from a collection of atomic nuclei. Lastly, we also demonstrate how this approach produces uncertainties in predictions for recent mass measurements obtained at Argonne National Laboratory.« less
Exposure time independent summary statistics for assessment of drug dependent cell line growth inhibition.

PubMed

Falgreen, Steffen; Laursen, Maria Bach; Bødker, Julie Støve; Kjeldsen, Malene Krag; Schmitz, Alexander; Nyegaard, Mette; Johnsen, Hans Erik; Dybkær, Karen; Bøgsted, Martin

2014-06-05

In vitro generated dose-response curves of human cancer cell lines are widely used to develop new therapeutics. The curves are summarised by simplified statistics that ignore the conventionally used dose-response curves' dependency on drug exposure time and growth kinetics. This may lead to suboptimal exploitation of data and biased conclusions on the potential of the drug in question. Therefore we set out to improve the dose-response assessments by eliminating the impact of time dependency. First, a mathematical model for drug induced cell growth inhibition was formulated and used to derive novel dose-response curves and improved summary statistics that are independent of time under the proposed model. Next, a statistical analysis workflow for estimating the improved statistics was suggested consisting of 1) nonlinear regression models for estimation of cell counts and doubling times, 2) isotonic regression for modelling the suggested dose-response curves, and 3) resampling based method for assessing variation of the novel summary statistics. We document that conventionally used summary statistics for dose-response experiments depend on time so that fast growing cell lines compared to slowly growing ones are considered overly sensitive. The adequacy of the mathematical model is tested for doxorubicin and found to fit real data to an acceptable degree. Dose-response data from the NCI60 drug screen were used to illustrate the time dependency and demonstrate an adjustment correcting for it. The applicability of the workflow was illustrated by simulation and application on a doxorubicin growth inhibition screen. The simulations show that under the proposed mathematical model the suggested statistical workflow results in unbiased estimates of the time independent summary statistics. Variance estimates of the novel summary statistics are used to conclude that the doxorubicin screen covers a significant diverse range of responses ensuring it is useful for biological interpretations. Time independent summary statistics may aid the understanding of drugs' action mechanism on tumour cells and potentially renew previous drug sensitivity evaluation studies.
Exposure time independent summary statistics for assessment of drug dependent cell line growth inhibition

PubMed Central

2014-01-01

Background In vitro generated dose-response curves of human cancer cell lines are widely used to develop new therapeutics. The curves are summarised by simplified statistics that ignore the conventionally used dose-response curves’ dependency on drug exposure time and growth kinetics. This may lead to suboptimal exploitation of data and biased conclusions on the potential of the drug in question. Therefore we set out to improve the dose-response assessments by eliminating the impact of time dependency. Results First, a mathematical model for drug induced cell growth inhibition was formulated and used to derive novel dose-response curves and improved summary statistics that are independent of time under the proposed model. Next, a statistical analysis workflow for estimating the improved statistics was suggested consisting of 1) nonlinear regression models for estimation of cell counts and doubling times, 2) isotonic regression for modelling the suggested dose-response curves, and 3) resampling based method for assessing variation of the novel summary statistics. We document that conventionally used summary statistics for dose-response experiments depend on time so that fast growing cell lines compared to slowly growing ones are considered overly sensitive. The adequacy of the mathematical model is tested for doxorubicin and found to fit real data to an acceptable degree. Dose-response data from the NCI60 drug screen were used to illustrate the time dependency and demonstrate an adjustment correcting for it. The applicability of the workflow was illustrated by simulation and application on a doxorubicin growth inhibition screen. The simulations show that under the proposed mathematical model the suggested statistical workflow results in unbiased estimates of the time independent summary statistics. Variance estimates of the novel summary statistics are used to conclude that the doxorubicin screen covers a significant diverse range of responses ensuring it is useful for biological interpretations. Conclusion Time independent summary statistics may aid the understanding of drugs’ action mechanism on tumour cells and potentially renew previous drug sensitivity evaluation studies. PMID:24902483
An alternative way to evaluate chemistry-transport model variability

NASA Astrophysics Data System (ADS)

Menut, Laurent; Mailler, Sylvain; Bessagnet, Bertrand; Siour, Guillaume; Colette, Augustin; Couvidat, Florian; Meleux, Frédérik

2017-03-01

A simple and complementary model evaluation technique for regional chemistry transport is discussed. The methodology is based on the concept that we can learn about model performance by comparing the simulation results with observational data available for time periods other than the period originally targeted. First, the statistical indicators selected in this study (spatial and temporal correlations) are computed for a given time period, using colocated observation and simulation data in time and space. Second, the same indicators are used to calculate scores for several other years while conserving the spatial locations and Julian days of the year. The difference between the results provides useful insights on the model capability to reproduce the observed day-to-day and spatial variability. In order to synthesize the large amount of results, a new indicator is proposed, designed to compare several error statistics between all the years of validation and to quantify whether the period and area being studied were well captured by the model for the correct reasons.
Numerical and Qualitative Contrasts of Two Statistical Models ...

EPA Pesticide Factsheets

Two statistical approaches, weighted regression on time, discharge, and season and generalized additive models, have recently been used to evaluate water quality trends in estuaries. Both models have been used in similar contexts despite differences in statistical foundations and products. This study provided an empirical and qualitative comparison of both models using 29 years of data for two discrete time series of chlorophyll-a (chl-a) in the Patuxent River estuary. Empirical descriptions of each model were based on predictive performance against the observed data, ability to reproduce flow-normalized trends with simulated data, and comparisons of performance with validation datasets. Between-model differences were apparent but minor and both models had comparable abilities to remove flow effects from simulated time series. Both models similarly predicted observations for missing data with different characteristics. Trends from each model revealed distinct mainstem influences of the Chesapeake Bay with both models predicting a roughly 65% increase in chl-a over time in the lower estuary, whereas flow-normalized predictions for the upper estuary showed a more dynamic pattern, with a nearly 100% increase in chl-a in the last 10 years. Qualitative comparisons highlighted important differences in the statistical structure, available products, and characteristics of the data and desired analysis. This manuscript describes a quantitative comparison of two recently-
A Decision Model for Evaluating Potential Change in Instructional Programs.

ERIC Educational Resources Information Center

Amor, J. P.; Dyer, J. S.

A statistical model designed to assist elementary school principals in the process of selection educational areas which should receive additional emphasis is presented. For each educational area, the model produces an index number which represents the expected "value" per dollar spent on an instructional program appropriate for strengthening that…
An NCME Instructional Module on Item-Fit Statistics for Item Response Theory Models

ERIC Educational Resources Information Center

Ames, Allison J.; Penfield, Randall D.

2015-01-01

Drawing valid inferences from item response theory (IRT) models is contingent upon a good fit of the data to the model. Violations of model-data fit have numerous consequences, limiting the usefulness and applicability of the model. This instructional module provides an overview of methods used for evaluating the fit of IRT models. Upon completing…
Global Sensitivity Analysis of Environmental Systems via Multiple Indices based on Statistical Moments of Model Outputs

NASA Astrophysics Data System (ADS)

Guadagnini, A.; Riva, M.; Dell'Oca, A.

2017-12-01

We propose to ground sensitivity of uncertain parameters of environmental models on a set of indices based on the main (statistical) moments, i.e., mean, variance, skewness and kurtosis, of the probability density function (pdf) of a target model output. This enables us to perform Global Sensitivity Analysis (GSA) of a model in terms of multiple statistical moments and yields a quantification of the impact of model parameters on features driving the shape of the pdf of model output. Our GSA approach includes the possibility of being coupled with the construction of a reduced complexity model that allows approximating the full model response at a reduced computational cost. We demonstrate our approach through a variety of test cases. These include a commonly used analytical benchmark, a simplified model representing pumping in a coastal aquifer, a laboratory-scale tracer experiment, and the migration of fracturing fluid through a naturally fractured reservoir (source) to reach an overlying formation (target). Our strategy allows discriminating the relative importance of model parameters to the four statistical moments considered. We also provide an appraisal of the error associated with the evaluation of our sensitivity metrics by replacing the original system model through the selected surrogate model. Our results suggest that one might need to construct a surrogate model with increasing level of accuracy depending on the statistical moment considered in the GSA. The methodological framework we propose can assist the development of analysis techniques targeted to model calibration, design of experiment, uncertainty quantification and risk assessment.

United States Census 2000 Population with Bridged Race Categories. Vital and Health Statistics. Data Evaluation and Methods Research.

ERIC Educational Resources Information Center

Ingram, Deborah D.; Parker, Jennifer D.; Schenker, Nathaniel; Weed, James A.; Hamilton, Brady; Arias, Elizabeth; Madans, Jennifer H.

This report documents the National Center for Health Statistics' (NCHS) methods for bridging the Census 2000 multiple-race resident population to single-race categories and describing bridged race resident population estimates. Data came from the pooled 1997-2000 National Health Interview Surveys. The bridging models included demographic and…
An Empirical Investigation of Methods for Assessing Item Fit for Mixed Format Tests

ERIC Educational Resources Information Center

Chon, Kyong Hee; Lee, Won-Chan; Ansley, Timothy N.

2013-01-01

Empirical information regarding performance of model-fit procedures has been a persistent need in measurement practice. Statistical procedures for evaluating item fit were applied to real test examples that consist of both dichotomously and polytomously scored items. The item fit statistics used in this study included the PARSCALE's G[squared],…
Evaluating Two Models of Collaborative Tests in an Online Introductory Statistics Course

ERIC Educational Resources Information Center

Björnsdóttir, Auðbjörg; Garfield, Joan; Everson, Michelle

2015-01-01

This study explored the use of two different types of collaborative tests in an online introductory statistics course. A study was designed and carried out to investigate three research questions: (1) What is the difference in students' learning between using consensus and non-consensus collaborative tests in the online environment?, (2) What is…
Developing risk prediction models for kidney injury and assessing incremental value for novel biomarkers.

PubMed

Kerr, Kathleen F; Meisner, Allison; Thiessen-Philbrook, Heather; Coca, Steven G; Parikh, Chirag R

2014-08-07

The field of nephrology is actively involved in developing biomarkers and improving models for predicting patients' risks of AKI and CKD and their outcomes. However, some important aspects of evaluating biomarkers and risk models are not widely appreciated, and statistical methods are still evolving. This review describes some of the most important statistical concepts for this area of research and identifies common pitfalls. Particular attention is paid to metrics proposed within the last 5 years for quantifying the incremental predictive value of a new biomarker. Copyright © 2014 by the American Society of Nephrology.
Economic Impacts of Infrastructure Damages on Industrial Sector

NASA Astrophysics Data System (ADS)

Kajitani, Yoshio

This paper proposes a basic model for evaluating economic impacts on industrial sectors under the conditions that multiple infrastructures are simultaneously damaged during the earthquake disasters. Especially, focusing on the available economic data developed in the smallest spatial scale in Japan (small area statistics), economic loss estimation model based on the small area statistics and its applicability are investigated on. In the detail, a loss estimation framework, utilizing survey results on firms' activities under electricity, water and gas disruptions, and route choice models in Transportation Engineering, are applied to the case of 2004 Mid-Niigata Earthquake.
Specialized data analysis of SSME and advanced propulsion system vibration measurements

NASA Technical Reports Server (NTRS)

Coffin, Thomas; Swanson, Wayne L.; Jong, Yen-Yi

1993-01-01

The basic objectives of this contract were to perform detailed analysis and evaluation of dynamic data obtained during Space Shuttle Main Engine (SSME) test and flight operations, including analytical/statistical assessment of component dynamic performance, and to continue the development and implementation of analytical/statistical models to effectively define nominal component dynamic characteristics, detect anomalous behavior, and assess machinery operational conditions. This study was to provide timely assessment of engine component operational status, identify probable causes of malfunction, and define feasible engineering solutions. The work was performed under three broad tasks: (1) Analysis, Evaluation, and Documentation of SSME Dynamic Test Results; (2) Data Base and Analytical Model Development and Application; and (3) Development and Application of Vibration Signature Analysis Techniques.
Using statistical and artificial neural network models to forecast potentiometric levels at a deep well in South Texas

NASA Astrophysics Data System (ADS)

Uddameri, V.

2007-01-01

Reliable forecasts of monthly and quarterly fluctuations in groundwater levels are necessary for short- and medium-term planning and management of aquifers to ensure proper service of seasonal demands within a region. Development of physically based transient mathematical models at this time scale poses considerable challenges due to lack of suitable data and other uncertainties. Artificial neural networks (ANN) possess flexible mathematical structures and are capable of mapping highly nonlinear relationships. Feed-forward neural network models were constructed and trained using the back-percolation algorithm to forecast monthly and quarterly time-series water levels at a well that taps into the deeper Evangeline formation of the Gulf Coast aquifer in Victoria, TX. Unlike unconfined formations, no causal relationships exist between water levels and hydro-meteorological variables measured near the vicinity of the well. As such, an endogenous forecasting model using dummy variables to capture short-term seasonal fluctuations and longer-term (decadal) trends was constructed. The root mean square error, mean absolute deviation and correlation coefficient ( R) were noted to be 1.40, 0.33 and 0.77 m, respectively, for an evaluation dataset of quarterly measurements and 1.17, 0.46, and 0.88 m for an evaluative monthly dataset not used to train or test the model. These statistics were better for the ANN model than those developed using statistical regression techniques.
Evaluation of Surface Flux Parameterizations with Long-Term ARM Observations

DOE PAGES

Liu, Gang; Liu, Yangang; Endo, Satoshi

2013-02-01

Surface momentum, sensible heat, and latent heat fluxes are critical for atmospheric processes such as clouds and precipitation, and are parameterized in a variety of models ranging from cloud-resolving models to large-scale weather and climate models. However, direct evaluation of the parameterization schemes for these surface fluxes is rare due to limited observations. This study takes advantage of the long-term observations of surface fluxes collected at the Southern Great Plains site by the Department of Energy Atmospheric Radiation Measurement program to evaluate the six surface flux parameterization schemes commonly used in the Weather Research and Forecasting (WRF) model and threemore » U.S. general circulation models (GCMs). The unprecedented 7-yr-long measurements by the eddy correlation (EC) and energy balance Bowen ratio (EBBR) methods permit statistical evaluation of all six parameterizations under a variety of stability conditions, diurnal cycles, and seasonal variations. The statistical analyses show that the momentum flux parameterization agrees best with the EC observations, followed by latent heat flux, sensible heat flux, and evaporation ratio/Bowen ratio. The overall performance of the parameterizations depends on atmospheric stability, being best under neutral stratification and deteriorating toward both more stable and more unstable conditions. Further diagnostic analysis reveals that in addition to the parameterization schemes themselves, the discrepancies between observed and parameterized sensible and latent heat fluxes may stem from inadequate use of input variables such as surface temperature, moisture availability, and roughness length. The results demonstrate the need for improving the land surface models and measurements of surface properties, which would permit the evaluation of full land surface models.« less
Statistical modeling for visualization evaluation through data fusion.

PubMed

Chen, Xiaoyu; Jin, Ran

2017-11-01

There is a high demand of data visualization providing insights to users in various applications. However, a consistent, online visualization evaluation method to quantify mental workload or user preference is lacking, which leads to an inefficient visualization and user interface design process. Recently, the advancement of interactive and sensing technologies makes the electroencephalogram (EEG) signals, eye movements as well as visualization logs available in user-centered evaluation. This paper proposes a data fusion model and the application procedure for quantitative and online visualization evaluation. 15 participants joined the study based on three different visualization designs. The results provide a regularized regression model which can accurately predict the user's evaluation of task complexity, and indicate the significance of all three types of sensing data sets for visualization evaluation. This model can be widely applied to data visualization evaluation, and other user-centered designs evaluation and data analysis in human factors and ergonomics. Copyright © 2016 Elsevier Ltd. All rights reserved.
Summary of hydrologic modeling for the Delaware River Basin using the Water Availability Tool for Environmental Resources (WATER)

USGS Publications Warehouse

Williamson, Tanja N.; Lant, Jeremiah G.; Claggett, Peter; Nystrom, Elizabeth A.; Milly, Paul C.D.; Nelson, Hugh L.; Hoffman, Scott A.; Colarullo, Susan J.; Fischer, Jeffrey M.

2015-11-18

The Water Availability Tool for Environmental Resources (WATER) is a decision support system for the nontidal part of the Delaware River Basin that provides a consistent and objective method of simulating streamflow under historical, forecasted, and managed conditions. In order to quantify the uncertainty associated with these simulations, however, streamflow and the associated hydroclimatic variables of potential evapotranspiration, actual evapotranspiration, and snow accumulation and snowmelt must be simulated and compared to long-term, daily observations from sites. This report details model development and optimization, statistical evaluation of simulations for 57 basins ranging from 2 to 930 km2 and 11.0 to 99.5 percent forested cover, and how this statistical evaluation of daily streamflow relates to simulating environmental changes and management decisions that are best examined at monthly time steps normalized over multiple decades. The decision support system provides a database of historical spatial and climatic data for simulating streamflow for 2001–11, in addition to land-cover and general circulation model forecasts that focus on 2030 and 2060. WATER integrates geospatial sampling of landscape characteristics, including topographic and soil properties, with a regionally calibrated hillslope-hydrology model, an impervious-surface model, and hydroclimatic models that were parameterized by using three hydrologic response units: forested, agricultural, and developed land cover. This integration enables the regional hydrologic modeling approach used in WATER without requiring site-specific optimization or those stationary conditions inferred when using a statistical model.
Using Logistic Regression To Predict the Probability of Debris Flows Occurring in Areas Recently Burned By Wildland Fires

USGS Publications Warehouse

Rupert, Michael G.; Cannon, Susan H.; Gartner, Joseph E.

2003-01-01

Logistic regression was used to predict the probability of debris flows occurring in areas recently burned by wildland fires. Multiple logistic regression is conceptually similar to multiple linear regression because statistical relations between one dependent variable and several independent variables are evaluated. In logistic regression, however, the dependent variable is transformed to a binary variable (debris flow did or did not occur), and the actual probability of the debris flow occurring is statistically modeled. Data from 399 basins located within 15 wildland fires that burned during 2000-2002 in Colorado, Idaho, Montana, and New Mexico were evaluated. More than 35 independent variables describing the burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated. The models were developed as follows: (1) Basins that did and did not produce debris flows were delineated from National Elevation Data using a Geographic Information System (GIS). (2) Data describing the burn severity, geology, land surface gradient, rainfall, and soil properties were determined for each basin. These data were then downloaded to a statistics software package for analysis using logistic regression. (3) Relations between the occurrence/non-occurrence of debris flows and burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated and several preliminary multivariate logistic regression models were constructed. All possible combinations of independent variables were evaluated to determine which combination produced the most effective model. The multivariate model that best predicted the occurrence of debris flows was selected. (4) The multivariate logistic regression model was entered into a GIS, and a map showing the probability of debris flows was constructed. The most effective model incorporates the percentage of each basin with slope greater than 30 percent, percentage of land burned at medium and high burn severity in each basin, particle size sorting, average storm intensity (millimeters per hour), soil organic matter content, soil permeability, and soil drainage. The results of this study demonstrate that logistic regression is a valuable tool for predicting the probability of debris flows occurring in recently-burned landscapes.
Modeling Cross-Situational Word–Referent Learning: Prior Questions

PubMed Central

Yu, Chen; Smith, Linda B.

2013-01-01

Both adults and young children possess powerful statistical computation capabilities—they can infer the referent of a word from highly ambiguous contexts involving many words and many referents by aggregating cross-situational statistical information across contexts. This ability has been explained by models of hypothesis testing and by models of associative learning. This article describes a series of simulation studies and analyses designed to understand the different learning mechanisms posited by the 2 classes of models and their relation to each other. Variants of a hypothesis-testing model and a simple or dumb associative mechanism were examined under different specifications of information selection, computation, and decision. Critically, these 3 components of the models interact in complex ways. The models illustrate a fundamental tradeoff between amount of data input and powerful computations: With the selection of more information, dumb associative models can mimic the powerful learning that is accomplished by hypothesis-testing models with fewer data. However, because of the interactions among the component parts of the models, the associative model can mimic various hypothesis-testing models, producing the same learning patterns but through different internal components. The simulations argue for the importance of a compositional approach to human statistical learning: the experimental decomposition of the processes that contribute to statistical learning in human learners and models with the internal components that can be evaluated independently and together. PMID:22229490
Integrated Model for E-Learning Acceptance

NASA Astrophysics Data System (ADS)

Ramadiani; Rodziah, A.; Hasan, S. M.; Rusli, A.; Noraini, C.

2016-01-01

E-learning is not going to work if the system is not used in accordance with user needs. User Interface is very important to encourage using the application. Many theories had discuss about user interface usability evaluation and technology acceptance separately, actually why we do not make it correlation between interface usability evaluation and user acceptance to enhance e-learning process. Therefore, the evaluation model for e-learning interface acceptance is considered important to investigate. The aim of this study is to propose the integrated e-learning user interface acceptance evaluation model. This model was combined some theories of e-learning interface measurement such as, user learning style, usability evaluation, and the user benefit. We formulated in constructive questionnaires which were shared at 125 English Language School (ELS) students. This research statistics used Structural Equation Model using LISREL v8.80 and MANOVA analysis.
Statistical Model of Dynamic Markers of the Alzheimer's Pathological Cascade.

PubMed

Balsis, Steve; Geraci, Lisa; Benge, Jared; Lowe, Deborah A; Choudhury, Tabina K; Tirso, Robert; Doody, Rachelle S

2018-05-05

Alzheimer's disease (AD) is a progressive disease reflected in markers across assessment modalities, including neuroimaging, cognitive testing, and evaluation of adaptive function. Identifying a single continuum of decline across assessment modalities in a single sample is statistically challenging because of the multivariate nature of the data. To address this challenge, we implemented advanced statistical analyses designed specifically to model complex data across a single continuum. We analyzed data from the Alzheimer's Disease Neuroimaging Initiative (ADNI; N = 1,056), focusing on indicators from the assessments of magnetic resonance imaging (MRI) volume, fluorodeoxyglucose positron emission tomography (FDG-PET) metabolic activity, cognitive performance, and adaptive function. Item response theory was used to identify the continuum of decline. Then, through a process of statistical scaling, indicators across all modalities were linked to that continuum and analyzed. Findings revealed that measures of MRI volume, FDG-PET metabolic activity, and adaptive function added measurement precision beyond that provided by cognitive measures, particularly in the relatively mild range of disease severity. More specifically, MRI volume, and FDG-PET metabolic activity become compromised in the very mild range of severity, followed by cognitive performance and finally adaptive function. Our statistically derived models of the AD pathological cascade are consistent with existing theoretical models.
Markov model plus k-word distributions: a synergy that produces novel statistical measures for sequence comparison.

PubMed

Dai, Qi; Yang, Yanchun; Wang, Tianming

2008-10-15

Many proposed statistical measures can efficiently compare biological sequences to further infer their structures, functions and evolutionary information. They are related in spirit because all the ideas for sequence comparison try to use the information on the k-word distributions, Markov model or both. Motivated by adding k-word distributions to Markov model directly, we investigated two novel statistical measures for sequence comparison, called wre.k.r and S2.k.r. The proposed measures were tested by similarity search, evaluation on functionally related regulatory sequences and phylogenetic analysis. This offers the systematic and quantitative experimental assessment of our measures. Moreover, we compared our achievements with these based on alignment or alignment-free. We grouped our experiments into two sets. The first one, performed via ROC (receiver operating curve) analysis, aims at assessing the intrinsic ability of our statistical measures to search for similar sequences from a database and discriminate functionally related regulatory sequences from unrelated sequences. The second one aims at assessing how well our statistical measure is used for phylogenetic analysis. The experimental assessment demonstrates that our similarity measures intending to incorporate k-word distributions into Markov model are more efficient.
Correcting evaluation bias of relational classifiers with network cross validation

DOE PAGES

Neville, Jennifer; Gallagher, Brian; Eliassi-Rad, Tina; ...

2011-01-04

Recently, a number of modeling techniques have been developed for data mining and machine learning in relational and network domains where the instances are not independent and identically distributed (i.i.d.). These methods specifically exploit the statistical dependencies among instances in order to improve classification accuracy. However, there has been little focus on how these same dependencies affect our ability to draw accurate conclusions about the performance of the models. More specifically, the complex link structure and attribute dependencies in relational data violate the assumptions of many conventional statistical tests and make it difficult to use these tests to assess themore » models in an unbiased manner. In this work, we examine the task of within-network classification and the question of whether two algorithms will learn models that will result in significantly different levels of performance. We show that the commonly used form of evaluation (paired t-test on overlapping network samples) can result in an unacceptable level of Type I error. Furthermore, we show that Type I error increases as (1) the correlation among instances increases and (2) the size of the evaluation set increases (i.e., the proportion of labeled nodes in the network decreases). Lastly, we propose a method for network cross-validation that combined with paired t-tests produces more acceptable levels of Type I error while still providing reasonable levels of statistical power (i.e., 1–Type II error).« less
Strengthen forensic entomology in court--the need for data exploration and the validation of a generalised additive mixed model.

PubMed

Baqué, Michèle; Amendt, Jens

2013-01-01

Developmental data of juvenile blow flies (Diptera: Calliphoridae) are typically used to calculate the age of immature stages found on or around a corpse and thus to estimate a minimum post-mortem interval (PMI(min)). However, many of those data sets don't take into account that immature blow flies grow in a non-linear fashion. Linear models do not supply a sufficient reliability on age estimates and may even lead to an erroneous determination of the PMI(min). According to the Daubert standard and the need for improvements in forensic science, new statistic tools like smoothing methods and mixed models allow the modelling of non-linear relationships and expand the field of statistical analyses. The present study introduces into the background and application of these statistical techniques by analysing a model which describes the development of the forensically important blow fly Calliphora vicina at different temperatures. The comparison of three statistical methods (linear regression, generalised additive modelling and generalised additive mixed modelling) clearly demonstrates that only the latter provided regression parameters that reflect the data adequately. We focus explicitly on both the exploration of the data--to assure their quality and to show the importance of checking it carefully prior to conducting the statistical tests--and the validation of the resulting models. Hence, we present a common method for evaluating and testing forensic entomological data sets by using for the first time generalised additive mixed models.
Thermodynamic Model of Spatial Memory

NASA Astrophysics Data System (ADS)

Kaufman, Miron; Allen, P.

1998-03-01

We develop and test a thermodynamic model of spatial memory. Our model is an application of statistical thermodynamics to cognitive science. It is related to applications of the statistical mechanics framework in parallel distributed processes research. Our macroscopic model allows us to evaluate an entropy associated with spatial memory tasks. We find that older adults exhibit higher levels of entropy than younger adults. Thurstone's Law of Categorical Judgment, according to which the discriminal processes along the psychological continuum produced by presentations of a single stimulus are normally distributed, is explained by using a Hooke spring model of spatial memory. We have also analyzed a nonlinear modification of the ideal spring model of spatial memory. This work is supported by NIH/NIA grant AG09282-06.
Human Thermal Model Evaluation Using the JSC Human Thermal Database

NASA Technical Reports Server (NTRS)

Bue, Grant; Makinen, Janice; Cognata, Thomas

2012-01-01

Human thermal modeling has considerable long term utility to human space flight. Such models provide a tool to predict crew survivability in support of vehicle design and to evaluate crew response in untested space environments. It is to the benefit of any such model not only to collect relevant experimental data to correlate it against, but also to maintain an experimental standard or benchmark for future development in a readily and rapidly searchable and software accessible format. The Human thermal database project is intended to do just so; to collect relevant data from literature and experimentation and to store the data in a database structure for immediate and future use as a benchmark to judge human thermal models against, in identifying model strengths and weakness, to support model development and improve correlation, and to statistically quantify a model s predictive quality. The human thermal database developed at the Johnson Space Center (JSC) is intended to evaluate a set of widely used human thermal models. This set includes the Wissler human thermal model, a model that has been widely used to predict the human thermoregulatory response to a variety of cold and hot environments. These models are statistically compared to the current database, which contains experiments of human subjects primarily in air from a literature survey ranging between 1953 and 2004 and from a suited experiment recently performed by the authors, for a quantitative study of relative strength and predictive quality of the models.
Models of Pilot Behavior and Their Use to Evaluate the State of Pilot Training

NASA Astrophysics Data System (ADS)

Jirgl, Miroslav; Jalovecky, Rudolf; Bradac, Zdenek

2016-07-01

This article discusses the possibilities of obtaining new information related to human behavior, namely the changes or progressive development of pilots' abilities during training. The main assumption is that a pilot's ability can be evaluated based on a corresponding behavioral model whose parameters are estimated using mathematical identification procedures. The mean values of the identified parameters are obtained via statistical methods. These parameters are then monitored and their changes evaluated. In this context, the paper introduces and examines relevant mathematical models of human (pilot) behavior, the pilot-aircraft interaction, and an example of the mathematical analysis.

Referenceless perceptual fog density prediction model

NASA Astrophysics Data System (ADS)

Choi, Lark Kwon; You, Jaehee; Bovik, Alan C.

2014-02-01

We propose a perceptual fog density prediction model based on natural scene statistics (NSS) and "fog aware" statistical features, which can predict the visibility in a foggy scene from a single image without reference to a corresponding fogless image, without side geographical camera information, without training on human-rated judgments, and without dependency on salient objects such as lane markings or traffic signs. The proposed fog density predictor only makes use of measurable deviations from statistical regularities observed in natural foggy and fog-free images. A fog aware collection of statistical features is derived from a corpus of foggy and fog-free images by using a space domain NSS model and observed characteristics of foggy images such as low contrast, faint color, and shifted intensity. The proposed model not only predicts perceptual fog density for the entire image but also provides a local fog density index for each patch. The predicted fog density of the model correlates well with the measured visibility in a foggy scene as measured by judgments taken in a human subjective study on a large foggy image database. As one application, the proposed model accurately evaluates the performance of defog algorithms designed to enhance the visibility of foggy images.
Cure Models as a Useful Statistical Tool for Analyzing Survival

PubMed Central

Othus, Megan; Barlogie, Bart; LeBlanc, Michael L.; Crowley, John J.

2013-01-01

Cure models are a popular topic within statistical literature but are not as widely known in the clinical literature. Many patients with cancer can be long-term survivors of their disease, and cure models can be a useful tool to analyze and describe cancer survival data. The goal of this article is to review what a cure model is, explain when cure models can be used, and use cure models to describe multiple myeloma survival trends. Multiple myeloma is generally considered an incurable disease, and this article shows that by using cure models, rather than the standard Cox proportional hazards model, we can evaluate whether there is evidence that therapies at the University of Arkansas for Medical Sciences induce a proportion of patients to be long-term survivors. PMID:22675175
Forecast and virtual weather driven plant disease risk modeling system

USDA-ARS?s Scientific Manuscript database

We describe a system in use and development that leverages public weather station data, several spatialized weather forecast types, leaf wetness estimation, generic plant disease models, and online statistical evaluation. Convergent technological developments in all these areas allow, with funding f...
Experimental design matters for statistical analysis: how to handle blocking.

PubMed

Jensen, Signe M; Schaarschmidt, Frank; Onofri, Andrea; Ritz, Christian

2018-03-01

Nowadays, evaluation of the effects of pesticides often relies on experimental designs that involve multiple concentrations of the pesticide of interest or multiple pesticides at specific comparable concentrations and, possibly, secondary factors of interest. Unfortunately, the experimental design is often more or less neglected when analysing data. Two data examples were analysed using different modelling strategies. First, in a randomized complete block design, mean heights of maize treated with a herbicide and one of several adjuvants were compared. Second, translocation of an insecticide applied to maize as a seed treatment was evaluated using incomplete data from an unbalanced design with several layers of hierarchical sampling. Extensive simulations were carried out to further substantiate the effects of different modelling strategies. It was shown that results from suboptimal approaches (two-sample t-tests and ordinary ANOVA assuming independent observations) may be both quantitatively and qualitatively different from the results obtained using an appropriate linear mixed model. The simulations demonstrated that the different approaches may lead to differences in coverage percentages of confidence intervals and type 1 error rates, confirming that misleading conclusions can easily happen when an inappropriate statistical approach is chosen. To ensure that experimental data are summarized appropriately, avoiding misleading conclusions, the experimental design should duly be reflected in the choice of statistical approaches and models. We recommend that author guidelines should explicitly point out that authors need to indicate how the statistical analysis reflects the experimental design. © 2017 Society of Chemical Industry. © 2017 Society of Chemical Industry.
Estimation of Mouse Organ Locations Through Registration of a Statistical Mouse Atlas With Micro-CT Images

PubMed Central

Stout, David B.; Chatziioannou, Arion F.

2012-01-01

Micro-CT is widely used in preclinical studies of small animals. Due to the low soft-tissue contrast in typical studies, segmentation of soft tissue organs from noncontrast enhanced micro-CT images is a challenging problem. Here, we propose an atlas-based approach for estimating the major organs in mouse micro-CT images. A statistical atlas of major trunk organs was constructed based on 45 training subjects. The statistical shape model technique was used to include inter-subject anatomical variations. The shape correlations between different organs were described using a conditional Gaussian model. For registration, first the high-contrast organs in micro-CT images were registered by fitting the statistical shape model, while the low-contrast organs were subsequently estimated from the high-contrast organs using the conditional Gaussian model. The registration accuracy was validated based on 23 noncontrast-enhanced and 45 contrast-enhanced micro-CT images. Three different accuracy metrics (Dice coefficient, organ volume recovery coefficient, and surface distance) were used for evaluation. The Dice coefficients vary from 0.45 ± 0.18 for the spleen to 0.90 ± 0.02 for the lungs, the volume recovery coefficients vary from for the liver to 1.30 ± 0.75 for the spleen, the surface distances vary from 0.18 ± 0.01 mm for the lungs to 0.72 ± 0.42 mm for the spleen. The registration accuracy of the statistical atlas was compared with two publicly available single-subject mouse atlases, i.e., the MOBY phantom and the DIGIMOUSE atlas, and the results proved that the statistical atlas is more accurate than the single atlases. To evaluate the influence of the training subject size, different numbers of training subjects were used for atlas construction and registration. The results showed an improvement of the registration accuracy when more training subjects were used for the atlas construction. The statistical atlas-based registration was also compared with the thin-plate spline based deformable registration, commonly used in mouse atlas registration. The results revealed that the statistical atlas has the advantage of improving the estimation of low-contrast organs. PMID:21859613
Ecological Momentary Assessments and Automated Time Series Analysis to Promote Tailored Health Care: A Proof-of-Principle Study.

PubMed

van der Krieke, Lian; Emerencia, Ando C; Bos, Elisabeth H; Rosmalen, Judith Gm; Riese, Harriëtte; Aiello, Marco; Sytema, Sjoerd; de Jonge, Peter

2015-08-07

Health promotion can be tailored by combining ecological momentary assessments (EMA) with time series analysis. This combined method allows for studying the temporal order of dynamic relationships among variables, which may provide concrete indications for intervention. However, application of this method in health care practice is hampered because analyses are conducted manually and advanced statistical expertise is required. This study aims to show how this limitation can be overcome by introducing automated vector autoregressive modeling (VAR) of EMA data and to evaluate its feasibility through comparisons with results of previously published manual analyses. We developed a Web-based open source application, called AutoVAR, which automates time series analyses of EMA data and provides output that is intended to be interpretable by nonexperts. The statistical technique we used was VAR. AutoVAR tests and evaluates all possible VAR models within a given combinatorial search space and summarizes their results, thereby replacing the researcher's tasks of conducting the analysis, making an informed selection of models, and choosing the best model. We compared the output of AutoVAR to the output of a previously published manual analysis (n=4). An illustrative example consisting of 4 analyses was provided. Compared to the manual output, the AutoVAR output presents similar model characteristics and statistical results in terms of the Akaike information criterion, the Bayesian information criterion, and the test statistic of the Granger causality test. Results suggest that automated analysis and interpretation of times series is feasible. Compared to a manual procedure, the automated procedure is more robust and can save days of time. These findings may pave the way for using time series analysis for health promotion on a larger scale. AutoVAR was evaluated using the results of a previously conducted manual analysis. Analysis of additional datasets is needed in order to validate and refine the application for general use.
Ecological Momentary Assessments and Automated Time Series Analysis to Promote Tailored Health Care: A Proof-of-Principle Study

PubMed Central

Emerencia, Ando C; Bos, Elisabeth H; Rosmalen, Judith GM; Riese, Harriëtte; Aiello, Marco; Sytema, Sjoerd; de Jonge, Peter

2015-01-01

Background Health promotion can be tailored by combining ecological momentary assessments (EMA) with time series analysis. This combined method allows for studying the temporal order of dynamic relationships among variables, which may provide concrete indications for intervention. However, application of this method in health care practice is hampered because analyses are conducted manually and advanced statistical expertise is required. Objective This study aims to show how this limitation can be overcome by introducing automated vector autoregressive modeling (VAR) of EMA data and to evaluate its feasibility through comparisons with results of previously published manual analyses. Methods We developed a Web-based open source application, called AutoVAR, which automates time series analyses of EMA data and provides output that is intended to be interpretable by nonexperts. The statistical technique we used was VAR. AutoVAR tests and evaluates all possible VAR models within a given combinatorial search space and summarizes their results, thereby replacing the researcher’s tasks of conducting the analysis, making an informed selection of models, and choosing the best model. We compared the output of AutoVAR to the output of a previously published manual analysis (n=4). Results An illustrative example consisting of 4 analyses was provided. Compared to the manual output, the AutoVAR output presents similar model characteristics and statistical results in terms of the Akaike information criterion, the Bayesian information criterion, and the test statistic of the Granger causality test. Conclusions Results suggest that automated analysis and interpretation of times series is feasible. Compared to a manual procedure, the automated procedure is more robust and can save days of time. These findings may pave the way for using time series analysis for health promotion on a larger scale. AutoVAR was evaluated using the results of a previously conducted manual analysis. Analysis of additional datasets is needed in order to validate and refine the application for general use. PMID:26254160
A model for indexing medical documents combining statistical and symbolic knowledge.

PubMed

Avillach, Paul; Joubert, Michel; Fieschi, Marius

2007-10-11

To develop and evaluate an information processing method based on terminologies, in order to index medical documents in any given documentary context. We designed a model using both symbolic general knowledge extracted from the Unified Medical Language System (UMLS) and statistical knowledge extracted from a domain of application. Using statistical knowledge allowed us to contextualize the general knowledge for every particular situation. For each document studied, the extracted terms are ranked to highlight the most significant ones. The model was tested on a set of 17,079 French standardized discharge summaries (SDSs). The most important ICD-10 term of each SDS was ranked 1st or 2nd by the method in nearly 90% of the cases. The use of several terminologies leads to more precise indexing. The improvement achieved in the models implementation performances as a result of using semantic relationships is encouraging.
A comparison of different statistical methods analyzing hypoglycemia data using bootstrap simulations.

PubMed

Jiang, Honghua; Ni, Xiao; Huster, William; Heilmann, Cory

2015-01-01

Hypoglycemia has long been recognized as a major barrier to achieving normoglycemia with intensive diabetic therapies. It is a common safety concern for the diabetes patients. Therefore, it is important to apply appropriate statistical methods when analyzing hypoglycemia data. Here, we carried out bootstrap simulations to investigate the performance of the four commonly used statistical models (Poisson, negative binomial, analysis of covariance [ANCOVA], and rank ANCOVA) based on the data from a diabetes clinical trial. Zero-inflated Poisson (ZIP) model and zero-inflated negative binomial (ZINB) model were also evaluated. Simulation results showed that Poisson model inflated type I error, while negative binomial model was overly conservative. However, after adjusting for dispersion, both Poisson and negative binomial models yielded slightly inflated type I errors, which were close to the nominal level and reasonable power. Reasonable control of type I error was associated with ANCOVA model. Rank ANCOVA model was associated with the greatest power and with reasonable control of type I error. Inflated type I error was observed with ZIP and ZINB models.
A Model for Post Hoc Evaluation.

ERIC Educational Resources Information Center

Theimer, William C., Jr.

Often a research department in a school system is called on to make an after the fact evaluation of a program or project. Although the department is operating under a handicap, it can still provide some data useful for evaluative purposes. It is suggested that all the classical methods of descriptive statistics be brought into play. The use of…
Quantifying networks complexity from information geometry viewpoint

DOE Office of Scientific and Technical Information (OSTI.GOV)

Felice, Domenico, E-mail: domenico.felice@unicam.it; Mancini, Stefano; INFN-Sezione di Perugia, Via A. Pascoli, I-06123 Perugia

We consider a Gaussian statistical model whose parameter space is given by the variances of random variables. Underlying this model we identify networks by interpreting random variables as sitting on vertices and their correlations as weighted edges among vertices. We then associate to the parameter space a statistical manifold endowed with a Riemannian metric structure (that of Fisher-Rao). Going on, in analogy with the microcanonical definition of entropy in Statistical Mechanics, we introduce an entropic measure of networks complexity. We prove that it is invariant under networks isomorphism. Above all, considering networks as simplicial complexes, we evaluate this entropy onmore » simplexes and find that it monotonically increases with their dimension.« less
Statistical properties of a cloud ensemble - A numerical study

NASA Technical Reports Server (NTRS)

Tao, Wei-Kuo; Simpson, Joanne; Soong, Su-Tzai

1987-01-01

The statistical properties of cloud ensembles under a specified large-scale environment, such as mass flux by cloud drafts and vertical velocity as well as the condensation and evaporation associated with these cloud drafts, are examined using a three-dimensional numerical cloud ensemble model described by Soong and Ogura (1980) and Tao and Soong (1986). The cloud drafts are classified as active and inactive, and separate contributions to cloud statistics in areas of different cloud activity are then evaluated. The model results compare well with results obtained from aircraft measurements of a well-organized ITCZ rainband that occurred on August 12, 1974, during the Global Atmospheric Research Program's Atlantic Tropical Experiment.
Photons Revisited

NASA Astrophysics Data System (ADS)

Batic, Matej; Begalli, Marcia; Han, Min Cheol; Hauf, Steffen; Hoff, Gabriela; Kim, Chan Hyeong; Kim, Han Sung; Grazia Pia, Maria; Saracco, Paolo; Weidenspointner, Georg

2014-06-01

A systematic review of methods and data for the Monte Carlo simulation of photon interactions is in progress: it concerns a wide set of theoretical modeling approaches and data libraries available for this purpose. Models and data libraries are assessed quantitatively with respect to an extensive collection of experimental measurements documented in the literature to determine their accuracy; this evaluation exploits rigorous statistical analysis methods. The computational performance of the associated modeling algorithms is evaluated as well. An overview of the assessment of photon interaction models and results of the experimental validation are presented.
Human Thermal Model Evaluation Using the JSC Human Thermal Database

NASA Technical Reports Server (NTRS)

Cognata, T.; Bue, G.; Makinen, J.

2011-01-01

The human thermal database developed at the Johnson Space Center (JSC) is used to evaluate a set of widely used human thermal models. This database will facilitate a more accurate evaluation of human thermoregulatory response using in a variety of situations, including those situations that might otherwise prove too dangerous for actual testing--such as extreme hot or cold splashdown conditions. This set includes the Wissler human thermal model, a model that has been widely used to predict the human thermoregulatory response to a variety of cold and hot environments. These models are statistically compared to the current database, which contains experiments of human subjects primarily in air from a literature survey ranging between 1953 and 2004 and from a suited experiment recently performed by the authors, for a quantitative study of relative strength and predictive quality of the models. Human thermal modeling has considerable long term utility to human space flight. Such models provide a tool to predict crew survivability in support of vehicle design and to evaluate crew response in untested environments. It is to the benefit of any such model not only to collect relevant experimental data to correlate it against, but also to maintain an experimental standard or benchmark for future development in a readily and rapidly searchable and software accessible format. The Human thermal database project is intended to do just so; to collect relevant data from literature and experimentation and to store the data in a database structure for immediate and future use as a benchmark to judge human thermal models against, in identifying model strengths and weakness, to support model development and improve correlation, and to statistically quantify a model s predictive quality.
Optimal experimental designs for fMRI when the model matrix is uncertain.

PubMed

Kao, Ming-Hung; Zhou, Lin

2017-07-15

This study concerns optimal designs for functional magnetic resonance imaging (fMRI) experiments when the model matrix of the statistical model depends on both the selected stimulus sequence (fMRI design), and the subject's uncertain feedback (e.g. answer) to each mental stimulus (e.g. question) presented to her/him. While practically important, this design issue is challenging. This mainly is because that the information matrix cannot be fully determined at the design stage, making it difficult to evaluate the quality of the selected designs. To tackle this challenging issue, we propose an easy-to-use optimality criterion for evaluating the quality of designs, and an efficient approach for obtaining designs optimizing this criterion. Compared with a previously proposed method, our approach requires a much less computing time to achieve designs with high statistical efficiencies. Copyright © 2017 Elsevier Inc. All rights reserved.
A Course Model for Teaching Research Evaluation in Colleges of Pharmacy.

ERIC Educational Resources Information Center

Draugalis, JoLaine R.; Slack, Marion K.

1992-01-01

A University of Arizona undergraduate pharmacy course designed to develop student skills in evaluation of research has five parts: introduction to the scientific method; statistical techniques/data analysis review; research design; fundamentals of clinical studies; and practical applications. Prerequisites include biostatistics and drug…
Evaluation of Satellite and Model Precipitation Products Over Turkey

NASA Astrophysics Data System (ADS)

Yilmaz, M. T.; Amjad, M.

2017-12-01

Satellite-based remote sensing, gauge stations, and models are the three major platforms to acquire precipitation dataset. Among them satellites and models have the advantage of retrieving spatially and temporally continuous and consistent datasets, while the uncertainty estimates of these retrievals are often required for many hydrological studies to understand the source and the magnitude of the uncertainty in hydrological response parameters. In this study, satellite and model precipitation data products are validated over various temporal scales (daily, 3-daily, 7-daily, 10-daily and monthly) using in-situ measured precipitation observations from a network of 733 gauges from all over the Turkey. Tropical Rainfall Measurement Mission (TRMM) Multi-satellite Precipitation Analysis (TMPA) 3B42 version 7 and European Center of Medium-Range Weather Forecast (ECMWF) model estimates (daily, 3-daily, 7-daily and 10-daily accumulated forecast) are used in this study. Retrievals are evaluated for their mean and standard deviation and their accuracies are evaluated via bias, root mean square error, error standard deviation and correlation coefficient statistics. Intensity vs frequency analysis and some contingency table statistics like percent correct, probability of detection, false alarm ratio and critical success index are determined using daily time-series. Both ECMWF forecasts and TRMM observations, on average, overestimate the precipitation compared to gauge estimates; wet biases are 10.26 mm/month and 8.65 mm/month, respectively for ECMWF and TRMM. RMSE values of ECMWF forecasts and TRMM estimates are 39.69 mm/month and 41.55 mm/month, respectively. Monthly correlations between Gauges-ECMWF, Gauges-TRMM and ECMWF-TRMM are 0.76, 0.73 and 0.81, respectively. The model and the satellite error statistics are further compared against the gauges error statistics based on inverse distance weighting (IWD) analysis. Both the model and satellite data have less IWD errors (14.72 mm/month and 10.75 mm/month, respectively) compared to gauges IWD error (21.58 mm/month). These results show that, on average, ECMWF forecast data have higher skill than TRMM observations. Overall, both ECMWF forecast data and TRMM observations show good potential for catchment scale hydrological analysis.
Introduction to Multilevel Item Response Theory Analysis: Descriptive and Explanatory Models

ERIC Educational Resources Information Center

Sulis, Isabella; Toland, Michael D.

2017-01-01

Item response theory (IRT) models are the main psychometric approach for the development, evaluation, and refinement of multi-item instruments and scaling of latent traits, whereas multilevel models are the primary statistical method when considering the dependence between person responses when primary units (e.g., students) are nested within…
10 CFR 431.445 - Determination of small electric motor efficiency.

Code of Federal Regulations, 2010 CFR

2010-01-01

... determined either by testing in accordance with § 431.444 of this subpart, or by application of an... method. An AEDM applied to a basic model must be: (i) Derived from a mathematical model that represents... statistical analysis, computer simulation or modeling, or other analytic evaluation of performance data. (3...
Do Different Mental Models Influence Cybersecurity Behavior? Evaluations via Statistical Reasoning Performance.

PubMed

Brase, Gary L; Vasserman, Eugene Y; Hsu, William

2017-01-01

Cybersecurity research often describes people as understanding internet security in terms of metaphorical mental models (e.g., disease risk, physical security risk, or criminal behavior risk). However, little research has directly evaluated if this is an accurate or productive framework. To assess this question, two experiments asked participants to respond to a statistical reasoning task framed in one of four different contexts (cybersecurity, plus the above alternative models). Each context was also presented using either percentages or natural frequencies, and these tasks were followed by a behavioral likelihood rating. As in previous research, consistent use of natural frequencies promoted correct Bayesian reasoning. There was little indication, however, that any of the alternative mental models generated consistently better understanding or reasoning over the actual cybersecurity context. There was some evidence that different models had some effects on patterns of responses, including the behavioral likelihood ratings, but these effects were small, as compared to the effect of the numerical format manipulation. This points to a need to improve the content of actual internet security warnings, rather than working to change the models users have of warnings.

Do Different Mental Models Influence Cybersecurity Behavior? Evaluations via Statistical Reasoning Performance

PubMed Central

Brase, Gary L.; Vasserman, Eugene Y.; Hsu, William

2017-01-01

Cybersecurity research often describes people as understanding internet security in terms of metaphorical mental models (e.g., disease risk, physical security risk, or criminal behavior risk). However, little research has directly evaluated if this is an accurate or productive framework. To assess this question, two experiments asked participants to respond to a statistical reasoning task framed in one of four different contexts (cybersecurity, plus the above alternative models). Each context was also presented using either percentages or natural frequencies, and these tasks were followed by a behavioral likelihood rating. As in previous research, consistent use of natural frequencies promoted correct Bayesian reasoning. There was little indication, however, that any of the alternative mental models generated consistently better understanding or reasoning over the actual cybersecurity context. There was some evidence that different models had some effects on patterns of responses, including the behavioral likelihood ratings, but these effects were small, as compared to the effect of the numerical format manipulation. This points to a need to improve the content of actual internet security warnings, rather than working to change the models users have of warnings. PMID:29163304
An Intercomparison and Evaluation of Aircraft-Derived and Simulated CO from Seven Chemical Transport Models During the TRACE-P Experiment

NASA Technical Reports Server (NTRS)

Kiley, C. M.; Fuelberg, Henry E.; Palmer, P. I.; Allen, D. J.; Carmichael, G. R.; Jacob, D. J.; Mari, C.; Pierce, R. B.; Pickering, K. E.; Tang, Y.

2002-01-01

Four global scale and three regional scale chemical transport models are intercompared and evaluated during NASA's TRACE-P experiment. Model simulated and measured CO are statistically analyzed along aircraft flight tracks. Results for the combination of eleven flights show an overall negative bias in simulated CO. Biases are most pronounced during large CO events. Statistical agreements vary greatly among the individual flights. Those flights with the greatest range of CO values tend to be the worst simulated. However, for each given flight, the models generally provide similar relative results. The models exhibit difficulties simulating intense CO plumes. CO error is found to be greatest in the lower troposphere. Convective mass flux is shown to be very important, particularly near emissions source regions. Occasionally meteorological lift associated with excessive model-calculated mass fluxes leads to an overestimation of mid- and upper- tropospheric mixing ratios. Planetary Boundary Layer (PBL) depth is found to play an important role in simulating intense CO plumes. PBL depth is shown to cap plumes, confining heavy pollution to the very lowest levels.
Evaluation of probabilistic forecasts with the scoringRules package

NASA Astrophysics Data System (ADS)

Jordan, Alexander; Krüger, Fabian; Lerch, Sebastian

2017-04-01

Over the last decades probabilistic forecasts in the form of predictive distributions have become popular in many scientific disciplines. With the proliferation of probabilistic models arises the need for decision-theoretically principled tools to evaluate the appropriateness of models and forecasts in a generalized way in order to better understand sources of prediction errors and to improve the models. Proper scoring rules are functions S(F,y) which evaluate the accuracy of a forecast distribution F , given that an outcome y was observed. In coherence with decision-theoretical principles they allow to compare alternative models, a crucial ability given the variety of theories, data sources and statistical specifications that is available in many situations. This contribution presents the software package scoringRules for the statistical programming language R, which provides functions to compute popular scoring rules such as the continuous ranked probability score for a variety of distributions F that come up in applied work. For univariate variables, two main classes are parametric distributions like normal, t, or gamma distributions, and distributions that are not known analytically, but are indirectly described through a sample of simulation draws. For example, ensemble weather forecasts take this form. The scoringRules package aims to be a convenient dictionary-like reference for computing scoring rules. We offer state of the art implementations of several known (but not routinely applied) formulas, and implement closed-form expressions that were previously unavailable. Whenever more than one implementation variant exists, we offer statistically principled default choices. Recent developments include the addition of scoring rules to evaluate multivariate forecast distributions. The use of the scoringRules package is illustrated in an example on post-processing ensemble forecasts of temperature.
Statistical modelling for recurrent events: an application to sports injuries

PubMed Central

Ullah, Shahid; Gabbett, Tim J; Finch, Caroline F

2014-01-01

Background Injuries are often recurrent, with subsequent injuries influenced by previous occurrences and hence correlation between events needs to be taken into account when analysing such data. Objective This paper compares five different survival models (Cox proportional hazards (CoxPH) model and the following generalisations to recurrent event data: Andersen-Gill (A-G), frailty, Wei-Lin-Weissfeld total time (WLW-TT) marginal, Prentice-Williams-Peterson gap time (PWP-GT) conditional models) for the analysis of recurrent injury data. Methods Empirical evaluation and comparison of different models were performed using model selection criteria and goodness-of-fit statistics. Simulation studies assessed the size and power of each model fit. Results The modelling approach is demonstrated through direct application to Australian National Rugby League recurrent injury data collected over the 2008 playing season. Of the 35 players analysed, 14 (40%) players had more than 1 injury and 47 contact injuries were sustained over 29 matches. The CoxPH model provided the poorest fit to the recurrent sports injury data. The fit was improved with the A-G and frailty models, compared to WLW-TT and PWP-GT models. Conclusions Despite little difference in model fit between the A-G and frailty models, in the interest of fewer statistical assumptions it is recommended that, where relevant, future studies involving modelling of recurrent sports injury data use the frailty model in preference to the CoxPH model or its other generalisations. The paper provides a rationale for future statistical modelling approaches for recurrent sports injury. PMID:22872683
An automated process for building reliable and optimal in vitro/in vivo correlation models based on Monte Carlo simulations.

PubMed

Sutton, Steven C; Hu, Mingxiu

2006-05-05

Many mathematical models have been proposed for establishing an in vitro/in vivo correlation (IVIVC). The traditional IVIVC model building process consists of 5 steps: deconvolution, model fitting, convolution, prediction error evaluation, and cross-validation. This is a time-consuming process and typically a few models at most are tested for any given data set. The objectives of this work were to (1) propose a statistical tool to screen models for further development of an IVIVC, (2) evaluate the performance of each model under different circumstances, and (3) investigate the effectiveness of common statistical model selection criteria for choosing IVIVC models. A computer program was developed to explore which model(s) would be most likely to work well with a random variation from the original formulation. The process used Monte Carlo simulation techniques to build IVIVC models. Data-based model selection criteria (Akaike Information Criteria [AIC], R2) and the probability of passing the Food and Drug Administration "prediction error" requirement was calculated. To illustrate this approach, several real data sets representing a broad range of release profiles are used to illustrate the process and to demonstrate the advantages of this automated process over the traditional approach. The Hixson-Crowell and Weibull models were often preferred over the linear. When evaluating whether a Level A IVIVC model was possible, the model selection criteria AIC generally selected the best model. We believe that the approach we proposed may be a rapid tool to determine which IVIVC model (if any) is the most applicable.
An overview of the mathematical and statistical analysis component of RICIS

NASA Technical Reports Server (NTRS)

Hallum, Cecil R.

1987-01-01

Mathematical and statistical analysis components of RICIS (Research Institute for Computing and Information Systems) can be used in the following problem areas: (1) quantification and measurement of software reliability; (2) assessment of changes in software reliability over time (reliability growth); (3) analysis of software-failure data; and (4) decision logic for whether to continue or stop testing software. Other areas of interest to NASA/JSC where mathematical and statistical analysis can be successfully employed include: math modeling of physical systems, simulation, statistical data reduction, evaluation methods, optimization, algorithm development, and mathematical methods in signal processing.
Evaluating the decision accuracy and speed of clinical data visualizations.

PubMed

Pieczkiewicz, David S; Finkelstein, Stanley M

2010-01-01

Clinicians face an increasing volume of biomedical data. Assessing the efficacy of systems that enable accurate and timely clinical decision making merits corresponding attention. This paper discusses the multiple-reader multiple-case (MRMC) experimental design and linear mixed models as means of assessing and comparing decision accuracy and latency (time) for decision tasks in which clinician readers must interpret visual displays of data. These tools can assess and compare decision accuracy and latency (time). These experimental and statistical techniques, used extensively in radiology imaging studies, offer a number of practical and analytic advantages over more traditional quantitative methods such as percent-correct measurements and ANOVAs, and are recommended for their statistical efficiency and generalizability. An example analysis using readily available, free, and commercial statistical software is provided as an appendix. While these techniques are not appropriate for all evaluation questions, they can provide a valuable addition to the evaluative toolkit of medical informatics research.
Evaluation of the statistical evidence for Characteristic Earthquakes in the frequency-magnitude distributions of Sumatra and other subduction zone regions

NASA Astrophysics Data System (ADS)

Naylor, M.; Main, I. G.; Greenhough, J.; Bell, A. F.; McCloskey, J.

2009-04-01

The Sumatran Boxing Day earthquake and subsequent large events provide an opportunity to re-evaluate the statistical evidence for characteristic earthquake events in frequency-magnitude distributions. Our aims are to (i) improve intuition regarding the properties of samples drawn from power laws, (ii) illustrate using random samples how appropriate Poisson confidence intervals can both aid the eye and provide an appropriate statistical evaluation of data drawn from power-law distributions, and (iii) apply these confidence intervals to test for evidence of characteristic earthquakes in subduction-zone frequency-magnitude distributions. We find no need for a characteristic model to describe frequency magnitude distributions in any of the investigated subduction zones, including Sumatra, due to an emergent skew in residuals of power law count data at high magnitudes combined with a sample bias for examining large earthquakes as candidate characteristic events.
A multibody knee model with discrete cartilage prediction of tibio-femoral contact mechanics.

PubMed

Guess, Trent M; Liu, Hongzeng; Bhashyam, Sampath; Thiagarajan, Ganesh

2013-01-01

Combining musculoskeletal simulations with anatomical joint models capable of predicting cartilage contact mechanics would provide a valuable tool for studying the relationships between muscle force and cartilage loading. As a step towards producing multibody musculoskeletal models that include representation of cartilage tissue mechanics, this research developed a subject-specific multibody knee model that represented the tibia plateau cartilage as discrete rigid bodies that interacted with the femur through deformable contacts. Parameters for the compliant contact law were derived using three methods: (1) simplified Hertzian contact theory, (2) simplified elastic foundation contact theory and (3) parameter optimisation from a finite element (FE) solution. The contact parameters and contact friction were evaluated during a simulated walk in a virtual dynamic knee simulator, and the resulting kinematics were compared with measured in vitro kinematics. The effects on predicted contact pressures and cartilage-bone interface shear forces during the simulated walk were also evaluated. The compliant contact stiffness parameters had a statistically significant effect on predicted contact pressures as well as all tibio-femoral motions except flexion-extension. The contact friction was not statistically significant to contact pressures, but was statistically significant to medial-lateral translation and all rotations except flexion-extension. The magnitude of kinematic differences between model formulations was relatively small, but contact pressure predictions were sensitive to model formulation. The developed multibody knee model was computationally efficient and had a computation time 283 times faster than a FE simulation using the same geometries and boundary conditions.
A Tutorial for Analyzing Human Reaction Times: How to Filter Data, Manage Missing Values, and Choose a Statistical Model

ERIC Educational Resources Information Center

Lachaud, Christian Michel; Renaud, Olivier

2011-01-01

This tutorial for the statistical processing of reaction times collected through a repeated-measure design is addressed to researchers in psychology. It aims at making explicit some important methodological issues, at orienting researchers to the existing solutions, and at providing them some evaluation tools for choosing the most robust and…
Evaluating sufficient similarity for drinking-water disinfection by-product (DBP) mixtures with bootstrap hypothesis test procedures.

PubMed

Feder, Paul I; Ma, Zhenxu J; Bull, Richard J; Teuschler, Linda K; Rice, Glenn

2009-01-01

In chemical mixtures risk assessment, the use of dose-response data developed for one mixture to estimate risk posed by a second mixture depends on whether the two mixtures are sufficiently similar. While evaluations of similarity may be made using qualitative judgments, this article uses nonparametric statistical methods based on the "bootstrap" resampling technique to address the question of similarity among mixtures of chemical disinfectant by-products (DBP) in drinking water. The bootstrap resampling technique is a general-purpose, computer-intensive approach to statistical inference that substitutes empirical sampling for theoretically based parametric mathematical modeling. Nonparametric, bootstrap-based inference involves fewer assumptions than parametric normal theory based inference. The bootstrap procedure is appropriate, at least in an asymptotic sense, whether or not the parametric, distributional assumptions hold, even approximately. The statistical analysis procedures in this article are initially illustrated with data from 5 water treatment plants (Schenck et al., 2009), and then extended using data developed from a study of 35 drinking-water utilities (U.S. EPA/AMWA, 1989), which permits inclusion of a greater number of water constituents and increased structure in the statistical models.
Efficient statistical tests to compare Youden index: accounting for contingency correlation.

PubMed

Chen, Fangyao; Xue, Yuqiang; Tan, Ming T; Chen, Pingyan

2015-04-30

Youden index is widely utilized in studies evaluating accuracy of diagnostic tests and performance of predictive, prognostic, or risk models. However, both one and two independent sample tests on Youden index have been derived ignoring the dependence (association) between sensitivity and specificity, resulting in potentially misleading findings. Besides, paired sample test on Youden index is currently unavailable. This article develops efficient statistical inference procedures for one sample, independent, and paired sample tests on Youden index by accounting for contingency correlation, namely associations between sensitivity and specificity and paired samples typically represented in contingency tables. For one and two independent sample tests, the variances are estimated by Delta method, and the statistical inference is based on the central limit theory, which are then verified by bootstrap estimates. For paired samples test, we show that the estimated covariance of the two sensitivities and specificities can be represented as a function of kappa statistic so the test can be readily carried out. We then show the remarkable accuracy of the estimated variance using a constrained optimization approach. Simulation is performed to evaluate the statistical properties of the derived tests. The proposed approaches yield more stable type I errors at the nominal level and substantially higher power (efficiency) than does the original Youden's approach. Therefore, the simple explicit large sample solution performs very well. Because we can readily implement the asymptotic and exact bootstrap computation with common software like R, the method is broadly applicable to the evaluation of diagnostic tests and model performance. Copyright © 2015 John Wiley & Sons, Ltd.
Applying the multivariate time-rescaling theorem to neural population models

PubMed Central

Gerhard, Felipe; Haslinger, Robert; Pipa, Gordon

2011-01-01

Statistical models of neural activity are integral to modern neuroscience. Recently, interest has grown in modeling the spiking activity of populations of simultaneously recorded neurons to study the effects of correlations and functional connectivity on neural information processing. However any statistical model must be validated by an appropriate goodness-of-fit test. Kolmogorov-Smirnov tests based upon the time-rescaling theorem have proven to be useful for evaluating point-process-based statistical models of single-neuron spike trains. Here we discuss the extension of the time-rescaling theorem to the multivariate (neural population) case. We show that even in the presence of strong correlations between spike trains, models which neglect couplings between neurons can be erroneously passed by the univariate time-rescaling test. We present the multivariate version of the time-rescaling theorem, and provide a practical step-by-step procedure for applying it towards testing the sufficiency of neural population models. Using several simple analytically tractable models and also more complex simulated and real data sets, we demonstrate that important features of the population activity can only be detected using the multivariate extension of the test. PMID:21395436
Improving the Validity of Activity of Daily Living Dependency Risk Assessment

PubMed Central

Clark, Daniel O.; Stump, Timothy E.; Tu, Wanzhu; Miller, Douglas K.

2015-01-01

Objectives Efforts to prevent activity of daily living (ADL) dependency may be improved through models that assess older adults’ dependency risk. We evaluated whether cognition and gait speed measures improve the predictive validity of interview-based models. Method Participants were 8,095 self-respondents in the 2006 Health and Retirement Survey who were aged 65 years or over and independent in five ADLs. Incident ADL dependency was determined from the 2008 interview. Models were developed using random 2/3rd cohorts and validated in the remaining 1/3rd. Results Compared to a c-statistic of 0.79 in the best interview model, the model including cognitive measures had c-statistics of 0.82 and 0.80 while the best fitting gait speed model had c-statistics of 0.83 and 0.79 in the development and validation cohorts, respectively. Conclusion Two relatively brief models, one that requires an in-person assessment and one that does not, had excellent validity for predicting incident ADL dependency but did not significantly improve the predictive validity of the best fitting interview-based models. PMID:24652867
Statistical variation in progressive scrambling

NASA Astrophysics Data System (ADS)

Clark, Robert D.; Fox, Peter C.

2004-07-01

The two methods most often used to evaluate the robustness and predictivity of partial least squares (PLS) models are cross-validation and response randomization. Both methods may be overly optimistic for data sets that contain redundant observations, however. The kinds of perturbation analysis widely used for evaluating model stability in the context of ordinary least squares regression are only applicable when the descriptors are independent of each other and errors are independent and normally distributed; neither assumption holds for QSAR in general and for PLS in particular. Progressive scrambling is a novel, non-parametric approach to perturbing models in the response space in a way that does not disturb the underlying covariance structure of the data. Here, we introduce adjustments for two of the characteristic values produced by a progressive scrambling analysis - the deprecated predictivity (Q_s^{ast^2}) and standard error of prediction (SDEP s * ) - that correct for the effect of introduced perturbation. We also explore the statistical behavior of the adjusted values (Q_0^{ast^2} and SDEP 0 * ) and the sensitivity to perturbation (d q 2/d r yy ' 2). It is shown that the three statistics are all robust for stable PLS models, in terms of the stochastic component of their determination and of their variation due to sampling effects involved in training set selection.
An improved approach for flight readiness certification: Methodology for failure risk assessment and application examples. Volume 2: Software documentation

NASA Technical Reports Server (NTRS)

Moore, N. R.; Ebbeler, D. H.; Newlin, L. E.; Sutharshana, S.; Creager, M.

1992-01-01

An improved methodology for quantitatively evaluating failure risk of spaceflight systems to assess flight readiness and identify risk control measures is presented. This methodology, called Probabilistic Failure Assessment (PFA), combines operating experience from tests and flights with engineering analysis to estimate failure risk. The PFA methodology is of particular value when information on which to base an assessment of failure risk, including test experience and knowledge of parameters used in engineering analyses of failure phenomena, is expensive or difficult to acquire. The PFA methodology is a prescribed statistical structure in which engineering analysis models that characterize failure phenomena are used conjointly with uncertainties about analysis parameters and/or modeling accuracy to estimate failure probability distributions for specific failure modes, These distributions can then be modified, by means of statistical procedures of the PFA methodology, to reflect any test or flight experience. Conventional engineering analysis models currently employed for design of failure prediction are used in this methodology. The PFA methodology is described and examples of its application are presented. Conventional approaches to failure risk evaluation for spaceflight systems are discussed, and the rationale for the approach taken in the PFA methodology is presented. The statistical methods, engineering models, and computer software used in fatigue failure mode applications are thoroughly documented.
Developing Statistical Evaluation Model of Introduction Effect of MSW Thermal Recycling

NASA Astrophysics Data System (ADS)

Aoyama, Makoto; Kato, Takeyoshi; Suzuoki, Yasuo

For the effective utilization of municipal solid waste (MSW) through a thermal recycling, new technologies, such as an incineration plant using a Molten Carbonate Fuel Cell (MCFC), are being developed. The impact of new technologies should be evaluated statistically for various municipalities, so that the target of technological development or potential cost reduction due to the increased cumulative number of installed system can be discussed. For this purpose, we developed a model for discussing the impact of new technologies, where a statistical mesh data set was utilized to estimate the heat demand around the incineration plant. This paper examines a case study by using a developed model, where a conventional type and a MCFC type MSW incineration plant is compared in terms of the reduction in primary energy and the revenue by both electricity and heat supply. Based on the difference in annual revenue, we calculate the allowable investment in MCFC-type MSW incineration plant in addition to conventional plant. The results suggest that allowable investment can be about 30 millions yen/(t/day) in small municipalities, while it is only 10 millions yen/(t/day) in large municipalities. The sensitive analysis shows the model can be useful for discussing the difference of impact of material recycling of plastics on thermal recycling technologies.
An improved approach for flight readiness certification: Methodology for failure risk assessment and application examples, volume 1

NASA Technical Reports Server (NTRS)

Moore, N. R.; Ebbeler, D. H.; Newlin, L. E.; Sutharshana, S.; Creager, M.

1992-01-01

An improved methodology for quantitatively evaluating failure risk of spaceflight systems to assess flight readiness and identify risk control measures is presented. This methodology, called Probabilistic Failure Assessment (PFA), combines operating experience from tests and flights with engineering analysis to estimate failure risk. The PFA methodology is of particular value when information on which to base an assessment of failure risk, including test experience and knowledge of parameters used in engineering analyses of failure phenomena, is expensive or difficult to acquire. The PFA methodology is a prescribed statistical structure in which engineering analysis models that characterize failure phenomena are used conjointly with uncertainties about analysis parameters and/or modeling accuracy to estimate failure probability distributions for specific failure modes. These distributions can then be modified, by means of statistical procedures of the PFA methodology, to reflect any test or flight experience. Conventional engineering analysis models currently employed for design of failure prediction are used in this methodology. The PFA methodology is described and examples of its application are presented. Conventional approaches to failure risk evaluation for spaceflight systems are discussed, and the rationale for the approach taken in the PFA methodology is presented. The statistical methods, engineering models, and computer software used in fatigue failure mode applications are thoroughly documented.
Newer classification and regression tree techniques: Bagging and Random Forests for ecological prediction

Treesearch

Anantha M. Prasad; Louis R. Iverson; Andy Liaw; Andy Liaw

2006-01-01

We evaluated four statistical models - Regression Tree Analysis (RTA), Bagging Trees (BT), Random Forests (RF), and Multivariate Adaptive Regression Splines (MARS) - for predictive vegetation mapping under current and future climate scenarios according to the Canadian Climate Centre global circulation model.
A Comparison of the Forecast Skills among Three Numerical Models

NASA Astrophysics Data System (ADS)

Lu, D.; Reddy, S. R.; White, L. J.

2003-12-01

Three numerical weather forecast models, MM5, COAMPS and WRF, operating with a joint effort of NOAA HU-NCAS and Jackson State University (JSU) during summer 2003 have been chosen to study their forecast skills against observations. The models forecast over the same region with the same initialization, boundary condition, forecast length and spatial resolution. AVN global dataset have been ingested as initial conditions. Grib resolution of 27 km is chosen to represent the current mesoscale model. The forecasts with the length of 36h are performed to output the result with 12h interval. The key parameters used to evaluate the forecast skill include 12h accumulated precipitation, sea level pressure, wind, surface temperature and dew point. Precipitation is evaluated statistically using conventional skill scores, Threat Score (TS) and Bias Score (BS), for different threshold values based on 12h rainfall observations whereas other statistical methods such as Mean Error (ME), Mean Absolute Error(MAE) and Root Mean Square Error (RMSE) are applied to other forecast parameters.

Accuracy of topographic index models at identifying ephemeral gully trajectories on agricultural fields

NASA Astrophysics Data System (ADS)

Sheshukov, Aleksey Y.; Sekaluvu, Lawrence; Hutchinson, Stacy L.

2018-04-01

Topographic index (TI) models have been widely used to predict trajectories and initiation points of ephemeral gullies (EGs) in agricultural landscapes. Prediction of EGs strongly relies on the selected value of critical TI threshold, and the accuracy depends on topographic features, agricultural management, and datasets of observed EGs. This study statistically evaluated the predictions by TI models in two paired watersheds in Central Kansas that had different levels of structural disturbances due to implemented conservation practices. Four TI models with sole dependency on topographic factors of slope, contributing area, and planform curvature were used in this study. The observed EGs were obtained by field reconnaissance and through the process of hydrological reconditioning of digital elevation models (DEMs). The Kernel Density Estimation analysis was used to evaluate TI distribution within a 10-m buffer of the observed EG trajectories. The EG occurrence within catchments was analyzed using kappa statistics of the error matrix approach, while the lengths of predicted EGs were compared with the observed dataset using the Nash-Sutcliffe Efficiency (NSE) statistics. The TI frequency analysis produced bi-modal distribution of topographic indexes with the pixels within the EG trajectory having a higher peak. The graphs of kappa and NSE versus critical TI threshold showed similar profile for all four TI models and both watersheds with the maximum value representing the best comparison with the observed data. The Compound Topographic Index (CTI) model presented the overall best accuracy with NSE of 0.55 and kappa of 0.32. The statistics for the disturbed watershed showed higher best critical TI threshold values than for the undisturbed watershed. Structural conservation practices implemented in the disturbed watershed reduced ephemeral channels in headwater catchments, thus producing less variability in catchments with EGs. The variation in critical thresholds for all TI models suggested that TI models tend to predict EG occurrence and length over a range of thresholds rather than find a single best value.
Estimation of absorption rate constant (ka) following oral administration by Wagner-Nelson, Loo-Riegelman, and statistical moments in the presence of a secondary peak.

PubMed

Mahmood, Iftekhar

2004-01-01

The objective of this study was to evaluate the performance of Wagner-Nelson, Loo-Reigelman, and statistical moments methods in determining the absorption rate constant(s) in the presence of a secondary peak. These methods were also evaluated when there were two absorption rates without a secondary peak. Different sets of plasma concentration versus time data for a hypothetical drug following one or two compartment models were generated by simulation. The true ka was compared with the ka estimated by Wagner-Nelson, Loo-Riegelman and statistical moments methods. The results of this study indicate that Wagner-Nelson, Loo-Riegelman and statistical moments methods may not be used for the estimation of absorption rate constants in the presence of a secondary peak or when absorption takes place with two absorption rates.
Evaluating the capabilities of watershed-scale models in estimating sediment yield at field-scale.

PubMed

Sommerlot, Andrew R; Nejadhashemi, A Pouyan; Woznicki, Sean A; Giri, Subhasis; Prohaska, Michael D

2013-09-30

Many watershed model interfaces have been developed in recent years for predicting field-scale sediment loads. They share the goal of providing data for decisions aimed at improving watershed health and the effectiveness of water quality conservation efforts. The objectives of this study were to: 1) compare three watershed-scale models (Soil and Water Assessment Tool (SWAT), Field_SWAT, and the High Impact Targeting (HIT) model) against calibrated field-scale model (RUSLE2) in estimating sediment yield from 41 randomly selected agricultural fields within the River Raisin watershed; 2) evaluate the statistical significance among models; 3) assess the watershed models' capabilities in identifying areas of concern at the field level; 4) evaluate the reliability of the watershed-scale models for field-scale analysis. The SWAT model produced the most similar estimates to RUSLE2 by providing the closest median and the lowest absolute error in sediment yield predictions, while the HIT model estimates were the worst. Concerning statistically significant differences between models, SWAT was the only model found to be not significantly different from the calibrated RUSLE2 at α = 0.05. Meanwhile, all models were incapable of identifying priorities areas similar to the RUSLE2 model. Overall, SWAT provided the most correct estimates (51%) within the uncertainty bounds of RUSLE2 and is the most reliable among the studied models, while HIT is the least reliable. The results of this study suggest caution should be exercised when using watershed-scale models for field level decision-making, while field specific data is of paramount importance. Copyright © 2013 Elsevier Ltd. All rights reserved.
QSAR study of curcumine derivatives as HIV-1 integrase inhibitors.

PubMed

Gupta, Pawan; Sharma, Anju; Garg, Prabha; Roy, Nilanjan

2013-03-01

A QSAR study was performed on curcumine derivatives as HIV-1 integrase inhibitors using multiple linear regression. The statistically significant model was developed with squared correlation coefficients (r(2)) 0.891 and cross validated r(2) (r(2) cv) 0.825. The developed model revealed that electronic, shape, size, geometry, substitution's information and hydrophilicity were important atomic properties for determining the inhibitory activity of these molecules. The model was also tested successfully for external validation (r(2) pred = 0.849) as well as Tropsha's test for model predictability. Furthermore, the domain analysis was carried out to evaluate the prediction reliability of external set molecules. The model was statistically robust and had good predictive power which can be successfully utilized for screening of new molecules.
Statistical validation and an empirical model of hydrogen production enhancement found by utilizing passive flow disturbance in the steam-reformation process

DOE Office of Scientific and Technical Information (OSTI.GOV)

Erickson, Paul A.; Liao, Chang-hsien

2007-11-15

A passive flow disturbance has been proven to enhance the conversion of fuel in a methanol-steam reformer. This study presents a statistical validation of the experiment based on a standard 2{sup k} factorial experiment design and the resulting empirical model of the enhanced hydrogen producing process. A factorial experiment design was used to statistically analyze the effects and interactions of various input factors in the experiment. Three input factors, including the number of flow disturbers, catalyst size, and reactant flow rate were investigated for their effects on the fuel conversion in the steam-reformation process. Based on the experimental results, anmore » empirical model was developed and further evaluated with an uncertainty analysis and interior point data. (author)« less
LES/PDF studies of joint statistics of mixture fraction and progress variable in piloted methane jet flames with inhomogeneous inlet flows

NASA Astrophysics Data System (ADS)

Zhang, Pei; Barlow, Robert; Masri, Assaad; Wang, Haifeng

2016-11-01

The mixture fraction and progress variable are often used as independent variables for describing turbulent premixed and non-premixed flames. There is a growing interest in using these two variables for describing partially premixed flames. The joint statistical distribution of the mixture fraction and progress variable is of great interest in developing models for partially premixed flames. In this work, we conduct predictive studies of the joint statistics of mixture fraction and progress variable in a series of piloted methane jet flames with inhomogeneous inlet flows. The employed models combine large eddy simulations with the Monte Carlo probability density function (PDF) method. The joint PDFs and marginal PDFs are examined in detail by comparing the model predictions and the measurements. Different presumed shapes of the joint PDFs are also evaluated.
Spillover in the Academy: Marriage Stability and Faculty Evaluations.

ERIC Educational Resources Information Center

Ludlow, Larry H.; Alvarez-Salvat, Rose M.

2001-01-01

Studied the spillover between family and work by examining the link between marital status and work performance across marriage, divorce, and remarriage. A polynomial regression model was fit to the data from 78 evaluations of an individual professor, and a cubic curve through the 3 periods was statistically significant. (SLD)
Slice simulation from a model of the parenchymous vascularization to evaluate texture features: work in progress.

PubMed

Rolland, Y; Bézy-Wendling, J; Duvauferrier, R; Coatrieux, J L

1999-03-01

To demonstrate the usefulness of a model of the parenchymous vascularization to evaluate texture analysis methods. Slices with thickness varying from 1 to 4 mm were reformatted from a 3D vascular model corresponding to either normal tissue perfusion or local hypervascularization. Parameters of statistical methods were measured on 16128x128 regions of interest, and mean values and standard deviation were calculated. For each parameter, the performances (discrimination power and stability) were evaluated. Among 11 calculated statistical parameters, three (homogeneity, entropy, mean of gradients) were found to have a good discriminating power to differentiate normal perfusion from hypervascularization, but only the gradient mean was found to have a good stability with respect to the thickness. Five parameters (run percentage, run length distribution, long run emphasis, contrast, and gray level distribution) were found to have intermediate results. In the remaining three, curtosis and correlation was found to have little discrimination power, skewness none. This 3D vascular model, which allows the generation of various examples of vascular textures, is a powerful tool to assess the performance of texture analysis methods. This improves our knowledge of the methods and should contribute to their a priori choice when designing clinical studies.
Quantifying discrimination of Framingham risk functions with different survival C statistics.

PubMed

Pencina, Michael J; D'Agostino, Ralph B; Song, Linye

2012-07-10

Cardiovascular risk prediction functions offer an important diagnostic tool for clinicians and patients themselves. They are usually constructed with the use of parametric or semi-parametric survival regression models. It is essential to be able to evaluate the performance of these models, preferably with summaries that offer natural and intuitive interpretations. The concept of discrimination, popular in the logistic regression context, has been extended to survival analysis. However, the extension is not unique. In this paper, we define discrimination in survival analysis as the model's ability to separate those with longer event-free survival from those with shorter event-free survival within some time horizon of interest. This definition remains consistent with that used in logistic regression, in the sense that it assesses how well the model-based predictions match the observed data. Practical and conceptual examples and numerical simulations are employed to examine four C statistics proposed in the literature to evaluate the performance of survival models. We observe that they differ in the numerical values and aspects of discrimination that they capture. We conclude that the index proposed by Harrell is the most appropriate to capture discrimination described by the above definition. We suggest researchers report which C statistic they are using, provide a rationale for their selection, and be aware that comparing different indices across studies may not be meaningful. Copyright © 2012 John Wiley & Sons, Ltd.
Probabilistic/Fracture-Mechanics Model For Service Life

NASA Technical Reports Server (NTRS)

Watkins, T., Jr.; Annis, C. G., Jr.

1991-01-01

Computer program makes probabilistic estimates of lifetime of engine and components thereof. Developed to fill need for more accurate life-assessment technique that avoids errors in estimated lives and provides for statistical assessment of levels of risk created by engineering decisions in designing system. Implements mathematical model combining techniques of statistics, fatigue, fracture mechanics, nondestructive analysis, life-cycle cost analysis, and management of engine parts. Used to investigate effects of such engine-component life-controlling parameters as return-to-service intervals, stresses, capabilities for nondestructive evaluation, and qualities of materials.
Simulation Study Using a New Type of Sample Variance

NASA Technical Reports Server (NTRS)

Howe, D. A.; Lainson, K. J.

1996-01-01

We evaluate with simulated data a new type of sample variance for the characterization of frequency stability. The new statistic (referred to as TOTALVAR and its square root TOTALDEV) is a better predictor of long-term frequency variations than the present sample Allan deviation. The statistical model uses the assumption that a time series of phase or frequency differences is wrapped (periodic) with overall frequency difference removed. We find that the variability at long averaging times is reduced considerably for the five models of power-law noise commonly encountered with frequency standards and oscillators.
Climate Considerations Of The Electricity Supply Systems In Industries

NASA Astrophysics Data System (ADS)

Asset, Khabdullin; Zauresh, Khabdullina

2014-12-01

The study is focused on analysis of climate considerations of electricity supply systems in a pellet industry. The developed analysis model consists of two modules: statistical data of active power losses evaluation module and climate aspects evaluation module. The statistical data module is presented as a universal mathematical model of electrical systems and components of industrial load. It forms a basis for detailed accounting of power loss from the voltage levels. On the basis of the universal model, a set of programs is designed to perform the calculation and experimental research. It helps to obtain the statistical characteristics of the power losses and loads of the electricity supply systems and to define the nature of changes in these characteristics. Within the module, several methods and algorithms for calculating parameters of equivalent circuits of low- and high-voltage ADC and SD with a massive smooth rotor with laminated poles are developed. The climate aspects module includes an analysis of the experimental data of power supply system in pellet production. It allows identification of GHG emission reduction parameters: operation hours, type of electrical motors, values of load factor and deviation of standard value of voltage.
Validating the simulation of large-scale parallel applications using statistical characteristics

DOE PAGES

Zhang, Deli; Wilke, Jeremiah; Hendry, Gilbert; ...

2016-03-01

Simulation is a widely adopted method to analyze and predict the performance of large-scale parallel applications. Validating the hardware model is highly important for complex simulations with a large number of parameters. Common practice involves calculating the percent error between the projected and the real execution time of a benchmark program. However, in a high-dimensional parameter space, this coarse-grained approach often suffers from parameter insensitivity, which may not be known a priori. Moreover, the traditional approach cannot be applied to the validation of software models, such as application skeletons used in online simulations. In this work, we present a methodologymore » and a toolset for validating both hardware and software models by quantitatively comparing fine-grained statistical characteristics obtained from execution traces. Although statistical information has been used in tasks like performance optimization, this is the first attempt to apply it to simulation validation. Lastly, our experimental results show that the proposed evaluation approach offers significant improvement in fidelity when compared to evaluation using total execution time, and the proposed metrics serve as reliable criteria that progress toward automating the simulation tuning process.« less
Landslide susceptibility mapping using GIS-based statistical models and Remote sensing data in tropical environment

PubMed Central

Hashim, Mazlan

2015-01-01

This research presents the results of the GIS-based statistical models for generation of landslide susceptibility mapping using geographic information system (GIS) and remote-sensing data for Cameron Highlands area in Malaysia. Ten factors including slope, aspect, soil, lithology, NDVI, land cover, distance to drainage, precipitation, distance to fault, and distance to road were extracted from SAR data, SPOT 5 and WorldView-1 images. The relationships between the detected landslide locations and these ten related factors were identified by using GIS-based statistical models including analytical hierarchy process (AHP), weighted linear combination (WLC) and spatial multi-criteria evaluation (SMCE) models. The landslide inventory map which has a total of 92 landslide locations was created based on numerous resources such as digital aerial photographs, AIRSAR data, WorldView-1 images, and field surveys. Then, 80% of the landslide inventory was used for training the statistical models and the remaining 20% was used for validation purpose. The validation results using the Relative landslide density index (R-index) and Receiver operating characteristic (ROC) demonstrated that the SMCE model (accuracy is 96%) is better in prediction than AHP (accuracy is 91%) and WLC (accuracy is 89%) models. These landslide susceptibility maps would be useful for hazard mitigation purpose and regional planning. PMID:25898919
Landslide susceptibility mapping using GIS-based statistical models and Remote sensing data in tropical environment.

PubMed

Shahabi, Himan; Hashim, Mazlan

2015-04-22

This research presents the results of the GIS-based statistical models for generation of landslide susceptibility mapping using geographic information system (GIS) and remote-sensing data for Cameron Highlands area in Malaysia. Ten factors including slope, aspect, soil, lithology, NDVI, land cover, distance to drainage, precipitation, distance to fault, and distance to road were extracted from SAR data, SPOT 5 and WorldView-1 images. The relationships between the detected landslide locations and these ten related factors were identified by using GIS-based statistical models including analytical hierarchy process (AHP), weighted linear combination (WLC) and spatial multi-criteria evaluation (SMCE) models. The landslide inventory map which has a total of 92 landslide locations was created based on numerous resources such as digital aerial photographs, AIRSAR data, WorldView-1 images, and field surveys. Then, 80% of the landslide inventory was used for training the statistical models and the remaining 20% was used for validation purpose. The validation results using the Relative landslide density index (R-index) and Receiver operating characteristic (ROC) demonstrated that the SMCE model (accuracy is 96%) is better in prediction than AHP (accuracy is 91%) and WLC (accuracy is 89%) models. These landslide susceptibility maps would be useful for hazard mitigation purpose and regional planning.
Population models for passerine birds: structure, parameterization, and analysis

USGS Publications Warehouse

Noon, B.R.; Sauer, J.R.; McCullough, D.R.; Barrett, R.H.

1992-01-01

Population models have great potential as management tools, as they use infonnation about the life history of a species to summarize estimates of fecundity and survival into a description of population change. Models provide a framework for projecting future populations, determining the effects of management decisions on future population dynamics, evaluating extinction probabilities, and addressing a variety of questions of ecological and evolutionary interest. Even when insufficient information exists to allow complete identification of the model, the modelling procedure is useful because it forces the investigator to consider the life history of the species when determining what parameters should be estimated from field studies and provides a context for evaluating the relative importance of demographic parameters. Models have been little used in the study of the population dynamics of passerine birds because of: (1) widespread misunderstandings of the model structures and parameterizations, (2) a lack of knowledge of life histories of many species, (3) difficulties in obtaining statistically reliable estimates of demographic parameters for most passerine species, and (4) confusion about functional relationships among demographic parameters. As a result, studies of passerine demography are often designed inappropriately and fail to provide essential data. We review appropriate models for passerine bird populations and illustrate their possible uses in evaluating the effects of management or other environmental influences on population dynamics. We identify environmental influences on population dynamics. We identify parameters that must be estimated from field data, briefly review existing statistical methods for obtaining valid estimates, and evaluate the present status of knowledge of these parameters.
Sample Invariance of the Structural Equation Model and the Item Response Model: A Case Study.

ERIC Educational Resources Information Center

Breithaupt, Krista; Zumbo, Bruno D.

2002-01-01

Evaluated the sample invariance of item discrimination statistics in a case study using real data, responses of 10 random samples of 500 people to a depression scale. Results lend some support to the hypothesized superiority of a two-parameter item response model over the common form of structural equation modeling, at least when responses are…
Fire and Smoke Model Evaluation Experiment (FASMEE): Modeling gaps and data needs

Treesearch

Yongqiang Liu; Adam Kochanski; Kirk Baker; Ruddy Mell; Rodman Linn; Ronan Paugam; Jan Mandel; Aime Fournier; Mary Ann Jenkins; Scott Goodrick; Gary Achtemeier; Andrew Hudak; Matthew Dickson; Brian Potter; Craig Clements; Shawn Urbanski; Roger Ottmar; Narasimhan Larkin; Timothy Brown; Nancy French; Susan Prichard; Adam Watts; Derek McNamara

2017-01-01

Fire and smoke models are numerical tools for simulating fire behavior, smoke dynamics, and air quality impacts of wildland fires. Fire models are developed based on the fundamental chemistry and physics of combustion and fire spread or statistical analysis of experimental data (Sullivan 2009). They provide information on fire spread and fuel consumption for safe and...
Predictive analysis of beer quality by correlating sensory evaluation with higher alcohol and ester production using multivariate statistics methods.

PubMed

Dong, Jian-Jun; Li, Qing-Liang; Yin, Hua; Zhong, Cheng; Hao, Jun-Guang; Yang, Pan-Fei; Tian, Yu-Hong; Jia, Shi-Ru

2014-10-15

Sensory evaluation is regarded as a necessary procedure to ensure a reproducible quality of beer. Meanwhile, high-throughput analytical methods provide a powerful tool to analyse various flavour compounds, such as higher alcohol and ester. In this study, the relationship between flavour compounds and sensory evaluation was established by non-linear models such as partial least squares (PLS), genetic algorithm back-propagation neural network (GA-BP), support vector machine (SVM). It was shown that SVM with a Radial Basis Function (RBF) had a better performance of prediction accuracy for both calibration set (94.3%) and validation set (96.2%) than other models. Relatively lower prediction abilities were observed for GA-BP (52.1%) and PLS (31.7%). In addition, the kernel function of SVM played an essential role of model training when the prediction accuracy of SVM with polynomial kernel function was 32.9%. As a powerful multivariate statistics method, SVM holds great potential to assess beer quality. Copyright © 2014 Elsevier Ltd. All rights reserved.
Comparative evaluation of spectroscopic models using different multivariate statistical tools in a multicancer scenario

NASA Astrophysics Data System (ADS)

Ghanate, A. D.; Kothiwale, S.; Singh, S. P.; Bertrand, Dominique; Krishna, C. Murali

2011-02-01

Cancer is now recognized as one of the major causes of morbidity and mortality. Histopathological diagnosis, the gold standard, is shown to be subjective, time consuming, prone to interobserver disagreement, and often fails to predict prognosis. Optical spectroscopic methods are being contemplated as adjuncts or alternatives to conventional cancer diagnostics. The most important aspect of these approaches is their objectivity, and multivariate statistical tools play a major role in realizing it. However, rigorous evaluation of the robustness of spectral models is a prerequisite. The utility of Raman spectroscopy in the diagnosis of cancers has been well established. Until now, the specificity and applicability of spectral models have been evaluated for specific cancer types. In this study, we have evaluated the utility of spectroscopic models representing normal and malignant tissues of the breast, cervix, colon, larynx, and oral cavity in a broader perspective, using different multivariate tests. The limit test, which was used in our earlier study, gave high sensitivity but suffered from poor specificity. The performance of other methods such as factorial discriminant analysis and partial least square discriminant analysis are at par with more complex nonlinear methods such as decision trees, but they provide very little information about the classification model. This comparative study thus demonstrates not just the efficacy of Raman spectroscopic models but also the applicability and limitations of different multivariate tools for discrimination under complex conditions such as the multicancer scenario.

An operational GLS model for hydrologic regression

USGS Publications Warehouse

Tasker, Gary D.; Stedinger, J.R.

1989-01-01

Recent Monte Carlo studies have documented the value of generalized least squares (GLS) procedures to estimate empirical relationships between streamflow statistics and physiographic basin characteristics. This paper presents a number of extensions of the GLS method that deal with realities and complexities of regional hydrologic data sets that were not addressed in the simulation studies. These extensions include: (1) a more realistic model of the underlying model errors; (2) smoothed estimates of cross correlation of flows; (3) procedures for including historical flow data; (4) diagnostic statistics describing leverage and influence for GLS regression; and (5) the formulation of a mathematical program for evaluating future gaging activities. ?? 1989.
A Dynamic Intrusion Detection System Based on Multivariate Hotelling's T2 Statistics Approach for Network Environments

PubMed Central

Avalappampatty Sivasamy, Aneetha; Sundan, Bose

2015-01-01

The ever expanding communication requirements in today's world demand extensive and efficient network systems with equally efficient and reliable security features integrated for safe, confident, and secured communication and data transfer. Providing effective security protocols for any network environment, therefore, assumes paramount importance. Attempts are made continuously for designing more efficient and dynamic network intrusion detection models. In this work, an approach based on Hotelling's T2 method, a multivariate statistical analysis technique, has been employed for intrusion detection, especially in network environments. Components such as preprocessing, multivariate statistical analysis, and attack detection have been incorporated in developing the multivariate Hotelling's T2 statistical model and necessary profiles have been generated based on the T-square distance metrics. With a threshold range obtained using the central limit theorem, observed traffic profiles have been classified either as normal or attack types. Performance of the model, as evaluated through validation and testing using KDD Cup'99 dataset, has shown very high detection rates for all classes with low false alarm rates. Accuracy of the model presented in this work, in comparison with the existing models, has been found to be much better. PMID:26357668
A Dynamic Intrusion Detection System Based on Multivariate Hotelling's T2 Statistics Approach for Network Environments.

PubMed

Sivasamy, Aneetha Avalappampatty; Sundan, Bose

2015-01-01

The ever expanding communication requirements in today's world demand extensive and efficient network systems with equally efficient and reliable security features integrated for safe, confident, and secured communication and data transfer. Providing effective security protocols for any network environment, therefore, assumes paramount importance. Attempts are made continuously for designing more efficient and dynamic network intrusion detection models. In this work, an approach based on Hotelling's T(2) method, a multivariate statistical analysis technique, has been employed for intrusion detection, especially in network environments. Components such as preprocessing, multivariate statistical analysis, and attack detection have been incorporated in developing the multivariate Hotelling's T(2) statistical model and necessary profiles have been generated based on the T-square distance metrics. With a threshold range obtained using the central limit theorem, observed traffic profiles have been classified either as normal or attack types. Performance of the model, as evaluated through validation and testing using KDD Cup'99 dataset, has shown very high detection rates for all classes with low false alarm rates. Accuracy of the model presented in this work, in comparison with the existing models, has been found to be much better.
Ultra-low-dose computed tomographic angiography with model-based iterative reconstruction compared with standard-dose imaging after endovascular aneurysm repair: a prospective pilot study.

PubMed

Naidu, Sailen G; Kriegshauser, J Scott; Paden, Robert G; He, Miao; Wu, Qing; Hara, Amy K

2014-12-01

An ultra-low-dose radiation protocol reconstructed with model-based iterative reconstruction was compared with our standard-dose protocol. This prospective study evaluated 20 men undergoing surveillance-enhanced computed tomography after endovascular aneurysm repair. All patients underwent standard-dose and ultra-low-dose venous phase imaging; images were compared after reconstruction with filtered back projection, adaptive statistical iterative reconstruction, and model-based iterative reconstruction. Objective measures of aortic contrast attenuation and image noise were averaged. Images were subjectively assessed (1 = worst, 5 = best) for diagnostic confidence, image noise, and vessel sharpness. Aneurysm sac diameter and endoleak detection were compared. Quantitative image noise was 26% less with ultra-low-dose model-based iterative reconstruction than with standard-dose adaptive statistical iterative reconstruction and 58% less than with ultra-low-dose adaptive statistical iterative reconstruction. Average subjective noise scores were not different between ultra-low-dose model-based iterative reconstruction and standard-dose adaptive statistical iterative reconstruction (3.8 vs. 4.0, P = .25). Subjective scores for diagnostic confidence were better with standard-dose adaptive statistical iterative reconstruction than with ultra-low-dose model-based iterative reconstruction (4.4 vs. 4.0, P = .002). Vessel sharpness was decreased with ultra-low-dose model-based iterative reconstruction compared with standard-dose adaptive statistical iterative reconstruction (3.3 vs. 4.1, P < .0001). Ultra-low-dose model-based iterative reconstruction and standard-dose adaptive statistical iterative reconstruction aneurysm sac diameters were not significantly different (4.9 vs. 4.9 cm); concordance for the presence of endoleak was 100% (P < .001). Compared with a standard-dose technique, an ultra-low-dose model-based iterative reconstruction protocol provides comparable image quality and diagnostic assessment at a 73% lower radiation dose.
Scaling up the evaluation of psychotherapy: evaluating motivational interviewing fidelity via statistical text classification

PubMed Central

2014-01-01

Background Behavioral interventions such as psychotherapy are leading, evidence-based practices for a variety of problems (e.g., substance abuse), but the evaluation of provider fidelity to behavioral interventions is limited by the need for human judgment. The current study evaluated the accuracy of statistical text classification in replicating human-based judgments of provider fidelity in one specific psychotherapy—motivational interviewing (MI). Method Participants (n = 148) came from five previously conducted randomized trials and were either primary care patients at a safety-net hospital or university students. To be eligible for the original studies, participants met criteria for either problematic drug or alcohol use. All participants received a type of brief motivational interview, an evidence-based intervention for alcohol and substance use disorders. The Motivational Interviewing Skills Code is a standard measure of MI provider fidelity based on human ratings that was used to evaluate all therapy sessions. A text classification approach called a labeled topic model was used to learn associations between human-based fidelity ratings and MI session transcripts. It was then used to generate codes for new sessions. The primary comparison was the accuracy of model-based codes with human-based codes. Results Receiver operating characteristic (ROC) analyses of model-based codes showed reasonably strong sensitivity and specificity with those from human raters (range of area under ROC curve (AUC) scores: 0.62 – 0.81; average AUC: 0.72). Agreement with human raters was evaluated based on talk turns as well as code tallies for an entire session. Generated codes had higher reliability with human codes for session tallies and also varied strongly by individual code. Conclusion To scale up the evaluation of behavioral interventions, technological solutions will be required. The current study demonstrated preliminary, encouraging findings regarding the utility of statistical text classification in bridging this methodological gap. PMID:24758152
Statistical modeling of 4D respiratory lung motion using diffeomorphic image registration.

PubMed

Ehrhardt, Jan; Werner, René; Schmidt-Richberg, Alexander; Handels, Heinz

2011-02-01

Modeling of respiratory motion has become increasingly important in various applications of medical imaging (e.g., radiation therapy of lung cancer). Current modeling approaches are usually confined to intra-patient registration of 3D image data representing the individual patient's anatomy at different breathing phases. We propose an approach to generate a mean motion model of the lung based on thoracic 4D computed tomography (CT) data of different patients to extend the motion modeling capabilities. Our modeling process consists of three steps: an intra-subject registration to generate subject-specific motion models, the generation of an average shape and intensity atlas of the lung as anatomical reference frame, and the registration of the subject-specific motion models to the atlas in order to build a statistical 4D mean motion model (4D-MMM). Furthermore, we present methods to adapt the 4D mean motion model to a patient-specific lung geometry. In all steps, a symmetric diffeomorphic nonlinear intensity-based registration method was employed. The Log-Euclidean framework was used to compute statistics on the diffeomorphic transformations. The presented methods are then used to build a mean motion model of respiratory lung motion using thoracic 4D CT data sets of 17 patients. We evaluate the model by applying it for estimating respiratory motion of ten lung cancer patients. The prediction is evaluated with respect to landmark and tumor motion, and the quantitative analysis results in a mean target registration error (TRE) of 3.3 ±1.6 mm if lung dynamics are not impaired by large lung tumors or other lung disorders (e.g., emphysema). With regard to lung tumor motion, we show that prediction accuracy is independent of tumor size and tumor motion amplitude in the considered data set. However, tumors adhering to non-lung structures degrade local lung dynamics significantly and the model-based prediction accuracy is lower in these cases. The statistical respiratory motion model is capable of providing valuable prior knowledge in many fields of applications. We present two examples of possible applications in radiation therapy and image guided diagnosis.
10 CFR 431.17 - Determination of efficiency.

Code of Federal Regulations, 2014 CFR

2014-01-01

... characteristics of that basic model, and (ii) Based on engineering or statistical analysis, computer simulation or... simulation or modeling, and other analytic evaluation of performance data on which the AEDM is based... applied. (iii) If requested by the Department, the manufacturer shall conduct simulations to predict the...
10 CFR 431.17 - Determination of efficiency.

Code of Federal Regulations, 2012 CFR

2012-01-01

... characteristics of that basic model, and (ii) Based on engineering or statistical analysis, computer simulation or... simulation or modeling, and other analytic evaluation of performance data on which the AEDM is based... applied. (iii) If requested by the Department, the manufacturer shall conduct simulations to predict the...
Spatiotemporal Variation in Distance Dependent Animal Movement Contacts: One Size Doesn’t Fit All

PubMed Central

Brommesson, Peter; Wennergren, Uno; Lindström, Tom

2016-01-01

The structure of contacts that mediate transmission has a pronounced effect on the outbreak dynamics of infectious disease and simulation models are powerful tools to inform policy decisions. Most simulation models of livestock disease spread rely to some degree on predictions of animal movement between holdings. Typically, movements are more common between nearby farms than between those located far away from each other. Here, we assessed spatiotemporal variation in such distance dependence of animal movement contacts from an epidemiological perspective. We evaluated and compared nine statistical models, applied to Swedish movement data from 2008. The models differed in at what level (if at all), they accounted for regional and/or seasonal heterogeneities in the distance dependence of the contacts. Using a kernel approach to describe how probability of contacts between farms changes with distance, we developed a hierarchical Bayesian framework and estimated parameters by using Markov Chain Monte Carlo techniques. We evaluated models by three different approaches of model selection. First, we used Deviance Information Criterion to evaluate their performance relative to each other. Secondly, we estimated the log predictive posterior distribution, this was also used to evaluate their relative performance. Thirdly, we performed posterior predictive checks by simulating movements with each of the parameterized models and evaluated their ability to recapture relevant summary statistics. Independent of selection criteria, we found that accounting for regional heterogeneity improved model accuracy. We also found that accounting for seasonal heterogeneity was beneficial, in terms of model accuracy, according to two of three methods used for model selection. Our results have important implications for livestock disease spread models where movement is an important risk factor for between farm transmission. We argue that modelers should refrain from using methods to simulate animal movements that assume the same pattern across all regions and seasons without explicitly testing for spatiotemporal variation. PMID:27760155
Comparison of two landslide susceptibility assessments in the Champagne-Ardenne region (France)

NASA Astrophysics Data System (ADS)

Den Eeckhaut, M. Van; Marre, A.; Poesen, J.

2010-02-01

The vineyards of the Montagne de Reims are mostly planted on steep south-oriented cuesta fronts receiving a maximum of sun radiation. Due to the location of the vineyards on steep hillslopes, the viticultural activity is threatened by slope failures. This study attempts to better understand the spatial patterns of landslide susceptibility in the Champagne-Ardenne region by comparing a heuristic (qualitative) and a statistical (quantitative) model in a 1120 km² study area. The heuristic landslide susceptibility model was adopted from the Bureau de Recherches Géologiques et Minières, the GEGEAA - Reims University and the Comité Interprofessionnel du Vin de Champagne. In this model, expert knowledge of the region was used to assign weights to all slope classes and lithologies present in the area, but the final susceptibility map was never evaluated with the location of mapped landslides. For the statistical landslide susceptibility assessment, logistic regression was applied to a dataset of 291 'old' (Holocene) landslides. The robustness of the logistic regression model was evaluated and ROC curves were used for model calibration and validation. With regard to the variables assumed to be important environmental factors controlling landslides, the two models are in agreement. They both indicate that present and future landslides are mainly controlled by slope gradient and lithology. However, the comparison of the two landslide susceptibility maps through (1) an evaluation with the location of mapped 'old' landslides and through (2) a temporal validation with spatial data of 'recent' (1960-1999; n = 48) and 'very recent' (2000-2008; n = 46) landslides showed a better prediction capacity for the statistical model produced in this study compared to the heuristic model. In total, the statistically-derived landslide susceptibility map succeeded in correctly classifying 81.0% of the 'old' and 91.6% of the 'recent' and 'very recent' landslides. On the susceptibility map derived from the heuristic model, on the other hand, only 54.6% of the 'old' and 64.0% of the 'recent' and 'very recent' landslides were correctly classified as unstable. Hence, the landslide susceptibility map obtained from logistic regression is a better tool for regional landslide susceptibility analysis in the study area of the Montagne de Reims. The accurate classification of zones with very high and high susceptibility allows delineating zones where viticulturists should be informed and where implementation of precaution measures is needed to secure slope stability.
Empirical Correction to the Likelihood Ratio Statistic for Structural Equation Modeling with Many Variables.

PubMed

Yuan, Ke-Hai; Tian, Yubin; Yanagihara, Hirokazu

2015-06-01

Survey data typically contain many variables. Structural equation modeling (SEM) is commonly used in analyzing such data. The most widely used statistic for evaluating the adequacy of a SEM model is T ML, a slight modification to the likelihood ratio statistic. Under normality assumption, T ML approximately follows a chi-square distribution when the number of observations (N) is large and the number of items or variables (p) is small. However, in practice, p can be rather large while N is always limited due to not having enough participants. Even with a relatively large N, empirical results show that T ML rejects the correct model too often when p is not too small. Various corrections to T ML have been proposed, but they are mostly heuristic. Following the principle of the Bartlett correction, this paper proposes an empirical approach to correct T ML so that the mean of the resulting statistic approximately equals the degrees of freedom of the nominal chi-square distribution. Results show that empirically corrected statistics follow the nominal chi-square distribution much more closely than previously proposed corrections to T ML, and they control type I errors reasonably well whenever N ≥ max(50,2p). The formulations of the empirically corrected statistics are further used to predict type I errors of T ML as reported in the literature, and they perform well.
Collaborative Project: The problem of bias in defining uncertainty in computationally enabled strategies for data-driven climate model development. Final Technical Report.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Huerta, Gabriel

The objective of the project is to develop strategies for better representing scientific sensibilities within statistical measures of model skill that then can be used within a Bayesian statistical framework for data-driven climate model development and improved measures of model scientific uncertainty. One of the thorny issues in model evaluation is quantifying the effect of biases on climate projections. While any bias is not desirable, only those biases that affect feedbacks affect scatter in climate projections. The effort at the University of Texas is to analyze previously calculated ensembles of CAM3.1 with perturbed parameters to discover how biases affect projectionsmore » of global warming. The hypothesis is that compensating errors in the control model can be identified by their effect on a combination of processes and that developing metrics that are sensitive to dependencies among state variables would provide a way to select version of climate models that may reduce scatter in climate projections. Gabriel Huerta at the University of New Mexico is responsible for developing statistical methods for evaluating these field dependencies. The UT effort will incorporate these developments into MECS, which is a set of python scripts being developed at the University of Texas for managing the workflow associated with data-driven climate model development over HPC resources. This report reflects the main activities at the University of New Mexico where the PI (Huerta) and the Postdocs (Nosedal, Hattab and Karki) worked on the project.« less
Uncertainty in eddy covariance measurements and its application to physiological models

Treesearch

D.Y. Hollinger; A.D. Richardson; A.D. Richardson

2005-01-01

Flux data are noisy, and this uncertainty is largely due to random measurement error. Knowledge of uncertainty is essential for the statistical evaluation of modeled andmeasured fluxes, for comparison of parameters derived by fitting models to measured fluxes and in formal data-assimilation efforts. We used the difference between simultaneous measurements from two...
An evaluation of 20th century climate for the Southeastern United States as simulated by Coupled Model Intercomparison Project Phase 5 (CMIP5) global climate models

USGS Publications Warehouse

David E. Rupp,

2016-05-05

The 20th century climate for the Southeastern United States and surrounding areas as simulated by global climate models used in the Coupled Model Intercomparison Project Phase 5 (CMIP5) was evaluated. A suite of statistics that characterize various aspects of the regional climate was calculated from both model simulations and observation-based datasets. CMIP5 global climate models were ranked by their ability to reproduce the observed climate. Differences in the performance of the models between regions of the United States (the Southeastern and Northwestern United States) warrant a regional-scale assessment of CMIP5 models.
Model-based economic evaluation in Alzheimer's disease: a review of the methods available to model Alzheimer's disease progression.

PubMed

Green, Colin; Shearer, James; Ritchie, Craig W; Zajicek, John P

2011-01-01

To consider the methods available to model Alzheimer's disease (AD) progression over time to inform on the structure and development of model-based evaluations, and the future direction of modelling methods in AD. A systematic search of the health care literature was undertaken to identify methods to model disease progression in AD. Modelling methods are presented in a descriptive review. The literature search identified 42 studies presenting methods or applications of methods to model AD progression over time. The review identified 10 general modelling frameworks available to empirically model the progression of AD as part of a model-based evaluation. Seven of these general models are statistical models predicting progression of AD using a measure of cognitive function. The main concerns with models are on model structure, around the limited characterization of disease progression, and on the use of a limited number of health states to capture events related to disease progression over time. None of the available models have been able to present a comprehensive model of the natural history of AD. Although helpful, there are serious limitations in the methods available to model progression of AD over time. Advances are needed to better model the progression of AD and the effects of the disease on peoples' lives. Recent evidence supports the need for a multivariable approach to the modelling of AD progression, and indicates that a latent variable analytic approach to characterising AD progression is a promising avenue for advances in the statistical development of modelling methods. Copyright © 2011 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.
A risk-based approach to management of leachables utilizing statistical analysis of extractables.

PubMed

Stults, Cheryl L M; Mikl, Jaromir; Whelehan, Oliver; Morrical, Bradley; Duffield, William; Nagao, Lee M

2015-04-01

To incorporate quality by design concepts into the management of leachables, an emphasis is often put on understanding the extractable profile for the materials of construction for manufacturing disposables, container-closure, or delivery systems. Component manufacturing processes may also impact the extractable profile. An approach was developed to (1) identify critical components that may be sources of leachables, (2) enable an understanding of manufacturing process factors that affect extractable profiles, (3) determine if quantitative models can be developed that predict the effect of those key factors, and (4) evaluate the practical impact of the key factors on the product. A risk evaluation for an inhalation product identified injection molding as a key process. Designed experiments were performed to evaluate the impact of molding process parameters on the extractable profile from an ABS inhaler component. Statistical analysis of the resulting GC chromatographic profiles identified processing factors that were correlated with peak levels in the extractable profiles. The combination of statistically significant molding process parameters was different for different types of extractable compounds. ANOVA models were used to obtain optimal process settings and predict extractable levels for a selected number of compounds. The proposed paradigm may be applied to evaluate the impact of material composition and processing parameters on extractable profiles and utilized to manage product leachables early in the development process and throughout the product lifecycle.
A method for evaluating the importance of system state observations to model predictions, with application to the Death Valley regional groundwater flow system

USGS Publications Warehouse

Tiedeman, Claire; Ely, D. Matthew; Hill, Mary C.; O'Brien, Grady M.

2004-01-01

We develop a new observation‐prediction (OPR) statistic for evaluating the importance of system state observations to model predictions. The OPR statistic measures the change in prediction uncertainty produced when an observation is added to or removed from an existing monitoring network, and it can be used to guide refinement and enhancement of the network. Prediction uncertainty is approximated using a first‐order second‐moment method. We apply the OPR statistic to a model of the Death Valley regional groundwater flow system (DVRFS) to evaluate the importance of existing and potential hydraulic head observations to predicted advective transport paths in the saturated zone underlying Yucca Mountain and underground testing areas on the Nevada Test Site. Important existing observations tend to be far from the predicted paths, and many unimportant observations are in areas of high observation density. These results can be used to select locations at which increased observation accuracy would be beneficial and locations that could be removed from the network. Important potential observations are mostly in areas of high hydraulic gradient far from the paths. Results for both existing and potential observations are related to the flow system dynamics and coarse parameter zonation in the DVRFS model. If system properties in different locations are as similar as the zonation assumes, then the OPR results illustrate a data collection opportunity whereby observations in distant, high‐gradient areas can provide information about properties in flatter‐gradient areas near the paths. If this similarity is suspect, then the analysis produces a different type of data collection opportunity involving testing of model assumptions critical to the OPR results.
A simulator for evaluating methods for the detection of lesion-deficit associations

NASA Technical Reports Server (NTRS)

Megalooikonomou, V.; Davatzikos, C.; Herskovits, E. H.

2000-01-01

Although much has been learned about the functional organization of the human brain through lesion-deficit analysis, the variety of statistical and image-processing methods developed for this purpose precludes a closed-form analysis of the statistical power of these systems. Therefore, we developed a lesion-deficit simulator (LDS), which generates artificial subjects, each of which consists of a set of functional deficits, and a brain image with lesions; the deficits and lesions conform to predefined distributions. We used probability distributions to model the number, sizes, and spatial distribution of lesions, to model the structure-function associations, and to model registration error. We used the LDS to evaluate, as examples, the effects of the complexities and strengths of lesion-deficit associations, and of registration error, on the power of lesion-deficit analysis. We measured the numbers of recovered associations from these simulated data, as a function of the number of subjects analyzed, the strengths and number of associations in the statistical model, the number of structures associated with a particular function, and the prior probabilities of structures being abnormal. The number of subjects required to recover the simulated lesion-deficit associations was found to have an inverse relationship to the strength of associations, and to the smallest probability in the structure-function model. The number of structures associated with a particular function (i.e., the complexity of associations) had a much greater effect on the performance of the analysis method than did the total number of associations. We also found that registration error of 5 mm or less reduces the number of associations discovered by approximately 13% compared to perfect registration. The LDS provides a flexible framework for evaluating many aspects of lesion-deficit analysis.
Evaluation of statistically downscaled GCM output as input for hydrological and stream temperature simulation in the Apalachicola–Chattahoochee–Flint River Basin (1961–99)

USGS Publications Warehouse

Hay, Lauren E.; LaFontaine, Jacob H.; Markstrom, Steven

2014-01-01

The accuracy of statistically downscaled general circulation model (GCM) simulations of daily surface climate for historical conditions (1961–99) and the implications when they are used to drive hydrologic and stream temperature models were assessed for the Apalachicola–Chattahoochee–Flint River basin (ACFB). The ACFB is a 50 000 km2 basin located in the southeastern United States. Three GCMs were statistically downscaled, using an asynchronous regional regression model (ARRM), to ⅛° grids of daily precipitation and minimum and maximum air temperature. These ARRM-based climate datasets were used as input to the Precipitation-Runoff Modeling System (PRMS), a deterministic, distributed-parameter, physical-process watershed model used to simulate and evaluate the effects of various combinations of climate and land use on watershed response. The ACFB was divided into 258 hydrologic response units (HRUs) in which the components of flow (groundwater, subsurface, and surface) are computed in response to climate, land surface, and subsurface characteristics of the basin. Daily simulations of flow components from PRMS were used with the climate to simulate in-stream water temperatures using the Stream Network Temperature (SNTemp) model, a mechanistic, one-dimensional heat transport model for branched stream networks.The climate, hydrology, and stream temperature for historical conditions were evaluated by comparing model outputs produced from historical climate forcings developed from gridded station data (GSD) versus those produced from the three statistically downscaled GCMs using the ARRM methodology. The PRMS and SNTemp models were forced with the GSD and the outputs produced were treated as “truth.” This allowed for a spatial comparison by HRU of the GSD-based output with ARRM-based output. Distributional similarities between GSD- and ARRM-based model outputs were compared using the two-sample Kolmogorov–Smirnov (KS) test in combination with descriptive metrics such as the mean and variance and an evaluation of rare and sustained events. In general, precipitation and streamflow quantities were negatively biased in the downscaled GCM outputs, and results indicate that the downscaled GCM simulations consistently underestimate the largest precipitation events relative to the GSD. The KS test results indicate that ARRM-based air temperatures are similar to GSD at the daily time step for the majority of the ACFB, with perhaps subweekly averaging for stream temperature. Depending on GCM and spatial location, ARRM-based precipitation and streamflow requires averaging of up to 30 days to become similar to the GSD-based output.Evaluation of the model skill for historical conditions suggests some guidelines for use of future projections; while it seems correct to place greater confidence in evaluation metrics which perform well historically, this does not necessarily mean those metrics will accurately reflect model outputs for future climatic conditions. Results from this study indicate no “best” overall model, but the breadth of analysis can be used to give the product users an indication of the applicability of the results to address their particular problem. Since results for historical conditions indicate that model outputs can have significant biases associated with them, the range in future projections examined in terms of change relative to historical conditions for each individual GCM may be more appropriate.
An analysis of the cognitive deficit of schizophrenia based on the Piaget developmental theory.

PubMed

Torres, Alejandro; Olivares, Jose M; Rodriguez, Angel; Vaamonde, Antonio; Berrios, German E

2007-01-01

The objective of the study was to evaluate from the perspective of the Piaget developmental model the cognitive functioning of a sample of patients diagnosed with schizophrenia. Fifty patients with schizophrenia (Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition) and 40 healthy matched controls were evaluated by means of the Longeot Logical Thought Evaluation Scale. Only 6% of the subjects with schizophrenia reached the "formal period," and 70% remained at the "concrete operations" stage. The corresponding figures for the control sample were 25% and 15%, respectively. These differences were statistically significant. The samples were specifically differentiable on the permutation, probabilities, and pendulum tests of the scale. The Longeot Logical Thought Evaluation Scale can discriminate between subjects with schizophrenia and healthy controls.

Detecting temporal change in freshwater fisheries surveys: statistical power and the important linkages between management questions and monitoring objectives

USGS Publications Warehouse

Wagner, Tyler; Irwin, Brian J.; James R. Bence,; Daniel B. Hayes,

2016-01-01

Monitoring to detect temporal trends in biological and habitat indices is a critical component of fisheries management. Thus, it is important that management objectives are linked to monitoring objectives. This linkage requires a definition of what constitutes a management-relevant “temporal trend.” It is also important to develop expectations for the amount of time required to detect a trend (i.e., statistical power) and for choosing an appropriate statistical model for analysis. We provide an overview of temporal trends commonly encountered in fisheries management, review published studies that evaluated statistical power of long-term trend detection, and illustrate dynamic linear models in a Bayesian context, as an additional analytical approach focused on shorter term change. We show that monitoring programs generally have low statistical power for detecting linear temporal trends and argue that often management should be focused on different definitions of trends, some of which can be better addressed by alternative analytical approaches.
Multiple-Point statistics for stochastic modeling of aquifers, where do we stand?

NASA Astrophysics Data System (ADS)

Renard, P.; Julien, S.

2017-12-01

In the last 20 years, multiple-point statistics have been a focus of much research, successes and disappointments. The aim of this geostatistical approach was to integrate geological information into stochastic models of aquifer heterogeneity to better represent the connectivity of high or low permeability structures in the underground. Many different algorithms (ENESIM, SNESIM, SIMPAT, CCSIM, QUILTING, IMPALA, DEESSE, FILTERSIM, HYPPS, etc.) have been and are still proposed. They are all based on the concept of a training data set from which spatial statistics are derived and used in a further step to generate conditional realizations. Some of these algorithms evaluate the statistics of the spatial patterns for every pixel, other techniques consider the statistics at the scale of a patch or a tile. While the method clearly succeeded in enabling modelers to generate realistic models, several issues are still the topic of debate both from a practical and theoretical point of view, and some issues such as training data set availability are often hindering the application of the method in practical situations. In this talk, the aim is to present a review of the status of these approaches both from a theoretical and practical point of view using several examples at different scales (from pore network to regional aquifer).
Nested Structural Equation Models: Noncentrality and Power of Restriction Test.

ERIC Educational Resources Information Center

Raykov, Tenko; Penev, Spiridon

1998-01-01

Discusses the difference in noncentrality parameters of nested structural equation models and their utility in evaluating statistical power associated with the pertinent restriction test. Asymptotic confidence intervals for that difference are presented. These intervals represent a useful adjunct to goodness-of-fit indexes in assessing constraints…
Mapping the Classroom Emotional Environment

ERIC Educational Resources Information Center

Harvey, Shane T.; Bimler, David; Evans, Ian M.; Kirkland, John; Pechtel, Pia

2012-01-01

Harvey and Evans (2003) have proposed that teachers' emotional skills, as required in the classroom, can be organized into a five-dimensional model. Further research is necessary to validate this model and evaluate the importance of each dimension of teacher emotion competence for educational practice. Using a statistical method for mapping…
Investigating the American Time Use Survey from an Exposure Modeling Perspective

EPA Science Inventory

This paper describes an evaluation of the U.S. Bureau of Labor Statistics' American Time Use Survey (ATUS) for potential use in modeling human exposures to environmental pollutants. The ATUS is a large, on-going, cross-sectional survey of where Americans spend time and what activ...
Statistically significant relational data mining :

DOE Office of Scientific and Technical Information (OSTI.GOV)

Berry, Jonathan W.; Leung, Vitus Joseph; Phillips, Cynthia Ann

This report summarizes the work performed under the project (3z(BStatitically significant relational data mining.(3y (BThe goal of the project was to add more statistical rigor to the fairly ad hoc area of data mining on graphs. Our goal was to develop better algorithms and better ways to evaluate algorithm quality. We concetrated on algorithms for community detection, approximate pattern matching, and graph similarity measures. Approximate pattern matching involves finding an instance of a relatively small pattern, expressed with tolerance, in a large graph of data observed with uncertainty. This report gathers the abstracts and references for the eight refereed publicationsmore » that have appeared as part of this work. We then archive three pieces of research that have not yet been published. The first is theoretical and experimental evidence that a popular statistical measure for comparison of community assignments favors over-resolved communities over approximations to a ground truth. The second are statistically motivated methods for measuring the quality of an approximate match of a small pattern in a large graph. The third is a new probabilistic random graph model. Statisticians favor these models for graph analysis. The new local structure graph model overcomes some of the issues with popular models such as exponential random graph models and latent variable models.« less
PET image reconstruction: a robust state space approach.

PubMed

Liu, Huafeng; Tian, Yi; Shi, Pengcheng

2005-01-01

Statistical iterative reconstruction algorithms have shown improved image quality over conventional nonstatistical methods in PET by using accurate system response models and measurement noise models. Strictly speaking, however, PET measurements, pre-corrected for accidental coincidences, are neither Poisson nor Gaussian distributed and thus do not meet basic assumptions of these algorithms. In addition, the difficulty in determining the proper system response model also greatly affects the quality of the reconstructed images. In this paper, we explore the usage of state space principles for the estimation of activity map in tomographic PET imaging. The proposed strategy formulates the organ activity distribution through tracer kinetics models, and the photon-counting measurements through observation equations, thus makes it possible to unify the dynamic reconstruction problem and static reconstruction problem into a general framework. Further, it coherently treats the uncertainties of the statistical model of the imaging system and the noisy nature of measurement data. Since H(infinity) filter seeks minimummaximum-error estimates without any assumptions on the system and data noise statistics, it is particular suited for PET image reconstruction where the statistical properties of measurement data and the system model are very complicated. The performance of the proposed framework is evaluated using Shepp-Logan simulated phantom data and real phantom data with favorable results.
Statistical Systems with Z

NASA Astrophysics Data System (ADS)

William, Peter

In this dissertation several two dimensional statistical systems exhibiting discrete Z(n) symmetries are studied. For this purpose a newly developed algorithm to compute the partition function of these models exactly is utilized. The zeros of the partition function are examined in order to obtain information about the observable quantities at the critical point. This occurs in the form of critical exponents of the order parameters which characterize phenomena at the critical point. The correlation length exponent is found to agree very well with those computed from strong coupling expansions for the mass gap and with Monte Carlo results. In Feynman's path integral formalism the partition function of a statistical system can be related to the vacuum expectation value of the time ordered product of the observable quantities of the corresponding field theoretic model. Hence a generalization of ordinary scale invariance in the form of conformal invariance is focussed upon. This principle is very suitably applicable, in the case of two dimensional statistical models undergoing second order phase transitions at criticality. The conformal anomaly specifies the universality class to which these models belong. From an evaluation of the partition function, the free energy at criticality is computed, to determine the conformal anomaly of these models. The conformal anomaly for all the models considered here are in good agreement with the predicted values.
Statistical Mechanics of Coherent Ising Machine — The Case of Ferromagnetic and Finite-Loading Hopfield Models —

NASA Astrophysics Data System (ADS)

Aonishi, Toru; Mimura, Kazushi; Utsunomiya, Shoko; Okada, Masato; Yamamoto, Yoshihisa

2017-10-01

The coherent Ising machine (CIM) has attracted attention as one of the most effective Ising computing architectures for solving large scale optimization problems because of its scalability and high-speed computational ability. However, it is difficult to implement the Ising computation in the CIM because the theories and techniques of classical thermodynamic equilibrium Ising spin systems cannot be directly applied to the CIM. This means we have to adapt these theories and techniques to the CIM. Here we focus on a ferromagnetic model and a finite loading Hopfield model, which are canonical models sharing a common mathematical structure with almost all other Ising models. We derive macroscopic equations to capture nonequilibrium phase transitions in these models. The statistical mechanical methods developed here constitute a basis for constructing evaluation methods for other Ising computation models.
Comments of statistical issue in numerical modeling for underground nuclear test monitoring

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nicholson, W.L.; Anderson, K.K.

1993-03-01

The Symposium concluded with prepared summaries by four experts in the involved disciplines. These experts made no mention of statistics and/or the statistical content of issues. The first author contributed an extemporaneous statement at the Symposium because there are important issues associated with conducting and evaluating numerical modeling that are familiar to statisticians and often treated successfully by them. This note expands upon these extemporaneous remarks. Statistical ideas may be helpful in resolving some numerical modeling issues. Specifically, we comment first on the role of statistical design/analysis in the quantification process to answer the question ``what do we know aboutmore » the numerical modeling of underground nuclear tests?`` and second on the peculiar nature of uncertainty analysis for situations involving numerical modeling. The simulations described in the workshop, though associated with topic areas, were basically sets of examples. Each simulation was tuned towards agreeing with either empirical evidence or an expert`s opinion of what empirical evidence would be. While the discussions were reasonable, whether the embellishments were correct or a forced fitting of reality is unclear and illustrates that ``simulation is easy.`` We also suggest that these examples of simulation are typical and the questions concerning the legitimacy and the role of knowing the reality are fair, in general, with respect to simulation. The answers will help us understand why ``prediction is difficult.``« less
The MSFC UNIVAC 1108 EXEC 8 simulation model

NASA Technical Reports Server (NTRS)

Williams, T. G.; Richards, F. M.; Weatherbee, J. E.; Paul, L. K.

1972-01-01

A model is presented which simulates the MSFC Univac 1108 multiprocessor system. The hardware/operating system is described to enable a good statistical measurement of the system behavior. The performance of the 1108 is evaluated by performing twenty-four different experiments designed to locate system bottlenecks and also to test the sensitivity of system throughput with respect to perturbation of the various Exec 8 scheduling algorithms. The model is implemented in the general purpose system simulation language and the techniques described can be used to assist in the design, development, and evaluation of multiprocessor systems.
Causal modelling applied to the risk assessment of a wastewater discharge.

PubMed

Paul, Warren L; Rokahr, Pat A; Webb, Jeff M; Rees, Gavin N; Clune, Tim S

2016-03-01

Bayesian networks (BNs), or causal Bayesian networks, have become quite popular in ecological risk assessment and natural resource management because of their utility as a communication and decision-support tool. Since their development in the field of artificial intelligence in the 1980s, however, Bayesian networks have evolved and merged with structural equation modelling (SEM). Unlike BNs, which are constrained to encode causal knowledge in conditional probability tables, SEMs encode this knowledge in structural equations, which is thought to be a more natural language for expressing causal information. This merger has clarified the causal content of SEMs and generalised the method such that it can now be performed using standard statistical techniques. As it was with BNs, the utility of this new generation of SEM in ecological risk assessment will need to be demonstrated with examples to foster an understanding and acceptance of the method. Here, we applied SEM to the risk assessment of a wastewater discharge to a stream, with a particular focus on the process of translating a causal diagram (conceptual model) into a statistical model which might then be used in the decision-making and evaluation stages of the risk assessment. The process of building and testing a spatial causal model is demonstrated using data from a spatial sampling design, and the implications of the resulting model are discussed in terms of the risk assessment. It is argued that a spatiotemporal causal model would have greater external validity than the spatial model, enabling broader generalisations to be made regarding the impact of a discharge, and greater value as a tool for evaluating the effects of potential treatment plant upgrades. Suggestions are made on how the causal model could be augmented to include temporal as well as spatial information, including suggestions for appropriate statistical models and analyses.
Variational stereo imaging of oceanic waves with statistical constraints.

PubMed

Gallego, Guillermo; Yezzi, Anthony; Fedele, Francesco; Benetazzo, Alvise

2013-11-01

An image processing observational technique for the stereoscopic reconstruction of the waveform of oceanic sea states is developed. The technique incorporates the enforcement of any given statistical wave law modeling the quasi-Gaussianity of oceanic waves observed in nature. The problem is posed in a variational optimization framework, where the desired waveform is obtained as the minimizer of a cost functional that combines image observations, smoothness priors and a weak statistical constraint. The minimizer is obtained by combining gradient descent and multigrid methods on the necessary optimality equations of the cost functional. Robust photometric error criteria and a spatial intensity compensation model are also developed to improve the performance of the presented image matching strategy. The weak statistical constraint is thoroughly evaluated in combination with other elements presented to reconstruct and enforce constraints on experimental stereo data, demonstrating the improvement in the estimation of the observed ocean surface.
Local sensitivity analysis for inverse problems solved by singular value decomposition

USGS Publications Warehouse

Hill, M.C.; Nolan, B.T.

2010-01-01

Local sensitivity analysis provides computationally frugal ways to evaluate models commonly used for resource management, risk assessment, and so on. This includes diagnosing inverse model convergence problems caused by parameter insensitivity and(or) parameter interdependence (correlation), understanding what aspects of the model and data contribute to measures of uncertainty, and identifying new data likely to reduce model uncertainty. Here, we consider sensitivity statistics relevant to models in which the process model parameters are transformed using singular value decomposition (SVD) to create SVD parameters for model calibration. The statistics considered include the PEST identifiability statistic, and combined use of the process-model parameter statistics composite scaled sensitivities and parameter correlation coefficients (CSS and PCC). The statistics are complimentary in that the identifiability statistic integrates the effects of parameter sensitivity and interdependence, while CSS and PCC provide individual measures of sensitivity and interdependence. PCC quantifies correlations between pairs or larger sets of parameters; when a set of parameters is intercorrelated, the absolute value of PCC is close to 1.00 for all pairs in the set. The number of singular vectors to include in the calculation of the identifiability statistic is somewhat subjective and influences the statistic. To demonstrate the statistics, we use the USDA’s Root Zone Water Quality Model to simulate nitrogen fate and transport in the unsaturated zone of the Merced River Basin, CA. There are 16 log-transformed process-model parameters, including water content at field capacity (WFC) and bulk density (BD) for each of five soil layers. Calibration data consisted of 1,670 observations comprising soil moisture, soil water tension, aqueous nitrate and bromide concentrations, soil nitrate concentration, and organic matter content. All 16 of the SVD parameters could be estimated by regression based on the range of singular values. Identifiability statistic results varied based on the number of SVD parameters included. Identifiability statistics calculated for four SVD parameters indicate the same three most important process-model parameters as CSS/PCC (WFC1, WFC2, and BD2), but the order differed. Additionally, the identifiability statistic showed that BD1 was almost as dominant as WFC1. The CSS/PCC analysis showed that this results from its high correlation with WCF1 (-0.94), and not its individual sensitivity. Such distinctions, combined with analysis of how high correlations and(or) sensitivities result from the constructed model, can produce important insights into, for example, the use of sensitivity analysis to design monitoring networks. In conclusion, the statistics considered identified similar important parameters. They differ because (1) with CSS/PCC can be more awkward because sensitivity and interdependence are considered separately and (2) identifiability requires consideration of how many SVD parameters to include. A continuing challenge is to understand how these computationally efficient methods compare with computationally demanding global methods like Markov-Chain Monte Carlo given common nonlinear processes and the often even more nonlinear models.
A statistical learning framework for groundwater nitrate models of the Central Valley, California, USA

USGS Publications Warehouse

Nolan, Bernard T.; Fienen, Michael N.; Lorenz, David L.

2015-01-01

We used a statistical learning framework to evaluate the ability of three machine-learning methods to predict nitrate concentration in shallow groundwater of the Central Valley, California: boosted regression trees (BRT), artificial neural networks (ANN), and Bayesian networks (BN). Machine learning methods can learn complex patterns in the data but because of overfitting may not generalize well to new data. The statistical learning framework involves cross-validation (CV) training and testing data and a separate hold-out data set for model evaluation, with the goal of optimizing predictive performance by controlling for model overfit. The order of prediction performance according to both CV testing R2 and that for the hold-out data set was BRT > BN > ANN. For each method we identified two models based on CV testing results: that with maximum testing R2 and a version with R2 within one standard error of the maximum (the 1SE model). The former yielded CV training R2 values of 0.94–1.0. Cross-validation testing R2 values indicate predictive performance, and these were 0.22–0.39 for the maximum R2 models and 0.19–0.36 for the 1SE models. Evaluation with hold-out data suggested that the 1SE BRT and ANN models predicted better for an independent data set compared with the maximum R2 versions, which is relevant to extrapolation by mapping. Scatterplots of predicted vs. observed hold-out data obtained for final models helped identify prediction bias, which was fairly pronounced for ANN and BN. Lastly, the models were compared with multiple linear regression (MLR) and a previous random forest regression (RFR) model. Whereas BRT results were comparable to RFR, MLR had low hold-out R2 (0.07) and explained less than half the variation in the training data. Spatial patterns of predictions by the final, 1SE BRT model agreed reasonably well with previously observed patterns of nitrate occurrence in groundwater of the Central Valley.
Dynamic evaluation of two decades of WRF-CMAQ ozone ...

EPA Pesticide Factsheets

Dynamic evaluation of the fully coupled Weather Research and Forecasting (WRF)– Community Multi-scale Air Quality (CMAQ) model ozone simulations over the contiguous United States (CONUS) using two decades of simulations covering the period from 1990 to 2010 is conducted to assess how well the changes in observed ozone air quality are simulated by the model. The changes induced by variations in meteorology and/or emissions are also evaluated during the same timeframe using spectral decomposition of observed and modeled ozone time series with the aim of identifying the underlying forcing mechanisms that control ozone exceedances and making informed recommendations for the optimal use of regional-scale air quality models. The evaluation is focused on the warm season's (i.e., May–September) daily maximum 8-hr (DM8HR) ozone concentrations, the 4th highest (4th) and average of top 10 DM8HR ozone values (top10), as well as the spectrally-decomposed components of the DM8HR ozone time series using the Kolmogorov-Zurbenko (KZ) filter. Results of the dynamic evaluation are presented for six regions in the U.S., consistent with the National Oceanic and Atmospheric Administration (NOAA) climatic regions. During the earlier 11-yr period (1990–2000), the simulated and observed trends are not statistically significant. During the more recent 2000–2010 period, all trends are statistically significant and WRF-CMAQ captures the observed trend in most regions. Given large n
Using Algal Metrics and Biomass to Evaluate Multiple Ways of Defining Concentration-Based Nutrient Criteria in Streams and their Ecological Relevance

EPA Science Inventory

We examined the utility of nutrient criteria derived solely from total phosphorus (TP) concentrations in streams (regression models and percentile distributions) and evaluated their ecological relevance to diatom and algal biomass responses. We used a variety of statistics to cha...
Rage against the Machine: Evaluation Metrics in the 21st Century

ERIC Educational Resources Information Center

Yang, Charles

2017-01-01

I review the classic literature in generative grammar and Marr's three-level program for cognitive science to defend the Evaluation Metric as a psychological theory of language learning. Focusing on well-established facts of language variation, change, and use, I argue that optimal statistical principles embodied in Bayesian inference models are…
Using Rasch Analysis to Identify Uncharacteristic Responses to Undergraduate Assessments

ERIC Educational Resources Information Center

Edwards, Antony; Alcock, Lara

2010-01-01

Rasch Analysis is a statistical technique that is commonly used to analyse both test data and Likert survey data, to construct and evaluate question item banks, and to evaluate change in longitudinal studies. In this article, we introduce the dichotomous Rasch model, briefly discussing its assumptions. Then, using data collected in an…
Meta-analysis of diagnostic test data: a bivariate Bayesian modeling approach.

PubMed

Verde, Pablo E

2010-12-30

In the last decades, the amount of published results on clinical diagnostic tests has expanded very rapidly. The counterpart to this development has been the formal evaluation and synthesis of diagnostic results. However, published results present substantial heterogeneity and they can be regarded as so far removed from the classical domain of meta-analysis, that they can provide a rather severe test of classical statistical methods. Recently, bivariate random effects meta-analytic methods, which model the pairs of sensitivities and specificities, have been presented from the classical point of view. In this work a bivariate Bayesian modeling approach is presented. This approach substantially extends the scope of classical bivariate methods by allowing the structural distribution of the random effects to depend on multiple sources of variability. Meta-analysis is summarized by the predictive posterior distributions for sensitivity and specificity. This new approach allows, also, to perform substantial model checking, model diagnostic and model selection. Statistical computations are implemented in the public domain statistical software (WinBUGS and R) and illustrated with real data examples. Copyright © 2010 John Wiley & Sons, Ltd.

Effect of chitosan-N-acetylcysteine conjugate in a mouse model of botulinum toxin B-induced dry eye.

PubMed

Hongyok, Teeravee; Chae, Jemin J; Shin, Young Joo; Na, Daero; Li, Li; Chuck, Roy S

2009-04-01

To evaluate the effect of a thiolated polymer lubricant, chitosan-N-acetylcysteine conjugate (C-NAC), in a mouse model of dry eye. Eye drops containing 0.5% C-NAC, 0.3% C-NAC, a vehicle (control group), artificial tears, or fluorometholone were applied in a masked fashion in a mouse model of induced dry eye from 3 days to 4 weeks after botulinum toxin B injection. Corneal fluorescein staining was periodically recorded. Real-time reverse transcriptase-polymerase chain reaction and immunofluorescence staining were performed at the end of the study to evaluate inflammatory cytokine expressions. Mice treated with C-NAC, 0.5%, and fluorometholone showed a downward trend that was not statistically significant in corneal staining compared with the other groups. Chitosan-NAC formulations, fluorometholone, and artificial tears significantly decreased IL-1beta (interleukin 1beta), IL-10, IL-12alpha, and tumor necrosis factor alpha expression in ocular surface tissues. The botulinum toxin B-induced dry eye mouse model is potentially useful in evaluating new dry eye treatment. Evaluation of important molecular biomarkers suggests that C-NAC may impart some protective ocular surface properties. However, clinical data did not indicate statistically significant improvement of tear production and corneal staining in any of the groups tested. Topically applied C-NAC might protect the ocular surface in dry eye syndrome, as evidenced by decreased inflammatory cytokine expression.
A Model for Indexing Medical Documents Combining Statistical and Symbolic Knowledge.

PubMed Central

Avillach, Paul; Joubert, Michel; Fieschi, Marius

2007-01-01

OBJECTIVES: To develop and evaluate an information processing method based on terminologies, in order to index medical documents in any given documentary context. METHODS: We designed a model using both symbolic general knowledge extracted from the Unified Medical Language System (UMLS) and statistical knowledge extracted from a domain of application. Using statistical knowledge allowed us to contextualize the general knowledge for every particular situation. For each document studied, the extracted terms are ranked to highlight the most significant ones. The model was tested on a set of 17,079 French standardized discharge summaries (SDSs). RESULTS: The most important ICD-10 term of each SDS was ranked 1st or 2nd by the method in nearly 90% of the cases. CONCLUSIONS: The use of several terminologies leads to more precise indexing. The improvement achieved in the model’s implementation performances as a result of using semantic relationships is encouraging. PMID:18693792
Promoting the Development of Preschool Children's Emergent Literacy Skills: A Randomized Evaluation of a Literacy-Focused Curriculum and Two Professional Development Models

ERIC Educational Resources Information Center

Lonigan, Christopher J.; Farver, JoAnn M.; Phillips, Beth M.; Clancy-Menchetti, Jeanine

2011-01-01

To date, there have been few causally interpretable evaluations of the impacts of preschool curricula on the skills of children at-risk for academic difficulties, and even fewer studies have demonstrated statistically significant or educationally meaningful effects. In this cluster-randomized study, we evaluated the impacts of a literacy-focused…
A statistical model of brittle fracture by transgranular cleavage

NASA Astrophysics Data System (ADS)

Lin, Tsann; Evans, A. G.; Ritchie, R. O.

A MODEL for brittle fracture by transgranular cleavage cracking is presented based on the application of weakest link statistics to the critical microstructural fracture mechanisms. The model permits prediction of the macroscopic fracture toughness, KI c, in single phase microstructures containing a known distribution of particles, and defines the critical distance from the crack tip at which the initial cracking event is most probable. The model is developed for unstable fracture ahead of a sharp crack considering both linear elastic and nonlinear elastic ("elastic/plastic") crack tip stress fields. Predictions are evaluated by comparison with experimental results on the low temperature flow and fracture behavior of a low carbon mild steel with a simple ferrite/grain boundary carbide microstructure.
Classical Statistics and Statistical Learning in Imaging Neuroscience

PubMed Central

Bzdok, Danilo

2017-01-01

Brain-imaging research has predominantly generated insight by means of classical statistics, including regression-type analyses and null-hypothesis testing using t-test and ANOVA. Throughout recent years, statistical learning methods enjoy increasing popularity especially for applications in rich and complex data, including cross-validated out-of-sample prediction using pattern classification and sparsity-inducing regression. This concept paper discusses the implications of inferential justifications and algorithmic methodologies in common data analysis scenarios in neuroimaging. It is retraced how classical statistics and statistical learning originated from different historical contexts, build on different theoretical foundations, make different assumptions, and evaluate different outcome metrics to permit differently nuanced conclusions. The present considerations should help reduce current confusion between model-driven classical hypothesis testing and data-driven learning algorithms for investigating the brain with imaging techniques. PMID:29056896
Determination of optimal imaging settings for urolithiasis CT using filtered back projection (FBP), statistical iterative reconstruction (IR) and knowledge-based iterative model reconstruction (IMR): a physical human phantom study

PubMed Central

Choi, Se Y; Ahn, Seung H; Choi, Jae D; Kim, Jung H; Lee, Byoung-Il; Kim, Jeong-In

2016-01-01

Objective: The purpose of this study was to compare CT image quality for evaluating urolithiasis using filtered back projection (FBP), statistical iterative reconstruction (IR) and knowledge-based iterative model reconstruction (IMR) according to various scan parameters and radiation doses. Methods: A 5 × 5 × 5 mm3 uric acid stone was placed in a physical human phantom at the level of the pelvis. 3 tube voltages (120, 100 and 80 kV) and 4 current–time products (100, 70, 30 and 15 mAs) were implemented in 12 scans. Each scan was reconstructed with FBP, statistical IR (Levels 5–7) and knowledge-based IMR (soft-tissue Levels 1–3). The radiation dose, objective image quality and signal-to-noise ratio (SNR) were evaluated, and subjective assessments were performed. Results: The effective doses ranged from 0.095 to 2.621 mSv. Knowledge-based IMR showed better objective image noise and SNR than did FBP and statistical IR. The subjective image noise of FBP was worse than that of statistical IR and knowledge-based IMR. The subjective assessment scores deteriorated after a break point of 100 kV and 30 mAs. Conclusion: At the setting of 100 kV and 30 mAs, the radiation dose can be decreased by approximately 84% while keeping the subjective image assessment. Advances in knowledge: Patients with urolithiasis can be evaluated with ultralow-dose non-enhanced CT using a knowledge-based IMR algorithm at a substantially reduced radiation dose with the imaging quality preserved, thereby minimizing the risks of radiation exposure while providing clinically relevant diagnostic benefits for patients. PMID:26577542
Functional annotation of regulatory pathways.

PubMed

Pandey, Jayesh; Koyutürk, Mehmet; Kim, Yohan; Szpankowski, Wojciech; Subramaniam, Shankar; Grama, Ananth

2007-07-01

Standardized annotations of biomolecules in interaction networks (e.g. Gene Ontology) provide comprehensive understanding of the function of individual molecules. Extending such annotations to pathways is a critical component of functional characterization of cellular signaling at the systems level. We propose a framework for projecting gene regulatory networks onto the space of functional attributes using multigraph models, with the objective of deriving statistically significant pathway annotations. We first demonstrate that annotations of pairwise interactions do not generalize to indirect relationships between processes. Motivated by this result, we formalize the problem of identifying statistically overrepresented pathways of functional attributes. We establish the hardness of this problem by demonstrating the non-monotonicity of common statistical significance measures. We propose a statistical model that emphasizes the modularity of a pathway, evaluating its significance based on the coupling of its building blocks. We complement the statistical model by an efficient algorithm and software, Narada, for computing significant pathways in large regulatory networks. Comprehensive results from our methods applied to the Escherichia coli transcription network demonstrate that our approach is effective in identifying known, as well as novel biological pathway annotations. Narada is implemented in Java and is available at http://www.cs.purdue.edu/homes/jpandey/narada/.
Evaluation of prompt gamma-ray data and nuclear structure of niobium-94 with statistical model calculations

NASA Astrophysics Data System (ADS)

Turkoglu, Danyal

Precise knowledge of prompt gamma-ray intensities following neutron capture is critical for elemental and isotopic analyses, homeland security, modeling nuclear reactors, etc. A recently-developed database of prompt gamma-ray production cross sections and nuclear structure information in the form of a decay scheme, called the Evaluated Gamma-ray Activation File (EGAF), is under revision. Statistical model calculations are useful for checking the consistency of the decay scheme, providing insight on its completeness and accuracy. Furthermore, these statistical model calculations are necessary to estimate the contribution of continuum gamma-rays, which cannot be experimentally resolved due to the high density of excited states in medium- and heavy-mass nuclei. Decay-scheme improvements in EGAF lead to improvements to other databases (Evaluated Nuclear Structure Data File, Reference Input Parameter Library) that are ultimately used in nuclear-reaction models to generate the Evaluated Nuclear Data File (ENDF). Gamma-ray transitions following neutron capture in 93Nb have been studied at the cold-neutron beam facility at the Budapest Research Reactor. Measurements have been performed using a coaxial HPGe detector with Compton suppression. Partial gamma-ray production capture cross sections at a neutron velocity of 2200 m/s have been deduced relative to that of the 255.9-keV transition after cold-neutron capture by 93Nb. With the measurement of a niobium chloride target, this partial cross section was internally standardized to the cross section for the 1951-keV transition after cold-neutron capture by 35Cl. The resulting (0.1377 +/- 0.0018) barn (b) partial cross section produced a calibration factor that was 23% lower than previously measured for the EGAF database. The thermal-neutron cross sections were deduced for the 93Nb(n,gamma ) 94mNb and 93Nb(n,gamma) 94gNb reactions by summing the experimentally-measured partial gamma-ray production cross sections associated with the ground-state transitions below the 396-keV level and combining that summation with the contribution to the ground state from the quasi-continuum above 396 keV, determined with Monte Carlo statistical model calculations using the DICEBOX computer code. These values, sigmam and sigma 0, were (0.83 +/- 0.05) b and (1.16 +/- 0.11) b, respectively, and found to be in agreement with literature values. Comparison of the modeled population and experimental depopulation of individual levels confirmed tentative spin assignments and suggested changes where imbalances existed.
The proposed 'concordance-statistic for benefit' provided a useful metric when modeling heterogeneous treatment effects.

PubMed

van Klaveren, David; Steyerberg, Ewout W; Serruys, Patrick W; Kent, David M

2018-02-01

Clinical prediction models that support treatment decisions are usually evaluated for their ability to predict the risk of an outcome rather than treatment benefit-the difference between outcome risk with vs. without therapy. We aimed to define performance metrics for a model's ability to predict treatment benefit. We analyzed data of the Synergy between Percutaneous Coronary Intervention with Taxus and Cardiac Surgery (SYNTAX) trial and of three recombinant tissue plasminogen activator trials. We assessed alternative prediction models with a conventional risk concordance-statistic (c-statistic) and a novel c-statistic for benefit. We defined observed treatment benefit by the outcomes in pairs of patients matched on predicted benefit but discordant for treatment assignment. The 'c-for-benefit' represents the probability that from two randomly chosen matched patient pairs with unequal observed benefit, the pair with greater observed benefit also has a higher predicted benefit. Compared to a model without treatment interactions, the SYNTAX score II had improved ability to discriminate treatment benefit (c-for-benefit 0.590 vs. 0.552), despite having similar risk discrimination (c-statistic 0.725 vs. 0.719). However, for the simplified stroke-thrombolytic predictive instrument (TPI) vs. the original stroke-TPI, the c-for-benefit (0.584 vs. 0.578) was similar. The proposed methodology has the potential to measure a model's ability to predict treatment benefit not captured with conventional performance metrics. Copyright © 2017 Elsevier Inc. All rights reserved.
Approximate Bayesian Computation Using Markov Chain Monte Carlo Simulation: Theory, Concepts, and Applications

NASA Astrophysics Data System (ADS)

Sadegh, M.; Vrugt, J. A.

2013-12-01

The ever increasing pace of computational power, along with continued advances in measurement technologies and improvements in process understanding has stimulated the development of increasingly complex hydrologic models that simulate soil moisture flow, groundwater recharge, surface runoff, root water uptake, and river discharge at increasingly finer spatial and temporal scales. Reconciling these system models with field and remote sensing data is a difficult task, particularly because average measures of model/data similarity inherently lack the power to provide a meaningful comparative evaluation of the consistency in model form and function. The very construction of the likelihood function - as a summary variable of the (usually averaged) properties of the error residuals - dilutes and mixes the available information into an index having little remaining correspondence to specific behaviors of the system (Gupta et al., 2008). The quest for a more powerful method for model evaluation has inspired Vrugt and Sadegh [2013] to introduce "likelihood-free" inference as vehicle for diagnostic model evaluation. This class of methods is also referred to as Approximate Bayesian Computation (ABC) and relaxes the need for an explicit likelihood function in favor of one or multiple different summary statistics rooted in hydrologic theory that together have a much stronger and compelling diagnostic power than some aggregated measure of the size of the error residuals. Here, we will introduce an efficient ABC sampling method that is orders of magnitude faster in exploring the posterior parameter distribution than commonly used rejection and Population Monte Carlo (PMC) samplers. Our methodology uses Markov Chain Monte Carlo simulation with DREAM, and takes advantage of a simple computational trick to resolve discontinuity problems with the application of set-theoretic summary statistics. We will also demonstrate a set of summary statistics that are rather insensitive to errors in the forcing data. This enhances prospects of detecting model structural deficiencies.
Assessment of the quality of primary care for the elderly according to the Chronic Care Model 1

PubMed Central

Silva, Líliam Barbosa; Soares, Sônia Maria; Silva, Patrícia Aparecida Barbosa; Santos, Joseph Fabiano Guimarães; Miranda, Lívia Carvalho Viana; Santos, Raquel Melgaço

2018-01-01

ABSTRACT Objective: to evaluate the quality of care provided to older people with diabetes mellitus and/or hypertension in the Primary Health Care (PHC) according to the Chronic Care Model (CCM) and identify associations with care outcomes. Method: cross-sectional study involving 105 older people with diabetes mellitus and/or hypertension. The Patient Assessment of Chronic Illness Care (PACIC) questionnaire was used to evaluate the quality of care. The total score was compared with care outcomes that included biochemical parameters, body mass index, pressure levels and quality of life. Data analysis was based on descriptive statistics and multiple logistic regression. Results: there was a predominance of females and a median age of 72 years. The median PACIC score was 1.55 (IQ 1.30-2.20). Among the PACIC dimensions, the “delivery system design/decision support” was the one that presented the best result. There was no statistical difference between the medians of the overall PACIC score and individual care outcomes. However, when the quality of life and health satisfaction were simultaneously evaluated, a statistical difference between the medians was observed. Conclusion: the low PACIC scores found indicate that chronic care according to the CCM in the PHC seems still to fall short of its assumptions. PMID:29538582
Assessment of the quality of primary care for the elderly according to the Chronic Care Model.

PubMed

Silva, Líliam Barbosa; Soares, Sônia Maria; Silva, Patrícia Aparecida Barbosa; Santos, Joseph Fabiano Guimarães; Miranda, Lívia Carvalho Viana; Santos, Raquel Melgaço

2018-03-08

to evaluate the quality of care provided to older people with diabetes mellitus and/or hypertension in the Primary Health Care (PHC) according to the Chronic Care Model (CCM) and identify associations with care outcomes. cross-sectional study involving 105 older people with diabetes mellitus and/or hypertension. The Patient Assessment of Chronic Illness Care (PACIC) questionnaire was used to evaluate the quality of care. The total score was compared with care outcomes that included biochemical parameters, body mass index, pressure levels and quality of life. Data analysis was based on descriptive statistics and multiple logistic regression. there was a predominance of females and a median age of 72 years. The median PACIC score was 1.55 (IQ 1.30-2.20). Among the PACIC dimensions, the "delivery system design/decision support" was the one that presented the best result. There was no statistical difference between the medians of the overall PACIC score and individual care outcomes. However, when the quality of life and health satisfaction were simultaneously evaluated, a statistical difference between the medians was observed. the low PACIC scores found indicate that chronic care according to the CCM in the PHC seems still to fall short of its assumptions.
Prediction of In Vivo Knee Joint Kinematics Using a Combined Dual Fluoroscopy Imaging and Statistical Shape Modeling Technique

PubMed Central

Li, Jing-Sheng; Tsai, Tsung-Yuan; Wang, Shaobai; Li, Pingyue; Kwon, Young-Min; Freiberg, Andrew; Rubash, Harry E.; Li, Guoan

2014-01-01

Using computed tomography (CT) or magnetic resonance (MR) images to construct 3D knee models has been widely used in biomedical engineering research. Statistical shape modeling (SSM) method is an alternative way to provide a fast, cost-efficient, and subject-specific knee modeling technique. This study was aimed to evaluate the feasibility of using a combined dual-fluoroscopic imaging system (DFIS) and SSM method to investigate in vivo knee kinematics. Three subjects were studied during a treadmill walking. The data were compared with the kinematics obtained using a CT-based modeling technique. Geometric root-mean-square (RMS) errors between the knee models constructed using the SSM and CT-based modeling techniques were 1.16 mm and 1.40 mm for the femur and tibia, respectively. For the kinematics of the knee during the treadmill gait, the SSM model can predict the knee kinematics with RMS errors within 3.3 deg for rotation and within 2.4 mm for translation throughout the stance phase of the gait cycle compared with those obtained using the CT-based knee models. The data indicated that the combined DFIS and SSM technique could be used for quick evaluation of knee joint kinematics. PMID:25320846
A range of complex probabilistic models for RNA secondary structure prediction that includes the nearest-neighbor model and more.

PubMed

Rivas, Elena; Lang, Raymond; Eddy, Sean R

2012-02-01

The standard approach for single-sequence RNA secondary structure prediction uses a nearest-neighbor thermodynamic model with several thousand experimentally determined energy parameters. An attractive alternative is to use statistical approaches with parameters estimated from growing databases of structural RNAs. Good results have been reported for discriminative statistical methods using complex nearest-neighbor models, including CONTRAfold, Simfold, and ContextFold. Little work has been reported on generative probabilistic models (stochastic context-free grammars [SCFGs]) of comparable complexity, although probabilistic models are generally easier to train and to use. To explore a range of probabilistic models of increasing complexity, and to directly compare probabilistic, thermodynamic, and discriminative approaches, we created TORNADO, a computational tool that can parse a wide spectrum of RNA grammar architectures (including the standard nearest-neighbor model and more) using a generalized super-grammar that can be parameterized with probabilities, energies, or arbitrary scores. By using TORNADO, we find that probabilistic nearest-neighbor models perform comparably to (but not significantly better than) discriminative methods. We find that complex statistical models are prone to overfitting RNA structure and that evaluations should use structurally nonhomologous training and test data sets. Overfitting has affected at least one published method (ContextFold). The most important barrier to improving statistical approaches for RNA secondary structure prediction is the lack of diversity of well-curated single-sequence RNA secondary structures in current RNA databases.
A statistical shape model of the human second cervical vertebra.

PubMed

Clogenson, Marine; Duff, John M; Luethi, Marcel; Levivier, Marc; Meuli, Reto; Baur, Charles; Henein, Simon

2015-07-01

Statistical shape and appearance models play an important role in reducing the segmentation processing time of a vertebra and in improving results for 3D model development. Here, we describe the different steps in generating a statistical shape model (SSM) of the second cervical vertebra (C2) and provide the shape model for general use by the scientific community. The main difficulties in its construction are the morphological complexity of the C2 and its variability in the population. The input dataset is composed of manually segmented anonymized patient computerized tomography (CT) scans. The alignment of the different datasets is done with the procrustes alignment on surface models, and then, the registration is cast as a model-fitting problem using a Gaussian process. A principal component analysis (PCA)-based model is generated which includes the variability of the C2. The SSM was generated using 92 CT scans. The resulting SSM was evaluated for specificity, compactness and generalization ability. The SSM of the C2 is freely available to the scientific community in Slicer (an open source software for image analysis and scientific visualization) with a module created to visualize the SSM using Statismo, a framework for statistical shape modeling. The SSM of the vertebra allows the shape variability of the C2 to be represented. Moreover, the SSM will enable semi-automatic segmentation and 3D model generation of the vertebra, which would greatly benefit surgery planning.
A range of complex probabilistic models for RNA secondary structure prediction that includes the nearest-neighbor model and more

PubMed Central

Rivas, Elena; Lang, Raymond; Eddy, Sean R.

2012-01-01

The standard approach for single-sequence RNA secondary structure prediction uses a nearest-neighbor thermodynamic model with several thousand experimentally determined energy parameters. An attractive alternative is to use statistical approaches with parameters estimated from growing databases of structural RNAs. Good results have been reported for discriminative statistical methods using complex nearest-neighbor models, including CONTRAfold, Simfold, and ContextFold. Little work has been reported on generative probabilistic models (stochastic context-free grammars [SCFGs]) of comparable complexity, although probabilistic models are generally easier to train and to use. To explore a range of probabilistic models of increasing complexity, and to directly compare probabilistic, thermodynamic, and discriminative approaches, we created TORNADO, a computational tool that can parse a wide spectrum of RNA grammar architectures (including the standard nearest-neighbor model and more) using a generalized super-grammar that can be parameterized with probabilities, energies, or arbitrary scores. By using TORNADO, we find that probabilistic nearest-neighbor models perform comparably to (but not significantly better than) discriminative methods. We find that complex statistical models are prone to overfitting RNA structure and that evaluations should use structurally nonhomologous training and test data sets. Overfitting has affected at least one published method (ContextFold). The most important barrier to improving statistical approaches for RNA secondary structure prediction is the lack of diversity of well-curated single-sequence RNA secondary structures in current RNA databases. PMID:22194308
Summary goodness-of-fit statistics for binary generalized linear models with noncanonical link functions.

PubMed

Canary, Jana D; Blizzard, Leigh; Barry, Ronald P; Hosmer, David W; Quinn, Stephen J

2016-05-01

Generalized linear models (GLM) with a canonical logit link function are the primary modeling technique used to relate a binary outcome to predictor variables. However, noncanonical links can offer more flexibility, producing convenient analytical quantities (e.g., probit GLMs in toxicology) and desired measures of effect (e.g., relative risk from log GLMs). Many summary goodness-of-fit (GOF) statistics exist for logistic GLM. Their properties make the development of GOF statistics relatively straightforward, but it can be more difficult under noncanonical links. Although GOF tests for logistic GLM with continuous covariates (GLMCC) have been applied to GLMCCs with log links, we know of no GOF tests in the literature specifically developed for GLMCCs that can be applied regardless of link function chosen. We generalize the Tsiatis GOF statistic originally developed for logistic GLMCCs, (TG), so that it can be applied under any link function. Further, we show that the algebraically related Hosmer-Lemeshow (HL) and Pigeon-Heyse (J(2) ) statistics can be applied directly. In a simulation study, TG, HL, and J(2) were used to evaluate the fit of probit, log-log, complementary log-log, and log models, all calculated with a common grouping method. The TG statistic consistently maintained Type I error rates, while those of HL and J(2) were often lower than expected if terms with little influence were included. Generally, the statistics had similar power to detect an incorrect model. An exception occurred when a log GLMCC was incorrectly fit to data generated from a logistic GLMCC. In this case, TG had more power than HL or J(2) . © 2015 John Wiley & Sons Ltd/London School of Economics.
Effect of heating rate and kinetic model selection on activation energy of nonisothermal crystallization of amorphous felodipine.

PubMed

Chattoraj, Sayantan; Bhugra, Chandan; Li, Zheng Jane; Sun, Changquan Calvin

2014-12-01

The nonisothermal crystallization kinetics of amorphous materials is routinely analyzed by statistically fitting the crystallization data to kinetic models. In this work, we systematically evaluate how the model-dependent crystallization kinetics is impacted by variations in the heating rate and the selection of the kinetic model, two key factors that can lead to significant differences in the crystallization activation energy (Ea ) of an amorphous material. Using amorphous felodipine, we show that the Ea decreases with increase in the heating rate, irrespective of the kinetic model evaluated in this work. The model that best describes the crystallization phenomenon cannot be identified readily through the statistical fitting approach because several kinetic models yield comparable R(2) . Here, we propose an alternate paired model-fitting model-free (PMFMF) approach for identifying the most suitable kinetic model, where Ea obtained from model-dependent kinetics is compared with those obtained from model-free kinetics. The most suitable kinetic model is identified as the one that yields Ea values comparable with the model-free kinetics. Through this PMFMF approach, nucleation and growth is identified as the main mechanism that controls the crystallization kinetics of felodipine. Using this PMFMF approach, we further demonstrate that crystallization mechanism from amorphous phase varies with heating rate. © 2014 Wiley Periodicals, Inc. and the American Pharmacists Association.
Application of Bayesian methods to habitat selection modeling of the northern spotted owl in California: new statistical methods for wildlife research

Treesearch

Howard B. Stauffer; Cynthia J. Zabel; Jeffrey R. Dunk

2005-01-01

We compared a set of competing logistic regression habitat selection models for Northern Spotted Owls (Strix occidentalis caurina) in California. The habitat selection models were estimated, compared, evaluated, and tested using multiple sample datasets collected on federal forestlands in northern California. We used Bayesian methods in interpreting...
Can air temperature be used to project influences of climate change on stream temperature?

Treesearch

Ivan Arismendi; Mohammad Safeeq; Jason B Dunham; Sherri L Johnson

2014-01-01

Worldwide, lack of data on stream temperature has motivated the use of regression-based statistical models to predict stream temperatures based on more widely available data on air temperatures. Such models have been widely applied to project responses of stream temperatures under climate change, but the performance of these models has not been fully evaluated. To...

Laser ektacytometry and evaluation of statistical characteristics of inhomogeneous ensembles of red blood cells

NASA Astrophysics Data System (ADS)

Nikitin, S. Yu.; Priezzhev, A. V.; Lugovtsov, A. E.; Ustinov, V. D.; Razgulin, A. V.

2014-10-01

The paper is devoted to development of the laser ektacytometry technique for evaluation of the statistical characteristics of inhomogeneous ensembles of red blood cells (RBCs). We have analyzed theoretically laser beam scattering by the inhomogeneous ensembles of elliptical discs, modeling red blood cells in the ektacytometer. The analysis shows that the laser ektacytometry technique allows for quantitative evaluation of such population characteristics of RBCs as the cells mean shape, the cells deformability variance and asymmetry of the cells distribution in the deformability. Moreover, we show that the deformability distribution itself can be retrieved by solving a specific Fredholm integral equation of the first kind. At this stage we do not take into account the scatter in the RBC sizes.
Two Methods of Automatic Evaluation of Speech Signal Enhancement Recorded in the Open-Air MRI Environment

NASA Astrophysics Data System (ADS)

Přibil, Jiří; Přibilová, Anna; Frollo, Ivan

2017-12-01

The paper focuses on two methods of evaluation of successfulness of speech signal enhancement recorded in the open-air magnetic resonance imager during phonation for the 3D human vocal tract modeling. The first approach enables to obtain a comparison based on statistical analysis by ANOVA and hypothesis tests. The second method is based on classification by Gaussian mixture models (GMM). The performed experiments have confirmed that the proposed ANOVA and GMM classifiers for automatic evaluation of the speech quality are functional and produce fully comparable results with the standard evaluation based on the listening test method.
Generalized linear and generalized additive models in studies of species distributions: Setting the scene

USGS Publications Warehouse

Guisan, Antoine; Edwards, T.C.; Hastie, T.

2002-01-01

An important statistical development of the last 30 years has been the advance in regression analysis provided by generalized linear models (GLMs) and generalized additive models (GAMs). Here we introduce a series of papers prepared within the framework of an international workshop entitled: Advances in GLMs/GAMs modeling: from species distribution to environmental management, held in Riederalp, Switzerland, 6-11 August 2001. We first discuss some general uses of statistical models in ecology, as well as provide a short review of several key examples of the use of GLMs and GAMs in ecological modeling efforts. We next present an overview of GLMs and GAMs, and discuss some of their related statistics used for predictor selection, model diagnostics, and evaluation. Included is a discussion of several new approaches applicable to GLMs and GAMs, such as ridge regression, an alternative to stepwise selection of predictors, and methods for the identification of interactions by a combined use of regression trees and several other approaches. We close with an overview of the papers and how we feel they advance our understanding of their application to ecological modeling. ?? 2002 Elsevier Science B.V. All rights reserved.
An improved approach for flight readiness certification: Methodology for failure risk assessment and application examples. Volume 3: Structure and listing of programs

NASA Technical Reports Server (NTRS)

Moore, N. R.; Ebbeler, D. H.; Newlin, L. E.; Sutharshana, S.; Creager, M.

1992-01-01

An improved methodology for quantitatively evaluating failure risk of spaceflight systems to assess flight readiness and identify risk control measures is presented. This methodology, called Probabilistic Failure Assessment (PFA), combines operating experience from tests and flights with engineering analysis to estimate failure risk. The PFA methodology is of particular value when information on which to base an assessment of failure risk, including test experience and knowledge of parameters used in engineering analyses of failure phenomena, is expensive or difficult to acquire. The PFA methodology is a prescribed statistical structure in which engineering analysis models that characterize failure phenomena are used conjointly with uncertainties about analysis parameters and/or modeling accuracy to estimate failure probability distributions for specific failure modes. These distributions can then be modified, by means of statistical procedures of the PFA methodology, to reflect any test or flight experience. Conventional engineering analysis models currently employed for design of failure prediction are used in this methodology. The PFA methodology is described and examples of its application are presented. Conventional approaches to failure risk evaluation for spaceflight systems are discussed, and the rationale for the approach taken in the PFA methodology is presented. The statistical methods, engineering models, and computer software used in fatigue failure mode applications are thoroughly documented.
Evaluation of statistical and rainfall-runoff models for predicting historical daily streamflow time series in the Des Moines and Iowa River watersheds

USGS Publications Warehouse

Farmer, William H.; Knight, Rodney R.; Eash, David A.; Kasey J. Hutchinson,; Linhart, S. Mike; Christiansen, Daniel E.; Archfield, Stacey A.; Over, Thomas M.; Kiang, Julie E.

2015-08-24

Daily records of streamflow are essential to understanding hydrologic systems and managing the interactions between human and natural systems. Many watersheds and locations lack streamgages to provide accurate and reliable records of daily streamflow. In such ungaged watersheds, statistical tools and rainfall-runoff models are used to estimate daily streamflow. Previous work compared 19 different techniques for predicting daily streamflow records in the southeastern United States. Here, five of the better-performing methods are compared in a different hydroclimatic region of the United States, in Iowa. The methods fall into three classes: (1) drainage-area ratio methods, (2) nonlinear spatial interpolations using flow duration curves, and (3) mechanistic rainfall-runoff models. The first two classes are each applied with nearest-neighbor and map-correlated index streamgages. Using a threefold validation and robust rank-based evaluation, the methods are assessed for overall goodness of fit of the hydrograph of daily streamflow, the ability to reproduce a daily, no-fail storage-yield curve, and the ability to reproduce key streamflow statistics. As in the Southeast study, a nonlinear spatial interpolation of daily streamflow using flow duration curves is found to be a method with the best predictive accuracy. Comparisons with previous work in Iowa show that the accuracy of mechanistic models with at-site calibration is substantially degraded in the ungaged framework.
Statistical evaluation of the metallurgical test data in the ORR-PSF-PVS irradiation experiment. [PWR; BWR

DOE Office of Scientific and Technical Information (OSTI.GOV)

Stallmann, F.W.

1984-08-01

A statistical analysis of Charpy test results of the two-year Pressure Vessel Simulation metallurgical irradiation experiment was performed. Determination of transition temperature and upper shelf energy derived from computer fits compare well with eyeball fits. Uncertainties for all results can be obtained with computer fits. The results were compared with predictions in Regulatory Guide 1.99 and other irradiation damage models.
Conditional statistical inference with multistage testing designs.

PubMed

Zwitser, Robert J; Maris, Gunter

2015-03-01

In this paper it is demonstrated how statistical inference from multistage test designs can be made based on the conditional likelihood. Special attention is given to parameter estimation, as well as the evaluation of model fit. Two reasons are provided why the fit of simple measurement models is expected to be better in adaptive designs, compared to linear designs: more parameters are available for the same number of observations; and undesirable response behavior, like slipping and guessing, might be avoided owing to a better match between item difficulty and examinee proficiency. The results are illustrated with simulated data, as well as with real data.
Estimation of diagnostic test accuracy without full verification: a review of latent class methods

PubMed Central

Collins, John; Huynh, Minh

2014-01-01

The performance of a diagnostic test is best evaluated against a reference test that is without error. For many diseases, this is not possible, and an imperfect reference test must be used. However, diagnostic accuracy estimates may be biased if inaccurately verified status is used as the truth. Statistical models have been developed to handle this situation by treating disease as a latent variable. In this paper, we conduct a systematized review of statistical methods using latent class models for estimating test accuracy and disease prevalence in the absence of complete verification. PMID:24910172
The Influence of Vacuum Circuit Breakers and Different Motor Models on Switching Overvoltages in Motor Circuits

NASA Astrophysics Data System (ADS)

Wong, Cat S. M.; Snider, L. A.; Lo, Edward W. C.; Chung, T. S.

Switching of induction motors with vacuum circuit breakers continues to be a concern. In this paper the influence on statistical overvoltages of the stochastic characteristics of vacuum circuit breakers, high frequency models of motors and transformers, and network characteristics, including cable lengths and network topology are evaluated and a general view of the overvoltages phenomena is presented. Finally, a real case study on the statistical voltage levels and risk-of-failure resulting from switching of a vacuum circuit breaker in an industrial installation in Hong Kong is presented.
DECIDE: a software for computer-assisted evaluation of diagnostic test performance.

PubMed

Chiecchio, A; Bo, A; Manzone, P; Giglioli, F

1993-05-01

The evaluation of the performance of clinical tests is a complex problem involving different steps and many statistical tools, not always structured in an organic and rational system. This paper presents a software which provides an organic system of statistical tools helping evaluation of clinical test performance. The program allows (a) the building and the organization of a working database, (b) the selection of the minimal set of tests with the maximum information content, (c) the search of the model best fitting the distribution of the test values, (d) the selection of optimal diagnostic cut-off value of the test for every positive/negative situation, (e) the evaluation of performance of the combinations of correlated and uncorrelated tests. The uncertainty associated with all the variables involved is evaluated. The program works in a MS-DOS environment with EGA or higher performing graphic card.
Linking Statistically- and Physically-Based Models for Improved Streamflow Simulation in Gaged and Ungaged Areas

NASA Astrophysics Data System (ADS)

Lafontaine, J.; Hay, L.; Archfield, S. A.; Farmer, W. H.; Kiang, J. E.

2014-12-01

The U.S. Geological Survey (USGS) has developed a National Hydrologic Model (NHM) to support coordinated, comprehensive and consistent hydrologic model development, and facilitate the application of hydrologic simulations within the continental US. The portion of the NHM located within the Gulf Coastal Plains and Ozarks Landscape Conservation Cooperative (GCPO LCC) is being used to test the feasibility of improving streamflow simulations in gaged and ungaged watersheds by linking statistically- and physically-based hydrologic models. The GCPO LCC covers part or all of 12 states and 5 sub-geographies, totaling approximately 726,000 km2, and is centered on the lower Mississippi Alluvial Valley. A total of 346 USGS streamgages in the GCPO LCC region were selected to evaluate the performance of this new calibration methodology for the period 1980 to 2013. Initially, the physically-based models are calibrated to measured streamflow data to provide a baseline for comparison. An enhanced calibration procedure then is used to calibrate the physically-based models in the gaged and ungaged areas of the GCPO LCC using statistically-based estimates of streamflow. For this application, the calibration procedure is adjusted to address the limitations of the statistically generated time series to reproduce measured streamflow in gaged basins, primarily by incorporating error and bias estimates. As part of this effort, estimates of uncertainty in the model simulations are also computed for the gaged and ungaged watersheds.
A statistical assessment of seismic models of the U.S. continental crust using Bayesian inversion of ambient noise surface wave dispersion data

NASA Astrophysics Data System (ADS)

Olugboji, T. M.; Lekic, V.; McDonough, W.

2017-07-01

We present a new approach for evaluating existing crustal models using ambient noise data sets and its associated uncertainties. We use a transdimensional hierarchical Bayesian inversion approach to invert ambient noise surface wave phase dispersion maps for Love and Rayleigh waves using measurements obtained from Ekström (2014). Spatiospectral analysis shows that our results are comparable to a linear least squares inverse approach (except at higher harmonic degrees), but the procedure has additional advantages: (1) it yields an autoadaptive parameterization that follows Earth structure without making restricting assumptions on model resolution (regularization or damping) and data errors; (2) it can recover non-Gaussian phase velocity probability distributions while quantifying the sources of uncertainties in the data measurements and modeling procedure; and (3) it enables statistical assessments of different crustal models (e.g., CRUST1.0, LITHO1.0, and NACr14) using variable resolution residual and standard deviation maps estimated from the ensemble. These assessments show that in the stable old crust of the Archean, the misfits are statistically negligible, requiring no significant update to crustal models from the ambient noise data set. In other regions of the U.S., significant updates to regionalization and crustal structure are expected especially in the shallow sedimentary basins and the tectonically active regions, where the differences between model predictions and data are statistically significant.
An Asynchronous Many-Task Implementation of In-Situ Statistical Analysis using Legion.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pebay, Philippe Pierre; Bennett, Janine Camille

2015-11-01

In this report, we propose a framework for the design and implementation of in-situ analy- ses using an asynchronous many-task (AMT) model, using the Legion programming model together with the MiniAero mini-application as a surrogate for full-scale parallel scientific computing applications. The bulk of this work consists of converting the Learn/Derive/Assess model which we had initially developed for parallel statistical analysis using MPI [PTBM11], from a SPMD to an AMT model. In this goal, we propose an original use of the concept of Legion logical regions as a replacement for the parallel communication schemes used for the only operation ofmore » the statistics engines that require explicit communication. We then evaluate this proposed scheme in a shared memory environment, using the Legion port of MiniAero as a proxy for a full-scale scientific application, as a means to provide input data sets of variable size for the in-situ statistical analyses in an AMT context. We demonstrate in particular that the approach has merit, and warrants further investigation, in collaboration with ongoing efforts to improve the overall parallel performance of the Legion system.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)

Hagos, Samson M.; Feng, Zhe; Burleyson, Casey D.

Regional cloud permitting model simulations of cloud populations observed during the 2011 ARM Madden Julian Oscillation Investigation Experiment/ Dynamics of Madden-Julian Experiment (AMIE/DYNAMO) field campaign are evaluated against radar and ship-based measurements. Sensitivity of model simulated surface rain rate statistics to parameters and parameterization of hydrometeor sizes in five commonly used WRF microphysics schemes are examined. It is shown that at 2 km grid spacing, the model generally overestimates rain rate from large and deep convective cores. Sensitivity runs involving variation of parameters that affect rain drop or ice particle size distribution (more aggressive break-up process etc) generally reduce themore » bias in rain-rate and boundary layer temperature statistics as the smaller particles become more vulnerable to evaporation. Furthermore significant improvement in the convective rain-rate statistics is observed when the horizontal grid-spacing is reduced to 1 km and 0.5 km, while it is worsened when run at 4 km grid spacing as increased turbulence enhances evaporation. The results suggest modulation of evaporation processes, through parameterization of turbulent mixing and break-up of hydrometeors may provide a potential avenue for correcting cloud statistics and associated boundary layer temperature biases in regional and global cloud permitting model simulations.« less
Evaluation of WRF Parameterizations for Air Quality Applications over the Midwest USA

NASA Astrophysics Data System (ADS)

Zheng, Z.; Fu, K.; Balasubramanian, S.; Koloutsou-Vakakis, S.; McFarland, D. M.; Rood, M. J.

2017-12-01

Reliable predictions from Chemical Transport Models (CTMs) for air quality research require accurate gridded weather inputs. In this study, a sensitivity analysis of 17 Weather Research and Forecast (WRF) model runs was conducted to explore the optimum configuration in six physics categories (i.e., cumulus, surface layer, microphysics, land surface model, planetary boundary layer, and longwave/shortwave radiation) for the Midwest USA. WRF runs were initally conducted over four days in May 2011 for a 12 km x 12 km domain over contiguous USA and a nested 4 km x 4 km domain over the Midwest USA (i.e., Illinois and adjacent areas including Iowa, Indiana, and Missouri). Model outputs were evaluated statistically by comparison with meteorological observations (DS337.0, METAR data, and the Water and Atmospheric Resources Monitoring Network) and resulting statistics were compared to benchmark values from the literature. Identified optimum configurations of physics parametrizations were then evaluated for the whole months of May and October 2011 to evaluate WRF model performance for Midwestern spring and fall seasons. This study demonstrated that for the chosen physics options, WRF predicted well temperature (Index of Agreement (IOA) = 0.99), pressure (IOA = 0.99), relative humidity (IOA = 0.93), wind speed (IOA = 0.85), and wind direction (IOA = 0.97). However, WRF did not predict daily precipitation satisfactorily (IOA = 0.16). Developed gridded weather fields will be used as inputs to a CTM ensemble consisting of the Comprehensive Air Quality Model with Extensions to study impacts of chemical fertilizer usage on regional air quality in the Midwest USA.
Sensitivity analysis, calibration, and testing of a distributed hydrological model using error‐based weighting and one objective function

USGS Publications Warehouse

Foglia, L.; Hill, Mary C.; Mehl, Steffen W.; Burlando, P.

2009-01-01

We evaluate the utility of three interrelated means of using data to calibrate the fully distributed rainfall‐runoff model TOPKAPI as applied to the Maggia Valley drainage area in Switzerland. The use of error‐based weighting of observation and prior information data, local sensitivity analysis, and single‐objective function nonlinear regression provides quantitative evaluation of sensitivity of the 35 model parameters to the data, identification of data types most important to the calibration, and identification of correlations among parameters that contribute to nonuniqueness. Sensitivity analysis required only 71 model runs, and regression required about 50 model runs. The approach presented appears to be ideal for evaluation of models with long run times or as a preliminary step to more computationally demanding methods. The statistics used include composite scaled sensitivities, parameter correlation coefficients, leverage, Cook's D, and DFBETAS. Tests suggest predictive ability of the calibrated model typical of hydrologic models.
Evaluation of a Local Anesthesia Simulation Model with Dental Students as Novice Clinicians.

PubMed

Lee, Jessica S; Graham, Roseanna; Bassiur, Jennifer P; Lichtenthal, Richard M

2015-12-01

The aim of this study was to evaluate the use of a local anesthesia (LA) simulation model in a facilitated small group setting before dental students administered an inferior alveolar nerve block (IANB) for the first time. For this pilot study, 60 dental students transitioning from preclinical to clinical education were randomly assigned to either an experimental group (N=30) that participated in a small group session using the simulation model or a control group (N=30). After administering local anesthesia for the first time, students in both groups were given questionnaires regarding levels of preparedness and confidence when administering an IANB and level of anesthesia effectiveness and pain when receiving an IANB. Students in the experimental group exhibited a positive difference on all six questions regarding preparedness and confidence when administering LA to another student. One of these six questions ("I was prepared in administering local anesthesia for the first time") showed a statistically significant difference (p<0.05). Students who received LA from students who practiced on the simulation model also experienced fewer post-injection complications one day after receiving the IANB, including a statistically significant reduction in trismus. No statistically significant difference was found in level of effectiveness of the IANB or perceived levels of pain between the two groups. The results of this pilot study suggest that using a local anesthesia simulation model may be beneficial in increasing a dental student's level of comfort prior to administering local anesthesia for the first time.
Right-Sizing Statistical Models for Longitudinal Data

PubMed Central

Wood, Phillip K.; Steinley, Douglas; Jackson, Kristina M.

2015-01-01

Arguments are proposed that researchers using longitudinal data should consider more and less complex statistical model alternatives to their initially chosen techniques in an effort to “right-size” the model to the data at hand. Such model comparisons may alert researchers who use poorly fitting overly parsimonious models to more complex better fitting alternatives, and, alternatively, may identify more parsimonious alternatives to overly complex (and perhaps empirically under-identified and/or less powerful) statistical models. A general framework is proposed for considering (often nested) relationships between a variety of psychometric and growth curve models. A three-step approach is proposed in which models are evaluated based on the number and patterning of variance components prior to selection of better-fitting growth models that explain both mean and variation/covariation patterns. The orthogonal, free-curve slope-intercept (FCSI) growth model is considered as a general model which includes, as special cases, many models including the Factor Mean model (FM, McArdle & Epstein, 1987), McDonald's (1967) linearly constrained factor model, Hierarchical Linear Models (HLM), Repeated Measures MANOVA, and the Linear Slope Intercept (LinearSI) Growth Model. The FCSI model, in turn, is nested within the Tuckerized factor model. The approach is illustrated by comparing alternative models in a longitudinal study of children's vocabulary and by comparison of several candidate parametric growth and chronometric models in a Monte Carlo study. PMID:26237507
Right-sizing statistical models for longitudinal data.

PubMed

Wood, Phillip K; Steinley, Douglas; Jackson, Kristina M

2015-12-01

Arguments are proposed that researchers using longitudinal data should consider more and less complex statistical model alternatives to their initially chosen techniques in an effort to "right-size" the model to the data at hand. Such model comparisons may alert researchers who use poorly fitting, overly parsimonious models to more complex, better-fitting alternatives and, alternatively, may identify more parsimonious alternatives to overly complex (and perhaps empirically underidentified and/or less powerful) statistical models. A general framework is proposed for considering (often nested) relationships between a variety of psychometric and growth curve models. A 3-step approach is proposed in which models are evaluated based on the number and patterning of variance components prior to selection of better-fitting growth models that explain both mean and variation-covariation patterns. The orthogonal free curve slope intercept (FCSI) growth model is considered a general model that includes, as special cases, many models, including the factor mean (FM) model (McArdle & Epstein, 1987), McDonald's (1967) linearly constrained factor model, hierarchical linear models (HLMs), repeated-measures multivariate analysis of variance (MANOVA), and the linear slope intercept (linearSI) growth model. The FCSI model, in turn, is nested within the Tuckerized factor model. The approach is illustrated by comparing alternative models in a longitudinal study of children's vocabulary and by comparing several candidate parametric growth and chronometric models in a Monte Carlo study. (c) 2015 APA, all rights reserved).
Cognitive Complaints After Breast Cancer Treatments: Examining the Relationship With Neuropsychological Test Performance

PubMed Central

2013-01-01

Background Cognitive complaints are reported frequently after breast cancer treatments. Their association with neuropsychological (NP) test performance is not well-established. Methods Early-stage, posttreatment breast cancer patients were enrolled in a prospective, longitudinal, cohort study prior to starting endocrine therapy. Evaluation included an NP test battery and self-report questionnaires assessing symptoms, including cognitive complaints. Multivariable regression models assessed associations among cognitive complaints, mood, treatment exposures, and NP test performance. Results One hundred eighty-nine breast cancer patients, aged 21–65 years, completed the evaluation; 23.3% endorsed higher memory complaints and 19.0% reported higher executive function complaints (>1 SD above the mean for healthy control sample). Regression modeling demonstrated a statistically significant association of higher memory complaints with combined chemotherapy and radiation treatments (P = .01), poorer NP verbal memory performance (P = .02), and higher depressive symptoms (P < .001), controlling for age and IQ. For executive functioning complaints, multivariable modeling controlling for age, IQ, and other confounds demonstrated statistically significant associations with better NP visual memory performance (P = .03) and higher depressive symptoms (P < .001), whereas combined chemotherapy and radiation treatment (P = .05) approached statistical significance. Conclusions About one in five post–adjuvant treatment breast cancer patients had elevated memory and/or executive function complaints that were statistically significantly associated with domain-specific NP test performances and depressive symptoms; combined chemotherapy and radiation treatment was also statistically significantly associated with memory complaints. These results and other emerging studies suggest that subjective cognitive complaints in part reflect objective NP performance, although their etiology and biology appear to be multifactorial, motivating further transdisciplinary research. PMID:23606729

The NBS Energy Model Assessment project: Summary and overview

NASA Astrophysics Data System (ADS)

Gass, S. I.; Hoffman, K. L.; Jackson, R. H. F.; Joel, L. S.; Saunders, P. B.

1980-09-01

The activities and technical reports for the project are summarized. The reports cover: assessment of the documentation of Midterm Oil and Gas Supply Modeling System; analysis of the model methodology characteristics of the input and other supporting data; statistical procedures undergirding construction of the model and sensitivity of the outputs to variations in input, as well as guidelines and recommendations for the role of these in model building and developing procedures for their evaluation.
Evaluating pictogram prediction in a location-aware augmentative and alternative communication system.

PubMed

Garcia, Luís Filipe; de Oliveira, Luís Caldas; de Matos, David Martins

2016-01-01

This study compared the performance of two statistical location-aware pictogram prediction mechanisms, with an all-purpose (All) pictogram prediction mechanism, having no location knowledge. The All approach had a unique language model under all locations. One of the location-aware alternatives, the location-specific (Spec) approach, made use of specific language models for pictogram prediction in each location of interest. The other location-aware approach resulted from combining the Spec and the All approaches, and was designated the mixed approach (Mix). In this approach, the language models acquired knowledge from all locations, but a higher relevance was assigned to the vocabulary from the associated location. Results from simulations showed that the Mix and Spec approaches could only outperform the baseline in a statistically significant way if pictogram users reuse more than 50% and 75% of their sentences, respectively. Under low sentence reuse conditions there were no statistically significant differences between the location-aware approaches and the All approach. Under these conditions, the Mix approach performed better than the Spec approach in a statistically significant way.
The impact of statistical adjustment on conditional standard errors of measurement in the assessment of physician communication skills.

PubMed

Raymond, Mark R; Clauser, Brian E; Furman, Gail E

2010-10-01

The use of standardized patients to assess communication skills is now an essential part of assessing a physician's readiness for practice. To improve the reliability of communication scores, it has become increasingly common in recent years to use statistical models to adjust ratings provided by standardized patients. This study employed ordinary least squares regression to adjust ratings, and then used generalizability theory to evaluate the impact of these adjustments on score reliability and the overall standard error of measurement. In addition, conditional standard errors of measurement were computed for both observed and adjusted scores to determine whether the improvements in measurement precision were uniform across the score distribution. Results indicated that measurement was generally less precise for communication ratings toward the lower end of the score distribution; and the improvement in measurement precision afforded by statistical modeling varied slightly across the score distribution such that the most improvement occurred in the upper-middle range of the score scale. Possible reasons for these patterns in measurement precision are discussed, as are the limitations of the statistical models used for adjusting performance ratings.
Liquid water breakthrough location distances on a gas diffusion layer of polymer electrolyte membrane fuel cells

NASA Astrophysics Data System (ADS)

Yu, Junliang; Froning, Dieter; Reimer, Uwe; Lehnert, Werner

2018-06-01

The lattice Boltzmann method is adopted to simulate the three dimensional dynamic process of liquid water breaking through the gas diffusion layer (GDL) in the polymer electrolyte membrane fuel cell. 22 micro-structures of Toray GDL are built based on a stochastic geometry model. It is found that more than one breakthrough locations are formed randomly on the GDL surface. Breakthrough location distance (BLD) are analyzed statistically in two ways. The distribution is evaluated statistically by the Lilliefors test. It is concluded that the BLD can be described by the normal distribution with certain statistic characteristics. Information of the shortest neighbor breakthrough location distance can be the input modeling setups on the cell-scale simulations in the field of fuel cell simulation.
Statistical Examination of the Resolution of a Block-Scale Urban Drainage Model

NASA Astrophysics Data System (ADS)

Goldstein, A.; Montalto, F. A.; Digiovanni, K. A.

2009-12-01

Stormwater drainage models are utilized by cities in order to plan retention systems to prevent combined sewage overflows and design for development. These models aggregate subcatchments and ignore small pipelines providing a coarse representation of a sewage network. This study evaluates the importance of resolution by comparing two models developed on a neighborhood scale for predicting the total quantity and peak flow of runoff to observed runoff measured at the site. The low and high resolution models were designed for a 2.6 ha block in Bronx, NYC in EPA Stormwater Management Model (SWMM) using a single catchment and separate subcatchments based on surface cover, respectively. The surface covers represented included sidewalks, street, buildings, and backyards. Characteristics for physical surfaces and the infrastructure in the high resolution mode were determined from site visits, sewer pipe maps, aerial photographs, and GIS data-sets provided by the NYC Department of City Planning. Since the low resolution model was depicted at a coarser scale, generalizations were assumed about the overall average characteristics of the catchment. Rainfall and runoff data were monitored over a four month period during the summer rainy season. A total of 53 rain fall events were recorded but only 29 storms produced significant amount of runoffs to be evaluated in the simulations. To determine which model was more accurate at predicting the observed runoff, three characteristics for each storm were compared: peak runoff, total runoff, and time to peak. Two statistical tests were used to determine the significance of the results: the percent difference for each storm and the overall Chi-squared Goodness of Fit distribution for both the low and high resolution model. These tests will evaluate if there is a statistical difference depending on the resolution of scale of the stormwater model. The scale of representation is being evaluated because it could have a profound impact on how low-impact development strategies are assessed. Rerouting flows to delay the time of entry into the combined sewage is the primary goal of stormwater source controls which may be better differentiated in a high resolution as opposed to low resolution model. The preliminary hypothesis is that the low resolution model simplifies watershed by defining attributes uniformly across the watershed. In the high resolution model, the physical flow can be more accurate depicted by connected the various subcatchments. For example, the runoff from buildings can directly be routed to the backyard. The main drawback to the high resolution model is the risk of adding uncertainty due to the number of parameters.
Two statistics for evaluating parameter identifiability and error reduction

USGS Publications Warehouse

Doherty, John; Hunt, Randall J.

2009-01-01

Two statistics are presented that can be used to rank input parameters utilized by a model in terms of their relative identifiability based on a given or possible future calibration dataset. Identifiability is defined here as the capability of model calibration to constrain parameters used by a model. Both statistics require that the sensitivity of each model parameter be calculated for each model output for which there are actual or presumed field measurements. Singular value decomposition (SVD) of the weighted sensitivity matrix is then undertaken to quantify the relation between the parameters and observations that, in turn, allows selection of calibration solution and null spaces spanned by unit orthogonal vectors. The first statistic presented, "parameter identifiability", is quantitatively defined as the direction cosine between a parameter and its projection onto the calibration solution space. This varies between zero and one, with zero indicating complete non-identifiability and one indicating complete identifiability. The second statistic, "relative error reduction", indicates the extent to which the calibration process reduces error in estimation of a parameter from its pre-calibration level where its value must be assigned purely on the basis of prior expert knowledge. This is more sophisticated than identifiability, in that it takes greater account of the noise associated with the calibration dataset. Like identifiability, it has a maximum value of one (which can only be achieved if there is no measurement noise). Conceptually it can fall to zero; and even below zero if a calibration problem is poorly posed. An example, based on a coupled groundwater/surface-water model, is included that demonstrates the utility of the statistics. ?? 2009 Elsevier B.V.
Exploring Modeling Options and Conversion of Average Response to Appropriate Vibration Envelopes for a Typical Cylindrical Vehicle Panel with Rib-stiffened Design

NASA Technical Reports Server (NTRS)

Harrison, Phil; LaVerde, Bruce; Teague, David

2009-01-01

Although applications for Statistical Energy Analysis (SEA) techniques are more widely used in the aerospace industry today, opportunities to anchor the response predictions using measured data from a flight-like launch vehicle structure are still quite valuable. Response and excitation data from a ground acoustic test at the Marshall Space Flight Center permitted the authors to compare and evaluate several modeling techniques available in the SEA module of the commercial code VA One. This paper provides an example of vibration response estimates developed using different modeling approaches to both approximate and bound the response of a flight-like vehicle panel. Since both vibration response and acoustic levels near the panel were available from the ground test, the evaluation provided an opportunity to learn how well the different modeling options can match band-averaged spectra developed from the test data. Additional work was performed to understand the spatial averaging of the measurements across the panel from measured data. Finally an evaluation/comparison of two conversion approaches from the statistical average response results that are output from an SEA analysis to a more useful envelope of response spectra appropriate to specify design and test vibration levels for a new vehicle.
Synthetic data sets for the identification of key ingredients for RNA-seq differential analysis.

PubMed

Rigaill, Guillem; Balzergue, Sandrine; Brunaud, Véronique; Blondet, Eddy; Rau, Andrea; Rogier, Odile; Caius, José; Maugis-Rabusseau, Cathy; Soubigou-Taconnat, Ludivine; Aubourg, Sébastien; Lurin, Claire; Martin-Magniette, Marie-Laure; Delannoy, Etienne

2018-01-01

Numerous statistical pipelines are now available for the differential analysis of gene expression measured with RNA-sequencing technology. Most of them are based on similar statistical frameworks after normalization, differing primarily in the choice of data distribution, mean and variance estimation strategy and data filtering. We propose an evaluation of the impact of these choices when few biological replicates are available through the use of synthetic data sets. This framework is based on real data sets and allows the exploration of various scenarios differing in the proportion of non-differentially expressed genes. Hence, it provides an evaluation of the key ingredients of the differential analysis, free of the biases associated with the simulation of data using parametric models. Our results show the relevance of a proper modeling of the mean by using linear or generalized linear modeling. Once the mean is properly modeled, the impact of the other parameters on the performance of the test is much less important. Finally, we propose to use the simple visualization of the raw P-value histogram as a practical evaluation criterion of the performance of differential analysis methods on real data sets. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Rasch Model Based Analysis of the Force Concept Inventory

ERIC Educational Resources Information Center

Planinic, Maja; Ivanjek, Lana; Susac, Ana

2010-01-01

The Force Concept Inventory (FCI) is an important diagnostic instrument which is widely used in the field of physics education research. It is therefore very important to evaluate and monitor its functioning using different tools for statistical analysis. One of such tools is the stochastic Rasch model, which enables construction of linear…
On the Power of Multivariate Latent Growth Curve Models to Detect Correlated Change

ERIC Educational Resources Information Center

Hertzog, Christopher; Lindenberger, Ulman; Ghisletta, Paolo; Oertzen, Timo von

2006-01-01

We evaluated the statistical power of single-indicator latent growth curve models (LGCMs) to detect correlated change between two variables (covariance of slopes) as a function of sample size, number of longitudinal measurement occasions, and reliability (measurement error variance). Power approximations following the method of Satorra and Saris…
Rethinking Teacher Evaluation: A Conversation about Statistical Inferences and Value-Added Models

ERIC Educational Resources Information Center

Callister Everson, Kimberlee; Feinauer, Erika; Sudweeks, Richard R.

2013-01-01

In this article, the authors provide a methodological critique of the current standard of value-added modeling forwarded in educational policy contexts as a means of measuring teacher effectiveness. Conventional value-added estimates of teacher quality are attempts to determine to what degree a teacher would theoretically contribute, on average,…
A Comparison of Latent Growth Models for Constructs Measured by Multiple Items

ERIC Educational Resources Information Center

Leite, Walter L.

2007-01-01

Univariate latent growth modeling (LGM) of composites of multiple items (e.g., item means or sums) has been frequently used to analyze the growth of latent constructs. This study evaluated whether LGM of composites yields unbiased parameter estimates, standard errors, chi-square statistics, and adequate fit indexes. Furthermore, LGM was compared…
Evaluating Video Self-Modeling Treatment Outcomes: Differentiating between Statistically and Clinically Significant Change

ERIC Educational Resources Information Center

La Spata, Michelle G.; Carter, Christopher W.; Johnson, Wendi L.; McGill, Ryan J.

2016-01-01

The present study examined the utility of video self-modeling (VSM) for reducing externalizing behaviors (e.g., aggression, conduct problems, hyperactivity, and impulsivity) observed within the classroom environment. After identification of relevant target behaviors, VSM interventions were developed for first and second grade students (N = 4),…
The Concentric Support Model: A Model for the Planning and Evaluation of Distance Learning Programs

ERIC Educational Resources Information Center

Osika, Elizabeth

2006-01-01

Each year, the number of institutions offering distance learning courses continues to grow significantly (Green, 2002; National Center for Educational Statistics, 2003; Wagner, 2000). Broskoske and Harvey (2000) explained that "many institutions begin a distance education initiative encouraged by the potential benefits, influenced by their…
EVALUATION OF INTERSPECIES DIFFERENCES IN PHARMACOKINETICS (PK) USING A PBPK MODEL FOR THE PESTICIDE DIMETHYLARSINIC ACID (DMAV)

EPA Science Inventory

DMAV is an organoarsenical pesticide registered for use on certain citrus crops and as a cotton defoliant. In lifetime oral route studies in rodents, DMAV causes statistically significant increases in bladder tumors in rats, but not in mice. We have developed a PBPK model for D...
Incorporating Biological, Chemical and Toxicological Knowledge into Predictive Models of Toxicity: Letter to the Editor

EPA Science Inventory

Thomas et al. (2012) recently published an evaluation of statistical models for classifying in vivo toxicity endpoints from ToxRefDB (Knudsen et al. 2009; Martin et al. 2009a and 2009b) using ToxCast in vitro bioactivity data (Judson et al. 2010) and chemical structure descriptor...
Applications of a New England stream temperature model to evaluate distribution of thermal regimes and sensitivity to change in riparian condition

EPA Science Inventory

We have applied a statistical stream network (SSN) model to predict stream thermal metrics (summer monthly medians, growing season maximum magnitude and timing, and daily rates of change) across New England nontidal streams and rivers, excluding northern Maine watersheds that ext...
A Flipped Classroom Model for a Biostatistics Short Course

ERIC Educational Resources Information Center

McLaughlin, Jacqueline E.; Kang, Isabell

2017-01-01

Effective pedagogical strategies are needed to improve statistical literacy within health sciences education. This paper describes the design, implementation, and evaluation of a highly interactive two-week biostatistics short course using the flipped classroom model in the United States. The course was required for all students at the start of a…
Data Analysis & Statistical Methods for Command File Errors

NASA Technical Reports Server (NTRS)

Meshkat, Leila; Waggoner, Bruce; Bryant, Larry

2014-01-01

This paper explains current work on modeling for managing the risk of command file errors. It is focused on analyzing actual data from a JPL spaceflight mission to build models for evaluating and predicting error rates as a function of several key variables. We constructed a rich dataset by considering the number of errors, the number of files radiated, including the number commands and blocks in each file, as well as subjective estimates of workload and operational novelty. We have assessed these data using different curve fitting and distribution fitting techniques, such as multiple regression analysis, and maximum likelihood estimation to see how much of the variability in the error rates can be explained with these. We have also used goodness of fit testing strategies and principal component analysis to further assess our data. Finally, we constructed a model of expected error rates based on the what these statistics bore out as critical drivers to the error rate. This model allows project management to evaluate the error rate against a theoretically expected rate as well as anticipate future error rates.
Statistical Evaluation of Utilization of the ISS

NASA Technical Reports Server (NTRS)

Andrews, Ross; Andrews, Alida

2006-01-01

PayLoad Utilization Modeler (PLUM) is a statistical-modeling computer program used to evaluate the effectiveness of utilization of the International Space Station (ISS) in terms of the number of research facilities that can be operated within a specified interval of time. PLUM is designed to balance the requirements of research facilities aboard the ISS against the resources available on the ISS. PLUM comprises three parts: an interface for the entry of data on constraints and on required and available resources, a database that stores these data as well as the program output, and a modeler. The modeler comprises two subparts: one that generates tens of thousands of random combinations of research facilities and another that calculates the usage of resources for each of those combinations. The results of these calculations are used to generate graphical and tabular reports to determine which facilities are most likely to be operable on the ISS, to identify which ISS resources are inadequate to satisfy the demands upon them, and to generate other data useful in allocation of and planning of resources.

On statistical analysis of factors affecting anthocyanin extraction from Ixora siamensis

NASA Astrophysics Data System (ADS)

Mat Nor, N. A.; Arof, A. K.

2016-10-01

This study focused on designing an experimental model in order to evaluate the influence of operative extraction parameters employed for anthocyanin extraction from Ixora siamensis on CIE color measurements (a*, b* and color saturation). Extractions were conducted at temperatures of 30, 55 and 80°C, soaking time of 60, 120 and 180 min using acidified methanol solvent with different trifluoroacetic acid (TFA) contents of 0.5, 1.75 and 3% (v/v). The statistical evaluation was performed by running analysis of variance (ANOVA) and regression calculation to investigate the significance of the generated model. Results show that the generated regression models adequately explain the data variation and significantly represented the actual relationship between the independent variables and the responses. Analysis of variance (ANOVA) showed high coefficient determination values (R2) of 0.9687 for a*, 0.9621 for b* and 0.9758 for color saturation, thus ensuring a satisfactory fit of the developed models with the experimental data. Interaction between TFA content and extraction temperature exhibited to the highest significant influence on CIE color parameter.
Optimizing the maximum reported cluster size in the spatial scan statistic for ordinal data.

PubMed

Kim, Sehwi; Jung, Inkyung

2017-01-01

The spatial scan statistic is an important tool for spatial cluster detection. There have been numerous studies on scanning window shapes. However, little research has been done on the maximum scanning window size or maximum reported cluster size. Recently, Han et al. proposed to use the Gini coefficient to optimize the maximum reported cluster size. However, the method has been developed and evaluated only for the Poisson model. We adopt the Gini coefficient to be applicable to the spatial scan statistic for ordinal data to determine the optimal maximum reported cluster size. Through a simulation study and application to a real data example, we evaluate the performance of the proposed approach. With some sophisticated modification, the Gini coefficient can be effectively employed for the ordinal model. The Gini coefficient most often picked the optimal maximum reported cluster sizes that were the same as or smaller than the true cluster sizes with very high accuracy. It seems that we can obtain a more refined collection of clusters by using the Gini coefficient. The Gini coefficient developed specifically for the ordinal model can be useful for optimizing the maximum reported cluster size for ordinal data and helpful for properly and informatively discovering cluster patterns.
Optimizing the maximum reported cluster size in the spatial scan statistic for ordinal data

PubMed Central

Kim, Sehwi

2017-01-01

The spatial scan statistic is an important tool for spatial cluster detection. There have been numerous studies on scanning window shapes. However, little research has been done on the maximum scanning window size or maximum reported cluster size. Recently, Han et al. proposed to use the Gini coefficient to optimize the maximum reported cluster size. However, the method has been developed and evaluated only for the Poisson model. We adopt the Gini coefficient to be applicable to the spatial scan statistic for ordinal data to determine the optimal maximum reported cluster size. Through a simulation study and application to a real data example, we evaluate the performance of the proposed approach. With some sophisticated modification, the Gini coefficient can be effectively employed for the ordinal model. The Gini coefficient most often picked the optimal maximum reported cluster sizes that were the same as or smaller than the true cluster sizes with very high accuracy. It seems that we can obtain a more refined collection of clusters by using the Gini coefficient. The Gini coefficient developed specifically for the ordinal model can be useful for optimizing the maximum reported cluster size for ordinal data and helpful for properly and informatively discovering cluster patterns. PMID:28753674
Stochastic performance modeling and evaluation of obstacle detectability with imaging range sensors

NASA Technical Reports Server (NTRS)

Matthies, Larry; Grandjean, Pierrick

1993-01-01

Statistical modeling and evaluation of the performance of obstacle detection systems for Unmanned Ground Vehicles (UGVs) is essential for the design, evaluation, and comparison of sensor systems. In this report, we address this issue for imaging range sensors by dividing the evaluation problem into two levels: quality of the range data itself and quality of the obstacle detection algorithms applied to the range data. We review existing models of the quality of range data from stereo vision and AM-CW LADAR, then use these to derive a new model for the quality of a simple obstacle detection algorithm. This model predicts the probability of detecting obstacles and the probability of false alarms, as a function of the size and distance of the obstacle, the resolution of the sensor, and the level of noise in the range data. We evaluate these models experimentally using range data from stereo image pairs of a gravel road with known obstacles at several distances. The results show that the approach is a promising tool for predicting and evaluating the performance of obstacle detection with imaging range sensors.
Bayesian uncertainty analysis for complex systems biology models: emulation, global parameter searches and evaluation of gene functions.

PubMed

Vernon, Ian; Liu, Junli; Goldstein, Michael; Rowe, James; Topping, Jen; Lindsey, Keith

2018-01-02

Many mathematical models have now been employed across every area of systems biology. These models increasingly involve large numbers of unknown parameters, have complex structure which can result in substantial evaluation time relative to the needs of the analysis, and need to be compared to observed data of various forms. The correct analysis of such models usually requires a global parameter search, over a high dimensional parameter space, that incorporates and respects the most important sources of uncertainty. This can be an extremely difficult task, but it is essential for any meaningful inference or prediction to be made about any biological system. It hence represents a fundamental challenge for the whole of systems biology. Bayesian statistical methodology for the uncertainty analysis of complex models is introduced, which is designed to address the high dimensional global parameter search problem. Bayesian emulators that mimic the systems biology model but which are extremely fast to evaluate are embeded within an iterative history match: an efficient method to search high dimensional spaces within a more formal statistical setting, while incorporating major sources of uncertainty. The approach is demonstrated via application to a model of hormonal crosstalk in Arabidopsis root development, which has 32 rate parameters, for which we identify the sets of rate parameter values that lead to acceptable matches between model output and observed trend data. The multiple insights into the model's structure that this analysis provides are discussed. The methodology is applied to a second related model, and the biological consequences of the resulting comparison, including the evaluation of gene functions, are described. Bayesian uncertainty analysis for complex models using both emulators and history matching is shown to be a powerful technique that can greatly aid the study of a large class of systems biology models. It both provides insight into model behaviour and identifies the sets of rate parameters of interest.
Hunting Solomonoff's Swans: Exploring the Boundary Between Physics and Statistics in Hydrological Modeling

NASA Astrophysics Data System (ADS)

Nearing, G. S.

2014-12-01

Statistical models consistently out-perform conceptual models in the short term, however to account for a nonstationary future (or an unobserved past) scientists prefer to base predictions on unchanging and commutable properties of the universe - i.e., physics. The problem with physically-based hydrology models is, of course, that they aren't really based on physics - they are based on statistical approximations of physical interactions, and we almost uniformly lack an understanding of the entropy associated with these approximations. Thermodynamics is successful precisely because entropy statistics are computable for homogeneous (well-mixed) systems, and ergodic arguments explain the success of Newton's laws to describe systems that are fundamentally quantum in nature. Unfortunately, similar arguments do not hold for systems like watersheds that are heterogeneous at a wide range of scales. Ray Solomonoff formalized the situation in 1968 by showing that given infinite evidence, simultaneously minimizing model complexity and entropy in predictions always leads to the best possible model. The open question in hydrology is about what happens when we don't have infinite evidence - for example, when the future will not look like the past, or when one watershed does not behave like another. How do we isolate stationary and commutable components of watershed behavior? I propose that one possible answer to this dilemma lies in a formal combination of physics and statistics. In this talk I outline my recent analogue (Solomonoff's theorem was digital) of Solomonoff's idea that allows us to quantify the complexity/entropy tradeoff in a way that is intuitive to physical scientists. I show how to formally combine "physical" and statistical methods for model development in a way that allows us to derive the theoretically best possible model given any given physics approximation(s) and available observations. Finally, I apply an analogue of Solomonoff's theorem to evaluate the tradeoff between model complexity and prediction power.
Discharge destination following lower limb fracture: development of a prediction model to assist with decision making.

PubMed

Kimmel, Lara A; Holland, Anne E; Edwards, Elton R; Cameron, Peter A; De Steiger, Richard; Page, Richard S; Gabbe, Belinda

2012-06-01

Accurate prediction of the likelihood of discharge to inpatient rehabilitation following lower limb fracture made on admission to hospital may assist patient discharge planning and decrease the burden on the hospital system caused by delays in decision making. To develop a prognostic model for discharge to inpatient rehabilitation. Isolated lower extremity fracture cases (excluding fractured neck of femur), captured by the Victorian Orthopaedic Trauma Outcomes Registry (VOTOR), were extracted for analysis. A training data set was created for model development and validation data set for evaluation. A multivariable logistic regression model was developed based on patient and injury characteristics. Models were assessed using measures of discrimination (C-statistic) and calibration (Hosmer-Lemeshow (H-L) statistic). A total of 1429 patients met the inclusion criteria and were randomly split into training and test data sets. Increasing age, more proximal fracture type, compensation or private fund source for the admission, metropolitan location of residence, not working prior to injury and having a self-reported pre-injury disability were included in the final prediction model. The C-statistic for the model was 0.92 (95% confidence interval (CI) 0.88, 0.95) with an H-L statistic of χ(2)=11.62, p=0.17. For the test data set, the C-statistic was 0.86 (95% CI 0.83, 0.90) with an H-L statistic of χ(2)=37.98, p<0.001. A model to predict discharge to inpatient rehabilitation following lower limb fracture was developed with excellent discrimination although the calibration was reduced in the test data set. This model requires prospective testing but could form an integral part of decision making in regards to discharge disposition to facilitate timely and accurate referral to rehabilitation and optimise resource allocation. Copyright © 2011 Elsevier Ltd. All rights reserved.
Micro-CT evaluation of the marginal fit of CAD/CAM all ceramic crowns

NASA Astrophysics Data System (ADS)

Brenes, Christian

Objectives: Evaluate the marginal fit of CAD/CAM all ceramic crowns made from lithium disilicate and zirconia using two different fabrication protocols (model and model-less). METHODS: Forty anterior all ceramic restorations (20 lithium disilicate, 20 zirconia) were fabricated using a CEREC Bluecam scanner. Two different fabrication methods were used: a full digital approach and a printed model. Completed crowns were cemented and marginal gap was evaluated using Micro-CT. Each specimen was analyzed in sagittal and trans-axial orientations, allowing a 360° evaluation of the vertical and horizontal fit. RESULTS: Vertical measurements in the lingual, distal and mesial views had and estimated marginal gap from 101.9 to 133.9 microns for E-max crowns and 126.4 to 165.4 microns for zirconia. No significant differences were found between model and model-less techniques. CONCLUSION: Lithium disilicate restorations exhibited a more accurate and consistent marginal adaptation when compared to zirconia crowns. No statistically significant differences were observed when comparing model or model-less approaches.
Bayesian Sensitivity Analysis of Statistical Models with Missing Data

PubMed Central

ZHU, HONGTU; IBRAHIM, JOSEPH G.; TANG, NIANSHENG

2013-01-01

Methods for handling missing data depend strongly on the mechanism that generated the missing values, such as missing completely at random (MCAR) or missing at random (MAR), as well as other distributional and modeling assumptions at various stages. It is well known that the resulting estimates and tests may be sensitive to these assumptions as well as to outlying observations. In this paper, we introduce various perturbations to modeling assumptions and individual observations, and then develop a formal sensitivity analysis to assess these perturbations in the Bayesian analysis of statistical models with missing data. We develop a geometric framework, called the Bayesian perturbation manifold, to characterize the intrinsic structure of these perturbations. We propose several intrinsic influence measures to perform sensitivity analysis and quantify the effect of various perturbations to statistical models. We use the proposed sensitivity analysis procedure to systematically investigate the tenability of the non-ignorable missing at random (NMAR) assumption. Simulation studies are conducted to evaluate our methods, and a dataset is analyzed to illustrate the use of our diagnostic measures. PMID:24753718
Teacher Evaluation: Alternate Measures of Student Growth. Q&A with Brian Gill. REL Mid-Atlantic Webinar

ERIC Educational Resources Information Center

Regional Educational Laboratory Mid-Atlantic, 2013

2013-01-01

This webinar described the findings of our literature review on alternative measures of student growth that are used in teacher evaluation. The review focused on two types of alternative growth measures: statistical growth/value-added models and teacher-developed student learning objectives. This Q&A addressed the questions participants had…
Metrics, The Measure of Your Future: Evaluation Report, 1977.

ERIC Educational Resources Information Center

North Carolina State Dept. of Public Instruction, Raleigh. Div. of Development.

The primary goal of the Metric Education Project was the systematic development of a replicable educational model to facilitate the system-wide conversion to the metric system during the next five to ten years. This document is an evaluation of that project. Three sets of statistical evidence exist to support the fact that the project has been…
Evaluation of a weighted test in the analysis of ordinal gait scores in an additivity model for five OP pesticides.

EPA Science Inventory

Appropriate statistical analyses are critical for evaluating interactions of mixtures with a common mode of action, as is often the case for cumulative risk assessments. Our objective is to develop analyses for use when a response variable is ordinal, and to test for interaction...
The National Evaluation of School Nutrition Programs. Final Report - Executive Summary.

ERIC Educational Resources Information Center

Radzikowski, Jack

This is a summary of the final report of a study (begun in 1979) of the National School Lunch, School Breakfast, and Special Milk Programs. The major objectives of the evaluation were to (1) identify existing information on the school nutrition programs; (2) identify determinants of participation in the programs and develop statistical models for…
The Integration of Environmental Education and Communicative English Based on Multiple Intelligence Theory for Students in Extended Schools

ERIC Educational Resources Information Center

Sangsongfa, Chalothorn; Rawang, Wee

2016-01-01

Research and Development (R&D) was used with 364 students, 44 teachers and 3 school directors before designing innovation, and evaluating the model efficiency with 30 voluntary students by Action Research (AR). The research used questionnaire, interview form and innovation efficiency evaluation form, and statistically analyzed by percentage,…
Estimation of Total Nitrogen and Phosphorus in New England Streams Using Spatially Referenced Regression Models

USGS Publications Warehouse

Moore, Richard Bridge; Johnston, Craig M.; Robinson, Keith W.; Deacon, Jeffrey R.

2004-01-01

The U.S. Geological Survey (USGS), in cooperation with the U.S. Environmental Protection Agency (USEPA) and the New England Interstate Water Pollution Control Commission (NEIWPCC), has developed a water-quality model, called SPARROW (Spatially Referenced Regressions on Watershed Attributes), to assist in regional total maximum daily load (TMDL) and nutrient-criteria activities in New England. SPARROW is a spatially detailed, statistical model that uses regression equations to relate total nitrogen and phosphorus (nutrient) stream loads to nutrient sources and watershed characteristics. The statistical relations in these equations are then used to predict nutrient loads in unmonitored streams. The New England SPARROW models are built using a hydrologic network of 42,000 stream reaches and associated watersheds. Watershed boundaries are defined for each stream reach in the network through the use of a digital elevation model and existing digitized watershed divides. Nutrient source data is from permitted wastewater discharge data from USEPA's Permit Compliance System (PCS), various land-use sources, and atmospheric deposition. Physical watershed characteristics include drainage area, land use, streamflow, time-of-travel, stream density, percent wetlands, slope of the land surface, and soil permeability. The New England SPARROW models for total nitrogen and total phosphorus have R-squared values of 0.95 and 0.94, with mean square errors of 0.16 and 0.23, respectively. Variables that were statistically significant in the total nitrogen model include permitted municipal-wastewater discharges, atmospheric deposition, agricultural area, and developed land area. Total nitrogen stream-loss rates were significant only in streams with average annual flows less than or equal to 2.83 cubic meters per second. In streams larger than this, there is nondetectable in-stream loss of annual total nitrogen in New England. Variables that were statistically significant in the total phosphorus model include discharges for municipal wastewater-treatment facilities and pulp and paper facilities, developed land area, agricultural area, and forested area. For total phosphorus, loss rates were significant for reservoirs with surface areas of 10 square kilometers or less, and in streams with flows less than or equal to 2.83 cubic meters per second. Applications of SPARROW for evaluating nutrient loading in New England waters include estimates of the spatial distributions of total nitrogen and phosphorus yields, sources of the nutrients, and the potential for delivery of those yields to receiving waters. This information can be used to (1) predict ranges in nutrient levels in surface waters, (2) identify the environmental variables that are statistically significant predictors of nutrient levels in streams, (3) evaluate monitoring efforts for better determination of nutrient loads, and (4) evaluate management options for reducing nutrient loads to achieve water-quality goals.
Numerical 3D flow simulation of attached cavitation structures at ultrasonic horn tips and statistical evaluation of flow aggressiveness via load collectives

NASA Astrophysics Data System (ADS)

Mottyll, S.; Skoda, R.

2015-12-01

A compressible inviscid flow solver with barotropic cavitation model is applied to two different ultrasonic horn set-ups and compared to hydrophone, shadowgraphy as well as erosion test data. The statistical analysis of single collapse events in wall-adjacent flow regions allows the determination of the flow aggressiveness via load collectives (cumulative event rate vs collapse pressure), which show an exponential decrease in agreement to studies on hydrodynamic cavitation [1]. A post-processing projection of event rate and collapse pressure on a reference grid reduces the grid dependency significantly. In order to evaluate the erosion-sensitive areas a statistical analysis of transient wall loads is utilised. Predicted erosion sensitive areas as well as temporal pressure and vapour volume evolution are in good agreement to the experimental data.
Estimation of M 1 scissors mode strength for deformed nuclei in the medium- to heavy-mass region by statistical Hauser-Feshbach model calculations

NASA Astrophysics Data System (ADS)

Mumpower, M. R.; Kawano, T.; Ullmann, J. L.; Krtička, M.; Sprouse, T. M.

2017-08-01

Radiative neutron capture is an important nuclear reaction whose accurate description is needed for many applications ranging from nuclear technology to nuclear astrophysics. The description of such a process relies on the Hauser-Feshbach theory which requires the nuclear optical potential, level density, and γ -strength function as model inputs. It has recently been suggested that the M 1 scissors mode may explain discrepancies between theoretical calculations and evaluated data. We explore statistical model calculations with the strength of the M 1 scissors mode estimated to be dependent on the nuclear deformation of the compound system. We show that the form of the M 1 scissors mode improves the theoretical description of evaluated data and the match to experiment in both the fission product and actinide regions. Since the scissors mode occurs in the range of a few keV to a few MeV, it may also impact the neutron capture cross sections of neutron-rich nuclei that participate in the rapid neutron capture process of nucleosynthesis. We comment on the possible impact to nucleosynthesis by evaluating neutron capture rates for neutron-rich nuclei with the M 1 scissors mode active.
Comparison of climate envelope models developed using expert-selected variables versus statistical selection

USGS Publications Warehouse

Brandt, Laura A.; Benscoter, Allison; Harvey, Rebecca G.; Speroterra, Carolina; Bucklin, David N.; Romañach, Stephanie; Watling, James I.; Mazzotti, Frank J.

2017-01-01

Climate envelope models are widely used to describe potential future distribution of species under different climate change scenarios. It is broadly recognized that there are both strengths and limitations to using climate envelope models and that outcomes are sensitive to initial assumptions, inputs, and modeling methods Selection of predictor variables, a central step in modeling, is one of the areas where different techniques can yield varying results. Selection of climate variables to use as predictors is often done using statistical approaches that develop correlations between occurrences and climate data. These approaches have received criticism in that they rely on the statistical properties of the data rather than directly incorporating biological information about species responses to temperature and precipitation. We evaluated and compared models and prediction maps for 15 threatened or endangered species in Florida based on two variable selection techniques: expert opinion and a statistical method. We compared model performance between these two approaches for contemporary predictions, and the spatial correlation, spatial overlap and area predicted for contemporary and future climate predictions. In general, experts identified more variables as being important than the statistical method and there was low overlap in the variable sets (<40%) between the two methods Despite these differences in variable sets (expert versus statistical), models had high performance metrics (>0.9 for area under the curve (AUC) and >0.7 for true skill statistic (TSS). Spatial overlap, which compares the spatial configuration between maps constructed using the different variable selection techniques, was only moderate overall (about 60%), with a great deal of variability across species. Difference in spatial overlap was even greater under future climate projections, indicating additional divergence of model outputs from different variable selection techniques. Our work is in agreement with other studies which have found that for broad-scale species distribution modeling, using statistical methods of variable selection is a useful first step, especially when there is a need to model a large number of species or expert knowledge of the species is limited. Expert input can then be used to refine models that seem unrealistic or for species that experts believe are particularly sensitive to change. It also emphasizes the importance of using multiple models to reduce uncertainty and improve map outputs for conservation planning. Where outputs overlap or show the same direction of change there is greater certainty in the predictions. Areas of disagreement can be used for learning by asking why the models do not agree, and may highlight areas where additional on-the-ground data collection could improve the models.
Flow Chamber System for the Statistical Evaluation of Bacterial Colonization on Materials

PubMed Central

Menzel, Friederike; Conradi, Bianca; Rodenacker, Karsten; Gorbushina, Anna A.; Schwibbert, Karin

2016-01-01

Biofilm formation on materials leads to high costs in industrial processes, as well as in medical applications. This fact has stimulated interest in the development of new materials with improved surfaces to reduce bacterial colonization. Standardized tests relying on statistical evidence are indispensable to evaluate the quality and safety of these new materials. We describe here a flow chamber system for biofilm cultivation under controlled conditions with a total capacity for testing up to 32 samples in parallel. In order to quantify the surface colonization, bacterial cells were DAPI (4`,6-diamidino-2-phenylindole)-stained and examined with epifluorescence microscopy. More than 100 images of each sample were automatically taken and the surface coverage was estimated using the free open source software g’mic, followed by a precise statistical evaluation. Overview images of all gathered pictures were generated to dissect the colonization characteristics of the selected model organism Escherichia coli W3310 on different materials (glass and implant steel). With our approach, differences in bacterial colonization on different materials can be quantified in a statistically validated manner. This reliable test procedure will support the design of improved materials for medical, industrial, and environmental (subaquatic or subaerial) applications. PMID:28773891
Assessment of Evidence-based Management Training Program: Application of a Logic Model.

PubMed

Guo, Ruiling; Farnsworth, Tracy J; Hermanson, Patrick M

2016-06-01

The purposes of this study were to apply a logic model to plan and implement an evidence-based management (EBMgt) educational training program for healthcare administrators and to examine whether a logic model is a useful tool for evaluating the outcomes of the educational program. The logic model was used as a conceptual framework to guide the investigators in developing an EBMgt educational training program and evaluating the outcomes of the program. The major components of the logic model were constructed as inputs, outputs, and outcomes/impacts. The investigators delineated the logic model based on the results of the needs assessment survey. Two 3-hour training workshops were delivered to 30 participants. To assess the outcomes of the EBMgt educational program, pre- and post-tests and self-reflection surveys were conducted. The data were collected and analyzed descriptively and inferentially, using the IBM Statistical Package for the Social Sciences (SPSS) 22.0. A paired sample t-test was performed to compare the differences in participants' EBMgt knowledge and skills prior to and after the training. The assessment results showed that there was a statistically significant difference in participants' EBMgt knowledge and information searching skills before and after the training (p< 0.001). Participants' confidence in using the EBMgt approach for decision-making was significantly increased after the training workshops (p< 0.001). Eighty-three percent of participants indicated that the knowledge and skills they gained through the training program could be used for future management decision-making in their healthcare organizations. The overall evaluation results of the program were positive. It is suggested that the logic model is a useful tool for program planning, implementation, and evaluation, and it also improves the outcomes of the educational program.

Evaluation of the learning curve of non-penetrating glaucoma surgery.

PubMed

Aslan, Fatih; Yuce, Berna; Oztas, Zafer; Ates, Halil

2017-08-11

To evaluate the learning curve of non-penetrating glaucoma surgery (NPGS). The study included 32 eyes of 27 patients' (20 male and 7 female) with medically uncontrolled glaucoma. Non-penetrating glaucoma surgeries performed by trainees under control of an experienced surgeon between 2005 and 2007 at our tertiary referral hospital were evaluated. Residents were separated into two groups. Humanistic training model applied to the one in the first group, he studied with experimental models before performing NPGS. Two residents in the second group performed NPGS after a conventional training model. Surgeries of the residents were recorded on video and intraoperative parameters were scored by the experienced surgeon at the end of the study. Postoperative intraocular pressure, absolute and total success rates were analyzed. In the first group 19 eyes of 16 patients and in the second group 13 eyes of 11 patients had been operated by residents. Intraoperative parameters and complication rates were not statistically significant between groups (p > 0.05, Chi-square). The duration of surgery was 32.7 ± 5.6 min in the first group and 45 ± 3.8 min in the second group. The difference was statistically significant (p < 0.001, Student's t test). Absolute and total success was 68.8 and 93.8% in the first group and 62.5 and 87.5% in the second group, respectively. The difference was not statistically significant. Humanistic and conventional training models under control of an experienced surgeon are safe and effective for senior residents who manage phacoemulsification surgery in routine cataract cases. Senior residents can practice these surgical techniques with reasonable complication rates.
Two approaches to incorporate clinical data uncertainty into multiple criteria decision analysis for benefit-risk assessment of medicinal products.

PubMed

Wen, Shihua; Zhang, Lanju; Yang, Bo

2014-07-01

The Problem formulation, Objectives, Alternatives, Consequences, Trade-offs, Uncertainties, Risk attitude, and Linked decisions (PrOACT-URL) framework and multiple criteria decision analysis (MCDA) have been recommended by the European Medicines Agency for structured benefit-risk assessment of medicinal products undergoing regulatory review. The objective of this article was to provide solutions to incorporate the uncertainty from clinical data into the MCDA model when evaluating the overall benefit-risk profiles among different treatment options. Two statistical approaches, the δ-method approach and the Monte-Carlo approach, were proposed to construct the confidence interval of the overall benefit-risk score from the MCDA model as well as other probabilistic measures for comparing the benefit-risk profiles between treatment options. Both approaches can incorporate the correlation structure between clinical parameters (criteria) in the MCDA model and are straightforward to implement. The two proposed approaches were applied to a case study to evaluate the benefit-risk profile of an add-on therapy for rheumatoid arthritis (drug X) relative to placebo. It demonstrated a straightforward way to quantify the impact of the uncertainty from clinical data to the benefit-risk assessment and enabled statistical inference on evaluating the overall benefit-risk profiles among different treatment options. The δ-method approach provides a closed form to quantify the variability of the overall benefit-risk score in the MCDA model, whereas the Monte-Carlo approach is more computationally intensive but can yield its true sampling distribution for statistical inference. The obtained confidence intervals and other probabilistic measures from the two approaches enhance the benefit-risk decision making of medicinal products. Copyright © 2014 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.
Studies regarding the quality of numerical weather forecasts of the WRF model integrated at high-resolutions for the Romanian territory

DOE PAGES

Iriza, Amalia; Dumitrache, Rodica C.; Lupascu, Aurelia; ...

2016-01-01

Our paper aims to evaluate the quality of high-resolution weather forecasts from the Weather Research and Forecasting (WRF) numerical weather prediction model. The lateral and boundary conditions were obtained from the numerical output of the Consortium for Small-scale Modeling (COSMO) model at 7 km horizontal resolution. Furthermore, the WRF model was run for January and July 2013 at two horizontal resolutions (3 and 1 km). The numerical forecasts of the WRF model were evaluated using different statistical scores for 2 m temperature and 10 m wind speed. Our results showed a tendency of the WRF model to overestimate the valuesmore » of the analyzed parameters in comparison to observations.« less
Studies regarding the quality of numerical weather forecasts of the WRF model integrated at high-resolutions for the Romanian territory

DOE Office of Scientific and Technical Information (OSTI.GOV)

Iriza, Amalia; Dumitrache, Rodica C.; Lupascu, Aurelia

Our paper aims to evaluate the quality of high-resolution weather forecasts from the Weather Research and Forecasting (WRF) numerical weather prediction model. The lateral and boundary conditions were obtained from the numerical output of the Consortium for Small-scale Modeling (COSMO) model at 7 km horizontal resolution. Furthermore, the WRF model was run for January and July 2013 at two horizontal resolutions (3 and 1 km). The numerical forecasts of the WRF model were evaluated using different statistical scores for 2 m temperature and 10 m wind speed. Our results showed a tendency of the WRF model to overestimate the valuesmore » of the analyzed parameters in comparison to observations.« less
Statistical performance and information content of time lag analysis and redundancy analysis in time series modeling.

PubMed

Angeler, David G; Viedma, Olga; Moreno, José M

2009-11-01

Time lag analysis (TLA) is a distance-based approach used to study temporal dynamics of ecological communities by measuring community dissimilarity over increasing time lags. Despite its increased use in recent years, its performance in comparison with other more direct methods (i.e., canonical ordination) has not been evaluated. This study fills this gap using extensive simulations and real data sets from experimental temporary ponds (true zooplankton communities) and landscape studies (landscape categories as pseudo-communities) that differ in community structure and anthropogenic stress history. Modeling time with a principal coordinate of neighborhood matrices (PCNM) approach, the canonical ordination technique (redundancy analysis; RDA) consistently outperformed the other statistical tests (i.e., TLAs, Mantel test, and RDA based on linear time trends) using all real data. In addition, the RDA-PCNM revealed different patterns of temporal change, and the strength of each individual time pattern, in terms of adjusted variance explained, could be evaluated, It also identified species contributions to these patterns of temporal change. This additional information is not provided by distance-based methods. The simulation study revealed better Type I error properties of the canonical ordination techniques compared with the distance-based approaches when no deterministic component of change was imposed on the communities. The simulation also revealed that strong emphasis on uniform deterministic change and low variability at other temporal scales is needed to result in decreased statistical power of the RDA-PCNM approach relative to the other methods. Based on the statistical performance of and information content provided by RDA-PCNM models, this technique serves ecologists as a powerful tool for modeling temporal change of ecological (pseudo-) communities.
SPATIO-TEMPORAL MODELING OF AGRICULTURAL YIELD DATA WITH AN APPLICATION TO PRICING CROP INSURANCE CONTRACTS

PubMed Central

Ozaki, Vitor A.; Ghosh, Sujit K.; Goodwin, Barry K.; Shirota, Ricardo

2009-01-01

This article presents a statistical model of agricultural yield data based on a set of hierarchical Bayesian models that allows joint modeling of temporal and spatial autocorrelation. This method captures a comprehensive range of the various uncertainties involved in predicting crop insurance premium rates as opposed to the more traditional ad hoc, two-stage methods that are typically based on independent estimation and prediction. A panel data set of county-average yield data was analyzed for 290 counties in the State of Paraná (Brazil) for the period of 1990 through 2002. Posterior predictive criteria are used to evaluate different model specifications. This article provides substantial improvements in the statistical and actuarial methods often applied to the calculation of insurance premium rates. These improvements are especially relevant to situations where data are limited. PMID:19890450
Molecular Modeling in Drug Design for the Development of Organophosphorus Antidotes/Prophylactics.

DTIC Science & Technology

1986-06-01

multidimensional statistical QSAR analysis techniques to suggest new structures for synthesis and evaluation. C. Application of quantum chemical techniques to...compounds for synthesis and testing for antidotal potency. E. Use of computer-assisted methods to determine the steric constraints at the active site...modeling techniques to model the enzyme acetylcholinester-se. H. Suggestion of some novel compounds for synthesis and testing for reactivating
A Comparison of Four Estimators of a Population Measure of Model Fit in Covariance Structure Analysis

ERIC Educational Resources Information Center

Zhang, Wei

2008-01-01

A major issue in the utilization of covariance structure analysis is model fit evaluation. Recent years have witnessed increasing interest in various test statistics and so-called fit indexes, most of which are actually based on or closely related to F[subscript 0], a measure of model fit in the population. This study aims to provide a systematic…
Plan Recognition using Statistical Relational Models

DTIC Science & Technology

2014-08-25

arguments. Section 4 describes several variants of MLNs for plan recognition. All MLN mod- els were implemented using Alchemy (Kok et al., 2010), an...For both MLN approaches, we used MC-SAT (Poon and Domingos, 2006) as implemented in the Alchemy system on both Monroe and Linux. Evaluation Metric We...Singla P, Poon H, Lowd D, Wang J, Nath A, Domingos P. The Alchemy System for Statistical Relational AI. Techni- cal Report; Department of Computer Science
Comparison of statistical models for analyzing wheat yield time series.

PubMed

Michel, Lucie; Makowski, David

2013-01-01

The world's population is predicted to exceed nine billion by 2050 and there is increasing concern about the capability of agriculture to feed such a large population. Foresight studies on food security are frequently based on crop yield trends estimated from yield time series provided by national and regional statistical agencies. Various types of statistical models have been proposed for the analysis of yield time series, but the predictive performances of these models have not yet been evaluated in detail. In this study, we present eight statistical models for analyzing yield time series and compare their ability to predict wheat yield at the national and regional scales, using data provided by the Food and Agriculture Organization of the United Nations and by the French Ministry of Agriculture. The Holt-Winters and dynamic linear models performed equally well, giving the most accurate predictions of wheat yield. However, dynamic linear models have two advantages over Holt-Winters models: they can be used to reconstruct past yield trends retrospectively and to analyze uncertainty. The results obtained with dynamic linear models indicated a stagnation of wheat yields in many countries, but the estimated rate of increase of wheat yield remained above 0.06 t ha⁻¹ year⁻¹ in several countries in Europe, Asia, Africa and America, and the estimated values were highly uncertain for several major wheat producing countries. The rate of yield increase differed considerably between French regions, suggesting that efforts to identify the main causes of yield stagnation should focus on a subnational scale.
Parallel algorithm for solving Kepler’s equation on Graphics Processing Units: Application to analysis of Doppler exoplanet searches

NASA Astrophysics Data System (ADS)

Ford, Eric B.

2009-05-01

We present the results of a highly parallel Kepler equation solver using the Graphics Processing Unit (GPU) on a commercial nVidia GeForce 280GTX and the "Compute Unified Device Architecture" (CUDA) programming environment. We apply this to evaluate a goodness-of-fit statistic (e.g., χ2) for Doppler observations of stars potentially harboring multiple planetary companions (assuming negligible planet-planet interactions). Given the high-dimensionality of the model parameter space (at least five dimensions per planet), a global search is extremely computationally demanding. We expect that the underlying Kepler solver and model evaluator will be combined with a wide variety of more sophisticated algorithms to provide efficient global search, parameter estimation, model comparison, and adaptive experimental design for radial velocity and/or astrometric planet searches. We tested multiple implementations using single precision, double precision, pairs of single precision, and mixed precision arithmetic. We find that the vast majority of computations can be performed using single precision arithmetic, with selective use of compensated summation for increased precision. However, standard single precision is not adequate for calculating the mean anomaly from the time of observation and orbital period when evaluating the goodness-of-fit for real planetary systems and observational data sets. Using all double precision, our GPU code outperforms a similar code using a modern CPU by a factor of over 60. Using mixed precision, our GPU code provides a speed-up factor of over 600, when evaluating nsys > 1024 models planetary systems each containing npl = 4 planets and assuming nobs = 256 observations of each system. We conclude that modern GPUs also offer a powerful tool for repeatedly evaluating Kepler's equation and a goodness-of-fit statistic for orbital models when presented with a large parameter space.
Evaluation of three statistical prediction models for forensic age prediction based on DNA methylation.

PubMed

Smeers, Inge; Decorte, Ronny; Van de Voorde, Wim; Bekaert, Bram

2018-05-01

DNA methylation is a promising biomarker for forensic age prediction. A challenge that has emerged in recent studies is the fact that prediction errors become larger with increasing age due to interindividual differences in epigenetic ageing rates. This phenomenon of non-constant variance or heteroscedasticity violates an assumption of the often used method of ordinary least squares (OLS) regression. The aim of this study was to evaluate alternative statistical methods that do take heteroscedasticity into account in order to provide more accurate, age-dependent prediction intervals. A weighted least squares (WLS) regression is proposed as well as a quantile regression model. Their performances were compared against an OLS regression model based on the same dataset. Both models provided age-dependent prediction intervals which account for the increasing variance with age, but WLS regression performed better in terms of success rate in the current dataset. However, quantile regression might be a preferred method when dealing with a variance that is not only non-constant, but also not normally distributed. Ultimately the choice of which model to use should depend on the observed characteristics of the data. Copyright © 2018 Elsevier B.V. All rights reserved.
Routine Discovery of Complex Genetic Models using Genetic Algorithms

PubMed Central

Moore, Jason H.; Hahn, Lance W.; Ritchie, Marylyn D.; Thornton, Tricia A.; White, Bill C.

2010-01-01

Simulation studies are useful in various disciplines for a number of reasons including the development and evaluation of new computational and statistical methods. This is particularly true in human genetics and genetic epidemiology where new analytical methods are needed for the detection and characterization of disease susceptibility genes whose effects are complex, nonlinear, and partially or solely dependent on the effects of other genes (i.e. epistasis or gene-gene interaction). Despite this need, the development of complex genetic models that can be used to simulate data is not always intuitive. In fact, only a few such models have been published. We have previously developed a genetic algorithm approach to discovering complex genetic models in which two single nucleotide polymorphisms (SNPs) influence disease risk solely through nonlinear interactions. In this paper, we extend this approach for the discovery of high-order epistasis models involving three to five SNPs. We demonstrate that the genetic algorithm is capable of routinely discovering interesting high-order epistasis models in which each SNP influences risk of disease only through interactions with the other SNPs in the model. This study opens the door for routine simulation of complex gene-gene interactions among SNPs for the development and evaluation of new statistical and computational approaches for identifying common, complex multifactorial disease susceptibility genes. PMID:20948983
Performance evaluation of a hybrid-passive landfill leachate treatment system using multivariate statistical techniques

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wallace, Jack, E-mail: jack.wallace@ce.queensu.ca; Champagne, Pascale, E-mail: champagne@civil.queensu.ca; Monnier, Anne-Charlotte, E-mail: anne-charlotte.monnier@insa-lyon.fr

Highlights: • Performance of a hybrid passive landfill leachate treatment system was evaluated. • 33 Water chemistry parameters were sampled for 21 months and statistically analyzed. • Parameters were strongly linked and explained most (>40%) of the variation in data. • Alkalinity, ammonia, COD, heavy metals, and iron were criteria for performance. • Eight other parameters were key in modeling system dynamics and criteria. - Abstract: A pilot-scale hybrid-passive treatment system operated at the Merrick Landfill in North Bay, Ontario, Canada, treats municipal landfill leachate and provides for subsequent natural attenuation. Collected leachate is directed to a hybrid-passive treatment system,more » followed by controlled release to a natural attenuation zone before entering the nearby Little Sturgeon River. The study presents a comprehensive evaluation of the performance of the system using multivariate statistical techniques to determine the interactions between parameters, major pollutants in the leachate, and the biological and chemical processes occurring in the system. Five parameters (ammonia, alkalinity, chemical oxygen demand (COD), “heavy” metals of interest, with atomic weights above calcium, and iron) were set as criteria for the evaluation of system performance based on their toxicity to aquatic ecosystems and importance in treatment with respect to discharge regulations. System data for a full range of water quality parameters over a 21-month period were analyzed using principal components analysis (PCA), as well as principal components (PC) and partial least squares (PLS) regressions. PCA indicated a high degree of association for most parameters with the first PC, which explained a high percentage (>40%) of the variation in the data, suggesting strong statistical relationships among most of the parameters in the system. Regression analyses identified 8 parameters (set as independent variables) that were most frequently retained for modeling the five criteria parameters (set as dependent variables), on a statistically significant level: conductivity, dissolved oxygen (DO), nitrite (NO{sub 2}{sup −}), organic nitrogen (N), oxidation reduction potential (ORP), pH, sulfate and total volatile solids (TVS). The criteria parameters and the significant explanatory parameters were most important in modeling the dynamics of the passive treatment system during the study period. Such techniques and procedures were found to be highly valuable and could be applied to other sites to determine parameters of interest in similar naturalized engineered systems.« less
Performance analysis of different tuning rules for an isothermal CSTR using integrated EPC and SPC

NASA Astrophysics Data System (ADS)

Roslan, A. H.; Karim, S. F. Abd; Hamzah, N.

2018-03-01

This paper demonstrates the integration of Engineering Process Control (EPC) and Statistical Process Control (SPC) for the control of product concentration of an isothermal CSTR. The objectives of this study are to evaluate the performance of Ziegler-Nichols (Z-N), Direct Synthesis, (DS) and Internal Model Control (IMC) tuning methods and determine the most effective method for this process. The simulation model was obtained from past literature and re-constructed using SIMULINK MATLAB to evaluate the process response. Additionally, the process stability, capability and normality were analyzed using Process Capability Sixpack reports in Minitab. Based on the results, DS displays the best response for having the smallest rise time, settling time, overshoot, undershoot, Integral Time Absolute Error (ITAE) and Integral Square Error (ISE). Also, based on statistical analysis, DS yields as the best tuning method as it exhibits the highest process stability and capability.
Graphical tools for network meta-analysis in STATA.

PubMed

Chaimani, Anna; Higgins, Julian P T; Mavridis, Dimitris; Spyridonos, Panagiota; Salanti, Georgia

2013-01-01

Network meta-analysis synthesizes direct and indirect evidence in a network of trials that compare multiple interventions and has the potential to rank the competing treatments according to the studied outcome. Despite its usefulness network meta-analysis is often criticized for its complexity and for being accessible only to researchers with strong statistical and computational skills. The evaluation of the underlying model assumptions, the statistical technicalities and presentation of the results in a concise and understandable way are all challenging aspects in the network meta-analysis methodology. In this paper we aim to make the methodology accessible to non-statisticians by presenting and explaining a series of graphical tools via worked examples. To this end, we provide a set of STATA routines that can be easily employed to present the evidence base, evaluate the assumptions, fit the network meta-analysis model and interpret its results.
Graphical Tools for Network Meta-Analysis in STATA

PubMed Central

Chaimani, Anna; Higgins, Julian P. T.; Mavridis, Dimitris; Spyridonos, Panagiota; Salanti, Georgia

2013-01-01

Network meta-analysis synthesizes direct and indirect evidence in a network of trials that compare multiple interventions and has the potential to rank the competing treatments according to the studied outcome. Despite its usefulness network meta-analysis is often criticized for its complexity and for being accessible only to researchers with strong statistical and computational skills. The evaluation of the underlying model assumptions, the statistical technicalities and presentation of the results in a concise and understandable way are all challenging aspects in the network meta-analysis methodology. In this paper we aim to make the methodology accessible to non-statisticians by presenting and explaining a series of graphical tools via worked examples. To this end, we provide a set of STATA routines that can be easily employed to present the evidence base, evaluate the assumptions, fit the network meta-analysis model and interpret its results. PMID:24098547
Computer Administering of the Psychological Investigations: Set-Relational Representation

NASA Astrophysics Data System (ADS)

Yordzhev, Krasimir

Computer administering of a psychological investigation is the computer representation of the entire procedure of psychological assessments - test construction, test implementation, results evaluation, storage and maintenance of the developed database, its statistical processing, analysis and interpretation. A mathematical description of psychological assessment with the aid of personality tests is discussed in this article. The set theory and the relational algebra are used in this description. A relational model of data, needed to design a computer system for automation of certain psychological assessments is given. Some finite sets and relation on them, which are necessary for creating a personality psychological test, are described. The described model could be used to develop real software for computer administering of any psychological test and there is full automation of the whole process: test construction, test implementation, result evaluation, storage of the developed database, statistical implementation, analysis and interpretation. A software project for computer administering personality psychological tests is suggested.
Modeling the Test-Retest Statistics of a Localization Experiment in the Full Horizontal Plane.

PubMed

Morsnowski, André; Maune, Steffen

2016-10-01

Two approaches to model the test-retest statistics of a localization experiment basing on Gaussian distribution and on surrogate data are introduced. Their efficiency is investigated using different measures describing directional hearing ability. A localization experiment in the full horizontal plane is a challenging task for hearing impaired patients. In clinical routine, we use this experiment to evaluate the progress of our cochlear implant (CI) recipients. Listening and time effort limit the reproducibility. The localization experiment consists of a 12 loudspeaker circle, placed in an anechoic room, a "camera silens". In darkness, HSM sentences are presented at 65 dB pseudo-erratically from all 12 directions with five repetitions. This experiment is modeled by a set of Gaussian distributions with different standard deviations added to a perfect estimator, as well as by surrogate data. Five repetitions per direction are used to produce surrogate data distributions for the sensation directions. To investigate the statistics, we retrospectively use the data of 33 CI patients with 92 pairs of test-retest-measurements from the same day. The first model does not take inversions into account, (i.e., permutations of the direction from back to front and vice versa are not considered), although they are common for hearing impaired persons particularly in the rear hemisphere. The second model considers these inversions but does not work with all measures. The introduced models successfully describe test-retest statistics of directional hearing. However, since their applications on the investigated measures perform differently no general recommendation can be provided. The presented test-retest statistics enable pair test comparisons for localization experiments.
Filter Tuning Using the Chi-Squared Statistic

NASA Technical Reports Server (NTRS)

Lilly-Salkowski, Tyler B.

2017-01-01

This paper examines the use of the Chi-square statistic as a means of evaluating filter performance. The goal of the process is to characterize the filter performance in the metric of covariance realism. The Chi-squared statistic is the value calculated to determine the realism of a covariance based on the prediction accuracy and the covariance values at a given point in time. Once calculated, it is the distribution of this statistic that provides insight on the accuracy of the covariance. The process of tuning an Extended Kalman Filter (EKF) for Aqua and Aura support is described, including examination of the measurement errors of available observation types, and methods of dealing with potentially volatile atmospheric drag modeling. Predictive accuracy and the distribution of the Chi-squared statistic, calculated from EKF solutions, are assessed.

Predicting adsorptive removal of chlorophenol from aqueous solution using artificial intelligence based modeling approaches.

PubMed

Singh, Kunwar P; Gupta, Shikha; Ojha, Priyanka; Rai, Premanjali

2013-04-01

The research aims to develop artificial intelligence (AI)-based model to predict the adsorptive removal of 2-chlorophenol (CP) in aqueous solution by coconut shell carbon (CSC) using four operational variables (pH of solution, adsorbate concentration, temperature, and contact time), and to investigate their effects on the adsorption process. Accordingly, based on a factorial design, 640 batch experiments were conducted. Nonlinearities in experimental data were checked using Brock-Dechert-Scheimkman (BDS) statistics. Five nonlinear models were constructed to predict the adsorptive removal of CP in aqueous solution by CSC using four variables as input. Performances of the constructed models were evaluated and compared using statistical criteria. BDS statistics revealed strong nonlinearity in experimental data. Performance of all the models constructed here was satisfactory. Radial basis function network (RBFN) and multilayer perceptron network (MLPN) models performed better than generalized regression neural network, support vector machines, and gene expression programming models. Sensitivity analysis revealed that the contact time had highest effect on adsorption followed by the solution pH, temperature, and CP concentration. The study concluded that all the models constructed here were capable of capturing the nonlinearity in data. A better generalization and predictive performance of RBFN and MLPN models suggested that these can be used to predict the adsorption of CP in aqueous solution using CSC.
Evaluating the sources of water to wells: Three techniques for metamodeling of a groundwater flow model

USGS Publications Warehouse

Fienen, Michael N.; Nolan, Bernard T.; Feinstein, Daniel T.

2016-01-01

For decision support, the insights and predictive power of numerical process models can be hampered by insufficient expertise and computational resources required to evaluate system response to new stresses. An alternative is to emulate the process model with a statistical “metamodel.” Built on a dataset of collocated numerical model input and output, a groundwater flow model was emulated using a Bayesian Network, an Artificial neural network, and a Gradient Boosted Regression Tree. The response of interest was surface water depletion expressed as the source of water-to-wells. The results have application for managing allocation of groundwater. Each technique was tuned using cross validation and further evaluated using a held-out dataset. A numerical MODFLOW-USG model of the Lake Michigan Basin, USA, was used for the evaluation. The performance and interpretability of each technique was compared pointing to advantages of each technique. The metamodel can extend to unmodeled areas.
Correcting Too Much or Too Little? The Performance of Three Chi-Square Corrections.

PubMed

Foldnes, Njål; Olsson, Ulf Henning

2015-01-01

This simulation study investigates the performance of three test statistics, T1, T2, and T3, used to evaluate structural equation model fit under non normal data conditions. T1 is the well-known mean-adjusted statistic of Satorra and Bentler. T2 is the mean-and-variance adjusted statistic of Sattertwaithe type where the degrees of freedom is manipulated. T3 is a recently proposed version of T2 that does not manipulate degrees of freedom. Discrepancies between these statistics and their nominal chi-square distribution in terms of errors of Type I and Type II are investigated. All statistics are shown to be sensitive to increasing kurtosis in the data, with Type I error rates often far off the nominal level. Under excess kurtosis true models are generally over-rejected by T1 and under-rejected by T2 and T3, which have similar performance in all conditions. Under misspecification there is a loss of power with increasing kurtosis, especially for T2 and T3. The coefficient of variation of the nonzero eigenvalues of a certain matrix is shown to be a reliable indicator for the adequacy of these statistics.
Recent evaluations of crack-opening-area in circumferentially cracked pipes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rahman, S.; Brust, F.; Ghadiali, N.

1997-04-01

Leak-before-break (LBB) analyses for circumferentially cracked pipes are currently being conducted in the nuclear industry to justify elimination of pipe whip restraints and jet shields which are present because of the expected dynamic effects from pipe rupture. The application of the LBB methodology frequently requires calculation of leak rates. The leak rates depend on the crack-opening area of the through-wall crack in the pipe. In addition to LBB analyses which assume a hypothetical flaw size, there is also interest in the integrity of actual leaking cracks corresponding to current leakage detection requirements in NRC Regulatory Guide 1.45, or for assessingmore » temporary repair of Class 2 and 3 pipes that have leaks as are being evaluated in ASME Section XI. The objectives of this study were to review, evaluate, and refine current predictive models for performing crack-opening-area analyses of circumferentially cracked pipes. The results from twenty-five full-scale pipe fracture experiments, conducted in the Degraded Piping Program, the International Piping Integrity Research Group Program, and the Short Cracks in Piping and Piping Welds Program, were used to verify the analytical models. Standard statistical analyses were performed to assess used to verify the analytical models. Standard statistical analyses were performed to assess quantitatively the accuracy of the predictive models. The evaluation also involved finite element analyses for determining the crack-opening profile often needed to perform leak-rate calculations.« less
Non-linear scaling of a musculoskeletal model of the lower limb using statistical shape models.

PubMed

Nolte, Daniel; Tsang, Chui Kit; Zhang, Kai Yu; Ding, Ziyun; Kedgley, Angela E; Bull, Anthony M J

2016-10-03

Accurate muscle geometry for musculoskeletal models is important to enable accurate subject-specific simulations. Commonly, linear scaling is used to obtain individualised muscle geometry. More advanced methods include non-linear scaling using segmented bone surfaces and manual or semi-automatic digitisation of muscle paths from medical images. In this study, a new scaling method combining non-linear scaling with reconstructions of bone surfaces using statistical shape modelling is presented. Statistical Shape Models (SSMs) of femur and tibia/fibula were used to reconstruct bone surfaces of nine subjects. Reference models were created by morphing manually digitised muscle paths to mean shapes of the SSMs using non-linear transformations and inter-subject variability was calculated. Subject-specific models of muscle attachment and via points were created from three reference models. The accuracy was evaluated by calculating the differences between the scaled and manually digitised models. The points defining the muscle paths showed large inter-subject variability at the thigh and shank - up to 26mm; this was found to limit the accuracy of all studied scaling methods. Errors for the subject-specific muscle point reconstructions of the thigh could be decreased by 9% to 20% by using the non-linear scaling compared to a typical linear scaling method. We conclude that the proposed non-linear scaling method is more accurate than linear scaling methods. Thus, when combined with the ability to reconstruct bone surfaces from incomplete or scattered geometry data using statistical shape models our proposed method is an alternative to linear scaling methods. Copyright © 2016 The Author. Published by Elsevier Ltd.. All rights reserved.
Statistical Approaches to Interpretation of Local, Regional, and National Highway-Runoff and Urban-Stormwater Data

USGS Publications Warehouse

Tasker, Gary D.; Granato, Gregory E.

2000-01-01

Decision makers need viable methods for the interpretation of local, regional, and national-highway runoff and urban-stormwater data including flows, concentrations and loads of chemical constituents and sediment, potential effects on receiving waters, and the potential effectiveness of various best management practices (BMPs). Valid (useful for intended purposes), current, and technically defensible stormwater-runoff models are needed to interpret data collected in field studies, to support existing highway and urban-runoffplanning processes, to meet National Pollutant Discharge Elimination System (NPDES) requirements, and to provide methods for computation of Total Maximum Daily Loads (TMDLs) systematically and economically. Historically, conceptual, simulation, empirical, and statistical models of varying levels of detail, complexity, and uncertainty have been used to meet various data-quality objectives in the decision-making processes necessary for the planning, design, construction, and maintenance of highways and for other land-use applications. Water-quality simulation models attempt a detailed representation of the physical processes and mechanisms at a given site. Empirical and statistical regional water-quality assessment models provide a more general picture of water quality or changes in water quality over a region. All these modeling techniques share one common aspect-their predictive ability is poor without suitable site-specific data for calibration. To properly apply the correct model, one must understand the classification of variables, the unique characteristics of water-resources data, and the concept of population structure and analysis. Classifying variables being used to analyze data may determine which statistical methods are appropriate for data analysis. An understanding of the characteristics of water-resources data is necessary to evaluate the applicability of different statistical methods, to interpret the results of these techniques, and to use tools and techniques that account for the unique nature of water-resources data sets. Populations of data on stormwater-runoff quantity and quality are often best modeled as logarithmic transformations. Therefore, these factors need to be considered to form valid, current, and technically defensible stormwater-runoff models. Regression analysis is an accepted method for interpretation of water-resources data and for prediction of current or future conditions at sites that fit the input data model. Regression analysis is designed to provide an estimate of the average response of a system as it relates to variation in one or more known variables. To produce valid models, however, regression analysis should include visual analysis of scatterplots, an examination of the regression equation, evaluation of the method design assumptions, and regression diagnostics. A number of statistical techniques are described in the text and in the appendixes to provide information necessary to interpret data by use of appropriate methods. Uncertainty is an important part of any decisionmaking process. In order to deal with uncertainty problems, the analyst needs to know the severity of the statistical uncertainty of the methods used to predict water quality. Statistical models need to be based on information that is meaningful, representative, complete, precise, accurate, and comparable to be deemed valid, up to date, and technically supportable. To assess uncertainty in the analytical tools, the modeling methods, and the underlying data set, all of these components need be documented and communicated in an accessible format within project publications.
Modeling subjective evaluation of soundscape quality in urban open spaces: An artificial neural network approach.

PubMed

Yu, Lei; Kang, Jian

2009-09-01

This research aims to explore the feasibility of using computer-based models to predict the soundscape quality evaluation of potential users in urban open spaces at the design stage. With the data from large scale field surveys in 19 urban open spaces across Europe and China, the importance of various physical, behavioral, social, demographical, and psychological factors for the soundscape evaluation has been statistically analyzed. Artificial neural network (ANN) models have then been explored at three levels. It has been shown that for both subjective sound level and acoustic comfort evaluation, a general model for all the case study sites is less feasible due to the complex physical and social environments in urban open spaces; models based on individual case study sites perform well but the application range is limited; and specific models for certain types of location/function would be reliable and practical. The performance of acoustic comfort models is considerably better than that of sound level models. Based on the ANN models, soundscape quality maps can be produced and this has been demonstrated with an example.
On Using Surrogates with Genetic Programming.

PubMed

Hildebrandt, Torsten; Branke, Jürgen

2015-01-01

One way to accelerate evolutionary algorithms with expensive fitness evaluations is to combine them with surrogate models. Surrogate models are efficiently computable approximations of the fitness function, derived by means of statistical or machine learning techniques from samples of fully evaluated solutions. But these models usually require a numerical representation, and therefore cannot be used with the tree representation of genetic programming (GP). In this paper, we present a new way to use surrogate models with GP. Rather than using the genotype directly as input to the surrogate model, we propose using a phenotypic characterization. This phenotypic characterization can be computed efficiently and allows us to define approximate measures of equivalence and similarity. Using a stochastic, dynamic job shop scenario as an example of simulation-based GP with an expensive fitness evaluation, we show how these ideas can be used to construct surrogate models and improve the convergence speed and solution quality of GP.
Bridging the Gap between Theory and Model: A Reflection on the Balance Scale Task.

ERIC Educational Resources Information Center

Turner, Geoffrey F. W.; Thomas, Hoben

2002-01-01

Focuses on individual strengths of articles by Jensen and van der Maas, and Halford et al., and the power of their combined perspectives. Suggests a performance model that can both evaluate specific theoretical claims and reveal important data features that had been previously obscured using conventional statistical analyses. Maintains that the…
Limited-information goodness-of-fit testing of diagnostic classification item response models.

PubMed

Hansen, Mark; Cai, Li; Monroe, Scott; Li, Zhen

2016-11-01

Despite the growing popularity of diagnostic classification models (e.g., Rupp et al., 2010, Diagnostic measurement: theory, methods, and applications, Guilford Press, New York, NY) in educational and psychological measurement, methods for testing their absolute goodness of fit to real data remain relatively underdeveloped. For tests of reasonable length and for realistic sample size, full-information test statistics such as Pearson's X 2 and the likelihood ratio statistic G 2 suffer from sparseness in the underlying contingency table from which they are computed. Recently, limited-information fit statistics such as Maydeu-Olivares and Joe's (2006, Psychometrika, 71, 713) M 2 have been found to be quite useful in testing the overall goodness of fit of item response theory models. In this study, we applied Maydeu-Olivares and Joe's (2006, Psychometrika, 71, 713) M 2 statistic to diagnostic classification models. Through a series of simulation studies, we found that M 2 is well calibrated across a wide range of diagnostic model structures and was sensitive to certain misspecifications of the item model (e.g., fitting disjunctive models to data generated according to a conjunctive model), errors in the Q-matrix (adding or omitting paths, omitting a latent variable), and violations of local item independence due to unmodelled testlet effects. On the other hand, M 2 was largely insensitive to misspecifications in the distribution of higher-order latent dimensions and to the specification of an extraneous attribute. To complement the analyses of the overall model goodness of fit using M 2 , we investigated the utility of the Chen and Thissen (1997, J. Educ. Behav. Stat., 22, 265) local dependence statistic XLD2 for characterizing sources of misfit, an important aspect of model appraisal often overlooked in favour of overall statements. The XLD2 statistic was found to be slightly conservative (with Type I error rates consistently below the nominal level) but still useful in pinpointing the sources of misfit. Patterns of local dependence arising due to specific model misspecifications are illustrated. Finally, we used the M 2 and XLD2 statistics to evaluate a diagnostic model fit to data from the Trends in Mathematics and Science Study, drawing upon analyses previously conducted by Lee et al., (2011, IJT, 11, 144). © 2016 The British Psychological Society.
Relative mass distributions of neutron-rich thermally fissile nuclei within a statistical model

NASA Astrophysics Data System (ADS)

Kumar, Bharat; Kannan, M. T. Senthil; Balasubramaniam, M.; Agrawal, B. K.; Patra, S. K.

2017-09-01

We study the binary mass distribution for the recently predicted thermally fissile neutron-rich uranium and thorium nuclei using a statistical model. The level density parameters needed for the study are evaluated from the excitation energies of the temperature-dependent relativistic mean field formalism. The excitation energy and the level density parameter for a given temperature are employed in the convolution integral method to obtain the probability of the particular fragmentation. As representative cases, we present the results for the binary yields of 250U and 254Th. The relative yields are presented for three different temperatures: T =1 , 2, and 3 MeV.
Pharmaceutical solid-state kinetic stability investigation by using moisture-modified Arrhenius equation and JMP statistical software.

PubMed

Fu, Mingkun; Perlman, Michael; Lu, Qing; Varga, Csanad

2015-03-25

An accelerated stress approach utilizing the moisture-modified Arrhenius equation and JMP statistical software was utilized to quantitatively assess the solid state stability of an investigational oncology drug MLNA under the influence of temperature (1/T) and humidity (%RH). Physical stability of MLNA under stress conditions was evaluated by using XRPD, DSC, TGA, and DVS, while chemical stability was evaluated by using HPLC. The major chemical degradation product was identified as a hydrolysis product of MLNA drug substance, and was subsequently subjected to an investigation of kinetics based on the isoconversion concept. A mathematical model (ln k=-11,991×(1/T)+0.0298×(%RH)+29.8823) based on the initial linear kinetics observed for the formation of this degradant at all seven stress conditions was built by using the moisture-modified Arrhenius equation and JMP statistical software. Comparison of the predicted versus experimental lnk values gave a mean deviation value of 5.8%, an R(2) value of 0.94, a p-value of 0.0038, and a coefficient of variation of the root mean square error CV(RMSE) of 7.9%. These statistics all indicated a good fit to the model for the stress data of MLNA. Both temperature and humidity were shown to have a statistically significant impact on stability by using effect leverage plots (p-value<0.05 for both 1/T and %RH). Inclusion of a term representing the interaction of relative humidity and temperature (%RH×1/T) was shown not to be justified by using Analysis of Covariance (ANCOVA), which supported the use of the moisture-corrected Arrhenius equation modeling theory. The model was found to be of value to aid setting of specifications and retest period, and storage condition selection. A model was also generated using only four conditions, as an example from a resource saving perspective, which was found to provide a good fit to the entire set of data. Copyright © 2015 Elsevier B.V. All rights reserved.
The functional basis of face evaluation

PubMed Central

Oosterhof, Nikolaas N.; Todorov, Alexander

2008-01-01

People automatically evaluate faces on multiple trait dimensions, and these evaluations predict important social outcomes, ranging from electoral success to sentencing decisions. Based on behavioral studies and computer modeling, we develop a 2D model of face evaluation. First, using a principal components analysis of trait judgments of emotionally neutral faces, we identify two orthogonal dimensions, valence and dominance, that are sufficient to describe face evaluation and show that these dimensions can be approximated by judgments of trustworthiness and dominance. Second, using a data-driven statistical model for face representation, we build and validate models for representing face trustworthiness and face dominance. Third, using these models, we show that, whereas valence evaluation is more sensitive to features resembling expressions signaling whether the person should be avoided or approached, dominance evaluation is more sensitive to features signaling physical strength/weakness. Fourth, we show that important social judgments, such as threat, can be reproduced as a function of the two orthogonal dimensions of valence and dominance. The findings suggest that face evaluation involves an overgeneralization of adaptive mechanisms for inferring harmful intentions and the ability to cause harm and can account for rapid, yet not necessarily accurate, judgments from faces. PMID:18685089
MyPMFs: a simple tool for creating statistical potentials to assess protein structural models.

PubMed

Postic, Guillaume; Hamelryck, Thomas; Chomilier, Jacques; Stratmann, Dirk

2018-05-29

Evaluating the model quality of protein structures that evolve in environments with particular physicochemical properties requires scoring functions that are adapted to their specific residue compositions and/or structural characteristics. Thus, computational methods developed for structures from the cytosol cannot work properly on membrane or secreted proteins. Here, we present MyPMFs, an easy-to-use tool that allows users to train statistical potentials of mean force (PMFs) on the protein structures of their choice, with all parameters being adjustable. We demonstrate its use by creating an accurate statistical potential for transmembrane protein domains. We also show its usefulness to study the influence of the physical environment on residue interactions within protein structures. Our open-source software is freely available for download at https://github.com/bibip-impmc/mypmfs. Copyright © 2018. Published by Elsevier B.V.
Revised Perturbation Statistics for the Global Scale Atmospheric Model

NASA Technical Reports Server (NTRS)

Justus, C. G.; Woodrum, A.

1975-01-01

Magnitudes and scales of atmospheric perturbations about the monthly mean for the thermodynamic variables and wind components are presented by month at various latitudes. These perturbation statistics are a revision of the random perturbation data required for the global scale atmospheric model program and are from meteorological rocket network statistical summaries in the 22 to 65 km height range and NASA grenade and pitot tube data summaries in the region up to 90 km. The observed perturbations in the thermodynamic variables were adjusted to make them consistent with constraints required by the perfect gas law and the hydrostatic equation. Vertical scales were evaluated by Buell's depth of pressure system equation and from vertical structure function analysis. Tables of magnitudes and vertical scales are presented for each month at latitude 10, 30, 50, 70, and 90 degrees.
Validation of periodontitis screening model using sociodemographic, systemic, and molecular information in a Korean population.

PubMed

Kim, Hyun-Duck; Sukhbaatar, Munkhzaya; Shin, Myungseop; Ahn, Yoo-Been; Yoo, Wook-Sung

2014-12-01

This study aims to evaluate and validate a periodontitis screening model that includes sociodemographic, metabolic syndrome (MetS), and molecular information, including gingival crevicular fluid (GCF), matrix metalloproteinase (MMP), and blood cytokines. The authors selected 506 participants from the Shiwha-Banwol cohort: 322 participants from the 2005 cohort for deriving the screening model and 184 participants from the 2007 cohort for its validation. Periodontitis was assessed by dentists using the community periodontal index. Interleukin (IL)-6, IL-8, and tumor necrosis factor-α in blood and MMP-8, -9, and -13 in GCF were assayed using enzyme-linked immunosorbent assay. MetS was assessed by physicians using physical examination and blood laboratory data. Information about age, sex, income, smoking, and drinking was obtained by interview. Logistic regression analysis was applied to finalize the best-fitting model and validate the model using sensitivity, specificity, and c-statistics. The derived model for periodontitis screening had a sensitivity of 0.73, specificity of 0.85, and c-statistic of 0.86 (P <0.001); those of the validated model were 0.64, 0.91, and 0.83 (P <0.001), respectively. The model that included age, sex, income, smoking, drinking, and blood and GCF biomarkers could be useful in screening for periodontitis. A future prospective study is indicated for evaluating this model's ability to predict the occurrence of periodontitis.
Prognostic value of coronary computed tomographic angiography findings in asymptomatic individuals: a 6-year follow-up from the prospective multicentre international CONFIRM study.

PubMed

Cho, Iksung; Al'Aref, Subhi J; Berger, Adam; Ó Hartaigh, Bríain; Gransar, Heidi; Valenti, Valentina; Lin, Fay Y; Achenbach, Stephan; Berman, Daniel S; Budoff, Matthew J; Callister, Tracy Q; Al-Mallah, Mouaz H; Cademartiri, Filippo; Chinnaiyan, Kavitha; Chow, Benjamin J W; DeLago, Augustin; Villines, Todd C; Hadamitzky, Martin; Hausleiter, Joerg; Leipsic, Jonathon; Shaw, Leslee J; Kaufmann, Philipp A; Feuchtner, Gudrun; Kim, Yong-Jin; Maffei, Erica; Raff, Gilbert; Pontone, Gianluca; Andreini, Daniele; Marques, Hugo; Rubinshtein, Ronen; Chang, Hyuk-Jae; Min, James K

2018-03-14

The long-term prognostic benefit of coronary computed tomographic angiography (CCTA) findings of coronary artery disease (CAD) in asymptomatic populations is unknown. From the prospective multicentre international CONFIRM long-term study, we evaluated asymptomatic subjects without known CAD who underwent both coronary artery calcium scoring (CACS) and CCTA (n = 1226). Coronary computed tomographic angiography findings included the severity of coronary artery stenosis, plaque composition, and coronary segment location. Using the C-statistic and likelihood ratio tests, we evaluated the incremental prognostic utility of CCTA findings over a base model that included a panel of traditional risk factors (RFs) as well as CACS to predict long-term all-cause mortality. During a mean follow-up of 5.9 ± 1.2 years, 78 deaths occurred. Compared with the traditional RF alone (C-statistic 0.64), CCTA findings including coronary stenosis severity, plaque composition, and coronary segment location demonstrated improved incremental prognostic utility beyond traditional RF alone (C-statistics range 0.71-0.73, all P < 0.05; incremental χ2 range 20.7-25.5, all P < 0.001). However, no added prognostic benefit was offered by CCTA findings when added to a base model containing both traditional RF and CACS (C-statistics P > 0.05, for all). Coronary computed tomographic angiography improved prognostication of 6-year all-cause mortality beyond a set of conventional RF alone, although, no further incremental value was offered by CCTA when CCTA findings were added to a model incorporating RF and CACS.
OPR-PPR, a Computer Program for Assessing Data Importance to Model Predictions Using Linear Statistics

USGS Publications Warehouse

Tonkin, Matthew J.; Tiedeman, Claire; Ely, D. Matthew; Hill, Mary C.

2007-01-01

The OPR-PPR program calculates the Observation-Prediction (OPR) and Parameter-Prediction (PPR) statistics that can be used to evaluate the relative importance of various kinds of data to simulated predictions. The data considered fall into three categories: (1) existing observations, (2) potential observations, and (3) potential information about parameters. The first two are addressed by the OPR statistic; the third is addressed by the PPR statistic. The statistics are based on linear theory and measure the leverage of the data, which depends on the location, the type, and possibly the time of the data being considered. For example, in a ground-water system the type of data might be a head measurement at a particular location and time. As a measure of leverage, the statistics do not take into account the value of the measurement. As linear measures, the OPR and PPR statistics require minimal computational effort once sensitivities have been calculated. Sensitivities need to be calculated for only one set of parameter values; commonly these are the values estimated through model calibration. OPR-PPR can calculate the OPR and PPR statistics for any mathematical model that produces the necessary OPR-PPR input files. In this report, OPR-PPR capabilities are presented in the context of using the ground-water model MODFLOW-2000 and the universal inverse program UCODE_2005. The method used to calculate the OPR and PPR statistics is based on the linear equation for prediction standard deviation. Using sensitivities and other information, OPR-PPR calculates (a) the percent increase in the prediction standard deviation that results when one or more existing observations are omitted from the calibration data set; (b) the percent decrease in the prediction standard deviation that results when one or more potential observations are added to the calibration data set; or (c) the percent decrease in the prediction standard deviation that results when potential information on one or more parameters is added.
Evaluation of Statistical Downscaling Skill at Reproducing Extreme Events

NASA Astrophysics Data System (ADS)

McGinnis, S. A.; Tye, M. R.; Nychka, D. W.; Mearns, L. O.

2015-12-01

Climate model outputs usually have much coarser spatial resolution than is needed by impacts models. Although higher resolution can be achieved using regional climate models for dynamical downscaling, further downscaling is often required. The final resolution gap is often closed with a combination of spatial interpolation and bias correction, which constitutes a form of statistical downscaling. We use this technique to downscale regional climate model data and evaluate its skill in reproducing extreme events. We downscale output from the North American Regional Climate Change Assessment Program (NARCCAP) dataset from its native 50-km spatial resolution to the 4-km resolution of University of Idaho's METDATA gridded surface meterological dataset, which derives from the PRISM and NLDAS-2 observational datasets. We operate on the major variables used in impacts analysis at a daily timescale: daily minimum and maximum temperature, precipitation, humidity, pressure, solar radiation, and winds. To interpolate the data, we use the patch recovery method from the Earth System Modeling Framework (ESMF) regridding package. We then bias correct the data using Kernel Density Distribution Mapping (KDDM), which has been shown to exhibit superior overall performance across multiple metrics. Finally, we evaluate the skill of this technique in reproducing extreme events by comparing raw and downscaled output with meterological station data in different bioclimatic regions according to the the skill scores defined by Perkins et al in 2013 for evaluation of AR4 climate models. We also investigate techniques for improving bias correction of values in the tails of the distributions. These techniques include binned kernel density estimation, logspline kernel density estimation, and transfer functions constructed by fitting the tails with a generalized pareto distribution.
Evaluating the efficiency of environmental monitoring programs

USGS Publications Warehouse

Levine, Carrie R.; Yanai, Ruth D.; Lampman, Gregory G.; Burns, Douglas A.; Driscoll, Charles T.; Lawrence, Gregory B.; Lynch, Jason; Schoch, Nina

2014-01-01

Statistical uncertainty analyses can be used to improve the efficiency of environmental monitoring, allowing sampling designs to maximize information gained relative to resources required for data collection and analysis. In this paper, we illustrate four methods of data analysis appropriate to four types of environmental monitoring designs. To analyze a long-term record from a single site, we applied a general linear model to weekly stream chemistry data at Biscuit Brook, NY, to simulate the effects of reducing sampling effort and to evaluate statistical confidence in the detection of change over time. To illustrate a detectable difference analysis, we analyzed a one-time survey of mercury concentrations in loon tissues in lakes in the Adirondack Park, NY, demonstrating the effects of sampling intensity on statistical power and the selection of a resampling interval. To illustrate a bootstrapping method, we analyzed the plot-level sampling intensity of forest inventory at the Hubbard Brook Experimental Forest, NH, to quantify the sampling regime needed to achieve a desired confidence interval. Finally, to analyze time-series data from multiple sites, we assessed the number of lakes and the number of samples per year needed to monitor change over time in Adirondack lake chemistry using a repeated-measures mixed-effects model. Evaluations of time series and synoptic long-term monitoring data can help determine whether sampling should be re-allocated in space or time to optimize the use of financial and human resources.

Comparative Research Productivity Measures for Economic Departments.

ERIC Educational Resources Information Center

Huettner, David A.; Clark, William

1997-01-01

Develops a simple theoretical model to evaluate interdisciplinary differences in research productivity between economics departments and related subjects. Compares the research publishing statistics of economics, finance, psychology, geology, physics, oceanography, chemistry, and geophysics. Considers a number of factors including journal…
Statistical Evaluation of Time Series Analysis Techniques

NASA Technical Reports Server (NTRS)

Benignus, V. A.

1973-01-01

The performance of a modified version of NASA's multivariate spectrum analysis program is discussed. A multiple regression model was used to make the revisions. Performance improvements were documented and compared to the standard fast Fourier transform by Monte Carlo techniques.
Forecasting volatility with neural regression: a contribution to model adequacy.

PubMed

Refenes, A N; Holt, W T

2001-01-01

Neural nets' usefulness for forecasting is limited by problems of overfitting and the lack of rigorous procedures for model identification, selection and adequacy testing. This paper describes a methodology for neural model misspecification testing. We introduce a generalization of the Durbin-Watson statistic for neural regression and discuss the general issues of misspecification testing using residual analysis. We derive a generalized influence matrix for neural estimators which enables us to evaluate the distribution of the statistic. We deploy Monte Carlo simulation to compare the power of the test for neural and linear regressors. While residual testing is not a sufficient condition for model adequacy, it is nevertheless a necessary condition to demonstrate that the model is a good approximation to the data generating process, particularly as neural-network estimation procedures are susceptible to partial convergence. The work is also an important step toward developing rigorous procedures for neural model identification, selection and adequacy testing which have started to appear in the literature. We demonstrate its applicability in the nontrivial problem of forecasting implied volatility innovations using high-frequency stock index options. Each step of the model building process is validated using statistical tests to verify variable significance and model adequacy with the results confirming the presence of nonlinear relationships in implied volatility innovations.
Regression Models for Identifying Noise Sources in Magnetic Resonance Images

PubMed Central

Zhu, Hongtu; Li, Yimei; Ibrahim, Joseph G.; Shi, Xiaoyan; An, Hongyu; Chen, Yashen; Gao, Wei; Lin, Weili; Rowe, Daniel B.; Peterson, Bradley S.

2009-01-01

Stochastic noise, susceptibility artifacts, magnetic field and radiofrequency inhomogeneities, and other noise components in magnetic resonance images (MRIs) can introduce serious bias into any measurements made with those images. We formally introduce three regression models including a Rician regression model and two associated normal models to characterize stochastic noise in various magnetic resonance imaging modalities, including diffusion-weighted imaging (DWI) and functional MRI (fMRI). Estimation algorithms are introduced to maximize the likelihood function of the three regression models. We also develop a diagnostic procedure for systematically exploring MR images to identify noise components other than simple stochastic noise, and to detect discrepancies between the fitted regression models and MRI data. The diagnostic procedure includes goodness-of-fit statistics, measures of influence, and tools for graphical display. The goodness-of-fit statistics can assess the key assumptions of the three regression models, whereas measures of influence can isolate outliers caused by certain noise components, including motion artifacts. The tools for graphical display permit graphical visualization of the values for the goodness-of-fit statistic and influence measures. Finally, we conduct simulation studies to evaluate performance of these methods, and we analyze a real dataset to illustrate how our diagnostic procedure localizes subtle image artifacts by detecting intravoxel variability that is not captured by the regression models. PMID:19890478
Benefit-cost evaluation of an intra-regional air service in the Bay area

NASA Technical Reports Server (NTRS)

Haefner, L. E.

1977-01-01

Utilization of an iterative statistical model is presented to evaluate combinations of commuter airport sites and surface transportation facilities in confunction with service by a given commuter aircraft type in light of Bay Area regional growth alternatives and peak and off-peak regional travel patterns. The model evaluates such transportation options with respect to criteria of airline profitability, public acceptance, and public and private nonuser costs. It incorporates information modal split, peak and off-peak use of the air commuter fleet, terminal and airport cost, development costs and uses of land in proximity to the airport sites, regional population shifts, and induced zonal shifts in travel demand. The model is multimodal in its analytical capability, and performs exhaustive sensitivity analysis.
A reliability study on brain activation during active and passive arm movements supported by an MRI-compatible robot.

PubMed

Estévez, Natalia; Yu, Ningbo; Brügger, Mike; Villiger, Michael; Hepp-Reymond, Marie-Claude; Riener, Robert; Kollias, Spyros

2014-11-01

In neurorehabilitation, longitudinal assessment of arm movement related brain function in patients with motor disability is challenging due to variability in task performance. MRI-compatible robots monitor and control task performance, yielding more reliable evaluation of brain function over time. The main goals of the present study were first to define the brain network activated while performing active and passive elbow movements with an MRI-compatible arm robot (MaRIA) in healthy subjects, and second to test the reproducibility of this activation over time. For the fMRI analysis two models were compared. In model 1 movement onset and duration were included, whereas in model 2 force and range of motion were added to the analysis. Reliability of brain activation was tested with several statistical approaches applied on individual and group activation maps and on summary statistics. The activated network included mainly the primary motor cortex, primary and secondary somatosensory cortex, superior and inferior parietal cortex, medial and lateral premotor regions, and subcortical structures. Reliability analyses revealed robust activation for active movements with both fMRI models and all the statistical methods used. Imposed passive movements also elicited mainly robust brain activation for individual and group activation maps, and reliability was improved by including additional force and range of motion using model 2. These findings demonstrate that the use of robotic devices, such as MaRIA, can be useful to reliably assess arm movement related brain activation in longitudinal studies and may contribute in studies evaluating therapies and brain plasticity following injury in the nervous system.
Application of multivariable statistical techniques in plant-wide WWTP control strategies analysis.

PubMed

Flores, X; Comas, J; Roda, I R; Jiménez, L; Gernaey, K V

2007-01-01

The main objective of this paper is to present the application of selected multivariable statistical techniques in plant-wide wastewater treatment plant (WWTP) control strategies analysis. In this study, cluster analysis (CA), principal component analysis/factor analysis (PCA/FA) and discriminant analysis (DA) are applied to the evaluation matrix data set obtained by simulation of several control strategies applied to the plant-wide IWA Benchmark Simulation Model No 2 (BSM2). These techniques allow i) to determine natural groups or clusters of control strategies with a similar behaviour, ii) to find and interpret hidden, complex and casual relation features in the data set and iii) to identify important discriminant variables within the groups found by the cluster analysis. This study illustrates the usefulness of multivariable statistical techniques for both analysis and interpretation of the complex multicriteria data sets and allows an improved use of information for effective evaluation of control strategies.
Space market model development project, phase 2

NASA Technical Reports Server (NTRS)

Bishop, Peter C.

1988-01-01

The results of the prototype operations of the Space Business Information Center are presented. A clearinghouse for space business information for members of the U.S. space industry composed of public, private, and academic sectors was conducted. Behavioral and evaluation statistics were recorded from the clearinghouse and the conclusions from these statistics are presented. Business guidebooks on major markets in space business are discussed. Proprietary research and briefings for firms and agencies in the space industry are also discussed.
Computer-Based Model Calibration and Uncertainty Analysis: Terms and Concepts

DTIC Science & Technology

2015-07-01

uncertainty analyses throughout the lifecycle of planning, designing, and operating of Civil Works flood risk management projects as described in...value 95% of the time. In the frequentist approach to PE, model parameters area regarded as having true values, and their estimate is based on the...in catchment models. 1. Evaluating parameter uncertainty. Water Resources Research 19(5):1151–1172. Lee, P. M. 2012. Bayesian statistics: An
Evaluating Site-Specific and Generic Spatial Models of Aboveground Forest Biomass Based on Landsat Time-Series and LiDAR Strip Samples in the Eastern USA

Treesearch

Ram Deo; Matthew Russell; Grant Domke; Hans-Erik Andersen; Warren Cohen; Christopher Woodall

2017-01-01

Large-area assessment of aboveground tree biomass (AGB) to inform regional or national forest monitoring programs can be efficiently carried out by combining remotely sensed data and field sample measurements through a generic statistical model, in contrast to site-specific models. We integrated forest inventory plot data with spatial predictors from Landsat time-...
Treatment effects model for assessing disease management: measuring outcomes and strengthening program management.

PubMed

Wendel, Jeanne; Dumitras, Diana

2005-06-01

This paper describes an analytical methodology for obtaining statistically unbiased outcomes estimates for programs in which participation decisions may be correlated with variables that impact outcomes. This methodology is particularly useful for intraorganizational program evaluations conducted for business purposes. In this situation, data is likely to be available for a population of managed care members who are eligible to participate in a disease management (DM) program, with some electing to participate while others eschew the opportunity. The most pragmatic analytical strategy for in-house evaluation of such programs is likely to be the pre-intervention/post-intervention design in which the control group consists of people who were invited to participate in the DM program, but declined the invitation. Regression estimates of program impacts may be statistically biased if factors that impact participation decisions are correlated with outcomes measures. This paper describes an econometric procedure, the Treatment Effects model, developed to produce statistically unbiased estimates of program impacts in this type of situation. Two equations are estimated to (a) estimate the impacts of patient characteristics on decisions to participate in the program, and then (b) use this information to produce a statistically unbiased estimate of the impact of program participation on outcomes. This methodology is well-established in economics and econometrics, but has not been widely applied in the DM outcomes measurement literature; hence, this paper focuses on one illustrative application.
Evaluation of Aerosol-cloud Interaction in the GISS Model E Using ARM Observations

NASA Technical Reports Server (NTRS)

DeBoer, G.; Bauer, S. E.; Toto, T.; Menon, Surabi; Vogelmann, A. M.

2013-01-01

Observations from the US Department of Energy's Atmospheric Radiation Measurement (ARM) program are used to evaluate the ability of the NASA GISS ModelE global climate model in reproducing observed interactions between aerosols and clouds. Included in the evaluation are comparisons of basic meteorology and aerosol properties, droplet activation, effective radius parameterizations, and surface-based evaluations of aerosol-cloud interactions (ACI). Differences between the simulated and observed ACI are generally large, but these differences may result partially from vertical distribution of aerosol in the model, rather than the representation of physical processes governing the interactions between aerosols and clouds. Compared to the current observations, the ModelE often features elevated droplet concentrations for a given aerosol concentration, indicating that the activation parameterizations used may be too aggressive. Additionally, parameterizations for effective radius commonly used in models were tested using ARM observations, and there was no clear superior parameterization for the cases reviewed here. This lack of consensus is demonstrated to result in potentially large, statistically significant differences to surface radiative budgets, should one parameterization be chosen over another.
A flexible, interpretable framework for assessing sensitivity to unmeasured confounding.

PubMed

Dorie, Vincent; Harada, Masataka; Carnegie, Nicole Bohme; Hill, Jennifer

2016-09-10

When estimating causal effects, unmeasured confounding and model misspecification are both potential sources of bias. We propose a method to simultaneously address both issues in the form of a semi-parametric sensitivity analysis. In particular, our approach incorporates Bayesian Additive Regression Trees into a two-parameter sensitivity analysis strategy that assesses sensitivity of posterior distributions of treatment effects to choices of sensitivity parameters. This results in an easily interpretable framework for testing for the impact of an unmeasured confounder that also limits the number of modeling assumptions. We evaluate our approach in a large-scale simulation setting and with high blood pressure data taken from the Third National Health and Nutrition Examination Survey. The model is implemented as open-source software, integrated into the treatSens package for the R statistical programming language. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.
Statistical Method to Overcome Overfitting Issue in Rational Function Models

NASA Astrophysics Data System (ADS)

Alizadeh Moghaddam, S. H.; Mokhtarzade, M.; Alizadeh Naeini, A.; Alizadeh Moghaddam, S. A.

2017-09-01

Rational function models (RFMs) are known as one of the most appealing models which are extensively applied in geometric correction of satellite images and map production. Overfitting is a common issue, in the case of terrain dependent RFMs, that degrades the accuracy of RFMs-derived geospatial products. This issue, resulting from the high number of RFMs' parameters, leads to ill-posedness of the RFMs. To tackle this problem, in this study, a fast and robust statistical approach is proposed and compared to Tikhonov regularization (TR) method, as a frequently-used solution to RFMs' overfitting. In the proposed method, a statistical test, namely, significance test is applied to search for the RFMs' parameters that are resistant against overfitting issue. The performance of the proposed method was evaluated for two real data sets of Cartosat-1 satellite images. The obtained results demonstrate the efficiency of the proposed method in term of the achievable level of accuracy. This technique, indeed, shows an improvement of 50-80% over the TR.
Systematic Review and Meta-Analysis of Studies Evaluating Diagnostic Test Accuracy: A Practical Review for Clinical Researchers-Part II. Statistical Methods of Meta-Analysis

PubMed Central

Lee, Juneyoung; Kim, Kyung Won; Choi, Sang Hyun; Huh, Jimi

2015-01-01

Meta-analysis of diagnostic test accuracy studies differs from the usual meta-analysis of therapeutic/interventional studies in that, it is required to simultaneously analyze a pair of two outcome measures such as sensitivity and specificity, instead of a single outcome. Since sensitivity and specificity are generally inversely correlated and could be affected by a threshold effect, more sophisticated statistical methods are required for the meta-analysis of diagnostic test accuracy. Hierarchical models including the bivariate model and the hierarchical summary receiver operating characteristic model are increasingly being accepted as standard methods for meta-analysis of diagnostic test accuracy studies. We provide a conceptual review of statistical methods currently used and recommended for meta-analysis of diagnostic test accuracy studies. This article could serve as a methodological reference for those who perform systematic review and meta-analysis of diagnostic test accuracy studies. PMID:26576107
An information-theoretic approach to the modeling and analysis of whole-genome bisulfite sequencing data.

PubMed

Jenkinson, Garrett; Abante, Jordi; Feinberg, Andrew P; Goutsias, John

2018-03-07

DNA methylation is a stable form of epigenetic memory used by cells to control gene expression. Whole genome bisulfite sequencing (WGBS) has emerged as a gold-standard experimental technique for studying DNA methylation by producing high resolution genome-wide methylation profiles. Statistical modeling and analysis is employed to computationally extract and quantify information from these profiles in an effort to identify regions of the genome that demonstrate crucial or aberrant epigenetic behavior. However, the performance of most currently available methods for methylation analysis is hampered by their inability to directly account for statistical dependencies between neighboring methylation sites, thus ignoring significant information available in WGBS reads. We present a powerful information-theoretic approach for genome-wide modeling and analysis of WGBS data based on the 1D Ising model of statistical physics. This approach takes into account correlations in methylation by utilizing a joint probability model that encapsulates all information available in WGBS methylation reads and produces accurate results even when applied on single WGBS samples with low coverage. Using the Shannon entropy, our approach provides a rigorous quantification of methylation stochasticity in individual WGBS samples genome-wide. Furthermore, it utilizes the Jensen-Shannon distance to evaluate differences in methylation distributions between a test and a reference sample. Differential performance assessment using simulated and real human lung normal/cancer data demonstrate a clear superiority of our approach over DSS, a recently proposed method for WGBS data analysis. Critically, these results demonstrate that marginal methods become statistically invalid when correlations are present in the data. This contribution demonstrates clear benefits and the necessity of modeling joint probability distributions of methylation using the 1D Ising model of statistical physics and of quantifying methylation stochasticity using concepts from information theory. By employing this methodology, substantial improvement of DNA methylation analysis can be achieved by effectively taking into account the massive amount of statistical information available in WGBS data, which is largely ignored by existing methods.
A statistical approach to quasi-extinction forecasting.

PubMed

Holmes, Elizabeth Eli; Sabo, John L; Viscido, Steven Vincent; Fagan, William Fredric

2007-12-01

Forecasting population decline to a certain critical threshold (the quasi-extinction risk) is one of the central objectives of population viability analysis (PVA), and such predictions figure prominently in the decisions of major conservation organizations. In this paper, we argue that accurate forecasting of a population's quasi-extinction risk does not necessarily require knowledge of the underlying biological mechanisms. Because of the stochastic and multiplicative nature of population growth, the ensemble behaviour of population trajectories converges to common statistical forms across a wide variety of stochastic population processes. This paper provides a theoretical basis for this argument. We show that the quasi-extinction surfaces of a variety of complex stochastic population processes (including age-structured, density-dependent and spatially structured populations) can be modelled by a simple stochastic approximation: the stochastic exponential growth process overlaid with Gaussian errors. Using simulated and real data, we show that this model can be estimated with 20-30 years of data and can provide relatively unbiased quasi-extinction risk with confidence intervals considerably smaller than (0,1). This was found to be true even for simulated data derived from some of the noisiest population processes (density-dependent feedback, species interactions and strong age-structure cycling). A key advantage of statistical models is that their parameters and the uncertainty of those parameters can be estimated from time series data using standard statistical methods. In contrast for most species of conservation concern, biologically realistic models must often be specified rather than estimated because of the limited data available for all the various parameters. Biologically realistic models will always have a prominent place in PVA for evaluating specific management options which affect a single segment of a population, a single demographic rate, or different geographic areas. However, for forecasting quasi-extinction risk, statistical models that are based on the convergent statistical properties of population processes offer many advantages over biologically realistic models.
COLLABORATIVE RESEARCH:USING ARM OBSERVATIONS & ADVANCED STATISTICAL TECHNIQUES TO EVALUATE CAM3 CLOUDS FOR DEVELOPMENT OF STOCHASTIC CLOUD-RADIATION

DOE Office of Scientific and Technical Information (OSTI.GOV)

Somerville, Richard

2013-08-22

The long-range goal of several past and current projects in our DOE-supported research has been the development of new and improved parameterizations of cloud-radiation effects and related processes, using ARM data, and the implementation and testing of these parameterizations in global models. The main objective of the present project being reported on here has been to develop and apply advanced statistical techniques, including Bayesian posterior estimates, to diagnose and evaluate features of both observed and simulated clouds. The research carried out under this project has been novel in two important ways. The first is that it is a key stepmore » in the development of practical stochastic cloud-radiation parameterizations, a new category of parameterizations that offers great promise for overcoming many shortcomings of conventional schemes. The second is that this work has brought powerful new tools to bear on the problem, because it has been a collaboration between a meteorologist with long experience in ARM research (Somerville) and a mathematician who is an expert on a class of advanced statistical techniques that are well-suited for diagnosing model cloud simulations using ARM observations (Shen).« less
Rank score and permutation testing alternatives for regression quantile estimates

USGS Publications Warehouse

Cade, B.S.; Richards, J.D.; Mielke, P.W.

2006-01-01

Performance of quantile rank score tests used for hypothesis testing and constructing confidence intervals for linear quantile regression estimates (0 ≤ τ ≤ 1) were evaluated by simulation for models with p = 2 and 6 predictors, moderate collinearity among predictors, homogeneous and hetero-geneous errors, small to moderate samples (n = 20–300), and central to upper quantiles (0.50–0.99). Test statistics evaluated were the conventional quantile rank score T statistic distributed as χ2 random variable with q degrees of freedom (where q parameters are constrained by H 0:) and an F statistic with its sampling distribution approximated by permutation. The permutation F-test maintained better Type I errors than the T-test for homogeneous error models with smaller n and more extreme quantiles τ. An F distributional approximation of the F statistic provided some improvements in Type I errors over the T-test for models with > 2 parameters, smaller n, and more extreme quantiles but not as much improvement as the permutation approximation. Both rank score tests required weighting to maintain correct Type I errors when heterogeneity under the alternative model increased to 5 standard deviations across the domain of X. A double permutation procedure was developed to provide valid Type I errors for the permutation F-test when null models were forced through the origin. Power was similar for conditions where both T- and F-tests maintained correct Type I errors but the F-test provided some power at smaller n and extreme quantiles when the T-test had no power because of excessively conservative Type I errors. When the double permutation scheme was required for the permutation F-test to maintain valid Type I errors, power was less than for the T-test with decreasing sample size and increasing quantiles. Confidence intervals on parameters and tolerance intervals for future predictions were constructed based on test inversion for an example application relating trout densities to stream channel width:depth.
Improving satellite-based PM2.5 estimates in China using Gaussian processes modeling in a Bayesian hierarchical setting.

PubMed

Yu, Wenxi; Liu, Yang; Ma, Zongwei; Bi, Jun

2017-08-01

Using satellite-based aerosol optical depth (AOD) measurements and statistical models to estimate ground-level PM 2.5 is a promising way to fill the areas that are not covered by ground PM 2.5 monitors. The statistical models used in previous studies are primarily Linear Mixed Effects (LME) and Geographically Weighted Regression (GWR) models. In this study, we developed a new regression model between PM 2.5 and AOD using Gaussian processes in a Bayesian hierarchical setting. Gaussian processes model the stochastic nature of the spatial random effects, where the mean surface and the covariance function is specified. The spatial stochastic process is incorporated under the Bayesian hierarchical framework to explain the variation of PM 2.5 concentrations together with other factors, such as AOD, spatial and non-spatial random effects. We evaluate the results of our model and compare them with those of other, conventional statistical models (GWR and LME) by within-sample model fitting and out-of-sample validation (cross validation, CV). The results show that our model possesses a CV result (R 2 = 0.81) that reflects higher accuracy than that of GWR and LME (0.74 and 0.48, respectively). Our results indicate that Gaussian process models have the potential to improve the accuracy of satellite-based PM 2.5 estimates.

Season-ahead water quality forecasts for the Schuylkill River, Pennsylvania

NASA Astrophysics Data System (ADS)

Block, P. J.; Leung, K.

2013-12-01

Anticipating and preparing for elevated water quality parameter levels in critical water sources, using weather forecasts, is not uncommon. In this study, we explore the feasibility of extending this prediction scale to a season-ahead for the Schuylkill River in Philadelphia, utilizing both statistical and dynamical prediction models, to characterize the season. This advance information has relevance for recreational activities, ecosystem health, and water treatment, as the Schuylkill provides 40% of Philadelphia's water supply. The statistical model associates large-scale climate drivers with streamflow and water quality parameter levels; numerous variables from NOAA's CFSv2 model are evaluated for the dynamical approach. A multi-model combination is also assessed. Results indicate moderately skillful prediction of average summertime total coliform and wintertime turbidity, using season-ahead oceanic and atmospheric variables, predominantly from the North Atlantic Ocean. Models predicting the number of elevated turbidity events across the wintertime season are also explored.
Alpha 2 LASSO Data Bundles

DOE Data Explorer

Gustafson, William Jr; Vogelmann, Andrew; Endo, Satoshi; Toto, Tami; Xiao, Heng; Li, Zhijin; Cheng, Xiaoping; Kim, Jinwon; Krishna, Bhargavi

2015-08-31

The Alpha 2 release is the second release from the LASSO Pilot Phase that builds upon the Alpha 1 release. Alpha 2 contains additional diagnostics in the data bundles and focuses on cases from spring-summer 2016. A data bundle is a unified package consisting of LASSO LES input and output, observations, evaluation diagnostics, and model skill scores. LES input include model configuration information and forcing data. LES output includes profile statistics and full domain fields of cloud and environmental variables. Model evaluation data consists of LES output and ARM observations co-registered on the same grid and sampling frequency. Model performance is quantified by skill scores and diagnostics in terms of cloud and environmental variables.
Probabilistic evaluation of damage potential in earthquake-induced liquefaction in a 3-D soil deposit

NASA Astrophysics Data System (ADS)

Halder, A.; Miller, F. J.

1982-03-01

A probabilistic model to evaluate the risk of liquefaction at a site and to limit or eliminate damage during earthquake induced liquefaction is proposed. The model is extended to consider three dimensional nonhomogeneous soil properties. The parameters relevant to the liquefaction phenomenon are identified, including: (1) soil parameters; (2) parameters required to consider laboratory test and sampling effects; and (3) loading parameters. The fundamentals of risk based design concepts pertient to liquefaction are reviewed. A detailed statistical evaluation of the soil parameters in the proposed liquefaction model is provided and the uncertainty associated with the estimation of in situ relative density is evaluated for both direct and indirect methods. It is found that the liquefaction potential the uncertainties in the load parameters could be higher than those in the resistance parameters.
Predicting and evaluation the severity in acute pancreatitis using a new modeling built on body mass index and intra-abdominal pressure.

PubMed

Fei, Yang; Gao, Kun; Tu, Jianfeng; Wang, Wei; Zong, Guang-Quan; Li, Wei-Qin

2017-06-03

Acute pancreatitis (AP) keeps as severe medical diagnosis and treatment problem. Early evaluation for severity and risk stratification in patients with AP is very important. Some scoring system such as acute physiology and chronic health evaluation-II (APACHE-II), the computed tomography severity index (CTSI), Ranson's score and the bedside index of severity of AP (BISAP) have been used, nevertheless, there're a few shortcomings in these methods. The aim of this study was to construct a new modeling including intra-abdominal pressure (IAP) and body mass index (BMI) to evaluate the severity in AP. The study comprised of two independent cohorts of patients with AP, one set was used to develop modeling from Jinling hospital in the period between January 2013 and October 2016, 1073 patients were included in it; another set was used to validate modeling from the 81st hospital in the period between January 2012 and December 2016, 326 patients were included in it. The association between risk factors and severity of AP were assessed by univariable analysis; multivariable modeling was explored through stepwise selection regression. The change in IAP and BMI were combined to generate a regression equation as the new modeling. Statistical indexes were used to evaluate the value of the prediction in the new modeling. Univariable analysis confirmed change in IAP and BMI to be significantly associated with severity of AP. The predict sensitivity, specificity, positive predictive value, negative predictive value and accuracy by the new modeling for severity of AP were 77.6%, 82.6%, 71.9%, 87.5% and 74.9% respectively in the developing dataset. There were significant differences between the new modeling and other scoring systems in these parameters (P < 0.05). In addition, a comparison of the area under receiver operating characteristic curves of them showed a statistically significant difference (P < 0.05). The same results could be found in the validating dataset. A new modeling based on IAP and BMI is more likely to predict the severity of AP. Copyright © 2017. Published by Elsevier Inc.
Relating crash frequency and severity: evaluating the effectiveness of shoulder rumble strips on reducing fatal and major injury crashes.

PubMed

Wu, Kun-Feng; Donnell, Eric T; Aguero-Valverde, Jonathan

2014-06-01

To approach the goal of "Toward Zero Deaths," there is a need to develop an analysis paradigm to better understand the effects of a countermeasure on reducing the number of severe crashes. One of the goals in traffic safety research is to search for an effective treatment to reduce fatal and major injury crashes, referred to as severe crashes. To achieve this goal, the selection of promising countermeasures is of utmost importance, and relies on the effectiveness of candidate countermeasures in reducing severe crashes. Although it is important to precisely evaluate the effectiveness of candidate countermeasures in reducing the number of severe crashes at a site, the current state-of-the-practice often leads to biased estimates. While there have been a few advanced statistical models developed to mitigate the problem in practice, these models are computationally difficult to estimate because severe crashes are dispersed spatially and temporally, and cannot be integrated into the Highway Safety Manual framework, which develops a series of safety performance functions and crash modification factors to predict the number of crashes. Crash severity outcomes are generally integrated into the Highway Safety Manual using deterministic distributions rather than statistical models. Accounting for the variability in crash severity as a function geometric design, traffic flow, and other roadway and roadside features is afforded by estimating statistical models. Therefore, there is a need to develop a new analysis paradigm to resolve the limitations in the current Highway Safety Manual methods. We propose an approach which decomposes the severe crash frequency into a function of the change in the total number of crashes and the probability of a crash becoming a severe crash before and after a countermeasure is implemented. We tested this approach by evaluating the effectiveness of shoulder rumble strips on reducing the number of severe crashes. A total of 310 segments that have had shoulder rumble strips installed during 2002-2009 are included in the analysis. It was found that shoulder rumble strips reduce the total number of crashes, but have no statistically significant effect on reducing the probability of a severe crash outcome. Copyright © 2014 Elsevier Ltd. All rights reserved.
Multi-objective calibration and uncertainty analysis of hydrologic models; A comparative study between formal and informal methods

NASA Astrophysics Data System (ADS)

Shafii, M.; Tolson, B.; Matott, L. S.

2012-04-01

Hydrologic modeling has benefited from significant developments over the past two decades. This has resulted in building of higher levels of complexity into hydrologic models, which eventually makes the model evaluation process (parameter estimation via calibration and uncertainty analysis) more challenging. In order to avoid unreasonable parameter estimates, many researchers have suggested implementation of multi-criteria calibration schemes. Furthermore, for predictive hydrologic models to be useful, proper consideration of uncertainty is essential. Consequently, recent research has emphasized comprehensive model assessment procedures in which multi-criteria parameter estimation is combined with statistically-based uncertainty analysis routines such as Bayesian inference using Markov Chain Monte Carlo (MCMC) sampling. Such a procedure relies on the use of formal likelihood functions based on statistical assumptions, and moreover, the Bayesian inference structured on MCMC samplers requires a considerably large number of simulations. Due to these issues, especially in complex non-linear hydrological models, a variety of alternative informal approaches have been proposed for uncertainty analysis in the multi-criteria context. This study aims at exploring a number of such informal uncertainty analysis techniques in multi-criteria calibration of hydrological models. The informal methods addressed in this study are (i) Pareto optimality which quantifies the parameter uncertainty using the Pareto solutions, (ii) DDS-AU which uses the weighted sum of objective functions to derive the prediction limits, and (iii) GLUE which describes the total uncertainty through identification of behavioral solutions. The main objective is to compare such methods with MCMC-based Bayesian inference with respect to factors such as computational burden, and predictive capacity, which are evaluated based on multiple comparative measures. The measures for comparison are calculated both for calibration and evaluation periods. The uncertainty analysis methodologies are applied to a simple 5-parameter rainfall-runoff model, called HYMOD.
Evaluating performances of simplified physically based landslide susceptibility models.

NASA Astrophysics Data System (ADS)

Capparelli, Giovanna; Formetta, Giuseppe; Versace, Pasquale

2015-04-01

Rainfall induced shallow landslides cause significant damages involving loss of life and properties. Prediction of shallow landslides susceptible locations is a complex task that involves many disciplines: hydrology, geotechnical science, geomorphology, and statistics. Usually to accomplish this task two main approaches are used: statistical or physically based model. This paper presents a package of GIS based models for landslide susceptibility analysis. It was integrated in the NewAge-JGrass hydrological model using the Object Modeling System (OMS) modeling framework. The package includes three simplified physically based models for landslides susceptibility analysis (M1, M2, and M3) and a component for models verifications. It computes eight goodness of fit indices (GOF) by comparing pixel-by-pixel model results and measurements data. Moreover, the package integration in NewAge-JGrass allows the use of other components such as geographic information system tools to manage inputs-output processes, and automatic calibration algorithms to estimate model parameters. The system offers the possibility to investigate and fairly compare the quality and the robustness of models and models parameters, according a procedure that includes: i) model parameters estimation by optimizing each of the GOF index separately, ii) models evaluation in the ROC plane by using each of the optimal parameter set, and iii) GOF robustness evaluation by assessing their sensitivity to the input parameter variation. This procedure was repeated for all three models. The system was applied for a case study in Calabria (Italy) along the Salerno-Reggio Calabria highway, between Cosenza and Altilia municipality. The analysis provided that among all the optimized indices and all the three models, Average Index (AI) optimization coupled with model M3 is the best modeling solution for our test case. This research was funded by PON Project No. 01_01503 "Integrated Systems for Hydrogeological Risk Monitoring, Early Warning and Mitigation Along the Main Lifelines", CUP B31H11000370005, in the framework of the National Operational Program for "Research and Competitiveness" 2007-2013.
Power analysis on the time effect for the longitudinal Rasch model.

PubMed

Feddag, M L; Blanchin, M; Hardouin, J B; Sebille, V

2014-01-01

Statistics literature in the social, behavioral, and biomedical sciences typically stress the importance of power analysis. Patient Reported Outcomes (PRO) such as quality of life and other perceived health measures (pain, fatigue, stress,...) are increasingly used as important health outcomes in clinical trials or in epidemiological studies. They cannot be directly observed nor measured as other clinical or biological data and they are often collected through questionnaires with binary or polytomous items. The Rasch model is the well known model in the item response theory (IRT) for binary data. The article proposes an approach to evaluate the statistical power of the time effect for the longitudinal Rasch model with two time points. The performance of this method is compared to the one obtained by simulation study. Finally, the proposed approach is illustrated on one subscale of the SF-36 questionnaire.
AA9int: SNP Interaction Pattern Search Using Non-Hierarchical Additive Model Set.

PubMed

Lin, Hui-Yi; Huang, Po-Yu; Chen, Dung-Tsa; Tung, Heng-Yuan; Sellers, Thomas A; Pow-Sang, Julio; Eeles, Rosalind; Easton, Doug; Kote-Jarai, Zsofia; Amin Al Olama, Ali; Benlloch, Sara; Muir, Kenneth; Giles, Graham G; Wiklund, Fredrik; Gronberg, Henrik; Haiman, Christopher A; Schleutker, Johanna; Nordestgaard, Børge G; Travis, Ruth C; Hamdy, Freddie; Neal, David E; Pashayan, Nora; Khaw, Kay-Tee; Stanford, Janet L; Blot, William J; Thibodeau, Stephen N; Maier, Christiane; Kibel, Adam S; Cybulski, Cezary; Cannon-Albright, Lisa; Brenner, Hermann; Kaneva, Radka; Batra, Jyotsna; Teixeira, Manuel R; Pandha, Hardev; Lu, Yong-Jie; Park, Jong Y

2018-06-07

The use of single nucleotide polymorphism (SNP) interactions to predict complex diseases is getting more attention during the past decade, but related statistical methods are still immature. We previously proposed the SNP Interaction Pattern Identifier (SIPI) approach to evaluate 45 SNP interaction patterns/patterns. SIPI is statistically powerful but suffers from a large computation burden. For large-scale studies, it is necessary to use a powerful and computation-efficient method. The objective of this study is to develop an evidence-based mini-version of SIPI as the screening tool or solitary use and to evaluate the impact of inheritance mode and model structure on detecting SNP-SNP interactions. We tested two candidate approaches: the 'Five-Full' and 'AA9int' method. The Five-Full approach is composed of the five full interaction models considering three inheritance modes (additive, dominant and recessive). The AA9int approach is composed of nine interaction models by considering non-hierarchical model structure and the additive mode. Our simulation results show that AA9int has similar statistical power compared to SIPI and is superior to the Five-Full approach, and the impact of the non-hierarchical model structure is greater than that of the inheritance mode in detecting SNP-SNP interactions. In summary, it is recommended that AA9int is a powerful tool to be used either alone or as the screening stage of a two-stage approach (AA9int+SIPI) for detecting SNP-SNP interactions in large-scale studies. The 'AA9int' and 'parAA9int' functions (standard and parallel computing version) are added in the SIPI R package, which is freely available at https://linhuiyi.github.io/LinHY_Software/. hlin1@lsuhsc.edu. Supplementary data are available at Bioinformatics online.
Impact of IRT item misfit on score estimates and severity classifications: an examination of PROMIS depression and pain interference item banks.

PubMed

Zhao, Yue

2017-03-01

In patient-reported outcome research that utilizes item response theory (IRT), using statistical significance tests to detect misfit is usually the focus of IRT model-data fit evaluations. However, such evaluations rarely address the impact/consequence of using misfitting items on the intended clinical applications. This study was designed to evaluate the impact of IRT item misfit on score estimates and severity classifications and to demonstrate a recommended process of model-fit evaluation. Using secondary data sources collected from the Patient-Reported Outcome Measurement Information System (PROMIS) wave 1 testing phase, analyses were conducted based on PROMIS depression (28 items; 782 cases) and pain interference (41 items; 845 cases) item banks. The identification of misfitting items was assessed using Orlando and Thissen's summed-score item-fit statistics and graphical displays. The impact of misfit was evaluated according to the agreement of both IRT-derived T-scores and severity classifications between inclusion and exclusion of misfitting items. The examination of the presence and impact of misfit suggested that item misfit had a negligible impact on the T-score estimates and severity classifications with the general population sample in the PROMIS depression and pain interference item banks, implying that the impact of item misfit was insignificant. Findings support the T-score estimates in the two item banks as robust against item misfit at both the group and individual levels and add confidence to the use of T-scores for severity diagnosis in the studied sample. Recommendations on approaches for identifying item misfit (statistical significance) and assessing the misfit impact (practical significance) are given.
Methods for evaluating the predictive accuracy of structural dynamic models

NASA Technical Reports Server (NTRS)

Hasselman, Timothy K.; Chrostowski, Jon D.

1991-01-01

Modeling uncertainty is defined in terms of the difference between predicted and measured eigenvalues and eigenvectors. Data compiled from 22 sets of analysis/test results was used to create statistical databases for large truss-type space structures and both pretest and posttest models of conventional satellite-type space structures. Modeling uncertainty is propagated through the model to produce intervals of uncertainty on frequency response functions, both amplitude and phase. This methodology was used successfully to evaluate the predictive accuracy of several structures, including the NASA CSI Evolutionary Structure tested at Langley Research Center. Test measurements for this structure were within + one-sigma intervals of predicted accuracy for the most part, demonstrating the validity of the methodology and computer code.
Forecasting in foodservice: model development, testing, and evaluation.

PubMed

Miller, J L; Thompson, P A; Orabella, M M

1991-05-01

This study was designed to develop, test, and evaluate mathematical models appropriate for forecasting menu-item production demand in foodservice. Data were collected from residence and dining hall foodservices at Ohio State University. Objectives of the study were to collect, code, and analyze the data; develop and test models using actual operation data; and compare forecasting results with current methods in use. Customer count was forecast using deseasonalized simple exponential smoothing. Menu-item demand was forecast by multiplying the count forecast by a predicted preference statistic. Forecasting models were evaluated using mean squared error, mean absolute deviation, and mean absolute percentage error techniques. All models were more accurate than current methods. A broad spectrum of forecasting techniques could be used by foodservice managers with access to a personal computer and spread-sheet and database-management software. The findings indicate that mathematical forecasting techniques may be effective in foodservice operations to control costs, increase productivity, and maximize profits.
Evaluation of the synoptic and mesoscale predictive capabilities of a mesoscale atmospheric simulation system

NASA Technical Reports Server (NTRS)

Koch, S. E.; Skillman, W. C.; Kocin, P. J.; Wetzel, P. J.; Brill, K.; Keyser, D. A.; Mccumber, M. C.

1983-01-01

The overall performance characteristics of a limited area, hydrostatic, fine (52 km) mesh, primitive equation, numerical weather prediction model are determined in anticipation of satellite data assimilations with the model. The synoptic and mesoscale predictive capabilities of version 2.0 of this model, the Mesoscale Atmospheric Simulation System (MASS 2.0), were evaluated. The two part study is based on a sample of approximately thirty 12h and 24h forecasts of atmospheric flow patterns during spring and early summer. The synoptic scale evaluation results benchmark the performance of MASS 2.0 against that of an operational, synoptic scale weather prediction model, the Limited area Fine Mesh (LFM). The large sample allows for the calculation of statistically significant measures of forecast accuracy and the determination of systematic model errors. The synoptic scale benchmark is required before unsmoothed mesoscale forecast fields can be seriously considered.
Speckle noise in satellite based lidar systems

NASA Technical Reports Server (NTRS)

Gardner, C. S.

1977-01-01

The lidar system model was described, and the statistics of the signal and noise at the receiver output were derived. Scattering media effects were discussed along with polarization and atmospheric turbulence. The major equations were summarized and evaluated for some typical parameters.
Model of Auctioneer Estimation of Swordtip Squid (Loligo edulis) Quality

NASA Astrophysics Data System (ADS)

Nakamura, Makoto; Matsumoto, Keisuke; Morimoto, Eiji; Ezoe, Satoru; Maeda, Toshimichi; Hirano, Takayuki

The knowledge of experienced auctioneers regarding the circulation of marine products is an essential skill and is necessary for evaluating product quality and managing aspects such as freshness. In the present study, the ability of an auctioneer to quickly evaluate the freshness of swordtip squid (Loligo edulis) at fish markets was analyzed. Evaluation characteristics used by an auctioneer were analyzed and developed using a fuzzy logic model. Forty boxes containing 247 swordtip squid with mantles measuring 220 mm that had been evaluated and assigned to one of five quality categories by an auctioneer were used for the analysis and the modeling. The relationships between the evaluations of appearance, body color, and muscle freshness were statistically analyzed. It was found that a total of four indexes of the epidermis color strongly reflected evaluations of appearance: dispersion ratio of the head, chroma on the head-end mantle and the difference in the chroma and brightness of the mantle. The fuzzy logic model used these indexes for the antecedent-part of the linguistic rules. The results of both simulation and evaluations demonstrate that the model is robust, with the predicted results corresponding with more than 96% of the quality assignments of the auctioneers.
Prediction of the presence of insulin resistance using general health checkup data in Japanese employees with metabolic risk factors.

PubMed

Takahara, Mitsuyoshi; Katakami, Naoto; Kaneto, Hideaki; Noguchi, Midori; Shimomura, Iichiro

2014-01-01

The aim of the current study was to develop a predictive model of insulin resistance using general health checkup data in Japanese employees with one or more metabolic risk factors. We used a database of 846 Japanese employees with one or more metabolic risk factors who underwent general health checkup and a 75-g oral glucose tolerance test (OGTT). Logistic regression models were developed to predict existing insulin resistance evaluated using the Matsuda index. The predictive performance of these models was assessed using the C statistic. The C statistics of body mass index (BMI), waist circumference and their combined use were 0.743, 0.732 and 0.749, with no significant differences. The multivariate backward selection model, in which BMI, the levels of plasma glucose, high-density lipoprotein (HDL) cholesterol, log-transformed triglycerides and log-transformed alanine aminotransferase and hypertension under treatment remained, had a C statistic of 0.816, with a significant difference compared to the combined use of BMI and waist circumference (p＜0.01). The C statistic was not significantly reduced when the levels of log-transformed triglycerides and log-transformed alanine aminotransferase and hypertension under treatment were simultaneously excluded from the multivariate model (p=0.14). On the other hand, further exclusion of any of the remaining three variables significantly reduced the C statistic (all p＜0.01). When predicting the presence of insulin resistance using general health checkup data in Japanese employees with metabolic risk factors, it is important to take into consideration the BMI and fasting plasma glucose and HDL cholesterol levels.
Direction dependence analysis: A framework to test the direction of effects in linear models with an implementation in SPSS.

PubMed

Wiedermann, Wolfgang; Li, Xintong

2018-04-16

In nonexperimental data, at least three possible explanations exist for the association of two variables x and y: (1) x is the cause of y, (2) y is the cause of x, or (3) an unmeasured confounder is present. Statistical tests that identify which of the three explanatory models fits best would be a useful adjunct to the use of theory alone. The present article introduces one such statistical method, direction dependence analysis (DDA), which assesses the relative plausibility of the three explanatory models on the basis of higher-moment information about the variables (i.e., skewness and kurtosis). DDA involves the evaluation of three properties of the data: (1) the observed distributions of the variables, (2) the residual distributions of the competing models, and (3) the independence properties of the predictors and residuals of the competing models. When the observed variables are nonnormally distributed, we show that DDA components can be used to uniquely identify each explanatory model. Statistical inference methods for model selection are presented, and macros to implement DDA in SPSS are provided. An empirical example is given to illustrate the approach. Conceptual and empirical considerations are discussed for best-practice applications in psychological data, and sample size recommendations based on previous simulation studies are provided.
An improved approach for flight readiness certification: Probabilistic models for flaw propagation and turbine blade failure. Volume 1: Methodology and applications

NASA Technical Reports Server (NTRS)

Moore, N. R.; Ebbeler, D. H.; Newlin, L. E.; Sutharshana, S.; Creager, M.

1992-01-01

An improved methodology for quantitatively evaluating failure risk of spaceflight systems to assess flight readiness and identify risk control measures is presented. This methodology, called Probabilistic Failure Assessment (PFA), combines operating experience from tests and flights with analytical modeling of failure phenomena to estimate failure risk. The PFA methodology is of particular value when information on which to base an assessment of failure risk, including test experience and knowledge of parameters used in analytical modeling, is expensive or difficult to acquire. The PFA methodology is a prescribed statistical structure in which analytical models that characterize failure phenomena are used conjointly with uncertainties about analysis parameters and/or modeling accuracy to estimate failure probability distributions for specific failure modes. These distributions can then be modified, by means of statistical procedures of the PFA methodology, to reflect any test or flight experience. State-of-the-art analytical models currently employed for designs failure prediction, or performance analysis are used in this methodology. The rationale for the statistical approach taken in the PFA methodology is discussed, the PFA methodology is described, and examples of its application to structural failure modes are presented. The engineering models and computer software used in fatigue crack growth and fatigue crack initiation applications are thoroughly documented.
An improved approach for flight readiness certification: Probabilistic models for flaw propagation and turbine blade failure. Volume 2: Software documentation

NASA Technical Reports Server (NTRS)

Moore, N. R.; Ebbeler, D. H.; Newlin, L. E.; Sutharshana, S.; Creager, M.

1992-01-01

An improved methodology for quantitatively evaluating failure risk of spaceflights systems to assess flight readiness and identify risk control measures is presented. This methodology, called Probabilistic Failure Assessment (PFA), combines operating experience from tests and flights with analytical modeling of failure phenomena to estimate failure risk. The PFA methodology is of particular value when information on which to base an assessment of failure risk, including test experience and knowledge of parameters used in analytical modeling, is expensive or difficult to acquire. The PFA methodology is a prescribed statistical structure in which analytical models that characterize failure phenomena are used conjointly with uncertainties about analysis parameters and/or modeling accuracy to estimate failure probability distributions for specific failure modes. These distributions can then be modified, by means of statistical procedures of the PFA methodology, to reflect any test or flight experience. State-of-the-art analytical models currently employed for design, failure prediction, or performance analysis are used in this methodology. The rationale for the statistical approach taken in the PFA methodology is discussed, the PFA methodology is described, and examples of its application to structural failure modes are presented. The engineering models and computer software used in fatigue crack growth and fatigue crack initiation applications are thoroughly documented.
OPLS statistical model versus linear regression to assess sonographic predictors of stroke prognosis.

PubMed

Vajargah, Kianoush Fathi; Sadeghi-Bazargani, Homayoun; Mehdizadeh-Esfanjani, Robab; Savadi-Oskouei, Daryoush; Farhoudi, Mehdi

2012-01-01

The objective of the present study was to assess the comparable applicability of orthogonal projections to latent structures (OPLS) statistical model vs traditional linear regression in order to investigate the role of trans cranial doppler (TCD) sonography in predicting ischemic stroke prognosis. The study was conducted on 116 ischemic stroke patients admitted to a specialty neurology ward. The Unified Neurological Stroke Scale was used once for clinical evaluation on the first week of admission and again six months later. All data was primarily analyzed using simple linear regression and later considered for multivariate analysis using PLS/OPLS models through the SIMCA P+12 statistical software package. The linear regression analysis results used for the identification of TCD predictors of stroke prognosis were confirmed through the OPLS modeling technique. Moreover, in comparison to linear regression, the OPLS model appeared to have higher sensitivity in detecting the predictors of ischemic stroke prognosis and detected several more predictors. Applying the OPLS model made it possible to use both single TCD measures/indicators and arbitrarily dichotomized measures of TCD single vessel involvement as well as the overall TCD result. In conclusion, the authors recommend PLS/OPLS methods as complementary rather than alternative to the available classical regression models such as linear regression.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.