Sample records for predictive modeling approaches

  1. Risk prediction model: Statistical and artificial neural network approach

    NASA Astrophysics Data System (ADS)

    Paiman, Nuur Azreen; Hariri, Azian; Masood, Ibrahim

    2017-04-01

    Prediction models are increasingly gaining popularity and had been used in numerous areas of studies to complement and fulfilled clinical reasoning and decision making nowadays. The adoption of such models assist physician's decision making, individual's behavior, and consequently improve individual outcomes and the cost-effectiveness of care. The objective of this paper is to reviewed articles related to risk prediction model in order to understand the suitable approach, development and the validation process of risk prediction model. A qualitative review of the aims, methods and significant main outcomes of the nineteen published articles that developed risk prediction models from numerous fields were done. This paper also reviewed on how researchers develop and validate the risk prediction models based on statistical and artificial neural network approach. From the review done, some methodological recommendation in developing and validating the prediction model were highlighted. According to studies that had been done, artificial neural network approached in developing the prediction model were more accurate compared to statistical approach. However currently, only limited published literature discussed on which approach is more accurate for risk prediction model development.

  2. Development of a noise prediction model based on advanced fuzzy approaches in typical industrial workrooms.

    PubMed

    Aliabadi, Mohsen; Golmohammadi, Rostam; Khotanlou, Hassan; Mansoorizadeh, Muharram; Salarpour, Amir

    2014-01-01

    Noise prediction is considered to be the best method for evaluating cost-preventative noise controls in industrial workrooms. One of the most important issues is the development of accurate models for analysis of the complex relationships among acoustic features affecting noise level in workrooms. In this study, advanced fuzzy approaches were employed to develop relatively accurate models for predicting noise in noisy industrial workrooms. The data were collected from 60 industrial embroidery workrooms in the Khorasan Province, East of Iran. The main acoustic and embroidery process features that influence the noise were used to develop prediction models using MATLAB software. Multiple regression technique was also employed and its results were compared with those of fuzzy approaches. Prediction errors of all prediction models based on fuzzy approaches were within the acceptable level (lower than one dB). However, Neuro-fuzzy model (RMSE=0.53dB and R2=0.88) could slightly improve the accuracy of noise prediction compared with generate fuzzy model. Moreover, fuzzy approaches provided more accurate predictions than did regression technique. The developed models based on fuzzy approaches as useful prediction tools give professionals the opportunity to have an optimum decision about the effectiveness of acoustic treatment scenarios in embroidery workrooms.

  3. Developing a local least-squares support vector machines-based neuro-fuzzy model for nonlinear and chaotic time series prediction.

    PubMed

    Miranian, A; Abdollahzade, M

    2013-02-01

    Local modeling approaches, owing to their ability to model different operating regimes of nonlinear systems and processes by independent local models, seem appealing for modeling, identification, and prediction applications. In this paper, we propose a local neuro-fuzzy (LNF) approach based on the least-squares support vector machines (LSSVMs). The proposed LNF approach employs LSSVMs, which are powerful in modeling and predicting time series, as local models and uses hierarchical binary tree (HBT) learning algorithm for fast and efficient estimation of its parameters. The HBT algorithm heuristically partitions the input space into smaller subdomains by axis-orthogonal splits. In each partitioning, the validity functions automatically form a unity partition and therefore normalization side effects, e.g., reactivation, are prevented. Integration of LSSVMs into the LNF network as local models, along with the HBT learning algorithm, yield a high-performance approach for modeling and prediction of complex nonlinear time series. The proposed approach is applied to modeling and predictions of different nonlinear and chaotic real-world and hand-designed systems and time series. Analysis of the prediction results and comparisons with recent and old studies demonstrate the promising performance of the proposed LNF approach with the HBT learning algorithm for modeling and prediction of nonlinear and chaotic systems and time series.

  4. The Role of Multimodel Combination in Improving Streamflow Prediction

    NASA Astrophysics Data System (ADS)

    Arumugam, S.; Li, W.

    2008-12-01

    Model errors are the inevitable part in any prediction exercise. One approach that is currently gaining attention to reduce model errors is by optimally combining multiple models to develop improved predictions. The rationale behind this approach primarily lies on the premise that optimal weights could be derived for each model so that the developed multimodel predictions will result in improved predictability. In this study, we present a new approach to combine multiple hydrological models by evaluating their predictability contingent on the predictor state. We combine two hydrological models, 'abcd' model and Variable Infiltration Capacity (VIC) model, with each model's parameter being estimated by two different objective functions to develop multimodel streamflow predictions. The performance of multimodel predictions is compared with individual model predictions using correlation, root mean square error and Nash-Sutcliffe coefficient. To quantify precisely under what conditions the multimodel predictions result in improved predictions, we evaluate the proposed algorithm by testing it against streamflow generated from a known model ('abcd' model or VIC model) with errors being homoscedastic or heteroscedastic. Results from the study show that streamflow simulated from individual models performed better than multimodels under almost no model error. Under increased model error, the multimodel consistently performed better than the single model prediction in terms of all performance measures. The study also evaluates the proposed algorithm for streamflow predictions in two humid river basins from NC as well as in two arid basins from Arizona. Through detailed validation in these four sites, the study shows that multimodel approach better predicts the observed streamflow in comparison to the single model predictions.

  5. Posterior Predictive Bayesian Phylogenetic Model Selection

    PubMed Central

    Lewis, Paul O.; Xie, Wangang; Chen, Ming-Hui; Fan, Yu; Kuo, Lynn

    2014-01-01

    We present two distinctly different posterior predictive approaches to Bayesian phylogenetic model selection and illustrate these methods using examples from green algal protein-coding cpDNA sequences and flowering plant rDNA sequences. The Gelfand–Ghosh (GG) approach allows dissection of an overall measure of model fit into components due to posterior predictive variance (GGp) and goodness-of-fit (GGg), which distinguishes this method from the posterior predictive P-value approach. The conditional predictive ordinate (CPO) method provides a site-specific measure of model fit useful for exploratory analyses and can be combined over sites yielding the log pseudomarginal likelihood (LPML) which is useful as an overall measure of model fit. CPO provides a useful cross-validation approach that is computationally efficient, requiring only a sample from the posterior distribution (no additional simulation is required). Both GG and CPO add new perspectives to Bayesian phylogenetic model selection based on the predictive abilities of models and complement the perspective provided by the marginal likelihood (including Bayes Factor comparisons) based solely on the fit of competing models to observed data. [Bayesian; conditional predictive ordinate; CPO; L-measure; LPML; model selection; phylogenetics; posterior predictive.] PMID:24193892

  6. Predicting Football Matches Results using Bayesian Networks for English Premier League (EPL)

    NASA Astrophysics Data System (ADS)

    Razali, Nazim; Mustapha, Aida; Yatim, Faiz Ahmad; Aziz, Ruhaya Ab

    2017-08-01

    The issues of modeling asscoiation football prediction model has become increasingly popular in the last few years and many different approaches of prediction models have been proposed with the point of evaluating the attributes that lead a football team to lose, draw or win the match. There are three types of approaches has been considered for predicting football matches results which include statistical approaches, machine learning approaches and Bayesian approaches. Lately, many studies regarding football prediction models has been produced using Bayesian approaches. This paper proposes a Bayesian Networks (BNs) to predict the results of football matches in term of home win (H), away win (A) and draw (D). The English Premier League (EPL) for three seasons of 2010-2011, 2011-2012 and 2012-2013 has been selected and reviewed. K-fold cross validation has been used for testing the accuracy of prediction model. The required information about the football data is sourced from a legitimate site at http://www.football-data.co.uk. BNs achieved predictive accuracy of 75.09% in average across three seasons. It is hoped that the results could be used as the benchmark output for future research in predicting football matches results.

  7. A Simplified Micromechanical Modeling Approach to Predict the Tensile Flow Curve Behavior of Dual-Phase Steels

    NASA Astrophysics Data System (ADS)

    Nanda, Tarun; Kumar, B. Ravi; Singh, Vishal

    2017-11-01

    Micromechanical modeling is used to predict material's tensile flow curve behavior based on microstructural characteristics. This research develops a simplified micromechanical modeling approach for predicting flow curve behavior of dual-phase steels. The existing literature reports on two broad approaches for determining tensile flow curve of these steels. The modeling approach developed in this work attempts to overcome specific limitations of the existing two approaches. This approach combines dislocation-based strain-hardening method with rule of mixtures. In the first step of modeling, `dislocation-based strain-hardening method' was employed to predict tensile behavior of individual phases of ferrite and martensite. In the second step, the individual flow curves were combined using `rule of mixtures,' to obtain the composite dual-phase flow behavior. To check accuracy of proposed model, four distinct dual-phase microstructures comprising of different ferrite grain size, martensite fraction, and carbon content in martensite were processed by annealing experiments. The true stress-strain curves for various microstructures were predicted with the newly developed micromechanical model. The results of micromechanical model matched closely with those of actual tensile tests. Thus, this micromechanical modeling approach can be used to predict and optimize the tensile flow behavior of dual-phase steels.

  8. Personalized Modeling for Prediction with Decision-Path Models

    PubMed Central

    Visweswaran, Shyam; Ferreira, Antonio; Ribeiro, Guilherme A.; Oliveira, Alexandre C.; Cooper, Gregory F.

    2015-01-01

    Deriving predictive models in medicine typically relies on a population approach where a single model is developed from a dataset of individuals. In this paper we describe and evaluate a personalized approach in which we construct a new type of decision tree model called decision-path model that takes advantage of the particular features of a given person of interest. We introduce three personalized methods that derive personalized decision-path models. We compared the performance of these methods to that of Classification And Regression Tree (CART) that is a population decision tree to predict seven different outcomes in five medical datasets. Two of the three personalized methods performed statistically significantly better on area under the ROC curve (AUC) and Brier skill score compared to CART. The personalized approach of learning decision path models is a new approach for predictive modeling that can perform better than a population approach. PMID:26098570

  9. Comparison of the Predictive Performance and Interpretability of Random Forest and Linear Models on Benchmark Data Sets.

    PubMed

    Marchese Robinson, Richard L; Palczewska, Anna; Palczewski, Jan; Kidley, Nathan

    2017-08-28

    The ability to interpret the predictions made by quantitative structure-activity relationships (QSARs) offers a number of advantages. While QSARs built using nonlinear modeling approaches, such as the popular Random Forest algorithm, might sometimes be more predictive than those built using linear modeling approaches, their predictions have been perceived as difficult to interpret. However, a growing number of approaches have been proposed for interpreting nonlinear QSAR models in general and Random Forest in particular. In the current work, we compare the performance of Random Forest to those of two widely used linear modeling approaches: linear Support Vector Machines (SVMs) (or Support Vector Regression (SVR)) and partial least-squares (PLS). We compare their performance in terms of their predictivity as well as the chemical interpretability of the predictions using novel scoring schemes for assessing heat map images of substructural contributions. We critically assess different approaches for interpreting Random Forest models as well as for obtaining predictions from the forest. We assess the models on a large number of widely employed public-domain benchmark data sets corresponding to regression and binary classification problems of relevance to hit identification and toxicology. We conclude that Random Forest typically yields comparable or possibly better predictive performance than the linear modeling approaches and that its predictions may also be interpreted in a chemically and biologically meaningful way. In contrast to earlier work looking at interpretation of nonlinear QSAR models, we directly compare two methodologically distinct approaches for interpreting Random Forest models. The approaches for interpreting Random Forest assessed in our article were implemented using open-source programs that we have made available to the community. These programs are the rfFC package ( https://r-forge.r-project.org/R/?group_id=1725 ) for the R statistical programming language and the Python program HeatMapWrapper [ https://doi.org/10.5281/zenodo.495163 ] for heat map generation.

  10. Prediction using patient comparison vs. modeling: a case study for mortality prediction.

    PubMed

    Hoogendoorn, Mark; El Hassouni, Ali; Mok, Kwongyen; Ghassemi, Marzyeh; Szolovits, Peter

    2016-08-01

    Information in Electronic Medical Records (EMRs) can be used to generate accurate predictions for the occurrence of a variety of health states, which can contribute to more pro-active interventions. The very nature of EMRs does make the application of off-the-shelf machine learning techniques difficult. In this paper, we study two approaches to making predictions that have hardly been compared in the past: (1) extracting high-level (temporal) features from EMRs and building a predictive model, and (2) defining a patient similarity metric and predicting based on the outcome observed for similar patients. We analyze and compare both approaches on the MIMIC-II ICU dataset to predict patient mortality and find that the patient similarity approach does not scale well and results in a less accurate model (AUC of 0.68) compared to the modeling approach (0.84). We also show that mortality can be predicted within a median of 72 hours.

  11. Multi-model comparison highlights consistency in predicted effect of warming on a semi-arid shrub

    USGS Publications Warehouse

    Renwick, Katherine M.; Curtis, Caroline; Kleinhesselink, Andrew R.; Schlaepfer, Daniel R.; Bradley, Bethany A.; Aldridge, Cameron L.; Poulter, Benjamin; Adler, Peter B.

    2018-01-01

    A number of modeling approaches have been developed to predict the impacts of climate change on species distributions, performance, and abundance. The stronger the agreement from models that represent different processes and are based on distinct and independent sources of information, the greater the confidence we can have in their predictions. Evaluating the level of confidence is particularly important when predictions are used to guide conservation or restoration decisions. We used a multi-model approach to predict climate change impacts on big sagebrush (Artemisia tridentata), the dominant plant species on roughly 43 million hectares in the western United States and a key resource for many endemic wildlife species. To evaluate the climate sensitivity of A. tridentata, we developed four predictive models, two based on empirically derived spatial and temporal relationships, and two that applied mechanistic approaches to simulate sagebrush recruitment and growth. This approach enabled us to produce an aggregate index of climate change vulnerability and uncertainty based on the level of agreement between models. Despite large differences in model structure, predictions of sagebrush response to climate change were largely consistent. Performance, as measured by change in cover, growth, or recruitment, was predicted to decrease at the warmest sites, but increase throughout the cooler portions of sagebrush's range. A sensitivity analysis indicated that sagebrush performance responds more strongly to changes in temperature than precipitation. Most of the uncertainty in model predictions reflected variation among the ecological models, raising questions about the reliability of forecasts based on a single modeling approach. Our results highlight the value of a multi-model approach in forecasting climate change impacts and uncertainties and should help land managers to maximize the value of conservation investments.

  12. Microarray-based cancer prediction using soft computing approach.

    PubMed

    Wang, Xiaosheng; Gotoh, Osamu

    2009-05-26

    One of the difficulties in using gene expression profiles to predict cancer is how to effectively select a few informative genes to construct accurate prediction models from thousands or ten thousands of genes. We screen highly discriminative genes and gene pairs to create simple prediction models involved in single genes or gene pairs on the basis of soft computing approach and rough set theory. Accurate cancerous prediction is obtained when we apply the simple prediction models for four cancerous gene expression datasets: CNS tumor, colon tumor, lung cancer and DLBCL. Some genes closely correlated with the pathogenesis of specific or general cancers are identified. In contrast with other models, our models are simple, effective and robust. Meanwhile, our models are interpretable for they are based on decision rules. Our results demonstrate that very simple models may perform well on cancerous molecular prediction and important gene markers of cancer can be detected if the gene selection approach is chosen reasonably.

  13. A review of statistical updating methods for clinical prediction models.

    PubMed

    Su, Ting-Li; Jaki, Thomas; Hickey, Graeme L; Buchan, Iain; Sperrin, Matthew

    2018-01-01

    A clinical prediction model is a tool for predicting healthcare outcomes, usually within a specific population and context. A common approach is to develop a new clinical prediction model for each population and context; however, this wastes potentially useful historical information. A better approach is to update or incorporate the existing clinical prediction models already developed for use in similar contexts or populations. In addition, clinical prediction models commonly become miscalibrated over time, and need replacing or updating. In this article, we review a range of approaches for re-using and updating clinical prediction models; these fall in into three main categories: simple coefficient updating, combining multiple previous clinical prediction models in a meta-model and dynamic updating of models. We evaluated the performance (discrimination and calibration) of the different strategies using data on mortality following cardiac surgery in the United Kingdom: We found that no single strategy performed sufficiently well to be used to the exclusion of the others. In conclusion, useful tools exist for updating existing clinical prediction models to a new population or context, and these should be implemented rather than developing a new clinical prediction model from scratch, using a breadth of complementary statistical methods.

  14. Approximating prediction uncertainty for random forest regression models

    Treesearch

    John W. Coulston; Christine E. Blinn; Valerie A. Thomas; Randolph H. Wynne

    2016-01-01

    Machine learning approaches such as random forest have increased for the spatial modeling and mapping of continuous variables. Random forest is a non-parametric ensemble approach, and unlike traditional regression approaches there is no direct quantification of prediction error. Understanding prediction uncertainty is important when using model-based continuous maps as...

  15. THE FUTURE OF COMPUTER-BASED TOXICITY PREDICTION: MECHANISM-BASED MODELS VS. INFORMATION MINING APPROACHES

    EPA Science Inventory


    The Future of Computer-Based Toxicity Prediction:
    Mechanism-Based Models vs. Information Mining Approaches

    When we speak of computer-based toxicity prediction, we are generally referring to a broad array of approaches which rely primarily upon chemical structure ...

  16. Predicting climate-induced range shifts: model differences and model reliability.

    Treesearch

    Joshua J. Lawler; Denis White; Ronald P. Neilson; Andrew R. Blaustein

    2006-01-01

    Predicted changes in the global climate are likely to cause large shifts in the geographic ranges of many plant and animal species. To date, predictions of future range shifts have relied on a variety of modeling approaches with different levels of model accuracy. Using a common data set, we investigated the potential implications of alternative modeling approaches for...

  17. TOXICO-CHEMINFORMATICS AND QSAR MODELING OF ...

    EPA Pesticide Factsheets

    This abstract concludes that QSAR approaches combined with toxico-chemoinformatics descriptors can enhance predictive toxicology models. This abstract concludes that QSAR approaches combined with toxico-chemoinformatics descriptors can enhance predictive toxicology models.

  18. Prediction of biochar yield from cattle manure pyrolysis via least squares support vector machine intelligent approach.

    PubMed

    Cao, Hongliang; Xin, Ya; Yuan, Qiaoxia

    2016-02-01

    To predict conveniently the biochar yield from cattle manure pyrolysis, intelligent modeling approach was introduced in this research. A traditional artificial neural networks (ANN) model and a novel least squares support vector machine (LS-SVM) model were developed. For the identification and prediction evaluation of the models, a data set with 33 experimental data was used, which were obtained using a laboratory-scale fixed bed reaction system. The results demonstrated that the intelligent modeling approach is greatly convenient and effective for the prediction of the biochar yield. In particular, the novel LS-SVM model has a more satisfying predicting performance and its robustness is better than the traditional ANN model. The introduction and application of the LS-SVM modeling method gives a successful example, which is a good reference for the modeling study of cattle manure pyrolysis process, even other similar processes. Copyright © 2015 Elsevier Ltd. All rights reserved.

  19. Supermodeling With A Global Atmospheric Model

    NASA Astrophysics Data System (ADS)

    Wiegerinck, Wim; Burgers, Willem; Selten, Frank

    2013-04-01

    In weather and climate prediction studies it often turns out to be the case that the multi-model ensemble mean prediction has the best prediction skill scores. One possible explanation is that the major part of the model error is random and is averaged out in the ensemble mean. In the standard multi-model ensemble approach, the models are integrated in time independently and the predicted states are combined a posteriori. Recently an alternative ensemble prediction approach has been proposed in which the models exchange information during the simulation and synchronize on a common solution that is closer to the truth than any of the individual model solutions in the standard multi-model ensemble approach or a weighted average of these. This approach is called the super modeling approach (SUMO). The potential of the SUMO approach has been demonstrated in the context of simple, low-order, chaotic dynamical systems. The information exchange takes the form of linear nudging terms in the dynamical equations that nudge the solution of each model to the solution of all other models in the ensemble. With a suitable choice of the connection strengths the models synchronize on a common solution that is indeed closer to the true system than any of the individual model solutions without nudging. This approach is called connected SUMO. An alternative approach is to integrate a weighted averaged model, weighted SUMO. At each time step all models in the ensemble calculate the tendency, these tendencies are weighted averaged and the state is integrated one time step into the future with this weighted averaged tendency. It was shown that in case the connected SUMO synchronizes perfectly, the connected SUMO follows the weighted averaged trajectory and both approaches yield the same solution. In this study we pioneer both approaches in the context of a global, quasi-geostrophic, three-level atmosphere model that is capable of simulating quite realistically the extra-tropical circulation in the Northern Hemisphere winter.

  20. A simplified approach to quasi-linear viscoelastic modeling

    PubMed Central

    Nekouzadeh, Ali; Pryse, Kenneth M.; Elson, Elliot L.; Genin, Guy M.

    2007-01-01

    The fitting of quasi-linear viscoelastic (QLV) constitutive models to material data often involves somewhat cumbersome numerical convolution. A new approach to treating quasi-linearity in one dimension is described and applied to characterize the behavior of reconstituted collagen. This approach is based on a new principle for including nonlinearity and requires considerably less computation than other comparable models for both model calibration and response prediction, especially for smoothly applied stretching. Additionally, the approach allows relaxation to adapt with the strain history. The modeling approach is demonstrated through tests on pure reconstituted collagen. Sequences of “ramp-and-hold” stretching tests were applied to rectangular collagen specimens. The relaxation force data from the “hold” was used to calibrate a new “adaptive QLV model” and several models from literature, and the force data from the “ramp” was used to check the accuracy of model predictions. Additionally, the ability of the models to predict the force response on a reloading of the specimen was assessed. The “adaptive QLV model” based on this new approach predicts collagen behavior comparably to or better than existing models, with much less computation. PMID:17499254

  1. A geostatistical approach to the change-of-support problem and variable-support data fusion in spatial analysis

    NASA Astrophysics Data System (ADS)

    Wang, Jun; Wang, Yang; Zeng, Hui

    2016-01-01

    A key issue to address in synthesizing spatial data with variable-support in spatial analysis and modeling is the change-of-support problem. We present an approach for solving the change-of-support and variable-support data fusion problems. This approach is based on geostatistical inverse modeling that explicitly accounts for differences in spatial support. The inverse model is applied here to produce both the best predictions of a target support and prediction uncertainties, based on one or more measurements, while honoring measurements. Spatial data covering large geographic areas often exhibit spatial nonstationarity and can lead to computational challenge due to the large data size. We developed a local-window geostatistical inverse modeling approach to accommodate these issues of spatial nonstationarity and alleviate computational burden. We conducted experiments using synthetic and real-world raster data. Synthetic data were generated and aggregated to multiple supports and downscaled back to the original support to analyze the accuracy of spatial predictions and the correctness of prediction uncertainties. Similar experiments were conducted for real-world raster data. Real-world data with variable-support were statistically fused to produce single-support predictions and associated uncertainties. The modeling results demonstrate that geostatistical inverse modeling can produce accurate predictions and associated prediction uncertainties. It is shown that the local-window geostatistical inverse modeling approach suggested offers a practical way to solve the well-known change-of-support problem and variable-support data fusion problem in spatial analysis and modeling.

  2. A Final Approach Trajectory Model for Current Operations

    NASA Technical Reports Server (NTRS)

    Gong, Chester; Sadovsky, Alexander

    2010-01-01

    Predicting accurate trajectories with limited intent information is a challenge faced by air traffic management decision support tools in operation today. One such tool is the FAA's Terminal Proximity Alert system which is intended to assist controllers in maintaining safe separation of arrival aircraft during final approach. In an effort to improve the performance of such tools, two final approach trajectory models are proposed; one based on polynomial interpolation, the other on the Fourier transform. These models were tested against actual traffic data and used to study effects of the key final approach trajectory modeling parameters of wind, aircraft type, and weight class, on trajectory prediction accuracy. Using only the limited intent data available to today's ATM system, both the polynomial interpolation and Fourier transform models showed improved trajectory prediction accuracy over a baseline dead reckoning model. Analysis of actual arrival traffic showed that this improved trajectory prediction accuracy leads to improved inter-arrival separation prediction accuracy for longer look ahead times. The difference in mean inter-arrival separation prediction error between the Fourier transform and dead reckoning models was 0.2 nmi for a look ahead time of 120 sec, a 33 percent improvement, with a corresponding 32 percent improvement in standard deviation.

  3. Accuracy of the actuator disc-RANS approach for predicting the performance and wake of tidal turbines.

    PubMed

    Batten, W M J; Harrison, M E; Bahaj, A S

    2013-02-28

    The actuator disc-RANS model has widely been used in wind and tidal energy to predict the wake of a horizontal axis turbine. The model is appropriate where large-scale effects of the turbine on a flow are of interest, for example, when considering environmental impacts, or arrays of devices. The accuracy of the model for modelling the wake of tidal stream turbines has not been demonstrated, and flow predictions presented in the literature for similar modelled scenarios vary significantly. This paper compares the results of the actuator disc-RANS model, where the turbine forces have been derived using a blade-element approach, to experimental data measured in the wake of a scaled turbine. It also compares the results with those of a simpler uniform actuator disc model. The comparisons show that the model is accurate and can predict up to 94 per cent of the variation in the experimental velocity data measured on the centreline of the wake, therefore demonstrating that the actuator disc-RANS model is an accurate approach for modelling a turbine wake, and a conservative approach to predict performance and loads. It can therefore be applied to similar scenarios with confidence.

  4. The North American Multi-Model Ensemble (NMME): Phase-1 Seasonal to Interannual Prediction, Phase-2 Toward Developing Intra-Seasonal Prediction

    NASA Technical Reports Server (NTRS)

    Kirtman, Ben P.; Min, Dughong; Infanti, Johnna M.; Kinter, James L., III; Paolino, Daniel A.; Zhang, Qin; vandenDool, Huug; Saha, Suranjana; Mendez, Malaquias Pena; Becker, Emily; hide

    2013-01-01

    The recent US National Academies report "Assessment of Intraseasonal to Interannual Climate Prediction and Predictability" was unequivocal in recommending the need for the development of a North American Multi-Model Ensemble (NMME) operational predictive capability. Indeed, this effort is required to meet the specific tailored regional prediction and decision support needs of a large community of climate information users. The multi-model ensemble approach has proven extremely effective at quantifying prediction uncertainty due to uncertainty in model formulation, and has proven to produce better prediction quality (on average) then any single model ensemble. This multi-model approach is the basis for several international collaborative prediction research efforts, an operational European system and there are numerous examples of how this multi-model ensemble approach yields superior forecasts compared to any single model. Based on two NOAA Climate Test Bed (CTB) NMME workshops (February 18, and April 8, 2011) a collaborative and coordinated implementation strategy for a NMME prediction system has been developed and is currently delivering real-time seasonal-to-interannual predictions on the NOAA Climate Prediction Center (CPC) operational schedule. The hindcast and real-time prediction data is readily available (e.g., http://iridl.ldeo.columbia.edu/SOURCES/.Models/.NMME/) and in graphical format from CPC (http://origin.cpc.ncep.noaa.gov/products/people/wd51yf/NMME/index.html). Moreover, the NMME forecast are already currently being used as guidance for operational forecasters. This paper describes the new NMME effort, presents an overview of the multi-model forecast quality, and the complementary skill associated with individual models.

  5. On the Conditioning of Machine-Learning-Assisted Turbulence Modeling

    NASA Astrophysics Data System (ADS)

    Wu, Jinlong; Sun, Rui; Wang, Qiqi; Xiao, Heng

    2017-11-01

    Recently, several researchers have demonstrated that machine learning techniques can be used to improve the RANS modeled Reynolds stress by training on available database of high fidelity simulations. However, obtaining improved mean velocity field remains an unsolved challenge, restricting the predictive capability of current machine-learning-assisted turbulence modeling approaches. In this work we define a condition number to evaluate the model conditioning of data-driven turbulence modeling approaches, and propose a stability-oriented machine learning framework to model Reynolds stress. Two canonical flows, the flow in a square duct and the flow over periodic hills, are investigated to demonstrate the predictive capability of the proposed framework. The satisfactory prediction performance of mean velocity field for both flows demonstrates the predictive capability of the proposed framework for machine-learning-assisted turbulence modeling. With showing the capability of improving the prediction of mean flow field, the proposed stability-oriented machine learning framework bridges the gap between the existing machine-learning-assisted turbulence modeling approaches and the demand of predictive capability of turbulence models in real applications.

  6. A model for prediction of STOVL ejector dynamics

    NASA Technical Reports Server (NTRS)

    Drummond, Colin K.

    1989-01-01

    A semi-empirical control-volume approach to ejector modeling for transient performance prediction is presented. This new approach is motivated by the need for a predictive real-time ejector sub-system simulation for Short Take-Off Verticle Landing (STOVL) integrated flight and propulsion controls design applications. Emphasis is placed on discussion of the approximate characterization of the mixing process central to thrust augmenting ejector operation. The proposed ejector model suggests transient flow predictions are possible with a model based on steady-flow data. A practical test case is presented to illustrate model calibration.

  7. Life extending control for rocket engines

    NASA Technical Reports Server (NTRS)

    Lorenzo, C. F.; Saus, J. R.; Ray, A.; Carpino, M.; Wu, M.-K.

    1992-01-01

    The concept of life extending control is defined. A brief discussion of current fatigue life prediction methods is given and the need for an alternative life prediction model based on a continuous functional relationship is established. Two approaches to life extending control are considered: (1) the implicit approach which uses cyclic fatigue life prediction as a basis for control design; and (2) the continuous life prediction approach which requires a continuous damage law. Progress on an initial formulation of a continuous (in time) fatigue model is presented. Finally, nonlinear programming is used to develop initial results for life extension for a simplified rocket engine (model).

  8. Performance of two predictive uncertainty estimation approaches for conceptual Rainfall-Runoff Model: Bayesian Joint Inference and Hydrologic Uncertainty Post-processing

    NASA Astrophysics Data System (ADS)

    Hernández-López, Mario R.; Romero-Cuéllar, Jonathan; Camilo Múnera-Estrada, Juan; Coccia, Gabriele; Francés, Félix

    2017-04-01

    It is noticeably important to emphasize the role of uncertainty particularly when the model forecasts are used to support decision-making and water management. This research compares two approaches for the evaluation of the predictive uncertainty in hydrological modeling. First approach is the Bayesian Joint Inference of hydrological and error models. Second approach is carried out through the Model Conditional Processor using the Truncated Normal Distribution in the transformed space. This comparison is focused on the predictive distribution reliability. The case study is applied to two basins included in the Model Parameter Estimation Experiment (MOPEX). These two basins, which have different hydrological complexity, are the French Broad River (North Carolina) and the Guadalupe River (Texas). The results indicate that generally, both approaches are able to provide similar predictive performances. However, the differences between them can arise in basins with complex hydrology (e.g. ephemeral basins). This is because obtained results with Bayesian Joint Inference are strongly dependent on the suitability of the hypothesized error model. Similarly, the results in the case of the Model Conditional Processor are mainly influenced by the selected model of tails or even by the selected full probability distribution model of the data in the real space, and by the definition of the Truncated Normal Distribution in the transformed space. In summary, the different hypotheses that the modeler choose on each of the two approaches are the main cause of the different results. This research also explores a proper combination of both methodologies which could be useful to achieve less biased hydrological parameter estimation. For this approach, firstly the predictive distribution is obtained through the Model Conditional Processor. Secondly, this predictive distribution is used to derive the corresponding additive error model which is employed for the hydrological parameter estimation with the Bayesian Joint Inference methodology.

  9. Offline modeling for product quality prediction of mineral processing using modeling error PDF shaping and entropy minimization.

    PubMed

    Ding, Jinliang; Chai, Tianyou; Wang, Hong

    2011-03-01

    This paper presents a novel offline modeling for product quality prediction of mineral processing which consists of a number of unit processes in series. The prediction of the product quality of the whole mineral process (i.e., the mixed concentrate grade) plays an important role and the establishment of its predictive model is a key issue for the plantwide optimization. For this purpose, a hybrid modeling approach of the mixed concentrate grade prediction is proposed, which consists of a linear model and a nonlinear model. The least-squares support vector machine is adopted to establish the nonlinear model. The inputs of the predictive model are the performance indices of each unit process, while the output is the mixed concentrate grade. In this paper, the model parameter selection is transformed into the shape control of the probability density function (PDF) of the modeling error. In this context, both the PDF-control-based and minimum-entropy-based model parameter selection approaches are proposed. Indeed, this is the first time that the PDF shape control idea is used to deal with system modeling, where the key idea is to turn model parameters so that either the modeling error PDF is controlled to follow a target PDF or the modeling error entropy is minimized. The experimental results using the real plant data and the comparison of the two approaches are discussed. The results show the effectiveness of the proposed approaches.

  10. Prediction of Size Effects in Notched Laminates Using Continuum Damage Mechanics

    NASA Technical Reports Server (NTRS)

    Camanho, D. P.; Maimi, P.; Davila, C. G.

    2007-01-01

    This paper examines the use of a continuum damage model to predict strength and size effects in notched carbon-epoxy laminates. The effects of size and the development of a fracture process zone before final failure are identified in an experimental program. The continuum damage model is described and the resulting predictions of size effects are compared with alternative approaches: the point stress and the inherent flaw models, the Linear-Elastic Fracture Mechanics approach, and the strength of materials approach. The results indicate that the continuum damage model is the most accurate technique to predict size effects in composites. Furthermore, the continuum damage model does not require any calibration and it is applicable to general geometries and boundary conditions.

  11. Reducing hydrologic model uncertainty in monthly streamflow predictions using multimodel combination

    NASA Astrophysics Data System (ADS)

    Li, Weihua; Sankarasubramanian, A.

    2012-12-01

    Model errors are inevitable in any prediction exercise. One approach that is currently gaining attention in reducing model errors is by combining multiple models to develop improved predictions. The rationale behind this approach primarily lies on the premise that optimal weights could be derived for each model so that the developed multimodel predictions will result in improved predictions. A new dynamic approach (MM-1) to combine multiple hydrological models by evaluating their performance/skill contingent on the predictor state is proposed. We combine two hydrological models, "abcd" model and variable infiltration capacity (VIC) model, to develop multimodel streamflow predictions. To quantify precisely under what conditions the multimodel combination results in improved predictions, we compare multimodel scheme MM-1 with optimal model combination scheme (MM-O) by employing them in predicting the streamflow generated from a known hydrologic model (abcd model orVICmodel) with heteroscedastic error variance as well as from a hydrologic model that exhibits different structure than that of the candidate models (i.e., "abcd" model or VIC model). Results from the study show that streamflow estimated from single models performed better than multimodels under almost no measurement error. However, under increased measurement errors and model structural misspecification, both multimodel schemes (MM-1 and MM-O) consistently performed better than the single model prediction. Overall, MM-1 performs better than MM-O in predicting the monthly flow values as well as in predicting extreme monthly flows. Comparison of the weights obtained from each candidate model reveals that as measurement errors increase, MM-1 assigns weights equally for all the models, whereas MM-O assigns higher weights for always the best-performing candidate model under the calibration period. Applying the multimodel algorithms for predicting streamflows over four different sites revealed that MM-1 performs better than all single models and optimal model combination scheme, MM-O, in predicting the monthly flows as well as the flows during wetter months.

  12. Proactive Supply Chain Performance Management with Predictive Analytics

    PubMed Central

    Stefanovic, Nenad

    2014-01-01

    Today's business climate requires supply chains to be proactive rather than reactive, which demands a new approach that incorporates data mining predictive analytics. This paper introduces a predictive supply chain performance management model which combines process modelling, performance measurement, data mining models, and web portal technologies into a unique model. It presents the supply chain modelling approach based on the specialized metamodel which allows modelling of any supply chain configuration and at different level of details. The paper also presents the supply chain semantic business intelligence (BI) model which encapsulates data sources and business rules and includes the data warehouse model with specific supply chain dimensions, measures, and KPIs (key performance indicators). Next, the paper describes two generic approaches for designing the KPI predictive data mining models based on the BI semantic model. KPI predictive models were trained and tested with a real-world data set. Finally, a specialized analytical web portal which offers collaborative performance monitoring and decision making is presented. The results show that these models give very accurate KPI projections and provide valuable insights into newly emerging trends, opportunities, and problems. This should lead to more intelligent, predictive, and responsive supply chains capable of adapting to future business environment. PMID:25386605

  13. Proactive supply chain performance management with predictive analytics.

    PubMed

    Stefanovic, Nenad

    2014-01-01

    Today's business climate requires supply chains to be proactive rather than reactive, which demands a new approach that incorporates data mining predictive analytics. This paper introduces a predictive supply chain performance management model which combines process modelling, performance measurement, data mining models, and web portal technologies into a unique model. It presents the supply chain modelling approach based on the specialized metamodel which allows modelling of any supply chain configuration and at different level of details. The paper also presents the supply chain semantic business intelligence (BI) model which encapsulates data sources and business rules and includes the data warehouse model with specific supply chain dimensions, measures, and KPIs (key performance indicators). Next, the paper describes two generic approaches for designing the KPI predictive data mining models based on the BI semantic model. KPI predictive models were trained and tested with a real-world data set. Finally, a specialized analytical web portal which offers collaborative performance monitoring and decision making is presented. The results show that these models give very accurate KPI projections and provide valuable insights into newly emerging trends, opportunities, and problems. This should lead to more intelligent, predictive, and responsive supply chains capable of adapting to future business environment.

  14. Selection, calibration, and validation of models of tumor growth.

    PubMed

    Lima, E A B F; Oden, J T; Hormuth, D A; Yankeelov, T E; Almeida, R C

    2016-11-01

    This paper presents general approaches for addressing some of the most important issues in predictive computational oncology concerned with developing classes of predictive models of tumor growth. First, the process of developing mathematical models of vascular tumors evolving in the complex, heterogeneous, macroenvironment of living tissue; second, the selection of the most plausible models among these classes, given relevant observational data; third, the statistical calibration and validation of models in these classes, and finally, the prediction of key Quantities of Interest (QOIs) relevant to patient survival and the effect of various therapies. The most challenging aspects of this endeavor is that all of these issues often involve confounding uncertainties: in observational data, in model parameters, in model selection, and in the features targeted in the prediction. Our approach can be referred to as "model agnostic" in that no single model is advocated; rather, a general approach that explores powerful mixture-theory representations of tissue behavior while accounting for a range of relevant biological factors is presented, which leads to many potentially predictive models. Then representative classes are identified which provide a starting point for the implementation of OPAL, the Occam Plausibility Algorithm (OPAL) which enables the modeler to select the most plausible models (for given data) and to determine if the model is a valid tool for predicting tumor growth and morphology ( in vivo ). All of these approaches account for uncertainties in the model, the observational data, the model parameters, and the target QOI. We demonstrate these processes by comparing a list of models for tumor growth, including reaction-diffusion models, phase-fields models, and models with and without mechanical deformation effects, for glioma growth measured in murine experiments. Examples are provided that exhibit quite acceptable predictions of tumor growth in laboratory animals while demonstrating successful implementations of OPAL.

  15. Prediction of Protein Structure by Template-Based Modeling Combined with the UNRES Force Field.

    PubMed

    Krupa, Paweł; Mozolewska, Magdalena A; Joo, Keehyoung; Lee, Jooyoung; Czaplewski, Cezary; Liwo, Adam

    2015-06-22

    A new approach to the prediction of protein structures that uses distance and backbone virtual-bond dihedral angle restraints derived from template-based models and simulations with the united residue (UNRES) force field is proposed. The approach combines the accuracy and reliability of template-based methods for the segments of the target sequence with high similarity to those having known structures with the ability of UNRES to pack the domains correctly. Multiplexed replica-exchange molecular dynamics with restraints derived from template-based models of a given target, in which each restraint is weighted according to the accuracy of the prediction of the corresponding section of the molecule, is used to search the conformational space, and the weighted histogram analysis method and cluster analysis are applied to determine the families of the most probable conformations, from which candidate predictions are selected. To test the capability of the method to recover template-based models from restraints, five single-domain proteins with structures that have been well-predicted by template-based methods were used; it was found that the resulting structures were of the same quality as the best of the original models. To assess whether the new approach can improve template-based predictions with incorrectly predicted domain packing, four such targets were selected from the CASP10 targets; for three of them the new approach resulted in significantly better predictions compared with the original template-based models. The new approach can be used to predict the structures of proteins for which good templates can be found for sections of the sequence or an overall good template can be found for the entire sequence but the prediction quality is remarkably weaker in putative domain-linker regions.

  16. Cross-validation analysis for genetic evaluation models for ranking in endurance horses.

    PubMed

    García-Ballesteros, S; Varona, L; Valera, M; Gutiérrez, J P; Cervantes, I

    2018-01-01

    Ranking trait was used as a selection criterion for competition horses to estimate racing performance. In the literature the most common approaches to estimate breeding values are the linear or threshold statistical models. However, recent studies have shown that a Thurstonian approach was able to fix the race effect (competitive level of the horses that participate in the same race), thus suggesting a better prediction accuracy of breeding values for ranking trait. The aim of this study was to compare the predictability of linear, threshold and Thurstonian approaches for genetic evaluation of ranking in endurance horses. For this purpose, eight genetic models were used for each approach with different combinations of random effects: rider, rider-horse interaction and environmental permanent effect. All genetic models included gender, age and race as systematic effects. The database that was used contained 4065 ranking records from 966 horses and that for the pedigree contained 8733 animals (47% Arabian horses), with an estimated heritability around 0.10 for the ranking trait. The prediction ability of the models for racing performance was evaluated using a cross-validation approach. The average correlation between real and predicted performances across genetic models was around 0.25 for threshold, 0.58 for linear and 0.60 for Thurstonian approaches. Although no significant differences were found between models within approaches, the best genetic model included: the rider and rider-horse random effects for threshold, only rider and environmental permanent effects for linear approach and all random effects for Thurstonian approach. The absolute correlations of predicted breeding values among models were higher between threshold and Thurstonian: 0.90, 0.91 and 0.88 for all animals, top 20% and top 5% best animals. For rank correlations these figures were 0.85, 0.84 and 0.86. The lower values were those between linear and threshold approaches (0.65, 0.62 and 0.51). In conclusion, the Thurstonian approach is recommended for the routine genetic evaluations for ranking in endurance horses.

  17. NAPR: a Cloud-Based Framework for Neuroanatomical Age Prediction.

    PubMed

    Pardoe, Heath R; Kuzniecky, Ruben

    2018-01-01

    The availability of cloud computing services has enabled the widespread adoption of the "software as a service" (SaaS) approach for software distribution, which utilizes network-based access to applications running on centralized servers. In this paper we apply the SaaS approach to neuroimaging-based age prediction. Our system, named "NAPR" (Neuroanatomical Age Prediction using R), provides access to predictive modeling software running on a persistent cloud-based Amazon Web Services (AWS) compute instance. The NAPR framework allows external users to estimate the age of individual subjects using cortical thickness maps derived from their own locally processed T1-weighted whole brain MRI scans. As a demonstration of the NAPR approach, we have developed two age prediction models that were trained using healthy control data from the ABIDE, CoRR, DLBS and NKI Rockland neuroimaging datasets (total N = 2367, age range 6-89 years). The provided age prediction models were trained using (i) relevance vector machines and (ii) Gaussian processes machine learning methods applied to cortical thickness surfaces obtained using Freesurfer v5.3. We believe that this transparent approach to out-of-sample evaluation and comparison of neuroimaging age prediction models will facilitate the development of improved age prediction models and allow for robust evaluation of the clinical utility of these methods.

  18. Nonlinear model predictive control of a wave energy converter based on differential flatness parameterisation

    NASA Astrophysics Data System (ADS)

    Li, Guang

    2017-01-01

    This paper presents a fast constrained optimization approach, which is tailored for nonlinear model predictive control of wave energy converters (WEC). The advantage of this approach relies on its exploitation of the differential flatness of the WEC model. This can reduce the dimension of the resulting nonlinear programming problem (NLP) derived from the continuous constrained optimal control of WEC using pseudospectral method. The alleviation of computational burden using this approach helps to promote an economic implementation of nonlinear model predictive control strategy for WEC control problems. The method is applicable to nonlinear WEC models, nonconvex objective functions and nonlinear constraints, which are commonly encountered in WEC control problems. Numerical simulations demonstrate the efficacy of this approach.

  19. Evaluating pictogram prediction in a location-aware augmentative and alternative communication system.

    PubMed

    Garcia, Luís Filipe; de Oliveira, Luís Caldas; de Matos, David Martins

    2016-01-01

    This study compared the performance of two statistical location-aware pictogram prediction mechanisms, with an all-purpose (All) pictogram prediction mechanism, having no location knowledge. The All approach had a unique language model under all locations. One of the location-aware alternatives, the location-specific (Spec) approach, made use of specific language models for pictogram prediction in each location of interest. The other location-aware approach resulted from combining the Spec and the All approaches, and was designated the mixed approach (Mix). In this approach, the language models acquired knowledge from all locations, but a higher relevance was assigned to the vocabulary from the associated location. Results from simulations showed that the Mix and Spec approaches could only outperform the baseline in a statistically significant way if pictogram users reuse more than 50% and 75% of their sentences, respectively. Under low sentence reuse conditions there were no statistically significant differences between the location-aware approaches and the All approach. Under these conditions, the Mix approach performed better than the Spec approach in a statistically significant way.

  20. A comparative study of clonal selection algorithm for effluent removal forecasting in septic sludge treatment plant.

    PubMed

    Chun, Ting Sie; Malek, M A; Ismail, Amelia Ritahani

    2015-01-01

    The development of effluent removal prediction is crucial in providing a planning tool necessary for the future development and the construction of a septic sludge treatment plant (SSTP), especially in the developing countries. In order to investigate the expected functionality of the required standard, the prediction of the effluent quality, namely biological oxygen demand, chemical oxygen demand and total suspended solid of an SSTP was modelled using an artificial intelligence approach. In this paper, we adopt the clonal selection algorithm (CSA) to set up a prediction model, with a well-established method - namely the least-square support vector machine (LS-SVM) as a baseline model. The test results of the case study showed that the prediction of the CSA-based SSTP model worked well and provided model performance as satisfactory as the LS-SVM model. The CSA approach shows that fewer control and training parameters are required for model simulation as compared with the LS-SVM approach. The ability of a CSA approach in resolving limited data samples, non-linear sample function and multidimensional pattern recognition makes it a powerful tool in modelling the prediction of effluent removals in an SSTP.

  1. Predictive models of poly(ethylene-terephthalate) film degradation under multi-factor accelerated weathering exposures

    PubMed Central

    Ngendahimana, David K.; Fagerholm, Cara L.; Sun, Jiayang; Bruckman, Laura S.

    2017-01-01

    Accelerated weathering exposures were performed on poly(ethylene-terephthalate) (PET) films. Longitudinal multi-level predictive models as a function of PET grades and exposure types were developed for the change in yellowness index (YI) and haze (%). Exposures with similar change in YI were modeled using a linear fixed-effects modeling approach. Due to the complex nature of haze formation, measurement uncertainty, and the differences in the samples’ responses, the change in haze (%) depended on individual samples’ responses and a linear mixed-effects modeling approach was used. When compared to fixed-effects models, the addition of random effects in the haze formation models significantly increased the variance explained. For both modeling approaches, diagnostic plots confirmed independence and homogeneity with normally distributed residual errors. Predictive R2 values for true prediction error and predictive power of the models demonstrated that the models were not subject to over-fitting. These models enable prediction under pre-defined exposure conditions for a given exposure time (or photo-dosage in case of UV light exposure). PET degradation under cyclic exposures combining UV light and condensing humidity is caused by photolytic and hydrolytic mechanisms causing yellowing and haze formation. Quantitative knowledge of these degradation pathways enable cross-correlation of these lab-based exposures with real-world conditions for service life prediction. PMID:28498875

  2. DATA ASSIMILATION APPROACH FOR FORECAST OF SOLAR ACTIVITY CYCLES

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kitiashvili, Irina N., E-mail: irina.n.kitiashvili@nasa.gov

    Numerous attempts to predict future solar cycles are mostly based on empirical relations derived from observations of previous cycles, and they yield a wide range of predicted strengths and durations of the cycles. Results obtained with current dynamo models also deviate strongly from each other, thus raising questions about criteria to quantify the reliability of such predictions. The primary difficulties in modeling future solar activity are shortcomings of both the dynamo models and observations that do not allow us to determine the current and past states of the global solar magnetic structure and its dynamics. Data assimilation is a relativelymore » new approach to develop physics-based predictions and estimate their uncertainties in situations where the physical properties of a system are not well-known. This paper presents an application of the ensemble Kalman filter method for modeling and prediction of solar cycles through use of a low-order nonlinear dynamo model that includes the essential physics and can describe general properties of the sunspot cycles. Despite the simplicity of this model, the data assimilation approach provides reasonable estimates for the strengths of future solar cycles. In particular, the prediction of Cycle 24 calculated and published in 2008 is so far holding up quite well. In this paper, I will present my first attempt to predict Cycle 25 using the data assimilation approach, and discuss the uncertainties of that prediction.« less

  3. Advances in modeling sorption and diffusion of moisture in porous reactive materials.

    PubMed

    Harley, Stephen J; Glascoe, Elizabeth A; Lewicki, James P; Maxwell, Robert S

    2014-06-23

    Water-vapor-uptake experiments were performed on a silica-filled poly(dimethylsiloxane) (PDMS) network and modeled by using two different approaches. The data was modeled by using established methods and the model parameters were used to predict moisture uptake in a sample. The predictions are reasonably good, but not outstanding; many of the shortcomings of the modeling are discussed. A high-fidelity modeling approach is derived and used to improve the modeling of moisture uptake and diffusion. Our modeling approach captures the physics and kinetics of diffusion and adsorption/desorption, simultaneously. It predicts uptake better than the established method; more importantly, it is also able to predict outgassing. The material used for these studies is a filled-PDMS network; physical interpretations concerning the sorption and diffusion of moisture in this network are discussed. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  4. Hierarchical time series bottom-up approach for forecast the export value in Central Java

    NASA Astrophysics Data System (ADS)

    Mahkya, D. A.; Ulama, B. S.; Suhartono

    2017-10-01

    The purpose of this study is Getting the best modeling and predicting the export value of Central Java using a Hierarchical Time Series. The export value is one variable injection in the economy of a country, meaning that if the export value of the country increases, the country’s economy will increase even more. Therefore, it is necessary appropriate modeling to predict the export value especially in Central Java. Export Value in Central Java are grouped into 21 commodities with each commodity has a different pattern. One approach that can be used time series is a hierarchical approach. Hierarchical Time Series is used Buttom-up. To Forecast the individual series at all levels using Autoregressive Integrated Moving Average (ARIMA), Radial Basis Function Neural Network (RBFNN), and Hybrid ARIMA-RBFNN. For the selection of the best models used Symmetric Mean Absolute Percentage Error (sMAPE). Results of the analysis showed that for the Export Value of Central Java, Bottom-up approach with Hybrid ARIMA-RBFNN modeling can be used for long-term predictions. As for the short and medium-term predictions, it can be used a bottom-up approach RBFNN modeling. Overall bottom-up approach with RBFNN modeling give the best result.

  5. Improving stability of prediction models based on correlated omics data by using network approaches.

    PubMed

    Tissier, Renaud; Houwing-Duistermaat, Jeanine; Rodríguez-Girondo, Mar

    2018-01-01

    Building prediction models based on complex omics datasets such as transcriptomics, proteomics, metabolomics remains a challenge in bioinformatics and biostatistics. Regularized regression techniques are typically used to deal with the high dimensionality of these datasets. However, due to the presence of correlation in the datasets, it is difficult to select the best model and application of these methods yields unstable results. We propose a novel strategy for model selection where the obtained models also perform well in terms of overall predictability. Several three step approaches are considered, where the steps are 1) network construction, 2) clustering to empirically derive modules or pathways, and 3) building a prediction model incorporating the information on the modules. For the first step, we use weighted correlation networks and Gaussian graphical modelling. Identification of groups of features is performed by hierarchical clustering. The grouping information is included in the prediction model by using group-based variable selection or group-specific penalization. We compare the performance of our new approaches with standard regularized regression via simulations. Based on these results we provide recommendations for selecting a strategy for building a prediction model given the specific goal of the analysis and the sizes of the datasets. Finally we illustrate the advantages of our approach by application of the methodology to two problems, namely prediction of body mass index in the DIetary, Lifestyle, and Genetic determinants of Obesity and Metabolic syndrome study (DILGOM) and prediction of response of each breast cancer cell line to treatment with specific drugs using a breast cancer cell lines pharmacogenomics dataset.

  6. Neuromusculoskeletal Model Calibration Significantly Affects Predicted Knee Contact Forces for Walking

    PubMed Central

    Serrancolí, Gil; Kinney, Allison L.; Fregly, Benjamin J.; Font-Llagunes, Josep M.

    2016-01-01

    Though walking impairments are prevalent in society, clinical treatments are often ineffective at restoring lost function. For this reason, researchers have begun to explore the use of patient-specific computational walking models to develop more effective treatments. However, the accuracy with which models can predict internal body forces in muscles and across joints depends on how well relevant model parameter values can be calibrated for the patient. This study investigated how knowledge of internal knee contact forces affects calibration of neuromusculoskeletal model parameter values and subsequent prediction of internal knee contact and leg muscle forces during walking. Model calibration was performed using a novel two-level optimization procedure applied to six normal walking trials from the Fourth Grand Challenge Competition to Predict In Vivo Knee Loads. The outer-level optimization adjusted time-invariant model parameter values to minimize passive muscle forces, reserve actuator moments, and model parameter value changes with (Approach A) and without (Approach B) tracking of experimental knee contact forces. Using the current guess for model parameter values but no knee contact force information, the inner-level optimization predicted time-varying muscle activations that were close to experimental muscle synergy patterns and consistent with the experimental inverse dynamic loads (both approaches). For all the six gait trials, Approach A predicted knee contact forces with high accuracy for both compartments (average correlation coefficient r = 0.99 and root mean square error (RMSE) = 52.6 N medial; average r = 0.95 and RMSE = 56.6 N lateral). In contrast, Approach B overpredicted contact force magnitude for both compartments (average RMSE = 323 N medial and 348 N lateral) and poorly matched contact force shape for the lateral compartment (average r = 0.90 medial and −0.10 lateral). Approach B had statistically higher lateral muscle forces and lateral optimal muscle fiber lengths but lower medial, central, and lateral normalized muscle fiber lengths compared to Approach A. These findings suggest that poorly calibrated model parameter values may be a major factor limiting the ability of neuromusculoskeletal models to predict knee contact and leg muscle forces accurately for walking. PMID:27210105

  7. Weather and seasonal climate prediction for South America using a multi-model superensemble

    NASA Astrophysics Data System (ADS)

    Chaves, Rosane R.; Ross, Robert S.; Krishnamurti, T. N.

    2005-11-01

    This work examines the feasibility of weather and seasonal climate predictions for South America using the multi-model synthetic superensemble approach for climate, and the multi-model conventional superensemble approach for numerical weather prediction, both developed at Florida State University (FSU). The effect on seasonal climate forecasts of the number of models used in the synthetic superensemble is investigated. It is shown that the synthetic superensemble approach for climate and the conventional superensemble approach for numerical weather prediction can reduce the errors over South America in seasonal climate prediction and numerical weather prediction.For climate prediction, a suite of 13 models is used. The forecast lead-time is 1 month for the climate forecasts, which consist of precipitation and surface temperature forecasts. The multi-model ensemble is comprised of four versions of the FSU-Coupled Ocean-Atmosphere Model, seven models from the Development of a European Multi-model Ensemble System for Seasonal to Interannual Prediction (DEMETER), a version of the Community Climate Model (CCM3), and a version of the predictive Ocean Atmosphere Model for Australia (POAMA). The results show that conditions over South America are appropriately simulated by the Florida State University Synthetic Superensemble (FSUSSE) in comparison to observations and that the skill of this approach increases with the use of additional models in the ensemble. When compared to observations, the forecasts are generally better than those from both a single climate model and the multi-model ensemble mean, for the variables tested in this study.For numerical weather prediction, the conventional Florida State University Superensemble (FSUSE) is used to predict the mass and motion fields over South America. Predictions of mean sea level pressure, 500 hPa geopotential height, and 850 hPa wind are made with a multi-model superensemble comprised of six global models for the period January, February, and December of 2000. The six global models are from the following forecast centers: FSU, Bureau of Meteorology Research Center (BMRC), Japan Meteorological Agency (JMA), National Centers for Environmental Prediction (NCEP), Naval Research Laboratory (NRL), and Recherche en Prevision Numerique (RPN). Predictions of precipitation are made for the period January, February, and December of 2001 with a multi-analysis-multi-model superensemble where, in addition to the six forecast models just mentioned, five additional versions of the FSU model are used in the ensemble, each with a different initialization (analysis) based on different physical initialization procedures. On the basis of observations, the results show that the FSUSE provides the best forecasts of the mass and motion field variables to forecast day 5, when compared to both the models comprising the ensemble and the multi-model ensemble mean during the wet season of December-February over South America. Individual case studies show that the FSUSE provides excellent predictions of rainfall for particular synoptic events to forecast day 3. Copyright

  8. Dynamics and control of quadcopter using linear model predictive control approach

    NASA Astrophysics Data System (ADS)

    Islam, M.; Okasha, M.; Idres, M. M.

    2017-12-01

    This paper investigates the dynamics and control of a quadcopter using the Model Predictive Control (MPC) approach. The dynamic model is of high fidelity and nonlinear, with six degrees of freedom that include disturbances and model uncertainties. The control approach is developed based on MPC to track different reference trajectories ranging from simple ones such as circular to complex helical trajectories. In this control technique, a linearized model is derived and the receding horizon method is applied to generate the optimal control sequence. Although MPC is computer expensive, it is highly effective to deal with the different types of nonlinearities and constraints such as actuators’ saturation and model uncertainties. The MPC parameters (control and prediction horizons) are selected by trial-and-error approach. Several simulation scenarios are performed to examine and evaluate the performance of the proposed control approach using MATLAB and Simulink environment. Simulation results show that this control approach is highly effective to track a given reference trajectory.

  9. A system identification approach for developing model predictive controllers of antibody quality attributes in cell culture processes

    PubMed Central

    Schmitt, John; Beller, Justin; Russell, Brian; Quach, Anthony; Hermann, Elizabeth; Lyon, David; Breit, Jeffrey

    2017-01-01

    As the biopharmaceutical industry evolves to include more diverse protein formats and processes, more robust control of Critical Quality Attributes (CQAs) is needed to maintain processing flexibility without compromising quality. Active control of CQAs has been demonstrated using model predictive control techniques, which allow development of processes which are robust against disturbances associated with raw material variability and other potentially flexible operating conditions. Wide adoption of model predictive control in biopharmaceutical cell culture processes has been hampered, however, in part due to the large amount of data and expertise required to make a predictive model of controlled CQAs, a requirement for model predictive control. Here we developed a highly automated, perfusion apparatus to systematically and efficiently generate predictive models using application of system identification approaches. We successfully created a predictive model of %galactosylation using data obtained by manipulating galactose concentration in the perfusion apparatus in serialized step change experiments. We then demonstrated the use of the model in a model predictive controller in a simulated control scenario to successfully achieve a %galactosylation set point in a simulated fed‐batch culture. The automated model identification approach demonstrated here can potentially be generalized to many CQAs, and could be a more efficient, faster, and highly automated alternative to batch experiments for developing predictive models in cell culture processes, and allow the wider adoption of model predictive control in biopharmaceutical processes. © 2017 The Authors Biotechnology Progress published by Wiley Periodicals, Inc. on behalf of American Institute of Chemical Engineers Biotechnol. Prog., 33:1647–1661, 2017 PMID:28786215

  10. Hierarchical multi-scale approach to validation and uncertainty quantification of hyper-spectral image modeling

    NASA Astrophysics Data System (ADS)

    Engel, Dave W.; Reichardt, Thomas A.; Kulp, Thomas J.; Graff, David L.; Thompson, Sandra E.

    2016-05-01

    Validating predictive models and quantifying uncertainties inherent in the modeling process is a critical component of the HARD Solids Venture program [1]. Our current research focuses on validating physics-based models predicting the optical properties of solid materials for arbitrary surface morphologies and characterizing the uncertainties in these models. We employ a systematic and hierarchical approach by designing physical experiments and comparing the experimental results with the outputs of computational predictive models. We illustrate this approach through an example comparing a micro-scale forward model to an idealized solid-material system and then propagating the results through a system model to the sensor level. Our efforts should enhance detection reliability of the hyper-spectral imaging technique and the confidence in model utilization and model outputs by users and stakeholders.

  11. A Sub-filter Scale Noise Equation far Hybrid LES Simulations

    NASA Technical Reports Server (NTRS)

    Goldstein, Marvin E.

    2006-01-01

    Hybrid LES/subscale modeling approaches have an important advantage over the current noise prediction methods in that they only involve modeling of the relatively universal subscale motion and not the configuration dependent larger scale turbulence . Previous hybrid approaches use approximate statistical techniques or extrapolation methods to obtain the requisite information about the sub-filter scale motion. An alternative approach would be to adopt the modeling techniques used in the current noise prediction methods and determine the unknown stresses from experimental data. The present paper derives an equation for predicting the sub scale sound from information that can be obtained with currently available experimental procedures. The resulting prediction method would then be intermediate between the current noise prediction codes and previously proposed hybrid techniques.

  12. Very-short-term wind power prediction by a hybrid model with single- and multi-step approaches

    NASA Astrophysics Data System (ADS)

    Mohammed, E.; Wang, S.; Yu, J.

    2017-05-01

    Very-short-term wind power prediction (VSTWPP) has played an essential role for the operation of electric power systems. This paper aims at improving and applying a hybrid method of VSTWPP based on historical data. The hybrid method is combined by multiple linear regressions and least square (MLR&LS), which is intended for reducing prediction errors. The predicted values are obtained through two sub-processes:1) transform the time-series data of actual wind power into the power ratio, and then predict the power ratio;2) use the predicted power ratio to predict the wind power. Besides, the proposed method can include two prediction approaches: single-step prediction (SSP) and multi-step prediction (MSP). WPP is tested comparatively by auto-regressive moving average (ARMA) model from the predicted values and errors. The validity of the proposed hybrid method is confirmed in terms of error analysis by using probability density function (PDF), mean absolute percent error (MAPE) and means square error (MSE). Meanwhile, comparison of the correlation coefficients between the actual values and the predicted values for different prediction times and window has confirmed that MSP approach by using the hybrid model is the most accurate while comparing to SSP approach and ARMA. The MLR&LS is accurate and promising for solving problems in WPP.

  13. Integrative Approaches for Predicting in vivo Effects of Chemicals from their Structural Descriptors and the Results of Short-term Biological Assays

    PubMed Central

    Low, Yen S.; Sedykh, Alexander; Rusyn, Ivan; Tropsha, Alexander

    2017-01-01

    Cheminformatics approaches such as Quantitative Structure Activity Relationship (QSAR) modeling have been used traditionally for predicting chemical toxicity. In recent years, high throughput biological assays have been increasingly employed to elucidate mechanisms of chemical toxicity and predict toxic effects of chemicals in vivo. The data generated in such assays can be considered as biological descriptors of chemicals that can be combined with molecular descriptors and employed in QSAR modeling to improve the accuracy of toxicity prediction. In this review, we discuss several approaches for integrating chemical and biological data for predicting biological effects of chemicals in vivo and compare their performance across several data sets. We conclude that while no method consistently shows superior performance, the integrative approaches rank consistently among the best yet offer enriched interpretation of models over those built with either chemical or biological data alone. We discuss the outlook for such interdisciplinary methods and offer recommendations to further improve the accuracy and interpretability of computational models that predict chemical toxicity. PMID:24805064

  14. Symbolic Processing Combined with Model-Based Reasoning

    NASA Technical Reports Server (NTRS)

    James, Mark

    2009-01-01

    A computer program for the detection of present and prediction of future discrete states of a complex, real-time engineering system utilizes a combination of symbolic processing and numerical model-based reasoning. One of the biggest weaknesses of a purely symbolic approach is that it enables prediction of only future discrete states while missing all unmodeled states or leading to incorrect identification of an unmodeled state as a modeled one. A purely numerical approach is based on a combination of statistical methods and mathematical models of the applicable physics and necessitates development of a complete model to the level of fidelity required for prediction. In addition, a purely numerical approach does not afford the ability to qualify its results without some form of symbolic processing. The present software implements numerical algorithms to detect unmodeled events and symbolic algorithms to predict expected behavior, correlate the expected behavior with the unmodeled events, and interpret the results in order to predict future discrete states. The approach embodied in this software differs from that of the BEAM methodology (aspects of which have been discussed in several prior NASA Tech Briefs articles), which provides for prediction of future measurements in the continuous-data domain.

  15. Linear regression crash prediction models : issues and proposed solutions.

    DOT National Transportation Integrated Search

    2010-05-01

    The paper develops a linear regression model approach that can be applied to : crash data to predict vehicle crashes. The proposed approach involves novice data aggregation : to satisfy linear regression assumptions; namely error structure normality ...

  16. Activated sludge pilot plant: comparison between experimental and predicted concentration profiles using three different modelling approaches.

    PubMed

    Le Moullec, Y; Potier, O; Gentric, C; Leclerc, J P

    2011-05-01

    This paper presents an experimental and numerical study of an activated sludge channel pilot plant. Concentration profiles of oxygen, COD, NO(3) and NH(4) have been measured for several operating conditions. These profiles have been compared to the simulated ones with three different modelling approaches, namely a systemic approach, CFD and compartmental modelling. For these three approaches, the kinetics model was the ASM-1 model (Henze et al., 2001). The three approaches allowed a reasonable simulation of all the concentration profiles except for ammonium for which the simulations results were far from the experimental ones. The analysis of the results showed that the role of the kinetics model is of primary importance for the prediction of activated sludge reactors performance. The fact that existing kinetics parameters in the literature have been determined by parametric optimisation using a systemic model limits the reliability of the prediction of local concentrations and of the local design of activated sludge reactors. Copyright © 2011 Elsevier Ltd. All rights reserved.

  17. Comparison of Models and Whole-Genome Profiling Approaches for Genomic-Enabled Prediction of Septoria Tritici Blotch, Stagonospora Nodorum Blotch, and Tan Spot Resistance in Wheat.

    PubMed

    Juliana, Philomin; Singh, Ravi P; Singh, Pawan K; Crossa, Jose; Rutkoski, Jessica E; Poland, Jesse A; Bergstrom, Gary C; Sorrells, Mark E

    2017-07-01

    The leaf spotting diseases in wheat that include Septoria tritici blotch (STB) caused by , Stagonospora nodorum blotch (SNB) caused by , and tan spot (TS) caused by pose challenges to breeding programs in selecting for resistance. A promising approach that could enable selection prior to phenotyping is genomic selection that uses genome-wide markers to estimate breeding values (BVs) for quantitative traits. To evaluate this approach for seedling and/or adult plant resistance (APR) to STB, SNB, and TS, we compared the predictive ability of least-squares (LS) approach with genomic-enabled prediction models including genomic best linear unbiased predictor (GBLUP), Bayesian ridge regression (BRR), Bayes A (BA), Bayes B (BB), Bayes Cπ (BC), Bayesian least absolute shrinkage and selection operator (BL), and reproducing kernel Hilbert spaces markers (RKHS-M), a pedigree-based model (RKHS-P) and RKHS markers and pedigree (RKHS-MP). We observed that LS gave the lowest prediction accuracies and RKHS-MP, the highest. The genomic-enabled prediction models and RKHS-P gave similar accuracies. The increase in accuracy using genomic prediction models over LS was 48%. The mean genomic prediction accuracies were 0.45 for STB (APR), 0.55 for SNB (seedling), 0.66 for TS (seedling) and 0.48 for TS (APR). We also compared markers from two whole-genome profiling approaches: genotyping by sequencing (GBS) and diversity arrays technology sequencing (DArTseq) for prediction. While, GBS markers performed slightly better than DArTseq, combining markers from the two approaches did not improve accuracies. We conclude that implementing GS in breeding for these diseases would help to achieve higher accuracies and rapid gains from selection. Copyright © 2017 Crop Science Society of America.

  18. Prediction of the dollar to the ruble rate. A system-theoretic approach

    NASA Astrophysics Data System (ADS)

    Borodachev, Sergey M.

    2017-07-01

    Proposed a simple state-space model of dollar rate formation based on changes in oil prices and some mechanisms of money transfer between monetary and stock markets. Comparison of predictions by means of input-output model and state-space model is made. It concludes that with proper use of statistical data (Kalman filter) the second approach provides more adequate predictions of the dollar rate.

  19. Prediction of Patient-Controlled Analgesic Consumption: A Multimodel Regression Tree Approach.

    PubMed

    Hu, Yuh-Jyh; Ku, Tien-Hsiung; Yang, Yu-Hung; Shen, Jia-Ying

    2018-01-01

    Several factors contribute to individual variability in postoperative pain, therefore, individuals consume postoperative analgesics at different rates. Although many statistical studies have analyzed postoperative pain and analgesic consumption, most have identified only the correlation and have not subjected the statistical model to further tests in order to evaluate its predictive accuracy. In this study involving 3052 patients, a multistrategy computational approach was developed for analgesic consumption prediction. This approach uses data on patient-controlled analgesia demand behavior over time and combines clustering, classification, and regression to mitigate the limitations of current statistical models. Cross-validation results indicated that the proposed approach significantly outperforms various existing regression methods. Moreover, a comparison between the predictions by anesthesiologists and medical specialists and those of the computational approach for an independent test data set of 60 patients further evidenced the superiority of the computational approach in predicting analgesic consumption because it produced markedly lower root mean squared errors.

  20. Emerging approaches in predictive toxicology.

    PubMed

    Zhang, Luoping; McHale, Cliona M; Greene, Nigel; Snyder, Ronald D; Rich, Ivan N; Aardema, Marilyn J; Roy, Shambhu; Pfuhler, Stefan; Venkatactahalam, Sundaresan

    2014-12-01

    Predictive toxicology plays an important role in the assessment of toxicity of chemicals and the drug development process. While there are several well-established in vitro and in vivo assays that are suitable for predictive toxicology, recent advances in high-throughput analytical technologies and model systems are expected to have a major impact on the field of predictive toxicology. This commentary provides an overview of the state of the current science and a brief discussion on future perspectives for the field of predictive toxicology for human toxicity. Computational models for predictive toxicology, needs for further refinement and obstacles to expand computational models to include additional classes of chemical compounds are highlighted. Functional and comparative genomics approaches in predictive toxicology are discussed with an emphasis on successful utilization of recently developed model systems for high-throughput analysis. The advantages of three-dimensional model systems and stem cells and their use in predictive toxicology testing are also described. © 2014 Wiley Periodicals, Inc.

  1. Emerging Approaches in Predictive Toxicology

    PubMed Central

    Zhang, Luoping; McHale, Cliona M.; Greene, Nigel; Snyder, Ronald D.; Rich, Ivan N.; Aardema, Marilyn J.; Roy, Shambhu; Pfuhler, Stefan; Venkatactahalam, Sundaresan

    2016-01-01

    Predictive toxicology plays an important role in the assessment of toxicity of chemicals and the drug development process. While there are several well-established in vitro and in vivo assays that are suitable for predictive toxicology, recent advances in high-throughput analytical technologies and model systems are expected to have a major impact on the field of predictive toxicology. This commentary provides an overview of the state of the current science and a brief discussion on future perspectives for the field of predictive toxicology for human toxicity. Computational models for predictive toxicology, needs for further refinement and obstacles to expand computational models to include additional classes of chemical compounds are highlighted. Functional and comparative genomics approaches in predictive toxicology are discussed with an emphasis on successful utilization of recently developed model systems for high-throughput analysis. The advantages of three-dimensional model systems and stem cells and their use in predictive toxicology testing are also described. PMID:25044351

  2. Data Aggregation, Curation and Modeling Approaches to Deliver Prediction Models to Support Computational Toxicology at the EPA (ACS Fall meeting)

    EPA Science Inventory

    The U.S. Environmental Protection Agency (EPA) Computational Toxicology Program develops and utilizes QSAR modeling approaches across a broad range of applications. In terms of physical chemistry we have a particular interest in the prediction of basic physicochemical parameters ...

  3. Prediction accuracy of direct and indirect approaches, and their relationships with prediction ability of calibration models.

    PubMed

    Belay, T K; Dagnachew, B S; Boison, S A; Ådnøy, T

    2018-03-28

    Milk infrared spectra are routinely used for phenotyping traits of interest through links developed between the traits and spectra. Predicted individual traits are then used in genetic analyses for estimated breeding value (EBV) or for phenotypic predictions using a single-trait mixed model; this approach is referred to as indirect prediction (IP). An alternative approach [direct prediction (DP)] is a direct genetic analysis of (a reduced dimension of) the spectra using a multitrait model to predict multivariate EBV of the spectral components and, ultimately, also to predict the univariate EBV or phenotype for the traits of interest. We simulated 3 traits under different genetic (low: 0.10 to high: 0.90) and residual (zero to high: ±0.90) correlation scenarios between the 3 traits and assumed the first trait is a linear combination of the other 2 traits. The aim was to compare the IP and DP approaches for predictions of EBV and phenotypes under the different correlation scenarios. We also evaluated relationships between performances of the 2 approaches and the accuracy of calibration equations. Moreover, the effect of using different regression coefficients estimated from simulated phenotypes (β p ), true breeding values (β g ), and residuals (β r ) on performance of the 2 approaches were evaluated. The simulated data contained 2,100 parents (100 sires and 2,000 cows) and 8,000 offspring (4 offspring per cow). Of the 8,000 observations, 2,000 were randomly selected and used to develop links between the first and the other 2 traits using partial least square (PLS) regression analysis. The different PLS regression coefficients, such as β p , β g , and β r , were used in subsequent predictions following the IP and DP approaches. We used BLUP analyses for the remaining 6,000 observations using the true (co)variance components that had been used for the simulation. Accuracy of prediction (of EBV and phenotype) was calculated as a correlation between predicted and true values from the simulations. The results showed that accuracies of EBV prediction were higher in the DP than in the IP approach. The reverse was true for accuracy of phenotypic prediction when using β p but not when using β g and β r , where accuracy of phenotypic prediction in the DP was slightly higher than in the IP approach. Within the DP approach, accuracies of EBV when using β g were higher than when using β p only at the low genetic correlation scenario. However, we found no differences in EBV prediction accuracy between the β p and β g in the IP approach. Accuracy of the calibration models increased with an increase in genetic and residual correlations between the traits. Performance of both approaches increased with an increase in accuracy of the calibration models. In conclusion, the DP approach is a good strategy for EBV prediction but not for phenotypic prediction, where the classical PLS regression-based equations or the IP approach provided better results. The Authors. Published by FASS Inc. and Elsevier Inc. on behalf of the American Dairy Science Association®. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/3.0/).

  4. Comparison of Predictive Modeling Methods of Aircraft Landing Speed

    NASA Technical Reports Server (NTRS)

    Diallo, Ousmane H.

    2012-01-01

    Expected increases in air traffic demand have stimulated the development of air traffic control tools intended to assist the air traffic controller in accurately and precisely spacing aircraft landing at congested airports. Such tools will require an accurate landing-speed prediction to increase throughput while decreasing necessary controller interventions for avoiding separation violations. There are many practical challenges to developing an accurate landing-speed model that has acceptable prediction errors. This paper discusses the development of a near-term implementation, using readily available information, to estimate/model final approach speed from the top of the descent phase of flight to the landing runway. As a first approach, all variables found to contribute directly to the landing-speed prediction model are used to build a multi-regression technique of the response surface equation (RSE). Data obtained from operations of a major airlines for a passenger transport aircraft type to the Dallas/Fort Worth International Airport are used to predict the landing speed. The approach was promising because it decreased the standard deviation of the landing-speed error prediction by at least 18% from the standard deviation of the baseline error, depending on the gust condition at the airport. However, when the number of variables is reduced to the most likely obtainable at other major airports, the RSE model shows little improvement over the existing methods. Consequently, a neural network that relies on a nonlinear regression technique is utilized as an alternative modeling approach. For the reduced number of variables cases, the standard deviation of the neural network models errors represent over 5% reduction compared to the RSE model errors, and at least 10% reduction over the baseline predicted landing-speed error standard deviation. Overall, the constructed models predict the landing-speed more accurately and precisely than the current state-of-the-art.

  5. Predicting Human Preferences Using the Block Structure of Complex Social Networks

    PubMed Central

    Guimerà, Roger; Llorente, Alejandro; Moro, Esteban; Sales-Pardo, Marta

    2012-01-01

    With ever-increasing available data, predicting individuals' preferences and helping them locate the most relevant information has become a pressing need. Understanding and predicting preferences is also important from a fundamental point of view, as part of what has been called a “new” computational social science. Here, we propose a novel approach based on stochastic block models, which have been developed by sociologists as plausible models of complex networks of social interactions. Our model is in the spirit of predicting individuals' preferences based on the preferences of others but, rather than fitting a particular model, we rely on a Bayesian approach that samples over the ensemble of all possible models. We show that our approach is considerably more accurate than leading recommender algorithms, with major relative improvements between 38% and 99% over industry-level algorithms. Besides, our approach sheds light on decision-making processes by identifying groups of individuals that have consistently similar preferences, and enabling the analysis of the characteristics of those groups. PMID:22984533

  6. Livestock Helminths in a Changing Climate: Approaches and Restrictions to Meaningful Predictions

    PubMed Central

    Fox, Naomi J.; Marion, Glenn; Davidson, Ross S.; White, Piran C. L.; Hutchings, Michael R.

    2012-01-01

    Simple Summary Parasitic helminths represent one of the most pervasive challenges to livestock, and their intensity and distribution will be influenced by climate change. There is a need for long-term predictions to identify potential risks and highlight opportunities for control. We explore the approaches to modelling future helminth risk to livestock under climate change. One of the limitations to model creation is the lack of purpose driven data collection. We also conclude that models need to include a broad view of the livestock system to generate meaningful predictions. Abstract Climate change is a driving force for livestock parasite risk. This is especially true for helminths including the nematodes Haemonchus contortus, Teladorsagia circumcincta, Nematodirus battus, and the trematode Fasciola hepatica, since survival and development of free-living stages is chiefly affected by temperature and moisture. The paucity of long term predictions of helminth risk under climate change has driven us to explore optimal modelling approaches and identify current bottlenecks to generating meaningful predictions. We classify approaches as correlative or mechanistic, exploring their strengths and limitations. Climate is one aspect of a complex system and, at the farm level, husbandry has a dominant influence on helminth transmission. Continuing environmental change will necessitate the adoption of mitigation and adaptation strategies in husbandry. Long term predictive models need to have the architecture to incorporate these changes. Ultimately, an optimal modelling approach is likely to combine mechanistic processes and physiological thresholds with correlative bioclimatic modelling, incorporating changes in livestock husbandry and disease control. Irrespective of approach, the principal limitation to parasite predictions is the availability of active surveillance data and empirical data on physiological responses to climate variables. By combining improved empirical data and refined models with a broad view of the livestock system, robust projections of helminth risk can be developed. PMID:26486780

  7. A Novel Modelling Approach for Predicting Forest Growth and Yield under Climate Change.

    PubMed

    Ashraf, M Irfan; Meng, Fan-Rui; Bourque, Charles P-A; MacLean, David A

    2015-01-01

    Global climate is changing due to increasing anthropogenic emissions of greenhouse gases. Forest managers need growth and yield models that can be used to predict future forest dynamics during the transition period of present-day forests under a changing climatic regime. In this study, we developed a forest growth and yield model that can be used to predict individual-tree growth under current and projected future climatic conditions. The model was constructed by integrating historical tree growth records with predictions from an ecological process-based model using neural networks. The new model predicts basal area (BA) and volume growth for individual trees in pure or mixed species forests. For model development, tree-growth data under current climatic conditions were obtained using over 3000 permanent sample plots from the Province of Nova Scotia, Canada. Data to reflect tree growth under a changing climatic regime were projected with JABOWA-3 (an ecological process-based model). Model validation with designated data produced model efficiencies of 0.82 and 0.89 in predicting individual-tree BA and volume growth. Model efficiency is a relative index of model performance, where 1 indicates an ideal fit, while values lower than zero means the predictions are no better than the average of the observations. Overall mean prediction error (BIAS) of basal area and volume growth predictions was nominal (i.e., for BA: -0.0177 cm(2) 5-year(-1) and volume: 0.0008 m(3) 5-year(-1)). Model variability described by root mean squared error (RMSE) in basal area prediction was 40.53 cm(2) 5-year(-1) and 0.0393 m(3) 5-year(-1) in volume prediction. The new modelling approach has potential to reduce uncertainties in growth and yield predictions under different climate change scenarios. This novel approach provides an avenue for forest managers to generate required information for the management of forests in transitional periods of climate change. Artificial intelligence technology has substantial potential in forest modelling.

  8. A Novel Modelling Approach for Predicting Forest Growth and Yield under Climate Change

    PubMed Central

    Ashraf, M. Irfan; Meng, Fan-Rui; Bourque, Charles P.-A.; MacLean, David A.

    2015-01-01

    Global climate is changing due to increasing anthropogenic emissions of greenhouse gases. Forest managers need growth and yield models that can be used to predict future forest dynamics during the transition period of present-day forests under a changing climatic regime. In this study, we developed a forest growth and yield model that can be used to predict individual-tree growth under current and projected future climatic conditions. The model was constructed by integrating historical tree growth records with predictions from an ecological process-based model using neural networks. The new model predicts basal area (BA) and volume growth for individual trees in pure or mixed species forests. For model development, tree-growth data under current climatic conditions were obtained using over 3000 permanent sample plots from the Province of Nova Scotia, Canada. Data to reflect tree growth under a changing climatic regime were projected with JABOWA-3 (an ecological process-based model). Model validation with designated data produced model efficiencies of 0.82 and 0.89 in predicting individual-tree BA and volume growth. Model efficiency is a relative index of model performance, where 1 indicates an ideal fit, while values lower than zero means the predictions are no better than the average of the observations. Overall mean prediction error (BIAS) of basal area and volume growth predictions was nominal (i.e., for BA: -0.0177 cm2 5-year-1 and volume: 0.0008 m3 5-year-1). Model variability described by root mean squared error (RMSE) in basal area prediction was 40.53 cm2 5-year-1 and 0.0393 m3 5-year-1 in volume prediction. The new modelling approach has potential to reduce uncertainties in growth and yield predictions under different climate change scenarios. This novel approach provides an avenue for forest managers to generate required information for the management of forests in transitional periods of climate change. Artificial intelligence technology has substantial potential in forest modelling. PMID:26173081

  9. Hierarchical Multi-Scale Approach To Validation and Uncertainty Quantification of Hyper-Spectral Image Modeling

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Engel, David W.; Reichardt, Thomas A.; Kulp, Thomas J.

    Validating predictive models and quantifying uncertainties inherent in the modeling process is a critical component of the HARD Solids Venture program [1]. Our current research focuses on validating physics-based models predicting the optical properties of solid materials for arbitrary surface morphologies and characterizing the uncertainties in these models. We employ a systematic and hierarchical approach by designing physical experiments and comparing the experimental results with the outputs of computational predictive models. We illustrate this approach through an example comparing a micro-scale forward model to an idealized solid-material system and then propagating the results through a system model to the sensormore » level. Our efforts should enhance detection reliability of the hyper-spectral imaging technique and the confidence in model utilization and model outputs by users and stakeholders.« less

  10. Conformal Regression for Quantitative Structure-Activity Relationship Modeling-Quantifying Prediction Uncertainty.

    PubMed

    Svensson, Fredrik; Aniceto, Natalia; Norinder, Ulf; Cortes-Ciriano, Isidro; Spjuth, Ola; Carlsson, Lars; Bender, Andreas

    2018-05-29

    Making predictions with an associated confidence is highly desirable as it facilitates decision making and resource prioritization. Conformal regression is a machine learning framework that allows the user to define the required confidence and delivers predictions that are guaranteed to be correct to the selected extent. In this study, we apply conformal regression to model molecular properties and bioactivity values and investigate different ways to scale the resultant prediction intervals to create as efficient (i.e., narrow) regressors as possible. Different algorithms to estimate the prediction uncertainty were used to normalize the prediction ranges, and the different approaches were evaluated on 29 publicly available data sets. Our results show that the most efficient conformal regressors are obtained when using the natural exponential of the ensemble standard deviation from the underlying random forest to scale the prediction intervals, but other approaches were almost as efficient. This approach afforded an average prediction range of 1.65 pIC50 units at the 80% confidence level when applied to bioactivity modeling. The choice of nonconformity function has a pronounced impact on the average prediction range with a difference of close to one log unit in bioactivity between the tightest and widest prediction range. Overall, conformal regression is a robust approach to generate bioactivity predictions with associated confidence.

  11. An integrated approach to infer dynamic protein-gene interactions - A case study of the human P53 protein.

    PubMed

    Wang, Junbai; Wu, Qianqian; Hu, Xiaohua Tony; Tian, Tianhai

    2016-11-01

    Investigating the dynamics of genetic regulatory networks through high throughput experimental data, such as microarray gene expression profiles, is a very important but challenging task. One of the major hindrances in building detailed mathematical models for genetic regulation is the large number of unknown model parameters. To tackle this challenge, a new integrated method is proposed by combining a top-down approach and a bottom-up approach. First, the top-down approach uses probabilistic graphical models to predict the network structure of DNA repair pathway that is regulated by the p53 protein. Two networks are predicted, namely a network of eight genes with eight inferred interactions and an extended network of 21 genes with 17 interactions. Then, the bottom-up approach using differential equation models is developed to study the detailed genetic regulations based on either a fully connected regulatory network or a gene network obtained by the top-down approach. Model simulation error, parameter identifiability and robustness property are used as criteria to select the optimal network. Simulation results together with permutation tests of input gene network structures indicate that the prediction accuracy and robustness property of the two predicted networks using the top-down approach are better than those of the corresponding fully connected networks. In particular, the proposed approach reduces computational cost significantly for inferring model parameters. Overall, the new integrated method is a promising approach for investigating the dynamics of genetic regulation. Copyright © 2016 Elsevier Inc. All rights reserved.

  12. A system identification approach for developing model predictive controllers of antibody quality attributes in cell culture processes.

    PubMed

    Downey, Brandon; Schmitt, John; Beller, Justin; Russell, Brian; Quach, Anthony; Hermann, Elizabeth; Lyon, David; Breit, Jeffrey

    2017-11-01

    As the biopharmaceutical industry evolves to include more diverse protein formats and processes, more robust control of Critical Quality Attributes (CQAs) is needed to maintain processing flexibility without compromising quality. Active control of CQAs has been demonstrated using model predictive control techniques, which allow development of processes which are robust against disturbances associated with raw material variability and other potentially flexible operating conditions. Wide adoption of model predictive control in biopharmaceutical cell culture processes has been hampered, however, in part due to the large amount of data and expertise required to make a predictive model of controlled CQAs, a requirement for model predictive control. Here we developed a highly automated, perfusion apparatus to systematically and efficiently generate predictive models using application of system identification approaches. We successfully created a predictive model of %galactosylation using data obtained by manipulating galactose concentration in the perfusion apparatus in serialized step change experiments. We then demonstrated the use of the model in a model predictive controller in a simulated control scenario to successfully achieve a %galactosylation set point in a simulated fed-batch culture. The automated model identification approach demonstrated here can potentially be generalized to many CQAs, and could be a more efficient, faster, and highly automated alternative to batch experiments for developing predictive models in cell culture processes, and allow the wider adoption of model predictive control in biopharmaceutical processes. © 2017 The Authors Biotechnology Progress published by Wiley Periodicals, Inc. on behalf of American Institute of Chemical Engineers Biotechnol. Prog., 33:1647-1661, 2017. © 2017 The Authors Biotechnology Progress published by Wiley Periodicals, Inc. on behalf of American Institute of Chemical Engineers.

  13. Predicting flight delay based on multiple linear regression

    NASA Astrophysics Data System (ADS)

    Ding, Yi

    2017-08-01

    Delay of flight has been regarded as one of the toughest difficulties in aviation control. How to establish an effective model to handle the delay prediction problem is a significant work. To solve the problem that the flight delay is difficult to predict, this study proposes a method to model the arriving flights and a multiple linear regression algorithm to predict delay, comparing with Naive-Bayes and C4.5 approach. Experiments based on a realistic dataset of domestic airports show that the accuracy of the proposed model approximates 80%, which is further improved than the Naive-Bayes and C4.5 approach approaches. The result testing shows that this method is convenient for calculation, and also can predict the flight delays effectively. It can provide decision basis for airport authorities.

  14. BAYESIAN METHODS FOR REGIONAL-SCALE EUTROPHICATION MODELS. (R830887)

    EPA Science Inventory

    We demonstrate a Bayesian classification and regression tree (CART) approach to link multiple environmental stressors to biological responses and quantify uncertainty in model predictions. Such an approach can: (1) report prediction uncertainty, (2) be consistent with the amou...

  15. PREDICTIVE MODELING OF LIGHT-INDUCED MORTALITY OF ENTEROCOCCI FAECALIS IN RECREATIONAL WATERS

    EPA Science Inventory

    One approach to predictive modeling of biological contamination of recreational waters involves the application of process-based approaches that consider microbial sources, hydrodynamic transport, and microbial fate. This presentation focuses on one important fate process, light-...

  16. Livestock Helminths in a Changing Climate: Approaches and Restrictions to Meaningful Predictions.

    PubMed

    Fox, Naomi J; Marion, Glenn; Davidson, Ross S; White, Piran C L; Hutchings, Michael R

    2012-03-06

    Climate change is a driving force for livestock parasite risk. This is especially true for helminths including the nematodes Haemonchus contortus, Teladorsagia circumcincta, Nematodirus battus, and the trematode Fasciola hepatica, since survival and development of free-living stages is chiefly affected by temperature and moisture. The paucity of long term predictions of helminth risk under climate change has driven us to explore optimal modelling approaches and identify current bottlenecks to generating meaningful predictions. We classify approaches as correlative or mechanistic, exploring their strengths and limitations. Climate is one aspect of a complex system and, at the farm level, husbandry has a dominant influence on helminth transmission. Continuing environmental change will necessitate the adoption of mitigation and adaptation strategies in husbandry. Long term predictive models need to have the architecture to incorporate these changes. Ultimately, an optimal modelling approach is likely to combine mechanistic processes and physiological thresholds with correlative bioclimatic modelling, incorporating changes in livestock husbandry and disease control. Irrespective of approach, the principal limitation to parasite predictions is the availability of active surveillance data and empirical data on physiological responses to climate variables. By combining improved empirical data and refined models with a broad view of the livestock system, robust projections of helminth risk can be developed.

  17. A New Approach to Predict the Fish Fillet Shelf-Life in Presence of Natural Preservative Agents.

    PubMed

    Giuffrida, Alessandro; Giarratana, Filippo; Valenti, Davide; Muscolino, Daniele; Parisi, Roberta; Parco, Alessio; Marotta, Stefania; Ziino, Graziella; Panebianco, Antonio

    2017-04-13

    Three data sets concerning the behaviour of spoilage flora of fillets treated with natural preservative substances (NPS) were used to construct a new kind of mathematical predictive model. This model, unlike other ones, allows expressing the antibacterial effect of the NPS separately from the prediction of the growth rate. This approach, based on the introduction of a parameter into the predictive primary model, produced a good fitting of observed data and allowed characterising quantitatively the increase of shelf-life of fillets.

  18. Dynamic Simulation of Human Gait Model With Predictive Capability.

    PubMed

    Sun, Jinming; Wu, Shaoli; Voglewede, Philip A

    2018-03-01

    In this paper, it is proposed that the central nervous system (CNS) controls human gait using a predictive control approach in conjunction with classical feedback control instead of exclusive classical feedback control theory that controls based on past error. To validate this proposition, a dynamic model of human gait is developed using a novel predictive approach to investigate the principles of the CNS. The model developed includes two parts: a plant model that represents the dynamics of human gait and a controller that represents the CNS. The plant model is a seven-segment, six-joint model that has nine degrees-of-freedom (DOF). The plant model is validated using data collected from able-bodied human subjects. The proposed controller utilizes model predictive control (MPC). MPC uses an internal model to predict the output in advance, compare the predicted output to the reference, and optimize the control input so that the predicted error is minimal. To decrease the complexity of the model, two joints are controlled using a proportional-derivative (PD) controller. The developed predictive human gait model is validated by simulating able-bodied human gait. The simulation results show that the developed model is able to simulate the kinematic output close to experimental data.

  19. Stochastic approaches for time series forecasting of boron: a case study of Western Turkey.

    PubMed

    Durdu, Omer Faruk

    2010-10-01

    In the present study, a seasonal and non-seasonal prediction of boron concentrations time series data for the period of 1996-2004 from Büyük Menderes river in western Turkey are addressed by means of linear stochastic models. The methodology presented here is to develop adequate linear stochastic models known as autoregressive integrated moving average (ARIMA) and multiplicative seasonal autoregressive integrated moving average (SARIMA) to predict boron content in the Büyük Menderes catchment. Initially, the Box-Whisker plots and Kendall's tau test are used to identify the trends during the study period. The measurements locations do not show significant overall trend in boron concentrations, though marginal increasing and decreasing trends are observed for certain periods at some locations. ARIMA modeling approach involves the following three steps: model identification, parameter estimation, and diagnostic checking. In the model identification step, considering the autocorrelation function (ACF) and partial autocorrelation function (PACF) results of boron data series, different ARIMA models are identified. The model gives the minimum Akaike information criterion (AIC) is selected as the best-fit model. The parameter estimation step indicates that the estimated model parameters are significantly different from zero. The diagnostic check step is applied to the residuals of the selected ARIMA models and the results indicate that the residuals are independent, normally distributed, and homoscadastic. For the model validation purposes, the predicted results using the best ARIMA models are compared to the observed data. The predicted data show reasonably good agreement with the actual data. The comparison of the mean and variance of 3-year (2002-2004) observed data vs predicted data from the selected best models show that the boron model from ARIMA modeling approaches could be used in a safe manner since the predicted values from these models preserve the basic statistics of observed data in terms of mean. The ARIMA modeling approach is recommended for predicting boron concentration series of a river.

  20. A simulation technique for predicting thickness of thermal sprayed coatings

    NASA Technical Reports Server (NTRS)

    Goedjen, John G.; Miller, Robert A.; Brindley, William J.; Leissler, George W.

    1995-01-01

    The complexity of many of the components being coated today using the thermal spray process makes the trial and error approach traditionally followed in depositing a uniform coating inadequate, thereby necessitating a more analytical approach to developing robotic trajectories. A two dimensional finite difference simulation model has been developed to predict the thickness of coatings deposited using the thermal spray process. The model couples robotic and component trajectories and thermal spraying parameters to predict coating thickness. Simulations and experimental verification were performed on a rotating disk to evaluate the predictive capabilities of the approach.

  1. A Hierarchical Multivariate Bayesian Approach to Ensemble Model output Statistics in Atmospheric Prediction

    DTIC Science & Technology

    2017-09-01

    efficacy of statistical post-processing methods downstream of these dynamical model components with a hierarchical multivariate Bayesian approach to...Bayesian hierarchical modeling, Markov chain Monte Carlo methods , Metropolis algorithm, machine learning, atmospheric prediction 15. NUMBER OF PAGES...scale processes. However, this dissertation explores the efficacy of statistical post-processing methods downstream of these dynamical model components

  2. Historical Prediction Modeling Approach for Estimating Long-Term Concentrations of PM2.5 in Cohort Studies before the 1999 Implementation of Widespread Monitoring.

    PubMed

    Kim, Sun-Young; Olives, Casey; Sheppard, Lianne; Sampson, Paul D; Larson, Timothy V; Keller, Joshua P; Kaufman, Joel D

    2017-01-01

    Recent cohort studies have used exposure prediction models to estimate the association between long-term residential concentrations of fine particulate matter (PM2.5) and health. Because these prediction models rely on PM2.5 monitoring data, predictions for times before extensive spatial monitoring present a challenge to understanding long-term exposure effects. The U.S. Environmental Protection Agency (EPA) Federal Reference Method (FRM) network for PM2.5 was established in 1999. We evaluated a novel statistical approach to produce high-quality exposure predictions from 1980 through 2010 in the continental United States for epidemiological applications. We developed spatio-temporal prediction models using geographic predictors and annual average PM2.5 data from 1999 through 2010 from the FRM and the Interagency Monitoring of Protected Visual Environments (IMPROVE) networks. Temporal trends before 1999 were estimated by using a) extrapolation based on PM2.5 data in FRM/IMPROVE, b) PM2.5 sulfate data in the Clean Air Status and Trends Network, and c) visibility data across the Weather Bureau Army Navy network. We validated the models using PM2.5 data collected before 1999 from IMPROVE, California Air Resources Board dichotomous sampler monitoring (CARB dichot), the Children's Health Study (CHS), and the Inhalable Particulate Network (IPN). In our validation using pre-1999 data, the prediction model performed well across three trend estimation approaches when validated using IMPROVE and CHS data (R2 = 0.84-0.91) with lower R2 values in early years. Model performance using CARB dichot and IPN data was worse (R2 = 0.00-0.85) most likely because of fewer monitoring sites and inconsistent sampling methods. Our prediction modeling approach will allow health effects estimation associated with long-term exposures to PM2.5 over extended time periods ≤ 30 years. Citation: Kim SY, Olives C, Sheppard L, Sampson PD, Larson TV, Keller JP, Kaufman JD. 2017. Historical prediction modeling approach for estimating long-term concentrations of PM2.5 in cohort studies before the 1999 implementation of widespread monitoring. Environ Health Perspect 125:38-46; http://dx.doi.org/10.1289/EHP131.

  3. A predictive modeling approach to increasing the economic effectiveness of disease management programs.

    PubMed

    Bayerstadler, Andreas; Benstetter, Franz; Heumann, Christian; Winter, Fabian

    2014-09-01

    Predictive Modeling (PM) techniques are gaining importance in the worldwide health insurance business. Modern PM methods are used for customer relationship management, risk evaluation or medical management. This article illustrates a PM approach that enables the economic potential of (cost-) effective disease management programs (DMPs) to be fully exploited by optimized candidate selection as an example of successful data-driven business management. The approach is based on a Generalized Linear Model (GLM) that is easy to apply for health insurance companies. By means of a small portfolio from an emerging country, we show that our GLM approach is stable compared to more sophisticated regression techniques in spite of the difficult data environment. Additionally, we demonstrate for this example of a setting that our model can compete with the expensive solutions offered by professional PM vendors and outperforms non-predictive standard approaches for DMP selection commonly used in the market.

  4. Choice Defines Value: A Predictive Modeling Competition in Health Preference Research.

    PubMed

    Jakubczyk, Michał; Craig, Benjamin M; Barra, Mathias; Groothuis-Oudshoorn, Catharina G M; Hartman, John D; Huynh, Elisabeth; Ramos-Goñi, Juan M; Stolk, Elly A; Rand, Kim

    2018-02-01

    To identify which specifications and approaches to model selection better predict health preferences, the International Academy of Health Preference Research (IAHPR) hosted a predictive modeling competition including 18 teams from around the world. In April 2016, an exploratory survey was fielded: 4074 US respondents completed 20 out of 1560 paired comparisons by choosing between two health descriptions (e.g., longer life span vs. better health). The exploratory data were distributed to all teams. By July, eight teams had submitted their predictions for 1600 additional pairs and described their analytical approach. After these predictions had been posted online, a confirmatory survey was fielded (4148 additional respondents). The victorious team, "Discreetly Charming Econometricians," led by Michał Jakubczyk, achieved the smallest χ 2 , 4391.54 (a predefined criterion). Its primary scientific findings were that different models performed better with different pairs, that the value of life span is not constant proportional, and that logit models have poor predictive validity in health valuation. The results demonstrated the diversity and potential of new analytical approaches in health preference research and highlighted the importance of predictive validity in health valuation. Copyright © 2018 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.

  5. A New Approach of Juvenile Age Estimation using Measurements of the Ilium and Multivariate Adaptive Regression Splines (MARS) Models for Better Age Prediction.

    PubMed

    Corron, Louise; Marchal, François; Condemi, Silvana; Chaumoître, Kathia; Adalian, Pascal

    2017-01-01

    Juvenile age estimation methods used in forensic anthropology generally lack methodological consistency and/or statistical validity. Considering this, a standard approach using nonparametric Multivariate Adaptive Regression Splines (MARS) models were tested to predict age from iliac biometric variables of male and female juveniles from Marseilles, France, aged 0-12 years. Models using unidimensional (length and width) and bidimensional iliac data (module and surface) were constructed on a training sample of 176 individuals and validated on an independent test sample of 68 individuals. Results show that MARS prediction models using iliac width, module and area give overall better and statistically valid age estimates. These models integrate punctual nonlinearities of the relationship between age and osteometric variables. By constructing valid prediction intervals whose size increases with age, MARS models take into account the normal increase of individual variability. MARS models can qualify as a practical and standardized approach for juvenile age estimation. © 2016 American Academy of Forensic Sciences.

  6. Epigenome-wide cross-tissue predictive modeling and comparison of cord blood and placental methylation in a birth cohort

    PubMed Central

    De Carli, Margherita M; Baccarelli, Andrea A; Trevisi, Letizia; Pantic, Ivan; Brennan, Kasey JM; Hacker, Michele R; Loudon, Holly; Brunst, Kelly J; Wright, Robert O; Wright, Rosalind J; Just, Allan C

    2017-01-01

    Aim: We compared predictive modeling approaches to estimate placental methylation using cord blood methylation. Materials & methods: We performed locus-specific methylation prediction using both linear regression and support vector machine models with 174 matched pairs of 450k arrays. Results: At most CpG sites, both approaches gave poor predictions in spite of a misleading improvement in array-wide correlation. CpG islands and gene promoters, but not enhancers, were the genomic contexts where the correlation between measured and predicted placental methylation levels achieved higher values. We provide a list of 714 sites where both models achieved an R2 ≥0.75. Conclusion: The present study indicates the need for caution in interpreting cross-tissue predictions. Few methylation sites can be predicted between cord blood and placenta. PMID:28234020

  7. Making predictions of mangrove deforestation: a comparison of two methods in Kenya.

    PubMed

    Rideout, Alasdair J R; Joshi, Neha P; Viergever, Karin M; Huxham, Mark; Briers, Robert A

    2013-11-01

    Deforestation of mangroves is of global concern given their importance for carbon storage, biogeochemical cycling and the provision of other ecosystem services, but the links between rates of loss and potential drivers or risk factors are rarely evaluated. Here, we identified key drivers of mangrove loss in Kenya and compared two different approaches to predicting risk. Risk factors tested included various possible predictors of anthropogenic deforestation, related to population, suitability for land use change and accessibility. Two approaches were taken to modelling risk; a quantitative statistical approach and a qualitative categorical ranking approach. A quantitative model linking rates of loss to risk factors was constructed based on generalized least squares regression and using mangrove loss data from 1992 to 2000. Population density, soil type and proximity to roads were the most important predictors. In order to validate this model it was used to generate a map of losses of Kenyan mangroves predicted to have occurred between 2000 and 2010. The qualitative categorical model was constructed using data from the same selection of variables, with the coincidence of different risk factors in particular mangrove areas used in an additive manner to create a relative risk index which was then mapped. Quantitative predictions of loss were significantly correlated with the actual loss of mangroves between 2000 and 2010 and the categorical risk index values were also highly correlated with the quantitative predictions. Hence, in this case the relatively simple categorical modelling approach was of similar predictive value to the more complex quantitative model of mangrove deforestation. The advantages and disadvantages of each approach are discussed, and the implications for mangroves are outlined. © 2013 Blackwell Publishing Ltd.

  8. Climatological Observations for Maritime Prediction and Analysis Support Service (COMPASS)

    NASA Astrophysics Data System (ADS)

    OConnor, A.; Kirtman, B. P.; Harrison, S.; Gorman, J.

    2016-02-01

    Current US Navy forecasting systems cannot easily incorporate extended-range forecasts that can improve mission readiness and effectiveness; ensure safety; and reduce cost, labor, and resource requirements. If Navy operational planners had systems that incorporated these forecasts, they could plan missions using more reliable and longer-term weather and climate predictions. Further, using multi-model forecast ensembles instead of single forecasts would produce higher predictive performance. Extended-range multi-model forecast ensembles, such as those available in the North American Multi-Model Ensemble (NMME), are ideal for system integration because of their high skill predictions; however, even higher skill predictions can be produced if forecast model ensembles are combined correctly. While many methods for weighting models exist, the best method in a given environment requires expert knowledge of the models and combination methods.We present an innovative approach that uses machine learning to combine extended-range predictions from multi-model forecast ensembles and generate a probabilistic forecast for any region of the globe up to 12 months in advance. Our machine-learning approach uses 30 years of hindcast predictions to learn patterns of forecast model successes and failures. Each model is assigned a weight for each environmental condition, 100 km2 region, and day given any expected environmental information. These weights are then applied to the respective predictions for the region and time of interest to effectively stitch together a single, coherent probabilistic forecast. Our experimental results demonstrate the benefits of our approach to produce extended-range probabilistic forecasts for regions and time periods of interest that are superior, in terms of skill, to individual NMME forecast models and commonly weighted models. The probabilistic forecast leverages the strengths of three NMME forecast models to predict environmental conditions for an area spanning from San Diego, CA to Honolulu, HI, seven months in-advance. Key findings include: weighted combinations of models are strictly better than individual models; machine-learned combinations are especially better; and forecasts produced using our approach have the highest rank probability skill score most often.

  9. Cure modeling in real-time prediction: How much does it help?

    PubMed

    Ying, Gui-Shuang; Zhang, Qiang; Lan, Yu; Li, Yimei; Heitjan, Daniel F

    2017-08-01

    Various parametric and nonparametric modeling approaches exist for real-time prediction in time-to-event clinical trials. Recently, Chen (2016 BMC Biomedical Research Methodology 16) proposed a prediction method based on parametric cure-mixture modeling, intending to cover those situations where it appears that a non-negligible fraction of subjects is cured. In this article we apply a Weibull cure-mixture model to create predictions, demonstrating the approach in RTOG 0129, a randomized trial in head-and-neck cancer. We compare the ultimate realized data in RTOG 0129 to interim predictions from a Weibull cure-mixture model, a standard Weibull model without a cure component, and a nonparametric model based on the Bayesian bootstrap. The standard Weibull model predicted that events would occur earlier than the Weibull cure-mixture model, but the difference was unremarkable until late in the trial when evidence for a cure became clear. Nonparametric predictions often gave undefined predictions or infinite prediction intervals, particularly at early stages of the trial. Simulations suggest that cure modeling can yield better-calibrated prediction intervals when there is a cured component, or the appearance of a cured component, but at a substantial cost in the average width of the intervals. Copyright © 2017 Elsevier Inc. All rights reserved.

  10. Simulating boundary layer transition with low-Reynolds-number k-epsilon turbulence models. I - An evaluation of prediction characteristics. II - An approach to improving the predictions

    NASA Technical Reports Server (NTRS)

    Schmidt, R. C.; Patankar, S. V.

    1991-01-01

    The capability of two k-epsilon low-Reynolds number (LRN) turbulence models, those of Jones and Launder (1972) and Lam and Bremhorst (1981), to predict transition in external boundary-layer flows subject to free-stream turbulence is analyzed. Both models correctly predict the basic qualitative aspects of boundary-layer transition with free stream turbulence, but for calculations started at low values of certain defined Reynolds numbers, the transition is generally predicted at unrealistically early locations. Also, the methods predict transition lengths significantly shorter than those found experimentally. An approach to overcoming these deficiencies without abandoning the basic LRN k-epsilon framework is developed. This approach limits the production term in the turbulent kinetic energy equation and is based on a simple stability criterion. It is correlated to the free-stream turbulence value. The modification is shown to improve the qualitative and quantitative characteristics of the transition predictions.

  11. Predicting chemical bioavailability using microarray gene expression data and regression modeling: A tale of three explosive compounds.

    PubMed

    Gong, Ping; Nan, Xiaofei; Barker, Natalie D; Boyd, Robert E; Chen, Yixin; Wilkins, Dawn E; Johnson, David R; Suedel, Burton C; Perkins, Edward J

    2016-03-08

    Chemical bioavailability is an important dose metric in environmental risk assessment. Although many approaches have been used to evaluate bioavailability, not a single approach is free from limitations. Previously, we developed a new genomics-based approach that integrated microarray technology and regression modeling for predicting bioavailability (tissue residue) of explosives compounds in exposed earthworms. In the present study, we further compared 18 different regression models and performed variable selection simultaneously with parameter estimation. This refined approach was applied to both previously collected and newly acquired earthworm microarray gene expression datasets for three explosive compounds. Our results demonstrate that a prediction accuracy of R(2) = 0.71-0.82 was achievable at a relatively low model complexity with as few as 3-10 predictor genes per model. These results are much more encouraging than our previous ones. This study has demonstrated that our approach is promising for bioavailability measurement, which warrants further studies of mixed contamination scenarios in field settings.

  12. SAR/QSAR MODELS FOR TOXICITY PREDICTION: APPROACHES AND NEW DIRECTIONS

    EPA Science Inventory

    Abstract

    SAR/QSAR MODELS FOR TOXICITY PREDICTION: APPROACHES AND NEW DIRECTIONS

    Risk assessment typically incorporates some relevant toxicity information upon which to base a sound estimation for a chemical of concern. However, there are many circumstances in whic...

  13. Direct construction of predictive models for describing growth Salmonella enteritidis in liquid eggs – a one-step approach

    USDA-ARS?s Scientific Manuscript database

    The objective of this study was to develop a new approach using a one-step approach to directly construct predictive models for describing the growth of Salmonella Enteritidis (SE) in liquid egg white (LEW) and egg yolk (LEY). A five-strain cocktail of SE, induced to resist rifampicin at 100 mg/L, ...

  14. On the effects of alternative optima in context-specific metabolic model predictions

    PubMed Central

    Nikoloski, Zoran

    2017-01-01

    The integration of experimental data into genome-scale metabolic models can greatly improve flux predictions. This is achieved by restricting predictions to a more realistic context-specific domain, like a particular cell or tissue type. Several computational approaches to integrate data have been proposed—generally obtaining context-specific (sub)models or flux distributions. However, these approaches may lead to a multitude of equally valid but potentially different models or flux distributions, due to possible alternative optima in the underlying optimization problems. Although this issue introduces ambiguity in context-specific predictions, it has not been generally recognized, especially in the case of model reconstructions. In this study, we analyze the impact of alternative optima in four state-of-the-art context-specific data integration approaches, providing both flux distributions and/or metabolic models. To this end, we present three computational methods and apply them to two particular case studies: leaf-specific predictions from the integration of gene expression data in a metabolic model of Arabidopsis thaliana, and liver-specific reconstructions derived from a human model with various experimental data sources. The application of these methods allows us to obtain the following results: (i) we sample the space of alternative flux distributions in the leaf- and the liver-specific case and quantify the ambiguity of the predictions. In addition, we show how the inclusion of ℓ1-regularization during data integration reduces the ambiguity in both cases. (ii) We generate sets of alternative leaf- and liver-specific models that are optimal to each one of the evaluated model reconstruction approaches. We demonstrate that alternative models of the same context contain a marked fraction of disparate reactions. Further, we show that a careful balance between model sparsity and metabolic functionality helps in reducing the discrepancies between alternative models. Finally, our findings indicate that alternative optima must be taken into account for rendering the context-specific metabolic model predictions less ambiguous. PMID:28557990

  15. On the effects of alternative optima in context-specific metabolic model predictions.

    PubMed

    Robaina-Estévez, Semidán; Nikoloski, Zoran

    2017-05-01

    The integration of experimental data into genome-scale metabolic models can greatly improve flux predictions. This is achieved by restricting predictions to a more realistic context-specific domain, like a particular cell or tissue type. Several computational approaches to integrate data have been proposed-generally obtaining context-specific (sub)models or flux distributions. However, these approaches may lead to a multitude of equally valid but potentially different models or flux distributions, due to possible alternative optima in the underlying optimization problems. Although this issue introduces ambiguity in context-specific predictions, it has not been generally recognized, especially in the case of model reconstructions. In this study, we analyze the impact of alternative optima in four state-of-the-art context-specific data integration approaches, providing both flux distributions and/or metabolic models. To this end, we present three computational methods and apply them to two particular case studies: leaf-specific predictions from the integration of gene expression data in a metabolic model of Arabidopsis thaliana, and liver-specific reconstructions derived from a human model with various experimental data sources. The application of these methods allows us to obtain the following results: (i) we sample the space of alternative flux distributions in the leaf- and the liver-specific case and quantify the ambiguity of the predictions. In addition, we show how the inclusion of ℓ1-regularization during data integration reduces the ambiguity in both cases. (ii) We generate sets of alternative leaf- and liver-specific models that are optimal to each one of the evaluated model reconstruction approaches. We demonstrate that alternative models of the same context contain a marked fraction of disparate reactions. Further, we show that a careful balance between model sparsity and metabolic functionality helps in reducing the discrepancies between alternative models. Finally, our findings indicate that alternative optima must be taken into account for rendering the context-specific metabolic model predictions less ambiguous.

  16. Latin hypercube approach to estimate uncertainty in ground water vulnerability

    USGS Publications Warehouse

    Gurdak, J.J.; McCray, J.E.; Thyne, G.; Qi, S.L.

    2007-01-01

    A methodology is proposed to quantify prediction uncertainty associated with ground water vulnerability models that were developed through an approach that coupled multivariate logistic regression with a geographic information system (GIS). This method uses Latin hypercube sampling (LHS) to illustrate the propagation of input error and estimate uncertainty associated with the logistic regression predictions of ground water vulnerability. Central to the proposed method is the assumption that prediction uncertainty in ground water vulnerability models is a function of input error propagation from uncertainty in the estimated logistic regression model coefficients (model error) and the values of explanatory variables represented in the GIS (data error). Input probability distributions that represent both model and data error sources of uncertainty were simultaneously sampled using a Latin hypercube approach with logistic regression calculations of probability of elevated nonpoint source contaminants in ground water. The resulting probability distribution represents the prediction intervals and associated uncertainty of the ground water vulnerability predictions. The method is illustrated through a ground water vulnerability assessment of the High Plains regional aquifer. Results of the LHS simulations reveal significant prediction uncertainties that vary spatially across the regional aquifer. Additionally, the proposed method enables a spatial deconstruction of the prediction uncertainty that can lead to improved prediction of ground water vulnerability. ?? 2007 National Ground Water Association.

  17. Dynamic-landscape metapopulation models predict complex response of wildlife populations to climate and landscape change

    Treesearch

    Thomas W. Bonnot; Frank R. Thompson; Joshua J. Millspaugh

    2017-01-01

    The increasing need to predict how climate change will impact wildlife species has exposed limitations in how well current approaches model important biological processes at scales at which those processes interact with climate. We used a comprehensive approach that combined recent advances in landscape and population modeling into dynamic-landscape metapopulation...

  18. Predicting Graduation Rates at 4-Year Broad Access Institutions Using a Bayesian Modeling Approach

    ERIC Educational Resources Information Center

    Crisp, Gloria; Doran, Erin; Salis Reyes, Nicole A.

    2018-01-01

    This study models graduation rates at 4-year broad access institutions (BAIs). We examine the student body, structural-demographic, and financial characteristics that best predict 6-year graduation rates across two time periods (2008-2009 and 2014-2015). A Bayesian model averaging approach is utilized to account for uncertainty in variable…

  19. An integrated uncertainty analysis and data assimilation approach for improved streamflow predictions

    NASA Astrophysics Data System (ADS)

    Hogue, T. S.; He, M.; Franz, K. J.; Margulis, S. A.; Vrugt, J. A.

    2010-12-01

    The current study presents an integrated uncertainty analysis and data assimilation approach to improve streamflow predictions while simultaneously providing meaningful estimates of the associated uncertainty. Study models include the National Weather Service (NWS) operational snow model (SNOW17) and rainfall-runoff model (SAC-SMA). The proposed approach uses the recently developed DiffeRential Evolution Adaptive Metropolis (DREAM) to simultaneously estimate uncertainties in model parameters, forcing, and observations. An ensemble Kalman filter (EnKF) is configured with the DREAM-identified uncertainty structure and applied to assimilating snow water equivalent data into the SNOW17 model for improved snowmelt simulations. Snowmelt estimates then serves as an input to the SAC-SMA model to provide streamflow predictions at the basin outlet. The robustness and usefulness of the approach is evaluated for a snow-dominated watershed in the northern Sierra Mountains. This presentation describes the implementation of DREAM and EnKF into the coupled SNOW17 and SAC-SMA models and summarizes study results and findings.

  20. Recent NASA Research on Aerodynamic Modeling of Post-Stall and Spin Dynamics of Large Transport Airplanes

    NASA Technical Reports Server (NTRS)

    Murch, Austin M.; Foster, John V.

    2007-01-01

    A simulation study was conducted to investigate aerodynamic modeling methods for prediction of post-stall flight dynamics of large transport airplanes. The research approach involved integrating dynamic wind tunnel data from rotary balance and forced oscillation testing with static wind tunnel data to predict aerodynamic forces and moments during highly dynamic departure and spin motions. Several state-of-the-art aerodynamic modeling methods were evaluated and predicted flight dynamics using these various approaches were compared. Results showed the different modeling methods had varying effects on the predicted flight dynamics and the differences were most significant during uncoordinated maneuvers. Preliminary wind tunnel validation data indicated the potential of the various methods for predicting steady spin motions.

  1. Dispositional and Situational Avoidance and Approach as Predictors of Physical Symptom Bother Following Breast Cancer Diagnosis

    PubMed Central

    Bauer, Margaret R.; Harris, Lauren N.; Wiley, Joshua F.; Crespi, Catherine M.; Krull, Jennifer L.; Weihs, Karen L.; Stanton, Annette L.

    2016-01-01

    Background Few studies examine whether dispositional approach and avoidance coping and stressor-specific coping strategies differentially predict physical adjustment to cancer-related stress. Purpose This study examines dispositional and situational avoidance and approach coping as unique predictors of the bother women experience from physical symptoms after breast cancer treatment, as well as whether situational coping mediates the prediction of bother from physical symptoms by dispositional coping. Method Breast cancer patients (N=460) diagnosed within the past 3 months completed self-report measures of dispositional coping at study entry and of situational coping and bother from physical symptoms every 6 weeks through 6 months. Results In multilevel structural equation modeling analyses, both dispositional and situational avoidance predict greater symptom bother. Dispositional, but not situational, approach predicts less symptom bother. Supporting mediation models, dispositional avoidance predicts more symptom bother indirectly through greater situational avoidance. Dispositional approach predicts less symptom bother through less situational avoidance. Conclusion Psychosocial interventions to reduce cancer-related avoidance coping are warranted for cancer survivors who are high in dispositional avoidance and/or low in dispositional approach. PMID:26769023

  2. Dispositional and Situational Avoidance and Approach as Predictors of Physical Symptom Bother Following Breast Cancer Diagnosis.

    PubMed

    Bauer, Margaret R; Harris, Lauren N; Wiley, Joshua F; Crespi, Catherine M; Krull, Jennifer L; Weihs, Karen L; Stanton, Annette L

    2016-06-01

    Few studies examine whether dispositional approach and avoidance coping and stressor-specific coping strategies differentially predict physical adjustment to cancer-related stress. This study examines dispositional and situational avoidance and approach coping as unique predictors of the bother women experience from physical symptoms after breast cancer treatment, as well as whether situational coping mediates the prediction of bother from physical symptoms by dispositional coping. Breast cancer patients (N = 460) diagnosed within the past 3 months completed self-report measures of dispositional coping at study entry and of situational coping and bother from physical symptoms every 6 weeks through 6 months. In multilevel structural equation modeling analyses, both dispositional and situational avoidance predict greater symptom bother. Dispositional, but not situational, approach predicts less symptom bother. Supporting mediation models, dispositional avoidance predicts more symptom bother indirectly through greater situational avoidance. Dispositional approach predicts less symptom bother through less situational avoidance. Psychosocial interventions to reduce cancer-related avoidance coping are warranted for cancer survivors who are high in dispositional avoidance and/or low in dispositional approach.

  3. A Bayesian approach to model structural error and input variability in groundwater modeling

    NASA Astrophysics Data System (ADS)

    Xu, T.; Valocchi, A. J.; Lin, Y. F. F.; Liang, F.

    2015-12-01

    Effective water resource management typically relies on numerical models to analyze groundwater flow and solute transport processes. Model structural error (due to simplification and/or misrepresentation of the "true" environmental system) and input forcing variability (which commonly arises since some inputs are uncontrolled or estimated with high uncertainty) are ubiquitous in groundwater models. Calibration that overlooks errors in model structure and input data can lead to biased parameter estimates and compromised predictions. We present a fully Bayesian approach for a complete assessment of uncertainty for spatially distributed groundwater models. The approach explicitly recognizes stochastic input and uses data-driven error models based on nonparametric kernel methods to account for model structural error. We employ exploratory data analysis to assist in specifying informative prior for error models to improve identifiability. The inference is facilitated by an efficient sampling algorithm based on DREAM-ZS and a parameter subspace multiple-try strategy to reduce the required number of forward simulations of the groundwater model. We demonstrate the Bayesian approach through a synthetic case study of surface-ground water interaction under changing pumping conditions. It is found that explicit treatment of errors in model structure and input data (groundwater pumping rate) has substantial impact on the posterior distribution of groundwater model parameters. Using error models reduces predictive bias caused by parameter compensation. In addition, input variability increases parametric and predictive uncertainty. The Bayesian approach allows for a comparison among the contributions from various error sources, which could inform future model improvement and data collection efforts on how to best direct resources towards reducing predictive uncertainty.

  4. Robust model predictive control for satellite formation keeping with eccentricity/inclination vector separation

    NASA Astrophysics Data System (ADS)

    Lim, Yeerang; Jung, Youeyun; Bang, Hyochoong

    2018-05-01

    This study presents model predictive formation control based on an eccentricity/inclination vector separation strategy. Alternative collision avoidance can be accomplished by using eccentricity/inclination vectors and adding a simple goal function term for optimization process. Real-time control is also achievable with model predictive controller based on convex formulation. Constraint-tightening approach is address as well improve robustness of the controller, and simulation results are presented to verify performance enhancement for the proposed approach.

  5. Numeric, Agent-based or System Dynamics Model? Which Modeling Approach is the Best for Vast Population Simulation?

    PubMed

    Cimler, Richard; Tomaskova, Hana; Kuhnova, Jitka; Dolezal, Ondrej; Pscheidl, Pavel; Kuca, Kamil

    2018-01-01

    Alzheimer's disease is one of the most common mental illnesses. It is posited that more than 25% of the population is affected by some mental disease during their lifetime. Treatment of each patient draws resources from the economy concerned. Therefore, it is important to quantify the potential economic impact. Agent-based, system dynamics and numerical approaches to dynamic modeling of the population of the European Union and its patients with Alzheimer's disease are presented in this article. Simulations, their characteristics, and the results from different modeling tools are compared. The results of these approaches are compared with EU population growth predictions from the statistical office of the EU by Eurostat. The methodology of a creation of the models is described and all three modeling approaches are compared. The suitability of each modeling approach for the population modeling is discussed. In this case study, all three approaches gave us the results corresponding with the EU population prediction. Moreover, we were able to predict the number of patients with AD and, based on the modeling method, we were also able to monitor different characteristics of the population. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  6. Past, present and prospect of an Artificial Intelligence (AI) based model for sediment transport prediction

    NASA Astrophysics Data System (ADS)

    Afan, Haitham Abdulmohsin; El-shafie, Ahmed; Mohtar, Wan Hanna Melini Wan; Yaseen, Zaher Mundher

    2016-10-01

    An accurate model for sediment prediction is a priority for all hydrological researchers. Many conventional methods have shown an inability to achieve an accurate prediction of suspended sediment. These methods are unable to understand the behaviour of sediment transport in rivers due to the complexity, noise, non-stationarity, and dynamism of the sediment pattern. In the past two decades, Artificial Intelligence (AI) and computational approaches have become a remarkable tool for developing an accurate model. These approaches are considered a powerful tool for solving any non-linear model, as they can deal easily with a large number of data and sophisticated models. This paper is a review of all AI approaches that have been applied in sediment modelling. The current research focuses on the development of AI application in sediment transport. In addition, the review identifies major challenges and opportunities for prospective research. Throughout the literature, complementary models superior to classical modelling.

  7. A study on predicting network corrections in PPP-RTK processing

    NASA Astrophysics Data System (ADS)

    Wang, Kan; Khodabandeh, Amir; Teunissen, Peter

    2017-10-01

    In PPP-RTK processing, the network corrections including the satellite clocks, the satellite phase biases and the ionospheric delays are provided to the users to enable fast single-receiver integer ambiguity resolution. To solve the rank deficiencies in the undifferenced observation equations, the estimable parameters are formed to generate full-rank design matrix. In this contribution, we firstly discuss the interpretation of the estimable parameters without and with a dynamic satellite clock model incorporated in a Kalman filter during the network processing. The functionality of the dynamic satellite clock model is tested in the PPP-RTK processing. Due to the latency generated by the network processing and data transfer, the network corrections are delayed for the real-time user processing. To bridge the latencies, we discuss and compare two prediction approaches making use of the network corrections without and with the dynamic satellite clock model, respectively. The first prediction approach is based on the polynomial fitting of the estimated network parameters, while the second approach directly follows the dynamic model in the Kalman filter of the network processing and utilises the satellite clock drifts estimated in the network processing. Using 1 Hz data from two networks in Australia, the influences of the two prediction approaches on the user positioning results are analysed and compared for latencies ranging from 3 to 10 s. The accuracy of the positioning results decreases with the increasing latency of the network products. For a latency of 3 s, the RMS of the horizontal and the vertical coordinates (with respect to the ground truth) do not show large differences applying both prediction approaches. For a latency of 10 s, the prediction approach making use of the satellite clock model has generated slightly better positioning results with the differences of the RMS at mm-level. Further advantages and disadvantages of both prediction approaches are also discussed in this contribution.

  8. Expanding metal mixture toxicity models to natural stream and lake invertebrate communities

    USGS Publications Warehouse

    Balistrieri, Laurie S.; Mebane, Christopher A.; Schmidt, Travis S.; Keller, William (Bill)

    2015-01-01

    A modeling approach that was used to predict the toxicity of dissolved single and multiple metals to trout is extended to stream benthic macroinvertebrates, freshwater zooplankton, and Daphnia magna. The approach predicts the accumulation of toxicants (H, Al, Cd, Cu, Ni, Pb, and Zn) in organisms using 3 equilibrium accumulation models that define interactions between dissolved cations and biological receptors (biotic ligands). These models differ in the structure of the receptors and include a 2-site biotic ligand model, a bidentate biotic ligand or 2-pKa model, and a humic acid model. The predicted accumulation of toxicants is weighted using toxicant-specific coefficients and incorporated into a toxicity function called Tox, which is then related to observed mortality or invertebrate community richness using a logistic equation. All accumulation models provide reasonable fits to metal concentrations in tissue samples of stream invertebrates. Despite the good fits, distinct differences in the magnitude of toxicant accumulation and biotic ligand speciation exist among the models for a given solution composition. However, predicted biological responses are similar among the models because there are interdependencies among model parameters in the accumulation–Tox models. To illustrate potential applications of the approaches, the 3 accumulation–Tox models for natural stream invertebrates are used in Monte Carlo simulations to predict the probability of adverse impacts in catchments of differing geology in central Colorado (USA); to link geology, water chemistry, and biological response; and to demonstrate how this approach can be used to screen for potential risks associated with resource development.

  9. Prediction of relative and absolute permeabilities for gas and water from soil water retention curves using a pore-scale network model

    NASA Astrophysics Data System (ADS)

    Fischer, Ulrich; Celia, Michael A.

    1999-04-01

    Functional relationships for unsaturated flow in soils, including those between capillary pressure, saturation, and relative permeabilities, are often described using analytical models based on the bundle-of-tubes concept. These models are often limited by, for example, inherent difficulties in prediction of absolute permeabilities, and in incorporation of a discontinuous nonwetting phase. To overcome these difficulties, an alternative approach may be formulated using pore-scale network models. In this approach, the pore space of the network model is adjusted to match retention data, and absolute and relative permeabilities are then calculated. A new approach that allows more general assignments of pore sizes within the network model provides for greater flexibility to match measured data. This additional flexibility is especially important for simultaneous modeling of main imbibition and drainage branches. Through comparisons between the network model results, analytical model results, and measured data for a variety of both undisturbed and repacked soils, the network model is seen to match capillary pressure-saturation data nearly as well as the analytical model, to predict water phase relative permeabilities equally well, and to predict gas phase relative permeabilities significantly better than the analytical model. The network model also provides very good estimates for intrinsic permeability and thus for absolute permeabilities. Both the network model and the analytical model lost accuracy in predicting relative water permeabilities for soils characterized by a van Genuchten exponent n≲3. Overall, the computational results indicate that reliable predictions of both relative and absolute permeabilities are obtained with the network model when the model matches the capillary pressure-saturation data well. The results also indicate that measured imbibition data are crucial to good predictions of the complete hysteresis loop.

  10. Opportunities of probabilistic flood loss models

    NASA Astrophysics Data System (ADS)

    Schröter, Kai; Kreibich, Heidi; Lüdtke, Stefan; Vogel, Kristin; Merz, Bruno

    2016-04-01

    Oftentimes, traditional uni-variate damage models as for instance depth-damage curves fail to reproduce the variability of observed flood damage. However, reliable flood damage models are a prerequisite for the practical usefulness of the model results. Innovative multi-variate probabilistic modelling approaches are promising to capture and quantify the uncertainty involved and thus to improve the basis for decision making. In this study we compare the predictive capability of two probabilistic modelling approaches, namely Bagging Decision Trees and Bayesian Networks and traditional stage damage functions. For model evaluation we use empirical damage data which are available from computer aided telephone interviews that were respectively compiled after the floods in 2002, 2005, 2006 and 2013 in the Elbe and Danube catchments in Germany. We carry out a split sample test by sub-setting the damage records. One sub-set is used to derive the models and the remaining records are used to evaluate the predictive performance of the model. Further we stratify the sample according to catchments which allows studying model performance in a spatial transfer context. Flood damage estimation is carried out on the scale of the individual buildings in terms of relative damage. The predictive performance of the models is assessed in terms of systematic deviations (mean bias), precision (mean absolute error) as well as in terms of sharpness of the predictions the reliability which is represented by the proportion of the number of observations that fall within the 95-quantile and 5-quantile predictive interval. The comparison of the uni-variable Stage damage function and the multivariable model approach emphasises the importance to quantify predictive uncertainty. With each explanatory variable, the multi-variable model reveals an additional source of uncertainty. However, the predictive performance in terms of precision (mbe), accuracy (mae) and reliability (HR) is clearly improved in comparison to uni-variable Stage damage function. Overall, Probabilistic models provide quantitative information about prediction uncertainty which is crucial to assess the reliability of model predictions and improves the usefulness of model results.

  11. Sparse Event Modeling with Hierarchical Bayesian Kernel Methods

    DTIC Science & Technology

    2016-01-05

    SECURITY CLASSIFICATION OF: The research objective of this proposal was to develop a predictive Bayesian kernel approach to model count data based on...several predictive variables. Such an approach, which we refer to as the Poisson Bayesian kernel model , is able to model the rate of occurrence of...which adds specificity to the model and can make nonlinear data more manageable. Early results show that the 1. REPORT DATE (DD-MM-YYYY) 4. TITLE

  12. PREDICTING CLIMATE-INDUCED RANGE SHIFTS: MODEL DIFFERENCES AND MODEL RELIABILITY

    EPA Science Inventory

    Predicted changes in the global climate are likely to cause large shifts in the geographic ranges of many plant and animal species. To date, predictions of future range shifts have relied on a variety of modeling approaches with different levels of model accuracy. Using a common ...

  13. Efficacy of monitoring and empirical predictive modeling at improving public health protection at Chicago beaches

    USGS Publications Warehouse

    Nevers, Meredith B.; Whitman, Richard L.

    2011-01-01

    Efforts to improve public health protection in recreational swimming waters have focused on obtaining real-time estimates of water quality. Current monitoring techniques rely on the time-intensive culturing of fecal indicator bacteria (FIB) from water samples, but rapidly changing FIB concentrations result in management errors that lead to the public being exposed to high FIB concentrations (type II error) or beaches being closed despite acceptable water quality (type I error). Empirical predictive models may provide a rapid solution, but their effectiveness at improving health protection has not been adequately assessed. We sought to determine if emerging monitoring approaches could effectively reduce risk of illness exposure by minimizing management errors. We examined four monitoring approaches (inactive, current protocol, a single predictive model for all beaches, and individual models for each beach) with increasing refinement at 14 Chicago beaches using historical monitoring and hydrometeorological data and compared management outcomes using different standards for decision-making. Predictability (R2) of FIB concentration improved with model refinement at all beaches but one. Predictive models did not always reduce the number of management errors and therefore the overall illness burden. Use of a Chicago-specific single-sample standard-rather than the default 235 E. coli CFU/100 ml widely used-together with predictive modeling resulted in the greatest number of open beach days without any increase in public health risk. These results emphasize that emerging monitoring approaches such as empirical models are not equally applicable at all beaches, and combining monitoring approaches may expand beach access.

  14. A novel single-parameter approach for forecasting algal blooms.

    PubMed

    Xiao, Xi; He, Junyu; Huang, Haomin; Miller, Todd R; Christakos, George; Reichwaldt, Elke S; Ghadouani, Anas; Lin, Shengpan; Xu, Xinhua; Shi, Jiyan

    2017-01-01

    Harmful algal blooms frequently occur globally, and forecasting could constitute an essential proactive strategy for bloom control. To decrease the cost of aquatic environmental monitoring and increase the accuracy of bloom forecasting, a novel single-parameter approach combining wavelet analysis with artificial neural networks (WNN) was developed and verified based on daily online monitoring datasets of algal density in the Siling Reservoir, China and Lake Winnebago, U.S.A. Firstly, a detailed modeling process was illustrated using the forecasting of cyanobacterial cell density in the Chinese reservoir as an example. Three WNN models occupying various prediction time intervals were optimized through model training using an early stopped training approach. All models performed well in fitting historical data and predicting the dynamics of cyanobacterial cell density, with the best model predicting cyanobacteria density one-day ahead (r = 0.986 and mean absolute error = 0.103 × 10 4  cells mL -1 ). Secondly, the potential of this novel approach was further confirmed by the precise predictions of algal biomass dynamics measured as chl a in both study sites, demonstrating its high performance in forecasting algal blooms, including cyanobacteria as well as other blooming species. Thirdly, the WNN model was compared to current algal forecasting methods (i.e. artificial neural networks, autoregressive integrated moving average model), and was found to be more accurate. In addition, the application of this novel single-parameter approach is cost effective as it requires only a buoy-mounted fluorescent probe, which is merely a fraction (∼15%) of the cost of a typical auto-monitoring system. As such, the newly developed approach presents a promising and cost-effective tool for the future prediction and management of harmful algal blooms. Copyright © 2016 Elsevier Ltd. All rights reserved.

  15. Evaluation of a Mysis bioenergetics model

    USGS Publications Warehouse

    Chipps, S.R.; Bennett, D.H.

    2002-01-01

    Direct approaches for estimating the feeding rate of the opossum shrimp Mysis relicta can be hampered by variable gut residence time (evacuation rate models) and non-linear functional responses (clearance rate models). Bioenergetics modeling provides an alternative method, but the reliability of this approach needs to be evaluated using independent measures of growth and food consumption. In this study, we measured growth and food consumption for M. relicta and compared experimental results with those predicted from a Mysis bioenergetics model. For Mysis reared at 10??C, model predictions were not significantly different from observed values. Moreover, decomposition of mean square error indicated that 70% of the variation between model predictions and observed values was attributable to random error. On average, model predictions were within 12% of observed values. A sensitivity analysis revealed that Mysis respiration and prey energy density were the most sensitive parameters affecting model output. By accounting for uncertainty (95% CLs) in Mysis respiration, we observed a significant improvement in the accuracy of model output (within 5% of observed values), illustrating the importance of sensitive input parameters for model performance. These findings help corroborate the Mysis bioenergetics model and demonstrate the usefulness of this approach for estimating Mysis feeding rate.

  16. An Ensemble Approach for Drug Side Effect Prediction

    PubMed Central

    Jahid, Md Jamiul; Ruan, Jianhua

    2014-01-01

    In silico prediction of drug side-effects in early stage of drug development is becoming more popular now days, which not only reduces the time for drug design but also reduces the drug development costs. In this article we propose an ensemble approach to predict drug side-effects of drug molecules based on their chemical structure. Our idea originates from the observation that similar drugs have similar side-effects. Based on this observation we design an ensemble approach that combine the results from different classification models where each model is generated by a different set of similar drugs. We applied our approach to 1385 side-effects in the SIDER database for 888 drugs. Results show that our approach outperformed previously published approaches and standard classifiers. Furthermore, we applied our method to a number of uncharacterized drug molecules in DrugBank database and predict their side-effect profiles for future usage. Results from various sources confirm that our method is able to predict the side-effects for uncharacterized drugs and more importantly able to predict rare side-effects which are often ignored by other approaches. The method described in this article can be useful to predict side-effects in drug design in an early stage to reduce experimental cost and time. PMID:25327524

  17. An integrated approach to evaluating alternative risk prediction strategies: a case study comparing alternative approaches for preventing invasive fungal disease.

    PubMed

    Sadique, Z; Grieve, R; Harrison, D A; Jit, M; Allen, E; Rowan, K M

    2013-12-01

    This article proposes an integrated approach to the development, validation, and evaluation of new risk prediction models illustrated with the Fungal Infection Risk Evaluation study, which developed risk models to identify non-neutropenic, critically ill adult patients at high risk of invasive fungal disease (IFD). Our decision-analytical model compared alternative strategies for preventing IFD at up to three clinical decision time points (critical care admission, after 24 hours, and end of day 3), followed with antifungal prophylaxis for those judged "high" risk versus "no formal risk assessment." We developed prognostic models to predict the risk of IFD before critical care unit discharge, with data from 35,455 admissions to 70 UK adult, critical care units, and validated the models externally. The decision model was populated with positive predictive values and negative predictive values from the best-fitting risk models. We projected lifetime cost-effectiveness and expected value of partial perfect information for groups of parameters. The risk prediction models performed well in internal and external validation. Risk assessment and prophylaxis at the end of day 3 was the most cost-effective strategy at the 2% and 1% risk threshold. Risk assessment at each time point was the most cost-effective strategy at a 0.5% risk threshold. Expected values of partial perfect information were high for positive predictive values or negative predictive values (£11 million-£13 million) and quality-adjusted life-years (£11 million). It is cost-effective to formally assess the risk of IFD for non-neutropenic, critically ill adult patients. This integrated approach to developing and evaluating risk models is useful for informing clinical practice and future research investment. © 2013 International Society for Pharmacoeconomics and Outcomes Research (ISPOR) Published by International Society for Pharmacoeconomics and Outcomes Research (ISPOR) All rights reserved.

  18. Why did the bear cross the road? Comparing the performance of multiple resistance surfaces and connectivity modeling methods

    Treesearch

    Samuel A. Cushman; Jesse S. Lewis; Erin L. Landguth

    2014-01-01

    There have been few assessments of the performance of alternative resistance surfaces, and little is known about how connectivity modeling approaches differ in their ability to predict organism movements. In this paper, we evaluate the performance of four connectivity modeling approaches applied to two resistance surfaces in predicting the locations of highway...

  19. Computational Fluid Dynamics Simulation of Flows in an Oxidation Ditch Driven by a New Surface Aerator.

    PubMed

    Huang, Weidong; Li, Kun; Wang, Gan; Wang, Yingzhe

    2013-11-01

    In this article, we present a newly designed inverse umbrella surface aerator, and tested its performance in driving flow of an oxidation ditch. Results show that it has a better performance in driving the oxidation ditch than the original one with higher average velocity and more uniform flow field. We also present a computational fluid dynamics model for predicting the flow field in an oxidation ditch driven by a surface aerator. The improved momentum source term approach to simulate the flow field of the oxidation ditch driven by an inverse umbrella surface aerator was developed and validated through experiments. Four kinds of turbulent models were investigated with the approach, including the standard k - ɛ model, RNG k - ɛ model, realizable k - ɛ model, and Reynolds stress model, and the predicted data were compared with those calculated with the multiple rotating reference frame approach (MRF) and sliding mesh approach (SM). Results of the momentum source term approach are in good agreement with the experimental data, and its prediction accuracy is better than MRF, close to SM. It is also found that the momentum source term approach has lower computational expenses, is simpler to preprocess, and is easier to use.

  20. Predictions of Cockpit Simulator Experimental Outcome Using System Models

    NASA Technical Reports Server (NTRS)

    Sorensen, J. A.; Goka, T.

    1984-01-01

    This study involved predicting the outcome of a cockpit simulator experiment where pilots used cockpit displays of traffic information (CDTI) to establish and maintain in-trail spacing behind a lead aircraft during approach. The experiments were run on the NASA Ames Research Center multicab cockpit simulator facility. Prior to the experiments, a mathematical model of the pilot/aircraft/CDTI flight system was developed which included relative in-trail and vertical dynamics between aircraft in the approach string. This model was used to construct a digital simulation of the string dynamics including response to initial position errors. The model was then used to predict the outcome of the in-trail following cockpit simulator experiments. Outcome included performance and sensitivity to different separation criteria. The experimental results were then used to evaluate the model and its prediction accuracy. Lessons learned in this modeling and prediction study are noted.

  1. Probability-based collaborative filtering model for predicting gene-disease associations.

    PubMed

    Zeng, Xiangxiang; Ding, Ningxiang; Rodríguez-Patón, Alfonso; Zou, Quan

    2017-12-28

    Accurately predicting pathogenic human genes has been challenging in recent research. Considering extensive gene-disease data verified by biological experiments, we can apply computational methods to perform accurate predictions with reduced time and expenses. We propose a probability-based collaborative filtering model (PCFM) to predict pathogenic human genes. Several kinds of data sets, containing data of humans and data of other nonhuman species, are integrated in our model. Firstly, on the basis of a typical latent factorization model, we propose model I with an average heterogeneous regularization. Secondly, we develop modified model II with personal heterogeneous regularization to enhance the accuracy of aforementioned models. In this model, vector space similarity or Pearson correlation coefficient metrics and data on related species are also used. We compared the results of PCFM with the results of four state-of-arts approaches. The results show that PCFM performs better than other advanced approaches. PCFM model can be leveraged for predictions of disease genes, especially for new human genes or diseases with no known relationships.

  2. Protein model quality assessment prediction by combining fragment comparisons and a consensus Cα contact potential

    PubMed Central

    Zhou, Hongyi; Skolnick, Jeffrey

    2009-01-01

    In this work, we develop a fully automated method for the quality assessment prediction of protein structural models generated by structure prediction approaches such as fold recognition servers, or ab initio methods. The approach is based on fragment comparisons and a consensus Cα contact potential derived from the set of models to be assessed and was tested on CASP7 server models. The average Pearson linear correlation coefficient between predicted quality and model GDT-score per target is 0.83 for the 98 targets which is better than those of other quality assessment methods that participated in CASP7. Our method also outperforms the other methods by about 3% as assessed by the total GDT-score of the selected top models. PMID:18004783

  3. Model Update of a Micro Air Vehicle (MAV) Flexible Wing Frame with Uncertainty Quantification

    NASA Technical Reports Server (NTRS)

    Reaves, Mercedes C.; Horta, Lucas G.; Waszak, Martin R.; Morgan, Benjamin G.

    2004-01-01

    This paper describes a procedure to update parameters in the finite element model of a Micro Air Vehicle (MAV) to improve displacement predictions under aerodynamics loads. Because of fabrication, materials, and geometric uncertainties, a statistical approach combined with Multidisciplinary Design Optimization (MDO) is used to modify key model parameters. Static test data collected using photogrammetry are used to correlate with model predictions. Results show significant improvements in model predictions after parameters are updated; however, computed probabilities values indicate low confidence in updated values and/or model structure errors. Lessons learned in the areas of wing design, test procedures, modeling approaches with geometric nonlinearities, and uncertainties quantification are all documented.

  4. The Prediction of the Gas Utilization Ratio Based on TS Fuzzy Neural Network and Particle Swarm Optimization

    PubMed Central

    Jiang, Haihe; Yin, Yixin; Xiao, Wendong; Zhao, Baoyong

    2018-01-01

    Gas utilization ratio (GUR) is an important indicator that is used to evaluate the energy consumption of blast furnaces (BFs). Currently, the existing methods cannot predict the GUR accurately. In this paper, we present a novel data-driven model for predicting the GUR. The proposed approach utilized both the TS fuzzy neural network (TS-FNN) and the particle swarm algorithm (PSO) to predict the GUR. The particle swarm algorithm (PSO) is applied to optimize the parameters of the TS-FNN in order to decrease the error caused by the inaccurate initial parameter. This paper also applied the box graph (Box-plot) method to eliminate the abnormal value of the raw data during the data preprocessing. This method can deal with the data which does not obey the normal distribution which is caused by the complex industrial environments. The prediction results demonstrate that the optimization model based on PSO and the TS-FNN approach achieves higher prediction accuracy compared with the TS-FNN model and SVM model and the proposed approach can accurately predict the GUR of the blast furnace, providing an effective way for the on-line blast furnace distribution control. PMID:29461469

  5. The Prediction of the Gas Utilization Ratio based on TS Fuzzy Neural Network and Particle Swarm Optimization.

    PubMed

    Zhang, Sen; Jiang, Haihe; Yin, Yixin; Xiao, Wendong; Zhao, Baoyong

    2018-02-20

    Gas utilization ratio (GUR) is an important indicator that is used to evaluate the energy consumption of blast furnaces (BFs). Currently, the existing methods cannot predict the GUR accurately. In this paper, we present a novel data-driven model for predicting the GUR. The proposed approach utilized both the TS fuzzy neural network (TS-FNN) and the particle swarm algorithm (PSO) to predict the GUR. The particle swarm algorithm (PSO) is applied to optimize the parameters of the TS-FNN in order to decrease the error caused by the inaccurate initial parameter. This paper also applied the box graph (Box-plot) method to eliminate the abnormal value of the raw data during the data preprocessing. This method can deal with the data which does not obey the normal distribution which is caused by the complex industrial environments. The prediction results demonstrate that the optimization model based on PSO and the TS-FNN approach achieves higher prediction accuracy compared with the TS-FNN model and SVM model and the proposed approach can accurately predict the GUR of the blast furnace, providing an effective way for the on-line blast furnace distribution control.

  6. How long will my mouse live? Machine learning approaches for prediction of mouse life span.

    PubMed

    Swindell, William R; Harper, James M; Miller, Richard A

    2008-09-01

    Prediction of individual life span based on characteristics evaluated at middle-age represents a challenging objective for aging research. In this study, we used machine learning algorithms to construct models that predict life span in a stock of genetically heterogeneous mice. Life-span prediction accuracy of 22 algorithms was evaluated using a cross-validation approach, in which models were trained and tested with distinct subsets of data. Using a combination of body weight and T-cell subset measures evaluated before 2 years of age, we show that the life-span quartile to which an individual mouse belongs can be predicted with an accuracy of 35.3% (+/-0.10%). This result provides a new benchmark for the development of life-span-predictive models, but improvement can be expected through identification of new predictor variables and development of computational approaches. Future work in this direction can provide tools for aging research and will shed light on associations between phenotypic traits and longevity.

  7. Utilizing Chinese Admission Records for MACE Prediction of Acute Coronary Syndrome

    PubMed Central

    Hu, Danqing; Huang, Zhengxing; Chan, Tak-Ming; Dong, Wei; Lu, Xudong; Duan, Huilong

    2016-01-01

    Background: Clinical major adverse cardiovascular event (MACE) prediction of acute coronary syndrome (ACS) is important for a number of applications including physician decision support, quality of care assessment, and efficient healthcare service delivery on ACS patients. Admission records, as typical media to contain clinical information of patients at the early stage of their hospitalizations, provide significant potential to be explored for MACE prediction in a proactive manner. Methods: We propose a hybrid approach for MACE prediction by utilizing a large volume of admission records. Firstly, both a rule-based medical language processing method and a machine learning method (i.e., Conditional Random Fields (CRFs)) are developed to extract essential patient features from unstructured admission records. After that, state-of-the-art supervised machine learning algorithms are applied to construct MACE prediction models from data. Results: We comparatively evaluate the performance of the proposed approach on a real clinical dataset consisting of 2930 ACS patient samples collected from a Chinese hospital. Our best model achieved 72% AUC in MACE prediction. In comparison of the performance between our models and two well-known ACS risk score tools, i.e., GRACE and TIMI, our learned models obtain better performances with a significant margin. Conclusions: Experimental results reveal that our approach can obtain competitive performance in MACE prediction. The comparison of classifiers indicates the proposed approach has a competitive generality with datasets extracted by different feature extraction methods. Furthermore, our MACE prediction model obtained a significant improvement by comparison with both GRACE and TIMI. It indicates that using admission records can effectively provide MACE prediction service for ACS patients at the early stage of their hospitalizations. PMID:27649220

  8. Turbulence Modeling Effects on the Prediction of Equilibrium States of Buoyant Shear Flows

    NASA Technical Reports Server (NTRS)

    Zhao, C. Y.; So, R. M. C.; Gatski, T. B.

    2001-01-01

    The effects of turbulence modeling on the prediction of equilibrium states of turbulent buoyant shear flows were investigated. The velocity field models used include a two-equation closure, a Reynolds-stress closure assuming two different pressure-strain models and three different dissipation rate tensor models. As for the thermal field closure models, two different pressure-scrambling models and nine different temperature variance dissipation rate, Epsilon(0) equations were considered. The emphasis of this paper is focused on the effects of the Epsilon(0)-equation, of the dissipation rate models, of the pressure-strain models and of the pressure-scrambling models on the prediction of the approach to equilibrium turbulence. Equilibrium turbulence is defined by the time rate (if change of the scaled Reynolds stress anisotropic tensor and heat flux vector becoming zero. These conditions lead to the equilibrium state parameters. Calculations show that the Epsilon(0)-equation has a significant effect on the prediction of the approach to equilibrium turbulence. For a particular Epsilon(0)-equation, all velocity closure models considered give an equilibrium state if anisotropic dissipation is accounted for in one form or another in the dissipation rate tensor or in the Epsilon(0)-equation. It is further found that the models considered for the pressure-strain tensor and the pressure-scrambling vector have little or no effect on the prediction of the approach to equilibrium turbulence.

  9. THE FUTURE OF TOXICOLOGY-PREDICTIVE TOXICOLOGY: AN EXPANDED VIEW OF CHEMICAL TOXICITY

    EPA Science Inventory

    A chemistry approach to predictive toxicology relies on structure−activity relationship (SAR) modeling to predict biological activity from chemical structure. Such approaches have proven capabilities when applied to well-defined toxicity end points or regions of chemical space. T...

  10. Comparative study of two approaches to model the offshore fish cages

    NASA Astrophysics Data System (ADS)

    Zhao, Yun-peng; Wang, Xin-xin; Decew, Jud; Tsukrov, Igor; Bai, Xiao-dong; Bi, Chun-wei

    2015-06-01

    The goal of this paper is to provide a comparative analysis of two commonly used approaches to discretize offshore fish cages: the lumped-mass approach and the finite element technique. Two case studies are chosen to compare predictions of the LMA (lumped-mass approach) and FEA (finite element analysis) based numerical modeling techniques. In both case studies, we consider several loading conditions consisting of different uniform currents and monochromatic waves. We investigate motion of the cage, its deformation, and the resultant tension in the mooring lines. Both model predictions are sufficient close to the experimental data, but for the first experiment, the DUT-FlexSim predictions are slightly more accurate than the ones provided by Aqua-FE™. According to the comparisons, both models can be successfully utilized to the design and analysis of the offshore fish cages provided that an appropriate safety factor is chosen.

  11. Computational modeling of human oral bioavailability: what will be next?

    PubMed

    Cabrera-Pérez, Miguel Ángel; Pham-The, Hai

    2018-06-01

    The oral route is the most convenient way of administrating drugs. Therefore, accurate determination of oral bioavailability is paramount during drug discovery and development. Quantitative structure-property relationship (QSPR), rule-of-thumb (RoT) and physiologically based-pharmacokinetic (PBPK) approaches are promising alternatives to the early oral bioavailability prediction. Areas covered: The authors give insight into the factors affecting bioavailability, the fundamental theoretical framework and the practical aspects of computational methods for predicting this property. They also give their perspectives on future computational models for estimating oral bioavailability. Expert opinion: Oral bioavailability is a multi-factorial pharmacokinetic property with its accurate prediction challenging. For RoT and QSPR modeling, the reliability of datasets, the significance of molecular descriptor families and the diversity of chemometric tools used are important factors that define model predictability and interpretability. Likewise, for PBPK modeling the integrity of the pharmacokinetic data, the number of input parameters, the complexity of statistical analysis and the software packages used are relevant factors in bioavailability prediction. Although these approaches have been utilized independently, the tendency to use hybrid QSPR-PBPK approaches together with the exploration of ensemble and deep-learning systems for QSPR modeling of oral bioavailability has opened new avenues for development promising tools for oral bioavailability prediction.

  12. Handling a Small Dataset Problem in Prediction Model by employ Artificial Data Generation Approach: A Review

    NASA Astrophysics Data System (ADS)

    Lateh, Masitah Abdul; Kamilah Muda, Azah; Yusof, Zeratul Izzah Mohd; Azilah Muda, Noor; Sanusi Azmi, Mohd

    2017-09-01

    The emerging era of big data for past few years has led to large and complex data which needed faster and better decision making. However, the small dataset problems still arise in a certain area which causes analysis and decision are hard to make. In order to build a prediction model, a large sample is required as a training sample of the model. Small dataset is insufficient to produce an accurate prediction model. This paper will review an artificial data generation approach as one of the solution to solve the small dataset problem.

  13. Improving predictions of large scale soil carbon dynamics: Integration of fine-scale hydrological and biogeochemical processes, scaling, and benchmarking

    NASA Astrophysics Data System (ADS)

    Riley, W. J.; Dwivedi, D.; Ghimire, B.; Hoffman, F. M.; Pau, G. S. H.; Randerson, J. T.; Shen, C.; Tang, J.; Zhu, Q.

    2015-12-01

    Numerical model representations of decadal- to centennial-scale soil-carbon dynamics are a dominant cause of uncertainty in climate change predictions. Recent attempts by some Earth System Model (ESM) teams to integrate previously unrepresented soil processes (e.g., explicit microbial processes, abiotic interactions with mineral surfaces, vertical transport), poor performance of many ESM land models against large-scale and experimental manipulation observations, and complexities associated with spatial heterogeneity highlight the nascent nature of our community's ability to accurately predict future soil carbon dynamics. I will present recent work from our group to develop a modeling framework to integrate pore-, column-, watershed-, and global-scale soil process representations into an ESM (ACME), and apply the International Land Model Benchmarking (ILAMB) package for evaluation. At the column scale and across a wide range of sites, observed depth-resolved carbon stocks and their 14C derived turnover times can be explained by a model with explicit representation of two microbial populations, a simple representation of mineralogy, and vertical transport. Integrating soil and plant dynamics requires a 'process-scaling' approach, since all aspects of the multi-nutrient system cannot be explicitly resolved at ESM scales. I will show that one approach, the Equilibrium Chemistry Approximation, improves predictions of forest nitrogen and phosphorus experimental manipulations and leads to very different global soil carbon predictions. Translating model representations from the site- to ESM-scale requires a spatial scaling approach that either explicitly resolves the relevant processes, or more practically, accounts for fine-resolution dynamics at coarser scales. To that end, I will present recent watershed-scale modeling work that applies reduced order model methods to accurately scale fine-resolution soil carbon dynamics to coarse-resolution simulations. Finally, we contend that creating believable soil carbon predictions requires a robust, transparent, and community-available benchmarking framework. I will present an ILAMB evaluation of several of the above-mentioned approaches in ACME, and attempt to motivate community adoption of this evaluation approach.

  14. Switching Kalman filter for failure prognostic

    NASA Astrophysics Data System (ADS)

    Lim, Chi Keong Reuben; Mba, David

    2015-02-01

    The use of condition monitoring (CM) data to predict remaining useful life have been growing with increasing use of health and usage monitoring systems on aircraft. There are many data-driven methodologies available for the prediction and popular ones include artificial intelligence and statistical based approach. The drawback of such approaches is that they require a lot of failure data for training which can be scarce in practice. In lieu of this, methods using state-space and regression-based models that extract information from the data history itself have been explored. However, such methods have their own limitations as they utilize a single time-invariant model which does not represent changing degradation path well. This causes most degradation modeling studies to focus only on segments of their CM data that behaves close to the assumed model. In this paper, a state-space based method; the Switching Kalman Filter (SKF), is adopted for model estimation and life prediction. The SKF approach however, uses multiple models from which the most probable model is inferred from the CM data using Bayesian estimation before it is applied for prediction. At the same time, the inference of the degradation model itself can provide maintainers with more information for their planning. This SKF approach is demonstrated with a case study on gearbox bearings that were found defective from the Republic of Singapore Air Force AH64D helicopter. The use of in-service CM data allows the approach to be applied in a practical scenario and results showed that the developed SKF approach is a promising tool to support maintenance decision-making.

  15. Historical Prediction Modeling Approach for Estimating Long-Term Concentrations of PM2.5 in Cohort Studies before the 1999 Implementation of Widespread Monitoring

    PubMed Central

    Kim, Sun-Young; Olives, Casey; Sheppard, Lianne; Sampson, Paul D.; Larson, Timothy V.; Keller, Joshua P.; Kaufman, Joel D.

    2016-01-01

    Introduction: Recent cohort studies have used exposure prediction models to estimate the association between long-term residential concentrations of fine particulate matter (PM2.5) and health. Because these prediction models rely on PM2.5 monitoring data, predictions for times before extensive spatial monitoring present a challenge to understanding long-term exposure effects. The U.S. Environmental Protection Agency (EPA) Federal Reference Method (FRM) network for PM2.5 was established in 1999. Objectives: We evaluated a novel statistical approach to produce high-quality exposure predictions from 1980 through 2010 in the continental United States for epidemiological applications. Methods: We developed spatio-temporal prediction models using geographic predictors and annual average PM2.5 data from 1999 through 2010 from the FRM and the Interagency Monitoring of Protected Visual Environments (IMPROVE) networks. Temporal trends before 1999 were estimated by using a) extrapolation based on PM2.5 data in FRM/IMPROVE, b) PM2.5 sulfate data in the Clean Air Status and Trends Network, and c) visibility data across the Weather Bureau Army Navy network. We validated the models using PM2.5 data collected before 1999 from IMPROVE, California Air Resources Board dichotomous sampler monitoring (CARB dichot), the Children’s Health Study (CHS), and the Inhalable Particulate Network (IPN). Results: In our validation using pre-1999 data, the prediction model performed well across three trend estimation approaches when validated using IMPROVE and CHS data (R2 = 0.84–0.91) with lower R2 values in early years. Model performance using CARB dichot and IPN data was worse (R2 = 0.00–0.85) most likely because of fewer monitoring sites and inconsistent sampling methods. Conclusions: Our prediction modeling approach will allow health effects estimation associated with long-term exposures to PM2.5 over extended time periods ≤ 30 years. Citation: Kim SY, Olives C, Sheppard L, Sampson PD, Larson TV, Keller JP, Kaufman JD. 2017. Historical prediction modeling approach for estimating long-term concentrations of PM2.5 in cohort studies before the 1999 implementation of widespread monitoring. Environ Health Perspect 125:38–46; http://dx.doi.org/10.1289/EHP131 PMID:27340825

  16. Facultative Stabilization Pond: Measuring Biological Oxygen Demand using Mathematical Approaches

    NASA Astrophysics Data System (ADS)

    Wira S, Ihsan; Sunarsih, Sunarsih

    2018-02-01

    Pollution is a man-made phenomenon. Some pollutants which discharged directly to the environment could create serious pollution problems. Untreated wastewater will cause contamination and even pollution on the water body. Biological Oxygen Demand (BOD) is the amount of oxygen required for the oxidation by bacteria. The higher the BOD concentration, the greater the organic matter would be. The purpose of this study was to predict the value of BOD contained in wastewater. Mathematical modeling methods were chosen in this study to depict and predict the BOD values contained in facultative wastewater stabilization ponds. Measurements of sampling data were carried out to validate the model. The results of this study indicated that a mathematical approach can be applied to predict the BOD contained in the facultative wastewater stabilization ponds. The model was validated using Absolute Means Error with 10% tolerance limit, and AME for model was 7.38% (< 10%), so the model is valid. Furthermore, a mathematical approach can also be applied to illustrate and predict the contents of wastewater.

  17. Predictive modeling of dynamic fracture growth in brittle materials with machine learning

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Moore, Bryan A.; Rougier, Esteban; O’Malley, Daniel

    We use simulation data from a high delity Finite-Discrete Element Model to build an e cient Machine Learning (ML) approach to predict fracture growth and coalescence. Our goal is for the ML approach to be used as an emulator in place of the computationally intensive high delity models in an uncertainty quanti cation framework where thousands of forward runs are required. The failure of materials with various fracture con gurations (size, orientation and the number of initial cracks) are explored and used as data to train our ML model. This novel approach has shown promise in predicting spatial (path tomore » failure) and temporal (time to failure) aspects of brittle material failure. Predictions of where dominant fracture paths formed within a material were ~85% accurate and the time of material failure deviated from the actual failure time by an average of ~16%. Additionally, the ML model achieves a reduction in computational cost by multiple orders of magnitude.« less

  18. Predictive modeling of dynamic fracture growth in brittle materials with machine learning

    DOE PAGES

    Moore, Bryan A.; Rougier, Esteban; O’Malley, Daniel; ...

    2018-02-22

    We use simulation data from a high delity Finite-Discrete Element Model to build an e cient Machine Learning (ML) approach to predict fracture growth and coalescence. Our goal is for the ML approach to be used as an emulator in place of the computationally intensive high delity models in an uncertainty quanti cation framework where thousands of forward runs are required. The failure of materials with various fracture con gurations (size, orientation and the number of initial cracks) are explored and used as data to train our ML model. This novel approach has shown promise in predicting spatial (path tomore » failure) and temporal (time to failure) aspects of brittle material failure. Predictions of where dominant fracture paths formed within a material were ~85% accurate and the time of material failure deviated from the actual failure time by an average of ~16%. Additionally, the ML model achieves a reduction in computational cost by multiple orders of magnitude.« less

  19. Combined Molecular Dynamics Simulation-Molecular-Thermodynamic Theory Framework for Predicting Surface Tensions.

    PubMed

    Sresht, Vishnu; Lewandowski, Eric P; Blankschtein, Daniel; Jusufi, Arben

    2017-08-22

    A molecular modeling approach is presented with a focus on quantitative predictions of the surface tension of aqueous surfactant solutions. The approach combines classical Molecular Dynamics (MD) simulations with a molecular-thermodynamic theory (MTT) [ Y. J. Nikas, S. Puvvada, D. Blankschtein, Langmuir 1992 , 8 , 2680 ]. The MD component is used to calculate thermodynamic and molecular parameters that are needed in the MTT model to determine the surface tension isotherm. The MD/MTT approach provides the important link between the surfactant bulk concentration, the experimental control parameter, and the surfactant surface concentration, the MD control parameter. We demonstrate the capability of the MD/MTT modeling approach on nonionic alkyl polyethylene glycol surfactants at the air-water interface and observe reasonable agreement of the predicted surface tensions and the experimental surface tension data over a wide range of surfactant concentrations below the critical micelle concentration. Our modeling approach can be extended to ionic surfactants and their mixtures with both ionic and nonionic surfactants at liquid-liquid interfaces.

  20. The Effect of Visual Information on the Manual Approach and Landing

    NASA Technical Reports Server (NTRS)

    Wewerinke, P. H.

    1982-01-01

    The effect of visual information in combination with basic display information on the approach performance. A pre-experimental model analysis was performed in terms of the optimal control model. The resulting aircraft approach performance predictions were compared with the results of a moving base simulator program. The results illustrate that the model provides a meaningful description of the visual (scene) perception process involved in the complex (multi-variable, time varying) manual approach task with a useful predictive capability. The theoretical framework was shown to allow a straight-forward investigation of the complex interaction of a variety of task variables.

  1. Dissimilarity based Partial Least Squares (DPLS) for genomic prediction from SNPs.

    PubMed

    Singh, Priyanka; Engel, Jasper; Jansen, Jeroen; de Haan, Jorn; Buydens, Lutgarde Maria Celina

    2016-05-04

    Genomic prediction (GP) allows breeders to select plants and animals based on their breeding potential for desirable traits, without lengthy and expensive field trials or progeny testing. We have proposed to use Dissimilarity-based Partial Least Squares (DPLS) for GP. As a case study, we use the DPLS approach to predict Bacterial wilt (BW) in tomatoes using SNPs as predictors. The DPLS approach was compared with the Genomic Best-Linear Unbiased Prediction (GBLUP) and single-SNP regression with SNP as a fixed effect to assess the performance of DPLS. Eight genomic distance measures were used to quantify relationships between the tomato accessions from the SNPs. Subsequently, each of these distance measures was used to predict the BW using the DPLS prediction model. The DPLS model was found to be robust to the choice of distance measures; similar prediction performances were obtained for each distance measure. DPLS greatly outperformed the single-SNP regression approach, showing that BW is a comprehensive trait dependent on several loci. Next, the performance of the DPLS model was compared to that of GBLUP. Although GBLUP and DPLS are conceptually very different, the prediction quality (PQ) measured by DPLS models were similar to the prediction statistics obtained from GBLUP. A considerable advantage of DPLS is that the genotype-phenotype relationship can easily be visualized in a 2-D scatter plot. This so-called score-plot provides breeders an insight to select candidates for their future breeding program. DPLS is a highly appropriate method for GP. The model prediction performance was similar to the GBLUP and far better than the single-SNP approach. The proposed method can be used in combination with a wide range of genomic dissimilarity measures and genotype representations such as allele-count, haplotypes or allele-intensity values. Additionally, the data can be insightfully visualized by the DPLS model, allowing for selection of desirable candidates from the breeding experiments. In this study, we have assessed the DPLS performance on a single trait.

  2. Physical and JIT Model Based Hybrid Modeling Approach for Building Thermal Load Prediction

    NASA Astrophysics Data System (ADS)

    Iino, Yutaka; Murai, Masahiko; Murayama, Dai; Motoyama, Ichiro

    Energy conservation in building fields is one of the key issues in environmental point of view as well as that of industrial, transportation and residential fields. The half of the total energy consumption in a building is occupied by HVAC (Heating, Ventilating and Air Conditioning) systems. In order to realize energy conservation of HVAC system, a thermal load prediction model for building is required. This paper propose a hybrid modeling approach with physical and Just-in-Time (JIT) model for building thermal load prediction. The proposed method has features and benefits such as, (1) it is applicable to the case in which past operation data for load prediction model learning is poor, (2) it has a self checking function, which always supervises if the data driven load prediction and the physical based one are consistent or not, so it can find if something is wrong in load prediction procedure, (3) it has ability to adjust load prediction in real-time against sudden change of model parameters and environmental conditions. The proposed method is evaluated with real operation data of an existing building, and the improvement of load prediction performance is illustrated.

  3. Analysis of the Mean Absolute Error (MAE) and the Root Mean Square Error (RMSE) in Assessing Rounding Model

    NASA Astrophysics Data System (ADS)

    Wang, Weijie; Lu, Yanmin

    2018-03-01

    Most existing Collaborative Filtering (CF) algorithms predict a rating as the preference of an active user toward a given item, which is always a decimal fraction. Meanwhile, the actual ratings in most data sets are integers. In this paper, we discuss and demonstrate why rounding can bring different influences to these two metrics; prove that rounding is necessary in post-processing of the predicted ratings, eliminate of model prediction bias, improving the accuracy of the prediction. In addition, we also propose two new rounding approaches based on the predicted rating probability distribution, which can be used to round the predicted rating to an optimal integer rating, and get better prediction accuracy compared to the Basic Rounding approach. Extensive experiments on different data sets validate the correctness of our analysis and the effectiveness of our proposed rounding approaches.

  4. A Novel Approach to Adaptive Flow Separation Control

    DTIC Science & Technology

    2016-09-03

    particular, it considers control of flow separation over a NACA-0025 airfoil using microjet actuators and develops Adaptive Sampling Based Model...Predictive Control ( Adaptive SBMPC), a novel approach to Nonlinear Model Predictive Control that applies the Minimal Resource Allocation Network...Distribution Unlimited UU UU UU UU 03-09-2016 1-May-2013 30-Apr-2016 Final Report: A Novel Approach to Adaptive Flow Separation Control The views, opinions

  5. Experimental and Numerical Analysis of Triaxially Braided Composites Utilizing a Modified Subcell Modeling Approach

    NASA Technical Reports Server (NTRS)

    Cater, Christopher; Xiao, Xinran; Goldberg, Robert K.; Kohlman, Lee W.

    2015-01-01

    A combined experimental and analytical approach was performed for characterizing and modeling triaxially braided composites with a modified subcell modeling strategy. Tensile coupon tests were conducted on a [0deg/60deg/-60deg] braided composite at angles of 0deg, 30deg, 45deg, 60deg and 90deg relative to the axial tow of the braid. It was found that measured coupon strength varied significantly with the angle of the applied load and each coupon direction exhibited unique final failures. The subcell modeling approach implemented into the finite element software LS-DYNA was used to simulate the various tensile coupon test angles. The modeling approach was successful in predicting both the coupon strength and reported failure mode for the 0deg, 30deg and 60deg loading directions. The model over-predicted the strength in the 90deg direction; however, the experimental results show a strong influence of free edge effects on damage initiation and failure. In the absence of these local free edge effects, the subcell modeling approach showed promise as a viable and computationally efficient analysis tool for triaxially braided composite structures. Future work will focus on validation of the approach for predicting the impact response of the braided composite against flat panel impact tests.

  6. Experimental and Numerical Analysis of Triaxially Braided Composites Utilizing a Modified Subcell Modeling Approach

    NASA Technical Reports Server (NTRS)

    Cater, Christopher; Xiao, Xinran; Goldberg, Robert K.; Kohlman, Lee W.

    2015-01-01

    A combined experimental and analytical approach was performed for characterizing and modeling triaxially braided composites with a modified subcell modeling strategy. Tensile coupon tests were conducted on a [0deg/60deg/-60deg] braided composite at angles [0deg, 30deg, 45deg, 60deg and 90deg] relative to the axial tow of the braid. It was found that measured coupon strength varied significantly with the angle of the applied load and each coupon direction exhibited unique final failures. The subcell modeling approach implemented into the finite element software LS-DYNA was used to simulate the various tensile coupon test angles. The modeling approach was successful in predicting both the coupon strength and reported failure mode for the 0deg, 30deg and 60deg loading directions. The model over-predicted the strength in the 90deg direction; however, the experimental results show a strong influence of free edge effects on damage initiation and failure. In the absence of these local free edge effects, the subcell modeling approach showed promise as a viable and computationally efficient analysis tool for triaxially braided composite structures. Future work will focus on validation of the approach for predicting the impact response of the braided composite against flat panel impact tests.

  7. An Integrated Approach Linking Process to Structural Modeling With Microstructural Characterization for Injections-Molded Long-Fiber Thermoplastics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nguyen, Ba Nghiep; Bapanapalli, Satish K.; Smith, Mark T.

    2008-09-01

    The objective of our work is to enable the optimum design of lightweight automotive structural components using injection-molded long fiber thermoplastics (LFTs). To this end, an integrated approach that links process modeling to structural analysis with experimental microstructural characterization and validation is developed. First, process models for LFTs are developed and implemented into processing codes (e.g. ORIENT, Moldflow) to predict the microstructure of the as-formed composite (i.e. fiber length and orientation distributions). In parallel, characterization and testing methods are developed to obtain necessary microstructural data to validate process modeling predictions. Second, the predicted LFT composite microstructure is imported into amore » structural finite element analysis by ABAQUS to determine the response of the as-formed composite to given boundary conditions. At this stage, constitutive models accounting for the composite microstructure are developed to predict various types of behaviors (i.e. thermoelastic, viscoelastic, elastic-plastic, damage, fatigue, and impact) of LFTs. Experimental methods are also developed to determine material parameters and to validate constitutive models. Such a process-linked-structural modeling approach allows an LFT composite structure to be designed with confidence through numerical simulations. Some recent results of our collaborative research will be illustrated to show the usefulness and applications of this integrated approach.« less

  8. Predictability of weather and climate in a coupled ocean-atmosphere model: A dynamical systems approach. Ph.D. Thesis

    NASA Technical Reports Server (NTRS)

    Nese, Jon M.

    1989-01-01

    A dynamical systems approach is used to quantify the instantaneous and time-averaged predictability of a low-order moist general circulation model. Specifically, the effects on predictability of incorporating an active ocean circulation, implementing annual solar forcing, and asynchronously coupling the ocean and atmosphere are evaluated. The predictability and structure of the model attractors is compared using the Lyapunov exponents, the local divergence rates, and the correlation, fractal, and Lyapunov dimensions. The Lyapunov exponents measure the average rate of growth of small perturbations on an attractor, while the local divergence rates quantify phase-spatial variations of predictability. These local rates are exploited to efficiently identify and distinguish subtle differences in predictability among attractors. In addition, the predictability of monthly averaged and yearly averaged states is investigated by using attractor reconstruction techniques.

  9. A Personalized Predictive Framework for Multivariate Clinical Time Series via Adaptive Model Selection.

    PubMed

    Liu, Zitao; Hauskrecht, Milos

    2017-11-01

    Building of an accurate predictive model of clinical time series for a patient is critical for understanding of the patient condition, its dynamics, and optimal patient management. Unfortunately, this process is not straightforward. First, patient-specific variations are typically large and population-based models derived or learned from many different patients are often unable to support accurate predictions for each individual patient. Moreover, time series observed for one patient at any point in time may be too short and insufficient to learn a high-quality patient-specific model just from the patient's own data. To address these problems we propose, develop and experiment with a new adaptive forecasting framework for building multivariate clinical time series models for a patient and for supporting patient-specific predictions. The framework relies on the adaptive model switching approach that at any point in time selects the most promising time series model out of the pool of many possible models, and consequently, combines advantages of the population, patient-specific and short-term individualized predictive models. We demonstrate that the adaptive model switching framework is very promising approach to support personalized time series prediction, and that it is able to outperform predictions based on pure population and patient-specific models, as well as, other patient-specific model adaptation strategies.

  10. Modeling Progressive Damage Using Local Displacement Discontinuities Within the FEAMAC Multiscale Modeling Framework

    NASA Technical Reports Server (NTRS)

    Ranatunga, Vipul; Bednarcyk, Brett A.; Arnold, Steven M.

    2010-01-01

    A method for performing progressive damage modeling in composite materials and structures based on continuum level interfacial displacement discontinuities is presented. The proposed method enables the exponential evolution of the interfacial compliance, resulting in unloading of the tractions at the interface after delamination or failure occurs. In this paper, the proposed continuum displacement discontinuity model has been used to simulate failure within both isotropic and orthotropic materials efficiently and to explore the possibility of predicting the crack path, therein. Simulation results obtained from Mode-I and Mode-II fracture compare the proposed approach with the cohesive element approach and Virtual Crack Closure Techniques (VCCT) available within the ABAQUS (ABAQUS, Inc.) finite element software. Furthermore, an eccentrically loaded 3-point bend test has been simulated with the displacement discontinuity model, and the resulting crack path prediction has been compared with a prediction based on the extended finite element model (XFEM) approach.

  11. Imaging genetics approach to predict progression of Parkinson's diseases.

    PubMed

    Mansu Kim; Seong-Jin Son; Hyunjin Park

    2017-07-01

    Imaging genetics is a tool to extract genetic variants associated with both clinical phenotypes and imaging information. The approach can extract additional genetic variants compared to conventional approaches to better investigate various diseased conditions. Here, we applied imaging genetics to study Parkinson's disease (PD). We aimed to extract significant features derived from imaging genetics and neuroimaging. We built a regression model based on extracted significant features combining genetics and neuroimaging to better predict clinical scores of PD progression (i.e. MDS-UPDRS). Our model yielded high correlation (r = 0.697, p <; 0.001) and low root mean squared error (8.36) between predicted and actual MDS-UPDRS scores. Neuroimaging (from 123 I-Ioflupane SPECT) predictors of regression model were computed from independent component analysis approach. Genetic features were computed using image genetics approach based on identified neuroimaging features as intermediate phenotypes. Joint modeling of neuroimaging and genetics could provide complementary information and thus have the potential to provide further insight into the pathophysiology of PD. Our model included newly found neuroimaging features and genetic variants which need further investigation.

  12. Automatically updating predictive modeling workflows support decision-making in drug design.

    PubMed

    Muegge, Ingo; Bentzien, Jörg; Mukherjee, Prasenjit; Hughes, Robert O

    2016-09-01

    Using predictive models for early decision-making in drug discovery has become standard practice. We suggest that model building needs to be automated with minimum input and low technical maintenance requirements. Models perform best when tailored to answering specific compound optimization related questions. If qualitative answers are required, 2-bin classification models are preferred. Integrating predictive modeling results with structural information stimulates better decision making. For in silico models supporting rapid structure-activity relationship cycles the performance deteriorates within weeks. Frequent automated updates of predictive models ensure best predictions. Consensus between multiple modeling approaches increases the prediction confidence. Combining qualified and nonqualified data optimally uses all available information. Dose predictions provide a holistic alternative to multiple individual property predictions for reaching complex decisions.

  13. Predictive QSAR modeling workflow, model applicability domains, and virtual screening.

    PubMed

    Tropsha, Alexander; Golbraikh, Alexander

    2007-01-01

    Quantitative Structure Activity Relationship (QSAR) modeling has been traditionally applied as an evaluative approach, i.e., with the focus on developing retrospective and explanatory models of existing data. Model extrapolation was considered if only in hypothetical sense in terms of potential modifications of known biologically active chemicals that could improve compounds' activity. This critical review re-examines the strategy and the output of the modern QSAR modeling approaches. We provide examples and arguments suggesting that current methodologies may afford robust and validated models capable of accurate prediction of compound properties for molecules not included in the training sets. We discuss a data-analytical modeling workflow developed in our laboratory that incorporates modules for combinatorial QSAR model development (i.e., using all possible binary combinations of available descriptor sets and statistical data modeling techniques), rigorous model validation, and virtual screening of available chemical databases to identify novel biologically active compounds. Our approach places particular emphasis on model validation as well as the need to define model applicability domains in the chemistry space. We present examples of studies where the application of rigorously validated QSAR models to virtual screening identified computational hits that were confirmed by subsequent experimental investigations. The emerging focus of QSAR modeling on target property forecasting brings it forward as predictive, as opposed to evaluative, modeling approach.

  14. An Efficient Deterministic Approach to Model-based Prediction Uncertainty Estimation

    NASA Technical Reports Server (NTRS)

    Daigle, Matthew J.; Saxena, Abhinav; Goebel, Kai

    2012-01-01

    Prognostics deals with the prediction of the end of life (EOL) of a system. EOL is a random variable, due to the presence of process noise and uncertainty in the future inputs to the system. Prognostics algorithm must account for this inherent uncertainty. In addition, these algorithms never know exactly the state of the system at the desired time of prediction, or the exact model describing the future evolution of the system, accumulating additional uncertainty into the predicted EOL. Prediction algorithms that do not account for these sources of uncertainty are misrepresenting the EOL and can lead to poor decisions based on their results. In this paper, we explore the impact of uncertainty in the prediction problem. We develop a general model-based prediction algorithm that incorporates these sources of uncertainty, and propose a novel approach to efficiently handle uncertainty in the future input trajectories of a system by using the unscented transformation. Using this approach, we are not only able to reduce the computational load but also estimate the bounds of uncertainty in a deterministic manner, which can be useful to consider during decision-making. Using a lithium-ion battery as a case study, we perform several simulation-based experiments to explore these issues, and validate the overall approach using experimental data from a battery testbed.

  15. Modelling approaches for pipe inclination effect on deposition limit velocity of settling slurry flow

    NASA Astrophysics Data System (ADS)

    Matoušek, Václav; Kesely, Mikoláš; Vlasák, Pavel

    2018-06-01

    The deposition velocity is an important operation parameter in hydraulic transport of solid particles in pipelines. It represents flow velocity at which transported particles start to settle out at the bottom of the pipe and are no longer transported. A number of predictive models has been developed to determine this threshold velocity for slurry flows of different solids fractions (fractions of different grain size and density). Most of the models consider flow in a horizontal pipe only, modelling approaches for inclined flows are extremely scarce due partially to a lack of experimental information about the effect of pipe inclination on the slurry flow pattern and behaviour. We survey different approaches to modelling of particle deposition in flowing slurry and discuss mechanisms on which deposition-limit models are based. Furthermore, we analyse possibilities to incorporate the effect of flow inclination into the predictive models and select the most appropriate ones based on their ability to modify the modelled deposition mechanisms to conditions associated with the flow inclination. A usefulness of the selected modelling approaches and their modifications are demonstrated by comparing model predictions with experimental results for inclined slurry flows from our own laboratory and from the literature.

  16. A Bayesian approach for parameter estimation and prediction using a computationally intensive model

    DOE PAGES

    Higdon, Dave; McDonnell, Jordan D.; Schunck, Nicolas; ...

    2015-02-05

    Bayesian methods have been successful in quantifying uncertainty in physics-based problems in parameter estimation and prediction. In these cases, physical measurements y are modeled as the best fit of a physics-based modelmore » $$\\eta (\\theta )$$, where θ denotes the uncertain, best input setting. Hence the statistical model is of the form $$y=\\eta (\\theta )+\\epsilon ,$$ where $$\\epsilon $$ accounts for measurement, and possibly other, error sources. When nonlinearity is present in $$\\eta (\\cdot )$$, the resulting posterior distribution for the unknown parameters in the Bayesian formulation is typically complex and nonstandard, requiring computationally demanding computational approaches such as Markov chain Monte Carlo (MCMC) to produce multivariate draws from the posterior. Although generally applicable, MCMC requires thousands (or even millions) of evaluations of the physics model $$\\eta (\\cdot )$$. This requirement is problematic if the model takes hours or days to evaluate. To overcome this computational bottleneck, we present an approach adapted from Bayesian model calibration. This approach combines output from an ensemble of computational model runs with physical measurements, within a statistical formulation, to carry out inference. A key component of this approach is a statistical response surface, or emulator, estimated from the ensemble of model runs. We demonstrate this approach with a case study in estimating parameters for a density functional theory model, using experimental mass/binding energy measurements from a collection of atomic nuclei. Lastly, we also demonstrate how this approach produces uncertainties in predictions for recent mass measurements obtained at Argonne National Laboratory.« less

  17. A Deep Learning based Approach to Reduced Order Modeling of Fluids using LSTM Neural Networks

    NASA Astrophysics Data System (ADS)

    Mohan, Arvind; Gaitonde, Datta

    2017-11-01

    Reduced Order Modeling (ROM) can be used as surrogates to prohibitively expensive simulations to model flow behavior for long time periods. ROM is predicated on extracting dominant spatio-temporal features of the flow from CFD or experimental datasets. We explore ROM development with a deep learning approach, which comprises of learning functional relationships between different variables in large datasets for predictive modeling. Although deep learning and related artificial intelligence based predictive modeling techniques have shown varied success in other fields, such approaches are in their initial stages of application to fluid dynamics. Here, we explore the application of the Long Short Term Memory (LSTM) neural network to sequential data, specifically to predict the time coefficients of Proper Orthogonal Decomposition (POD) modes of the flow for future timesteps, by training it on data at previous timesteps. The approach is demonstrated by constructing ROMs of several canonical flows. Additionally, we show that statistical estimates of stationarity in the training data can indicate a priori how amenable a given flow-field is to this approach. Finally, the potential and limitations of deep learning based ROM approaches will be elucidated and further developments discussed.

  18. Body Fat Percentage Prediction Using Intelligent Hybrid Approaches

    PubMed Central

    Shao, Yuehjen E.

    2014-01-01

    Excess of body fat often leads to obesity. Obesity is typically associated with serious medical diseases, such as cancer, heart disease, and diabetes. Accordingly, knowing the body fat is an extremely important issue since it affects everyone's health. Although there are several ways to measure the body fat percentage (BFP), the accurate methods are often associated with hassle and/or high costs. Traditional single-stage approaches may use certain body measurements or explanatory variables to predict the BFP. Diverging from existing approaches, this study proposes new intelligent hybrid approaches to obtain fewer explanatory variables, and the proposed forecasting models are able to effectively predict the BFP. The proposed hybrid models consist of multiple regression (MR), artificial neural network (ANN), multivariate adaptive regression splines (MARS), and support vector regression (SVR) techniques. The first stage of the modeling includes the use of MR and MARS to obtain fewer but more important sets of explanatory variables. In the second stage, the remaining important variables are served as inputs for the other forecasting methods. A real dataset was used to demonstrate the development of the proposed hybrid models. The prediction results revealed that the proposed hybrid schemes outperformed the typical, single-stage forecasting models. PMID:24723804

  19. Climate downscaling effects on predictive ecological models: a case study for threatened and endangered vertebrates in the southeastern United States

    USGS Publications Warehouse

    Bucklin, David N.; Watling, James I.; Speroterra, Carolina; Brandt, Laura A.; Mazzotti, Frank J.; Romañach, Stephanie S.

    2013-01-01

    High-resolution (downscaled) projections of future climate conditions are critical inputs to a wide variety of ecological and socioeconomic models and are created using numerous different approaches. Here, we conduct a sensitivity analysis of spatial predictions from climate envelope models for threatened and endangered vertebrates in the southeastern United States to determine whether two different downscaling approaches (with and without the use of a regional climate model) affect climate envelope model predictions when all other sources of variation are held constant. We found that prediction maps differed spatially between downscaling approaches and that the variation attributable to downscaling technique was comparable to variation between maps generated using different general circulation models (GCMs). Precipitation variables tended to show greater discrepancies between downscaling techniques than temperature variables, and for one GCM, there was evidence that more poorly resolved precipitation variables contributed relatively more to model uncertainty than more well-resolved variables. Our work suggests that ecological modelers requiring high-resolution climate projections should carefully consider the type of downscaling applied to the climate projections prior to their use in predictive ecological modeling. The uncertainty associated with alternative downscaling methods may rival that of other, more widely appreciated sources of variation, such as the general circulation model or emissions scenario with which future climate projections are created.

  20. Efficient statistical mapping of avian count data

    USGS Publications Warehouse

    Royle, J. Andrew; Wikle, C.K.

    2005-01-01

    We develop a spatial modeling framework for count data that is efficient to implement in high-dimensional prediction problems. We consider spectral parameterizations for the spatially varying mean of a Poisson model. The spectral parameterization of the spatial process is very computationally efficient, enabling effective estimation and prediction in large problems using Markov chain Monte Carlo techniques. We apply this model to creating avian relative abundance maps from North American Breeding Bird Survey (BBS) data. Variation in the ability of observers to count birds is modeled as spatially independent noise, resulting in over-dispersion relative to the Poisson assumption. This approach represents an improvement over existing approaches used for spatial modeling of BBS data which are either inefficient for continental scale modeling and prediction or fail to accommodate important distributional features of count data thus leading to inaccurate accounting of prediction uncertainty.

  1. Droplet Deformation Prediction With the Droplet Deformation and Breakup Model (DDB)

    NASA Technical Reports Server (NTRS)

    Vargas, Mario

    2012-01-01

    The Droplet Deformation and Breakup Model was used to predict deformation of droplets approaching the leading edge stagnation line of an airfoil. The quasi-steady model was solved for each position along the droplet path. A program was developed to solve the non-linear, second order, ordinary differential equation that governs the model. A fourth order Runge-Kutta method was used to solve the equation. Experimental slip velocities from droplet breakup studies were used as input to the model which required slip velocity along the particle path. The center of mass displacement predictions were compared to the experimental measurements from the droplet breakup studies for droplets with radii in the range of 200 to 700 mm approaching the airfoil at 50 and 90 m/sec. The model predictions were good for the displacement of the center of mass for small and medium sized droplets. For larger droplets the model predictions did not agree with the experimental results.

  2. Towards more accurate and reliable predictions for nuclear applications

    NASA Astrophysics Data System (ADS)

    Goriely, Stephane; Hilaire, Stephane; Dubray, Noel; Lemaître, Jean-François

    2017-09-01

    The need for nuclear data far from the valley of stability, for applications such as nuclear astrophysics or future nuclear facilities, challenges the robustness as well as the predictive power of present nuclear models. Most of the nuclear data evaluation and prediction are still performed on the basis of phenomenological nuclear models. For the last decades, important progress has been achieved in fundamental nuclear physics, making it now feasible to use more reliable, but also more complex microscopic or semi-microscopic models in the evaluation and prediction of nuclear data for practical applications. Nowadays mean-field models can be tuned at the same level of accuracy as the phenomenological models, renormalized on experimental data if needed, and therefore can replace the phenomenological inputs in the evaluation of nuclear data. The latest achievements to determine nuclear masses within the non-relativistic HFB approach, including the related uncertainties in the model predictions, are discussed. Similarly, recent efforts to determine fission observables within the mean-field approach are described and compared with more traditional existing models.

  3. Addressing issues associated with evaluating prediction models for survival endpoints based on the concordance statistic.

    PubMed

    Wang, Ming; Long, Qi

    2016-09-01

    Prediction models for disease risk and prognosis play an important role in biomedical research, and evaluating their predictive accuracy in the presence of censored data is of substantial interest. The standard concordance (c) statistic has been extended to provide a summary measure of predictive accuracy for survival models. Motivated by a prostate cancer study, we address several issues associated with evaluating survival prediction models based on c-statistic with a focus on estimators using the technique of inverse probability of censoring weighting (IPCW). Compared to the existing work, we provide complete results on the asymptotic properties of the IPCW estimators under the assumption of coarsening at random (CAR), and propose a sensitivity analysis under the mechanism of noncoarsening at random (NCAR). In addition, we extend the IPCW approach as well as the sensitivity analysis to high-dimensional settings. The predictive accuracy of prediction models for cancer recurrence after prostatectomy is assessed by applying the proposed approaches. We find that the estimated predictive accuracy for the models in consideration is sensitive to NCAR assumption, and thus identify the best predictive model. Finally, we further evaluate the performance of the proposed methods in both settings of low-dimensional and high-dimensional data under CAR and NCAR through simulations. © 2016, The International Biometric Society.

  4. Promises of Machine Learning Approaches in Prediction of Absorption of Compounds.

    PubMed

    Kumar, Rajnish; Sharma, Anju; Siddiqui, Mohammed Haris; Tiwari, Rajesh Kumar

    2018-01-01

    The Machine Learning (ML) is one of the fastest developing techniques in the prediction and evaluation of important pharmacokinetic properties such as absorption, distribution, metabolism and excretion. The availability of a large number of robust validation techniques for prediction models devoted to pharmacokinetics has significantly enhanced the trust and authenticity in ML approaches. There is a series of prediction models generated and used for rapid screening of compounds on the basis of absorption in last one decade. Prediction of absorption of compounds using ML models has great potential across the pharmaceutical industry as a non-animal alternative to predict absorption. However, these prediction models still have to go far ahead to develop the confidence similar to conventional experimental methods for estimation of drug absorption. Some of the general concerns are selection of appropriate ML methods and validation techniques in addition to selecting relevant descriptors and authentic data sets for the generation of prediction models. The current review explores published models of ML for the prediction of absorption using physicochemical properties as descriptors and their important conclusions. In addition, some critical challenges in acceptance of ML models for absorption are also discussed. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  5. Similarity-based multi-model ensemble approach for 1-15-day advance prediction of monsoon rainfall over India

    NASA Astrophysics Data System (ADS)

    Jaiswal, Neeru; Kishtawal, C. M.; Bhomia, Swati

    2018-04-01

    The southwest (SW) monsoon season (June, July, August and September) is the major period of rainfall over the Indian region. The present study focuses on the development of a new multi-model ensemble approach based on the similarity criterion (SMME) for the prediction of SW monsoon rainfall in the extended range. This approach is based on the assumption that training with the similar type of conditions may provide the better forecasts in spite of the sequential training which is being used in the conventional MME approaches. In this approach, the training dataset has been selected by matching the present day condition to the archived dataset and days with the most similar conditions were identified and used for training the model. The coefficients thus generated were used for the rainfall prediction. The precipitation forecasts from four general circulation models (GCMs), viz. European Centre for Medium-Range Weather Forecasts (ECMWF), United Kingdom Meteorological Office (UKMO), National Centre for Environment Prediction (NCEP) and China Meteorological Administration (CMA) have been used for developing the SMME forecasts. The forecasts of 1-5, 6-10 and 11-15 days were generated using the newly developed approach for each pentad of June-September during the years 2008-2013 and the skill of the model was analysed using verification scores, viz. equitable skill score (ETS), mean absolute error (MAE), Pearson's correlation coefficient and Nash-Sutcliffe model efficiency index. Statistical analysis of SMME forecasts shows superior forecast skill compared to the conventional MME and the individual models for all the pentads, viz. 1-5, 6-10 and 11-15 days.

  6. Interpreting Disruption Prediction Models to Improve Plasma Control

    NASA Astrophysics Data System (ADS)

    Parsons, Matthew

    2017-10-01

    In order for the tokamak to be a feasible design for a fusion reactor, it is necessary to minimize damage to the machine caused by plasma disruptions. Accurately predicting disruptions is a critical capability for triggering any mitigative actions, and a modest amount of attention has been given to efforts that employ machine learning techniques to make these predictions. By monitoring diagnostic signals during a discharge, such predictive models look for signs that the plasma is about to disrupt. Typically these predictive models are interpreted simply to give a `yes' or `no' response as to whether a disruption is approaching. However, it is possible to extract further information from these models to indicate which input signals are more strongly correlated with the plasma approaching a disruption. If highly accurate predictive models can be developed, this information could be used in plasma control schemes to make better decisions about disruption avoidance. This work was supported by a Grant from the 2016-2017 Fulbright U.S. Student Program, administered by the Franco-American Fulbright Commission in France.

  7. Prediction and assimilation of surf-zone processes using a Bayesian network: Part I: Forward models

    USGS Publications Warehouse

    Plant, Nathaniel G.; Holland, K. Todd

    2011-01-01

    Prediction of coastal processes, including waves, currents, and sediment transport, can be obtained from a variety of detailed geophysical-process models with many simulations showing significant skill. This capability supports a wide range of research and applied efforts that can benefit from accurate numerical predictions. However, the predictions are only as accurate as the data used to drive the models and, given the large temporal and spatial variability of the surf zone, inaccuracies in data are unavoidable such that useful predictions require corresponding estimates of uncertainty. We demonstrate how a Bayesian-network model can be used to provide accurate predictions of wave-height evolution in the surf zone given very sparse and/or inaccurate boundary-condition data. The approach is based on a formal treatment of a data-assimilation problem that takes advantage of significant reduction of the dimensionality of the model system. We demonstrate that predictions of a detailed geophysical model of the wave evolution are reproduced accurately using a Bayesian approach. In this surf-zone application, forward prediction skill was 83%, and uncertainties in the model inputs were accurately transferred to uncertainty in output variables. We also demonstrate that if modeling uncertainties were not conveyed to the Bayesian network (i.e., perfect data or model were assumed), then overly optimistic prediction uncertainties were computed. More consistent predictions and uncertainties were obtained by including model-parameter errors as a source of input uncertainty. Improved predictions (skill of 90%) were achieved because the Bayesian network simultaneously estimated optimal parameters while predicting wave heights.

  8. Required Collaborative Work in Online Courses: A Predictive Modeling Approach

    ERIC Educational Resources Information Center

    Smith, Marlene A.; Kellogg, Deborah L.

    2015-01-01

    This article describes a predictive model that assesses whether a student will have greater perceived learning in group assignments or in individual work. The model produces correct classifications 87.5% of the time. The research is notable in that it is the first in the education literature to adopt a predictive modeling methodology using data…

  9. Patient No-Show Predictive Model Development using Multiple Data Sources for an Effective Overbooking Approach

    PubMed Central

    Hanauer, D.A.

    2014-01-01

    Summary Background Patient no-shows in outpatient delivery systems remain problematic. The negative impacts include underutilized medical resources, increased healthcare costs, decreased access to care, and reduced clinic efficiency and provider productivity. Objective To develop an evidence-based predictive model for patient no-shows, and thus improve overbooking approaches in outpatient settings to reduce the negative impact of no-shows. Methods Ten years of retrospective data were extracted from a scheduling system and an electronic health record system from a single general pediatrics clinic, consisting of 7,988 distinct patients and 104,799 visits along with variables regarding appointment characteristics, patient demographics, and insurance information. Descriptive statistics were used to explore the impact of variables on show or no-show status. Logistic regression was used to develop a no-show predictive model, which was then used to construct an algorithm to determine the no-show threshold that calculates a predicted show/no-show status. This approach aims to overbook an appointment where a scheduled patient is predicted to be a no-show. The approach was compared with two commonly-used overbooking approaches to demonstrate the effectiveness in terms of patient wait time, physician idle time, overtime and total cost. Results From the training dataset, the optimal error rate is 10.6% with a no-show threshold being 0.74. This threshold successfully predicts the validation dataset with an error rate of 13.9%. The proposed overbooking approach demonstrated a significant reduction of at least 6% on patient waiting, 27% on overtime, and 3% on total costs compared to other common flat-overbooking methods. Conclusions This paper demonstrates an alternative way to accommodate overbooking, accounting for the prediction of an individual patient’s show/no-show status. The predictive no-show model leads to a dynamic overbooking policy that could improve patient waiting, overtime, and total costs in a clinic day while maintaining a full scheduling capacity. PMID:25298821

  10. Statistical Approaches for Spatiotemporal Prediction of Low Flows

    NASA Astrophysics Data System (ADS)

    Fangmann, A.; Haberlandt, U.

    2017-12-01

    An adequate assessment of regional climate change impacts on streamflow requires the integration of various sources of information and modeling approaches. This study proposes simple statistical tools for inclusion into model ensembles, which are fast and straightforward in their application, yet able to yield accurate streamflow predictions in time and space. Target variables for all approaches are annual low flow indices derived from a data set of 51 records of average daily discharge for northwestern Germany. The models require input of climatic data in the form of meteorological drought indices, derived from observed daily climatic variables, averaged over the streamflow gauges' catchments areas. Four different modeling approaches are analyzed. Basis for all pose multiple linear regression models that estimate low flows as a function of a set of meteorological indices and/or physiographic and climatic catchment descriptors. For the first method, individual regression models are fitted at each station, predicting annual low flow values from a set of annual meteorological indices, which are subsequently regionalized using a set of catchment characteristics. The second method combines temporal and spatial prediction within a single panel data regression model, allowing estimation of annual low flow values from input of both annual meteorological indices and catchment descriptors. The third and fourth methods represent non-stationary low flow frequency analyses and require fitting of regional distribution functions. Method three is subject to a spatiotemporal prediction of an index value, method four to estimation of L-moments that adapt the regional frequency distribution to the at-site conditions. The results show that method two outperforms successive prediction in time and space. Method three also shows a high performance in the near future period, but since it relies on a stationary distribution, its application for prediction of far future changes may be problematic. Spatiotemporal prediction of L-moments appeared highly uncertain for higher-order moments resulting in unrealistic future low flow values. All in all, the results promote an inclusion of simple statistical methods in climate change impact assessment.

  11. Combining inferences from models of capture efficiency, detectability, and suitable habitat to classify landscapes for conservation of threatened bull trout

    USGS Publications Warehouse

    Peterson, J.; Dunham, J.B.

    2003-01-01

    Effective conservation efforts for at-risk species require knowledge of the locations of existing populations. Species presence can be estimated directly by conducting field-sampling surveys or alternatively by developing predictive models. Direct surveys can be expensive and inefficient, particularly for rare and difficult-to-sample species, and models of species presence may produce biased predictions. We present a Bayesian approach that combines sampling and model-based inferences for estimating species presence. The accuracy and cost-effectiveness of this approach were compared to those of sampling surveys and predictive models for estimating the presence of the threatened bull trout ( Salvelinus confluentus ) via simulation with existing models and empirical sampling data. Simulations indicated that a sampling-only approach would be the most effective and would result in the lowest presence and absence misclassification error rates for three thresholds of detection probability. When sampling effort was considered, however, the combined approach resulted in the lowest error rates per unit of sampling effort. Hence, lower probability-of-detection thresholds can be specified with the combined approach, resulting in lower misclassification error rates and improved cost-effectiveness.

  12. Refining metabolic models and accounting for regulatory effects.

    PubMed

    Kim, Joonhoon; Reed, Jennifer L

    2014-10-01

    Advances in genome-scale metabolic modeling allow us to investigate and engineer metabolism at a systems level. Metabolic network reconstructions have been made for many organisms and computational approaches have been developed to convert these reconstructions into predictive models. However, due to incomplete knowledge these reconstructions often have missing or extraneous components and interactions, which can be identified by reconciling model predictions with experimental data. Recent studies have provided methods to further improve metabolic model predictions by incorporating transcriptional regulatory interactions and high-throughput omics data to yield context-specific metabolic models. Here we discuss recent approaches for resolving model-data discrepancies and building context-specific metabolic models. Once developed highly accurate metabolic models can be used in a variety of biotechnology applications. Copyright © 2014 Elsevier Ltd. All rights reserved.

  13. The Complexity of Developmental Predictions from Dual Process Models

    ERIC Educational Resources Information Center

    Stanovich, Keith E.; West, Richard F.; Toplak, Maggie E.

    2011-01-01

    Drawing developmental predictions from dual-process theories is more complex than is commonly realized. Overly simplified predictions drawn from such models may lead to premature rejection of the dual process approach as one of many tools for understanding cognitive development. Misleading predictions can be avoided by paying attention to several…

  14. Predictive optimal control of sewer networks using CORAL tool: application to Riera Blanca catchment in Barcelona.

    PubMed

    Puig, V; Cembrano, G; Romera, J; Quevedo, J; Aznar, B; Ramón, G; Cabot, J

    2009-01-01

    This paper deals with the global control of the Riera Blanca catchment in the Barcelona sewer network using a predictive optimal control approach. This catchment has been modelled using a conceptual modelling approach based on decomposing the catchments in subcatchments and representing them as virtual tanks. This conceptual modelling approach allows real-time model calibration and control of the sewer network. The global control problem of the Riera Blanca catchment is solved using a optimal/predictive control algorithm. To implement the predictive optimal control of the Riera Blanca catchment, a software tool named CORAL is used. The on-line control is simulated by interfacing CORAL with a high fidelity simulator of sewer networks (MOUSE). CORAL interchanges readings from the limnimeters and gate commands with MOUSE as if it was connected with the real SCADA system. Finally, the global control results obtained using the predictive optimal control are presented and compared against the results obtained using current local control system. The results obtained using the global control are very satisfactory compared to those obtained using the local control.

  15. A coupled melt-freeze temperature index approach in a one-layer model to predict bulk volumetric liquid water content dynamics in snow

    NASA Astrophysics Data System (ADS)

    Avanzi, Francesco; Yamaguchi, Satoru; Hirashima, Hiroyuki; De Michele, Carlo

    2016-04-01

    Liquid water in snow rules runoff dynamics and wet snow avalanches release. Moreover, it affects snow viscosity and snow albedo. As a result, measuring and modeling liquid water dynamics in snow have important implications for many scientific applications. However, measurements are usually challenging, while modeling is difficult due to an overlap of mechanical, thermal and hydraulic processes. Here, we evaluate the use of a simple one-layer one-dimensional model to predict hourly time-series of bulk volumetric liquid water content in seasonal snow. The model considers both a simple temperature-index approach (melt only) and a coupled melt-freeze temperature-index approach that is able to reconstruct melt-freeze dynamics. Performance of this approach is evaluated at three sites in Japan. These sites (Nagaoka, Shinjo and Sapporo) present multi-year time-series of snow and meteorological data, vertical profiles of snow physical properties and snow melt lysimeters data. These data-sets are an interesting opportunity to test this application in different climatic conditions, as sites span a wide latitudinal range and are subjected to different snow conditions during the season. When melt-freeze dynamics are included in the model, results show that median absolute differences between observations and predictions of bulk volumetric liquid water content are consistently lower than 1 vol%. Moreover, the model is able to predict an observed dry condition of the snowpack in 80% of observed cases at a non-calibration site, where parameters from calibration sites are transferred. Overall, the analysis show that a coupled melt-freeze temperature-index approach may be a valid solution to predict average wetness conditions of a snow cover at local scale.

  16. Optimal population prediction of sandhill crane recruitment based on climate-mediated habitat limitations

    USGS Publications Warehouse

    Gerber, Brian D.; Kendall, William L.; Hooten, Mevin B.; Dubovsky, James A.; Drewien, Roderick C.

    2015-01-01

    Prediction is fundamental to scientific enquiry and application; however, ecologists tend to favour explanatory modelling. We discuss a predictive modelling framework to evaluate ecological hypotheses and to explore novel/unobserved environmental scenarios to assist conservation and management decision-makers. We apply this framework to develop an optimal predictive model for juvenile (<1 year old) sandhill crane Grus canadensis recruitment of the Rocky Mountain Population (RMP). We consider spatial climate predictors motivated by hypotheses of how drought across multiple time-scales and spring/summer weather affects recruitment.Our predictive modelling framework focuses on developing a single model that includes all relevant predictor variables, regardless of collinearity. This model is then optimized for prediction by controlling model complexity using a data-driven approach that marginalizes or removes irrelevant predictors from the model. Specifically, we highlight two approaches of statistical regularization, Bayesian least absolute shrinkage and selection operator (LASSO) and ridge regression.Our optimal predictive Bayesian LASSO and ridge regression models were similar and on average 37% superior in predictive accuracy to an explanatory modelling approach. Our predictive models confirmed a priori hypotheses that drought and cold summers negatively affect juvenile recruitment in the RMP. The effects of long-term drought can be alleviated by short-term wet spring–summer months; however, the alleviation of long-term drought has a much greater positive effect on juvenile recruitment. The number of freezing days and snowpack during the summer months can also negatively affect recruitment, while spring snowpack has a positive effect.Breeding habitat, mediated through climate, is a limiting factor on population growth of sandhill cranes in the RMP, which could become more limiting with a changing climate (i.e. increased drought). These effects are likely not unique to cranes. The alteration of hydrological patterns and water levels by drought may impact many migratory, wetland nesting birds in the Rocky Mountains and beyond.Generalizable predictive models (trained by out-of-sample fit and based on ecological hypotheses) are needed by conservation and management decision-makers. Statistical regularization improves predictions and provides a general framework for fitting models with a large number of predictors, even those with collinearity, to simultaneously identify an optimal predictive model while conducting rigorous Bayesian model selection. Our framework is important for understanding population dynamics under a changing climate and has direct applications for making harvest and habitat management decisions.

  17. Gaussian functional regression for output prediction: Model assimilation and experimental design

    NASA Astrophysics Data System (ADS)

    Nguyen, N. C.; Peraire, J.

    2016-03-01

    In this paper, we introduce a Gaussian functional regression (GFR) technique that integrates multi-fidelity models with model reduction to efficiently predict the input-output relationship of a high-fidelity model. The GFR method combines the high-fidelity model with a low-fidelity model to provide an estimate of the output of the high-fidelity model in the form of a posterior distribution that can characterize uncertainty in the prediction. A reduced basis approximation is constructed upon the low-fidelity model and incorporated into the GFR method to yield an inexpensive posterior distribution of the output estimate. As this posterior distribution depends crucially on a set of training inputs at which the high-fidelity models are simulated, we develop a greedy sampling algorithm to select the training inputs. Our approach results in an output prediction model that inherits the fidelity of the high-fidelity model and has the computational complexity of the reduced basis approximation. Numerical results are presented to demonstrate the proposed approach.

  18. Computational Fluid Dynamics Simulation of Flows in an Oxidation Ditch Driven by a New Surface Aerator

    PubMed Central

    Huang, Weidong; Li, Kun; Wang, Gan; Wang, Yingzhe

    2013-01-01

    Abstract In this article, we present a newly designed inverse umbrella surface aerator, and tested its performance in driving flow of an oxidation ditch. Results show that it has a better performance in driving the oxidation ditch than the original one with higher average velocity and more uniform flow field. We also present a computational fluid dynamics model for predicting the flow field in an oxidation ditch driven by a surface aerator. The improved momentum source term approach to simulate the flow field of the oxidation ditch driven by an inverse umbrella surface aerator was developed and validated through experiments. Four kinds of turbulent models were investigated with the approach, including the standard k−ɛ model, RNG k−ɛ model, realizable k−ɛ model, and Reynolds stress model, and the predicted data were compared with those calculated with the multiple rotating reference frame approach (MRF) and sliding mesh approach (SM). Results of the momentum source term approach are in good agreement with the experimental data, and its prediction accuracy is better than MRF, close to SM. It is also found that the momentum source term approach has lower computational expenses, is simpler to preprocess, and is easier to use. PMID:24302850

  19. The practice of prediction: What can ecologists learn from applied, ecology-related fields?

    USGS Publications Warehouse

    Pennekamp, Frank; Adamson, Matthew; Petchey, Owen L; Poggiale, Jean-Christophe; Aguiar, Maira; Kooi, Bob W.; Botkin, Daniel B.; DeAngelis, Donald L.

    2017-01-01

    The pervasive influence of human induced global environmental change affects biodiversity across the globe, and there is great uncertainty as to how the biosphere will react on short and longer time scales. To adapt to what the future holds and to manage the impacts of global change, scientists need to predict the expected effects with some confidence and communicate these predictions to policy makers. However, recent reviews found that we currently lack a clear understanding of how predictable ecology is, with views seeing it as mostly unpredictable to potentially predictable, at least over short time frames. However, in applied, ecology-related fields predictions are more commonly formulated and reported, as well as evaluated in hindsight, potentially allowing one to define baselines of predictive proficiency in these fields. We searched the literature for representative case studies in these fields and collected information about modeling approaches, target variables of prediction, predictive proficiency achieved, as well as the availability of data to parameterize predictive models. We find that some fields such as epidemiology achieve high predictive proficiency, but even in the more predictive fields proficiency is evaluated in different ways. Both phenomenological and mechanistic approaches are used in most fields, but differences are often small, with no clear superiority of one approach over the other. Data availability is limiting in most fields, with long-term studies being rare and detailed data for parameterizing mechanistic models being in short supply. We suggest that ecologists adopt a more rigorous approach to report and assess predictive proficiency, and embrace the challenges of real world decision making to strengthen the practice of prediction in ecology.

  20. An Efficient Pattern Mining Approach for Event Detection in Multivariate Temporal Data

    PubMed Central

    Batal, Iyad; Cooper, Gregory; Fradkin, Dmitriy; Harrison, James; Moerchen, Fabian; Hauskrecht, Milos

    2015-01-01

    This work proposes a pattern mining approach to learn event detection models from complex multivariate temporal data, such as electronic health records. We present Recent Temporal Pattern mining, a novel approach for efficiently finding predictive patterns for event detection problems. This approach first converts the time series data into time-interval sequences of temporal abstractions. It then constructs more complex time-interval patterns backward in time using temporal operators. We also present the Minimal Predictive Recent Temporal Patterns framework for selecting a small set of predictive and non-spurious patterns. We apply our methods for predicting adverse medical events in real-world clinical data. The results demonstrate the benefits of our methods in learning accurate event detection models, which is a key step for developing intelligent patient monitoring and decision support systems. PMID:26752800

  1. A hybrid modelling approach for predicting ground vibration from trains

    NASA Astrophysics Data System (ADS)

    Triepaischajonsak, N.; Thompson, D. J.

    2015-01-01

    The prediction of ground vibration from trains presents a number of difficulties. The ground is effectively an infinite medium, often with a layered structure and with properties that may vary greatly from one location to another. The vibration from a passing train forms a transient event, which limits the usefulness of steady-state frequency domain models. Moreover, there is often a need to consider vehicle/track interaction in more detail than is commonly used in frequency domain models, such as the 2.5D approach, while maintaining the computational efficiency of the latter. However, full time-domain approaches involve large computation times, particularly where three-dimensional ground models are required. Here, a hybrid modelling approach is introduced. The vehicle/track interaction is calculated in the time domain in order to be able t account directly for effects such as the discrete sleeper spacing. Forces acting on the ground are extracted from this first model and used in a second model to predict the ground response at arbitrary locations. In the present case the second model is a layered ground model operating in the frequency domain. Validation of the approach is provided by comparison with an existing frequency domain model. The hybrid model is then used to study the sleeper-passing effect, which is shown to be less significant than excitation due to track unevenness in all the cases considered.

  2. Sweat loss prediction using a multi-model approach

    NASA Astrophysics Data System (ADS)

    Xu, Xiaojiang; Santee, William R.

    2011-07-01

    A new multi-model approach (MMA) for sweat loss prediction is proposed to improve prediction accuracy. MMA was computed as the average of sweat loss predicted by two existing thermoregulation models: i.e., the rational model SCENARIO and the empirical model Heat Strain Decision Aid (HSDA). Three independent physiological datasets, a total of 44 trials, were used to compare predictions by MMA, SCENARIO, and HSDA. The observed sweat losses were collected under different combinations of uniform ensembles, environmental conditions (15-40°C, RH 25-75%), and exercise intensities (250-600 W). Root mean square deviation (RMSD), residual plots, and paired t tests were used to compare predictions with observations. Overall, MMA reduced RMSD by 30-39% in comparison with either SCENARIO or HSDA, and increased the prediction accuracy to 66% from 34% or 55%. Of the MMA predictions, 70% fell within the range of mean observed value ± SD, while only 43% of SCENARIO and 50% of HSDA predictions fell within the same range. Paired t tests showed that differences between observations and MMA predictions were not significant, but differences between observations and SCENARIO or HSDA predictions were significantly different for two datasets. Thus, MMA predicted sweat loss more accurately than either of the two single models for the three datasets used. Future work will be to evaluate MMA using additional physiological data to expand the scope of populations and conditions.

  3. Stochastic or statistic? Comparing flow duration curve models in ungauged basins and changing climates

    NASA Astrophysics Data System (ADS)

    Müller, M. F.; Thompson, S. E.

    2015-09-01

    The prediction of flow duration curves (FDCs) in ungauged basins remains an important task for hydrologists given the practical relevance of FDCs for water management and infrastructure design. Predicting FDCs in ungauged basins typically requires spatial interpolation of statistical or model parameters. This task is complicated if climate becomes non-stationary, as the prediction challenge now also requires extrapolation through time. In this context, process-based models for FDCs that mechanistically link the streamflow distribution to climate and landscape factors may have an advantage over purely statistical methods to predict FDCs. This study compares a stochastic (process-based) and statistical method for FDC prediction in both stationary and non-stationary contexts, using Nepal as a case study. Under contemporary conditions, both models perform well in predicting FDCs, with Nash-Sutcliffe coefficients above 0.80 in 75 % of the tested catchments. The main drives of uncertainty differ between the models: parameter interpolation was the main source of error for the statistical model, while violations of the assumptions of the process-based model represented the main source of its error. The process-based approach performed better than the statistical approach in numerical simulations with non-stationary climate drivers. The predictions of the statistical method under non-stationary rainfall conditions were poor if (i) local runoff coefficients were not accurately determined from the gauge network, or (ii) streamflow variability was strongly affected by changes in rainfall. A Monte Carlo analysis shows that the streamflow regimes in catchments characterized by a strong wet-season runoff and a rapid, strongly non-linear hydrologic response are particularly sensitive to changes in rainfall statistics. In these cases, process-based prediction approaches are strongly favored over statistical models.

  4. Comparing statistical and process-based flow duration curve models in ungauged basins and changing rain regimes

    NASA Astrophysics Data System (ADS)

    Müller, M. F.; Thompson, S. E.

    2016-02-01

    The prediction of flow duration curves (FDCs) in ungauged basins remains an important task for hydrologists given the practical relevance of FDCs for water management and infrastructure design. Predicting FDCs in ungauged basins typically requires spatial interpolation of statistical or model parameters. This task is complicated if climate becomes non-stationary, as the prediction challenge now also requires extrapolation through time. In this context, process-based models for FDCs that mechanistically link the streamflow distribution to climate and landscape factors may have an advantage over purely statistical methods to predict FDCs. This study compares a stochastic (process-based) and statistical method for FDC prediction in both stationary and non-stationary contexts, using Nepal as a case study. Under contemporary conditions, both models perform well in predicting FDCs, with Nash-Sutcliffe coefficients above 0.80 in 75 % of the tested catchments. The main drivers of uncertainty differ between the models: parameter interpolation was the main source of error for the statistical model, while violations of the assumptions of the process-based model represented the main source of its error. The process-based approach performed better than the statistical approach in numerical simulations with non-stationary climate drivers. The predictions of the statistical method under non-stationary rainfall conditions were poor if (i) local runoff coefficients were not accurately determined from the gauge network, or (ii) streamflow variability was strongly affected by changes in rainfall. A Monte Carlo analysis shows that the streamflow regimes in catchments characterized by frequent wet-season runoff and a rapid, strongly non-linear hydrologic response are particularly sensitive to changes in rainfall statistics. In these cases, process-based prediction approaches are favored over statistical models.

  5. Multistep-Ahead Air Passengers Traffic Prediction with Hybrid ARIMA-SVMs Models

    PubMed Central

    Ming, Wei; Xiong, Tao

    2014-01-01

    The hybrid ARIMA-SVMs prediction models have been established recently, which take advantage of the unique strength of ARIMA and SVMs models in linear and nonlinear modeling, respectively. Built upon this hybrid ARIMA-SVMs models alike, this study goes further to extend them into the case of multistep-ahead prediction for air passengers traffic with the two most commonly used multistep-ahead prediction strategies, that is, iterated strategy and direct strategy. Additionally, the effectiveness of data preprocessing approaches, such as deseasonalization and detrending, is investigated and proofed along with the two strategies. Real data sets including four selected airlines' monthly series were collected to justify the effectiveness of the proposed approach. Empirical results demonstrate that the direct strategy performs better than iterative one in long term prediction case while iterative one performs better in the case of short term prediction. Furthermore, both deseasonalization and detrending can significantly improve the prediction accuracy for both strategies, indicating the necessity of data preprocessing. As such, this study contributes as a full reference to the planners from air transportation industries on how to tackle multistep-ahead prediction tasks in the implementation of either prediction strategy. PMID:24723814

  6. Alcohol-Approach Inclinations and Drinking Identity as Predictors of Behavioral Economic Demand for Alcohol

    PubMed Central

    Ramirez, Jason J.; Dennhardt, Ashley A.; Baldwin, Scott A.; Murphy, James G.; Lindgren, Kristen P.

    2016-01-01

    Behavioral economic demand curve indices of alcohol consumption reflect decisions to consume alcohol at varying costs. Although these indices predict alcohol-related problems beyond established predictors, little is known about the determinants of elevated demand. Two cognitive constructs that may underlie alcohol demand are alcohol-approach inclinations and drinking identity. The aim of this study was to evaluate implicit and explicit measures of these constructs as predictors of alcohol demand curve indices. College student drinkers (N = 223, 59% female) completed implicit and explicit measures of drinking identity and alcohol-approach inclinations at three timepoints separated by three-month intervals, and completed the Alcohol Purchase Task to assess demand at Time 3. Given no change in our alcohol-approach inclinations and drinking identity measures over time, random intercept-only models were used to predict two demand indices: Amplitude, which represents maximum hypothetical alcohol consumption and expenditures, and Persistence, which represents sensitivity to increasing prices. When modeled separately, implicit and explicit measures of drinking identity and alcohol-approach inclinations positively predicted demand indices. When implicit and explicit measures were included in the same model, both measures of drinking identity predicted Amplitude, but only explicit drinking identity predicted Persistence. In contrast, explicit measures of alcohol-approach inclinations, but not implicit measures, predicted both demand indices. Therefore, there was more support for explicit, versus implicit, measures as unique predictors of alcohol demand. Overall, drinking identity and alcohol-approach inclinations both exhibit positive associations with alcohol demand and represent potentially modifiable cognitive constructs that may underlie elevated demand in college student drinkers. PMID:27379444

  7. A semi-supervised learning approach for RNA secondary structure prediction.

    PubMed

    Yonemoto, Haruka; Asai, Kiyoshi; Hamada, Michiaki

    2015-08-01

    RNA secondary structure prediction is a key technology in RNA bioinformatics. Most algorithms for RNA secondary structure prediction use probabilistic models, in which the model parameters are trained with reliable RNA secondary structures. Because of the difficulty of determining RNA secondary structures by experimental procedures, such as NMR or X-ray crystal structural analyses, there are still many RNA sequences that could be useful for training whose secondary structures have not been experimentally determined. In this paper, we introduce a novel semi-supervised learning approach for training parameters in a probabilistic model of RNA secondary structures in which we employ not only RNA sequences with annotated secondary structures but also ones with unknown secondary structures. Our model is based on a hybrid of generative (stochastic context-free grammars) and discriminative models (conditional random fields) that has been successfully applied to natural language processing. Computational experiments indicate that the accuracy of secondary structure prediction is improved by incorporating RNA sequences with unknown secondary structures into training. To our knowledge, this is the first study of a semi-supervised learning approach for RNA secondary structure prediction. This technique will be useful when the number of reliable structures is limited. Copyright © 2015 Elsevier Ltd. All rights reserved.

  8. Practical guidance on representing the heteroscedasticity of residual errors of hydrological predictions

    NASA Astrophysics Data System (ADS)

    McInerney, David; Thyer, Mark; Kavetski, Dmitri; Kuczera, George

    2016-04-01

    Appropriate representation of residual errors in hydrological modelling is essential for accurate and reliable probabilistic streamflow predictions. In particular, residual errors of hydrological predictions are often heteroscedastic, with large errors associated with high runoff events. Although multiple approaches exist for representing this heteroscedasticity, few if any studies have undertaken a comprehensive evaluation and comparison of these approaches. This study fills this research gap by evaluating a range of approaches for representing heteroscedasticity in residual errors. These approaches include the 'direct' weighted least squares approach and 'transformational' approaches, such as logarithmic, Box-Cox (with and without fitting the transformation parameter), logsinh and the inverse transformation. The study reports (1) theoretical comparison of heteroscedasticity approaches, (2) empirical evaluation of heteroscedasticity approaches using a range of multiple catchments / hydrological models / performance metrics and (3) interpretation of empirical results using theory to provide practical guidance on the selection of heteroscedasticity approaches. Importantly, for hydrological practitioners, the results will simplify the choice of approaches to represent heteroscedasticity. This will enhance their ability to provide hydrological probabilistic predictions with the best reliability and precision for different catchment types (e.g. high/low degree of ephemerality).

  9. A Five-Stage Prediction-Observation-Explanation Inquiry-Based Learning Model to Improve Students' Learning Performance in Science Courses

    ERIC Educational Resources Information Center

    Hsiao, Hsien-Sheng; Chen, Jyun-Chen; Hong, Jon-Chao; Chen, Po-Hsi; Lu, Chow-Chin; Chen, Sherry Y.

    2017-01-01

    A five-stage prediction-observation-explanation inquiry-based learning (FPOEIL) model was developed to improve students' scientific learning performance. In order to intensify the science learning effect, the repertory grid technology-assisted learning (RGTL) approach and the collaborative learning (CL) approach were utilized. A quasi-experimental…

  10. Students’ Achievement Goals, Learning-Related Emotions and Academic Achievement

    PubMed Central

    Lüftenegger, Marko; Klug, Julia; Harrer, Katharina; Langer, Marie; Spiel, Christiane; Schober, Barbara

    2016-01-01

    In the present research, the recently proposed 3 × 2 model of achievement goals is tested and associations with achievement emotions and their joint influence on academic achievement are investigated. The study was conducted with 388 students using the 3 × 2 Achievement Goal Questionnaire including the six proposed goal constructs (task-approach, task-avoidance, self-approach, self-avoidance, other-approach, other-avoidance) and the enjoyment and boredom scales from the Achievement Emotion Questionnaire. Exam grades were used as an indicator of academic achievement. Findings from CFAs provided strong support for the proposed structure of the 3 × 2 achievement goal model. Self-based goals, other-based goals and task-approach goals predicted enjoyment. Task-approach goals negatively predicted boredom. Task-approach and other-approach predicted achievement. The indirect effects of achievement goals through emotion variables on achievement were assessed using bias-corrected bootstrapping. No mediation effects were found. Implications for educational practice are discussed. PMID:27199836

  11. Clinical Predictive Modeling Development and Deployment through FHIR Web Services.

    PubMed

    Khalilia, Mohammed; Choi, Myung; Henderson, Amelia; Iyengar, Sneha; Braunstein, Mark; Sun, Jimeng

    2015-01-01

    Clinical predictive modeling involves two challenging tasks: model development and model deployment. In this paper we demonstrate a software architecture for developing and deploying clinical predictive models using web services via the Health Level 7 (HL7) Fast Healthcare Interoperability Resources (FHIR) standard. The services enable model development using electronic health records (EHRs) stored in OMOP CDM databases and model deployment for scoring individual patients through FHIR resources. The MIMIC2 ICU dataset and a synthetic outpatient dataset were transformed into OMOP CDM databases for predictive model development. The resulting predictive models are deployed as FHIR resources, which receive requests of patient information, perform prediction against the deployed predictive model and respond with prediction scores. To assess the practicality of this approach we evaluated the response and prediction time of the FHIR modeling web services. We found the system to be reasonably fast with one second total response time per patient prediction.

  12. Clinical Predictive Modeling Development and Deployment through FHIR Web Services

    PubMed Central

    Khalilia, Mohammed; Choi, Myung; Henderson, Amelia; Iyengar, Sneha; Braunstein, Mark; Sun, Jimeng

    2015-01-01

    Clinical predictive modeling involves two challenging tasks: model development and model deployment. In this paper we demonstrate a software architecture for developing and deploying clinical predictive models using web services via the Health Level 7 (HL7) Fast Healthcare Interoperability Resources (FHIR) standard. The services enable model development using electronic health records (EHRs) stored in OMOP CDM databases and model deployment for scoring individual patients through FHIR resources. The MIMIC2 ICU dataset and a synthetic outpatient dataset were transformed into OMOP CDM databases for predictive model development. The resulting predictive models are deployed as FHIR resources, which receive requests of patient information, perform prediction against the deployed predictive model and respond with prediction scores. To assess the practicality of this approach we evaluated the response and prediction time of the FHIR modeling web services. We found the system to be reasonably fast with one second total response time per patient prediction. PMID:26958207

  13. Near infrared spectroscopy to estimate the temperature reached on burned soils: strategies to develop robust models.

    NASA Astrophysics Data System (ADS)

    Guerrero, César; Pedrosa, Elisabete T.; Pérez-Bejarano, Andrea; Keizer, Jan Jacob

    2014-05-01

    The temperature reached on soils is an important parameter needed to describe the wildfire effects. However, the methods for measure the temperature reached on burned soils have been poorly developed. Recently, the use of the near-infrared (NIR) spectroscopy has been pointed as a valuable tool for this purpose. The NIR spectrum of a soil sample contains information of the organic matter (quantity and quality), clay (quantity and quality), minerals (such as carbonates and iron oxides) and water contents. Some of these components are modified by the heat, and each temperature causes a group of changes, leaving a typical fingerprint on the NIR spectrum. This technique needs the use of a model (or calibration) where the changes in the NIR spectra are related with the temperature reached. For the development of the model, several aliquots are heated at known temperatures, and used as standards in the calibration set. This model offers the possibility to make estimations of the temperature reached on a burned sample from its NIR spectrum. However, the estimation of the temperature reached using NIR spectroscopy is due to changes in several components, and cannot be attributed to changes in a unique soil component. Thus, we can estimate the temperature reached by the interaction between temperature and the thermo-sensible soil components. In addition, we cannot expect the uniform distribution of these components, even at small scale. Consequently, the proportion of these soil components can vary spatially across the site. This variation will be present in the samples used to construct the model and also in the samples affected by the wildfire. Therefore, the strategies followed to develop robust models should be focused to manage this expected variation. In this work we compared the prediction accuracy of models constructed with different approaches. These approaches were designed to provide insights about how to distribute the efforts needed for the development of robust models, since this step is the bottle-neck of this technique. In the first approach, a plot-scale model was used to predict the temperature reached in samples collected in other plots from the same site. In a plot-scale model, all the heated aliquots come from a unique plot-scale sample. As expected, the results obtained with this approach were deceptive, because this approach was assuming that a plot-scale model would be enough to represent the whole variability of the site. The accuracy (measured as the root mean square error of prediction, thereinafter RMSEP) was 86ºC, and the bias was also high (>30ºC). In the second approach, the temperatures predicted through several plot-scale models were averaged. The accuracy was improved (RMSEP=65ºC) respect the first approach, because the variability from several plots was considered and biased predictions were partially counterbalanced. However, this approach implies more efforts, since several plot-scale models are needed. In the third approach, the predictions were obtained with site-scale models. These models were constructed with aliquots from several plots. In this case, the results were accurate, since the RMSEP was around 40ºC, the bias was very small (<1ºC) and the R2 was 0.92. As expected, this approach clearly outperformed the second approach, in spite of the fact that the same efforts were needed. In a plot-scale model, only one interaction between temperature and soil components was modelled. However, several different interactions between temperature and soil components were present in the calibration matrix of a site-scale model. Consequently, the site-scale models were able to model the temperature reached excluding the influence of the differences in soil composition, resulting in more robust models respect that variation. Summarizing, the results were highlighting the importance of an adequate strategy to develop robust and accurate models with moderate efforts, and how a wrong strategy can result in deceptive predictions.

  14. Constrained off-line synthesis approach of model predictive control for networked control systems with network-induced delays.

    PubMed

    Tang, Xiaoming; Qu, Hongchun; Wang, Ping; Zhao, Meng

    2015-03-01

    This paper investigates the off-line synthesis approach of model predictive control (MPC) for a class of networked control systems (NCSs) with network-induced delays. A new augmented model which can be readily applied to time-varying control law, is proposed to describe the NCS where bounded deterministic network-induced delays may occur in both sensor to controller (S-A) and controller to actuator (C-A) links. Based on this augmented model, a sufficient condition of the closed-loop stability is derived by applying the Lyapunov method. The off-line synthesis approach of model predictive control is addressed using the stability results of the system, which explicitly considers the satisfaction of input and state constraints. Numerical example is given to illustrate the effectiveness of the proposed method. Copyright © 2014 ISA. Published by Elsevier Ltd. All rights reserved.

  15. Exploration of Machine Learning Approaches to Predict Pavement Performance

    DOT National Transportation Integrated Search

    2018-03-23

    Machine learning (ML) techniques were used to model and predict pavement condition index (PCI) for various pavement types using a variety of input variables. The primary objective of this research was to develop and assess PCI predictive models for t...

  16. Integrating predictive information into an agro-economic model to guide agricultural management

    NASA Astrophysics Data System (ADS)

    Zhang, Y.; Block, P.

    2016-12-01

    Skillful season-ahead climate predictions linked with responsive agricultural planning and management have the potential to reduce losses, if adopted by farmers, particularly for rainfed-dominated agriculture such as in Ethiopia. Precipitation predictions during the growing season in major agricultural regions of Ethiopia are used to generate predicted climate yield factors, which reflect the influence of precipitation amounts on crop yields and serve as inputs into an agro-economic model. The adapted model, originally developed by the International Food Policy Research Institute, produces outputs of economic indices (GDP, poverty rates, etc.) at zonal and national levels. Forecast-based approaches, in which farmers' actions are in response to forecasted conditions, are compared with no-forecast approaches in which farmers follow business as usual practices, expecting "average" climate conditions. The effects of farmer adoption rates, including the potential for reduced uptake due to poor predictions, and increasing forecast lead-time on economic outputs are also explored. Preliminary results indicate superior gains under forecast-based approaches.

  17. How can machine-learning methods assist in virtual screening for hyperuricemia? A healthcare machine-learning approach.

    PubMed

    Ichikawa, Daisuke; Saito, Toki; Ujita, Waka; Oyama, Hiroshi

    2016-12-01

    Our purpose was to develop a new machine-learning approach (a virtual health check-up) toward identification of those at high risk of hyperuricemia. Applying the system to general health check-ups is expected to reduce medical costs compared with administering an additional test. Data were collected during annual health check-ups performed in Japan between 2011 and 2013 (inclusive). We prepared training and test datasets from the health check-up data to build prediction models; these were composed of 43,524 and 17,789 persons, respectively. Gradient-boosting decision tree (GBDT), random forest (RF), and logistic regression (LR) approaches were trained using the training dataset and were then used to predict hyperuricemia in the test dataset. Undersampling was applied to build the prediction models to deal with the imbalanced class dataset. The results showed that the RF and GBDT approaches afforded the best performances in terms of sensitivity and specificity, respectively. The area under the curve (AUC) values of the models, which reflected the total discriminative ability of the classification, were 0.796 [95% confidence interval (CI): 0.766-0.825] for the GBDT, 0.784 [95% CI: 0.752-0.815] for the RF, and 0.785 [95% CI: 0.752-0.819] for the LR approaches. No significant differences were observed between pairs of each approach. Small changes occurred in the AUCs after applying undersampling to build the models. We developed a virtual health check-up that predicted the development of hyperuricemia using machine-learning methods. The GBDT, RF, and LR methods had similar predictive capability. Undersampling did not remarkably improve predictive power. Copyright © 2016 Elsevier Inc. All rights reserved.

  18. Spatiotemporal Bayesian networks for malaria prediction.

    PubMed

    Haddawy, Peter; Hasan, A H M Imrul; Kasantikul, Rangwan; Lawpoolsri, Saranath; Sa-Angchai, Patiwat; Kaewkungwal, Jaranit; Singhasivanon, Pratap

    2018-01-01

    Targeted intervention and resource allocation are essential for effective malaria control, particularly in remote areas, with predictive models providing important information for decision making. While a diversity of modeling technique have been used to create predictive models of malaria, no work has made use of Bayesian networks. Bayes nets are attractive due to their ability to represent uncertainty, model time lagged and nonlinear relations, and provide explanations. This paper explores the use of Bayesian networks to model malaria, demonstrating the approach by creating village level models with weekly temporal resolution for Tha Song Yang district in northern Thailand. The networks are learned using data on cases and environmental covariates. Three types of networks are explored: networks for numeric prediction, networks for outbreak prediction, and networks that incorporate spatial autocorrelation. Evaluation of the numeric prediction network shows that the Bayes net has prediction accuracy in terms of mean absolute error of about 1.4 cases for 1 week prediction and 1.7 cases for 6 week prediction. The network for outbreak prediction has an ROC AUC above 0.9 for all prediction horizons. Comparison of prediction accuracy of both Bayes nets against several traditional modeling approaches shows the Bayes nets to outperform the other models for longer time horizon prediction of high incidence transmission. To model spread of malaria over space, we elaborate the models with links between the village networks. This results in some very large models which would be far too laborious to build by hand. So we represent the models as collections of probability logic rules and automatically generate the networks. Evaluation of the models shows that the autocorrelation links significantly improve prediction accuracy for some villages in regions of high incidence. We conclude that spatiotemporal Bayesian networks are a highly promising modeling alternative for prediction of malaria and other vector-borne diseases. Copyright © 2017 Elsevier B.V. All rights reserved.

  19. Prediction of Cell Wall Properties and Response to Deconstruction Using Alkaline Pretreatment in Diverse Maize Genotypes Using Py-MBMS and NIR

    DOE PAGES

    Li, Muyang; Williams, Daniel L.; Heckwolf, Marlies; ...

    2016-10-04

    In this paper, we explore the ability of several characterization approaches for phenotyping to extract information about plant cell wall properties in diverse maize genotypes with the goal of identifying approaches that could be used to predict the plant's response to deconstruction in a biomass-to-biofuel process. Specifically, a maize diversity panel was subjected to two high-throughput biomass characterization approaches, pyrolysis molecular beam mass spectrometry (py-MBMS) and near-infrared (NIR) spectroscopy, and chemometric models to predict a number of plant cell wall properties as well as enzymatic hydrolysis yields of glucose following either no pretreatment or with mild alkaline pretreatment. These weremore » compared to multiple linear regression (MLR) models developed from quantified properties. We were able to demonstrate that direct correlations to specific mass spectrometry ions from pyrolysis as well as characteristic regions of the second derivative of the NIR spectrum regions were comparable in their predictive capability to partial least squares (PLS) models for p-coumarate content, while the direct correlation to the spectral data was superior to the PLS for Klason lignin content and guaiacyl monomer release by thioacidolysis as assessed by cross-validation. The PLS models for prediction of hydrolysis yields using either py-MBMS or NIR spectra were superior to MLR models based on quantified properties for unpretreated biomass. However, the PLS models using the two high-throughput characterization approaches could not predict hydrolysis following alkaline pretreatment while MLR models based on quantified properties could. This is likely a consequence of quantified properties including some assessments of pretreated biomass, while the py-MBMS and NIR only utilized untreated biomass.« less

  20. Prediction of Cell Wall Properties and Response to Deconstruction Using Alkaline Pretreatment in Diverse Maize Genotypes Using Py-MBMS and NIR

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Li, Muyang; Williams, Daniel L.; Heckwolf, Marlies

    In this paper, we explore the ability of several characterization approaches for phenotyping to extract information about plant cell wall properties in diverse maize genotypes with the goal of identifying approaches that could be used to predict the plant's response to deconstruction in a biomass-to-biofuel process. Specifically, a maize diversity panel was subjected to two high-throughput biomass characterization approaches, pyrolysis molecular beam mass spectrometry (py-MBMS) and near-infrared (NIR) spectroscopy, and chemometric models to predict a number of plant cell wall properties as well as enzymatic hydrolysis yields of glucose following either no pretreatment or with mild alkaline pretreatment. These weremore » compared to multiple linear regression (MLR) models developed from quantified properties. We were able to demonstrate that direct correlations to specific mass spectrometry ions from pyrolysis as well as characteristic regions of the second derivative of the NIR spectrum regions were comparable in their predictive capability to partial least squares (PLS) models for p-coumarate content, while the direct correlation to the spectral data was superior to the PLS for Klason lignin content and guaiacyl monomer release by thioacidolysis as assessed by cross-validation. The PLS models for prediction of hydrolysis yields using either py-MBMS or NIR spectra were superior to MLR models based on quantified properties for unpretreated biomass. However, the PLS models using the two high-throughput characterization approaches could not predict hydrolysis following alkaline pretreatment while MLR models based on quantified properties could. This is likely a consequence of quantified properties including some assessments of pretreated biomass, while the py-MBMS and NIR only utilized untreated biomass.« less

  1. Detecting the influence of rare stressors on rare species in Yosemite National Park using a novel stratified permutation test

    USGS Publications Warehouse

    Matchett, John R.; Stark, Philip B.; Ostoja, Steven M.; Knapp, Roland A.; McKenny, Heather C.; Brooks, Matthew L.; Langford, William T.; Joppa, Lucas N.; Berlow, Eric L.

    2015-01-01

    Statistical models often use observational data to predict phenomena; however, interpreting model terms to understand their influence can be problematic. This issue poses a challenge in species conservation where setting priorities requires estimating influences of potential stressors using observational data. We present a novel approach for inferring influence of a rare stressor on a rare species by blending predictive models with nonparametric permutation tests. We illustrate the approach with two case studies involving rare amphibians in Yosemite National Park, USA. The endangered frog, Rana sierrae, is known to be negatively impacted by non-native fish, while the threatened toad, Anaxyrus canorus, is potentially affected by packstock. Both stressors and amphibians are rare, occurring in ~10% of potential habitat patches. We first predict amphibian occupancy with a statistical model that includes all predictors but the stressor to stratify potential habitat by predicted suitability. A stratified permutation test then evaluates the association between stressor and amphibian, all else equal. Our approach confirms the known negative relationship between fish and R. sierrae, but finds no evidence of a negative relationship between current packstock use and A. canorus breeding. Our statistical approach has potential broad application for deriving understanding (not just prediction) from observational data.

  2. Detecting the influence of rare stressors on rare species in Yosemite National Park using a novel stratified permutation test

    PubMed Central

    Matchett, J. R.; Stark, Philip B.; Ostoja, Steven M.; Knapp, Roland A.; McKenny, Heather C.; Brooks, Matthew L.; Langford, William T.; Joppa, Lucas N.; Berlow, Eric L.

    2015-01-01

    Statistical models often use observational data to predict phenomena; however, interpreting model terms to understand their influence can be problematic. This issue poses a challenge in species conservation where setting priorities requires estimating influences of potential stressors using observational data. We present a novel approach for inferring influence of a rare stressor on a rare species by blending predictive models with nonparametric permutation tests. We illustrate the approach with two case studies involving rare amphibians in Yosemite National Park, USA. The endangered frog, Rana sierrae, is known to be negatively impacted by non-native fish, while the threatened toad, Anaxyrus canorus, is potentially affected by packstock. Both stressors and amphibians are rare, occurring in ~10% of potential habitat patches. We first predict amphibian occupancy with a statistical model that includes all predictors but the stressor to stratify potential habitat by predicted suitability. A stratified permutation test then evaluates the association between stressor and amphibian, all else equal. Our approach confirms the known negative relationship between fish and R. sierrae, but finds no evidence of a negative relationship between current packstock use and A. canorus breeding. Our statistical approach has potential broad application for deriving understanding (not just prediction) from observational data. PMID:26031755

  3. Numerical weather prediction model tuning via ensemble prediction system

    NASA Astrophysics Data System (ADS)

    Jarvinen, H.; Laine, M.; Ollinaho, P.; Solonen, A.; Haario, H.

    2011-12-01

    This paper discusses a novel approach to tune predictive skill of numerical weather prediction (NWP) models. NWP models contain tunable parameters which appear in parameterizations schemes of sub-grid scale physical processes. Currently, numerical values of these parameters are specified manually. In a recent dual manuscript (QJRMS, revised) we developed a new concept and method for on-line estimation of the NWP model parameters. The EPPES ("Ensemble prediction and parameter estimation system") method requires only minimal changes to the existing operational ensemble prediction infra-structure and it seems very cost-effective because practically no new computations are introduced. The approach provides an algorithmic decision making tool for model parameter optimization in operational NWP. In EPPES, statistical inference about the NWP model tunable parameters is made by (i) generating each member of the ensemble of predictions using different model parameter values, drawn from a proposal distribution, and (ii) feeding-back the relative merits of the parameter values to the proposal distribution, based on evaluation of a suitable likelihood function against verifying observations. In the presentation, the method is first illustrated in low-order numerical tests using a stochastic version of the Lorenz-95 model which effectively emulates the principal features of ensemble prediction systems. The EPPES method correctly detects the unknown and wrongly specified parameters values, and leads to an improved forecast skill. Second, results with an atmospheric general circulation model based ensemble prediction system show that the NWP model tuning capacity of EPPES scales up to realistic models and ensemble prediction systems. Finally, a global top-end NWP model tuning exercise with preliminary results is published.

  4. Ordered LOGIT Model approach for the determination of financial distress.

    PubMed

    Kinay, B

    2010-01-01

    Nowadays, as a result of the global competition encountered, numerous companies come up against financial distresses. To predict and take proactive approaches for those problems is quite important. Thus, the prediction of crisis and financial distress is essential in terms of revealing the financial condition of companies. In this study, financial ratios relating to 156 industrial firms that are quoted in the Istanbul Stock Exchange are used and probabilities of financial distress are predicted by means of an ordered logit regression model. By means of Altman's Z Score, the dependent variable is composed by scaling the level of risk. Thus, a model that can compose an early warning system and predict financial distress is proposed.

  5. Compound Structure-Independent Activity Prediction in High-Dimensional Target Space.

    PubMed

    Balfer, Jenny; Hu, Ye; Bajorath, Jürgen

    2014-08-01

    Profiling of compound libraries against arrays of targets has become an important approach in pharmaceutical research. The prediction of multi-target compound activities also represents an attractive task for machine learning with potential for drug discovery applications. Herein, we have explored activity prediction in high-dimensional target space. Different types of models were derived to predict multi-target activities. The models included naïve Bayesian (NB) and support vector machine (SVM) classifiers based upon compound structure information and NB models derived on the basis of activity profiles, without considering compound structure. Because the latter approach can be applied to incomplete training data and principally depends on the feature independence assumption, SVM modeling was not applicable in this case. Furthermore, iterative hybrid NB models making use of both activity profiles and compound structure information were built. In high-dimensional target space, NB models utilizing activity profile data were found to yield more accurate activity predictions than structure-based NB and SVM models or hybrid models. An in-depth analysis of activity profile-based models revealed the presence of correlation effects across different targets and rationalized prediction accuracy. Taken together, the results indicate that activity profile information can be effectively used to predict the activity of test compounds against novel targets. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  6. Comparative analysis of predictive models for nongenotoxic hepatocarcinogenicity using both toxicogenomics and quantitative structure-activity relationships.

    PubMed

    Liu, Zhichao; Kelly, Reagan; Fang, Hong; Ding, Don; Tong, Weida

    2011-07-18

    The primary testing strategy to identify nongenotoxic carcinogens largely relies on the 2-year rodent bioassay, which is time-consuming and labor-intensive. There is an increasing effort to develop alternative approaches to prioritize the chemicals for, supplement, or even replace the cancer bioassay. In silico approaches based on quantitative structure-activity relationships (QSAR) are rapid and inexpensive and thus have been investigated for such purposes. A slightly more expensive approach based on short-term animal studies with toxicogenomics (TGx) represents another attractive option for this application. Thus, the primary questions are how much better predictive performance using short-term TGx models can be achieved compared to that of QSAR models, and what length of exposure is sufficient for high quality prediction based on TGx. In this study, we developed predictive models for rodent liver carcinogenicity using gene expression data generated from short-term animal models at different time points and QSAR. The study was focused on the prediction of nongenotoxic carcinogenicity since the genotoxic chemicals can be inexpensively removed from further development using various in vitro assays individually or in combination. We identified 62 chemicals whose hepatocarcinogenic potential was available from the National Center for Toxicological Research liver cancer database (NCTRlcdb). The gene expression profiles of liver tissue obtained from rats treated with these chemicals at different time points (1 day, 3 days, and 5 days) are available from the Gene Expression Omnibus (GEO) database. Both TGx and QSAR models were developed on the basis of the same set of chemicals using the same modeling approach, a nearest-centroid method with a minimum redundancy and maximum relevancy-based feature selection with performance assessed using compound-based 5-fold cross-validation. We found that the TGx models outperformed QSAR in every aspect of modeling. For example, the TGx models' predictive accuracy (0.77, 0.77, and 0.82 for the 1-day, 3-day, and 5-day models, respectively) was much higher for an independent validation set than that of a QSAR model (0.55). Permutation tests confirmed the statistical significance of the model's prediction performance. The study concluded that a short-term 5-day TGx animal model holds the potential to predict nongenotoxic hepatocarcinogenicity. © 2011 American Chemical Society

  7. Estimation and impact assessment of input and parameter uncertainty in predicting groundwater flow with a fully distributed model

    NASA Astrophysics Data System (ADS)

    Touhidul Mustafa, Syed Md.; Nossent, Jiri; Ghysels, Gert; Huysmans, Marijke

    2017-04-01

    Transient numerical groundwater flow models have been used to understand and forecast groundwater flow systems under anthropogenic and climatic effects, but the reliability of the predictions is strongly influenced by different sources of uncertainty. Hence, researchers in hydrological sciences are developing and applying methods for uncertainty quantification. Nevertheless, spatially distributed flow models pose significant challenges for parameter and spatially distributed input estimation and uncertainty quantification. In this study, we present a general and flexible approach for input and parameter estimation and uncertainty analysis of groundwater models. The proposed approach combines a fully distributed groundwater flow model (MODFLOW) with the DiffeRential Evolution Adaptive Metropolis (DREAM) algorithm. To avoid over-parameterization, the uncertainty of the spatially distributed model input has been represented by multipliers. The posterior distributions of these multipliers and the regular model parameters were estimated using DREAM. The proposed methodology has been applied in an overexploited aquifer in Bangladesh where groundwater pumping and recharge data are highly uncertain. The results confirm that input uncertainty does have a considerable effect on the model predictions and parameter distributions. Additionally, our approach also provides a new way to optimize the spatially distributed recharge and pumping data along with the parameter values under uncertain input conditions. It can be concluded from our approach that considering model input uncertainty along with parameter uncertainty is important for obtaining realistic model predictions and a correct estimation of the uncertainty bounds.

  8. Artificial intelligence: a new approach for prescription and monitoring of hemodialysis therapy.

    PubMed

    Akl, A I; Sobh, M A; Enab, Y M; Tattersall, J

    2001-12-01

    The effect of dialysis on patients is conventionally predicted using a formal mathematical model. This approach requires many assumptions of the processes involved, and validation of these may be difficult. The validity of dialysis urea modeling using a formal mathematical model has been challenged. Artificial intelligence using neural networks (NNs) has been used to solve complex problems without needing a mathematical model or an understanding of the mechanisms involved. In this study, we applied an NN model to study and predict concentrations of urea during a hemodialysis session. We measured blood concentrations of urea, patient weight, and total urea removal by direct dialysate quantification (DDQ) at 30-minute intervals during the session (in 15 chronic hemodialysis patients). The NN model was trained to recognize the evolution of measured urea concentrations and was subsequently able to predict hemodialysis session time needed to reach a target solute removal index (SRI) in patients not previously studied by the NN model (in another 15 chronic hemodialysis patients). Comparing results of the NN model with the DDQ model, the prediction error was 10.9%, with a not significant difference between predicted total urea nitrogen (UN) removal and measured UN removal by DDQ. NN model predictions of time showed a not significant difference with actual intervals needed to reach the same SRI level at the same patient conditions, except for the prediction of SRI at the first 30-minute interval, which showed a significant difference (P = 0.001). This indicates the sensitivity of the NN model to what is called patient clearance time; the prediction error was 8.3%. From our results, we conclude that artificial intelligence applications in urea kinetics can give an idea of intradialysis profiling according to individual clinical needs. In theory, this approach can be extended easily to other solutes, making the NN model a step forward to achieving artificial-intelligent dialysis control.

  9. 78 FR 70303 - Announcement of Requirements and Registration for the Predict the Influenza Season Challenge

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-11-25

    ... public. Mathematical and statistical models can be useful in predicting the timing and impact of the... applying any mathematical, statistical, or other approach to predictive modeling. This challenge will... Services (HHS) region level(s) in the United States by developing mathematical and statistical models that...

  10. On the application of multilevel modeling in environmental and ecological studies

    USGS Publications Warehouse

    Qian, Song S.; Cuffney, Thomas F.; Alameddine, Ibrahim; McMahon, Gerard; Reckhow, Kenneth H.

    2010-01-01

    This paper illustrates the advantages of a multilevel/hierarchical approach for predictive modeling, including flexibility of model formulation, explicitly accounting for hierarchical structure in the data, and the ability to predict the outcome of new cases. As a generalization of the classical approach, the multilevel modeling approach explicitly models the hierarchical structure in the data by considering both the within- and between-group variances leading to a partial pooling of data across all levels in the hierarchy. The modeling framework provides means for incorporating variables at different spatiotemporal scales. The examples used in this paper illustrate the iterative process of model fitting and evaluation, a process that can lead to improved understanding of the system being studied.

  11. A Novel Two-Step Hierarchical Quantitative Structure–Activity Relationship Modeling Work Flow for Predicting Acute Toxicity of Chemicals in Rodents

    PubMed Central

    Zhu, Hao; Ye, Lin; Richard, Ann; Golbraikh, Alexander; Wright, Fred A.; Rusyn, Ivan; Tropsha, Alexander

    2009-01-01

    Background Accurate prediction of in vivo toxicity from in vitro testing is a challenging problem. Large public–private consortia have been formed with the goal of improving chemical safety assessment by the means of high-throughput screening. Objective A wealth of available biological data requires new computational approaches to link chemical structure, in vitro data, and potential adverse health effects. Methods and results A database containing experimental cytotoxicity values for in vitro half-maximal inhibitory concentration (IC50) and in vivo rodent median lethal dose (LD50) for more than 300 chemicals was compiled by Zentralstelle zur Erfassung und Bewertung von Ersatz- und Ergaenzungsmethoden zum Tierversuch (ZEBET; National Center for Documentation and Evaluation of Alternative Methods to Animal Experiments). The application of conventional quantitative structure–activity relationship (QSAR) modeling approaches to predict mouse or rat acute LD50 values from chemical descriptors of ZEBET compounds yielded no statistically significant models. The analysis of these data showed no significant correlation between IC50 and LD50. However, a linear IC50 versus LD50 correlation could be established for a fraction of compounds. To capitalize on this observation, we developed a novel two-step modeling approach as follows. First, all chemicals are partitioned into two groups based on the relationship between IC50 and LD50 values: One group comprises compounds with linear IC50 versus LD50 relationships, and another group comprises the remaining compounds. Second, we built conventional binary classification QSAR models to predict the group affiliation based on chemical descriptors only. Third, we developed k-nearest neighbor continuous QSAR models for each subclass to predict LD50 values from chemical descriptors. All models were extensively validated using special protocols. Conclusions The novelty of this modeling approach is that it uses the relationships between in vivo and in vitro data only to inform the initial construction of the hierarchical two-step QSAR models. Models resulting from this approach employ chemical descriptors only for external prediction of acute rodent toxicity. PMID:19672406

  12. A novel two-step hierarchical quantitative structure-activity relationship modeling work flow for predicting acute toxicity of chemicals in rodents.

    PubMed

    Zhu, Hao; Ye, Lin; Richard, Ann; Golbraikh, Alexander; Wright, Fred A; Rusyn, Ivan; Tropsha, Alexander

    2009-08-01

    Accurate prediction of in vivo toxicity from in vitro testing is a challenging problem. Large public-private consortia have been formed with the goal of improving chemical safety assessment by the means of high-throughput screening. A wealth of available biological data requires new computational approaches to link chemical structure, in vitro data, and potential adverse health effects. A database containing experimental cytotoxicity values for in vitro half-maximal inhibitory concentration (IC(50)) and in vivo rodent median lethal dose (LD(50)) for more than 300 chemicals was compiled by Zentralstelle zur Erfassung und Bewertung von Ersatz- und Ergaenzungsmethoden zum Tierversuch (ZEBET; National Center for Documentation and Evaluation of Alternative Methods to Animal Experiments). The application of conventional quantitative structure-activity relationship (QSAR) modeling approaches to predict mouse or rat acute LD(50) values from chemical descriptors of ZEBET compounds yielded no statistically significant models. The analysis of these data showed no significant correlation between IC(50) and LD(50). However, a linear IC(50) versus LD(50) correlation could be established for a fraction of compounds. To capitalize on this observation, we developed a novel two-step modeling approach as follows. First, all chemicals are partitioned into two groups based on the relationship between IC(50) and LD(50) values: One group comprises compounds with linear IC(50) versus LD(50) relationships, and another group comprises the remaining compounds. Second, we built conventional binary classification QSAR models to predict the group affiliation based on chemical descriptors only. Third, we developed k-nearest neighbor continuous QSAR models for each subclass to predict LD(50) values from chemical descriptors. All models were extensively validated using special protocols. The novelty of this modeling approach is that it uses the relationships between in vivo and in vitro data only to inform the initial construction of the hierarchical two-step QSAR models. Models resulting from this approach employ chemical descriptors only for external prediction of acute rodent toxicity.

  13. Using "big data" to optimally model hydrology and water quality across expansive regions

    USGS Publications Warehouse

    Roehl, E.A.; Cook, J.B.; Conrads, P.A.

    2009-01-01

    This paper describes a new divide and conquer approach that leverages big environmental data, utilizing all available categorical and time-series data without subjectivity, to empirically model hydrologic and water-quality behaviors across expansive regions. The approach decomposes large, intractable problems into smaller ones that are optimally solved; decomposes complex signals into behavioral components that are easier to model with "sub- models"; and employs a sequence of numerically optimizing algorithms that include time-series clustering, nonlinear, multivariate sensitivity analysis and predictive modeling using multi-layer perceptron artificial neural networks, and classification for selecting the best sub-models to make predictions at new sites. This approach has many advantages over traditional modeling approaches, including being faster and less expensive, more comprehensive in its use of available data, and more accurate in representing a system's physical processes. This paper describes the application of the approach to model groundwater levels in Florida, stream temperatures across Western Oregon and Wisconsin, and water depths in the Florida Everglades. ?? 2009 ASCE.

  14. Interpretable Deep Models for ICU Outcome Prediction

    PubMed Central

    Che, Zhengping; Purushotham, Sanjay; Khemani, Robinder; Liu, Yan

    2016-01-01

    Exponential surge in health care data, such as longitudinal data from electronic health records (EHR), sensor data from intensive care unit (ICU), etc., is providing new opportunities to discover meaningful data-driven characteristics and patterns ofdiseases. Recently, deep learning models have been employedfor many computational phenotyping and healthcare prediction tasks to achieve state-of-the-art performance. However, deep models lack interpretability which is crucial for wide adoption in medical research and clinical decision-making. In this paper, we introduce a simple yet powerful knowledge-distillation approach called interpretable mimic learning, which uses gradient boosting trees to learn interpretable models and at the same time achieves strong prediction performance as deep learning models. Experiment results on Pediatric ICU dataset for acute lung injury (ALI) show that our proposed method not only outperforms state-of-the-art approaches for morality and ventilator free days prediction tasks but can also provide interpretable models to clinicians. PMID:28269832

  15. QSAR modelling using combined simple competitive learning networks and RBF neural networks.

    PubMed

    Sheikhpour, R; Sarram, M A; Rezaeian, M; Sheikhpour, E

    2018-04-01

    The aim of this study was to propose a QSAR modelling approach based on the combination of simple competitive learning (SCL) networks with radial basis function (RBF) neural networks for predicting the biological activity of chemical compounds. The proposed QSAR method consisted of two phases. In the first phase, an SCL network was applied to determine the centres of an RBF neural network. In the second phase, the RBF neural network was used to predict the biological activity of various phenols and Rho kinase (ROCK) inhibitors. The predictive ability of the proposed QSAR models was evaluated and compared with other QSAR models using external validation. The results of this study showed that the proposed QSAR modelling approach leads to better performances than other models in predicting the biological activity of chemical compounds. This indicated the efficiency of simple competitive learning networks in determining the centres of RBF neural networks.

  16. Large-scale model quality assessment for improving protein tertiary structure prediction.

    PubMed

    Cao, Renzhi; Bhattacharya, Debswapna; Adhikari, Badri; Li, Jilong; Cheng, Jianlin

    2015-06-15

    Sampling structural models and ranking them are the two major challenges of protein structure prediction. Traditional protein structure prediction methods generally use one or a few quality assessment (QA) methods to select the best-predicted models, which cannot consistently select relatively better models and rank a large number of models well. Here, we develop a novel large-scale model QA method in conjunction with model clustering to rank and select protein structural models. It unprecedentedly applied 14 model QA methods to generate consensus model rankings, followed by model refinement based on model combination (i.e. averaging). Our experiment demonstrates that the large-scale model QA approach is more consistent and robust in selecting models of better quality than any individual QA method. Our method was blindly tested during the 11th Critical Assessment of Techniques for Protein Structure Prediction (CASP11) as MULTICOM group. It was officially ranked third out of all 143 human and server predictors according to the total scores of the first models predicted for 78 CASP11 protein domains and second according to the total scores of the best of the five models predicted for these domains. MULTICOM's outstanding performance in the extremely competitive 2014 CASP11 experiment proves that our large-scale QA approach together with model clustering is a promising solution to one of the two major problems in protein structure modeling. The web server is available at: http://sysbio.rnet.missouri.edu/multicom_cluster/human/. © The Author 2015. Published by Oxford University Press.

  17. Assessment of turbulent models for scramjet flowfields

    NASA Technical Reports Server (NTRS)

    Sindir, M. M.; Harsha, P. T.

    1982-01-01

    The behavior of several turbulence models applied to the prediction of scramjet combustor flows is described. These models include the basic two equation model, the multiple dissipation length scale variant of the two equation model, and the algebraic stress model (ASM). Predictions were made of planar backward facing step flows and axisymmetric sudden expansion flows using each of these approaches. The formulation of each of these models are discussed, and the application of the different approaches to supersonic flows is described. A modified version of the ASM is found to provide the best prediction of the planar backward facing step flow in the region near the recirculation zone, while the basic ASM provides the best results downstream of the recirculation. Aspects of the interaction of numerica modeling and turbulences modeling as they affect the assessment of turbulence models are discussed.

  18. A hybrid PCA-CART-MARS-based prognostic approach of the remaining useful life for aircraft engines.

    PubMed

    Sánchez Lasheras, Fernando; García Nieto, Paulino José; de Cos Juez, Francisco Javier; Mayo Bayón, Ricardo; González Suárez, Victor Manuel

    2015-03-23

    Prognostics is an engineering discipline that predicts the future health of a system. In this research work, a data-driven approach for prognostics is proposed. Indeed, the present paper describes a data-driven hybrid model for the successful prediction of the remaining useful life of aircraft engines. The approach combines the multivariate adaptive regression splines (MARS) technique with the principal component analysis (PCA), dendrograms and classification and regression trees (CARTs). Elements extracted from sensor signals are used to train this hybrid model, representing different levels of health for aircraft engines. In this way, this hybrid algorithm is used to predict the trends of these elements. Based on this fitting, one can determine the future health state of a system and estimate its remaining useful life (RUL) with accuracy. To evaluate the proposed approach, a test was carried out using aircraft engine signals collected from physical sensors (temperature, pressure, speed, fuel flow, etc.). Simulation results show that the PCA-CART-MARS-based approach can forecast faults long before they occur and can predict the RUL. The proposed hybrid model presents as its main advantage the fact that it does not require information about the previous operation states of the input variables of the engine. The performance of this model was compared with those obtained by other benchmark models (multivariate linear regression and artificial neural networks) also applied in recent years for the modeling of remaining useful life. Therefore, the PCA-CART-MARS-based approach is very promising in the field of prognostics of the RUL for aircraft engines.

  19. A Hybrid PCA-CART-MARS-Based Prognostic Approach of the Remaining Useful Life for Aircraft Engines

    PubMed Central

    Lasheras, Fernando Sánchez; Nieto, Paulino José García; de Cos Juez, Francisco Javier; Bayón, Ricardo Mayo; Suárez, Victor Manuel González

    2015-01-01

    Prognostics is an engineering discipline that predicts the future health of a system. In this research work, a data-driven approach for prognostics is proposed. Indeed, the present paper describes a data-driven hybrid model for the successful prediction of the remaining useful life of aircraft engines. The approach combines the multivariate adaptive regression splines (MARS) technique with the principal component analysis (PCA), dendrograms and classification and regression trees (CARTs). Elements extracted from sensor signals are used to train this hybrid model, representing different levels of health for aircraft engines. In this way, this hybrid algorithm is used to predict the trends of these elements. Based on this fitting, one can determine the future health state of a system and estimate its remaining useful life (RUL) with accuracy. To evaluate the proposed approach, a test was carried out using aircraft engine signals collected from physical sensors (temperature, pressure, speed, fuel flow, etc.). Simulation results show that the PCA-CART-MARS-based approach can forecast faults long before they occur and can predict the RUL. The proposed hybrid model presents as its main advantage the fact that it does not require information about the previous operation states of the input variables of the engine. The performance of this model was compared with those obtained by other benchmark models (multivariate linear regression and artificial neural networks) also applied in recent years for the modeling of remaining useful life. Therefore, the PCA-CART-MARS-based approach is very promising in the field of prognostics of the RUL for aircraft engines. PMID:25806876

  20. A Systematic Approach to Predicting Spring Force for Sagittal Craniosynostosis Surgery.

    PubMed

    Zhang, Guangming; Tan, Hua; Qian, Xiaohua; Zhang, Jian; Li, King; David, Lisa R; Zhou, Xiaobo

    2016-05-01

    Spring-assisted surgery (SAS) can effectively treat scaphocephaly by reshaping crania with the appropriate spring force. However, it is difficult to accurately estimate spring force without considering biomechanical properties of tissues. This study presents and validates a reliable system to accurately predict the spring force for sagittal craniosynostosis surgery. The authors randomly chose 23 patients who underwent SAS and had been followed for at least 2 years. An elastic model was designed to characterize the biomechanical behavior of calvarial bone tissue for each individual. After simulating the contact force on accurate position of the skull strip with the springs, the finite element method was applied to calculating the stress of each tissue node based on the elastic model. A support vector regression approach was then used to model the relationships between biomechanical properties generated from spring force, bone thickness, and the change of cephalic index after surgery. Therefore, for a new patient, the optimal spring force can be predicted based on the learned model with virtual spring simulation and dynamic programming approach prior to SAS. Leave-one-out cross-validation was implemented to assess the accuracy of our prediction. As a result, the mean prediction accuracy of this model was 93.35%, demonstrating the great potential of this model as a useful adjunct for preoperative planning tool.

  1. Trust from the past: Bayesian Personalized Ranking based Link Prediction in Knowledge Graphs

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Baichuan; Choudhury, Sutanay; Al-Hasan, Mohammad

    2016-02-01

    Estimating the confidence for a link is a critical task for Knowledge Graph construction. Link prediction, or predicting the likelihood of a link in a knowledge graph based on prior state is a key research direction within this area. We propose a Latent Feature Embedding based link recommendation model for prediction task and utilize Bayesian Personalized Ranking based optimization technique for learning models for each predicate. Experimental results on large-scale knowledge bases such as YAGO2 show that our approach achieves substantially higher performance than several state-of-art approaches. Furthermore, we also study the performance of the link prediction algorithm in termsmore » of topological properties of the Knowledge Graph and present a linear regression model to reason about its expected level of accuracy.« less

  2. Quantifying predictability variations in a low-order ocean-atmosphere model - A dynamical systems approach

    NASA Technical Reports Server (NTRS)

    Nese, Jon M.; Dutton, John A.

    1993-01-01

    The predictability of the weather and climatic states of a low-order moist general circulation model is quantified using a dynamic systems approach, and the effect of incorporating a simple oceanic circulation on predictability is evaluated. The predictability and the structure of the model attractors are compared using Liapunov exponents, local divergence rates, and the correlation and Liapunov dimensions. It was found that the activation of oceanic circulation increases the average error doubling time of the atmosphere and the coupled ocean-atmosphere system by 10 percent and decreases the variance of the largest local divergence rate by 20 percent. When an oceanic circulation develops, the average predictability of annually averaged states is improved by 25 percent and the variance of the largest local divergence rate decreases by 25 percent.

  3. Computational Model-Based Prediction of Human Episodic Memory Performance Based on Eye Movements

    NASA Astrophysics Data System (ADS)

    Sato, Naoyuki; Yamaguchi, Yoko

    Subjects' episodic memory performance is not simply reflected by eye movements. We use a ‘theta phase coding’ model of the hippocampus to predict subjects' memory performance from their eye movements. Results demonstrate the ability of the model to predict subjects' memory performance. These studies provide a novel approach to computational modeling in the human-machine interface.

  4. Wind turbine rotor simulation using the actuator disk and actuator line methods

    NASA Astrophysics Data System (ADS)

    Tzimas, M.; Prospathopoulos, J.

    2016-09-01

    The present paper focuses on wind turbine rotor modeling for loads and wake flow prediction. Two steady-state models based on the actuator disk approach are considered, using either a uniform thrust or a blade element momentum calculation of the wind turbine loads. A third model is based on the unsteady-state actuator line approach. Predictions are compared with measurements in wind tunnel experiments and in atmospheric environment and the capabilities and weaknesses of the different models are addressed.

  5. Managing uncertainty in metabolic network structure and improving predictions using EnsembleFBA

    PubMed Central

    2017-01-01

    Genome-scale metabolic network reconstructions (GENREs) are repositories of knowledge about the metabolic processes that occur in an organism. GENREs have been used to discover and interpret metabolic functions, and to engineer novel network structures. A major barrier preventing more widespread use of GENREs, particularly to study non-model organisms, is the extensive time required to produce a high-quality GENRE. Many automated approaches have been developed which reduce this time requirement, but automatically-reconstructed draft GENREs still require curation before useful predictions can be made. We present a novel approach to the analysis of GENREs which improves the predictive capabilities of draft GENREs by representing many alternative network structures, all equally consistent with available data, and generating predictions from this ensemble. This ensemble approach is compatible with many reconstruction methods. We refer to this new approach as Ensemble Flux Balance Analysis (EnsembleFBA). We validate EnsembleFBA by predicting growth and gene essentiality in the model organism Pseudomonas aeruginosa UCBPP-PA14. We demonstrate how EnsembleFBA can be included in a systems biology workflow by predicting essential genes in six Streptococcus species and mapping the essential genes to small molecule ligands from DrugBank. We found that some metabolic subsystems contributed disproportionately to the set of predicted essential reactions in a way that was unique to each Streptococcus species, leading to species-specific outcomes from small molecule interactions. Through our analyses of P. aeruginosa and six Streptococci, we show that ensembles increase the quality of predictions without drastically increasing reconstruction time, thus making GENRE approaches more practical for applications which require predictions for many non-model organisms. All of our functions and accompanying example code are available in an open online repository. PMID:28263984

  6. Managing uncertainty in metabolic network structure and improving predictions using EnsembleFBA.

    PubMed

    Biggs, Matthew B; Papin, Jason A

    2017-03-01

    Genome-scale metabolic network reconstructions (GENREs) are repositories of knowledge about the metabolic processes that occur in an organism. GENREs have been used to discover and interpret metabolic functions, and to engineer novel network structures. A major barrier preventing more widespread use of GENREs, particularly to study non-model organisms, is the extensive time required to produce a high-quality GENRE. Many automated approaches have been developed which reduce this time requirement, but automatically-reconstructed draft GENREs still require curation before useful predictions can be made. We present a novel approach to the analysis of GENREs which improves the predictive capabilities of draft GENREs by representing many alternative network structures, all equally consistent with available data, and generating predictions from this ensemble. This ensemble approach is compatible with many reconstruction methods. We refer to this new approach as Ensemble Flux Balance Analysis (EnsembleFBA). We validate EnsembleFBA by predicting growth and gene essentiality in the model organism Pseudomonas aeruginosa UCBPP-PA14. We demonstrate how EnsembleFBA can be included in a systems biology workflow by predicting essential genes in six Streptococcus species and mapping the essential genes to small molecule ligands from DrugBank. We found that some metabolic subsystems contributed disproportionately to the set of predicted essential reactions in a way that was unique to each Streptococcus species, leading to species-specific outcomes from small molecule interactions. Through our analyses of P. aeruginosa and six Streptococci, we show that ensembles increase the quality of predictions without drastically increasing reconstruction time, thus making GENRE approaches more practical for applications which require predictions for many non-model organisms. All of our functions and accompanying example code are available in an open online repository.

  7. Modeling changes in biomass composition during microwave-based alkali pretreatment of switchgrass.

    PubMed

    Keshwani, Deepak R; Cheng, Jay J

    2010-01-01

    This study used two different approaches to model changes in biomass composition during microwave-based pretreatment of switchgrass: kinetic modeling using a time-dependent rate coefficient, and a Mamdani-type fuzzy inference system. In both modeling approaches, the dielectric loss tangent of the alkali reagent and pretreatment time were used as predictors for changes in amounts of lignin, cellulose, and xylan during the pretreatment. Training and testing data sets for development and validation of the models were obtained from pretreatment experiments conducted using 1-3% w/v NaOH (sodium hydroxide) and pretreatment times ranging from 5 to 20 min. The kinetic modeling approach for lignin and xylan gave comparable results for training and testing data sets, and the differences between the predictions and experimental values were within 2%. The kinetic modeling approach for cellulose was not as effective, and the differences were within 5-7%. The time-dependent rate coefficients of the kinetic models estimated from experimental data were consistent with the heterogeneity of individual biomass components. The Mamdani-type fuzzy inference was shown to be an effective approach to model the pretreatment process and yielded predictions with less than 2% deviation from the experimental values for lignin and with less than 3% deviation from the experimental values for cellulose and xylan. The entropies of the fuzzy outputs from the Mamdani-type fuzzy inference system were calculated to quantify the uncertainty associated with the predictions. Results indicate that there is no significant difference between the entropies associated with the predictions for lignin, cellulose, and xylan. It is anticipated that these models could be used in process simulations of bioethanol production from lignocellulosic materials.

  8. Predicting Deforestation Patterns in Loreto, Peru from 2000-2010 Using a Nested GLM Approach

    NASA Astrophysics Data System (ADS)

    Vijay, V.; Jenkins, C.; Finer, M.; Pimm, S.

    2013-12-01

    Loreto is the largest province in Peru, covering about 370,000 km2. Because of its remote location in the Amazonian rainforest, it is also one of the most sparsely populated. Though a majority of the region remains covered by forest, deforestation is being driven by human encroachment through industrial activities and the spread of colonization and agriculture. The importance of accurate predictive modeling of deforestation has spawned an extensive body of literature on the topic. We present a nested GLM approach based on predictions of deforestation from 2000-2010 and using variables representing the expected drivers of deforestation. Models were constructed using 2000 to 2005 changes and tested against data for 2005 to 2010. The most complex model, which included transportation variables (roads and navigable rivers), spatial contagion processes, population centers and industrial activities, performed better in predicting the 2005 to 2010 changes (75.8% accurate) than did a simpler model using only transportation variables (69.2% accurate). Finally we contrast the GLM approach with a more complex spatially articulated model.

  9. Optimization of a novel biophysical model using large scale in vivo antisense hybridization data displays improved prediction capabilities of structurally accessible RNA regions

    PubMed Central

    Vazquez-Anderson, Jorge; Mihailovic, Mia K.; Baldridge, Kevin C.; Reyes, Kristofer G.; Haning, Katie; Cho, Seung Hee; Amador, Paul; Powell, Warren B.

    2017-01-01

    Abstract Current approaches to design efficient antisense RNAs (asRNAs) rely primarily on a thermodynamic understanding of RNA–RNA interactions. However, these approaches depend on structure predictions and have limited accuracy, arguably due to overlooking important cellular environment factors. In this work, we develop a biophysical model to describe asRNA–RNA hybridization that incorporates in vivo factors using large-scale experimental hybridization data for three model RNAs: a group I intron, CsrB and a tRNA. A unique element of our model is the estimation of the availability of the target region to interact with a given asRNA using a differential entropic consideration of suboptimal structures. We showcase the utility of this model by evaluating its prediction capabilities in four additional RNAs: a group II intron, Spinach II, 2-MS2 binding domain and glgC 5΄ UTR. Additionally, we demonstrate the applicability of this approach to other bacterial species by predicting sRNA–mRNA binding regions in two newly discovered, though uncharacterized, regulatory RNAs. PMID:28334800

  10. Predicting locations of rare aquatic species’ habitat with a combination of species-specific and assemblage-based models

    USGS Publications Warehouse

    McKenna, James E.; Carlson, Douglas M.; Payne-Wynne, Molly L.

    2013-01-01

    Aim: Rare aquatic species are a substantial component of biodiversity, and their conservation is a major objective of many management plans. However, they are difficult to assess, and their optimal habitats are often poorly known. Methods to effectively predict the likely locations of suitable rare aquatic species habitats are needed. We combine two modelling approaches to predict occurrence and general abundance of several rare fish species. Location: Allegheny watershed of western New York State (USA) Methods: Our method used two empirical neural network modelling approaches (species specific and assemblage based) to predict stream-by-stream occurrence and general abundance of rare darters, based on broad-scale habitat conditions. Species-specific models were developed for longhead darter (Percina macrocephala), spotted darter (Etheostoma maculatum) and variegate darter (Etheostoma variatum) in the Allegheny drainage. An additional model predicted the type of rare darter-containing assemblage expected in each stream reach. Predictions from both models were then combined inclusively and exclusively and compared with additional independent data. Results Example rare darter predictions demonstrate the method's effectiveness. Models performed well (R2 ≥ 0.79), identified where suitable darter habitat was most likely to occur, and predictions matched well to those of collection sites. Additional independent data showed that the most conservative (exclusive) model slightly underestimated the distributions of these rare darters or predictions were displaced by one stream reach, suggesting that new darter habitat types were detected in the later collections. Main conclusions Broad-scale habitat variables can be used to effectively identify rare species' habitats. Combining species-specific and assemblage-based models enhances our ability to make use of the sparse data on rare species and to identify habitat units most likely and least likely to support those species. This hybrid approach may assist managers with the prioritization of habitats to be examined or conserved for rare species.

  11. Bayesian Framework Approach for Prognostic Studies in Electrolytic Capacitor under Thermal Overstress Conditions

    DTIC Science & Technology

    2012-09-01

    make end of life ( EOL ) and remaining useful life (RUL) estimations. Model-based prognostics approaches perform these tasks with the help of first...in parameters Degradation Modeling Parameter estimation Prediction Thermal / Electrical Stress Experimental Data State Space model RUL EOL ...distribution at given single time point kP , and use this for multi-step predictions to EOL . There are several methods which exits for selecting the sigma

  12. Reverse engineering systems models of regulation: discovery, prediction and mechanisms.

    PubMed

    Ashworth, Justin; Wurtmann, Elisabeth J; Baliga, Nitin S

    2012-08-01

    Biological systems can now be understood in comprehensive and quantitative detail using systems biology approaches. Putative genome-scale models can be built rapidly based upon biological inventories and strategic system-wide molecular measurements. Current models combine statistical associations, causative abstractions, and known molecular mechanisms to explain and predict quantitative and complex phenotypes. This top-down 'reverse engineering' approach generates useful organism-scale models despite noise and incompleteness in data and knowledge. Here we review and discuss the reverse engineering of biological systems using top-down data-driven approaches, in order to improve discovery, hypothesis generation, and the inference of biological properties. Copyright © 2011 Elsevier Ltd. All rights reserved.

  13. Towards a whole-cell modeling approach for synthetic biology

    NASA Astrophysics Data System (ADS)

    Purcell, Oliver; Jain, Bonny; Karr, Jonathan R.; Covert, Markus W.; Lu, Timothy K.

    2013-06-01

    Despite rapid advances over the last decade, synthetic biology lacks the predictive tools needed to enable rational design. Unlike established engineering disciplines, the engineering of synthetic gene circuits still relies heavily on experimental trial-and-error, a time-consuming and inefficient process that slows down the biological design cycle. This reliance on experimental tuning is because current modeling approaches are unable to make reliable predictions about the in vivo behavior of synthetic circuits. A major reason for this lack of predictability is that current models view circuits in isolation, ignoring the vast number of complex cellular processes that impinge on the dynamics of the synthetic circuit and vice versa. To address this problem, we present a modeling approach for the design of synthetic circuits in the context of cellular networks. Using the recently published whole-cell model of Mycoplasma genitalium, we examined the effect of adding genes into the host genome. We also investigated how codon usage correlates with gene expression and find agreement with existing experimental results. Finally, we successfully implemented a synthetic Goodwin oscillator in the whole-cell model. We provide an updated software framework for the whole-cell model that lays the foundation for the integration of whole-cell models with synthetic gene circuit models. This software framework is made freely available to the community to enable future extensions. We envision that this approach will be critical to transforming the field of synthetic biology into a rational and predictive engineering discipline.

  14. Comparison of individual-based modeling and population approaches for prediction of foodborne pathogens growth.

    PubMed

    Augustin, Jean-Christophe; Ferrier, Rachel; Hezard, Bernard; Lintz, Adrienne; Stahl, Valérie

    2015-02-01

    Individual-based modeling (IBM) approach combined with the microenvironment modeling of vacuum-packed cold-smoked salmon was more effective to describe the variability of the growth of a few Listeria monocytogenes cells contaminating irradiated salmon slices than the traditional population models. The IBM approach was particularly relevant to predict the absence of growth in 25% (5 among 20) of artificially contaminated cold-smoked salmon samples stored at 8 °C. These results confirmed similar observations obtained with smear soft cheese (Ferrier et al., 2013). These two different food models were used to compare the IBM/microscale and population/macroscale modeling approaches in more global exposure and risk assessment frameworks taking into account the variability and/or the uncertainty of the factors influencing the growth of L. monocytogenes. We observed that the traditional population models significantly overestimate exposure and risk estimates in comparison to IBM approach when contamination of foods occurs with a low number of cells (<100 per serving). Moreover, the exposure estimates obtained with the population model were characterized by a great uncertainty. The overestimation was mainly linked to the ability of IBM to predict no growth situations rather than the consideration of microscale environment. On the other hand, when the aim of quantitative risk assessment studies is only to assess the relative impact of changes in control measures affecting the growth of foodborne bacteria, the two modeling approach gave similar results and the simplest population approach was suitable. Copyright © 2014 Elsevier Ltd. All rights reserved.

  15. Modeling Stationary Lithium-Ion Batteries for Optimization and Predictive Control

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Baker, Kyri A; Shi, Ying; Christensen, Dane T

    Accurately modeling stationary battery storage behavior is crucial to understand and predict its limitations in demand-side management scenarios. In this paper, a lithium-ion battery model was derived to estimate lifetime and state-of-charge for building-integrated use cases. The proposed battery model aims to balance speed and accuracy when modeling battery behavior for real-time predictive control and optimization. In order to achieve these goals, a mixed modeling approach was taken, which incorporates regression fits to experimental data and an equivalent circuit to model battery behavior. A comparison of the proposed battery model output to actual data from the manufacturer validates the modelingmore » approach taken in the paper. Additionally, a dynamic test case demonstrates the effects of using regression models to represent internal resistance and capacity fading.« less

  16. Modeling Stationary Lithium-Ion Batteries for Optimization and Predictive Control: Preprint

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Raszmann, Emma; Baker, Kyri; Shi, Ying

    Accurately modeling stationary battery storage behavior is crucial to understand and predict its limitations in demand-side management scenarios. In this paper, a lithium-ion battery model was derived to estimate lifetime and state-of-charge for building-integrated use cases. The proposed battery model aims to balance speed and accuracy when modeling battery behavior for real-time predictive control and optimization. In order to achieve these goals, a mixed modeling approach was taken, which incorporates regression fits to experimental data and an equivalent circuit to model battery behavior. A comparison of the proposed battery model output to actual data from the manufacturer validates the modelingmore » approach taken in the paper. Additionally, a dynamic test case demonstrates the effects of using regression models to represent internal resistance and capacity fading.« less

  17. ADMET Evaluation in Drug Discovery. 16. Predicting hERG Blockers by Combining Multiple Pharmacophores and Machine Learning Approaches.

    PubMed

    Wang, Shuangquan; Sun, Huiyong; Liu, Hui; Li, Dan; Li, Youyong; Hou, Tingjun

    2016-08-01

    Blockade of human ether-à-go-go related gene (hERG) channel by compounds may lead to drug-induced QT prolongation, arrhythmia, and Torsades de Pointes (TdP), and therefore reliable prediction of hERG liability in the early stages of drug design is quite important to reduce the risk of cardiotoxicity-related attritions in the later development stages. In this study, pharmacophore modeling and machine learning approaches were combined to construct classification models to distinguish hERG active from inactive compounds based on a diverse data set. First, an optimal ensemble of pharmacophore hypotheses that had good capability to differentiate hERG active from inactive compounds was identified by the recursive partitioning (RP) approach. Then, the naive Bayesian classification (NBC) and support vector machine (SVM) approaches were employed to construct classification models by integrating multiple important pharmacophore hypotheses. The integrated classification models showed improved predictive capability over any single pharmacophore hypothesis, suggesting that the broad binding polyspecificity of hERG can only be well characterized by multiple pharmacophores. The best SVM model achieved the prediction accuracies of 84.7% for the training set and 82.1% for the external test set. Notably, the accuracies for the hERG blockers and nonblockers in the test set reached 83.6% and 78.2%, respectively. Analysis of significant pharmacophores helps to understand the multimechanisms of action of hERG blockers. We believe that the combination of pharmacophore modeling and SVM is a powerful strategy to develop reliable theoretical models for the prediction of potential hERG liability.

  18. Predicting drug-target interactions using restricted Boltzmann machines.

    PubMed

    Wang, Yuhao; Zeng, Jianyang

    2013-07-01

    In silico prediction of drug-target interactions plays an important role toward identifying and developing new uses of existing or abandoned drugs. Network-based approaches have recently become a popular tool for discovering new drug-target interactions (DTIs). Unfortunately, most of these network-based approaches can only predict binary interactions between drugs and targets, and information about different types of interactions has not been well exploited for DTI prediction in previous studies. On the other hand, incorporating additional information about drug-target relationships or drug modes of action can improve prediction of DTIs. Furthermore, the predicted types of DTIs can broaden our understanding about the molecular basis of drug action. We propose a first machine learning approach to integrate multiple types of DTIs and predict unknown drug-target relationships or drug modes of action. We cast the new DTI prediction problem into a two-layer graphical model, called restricted Boltzmann machine, and apply a practical learning algorithm to train our model and make predictions. Tests on two public databases show that our restricted Boltzmann machine model can effectively capture the latent features of a DTI network and achieve excellent performance on predicting different types of DTIs, with the area under precision-recall curve up to 89.6. In addition, we demonstrate that integrating multiple types of DTIs can significantly outperform other predictions either by simply mixing multiple types of interactions without distinction or using only a single interaction type. Further tests show that our approach can infer a high fraction of novel DTIs that has been validated by known experiments in the literature or other databases. These results indicate that our approach can have highly practical relevance to DTI prediction and drug repositioning, and hence advance the drug discovery process. Software and datasets are available on request. Supplementary data are available at Bioinformatics online.

  19. [Bayesian geostatistical prediction of soil organic carbon contents of solonchak soils in nor-thern Tarim Basin, Xinjiang, China.

    PubMed

    Wu, Wei Mo; Wang, Jia Qiang; Cao, Qi; Wu, Jia Ping

    2017-02-01

    Accurate prediction of soil organic carbon (SOC) distribution is crucial for soil resources utilization and conservation, climate change adaptation, and ecosystem health. In this study, we selected a 1300 m×1700 m solonchak sampling area in northern Tarim Basin, Xinjiang, China, and collected a total of 144 soil samples (5-10 cm). The objectives of this study were to build a Baye-sian geostatistical model to predict SOC content, and to assess the performance of the Bayesian model for the prediction of SOC content by comparing with other three geostatistical approaches [ordinary kriging (OK), sequential Gaussian simulation (SGS), and inverse distance weighting (IDW)]. In the study area, soil organic carbon contents ranged from 1.59 to 9.30 g·kg -1 with a mean of 4.36 g·kg -1 and a standard deviation of 1.62 g·kg -1 . Sample semivariogram was best fitted by an exponential model with the ratio of nugget to sill being 0.57. By using the Bayesian geostatistical approach, we generated the SOC content map, and obtained the prediction variance, upper 95% and lower 95% of SOC contents, which were then used to evaluate the prediction uncertainty. Bayesian geostatistical approach performed better than that of the OK, SGS and IDW, demonstrating the advantages of Bayesian approach in SOC prediction.

  20. HABITAT MODELING APPROACHES FOR RESTORATION SITE SELECTION

    EPA Science Inventory

    Numerous modeling approaches have been used to develop predictive models of species-environment and species-habitat relationships. These models have been used in conservation biology and habitat or species management, but their application to restoration efforts has been minimal...

  1. A model for predicting air quality along highways.

    DOT National Transportation Integrated Search

    1973-01-01

    The subject of this report is an air quality prediction model for highways, AIRPOL Version 2, July 1973. AIRPOL has been developed by modifying the basic Gaussian approach to gaseous dispersion. The resultant model is smooth and continuous throughout...

  2. Use of statistical and neural net approaches in predicting toxicity of chemicals.

    PubMed

    Basak, S C; Grunwald, G D; Gute, B D; Balasubramanian, K; Opitz, D

    2000-01-01

    Hierarchical quantitative structure-activity relationships (H-QSAR) have been developed as a new approach in constructing models for estimating physicochemical, biomedicinal, and toxicological properties of interest. This approach uses increasingly more complex molecular descriptors in a graduated approach to model building. In this study, statistical and neural network methods have been applied to the development of H-QSAR models for estimating the acute aquatic toxicity (LC50) of 69 benzene derivatives to Pimephales promelas (fathead minnow). Topostructural, topochemical, geometrical, and quantum chemical indices were used as the four levels of the hierarchical method. It is clear from both the statistical and neural network models that topostructural indices alone cannot adequately model this set of congeneric chemicals. Not surprisingly, topochemical indices greatly increase the predictive power of both statistical and neural network models. Quantum chemical indices also add significantly to the modeling of this set of acute aquatic toxicity data.

  3. Model predictive and reallocation problem for CubeSat fault recovery and attitude control

    NASA Astrophysics Data System (ADS)

    Franchi, Loris; Feruglio, Lorenzo; Mozzillo, Raffaele; Corpino, Sabrina

    2018-01-01

    In recent years, thanks to the increase of the know-how on machine-learning techniques and the advance of the computational capabilities of on-board processing, expensive computing algorithms, such as Model Predictive Control, have begun to spread in space applications even on small on-board processor. The paper presents an algorithm for an optimal fault recovery of a 3U CubeSat, developed in MathWorks Matlab & Simulink environment. This algorithm involves optimization techniques aiming at obtaining the optimal recovery solution, and involves a Model Predictive Control approach for the attitude control. The simulated system is a CubeSat in Low Earth Orbit: the attitude control is performed with three magnetic torquers and a single reaction wheel. The simulation neglects the errors in the attitude determination of the satellite, and focuses on the recovery approach and control method. The optimal recovery approach takes advantage of the properties of magnetic actuation, which gives the possibility of the redistribution of the control action when a fault occurs on a single magnetic torquer, even in absence of redundant actuators. In addition, the paper presents the results of the implementation of Model Predictive approach to control the attitude of the satellite.

  4. Modelling the mating system of polar bears: a mechanistic approach to the Allee effect.

    PubMed

    Molnár, Péter K; Derocher, Andrew E; Lewis, Mark A; Taylor, Mitchell K

    2008-01-22

    Allee effects may render exploited animal populations extinction prone, but empirical data are often lacking to describe the circumstances leading to an Allee effect. Arbitrary assumptions regarding Allee effects could lead to erroneous management decisions so that predictive modelling approaches are needed that identify the circumstances leading to an Allee effect before such a scenario occurs. We present a predictive approach of Allee effects for polar bears where low population densities, an unpredictable habitat and harvest-depleted male populations result in infrequent mating encounters. We develop a mechanistic model for the polar bear mating system that predicts the proportion of fertilized females at the end of the mating season given population density and operational sex ratio. The model is parametrized using pairing data from Lancaster Sound, Canada, and describes the observed pairing dynamics well. Female mating success is shown to be a nonlinear function of the operational sex ratio, so that a sudden and rapid reproductive collapse could occur if males are severely depleted. The operational sex ratio where an Allee effect is expected is dependent on population density. We focus on the prediction of Allee effects in polar bears but our approach is also applicable to other species.

  5. Classification of Time Series Gene Expression in Clinical Studies via Integration of Biological Network

    PubMed Central

    Qian, Liwei; Zheng, Haoran; Zhou, Hong; Qin, Ruibin; Li, Jinlong

    2013-01-01

    The increasing availability of time series expression datasets, although promising, raises a number of new computational challenges. Accordingly, the development of suitable classification methods to make reliable and sound predictions is becoming a pressing issue. We propose, here, a new method to classify time series gene expression via integration of biological networks. We evaluated our approach on 2 different datasets and showed that the use of a hidden Markov model/Gaussian mixture models hybrid explores the time-dependence of the expression data, thereby leading to better prediction results. We demonstrated that the biclustering procedure identifies function-related genes as a whole, giving rise to high accordance in prognosis prediction across independent time series datasets. In addition, we showed that integration of biological networks into our method significantly improves prediction performance. Moreover, we compared our approach with several state-of–the-art algorithms and found that our method outperformed previous approaches with regard to various criteria. Finally, our approach achieved better prediction results on early-stage data, implying the potential of our method for practical prediction. PMID:23516469

  6. Weather models as virtual sensors to data-driven rainfall predictions in urban watersheds

    NASA Astrophysics Data System (ADS)

    Cozzi, Lorenzo; Galelli, Stefano; Pascal, Samuel Jolivet De Marc; Castelletti, Andrea

    2013-04-01

    Weather and climate predictions are a key element of urban hydrology where they are used to inform water management and assist in flood warning delivering. Indeed, the modelling of the very fast dynamics of urbanized catchments can be substantially improved by the use of weather/rainfall predictions. For example, in Singapore Marina Reservoir catchment runoff processes have a very short time of concentration (roughly one hour) and observational data are thus nearly useless for runoff predictions and weather prediction are required. Unfortunately, radar nowcasting methods do not allow to carrying out long - term weather predictions, whereas numerical models are limited by their coarse spatial scale. Moreover, numerical models are usually poorly reliable because of the fast motion and limited spatial extension of rainfall events. In this study we investigate the combined use of data-driven modelling techniques and weather variables observed/simulated with a numerical model as a way to improve rainfall prediction accuracy and lead time in the Singapore metropolitan area. To explore the feasibility of the approach, we use a Weather Research and Forecast (WRF) model as a virtual sensor network for the input variables (the states of the WRF model) to a machine learning rainfall prediction model. More precisely, we combine an input variable selection method and a non-parametric tree-based model to characterize the empirical relation between the rainfall measured at the catchment level and all possible weather input variables provided by WRF model. We explore different lead time to evaluate the model reliability for different long - term predictions, as well as different time lags to see how past information could improve results. Results show that the proposed approach allow a significant improvement of the prediction accuracy of the WRF model on the Singapore urban area.

  7. Predicting birth weight with conditionally linear transformation models.

    PubMed

    Möst, Lisa; Schmid, Matthias; Faschingbauer, Florian; Hothorn, Torsten

    2016-12-01

    Low and high birth weight (BW) are important risk factors for neonatal morbidity and mortality. Gynecologists must therefore accurately predict BW before delivery. Most prediction formulas for BW are based on prenatal ultrasound measurements carried out within one week prior to birth. Although successfully used in clinical practice, these formulas focus on point predictions of BW but do not systematically quantify uncertainty of the predictions, i.e. they result in estimates of the conditional mean of BW but do not deliver prediction intervals. To overcome this problem, we introduce conditionally linear transformation models (CLTMs) to predict BW. Instead of focusing only on the conditional mean, CLTMs model the whole conditional distribution function of BW given prenatal ultrasound parameters. Consequently, the CLTM approach delivers both point predictions of BW and fetus-specific prediction intervals. Prediction intervals constitute an easy-to-interpret measure of prediction accuracy and allow identification of fetuses subject to high prediction uncertainty. Using a data set of 8712 deliveries at the Perinatal Centre at the University Clinic Erlangen (Germany), we analyzed variants of CLTMs and compared them to standard linear regression estimation techniques used in the past and to quantile regression approaches. The best-performing CLTM variant was competitive with quantile regression and linear regression approaches in terms of conditional coverage and average length of the prediction intervals. We propose that CLTMs be used because they are able to account for possible heteroscedasticity, kurtosis, and skewness of the distribution of BWs. © The Author(s) 2014.

  8. Model Averaging for Predicting the Exposure to Aflatoxin B1 Using DNA Methylation in White Blood Cells of Infants

    NASA Astrophysics Data System (ADS)

    Rahardiantoro, S.; Sartono, B.; Kurnia, A.

    2017-03-01

    In recent years, DNA methylation has been the special issue to reveal the pattern of a lot of human diseases. Huge amount of data would be the inescapable phenomenon in this case. In addition, some researchers interesting to take some predictions based on these huge data, especially using regression analysis. The classical approach would be failed to take the task. Model averaging by Ando and Li [1] could be an alternative approach to face this problem. This research applied the model averaging to get the best prediction in high dimension of data. In the practice, the case study by Vargas et al [3], data of exposure to aflatoxin B1 (AFB1) and DNA methylation in white blood cells of infants in The Gambia, take the implementation of model averaging. The best ensemble model selected based on the minimum of MAPE, MAE, and MSE of predictions. The result is ensemble model by model averaging with number of predictors in model candidate is 15.

  9. Predicting fundamental and realized distributions based on thermal niche: A case study of a freshwater turtle

    NASA Astrophysics Data System (ADS)

    Rodrigues, João Fabrício Mota; Coelho, Marco Túlio Pacheco; Ribeiro, Bruno R.

    2018-04-01

    Species distribution models (SDM) have been broadly used in ecology to address theoretical and practical problems. Currently, there are two main approaches to generate SDMs: (i) correlative, which is based on species occurrences and environmental predictor layers and (ii) process-based models, which are constructed based on species' functional traits and physiological tolerances. The distributions estimated by each approach are based on different components of species niche. Predictions of correlative models approach species realized niches, while predictions of process-based are more akin to species fundamental niche. Here, we integrated the predictions of fundamental and realized distributions of the freshwater turtle Trachemys dorbigni. Fundamental distribution was estimated using data of T. dorbigni's egg incubation temperature, and realized distribution was estimated using species occurrence records. Both types of distributions were estimated using the same regression approaches (logistic regression and support vector machines), both considering macroclimatic and microclimatic temperatures. The realized distribution of T. dorbigni was generally nested in its fundamental distribution reinforcing theoretical assumptions that the species' realized niche is a subset of its fundamental niche. Both modelling algorithms produced similar results but microtemperature generated better results than macrotemperature for the incubation model. Finally, our results reinforce the conclusion that species realized distributions are constrained by other factors other than just thermal tolerances.

  10. Moving beyond regression techniques in cardiovascular risk prediction: applying machine learning to address analytic challenges

    PubMed Central

    Goldstein, Benjamin A.; Navar, Ann Marie; Carter, Rickey E.

    2017-01-01

    Abstract Risk prediction plays an important role in clinical cardiology research. Traditionally, most risk models have been based on regression models. While useful and robust, these statistical methods are limited to using a small number of predictors which operate in the same way on everyone, and uniformly throughout their range. The purpose of this review is to illustrate the use of machine-learning methods for development of risk prediction models. Typically presented as black box approaches, most machine-learning methods are aimed at solving particular challenges that arise in data analysis that are not well addressed by typical regression approaches. To illustrate these challenges, as well as how different methods can address them, we consider trying to predicting mortality after diagnosis of acute myocardial infarction. We use data derived from our institution's electronic health record and abstract data on 13 regularly measured laboratory markers. We walk through different challenges that arise in modelling these data and then introduce different machine-learning approaches. Finally, we discuss general issues in the application of machine-learning methods including tuning parameters, loss functions, variable importance, and missing data. Overall, this review serves as an introduction for those working on risk modelling to approach the diffuse field of machine learning. PMID:27436868

  11. Benchmarking novel approaches for modelling species range dynamics

    PubMed Central

    Zurell, Damaris; Thuiller, Wilfried; Pagel, Jörn; Cabral, Juliano S; Münkemüller, Tamara; Gravel, Dominique; Dullinger, Stefan; Normand, Signe; Schiffers, Katja H.; Moore, Kara A.; Zimmermann, Niklaus E.

    2016-01-01

    Increasing biodiversity loss due to climate change is one of the most vital challenges of the 21st century. To anticipate and mitigate biodiversity loss, models are needed that reliably project species’ range dynamics and extinction risks. Recently, several new approaches to model range dynamics have been developed to supplement correlative species distribution models (SDMs), but applications clearly lag behind model development. Indeed, no comparative analysis has been performed to evaluate their performance. Here, we build on process-based, simulated data for benchmarking five range (dynamic) models of varying complexity including classical SDMs, SDMs coupled with simple dispersal or more complex population dynamic models (SDM hybrids), and a hierarchical Bayesian process-based dynamic range model (DRM). We specifically test the effects of demographic and community processes on model predictive performance. Under current climate, DRMs performed best, although only marginally. Under climate change, predictive performance varied considerably, with no clear winners. Yet, all range dynamic models improved predictions under climate change substantially compared to purely correlative SDMs, and the population dynamic models also predicted reasonable extinction risks for most scenarios. When benchmarking data were simulated with more complex demographic and community processes, simple SDM hybrids including only dispersal often proved most reliable. Finally, we found that structural decisions during model building can have great impact on model accuracy, but prior system knowledge on important processes can reduce these uncertainties considerably. Our results reassure the clear merit in using dynamic approaches for modelling species’ response to climate change but also emphasise several needs for further model and data improvement. We propose and discuss perspectives for improving range projections through combination of multiple models and for making these approaches operational for large numbers of species. PMID:26872305

  12. Benchmarking novel approaches for modelling species range dynamics.

    PubMed

    Zurell, Damaris; Thuiller, Wilfried; Pagel, Jörn; Cabral, Juliano S; Münkemüller, Tamara; Gravel, Dominique; Dullinger, Stefan; Normand, Signe; Schiffers, Katja H; Moore, Kara A; Zimmermann, Niklaus E

    2016-08-01

    Increasing biodiversity loss due to climate change is one of the most vital challenges of the 21st century. To anticipate and mitigate biodiversity loss, models are needed that reliably project species' range dynamics and extinction risks. Recently, several new approaches to model range dynamics have been developed to supplement correlative species distribution models (SDMs), but applications clearly lag behind model development. Indeed, no comparative analysis has been performed to evaluate their performance. Here, we build on process-based, simulated data for benchmarking five range (dynamic) models of varying complexity including classical SDMs, SDMs coupled with simple dispersal or more complex population dynamic models (SDM hybrids), and a hierarchical Bayesian process-based dynamic range model (DRM). We specifically test the effects of demographic and community processes on model predictive performance. Under current climate, DRMs performed best, although only marginally. Under climate change, predictive performance varied considerably, with no clear winners. Yet, all range dynamic models improved predictions under climate change substantially compared to purely correlative SDMs, and the population dynamic models also predicted reasonable extinction risks for most scenarios. When benchmarking data were simulated with more complex demographic and community processes, simple SDM hybrids including only dispersal often proved most reliable. Finally, we found that structural decisions during model building can have great impact on model accuracy, but prior system knowledge on important processes can reduce these uncertainties considerably. Our results reassure the clear merit in using dynamic approaches for modelling species' response to climate change but also emphasize several needs for further model and data improvement. We propose and discuss perspectives for improving range projections through combination of multiple models and for making these approaches operational for large numbers of species. © 2016 John Wiley & Sons Ltd.

  13. The prediction of drug metabolism, tissue distribution, and bioavailability of 50 structurally diverse compounds in rat using mechanism-based absorption, distribution, and metabolism prediction tools.

    PubMed

    De Buck, Stefan S; Sinha, Vikash K; Fenu, Luca A; Gilissen, Ron A; Mackie, Claire E; Nijsen, Marjoleen J

    2007-04-01

    The aim of this study was to assess a physiologically based modeling approach for predicting drug metabolism, tissue distribution, and bioavailability in rat for a structurally diverse set of neutral and moderate-to-strong basic compounds (n = 50). Hepatic blood clearance (CL(h)) was projected using microsomal data and shown to be well predicted, irrespective of the type of hepatic extraction model (80% within 2-fold). Best predictions of CL(h) were obtained disregarding both plasma and microsomal protein binding, whereas strong bias was seen using either blood binding only or both plasma and microsomal protein binding. Two mechanistic tissue composition-based equations were evaluated for predicting volume of distribution (V(dss)) and tissue-to-plasma partitioning (P(tp)). A first approach, which accounted for ionic interactions with acidic phospholipids, resulted in accurate predictions of V(dss) (80% within 2-fold). In contrast, a second approach, which disregarded ionic interactions, was a poor predictor of V(dss) (60% within 2-fold). The first approach also yielded accurate predictions of P(tp) in muscle, heart, and kidney (80% within 3-fold), whereas in lung, liver, and brain, predictions ranged from 47% to 62% within 3-fold. Using the second approach, P(tp) prediction accuracy in muscle, heart, and kidney was on average 70% within 3-fold, and ranged from 24% to 54% in all other tissues. Combining all methods for predicting V(dss) and CL(h) resulted in accurate predictions of the in vivo half-life (70% within 2-fold). Oral bioavailability was well predicted using CL(h) data and Gastroplus Software (80% within 2-fold). These results illustrate that physiologically based prediction tools can provide accurate predictions of rat pharmacokinetics.

  14. The string prediction models as invariants of time series in the forex market

    NASA Astrophysics Data System (ADS)

    Pincak, R.

    2013-12-01

    In this paper we apply a new approach of string theory to the real financial market. The models are constructed with an idea of prediction models based on the string invariants (PMBSI). The performance of PMBSI is compared to support vector machines (SVM) and artificial neural networks (ANN) on an artificial and a financial time series. A brief overview of the results and analysis is given. The first model is based on the correlation function as invariant and the second one is an application based on the deviations from the closed string/pattern form (PMBCS). We found the difference between these two approaches. The first model cannot predict the behavior of the forex market with good efficiency in comparison with the second one which is, in addition, able to make relevant profit per year. The presented string models could be useful for portfolio creation and financial risk management in the banking sector as well as for a nonlinear statistical approach to data optimization.

  15. Predicting diabetes mellitus using SMOTE and ensemble machine learning approach: The Henry Ford ExercIse Testing (FIT) project.

    PubMed

    Alghamdi, Manal; Al-Mallah, Mouaz; Keteyian, Steven; Brawner, Clinton; Ehrman, Jonathan; Sakr, Sherif

    2017-01-01

    Machine learning is becoming a popular and important approach in the field of medical research. In this study, we investigate the relative performance of various machine learning methods such as Decision Tree, Naïve Bayes, Logistic Regression, Logistic Model Tree and Random Forests for predicting incident diabetes using medical records of cardiorespiratory fitness. In addition, we apply different techniques to uncover potential predictors of diabetes. This FIT project study used data of 32,555 patients who are free of any known coronary artery disease or heart failure who underwent clinician-referred exercise treadmill stress testing at Henry Ford Health Systems between 1991 and 2009 and had a complete 5-year follow-up. At the completion of the fifth year, 5,099 of those patients have developed diabetes. The dataset contained 62 attributes classified into four categories: demographic characteristics, disease history, medication use history, and stress test vital signs. We developed an Ensembling-based predictive model using 13 attributes that were selected based on their clinical importance, Multiple Linear Regression, and Information Gain Ranking methods. The negative effect of the imbalance class of the constructed model was handled by Synthetic Minority Oversampling Technique (SMOTE). The overall performance of the predictive model classifier was improved by the Ensemble machine learning approach using the Vote method with three Decision Trees (Naïve Bayes Tree, Random Forest, and Logistic Model Tree) and achieved high accuracy of prediction (AUC = 0.92). The study shows the potential of ensembling and SMOTE approaches for predicting incident diabetes using cardiorespiratory fitness data.

  16. A Hybrid RANS/LES Approach for Predicting Jet Noise

    NASA Technical Reports Server (NTRS)

    Goldstein, Marvin E.

    2006-01-01

    Hybrid acoustic prediction methods have an important advantage over the current Reynolds averaged Navier-Stokes (RANS) based methods in that they only involve modeling of the relatively universal subscale motion and not the configuration dependent larger scale turbulence. Unfortunately, they are unable to account for the high frequency sound generated by the turbulence in the initial mixing layers. This paper introduces an alternative approach that directly calculates the sound from a hybrid RANS/LES flow model (which can resolve the steep gradients in the initial mixing layers near the nozzle lip) and adopts modeling techniques similar to those used in current RANS based noise prediction methods to determine the unknown sources in the equations for the remaining unresolved components of the sound field. The resulting prediction method would then be intermediate between the current noise prediction codes and previously proposed hybrid noise prediction methods.

  17. Prediction complements explanation in understanding the developing brain.

    PubMed

    Rosenberg, Monica D; Casey, B J; Holmes, Avram J

    2018-02-21

    A central aim of human neuroscience is understanding the neurobiology of cognition and behavior. Although we have made significant progress towards this goal, reliance on group-level studies of the developed adult brain has limited our ability to explain population variability and developmental changes in neural circuitry and behavior. In this review, we suggest that predictive modeling, a method for predicting individual differences in behavior from brain features, can complement descriptive approaches and provide new ways to account for this variability. Highlighting the outsized scientific and clinical benefits of prediction in developmental populations including adolescence, we show that predictive brain-based models are already providing new insights on adolescent-specific risk-related behaviors. Together with large-scale developmental neuroimaging datasets and complementary analytic approaches, predictive modeling affords us the opportunity and obligation to identify novel treatment targets and individually tailor the course of interventions for developmental psychopathologies that impact so many young people today.

  18. Biomechanical Model for Computing Deformations for Whole-Body Image Registration: A Meshless Approach

    PubMed Central

    Li, Mao; Miller, Karol; Joldes, Grand Roman; Kikinis, Ron; Wittek, Adam

    2016-01-01

    Patient-specific biomechanical models have been advocated as a tool for predicting deformations of soft body organs/tissue for medical image registration (aligning two sets of images) when differences between the images are large. However, complex and irregular geometry of the body organs makes generation of patient-specific biomechanical models very time consuming. Meshless discretisation has been proposed to solve this challenge. However, applications so far have been limited to 2-D models and computing single organ deformations. In this study, 3-D comprehensive patient-specific non-linear biomechanical models implemented using Meshless Total Lagrangian Explicit Dynamics (MTLED) algorithms are applied to predict a 3-D deformation field for whole-body image registration. Unlike a conventional approach which requires dividing (segmenting) the image into non-overlapping constituents representing different organs/tissues, the mechanical properties are assigned using the Fuzzy C-Means (FCM) algorithm without the image segmentation. Verification indicates that the deformations predicted using the proposed meshless approach are for practical purposes the same as those obtained using the previously validated finite element models. To quantitatively evaluate the accuracy of the predicted deformations, we determined the spatial misalignment between the registered (i.e. source images warped using the predicted deformations) and target images by computing the edge-based Hausdorff distance. The Hausdorff distance-based evaluation determines that our meshless models led to successful registration of the vast majority of the image features. PMID:26791945

  19. A Pareto-optimal moving average multigene genetic programming model for daily streamflow prediction

    NASA Astrophysics Data System (ADS)

    Danandeh Mehr, Ali; Kahya, Ercan

    2017-06-01

    Genetic programming (GP) is able to systematically explore alternative model structures of different accuracy and complexity from observed input and output data. The effectiveness of GP in hydrological system identification has been recognized in recent studies. However, selecting a parsimonious (accurate and simple) model from such alternatives still remains a question. This paper proposes a Pareto-optimal moving average multigene genetic programming (MA-MGGP) approach to develop a parsimonious model for single-station streamflow prediction. The three main components of the approach that take us from observed data to a validated model are: (1) data pre-processing, (2) system identification and (3) system simplification. The data pre-processing ingredient uses a simple moving average filter to diminish the lagged prediction effect of stand-alone data-driven models. The multigene ingredient of the model tends to identify the underlying nonlinear system with expressions simpler than classical monolithic GP and, eventually simplification component exploits Pareto front plot to select a parsimonious model through an interactive complexity-efficiency trade-off. The approach was tested using the daily streamflow records from a station on Senoz Stream, Turkey. Comparing to the efficiency results of stand-alone GP, MGGP, and conventional multi linear regression prediction models as benchmarks, the proposed Pareto-optimal MA-MGGP model put forward a parsimonious solution, which has a noteworthy importance of being applied in practice. In addition, the approach allows the user to enter human insight into the problem to examine evolved models and pick the best performing programs out for further analysis.

  20. Application of a coupled smoothed particle hydrodynamics (SPH) and coarse-grained (CG) numerical modelling approach to study three-dimensional (3-D) deformations of single cells of different food-plant materials during drying.

    PubMed

    Rathnayaka, C M; Karunasena, H C P; Senadeera, W; Gu, Y T

    2018-03-14

    Numerical modelling has gained popularity in many science and engineering streams due to the economic feasibility and advanced analytical features compared to conventional experimental and theoretical models. Food drying is one of the areas where numerical modelling is increasingly applied to improve drying process performance and product quality. This investigation applies a three dimensional (3-D) Smoothed Particle Hydrodynamics (SPH) and Coarse-Grained (CG) numerical approach to predict the morphological changes of different categories of food-plant cells such as apple, grape, potato and carrot during drying. To validate the model predictions, experimental findings from in-house experimental procedures (for apple) and sources of literature (for grape, potato and carrot) have been utilised. The subsequent comaprison indicate that the model predictions demonstrate a reasonable agreement with the experimental findings, both qualitatively and quantitatively. In this numerical model, a higher computational accuracy has been maintained by limiting the consistency error below 1% for all four cell types. The proposed meshfree-based approach is well-equipped to predict the morphological changes of plant cellular structure over a wide range of moisture contents (10% to 100% dry basis). Compared to the previous 2-D meshfree-based models developed for plant cell drying, the proposed model can draw more useful insights on the morphological behaviour due to the 3-D nature of the model. In addition, the proposed computational modelling approach has a high potential to be used as a comprehensive tool in many other tissue morphology related investigations.

  1. Comparing flood loss models of different complexity

    NASA Astrophysics Data System (ADS)

    Schröter, Kai; Kreibich, Heidi; Vogel, Kristin; Riggelsen, Carsten; Scherbaum, Frank; Merz, Bruno

    2013-04-01

    Any deliberation on flood risk requires the consideration of potential flood losses. In particular, reliable flood loss models are needed to evaluate cost-effectiveness of mitigation measures, to assess vulnerability, for comparative risk analysis and financial appraisal during and after floods. In recent years, considerable improvements have been made both concerning the data basis and the methodological approaches used for the development of flood loss models. Despite of that, flood loss models remain an important source of uncertainty. Likewise the temporal and spatial transferability of flood loss models is still limited. This contribution investigates the predictive capability of different flood loss models in a split sample cross regional validation approach. For this purpose, flood loss models of different complexity, i.e. based on different numbers of explaining variables, are learned from a set of damage records that was obtained from a survey after the Elbe flood in 2002. The validation of model predictions is carried out for different flood events in the Elbe and Danube river basins in 2002, 2005 and 2006 for which damage records are available from surveys after the flood events. The models investigated are a stage-damage model, the rule based model FLEMOps+r as well as novel model approaches which are derived using data mining techniques of regression trees and Bayesian networks. The Bayesian network approach to flood loss modelling provides attractive additional information concerning the probability distribution of both model predictions and explaining variables.

  2. Multifidelity, Multidisciplinary Design Under Uncertainty with Non-Intrusive Polynomial Chaos

    NASA Technical Reports Server (NTRS)

    West, Thomas K., IV; Gumbert, Clyde

    2017-01-01

    The primary objective of this work is to develop an approach for multifidelity uncertainty quantification and to lay the framework for future design under uncertainty efforts. In this study, multifidelity is used to describe both the fidelity of the modeling of the physical systems, as well as the difference in the uncertainty in each of the models. For computational efficiency, a multifidelity surrogate modeling approach based on non-intrusive polynomial chaos using the point-collocation technique is developed for the treatment of both multifidelity modeling and multifidelity uncertainty modeling. Two stochastic model problems are used to demonstrate the developed methodologies: a transonic airfoil model and multidisciplinary aircraft analysis model. The results of both showed the multifidelity modeling approach was able to predict the output uncertainty predicted by the high-fidelity model as a significant reduction in computational cost.

  3. Exploring the social dimension of sandy beaches through predictive modelling.

    PubMed

    Domínguez-Tejo, Elianny; Metternicht, Graciela; Johnston, Emma L; Hedge, Luke

    2018-05-15

    Sandy beaches are unique ecosystems increasingly exposed to human-induced pressures. Consistent with emerging frameworks promoting this holistic approach towards beach management, is the need to improve the integration of social data into management practices. This paper aims to increase understanding of links between demographics and community values and preferred beach activities, as key components of the social dimension of the beach environment. A mixed method approach was adopted to elucidate users' opinions on beach preferences and community values through a survey carried out in Manly Local Government Area in Sydney Harbour, Australia. A proposed conceptual model was used to frame demographic models (using age, education, employment, household income and residence status) as predictors of these two community responses. All possible regression-model combinations were compared using Akaike's information criterion. Best models were then used to calculate quantitative likelihoods of the responses, presented as heat maps. Findings concur with international research indicating the relevance of social and restful activities as important social links between the community and the beach environment. Participant's age was a significant variable in the four predictive models. The use of predictive models informed by demographics could potentially increase our understanding of interactions between the social and ecological systems of the beach environment, as a prelude to integrated beach management approaches. The research represents a practical demonstration of how demographic predictive models could support proactive approaches to beach management. Copyright © 2018 Elsevier Ltd. All rights reserved.

  4. Distributed Prognostics based on Structural Model Decomposition

    NASA Technical Reports Server (NTRS)

    Daigle, Matthew J.; Bregon, Anibal; Roychoudhury, I.

    2014-01-01

    Within systems health management, prognostics focuses on predicting the remaining useful life of a system. In the model-based prognostics paradigm, physics-based models are constructed that describe the operation of a system and how it fails. Such approaches consist of an estimation phase, in which the health state of the system is first identified, and a prediction phase, in which the health state is projected forward in time to determine the end of life. Centralized solutions to these problems are often computationally expensive, do not scale well as the size of the system grows, and introduce a single point of failure. In this paper, we propose a novel distributed model-based prognostics scheme that formally describes how to decompose both the estimation and prediction problems into independent local subproblems whose solutions may be easily composed into a global solution. The decomposition of the prognostics problem is achieved through structural decomposition of the underlying models. The decomposition algorithm creates from the global system model a set of local submodels suitable for prognostics. Independent local estimation and prediction problems are formed based on these local submodels, resulting in a scalable distributed prognostics approach that allows the local subproblems to be solved in parallel, thus offering increases in computational efficiency. Using a centrifugal pump as a case study, we perform a number of simulation-based experiments to demonstrate the distributed approach, compare the performance with a centralized approach, and establish its scalability. Index Terms-model-based prognostics, distributed prognostics, structural model decomposition ABBREVIATIONS

  5. Prediction of biomechanical parameters of the proximal femur using statistical appearance models and support vector regression.

    PubMed

    Fritscher, Karl; Schuler, Benedikt; Link, Thomas; Eckstein, Felix; Suhm, Norbert; Hänni, Markus; Hengg, Clemens; Schubert, Rainer

    2008-01-01

    Fractures of the proximal femur are one of the principal causes of mortality among elderly persons. Traditional methods for the determination of femoral fracture risk use methods for measuring bone mineral density. However, BMD alone is not sufficient to predict bone failure load for an individual patient and additional parameters have to be determined for this purpose. In this work an approach that uses statistical models of appearance to identify relevant regions and parameters for the prediction of biomechanical properties of the proximal femur will be presented. By using Support Vector Regression the proposed model based approach is capable of predicting two different biomechanical parameters accurately and fully automatically in two different testing scenarios.

  6. Modeling the reactivities of hydroxyl radical and ozone towards atmospheric organic chemicals using quantitative structure-reactivity relationship approaches.

    PubMed

    Gupta, Shikha; Basant, Nikita; Mohan, Dinesh; Singh, Kunwar P

    2016-07-01

    The persistence and the removal of organic chemicals from the atmosphere are largely determined by their reactions with the OH radical and O3. Experimental determinations of the kinetic rate constants of OH and O3 with a large number of chemicals are tedious and resource intensive and development of computational approaches has widely been advocated. Recently, ensemble machine learning (EML) methods have emerged as unbiased tools to establish relationship between independent and dependent variables having a nonlinear dependence. In this study, EML-based, temperature-dependent quantitative structure-reactivity relationship (QSRR) models have been developed for predicting the kinetic rate constants for OH (kOH) and O3 (kO3) reactions with diverse chemicals. Structural diversity of chemicals was evaluated using a Tanimoto similarity index. The generalization and prediction abilities of the constructed models were established through rigorous internal and external validation performed employing statistical checks. In test data, the EML QSRR models yielded correlation (R (2)) of ≥0.91 between the measured and the predicted reactivities. The applicability domains of the constructed models were determined using methods based on descriptors range, Euclidean distance, leverage, and standardization approaches. The prediction accuracies for the higher reactivity compounds were relatively better than those of the low reactivity compounds. Proposed EML QSRR models performed well and outperformed the previous reports. The proposed QSRR models can make predictions of rate constants at different temperatures. The proposed models can be useful tools in predicting the reactivities of chemicals towards OH radical and O3 in the atmosphere.

  7. Comparison of two statistical methods for probability prediction of monthly precipitation during summer over Huaihe River Basin in China, and applications in runoff prediction based on hydrological model

    NASA Astrophysics Data System (ADS)

    Liu, L.; Du, L.; Liao, Y.

    2017-12-01

    Based on the ensemble hindcast dataset of CSM1.1m by NCC, CMA, Bayesian merging models and a two-step statistical model are developed and employed to predict monthly grid/station precipitation in the Huaihe River China during summer at the lead-time of 1 to 3 months. The hindcast datasets span a period of 1991 to 2014. The skill of the two models is evaluated using area under the ROC curve (AUC) in a leave-one-out cross-validation framework, and is compared to the skill of CSM1.1m. CSM1.1m has highest skill for summer precipitation from April while lowest from May, and has highest skill for precipitation in June but lowest for precipitation in July. Compared with raw outputs of climate models, some schemes of the two approaches have higher skill for the prediction from March and May, but almost schemes have lower skill for prediction from April. Compared to two-step approach, one sampling scheme of Bayesian merging approach has higher skill for the prediction from March, but has lower skill from May. The results suggest that there is potential to apply the two statistical models for monthly precipitation forecast in summer from March and from May over Huaihe River basin, but is potential to apply CSM1.1m forecast from April. Finally, the summer runoff during 1991 to 2014 is simulated based on one hydrological model using the climate hindcast of CSM1.1m and the two statistical models.

  8. A dynamic spatio-temporal model for spatial data

    USGS Publications Warehouse

    Hefley, Trevor J.; Hooten, Mevin B.; Hanks, Ephraim M.; Russell, Robin; Walsh, Daniel P.

    2017-01-01

    Analyzing spatial data often requires modeling dependencies created by a dynamic spatio-temporal data generating process. In many applications, a generalized linear mixed model (GLMM) is used with a random effect to account for spatial dependence and to provide optimal spatial predictions. Location-specific covariates are often included as fixed effects in a GLMM and may be collinear with the spatial random effect, which can negatively affect inference. We propose a dynamic approach to account for spatial dependence that incorporates scientific knowledge of the spatio-temporal data generating process. Our approach relies on a dynamic spatio-temporal model that explicitly incorporates location-specific covariates. We illustrate our approach with a spatially varying ecological diffusion model implemented using a computationally efficient homogenization technique. We apply our model to understand individual-level and location-specific risk factors associated with chronic wasting disease in white-tailed deer from Wisconsin, USA and estimate the location the disease was first introduced. We compare our approach to several existing methods that are commonly used in spatial statistics. Our spatio-temporal approach resulted in a higher predictive accuracy when compared to methods based on optimal spatial prediction, obviated confounding among the spatially indexed covariates and the spatial random effect, and provided additional information that will be important for containing disease outbreaks.

  9. Comparative evaluation of statistical and mechanistic models of Escherichia coli at beaches in southern Lake Michigan

    USGS Publications Warehouse

    Safaie, Ammar; Wendzel, Aaron; Ge, Zhongfu; Nevers, Meredith; Whitman, Richard L.; Corsi, Steven R.; Phanikumar, Mantha S.

    2016-01-01

    Statistical and mechanistic models are popular tools for predicting the levels of indicator bacteria at recreational beaches. Researchers tend to use one class of model or the other, and it is difficult to generalize statements about their relative performance due to differences in how the models are developed, tested, and used. We describe a cooperative modeling approach for freshwater beaches impacted by point sources in which insights derived from mechanistic modeling were used to further improve the statistical models and vice versa. The statistical models provided a basis for assessing the mechanistic models which were further improved using probability distributions to generate high-resolution time series data at the source, long-term “tracer” transport modeling based on observed electrical conductivity, better assimilation of meteorological data, and the use of unstructured-grids to better resolve nearshore features. This approach resulted in improved models of comparable performance for both classes including a parsimonious statistical model suitable for real-time predictions based on an easily measurable environmental variable (turbidity). The modeling approach outlined here can be used at other sites impacted by point sources and has the potential to improve water quality predictions resulting in more accurate estimates of beach closures.

  10. Bayesian Maximum Entropy Integration of Ozone Observations and Model Predictions: A National Application.

    PubMed

    Xu, Yadong; Serre, Marc L; Reyes, Jeanette; Vizuete, William

    2016-04-19

    To improve ozone exposure estimates for ambient concentrations at a national scale, we introduce our novel Regionalized Air Quality Model Performance (RAMP) approach to integrate chemical transport model (CTM) predictions with the available ozone observations using the Bayesian Maximum Entropy (BME) framework. The framework models the nonlinear and nonhomoscedastic relation between air pollution observations and CTM predictions and for the first time accounts for variability in CTM model performance. A validation analysis using only noncollocated data outside of a validation radius rv was performed and the R(2) between observations and re-estimated values for two daily metrics, the daily maximum 8-h average (DM8A) and the daily 24-h average (D24A) ozone concentrations, were obtained with the OBS scenario using ozone observations only in contrast with the RAMP and a Constant Air Quality Model Performance (CAMP) scenarios. We show that, by accounting for the spatial and temporal variability in model performance, our novel RAMP approach is able to extract more information in terms of R(2) increase percentage, with over 12 times for the DM8A and over 3.5 times for the D24A ozone concentrations, from CTM predictions than the CAMP approach assuming that model performance does not change across space and time.

  11. A hybrid approach to survival model building using integration of clinical and molecular information in censored data.

    PubMed

    Choi, Ickwon; Kattan, Michael W; Wells, Brian J; Yu, Changhong

    2012-01-01

    In medical society, the prognostic models, which use clinicopathologic features and predict prognosis after a certain treatment, have been externally validated and used in practice. In recent years, most research has focused on high dimensional genomic data and small sample sizes. Since clinically similar but molecularly heterogeneous tumors may produce different clinical outcomes, the combination of clinical and genomic information, which may be complementary, is crucial to improve the quality of prognostic predictions. However, there is a lack of an integrating scheme for clinic-genomic models due to the P ≥ N problem, in particular, for a parsimonious model. We propose a methodology to build a reduced yet accurate integrative model using a hybrid approach based on the Cox regression model, which uses several dimension reduction techniques, L₂ penalized maximum likelihood estimation (PMLE), and resampling methods to tackle the problem. The predictive accuracy of the modeling approach is assessed by several metrics via an independent and thorough scheme to compare competing methods. In breast cancer data studies on a metastasis and death event, we show that the proposed methodology can improve prediction accuracy and build a final model with a hybrid signature that is parsimonious when integrating both types of variables.

  12. Predicting the continuum between corridors and barriers to animal movements using Step Selection Functions and Randomized Shortest Paths.

    PubMed

    Panzacchi, Manuela; Van Moorter, Bram; Strand, Olav; Saerens, Marco; Kivimäki, Ilkka; St Clair, Colleen C; Herfindal, Ivar; Boitani, Luigi

    2016-01-01

    The loss, fragmentation and degradation of habitat everywhere on Earth prompts increasing attention to identifying landscape features that support animal movement (corridors) or impedes it (barriers). Most algorithms used to predict corridors assume that animals move through preferred habitat either optimally (e.g. least cost path) or as random walkers (e.g. current models), but neither extreme is realistic. We propose that corridors and barriers are two sides of the same coin and that animals experience landscapes as spatiotemporally dynamic corridor-barrier continua connecting (separating) functional areas where individuals fulfil specific ecological processes. Based on this conceptual framework, we propose a novel methodological approach that uses high-resolution individual-based movement data to predict corridor-barrier continua with increased realism. Our approach consists of two innovations. First, we use step selection functions (SSF) to predict friction maps quantifying corridor-barrier continua for tactical steps between consecutive locations. Secondly, we introduce to movement ecology the randomized shortest path algorithm (RSP) which operates on friction maps to predict the corridor-barrier continuum for strategic movements between functional areas. By modulating the parameter Ѳ, which controls the trade-off between exploration and optimal exploitation of the environment, RSP bridges the gap between algorithms assuming optimal movements (when Ѳ approaches infinity, RSP is equivalent to LCP) or random walk (when Ѳ → 0, RSP → current models). Using this approach, we identify migration corridors for GPS-monitored wild reindeer (Rangifer t. tarandus) in Norway. We demonstrate that reindeer movement is best predicted by an intermediate value of Ѳ, indicative of a movement trade-off between optimization and exploration. Model calibration allows identification of a corridor-barrier continuum that closely fits empirical data and demonstrates that RSP outperforms models that assume either optimality or random walk. The proposed approach models the multiscale cognitive maps by which animals likely navigate real landscapes and generalizes the most common algorithms for identifying corridors. Because suboptimal, but non-random, movement strategies are likely widespread, our approach has the potential to predict more realistic corridor-barrier continua for a wide range of species. © 2015 The Authors. Journal of Animal Ecology © 2015 British Ecological Society.

  13. POPULATION EXPOSURES TO PARTICULATE MATTER: A COMPARISON OF EXPOSURE MODEL PREDICTIONS AND MEASUREMENT DATA

    EPA Science Inventory

    The US EPA National Exposure Research Laboratory (NERL) is currently developing an integrated human exposure source-to-dose modeling system (HES2D). This modeling system will incorporate models that use a probabilistic approach to predict population exposures to environmental ...

  14. Using a RIVPACS model to predict expected macrofaunal species richness in Puget Sound

    EPA Science Inventory

    As part of a project to develop regional indicators for Pacific coastal environments using soft-bottom benthic species, we are evaluating a RIVPACS predictive model (River InVertebrate Prediction and Classification System). This approach was originally developed for rivers and s...

  15. Development of a hybrid modeling approach for predicting intensively managed Douglas-fir growth at multiple scales.

    Treesearch

    A. Weiskittel; D. Maguire; R. Monserud

    2007-01-01

    Hybrid models offer the opportunity to improve future growth projections by combining advantages of both empirical and process-based modeling approaches. Hybrid models have been constructed in several regions and their performance relative to a purely empirical approach has varied. A hybrid model was constructed for intensively managed Douglas-fir plantations in the...

  16. Predicting progression of mild cognitive impairment to dementia using neuropsychological data: a supervised learning approach using time windows.

    PubMed

    Pereira, Telma; Lemos, Luís; Cardoso, Sandra; Silva, Dina; Rodrigues, Ana; Santana, Isabel; de Mendonça, Alexandre; Guerreiro, Manuela; Madeira, Sara C

    2017-07-19

    Predicting progression from a stage of Mild Cognitive Impairment to dementia is a major pursuit in current research. It is broadly accepted that cognition declines with a continuum between MCI and dementia. As such, cohorts of MCI patients are usually heterogeneous, containing patients at different stages of the neurodegenerative process. This hampers the prognostic task. Nevertheless, when learning prognostic models, most studies use the entire cohort of MCI patients regardless of their disease stages. In this paper, we propose a Time Windows approach to predict conversion to dementia, learning with patients stratified using time windows, thus fine-tuning the prognosis regarding the time to conversion. In the proposed Time Windows approach, we grouped patients based on the clinical information of whether they converted (converter MCI) or remained MCI (stable MCI) within a specific time window. We tested time windows of 2, 3, 4 and 5 years. We developed a prognostic model for each time window using clinical and neuropsychological data and compared this approach with the commonly used in the literature, where all patients are used to learn the models, named as First Last approach. This enables to move from the traditional question "Will a MCI patient convert to dementia somewhere in the future" to the question "Will a MCI patient convert to dementia in a specific time window". The proposed Time Windows approach outperformed the First Last approach. The results showed that we can predict conversion to dementia as early as 5 years before the event with an AUC of 0.88 in the cross-validation set and 0.76 in an independent validation set. Prognostic models using time windows have higher performance when predicting progression from MCI to dementia, when compared to the prognostic approach commonly used in the literature. Furthermore, the proposed Time Windows approach is more relevant from a clinical point of view, predicting conversion within a temporal interval rather than sometime in the future and allowing clinicians to timely adjust treatments and clinical appointments.

  17. Numerical Investigation of Desulfurization Kinetics in Gas-Stirred Ladles by a Quick Modeling Analysis Approach

    NASA Astrophysics Data System (ADS)

    Cao, Qing; Nastac, Laurentiu; Pitts-Baggett, April; Yu, Qiulin

    2018-03-01

    A quick modeling analysis approach for predicting the slag-steel reaction and desulfurization kinetics in argon gas-stirred ladles has been developed in this study. The model consists of two uncoupled components: (i) a computational fluid dynamics (CFD) model for predicting the fluid flow and the characteristics of slag-steel interface, and (ii) a multicomponent reaction kinetics model for calculating the desulfurization evolution. The steel-slag interfacial area and mass transfer coefficients predicted by the CFD simulation are used as the processing data for the reaction model. Since the desulfurization predictions are uncoupled from the CFD simulation, the computational time of this uncoupled predictive approach is decreased by at least 100 times for each case study when compared with the CFD-reaction kinetics fully coupled model. The uncoupled modeling approach was validated by comparing the evolution of steel and slag compositions with the experimentally measured data during ladle metallurgical furnace (LMF) processing at Nucor Steel Tuscaloosa, Inc. Then, the validated approach was applied to investigate the effects of the initial steel and slag compositions, as well as different types of additions during the refining process on the desulfurization efficiency. The results revealed that the sulfur distribution ratio and the desulfurization reaction can be promoted by making Al and CaO additions during the refining process. It was also shown that by increasing the initial Al content in liquid steel, both Al oxidation and desulfurization rates rapidly increase. In addition, it was found that the variation of the initial Si content in steel has no significant influence on the desulfurization rate. Lastly, if the initial CaO content in slag is increased or the initial Al2O3 content is decreased in the fluid-slag compositional range, the desulfurization rate can be improved significantly during the LMF process.

  18. Numerical Investigation of Desulfurization Kinetics in Gas-Stirred Ladles by a Quick Modeling Analysis Approach

    NASA Astrophysics Data System (ADS)

    Cao, Qing; Nastac, Laurentiu; Pitts-Baggett, April; Yu, Qiulin

    2018-06-01

    A quick modeling analysis approach for predicting the slag-steel reaction and desulfurization kinetics in argon gas-stirred ladles has been developed in this study. The model consists of two uncoupled components: (i) a computational fluid dynamics (CFD) model for predicting the fluid flow and the characteristics of slag-steel interface, and (ii) a multicomponent reaction kinetics model for calculating the desulfurization evolution. The steel-slag interfacial area and mass transfer coefficients predicted by the CFD simulation are used as the processing data for the reaction model. Since the desulfurization predictions are uncoupled from the CFD simulation, the computational time of this uncoupled predictive approach is decreased by at least 100 times for each case study when compared with the CFD-reaction kinetics fully coupled model. The uncoupled modeling approach was validated by comparing the evolution of steel and slag compositions with the experimentally measured data during ladle metallurgical furnace (LMF) processing at Nucor Steel Tuscaloosa, Inc. Then, the validated approach was applied to investigate the effects of the initial steel and slag compositions, as well as different types of additions during the refining process on the desulfurization efficiency. The results revealed that the sulfur distribution ratio and the desulfurization reaction can be promoted by making Al and CaO additions during the refining process. It was also shown that by increasing the initial Al content in liquid steel, both Al oxidation and desulfurization rates rapidly increase. In addition, it was found that the variation of the initial Si content in steel has no significant influence on the desulfurization rate. Lastly, if the initial CaO content in slag is increased or the initial Al2O3 content is decreased in the fluid-slag compositional range, the desulfurization rate can be improved significantly during the LMF process.

  19. A Market-Basket Approach to Predict the Acute Aquatic Toxicity of Munitions and Energetic Materials.

    PubMed

    Burgoon, Lyle D

    2016-06-01

    An ongoing challenge in chemical production, including the production of insensitive munitions and energetics, is the ability to make predictions about potential environmental hazards early in the process. To address this challenge, a quantitative structure activity relationship model was developed to predict acute fathead minnow toxicity of insensitive munitions and energetic materials. Computational predictive toxicology models like this one may be used to identify and prioritize environmentally safer materials early in their development. The developed model is based on the Apriori market-basket/frequent itemset mining approach to identify probabilistic prediction rules using chemical atom-pairs and the lethality data for 57 compounds from a fathead minnow acute toxicity assay. Lethality data were discretized into four categories based on the Globally Harmonized System of Classification and Labelling of Chemicals. Apriori identified toxicophores for categories two and three. The model classified 32 of the 57 compounds correctly, with a fivefold cross-validation classification rate of 74 %. A structure-based surrogate approach classified the remaining 25 chemicals correctly at 48 %. This result is unsurprising as these 25 chemicals were fairly unique within the larger set.

  20. Modeling and prediction of peptide drift times in ion mobility spectrometry using sequence-based and structure-based approaches.

    PubMed

    Zhang, Yiming; Jin, Quan; Wang, Shuting; Ren, Ren

    2011-05-01

    The mobile behavior of 1481 peptides in ion mobility spectrometry (IMS), which are generated by protease digestion of the Drosophila melanogaster proteome, is modeled and predicted based on two different types of characterization methods, i.e. sequence-based approach and structure-based approach. In this procedure, the sequence-based approach considers both the amino acid composition of a peptide and the local environment profile of each amino acid in the peptide; the structure-based approach is performed with the CODESSA protocol, which regards a peptide as a common organic compound and generates more than 200 statistically significant variables to characterize the whole structure profile of a peptide molecule. Subsequently, the nonlinear support vector machine (SVM) and Gaussian process (GP) as well as linear partial least squares (PLS) regression is employed to correlate the structural parameters of the characterizations with the IMS drift times of these peptides. The obtained quantitative structure-spectrum relationship (QSSR) models are evaluated rigorously and investigated systematically via both one-deep and two-deep cross-validations as well as the rigorous Monte Carlo cross-validation (MCCV). We also give a comprehensive comparison on the resulting statistics arising from the different combinations of variable types with modeling methods and find that the sequence-based approach can give the QSSR models with better fitting ability and predictive power but worse interpretability than the structure-based approach. In addition, though the QSSR modeling using sequence-based approach is not needed for the preparation of the minimization structures of peptides before the modeling, it would be considerably efficient as compared to that using structure-based approach. Copyright © 2011 Elsevier Ltd. All rights reserved.

  1. The First Attempt at Non-Linear in Silico Prediction of Sampling Rates for Polar Organic Chemical Integrative Samplers (POCIS)

    PubMed Central

    2016-01-01

    Modeling and prediction of polar organic chemical integrative sampler (POCIS) sampling rates (Rs) for 73 compounds using artificial neural networks (ANNs) is presented for the first time. Two models were constructed: the first was developed ab initio using a genetic algorithm (GSD-model) to shortlist 24 descriptors covering constitutional, topological, geometrical and physicochemical properties and the second model was adapted for Rs prediction from a previous chromatographic retention model (RTD-model). Mechanistic evaluation of descriptors showed that models did not require comprehensive a priori information to predict Rs. Average predicted errors for the verification and blind test sets were 0.03 ± 0.02 L d–1 (RTD-model) and 0.03 ± 0.03 L d–1 (GSD-model) relative to experimentally determined Rs. Prediction variability in replicated models was the same or less than for measured Rs. Networks were externally validated using a measured Rs data set of six benzodiazepines. The RTD-model performed best in comparison to the GSD-model for these compounds (average absolute errors of 0.0145 ± 0.008 L d–1 and 0.0437 ± 0.02 L d–1, respectively). Improvements to generalizability of modeling approaches will be reliant on the need for standardized guidelines for Rs measurement. The use of in silico tools for Rs determination represents a more economical approach than laboratory calibrations. PMID:27363449

  2. Application of various FLD modelling approaches

    NASA Astrophysics Data System (ADS)

    Banabic, D.; Aretz, H.; Paraianu, L.; Jurco, P.

    2005-07-01

    This paper focuses on a comparison between different modelling approaches to predict the forming limit diagram (FLD) for sheet metal forming under a linear strain path using the recently introduced orthotropic yield criterion BBC2003 (Banabic D et al 2005 Int. J. Plasticity 21 493-512). The FLD models considered here are a finite element based approach, the well known Marciniak-Kuczynski model, the modified maximum force criterion according to Hora et al (1996 Proc. Numisheet'96 Conf. (Dearborn/Michigan) pp 252-6), Swift's diffuse (Swift H W 1952 J. Mech. Phys. Solids 1 1-18) and Hill's classical localized necking approach (Hill R 1952 J. Mech. Phys. Solids 1 19-30). The FLD of an AA5182-O aluminium sheet alloy has been determined experimentally in order to quantify the predictive capabilities of the models mentioned above.

  3. An integrated approach to reconstructing genome-scale transcriptional regulatory networks

    DOE PAGES

    Imam, Saheed; Noguera, Daniel R.; Donohue, Timothy J.; ...

    2015-02-27

    Transcriptional regulatory networks (TRNs) program cells to dynamically alter their gene expression in response to changing internal or environmental conditions. In this study, we develop a novel workflow for generating large-scale TRN models that integrates comparative genomics data, global gene expression analyses, and intrinsic properties of transcription factors (TFs). An assessment of this workflow using benchmark datasets for the well-studied γ-proteobacterium Escherichia coli showed that it outperforms expression-based inference approaches, having a significantly larger area under the precision-recall curve. Further analysis indicated that this integrated workflow captures different aspects of the E. coli TRN than expression-based approaches, potentially making themmore » highly complementary. We leveraged this new workflow and observations to build a large-scale TRN model for the α-Proteobacterium Rhodobacter sphaeroides that comprises 120 gene clusters, 1211 genes (including 93 TFs), 1858 predicted protein-DNA interactions and 76 DNA binding motifs. We found that ~67% of the predicted gene clusters in this TRN are enriched for functions ranging from photosynthesis or central carbon metabolism to environmental stress responses. We also found that members of many of the predicted gene clusters were consistent with prior knowledge in R. sphaeroides and/or other bacteria. Experimental validation of predictions from this R. sphaeroides TRN model showed that high precision and recall was also obtained for TFs involved in photosynthesis (PpsR), carbon metabolism (RSP_0489) and iron homeostasis (RSP_3341). In addition, this integrative approach enabled generation of TRNs with increased information content relative to R. sphaeroides TRN models built via other approaches. We also show how this approach can be used to simultaneously produce TRN models for each related organism used in the comparative genomics analysis. Our results highlight the advantages of integrating comparative genomics of closely related organisms with gene expression data to assemble large-scale TRN models with high-quality predictions.« less

  4. A computational approach to compare regression modelling strategies in prediction research.

    PubMed

    Pajouheshnia, Romin; Pestman, Wiebe R; Teerenstra, Steven; Groenwold, Rolf H H

    2016-08-25

    It is often unclear which approach to fit, assess and adjust a model will yield the most accurate prediction model. We present an extension of an approach for comparing modelling strategies in linear regression to the setting of logistic regression and demonstrate its application in clinical prediction research. A framework for comparing logistic regression modelling strategies by their likelihoods was formulated using a wrapper approach. Five different strategies for modelling, including simple shrinkage methods, were compared in four empirical data sets to illustrate the concept of a priori strategy comparison. Simulations were performed in both randomly generated data and empirical data to investigate the influence of data characteristics on strategy performance. We applied the comparison framework in a case study setting. Optimal strategies were selected based on the results of a priori comparisons in a clinical data set and the performance of models built according to each strategy was assessed using the Brier score and calibration plots. The performance of modelling strategies was highly dependent on the characteristics of the development data in both linear and logistic regression settings. A priori comparisons in four empirical data sets found that no strategy consistently outperformed the others. The percentage of times that a model adjustment strategy outperformed a logistic model ranged from 3.9 to 94.9 %, depending on the strategy and data set. However, in our case study setting the a priori selection of optimal methods did not result in detectable improvement in model performance when assessed in an external data set. The performance of prediction modelling strategies is a data-dependent process and can be highly variable between data sets within the same clinical domain. A priori strategy comparison can be used to determine an optimal logistic regression modelling strategy for a given data set before selecting a final modelling approach.

  5. Predicting multi-level drug response with gene expression profile in multiple myeloma using hierarchical ordinal regression.

    PubMed

    Zhang, Xinyan; Li, Bingzong; Han, Huiying; Song, Sha; Xu, Hongxia; Hong, Yating; Yi, Nengjun; Zhuang, Wenzhuo

    2018-05-10

    Multiple myeloma (MM), like other cancers, is caused by the accumulation of genetic abnormalities. Heterogeneity exists in the patients' response to treatments, for example, bortezomib. This urges efforts to identify biomarkers from numerous molecular features and build predictive models for identifying patients that can benefit from a certain treatment scheme. However, previous studies treated the multi-level ordinal drug response as a binary response where only responsive and non-responsive groups are considered. It is desirable to directly analyze the multi-level drug response, rather than combining the response to two groups. In this study, we present a novel method to identify significantly associated biomarkers and then develop ordinal genomic classifier using the hierarchical ordinal logistic model. The proposed hierarchical ordinal logistic model employs the heavy-tailed Cauchy prior on the coefficients and is fitted by an efficient quasi-Newton algorithm. We apply our hierarchical ordinal regression approach to analyze two publicly available datasets for MM with five-level drug response and numerous gene expression measures. Our results show that our method is able to identify genes associated with the multi-level drug response and to generate powerful predictive models for predicting the multi-level response. The proposed method allows us to jointly fit numerous correlated predictors and thus build efficient models for predicting the multi-level drug response. The predictive model for the multi-level drug response can be more informative than the previous approaches. Thus, the proposed approach provides a powerful tool for predicting multi-level drug response and has important impact on cancer studies.

  6. Optimal population prediction of sandhill crane recruitment based on climate-mediated habitat limitations.

    PubMed

    Gerber, Brian D; Kendall, William L; Hooten, Mevin B; Dubovsky, James A; Drewien, Roderick C

    2015-09-01

    1. Prediction is fundamental to scientific enquiry and application; however, ecologists tend to favour explanatory modelling. We discuss a predictive modelling framework to evaluate ecological hypotheses and to explore novel/unobserved environmental scenarios to assist conservation and management decision-makers. We apply this framework to develop an optimal predictive model for juvenile (<1 year old) sandhill crane Grus canadensis recruitment of the Rocky Mountain Population (RMP). We consider spatial climate predictors motivated by hypotheses of how drought across multiple time-scales and spring/summer weather affects recruitment. 2. Our predictive modelling framework focuses on developing a single model that includes all relevant predictor variables, regardless of collinearity. This model is then optimized for prediction by controlling model complexity using a data-driven approach that marginalizes or removes irrelevant predictors from the model. Specifically, we highlight two approaches of statistical regularization, Bayesian least absolute shrinkage and selection operator (LASSO) and ridge regression. 3. Our optimal predictive Bayesian LASSO and ridge regression models were similar and on average 37% superior in predictive accuracy to an explanatory modelling approach. Our predictive models confirmed a priori hypotheses that drought and cold summers negatively affect juvenile recruitment in the RMP. The effects of long-term drought can be alleviated by short-term wet spring-summer months; however, the alleviation of long-term drought has a much greater positive effect on juvenile recruitment. The number of freezing days and snowpack during the summer months can also negatively affect recruitment, while spring snowpack has a positive effect. 4. Breeding habitat, mediated through climate, is a limiting factor on population growth of sandhill cranes in the RMP, which could become more limiting with a changing climate (i.e. increased drought). These effects are likely not unique to cranes. The alteration of hydrological patterns and water levels by drought may impact many migratory, wetland nesting birds in the Rocky Mountains and beyond. 5. Generalizable predictive models (trained by out-of-sample fit and based on ecological hypotheses) are needed by conservation and management decision-makers. Statistical regularization improves predictions and provides a general framework for fitting models with a large number of predictors, even those with collinearity, to simultaneously identify an optimal predictive model while conducting rigorous Bayesian model selection. Our framework is important for understanding population dynamics under a changing climate and has direct applications for making harvest and habitat management decisions. Published 2015. This article is a U.S. Government work and is in the public domain in the USA.

  7. Bayesian averaging over Decision Tree models for trauma severity scoring.

    PubMed

    Schetinin, V; Jakaite, L; Krzanowski, W

    2018-01-01

    Health care practitioners analyse possible risks of misleading decisions and need to estimate and quantify uncertainty in predictions. We have examined the "gold" standard of screening a patient's conditions for predicting survival probability, based on logistic regression modelling, which is used in trauma care for clinical purposes and quality audit. This methodology is based on theoretical assumptions about data and uncertainties. Models induced within such an approach have exposed a number of problems, providing unexplained fluctuation of predicted survival and low accuracy of estimating uncertainty intervals within which predictions are made. Bayesian method, which in theory is capable of providing accurate predictions and uncertainty estimates, has been adopted in our study using Decision Tree models. Our approach has been tested on a large set of patients registered in the US National Trauma Data Bank and has outperformed the standard method in terms of prediction accuracy, thereby providing practitioners with accurate estimates of the predictive posterior densities of interest that are required for making risk-aware decisions. Copyright © 2017 Elsevier B.V. All rights reserved.

  8. Ensemble predictive model for more accurate soil organic carbon spectroscopic estimation

    NASA Astrophysics Data System (ADS)

    Vašát, Radim; Kodešová, Radka; Borůvka, Luboš

    2017-07-01

    A myriad of signal pre-processing strategies and multivariate calibration techniques has been explored in attempt to improve the spectroscopic prediction of soil organic carbon (SOC) over the last few decades. Therefore, to come up with a novel, more powerful, and accurate predictive approach to beat the rank becomes a challenging task. However, there may be a way, so that combine several individual predictions into a single final one (according to ensemble learning theory). As this approach performs best when combining in nature different predictive algorithms that are calibrated with structurally different predictor variables, we tested predictors of two different kinds: 1) reflectance values (or transforms) at each wavelength and 2) absorption feature parameters. Consequently we applied four different calibration techniques, two per each type of predictors: a) partial least squares regression and support vector machines for type 1, and b) multiple linear regression and random forest for type 2. The weights to be assigned to individual predictions within the ensemble model (constructed as a weighted average) were determined by an automated procedure that ensured the best solution among all possible was selected. The approach was tested at soil samples taken from surface horizon of four sites differing in the prevailing soil units. By employing the ensemble predictive model the prediction accuracy of SOC improved at all four sites. The coefficient of determination in cross-validation (R2cv) increased from 0.849, 0.611, 0.811 and 0.644 (the best individual predictions) to 0.864, 0.650, 0.824 and 0.698 for Site 1, 2, 3 and 4, respectively. Generally, the ensemble model affected the final prediction so that the maximal deviations of predicted vs. observed values of the individual predictions were reduced, and thus the correlation cloud became thinner as desired.

  9. NCEP/NLDAS Drought Monitoring and Prediction

    NASA Astrophysics Data System (ADS)

    Xia, Y.; Ek, M.; Wood, E.; Luo, L.; Sheffield, J.; Lettenmaier, D.; Livneh, B.; Cosgrove, B.; Mocko, D.; Meng, J.; Wei, H.; Restrepo, P.; Schaake, J.; Mo, K.

    2009-05-01

    The NCEP Environmental Modeling Center (EMC) collaborated with its CPPA (Climate Prediction Program of the Americas) partners to develop a North American Land Data Assimilation System (NLDAS, http://www.emc.ncep.noaa.gov/mmb/nldas) to monitor and predict the drought over the Continental United States (CONUS). The realtime NLDAS drought monitor, executed daily at NCEP/EMC, including daily, weekly and monthly anomaly and percentile of six fields (soil moisture, snow water equivalent, total runoff, streamflow, evaporation, precipitation) outputted from four land surface models (Noah, Mosaic, SAC, and VIC) on a common 1/8th degree grid using common hourly land surface forcing. The non-precipitation surface forcing is derived from NCEP's retrospective and realtime North American Regional Reanalysis System (NARR). The precipitation forcing is anchored to a daily gauge-only precipitation analysis over CONUS that applies a Parameter-elevation Regressions on Independent Slopes Model (PRISM) correction. This daily precipitation analysis is then temporally disaggregated to hourly precipitation amounts using radar and satellite precipitation. The NARR- based surface downward solar radiation is bias-corrected using seven years (1997-2004) of GOES satellite- derived solar radiation retrievals. The uncoupled ensemble seasonal drought prediction utilizes the following three independent approaches for generating downscaled ensemble seasonal forecasts of surface forcing: (1) Ensemble Streamflow Prediction, (2) CPC Official Seasonal Climate Outlook, and (3) NCEP CFS ensemble dynamical model prediction. For each of these three approaches, twenty ensemble members of forcing realizations are generated using a Bayesian merging algorithm developed by Princeton University. The three forcing methods are then used to drive the VIC model in seasonal prediction mode over thirteen large river basins that together span the CONUS domain. One to nine month ensemble seasonal prediction products such as air temperature, precipitation, soil moisture, snowpack, total runoff, evaporation and streamflow are derived for each forcing approach. The anomalies and percentiles of the predicted products for each approach may be used for CONUS drought prediction. This system is executed at the beginning of each month and distributes its products by the 10th of each month. The prediction products are evaluated using corresponding monitoring products for the VIC model and are compared with the prediction products from other research groups (e.g., University of Washington at Seattle, NASA Goddard) in the CONUS.

  10. What should be considered if you decide to build your own mathematical model for predicting the development of bacterial resistance? Recommendations based on a systematic review of the literature

    PubMed Central

    Arepeva, Maria; Kolbin, Alexey; Kurylev, Alexey; Balykina, Julia; Sidorenko, Sergey

    2015-01-01

    Acquired bacterial resistance is one of the causes of mortality and morbidity from infectious diseases. Mathematical modeling allows us to predict the spread of resistance and to some extent to control its dynamics. The purpose of this review was to examine existing mathematical models in order to understand the pros and cons of currently used approaches and to build our own model. During the analysis, seven articles on mathematical approaches to studying resistance that satisfied the inclusion/exclusion criteria were selected. All models were classified according to the approach used to study resistance in the presence of an antibiotic and were analyzed in terms of our research. Some models require modifications due to the specifics of the research. The plan for further work on model building is as follows: modify some models, according to our research, check all obtained models against our data, and select the optimal model or models with the best quality of prediction. After that we would be able to build a model for the development of resistance using the obtained results. PMID:25972847

  11. Multivariate Models for Prediction of Human Skin Sensitization ...

    EPA Pesticide Factsheets

    One of the lnteragency Coordinating Committee on the Validation of Alternative Method's (ICCVAM) top priorities is the development and evaluation of non-animal approaches to identify potential skin sensitizers. The complexity of biological events necessary to produce skin sensitization suggests that no single alternative method will replace the currently accepted animal tests. ICCVAM is evaluating an integrated approach to testing and assessment based on the adverse outcome pathway for skin sensitization that uses machine learning approaches to predict human skin sensitization hazard. We combined data from three in chemico or in vitro assays - the direct peptide reactivity assay (DPRA), human cell line activation test (h-CLAT) and KeratinoSens TM assay - six physicochemical properties and an in silico read-across prediction of skin sensitization hazard into 12 variable groups. The variable groups were evaluated using two machine learning approaches , logistic regression and support vector machine, to predict human skin sensitization hazard. Models were trained on 72 substances and tested on an external set of 24 substances. The six models (three logistic regression and three support vector machine) with the highest accuracy (92%) used: (1) DPRA, h-CLAT and read-across; (2) DPRA, h-CLAT, read-across and KeratinoSens; or (3) DPRA, h-CLAT, read-across, KeratinoSens and log P. The models performed better at predicting human skin sensitization hazard than the murine

  12. Effect of heteroscedasticity treatment in residual error models on model calibration and prediction uncertainty estimation

    NASA Astrophysics Data System (ADS)

    Sun, Ruochen; Yuan, Huiling; Liu, Xiaoli

    2017-11-01

    The heteroscedasticity treatment in residual error models directly impacts the model calibration and prediction uncertainty estimation. This study compares three methods to deal with the heteroscedasticity, including the explicit linear modeling (LM) method and nonlinear modeling (NL) method using hyperbolic tangent function, as well as the implicit Box-Cox transformation (BC). Then a combined approach (CA) combining the advantages of both LM and BC methods has been proposed. In conjunction with the first order autoregressive model and the skew exponential power (SEP) distribution, four residual error models are generated, namely LM-SEP, NL-SEP, BC-SEP and CA-SEP, and their corresponding likelihood functions are applied to the Variable Infiltration Capacity (VIC) hydrologic model over the Huaihe River basin, China. Results show that the LM-SEP yields the poorest streamflow predictions with the widest uncertainty band and unrealistic negative flows. The NL and BC methods can better deal with the heteroscedasticity and hence their corresponding predictive performances are improved, yet the negative flows cannot be avoided. The CA-SEP produces the most accurate predictions with the highest reliability and effectively avoids the negative flows, because the CA approach is capable of addressing the complicated heteroscedasticity over the study basin.

  13. An iterative fullwave simulation approach to multiple scattering in media with randomly distributed microbubbles

    NASA Astrophysics Data System (ADS)

    Joshi, Aditya; Lindsey, Brooks D.; Dayton, Paul A.; Pinton, Gianmarco; Muller, Marie

    2017-05-01

    Ultrasound contrast agents (UCA), such as microbubbles, enhance the scattering properties of blood, which is otherwise hypoechoic. The multiple scattering interactions of the acoustic field with UCA are poorly understood due to the complexity of the multiple scattering theories and the nonlinear microbubble response. The majority of bubble models describe the behavior of UCA as single, isolated microbubbles suspended in infinite medium. Multiple scattering models such as the independent scattering approximation can approximate phase velocity and attenuation for low scatterer volume fractions. However, all current models and simulation approaches only describe multiple scattering and nonlinear bubble dynamics separately. Here we present an approach that combines two existing models: (1) a full-wave model that describes nonlinear propagation and scattering interactions in a heterogeneous attenuating medium and (2) a Paul-Sarkar model that describes the nonlinear interactions between an acoustic field and microbubbles. These two models were solved numerically and combined with an iterative approach. The convergence of this combined model was explored in silico for 0.5 × 106 microbubbles ml-1, 1% and 2% bubble concentration by volume. The backscattering predicted by our modeling approach was verified experimentally with water tank measurements performed with a 128-element linear array transducer. An excellent agreement in terms of the fundamental and harmonic acoustic fields is shown. Additionally, our model correctly predicts the phase velocity and attenuation measured using through transmission and predicted by the independent scattering approximation.

  14. Multiphase, multicomponent phase behavior prediction

    NASA Astrophysics Data System (ADS)

    Dadmohammadi, Younas

    Accurate prediction of phase behavior of fluid mixtures in the chemical industry is essential for designing and operating a multitude of processes. Reliable generalized predictions of phase equilibrium properties, such as pressure, temperature, and phase compositions offer an attractive alternative to costly and time consuming experimental measurements. The main purpose of this work was to assess the efficacy of recently generalized activity coefficient models based on binary experimental data to (a) predict binary and ternary vapor-liquid equilibrium systems, and (b) characterize liquid-liquid equilibrium systems. These studies were completed using a diverse binary VLE database consisting of 916 binary and 86 ternary systems involving 140 compounds belonging to 31 chemical classes. Specifically the following tasks were undertaken: First, a comprehensive assessment of the two common approaches (gamma-phi (gamma-ϕ) and phi-phi (ϕ-ϕ)) used for determining the phase behavior of vapor-liquid equilibrium systems is presented. Both the representation and predictive capabilities of these two approaches were examined, as delineated form internal and external consistency tests of 916 binary systems. For the purpose, the universal quasi-chemical (UNIQUAC) model and the Peng-Robinson (PR) equation of state (EOS) were used in this assessment. Second, the efficacy of recently developed generalized UNIQUAC and the nonrandom two-liquid (NRTL) for predicting multicomponent VLE systems were investigated. Third, the abilities of recently modified NRTL model (mNRTL2 and mNRTL1) to characterize liquid-liquid equilibria (LLE) phase conditions and attributes, including phase stability, miscibility, and consolute point coordinates, were assessed. The results of this work indicate that the ϕ-ϕ approach represents the binary VLE systems considered within three times the error of the gamma-ϕ approach. A similar trend was observed for the for the generalized model predictions using quantitative structure-property parameter generalizations (QSPR). For ternary systems, where all three constituent binary systems were available, the NRTL-QSPR, UNIQUAC-QSPR, and UNIFAC-6 models produce comparable accuracy. For systems where at least one constituent binary is missing, the UNIFAC-6 model produces larger errors than the QSPR generalized models. In general, the LLE characterization results indicate the accuracy of the modified models in reproducing the findings of the original NRTL model.

  15. Cost Models for MMC Manufacturing Processes

    NASA Technical Reports Server (NTRS)

    Elzey, Dana M.; Wadley, Haydn N. G.

    1996-01-01

    Processes for the manufacture of advanced metal matrix composites are rapidly approaching maturity in the research laboratory and there is growing interest in their transition to industrial production. However, research conducted to date has almost exclusively focused on overcoming the technical barriers to producing high-quality material and little attention has been given to the economical feasibility of these laboratory approaches and process cost issues. A quantitative cost modeling (QCM) approach was developed to address these issues. QCM are cost analysis tools based on predictive process models relating process conditions to the attributes of the final product. An important attribute, of the QCM approach is the ability to predict the sensitivity of material production costs to product quality and to quantitatively explore trade-offs between cost and quality. Applications of the cost models allow more efficient direction of future MMC process technology development and a more accurate assessment of MMC market potential. Cost models were developed for two state-of-the art metal matrix composite (MMC) manufacturing processes: tape casting and plasma spray deposition. Quality and Cost models are presented for both processes and the resulting predicted quality-cost curves are presented and discussed.

  16. Model selection and assessment for multi­-species occupancy models

    USGS Publications Warehouse

    Broms, Kristin M.; Hooten, Mevin B.; Fitzpatrick, Ryan M.

    2016-01-01

    While multi-species occupancy models (MSOMs) are emerging as a popular method for analyzing biodiversity data, formal checking and validation approaches for this class of models have lagged behind. Concurrent with the rise in application of MSOMs among ecologists, a quiet regime shift is occurring in Bayesian statistics where predictive model comparison approaches are experiencing a resurgence. Unlike single-species occupancy models that use integrated likelihoods, MSOMs are usually couched in a Bayesian framework and contain multiple levels. Standard model checking and selection methods are often unreliable in this setting and there is only limited guidance in the ecological literature for this class of models. We examined several different contemporary Bayesian hierarchical approaches for checking and validating MSOMs and applied these methods to a freshwater aquatic study system in Colorado, USA, to better understand the diversity and distributions of plains fishes. Our findings indicated distinct differences among model selection approaches, with cross-validation techniques performing the best in terms of prediction.

  17. Confronting uncertainty in flood damage predictions

    NASA Astrophysics Data System (ADS)

    Schröter, Kai; Kreibich, Heidi; Vogel, Kristin; Merz, Bruno

    2015-04-01

    Reliable flood damage models are a prerequisite for the practical usefulness of the model results. Oftentimes, traditional uni-variate damage models as for instance depth-damage curves fail to reproduce the variability of observed flood damage. Innovative multi-variate probabilistic modelling approaches are promising to capture and quantify the uncertainty involved and thus to improve the basis for decision making. In this study we compare the predictive capability of two probabilistic modelling approaches, namely Bagging Decision Trees and Bayesian Networks. For model evaluation we use empirical damage data which are available from computer aided telephone interviews that were respectively compiled after the floods in 2002, 2005 and 2006, in the Elbe and Danube catchments in Germany. We carry out a split sample test by sub-setting the damage records. One sub-set is used to derive the models and the remaining records are used to evaluate the predictive performance of the model. Further we stratify the sample according to catchments which allows studying model performance in a spatial transfer context. Flood damage estimation is carried out on the scale of the individual buildings in terms of relative damage. The predictive performance of the models is assessed in terms of systematic deviations (mean bias), precision (mean absolute error) as well as in terms of reliability which is represented by the proportion of the number of observations that fall within the 95-quantile and 5-quantile predictive interval. The reliability of the probabilistic predictions within validation runs decreases only slightly and achieves a very good coverage of observations within the predictive interval. Probabilistic models provide quantitative information about prediction uncertainty which is crucial to assess the reliability of model predictions and improves the usefulness of model results.

  18. Matrix approach to uncertainty assessment and reduction for modeling terrestrial carbon cycle

    NASA Astrophysics Data System (ADS)

    Luo, Y.; Xia, J.; Ahlström, A.; Zhou, S.; Huang, Y.; Shi, Z.; Wang, Y.; Du, Z.; Lu, X.

    2017-12-01

    Terrestrial ecosystems absorb approximately 30% of the anthropogenic carbon dioxide emissions. This estimate has been deduced indirectly: combining analyses of atmospheric carbon dioxide concentrations with ocean observations to infer the net terrestrial carbon flux. In contrast, when knowledge about the terrestrial carbon cycle is integrated into different terrestrial carbon models they make widely different predictions. To improve the terrestrial carbon models, we have recently developed a matrix approach to uncertainty assessment and reduction. Specifically, the terrestrial carbon cycle has been commonly represented by a series of carbon balance equations to track carbon influxes into and effluxes out of individual pools in earth system models. This representation matches our understanding of carbon cycle processes well and can be reorganized into one matrix equation without changing any modeled carbon cycle processes and mechanisms. We have developed matrix equations of several global land C cycle models, including CLM3.5, 4.0 and 4.5, CABLE, LPJ-GUESS, and ORCHIDEE. Indeed, the matrix equation is generic and can be applied to other land carbon models. This matrix approach offers a suite of new diagnostic tools, such as the 3-dimensional (3-D) parameter space, traceability analysis, and variance decomposition, for uncertainty analysis. For example, predictions of carbon dynamics with complex land models can be placed in a 3-D parameter space (carbon input, residence time, and storage potential) as a common metric to measure how much model predictions are different. The latter can be traced to its source components by decomposing model predictions to a hierarchy of traceable components. Then, variance decomposition can help attribute the spread in predictions among multiple models to precisely identify sources of uncertainty. The highly uncertain components can be constrained by data as the matrix equation makes data assimilation computationally possible. We will illustrate various applications of this matrix approach to uncertainty assessment and reduction for terrestrial carbon cycle models.

  19. A strategy to establish Food Safety Model Repositories.

    PubMed

    Plaza-Rodríguez, C; Thoens, C; Falenski, A; Weiser, A A; Appel, B; Kaesbohrer, A; Filter, M

    2015-07-02

    Transferring the knowledge of predictive microbiology into real world food manufacturing applications is still a major challenge for the whole food safety modelling community. To facilitate this process, a strategy for creating open, community driven and web-based predictive microbial model repositories is proposed. These collaborative model resources could significantly improve the transfer of knowledge from research into commercial and governmental applications and also increase efficiency, transparency and usability of predictive models. To demonstrate the feasibility, predictive models of Salmonella in beef previously published in the scientific literature were re-implemented using an open source software tool called PMM-Lab. The models were made publicly available in a Food Safety Model Repository within the OpenML for Predictive Modelling in Food community project. Three different approaches were used to create new models in the model repositories: (1) all information relevant for model re-implementation is available in a scientific publication, (2) model parameters can be imported from tabular parameter collections and (3) models have to be generated from experimental data or primary model parameters. All three approaches were demonstrated in the paper. The sample Food Safety Model Repository is available via: http://sourceforge.net/projects/microbialmodelingexchange/files/models and the PMM-Lab software can be downloaded from http://sourceforge.net/projects/pmmlab/. This work also illustrates that a standardized information exchange format for predictive microbial models, as the key component of this strategy, could be established by adoption of resources from the Systems Biology domain. Copyright © 2015. Published by Elsevier B.V.

  20. Stochastic Residual-Error Analysis For Estimating Hydrologic Model Predictive Uncertainty

    EPA Science Inventory

    A hybrid time series-nonparametric sampling approach, referred to herein as semiparametric, is presented for the estimation of model predictive uncertainty. The methodology is a two-step procedure whereby a distributed hydrologic model is first calibrated, then followed by brute ...

  1. The Rangeland Hydrology and Erosion Model: A dynamic approach for predicting soil loss on rangelands

    USDA-ARS?s Scientific Manuscript database

    In this study we present the improved Rangeland Hydrology and Erosion Model (RHEM V2.3), a process-based erosion prediction tool specific for rangeland application. The article provides the mathematical formulation of the model and parameter estimation equations. Model performance is assessed agains...

  2. Predicting Energy Performance of a Net-Zero Energy Building: A Statistical Approach

    PubMed Central

    Kneifel, Joshua; Webb, David

    2016-01-01

    Performance-based building requirements have become more prevalent because it gives freedom in building design while still maintaining or exceeding the energy performance required by prescriptive-based requirements. In order to determine if building designs reach target energy efficiency improvements, it is necessary to estimate the energy performance of a building using predictive models and different weather conditions. Physics-based whole building energy simulation modeling is the most common approach. However, these physics-based models include underlying assumptions and require significant amounts of information in order to specify the input parameter values. An alternative approach to test the performance of a building is to develop a statistically derived predictive regression model using post-occupancy data that can accurately predict energy consumption and production based on a few common weather-based factors, thus requiring less information than simulation models. A regression model based on measured data should be able to predict energy performance of a building for a given day as long as the weather conditions are similar to those during the data collection time frame. This article uses data from the National Institute of Standards and Technology (NIST) Net-Zero Energy Residential Test Facility (NZERTF) to develop and validate a regression model to predict the energy performance of the NZERTF using two weather variables aggregated to the daily level, applies the model to estimate the energy performance of hypothetical NZERTFs located in different cities in the Mixed-Humid climate zone, and compares these estimates to the results from already existing EnergyPlus whole building energy simulations. This regression model exhibits agreement with EnergyPlus predictive trends in energy production and net consumption, but differs greatly in energy consumption. The model can be used as a framework for alternative and more complex models based on the experimental data collected from the NZERTF. PMID:27956756

  3. Predicting Energy Performance of a Net-Zero Energy Building: A Statistical Approach.

    PubMed

    Kneifel, Joshua; Webb, David

    2016-09-01

    Performance-based building requirements have become more prevalent because it gives freedom in building design while still maintaining or exceeding the energy performance required by prescriptive-based requirements. In order to determine if building designs reach target energy efficiency improvements, it is necessary to estimate the energy performance of a building using predictive models and different weather conditions. Physics-based whole building energy simulation modeling is the most common approach. However, these physics-based models include underlying assumptions and require significant amounts of information in order to specify the input parameter values. An alternative approach to test the performance of a building is to develop a statistically derived predictive regression model using post-occupancy data that can accurately predict energy consumption and production based on a few common weather-based factors, thus requiring less information than simulation models. A regression model based on measured data should be able to predict energy performance of a building for a given day as long as the weather conditions are similar to those during the data collection time frame. This article uses data from the National Institute of Standards and Technology (NIST) Net-Zero Energy Residential Test Facility (NZERTF) to develop and validate a regression model to predict the energy performance of the NZERTF using two weather variables aggregated to the daily level, applies the model to estimate the energy performance of hypothetical NZERTFs located in different cities in the Mixed-Humid climate zone, and compares these estimates to the results from already existing EnergyPlus whole building energy simulations. This regression model exhibits agreement with EnergyPlus predictive trends in energy production and net consumption, but differs greatly in energy consumption. The model can be used as a framework for alternative and more complex models based on the experimental data collected from the NZERTF.

  4. QUANTITATIVE MODELING APPROACHES TO PREDICTING THE ACUTE NEUROTOXICITY OF VOLATILE ORGANIC COMPOUNDS (VOCS).

    EPA Science Inventory

    Lack of complete and appropriate human data requires prediction of the hazards for exposed human populations by extrapolation from available animal and in vitro data. Predictive models for the toxicity of chemicals can be constructed by linking kinetic and mode of action data uti...

  5. Artificial Neural Networks: A New Approach to Predicting Application Behavior.

    ERIC Educational Resources Information Center

    Gonzalez, Julie M. Byers; DesJardins, Stephen L.

    2002-01-01

    Applied the technique of artificial neural networks to predict which students were likely to apply to one research university. Compared the results to the traditional analysis tool, logistic regression modeling. Found that the addition of artificial intelligence models was a useful new tool for predicting student application behavior. (EV)

  6. MACE prediction of acute coronary syndrome via boosted resampling classification using electronic medical records.

    PubMed

    Huang, Zhengxing; Chan, Tak-Ming; Dong, Wei

    2017-02-01

    Major adverse cardiac events (MACE) of acute coronary syndrome (ACS) often occur suddenly resulting in high mortality and morbidity. Recently, the rapid development of electronic medical records (EMR) provides the opportunity to utilize the potential of EMR to improve the performance of MACE prediction. In this study, we present a novel data-mining based approach specialized for MACE prediction from a large volume of EMR data. The proposed approach presents a new classification algorithm by applying both over-sampling and under-sampling on minority-class and majority-class samples, respectively, and integrating the resampling strategy into a boosting framework so that it can effectively handle imbalance of MACE of ACS patients analogous to domain practice. The method learns a new and stronger MACE prediction model each iteration from a more difficult subset of EMR data with wrongly predicted MACEs of ACS patients by a previous weak model. We verify the effectiveness of the proposed approach on a clinical dataset containing 2930 ACS patient samples with 268 feature types. While the imbalanced ratio does not seem extreme (25.7%), MACE prediction targets pose great challenge to traditional methods. As these methods degenerate dramatically with increasing imbalanced ratios, the performance of our approach for predicting MACE remains robust and reaches 0.672 in terms of AUC. On average, the proposed approach improves the performance of MACE prediction by 4.8%, 4.5%, 8.6% and 4.8% over the standard SVM, Adaboost, SMOTE, and the conventional GRACE risk scoring system for MACE prediction, respectively. We consider that the proposed iterative boosting approach has demonstrated great potential to meet the challenge of MACE prediction for ACS patients using a large volume of EMR. Copyright © 2017 Elsevier Inc. All rights reserved.

  7. Physics-of-Failure Approach to Prognostics

    NASA Technical Reports Server (NTRS)

    Kulkarni, Chetan S.

    2017-01-01

    As more and more electric vehicles emerge in our daily operation progressively, a very critical challenge lies in accurate prediction of the electrical components present in the system. In case of electric vehicles, computing remaining battery charge is safety-critical. In order to tackle and solve the prediction problem, it is essential to have awareness of the current state and health of the system, especially since it is necessary to perform condition-based predictions. To be able to predict the future state of the system, it is also required to possess knowledge of the current and future operations of the vehicle. In this presentation our approach to develop a system level health monitoring safety indicator for different electronic components is presented which runs estimation and prediction algorithms to determine state-of-charge and estimate remaining useful life of respective components. Given models of the current and future system behavior, the general approach of model-based prognostics can be employed as a solution to the prediction problem and further for decision making.

  8. Extended-Range Prediction with Low-Dimensional, Stochastic-Dynamic Models: A Data-driven Approach

    DTIC Science & Technology

    2012-09-30

    characterization of extratropical storms and extremes and link these to LFV modes. Mingfang Ting, Yochanan Kushnir, Andrew W. Robertson...simulating and predicting a wide range of climate phenomena including ENSO, tropical Atlantic sea surface temperatures (SSTs), storm track variability...into empirical prediction models. Use observations to improve low-order dynamical MJO models. Adam Sobel, Daehyun Kim. Extratropical variability

  9. Reliability prediction of ontology-based service compositions using Petri net and time series models.

    PubMed

    Li, Jia; Xia, Yunni; Luo, Xin

    2014-01-01

    OWL-S, one of the most important Semantic Web service ontologies proposed to date, provides a core ontological framework and guidelines for describing the properties and capabilities of their web services in an unambiguous, computer interpretable form. Predicting the reliability of composite service processes specified in OWL-S allows service users to decide whether the process meets the quantitative quality requirement. In this study, we consider the runtime quality of services to be fluctuating and introduce a dynamic framework to predict the runtime reliability of services specified in OWL-S, employing the Non-Markovian stochastic Petri net (NMSPN) and the time series model. The framework includes the following steps: obtaining the historical response times series of individual service components; fitting these series with a autoregressive-moving-average-model (ARMA for short) and predicting the future firing rates of service components; mapping the OWL-S process into a NMSPN model; employing the predicted firing rates as the model input of NMSPN and calculating the normal completion probability as the reliability estimate. In the case study, a comparison between the static model and our approach based on experimental data is presented and it is shown that our approach achieves higher prediction accuracy.

  10. A computational language approach to modeling prose recall in schizophrenia

    PubMed Central

    Rosenstein, Mark; Diaz-Asper, Catherine; Foltz, Peter W.; Elvevåg, Brita

    2014-01-01

    Many cortical disorders are associated with memory problems. In schizophrenia, verbal memory deficits are a hallmark feature. However, the exact nature of this deficit remains elusive. Modeling aspects of language features used in memory recall have the potential to provide means for measuring these verbal processes. We employ computational language approaches to assess time-varying semantic and sequential properties of prose recall at various retrieval intervals (immediate, 30 min and 24 h later) in patients with schizophrenia, unaffected siblings and healthy unrelated control participants. First, we model the recall data to quantify the degradation of performance with increasing retrieval interval and the effect of diagnosis (i.e., group membership) on performance. Next we model the human scoring of recall performance using an n-gram language sequence technique, and then with a semantic feature based on Latent Semantic Analysis. These models show that automated analyses of the recalls can produce scores that accurately mimic human scoring. The final analysis addresses the validity of this approach by ascertaining the ability to predict group membership from models built on the two classes of language features. Taken individually, the semantic feature is most predictive, while a model combining the features improves accuracy of group membership prediction slightly above the semantic feature alone as well as over the human rating approach. We discuss the implications for cognitive neuroscience of such a computational approach in exploring the mechanisms of prose recall. PMID:24709122

  11. Prediction of Bispectral Index during Target-controlled Infusion of Propofol and Remifentanil: A Deep Learning Approach.

    PubMed

    Lee, Hyung-Chul; Ryu, Ho-Geol; Chung, Eun-Jin; Jung, Chul-Woo

    2018-03-01

    The discrepancy between predicted effect-site concentration and measured bispectral index is problematic during intravenous anesthesia with target-controlled infusion of propofol and remifentanil. We hypothesized that bispectral index during total intravenous anesthesia would be more accurately predicted by a deep learning approach. Long short-term memory and the feed-forward neural network were sequenced to simulate the pharmacokinetic and pharmacodynamic parts of an empirical model, respectively, to predict intraoperative bispectral index during combined use of propofol and remifentanil. Inputs of long short-term memory were infusion histories of propofol and remifentanil, which were retrieved from target-controlled infusion pumps for 1,800 s at 10-s intervals. Inputs of the feed-forward network were the outputs of long short-term memory and demographic data such as age, sex, weight, and height. The final output of the feed-forward network was the bispectral index. The performance of bispectral index prediction was compared between the deep learning model and previously reported response surface model. The model hyperparameters comprised 8 memory cells in the long short-term memory layer and 16 nodes in the hidden layer of the feed-forward network. The model training and testing were performed with separate data sets of 131 and 100 cases. The concordance correlation coefficient (95% CI) were 0.561 (0.560 to 0.562) in the deep learning model, which was significantly larger than that in the response surface model (0.265 [0.263 to 0.266], P < 0.001). The deep learning model-predicted bispectral index during target-controlled infusion of propofol and remifentanil more accurately compared to the traditional model. The deep learning approach in anesthetic pharmacology seems promising because of its excellent performance and extensibility.

  12. A systems approach to college drinking: development of a deterministic model for testing alcohol control policies.

    PubMed

    Scribner, Richard; Ackleh, Azmy S; Fitzpatrick, Ben G; Jacquez, Geoffrey; Thibodeaux, Jeremy J; Rommel, Robert; Simonsen, Neal

    2009-09-01

    The misuse and abuse of alcohol among college students remain persistent problems. Using a systems approach to understand the dynamics of student drinking behavior and thus forecasting the impact of campus policy to address the problem represents a novel approach. Toward this end, the successful development of a predictive mathematical model of college drinking would represent a significant advance for prevention efforts. A deterministic, compartmental model of college drinking was developed, incorporating three processes: (1) individual factors, (2) social interactions, and (3) social norms. The model quantifies these processes in terms of the movement of students between drinking compartments characterized by five styles of college drinking: abstainers, light drinkers, moderate drinkers, problem drinkers, and heavy episodic drinkers. Predictions from the model were first compared with actual campus-level data and then used to predict the effects of several simulated interventions to address heavy episodic drinking. First, the model provides a reasonable fit of actual drinking styles of students attending Social Norms Marketing Research Project campuses varying by "wetness" and by drinking styles of matriculating students. Second, the model predicts that a combination of simulated interventions targeting heavy episodic drinkers at a moderately "dry" campus would extinguish heavy episodic drinkers, replacing them with light and moderate drinkers. Instituting the same combination of simulated interventions at a moderately "wet" campus would result in only a moderate reduction in heavy episodic drinkers (i.e., 50% to 35%). A simple, five-state compartmental model adequately predicted the actual drinking patterns of students from a variety of campuses surveyed in the Social Norms Marketing Research Project study. The model predicted the impact on drinking patterns of several simulated interventions to address heavy episodic drinking on various types of campuses.

  13. A Systems Approach to College Drinking: Development of a Deterministic Model for Testing Alcohol Control Policies*

    PubMed Central

    Scribner, Richard; Ackleh, Azmy S.; Fitzpatrick, Ben G.; Jacquez, Geoffrey; Thibodeaux, Jeremy J.; Rommel, Robert; Simonsen, Neal

    2009-01-01

    Objective: The misuse and abuse of alcohol among college students remain persistent problems. Using a systems approach to understand the dynamics of student drinking behavior and thus forecasting the impact of campus policy to address the problem represents a novel approach. Toward this end, the successful development of a predictive mathematical model of college drinking would represent a significant advance for prevention efforts. Method: A deterministic, compartmental model of college drinking was developed, incorporating three processes: (1) individual factors, (2) social interactions, and (3) social norms. The model quantifies these processes in terms of the movement of students between drinking compartments characterized by five styles of college drinking: abstainers, light drinkers, moderate drinkers, problem drinkers, and heavy episodic drinkers. Predictions from the model were first compared with actual campus-level data and then used to predict the effects of several simulated interventions to address heavy episodic drinking. Results: First, the model provides a reasonable fit of actual drinking styles of students attending Social Norms Marketing Research Project campuses varying by “wetness” and by drinking styles of matriculating students. Second, the model predicts that a combination of simulated interventions targeting heavy episodic drinkers at a moderately “dry” campus would extinguish heavy episodic drinkers, replacing them with light and moderate drinkers. Instituting the same combination of simulated interventions at a moderately “wet” campus would result in only a moderate reduction in heavy episodic drinkers (i.e., 50% to 35%). Conclusions: A simple, five-state compartmental model adequately predicted the actual drinking patterns of students from a variety of campuses surveyed in the Social Norms Marketing Research Project study. The model predicted the impact on drinking patterns of several simulated interventions to address heavy episodic drinking on various types of campuses. PMID:19737506

  14. Forecasting influenza in Hong Kong with Google search queries and statistical model fusion.

    PubMed

    Xu, Qinneng; Gel, Yulia R; Ramirez Ramirez, L Leticia; Nezafati, Kusha; Zhang, Qingpeng; Tsui, Kwok-Leung

    2017-01-01

    The objective of this study is to investigate predictive utility of online social media and web search queries, particularly, Google search data, to forecast new cases of influenza-like-illness (ILI) in general outpatient clinics (GOPC) in Hong Kong. To mitigate the impact of sensitivity to self-excitement (i.e., fickle media interest) and other artifacts of online social media data, in our approach we fuse multiple offline and online data sources. Four individual models: generalized linear model (GLM), least absolute shrinkage and selection operator (LASSO), autoregressive integrated moving average (ARIMA), and deep learning (DL) with Feedforward Neural Networks (FNN) are employed to forecast ILI-GOPC both one week and two weeks in advance. The covariates include Google search queries, meteorological data, and previously recorded offline ILI. To our knowledge, this is the first study that introduces deep learning methodology into surveillance of infectious diseases and investigates its predictive utility. Furthermore, to exploit the strength from each individual forecasting models, we use statistical model fusion, using Bayesian model averaging (BMA), which allows a systematic integration of multiple forecast scenarios. For each model, an adaptive approach is used to capture the recent relationship between ILI and covariates. DL with FNN appears to deliver the most competitive predictive performance among the four considered individual models. Combing all four models in a comprehensive BMA framework allows to further improve such predictive evaluation metrics as root mean squared error (RMSE) and mean absolute predictive error (MAPE). Nevertheless, DL with FNN remains the preferred method for predicting locations of influenza peaks. The proposed approach can be viewed a feasible alternative to forecast ILI in Hong Kong or other countries where ILI has no constant seasonal trend and influenza data resources are limited. The proposed methodology is easily tractable and computationally efficient.

  15. Proton exchange membrane fuel cell model for aging predictions: Simulated equivalent active surface area loss and comparisons with durability tests

    NASA Astrophysics Data System (ADS)

    Robin, C.; Gérard, M.; Quinaud, M.; d'Arbigny, J.; Bultel, Y.

    2016-09-01

    The prediction of Proton Exchange Membrane Fuel Cell (PEMFC) lifetime is one of the major challenges to optimize both material properties and dynamic control of the fuel cell system. In this study, by a multiscale modeling approach, a mechanistic catalyst dissolution model is coupled to a dynamic PEMFC cell model to predict the performance loss of the PEMFC. Results are compared to two 2000-h experimental aging tests. More precisely, an original approach is introduced to estimate the loss of an equivalent active surface area during an aging test. Indeed, when the computed Electrochemical Catalyst Surface Area profile is fitted on the experimental measures from Cyclic Voltammetry, the computed performance loss of the PEMFC is underestimated. To be able to predict the performance loss measured by polarization curves during the aging test, an equivalent active surface area is obtained by a model inversion. This methodology enables to successfully find back the experimental cell voltage decay during time. The model parameters are fitted from the polarization curves so that they include the global degradation. Moreover, the model captures the aging heterogeneities along the surface of the cell observed experimentally. Finally, a second 2000-h durability test in dynamic operating conditions validates the approach.

  16. Validating spatiotemporal predictions of an important pest of small grains.

    PubMed

    Merrill, Scott C; Holtzer, Thomas O; Peairs, Frank B; Lester, Philip J

    2015-01-01

    Arthropod pests are typically managed using tactics applied uniformly to the whole field. Precision pest management applies tactics under the assumption that within-field pest pressure differences exist. This approach allows for more precise and judicious use of scouting resources and management tactics. For example, a portion of a field delineated as attractive to pests may be selected to receive extra monitoring attention. Likely because of the high variability in pest dynamics, little attention has been given to developing precision pest prediction models. Here, multimodel synthesis was used to develop a spatiotemporal model predicting the density of a key pest of wheat, the Russian wheat aphid, Diuraphis noxia (Kurdjumov). Spatially implicit and spatially explicit models were synthesized to generate spatiotemporal pest pressure predictions. Cross-validation and field validation were used to confirm model efficacy. A strong within-field signal depicting aphid density was confirmed with low prediction errors. Results show that the within-field model predictions will provide higher-quality information than would be provided by traditional field scouting. With improvements to the broad-scale model component, the model synthesis approach and resulting tool could improve pest management strategy and provide a template for the development of spatially explicit pest pressure models. © 2014 Society of Chemical Industry.

  17. Reducing usage of the computational resources by event driven approach to model predictive control

    NASA Astrophysics Data System (ADS)

    Misik, Stefan; Bradac, Zdenek; Cela, Arben

    2017-08-01

    This paper deals with a real-time and optimal control of dynamic systems while also considers the constraints which these systems might be subject to. Main objective of this work is to propose a simple modification of the existing Model Predictive Control approach to better suit needs of computational resource-constrained real-time systems. An example using model of a mechanical system is presented and the performance of the proposed method is evaluated in a simulated environment.

  18. Using a neural network approach and time series data from an international monitoring station in the Yellow Sea for modeling marine ecosystems.

    PubMed

    Zhang, Yingying; Wang, Juncheng; Vorontsov, A M; Hou, Guangli; Nikanorova, M N; Wang, Hongliang

    2014-01-01

    The international marine ecological safety monitoring demonstration station in the Yellow Sea was developed as a collaborative project between China and Russia. It is a nonprofit technical workstation designed as a facility for marine scientific research for public welfare. By undertaking long-term monitoring of the marine environment and automatic data collection, this station will provide valuable information for marine ecological protection and disaster prevention and reduction. The results of some initial research by scientists at the research station into predictive modeling of marine ecological environments and early warning are described in this paper. Marine ecological processes are influenced by many factors including hydrological and meteorological conditions, biological factors, and human activities. Consequently, it is very difficult to incorporate all these influences and their interactions in a deterministic or analysis model. A prediction model integrating a time series prediction approach with neural network nonlinear modeling is proposed for marine ecological parameters. The model explores the natural fluctuations in marine ecological parameters by learning from the latest observed data automatically, and then predicting future values of the parameter. The model is updated in a "rolling" fashion with new observed data from the monitoring station. Prediction experiments results showed that the neural network prediction model based on time series data is effective for marine ecological prediction and can be used for the development of early warning systems.

  19. Near-Surface Wind Predictions in Complex Terrain with a CFD Approach Optimized for Atmospheric Boundary Layer Flows

    NASA Astrophysics Data System (ADS)

    Wagenbrenner, N. S.; Forthofer, J.; Butler, B.; Shannon, K.

    2014-12-01

    Near-surface wind predictions are important for a number of applications, including transport and dispersion, wind energy forecasting, and wildfire behavior. Researchers and forecasters would benefit from a wind model that could be readily applied to complex terrain for use in these various disciplines. Unfortunately, near-surface winds in complex terrain are not handled well by traditional modeling approaches. Numerical weather prediction models employ coarse horizontal resolutions which do not adequately resolve sub-grid terrain features important to the surface flow. Computational fluid dynamics (CFD) models are increasingly being applied to simulate atmospheric boundary layer (ABL) flows, especially in wind energy applications; however, the standard functionality provided in commercial CFD models is not suitable for ABL flows. Appropriate CFD modeling in the ABL requires modification of empirically-derived wall function parameters and boundary conditions to avoid erroneous streamwise gradients due to inconsistences between inlet profiles and specified boundary conditions. This work presents a new version of a near-surface wind model for complex terrain called WindNinja. The new version of WindNinja offers two options for flow simulations: 1) the native, fast-running mass-consistent method available in previous model versions and 2) a CFD approach based on the OpenFOAM modeling framework and optimized for ABL flows. The model is described and evaluations of predictions with surface wind data collected from two recent field campaigns in complex terrain are presented. A comparison of predictions from the native mass-consistent method and the new CFD method is also provided.

  20. Genetic determinants of freckle occurrence in the Spanish population: Towards ephelides prediction from human DNA samples.

    PubMed

    Hernando, Barbara; Ibañez, Maria Victoria; Deserio-Cuesta, Julio Alberto; Soria-Navarro, Raquel; Vilar-Sastre, Inca; Martinez-Cadenas, Conrado

    2018-03-01

    Prediction of human pigmentation traits, one of the most differentiable externally visible characteristics among individuals, from biological samples represents a useful tool in the field of forensic DNA phenotyping. In spite of freckling being a relatively common pigmentation characteristic in Europeans, little is known about the genetic basis of this largely genetically determined phenotype in southern European populations. In this work, we explored the predictive capacity of eight freckle and sunlight sensitivity-related genes in 458 individuals (266 non-freckled controls and 192 freckled cases) from Spain. Four loci were associated with freckling (MC1R, IRF4, ASIP and BNC2), and female sex was also found to be a predictive factor for having a freckling phenotype in our population. After identifying the most informative genetic variants responsible for human ephelides occurrence in our sample set, we developed a DNA-based freckle prediction model using a multivariate regression approach. Once developed, the capabilities of the prediction model were tested by a repeated 10-fold cross-validation approach. The proportion of correctly predicted individuals using the DNA-based freckle prediction model was 74.13%. The implementation of sex into the DNA-based freckle prediction model slightly improved the overall prediction accuracy by 2.19% (76.32%). Further evaluation of the newly-generated prediction model was performed by assessing the model's performance in a new cohort of 212 Spanish individuals, reaching a classification success rate of 74.61%. Validation of this prediction model may be carried out in larger populations, including samples from different European populations. Further research to validate and improve this newly-generated freckle prediction model will be needed before its forensic application. Together with DNA tests already validated for eye and hair colour prediction, this freckle prediction model may lead to a substantially more detailed physical description of unknown individuals from DNA found at the crime scene. Copyright © 2017 Elsevier B.V. All rights reserved.

  1. An integrative formal model of motivation and decision making: The MGPM*.

    PubMed

    Ballard, Timothy; Yeo, Gillian; Loft, Shayne; Vancouver, Jeffrey B; Neal, Andrew

    2016-09-01

    We develop and test an integrative formal model of motivation and decision making. The model, referred to as the extended multiple-goal pursuit model (MGPM*), is an integration of the multiple-goal pursuit model (Vancouver, Weinhardt, & Schmidt, 2010) and decision field theory (Busemeyer & Townsend, 1993). Simulations of the model generated predictions regarding the effects of goal type (approach vs. avoidance), risk, and time sensitivity on prioritization. We tested these predictions in an experiment in which participants pursued different combinations of approach and avoidance goals under different levels of risk. The empirical results were consistent with the predictions of the MGPM*. Specifically, participants pursuing 1 approach and 1 avoidance goal shifted priority from the approach to the avoidance goal over time. Among participants pursuing 2 approach goals, those with low time sensitivity prioritized the goal with the larger discrepancy, whereas those with high time sensitivity prioritized the goal with the smaller discrepancy. Participants pursuing 2 avoidance goals generally prioritized the goal with the smaller discrepancy. Finally, all of these effects became weaker as the level of risk increased. We used quantitative model comparison to show that the MGPM* explained the data better than the original multiple-goal pursuit model, and that the major extensions from the original model were justified. The MGPM* represents a step forward in the development of a general theory of decision making during multiple-goal pursuit. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

  2. QSPR models for half-wave reduction potential of steroids: a comparative study between feature selection and feature extraction from subsets of or entire set of descriptors.

    PubMed

    Hemmateenejad, Bahram; Yazdani, Mahdieh

    2009-02-16

    Steroids are widely distributed in nature and are found in plants, animals, and fungi in abundance. A data set consists of a diverse set of steroids have been used to develop quantitative structure-electrochemistry relationship (QSER) models for their half-wave reduction potential. Modeling was established by means of multiple linear regression (MLR) and principle component regression (PCR) analyses. In MLR analysis, the QSPR models were constructed by first grouping descriptors and then stepwise selection of variables from each group (MLR1) and stepwise selection of predictor variables from the pool of all calculated descriptors (MLR2). Similar procedure was used in PCR analysis so that the principal components (or features) were extracted from different group of descriptors (PCR1) and from entire set of descriptors (PCR2). The resulted models were evaluated using cross-validation, chance correlation, application to prediction reduction potential of some test samples and accessing applicability domain. Both MLR approaches represented accurate results however the QSPR model found by MLR1 was statistically more significant. PCR1 approach produced a model as accurate as MLR approaches whereas less accurate results were obtained by PCR2 approach. In overall, the correlation coefficients of cross-validation and prediction of the QSPR models resulted from MLR1, MLR2 and PCR1 approaches were higher than 90%, which show the high ability of the models to predict reduction potential of the studied steroids.

  3. Predicting seasonal influenza transmission using functional regression models with temporal dependence.

    PubMed

    Oviedo de la Fuente, Manuel; Febrero-Bande, Manuel; Muñoz, María Pilar; Domínguez, Àngela

    2018-01-01

    This paper proposes a novel approach that uses meteorological information to predict the incidence of influenza in Galicia (Spain). It extends the Generalized Least Squares (GLS) methods in the multivariate framework to functional regression models with dependent errors. These kinds of models are useful when the recent history of the incidence of influenza are readily unavailable (for instance, by delays on the communication with health informants) and the prediction must be constructed by correcting the temporal dependence of the residuals and using more accessible variables. A simulation study shows that the GLS estimators render better estimations of the parameters associated with the regression model than they do with the classical models. They obtain extremely good results from the predictive point of view and are competitive with the classical time series approach for the incidence of influenza. An iterative version of the GLS estimator (called iGLS) was also proposed that can help to model complicated dependence structures. For constructing the model, the distance correlation measure [Formula: see text] was employed to select relevant information to predict influenza rate mixing multivariate and functional variables. These kinds of models are extremely useful to health managers in allocating resources in advance to manage influenza epidemics.

  4. A Numerical-Analytical Approach to Modeling the Axial Rotation of the Earth

    NASA Astrophysics Data System (ADS)

    Markov, Yu. G.; Perepelkin, V. V.; Rykhlova, L. V.; Filippova, A. S.

    2018-04-01

    A model for the non-uniform axial rotation of the Earth is studied using a celestial-mechanical approach and numerical simulations. The application of an approximate model containing a small number of parameters to predict variations of the axial rotation velocity of the Earth over short time intervals is justified. This approximate model is obtained by averaging variable parameters that are subject to small variations due to non-stationarity of the perturbing factors. The model is verified and compared with predictions over a long time interval published by the International Earth Rotation and Reference Systems Service (IERS).

  5. Multiscale Modeling of PEEK Using Reactive Molecular Dynamics Modeling and Micromechanics

    NASA Technical Reports Server (NTRS)

    Pisani, William A.; Radue, Matthew; Chinkanjanarot, Sorayot; Bednarcyk, Brett A.; Pineda, Evan J.; King, Julia A.; Odegard, Gregory M.

    2018-01-01

    Polyether ether ketone (PEEK) is a high-performance, semi-crystalline thermoplastic that is used in a wide range of engineering applications, including some structural components of aircraft. The design of new PEEK-based materials requires a precise understanding of the multiscale structure and behavior of semi-crystalline PEEK. Molecular Dynamics (MD) modeling can efficiently predict bulk-level properties of single phase polymers, and micromechanics can be used to homogenize those phases based on the overall polymer microstructure. In this study, MD modeling was used to predict the mechanical properties of the amorphous and crystalline phases of PEEK. The hierarchical microstructure of PEEK, which combines the aforementioned phases, was modeled using a multiscale modeling approach facilitated by NASA's MSGMC. The bulk mechanical properties of semi-crystalline PEEK predicted using MD modeling and MSGMC agree well with vendor data, thus validating the multiscale modeling approach.

  6. Improving orbit prediction accuracy through supervised machine learning

    NASA Astrophysics Data System (ADS)

    Peng, Hao; Bai, Xiaoli

    2018-05-01

    Due to the lack of information such as the space environment condition and resident space objects' (RSOs') body characteristics, current orbit predictions that are solely grounded on physics-based models may fail to achieve required accuracy for collision avoidance and have led to satellite collisions already. This paper presents a methodology to predict RSOs' trajectories with higher accuracy than that of the current methods. Inspired by the machine learning (ML) theory through which the models are learned based on large amounts of observed data and the prediction is conducted without explicitly modeling space objects and space environment, the proposed ML approach integrates physics-based orbit prediction algorithms with a learning-based process that focuses on reducing the prediction errors. Using a simulation-based space catalog environment as the test bed, the paper demonstrates three types of generalization capability for the proposed ML approach: (1) the ML model can be used to improve the same RSO's orbit information that is not available during the learning process but shares the same time interval as the training data; (2) the ML model can be used to improve predictions of the same RSO at future epochs; and (3) the ML model based on a RSO can be applied to other RSOs that share some common features.

  7. Comparison of Predicted Thermoelectric Energy Conversion Efficiency by Cumulative Properties and Reduced Variables Approaches

    NASA Astrophysics Data System (ADS)

    Linker, Thomas M.; Lee, Glenn S.; Beekman, Matt

    2018-06-01

    The semi-analytical methods of thermoelectric energy conversion efficiency calculation based on the cumulative properties approach and reduced variables approach are compared for 21 high performance thermoelectric materials. Both approaches account for the temperature dependence of the material properties as well as the Thomson effect, thus the predicted conversion efficiencies are generally lower than that based on the conventional thermoelectric figure of merit ZT for nearly all of the materials evaluated. The two methods also predict material energy conversion efficiencies that are in very good agreement which each other, even for large temperature differences (average percent difference of 4% with maximum observed deviation of 11%). The tradeoff between obtaining a reliable assessment of a material's potential for thermoelectric applications and the complexity of implementation of the three models, as well as the advantages of using more accurate modeling approaches in evaluating new thermoelectric materials, are highlighted.

  8. Modeling and predicting historical volatility in exchange rate markets

    NASA Astrophysics Data System (ADS)

    Lahmiri, Salim

    2017-04-01

    Volatility modeling and forecasting of currency exchange rate is an important task in several business risk management tasks; including treasury risk management, derivatives pricing, and portfolio risk evaluation. The purpose of this study is to present a simple and effective approach for predicting historical volatility of currency exchange rate. The approach is based on a limited set of technical indicators as inputs to the artificial neural networks (ANN). To show the effectiveness of the proposed approach, it was applied to forecast US/Canada and US/Euro exchange rates volatilities. The forecasting results show that our simple approach outperformed the conventional GARCH and EGARCH with different distribution assumptions, and also the hybrid GARCH and EGARCH with ANN in terms of mean absolute error, mean of squared errors, and Theil's inequality coefficient. Because of the simplicity and effectiveness of the approach, it is promising for US currency volatility prediction tasks.

  9. Technosocial Predictive Analytics in Support of Naturalistic Decision Making

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sanfilippo, Antonio P.; Cowell, Andrew J.; Malone, Elizabeth L.

    2009-06-23

    A main challenge we face in fostering sustainable growth is to anticipate outcomes through predictive and proactive across domains as diverse as energy, security, the environment, health and finance in order to maximize opportunities, influence outcomes and counter adversities. The goal of this paper is to present new methods for anticipatory analytical thinking which address this challenge through the development of a multi-perspective approach to predictive modeling as a core to a creative decision making process. This approach is uniquely multidisciplinary in that it strives to create decision advantage through the integration of human and physical models, and leverages knowledgemore » management and visual analytics to support creative thinking by facilitating the achievement of interoperable knowledge inputs and enhancing the user’s cognitive access. We describe a prototype system which implements this approach and exemplify its functionality with reference to a use case in which predictive modeling is paired with analytic gaming to support collaborative decision-making in the domain of agricultural land management.« less

  10. Coupling of EIT with computational lung modeling for predicting patient-specific ventilatory responses.

    PubMed

    Roth, Christian J; Becher, Tobias; Frerichs, Inéz; Weiler, Norbert; Wall, Wolfgang A

    2017-04-01

    Providing optimal personalized mechanical ventilation for patients with acute or chronic respiratory failure is still a challenge within a clinical setting for each case anew. In this article, we integrate electrical impedance tomography (EIT) monitoring into a powerful patient-specific computational lung model to create an approach for personalizing protective ventilatory treatment. The underlying computational lung model is based on a single computed tomography scan and able to predict global airflow quantities, as well as local tissue aeration and strains for any ventilation maneuver. For validation, a novel "virtual EIT" module is added to our computational lung model, allowing to simulate EIT images based on the patient's thorax geometry and the results of our numerically predicted tissue aeration. Clinically measured EIT images are not used to calibrate the computational model. Thus they provide an independent method to validate the computational predictions at high temporal resolution. The performance of this coupling approach has been tested in an example patient with acute respiratory distress syndrome. The method shows good agreement between computationally predicted and clinically measured airflow data and EIT images. These results imply that the proposed framework can be used for numerical prediction of patient-specific responses to certain therapeutic measures before applying them to an actual patient. In the long run, definition of patient-specific optimal ventilation protocols might be assisted by computational modeling. NEW & NOTEWORTHY In this work, we present a patient-specific computational lung model that is able to predict global and local ventilatory quantities for a given patient and any selected ventilation protocol. For the first time, such a predictive lung model is equipped with a virtual electrical impedance tomography module allowing real-time validation of the computed results with the patient measurements. First promising results obtained in an acute respiratory distress syndrome patient show the potential of this approach for personalized computationally guided optimization of mechanical ventilation in future. Copyright © 2017 the American Physiological Society.

  11. Optimization of a novel biophysical model using large scale in vivo antisense hybridization data displays improved prediction capabilities of structurally accessible RNA regions.

    PubMed

    Vazquez-Anderson, Jorge; Mihailovic, Mia K; Baldridge, Kevin C; Reyes, Kristofer G; Haning, Katie; Cho, Seung Hee; Amador, Paul; Powell, Warren B; Contreras, Lydia M

    2017-05-19

    Current approaches to design efficient antisense RNAs (asRNAs) rely primarily on a thermodynamic understanding of RNA-RNA interactions. However, these approaches depend on structure predictions and have limited accuracy, arguably due to overlooking important cellular environment factors. In this work, we develop a biophysical model to describe asRNA-RNA hybridization that incorporates in vivo factors using large-scale experimental hybridization data for three model RNAs: a group I intron, CsrB and a tRNA. A unique element of our model is the estimation of the availability of the target region to interact with a given asRNA using a differential entropic consideration of suboptimal structures. We showcase the utility of this model by evaluating its prediction capabilities in four additional RNAs: a group II intron, Spinach II, 2-MS2 binding domain and glgC 5΄ UTR. Additionally, we demonstrate the applicability of this approach to other bacterial species by predicting sRNA-mRNA binding regions in two newly discovered, though uncharacterized, regulatory RNAs. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  12. Control of Systems With Slow Actuators Using Time Scale Separation

    NASA Technical Reports Server (NTRS)

    Stepanyan, Vehram; Nguyen, Nhan

    2009-01-01

    This paper addresses the problem of controlling a nonlinear plant with a slow actuator using singular perturbation method. For the known plant-actuator cascaded system the proposed scheme achieves tracking of a given reference model with considerably less control demand than would otherwise result when using conventional design techniques. This is the consequence of excluding the small parameter from the actuator dynamics via time scale separation. The resulting tracking error is within the order of this small parameter. For the unknown system the adaptive counterpart is developed based on the prediction model, which is driven towards the reference model by the control design. It is proven that the prediction model tracks the reference model with an error proportional to the small parameter, while the prediction error converges to zero. The resulting closed-loop system with all prediction models and adaptive laws remains stable. The benefits of the approach are demonstrated in simulation studies and compared to conventional control approaches.

  13. All-atom 3D structure prediction of transmembrane β-barrel proteins from sequences.

    PubMed

    Hayat, Sikander; Sander, Chris; Marks, Debora S; Elofsson, Arne

    2015-04-28

    Transmembrane β-barrels (TMBs) carry out major functions in substrate transport and protein biogenesis but experimental determination of their 3D structure is challenging. Encouraged by successful de novo 3D structure prediction of globular and α-helical membrane proteins from sequence alignments alone, we developed an approach to predict the 3D structure of TMBs. The approach combines the maximum-entropy evolutionary coupling method for predicting residue contacts (EVfold) with a machine-learning approach (boctopus2) for predicting β-strands in the barrel. In a blinded test for 19 TMB proteins of known structure that have a sufficient number of diverse homologous sequences available, this combined method (EVfold_bb) predicts hydrogen-bonded residue pairs between adjacent β-strands at an accuracy of ∼70%. This accuracy is sufficient for the generation of all-atom 3D models. In the transmembrane barrel region, the average 3D structure accuracy [template-modeling (TM) score] of top-ranked models is 0.54 (ranging from 0.36 to 0.85), with a higher (44%) number of residue pairs in correct strand-strand registration than in earlier methods (18%). Although the nonbarrel regions are predicted less accurately overall, the evolutionary couplings identify some highly constrained loop residues and, for FecA protein, the barrel including the structure of a plug domain can be accurately modeled (TM score = 0.68). Lower prediction accuracy tends to be associated with insufficient sequence information and we therefore expect increasing numbers of β-barrel families to become accessible to accurate 3D structure prediction as the number of available sequences increases.

  14. A time series modeling approach in risk appraisal of violent and sexual recidivism.

    PubMed

    Bani-Yaghoub, Majid; Fedoroff, J Paul; Curry, Susan; Amundsen, David E

    2010-10-01

    For over half a century, various clinical and actuarial methods have been employed to assess the likelihood of violent recidivism. Yet there is a need for new methods that can improve the accuracy of recidivism predictions. This study proposes a new time series modeling approach that generates high levels of predictive accuracy over short and long periods of time. The proposed approach outperformed two widely used actuarial instruments (i.e., the Violence Risk Appraisal Guide and the Sex Offender Risk Appraisal Guide). Furthermore, analysis of temporal risk variations based on specific time series models can add valuable information into risk assessment and management of violent offenders.

  15. Vehicular traffic noise prediction using soft computing approach.

    PubMed

    Singh, Daljeet; Nigam, S P; Agrawal, V P; Kumar, Maneek

    2016-12-01

    A new approach for the development of vehicular traffic noise prediction models is presented. Four different soft computing methods, namely, Generalized Linear Model, Decision Trees, Random Forests and Neural Networks, have been used to develop models to predict the hourly equivalent continuous sound pressure level, Leq, at different locations in the Patiala city in India. The input variables include the traffic volume per hour, percentage of heavy vehicles and average speed of vehicles. The performance of the four models is compared on the basis of performance criteria of coefficient of determination, mean square error and accuracy. 10-fold cross validation is done to check the stability of the Random Forest model, which gave the best results. A t-test is performed to check the fit of the model with the field data. Copyright © 2016 Elsevier Ltd. All rights reserved.

  16. A predictive parameter estimation approach for the thermodynamically constrained averaging theory applied to diffusion in porous media

    NASA Astrophysics Data System (ADS)

    Valdes-Parada, F. J.; Ostvar, S.; Wood, B. D.; Miller, C. T.

    2017-12-01

    Modeling of hierarchical systems such as porous media can be performed by different approaches that bridge microscale physics to the macroscale. Among the several alternatives available in the literature, the thermodynamically constrained averaging theory (TCAT) has emerged as a robust modeling approach that provides macroscale models that are consistent across scales. For specific closure relation forms, TCAT models are expressed in terms of parameters that depend upon the physical system under study. These parameters are usually obtained from inverse modeling based upon either experimental data or direct numerical simulation at the pore scale. Other upscaling approaches, such as the method of volume averaging, involve an a priori scheme for parameter estimation for certain microscale and transport conditions. In this work, we show how such a predictive scheme can be implemented in TCAT by studying the simple problem of single-phase passive diffusion in rigid and homogeneous porous media. The components of the effective diffusivity tensor are predicted for several porous media by solving ancillary boundary-value problems in periodic unit cells. The results are validated through a comparison with data from direct numerical simulation. This extension of TCAT constitutes a useful advance for certain classes of problems amenable to this estimation approach.

  17. Comparison of time series models for predicting campylobacteriosis risk in New Zealand.

    PubMed

    Al-Sakkaf, A; Jones, G

    2014-05-01

    Predicting campylobacteriosis cases is a matter of considerable concern in New Zealand, after the number of the notified cases was the highest among the developed countries in 2006. Thus, there is a need to develop a model or a tool to predict accurately the number of campylobacteriosis cases as the Microbial Risk Assessment Model used to predict the number of campylobacteriosis cases failed to predict accurately the number of actual cases. We explore the appropriateness of classical time series modelling approaches for predicting campylobacteriosis. Finding the most appropriate time series model for New Zealand data has additional practical considerations given a possible structural change, that is, a specific and sudden change in response to the implemented interventions. A univariate methodological approach was used to predict monthly disease cases using New Zealand surveillance data of campylobacteriosis incidence from 1998 to 2009. The data from the years 1998 to 2008 were used to model the time series with the year 2009 held out of the data set for model validation. The best two models were then fitted to the full 1998-2009 data and used to predict for each month of 2010. The Holt-Winters (multiplicative) and ARIMA (additive) intervention models were considered the best models for predicting campylobacteriosis in New Zealand. It was noticed that the prediction by an additive ARIMA with intervention was slightly better than the prediction by a Holt-Winter multiplicative method for the annual total in year 2010, the former predicting only 23 cases less than the actual reported cases. It is confirmed that classical time series techniques such as ARIMA with intervention and Holt-Winters can provide a good prediction performance for campylobacteriosis risk in New Zealand. The results reported by this study are useful to the New Zealand Health and Safety Authority's efforts in addressing the problem of the campylobacteriosis epidemic. © 2013 Blackwell Verlag GmbH.

  18. A predictive pilot model for STOL aircraft landing

    NASA Technical Reports Server (NTRS)

    Kleinman, D. L.; Killingsworth, W. R.

    1974-01-01

    An optimal control approach has been used to model pilot performance during STOL flare and landing. The model is used to predict pilot landing performance for three STOL configurations, each having a different level of automatic control augmentation. Model predictions are compared with flight simulator data. It is concluded that the model can be effective design tool for studying analytically the effects of display modifications, different stability augmentation systems, and proposed changes in the landing area geometry.

  19. Neutrinos and flavor symmetries

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tanimoto, Morimitsu

    2015-07-15

    We discuss the recent progress of flavor models with the non-Abelian discrete symmetry in the lepton sector focusing on the θ{sub 13} and CP violating phase. In both direct approach and indirect approach of the flavor symmetry, the non-vanishing θ{sub 13} is predictable. The flavor symmetry with the generalised CP symmetry can also predicts the CP violating phase. We show the phenomenological analyses of neutrino mixing for the typical flavor models.

  20. Ecological Forecasting in Chesapeake Bay: Using a Mechanistic-Empirical Modelling Approach

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Brown, C. W.; Hood, Raleigh R.; Long, Wen

    The Chesapeake Bay Ecological Prediction System (CBEPS) automatically generates daily nowcasts and three-day forecasts of several environmental variables, such as sea-surface temperature and salinity, the concentrations of chlorophyll, nitrate, and dissolved oxygen, and the likelihood of encountering several noxious species, including harmful algal blooms and water-borne pathogens, for the purpose of monitoring the Bay's ecosystem. While the physical and biogeochemical variables are forecast mechanistically using the Regional Ocean Modeling System configured for the Chesapeake Bay, the species predictions are generated using a novel mechanistic empirical approach, whereby real-time output from the coupled physical biogeochemical model drives multivariate empirical habitat modelsmore » of the target species. The predictions, in the form of digital images, are available via the World Wide Web to interested groups to guide recreational, management, and research activities. Though full validation of the integrated forecasts for all species is still a work in progress, we argue that the mechanistic–empirical approach can be used to generate a wide variety of short-term ecological forecasts, and that it can be applied in any marine system where sufficient data exist to develop empirical habitat models. This paper provides an overview of this system, its predictions, and the approach taken.« less

  1. Predicting the geographical distribution of two invasive termite species from occurrence data.

    PubMed

    Tonini, Francesco; Divino, Fabio; Lasinio, Giovanna Jona; Hochmair, Hartwig H; Scheffrahn, Rudolf H

    2014-10-01

    Predicting the potential habitat of species under both current and future climate change scenarios is crucial for monitoring invasive species and understanding a species' response to different environmental conditions. Frequently, the only data available on a species is the location of its occurrence (presence-only data). Using occurrence records only, two models were used to predict the geographical distribution of two destructive invasive termite species, Coptotermes gestroi (Wasmann) and Coptotermes formosanus Shiraki. The first model uses a Bayesian linear logistic regression approach adjusted for presence-only data while the second one is the widely used maximum entropy approach (Maxent). Results show that the predicted distributions of both C. gestroi and C. formosanus are strongly linked to urban development. The impact of future scenarios such as climate warming and population growth on the biotic distribution of both termite species was also assessed. Future climate warming seems to affect their projected probability of presence to a lesser extent than population growth. The Bayesian logistic approach outperformed Maxent consistently in all models according to evaluation criteria such as model sensitivity and ecological realism. The importance of further studies for an explicit treatment of residual spatial autocorrelation and a more comprehensive comparison between both statistical approaches is suggested.

  2. Moving beyond regression techniques in cardiovascular risk prediction: applying machine learning to address analytic challenges.

    PubMed

    Goldstein, Benjamin A; Navar, Ann Marie; Carter, Rickey E

    2017-06-14

    Risk prediction plays an important role in clinical cardiology research. Traditionally, most risk models have been based on regression models. While useful and robust, these statistical methods are limited to using a small number of predictors which operate in the same way on everyone, and uniformly throughout their range. The purpose of this review is to illustrate the use of machine-learning methods for development of risk prediction models. Typically presented as black box approaches, most machine-learning methods are aimed at solving particular challenges that arise in data analysis that are not well addressed by typical regression approaches. To illustrate these challenges, as well as how different methods can address them, we consider trying to predicting mortality after diagnosis of acute myocardial infarction. We use data derived from our institution's electronic health record and abstract data on 13 regularly measured laboratory markers. We walk through different challenges that arise in modelling these data and then introduce different machine-learning approaches. Finally, we discuss general issues in the application of machine-learning methods including tuning parameters, loss functions, variable importance, and missing data. Overall, this review serves as an introduction for those working on risk modelling to approach the diffuse field of machine learning. © The Author 2016. Published by Oxford University Press on behalf of the European Society of Cardiology.

  3. Biomechanical model for computing deformations for whole-body image registration: A meshless approach.

    PubMed

    Li, Mao; Miller, Karol; Joldes, Grand Roman; Kikinis, Ron; Wittek, Adam

    2016-12-01

    Patient-specific biomechanical models have been advocated as a tool for predicting deformations of soft body organs/tissue for medical image registration (aligning two sets of images) when differences between the images are large. However, complex and irregular geometry of the body organs makes generation of patient-specific biomechanical models very time-consuming. Meshless discretisation has been proposed to solve this challenge. However, applications so far have been limited to 2D models and computing single organ deformations. In this study, 3D comprehensive patient-specific nonlinear biomechanical models implemented using meshless Total Lagrangian explicit dynamics algorithms are applied to predict a 3D deformation field for whole-body image registration. Unlike a conventional approach that requires dividing (segmenting) the image into non-overlapping constituents representing different organs/tissues, the mechanical properties are assigned using the fuzzy c-means algorithm without the image segmentation. Verification indicates that the deformations predicted using the proposed meshless approach are for practical purposes the same as those obtained using the previously validated finite element models. To quantitatively evaluate the accuracy of the predicted deformations, we determined the spatial misalignment between the registered (i.e. source images warped using the predicted deformations) and target images by computing the edge-based Hausdorff distance. The Hausdorff distance-based evaluation determines that our meshless models led to successful registration of the vast majority of the image features. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  4. Evaluating Predictive Uncertainty of Hyporheic Exchange Modelling

    NASA Astrophysics Data System (ADS)

    Chow, R.; Bennett, J.; Dugge, J.; Wöhling, T.; Nowak, W.

    2017-12-01

    Hyporheic exchange is the interaction of water between rivers and groundwater, and is difficult to predict. One of the largest contributions to predictive uncertainty for hyporheic fluxes have been attributed to the representation of heterogeneous subsurface properties. This research aims to evaluate which aspect of the subsurface representation - the spatial distribution of hydrofacies or the model for local-scale (within-facies) heterogeneity - most influences the predictive uncertainty. Also, we seek to identify data types that help reduce this uncertainty best. For this investigation, we conduct a modelling study of the Steinlach River meander, in Southwest Germany. The Steinlach River meander is an experimental site established in 2010 to monitor hyporheic exchange at the meander scale. We use HydroGeoSphere, a fully integrated surface water-groundwater model, to model hyporheic exchange and to assess the predictive uncertainty of hyporheic exchange transit times (HETT). A highly parameterized complex model is built and treated as `virtual reality', which is in turn modelled with simpler subsurface parameterization schemes (Figure). Then, we conduct Monte-Carlo simulations with these models to estimate the predictive uncertainty. Results indicate that: Uncertainty in HETT is relatively small for early times and increases with transit times. Uncertainty from local-scale heterogeneity is negligible compared to uncertainty in the hydrofacies distribution. Introducing more data to a poor model structure may reduce predictive variance, but does not reduce predictive bias. Hydraulic head observations alone cannot constrain the uncertainty of HETT, however an estimate of hyporheic exchange flux proves to be more effective at reducing this uncertainty. Figure: Approach for evaluating predictive model uncertainty. A conceptual model is first developed from the field investigations. A complex model (`virtual reality') is then developed based on that conceptual model. This complex model then serves as the basis to compare simpler model structures. Through this approach, predictive uncertainty can be quantified relative to a known reference solution.

  5. Modeling Heavy/Medium-Duty Fuel Consumption Based on Drive Cycle Properties

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wang, Lijuan; Duran, Adam; Gonder, Jeffrey

    This paper presents multiple methods for predicting heavy/medium-duty vehicle fuel consumption based on driving cycle information. A polynomial model, a black box artificial neural net model, a polynomial neural network model, and a multivariate adaptive regression splines (MARS) model were developed and verified using data collected from chassis testing performed on a parcel delivery diesel truck operating over the Heavy Heavy-Duty Diesel Truck (HHDDT), City Suburban Heavy Vehicle Cycle (CSHVC), New York Composite Cycle (NYCC), and hydraulic hybrid vehicle (HHV) drive cycles. Each model was trained using one of four drive cycles as a training cycle and the other threemore » as testing cycles. By comparing the training and testing results, a representative training cycle was chosen and used to further tune each method. HHDDT as the training cycle gave the best predictive results, because HHDDT contains a variety of drive characteristics, such as high speed, acceleration, idling, and deceleration. Among the four model approaches, MARS gave the best predictive performance, with an average absolute percent error of -1.84% over the four chassis dynamometer drive cycles. To further evaluate the accuracy of the predictive models, the approaches were first applied to real-world data. MARS outperformed the other three approaches, providing an average absolute percent error of -2.2% of four real-world road segments. The MARS model performance was then compared to HHDDT, CSHVC, NYCC, and HHV drive cycles with the performance from Future Automotive System Technology Simulator (FASTSim). The results indicated that the MARS method achieved a comparative predictive performance with FASTSim.« less

  6. Measuring pedestrian volumes and conflicts. Volume 2, Accident prediction model

    DOT National Transportation Integrated Search

    1987-12-01

    This final report presents the findings, conclusions, and recommendations of the study conducted to model pedestrian/vehicle accidents. A group-type analysis approach for the prediction of pedestrian/vehicle accidents using pedestrian/vehicle conflic...

  7. Patient Similarity in Prediction Models Based on Health Data: A Scoping Review

    PubMed Central

    Sharafoddini, Anis; Dubin, Joel A

    2017-01-01

    Background Physicians and health policy makers are required to make predictions during their decision making in various medical problems. Many advances have been made in predictive modeling toward outcome prediction, but these innovations target an average patient and are insufficiently adjustable for individual patients. One developing idea in this field is individualized predictive analytics based on patient similarity. The goal of this approach is to identify patients who are similar to an index patient and derive insights from the records of similar patients to provide personalized predictions.. Objective The aim is to summarize and review published studies describing computer-based approaches for predicting patients’ future health status based on health data and patient similarity, identify gaps, and provide a starting point for related future research. Methods The method involved (1) conducting the review by performing automated searches in Scopus, PubMed, and ISI Web of Science, selecting relevant studies by first screening titles and abstracts then analyzing full-texts, and (2) documenting by extracting publication details and information on context, predictors, missing data, modeling algorithm, outcome, and evaluation methods into a matrix table, synthesizing data, and reporting results. Results After duplicate removal, 1339 articles were screened in abstracts and titles and 67 were selected for full-text review. In total, 22 articles met the inclusion criteria. Within included articles, hospitals were the main source of data (n=10). Cardiovascular disease (n=7) and diabetes (n=4) were the dominant patient diseases. Most studies (n=18) used neighborhood-based approaches in devising prediction models. Two studies showed that patient similarity-based modeling outperformed population-based predictive methods. Conclusions Interest in patient similarity-based predictive modeling for diagnosis and prognosis has been growing. In addition to raw/coded health data, wavelet transform and term frequency-inverse document frequency methods were employed to extract predictors. Selecting predictors with potential to highlight special cases and defining new patient similarity metrics were among the gaps identified in the existing literature that provide starting points for future work. Patient status prediction models based on patient similarity and health data offer exciting potential for personalizing and ultimately improving health care, leading to better patient outcomes. PMID:28258046

  8. A study of modelling simplifications in ground vibration predictions for railway traffic at grade

    NASA Astrophysics Data System (ADS)

    Germonpré, M.; Degrande, G.; Lombaert, G.

    2017-10-01

    Accurate computational models are required to predict ground-borne vibration due to railway traffic. Such models generally require a substantial computational effort. Therefore, much research has focused on developing computationally efficient methods, by either exploiting the regularity of the problem geometry in the direction along the track or assuming a simplified track structure. This paper investigates the modelling errors caused by commonly made simplifications of the track geometry. A case study is presented investigating a ballasted track in an excavation. The soil underneath the ballast is stiffened by a lime treatment. First, periodic track models with different cross sections are analyzed, revealing that a prediction of the rail receptance only requires an accurate representation of the soil layering directly underneath the ballast. A much more detailed representation of the cross sectional geometry is required, however, to calculate vibration transfer from track to free field. Second, simplifications in the longitudinal track direction are investigated by comparing 2.5D and periodic track models. This comparison shows that the 2.5D model slightly overestimates the track stiffness, while the transfer functions between track and free field are well predicted. Using a 2.5D model to predict the response during a train passage leads to an overestimation of both train-track interaction forces and free field vibrations. A combined periodic/2.5D approach is therefore proposed in this paper. First, the dynamic axle loads are computed by solving the train-track interaction problem with a periodic model. Next, the vibration transfer to the free field is computed with a 2.5D model. This combined periodic/2.5D approach only introduces small modelling errors compared to an approach in which a periodic model is used in both steps, while significantly reducing the computational cost.

  9. Data-driven Analysis and Prediction of Arctic Sea Ice

    NASA Astrophysics Data System (ADS)

    Kondrashov, D. A.; Chekroun, M.; Ghil, M.; Yuan, X.; Ting, M.

    2015-12-01

    We present results of data-driven predictive analyses of sea ice over the main Arctic regions. Our approach relies on the Multilayer Stochastic Modeling (MSM) framework of Kondrashov, Chekroun and Ghil [Physica D, 2015] and it leads to prognostic models of sea ice concentration (SIC) anomalies on seasonal time scales.This approach is applied to monthly time series of leading principal components from the multivariate Empirical Orthogonal Function decomposition of SIC and selected climate variables over the Arctic. We evaluate the predictive skill of MSM models by performing retrospective forecasts with "no-look ahead" forup to 6-months ahead. It will be shown in particular that the memory effects included in our non-Markovian linear MSM models improve predictions of large-amplitude SIC anomalies in certain Arctic regions. Furtherimprovements allowed by the MSM framework will adopt a nonlinear formulation, as well as alternative data-adaptive decompositions.

  10. Operational validation of a multi-period and multi-criteria model conditioning approach for the prediction of rainfall-runoff processes in small forest catchments

    NASA Astrophysics Data System (ADS)

    Choi, H.; Kim, S.

    2012-12-01

    Most of hydrologic models have generally been used to describe and represent the spatio-temporal variability of hydrological processes in the watershed scale. Though it is an obvious fact that hydrological responses have the time varying nature, optimal values of model parameters were normally considered as time invariants or constants in most cases. The recent paper of Choi and Beven (2007) presents a multi-period and multi-criteria model conditioning approach. The approach is based on the equifinality thesis within the Generalised Likelihood Uncertainty Estimation (GLUE) framework. In their application, the behavioural TOPMODEL parameter sets are determined by several performance measures for global (annual) and short (30-days) periods, clustered using a Fuzzy C-means algorithm, into 15 types representing different hydrological conditions. Their study shows a good performance on the calibration of a rainfall-runoff model in a forest catchment, and also gives strong indications that it is uncommon to find model realizations that were behavioural over all multi-periods and all performance measures, and multi-period model conditioning approach may become new effective tool for predictions of hydrological processes in ungauged catchments. This study is a follow-up study on the Choi and Beven's (2007) model conditioning approach to test how the approach is effective for the prediction of rainfall-runoff responses in ungauged catchments. To achieve this purpose, 6 small forest catchments are selected among the several hydrological experimental catchments operated by Korea Forest Research Institute. In each catchment, long-term hydrological time series data varying from 10 to 30 years were available. The areas of the selected catchments range from 13.6 to 37.8 ha, and all areas are covered by coniferous or broad-leaves forests. The selected catchments locate in the southern coastal area to the northern part of South Korea. The bed rocks are Granite gneiss, Granite or Limestone. The study is progressed based on the followings. Firstly, hydrological time series of each catchment are sampled and clustered into multi-period having distinctly different temporal characteristics, and secondly, behavioural parameter distributions are determined in each multi-period based on the specification of multi-criteria model performance measures. Finally, behavioural parameter sets of each multi-period of single catchment are applied on the corresponding period of other catchments, and the cross-validations are conducted in this manner for all catchments The multi-period model conditioning approach is clearly effective to reduce the width of prediction limits, giving better model performance against the temporal variability of hydrological characteristics, and has enough potential to be the effective prediction tool for ungauged catchments. However, more advanced and continuous studies are needed to expand the application of this approach in prediction of hydrological responses in ungauged catchments,

  11. Mathematical modeling and computational prediction of cancer drug resistance.

    PubMed

    Sun, Xiaoqiang; Hu, Bin

    2017-06-23

    Diverse forms of resistance to anticancer drugs can lead to the failure of chemotherapy. Drug resistance is one of the most intractable issues for successfully treating cancer in current clinical practice. Effective clinical approaches that could counter drug resistance by restoring the sensitivity of tumors to the targeted agents are urgently needed. As numerous experimental results on resistance mechanisms have been obtained and a mass of high-throughput data has been accumulated, mathematical modeling and computational predictions using systematic and quantitative approaches have become increasingly important, as they can potentially provide deeper insights into resistance mechanisms, generate novel hypotheses or suggest promising treatment strategies for future testing. In this review, we first briefly summarize the current progress of experimentally revealed resistance mechanisms of targeted therapy, including genetic mechanisms, epigenetic mechanisms, posttranslational mechanisms, cellular mechanisms, microenvironmental mechanisms and pharmacokinetic mechanisms. Subsequently, we list several currently available databases and Web-based tools related to drug sensitivity and resistance. Then, we focus primarily on introducing some state-of-the-art computational methods used in drug resistance studies, including mechanism-based mathematical modeling approaches (e.g. molecular dynamics simulation, kinetic model of molecular networks, ordinary differential equation model of cellular dynamics, stochastic model, partial differential equation model, agent-based model, pharmacokinetic-pharmacodynamic model, etc.) and data-driven prediction methods (e.g. omics data-based conventional screening approach for node biomarkers, static network approach for edge biomarkers and module biomarkers, dynamic network approach for dynamic network biomarkers and dynamic module network biomarkers, etc.). Finally, we discuss several further questions and future directions for the use of computational methods for studying drug resistance, including inferring drug-induced signaling networks, multiscale modeling, drug combinations and precision medicine. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  12. The energetic cost of walking: a comparison of predictive methods.

    PubMed

    Kramer, Patricia Ann; Sylvester, Adam D

    2011-01-01

    The energy that animals devote to locomotion has been of intense interest to biologists for decades and two basic methodologies have emerged to predict locomotor energy expenditure: those based on metabolic and those based on mechanical energy. Metabolic energy approaches share the perspective that prediction of locomotor energy expenditure should be based on statistically significant proxies of metabolic function, while mechanical energy approaches, which derive from many different perspectives, focus on quantifying the energy of movement. Some controversy exists as to which mechanical perspective is "best", but from first principles all mechanical methods should be equivalent if the inputs to the simulation are of similar quality. Our goals in this paper are 1) to establish the degree to which the various methods of calculating mechanical energy are correlated, and 2) to investigate to what degree the prediction methods explain the variation in energy expenditure. We use modern humans as the model organism in this experiment because their data are readily attainable, but the methodology is appropriate for use in other species. Volumetric oxygen consumption and kinematic and kinetic data were collected on 8 adults while walking at their self-selected slow, normal and fast velocities. Using hierarchical statistical modeling via ordinary least squares and maximum likelihood techniques, the predictive ability of several metabolic and mechanical approaches were assessed. We found that all approaches are correlated and that the mechanical approaches explain similar amounts of the variation in metabolic energy expenditure. Most methods predict the variation within an individual well, but are poor at accounting for variation between individuals. Our results indicate that the choice of predictive method is dependent on the question(s) of interest and the data available for use as inputs. Although we used modern humans as our model organism, these results can be extended to other species.

  13. What do we gain with Probabilistic Flood Loss Models?

    NASA Astrophysics Data System (ADS)

    Schroeter, K.; Kreibich, H.; Vogel, K.; Merz, B.; Lüdtke, S.

    2015-12-01

    The reliability of flood loss models is a prerequisite for their practical usefulness. Oftentimes, traditional uni-variate damage models as for instance depth-damage curves fail to reproduce the variability of observed flood damage. Innovative multi-variate probabilistic modelling approaches are promising to capture and quantify the uncertainty involved and thus to improve the basis for decision making. In this study we compare the predictive capability of two probabilistic modelling approaches, namely Bagging Decision Trees and Bayesian Networks and traditional stage damage functions which are cast in a probabilistic framework. For model evaluation we use empirical damage data which are available from computer aided telephone interviews that were respectively compiled after the floods in 2002, 2005, 2006 and 2013 in the Elbe and Danube catchments in Germany. We carry out a split sample test by sub-setting the damage records. One sub-set is used to derive the models and the remaining records are used to evaluate the predictive performance of the model. Further we stratify the sample according to catchments which allows studying model performance in a spatial transfer context. Flood damage estimation is carried out on the scale of the individual buildings in terms of relative damage. The predictive performance of the models is assessed in terms of systematic deviations (mean bias), precision (mean absolute error) as well as in terms of reliability which is represented by the proportion of the number of observations that fall within the 95-quantile and 5-quantile predictive interval. The reliability of the probabilistic predictions within validation runs decreases only slightly and achieves a very good coverage of observations within the predictive interval. Probabilistic models provide quantitative information about prediction uncertainty which is crucial to assess the reliability of model predictions and improves the usefulness of model results.

  14. Predictive Models for Semiconductor Device Design and Processing

    NASA Technical Reports Server (NTRS)

    Meyyappan, Meyya; Arnold, James O. (Technical Monitor)

    1998-01-01

    The device feature size continues to be on a downward trend with a simultaneous upward trend in wafer size to 300 mm. Predictive models are needed more than ever before for this reason. At NASA Ames, a Device and Process Modeling effort has been initiated recently with a view to address these issues. Our activities cover sub-micron device physics, process and equipment modeling, computational chemistry and material science. This talk would outline these efforts and emphasize the interaction among various components. The device physics component is largely based on integrating quantum effects into device simulators. We have two parallel efforts, one based on a quantum mechanics approach and the second, a semiclassical hydrodynamics approach with quantum correction terms. Under the first approach, three different quantum simulators are being developed and compared: a nonequlibrium Green's function (NEGF) approach, Wigner function approach, and a density matrix approach. In this talk, results using various codes will be presented. Our process modeling work focuses primarily on epitaxy and etching using first-principles models coupling reactor level and wafer level features. For the latter, we are using a novel approach based on Level Set theory. Sample results from this effort will also be presented.

  15. Applying the age-shift approach to model responses to midrotation fertilization

    Treesearch

    Colleen A. Carlson; Thomas R. Fox; H. Lee Allen; Timothy J. Albaugh

    2010-01-01

    Growth and yield models used to evaluate midrotation fertilization economics require adjustments to account for the typically observed responses. This study investigated the use of age-shift models to predict midrotation fertilizer responses. Age-shift prediction models were constructed from a regional study consisting of 43 installations of a nitrogen (N) by...

  16. Ensemble modeling to predict habitat suitability for a large-scale disturbance specialist

    Treesearch

    Quresh S. Latif; Victoria A. Saab; Jonathan G. Dudley; Jeff P. Hollenbeck

    2013-01-01

    To conserve habitat for disturbance specialist species, ecologists must identify where individuals will likely settle in newly disturbed areas. Habitat suitability models can predict which sites at new disturbances will most likely attract specialists. Without validation data from newly disturbed areas, however, the best approach for maximizing predictive accuracy can...

  17. Predicting Alumni/ae Gift Giving Behavior: A Structural Equation Model Approach.

    ERIC Educational Resources Information Center

    Mosser, John Wayne

    This dissertation focuses on predicting alumni gift giving behavior at a large public research university (University of Michigan). A conceptual model was developed for predicting alumni giving behavior in order to advance the theoretical understanding of how capacity to give, motivation to give, and their interaction effect gift giving behavior.…

  18. Using connectome-based predictive modeling to predict individual behavior from brain connectivity

    PubMed Central

    Shen, Xilin; Finn, Emily S.; Scheinost, Dustin; Rosenberg, Monica D.; Chun, Marvin M.; Papademetris, Xenophon; Constable, R Todd

    2017-01-01

    Neuroimaging is a fast developing research area where anatomical and functional images of human brains are collected using techniques such as functional magnetic resonance imaging (fMRI), diffusion tensor imaging (DTI), and electroencephalography (EEG). Technical advances and large-scale datasets have allowed for the development of models capable of predicting individual differences in traits and behavior using brain connectivity measures derived from neuroimaging data. Here, we present connectome-based predictive modeling (CPM), a data-driven protocol for developing predictive models of brain-behavior relationships from connectivity data using cross-validation. This protocol includes the following steps: 1) feature selection, 2) feature summarization, 3) model building, and 4) assessment of prediction significance. We also include suggestions for visualizing the most predictive features (i.e., brain connections). The final result should be a generalizable model that takes brain connectivity data as input and generates predictions of behavioral measures in novel subjects, accounting for a significant amount of the variance in these measures. It has been demonstrated that the CPM protocol performs equivalently or better than most of the existing approaches in brain-behavior prediction. However, because CPM focuses on linear modeling and a purely data-driven driven approach, neuroscientists with limited or no experience in machine learning or optimization would find it easy to implement the protocols. Depending on the volume of data to be processed, the protocol can take 10–100 minutes for model building, 1–48 hours for permutation testing, and 10–20 minutes for visualization of results. PMID:28182017

  19. A hybrid predictive model for acoustic noise in urban areas based on time series analysis and artificial neural network

    NASA Astrophysics Data System (ADS)

    Guarnaccia, Claudio; Quartieri, Joseph; Tepedino, Carmine

    2017-06-01

    The dangerous effect of noise on human health is well known. Both the auditory and non-auditory effects are largely documented in literature, and represent an important hazard in human activities. Particular care is devoted to road traffic noise, since it is growing according to the growth of residential, industrial and commercial areas. For these reasons, it is important to develop effective models able to predict the noise in a certain area. In this paper, a hybrid predictive model is presented. The model is based on the mixing of two different approach: the Time Series Analysis (TSA) and the Artificial Neural Network (ANN). The TSA model is based on the evaluation of trend and seasonality in the data, while the ANN model is based on the capacity of the network to "learn" the behavior of the data. The mixed approach will consist in the evaluation of noise levels by means of TSA and, once the differences (residuals) between TSA estimations and observed data have been calculated, in the training of a ANN on the residuals. This hybrid model will exploit interesting features and results, with a significant variation related to the number of steps forward in the prediction. It will be shown that the best results, in terms of prediction, are achieved predicting one step ahead in the future. Anyway, a 7 days prediction can be performed, with a slightly greater error, but offering a larger range of prediction, with respect to the single day ahead predictive model.

  20. Developing an in silico minimum inhibitory concentration panel test for Klebsiella pneumoniae

    DOE PAGES

    Nguyen, Marcus; Brettin, Thomas; Long, S. Wesley; ...

    2018-01-11

    Here, antimicrobial resistant infections are a serious public health threat worldwide. Whole genome sequencing approaches to rapidly identify pathogens and predict antibiotic resistance phenotypes are becoming more feasible and may offer a way to reduce clinical test turnaround times compared to conventional culture-based methods, and in turn, improve patient outcomes. In this study, we use whole genome sequence data from 1668 clinical isolates of Klebsiella pneumoniae to develop a XGBoost-based machine learning model that accurately predicts minimum inhibitory concentrations (MICs) for 20 antibiotics. The overall accuracy of the model, within ± 1 two-fold dilution factor, is 92%. Individual accuracies aremore » >= 90% for 15/20 antibiotics. We show that the MICs predicted by the model correlate with known antimicrobial resistance genes. Importantly, the genome-wide approach described in this study offers a way to predict MICs for isolates without knowledge of the underlying gene content. This study shows that machine learning can be used to build a complete in silico MIC prediction panel for K. pneumoniae and provides a framework for building MIC prediction models for other pathogenic bacteria.« less

  1. Developing an in silico minimum inhibitory concentration panel test for Klebsiella pneumoniae

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nguyen, Marcus; Brettin, Thomas; Long, S. Wesley

    Here, antimicrobial resistant infections are a serious public health threat worldwide. Whole genome sequencing approaches to rapidly identify pathogens and predict antibiotic resistance phenotypes are becoming more feasible and may offer a way to reduce clinical test turnaround times compared to conventional culture-based methods, and in turn, improve patient outcomes. In this study, we use whole genome sequence data from 1668 clinical isolates of Klebsiella pneumoniae to develop a XGBoost-based machine learning model that accurately predicts minimum inhibitory concentrations (MICs) for 20 antibiotics. The overall accuracy of the model, within ± 1 two-fold dilution factor, is 92%. Individual accuracies aremore » >= 90% for 15/20 antibiotics. We show that the MICs predicted by the model correlate with known antimicrobial resistance genes. Importantly, the genome-wide approach described in this study offers a way to predict MICs for isolates without knowledge of the underlying gene content. This study shows that machine learning can be used to build a complete in silico MIC prediction panel for K. pneumoniae and provides a framework for building MIC prediction models for other pathogenic bacteria.« less

  2. Distributed Damage Estimation for Prognostics based on Structural Model Decomposition

    NASA Technical Reports Server (NTRS)

    Daigle, Matthew; Bregon, Anibal; Roychoudhury, Indranil

    2011-01-01

    Model-based prognostics approaches capture system knowledge in the form of physics-based models of components, and how they fail. These methods consist of a damage estimation phase, in which the health state of a component is estimated, and a prediction phase, in which the health state is projected forward in time to determine end of life. However, the damage estimation problem is often multi-dimensional and computationally intensive. We propose a model decomposition approach adapted from the diagnosis community, called possible conflicts, in order to both improve the computational efficiency of damage estimation, and formulate a damage estimation approach that is inherently distributed. Local state estimates are combined into a global state estimate from which prediction is performed. Using a centrifugal pump as a case study, we perform a number of simulation-based experiments to demonstrate the approach.

  3. Qualitative Event-Based Diagnosis: Case Study on the Second International Diagnostic Competition

    NASA Technical Reports Server (NTRS)

    Daigle, Matthew; Roychoudhury, Indranil

    2010-01-01

    We describe a diagnosis algorithm entered into the Second International Diagnostic Competition. We focus on the first diagnostic problem of the industrial track of the competition in which a diagnosis algorithm must detect, isolate, and identify faults in an electrical power distribution testbed and provide corresponding recovery recommendations. The diagnosis algorithm embodies a model-based approach, centered around qualitative event-based fault isolation. Faults produce deviations in measured values from model-predicted values. The sequence of these deviations is matched to those predicted by the model in order to isolate faults. We augment this approach with model-based fault identification, which determines fault parameters and helps to further isolate faults. We describe the diagnosis approach, provide diagnosis results from running the algorithm on provided example scenarios, and discuss the issues faced, and lessons learned, from implementing the approach

  4. Transition Heat Transfer Modeling Based on the Characteristics of Turbulent Spots

    NASA Technical Reports Server (NTRS)

    Simon, Fred; Boyle, Robert

    1998-01-01

    While turbulence models are being developed which show promise for simulating the transition region on a turbine blade or vane, it is believed that the best approach with the greatest potential for practical use is the use of models which incorporate the physics of turbulent spots present in the transition region. This type of modeling results in the prediction of transition region intermittency which when incorporated in turbulence models give a good to excellent prediction of the transition region heat transfer. Some models are presented which show how turbulent spot characteristics and behavior can be employed to predict the effect of pressure gradient and Mach number on the transition region. The models predict the spot formation rate which is needed, in addition to the transition onset location, in the Narasimha concentrated breakdown intermittency equation. A simplified approach is taken for modeling turbulent spot growth and interaction in the transition region which utilizes the turbulent spot variables governing transition length and spot generation rate. The models are expressed in terms of spot spreading angle, dimensionless spot velocity, dimensionless spot area, disturbance frequency and Mach number. The models are used in conjunction with a computer code to predict the effects of pressure gradient and Mach number on the transition region and compared with VKI experimental turbine data.

  5. Modelling road accidents: An approach using structural time series

    NASA Astrophysics Data System (ADS)

    Junus, Noor Wahida Md; Ismail, Mohd Tahir

    2014-09-01

    In this paper, the trend of road accidents in Malaysia for the years 2001 until 2012 was modelled using a structural time series approach. The structural time series model was identified using a stepwise method, and the residuals for each model were tested. The best-fitted model was chosen based on the smallest Akaike Information Criterion (AIC) and prediction error variance. In order to check the quality of the model, a data validation procedure was performed by predicting the monthly number of road accidents for the year 2012. Results indicate that the best specification of the structural time series model to represent road accidents is the local level with a seasonal model.

  6. Predicting future protection of respirator users: Statistical approaches and practical implications.

    PubMed

    Hu, Chengcheng; Harber, Philip; Su, Jing

    2016-01-01

    The purpose of this article is to describe a statistical approach for predicting a respirator user's fit factor in the future based upon results from initial tests. A statistical prediction model was developed based upon joint distribution of multiple fit factor measurements over time obtained from linear mixed effect models. The model accounts for within-subject correlation as well as short-term (within one day) and longer-term variability. As an example of applying this approach, model parameters were estimated from a research study in which volunteers were trained by three different modalities to use one of two types of respirators. They underwent two quantitative fit tests at the initial session and two on the same day approximately six months later. The fitted models demonstrated correlation and gave the estimated distribution of future fit test results conditional on past results for an individual worker. This approach can be applied to establishing a criterion value for passing an initial fit test to provide reasonable likelihood that a worker will be adequately protected in the future; and to optimizing the repeat fit factor test intervals individually for each user for cost-effective testing.

  7. Modelling the multidimensional niche by linking functional traits to competitive performance

    PubMed Central

    Maynard, Daniel S.; Leonard, Kenneth E.; Drake, John M.; Hall, David W.; Crowther, Thomas W.; Bradford, Mark A.

    2015-01-01

    Linking competitive outcomes to environmental conditions is necessary for understanding species' distributions and responses to environmental change. Despite this importance, generalizable approaches for predicting competitive outcomes across abiotic gradients are lacking, driven largely by the highly complex and context-dependent nature of biotic interactions. Here, we present and empirically test a novel niche model that uses functional traits to model the niche space of organisms and predict competitive outcomes of co-occurring populations across multiple resource gradients. The model makes no assumptions about the underlying mode of competition and instead applies to those settings where relative competitive ability across environments correlates with a quantifiable performance metric. To test the model, a series of controlled microcosm experiments were conducted using genetically related strains of a widespread microbe. The model identified trait microevolution and performance differences among strains, with the predicted competitive ability of each organism mapped across a two-dimensional carbon and nitrogen resource space. Areas of coexistence and competitive dominance between strains were identified, and the predicted competitive outcomes were validated in approximately 95% of the pairings. By linking trait variation to competitive ability, our work demonstrates a generalizable approach for predicting and modelling competitive outcomes across changing environmental contexts. PMID:26136444

  8. Population response to climate change: linear vs. non-linear modeling approaches.

    PubMed

    Ellis, Alicia M; Post, Eric

    2004-03-31

    Research on the ecological consequences of global climate change has elicited a growing interest in the use of time series analysis to investigate population dynamics in a changing climate. Here, we compare linear and non-linear models describing the contribution of climate to the density fluctuations of the population of wolves on Isle Royale, Michigan from 1959 to 1999. The non-linear self excitatory threshold autoregressive (SETAR) model revealed that, due to differences in the strength and nature of density dependence, relatively small and large populations may be differentially affected by future changes in climate. Both linear and non-linear models predict a decrease in the population of wolves with predicted changes in climate. Because specific predictions differed between linear and non-linear models, our study highlights the importance of using non-linear methods that allow the detection of non-linearity in the strength and nature of density dependence. Failure to adopt a non-linear approach to modelling population response to climate change, either exclusively or in addition to linear approaches, may compromise efforts to quantify ecological consequences of future warming.

  9. An efficient approach for site-specific scenery prediction in surveillance imaging near Earth's surface

    NASA Astrophysics Data System (ADS)

    Jylhä, Juha; Marjanen, Kalle; Rantala, Mikko; Metsäpuro, Petri; Visa, Ari

    2006-09-01

    Surveillance camera automation and camera network development are growing areas of interest. This paper proposes a competent approach to enhance the camera surveillance with Geographic Information Systems (GIS) when the camera is located at the height of 10-1000 m. A digital elevation model (DEM), a terrain class model, and a flight obstacle register comprise exploited auxiliary information. The approach takes into account spherical shape of the Earth and realistic terrain slopes. Accordingly, considering also forests, it determines visible and shadow regions. The efficiency arises out of reduced dimensionality in the visibility computation. Image processing is aided by predicting certain advance features of visible terrain. The features include distance from the camera and the terrain or object class such as coniferous forest, field, urban site, lake, or mast. The performance of the approach is studied by comparing a photograph of Finnish forested landscape with the prediction. The predicted background is well-fitting, and potential knowledge-aid for various purposes becomes apparent.

  10. Comparison of two stochastic techniques for reliable urban runoff prediction by modeling systematic errors

    NASA Astrophysics Data System (ADS)

    Del Giudice, Dario; Löwe, Roland; Madsen, Henrik; Mikkelsen, Peter Steen; Rieckermann, Jörg

    2015-07-01

    In urban rainfall-runoff, commonly applied statistical techniques for uncertainty quantification mostly ignore systematic output errors originating from simplified models and erroneous inputs. Consequently, the resulting predictive uncertainty is often unreliable. Our objective is to present two approaches which use stochastic processes to describe systematic deviations and to discuss their advantages and drawbacks for urban drainage modeling. The two methodologies are an external bias description (EBD) and an internal noise description (IND, also known as stochastic gray-box modeling). They emerge from different fields and have not yet been compared in environmental modeling. To compare the two approaches, we develop a unifying terminology, evaluate them theoretically, and apply them to conceptual rainfall-runoff modeling in the same drainage system. Our results show that both approaches can provide probabilistic predictions of wastewater discharge in a similarly reliable way, both for periods ranging from a few hours up to more than 1 week ahead of time. The EBD produces more accurate predictions on long horizons but relies on computationally heavy MCMC routines for parameter inferences. These properties make it more suitable for off-line applications. The IND can help in diagnosing the causes of output errors and is computationally inexpensive. It produces best results on short forecast horizons that are typical for online applications.

  11. Computational predictive models for P-glycoprotein inhibition of in-house chalcone derivatives and drug-bank compounds.

    PubMed

    Ngo, Trieu-Du; Tran, Thanh-Dao; Le, Minh-Tri; Thai, Khac-Minh

    2016-11-01

    The human P-glycoprotein (P-gp) efflux pump is of great interest for medicinal chemists because of its important role in multidrug resistance (MDR). Because of the high polyspecificity as well as the unavailability of high-resolution X-ray crystal structures of this transmembrane protein, ligand-based, and structure-based approaches which were machine learning, homology modeling, and molecular docking were combined for this study. In ligand-based approach, individual two-dimensional quantitative structure-activity relationship models were developed using different machine learning algorithms and subsequently combined into the Ensemble model which showed good performance on both the diverse training set and the validation sets. The applicability domain and the prediction quality of the developed models were also judged using the state-of-the-art methods and tools. In our structure-based approach, the P-gp structure and its binding region were predicted for a docking study to determine possible interactions between the ligands and the receptor. Based on these in silico tools, hit compounds for reversing MDR were discovered from the in-house and DrugBank databases through virtual screening using prediction models and molecular docking in an attempt to restore cancer cell sensitivity to cytotoxic drugs.

  12. A Regularized Deep Learning Approach for Clinical Risk Prediction of Acute Coronary Syndrome Using Electronic Health Records.

    PubMed

    Huang, Zhengxing; Dong, Wei; Duan, Huilong; Liu, Jiquan

    2018-05-01

    Acute coronary syndrome (ACS), as a common and severe cardiovascular disease, is a leading cause of death and the principal cause of serious long-term disability globally. Clinical risk prediction of ACS is important for early intervention and treatment. Existing ACS risk scoring models are based mainly on a small set of hand-picked risk factors and often dichotomize predictive variables to simplify the score calculation. This study develops a regularized stacked denoising autoencoder (SDAE) model to stratify clinical risks of ACS patients from a large volume of electronic health records (EHR). To capture characteristics of patients at similar risk levels, and preserve the discriminating information across different risk levels, two constraints are added on SDAE to make the reconstructed feature representations contain more risk information of patients, which contribute to a better clinical risk prediction result. We validate our approach on a real clinical dataset consisting of 3464 ACS patient samples. The performance of our approach for predicting ACS risk remains robust and reaches 0.868 and 0.73 in terms of both AUC and accuracy, respectively. The obtained results show that the proposed approach achieves a competitive performance compared to state-of-the-art models in dealing with the clinical risk prediction problem. In addition, our approach can extract informative risk factors of ACS via a reconstructive learning strategy. Some of these extracted risk factors are not only consistent with existing medical domain knowledge, but also contain suggestive hypotheses that could be validated by further investigations in the medical domain.

  13. A model of strength

    USGS Publications Warehouse

    Johnson, Douglas H.; Cook, R.D.

    2013-01-01

    In her AAAS News & Notes piece "Can the Southwest manage its thirst?" (26 July, p. 362), K. Wren quotes Ajay Kalra, who advocates a particular method for predicting Colorado River streamflow "because it eschews complex physical climate models for a statistical data-driven modeling approach." A preference for data-driven models may be appropriate in this individual situation, but it is not so generally, Data-driven models often come with a warning against extrapolating beyond the range of the data used to develop the models. When the future is like the past, data-driven models can work well for prediction, but it is easy to over-model local or transient phenomena, often leading to predictive inaccuracy (1). Mechanistic models are built on established knowledge of the process that connects the response variables with the predictors, using information obtained outside of an extant data set. One may shy away from a mechanistic approach when the underlying process is judged to be too complicated, but good predictive models can be constructed with statistical components that account for ingredients missing in the mechanistic analysis. Models with sound mechanistic components are more generally applicable and robust than data-driven models.

  14. External intermittency prediction using AMR solutions of RANS turbulence and transported PDF models

    NASA Astrophysics Data System (ADS)

    Olivieri, D. A.; Fairweather, M.; Falle, S. A. E. G.

    2011-12-01

    External intermittency in turbulent round jets is predicted using a Reynolds-averaged Navier-Stokes modelling approach coupled to solutions of the transported probability density function (pdf) equation for scalar variables. Solutions to the descriptive equations are obtained using a finite-volume method, combined with an adaptive mesh refinement algorithm, applied in both physical and compositional space. This method contrasts with conventional approaches to solving the transported pdf equation which generally employ Monte Carlo techniques. Intermittency-modified eddy viscosity and second-moment turbulence closures are used to accommodate the effects of intermittency on the flow field, with the influence of intermittency also included, through modifications to the mixing model, in the transported pdf equation. Predictions of the overall model are compared with experimental data on the velocity and scalar fields in a round jet, as well as against measurements of intermittency profiles and scalar pdfs in a number of flows, with good agreement obtained. For the cases considered, predictions based on the second-moment turbulence closure are clearly superior, although both turbulence models give realistic predictions of the bimodal scalar pdfs observed experimentally.

  15. Overview of Heat Addition and Efficiency Predictions for an Advanced Stirling Convertor

    NASA Technical Reports Server (NTRS)

    Wilson, Scott D.; Reid, Terry; Schifer, Nicholas; Briggs, Maxwell

    2011-01-01

    Past methods of predicting net heat input needed to be validated. Validation effort pursued with several paths including improving model inputs, using test hardware to provide validation data, and validating high fidelity models. Validation test hardware provided direct measurement of net heat input for comparison to predicted values. Predicted value of net heat input was 1.7 percent less than measured value and initial calculations of measurement uncertainty were 2.1 percent (under review). Lessons learned during validation effort were incorporated into convertor modeling approach which improved predictions of convertor efficiency.

  16. Continuum Damage Mechanics Models for the Analysis of Progressive Failure in Open-Hole Tension Laminates

    NASA Technical Reports Server (NTRS)

    Song, Kyonchan; Li, Yingyong; Rose, Cheryl A.

    2011-01-01

    The performance of a state-of-the-art continuum damage mechanics model for interlaminar damage, coupled with a cohesive zone model for delamination is examined for failure prediction of quasi-isotropic open-hole tension laminates. Limitations of continuum representations of intra-ply damage and the effect of mesh orientation on the analysis predictions are discussed. It is shown that accurate prediction of matrix crack paths and stress redistribution after cracking requires a mesh aligned with the fiber orientation. Based on these results, an aligned mesh is proposed for analysis of the open-hole tension specimens consisting of different meshes within the individual plies, such that the element edges are aligned with the ply fiber direction. The modeling approach is assessed by comparison of analysis predictions to experimental data for specimen configurations in which failure is dominated by complex interactions between matrix cracks and delaminations. It is shown that the different failure mechanisms observed in the tests are well predicted. In addition, the modeling approach is demonstrated to predict proper trends in the effect of scaling on strength and failure mechanisms of quasi-isotropic open-hole tension laminates.

  17. QSAR prediction of additive and non-additive mixture toxicities of antibiotics and pesticide.

    PubMed

    Qin, Li-Tang; Chen, Yu-Han; Zhang, Xin; Mo, Ling-Yun; Zeng, Hong-Hu; Liang, Yan-Peng

    2018-05-01

    Antibiotics and pesticides may exist as a mixture in real environment. The combined effect of mixture can either be additive or non-additive (synergism and antagonism). However, no effective predictive approach exists on predicting the synergistic and antagonistic toxicities of mixtures. In this study, we developed a quantitative structure-activity relationship (QSAR) model for the toxicities (half effect concentration, EC 50 ) of 45 binary and multi-component mixtures composed of two antibiotics and four pesticides. The acute toxicities of single compound and mixtures toward Aliivibrio fischeri were tested. A genetic algorithm was used to obtain the optimized model with three theoretical descriptors. Various internal and external validation techniques indicated that the coefficient of determination of 0.9366 and root mean square error of 0.1345 for the QSAR model predicted that 45 mixture toxicities presented additive, synergistic, and antagonistic effects. Compared with the traditional concentration additive and independent action models, the QSAR model exhibited an advantage in predicting mixture toxicity. Thus, the presented approach may be able to fill the gaps in predicting non-additive toxicities of binary and multi-component mixtures. Copyright © 2018 Elsevier Ltd. All rights reserved.

  18. A water balance model to estimate flow through the Old and Middle River corridor

    USGS Publications Warehouse

    Andrews, Stephen W.; Gross, Edward S.; Hutton, Paul H.

    2016-01-01

    We applied a water balance model to predict tidally averaged (subtidal) flows through the Old River and Middle River corridor in the Sacramento–San Joaquin Delta. We reviewed the dynamics that govern subtidal flows and water levels and adopted a simplified representation. In this water balance approach, we estimated ungaged flows as linear functions of known (or specified) flows. We assumed that subtidal storage within the control volume varies because of fortnightly variation in subtidal water level, Delta inflow, and barometric pressure. The water balance model effectively predicts subtidal flows and approaches the accuracy of a 1–D Delta hydrodynamic model. We explore the potential to improve the approach by representing more complex dynamics and identify possible future improvements.

  19. Creep-fatigue life prediction for engine hot section materials (isotropic)

    NASA Technical Reports Server (NTRS)

    Moreno, V.

    1982-01-01

    The objectives of this program are the investigation of fundamental approaches to high temperature crack initiation life prediction, identification of specific modeling strategies and the development of specific models for component relevant loading conditions. A survey of the hot section material/coating systems used throughout the gas turbine industry is included. Two material/coating systems will be identified for the program. The material/coating system designated as the base system shall be used throughout Tasks 1-12. The alternate material/coating system will be used only in Task 12 for further evaluation of the models developed on the base material. In Task II, candidate life prediction approaches will be screened based on a set of criteria that includes experience of the approaches within the literature, correlation with isothermal data generated on the base material, and judgements relative to the applicability of the approach for the complex cycles to be considered in the option program. The two most promising approaches will be identified. Task 3 further evaluates the best approach using additional base material fatigue testing including verification tests. Task 4 consists of technical, schedular, financial and all other reporting requirements in accordance with the Reports of Work clause.

  20. A hierarchical spatial model for well yield in complex aquifers

    NASA Astrophysics Data System (ADS)

    Montgomery, J.; O'sullivan, F.

    2017-12-01

    Efficiently siting and managing groundwater wells requires reliable estimates of the amount of water that can be produced, or the well yield. This can be challenging to predict in highly complex, heterogeneous fractured aquifers due to the uncertainty around local hydraulic properties. Promising statistical approaches have been advanced in recent years. For instance, kriging and multivariate regression analysis have been applied to well test data with limited but encouraging levels of prediction accuracy. Additionally, some analytical solutions to diffusion in homogeneous porous media have been used to infer "effective" properties consistent with observed flow rates or drawdown. However, this is an under-specified inverse problem with substantial and irreducible uncertainty. We describe a flexible machine learning approach capable of combining diverse datasets with constraining physical and geostatistical models for improved well yield prediction accuracy and uncertainty quantification. Our approach can be implemented within a hierarchical Bayesian framework using Markov Chain Monte Carlo, which allows for additional sources of information to be incorporated in priors to further constrain and improve predictions and reduce the model order. We demonstrate the usefulness of this approach using data from over 7,000 wells in a fractured bedrock aquifer.

  1. Quantifying unpredictability: A multiple-model approach based on satellite imagery data from Mediterranean ponds

    PubMed Central

    García-Roger, Eduardo Moisés; Franch, Belen; Carmona, María José; Serra, Manuel

    2017-01-01

    Fluctuations in environmental parameters are increasingly being recognized as essential features of any habitat. The quantification of whether environmental fluctuations are prevalently predictable or unpredictable is remarkably relevant to understanding the evolutionary responses of organisms. However, when characterizing the relevant features of natural habitats, ecologists typically face two problems: (1) gathering long-term data and (2) handling the hard-won data. This paper takes advantage of the free access to long-term recordings of remote sensing data (27 years, Landsat TM/ETM+) to assess a set of environmental models for estimating environmental predictability. The case study included 20 Mediterranean saline ponds and lakes, and the focal variable was the water-surface area. This study first aimed to produce a method for accurately estimating the water-surface area from satellite images. Saline ponds can develop salt-crusted areas that make it difficult to distinguish between soil and water. This challenge was addressed using a novel pipeline that combines band ratio water indices and the short near-infrared band as a salt filter. The study then extracted the predictable and unpredictable components of variation in the water-surface area. Two different approaches, each showing variations in the parameters, were used to obtain the stochastic variation around a regular pattern with the objective of dissecting the effect of assumptions on predictability estimations. The first approach, which is based on Colwell’s predictability metrics, transforms the focal variable into a nominal one. The resulting discrete categories define the relevant variations in the water-surface area. In the second approach, we introduced General Additive Model (GAM) fitting as a new metric for quantifying predictability. Both approaches produced a wide range of predictability for the studied ponds. Some model assumptions–which are considered very different a priori–had minor effects, whereas others produced predictability estimations that showed some degree of divergence. We hypothesize that these diverging estimations of predictability reflect the effect of fluctuations on different types of organisms. The fluctuation analysis described in this manuscript is applicable to a wide variety of systems, including both aquatic and non-aquatic systems, and will be valuable for quantifying and characterizing predictability, which is essential within the expected global increase in the unpredictability of environmental fluctuations. We advocate that a priori information for organisms of interest should be used to select the most suitable metrics for estimating predictability, and we provide some guidelines for this approach. PMID:29121667

  2. Model-driven discovery of underground metabolic functions in Escherichia coli.

    PubMed

    Guzmán, Gabriela I; Utrilla, José; Nurk, Sergey; Brunk, Elizabeth; Monk, Jonathan M; Ebrahim, Ali; Palsson, Bernhard O; Feist, Adam M

    2015-01-20

    Enzyme promiscuity toward substrates has been discussed in evolutionary terms as providing the flexibility to adapt to novel environments. In the present work, we describe an approach toward exploring such enzyme promiscuity in the space of a metabolic network. This approach leverages genome-scale models, which have been widely used for predicting growth phenotypes in various environments or following a genetic perturbation; however, these predictions occasionally fail. Failed predictions of gene essentiality offer an opportunity for targeting biological discovery, suggesting the presence of unknown underground pathways stemming from enzymatic cross-reactivity. We demonstrate a workflow that couples constraint-based modeling and bioinformatic tools with KO strain analysis and adaptive laboratory evolution for the purpose of predicting promiscuity at the genome scale. Three cases of genes that are incorrectly predicted as essential in Escherichia coli--aspC, argD, and gltA--are examined, and isozyme functions are uncovered for each to a different extent. Seven isozyme functions based on genetic and transcriptional evidence are suggested between the genes aspC and tyrB, argD and astC, gabT and puuE, and gltA and prpC. This study demonstrates how a targeted model-driven approach to discovery can systematically fill knowledge gaps, characterize underground metabolism, and elucidate regulatory mechanisms of adaptation in response to gene KO perturbations.

  3. Comparing niche- and process-based models to reduce prediction uncertainty in species range shifts under climate change.

    PubMed

    Morin, Xavier; Thuiller, Wilfried

    2009-05-01

    Obtaining reliable predictions of species range shifts under climate change is a crucial challenge for ecologists and stakeholders. At the continental scale, niche-based models have been widely used in the last 10 years to predict the potential impacts of climate change on species distributions all over the world, although these models do not include any mechanistic relationships. In contrast, species-specific, process-based predictions remain scarce at the continental scale. This is regrettable because to secure relevant and accurate predictions it is always desirable to compare predictions derived from different kinds of models applied independently to the same set of species and using the same raw data. Here we compare predictions of range shifts under climate change scenarios for 2100 derived from niche-based models with those of a process-based model for 15 North American boreal and temperate tree species. A general pattern emerged from our comparisons: niche-based models tend to predict a stronger level of extinction and a greater proportion of colonization than the process-based model. This result likely arises because niche-based models do not take phenotypic plasticity and local adaptation into account. Nevertheless, as the two kinds of models rely on different assumptions, their complementarity is revealed by common findings. Both modeling approaches highlight a major potential limitation on species tracking their climatic niche because of migration constraints and identify similar zones where species extirpation is likely. Such convergent predictions from models built on very different principles provide a useful way to offset uncertainties at the continental scale. This study shows that the use in concert of both approaches with their own caveats and advantages is crucial to obtain more robust results and that comparisons among models are needed in the near future to gain accuracy regarding predictions of range shifts under climate change.

  4. Formability prediction for AHSS materials using damage models

    NASA Astrophysics Data System (ADS)

    Amaral, R.; Santos, Abel D.; José, César de Sá; Miranda, Sara

    2017-05-01

    Advanced high strength steels (AHSS) are seeing an increased use, mostly due to lightweight design in automobile industry and strict regulations on safety and greenhouse gases emissions. However, the use of these materials, characterized by a high strength to weight ratio, stiffness and high work hardening at early stages of plastic deformation, have imposed many challenges in sheet metal industry, mainly their low formability and different behaviour, when compared to traditional steels, which may represent a defying task, both to obtain a successful component and also when using numerical simulation to predict material behaviour and its fracture limits. Although numerical prediction of critical strains in sheet metal forming processes is still very often based on the classic forming limit diagrams, alternative approaches can use damage models, which are based on stress states to predict failure during the forming process and they can be classified as empirical, physics based and phenomenological models. In the present paper a comparative analysis of different ductile damage models is carried out, in order numerically evaluate two isotropic coupled damage models proposed by Johnson-Cook and Gurson-Tvergaard-Needleman (GTN), each of them corresponding to the first two previous group classification. Finite element analysis is used considering these damage mechanics approaches and the obtained results are compared with experimental Nakajima tests, thus being possible to evaluate and validate the ability to predict damage and formability limits for previous defined approaches.

  5. Modeling a full-scale primary sedimentation tank using artificial neural networks.

    PubMed

    Gamal El-Din, A; Smith, D W

    2002-05-01

    Modeling the performance of full-scale primary sedimentation tanks has been commonly done using regression-based models, which are empirical relationships derived strictly from observed daily average influent and effluent data. Another approach to model a sedimentation tank is using a hydraulic efficiency model that utilizes tracer studies to characterize the performance of model sedimentation tanks based on eddy diffusion. However, the use of hydraulic efficiency models to predict the dynamic behavior of a full-scale sedimentation tank is very difficult as the development of such models has been done using controlled studies of model tanks. In this paper, another type of model, namely artificial neural network modeling approach, is used to predict the dynamic response of a full-scale primary sedimentation tank. The neuralmodel consists of two separate networks, one uses flow and influent total suspended solids data in order to predict the effluent total suspended solids from the tank, and the other makes predictions of the effluent chemical oxygen demand using data of the flow and influent chemical oxygen demand as inputs. An extensive sampling program was conducted in order to collect a data set to be used in training and validating the networks. A systematic approach was used in the building process of the model which allowed the identification of a parsimonious neural model that is able to learn (and not memorize) from past data and generalize very well to unseen data that were used to validate the model. Theresults seem very promising. The potential of using the model as part of a real-time process control system isalso discussed.

  6. Consumer preference models: fuzzy theory approach

    NASA Astrophysics Data System (ADS)

    Turksen, I. B.; Wilson, I. A.

    1993-12-01

    Consumer preference models are widely used in new product design, marketing management, pricing and market segmentation. The purpose of this article is to develop and test a fuzzy set preference model which can represent linguistic variables in individual-level models implemented in parallel with existing conjoint models. The potential improvements in market share prediction and predictive validity can substantially improve management decisions about what to make (product design), for whom to make it (market segmentation) and how much to make (market share prediction).

  7. RNA secondary structure prediction with pseudoknots: Contribution of algorithm versus energy model.

    PubMed

    Jabbari, Hosna; Wark, Ian; Montemagno, Carlo

    2018-01-01

    RNA is a biopolymer with various applications inside the cell and in biotechnology. Structure of an RNA molecule mainly determines its function and is essential to guide nanostructure design. Since experimental structure determination is time-consuming and expensive, accurate computational prediction of RNA structure is of great importance. Prediction of RNA secondary structure is relatively simpler than its tertiary structure and provides information about its tertiary structure, therefore, RNA secondary structure prediction has received attention in the past decades. Numerous methods with different folding approaches have been developed for RNA secondary structure prediction. While methods for prediction of RNA pseudoknot-free structure (structures with no crossing base pairs) have greatly improved in terms of their accuracy, methods for prediction of RNA pseudoknotted secondary structure (structures with crossing base pairs) still have room for improvement. A long-standing question for improving the prediction accuracy of RNA pseudoknotted secondary structure is whether to focus on the prediction algorithm or the underlying energy model, as there is a trade-off on computational cost of the prediction algorithm versus the generality of the method. The aim of this work is to argue when comparing different methods for RNA pseudoknotted structure prediction, the combination of algorithm and energy model should be considered and a method should not be considered superior or inferior to others if they do not use the same scoring model. We demonstrate that while the folding approach is important in structure prediction, it is not the only important factor in prediction accuracy of a given method as the underlying energy model is also as of great value. Therefore we encourage researchers to pay particular attention in comparing methods with different energy models.

  8. Feature selection through validation and un-censoring of endovascular repair survival data for predicting the risk of re-intervention.

    PubMed

    Attallah, Omneya; Karthikesalingam, Alan; Holt, Peter J E; Thompson, Matthew M; Sayers, Rob; Bown, Matthew J; Choke, Eddie C; Ma, Xianghong

    2017-08-03

    Feature selection (FS) process is essential in the medical area as it reduces the effort and time needed for physicians to measure unnecessary features. Choosing useful variables is a difficult task with the presence of censoring which is the unique characteristic in survival analysis. Most survival FS methods depend on Cox's proportional hazard model; however, machine learning techniques (MLT) are preferred but not commonly used due to censoring. Techniques that have been proposed to adopt MLT to perform FS with survival data cannot be used with the high level of censoring. The researcher's previous publications proposed a technique to deal with the high level of censoring. It also used existing FS techniques to reduce dataset dimension. However, in this paper a new FS technique was proposed and combined with feature transformation and the proposed uncensoring approaches to select a reduced set of features and produce a stable predictive model. In this paper, a FS technique based on artificial neural network (ANN) MLT is proposed to deal with highly censored Endovascular Aortic Repair (EVAR). Survival data EVAR datasets were collected during 2004 to 2010 from two vascular centers in order to produce a final stable model. They contain almost 91% of censored patients. The proposed approach used a wrapper FS method with ANN to select a reduced subset of features that predict the risk of EVAR re-intervention after 5 years to patients from two different centers located in the United Kingdom, to allow it to be potentially applied to cross-centers predictions. The proposed model is compared with the two popular FS techniques; Akaike and Bayesian information criteria (AIC, BIC) that are used with Cox's model. The final model outperforms other methods in distinguishing the high and low risk groups; as they both have concordance index and estimated AUC better than the Cox's model based on AIC, BIC, Lasso, and SCAD approaches. These models have p-values lower than 0.05, meaning that patients with different risk groups can be separated significantly and those who would need re-intervention can be correctly predicted. The proposed approach will save time and effort made by physicians to collect unnecessary variables. The final reduced model was able to predict the long-term risk of aortic complications after EVAR. This predictive model can help clinicians decide patients' future observation plan.

  9. Non-animal assessment of skin sensitization hazard: Is an integrated testing strategy needed, and if so what should be integrated?

    PubMed

    Roberts, David W; Patlewicz, Grace

    2018-01-01

    There is an expectation that to meet regulatory requirements, and avoid or minimize animal testing, integrated approaches to testing and assessment will be needed that rely on assays representing key events (KEs) in the skin sensitization adverse outcome pathway. Three non-animal assays have been formally validated and regulatory adopted: the direct peptide reactivity assay (DPRA), the KeratinoSens™ assay and the human cell line activation test (h-CLAT). There have been many efforts to develop integrated approaches to testing and assessment with the "two out of three" approach attracting much attention. Here a set of 271 chemicals with mouse, human and non-animal sensitization test data was evaluated to compare the predictive performances of the three individual non-animal assays, their binary combinations and the "two out of three" approach in predicting skin sensitization potential. The most predictive approach was to use both the DPRA and h-CLAT as follows: (1) perform DPRA - if positive, classify as sensitizing, and (2) if negative, perform h-CLAT - a positive outcome denotes a sensitizer, a negative, a non-sensitizer. With this approach, 85% (local lymph node assay) and 93% (human) of non-sensitizer predictions were correct, whereas the "two out of three" approach had 69% (local lymph node assay) and 79% (human) of non-sensitizer predictions correct. The findings are consistent with the argument, supported by published quantitative mechanistic models that only the first KE needs to be modeled. All three assays model this KE to an extent. The value of using more than one assay depends on how the different assays compensate for each other's technical limitations. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  10. Optimization of global model composed of radial basis functions using the term-ranking approach

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cai, Peng; Tao, Chao, E-mail: taochao@nju.edu.cn; Liu, Xiao-Jun

    2014-03-15

    A term-ranking method is put forward to optimize the global model composed of radial basis functions to improve the predictability of the model. The effectiveness of the proposed method is examined by numerical simulation and experimental data. Numerical simulations indicate that this method can significantly lengthen the prediction time and decrease the Bayesian information criterion of the model. The application to real voice signal shows that the optimized global model can capture more predictable component in chaos-like voice data and simultaneously reduce the predictable component (periodic pitch) in the residual signal.

  11. Datamining approaches for modeling tumor control probability.

    PubMed

    Naqa, Issam El; Deasy, Joseph O; Mu, Yi; Huang, Ellen; Hope, Andrew J; Lindsay, Patricia E; Apte, Aditya; Alaly, James; Bradley, Jeffrey D

    2010-11-01

    Tumor control probability (TCP) to radiotherapy is determined by complex interactions between tumor biology, tumor microenvironment, radiation dosimetry, and patient-related variables. The complexity of these heterogeneous variable interactions constitutes a challenge for building predictive models for routine clinical practice. We describe a datamining framework that can unravel the higher order relationships among dosimetric dose-volume prognostic variables, interrogate various radiobiological processes, and generalize to unseen data before when applied prospectively. Several datamining approaches are discussed that include dose-volume metrics, equivalent uniform dose, mechanistic Poisson model, and model building methods using statistical regression and machine learning techniques. Institutional datasets of non-small cell lung cancer (NSCLC) patients are used to demonstrate these methods. The performance of the different methods was evaluated using bivariate Spearman rank correlations (rs). Over-fitting was controlled via resampling methods. Using a dataset of 56 patients with primary NCSLC tumors and 23 candidate variables, we estimated GTV volume and V75 to be the best model parameters for predicting TCP using statistical resampling and a logistic model. Using these variables, the support vector machine (SVM) kernel method provided superior performance for TCP prediction with an rs=0.68 on leave-one-out testing compared to logistic regression (rs=0.4), Poisson-based TCP (rs=0.33), and cell kill equivalent uniform dose model (rs=0.17). The prediction of treatment response can be improved by utilizing datamining approaches, which are able to unravel important non-linear complex interactions among model variables and have the capacity to predict on unseen data for prospective clinical applications.

  12. Deep Visual Attention Prediction

    NASA Astrophysics Data System (ADS)

    Wang, Wenguan; Shen, Jianbing

    2018-05-01

    In this work, we aim to predict human eye fixation with view-free scenes based on an end-to-end deep learning architecture. Although Convolutional Neural Networks (CNNs) have made substantial improvement on human attention prediction, it is still needed to improve CNN based attention models by efficiently leveraging multi-scale features. Our visual attention network is proposed to capture hierarchical saliency information from deep, coarse layers with global saliency information to shallow, fine layers with local saliency response. Our model is based on a skip-layer network structure, which predicts human attention from multiple convolutional layers with various reception fields. Final saliency prediction is achieved via the cooperation of those global and local predictions. Our model is learned in a deep supervision manner, where supervision is directly fed into multi-level layers, instead of previous approaches of providing supervision only at the output layer and propagating this supervision back to earlier layers. Our model thus incorporates multi-level saliency predictions within a single network, which significantly decreases the redundancy of previous approaches of learning multiple network streams with different input scales. Extensive experimental analysis on various challenging benchmark datasets demonstrate our method yields state-of-the-art performance with competitive inference time.

  13. A Unified Spatiotemporal Modeling Approach for Predicting Concentrations of Multiple Air Pollutants in the Multi-Ethnic Study of Atherosclerosis and Air Pollution

    PubMed Central

    Olives, Casey; Kim, Sun-Young; Sheppard, Lianne; Sampson, Paul D.; Szpiro, Adam A.; Oron, Assaf P.; Lindström, Johan; Vedal, Sverre; Kaufman, Joel D.

    2014-01-01

    Background: Cohort studies of the relationship between air pollution exposure and chronic health effects require predictions of exposure over long periods of time. Objectives: We developed a unified modeling approach for predicting fine particulate matter, nitrogen dioxide, oxides of nitrogen, and black carbon (as measured by light absorption coefficient) in six U.S. metropolitan regions from 1999 through early 2012 as part of the Multi-Ethnic Study of Atherosclerosis and Air Pollution (MESA Air). Methods: We obtained monitoring data from regulatory networks and supplemented those data with study-specific measurements collected from MESA Air community locations and participants’ homes. In each region, we applied a spatiotemporal model that included a long-term spatial mean, time trends with spatially varying coefficients, and a spatiotemporal residual. The mean structure was derived from a large set of geographic covariates that was reduced using partial least-squares regression. We estimated time trends from observed time series and used spatial smoothing methods to borrow strength between observations. Results: Prediction accuracy was high for most models, with cross-validation R2 (R2CV) > 0.80 at regulatory and fixed sites for most regions and pollutants. At home sites, overall R2CV ranged from 0.45 to 0.92, and temporally adjusted R2CV ranged from 0.23 to 0.92. Conclusions: This novel spatiotemporal modeling approach provides accurate fine-scale predictions in multiple regions for four pollutants. We have generated participant-specific predictions for MESA Air to investigate health effects of long-term air pollution exposures. These successes highlight modeling advances that can be adopted more widely in modern cohort studies. Citation: Keller JP, Olives C, Kim SY, Sheppard L, Sampson PD, Szpiro AA, Oron AP, Lindström J, Vedal S, Kaufman JD. 2015. A unified spatiotemporal modeling approach for predicting concentrations of multiple air pollutants in the Multi-Ethnic Study of Atherosclerosis and Air Pollution. Environ Health Perspect 123:301–309; http://dx.doi.org/10.1289/ehp.1408145 PMID:25398188

  14. Prediction and Factor Extraction of Drug Function by Analyzing Medical Records in Developing Countries.

    PubMed

    Hu, Min; Nohara, Yasunobu; Nakamura, Masafumi; Nakashima, Naoki

    2017-01-01

    The World Health Organization has declared Bangladesh one of 58 countries facing acute Human Resources for Health (HRH) crisis. Artificial intelligence in healthcare has been shown to be successful for diagnostics. Using machine learning to predict pharmaceutical prescriptions may solve HRH crises. In this study, we investigate a predictive model by analyzing prescription data of 4,543 subjects in Bangladesh. We predict the function of prescribed drugs, comparing three machine-learning approaches. The approaches compare whether a subject shall be prescribed medicine from the 21 most frequently prescribed drug functions. Receiver Operating Characteristics (ROC) were selected as a way to evaluate and assess prediction models. The results show the drug function with the best prediction performance was oral hypoglycemic drugs, which has an average AUC of 0.962. To understand how the variables affect prediction, we conducted factor analysis based on tree-based algorithms and natural language processing techniques.

  15. An improved Multimodel Approach for Global Sea Surface Temperature Forecasts

    NASA Astrophysics Data System (ADS)

    Khan, M. Z. K.; Mehrotra, R.; Sharma, A.

    2014-12-01

    The concept of ensemble combinations for formulating improved climate forecasts has gained popularity in recent years. However, many climate models share similar physics or modeling processes, which may lead to similar (or strongly correlated) forecasts. Recent approaches for combining forecasts that take into consideration differences in model accuracy over space and time have either ignored the similarity of forecast among the models or followed a pairwise dynamic combination approach. Here we present a basis for combining model predictions, illustrating the improvements that can be achieved if procedures for factoring in inter-model dependence are utilised. The utility of the approach is demonstrated by combining sea surface temperature (SST) forecasts from five climate models over a period of 1960-2005. The variable of interest, the monthly global sea surface temperature anomalies (SSTA) at a 50´50 latitude-longitude grid, is predicted three months in advance to demonstrate the utility of the proposed algorithm. Results indicate that the proposed approach offers consistent and significant improvements for majority of grid points compared to the case where the dependence among the models is ignored. Therefore, the proposed approach of combining multiple models by taking into account the existing interdependence, provides an attractive alternative to obtain improved climate forecast. In addition, an approach to combine seasonal forecasts from multiple climate models with varying periods of availability is also demonstrated.

  16. A model for the progressive failure of laminated composite structural components

    NASA Technical Reports Server (NTRS)

    Allen, D. H.; Lo, D. C.

    1991-01-01

    Laminated continuous fiber polymeric composites are capable of sustaining substantial load induced microstructural damage prior to component failure. Because this damage eventually leads to catastrophic failure, it is essential to capture the mechanics of progressive damage in any cogent life prediction model. For the past several years the authors have been developing one solution approach to this problem. In this approach the mechanics of matrix cracking and delamination are accounted for via locally averaged internal variables which account for the kinematics of microcracking. Damage progression is predicted by using phenomenologically based damage evolution laws which depend on the load history. The result is a nonlinear and path dependent constitutive model which has previously been implemented to a finite element computer code for analysis of structural components. Using an appropriate failure model, this algorithm can be used to predict component life. In this paper the model will be utilized to demonstrate the ability to predict the load path dependence of the damage and stresses in plates subjected to fatigue loading.

  17. Compaction of North-sea chalk by pore-failure and pressure solution in a producing reservoir

    NASA Astrophysics Data System (ADS)

    Keszthelyi, Daniel; Dysthe, Dag; Jamtveit, Bjorn

    2016-02-01

    The Ekofisk field, Norwegian North sea,is an example of compacting chalk reservoir with considerable subsequent seafloor subsidence due to petroleum production. Previously, a number of models were created to predict the compaction using different phenomenological approaches. Here we present a different approach, we use a new creep model based on microscopic mechanisms with no fitting parameters to predict strain rate at core scale and at reservoir scale. The model is able to reproduce creep experiments and the magnitude of the observed subsidence making it the first microstructural model which can explain the Ekofisk compaction.

  18. M5 model tree based predictive modeling of road accidents on non-urban sections of highways in India.

    PubMed

    Singh, Gyanendra; Sachdeva, S N; Pal, Mahesh

    2016-11-01

    This work examines the application of M5 model tree and conventionally used fixed/random effect negative binomial (FENB/RENB) regression models for accident prediction on non-urban sections of highway in Haryana (India). Road accident data for a period of 2-6 years on different sections of 8 National and State Highways in Haryana was collected from police records. Data related to road geometry, traffic and road environment related variables was collected through field studies. Total two hundred and twenty two data points were gathered by dividing highways into sections with certain uniform geometric characteristics. For prediction of accident frequencies using fifteen input parameters, two modeling approaches: FENB/RENB regression and M5 model tree were used. Results suggest that both models perform comparably well in terms of correlation coefficient and root mean square error values. M5 model tree provides simple linear equations that are easy to interpret and provide better insight, indicating that this approach can effectively be used as an alternative to RENB approach if the sole purpose is to predict motor vehicle crashes. Sensitivity analysis using M5 model tree also suggests that its results reflect the physical conditions. Both models clearly indicate that to improve safety on Indian highways minor accesses to the highways need to be properly designed and controlled, the service roads to be made functional and dispersion of speeds is to be brought down. Copyright © 2016 Elsevier Ltd. All rights reserved.

  19. Promoter Sequences Prediction Using Relational Association Rule Mining

    PubMed Central

    Czibula, Gabriela; Bocicor, Maria-Iuliana; Czibula, Istvan Gergely

    2012-01-01

    In this paper we are approaching, from a computational perspective, the problem of promoter sequences prediction, an important problem within the field of bioinformatics. As the conditions for a DNA sequence to function as a promoter are not known, machine learning based classification models are still developed to approach the problem of promoter identification in the DNA. We are proposing a classification model based on relational association rules mining. Relational association rules are a particular type of association rules and describe numerical orderings between attributes that commonly occur over a data set. Our classifier is based on the discovery of relational association rules for predicting if a DNA sequence contains or not a promoter region. An experimental evaluation of the proposed model and comparison with similar existing approaches is provided. The obtained results show that our classifier overperforms the existing techniques for identifying promoter sequences, confirming the potential of our proposal. PMID:22563233

  20. Departures From Optimality When Pursuing Multiple Approach or Avoidance Goals

    PubMed Central

    2016-01-01

    This article examines how people depart from optimality during multiple-goal pursuit. The authors operationalized optimality using dynamic programming, which is a mathematical model used to calculate expected value in multistage decisions. Drawing on prospect theory, they predicted that people are risk-averse when pursuing approach goals and are therefore more likely to prioritize the goal in the best position than the dynamic programming model suggests is optimal. The authors predicted that people are risk-seeking when pursuing avoidance goals and are therefore more likely to prioritize the goal in the worst position than is optimal. These predictions were supported by results from an experimental paradigm in which participants made a series of prioritization decisions while pursuing either 2 approach or 2 avoidance goals. This research demonstrates the usefulness of using decision-making theories and normative models to understand multiple-goal pursuit. PMID:26963081

  1. Prediction task guided representation learning of medical codes in EHR.

    PubMed

    Cui, Liwen; Xie, Xiaolei; Shen, Zuojun

    2018-06-18

    There have been rapidly growing applications using machine learning models for predictive analytics in Electronic Health Records (EHR) to improve the quality of hospital services and the efficiency of healthcare resource utilization. A fundamental and crucial step in developing such models is to convert medical codes in EHR to feature vectors. These medical codes are used to represent diagnoses or procedures. Their vector representations have a tremendous impact on the performance of machine learning models. Recently, some researchers have utilized representation learning methods from Natural Language Processing (NLP) to learn vector representations of medical codes. However, most previous approaches are unsupervised, i.e. the generation of medical code vectors is independent from prediction tasks. Thus, the obtained feature vectors may be inappropriate for a specific prediction task. Moreover, unsupervised methods often require a lot of samples to obtain reliable results, but most practical problems have very limited patient samples. In this paper, we develop a new method called Prediction Task Guided Health Record Aggregation (PTGHRA), which aggregates health records guided by prediction tasks, to construct training corpus for various representation learning models. Compared with unsupervised approaches, representation learning models integrated with PTGHRA yield a significant improvement in predictive capability of generated medical code vectors, especially for limited training samples. Copyright © 2018. Published by Elsevier Inc.

  2. Ensuring long-term utility of the AOP framework and knowledge for multiple stakeholders

    EPA Science Inventory

    1.Introduction There is a need to increase the development and implementation of predictive approaches to support chemical safety assessment. These predictive approaches feature generation of data from tools such as computational models, pathway-based in vitro assays, and short-t...

  3. Interactions of timing and prediction error learning.

    PubMed

    Kirkpatrick, Kimberly

    2014-01-01

    Timing and prediction error learning have historically been treated as independent processes, but growing evidence has indicated that they are not orthogonal. Timing emerges at the earliest time point when conditioned responses are observed, and temporal variables modulate prediction error learning in both simple conditioning and cue competition paradigms. In addition, prediction errors, through changes in reward magnitude or value alter timing of behavior. Thus, there appears to be a bi-directional interaction between timing and prediction error learning. Modern theories have attempted to integrate the two processes with mixed success. A neurocomputational approach to theory development is espoused, which draws on neurobiological evidence to guide and constrain computational model development. Heuristics for future model development are presented with the goal of sparking new approaches to theory development in the timing and prediction error fields. Copyright © 2013 Elsevier B.V. All rights reserved.

  4. A coupled ductile fracture phase-field model for crystal plasticity

    NASA Astrophysics Data System (ADS)

    Hernandez Padilla, Carlos Alberto; Markert, Bernd

    2017-07-01

    Nowadays crack initiation and evolution play a key role in the design of mechanical components. In the past few decades, several numerical approaches have been developed with the objective to predict these phenomena. The objective of this work is to present a simplified, nonetheless representative phenomenological model to predict the crack evolution of ductile fracture in single crystals. The proposed numerical approach is carried out by merging a conventional elasto-plastic crystal plasticity model and a phase-field model modified to predict ductile fracture. A two-dimensional initial boundary value problem of ductile fracture is introduced considering a single-crystal setup and Nickel-base superalloy material properties. The model is implemented into the finite element context subjected to a quasi-static uniaxial tension test. The results are then qualitatively analyzed and briefly compared to current benchmark results in the literature.

  5. Contrasting analytical and data-driven frameworks for radiogenomic modeling of normal tissue toxicities in prostate cancer.

    PubMed

    Coates, James; Jeyaseelan, Asha K; Ybarra, Norma; David, Marc; Faria, Sergio; Souhami, Luis; Cury, Fabio; Duclos, Marie; El Naqa, Issam

    2015-04-01

    We explore analytical and data-driven approaches to investigate the integration of genetic variations (single nucleotide polymorphisms [SNPs] and copy number variations [CNVs]) with dosimetric and clinical variables in modeling radiation-induced rectal bleeding (RB) and erectile dysfunction (ED) in prostate cancer patients. Sixty-two patients who underwent curative hypofractionated radiotherapy (66 Gy in 22 fractions) between 2002 and 2010 were retrospectively genotyped for CNV and SNP rs5489 in the xrcc1 DNA repair gene. Fifty-four patients had full dosimetric profiles. Two parallel modeling approaches were compared to assess the risk of severe RB (Grade⩾3) and ED (Grade⩾1); Maximum likelihood estimated generalized Lyman-Kutcher-Burman (LKB) and logistic regression. Statistical resampling based on cross-validation was used to evaluate model predictive power and generalizability to unseen data. Integration of biological variables xrcc1 CNV and SNP improved the fit of the RB and ED analytical and data-driven models. Cross-validation of the generalized LKB models yielded increases in classification performance of 27.4% for RB and 14.6% for ED when xrcc1 CNV and SNP were included, respectively. Biological variables added to logistic regression modeling improved classification performance over standard dosimetric models by 33.5% for RB and 21.2% for ED models. As a proof-of-concept, we demonstrated that the combination of genetic and dosimetric variables can provide significant improvement in NTCP prediction using analytical and data-driven approaches. The improvement in prediction performance was more pronounced in the data driven approaches. Moreover, we have shown that CNVs, in addition to SNPs, may be useful structural genetic variants in predicting radiation toxicities. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  6. Representation of Vegetation and Other Nonerodible Elements in Aeolian Shear Stress Partitioning Models for Predicting Transport Threshold

    NASA Technical Reports Server (NTRS)

    King, James; Nickling, William G.; Gillies, John A.

    2005-01-01

    The presence of nonerodible elements is well understood to be a reducing factor for soil erosion by wind, but the limits of its protection of the surface and erosion threshold prediction are complicated by the varying geometry, spatial organization, and density of the elements. The predictive capabilities of the most recent models for estimating wind driven particle fluxes are reduced because of the poor representation of the effectiveness of vegetation to reduce wind erosion. Two approaches have been taken to account for roughness effects on sediment transport thresholds. Marticorena and Bergametti (1995) in their dust emission model parameterize the effect of roughness on threshold with the assumption that there is a relationship between roughness density and the aerodynamic roughness length of a surface. Raupach et al. (1993) offer a different approach based on physical modeling of wake development behind individual roughness elements and the partition of the surface stress and the total stress over a roughened surface. A comparison between the models shows the partitioning approach to be a good framework to explain the effect of roughness on entrainment of sediment by wind. Both models provided very good agreement for wind tunnel experiments using solid objects on a nonerodible surface. However, the Marticorena and Bergametti (1995) approach displays a scaling dependency when the difference between the roughness length of the surface and the overall roughness length is too great, while the Raupach et al. (1993) model's predictions perform better owing to the incorporation of the roughness geometry and the alterations to the flow they can cause.

  7. Comparison of empirical and data driven hydrometeorological hazard models on coastal cities of São Paulo, Brazil

    NASA Astrophysics Data System (ADS)

    Koga-Vicente, A.; Friedel, M. J.

    2010-12-01

    Every year thousands of people are affected by floods and landslide hazards caused by rainstorms. The problem is more serious in tropical developing countries because of the susceptibility as a result of the high amount of available energy to form storms, and the high vulnerability due to poor economic and social conditions. Predictive models of hazards are important tools to manage this kind of risk. In this study, a comparison of two different modeling approaches was made for predicting hydrometeorological hazards in 12 cities on the coast of São Paulo, Brazil, from 1994 to 2003. In the first approach, an empirical multiple linear regression (MLR) model was developed and used; the second approach used a type of unsupervised nonlinear artificial neural network called a self-organized map (SOM). By using twenty three independent variables of susceptibility (precipitation, soil type, slope, elevation, and regional atmospheric system scale) and vulnerability (distribution and total population, income and educational characteristics, poverty intensity, human development index), binary hazard responses were obtained. Model performance by cross-validation indicated that the respective MLR and SOM model accuracy was about 67% and 80%. Prediction accuracy can be improved by the addition of information, but the SOM approach is preferred because of sparse data and highly nonlinear relations among the independent variables.

  8. Bridging the Gap between Human Judgment and Automated Reasoning in Predictive Analytics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sanfilippo, Antonio P.; Riensche, Roderick M.; Unwin, Stephen D.

    2010-06-07

    Events occur daily that impact the health, security and sustainable growth of our society. If we are to address the challenges that emerge from these events, anticipatory reasoning has to become an everyday activity. Strong advances have been made in using integrated modeling for analysis and decision making. However, a wider impact of predictive analytics is currently hindered by the lack of systematic methods for integrating predictive inferences from computer models with human judgment. In this paper, we present a predictive analytics approach that supports anticipatory analysis and decision-making through a concerted reasoning effort that interleaves human judgment and automatedmore » inferences. We describe a systematic methodology for integrating modeling algorithms within a serious gaming environment in which role-playing by human agents provides updates to model nodes and the ensuing model outcomes in turn influence the behavior of the human players. The approach ensures a strong functional partnership between human players and computer models while maintaining a high degree of independence and greatly facilitating the connection between model and game structures.« less

  9. Data-driven modeling and predictive control for boiler-turbine unit using fuzzy clustering and subspace methods.

    PubMed

    Wu, Xiao; Shen, Jiong; Li, Yiguo; Lee, Kwang Y

    2014-05-01

    This paper develops a novel data-driven fuzzy modeling strategy and predictive controller for boiler-turbine unit using fuzzy clustering and subspace identification (SID) methods. To deal with the nonlinear behavior of boiler-turbine unit, fuzzy clustering is used to provide an appropriate division of the operation region and develop the structure of the fuzzy model. Then by combining the input data with the corresponding fuzzy membership functions, the SID method is extended to extract the local state-space model parameters. Owing to the advantages of the both methods, the resulting fuzzy model can represent the boiler-turbine unit very closely, and a fuzzy model predictive controller is designed based on this model. As an alternative approach, a direct data-driven fuzzy predictive control is also developed following the same clustering and subspace methods, where intermediate subspace matrices developed during the identification procedure are utilized directly as the predictor. Simulation results show the advantages and effectiveness of the proposed approach. Copyright © 2014 ISA. Published by Elsevier Ltd. All rights reserved.

  10. Assimilation of Satellite to Improve Cloud Simulation in Wrf Model

    NASA Astrophysics Data System (ADS)

    Park, Y. H.; Pour Biazar, A.; McNider, R. T.

    2012-12-01

    A simple approach has been introduced to improve cloud simulation spatially and temporally in a meteorological model. The first step for this approach is to use Geostationary Operational Environmental Satellite (GOES) observations to identify clouds and estimate the clouds structure. Then by comparing GOES observations to model cloud field, we identify areas in which model has under-predicted or over-predicted clouds. Next, by introducing subsidence in areas with over-prediction and lifting in areas with under-prediction, erroneous clouds are removed and new clouds are formed. The technique estimates a vertical velocity needed for the cloud correction and then uses a one dimensional variation schemes (1D_Var) to calculate the horizontal divergence components and the consequent horizontal wind components needed to sustain such vertical velocity. Finally, the new horizontal winds are provided as a nudging field to the model. This nudging provides the dynamical support needed to create/clear clouds in a sustainable manner. The technique was implemented and tested in the Weather Research and Forecast (WRF) Model and resulted in substantial improvement in model simulated clouds. Some of the results are presented here.

  11. Catchments as non-linear filters: evaluating data-driven approaches for spatio-temporal predictions in ungauged basins

    NASA Astrophysics Data System (ADS)

    Bellugi, D. G.; Tennant, C.; Larsen, L.

    2016-12-01

    Catchment and climate heterogeneity complicate prediction of runoff across time and space, and resulting parameter uncertainty can lead to large accumulated errors in hydrologic models, particularly in ungauged basins. Recently, data-driven modeling approaches have been shown to avoid the accumulated uncertainty associated with many physically-based models, providing an appealing alternative for hydrologic prediction. However, the effectiveness of different methods in hydrologically and geomorphically distinct catchments, and the robustness of these methods to changing climate and changing hydrologic processes remain to be tested. Here, we evaluate the use of machine learning techniques to predict daily runoff across time and space using only essential climatic forcing (e.g. precipitation, temperature, and potential evapotranspiration) time series as model input. Model training and testing was done using a high quality dataset of daily runoff and climate forcing data for 25+ years for 600+ minimally-disturbed catchments (drainage area range 5-25,000 km2, median size 336 km2) that cover a wide range of climatic and physical characteristics. Preliminary results using Support Vector Regression (SVR) suggest that in some catchments this nonlinear-based regression technique can accurately predict daily runoff, while the same approach fails in other catchments, indicating that the representation of climate inputs and/or catchment filter characteristics in the model structure need further refinement to increase performance. We bolster this analysis by using Sparse Identification of Nonlinear Dynamics (a sparse symbolic regression technique) to uncover the governing equations that describe runoff processes in catchments where SVR performed well and for ones where it performed poorly, thereby enabling inference about governing processes. This provides a robust means of examining how catchment complexity influences runoff prediction skill, and represents a contribution towards the integration of data-driven inference and physically-based models.

  12. A Bayesian network approach for modeling local failure in lung cancer

    NASA Astrophysics Data System (ADS)

    Oh, Jung Hun; Craft, Jeffrey; Lozi, Rawan Al; Vaidya, Manushka; Meng, Yifan; Deasy, Joseph O.; Bradley, Jeffrey D.; El Naqa, Issam

    2011-03-01

    Locally advanced non-small cell lung cancer (NSCLC) patients suffer from a high local failure rate following radiotherapy. Despite many efforts to develop new dose-volume models for early detection of tumor local failure, there was no reported significant improvement in their application prospectively. Based on recent studies of biomarker proteins' role in hypoxia and inflammation in predicting tumor response to radiotherapy, we hypothesize that combining physical and biological factors with a suitable framework could improve the overall prediction. To test this hypothesis, we propose a graphical Bayesian network framework for predicting local failure in lung cancer. The proposed approach was tested using two different datasets of locally advanced NSCLC patients treated with radiotherapy. The first dataset was collected retrospectively, which comprises clinical and dosimetric variables only. The second dataset was collected prospectively in which in addition to clinical and dosimetric information, blood was drawn from the patients at various time points to extract candidate biomarkers as well. Our preliminary results show that the proposed method can be used as an efficient method to develop predictive models of local failure in these patients and to interpret relationships among the different variables in the models. We also demonstrate the potential use of heterogeneous physical and biological variables to improve the model prediction. With the first dataset, we achieved better performance compared with competing Bayesian-based classifiers. With the second dataset, the combined model had a slightly higher performance compared to individual physical and biological models, with the biological variables making the largest contribution. Our preliminary results highlight the potential of the proposed integrated approach for predicting post-radiotherapy local failure in NSCLC patients.

  13. Maximal Predictability Approach for Identifying the Right Descriptors for Electrocatalytic Reactions.

    PubMed

    Krishnamurthy, Dilip; Sumaria, Vaidish; Viswanathan, Venkatasubramanian

    2018-02-01

    Density functional theory (DFT) calculations are being routinely used to identify new material candidates that approach activity near fundamental limits imposed by thermodynamics or scaling relations. DFT calculations are associated with inherent uncertainty, which limits the ability to delineate materials (distinguishability) that possess high activity. Development of error-estimation capabilities in DFT has enabled uncertainty propagation through activity-prediction models. In this work, we demonstrate an approach to propagating uncertainty through thermodynamic activity models leading to a probability distribution of the computed activity and thereby its expectation value. A new metric, prediction efficiency, is defined, which provides a quantitative measure of the ability to distinguish activity of materials and can be used to identify the optimal descriptor(s) ΔG opt . We demonstrate the framework for four important electrochemical reactions: hydrogen evolution, chlorine evolution, oxygen reduction and oxygen evolution. Future studies could utilize expected activity and prediction efficiency to significantly improve the prediction accuracy of highly active material candidates.

  14. A generic approach for the development of short-term predictions of Escherichia coli and biotoxins in shellfish

    PubMed Central

    Schmidt, Wiebke; Evers-King, Hayley L.; Campos, Carlos J. A.; Jones, Darren B.; Miller, Peter I.; Davidson, Keith; Shutler, Jamie D.

    2018-01-01

    Microbiological contamination or elevated marine biotoxin concentrations within shellfish can result in temporary closure of shellfish aquaculture harvesting, leading to financial loss for the aquaculture business and a potential reduction in consumer confidence in shellfish products. We present a method for predicting short-term variations in shellfish concentrations of Escherichia coli and biotoxin (okadaic acid and its derivates dinophysistoxins and pectenotoxins). The approach was evaluated for 2 contrasting shellfish harvesting areas. Through a meta-data analysis and using environmental data (in situ, satellite observations and meteorological nowcasts and forecasts), key environmental drivers were identified and used to develop models to predict E. coli and biotoxin concentrations within shellfish. Models were trained and evaluated using independent datasets, and the best models were identified based on the model exhibiting the lowest root mean square error. The best biotoxin model was able to provide 1 wk forecasts with an accuracy of 86%, a 0% false positive rate and a 0% false discovery rate (n = 78 observations) when used to predict the closure of shellfish beds due to biotoxin. The best E. coli models were used to predict the European hygiene classification of the shellfish beds to an accuracy of 99% (n = 107 observations) and 98% (n = 63 observations) for a bay (St Austell Bay) and an estuary (Turnaware Bar), respectively. This generic approach enables high accuracy short-term farm-specific forecasts, based on readily accessible environmental data and observations. PMID:29805719

  15. An Artificial Intelligence Approach for Modeling and Prediction of Water Diffusion Inside a Carbon Nanotube

    PubMed Central

    2009-01-01

    Modeling of water flow in carbon nanotubes is still a challenge for the classic models of fluid dynamics. In this investigation, an adaptive-network-based fuzzy inference system (ANFIS) is presented to solve this problem. The proposed ANFIS approach can construct an input–output mapping based on both human knowledge in the form of fuzzy if-then rules and stipulated input–output data pairs. Good performance of the designed ANFIS ensures its capability as a promising tool for modeling and prediction of fluid flow at nanoscale where the continuum models of fluid dynamics tend to break down. PMID:20596382

  16. An Artificial Intelligence Approach for Modeling and Prediction of Water Diffusion Inside a Carbon Nanotube.

    PubMed

    Ahadian, Samad; Kawazoe, Yoshiyuki

    2009-06-04

    Modeling of water flow in carbon nanotubes is still a challenge for the classic models of fluid dynamics. In this investigation, an adaptive-network-based fuzzy inference system (ANFIS) is presented to solve this problem. The proposed ANFIS approach can construct an input-output mapping based on both human knowledge in the form of fuzzy if-then rules and stipulated input-output data pairs. Good performance of the designed ANFIS ensures its capability as a promising tool for modeling and prediction of fluid flow at nanoscale where the continuum models of fluid dynamics tend to break down.

  17. Approaches to developing alternative and predictive toxicology based on PBPK/PD and QSAR modeling.

    PubMed Central

    Yang, R S; Thomas, R S; Gustafson, D L; Campain, J; Benjamin, S A; Verhaar, H J; Mumtaz, M M

    1998-01-01

    Systematic toxicity testing, using conventional toxicology methodologies, of single chemicals and chemical mixtures is highly impractical because of the immense numbers of chemicals and chemical mixtures involved and the limited scientific resources. Therefore, the development of unconventional, efficient, and predictive toxicology methods is imperative. Using carcinogenicity as an end point, we present approaches for developing predictive tools for toxicologic evaluation of chemicals and chemical mixtures relevant to environmental contamination. Central to the approaches presented is the integration of physiologically based pharmacokinetic/pharmacodynamic (PBPK/PD) and quantitative structure--activity relationship (QSAR) modeling with focused mechanistically based experimental toxicology. In this development, molecular and cellular biomarkers critical to the carcinogenesis process are evaluated quantitatively between different chemicals and/or chemical mixtures. Examples presented include the integration of PBPK/PD and QSAR modeling with a time-course medium-term liver foci assay, molecular biology and cell proliferation studies. Fourier transform infrared spectroscopic analyses of DNA changes, and cancer modeling to assess and attempt to predict the carcinogenicity of the series of 12 chlorobenzene isomers. Also presented is an ongoing effort to develop and apply a similar approach to chemical mixtures using in vitro cell culture (Syrian hamster embryo cell transformation assay and human keratinocytes) methodologies and in vivo studies. The promise and pitfalls of these developments are elaborated. When successfully applied, these approaches may greatly reduce animal usage, personnel, resources, and time required to evaluate the carcinogenicity of chemicals and chemical mixtures. Images Figure 6 PMID:9860897

  18. Modeling the distribution of white spruce (Picea glauca) for Alaska with high accuracy: an open access role-model for predicting tree species in last remaining wilderness areas

    Treesearch

    Bettina Ohse; Falk Huettmann; Stefanie M. Ickert-Bond; Glenn P. Juday

    2009-01-01

    Most wilderness areas still lack accurate distribution information on tree species. We met this need with a predictive GIS modeling approach, using freely available digital data and computer programs to efficiently obtain high-quality species distribution maps. Here we present a digital map with the predicted distribution of white spruce (Picea glauca...

  19. The People Capability Maturity Model

    ERIC Educational Resources Information Center

    Wademan, Mark R.; Spuches, Charles M.; Doughty, Philip L.

    2007-01-01

    The People Capability Maturity Model[R] (People CMM[R]) advocates a staged approach to organizational change. Developed by the Carnegie Mellon University Software Engineering Institute, this model seeks to bring discipline to the people side of management by promoting a structured, repeatable, and predictable approach for improving an…

  20. A biomechanical approach for in vivo lung tumor motion prediction during external beam radiation therapy

    NASA Astrophysics Data System (ADS)

    Karami, Elham; Gaede, Stewart; Lee, Ting-Yim; Samani, Abbas

    2015-03-01

    Lung Cancer is the leading cause of cancer death in both men and women. Among various treatment methods currently being used in the clinic, External Beam Radiation Therapy (EBRT) is used widely not only as the primary treatment method, but also in combination with chemotherapy and surgery. However, this method may lack desirable dosimetric accuracy because of respiration induced tumor motion. Recently, biomechanical modeling of the respiratory system has become a popular approach for tumor motion prediction and compensation. This approach requires reasonably accurate data pertaining to thoracic pressure variation, diaphragm position and biomechanical properties of the lung tissue in order to predict the lung tissue deformation and tumor motion. In this paper, we present preliminary results of an in vivo study obtained from a Finite Element Model (FEM) of the lung developed to predict tumor motion during respiration.

  1. Predictive Utility of Personality Disorder in Depression: Comparison of Outcomes and Taxonomic Approach.

    PubMed

    Newton-Howes, Giles; Mulder, Roger; Ellis, Pete M; Boden, Joseph M; Joyce, Peter

    2017-09-19

    There is debate around the best model for diagnosing personality disorder, both in terms of its relationship to the empirical data and clinical utility. Four randomized controlled trials examining various treatments for depression were analyzed at an individual patient level. Three different approaches to the diagnosis of personality disorder were analyzed in these patients. A total of 578 depressed patients were included in the analysis. Personality disorder, however measured, was of little predictive utility in the short term but added significantly to predictive modelling of medium-term outcomes, accounting for more than twice as much of the variance in social functioning outcome as depression psychopathology. Personality disorder assessment is of predictive utility with longer timeframes and when considering social outcomes as opposed to symptom counts. This utility is sufficiently great that there appears to be value in assessing personality; however, no particular approach outperforms any other.

  2. Reliability Prediction of Ontology-Based Service Compositions Using Petri Net and Time Series Models

    PubMed Central

    Li, Jia; Xia, Yunni; Luo, Xin

    2014-01-01

    OWL-S, one of the most important Semantic Web service ontologies proposed to date, provides a core ontological framework and guidelines for describing the properties and capabilities of their web services in an unambiguous, computer interpretable form. Predicting the reliability of composite service processes specified in OWL-S allows service users to decide whether the process meets the quantitative quality requirement. In this study, we consider the runtime quality of services to be fluctuating and introduce a dynamic framework to predict the runtime reliability of services specified in OWL-S, employing the Non-Markovian stochastic Petri net (NMSPN) and the time series model. The framework includes the following steps: obtaining the historical response times series of individual service components; fitting these series with a autoregressive-moving-average-model (ARMA for short) and predicting the future firing rates of service components; mapping the OWL-S process into a NMSPN model; employing the predicted firing rates as the model input of NMSPN and calculating the normal completion probability as the reliability estimate. In the case study, a comparison between the static model and our approach based on experimental data is presented and it is shown that our approach achieves higher prediction accuracy. PMID:24688429

  3. Treatment Selection in Depression.

    PubMed

    Cohen, Zachary D; DeRubeis, Robert J

    2018-05-07

    Mental health researchers and clinicians have long sought answers to the question "What works for whom?" The goal of precision medicine is to provide evidence-based answers to this question. Treatment selection in depression aims to help each individual receive the treatment, among the available options, that is most likely to lead to a positive outcome for them. Although patient variables that are predictive of response to treatment have been identified, this knowledge has not yet translated into real-world treatment recommendations. The Personalized Advantage Index (PAI) and related approaches combine information obtained prior to the initiation of treatment into multivariable prediction models that can generate individualized predictions to help clinicians and patients select the right treatment. With increasing availability of advanced statistical modeling approaches, as well as novel predictive variables and big data, treatment selection models promise to contribute to improved outcomes in depression.

  4. Finite Element Modeling of the Buckling Response of Sandwich Panels

    NASA Technical Reports Server (NTRS)

    Rose, Cheryl A.; Moore, David F.; Knight, Norman F., Jr.; Rankin, Charles C.

    2002-01-01

    A comparative study of different modeling approaches for predicting sandwich panel buckling response is described. The study considers sandwich panels with anisotropic face sheets and a very thick core. Results from conventional analytical solutions for sandwich panel overall buckling and face-sheet-wrinkling type modes are compared with solutions obtained using different finite element modeling approaches. Finite element solutions are obtained using layered shell element models, with and without transverse shear flexibility, layered shell/solid element models, with shell elements for the face sheets and solid elements for the core, and sandwich models using a recently developed specialty sandwich element. Convergence characteristics of the shell/solid and sandwich element modeling approaches with respect to in-plane and through-the-thickness discretization, are demonstrated. Results of the study indicate that the specialty sandwich element provides an accurate and effective modeling approach for predicting both overall and localized sandwich panel buckling response. Furthermore, results indicate that anisotropy of the face sheets, along with the ratio of principle elastic moduli, affect the buckling response and these effects may not be represented accurately by analytical solutions. Modeling recommendations are also provided.

  5. Variational Bayesian identification and prediction of stochastic nonlinear dynamic causal models.

    PubMed

    Daunizeau, J; Friston, K J; Kiebel, S J

    2009-11-01

    In this paper, we describe a general variational Bayesian approach for approximate inference on nonlinear stochastic dynamic models. This scheme extends established approximate inference on hidden-states to cover: (i) nonlinear evolution and observation functions, (ii) unknown parameters and (precision) hyperparameters and (iii) model comparison and prediction under uncertainty. Model identification or inversion entails the estimation of the marginal likelihood or evidence of a model. This difficult integration problem can be finessed by optimising a free-energy bound on the evidence using results from variational calculus. This yields a deterministic update scheme that optimises an approximation to the posterior density on the unknown model variables. We derive such a variational Bayesian scheme in the context of nonlinear stochastic dynamic hierarchical models, for both model identification and time-series prediction. The computational complexity of the scheme is comparable to that of an extended Kalman filter, which is critical when inverting high dimensional models or long time-series. Using Monte-Carlo simulations, we assess the estimation efficiency of this variational Bayesian approach using three stochastic variants of chaotic dynamic systems. We also demonstrate the model comparison capabilities of the method, its self-consistency and its predictive power.

  6. Initializing decadal climate predictions over the North Atlantic region

    NASA Astrophysics Data System (ADS)

    Matei, Daniela Mihaela; Pohlmann, Holger; Jungclaus, Johann; Müller, Wolfgang; Haak, Helmuth; Marotzke, Jochem

    2010-05-01

    Decadal climate prediction aims to predict the internally-generated decadal climate variability in addition to externally-forced climate change signal. In order to achieve this it is necessary to start the predictions from the current climate state. In this study we investigate the forecast skill of the North Atlantic decadal climate predictions using two different ocean initialization strategies. First we apply an assimilation of ocean synthesis data provided by the GECCO project (Köhl and Stammer, 2008) as initial conditions for the coupled model ECHAM5/MPI-OM. Hindcast experiments are then performed over the period 1952-2001. An alternative approach is one in which the subsurface ocean temperature and salinity are diagnosed from an ensemble of ocean model runs forced by the NCEP-NCAR atmospheric reanalyzes for the period 1948-2007, then nudge into the coupled model to produce initial conditions for the hindcast experiments. An anomaly coupling scheme is used in both approaches to avoid the hindcast drift and the associated initial shock. Differences between the two assimilation approaches are discussed by comparing them with the observational data in key regions and processes. We asses the skill of the initialized decadal hindcast experiments against the prediction skill of the non-initialized hindcasts simulation. We obtain an overview of the regions with the highest predictability from the regional distribution of the anomaly correlation coefficients and RMSE for the SAT. For the first year the hindcast skill is increased over almost all ocean regions in the NCEP-forced approach. This increase in the hindcast skill for the 1 year lead time is somewhat reduced in the GECCO approach. At lead time 5yr and 10yr, the skill enhancement is still found over the North Atlantic and North Pacific regions. We also consider the potential predictability of the Atlantic Meridional Overturning Circulation (AMOC) and Nordic Seas Overflow by comparing the predicted values to the respective assimilation experiments. Hindcasts of Atlantic MOC and Denmark Strait Overflow show higher predictability than the comparison experiments without initialization and damped persistence predictions up to about 5-6 years.

  7. Physics and chemistry-driven artificial neural network for predicting bioactivity of peptides and proteins and their design.

    PubMed

    Huang, Ri-Bo; Du, Qi-Shi; Wei, Yu-Tuo; Pang, Zong-Wen; Wei, Hang; Chou, Kuo-Chen

    2009-02-07

    Predicting the bioactivity of peptides and proteins is an important challenge in drug development and protein engineering. In this study we introduce a novel approach, the so-called "physics and chemistry-driven artificial neural network (Phys-Chem ANN)", to deal with such a problem. Unlike the existing ANN approaches, which were designed under the inspiration of biological neural system, the Phys-Chem ANN approach is based on the physical and chemical principles, as well as the structural features of proteins. In the Phys-Chem ANN model the "hidden layers" are no longer virtual "neurons", but real structural units of proteins and peptides. It is a hybridization approach, which combines the linear free energy concept of quantitative structure-activity relationship (QSAR) with the advanced mathematical technique of ANN. The Phys-Chem ANN approach has adopted an iterative and feedback procedure, incorporating both machine-learning and artificial intelligence capabilities. In addition to making more accurate predictions for the bioactivities of proteins and peptides than is possible with the traditional QSAR approach, the Phys-Chem ANN approach can also provide more insights about the relationship between bioactivities and the structures involved than the ANN approach does. As an example of the application of the Phys-Chem ANN approach, a predictive model for the conformational stability of human lysozyme is presented.

  8. Flood loss model transfer: on the value of additional data

    NASA Astrophysics Data System (ADS)

    Schröter, Kai; Lüdtke, Stefan; Vogel, Kristin; Kreibich, Heidi; Thieken, Annegret; Merz, Bruno

    2017-04-01

    The transfer of models across geographical regions and flood events is a key challenge in flood loss estimation. Variations in local characteristics and continuous system changes require regional adjustments and continuous updating with current evidence. However, acquiring data on damage influencing factors is expensive and therefore assessing the value of additional data in terms of model reliability and performance improvement is of high relevance. The present study utilizes empirical flood loss data on direct damage to residential buildings available from computer aided telephone interviews that were carried out after the floods in 2002, 2005, 2006, 2010, 2011 and 2013 mainly in the Elbe and Danube catchments in Germany. Flood loss model performance is assessed for incrementally increased numbers of loss data which are differentiated according to region and flood event. Two flood loss modeling approaches are considered: (i) a multi-variable flood loss model approach using Random Forests and (ii) a uni-variable stage damage function. Both model approaches are embedded in a bootstrapping process which allows evaluating the uncertainty of model predictions. Predictive performance of both models is evaluated with regard to mean bias, mean absolute and mean squared errors, as well as hit rate and sharpness. Mean bias and mean absolute error give information about the accuracy of model predictions; mean squared error and sharpness about precision and hit rate is an indicator for model reliability. The results of incremental, regional and temporal updating demonstrate the usefulness of additional data to improve model predictive performance and increase model reliability, particularly in a spatial-temporal transfer setting.

  9. An improved advertising CTR prediction approach based on the fuzzy deep neural network

    PubMed Central

    Gao, Shu; Li, Mingjiang

    2018-01-01

    Combining a deep neural network with fuzzy theory, this paper proposes an advertising click-through rate (CTR) prediction approach based on a fuzzy deep neural network (FDNN). In this approach, fuzzy Gaussian-Bernoulli restricted Boltzmann machine (FGBRBM) is first applied to input raw data from advertising datasets. Next, fuzzy restricted Boltzmann machine (FRBM) is used to construct the fuzzy deep belief network (FDBN) with the unsupervised method layer by layer. Finally, fuzzy logistic regression (FLR) is utilized for modeling the CTR. The experimental results show that the proposed FDNN model outperforms several baseline models in terms of both data representation capability and robustness in advertising click log datasets with noise. PMID:29727443

  10. An improved advertising CTR prediction approach based on the fuzzy deep neural network.

    PubMed

    Jiang, Zilong; Gao, Shu; Li, Mingjiang

    2018-01-01

    Combining a deep neural network with fuzzy theory, this paper proposes an advertising click-through rate (CTR) prediction approach based on a fuzzy deep neural network (FDNN). In this approach, fuzzy Gaussian-Bernoulli restricted Boltzmann machine (FGBRBM) is first applied to input raw data from advertising datasets. Next, fuzzy restricted Boltzmann machine (FRBM) is used to construct the fuzzy deep belief network (FDBN) with the unsupervised method layer by layer. Finally, fuzzy logistic regression (FLR) is utilized for modeling the CTR. The experimental results show that the proposed FDNN model outperforms several baseline models in terms of both data representation capability and robustness in advertising click log datasets with noise.

  11. Initial Integration of Noise Prediction Tools for Acoustic Scattering Effects

    NASA Technical Reports Server (NTRS)

    Nark, Douglas M.; Burley, Casey L.; Tinetti, Ana; Rawls, John W.

    2008-01-01

    This effort provides an initial glimpse at NASA capabilities available in predicting the scattering of fan noise from a non-conventional aircraft configuration. The Aircraft NOise Prediction Program, Fast Scattering Code, and the Rotorcraft Noise Model were coupled to provide increased fidelity models of scattering effects on engine fan noise sources. The integration of these codes led to the identification of several keys issues entailed in applying such multi-fidelity approaches. In particular, for prediction at noise certification points, the inclusion of distributed sources leads to complications with the source semi-sphere approach. Computational resource requirements limit the use of the higher fidelity scattering code to predict radiated sound pressure levels for full scale configurations at relevant frequencies. And, the ability to more accurately represent complex shielding surfaces in current lower fidelity models is necessary for general application to scattering predictions. This initial step in determining the potential benefits/costs of these new methods over the existing capabilities illustrates a number of the issues that must be addressed in the development of next generation aircraft system noise prediction tools.

  12. Seizure prediction in hippocampal and neocortical epilepsy using a model-based approach

    PubMed Central

    Aarabi, Ardalan; He, Bin

    2014-01-01

    Objectives The aim of this study is to develop a model based seizure prediction method. Methods A neural mass model was used to simulate the macro-scale dynamics of intracranial EEG data. The model was composed of pyramidal cells, excitatory and inhibitory interneurons described through state equations. Twelve model’s parameters were estimated by fitting the model to the power spectral density of intracranial EEG signals and then integrated based on information obtained by investigating changes in the parameters prior to seizures. Twenty-one patients with medically intractable hippocampal and neocortical focal epilepsy were studied. Results Tuned to obtain maximum sensitivity, an average sensitivity of 87.07% and 92.6% with an average false prediction rate of 0.2 and 0.15/h were achieved using maximum seizure occurrence periods of 30 and 50 min and a minimum seizure prediction horizon of 10 s, respectively. Under maximum specificity conditions, the system sensitivity decreased to 82.9% and 90.05% and the false prediction rates were reduced to 0.16 and 0.12/h using maximum seizure occurrence periods of 30 and 50 min, respectively. Conclusions The spatio-temporal changes in the parameters demonstrated patient-specific preictal signatures that could be used for seizure prediction. Significance The present findings suggest that the model-based approach may aid prediction of seizures. PMID:24374087

  13. Robust regression and posterior predictive simulation increase power to detect early bursts of trait evolution.

    PubMed

    Slater, Graham J; Pennell, Matthew W

    2014-05-01

    A central prediction of much theory on adaptive radiations is that traits should evolve rapidly during the early stages of a clade's history and subsequently slowdown in rate as niches become saturated--a so-called "Early Burst." Although a common pattern in the fossil record, evidence for early bursts of trait evolution in phylogenetic comparative data has been equivocal at best. We show here that this may not necessarily be due to the absence of this pattern in nature. Rather, commonly used methods to infer its presence perform poorly when when the strength of the burst--the rate at which phenotypic evolution declines--is small, and when some morphological convergence is present within the clade. We present two modifications to existing comparative methods that allow greater power to detect early bursts in simulated datasets. First, we develop posterior predictive simulation approaches and show that they outperform maximum likelihood approaches at identifying early bursts at moderate strength. Second, we use a robust regression procedure that allows for the identification and down-weighting of convergent taxa, leading to moderate increases in method performance. We demonstrate the utility and power of these approach by investigating the evolution of body size in cetaceans. Model fitting using maximum likelihood is equivocal with regards the mode of cetacean body size evolution. However, posterior predictive simulation combined with a robust node height test return low support for Brownian motion or rate shift models, but not the early burst model. While the jury is still out on whether early bursts are actually common in nature, our approach will hopefully facilitate more robust testing of this hypothesis. We advocate the adoption of similar posterior predictive approaches to improve the fit and to assess the adequacy of macroevolutionary models in general.

  14. Geographic and temporal validity of prediction models: Different approaches were useful to examine model performance

    PubMed Central

    Austin, Peter C.; van Klaveren, David; Vergouwe, Yvonne; Nieboer, Daan; Lee, Douglas S.; Steyerberg, Ewout W.

    2017-01-01

    Objective Validation of clinical prediction models traditionally refers to the assessment of model performance in new patients. We studied different approaches to geographic and temporal validation in the setting of multicenter data from two time periods. Study Design and Setting We illustrated different analytic methods for validation using a sample of 14,857 patients hospitalized with heart failure at 90 hospitals in two distinct time periods. Bootstrap resampling was used to assess internal validity. Meta-analytic methods were used to assess geographic transportability. Each hospital was used once as a validation sample, with the remaining hospitals used for model derivation. Hospital-specific estimates of discrimination (c-statistic) and calibration (calibration intercepts and slopes) were pooled using random effects meta-analysis methods. I2 statistics and prediction interval width quantified geographic transportability. Temporal transportability was assessed using patients from the earlier period for model derivation and patients from the later period for model validation. Results Estimates of reproducibility, pooled hospital-specific performance, and temporal transportability were on average very similar, with c-statistics of 0.75. Between-hospital variation was moderate according to I2 statistics and prediction intervals for c-statistics. Conclusion This study illustrates how performance of prediction models can be assessed in settings with multicenter data at different time periods. PMID:27262237

  15. Closure models for transitional blunt-body flows

    NASA Astrophysics Data System (ADS)

    Nance, Robert Paul

    1998-12-01

    A mean-flow modeling approach is proposed for the prediction of high-speed blunt-body wake flows undergoing transition to turbulence. This method couples the k- /zeta (Enstrophy) compressible turbulence model with a procedure for characterizing non-turbulent fluctuations upstream of transition. Two different instability mechanisms are examined in this study. In the first model, transition is brought about by streamwise disturbance modes, whereas the second mechanism considers instabilities in the free shear layer associated with the wake flow. An important feature of this combined approach is the ability to specify or predict the location of transition onset. Solutions obtained using the new approach are presented for a variety of perfect-gas hypersonic flows over blunt- cone configurations. These results are shown to provide better agreement with experimental heating data than earlier laminar predictions by other researchers. In addition, it is demonstrated that the free-shear-layer instability mechanism is superior to the streamwise mechanism in terms of comparisons with heating measurements. The favorable comparisons are a strong indication that transition to turbulence is indeed present in the flowfields considered. They also show that the present method is a useful predictive tool for transitional blunt-body wake flows.

  16. Building protein-protein interaction networks for Leishmania species through protein structural information.

    PubMed

    Dos Santos Vasconcelos, Crhisllane Rafaele; de Lima Campos, Túlio; Rezende, Antonio Mauro

    2018-03-06

    Systematic analysis of a parasite interactome is a key approach to understand different biological processes. It makes possible to elucidate disease mechanisms, to predict protein functions and to select promising targets for drug development. Currently, several approaches for protein interaction prediction for non-model species incorporate only small fractions of the entire proteomes and their interactions. Based on this perspective, this study presents an integration of computational methodologies, protein network predictions and comparative analysis of the protozoan species Leishmania braziliensis and Leishmania infantum. These parasites cause Leishmaniasis, a worldwide distributed and neglected disease, with limited treatment options using currently available drugs. The predicted interactions were obtained from a meta-approach, applying rigid body docking tests and template-based docking on protein structures predicted by different comparative modeling techniques. In addition, we trained a machine-learning algorithm (Gradient Boosting) using docking information performed on a curated set of positive and negative protein interaction data. Our final model obtained an AUC = 0.88, with recall = 0.69, specificity = 0.88 and precision = 0.83. Using this approach, it was possible to confidently predict 681 protein structures and 6198 protein interactions for L. braziliensis, and 708 protein structures and 7391 protein interactions for L. infantum. The predicted networks were integrated to protein interaction data already available, analyzed using several topological features and used to classify proteins as essential for network stability. The present study allowed to demonstrate the importance of integrating different methodologies of interaction prediction to increase the coverage of the protein interaction of the studied protocols, besides it made available protein structures and interactions not previously reported.

  17. The Use of a Predictive Habitat Model and a Fuzzy Logic Approach for Marine Management and Planning

    PubMed Central

    Hattab, Tarek; Ben Rais Lasram, Frida; Albouy, Camille; Sammari, Chérif; Romdhane, Mohamed Salah; Cury, Philippe; Leprieur, Fabien; Le Loc’h, François

    2013-01-01

    Bottom trawl survey data are commonly used as a sampling technique to assess the spatial distribution of commercial species. However, this sampling technique does not always correctly detect a species even when it is present, and this can create significant limitations when fitting species distribution models. In this study, we aim to test the relevance of a mixed methodological approach that combines presence-only and presence-absence distribution models. We illustrate this approach using bottom trawl survey data to model the spatial distributions of 27 commercially targeted marine species. We use an environmentally- and geographically-weighted method to simulate pseudo-absence data. The species distributions are modelled using regression kriging, a technique that explicitly incorporates spatial dependence into predictions. Model outputs are then used to identify areas that met the conservation targets for the deployment of artificial anti-trawling reefs. To achieve this, we propose the use of a fuzzy logic framework that accounts for the uncertainty associated with different model predictions. For each species, the predictive accuracy of the model is classified as ‘high’. A better result is observed when a large number of occurrences are used to develop the model. The map resulting from the fuzzy overlay shows that three main areas have a high level of agreement with the conservation criteria. These results align with expert opinion, confirming the relevance of the proposed methodology in this study. PMID:24146867

  18. Predicting Speech Intelligibility with a Multiple Speech Subsystems Approach in Children with Cerebral Palsy

    ERIC Educational Resources Information Center

    Lee, Jimin; Hustad, Katherine C.; Weismer, Gary

    2014-01-01

    Purpose: Speech acoustic characteristics of children with cerebral palsy (CP) were examined with a multiple speech subsystems approach; speech intelligibility was evaluated using a prediction model in which acoustic measures were selected to represent three speech subsystems. Method: Nine acoustic variables reflecting different subsystems, and…

  19. A Laboratory Simulation of Urban Runoff and the Potential for Hydrograph Prediction with Curve Numbers

    USDA-ARS?s Scientific Manuscript database

    Urban drainages are mosaics of pervious and impervious surfaces, and prediction of runoff hydrology with a lumped modeling approach using the NRCS curve number may be appropriate. However, the prognostic capability of such a lumped approach is complicated by routing and connectivity amongst infiltra...

  20. Predictability and Coupled Dynamics of MJO During DYNAMO

    DTIC Science & Technology

    2013-09-30

    1 DISTRIBUTION STATEMENT A. Approved for public release; distribution is unlimited. Predictability and Coupled Dynamics of MJO During DYNAMO ...Model (LIM) for MJO predictions and apply it in retrospective cross-validated forecast mode to the DYNAMO time period. APPROACH We are working as...a team to study MJO dynamics and predictability using several models as team members of the ONR DRI associated with the DYNAMO experiment. This is a

  1. Prediction of Fracture Behavior in Rock and Rock-like Materials Using Discrete Element Models

    NASA Astrophysics Data System (ADS)

    Katsaga, T.; Young, P.

    2009-05-01

    The study of fracture initiation and propagation in heterogeneous materials such as rock and rock-like materials are of principal interest in the field of rock mechanics and rock engineering. It is crucial to study and investigate failure prediction and safety measures in civil and mining structures. Our work offers a practical approach to predict fracture behaviour using discrete element models. In this approach, the microstructures of materials are presented through the combination of clusters of bonded particles with different inter-cluster particle and bond properties, and intra-cluster bond properties. The geometry of clusters is transferred from information available from thin sections, computed tomography (CT) images and other visual presentation of the modeled material using customized AutoCAD built-in dialog- based Visual Basic Application. Exact microstructures of the tested sample, including fractures, faults, inclusions and void spaces can be duplicated in the discrete element models. Although the microstructural fabrics of rocks and rock-like structures may have different scale, fracture formation and propagation through these materials are alike and will follow similar mechanics. Synthetic material provides an excellent condition for validating the modelling approaches, as fracture behaviours are known with the well-defined composite's properties. Calibration of the macro-properties of matrix material and inclusions (aggregates), were followed with the overall mechanical material responses calibration by adjusting the interfacial properties. The discrete element model predicted similar fracture propagation features and path as that of the real sample material. The path of the fractures and matrix-inclusion interaction was compared using computed tomography images. Initiation and fracture formation in the model and real material were compared using Acoustic Emission data. Analysing the temporal and spatial evolution of AE events, collected during the sample testing, in relation to the CT images allows the precise reconstruction of the failure sequence. Our proposed modelling approach illustrates realistic fracture formation and growth predictions at different loading conditions.

  2. Ordinal convolutional neural networks for predicting RDoC positive valence psychiatric symptom severity scores.

    PubMed

    Rios, Anthony; Kavuluru, Ramakanth

    2017-11-01

    The CEGS N-GRID 2016 Shared Task in Clinical Natural Language Processing (NLP) provided a set of 1000 neuropsychiatric notes to participants as part of a competition to predict psychiatric symptom severity scores. This paper summarizes our methods, results, and experiences based on our participation in the second track of the shared task. Classical methods of text classification usually fall into one of three problem types: binary, multi-class, and multi-label classification. In this effort, we study ordinal regression problems with text data where misclassifications are penalized differently based on how far apart the ground truth and model predictions are on the ordinal scale. Specifically, we present our entries (methods and results) in the N-GRID shared task in predicting research domain criteria (RDoC) positive valence ordinal symptom severity scores (absent, mild, moderate, and severe) from psychiatric notes. We propose a novel convolutional neural network (CNN) model designed to handle ordinal regression tasks on psychiatric notes. Broadly speaking, our model combines an ordinal loss function, a CNN, and conventional feature engineering (wide features) into a single model which is learned end-to-end. Given interpretability is an important concern with nonlinear models, we apply a recent approach called locally interpretable model-agnostic explanation (LIME) to identify important words that lead to instance specific predictions. Our best model entered into the shared task placed third among 24 teams and scored a macro mean absolute error (MMAE) based normalized score (100·(1-MMAE)) of 83.86. Since the competition, we improved our score (using basic ensembling) to 85.55, comparable with the winning shared task entry. Applying LIME to model predictions, we demonstrate the feasibility of instance specific prediction interpretation by identifying words that led to a particular decision. In this paper, we present a method that successfully uses wide features and an ordinal loss function applied to convolutional neural networks for ordinal text classification specifically in predicting psychiatric symptom severity scores. Our approach leads to excellent performance on the N-GRID shared task and is also amenable to interpretability using existing model-agnostic approaches. Copyright © 2017 Elsevier Inc. All rights reserved.

  3. Forecasting daily lake levels using artificial intelligence approaches

    NASA Astrophysics Data System (ADS)

    Kisi, Ozgur; Shiri, Jalal; Nikoofar, Bagher

    2012-04-01

    Accurate prediction of lake-level variations is important for planning, design, construction, and operation of lakeshore structures and also in the management of freshwater lakes for water supply purposes. In the present paper, three artificial intelligence approaches, namely artificial neural networks (ANNs), adaptive-neuro-fuzzy inference system (ANFIS), and gene expression programming (GEP), were applied to forecast daily lake-level variations up to 3-day ahead time intervals. The measurements at the Lake Iznik in Western Turkey, for the period of January 1961-December 1982, were used for training, testing, and validating the employed models. The results obtained by the GEP approach indicated that it performs better than ANFIS and ANNs in predicting lake-level variations. A comparison was also made between these artificial intelligence approaches and convenient autoregressive moving average (ARMA) models, which demonstrated the superiority of GEP, ANFIS, and ANN models over ARMA models.

  4. A model predictive speed tracking control approach for autonomous ground vehicles

    NASA Astrophysics Data System (ADS)

    Zhu, Min; Chen, Huiyan; Xiong, Guangming

    2017-03-01

    This paper presents a novel speed tracking control approach based on a model predictive control (MPC) framework for autonomous ground vehicles. A switching algorithm without calibration is proposed to determine the drive or brake control. Combined with a simple inverse longitudinal vehicle model and adaptive regulation of MPC, this algorithm can make use of the engine brake torque for various driving conditions and avoid high frequency oscillations automatically. A simplified quadratic program (QP) solving algorithm is used to reduce the computational time, and the approach has been applied in a 16-bit microcontroller. The performance of the proposed approach is evaluated via simulations and vehicle tests, which were carried out in a range of speed-profile tracking tasks. With a well-designed system structure, high-precision speed control is achieved. The system can robustly model uncertainty and external disturbances, and yields a faster response with less overshoot than a PI controller.

  5. NWP model forecast skill optimization via closure parameter variations

    NASA Astrophysics Data System (ADS)

    Järvinen, H.; Ollinaho, P.; Laine, M.; Solonen, A.; Haario, H.

    2012-04-01

    We present results of a novel approach to tune predictive skill of numerical weather prediction (NWP) models. These models contain tunable parameters which appear in parameterizations schemes of sub-grid scale physical processes. The current practice is to specify manually the numerical parameter values, based on expert knowledge. We developed recently a concept and method (QJRMS 2011) for on-line estimation of the NWP model parameters via closure parameter variations. The method called EPPES ("Ensemble prediction and parameter estimation system") utilizes ensemble prediction infra-structure for parameter estimation in a very cost-effective way: practically no new computations are introduced. The approach provides an algorithmic decision making tool for model parameter optimization in operational NWP. In EPPES, statistical inference about the NWP model tunable parameters is made by (i) generating an ensemble of predictions so that each member uses different model parameter values, drawn from a proposal distribution, and (ii) feeding-back the relative merits of the parameter values to the proposal distribution, based on evaluation of a suitable likelihood function against verifying observations. In this presentation, the method is first illustrated in low-order numerical tests using a stochastic version of the Lorenz-95 model which effectively emulates the principal features of ensemble prediction systems. The EPPES method correctly detects the unknown and wrongly specified parameters values, and leads to an improved forecast skill. Second, results with an ensemble prediction system emulator, based on the ECHAM5 atmospheric GCM show that the model tuning capability of EPPES scales up to realistic models and ensemble prediction systems. Finally, preliminary results of EPPES in the context of ECMWF forecasting system are presented.

  6. A range of complex probabilistic models for RNA secondary structure prediction that includes the nearest-neighbor model and more.

    PubMed

    Rivas, Elena; Lang, Raymond; Eddy, Sean R

    2012-02-01

    The standard approach for single-sequence RNA secondary structure prediction uses a nearest-neighbor thermodynamic model with several thousand experimentally determined energy parameters. An attractive alternative is to use statistical approaches with parameters estimated from growing databases of structural RNAs. Good results have been reported for discriminative statistical methods using complex nearest-neighbor models, including CONTRAfold, Simfold, and ContextFold. Little work has been reported on generative probabilistic models (stochastic context-free grammars [SCFGs]) of comparable complexity, although probabilistic models are generally easier to train and to use. To explore a range of probabilistic models of increasing complexity, and to directly compare probabilistic, thermodynamic, and discriminative approaches, we created TORNADO, a computational tool that can parse a wide spectrum of RNA grammar architectures (including the standard nearest-neighbor model and more) using a generalized super-grammar that can be parameterized with probabilities, energies, or arbitrary scores. By using TORNADO, we find that probabilistic nearest-neighbor models perform comparably to (but not significantly better than) discriminative methods. We find that complex statistical models are prone to overfitting RNA structure and that evaluations should use structurally nonhomologous training and test data sets. Overfitting has affected at least one published method (ContextFold). The most important barrier to improving statistical approaches for RNA secondary structure prediction is the lack of diversity of well-curated single-sequence RNA secondary structures in current RNA databases.

  7. A range of complex probabilistic models for RNA secondary structure prediction that includes the nearest-neighbor model and more

    PubMed Central

    Rivas, Elena; Lang, Raymond; Eddy, Sean R.

    2012-01-01

    The standard approach for single-sequence RNA secondary structure prediction uses a nearest-neighbor thermodynamic model with several thousand experimentally determined energy parameters. An attractive alternative is to use statistical approaches with parameters estimated from growing databases of structural RNAs. Good results have been reported for discriminative statistical methods using complex nearest-neighbor models, including CONTRAfold, Simfold, and ContextFold. Little work has been reported on generative probabilistic models (stochastic context-free grammars [SCFGs]) of comparable complexity, although probabilistic models are generally easier to train and to use. To explore a range of probabilistic models of increasing complexity, and to directly compare probabilistic, thermodynamic, and discriminative approaches, we created TORNADO, a computational tool that can parse a wide spectrum of RNA grammar architectures (including the standard nearest-neighbor model and more) using a generalized super-grammar that can be parameterized with probabilities, energies, or arbitrary scores. By using TORNADO, we find that probabilistic nearest-neighbor models perform comparably to (but not significantly better than) discriminative methods. We find that complex statistical models are prone to overfitting RNA structure and that evaluations should use structurally nonhomologous training and test data sets. Overfitting has affected at least one published method (ContextFold). The most important barrier to improving statistical approaches for RNA secondary structure prediction is the lack of diversity of well-curated single-sequence RNA secondary structures in current RNA databases. PMID:22194308

  8. Biogeochemical metabolic modeling of methanogenesis by Methanosarcina barkeri

    NASA Astrophysics Data System (ADS)

    Jensvold, Z. D.; Jin, Q.

    2015-12-01

    Methanogenesis, the biological process of methane production, is the final step of natural organic matter degradation. In studying natural methanogenesis, important questions include how fast methanogenesis proceeds and how methanogens adapt to the environment. To address these questions, we propose a new approach - biogeochemical reaction modeling - by simulating the metabolic networks of methanogens. Biogeochemical reaction modeling combines geochemical reaction modeling and genome-scale metabolic modeling. Geochemical reaction modeling focuses on the speciation of electron donors and acceptors in the environment, and therefore the energy available to methanogens. Genome-scale metabolic modeling predicts microbial rates and metabolic strategies. Specifically, this approach describes methanogenesis using an enzyme network model, and computes enzyme rates by accounting for both the kinetics and thermodynamics. The network model is simulated numerically to predict enzyme abundances and rates of methanogen metabolism. We applied this new approach to Methanosarcina barkeri strain fusaro, a model methanogen that makes methane by reducing carbon dioxide and oxidizing dihydrogen. The simulation results match well with the results of previous laboratory experiments, including the magnitude of proton motive force and the kinetic parameters of Methanosarcina barkeri. The results also predict that in natural environments, the configuration of methanogenesis network, including the concentrations of enzymes and metabolites, differs significantly from that under laboratory settings.

  9. A Critical Review for Developing Accurate and Dynamic Predictive Models Using Machine Learning Methods in Medicine and Health Care.

    PubMed

    Alanazi, Hamdan O; Abdullah, Abdul Hanan; Qureshi, Kashif Naseer

    2017-04-01

    Recently, Artificial Intelligence (AI) has been used widely in medicine and health care sector. In machine learning, the classification or prediction is a major field of AI. Today, the study of existing predictive models based on machine learning methods is extremely active. Doctors need accurate predictions for the outcomes of their patients' diseases. In addition, for accurate predictions, timing is another significant factor that influences treatment decisions. In this paper, existing predictive models in medicine and health care have critically reviewed. Furthermore, the most famous machine learning methods have explained, and the confusion between a statistical approach and machine learning has clarified. A review of related literature reveals that the predictions of existing predictive models differ even when the same dataset is used. Therefore, existing predictive models are essential, and current methods must be improved.

  10. Testing process predictions of models of risky choice: a quantitative model comparison approach

    PubMed Central

    Pachur, Thorsten; Hertwig, Ralph; Gigerenzer, Gerd; Brandstätter, Eduard

    2013-01-01

    This article presents a quantitative model comparison contrasting the process predictions of two prominent views on risky choice. One view assumes a trade-off between probabilities and outcomes (or non-linear functions thereof) and the separate evaluation of risky options (expectation models). Another view assumes that risky choice is based on comparative evaluation, limited search, aspiration levels, and the forgoing of trade-offs (heuristic models). We derived quantitative process predictions for a generic expectation model and for a specific heuristic model, namely the priority heuristic (Brandstätter et al., 2006), and tested them in two experiments. The focus was on two key features of the cognitive process: acquisition frequencies (i.e., how frequently individual reasons are looked up) and direction of search (i.e., gamble-wise vs. reason-wise). In Experiment 1, the priority heuristic predicted direction of search better than the expectation model (although neither model predicted the acquisition process perfectly); acquisition frequencies, however, were inconsistent with both models. Additional analyses revealed that these frequencies were primarily a function of what Rubinstein (1988) called “similarity.” In Experiment 2, the quantitative model comparison approach showed that people seemed to rely more on the priority heuristic in difficult problems, but to make more trade-offs in easy problems. This finding suggests that risky choice may be based on a mental toolbox of strategies. PMID:24151472

  11. [GSH fermentation process modeling using entropy-criterion based RBF neural network model].

    PubMed

    Tan, Zuoping; Wang, Shitong; Deng, Zhaohong; Du, Guocheng

    2008-05-01

    The prediction accuracy and generalization of GSH fermentation process modeling are often deteriorated by noise existing in the corresponding experimental data. In order to avoid this problem, we present a novel RBF neural network modeling approach based on entropy criterion. It considers the whole distribution structure of the training data set in the parameter learning process compared with the traditional MSE-criterion based parameter learning, and thus effectively avoids the weak generalization and over-learning. Then the proposed approach is applied to the GSH fermentation process modeling. Our results demonstrate that this proposed method has better prediction accuracy, generalization and robustness such that it offers a potential application merit for the GSH fermentation process modeling.

  12. Moving beyond qualitative evaluations of Bayesian models of cognition.

    PubMed

    Hemmer, Pernille; Tauber, Sean; Steyvers, Mark

    2015-06-01

    Bayesian models of cognition provide a powerful way to understand the behavior and goals of individuals from a computational point of view. Much of the focus in the Bayesian cognitive modeling approach has been on qualitative model evaluations, where predictions from the models are compared to data that is often averaged over individuals. In many cognitive tasks, however, there are pervasive individual differences. We introduce an approach to directly infer individual differences related to subjective mental representations within the framework of Bayesian models of cognition. In this approach, Bayesian data analysis methods are used to estimate cognitive parameters and motivate the inference process within a Bayesian cognitive model. We illustrate this integrative Bayesian approach on a model of memory. We apply the model to behavioral data from a memory experiment involving the recall of heights of people. A cross-validation analysis shows that the Bayesian memory model with inferred subjective priors predicts withheld data better than a Bayesian model where the priors are based on environmental statistics. In addition, the model with inferred priors at the individual subject level led to the best overall generalization performance, suggesting that individual differences are important to consider in Bayesian models of cognition.

  13. Predicting the influence of liposomal lipid composition on liposome size, zeta potential and liposome-induced dendritic cell maturation using a design of experiments approach.

    PubMed

    Soema, Peter C; Willems, Geert-Jan; Jiskoot, Wim; Amorij, Jean-Pierre; Kersten, Gideon F

    2015-08-01

    In this study, the effect of liposomal lipid composition on the physicochemical characteristics and adjuvanticity of liposomes was investigated. Using a design of experiments (DoE) approach, peptide-containing liposomes containing various lipids (EPC, DOPE, DOTAP and DC-Chol) and peptide concentrations were formulated. Liposome size and zeta potential were determined for each formulation. Moreover, the adjuvanticity of the liposomes was assessed in an in vitro dendritic cell (DC) model, by quantifying the expression of DC maturation markers CD40, CD80, CD83 and CD86. The acquired data of these liposome characteristics were successfully fitted with regression models, and response contour plots were generated for each response factor. These models were applied to predict a lipid composition that resulted in a liposome with a target zeta potential. Subsequently, the expression of the DC maturation factors for this lipid composition was predicted and tested in vitro; the acquired maturation responses corresponded well with the predicted ones. These results show that a DoE approach can be used to screen various lipids and lipid compositions, and to predict their impact on liposome size, charge and adjuvanticity. Using such an approach may accelerate the formulation development of liposomal vaccine adjuvants. Copyright © 2015 The Authors. Published by Elsevier B.V. All rights reserved.

  14. Genomic Selection in Multi-environment Crop Trials.

    PubMed

    Oakey, Helena; Cullis, Brian; Thompson, Robin; Comadran, Jordi; Halpin, Claire; Waugh, Robbie

    2016-05-03

    Genomic selection in crop breeding introduces modeling challenges not found in animal studies. These include the need to accommodate replicate plants for each line, consider spatial variation in field trials, address line by environment interactions, and capture nonadditive effects. Here, we propose a flexible single-stage genomic selection approach that resolves these issues. Our linear mixed model incorporates spatial variation through environment-specific terms, and also randomization-based design terms. It considers marker, and marker by environment interactions using ridge regression best linear unbiased prediction to extend genomic selection to multiple environments. Since the approach uses the raw data from line replicates, the line genetic variation is partitioned into marker and nonmarker residual genetic variation (i.e., additive and nonadditive effects). This results in a more precise estimate of marker genetic effects. Using barley height data from trials, in 2 different years, of up to 477 cultivars, we demonstrate that our new genomic selection model improves predictions compared to current models. Analyzing single trials revealed improvements in predictive ability of up to 5.7%. For the multiple environment trial (MET) model, combining both year trials improved predictive ability up to 11.4% compared to a single environment analysis. Benefits were significant even when fewer markers were used. Compared to a single-year standard model run with 3490 markers, our partitioned MET model achieved the same predictive ability using between 500 and 1000 markers depending on the trial. Our approach can be used to increase accuracy and confidence in the selection of the best lines for breeding and/or, to reduce costs by using fewer markers. Copyright © 2016 Oakey et al.

  15. The Effect of Delamination on Damage Path and Failure Load Prediction for Notched Composite Laminates

    NASA Technical Reports Server (NTRS)

    Satyanarayana, Arunkumar; Bogert, Philip B.; Chunchu, Prasad B.

    2007-01-01

    The influence of delamination on the progressing damage path and initial failure load in composite laminates is investigated. Results are presented from a numerical and an experimental study of center-notched tensile-loaded coupons. The numerical study includes two approaches. The first approach considers only intralaminar (fiber breakage and matrix cracking) damage modes in calculating the progression of the damage path. In the second approach, the model is extended to consider the effect of interlaminar (delamination) damage modes in addition to the intralaminar damage modes. The intralaminar damage is modeled using progressive damage analysis (PDA) methodology implemented with the VUMAT subroutine in the ABAQUS finite element code. The interlaminar damage mode has been simulated using cohesive elements in ABAQUS. In the experimental study, 2-3 specimens each of two different stacking sequences of center-notched laminates are tensile loaded. The numerical results from the two different modeling approaches are compared with each other and the experimentally observed results for both laminate types. The comparisons reveal that the second modeling approach, where the delamination damage mode is included together with the intralaminar damage modes, better simulates the experimentally observed damage modes and damage paths, which were characterized by splitting failures perpendicular to the notch tips in one or more layers. Additionally, the inclusion of the delamination mode resulted in a better prediction of the loads at which the failure took place, which were higher than those predicted by the first modeling approach which did not include delaminations.

  16. Predicting the effect of cytochrome P450 inhibitors on substrate drugs: analysis of physiologically based pharmacokinetic modeling submissions to the US Food and Drug Administration.

    PubMed

    Wagner, Christian; Pan, Yuzhuo; Hsu, Vicky; Grillo, Joseph A; Zhang, Lei; Reynolds, Kellie S; Sinha, Vikram; Zhao, Ping

    2015-01-01

    The US Food and Drug Administration (FDA) has seen a recent increase in the application of physiologically based pharmacokinetic (PBPK) modeling towards assessing the potential of drug-drug interactions (DDI) in clinically relevant scenarios. To continue our assessment of such approaches, we evaluated the predictive performance of PBPK modeling in predicting cytochrome P450 (CYP)-mediated DDI. This evaluation was based on 15 substrate PBPK models submitted by nine sponsors between 2009 and 2013. For these 15 models, a total of 26 DDI studies (cases) with various CYP inhibitors were available. Sponsors developed the PBPK models, reportedly without considering clinical DDI data. Inhibitor models were either developed by sponsors or provided by PBPK software developers and applied with minimal or no modification. The metric for assessing predictive performance of the sponsors' PBPK approach was the R predicted/observed value (R predicted/observed = [predicted mean exposure ratio]/[observed mean exposure ratio], with the exposure ratio defined as [C max (maximum plasma concentration) or AUC (area under the plasma concentration-time curve) in the presence of CYP inhibition]/[C max or AUC in the absence of CYP inhibition]). In 81 % (21/26) and 77 % (20/26) of cases, respectively, the R predicted/observed values for AUC and C max ratios were within a pre-defined threshold of 1.25-fold of the observed data. For all cases, the R predicted/observed values for AUC and C max were within a 2-fold range. These results suggest that, based on the submissions to the FDA to date, there is a high degree of concordance between PBPK-predicted and observed effects of CYP inhibition, especially CYP3A-based, on the exposure of drug substrates.

  17. Stochastic Earthquake Rupture Modeling Using Nonparametric Co-Regionalization

    NASA Astrophysics Data System (ADS)

    Lee, Kyungbook; Song, Seok Goo

    2017-09-01

    Accurate predictions of the intensity and variability of ground motions are essential in simulation-based seismic hazard assessment. Advanced simulation-based ground motion prediction methods have been proposed to complement the empirical approach, which suffers from the lack of observed ground motion data, especially in the near-source region for large events. It is important to quantify the variability of the earthquake rupture process for future events and to produce a number of rupture scenario models to capture the variability in simulation-based ground motion predictions. In this study, we improved the previously developed stochastic earthquake rupture modeling method by applying the nonparametric co-regionalization, which was proposed in geostatistics, to the correlation models estimated from dynamically derived earthquake rupture models. The nonparametric approach adopted in this study is computationally efficient and, therefore, enables us to simulate numerous rupture scenarios, including large events ( M > 7.0). It also gives us an opportunity to check the shape of true input correlation models in stochastic modeling after being deformed for permissibility. We expect that this type of modeling will improve our ability to simulate a wide range of rupture scenario models and thereby predict ground motions and perform seismic hazard assessment more accurately.

  18. Quantifying the uncertainty of nonpoint source attribution in distributed water quality models: A Bayesian assessment of SWAT's sediment export predictions

    NASA Astrophysics Data System (ADS)

    Wellen, Christopher; Arhonditsis, George B.; Long, Tanya; Boyd, Duncan

    2014-11-01

    Spatially distributed nonpoint source watershed models are essential tools to estimate the magnitude and sources of diffuse pollution. However, little work has been undertaken to understand the sources and ramifications of the uncertainty involved in their use. In this study we conduct the first Bayesian uncertainty analysis of the water quality components of the SWAT model, one of the most commonly used distributed nonpoint source models. Working in Southern Ontario, we apply three Bayesian configurations for calibrating SWAT to Redhill Creek, an urban catchment, and Grindstone Creek, an agricultural one. We answer four interrelated questions: can SWAT determine suspended sediment sources with confidence when end of basin data is used for calibration? How does uncertainty propagate from the discharge submodel to the suspended sediment submodels? Do the estimated sediment sources vary when different calibration approaches are used? Can we combine the knowledge gained from different calibration approaches? We show that: (i) despite reasonable fit at the basin outlet, the simulated sediment sources are subject to uncertainty sufficient to undermine the typical approach of reliance on a single, best fit simulation; (ii) more than a third of the uncertainty of sediment load predictions may stem from the discharge submodel; (iii) estimated sediment sources do vary significantly across the three statistical configurations of model calibration despite end-of-basin predictions being virtually identical; and (iv) Bayesian model averaging is an approach that can synthesize predictions when a number of adequate distributed models make divergent source apportionments. We conclude with recommendations for future research to reduce the uncertainty encountered when using distributed nonpoint source models for source apportionment.

  19. An Imbalance of Approach and Effortful Control Predicts Externalizing Problems: Support for Extending the Dual-Systems Model into Early Childhood.

    PubMed

    Jonas, Katherine; Kochanska, Grazyna

    2018-01-25

    Although the association between deficits in effortful control and later externalizing behavior is well established, many researchers (Nigg Journal of Child Psychology and Psychiatry, 47(3-4), 395-422, 2006; Steinberg Developmental Review, 28(1), 78-106, 2008) have hypothesized this association is actually the product of the imbalance of dual systems, or two underlying traits: approach and self-regulation. Very little research, however, has deployed a statistically robust strategy to examine that compelling model; further, no research has done so using behavioral measures, particularly in longitudinal studies. We examined the imbalance of approach and self-regulation (effortful control, EC) as predicting externalizing problems. Latent trait models of approach and EC were derived from behavioral measures collected from 102 children in a community sample at 25, 38, 52, and 67 months (2 to 5 ½ years), and used to predict externalizing behaviors, modeled as a latent trait derived from parent-reported measures at 80, 100, 123, and 147 months (6 ½ to 12 years). The imbalance hypothesis was supported: Children with an imbalance of approach and EC had more externalizing behavior problems in middle childhood and early preadolescence, relative to children with equal levels of the two traits.

  20. An analytics approach to designing patient centered medical homes.

    PubMed

    Ajorlou, Saeede; Shams, Issac; Yang, Kai

    2015-03-01

    Recently the patient centered medical home (PCMH) model has become a popular team based approach focused on delivering more streamlined care to patients. In current practices of medical homes, a clinical based prediction frame is recommended because it can help match the portfolio capacity of PCMH teams with the actual load generated by a set of patients. Without such balances in clinical supply and demand, issues such as excessive under and over utilization of physicians, long waiting time for receiving the appropriate treatment, and non-continuity of care will eliminate many advantages of the medical home strategy. In this paper, by using the hierarchical generalized linear model with multivariate responses, we develop a clinical workload prediction model for care portfolio demands in a Bayesian framework. The model allows for heterogeneous variances and unstructured covariance matrices for nested random effects that arise through complex hierarchical care systems. We show that using a multivariate approach substantially enhances the precision of workload predictions at both primary and non primary care levels. We also demonstrate that care demands depend not only on patient demographics but also on other utilization factors, such as length of stay. Our analyses of a recent data from Veteran Health Administration further indicate that risk adjustment for patient health conditions can considerably improve the prediction power of the model.

  1. A Hybrid Physics-Based Data-Driven Approach for Point-Particle Force Modeling

    NASA Astrophysics Data System (ADS)

    Moore, Chandler; Akiki, Georges; Balachandar, S.

    2017-11-01

    This study improves upon the physics-based pairwise interaction extended point-particle (PIEP) model. The PIEP model leverages a physical framework to predict fluid mediated interactions between solid particles. While the PIEP model is a powerful tool, its pairwise assumption leads to increased error in flows with high particle volume fractions. To reduce this error, a regression algorithm is used to model the differences between the current PIEP model's predictions and the results of direct numerical simulations (DNS) for an array of monodisperse solid particles subjected to various flow conditions. The resulting statistical model and the physical PIEP model are superimposed to construct a hybrid, physics-based data-driven PIEP model. It must be noted that the performance of a pure data-driven approach without the model-form provided by the physical PIEP model is substantially inferior. The hybrid model's predictive capabilities are analyzed using more DNS. In every case tested, the hybrid PIEP model's prediction are more accurate than those of physical PIEP model. This material is based upon work supported by the National Science Foundation Graduate Research Fellowship Program under Grant No. DGE-1315138 and the U.S. DOE, NNSA, ASC Program, as a Cooperative Agreement under Contract No. DE-NA0002378.

  2. A three-step approach for the derivation and validation of high-performing predictive models using an operational dataset: congestive heart failure readmission case study.

    PubMed

    AbdelRahman, Samir E; Zhang, Mingyuan; Bray, Bruce E; Kawamoto, Kensaku

    2014-05-27

    The aim of this study was to propose an analytical approach to develop high-performing predictive models for congestive heart failure (CHF) readmission using an operational dataset with incomplete records and changing data over time. Our analytical approach involves three steps: pre-processing, systematic model development, and risk factor analysis. For pre-processing, variables that were absent in >50% of records were removed. Moreover, the dataset was divided into a validation dataset and derivation datasets which were separated into three temporal subsets based on changes to the data over time. For systematic model development, using the different temporal datasets and the remaining explanatory variables, the models were developed by combining the use of various (i) statistical analyses to explore the relationships between the validation and the derivation datasets; (ii) adjustment methods for handling missing values; (iii) classifiers; (iv) feature selection methods; and (iv) discretization methods. We then selected the best derivation dataset and the models with the highest predictive performance. For risk factor analysis, factors in the highest-performing predictive models were analyzed and ranked using (i) statistical analyses of the best derivation dataset, (ii) feature rankers, and (iii) a newly developed algorithm to categorize risk factors as being strong, regular, or weak. The analysis dataset consisted of 2,787 CHF hospitalizations at University of Utah Health Care from January 2003 to June 2013. In this study, we used the complete-case analysis and mean-based imputation adjustment methods; the wrapper subset feature selection method; and four ranking strategies based on information gain, gain ratio, symmetrical uncertainty, and wrapper subset feature evaluators. The best-performing models resulted from the use of a complete-case analysis derivation dataset combined with the Class-Attribute Contingency Coefficient discretization method and a voting classifier which averaged the results of multi-nominal logistic regression and voting feature intervals classifiers. Of 42 final model risk factors, discharge disposition, discretized age, and indicators of anemia were the most significant. This model achieved a c-statistic of 86.8%. The proposed three-step analytical approach enhanced predictive model performance for CHF readmissions. It could potentially be leveraged to improve predictive model performance in other areas of clinical medicine.

  3. Multi-Dimensional Calibration of Impact Dynamic Models

    NASA Technical Reports Server (NTRS)

    Horta, Lucas G.; Reaves, Mercedes C.; Annett, Martin S.; Jackson, Karen E.

    2011-01-01

    NASA Langley, under the Subsonic Rotary Wing Program, recently completed two helicopter tests in support of an in-house effort to study crashworthiness. As part of this effort, work is on-going to investigate model calibration approaches and calibration metrics for impact dynamics models. Model calibration of impact dynamics problems has traditionally assessed model adequacy by comparing time histories from analytical predictions to test at only a few critical locations. Although this approach provides for a direct measure of the model predictive capability, overall system behavior is only qualitatively assessed using full vehicle animations. In order to understand the spatial and temporal relationships of impact loads as they migrate throughout the structure, a more quantitative approach is needed. In this work impact shapes derived from simulated time history data are used to recommend sensor placement and to assess model adequacy using time based metrics and orthogonality multi-dimensional metrics. An approach for model calibration is presented that includes metric definitions, uncertainty bounds, parameter sensitivity, and numerical optimization to estimate parameters to reconcile test with analysis. The process is illustrated using simulated experiment data.

  4. Prediction of Transitional Flows in the Low Pressure Turbine

    NASA Technical Reports Server (NTRS)

    Huang, George; Xiong, Guohua

    1998-01-01

    Current turbulence models tend to give too early and too short a length of flow transition to turbulence, and hence fail to predict flow separation induced by the adverse pressure gradients and streamline flow curvatures. Our discussion will focus on the development and validation of transition models. The baseline data for model comparisons are the T3 series, which include a range of free-stream turbulence intensity and cover zero-pressure gradient to aft-loaded turbine pressure gradient flows. The method will be based on the conditioned N-S equations and a transport equation for the intermittency factor. First, several of the most popular 2-equation models in predicting flow transition are examined: k-e [Launder-Sharina], k-w [Wilcox], Lien-Leschiziner and SST [Menter] models. All models fail to predict the onset and the length of transition, even for the simplest flat plate with zero-pressure gradient(T3A). Although the predicted onset position of transition can be varied by providing different inlet turbulent energy dissipation rates, the appropriate inlet conditions for turbulence quantities should be adjusted to match the decay of the free-stream turbulence. Arguably, one may adjust the low-Reynolds-number part of the model to predict transition. This approach has so far not been very successful. However, we have found that the low-Reynolds-number model of Launder and Sharma [1974], which is an improved version of Jones and Launder [1972] gave the best overall performance. The Launder and Sharma model was designed to capture flow re-laminarization (a reverse of flow transition), but tends to give rise to a too early and too fast transition in comparison with the physical transition. The three test cases were for flows with zero pressure gradient but with different free-stream turbulent intensities. The same can be said about the model when considering flows subject to pressure gradient(T3C1). To capture the effects of transition using existing turbulence models, one approach is to make use of the concept of the intermittency to predict the flow transition. It was originally based on the intermittency distribution of Narasimha [1957], and then gradually evolved into a transport equation for the intermittency factor. Gostelow and associates [1994,1995] have made some improvements to Narasimha's method in an attempt to account for both favorable and adverse pressure gradients. Their approach is based on a linear, explicit combination of laminar and turbulent solutions. This approach fails to predict the overshoot of the skin friction on a flat plate near the end of transition zone, even though the length of transition is well predicted. The major flaw of Gostelow's approach is that it assumes the non-turbulent part being the laminar solution and the turbulent part being the turbulent solution and they do not interact across the transitional region. The technique in condition averaging the flow equations in intermittent flows was first introduced by Libby [1975] and Dopazo [1977] and further refined by Dick and associates [1988, 1996]. This approach employs two sets of transport equations for the non-turbulent part and the other for the turbulent part. The advantage of this approach is that it allows the interaction of non-turbulent and turbulent velocities through the introduction of additional source terms in the continuity and momentum equations for the non-turbulent and turbulent velocities. However, the strong coupling of the two sets of equations has caused some numerical difficulties, which requires special attention. The prediction of the skin friction can be improved by this approach via the implicit coupling of non-turbulent and turbulent velocity flelds. Another improvement of the interrmittency model can be further made by allowing the intermittency to vary in the cross-stream direction. This is one step prior to testing any proposal for the transport equation for the intermittency factor. Instead of solving the transport equation for the intermittency factor, the distribution for the intermittency factor is prescribed by Klebanoff's empirical formula [1955]. The skin friction is very well predicted by this new modification, including the overshoot of the profile near the end of the transition zone. The outcome of this study is very encouraging since it indicates that the proper description of the intermittency distribution is the key to the success of the model prediction. This study will be used to guide us on the modelling of the intermittency transport equation.

  5. Quantifying uncertainties in streamflow predictions through signature based inference of hydrological model parameters

    NASA Astrophysics Data System (ADS)

    Fenicia, Fabrizio; Reichert, Peter; Kavetski, Dmitri; Albert, Calro

    2016-04-01

    The calibration of hydrological models based on signatures (e.g. Flow Duration Curves - FDCs) is often advocated as an alternative to model calibration based on the full time series of system responses (e.g. hydrographs). Signature based calibration is motivated by various arguments. From a conceptual perspective, calibration on signatures is a way to filter out errors that are difficult to represent when calibrating on the full time series. Such errors may for example occur when observed and simulated hydrographs are shifted, either on the "time" axis (i.e. left or right), or on the "streamflow" axis (i.e. above or below). These shifts may be due to errors in the precipitation input (time or amount), and if not properly accounted in the likelihood function, may cause biased parameter estimates (e.g. estimated model parameters that do not reproduce the recession characteristics of a hydrograph). From a practical perspective, signature based calibration is seen as a possible solution for making predictions in ungauged basins. Where streamflow data are not available, it may in fact be possible to reliably estimate streamflow signatures. Previous research has for example shown how FDCs can be reliably estimated at ungauged locations based on climatic and physiographic influence factors. Typically, the goal of signature based calibration is not the prediction of the signatures themselves, but the prediction of the system responses. Ideally, the prediction of system responses should be accompanied by a reliable quantification of the associated uncertainties. Previous approaches for signature based calibration, however, do not allow reliable estimates of streamflow predictive distributions. Here, we illustrate how the Bayesian approach can be employed to obtain reliable streamflow predictive distributions based on signatures. A case study is presented, where a hydrological model is calibrated on FDCs and additional signatures. We propose an approach where the likelihood function for the signatures is derived from the likelihood for streamflow (rather than using an "ad-hoc" likelihood for the signatures as done in previous approaches). This likelihood is not easily tractable analytically and we therefore cannot apply "simple" MCMC methods. This numerical problem is solved using Approximate Bayesian Computation (ABC). Our result indicate that the proposed approach is suitable for producing reliable streamflow predictive distributions based on calibration to signature data. Moreover, our results provide indications on which signatures are more appropriate to represent the information content of the hydrograph.

  6. Development of a coupled hydrological - hydrodynamic model for probabilistic catchment flood inundation modelling

    NASA Astrophysics Data System (ADS)

    Quinn, Niall; Freer, Jim; Coxon, Gemma; Dunne, Toby; Neal, Jeff; Bates, Paul; Sampson, Chris; Smith, Andy; Parkin, Geoff

    2017-04-01

    Computationally efficient flood inundation modelling systems capable of representing important hydrological and hydrodynamic flood generating processes over relatively large regions are vital for those interested in flood preparation, response, and real time forecasting. However, such systems are currently not readily available. This can be particularly important where flood predictions from intense rainfall are considered as the processes leading to flooding often involve localised, non-linear spatially connected hillslope-catchment responses. Therefore, this research introduces a novel hydrological-hydraulic modelling framework for the provision of probabilistic flood inundation predictions across catchment to regional scales that explicitly account for spatial variability in rainfall-runoff and routing processes. Approaches have been developed to automate the provision of required input datasets and estimate essential catchment characteristics from freely available, national datasets. This is an essential component of the framework as when making predictions over multiple catchments or at relatively large scales, and where data is often scarce, obtaining local information and manually incorporating it into the model quickly becomes infeasible. An extreme flooding event in the town of Morpeth, NE England, in 2008 was used as a first case study evaluation of the modelling framework introduced. The results demonstrated a high degree of prediction accuracy when comparing modelled and reconstructed event characteristics for the event, while the efficiency of the modelling approach used enabled the generation of relatively large ensembles of realisations from which uncertainty within the prediction may be represented. This research supports previous literature highlighting the importance of probabilistic forecasting, particularly during extreme events, which can be often be poorly characterised or even missed by deterministic predictions due to the inherent uncertainty in any model application. Future research will aim to further evaluate the robustness of the approaches introduced by applying the modelling framework to a variety of historical flood events across UK catchments. Furthermore, the flexibility and efficiency of the framework is ideally suited to the examination of the propagation of errors through the model which will help gain a better understanding of the dominant sources of uncertainty currently impacting flood inundation predictions.

  7. Seasonal prediction of East Asian summer rainfall using a multi-model ensemble system

    NASA Astrophysics Data System (ADS)

    Ahn, Joong-Bae; Lee, Doo-Young; Yoo, Jin‑Ho

    2015-04-01

    Using the retrospective forecasts of seven state-of-the-art coupled models and their multi-model ensemble (MME) for boreal summers, the prediction skills of climate models in the western tropical Pacific (WTP) and East Asian region are assessed. The prediction of summer rainfall anomalies in East Asia is difficult, while the WTP has a strong correlation between model prediction and observation. We focus on developing a new approach to further enhance the seasonal prediction skill for summer rainfall in East Asia and investigate the influence of convective activity in the WTP on East Asian summer rainfall. By analyzing the characteristics of the WTP convection, two distinct patterns associated with El Niño-Southern Oscillation developing and decaying modes are identified. Based on the multiple linear regression method, the East Asia Rainfall Index (EARI) is developed by using the interannual variability of the normalized Maritime continent-WTP Indices (MPIs), as potentially useful predictors for rainfall prediction over East Asia, obtained from the above two main patterns. For East Asian summer rainfall, the EARI has superior performance to the East Asia summer monsoon index or each MPI. Therefore, the regressed rainfall from EARI also shows a strong relationship with the observed East Asian summer rainfall pattern. In addition, we evaluate the prediction skill of the East Asia reconstructed rainfall obtained by hybrid dynamical-statistical approach using the cross-validated EARI from the individual models and their MME. The results show that the rainfalls reconstructed from simulations capture the general features of observed precipitation in East Asia quite well. This study convincingly demonstrates that rainfall prediction skill is considerably improved by using a hybrid dynamical-statistical approach compared to the dynamical forecast alone. Acknowledgements This work was carried out with the support of Rural Development Administration Cooperative Research Program for Agriculture Science and Technology Development under grant project PJ009353 and Korea Meteorological Administration Research and Development Program under grant CATER 2012-3100, Republic of Korea.

  8. A Bayesian Hierarchical Modeling Approach to Predicting Flow in Ungauged Basins

    NASA Astrophysics Data System (ADS)

    Gronewold, A.; Alameddine, I.; Anderson, R. M.

    2009-12-01

    Recent innovative approaches to identifying and applying regression-based relationships between land use patterns (such as increasing impervious surface area and decreasing vegetative cover) and rainfall-runoff model parameters represent novel and promising improvements to predicting flow from ungauged basins. In particular, these approaches allow for predicting flows under uncertain and potentially variable future conditions due to rapid land cover changes, variable climate conditions, and other factors. Despite the broad range of literature on estimating rainfall-runoff model parameters, however, the absence of a robust set of modeling tools for identifying and quantifying uncertainties in (and correlation between) rainfall-runoff model parameters represents a significant gap in current hydrological modeling research. Here, we build upon a series of recent publications promoting novel Bayesian and probabilistic modeling strategies for quantifying rainfall-runoff model parameter estimation uncertainty. Our approach applies alternative measures of rainfall-runoff model parameter joint likelihood (including Nash-Sutcliffe efficiency, among others) to simulate samples from the joint parameter posterior probability density function. We then use these correlated samples as response variables in a Bayesian hierarchical model with land use coverage data as predictor variables in order to develop a robust land use-based tool for forecasting flow in ungauged basins while accounting for, and explicitly acknowledging, parameter estimation uncertainty. We apply this modeling strategy to low-relief coastal watersheds of Eastern North Carolina, an area representative of coastal resource waters throughout the world because of its sensitive embayments and because of the abundant (but currently threatened) natural resources it hosts. Consequently, this area is the subject of several ongoing studies and large-scale planning initiatives, including those conducted through the United States Environmental Protection Agency (USEPA) total maximum daily load (TMDL) program, as well as those addressing coastal population dynamics and sea level rise. Our approach has several advantages, including the propagation of parameter uncertainty through a nonparametric probability distribution which avoids common pitfalls of fitting parameters and model error structure to a predetermined parametric distribution function. In addition, by explicitly acknowledging correlation between model parameters (and reflecting those correlations in our predictive model) our model yields relatively efficient prediction intervals (unlike those in the current literature which are often unnecessarily large, and may lead to overly-conservative management actions). Finally, our model helps improve understanding of the rainfall-runoff process by identifying model parameters (and associated catchment attributes) which are most sensitive to current and future land use change patterns. Disclaimer: Although this work was reviewed by EPA and approved for publication, it may not necessarily reflect official Agency policy.

  9. A Novel Approach for Blast-Induced Flyrock Prediction Based on Imperialist Competitive Algorithm and Artificial Neural Network

    PubMed Central

    Marto, Aminaton; Jahed Armaghani, Danial; Tonnizam Mohamad, Edy; Makhtar, Ahmad Mahir

    2014-01-01

    Flyrock is one of the major disturbances induced by blasting which may cause severe damage to nearby structures. This phenomenon has to be precisely predicted and subsequently controlled through the changing in the blast design to minimize potential risk of blasting. The scope of this study is to predict flyrock induced by blasting through a novel approach based on the combination of imperialist competitive algorithm (ICA) and artificial neural network (ANN). For this purpose, the parameters of 113 blasting operations were accurately recorded and flyrock distances were measured for each operation. By applying the sensitivity analysis, maximum charge per delay and powder factor were determined as the most influential parameters on flyrock. In the light of this analysis, two new empirical predictors were developed to predict flyrock distance. For a comparison purpose, a predeveloped backpropagation (BP) ANN was developed and the results were compared with those of the proposed ICA-ANN model and empirical predictors. The results clearly showed the superiority of the proposed ICA-ANN model in comparison with the proposed BP-ANN model and empirical approaches. PMID:25147856

  10. Longitudinal Study-Based Dementia Prediction for Public Health

    PubMed Central

    Kim, HeeChel; Chun, Hong-Woo; Kim, Seonho; Coh, Byoung-Youl; Kwon, Oh-Jin; Moon, Yeong-Ho

    2017-01-01

    The issue of public health in Korea has attracted significant attention given the aging of the country’s population, which has created many types of social problems. The approach proposed in this article aims to address dementia, one of the most significant symptoms of aging and a public health care issue in Korea. The Korean National Health Insurance Service Senior Cohort Database contains personal medical data of every citizen in Korea. There are many different medical history patterns between individuals with dementia and normal controls. The approach used in this study involved examination of personal medical history features from personal disease history, sociodemographic data, and personal health examinations to develop a prediction model. The prediction model used a support-vector machine learning technique to perform a 10-fold cross-validation analysis. The experimental results demonstrated promising performance (80.9% F-measure). The proposed approach supported the significant influence of personal medical history features during an optimal observation period. It is anticipated that a biomedical “big data”-based disease prediction model may assist the diagnosis of any disease more correctly. PMID:28867810

  11. A novel approach for blast-induced flyrock prediction based on imperialist competitive algorithm and artificial neural network.

    PubMed

    Marto, Aminaton; Hajihassani, Mohsen; Armaghani, Danial Jahed; Mohamad, Edy Tonnizam; Makhtar, Ahmad Mahir

    2014-01-01

    Flyrock is one of the major disturbances induced by blasting which may cause severe damage to nearby structures. This phenomenon has to be precisely predicted and subsequently controlled through the changing in the blast design to minimize potential risk of blasting. The scope of this study is to predict flyrock induced by blasting through a novel approach based on the combination of imperialist competitive algorithm (ICA) and artificial neural network (ANN). For this purpose, the parameters of 113 blasting operations were accurately recorded and flyrock distances were measured for each operation. By applying the sensitivity analysis, maximum charge per delay and powder factor were determined as the most influential parameters on flyrock. In the light of this analysis, two new empirical predictors were developed to predict flyrock distance. For a comparison purpose, a predeveloped backpropagation (BP) ANN was developed and the results were compared with those of the proposed ICA-ANN model and empirical predictors. The results clearly showed the superiority of the proposed ICA-ANN model in comparison with the proposed BP-ANN model and empirical approaches.

  12. Predicting p Ka values from EEM atomic charges

    PubMed Central

    2013-01-01

    The acid dissociation constant p Ka is a very important molecular property, and there is a strong interest in the development of reliable and fast methods for p Ka prediction. We have evaluated the p Ka prediction capabilities of QSPR models based on empirical atomic charges calculated by the Electronegativity Equalization Method (EEM). Specifically, we collected 18 EEM parameter sets created for 8 different quantum mechanical (QM) charge calculation schemes. Afterwards, we prepared a training set of 74 substituted phenols. Additionally, for each molecule we generated its dissociated form by removing the phenolic hydrogen. For all the molecules in the training set, we then calculated EEM charges using the 18 parameter sets, and the QM charges using the 8 above mentioned charge calculation schemes. For each type of QM and EEM charges, we created one QSPR model employing charges from the non-dissociated molecules (three descriptor QSPR models), and one QSPR model based on charges from both dissociated and non-dissociated molecules (QSPR models with five descriptors). Afterwards, we calculated the quality criteria and evaluated all the QSPR models obtained. We found that QSPR models employing the EEM charges proved as a good approach for the prediction of p Ka (63% of these models had R2 > 0.9, while the best had R2 = 0.924). As expected, QM QSPR models provided more accurate p Ka predictions than the EEM QSPR models but the differences were not significant. Furthermore, a big advantage of the EEM QSPR models is that their descriptors (i.e., EEM atomic charges) can be calculated markedly faster than the QM charge descriptors. Moreover, we found that the EEM QSPR models are not so strongly influenced by the selection of the charge calculation approach as the QM QSPR models. The robustness of the EEM QSPR models was subsequently confirmed by cross-validation. The applicability of EEM QSPR models for other chemical classes was illustrated by a case study focused on carboxylic acids. In summary, EEM QSPR models constitute a fast and accurate p Ka prediction approach that can be used in virtual screening. PMID:23574978

  13. Predicting landscape vegetation dynamics using state-and-transition simulation models

    Treesearch

    Colin J. Daniel; Leonardo Frid

    2012-01-01

    This paper outlines how state-and-transition simulation models (STSMs) can be used to project changes in vegetation over time across a landscape. STSMs are stochastic, empirical simulation models that use an adapted Markov chain approach to predict how vegetation will transition between states over time, typically in response to interactions between succession,...

  14. Model predictive control of P-time event graphs

    NASA Astrophysics Data System (ADS)

    Hamri, H.; Kara, R.; Amari, S.

    2016-12-01

    This paper deals with model predictive control of discrete event systems modelled by P-time event graphs. First, the model is obtained by using the dater evolution model written in the standard algebra. Then, for the control law, we used the finite-horizon model predictive control. For the closed-loop control, we used the infinite-horizon model predictive control (IH-MPC). The latter is an approach that calculates static feedback gains which allows the stability of the closed-loop system while respecting the constraints on the control vector. The problem of IH-MPC is formulated as a linear convex programming subject to a linear matrix inequality problem. Finally, the proposed methodology is applied to a transportation system.

  15. Data Assimilation and Propagation of Uncertainty in Multiscale Cardiovascular Simulation

    NASA Astrophysics Data System (ADS)

    Schiavazzi, Daniele; Marsden, Alison

    2015-11-01

    Cardiovascular modeling is the application of computational tools to predict hemodynamics. State-of-the-art techniques couple a 3D incompressible Navier-Stokes solver with a boundary circulation model and can predict local and peripheral hemodynamics, analyze the post-operative performance of surgical designs and complement clinical data collection minimizing invasive and risky measurement practices. The ability of these tools to make useful predictions is directly related to their accuracy in representing measured physiologies. Tuning of model parameters is therefore a topic of paramount importance and should include clinical data uncertainty, revealing how this uncertainty will affect the predictions. We propose a fully Bayesian, multi-level approach to data assimilation of uncertain clinical data in multiscale circulation models. To reduce the computational cost, we use a stable, condensed approximation of the 3D model build by linear sparse regression of the pressure/flow rate relationship at the outlets. Finally, we consider the problem of non-invasively propagating the uncertainty in model parameters to the resulting hemodynamics and compare Monte Carlo simulation with Stochastic Collocation approaches based on Polynomial or Multi-resolution Chaos expansions.

  16. Jet Noise Modeling for Supersonic Business Jet Application

    NASA Technical Reports Server (NTRS)

    Stone, James R.; Krejsa, Eugene A.; Clark, Bruce J.

    2004-01-01

    This document describes the development of an improved predictive model for coannular jet noise, including noise suppression modifications applicable to small supersonic-cruise aircraft such as the Supersonic Business Jet (SBJ), for NASA Langley Research Center (LaRC). For such aircraft a wide range of propulsion and integration options are under consideration. Thus there is a need for very versatile design tools, including a noise prediction model. The approach used is similar to that used with great success by the Modern Technologies Corporation (MTC) in developing a noise prediction model for two-dimensional mixer ejector (2DME) nozzles under the High Speed Research Program and in developing a more recent model for coannular nozzles over a wide range of conditions. If highly suppressed configurations are ultimately required, the 2DME model is expected to provide reasonable prediction for these smaller scales, although this has not been demonstrated. It is considered likely that more modest suppression approaches, such as dual stream nozzles featuring chevron or chute suppressors, perhaps in conjunction with inverted velocity profiles (IVP), will be sufficient for the SBJ.

  17. Estimating West Nile virus transmission period in Pennsylvania using an optimized degree-day model.

    PubMed

    Chen, Shi; Blanford, Justine I; Fleischer, Shelby J; Hutchinson, Michael; Saunders, Michael C; Thomas, Matthew B

    2013-07-01

    Abstract We provide calibrated degree-day models to predict potential West Nile virus (WNV) transmission periods in Pennsylvania. We begin by following the standard approach of treating the degree-days necessary for the virus to complete the extrinsic incubation period (EIP), and mosquito longevity as constants. This approach failed to adequately explain virus transmission periods based on mosquito surveillance data from 4 locations (Harrisburg, Philadelphia, Pittsburgh, and Williamsport) in Pennsylvania from 2002 to 2008. Allowing the EIP and adult longevity to vary across time and space improved model fit substantially. The calibrated models increase the ability to successfully predict the WNV transmission period in Pennsylvania to 70-80% compared to less than 30% in the uncalibrated model. Model validation showed the optimized models to be robust in 3 of the locations, although still showing errors for Philadelphia. These models and methods could provide useful tools to predict WNV transmission period from surveillance datasets, assess potential WNV risk, and make informed mosquito surveillance strategies.

  18. Logical-rule models of classification response times: a synthesis of mental-architecture, random-walk, and decision-bound approaches.

    PubMed

    Fific, Mario; Little, Daniel R; Nosofsky, Robert M

    2010-04-01

    We formalize and provide tests of a set of logical-rule models for predicting perceptual classification response times (RTs) and choice probabilities. The models are developed by synthesizing mental-architecture, random-walk, and decision-bound approaches. According to the models, people make independent decisions about the locations of stimuli along a set of component dimensions. Those independent decisions are then combined via logical rules to determine the overall categorization response. The time course of the independent decisions is modeled via random-walk processes operating along individual dimensions. Alternative mental architectures are used as mechanisms for combining the independent decisions to implement the logical rules. We derive fundamental qualitative contrasts for distinguishing among the predictions of the rule models and major alternative models of classification RT. We also use the models to predict detailed RT-distribution data associated with individual stimuli in tasks of speeded perceptual classification. PsycINFO Database Record (c) 2010 APA, all rights reserved.

  19. A Bayesian network approach to predicting nest presence of thefederally-threatened piping plover (Charadrius melodus) using barrier island features

    USGS Publications Warehouse

    Gieder, Katherina D.; Karpanty, Sarah M.; Fraser, James D.; Catlin, Daniel H.; Gutierrez, Benjamin T.; Plant, Nathaniel G.; Turecek, Aaron M.; Thieler, E. Robert

    2014-01-01

    Sea-level rise and human development pose significant threats to shorebirds, particularly for species that utilize barrier island habitat. The piping plover (Charadrius melodus) is a federally-listed shorebird that nests on barrier islands and rapidly responds to changes in its physical environment, making it an excellent species with which to model how shorebird species may respond to habitat change related to sea-level rise and human development. The uncertainty and complexity in predicting sea-level rise, the responses of barrier island habitats to sea-level rise, and the responses of species to sea-level rise and human development necessitate a modelling approach that can link species to the physical habitat features that will be altered by changes in sea level and human development. We used a Bayesian network framework to develop a model that links piping plover nest presence to the physical features of their nesting habitat on a barrier island that is impacted by sea-level rise and human development, using three years of data (1999, 2002, and 2008) from Assateague Island National Seashore in Maryland. Our model performance results showed that we were able to successfully predict nest presence given a wide range of physical conditions within the model’s dataset. We found that model predictions were more successful when the range of physical conditions included in model development was varied rather than when those physical conditions were narrow. We also found that all model predictions had fewer false negatives (nests predicted to be absent when they were actually present in the dataset) than false positives (nests predicted to be present when they were actually absent in the dataset), indicating that our model correctly predicted nest presence better than nest absence. These results indicated that our approach of using a Bayesian network to link specific physical features to nest presence will be useful for modelling impacts of sea-level rise- or human-related habitat change on barrier islands. We recommend that potential users of this method utilize multiple years of data that represent a wide range of physical conditions in model development, because the model performed less well when constructed using a narrow range of physical conditions. Further, given that there will always be some uncertainty in predictions of future physical habitat conditions related to sea-level rise and/or human development, predictive models will perform best when developed using multiple, varied years of data input.

  20. Generalized regression neural network (GRNN)-based approach for colored dissolved organic matter (CDOM) retrieval: case study of Connecticut River at Middle Haddam Station, USA.

    PubMed

    Heddam, Salim

    2014-11-01

    The prediction of colored dissolved organic matter (CDOM) using artificial neural network approaches has received little attention in the past few decades. In this study, colored dissolved organic matter (CDOM) was modeled using generalized regression neural network (GRNN) and multiple linear regression (MLR) models as a function of Water temperature (TE), pH, specific conductance (SC), and turbidity (TU). Evaluation of the prediction accuracy of the models is based on the root mean square error (RMSE), mean absolute error (MAE), coefficient of correlation (CC), and Willmott's index of agreement (d). The results indicated that GRNN can be applied successfully for prediction of colored dissolved organic matter (CDOM).

  1. Estimating the domain of applicability for machine learning QSAR models: a study on aqueous solubility of drug discovery molecules.

    PubMed

    Schroeter, Timon Sebastian; Schwaighofer, Anton; Mika, Sebastian; Ter Laak, Antonius; Suelzle, Detlev; Ganzer, Ursula; Heinrich, Nikolaus; Müller, Klaus-Robert

    2007-12-01

    We investigate the use of different Machine Learning methods to construct models for aqueous solubility. Models are based on about 4000 compounds, including an in-house set of 632 drug discovery molecules of Bayer Schering Pharma. For each method, we also consider an appropriate method to obtain error bars, in order to estimate the domain of applicability (DOA) for each model. Here, we investigate error bars from a Bayesian model (Gaussian Process (GP)), an ensemble based approach (Random Forest), and approaches based on the Mahalanobis distance to training data (for Support Vector Machine and Ridge Regression models). We evaluate all approaches in terms of their prediction accuracy (in cross-validation, and on an external validation set of 536 molecules) and in how far the individual error bars can faithfully represent the actual prediction error.

  2. Estimating the domain of applicability for machine learning QSAR models: a study on aqueous solubility of drug discovery molecules.

    PubMed

    Schroeter, Timon Sebastian; Schwaighofer, Anton; Mika, Sebastian; Ter Laak, Antonius; Suelzle, Detlev; Ganzer, Ursula; Heinrich, Nikolaus; Müller, Klaus-Robert

    2007-09-01

    We investigate the use of different Machine Learning methods to construct models for aqueous solubility. Models are based on about 4000 compounds, including an in-house set of 632 drug discovery molecules of Bayer Schering Pharma. For each method, we also consider an appropriate method to obtain error bars, in order to estimate the domain of applicability (DOA) for each model. Here, we investigate error bars from a Bayesian model (Gaussian Process (GP)), an ensemble based approach (Random Forest), and approaches based on the Mahalanobis distance to training data (for Support Vector Machine and Ridge Regression models). We evaluate all approaches in terms of their prediction accuracy (in cross-validation, and on an external validation set of 536 molecules) and in how far the individual error bars can faithfully represent the actual prediction error.

  3. Estimating the domain of applicability for machine learning QSAR models: a study on aqueous solubility of drug discovery molecules

    NASA Astrophysics Data System (ADS)

    Schroeter, Timon Sebastian; Schwaighofer, Anton; Mika, Sebastian; Ter Laak, Antonius; Suelzle, Detlev; Ganzer, Ursula; Heinrich, Nikolaus; Müller, Klaus-Robert

    2007-12-01

    We investigate the use of different Machine Learning methods to construct models for aqueous solubility. Models are based on about 4000 compounds, including an in-house set of 632 drug discovery molecules of Bayer Schering Pharma. For each method, we also consider an appropriate method to obtain error bars, in order to estimate the domain of applicability (DOA) for each model. Here, we investigate error bars from a Bayesian model (Gaussian Process (GP)), an ensemble based approach (Random Forest), and approaches based on the Mahalanobis distance to training data (for Support Vector Machine and Ridge Regression models). We evaluate all approaches in terms of their prediction accuracy (in cross-validation, and on an external validation set of 536 molecules) and in how far the individual error bars can faithfully represent the actual prediction error.

  4. Estimating the domain of applicability for machine learning QSAR models: a study on aqueous solubility of drug discovery molecules

    NASA Astrophysics Data System (ADS)

    Schroeter, Timon Sebastian; Schwaighofer, Anton; Mika, Sebastian; Ter Laak, Antonius; Suelzle, Detlev; Ganzer, Ursula; Heinrich, Nikolaus; Müller, Klaus-Robert

    2007-09-01

    We investigate the use of different Machine Learning methods to construct models for aqueous solubility. Models are based on about 4000 compounds, including an in-house set of 632 drug discovery molecules of Bayer Schering Pharma. For each method, we also consider an appropriate method to obtain error bars, in order to estimate the domain of applicability (DOA) for each model. Here, we investigate error bars from a Bayesian model (Gaussian Process (GP)), an ensemble based approach (Random Forest), and approaches based on the Mahalanobis distance to training data (for Support Vector Machine and Ridge Regression models). We evaluate all approaches in terms of their prediction accuracy (in cross-validation, and on an external validation set of 536 molecules) and in how far the individual error bars can faithfully represent the actual prediction error.

  5. Robust artificial neural network for reliability and sensitivity analyses of complex non-linear systems.

    PubMed

    Oparaji, Uchenna; Sheu, Rong-Jiun; Bankhead, Mark; Austin, Jonathan; Patelli, Edoardo

    2017-12-01

    Artificial Neural Networks (ANNs) are commonly used in place of expensive models to reduce the computational burden required for uncertainty quantification, reliability and sensitivity analyses. ANN with selected architecture is trained with the back-propagation algorithm from few data representatives of the input/output relationship of the underlying model of interest. However, different performing ANNs might be obtained with the same training data as a result of the random initialization of the weight parameters in each of the network, leading to an uncertainty in selecting the best performing ANN. On the other hand, using cross-validation to select the best performing ANN based on the ANN with the highest R 2 value can lead to biassing in the prediction. This is as a result of the fact that the use of R 2 cannot determine if the prediction made by ANN is biased. Additionally, R 2 does not indicate if a model is adequate, as it is possible to have a low R 2 for a good model and a high R 2 for a bad model. Hence, in this paper, we propose an approach to improve the robustness of a prediction made by ANN. The approach is based on a systematic combination of identical trained ANNs, by coupling the Bayesian framework and model averaging. Additionally, the uncertainties of the robust prediction derived from the approach are quantified in terms of confidence intervals. To demonstrate the applicability of the proposed approach, two synthetic numerical examples are presented. Finally, the proposed approach is used to perform a reliability and sensitivity analyses on a process simulation model of a UK nuclear effluent treatment plant developed by National Nuclear Laboratory (NNL) and treated in this study as a black-box employing a set of training data as a test case. This model has been extensively validated against plant and experimental data and used to support the UK effluent discharge strategy. Copyright © 2017 Elsevier Ltd. All rights reserved.

  6. Crime prediction modeling

    NASA Technical Reports Server (NTRS)

    1971-01-01

    A study of techniques for the prediction of crime in the City of Los Angeles was conducted. Alternative approaches to crime prediction (causal, quasicausal, associative, extrapolative, and pattern-recognition models) are discussed, as is the environment within which predictions were desired for the immediate application. The decision was made to use time series (extrapolative) models to produce the desired predictions. The characteristics of the data and the procedure used to choose equations for the extrapolations are discussed. The usefulness of different functional forms (constant, quadratic, and exponential forms) and of different parameter estimation techniques (multiple regression and multiple exponential smoothing) are compared, and the quality of the resultant predictions is assessed.

  7. Computational approaches for predicting biomedical research collaborations.

    PubMed

    Zhang, Qing; Yu, Hong

    2014-01-01

    Biomedical research is increasingly collaborative, and successful collaborations often produce high impact work. Computational approaches can be developed for automatically predicting biomedical research collaborations. Previous works of collaboration prediction mainly explored the topological structures of research collaboration networks, leaving out rich semantic information from the publications themselves. In this paper, we propose supervised machine learning approaches to predict research collaborations in the biomedical field. We explored both the semantic features extracted from author research interest profile and the author network topological features. We found that the most informative semantic features for author collaborations are related to research interest, including similarity of out-citing citations, similarity of abstracts. Of the four supervised machine learning models (naïve Bayes, naïve Bayes multinomial, SVMs, and logistic regression), the best performing model is logistic regression with an ROC ranging from 0.766 to 0.980 on different datasets. To our knowledge we are the first to study in depth how research interest and productivities can be used for collaboration prediction. Our approach is computationally efficient, scalable and yet simple to implement. The datasets of this study are available at https://github.com/qingzhanggithub/medline-collaboration-datasets.

  8. Looking beyond historical patient outcomes to improve clinical models.

    PubMed

    Chia, Chih-Chun; Rubinfeld, Ilan; Scirica, Benjamin M; McMillan, Sean; Gurm, Hitinder S; Syed, Zeeshan

    2012-04-25

    Conventional algorithms for modeling clinical events focus on characterizing the differences between patients with varying outcomes in historical data sets used for the model derivation. For many clinical conditions with low prevalence and where small data sets are available, this approach to developing models is challenging due to the limited number of positive (that is, event) examples available for model training. Here, we investigate how the approach of developing clinical models might be improved across three distinct patient populations (patients with acute coronary syndrome enrolled in the DISPERSE2-TIMI33 and MERLIN-TIMI36 trials, patients undergoing inpatient surgery in the National Surgical Quality Improvement Program registry, and patients undergoing percutaneous coronary intervention in the Blue Cross Blue Shield of Michigan Cardiovascular Consortium registry). For each of these cases, we supplement an incomplete characterization of patient outcomes in the derivation data set (uncensored view of the data) with an additional characterization of the extent to which patients differ from the statistical support of their clinical characteristics (censored view of the data). Our approach exploits the same training data within the derivation cohort in multiple ways to improve the accuracy of prediction. We position this approach within the context of traditional supervised (2-class) and unsupervised (1-class) learning methods and present a 1.5-class approach for clinical decision-making. We describe a 1.5-class support vector machine (SVM) classification algorithm that implements this approach, and report on its performance relative to logistic regression and 2-class SVM classification with cost-sensitive weighting and oversampling. The 1.5-class SVM algorithm improved prediction accuracy relative to other approaches and may have value in predicting clinical events both at the bedside and for risk-adjusted quality of care assessment.

  9. A design of experiments approach to validation sampling for logistic regression modeling with error-prone medical records.

    PubMed

    Ouyang, Liwen; Apley, Daniel W; Mehrotra, Sanjay

    2016-04-01

    Electronic medical record (EMR) databases offer significant potential for developing clinical hypotheses and identifying disease risk associations by fitting statistical models that capture the relationship between a binary response variable and a set of predictor variables that represent clinical, phenotypical, and demographic data for the patient. However, EMR response data may be error prone for a variety of reasons. Performing a manual chart review to validate data accuracy is time consuming, which limits the number of chart reviews in a large database. The authors' objective is to develop a new design-of-experiments-based systematic chart validation and review (DSCVR) approach that is more powerful than the random validation sampling used in existing approaches. The DSCVR approach judiciously and efficiently selects the cases to validate (i.e., validate whether the response values are correct for those cases) for maximum information content, based only on their predictor variable values. The final predictive model will be fit using only the validation sample, ignoring the remainder of the unvalidated and unreliable error-prone data. A Fisher information based D-optimality criterion is used, and an algorithm for optimizing it is developed. The authors' method is tested in a simulation comparison that is based on a sudden cardiac arrest case study with 23 041 patients' records. This DSCVR approach, using the Fisher information based D-optimality criterion, results in a fitted model with much better predictive performance, as measured by the receiver operating characteristic curve and the accuracy in predicting whether a patient will experience the event, than a model fitted using a random validation sample. The simulation comparisons demonstrate that this DSCVR approach can produce predictive models that are significantly better than those produced from random validation sampling, especially when the event rate is low. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  10. Using built environment characteristics to predict walking for exercise

    PubMed Central

    Lovasi, Gina S; Moudon, Anne V; Pearson, Amber L; Hurvitz, Philip M; Larson, Eric B; Siscovick, David S; Berke, Ethan M; Lumley, Thomas; Psaty, Bruce M

    2008-01-01

    Background Environments conducive to walking may help people avoid sedentary lifestyles and associated diseases. Recent studies developed walkability models combining several built environment characteristics to optimally predict walking. Developing and testing such models with the same data could lead to overestimating one's ability to predict walking in an independent sample of the population. More accurate estimates of model fit can be obtained by splitting a single study population into training and validation sets (holdout approach) or through developing and evaluating models in different populations. We used these two approaches to test whether built environment characteristics near the home predict walking for exercise. Study participants lived in western Washington State and were adult members of a health maintenance organization. The physical activity data used in this study were collected by telephone interview and were selected for their relevance to cardiovascular disease. In order to limit confounding by prior health conditions, the sample was restricted to participants in good self-reported health and without a documented history of cardiovascular disease. Results For 1,608 participants meeting the inclusion criteria, the mean age was 64 years, 90 percent were white, 37 percent had a college degree, and 62 percent of participants reported that they walked for exercise. Single built environment characteristics, such as residential density or connectivity, did not significantly predict walking for exercise. Regression models using multiple built environment characteristics to predict walking were not successful at predicting walking for exercise in an independent population sample. In the validation set, none of the logistic models had a C-statistic confidence interval excluding the null value of 0.5, and none of the linear models explained more than one percent of the variance in time spent walking for exercise. We did not detect significant differences in walking for exercise among census areas or postal codes, which were used as proxies for neighborhoods. Conclusion None of the built environment characteristics significantly predicted walking for exercise, nor did combinations of these characteristics predict walking for exercise when tested using a holdout approach. These results reflect a lack of neighborhood-level variation in walking for exercise for the population studied. PMID:18312660

  11. Predictions of psychophysical measurements for sinusoidal amplitude modulated (SAM) pulse-train stimuli from a stochastic model.

    PubMed

    Xu, Yifang; Collins, Leslie M

    2007-08-01

    Two approaches have been proposed to reduce the synchrony of the neural response to electrical stimuli in cochlear implants. One approach involves adding noise to the pulse-train stimulus, and the other is based on using a high-rate pulse-train carrier. Hypotheses regarding the efficacy of the two approaches can be tested using computational models of neural responsiveness prior to time-intensive psychophysical studies. In our previous work, we have used such models to examine the effects of noise on several psychophysical measures important to speech recognition. However, to date there has been no parallel analytic solution investigating the neural response to the high-rate pulse-train stimuli and their effect on psychophysical measures. This work investigates the properties of the neural response to high-rate pulse-train stimuli with amplitude modulated envelopes using a stochastic auditory nerve model. The statistics governing the neural response to each pulse are derived using a recursive method. The agreement between the theoretical predictions and model simulations is demonstrated for sinusoidal amplitude modulated (SAM) high rate pulse-train stimuli. With our approach, predicting the neural response in modern implant devices becomes tractable. Psychophysical measurements are also predicted using the stochastic auditory nerve model for SAM high-rate pulse-train stimuli. Changes in dynamic range (DR) and intensity discrimination are compared with that observed for noise-modulated pulse-train stimuli. Modulation frequency discrimination is also studied as a function of stimulus level and pulse rate. Results suggest that high rate carriers may positively impact such psychophysical measures.

  12. Improving probabilistic prediction of daily streamflow by identifying Pareto optimal approaches for modelling heteroscedastic residual errors

    NASA Astrophysics Data System (ADS)

    David, McInerney; Mark, Thyer; Dmitri, Kavetski; George, Kuczera

    2017-04-01

    This study provides guidance to hydrological researchers which enables them to provide probabilistic predictions of daily streamflow with the best reliability and precision for different catchment types (e.g. high/low degree of ephemerality). Reliable and precise probabilistic prediction of daily catchment-scale streamflow requires statistical characterization of residual errors of hydrological models. It is commonly known that hydrological model residual errors are heteroscedastic, i.e. there is a pattern of larger errors in higher streamflow predictions. Although multiple approaches exist for representing this heteroscedasticity, few studies have undertaken a comprehensive evaluation and comparison of these approaches. This study fills this research gap by evaluating 8 common residual error schemes, including standard and weighted least squares, the Box-Cox transformation (with fixed and calibrated power parameter, lambda) and the log-sinh transformation. Case studies include 17 perennial and 6 ephemeral catchments in Australia and USA, and two lumped hydrological models. We find the choice of heteroscedastic error modelling approach significantly impacts on predictive performance, though no single scheme simultaneously optimizes all performance metrics. The set of Pareto optimal schemes, reflecting performance trade-offs, comprises Box-Cox schemes with lambda of 0.2 and 0.5, and the log scheme (lambda=0, perennial catchments only). These schemes significantly outperform even the average-performing remaining schemes (e.g., across ephemeral catchments, median precision tightens from 105% to 40% of observed streamflow, and median biases decrease from 25% to 4%). Theoretical interpretations of empirical results highlight the importance of capturing the skew/kurtosis of raw residuals and reproducing zero flows. Recommendations for researchers and practitioners seeking robust residual error schemes for practical work are provided.

  13. ADMET Evaluation in Drug Discovery. 18. Reliable Prediction of Chemical-Induced Urinary Tract Toxicity by Boosting Machine Learning Approaches.

    PubMed

    Lei, Tailong; Sun, Huiyong; Kang, Yu; Zhu, Feng; Liu, Hui; Zhou, Wenfang; Wang, Zhe; Li, Dan; Li, Youyong; Hou, Tingjun

    2017-11-06

    Xenobiotic chemicals and their metabolites are mainly excreted out of our bodies by the urinary tract through the urine. Chemical-induced urinary tract toxicity is one of the main reasons that cause failure during drug development, and it is a common adverse event for medications, natural supplements, and environmental chemicals. Despite its importance, there are only a few in silico models for assessing urinary tract toxicity for a large number of compounds with diverse chemical structures. Here, we developed a series of qualitative and quantitative structure-activity relationship (QSAR) models for predicting urinary tract toxicity. In our study, the recursive feature elimination method incorporated with random forests (RFE-RF) was used for dimension reduction, and then eight machine learning approaches were used for QSAR modeling, i.e., relevance vector machine (RVM), support vector machine (SVM), regularized random forest (RRF), C5.0 trees, eXtreme gradient boosting (XGBoost), AdaBoost.M1, SVM boosting (SVMBoost), and RVM boosting (RVMBoost). For building classification models, the synthetic minority oversampling technique was used to handle the imbalance data set problem. Among all the machine learning approaches, SVMBoost based on the RBF kernel achieves both the best quantitative (q ext 2 = 0.845) and qualitative predictions for the test set (MCC of 0.787, AUC of 0.893, sensitivity of 89.6%, specificity of 94.1%, and global accuracy of 90.8%). The application domains were then analyzed, and all of the tested chemicals fall within the application domain coverage. We also examined the structure features of the chemicals with large prediction errors. In brief, both the regression and classification models developed by the SVMBoost approach have reliable prediction capability for assessing chemical-induced urinary tract toxicity.

  14. Low-Cloud Feedbacks from Cloud-Controlling Factors: A Review

    DOE PAGES

    Klein, Stephen A.; Hall, Alex; Norris, Joel R.; ...

    2017-10-24

    Here, the response to warming of tropical low-level clouds including both marine stratocumulus and trade cumulus is a major source of uncertainty in projections of future climate. Climate model simulations of the response vary widely, reflecting the difficulty the models have in simulating these clouds. These inadequacies have led to alternative approaches to predict low-cloud feedbacks. Here, we review an observational approach that relies on the assumption that observed relationships between low clouds and the “cloud-controlling factors” of the large-scale environment are invariant across time-scales. With this assumption, and given predictions of how the cloud-controlling factors change with climate warming,more » one can predict low-cloud feedbacks without using any model simulation of low clouds. We discuss both fundamental and implementation issues with this approach and suggest steps that could reduce uncertainty in the predicted low-cloud feedback. Recent studies using this approach predict that the tropical low-cloud feedback is positive mainly due to the observation that reflection of solar radiation by low clouds decreases as temperature increases, holding all other cloud-controlling factors fixed. The positive feedback from temperature is partially offset by a negative feedback from the tendency for the inversion strength to increase in a warming world, with other cloud-controlling factors playing a smaller role. A consensus estimate from these studies for the contribution of tropical low clouds to the global mean cloud feedback is 0.25 ± 0.18 W m –2 K –1 (90% confidence interval), suggesting it is very unlikely that tropical low clouds reduce total global cloud feedback. Because the prediction of positive tropical low-cloud feedback with this approach is consistent with independent evidence from low-cloud feedback studies using high-resolution cloud models, progress is being made in reducing this key climate uncertainty.« less

  15. Low-Cloud Feedbacks from Cloud-Controlling Factors: A Review

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Klein, Stephen A.; Hall, Alex; Norris, Joel R.

    Here, the response to warming of tropical low-level clouds including both marine stratocumulus and trade cumulus is a major source of uncertainty in projections of future climate. Climate model simulations of the response vary widely, reflecting the difficulty the models have in simulating these clouds. These inadequacies have led to alternative approaches to predict low-cloud feedbacks. Here, we review an observational approach that relies on the assumption that observed relationships between low clouds and the “cloud-controlling factors” of the large-scale environment are invariant across time-scales. With this assumption, and given predictions of how the cloud-controlling factors change with climate warming,more » one can predict low-cloud feedbacks without using any model simulation of low clouds. We discuss both fundamental and implementation issues with this approach and suggest steps that could reduce uncertainty in the predicted low-cloud feedback. Recent studies using this approach predict that the tropical low-cloud feedback is positive mainly due to the observation that reflection of solar radiation by low clouds decreases as temperature increases, holding all other cloud-controlling factors fixed. The positive feedback from temperature is partially offset by a negative feedback from the tendency for the inversion strength to increase in a warming world, with other cloud-controlling factors playing a smaller role. A consensus estimate from these studies for the contribution of tropical low clouds to the global mean cloud feedback is 0.25 ± 0.18 W m –2 K –1 (90% confidence interval), suggesting it is very unlikely that tropical low clouds reduce total global cloud feedback. Because the prediction of positive tropical low-cloud feedback with this approach is consistent with independent evidence from low-cloud feedback studies using high-resolution cloud models, progress is being made in reducing this key climate uncertainty.« less

  16. Predictability of extreme weather events for NE U.S.: improvement of the numerical prediction using a Bayesian regression approach

    NASA Astrophysics Data System (ADS)

    Yang, J.; Astitha, M.; Anagnostou, E. N.; Hartman, B.; Kallos, G. B.

    2015-12-01

    Weather prediction accuracy has become very important for the Northeast U.S. given the devastating effects of extreme weather events in the recent years. Weather forecasting systems are used towards building strategies to prevent catastrophic losses for human lives and the environment. Concurrently, weather forecast tools and techniques have evolved with improved forecast skill as numerical prediction techniques are strengthened by increased super-computing resources. In this study, we examine the combination of two state-of-the-science atmospheric models (WRF and RAMS/ICLAMS) by utilizing a Bayesian regression approach to improve the prediction of extreme weather events for NE U.S. The basic concept behind the Bayesian regression approach is to take advantage of the strengths of two atmospheric modeling systems and, similar to the multi-model ensemble approach, limit their weaknesses which are related to systematic and random errors in the numerical prediction of physical processes. The first part of this study is focused on retrospective simulations of seventeen storms that affected the region in the period 2004-2013. Optimal variances are estimated by minimizing the root mean square error and are applied to out-of-sample weather events. The applicability and usefulness of this approach are demonstrated by conducting an error analysis based on in-situ observations from meteorological stations of the National Weather Service (NWS) for wind speed and wind direction, and NCEP Stage IV radar data, mosaicked from the regional multi-sensor for precipitation. The preliminary results indicate a significant improvement in the statistical metrics of the modeled-observed pairs for meteorological variables using various combinations of the sixteen events as predictors of the seventeenth. This presentation will illustrate the implemented methodology and the obtained results for wind speed, wind direction and precipitation, as well as set the research steps that will be followed in the future.

  17. A Bayesian Hierarchical Modeling Approach to Predicting Flow in Ungauged Basins

    EPA Science Inventory

    Recent innovative approaches to identifying and applying regression-based relationships between land use patterns (such as increasing impervious surface area and decreasing vegetative cover) and rainfall-runoff model parameters represent novel and promising improvements to predic...

  18. Robust multiscale prediction of Po River discharge using a twofold AR-NN approach

    NASA Astrophysics Data System (ADS)

    Alessio, Silvia; Taricco, Carla; Rubinetti, Sara; Zanchettin, Davide; Rubino, Angelo; Mancuso, Salvatore

    2017-04-01

    The Mediterranean area is among the regions most exposed to hydroclimatic changes, with a likely increase of frequency and duration of droughts in the last decades and potentially substantial future drying according to climate projections. However, significant decadal variability is often superposed or even dominates these long-term hydrological trend as observed, for instance, in North Italian precipitation and river discharge records. The capability to accurately predict such decadal changes is, therefore, of utmost environmental and social importance. In order to forecast short and noisy hydroclimatic time series, we apply a twofold statistical approach that we improved with respect to previous works [1]. Our prediction strategy consists in the application of two independent methods that use autoregressive models and feed-forward neural networks. Since all prediction methods work better on clean signals, the predictions are not performed directly on the series, but rather on each significant variability components extracted with Singular Spectrum Analysis (SSA). In this contribution, we will illustrate the multiscale prediction approach and its application to the case of decadal prediction of annual-average Po River discharges (Italy). The discharge record is available for the last 209 years and allows to work with both interannual and decadal time-scale components. Fifteen-year forecasts obtained with both methods robustly indicate a prominent dry period in the second half of the 2020s. We will discuss advantages and limitations of the proposed statistical approach in the light of the current capabilities of decadal climate prediction systems based on numerical climate models, toward an integrated dynamical and statistical approach for the interannual-to-decadal prediction of hydroclimate variability in medium-size river basins. [1] Alessio et. al., Natural variability and anthropogenic effects in a Central Mediterranean core, Clim. of the Past, 8, 831-839, 2012.

  19. Can We Predict Individual Combined Benefit and Harm of Therapy? Warfarin Therapy for Atrial Fibrillation as a Test Case

    PubMed Central

    Li, Guowei; Thabane, Lehana; Delate, Thomas; Witt, Daniel M.; Levine, Mitchell A. H.; Cheng, Ji; Holbrook, Anne

    2016-01-01

    Objectives To construct and validate a prediction model for individual combined benefit and harm outcomes (stroke with no major bleeding, major bleeding with no stroke, neither event, or both) in patients with atrial fibrillation (AF) with and without warfarin therapy. Methods Using the Kaiser Permanente Colorado databases, we included patients newly diagnosed with AF between January 1, 2005 and December 31, 2012 for model construction and validation. The primary outcome was a prediction model of composite of stroke or major bleeding using polytomous logistic regression (PLR) modelling. The secondary outcome was a prediction model of all-cause mortality using the Cox regression modelling. Results We included 9074 patients with 4537 and 4537 warfarin users and non-users, respectively. In the derivation cohort (n = 4632), there were 136 strokes (2.94%), 280 major bleedings (6.04%) and 1194 deaths (25.78%) occurred. In the prediction models, warfarin use was not significantly associated with risk of stroke, but increased the risk of major bleeding and decreased the risk of death. Both the PLR and Cox models were robust, internally and externally validated, and with acceptable model performances. Conclusions In this study, we introduce a new methodology for predicting individual combined benefit and harm outcomes associated with warfarin therapy for patients with AF. Should this approach be validated in other patient populations, it has potential advantages over existing risk stratification approaches as a patient-physician aid for shared decision-making PMID:27513986

  20. Regression model estimation of early season crop proportions: North Dakota, some preliminary results

    NASA Technical Reports Server (NTRS)

    Lin, K. K. (Principal Investigator)

    1982-01-01

    To estimate crop proportions early in the season, an approach is proposed based on: use of a regression-based prediction equation to obtain an a priori estimate for specific major crop groups; modification of this estimate using current-year LANDSAT and weather data; and a breakdown of the major crop groups into specific crops by regression models. Results from the development and evaluation of appropriate regression models for the first portion of the proposed approach are presented. The results show that the model predicts 1980 crop proportions very well at both county and crop reporting district levels. In terms of planted acreage, the model underpredicted 9.1 percent of the 1980 published data on planted acreage at the county level. It predicted almost exactly the 1980 published data on planted acreage at the crop reporting district level and overpredicted the planted acreage by just 0.92 percent.

Top