Bennett, Erin R; Clausen, Jay; Linkov, Eugene; Linkov, Igor
2009-11-01
Reliable, up-front information on physical and biological properties of emerging materials is essential before making a decision and investment to formulate, synthesize, scale-up, test, and manufacture a new material for use in both military and civilian applications. Multiple quantitative structure-activity relationships (QSARs) software tools are available for predicting a material's physical/chemical properties and environmental effects. Even though information on emerging materials is often limited, QSAR software output is treated without sufficient uncertainty analysis. We hypothesize that uncertainty and variability in material properties and uncertainty in model prediction can be too large to provide meaningful results. To test this hypothesis, we predicted octanol water partitioning coefficients (logP) for multiple, similar compounds with limited physical-chemical properties using six different commercial logP calculators (KOWWIN, MarvinSketch, ACD/Labs, ALogP, CLogP, SPARC). Analysis was done for materials with largely uncertain properties that were similar, based on molecular formula, to military compounds (RDX, BTTN, TNT) and pharmaceuticals (Carbamazepine, Gemfibrizol). We have also compared QSAR modeling results for a well-studied pesticide and pesticide breakdown product (Atrazine, DDE). Our analysis shows variability due to structural variations of the emerging chemicals may be several orders of magnitude. The model uncertainty across six software packages was very high (10 orders of magnitude) for emerging materials while it was low for traditional chemicals (e.g. Atrazine). Thus the use of QSAR models for emerging materials screening requires extensive model validation and coupling QSAR output with available empirical data and other relevant information.
TOXICO-CHEMINFORMATICS AND QSAR MODELING OF ...
This abstract concludes that QSAR approaches combined with toxico-chemoinformatics descriptors can enhance predictive toxicology models. This abstract concludes that QSAR approaches combined with toxico-chemoinformatics descriptors can enhance predictive toxicology models.
Predictive QSAR modeling workflow, model applicability domains, and virtual screening.
Tropsha, Alexander; Golbraikh, Alexander
2007-01-01
Quantitative Structure Activity Relationship (QSAR) modeling has been traditionally applied as an evaluative approach, i.e., with the focus on developing retrospective and explanatory models of existing data. Model extrapolation was considered if only in hypothetical sense in terms of potential modifications of known biologically active chemicals that could improve compounds' activity. This critical review re-examines the strategy and the output of the modern QSAR modeling approaches. We provide examples and arguments suggesting that current methodologies may afford robust and validated models capable of accurate prediction of compound properties for molecules not included in the training sets. We discuss a data-analytical modeling workflow developed in our laboratory that incorporates modules for combinatorial QSAR model development (i.e., using all possible binary combinations of available descriptor sets and statistical data modeling techniques), rigorous model validation, and virtual screening of available chemical databases to identify novel biologically active compounds. Our approach places particular emphasis on model validation as well as the need to define model applicability domains in the chemistry space. We present examples of studies where the application of rigorously validated QSAR models to virtual screening identified computational hits that were confirmed by subsequent experimental investigations. The emerging focus of QSAR modeling on target property forecasting brings it forward as predictive, as opposed to evaluative, modeling approach.
QSAR DataBank - an approach for the digital organization and archiving of QSAR model information
2014-01-01
Background Research efforts in the field of descriptive and predictive Quantitative Structure-Activity Relationships or Quantitative Structure–Property Relationships produce around one thousand scientific publications annually. All the materials and results are mainly communicated using printed media. The printed media in its present form have obvious limitations when they come to effectively representing mathematical models, including complex and non-linear, and large bodies of associated numerical chemical data. It is not supportive of secondary information extraction or reuse efforts while in silico studies poses additional requirements for accessibility, transparency and reproducibility of the research. This gap can and should be bridged by introducing domain-specific digital data exchange standards and tools. The current publication presents a formal specification of the quantitative structure-activity relationship data organization and archival format called the QSAR DataBank (QsarDB for shorter, or QDB for shortest). Results The article describes QsarDB data schema, which formalizes QSAR concepts (objects and relationships between them) and QsarDB data format, which formalizes their presentation for computer systems. The utility and benefits of QsarDB have been thoroughly tested by solving everyday QSAR and predictive modeling problems, with examples in the field of predictive toxicology, and can be applied for a wide variety of other endpoints. The work is accompanied with open source reference implementation and tools. Conclusions The proposed open data, open source, and open standards design is open to public and proprietary extensions on many levels. Selected use cases exemplify the benefits of the proposed QsarDB data format. General ideas for future development are discussed. PMID:24910716
QSAR modeling: where have you been? Where are you going to?
Cherkasov, Artem; Muratov, Eugene N; Fourches, Denis; Varnek, Alexandre; Baskin, Igor I; Cronin, Mark; Dearden, John; Gramatica, Paola; Martin, Yvonne C; Todeschini, Roberto; Consonni, Viviana; Kuz'min, Victor E; Cramer, Richard; Benigni, Romualdo; Yang, Chihae; Rathman, James; Terfloth, Lothar; Gasteiger, Johann; Richard, Ann; Tropsha, Alexander
2014-06-26
Quantitative structure-activity relationship modeling is one of the major computational tools employed in medicinal chemistry. However, throughout its entire history it has drawn both praise and criticism concerning its reliability, limitations, successes, and failures. In this paper, we discuss (i) the development and evolution of QSAR; (ii) the current trends, unsolved problems, and pressing challenges; and (iii) several novel and emerging applications of QSAR modeling. Throughout this discussion, we provide guidelines for QSAR development, validation, and application, which are summarized in best practices for building rigorously validated and externally predictive QSAR models. We hope that this Perspective will help communications between computational and experimental chemists toward collaborative development and use of QSAR models. We also believe that the guidelines presented here will help journal editors and reviewers apply more stringent scientific standards to manuscripts reporting new QSAR studies, as well as encourage the use of high quality, validated QSARs for regulatory decision making.
QSAR Modeling: Where have you been? Where are you going to?
Cherkasov, Artem; Muratov, Eugene N.; Fourches, Denis; Varnek, Alexandre; Baskin, Igor I.; Cronin, Mark; Dearden, John; Gramatica, Paola; Martin, Yvonne C.; Todeschini, Roberto; Consonni, Viviana; Kuz'min, Victor E.; Cramer, Richard; Benigni, Romualdo; Yang, Chihae; Rathman, James; Terfloth, Lothar; Gasteiger, Johann; Richard, Ann; Tropsha, Alexander
2014-01-01
Quantitative Structure-Activity Relationship modeling is one of the major computational tools employed in medicinal chemistry. However, throughout its entire history it has drawn both praise and criticism concerning its reliability, limitations, successes, and failures. In this paper, we discuss: (i) the development and evolution of QSAR; (ii) the current trends, unsolved problems, and pressing challenges; and (iii) several novel and emerging applications of QSAR modeling. Throughout this discussion, we provide guidelines for QSAR development, validation, and application, which are summarized in best practices for building rigorously validated and externally predictive QSAR models. We hope that this Perspective will help communications between computational and experimental chemists towards collaborative development and use of QSAR models. We also believe that the guidelines presented here will help journal editors and reviewers apply more stringent scientific standards to manuscripts reporting new QSAR studies, as well as encourage the use of high quality, validated QSARs for regulatory decision making. PMID:24351051
QSAR models for anti-malarial activity of 4-aminoquinolines.
Masand, Vijay H; Toropov, Andrey A; Toropova, Alla P; Mahajan, Devidas T
2014-03-01
In the present study, predictive quantitative structure - activity relationship (QSAR) models for anti-malarial activity of 4-aminoquinolines have been developed. CORAL, which is freely available on internet (http://www.insilico.eu/coral), has been used as a tool of QSAR analysis to establish statistically robust QSAR model of anti-malarial activity of 4-aminoquinolines. Six random splits into the visible sub-system of the training and invisible subsystem of validation were examined. Statistical qualities for these splits vary, but in all these cases, statistical quality of prediction for anti-malarial activity was quite good. The optimal SMILES-based descriptor was used to derive the single descriptor based QSAR model for a data set of 112 aminoquinolones. All the splits had r(2)> 0.85 and r(2)> 0.78 for subtraining and validation sets, respectively. The three parametric multilinear regression (MLR) QSAR model has Q(2) = 0.83, R(2) = 0.84 and F = 190.39. The anti-malarial activity has strong correlation with presence/absence of nitrogen and oxygen at a topological distance of six.
An ensemble model of QSAR tools for regulatory risk assessment.
Pradeep, Prachi; Povinelli, Richard J; White, Shannon; Merrill, Stephen J
2016-01-01
Quantitative structure activity relationships (QSARs) are theoretical models that relate a quantitative measure of chemical structure to a physical property or a biological effect. QSAR predictions can be used for chemical risk assessment for protection of human and environmental health, which makes them interesting to regulators, especially in the absence of experimental data. For compatibility with regulatory use, QSAR models should be transparent, reproducible and optimized to minimize the number of false negatives. In silico QSAR tools are gaining wide acceptance as a faster alternative to otherwise time-consuming clinical and animal testing methods. However, different QSAR tools often make conflicting predictions for a given chemical and may also vary in their predictive performance across different chemical datasets. In a regulatory context, conflicting predictions raise interpretation, validation and adequacy concerns. To address these concerns, ensemble learning techniques in the machine learning paradigm can be used to integrate predictions from multiple tools. By leveraging various underlying QSAR algorithms and training datasets, the resulting consensus prediction should yield better overall predictive ability. We present a novel ensemble QSAR model using Bayesian classification. The model allows for varying a cut-off parameter that allows for a selection in the desirable trade-off between model sensitivity and specificity. The predictive performance of the ensemble model is compared with four in silico tools (Toxtree, Lazar, OECD Toolbox, and Danish QSAR) to predict carcinogenicity for a dataset of air toxins (332 chemicals) and a subset of the gold carcinogenic potency database (480 chemicals). Leave-one-out cross validation results show that the ensemble model achieves the best trade-off between sensitivity and specificity (accuracy: 83.8 % and 80.4 %, and balanced accuracy: 80.6 % and 80.8 %) and highest inter-rater agreement [kappa ( κ ): 0
Combinatorial QSAR Modeling of Rat Acute Toxicity by Oral Exposure
Quantitative Structure-Activity Relationship (QSAR) toxicity models have become popular tools for identifying potential toxic compounds and prioritizing candidates for animal toxicity tests. However, few QSAR studies have successfully modeled large, diverse mammalian toxicity end...
QSAR and 3D-QSAR studies applied to compounds with anticonvulsant activity.
Garro Martinez, Juan C; Vega-Hissi, Esteban G; Andrada, Matías F; Estrada, Mario R
2015-01-01
Quantitative structure-activity relationships (QSAR and 3D-QSAR) have been applied in the last decade to obtain a reliable statistical model for the prediction of the anticonvulsant activities of new chemical entities. However, despite the large amount of information on QSAR, no recent review has published and discussed this data in detail. In this review, the authors provide a detailed discussion of QSAR studies that have been applied to compounds with anticonvulsant activity published between the years 2003 and 2013. They also evaluate the mathematical approaches and the main software used to develop the QSAR and 3D-QSAR model. QSAR methodologies continue to attract the attention of researchers and provide valuable information for the development of new potentially active compounds including those with anticonvulsant activity. This has been helped in part by improvements in the size and performance of computers; the development of specific software and the development of novel molecular descriptors, which have given rise to new and more predictive QSAR models. The extensive development of descriptors, and the way by which descriptor values are derived, have allowed the evolution of the QSAR methods. This evolution could strengthen the QSAR methods as an important tool in research and development of new and more potent anticonvulsant agents.
QSAR modeling of GPCR ligands: methodologies and examples of applications.
Tropsha, A; Wang, S X
2006-01-01
GPCR ligands represent not only one of the major classes of current drugs but the major continuing source of novel potent pharmaceutical agents. Because 3D structures of GPCRs as determined by experimental techniques are still unavailable, ligand-based drug discovery methods remain the major computational molecular modeling approaches to the analysis of growing data sets of tested GPCR ligands. This paper presents an overview of modern Quantitative Structure Activity Relationship (QSAR) modeling. We discuss the critical issue of model validation and the strategy for applying the successfully validated QSAR models to virtual screening of available chemical databases. We present several examples of applications of validated QSAR modeling approaches to GPCR ligands. We conclude with the comments on exciting developments in the QSAR modeling of GPCR ligands that focus on the study of emerging data sets of compounds with dual or even multiple activities against two or more of GPCRs.
QSAR and 3D QSAR of inhibitors of the epidermal growth factor receptor
NASA Astrophysics Data System (ADS)
Pinto-Bazurco, Mariano; Tsakovska, Ivanka; Pajeva, Ilza
This article reports quantitative structure-activity relationships (QSAR) and 3D QSAR models of 134 structurally diverse inhibitors of the epidermal growth factor receptor (EGFR) tyrosine kinase. Free-Wilson analysis was used to derive the QSAR model. It identified the substituents in aniline, the polycyclic system, and the substituents at the 6- and 7-positions of the polycyclic system as the most important structural features. Comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) were used in the 3D QSAR modeling. The steric and electrostatic interactions proved the most important for the inhibitory effect. Both QSAR and 3D QSAR models led to consistent results. On the basis of the statistically significant models, new structures were proposed and their inhibitory activities were predicted.
NASA Astrophysics Data System (ADS)
Masand, Vijay H.; El-Sayed, Nahed N. E.; Mahajan, Devidas T.; Mercader, Andrew G.; Alafeefy, Ahmed M.; Shibi, I. G.
2017-02-01
In the present work, sixty substituted 2-Phenylimidazopyridines previously reported with potent anti-human African trypanosomiasis (HAT) activity were selected to build genetic algorithm (GA) based QSAR models to determine the structural features that have significant correlation with the activity. Multiple QSAR models were built using easily interpretable descriptors that are directly associated with the presence or the absence of a structural scaffold, or a specific atom. All the QSAR models have been thoroughly validated according to the OECD principles. All the QSAR models are statistically very robust (R2 = 0.80-0.87) with high external predictive ability (CCCex = 0.81-0.92). The QSAR analysis reveals that the HAT activity has good correlation with the presence of five membered rings in the molecule.
CheS-Mapper 2.0 for visual validation of (Q)SAR models
2014-01-01
Background Sound statistical validation is important to evaluate and compare the overall performance of (Q)SAR models. However, classical validation does not support the user in better understanding the properties of the model or the underlying data. Even though, a number of visualization tools for analyzing (Q)SAR information in small molecule datasets exist, integrated visualization methods that allow the investigation of model validation results are still lacking. Results We propose visual validation, as an approach for the graphical inspection of (Q)SAR model validation results. The approach applies the 3D viewer CheS-Mapper, an open-source application for the exploration of small molecules in virtual 3D space. The present work describes the new functionalities in CheS-Mapper 2.0, that facilitate the analysis of (Q)SAR information and allows the visual validation of (Q)SAR models. The tool enables the comparison of model predictions to the actual activity in feature space. The approach is generic: It is model-independent and can handle physico-chemical and structural input features as well as quantitative and qualitative endpoints. Conclusions Visual validation with CheS-Mapper enables analyzing (Q)SAR information in the data and indicates how this information is employed by the (Q)SAR model. It reveals, if the endpoint is modeled too specific or too generic and highlights common properties of misclassified compounds. Moreover, the researcher can use CheS-Mapper to inspect how the (Q)SAR model predicts activity cliffs. The CheS-Mapper software is freely available at http://ches-mapper.org. Graphical abstract Comparing actual and predicted activity values with CheS-Mapper.
An ensemble model of QSAR tools for regulatory risk assessment
Pradeep, Prachi; Povinelli, Richard J.; White, Shannon; ...
2016-09-22
Quantitative structure activity relationships (QSARs) are theoretical models that relate a quantitative measure of chemical structure to a physical property or a biological effect. QSAR predictions can be used for chemical risk assessment for protection of human and environmental health, which makes them interesting to regulators, especially in the absence of experimental data. For compatibility with regulatory use, QSAR models should be transparent, reproducible and optimized to minimize the number of false negatives. In silico QSAR tools are gaining wide acceptance as a faster alternative to otherwise time-consuming clinical and animal testing methods. However, different QSAR tools often make conflictingmore » predictions for a given chemical and may also vary in their predictive performance across different chemical datasets. In a regulatory context, conflicting predictions raise interpretation, validation and adequacy concerns. To address these concerns, ensemble learning techniques in the machine learning paradigm can be used to integrate predictions from multiple tools. By leveraging various underlying QSAR algorithms and training datasets, the resulting consensus prediction should yield better overall predictive ability. We present a novel ensemble QSAR model using Bayesian classification. The model allows for varying a cut-off parameter that allows for a selection in the desirable trade-off between model sensitivity and specificity. The predictive performance of the ensemble model is compared with four in silico tools (Toxtree, Lazar, OECD Toolbox, and Danish QSAR) to predict carcinogenicity for a dataset of air toxins (332 chemicals) and a subset of the gold carcinogenic potency database (480 chemicals). Leave-one-out cross validation results show that the ensemble model achieves the best trade-off between sensitivity and specificity (accuracy: 83.8 % and 80.4 %, and balanced accuracy: 80.6 % and 80.8 %) and highest inter-rater agreement [kappa (κ): 0
From QSAR to QSIIR: Searching for Enhanced Computational Toxicology Models
Zhu, Hao
2017-01-01
Quantitative Structure Activity Relationship (QSAR) is the most frequently used modeling approach to explore the dependency of biological, toxicological, or other types of activities/properties of chemicals on their molecular features. In the past two decades, QSAR modeling has been used extensively in drug discovery process. However, the predictive models resulted from QSAR studies have limited use for chemical risk assessment, especially for animal and human toxicity evaluations, due to the low predictivity of new compounds. To develop enhanced toxicity models with independently validated external prediction power, novel modeling protocols were pursued by computational toxicologists based on rapidly increasing toxicity testing data in recent years. This chapter reviews the recent effort in our laboratory to incorporate the biological testing results as descriptors in the toxicity modeling process. This effort extended the concept of QSAR to Quantitative Structure In vitro-In vivo Relationship (QSIIR). The QSIIR study examples provided in this chapter indicate that the QSIIR models that based on the hybrid (biological and chemical) descriptors are indeed superior to the conventional QSAR models that only based on chemical descriptors for several animal toxicity endpoints. We believe that the applications introduced in this review will be of interest and value to researchers working in the field of computational drug discovery and environmental chemical risk assessment. PMID:23086837
Modelling the effect of structural QSAR parameters on skin penetration using genetic programming
NASA Astrophysics Data System (ADS)
Chung, K. K.; Do, D. Q.
2010-09-01
In order to model relationships between chemical structures and biological effects in quantitative structure-activity relationship (QSAR) data, an alternative technique of artificial intelligence computing—genetic programming (GP)—was investigated and compared to the traditional method—statistical. GP, with the primary advantage of generating mathematical equations, was employed to model QSAR data and to define the most important molecular descriptions in QSAR data. The models predicted by GP agreed with the statistical results, and the most predictive models of GP were significantly improved when compared to the statistical models using ANOVA. Recently, artificial intelligence techniques have been applied widely to analyse QSAR data. With the capability of generating mathematical equations, GP can be considered as an effective and efficient method for modelling QSAR data.
Wang, Wenyi; Kim, Marlene T.; Sedykh, Alexander
2015-01-01
Purpose Experimental Blood–Brain Barrier (BBB) permeability models for drug molecules are expensive and time-consuming. As alternative methods, several traditional Quantitative Structure-Activity Relationship (QSAR) models have been developed previously. In this study, we aimed to improve the predictivity of traditional QSAR BBB permeability models by employing relevant public bio-assay data in the modeling process. Methods We compiled a BBB permeability database consisting of 439 unique compounds from various resources. The database was split into a modeling set of 341 compounds and a validation set of 98 compounds. Consensus QSAR modeling workflow was employed on the modeling set to develop various QSAR models. A five-fold cross-validation approach was used to validate the developed models, and the resulting models were used to predict the external validation set compounds. Furthermore, we used previously published membrane transporter models to generate relevant transporter profiles for target compounds. The transporter profiles were used as additional biological descriptors to develop hybrid QSAR BBB models. Results The consensus QSAR models have R2=0.638 for fivefold cross-validation and R2=0.504 for external validation. The consensus model developed by pooling chemical and transporter descriptors showed better predictivity (R2=0.646 for five-fold cross-validation and R2=0.526 for external validation). Moreover, several external bio-assays that correlate with BBB permeability were identified using our automatic profiling tool. Conclusions The BBB permeability models developed in this study can be useful for early evaluation of new compounds (e.g., new drug candidates). The combination of chemical and biological descriptors shows a promising direction to improve the current traditional QSAR models. PMID:25862462
QSAR modelling using combined simple competitive learning networks and RBF neural networks.
Sheikhpour, R; Sarram, M A; Rezaeian, M; Sheikhpour, E
2018-04-01
The aim of this study was to propose a QSAR modelling approach based on the combination of simple competitive learning (SCL) networks with radial basis function (RBF) neural networks for predicting the biological activity of chemical compounds. The proposed QSAR method consisted of two phases. In the first phase, an SCL network was applied to determine the centres of an RBF neural network. In the second phase, the RBF neural network was used to predict the biological activity of various phenols and Rho kinase (ROCK) inhibitors. The predictive ability of the proposed QSAR models was evaluated and compared with other QSAR models using external validation. The results of this study showed that the proposed QSAR modelling approach leads to better performances than other models in predicting the biological activity of chemical compounds. This indicated the efficiency of simple competitive learning networks in determining the centres of RBF neural networks.
2D-QSAR and 3D-QSAR Analyses for EGFR Inhibitors
Zhao, Manman; Zheng, Linfeng; Qiu, Chun
2017-01-01
Epidermal growth factor receptor (EGFR) is an important target for cancer therapy. In this study, EGFR inhibitors were investigated to build a two-dimensional quantitative structure-activity relationship (2D-QSAR) model and a three-dimensional quantitative structure-activity relationship (3D-QSAR) model. In the 2D-QSAR model, the support vector machine (SVM) classifier combined with the feature selection method was applied to predict whether a compound was an EGFR inhibitor. As a result, the prediction accuracy of the 2D-QSAR model was 98.99% by using tenfold cross-validation test and 97.67% by using independent set test. Then, in the 3D-QSAR model, the model with q2 = 0.565 (cross-validated correlation coefficient) and r2 = 0.888 (non-cross-validated correlation coefficient) was built to predict the activity of EGFR inhibitors. The mean absolute error (MAE) of the training set and test set was 0.308 log units and 0.526 log units, respectively. In addition, molecular docking was also employed to investigate the interaction between EGFR inhibitors and EGFR. PMID:28630865
Dixon, Steven L; Duan, Jianxin; Smith, Ethan; Von Bargen, Christopher D; Sherman, Woody; Repasky, Matthew P
2016-10-01
We introduce AutoQSAR, an automated machine-learning application to build, validate and deploy quantitative structure-activity relationship (QSAR) models. The process of descriptor generation, feature selection and the creation of a large number of QSAR models has been automated into a single workflow within AutoQSAR. The models are built using a variety of machine-learning methods, and each model is scored using a novel approach. Effectiveness of the method is demonstrated through comparison with literature QSAR models using identical datasets for six end points: protein-ligand binding affinity, solubility, blood-brain barrier permeability, carcinogenicity, mutagenicity and bioaccumulation in fish. AutoQSAR demonstrates similar or better predictive performance as compared with published results for four of the six endpoints while requiring minimal human time and expertise.
The great descriptor melting pot: mixing descriptors for the common good of QSAR models.
Tseng, Yufeng J; Hopfinger, Anton J; Esposito, Emilio Xavier
2012-01-01
The usefulness and utility of QSAR modeling depends heavily on the ability to estimate the values of molecular descriptors relevant to the endpoints of interest followed by an optimized selection of descriptors to form the best QSAR models from a representative set of the endpoints of interest. The performance of a QSAR model is directly related to its molecular descriptors. QSAR modeling, specifically model construction and optimization, has benefited from its ability to borrow from other unrelated fields, yet the molecular descriptors that form QSAR models have remained basically unchanged in both form and preferred usage. There are many types of endpoints that require multiple classes of descriptors (descriptors that encode 1D through multi-dimensional, 4D and above, content) needed to most fully capture the molecular features and interactions that contribute to the endpoint. The advantages of QSAR models constructed from multiple, and different, descriptor classes have been demonstrated in the exploration of markedly different, and principally biological systems and endpoints. Multiple examples of such QSAR applications using different descriptor sets are described and that examined. The take-home-message is that a major part of the future of QSAR analysis, and its application to modeling biological potency, ADME-Tox properties, general use in virtual screening applications, as well as its expanding use into new fields for building QSPR models, lies in developing strategies that combine and use 1D through nD molecular descriptors.
In silico study of in vitro GPCR assays by QSAR modeling ...
The U.S. EPA is screening thousands of chemicals of environmental interest in hundreds of in vitro high-throughput screening (HTS) assays (the ToxCast program). One goal is to prioritize chemicals for more detailed analyses based on activity in molecular initiating events (MIE) of adverse outcome pathways (AOPs). However, the chemical space of interest for environmental exposure is much wider than this set of chemicals. Thus, there is a need to fill data gaps with in silico methods, and quantitative structure-activity relationships (QSARs) are a proven and cost effective approach to predict biological activity. ToxCast in turn provides relatively large datasets that are ideal for training and testing QSAR models. The overall goal of the study described here was to develop QSAR models to fill the data gaps in a larger environmental database of ~32k structures. The specific aim of the current work was to build QSAR models for 18 G-Protein Coupled Receptor (GPCR) assays, part of the aminergic category. Two QSAR modeling strategies were adopted: classification models were developed to separate chemicals into active/non-active classes, and then regression models were built to predict the potency values of the bioassays for the active chemicals. Multiple software programs were used to calculate constitutional, topological and substructural molecular descriptors from two-dimensional (2D) chemical structures. Model-fitting methods included PLSDA (partial least squares d
LQTA-QSAR: a new 4D-QSAR methodology.
Martins, João Paulo A; Barbosa, Euzébio G; Pasqualoto, Kerly F M; Ferreira, Márcia M C
2009-06-01
A novel 4D-QSAR approach which makes use of the molecular dynamics (MD) trajectories and topology information retrieved from the GROMACS package is presented in this study. This new methodology, named LQTA-QSAR (LQTA, Laboratório de Quimiometria Teórica e Aplicada), has a module (LQTAgrid) that calculates intermolecular interaction energies at each grid point considering probes and all aligned conformations resulting from MD simulations. These interaction energies are the independent variables or descriptors employed in a QSAR analysis. The comparison of the proposed methodology to other 4D-QSAR and CoMFA formalisms was performed using a set of forty-seven glycogen phosphorylase b inhibitors (data set 1) and a set of forty-four MAP p38 kinase inhibitors (data set 2). The QSAR models for both data sets were built using the ordered predictor selection (OPS) algorithm for variable selection. Model validation was carried out applying y-randomization and leave-N-out cross-validation in addition to the external validation. PLS models for data set 1 and 2 provided the following statistics: q(2) = 0.72, r(2) = 0.81 for 12 variables selected and 2 latent variables and q(2) = 0.82, r(2) = 0.90 for 10 variables selected and 5 latent variables, respectively. Visualization of the descriptors in 3D space was successfully interpreted from the chemical point of view, supporting the applicability of this new approach in rational drug design.
Sensitivity Analysis of QSAR Models for Assessing Novel Military Compounds
2009-01-01
ER D C TR -0 9 -3 Strategic Environmental Research and Development Program Sensitivity Analysis of QSAR Models for Assessing Novel...Environmental Research and Development Program ERDC TR-09-3 January 2009 Sensitivity Analysis of QSAR Models for Assessing Novel Military Compound...Jay L. Clausen Cold Regions Research and Engineering Laboratory U.S. Army Engineer Research and Development Center 72 Lyme Road Hanover, NH
Statistical molecular design of balanced compound libraries for QSAR modeling.
Linusson, A; Elofsson, M; Andersson, I E; Dahlgren, M K
2010-01-01
A fundamental step in preclinical drug development is the computation of quantitative structure-activity relationship (QSAR) models, i.e. models that link chemical features of compounds with activities towards a target macromolecule associated with the initiation or progression of a disease. QSAR models are computed by combining information on the physicochemical and structural features of a library of congeneric compounds, typically assembled from two or more building blocks, and biological data from one or more in vitro assays. Since the models provide information on features affecting the compounds' biological activity they can be used as guides for further optimization. However, in order for a QSAR model to be relevant to the targeted disease, and drug development in general, the compound library used must contain molecules with balanced variation of the features spanning the chemical space believed to be important for interaction with the biological target. In addition, the assays used must be robust and deliver high quality data that are directly related to the function of the biological target and the associated disease state. In this review, we discuss and exemplify the concept of statistical molecular design (SMD) in the selection of building blocks and final synthetic targets (i.e. compounds to synthesize) to generate information-rich, balanced libraries for biological testing and computation of QSAR models.
Dong, Xialan; Ebalunode, Jerry O; Cho, Sung Jin; Zheng, Weifan
2010-02-22
Quantitative structure-activity relationship (QSAR) methods aim to build quantitatively predictive models for the discovery of new molecules. It has been widely used in medicinal chemistry for drug discovery. Many QSAR techniques have been developed since Hansch's seminal work, and more are still being developed. Motivated by Hopfinger's receptor-dependent QSAR (RD-QSAR) formalism and the Lukacova-Balaz scheme to treat multimode issues, we have initiated studies that focus on a structure-based multimode QSAR (SBMM QSAR) method, where the structure of the target protein is used in characterizing the ligand, and the multimode issue of ligand binding is systematically treated with a modified Lukacova-Balaz scheme. All ligand molecules are first docked to the target binding pocket to obtain a set of aligned ligand poses. A structure-based pharmacophore concept is adopted to characterize the binding pocket. Specifically, we represent the binding pocket as a geometric grid labeled by pharmacophoric features. Each pose of the ligand is also represented as a labeled grid, where each grid point is labeled according to the atom types of nearby ligand atoms. These labeled grids or three-dimensional (3D) maps (both the receptor map (R-map) and the ligand map (L-map)) are compared to each other to derive descriptors for each pose of the ligand, resulting in a multimode structure-activity relationship (SAR) table. Iterative partial least-squares (PLS) is employed to build the QSAR models. When we applied this method to analyze PDE-4 inhibitors, predictive models have been developed, obtaining models with excellent training correlation (r(2) = 0.65-0.66), as well as test correlation (R(2) = 0.64-0.65). A comparative analysis with 4 other QSAR techniques demonstrates that this new method affords better models, in terms of the prediction power for the test set.
Rational selection of training and test sets for the development of validated QSAR models
NASA Astrophysics Data System (ADS)
Golbraikh, Alexander; Shen, Min; Xiao, Zhiyan; Xiao, Yun-De; Lee, Kuo-Hsiung; Tropsha, Alexander
2003-02-01
Quantitative Structure-Activity Relationship (QSAR) models are used increasingly to screen chemical databases and/or virtual chemical libraries for potentially bioactive molecules. These developments emphasize the importance of rigorous model validation to ensure that the models have acceptable predictive power. Using k nearest neighbors ( kNN) variable selection QSAR method for the analysis of several datasets, we have demonstrated recently that the widely accepted leave-one-out (LOO) cross-validated R2 (q2) is an inadequate characteristic to assess the predictive ability of the models [Golbraikh, A., Tropsha, A. Beware of q2! J. Mol. Graphics Mod. 20, 269-276, (2002)]. Herein, we provide additional evidence that there exists no correlation between the values of q 2 for the training set and accuracy of prediction ( R 2) for the test set and argue that this observation is a general property of any QSAR model developed with LOO cross-validation. We suggest that external validation using rationally selected training and test sets provides a means to establish a reliable QSAR model. We propose several approaches to the division of experimental datasets into training and test sets and apply them in QSAR studies of 48 functionalized amino acid anticonvulsants and a series of 157 epipodophyllotoxin derivatives with antitumor activity. We formulate a set of general criteria for the evaluation of predictive power of QSAR models.
Roy, Kunal; Mitra, Indrani
2011-07-01
Quantitative structure-activity relationships (QSARs) have important applications in drug discovery research, environmental fate modeling, property prediction, etc. Validation has been recognized as a very important step for QSAR model development. As one of the important objectives of QSAR modeling is to predict activity/property/toxicity of new chemicals falling within the domain of applicability of the developed models and QSARs are being used for regulatory decisions, checking reliability of the models and confidence of their predictions is a very important aspect, which can be judged during the validation process. One prime application of a statistically significant QSAR model is virtual screening for molecules with improved potency based on the pharmacophoric features and the descriptors appearing in the QSAR model. Validated QSAR models may also be utilized for design of focused libraries which may be subsequently screened for the selection of hits. The present review focuses on various metrics used for validation of predictive QSAR models together with an overview of the application of QSAR models in the fields of virtual screening and focused library design for diverse series of compounds with citation of some recent examples.
Development of a QSAR Model for Thyroperoxidase Inhbition ...
hyroid hormones (THs) are involved in multiple biological processes and are critical modulators of fetal development. Even moderate changes in maternal or fetal TH levels can produce irreversible neurological deficits in children, such as lower IQ. The enzyme thyroperoxidase (TPO) plays a key role in the synthesis of THs, and inhibition of TPO by xenobiotics results in decreased TH synthesis. Recently, a high-throughput screening assay for TPO inhibition (AUR-TPO) was developed and used to test the ToxCast Phase I and II chemicals. In the present study, we used the results from AUR-TPO to develop a Quantitative Structure-Activity Relationship (QSAR) model for TPO inhibition. The training set consisted of 898 discrete organic chemicals: 134 inhibitors and 764 non-inhibitors. A five times two-fold cross-validation of the model was performed, yielding a balanced accuracy of 78.7%. More recently, an additional ~800 chemicals were tested in the AUR-TPO assay. These data were used for a blinded external validation of the QSAR model, demonstrating a balanced accuracy of 85.7%. Overall, the cross- and external validation indicate a robust model with high predictive performance. Next, we used the QSAR model to predict 72,526 REACH pre-registered substances. The model could predict 49.5% (35,925) of the substances in its applicability domain and of these, 8,863 (24.7%) were predicted to be TPO inhibitors. Predictions from this screening can be used in a tiered approach to
The importance of data curation on QSAR Modeling ...
During the last few decades many QSAR models and tools have been developed at the US EPA, including the widely used EPISuite. During this period the arsenal of computational capabilities supporting cheminformatics has broadened dramatically with multiple software packages. These modern tools allow for more advanced techniques in terms of chemical structure representation and storage, as well as enabling automated data-mining and standardization approaches to examine and fix data quality issues.This presentation will investigate the impact of data curation on the reliability of QSAR models being developed within the EPA‘s National Center for Computational Toxicology. As part of this work we have attempted to disentangle the influence of the quality versus quantity of data based on the Syracuse PHYSPROP database partly used by EPISuite software. We will review our automated approaches to examining key datasets related to the EPISuite data to validate across chemical structure representations (e.g., mol file and SMILES) and identifiers (chemical names and registry numbers) and approaches to standardize data into QSAR-ready formats prior to modeling procedures. Our efforts to quantify and segregate data into quality categories has allowed us to evaluate the resulting models that can be developed from these data slices and to quantify to what extent efforts developing high-quality datasets have the expected pay-off in terms of predicting performance. The most accur
Web-4D-QSAR: A web-based application to generate 4D-QSAR descriptors.
Ataide Martins, João Paulo; Rougeth de Oliveira, Marco Antônio; Oliveira de Queiroz, Mário Sérgio
2018-06-05
A web-based application is developed to generate 4D-QSAR descriptors using the LQTA-QSAR methodology, based on molecular dynamics (MD) trajectories and topology information retrieved from the GROMACS package. The LQTAGrid module calculates the intermolecular interaction energies at each grid point, considering probes and all aligned conformations resulting from MD simulations. These interaction energies are the independent variables or descriptors employed in a QSAR analysis. A friendly front end web interface, built using the Django framework and Python programming language, integrates all steps of the LQTA-QSAR methodology in a way that is transparent to the user, and in the backend, GROMACS and LQTAGrid are executed to generate 4D-QSAR descriptors to be used later in the process of QSAR model building. © 2018 Wiley Periodicals, Inc. © 2018 Wiley Periodicals, Inc.
Modeling Liver-Related Adverse Effects of Drugs Using kNN QSAR Method
Rodgers, Amie D.; Zhu, Hao; Fourches, Dennis; Rusyn, Ivan; Tropsha, Alexander
2010-01-01
Adverse effects of drugs (AEDs) continue to be a major cause of drug withdrawals both in development and post-marketing. While liver-related AEDs are a major concern for drug safety, there are few in silico models for predicting human liver toxicity for drug candidates. We have applied the Quantitative Structure Activity Relationship (QSAR) approach to model liver AEDs. In this study, we aimed to construct a QSAR model capable of binary classification (active vs. inactive) of drugs for liver AEDs based on chemical structure. To build QSAR models, we have employed an FDA spontaneous reporting database of human liver AEDs (elevations in activity of serum liver enzymes), which contains data on approximately 500 approved drugs. Approximately 200 compounds with wide clinical data coverage, structural similarity and balanced (40/60) active/inactive ratio were selected for modeling and divided into multiple training/test and external validation sets. QSAR models were developed using the k nearest neighbor method and validated using external datasets. Models with high sensitivity (>73%) and specificity (>94%) for prediction of liver AEDs in external validation sets were developed. To test applicability of the models, three chemical databases (World Drug Index, Prestwick Chemical Library, and Biowisdom Liver Intelligence Module) were screened in silico and the validity of predictions was determined, where possible, by comparing model-based classification with assertions in publicly available literature. Validated QSAR models of liver AEDs based on the data from the FDA spontaneous reporting system can be employed as sensitive and specific predictors of AEDs in pre-clinical screening of drug candidates for potential hepatotoxicity in humans. PMID:20192250
New public QSAR model for carcinogenicity
2010-01-01
Background One of the main goals of the new chemical regulation REACH (Registration, Evaluation and Authorization of Chemicals) is to fulfill the gaps in data concerned with properties of chemicals affecting the human health. (Q)SAR models are accepted as a suitable source of information. The EU funded CAESAR project aimed to develop models for prediction of 5 endpoints for regulatory purposes. Carcinogenicity is one of the endpoints under consideration. Results Models for prediction of carcinogenic potency according to specific requirements of Chemical regulation were developed. The dataset of 805 non-congeneric chemicals extracted from Carcinogenic Potency Database (CPDBAS) was used. Counter Propagation Artificial Neural Network (CP ANN) algorithm was implemented. In the article two alternative models for prediction carcinogenicity are described. The first model employed eight MDL descriptors (model A) and the second one twelve Dragon descriptors (model B). CAESAR's models have been assessed according to the OECD principles for the validation of QSAR. For the model validity we used a wide series of statistical checks. Models A and B yielded accuracy of training set (644 compounds) equal to 91% and 89% correspondingly; the accuracy of the test set (161 compounds) was 73% and 69%, while the specificity was 69% and 61%, respectively. Sensitivity in both cases was equal to 75%. The accuracy of the leave 20% out cross validation for the training set of models A and B was equal to 66% and 62% respectively. To verify if the models perform correctly on new compounds the external validation was carried out. The external test set was composed of 738 compounds. We obtained accuracy of external validation equal to 61.4% and 60.0%, sensitivity 64.0% and 61.8% and specificity equal to 58.9% and 58.4% respectively for models A and B. Conclusion Carcinogenicity is a particularly important endpoint and it is expected that QSAR models will not replace the human experts opinions
2011-01-01
Background Machine learning has a vast range of applications. In particular, advanced machine learning methods are routinely and increasingly used in quantitative structure activity relationship (QSAR) modeling. QSAR data sets often encompass tens of thousands of compounds and the size of proprietary, as well as public data sets, is rapidly growing. Hence, there is a demand for computationally efficient machine learning algorithms, easily available to researchers without extensive machine learning knowledge. In granting the scientific principles of transparency and reproducibility, Open Source solutions are increasingly acknowledged by regulatory authorities. Thus, an Open Source state-of-the-art high performance machine learning platform, interfacing multiple, customized machine learning algorithms for both graphical programming and scripting, to be used for large scale development of QSAR models of regulatory quality, is of great value to the QSAR community. Results This paper describes the implementation of the Open Source machine learning package AZOrange. AZOrange is specially developed to support batch generation of QSAR models in providing the full work flow of QSAR modeling, from descriptor calculation to automated model building, validation and selection. The automated work flow relies upon the customization of the machine learning algorithms and a generalized, automated model hyper-parameter selection process. Several high performance machine learning algorithms are interfaced for efficient data set specific selection of the statistical method, promoting model accuracy. Using the high performance machine learning algorithms of AZOrange does not require programming knowledge as flexible applications can be created, not only at a scripting level, but also in a graphical programming environment. Conclusions AZOrange is a step towards meeting the needs for an Open Source high performance machine learning platform, supporting the efficient development of
Stålring, Jonna C; Carlsson, Lars A; Almeida, Pedro; Boyer, Scott
2011-07-28
Machine learning has a vast range of applications. In particular, advanced machine learning methods are routinely and increasingly used in quantitative structure activity relationship (QSAR) modeling. QSAR data sets often encompass tens of thousands of compounds and the size of proprietary, as well as public data sets, is rapidly growing. Hence, there is a demand for computationally efficient machine learning algorithms, easily available to researchers without extensive machine learning knowledge. In granting the scientific principles of transparency and reproducibility, Open Source solutions are increasingly acknowledged by regulatory authorities. Thus, an Open Source state-of-the-art high performance machine learning platform, interfacing multiple, customized machine learning algorithms for both graphical programming and scripting, to be used for large scale development of QSAR models of regulatory quality, is of great value to the QSAR community. This paper describes the implementation of the Open Source machine learning package AZOrange. AZOrange is specially developed to support batch generation of QSAR models in providing the full work flow of QSAR modeling, from descriptor calculation to automated model building, validation and selection. The automated work flow relies upon the customization of the machine learning algorithms and a generalized, automated model hyper-parameter selection process. Several high performance machine learning algorithms are interfaced for efficient data set specific selection of the statistical method, promoting model accuracy. Using the high performance machine learning algorithms of AZOrange does not require programming knowledge as flexible applications can be created, not only at a scripting level, but also in a graphical programming environment. AZOrange is a step towards meeting the needs for an Open Source high performance machine learning platform, supporting the efficient development of highly accurate QSAR models
Using Toxicological Evidence from QSAR Models in Practice
The new generation of QSAR models provides supporting documentation in addition to the predicted toxicological value. Such information enables the toxicologist to explore the properties of chemical substances and to review and increase the reliability of toxicity predictions. Thi...
QSAR Modeling of Rat Acute Toxicity by Oral Exposure
Zhu, Hao; Martin, Todd M.; Ye, Lin; Sedykh, Alexander; Young, Douglas M.; Tropsha, Alexander
2009-01-01
Few Quantitative Structure-Activity Relationship (QSAR) studies have successfully modeled large, diverse rodent toxicity endpoints. In this study, a comprehensive dataset of 7,385 compounds with their most conservative lethal dose (LD50) values has been compiled. A combinatorial QSAR approach has been employed to develop robust and predictive models of acute toxicity in rats caused by oral exposure to chemicals. To enable fair comparison between the predictive power of models generated in this study versus a commercial toxicity predictor, TOPKAT (Toxicity Prediction by Komputer Assisted Technology), a modeling subset of the entire dataset was selected that included all 3,472 compounds used in the TOPKAT’s training set. The remaining 3,913 compounds, which were not present in the TOPKAT training set, were used as the external validation set. QSAR models of five different types were developed for the modeling set. The prediction accuracy for the external validation set was estimated by determination coefficient R2 of linear regression between actual and predicted LD50 values. The use of the applicability domain threshold implemented in most models generally improved the external prediction accuracy but expectedly led to the decrease in chemical space coverage; depending on the applicability domain threshold, R2 ranged from 0.24 to 0.70. Ultimately, several consensus models were developed by averaging the predicted LD50 for every compound using all 5 models. The consensus models afforded higher prediction accuracy for the external validation dataset with the higher coverage as compared to individual constituent models. The validated consensus LD50 models developed in this study can be used as reliable computational predictors of in vivo acute toxicity. PMID:19845371
Rácz, A; Bajusz, D; Héberger, K
2015-01-01
Recent implementations of QSAR modelling software provide the user with numerous models and a wealth of information. In this work, we provide some guidance on how one should interpret the results of QSAR modelling, compare and assess the resulting models, and select the best and most consistent ones. Two QSAR datasets are applied as case studies for the comparison of model performance parameters and model selection methods. We demonstrate the capabilities of sum of ranking differences (SRD) in model selection and ranking, and identify the best performance indicators and models. While the exchange of the original training and (external) test sets does not affect the ranking of performance parameters, it provides improved models in certain cases (despite the lower number of molecules in the training set). Performance parameters for external validation are substantially separated from the other merits in SRD analyses, highlighting their value in data fusion.
2012-01-01
The ORCHESTRA online questionnaire on “benefits and barriers to the use of QSAR methods” addressed the academic, consultant, regulatory and industry communities potentially interested by QSAR methods in the context of REACH. Replies from more than 60 stakeholders produced some insights on the actual application of QSAR methods, and how to improve their use. Respondents state in majority that they have used QSAR methods. All have some future plans to test or use QSAR methods in accordance with their stakeholder role. The stakeholder respondents cited a total of 28 models, methods or software that they have actually applied. The three most frequently cited suites, used moreover by all the stakeholder categories, are the OECD Toolbox, EPISuite and CAESAR; all are free tools. Results suggest that stereotyped assumptions about the barriers to application of QSAR may be incorrect. Economic costs (including potential delays) are not found to be a major barrier. And only one respondent “prefers” traditional, well-known and accepted toxicological assessment methods. Information and guidance may be the keys to reinforcing use of QSAR models. Regulators appear most interested in obtaining clear explanation of the basis of the models, to provide a solid basis for decisions. Scientists appear most interested in the exploration of the scientific capabilities of the QSAR approach. Industry shows interest in obtaining reassurance that appropriate uses of QSAR will be accepted by regulators. PMID:23244245
Hodyna, Diana; Kovalishyn, Vasyl; Rogalsky, Sergiy; Blagodatnyi, Volodymyr; Petko, Kirill; Metelytsia, Larisa
2016-09-01
Predictive QSAR models for the inhibitors of B. subtilis and Ps. aeruginosa among imidazolium-based ionic liquids were developed using literary data. The regression QSAR models were created through Artificial Neural Network and k-nearest neighbor procedures. The classification QSAR models were constructed using WEKA-RF (random forest) method. The predictive ability of the models was tested by fivefold cross-validation; giving q(2) = 0.77-0.92 for regression models and accuracy 83-88% for classification models. Twenty synthesized samples of 1,3-dialkylimidazolium ionic liquids with predictive value of activity level of antimicrobial potential were evaluated. For all asymmetric 1,3-dialkylimidazolium ionic liquids, only compounds containing at least one radical with alkyl chain length of 12 carbon atoms showed high antibacterial activity. However, the activity of symmetric 1,3-dialkylimidazolium salts was found to have opposite relationship with the length of aliphatic radical being maximum for compounds based on 1,3-dioctylimidazolium cation. The obtained experimental results suggested that the application of classification QSAR models is more accurate for the prediction of activity of new imidazolium-based ILs as potential antibacterials. © 2016 John Wiley & Sons A/S.
The importance of data curation on QSAR Modeling - PHYSPROP open data as a case study. (QSAR 2016)
During the last few decades many QSAR models and tools have been developed at the US EPA, including the widely used EPISuite. During this period the arsenal of computational capabilities supporting cheminformatics has broadened dramatically with multiple software packages. These ...
Metabolic biotransformation half-lives in fish: QSAR modeling and consensus analysis.
Papa, Ester; van der Wal, Leon; Arnot, Jon A; Gramatica, Paola
2014-02-01
Bioaccumulation in fish is a function of competing rates of chemical uptake and elimination. For hydrophobic organic chemicals bioconcentration, bioaccumulation and biomagnification potential are high and the biotransformation rate constant is a key parameter. Few measured biotransformation rate constant data are available compared to the number of chemicals that are being evaluated for bioaccumulation hazard and for exposure and risk assessment. Three new Quantitative Structure-Activity Relationships (QSARs) for predicting whole body biotransformation half-lives (HLN) in fish were developed and validated using theoretical molecular descriptors that seek to capture structural characteristics of the whole molecule and three data set splitting schemes. The new QSARs were developed using a minimal number of theoretical descriptors (n=9) and compared to existing QSARs developed using fragment contribution methods that include up to 59 descriptors. The predictive statistics of the models are similar thus further corroborating the predictive performance of the different QSARs; Q(2)ext ranges from 0.75 to 0.77, CCCext ranges from 0.86 to 0.87, RMSE in prediction ranges from 0.56 to 0.58. The new QSARs provide additional mechanistic insights into the biotransformation capacity of organic chemicals in fish by including whole molecule descriptors and they also include information on the domain of applicability for the chemical of interest. Advantages of consensus modeling for improving overall prediction and minimizing false negative errors in chemical screening assessments, for identifying potential sources of residual error in the empirical HLN database, and for identifying structural features that are not well represented in the HLN dataset to prioritize future testing needs are illustrated. © 2013.
QSAR modeling for predicting mutagenic toxicity of diverse chemicals for regulatory purposes.
Basant, Nikita; Gupta, Shikha
2017-06-01
The safety assessment process of chemicals requires information on their mutagenic potential. The experimental determination of mutagenicity of a large number of chemicals is tedious and time and cost intensive, thus compelling for alternative methods. We have established local and global QSAR models for discriminating low and high mutagenic compounds and predicting their mutagenic activity in a quantitative manner in Salmonella typhimurium (TA) bacterial strains (TA98 and TA100). The decision treeboost (DTB)-based classification QSAR models discriminated among two categories with accuracies of >96% and the regression QSAR models precisely predicted the mutagenic activity of diverse chemicals yielding high correlations (R 2 ) between the experimental and model-predicted values in the respective training (>0.96) and test (>0.94) sets. The test set root mean squared error (RMSE) and mean absolute error (MAE) values emphasized the usefulness of the developed models for predicting new compounds. Relevant structural features of diverse chemicals that were responsible and influence the mutagenic activity were identified. The applicability domains of the developed models were defined. The developed models can be used as tools for screening new chemicals for their mutagenicity assessment for regulatory purpose.
An examination of data quality on QSAR Modeling in regards ...
The development of QSAR models is critically dependent on the quality of available data. As part of our efforts to develop public platforms to provide access to predictive models, we have attempted to discriminate the influence of the quality versus quantity of data available to develop and validate QSAR models. We have focused our efforts on the widely used EPISuite software that was initially developed over two decades ago and, specifically, on the PHYSPROP dataset used to train the EPISuite prediction models. This presentation will review our approaches to examining key datasets, the delivery of curated data and the development of machine-learning models for thirteen separate property endpoints of interest to environmental science. We will also review how these data will be made freely accessible to the community via a new “chemistry dashboard”. This abstract does not reflect U.S. EPA policy. presentation at UNC-CH.
Yadav, Mukesh; Joshi, Shobha; Nayarisseri, Anuraj; Jain, Anuja; Hussain, Aabid; Dubey, Tushar
2013-06-01
Global QSAR models predict biological response of molecular structures which are generic in particular class. A global QSAR dataset admits structural features derived from larger chemical space, intricate to model but more applicable in medicinal chemistry. The present work is global in either sense of structural diversity in QSAR dataset or large number of descriptor input. Forty phenethylamine structure derivatives were selected from a large pool (904) of similar phenethylamines available in Pubchem database. LogP values of selected candidates were collected from physical properties database (PHYSPROP) determined in identical set of conditions. Attempts to model logP value have produced significant QSAR models. MLR aided linear one-variable and two-variable QSAR models with their respective R(2) (0.866, 0.937), R(2)A (0.862, 0.932), F-stat (181.936, 199.812) and Standard Error (0.365, 0.255) are statistically fit and found predictive after internal validation and external validation. The descriptors chosen after improvisation and optimization reveal mechanistic part of work in terms of Verhaar model of Fish base-line toxicity from MLOGP, i.e. (BLTF96) and 3D-MoRSE -signal 15 /unweighted molecular descriptor calculated by summing atom weights viewed by a different angular scattering function (Mor15u) are crucial in regulation of logP values of phenethylamines.
Naik, P K; Singh, T; Singh, H
2009-07-01
Quantitative structure-activity relationship (QSAR) analyses were performed independently on data sets belonging to two groups of insecticides, namely the organophosphates and carbamates. Several types of descriptors including topological, spatial, thermodynamic, information content, lead likeness and E-state indices were used to derive quantitative relationships between insecticide activities and structural properties of chemicals. A systematic search approach based on missing value, zero value, simple correlation and multi-collinearity tests as well as the use of a genetic algorithm allowed the optimal selection of the descriptors used to generate the models. The QSAR models developed for both organophosphate and carbamate groups revealed good predictability with r(2) values of 0.949 and 0.838 as well as [image omitted] values of 0.890 and 0.765, respectively. In addition, a linear correlation was observed between the predicted and experimental LD(50) values for the test set data with r(2) of 0.871 and 0.788 for both the organophosphate and carbamate groups, indicating that the prediction accuracy of the QSAR models was acceptable. The models were also tested successfully from external validation criteria. QSAR models developed in this study should help further design of novel potent insecticides.
Experimental Errors in QSAR Modeling Sets: What We Can Do and What We Cannot Do.
Zhao, Linlin; Wang, Wenyi; Sedykh, Alexander; Zhu, Hao
2017-06-30
Numerous chemical data sets have become available for quantitative structure-activity relationship (QSAR) modeling studies. However, the quality of different data sources may be different based on the nature of experimental protocols. Therefore, potential experimental errors in the modeling sets may lead to the development of poor QSAR models and further affect the predictions of new compounds. In this study, we explored the relationship between the ratio of questionable data in the modeling sets, which was obtained by simulating experimental errors, and the QSAR modeling performance. To this end, we used eight data sets (four continuous endpoints and four categorical endpoints) that have been extensively curated both in-house and by our collaborators to create over 1800 various QSAR models. Each data set was duplicated to create several new modeling sets with different ratios of simulated experimental errors (i.e., randomizing the activities of part of the compounds) in the modeling process. A fivefold cross-validation process was used to evaluate the modeling performance, which deteriorates when the ratio of experimental errors increases. All of the resulting models were also used to predict external sets of new compounds, which were excluded at the beginning of the modeling process. The modeling results showed that the compounds with relatively large prediction errors in cross-validation processes are likely to be those with simulated experimental errors. However, after removing a certain number of compounds with large prediction errors in the cross-validation process, the external predictions of new compounds did not show improvement. Our conclusion is that the QSAR predictions, especially consensus predictions, can identify compounds with potential experimental errors. But removing those compounds by the cross-validation procedure is not a reasonable means to improve model predictivity due to overfitting.
Experimental Errors in QSAR Modeling Sets: What We Can Do and What We Cannot Do
2017-01-01
Numerous chemical data sets have become available for quantitative structure–activity relationship (QSAR) modeling studies. However, the quality of different data sources may be different based on the nature of experimental protocols. Therefore, potential experimental errors in the modeling sets may lead to the development of poor QSAR models and further affect the predictions of new compounds. In this study, we explored the relationship between the ratio of questionable data in the modeling sets, which was obtained by simulating experimental errors, and the QSAR modeling performance. To this end, we used eight data sets (four continuous endpoints and four categorical endpoints) that have been extensively curated both in-house and by our collaborators to create over 1800 various QSAR models. Each data set was duplicated to create several new modeling sets with different ratios of simulated experimental errors (i.e., randomizing the activities of part of the compounds) in the modeling process. A fivefold cross-validation process was used to evaluate the modeling performance, which deteriorates when the ratio of experimental errors increases. All of the resulting models were also used to predict external sets of new compounds, which were excluded at the beginning of the modeling process. The modeling results showed that the compounds with relatively large prediction errors in cross-validation processes are likely to be those with simulated experimental errors. However, after removing a certain number of compounds with large prediction errors in the cross-validation process, the external predictions of new compounds did not show improvement. Our conclusion is that the QSAR predictions, especially consensus predictions, can identify compounds with potential experimental errors. But removing those compounds by the cross-validation procedure is not a reasonable means to improve model predictivity due to overfitting. PMID:28691113
Pérez-Garrido, Alfonso; Morales Helguera, Aliuska; Abellán Guillén, Adela; Cordeiro, M Natália D S; Garrido Escudero, Amalio
2009-01-15
This paper reports a QSAR study for predicting the complexation of a large and heterogeneous variety of substances (233 organic compounds) with beta-cyclodextrins (beta-CDs). Several different theoretical molecular descriptors, calculated solely from the molecular structure of the compounds under investigation, and an efficient variable selection procedure, like the Genetic Algorithm, led to models with satisfactory global accuracy and predictivity. But the best-final QSAR model is based on Topological descriptors meanwhile offering a reasonable interpretation. This QSAR model was able to explain ca. 84% of the variance in the experimental activity, and displayed very good internal cross-validation statistics and predictivity on external data. It shows that the driving forces for CD complexation are mainly hydrophobic and steric (van der Waals) interactions. Thus, the results of our study provide a valuable tool for future screening and priority testing of beta-CDs guest molecules.
QSAR Modeling Using Large-Scale Databases: Case Study for HIV-1 Reverse Transcriptase Inhibitors.
Tarasova, Olga A; Urusova, Aleksandra F; Filimonov, Dmitry A; Nicklaus, Marc C; Zakharov, Alexey V; Poroikov, Vladimir V
2015-07-27
Large-scale databases are important sources of training sets for various QSAR modeling approaches. Generally, these databases contain information extracted from different sources. This variety of sources can produce inconsistency in the data, defined as sometimes widely diverging activity results for the same compound against the same target. Because such inconsistency can reduce the accuracy of predictive models built from these data, we are addressing the question of how best to use data from publicly and commercially accessible databases to create accurate and predictive QSAR models. We investigate the suitability of commercially and publicly available databases to QSAR modeling of antiviral activity (HIV-1 reverse transcriptase (RT) inhibition). We present several methods for the creation of modeling (i.e., training and test) sets from two, either commercially or freely available, databases: Thomson Reuters Integrity and ChEMBL. We found that the typical predictivities of QSAR models obtained using these different modeling set compilation methods differ significantly from each other. The best results were obtained using training sets compiled for compounds tested using only one method and material (i.e., a specific type of biological assay). Compound sets aggregated by target only typically yielded poorly predictive models. We discuss the possibility of "mix-and-matching" assay data across aggregating databases such as ChEMBL and Integrity and their current severe limitations for this purpose. One of them is the general lack of complete and semantic/computer-parsable descriptions of assay methodology carried by these databases that would allow one to determine mix-and-matchability of result sets at the assay level.
GTM-Based QSAR Models and Their Applicability Domains.
Gaspar, H A; Baskin, I I; Marcou, G; Horvath, D; Varnek, A
2015-06-01
In this paper we demonstrate that Generative Topographic Mapping (GTM), a machine learning method traditionally used for data visualisation, can be efficiently applied to QSAR modelling using probability distribution functions (PDF) computed in the latent 2-dimensional space. Several different scenarios of the activity assessment were considered: (i) the "activity landscape" approach based on direct use of PDF, (ii) QSAR models involving GTM-generated on descriptors derived from PDF, and, (iii) the k-Nearest Neighbours approach in 2D latent space. Benchmarking calculations were performed on five different datasets: stability constants of metal cations Ca(2+) , Gd(3+) and Lu(3+) complexes with organic ligands in water, aqueous solubility and activity of thrombin inhibitors. It has been shown that the performance of GTM-based regression models is similar to that obtained with some popular machine-learning methods (random forest, k-NN, M5P regression tree and PLS) and ISIDA fragment descriptors. By comparing GTM activity landscapes built both on predicted and experimental activities, we may visually assess the model's performance and identify the areas in the chemical space corresponding to reliable predictions. The applicability domain used in this work is based on data likelihood. Its application has significantly improved the model performances for 4 out of 5 datasets. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Majumdar, Subhabrata; Basak, Subhash C
2018-04-26
Proper validation is an important aspect of QSAR modelling. External validation is one of the widely used validation methods in QSAR where the model is built on a subset of the data and validated on the rest of the samples. However, its effectiveness for datasets with a small number of samples but large number of predictors remains suspect. Calculating hundreds or thousands of molecular descriptors using currently available software has become the norm in QSAR research, owing to computational advances in the past few decades. Thus, for n chemical compounds and p descriptors calculated for each molecule, the typical chemometric dataset today has high value of p but small n (i.e. n < p). Motivated by the evidence of inadequacies of external validation in estimating the true predictive capability of a statistical model in recent literature, this paper performs an extensive and comparative study of this method with several other validation techniques. We compared four validation methods: leave-one-out, K-fold, external and multi-split validation, using statistical models built using the LASSO regression, which simultaneously performs variable selection and modelling. We used 300 simulated datasets and one real dataset of 95 congeneric amine mutagens for this evaluation. External validation metrics have high variation among different random splits of the data, hence are not recommended for predictive QSAR models. LOO has the overall best performance among all validation methods applied in our scenario. Results from external validation are too unstable for the datasets we analyzed. Based on our findings, we recommend using the LOO procedure for validating QSAR predictive models built on high-dimensional small-sample data. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
QSAR modeling of cumulative environmental end-points for the prioritization of hazardous chemicals.
Gramatica, Paola; Papa, Ester; Sangion, Alessandro
2018-01-24
The hazard of chemicals in the environment is inherently related to the molecular structure and derives simultaneously from various chemical properties/activities/reactivities. Models based on Quantitative Structure Activity Relationships (QSARs) are useful to screen, rank and prioritize chemicals that may have an adverse impact on humans and the environment. This paper reviews a selection of QSAR models (based on theoretical molecular descriptors) developed for cumulative multivariate endpoints, which were derived by mathematical combination of multiple effects and properties. The cumulative end-points provide an integrated holistic point of view to address environmentally relevant properties of chemicals.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Alves, Vinicius M.; Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, NC 27599; Muratov, Eugene
Repetitive exposure to a chemical agent can induce an immune reaction in inherently susceptible individuals that leads to skin sensitization. Although many chemicals have been reported as skin sensitizers, there have been very few rigorously validated QSAR models with defined applicability domains (AD) that were developed using a large group of chemically diverse compounds. In this study, we have aimed to compile, curate, and integrate the largest publicly available dataset related to chemically-induced skin sensitization, use this data to generate rigorously validated and QSAR models for skin sensitization, and employ these models as a virtual screening tool for identifying putativemore » sensitizers among environmental chemicals. We followed best practices for model building and validation implemented with our predictive QSAR workflow using Random Forest modeling technique in combination with SiRMS and Dragon descriptors. The Correct Classification Rate (CCR) for QSAR models discriminating sensitizers from non-sensitizers was 71–88% when evaluated on several external validation sets, within a broad AD, with positive (for sensitizers) and negative (for non-sensitizers) predicted rates of 85% and 79% respectively. When compared to the skin sensitization module included in the OECD QSAR Toolbox as well as to the skin sensitization model in publicly available VEGA software, our models showed a significantly higher prediction accuracy for the same sets of external compounds as evaluated by Positive Predicted Rate, Negative Predicted Rate, and CCR. These models were applied to identify putative chemical hazards in the Scorecard database of possible skin or sense organ toxicants as primary candidates for experimental validation. - Highlights: • It was compiled the largest publicly-available skin sensitization dataset. • Predictive QSAR models were developed for skin sensitization. • Developed models have higher prediction accuracy than OECD QSAR Toolbox.
SAR/QSAR MODELS FOR TOXICITY PREDICTION: APPROACHES AND NEW DIRECTIONS
Abstract
SAR/QSAR MODELS FOR TOXICITY PREDICTION: APPROACHES AND NEW DIRECTIONS
Risk assessment typically incorporates some relevant toxicity information upon which to base a sound estimation for a chemical of concern. However, there are many circumstances in whic...
Does rational selection of training and test sets improve the outcome of QSAR modeling?
Martin, Todd M; Harten, Paul; Young, Douglas M; Muratov, Eugene N; Golbraikh, Alexander; Zhu, Hao; Tropsha, Alexander
2012-10-22
Prior to using a quantitative structure activity relationship (QSAR) model for external predictions, its predictive power should be established and validated. In the absence of a true external data set, the best way to validate the predictive ability of a model is to perform its statistical external validation. In statistical external validation, the overall data set is divided into training and test sets. Commonly, this splitting is performed using random division. Rational splitting methods can divide data sets into training and test sets in an intelligent fashion. The purpose of this study was to determine whether rational division methods lead to more predictive models compared to random division. A special data splitting procedure was used to facilitate the comparison between random and rational division methods. For each toxicity end point, the overall data set was divided into a modeling set (80% of the overall set) and an external evaluation set (20% of the overall set) using random division. The modeling set was then subdivided into a training set (80% of the modeling set) and a test set (20% of the modeling set) using rational division methods and by using random division. The Kennard-Stone, minimal test set dissimilarity, and sphere exclusion algorithms were used as the rational division methods. The hierarchical clustering, random forest, and k-nearest neighbor (kNN) methods were used to develop QSAR models based on the training sets. For kNN QSAR, multiple training and test sets were generated, and multiple QSAR models were built. The results of this study indicate that models based on rational division methods generate better statistical results for the test sets than models based on random division, but the predictive power of both types of models are comparable.
Towards interoperable and reproducible QSAR analyses: Exchange of datasets.
Spjuth, Ola; Willighagen, Egon L; Guha, Rajarshi; Eklund, Martin; Wikberg, Jarl Es
2010-06-30
QSAR is a widely used method to relate chemical structures to responses or properties based on experimental observations. Much effort has been made to evaluate and validate the statistical modeling in QSAR, but these analyses treat the dataset as fixed. An overlooked but highly important issue is the validation of the setup of the dataset, which comprises addition of chemical structures as well as selection of descriptors and software implementations prior to calculations. This process is hampered by the lack of standards and exchange formats in the field, making it virtually impossible to reproduce and validate analyses and drastically constrain collaborations and re-use of data. We present a step towards standardizing QSAR analyses by defining interoperable and reproducible QSAR datasets, consisting of an open XML format (QSAR-ML) which builds on an open and extensible descriptor ontology. The ontology provides an extensible way of uniquely defining descriptors for use in QSAR experiments, and the exchange format supports multiple versioned implementations of these descriptors. Hence, a dataset described by QSAR-ML makes its setup completely reproducible. We also provide a reference implementation as a set of plugins for Bioclipse which simplifies setup of QSAR datasets, and allows for exporting in QSAR-ML as well as old-fashioned CSV formats. The implementation facilitates addition of new descriptor implementations from locally installed software and remote Web services; the latter is demonstrated with REST and XMPP Web services. Standardized QSAR datasets open up new ways to store, query, and exchange data for subsequent analyses. QSAR-ML supports completely reproducible creation of datasets, solving the problems of defining which software components were used and their versions, and the descriptor ontology eliminates confusions regarding descriptors by defining them crisply. This makes is easy to join, extend, combine datasets and hence work collectively, but
Towards interoperable and reproducible QSAR analyses: Exchange of datasets
2010-01-01
Background QSAR is a widely used method to relate chemical structures to responses or properties based on experimental observations. Much effort has been made to evaluate and validate the statistical modeling in QSAR, but these analyses treat the dataset as fixed. An overlooked but highly important issue is the validation of the setup of the dataset, which comprises addition of chemical structures as well as selection of descriptors and software implementations prior to calculations. This process is hampered by the lack of standards and exchange formats in the field, making it virtually impossible to reproduce and validate analyses and drastically constrain collaborations and re-use of data. Results We present a step towards standardizing QSAR analyses by defining interoperable and reproducible QSAR datasets, consisting of an open XML format (QSAR-ML) which builds on an open and extensible descriptor ontology. The ontology provides an extensible way of uniquely defining descriptors for use in QSAR experiments, and the exchange format supports multiple versioned implementations of these descriptors. Hence, a dataset described by QSAR-ML makes its setup completely reproducible. We also provide a reference implementation as a set of plugins for Bioclipse which simplifies setup of QSAR datasets, and allows for exporting in QSAR-ML as well as old-fashioned CSV formats. The implementation facilitates addition of new descriptor implementations from locally installed software and remote Web services; the latter is demonstrated with REST and XMPP Web services. Conclusions Standardized QSAR datasets open up new ways to store, query, and exchange data for subsequent analyses. QSAR-ML supports completely reproducible creation of datasets, solving the problems of defining which software components were used and their versions, and the descriptor ontology eliminates confusions regarding descriptors by defining them crisply. This makes is easy to join, extend, combine datasets
NASA Astrophysics Data System (ADS)
Hsieh, Jui-Hua; Wang, Xiang S.; Teotico, Denise; Golbraikh, Alexander; Tropsha, Alexander
2008-09-01
The use of inaccurate scoring functions in docking algorithms may result in the selection of compounds with high predicted binding affinity that nevertheless are known experimentally not to bind to the target receptor. Such falsely predicted binders have been termed `binding decoys'. We posed a question as to whether true binders and decoys could be distinguished based only on their structural chemical descriptors using approaches commonly used in ligand based drug design. We have applied the k-Nearest Neighbor ( kNN) classification QSAR approach to a dataset of compounds characterized as binders or binding decoys of AmpC beta-lactamase. Models were subjected to rigorous internal and external validation as part of our standard workflow and a special QSAR modeling scheme was employed that took into account the imbalanced ratio of inhibitors to non-binders (1:4) in this dataset. 342 predictive models were obtained with correct classification rate (CCR) for both training and test sets as high as 0.90 or higher. The prediction accuracy was as high as 100% (CCR = 1.00) for the external validation set composed of 10 compounds (5 true binders and 5 decoys) selected randomly from the original dataset. For an additional external set of 50 known non-binders, we have achieved the CCR of 0.87 using very conservative model applicability domain threshold. The validated binary kNN QSAR models were further employed for mining the NCGC AmpC screening dataset (69653 compounds). The consensus prediction of 64 compounds identified as screening hits in the AmpC PubChem assay disagreed with their annotation in PubChem but was in agreement with the results of secondary assays. At the same time, 15 compounds were identified as potential binders contrary to their annotation in PubChem. Five of them were tested experimentally and showed inhibitory activities in millimolar range with the highest binding constant Ki of 135 μM. Our studies suggest that validated QSAR models could complement
Algamal, Z Y; Lee, M H
2017-01-01
A high-dimensional quantitative structure-activity relationship (QSAR) classification model typically contains a large number of irrelevant and redundant descriptors. In this paper, a new design of descriptor selection for the QSAR classification model estimation method is proposed by adding a new weight inside L1-norm. The experimental results of classifying the anti-hepatitis C virus activity of thiourea derivatives demonstrate that the proposed descriptor selection method in the QSAR classification model performs effectively and competitively compared with other existing penalized methods in terms of classification performance on both the training and the testing datasets. Moreover, it is noteworthy that the results obtained in terms of stability test and applicability domain provide a robust QSAR classification model. It is evident from the results that the developed QSAR classification model could conceivably be employed for further high-dimensional QSAR classification studies.
A QSAR Model for Thyroperoxidase Inhibition and Screening ...
Thyroid hormones (THs) are critical modulators of a wide range of biological processes from neurodevelopment to metabolism. Well regulated levels of THs are critical during development and even moderate changes in maternal or fetal TH levels produce irreversible neurological deficits in children. The enzyme thyroperoxidase (TPO) plays a key role in the synthesis of THs. Inhibition of TPO by xenobiotics leads to decreased TH synthesis and, depending on the degree of synthesis inhibition, may result in adverse developmental outcomes. Recently, a high-throughput screening assay for TPO inhibition (AUR-TPO) was developed and used to screen the ToxCast Phase I and II chemicals. In the present study, we used the results from the AUR-TPO screening to develop a Quantitative Structure-Activity Relationship (QSAR) model for TPO inhibition in Leadscope®. The training set consisted of 898 discrete organic chemicals: 134 positive and 764 negative for TPO inhibition. A 10 times two-fold 50% cross-validation of the model was performed, yielding a balanced accuracy of 78.7% within its defined applicability domain. More recently, an additional ~800 chemicals from the US EPA Endocrine Disruption Screening Program (EDSP21) were screened using the AUR-TPO assay. This data was used for external validation of the QSAR model, demonstrating a balanced accuracy of 85.7% within its applicability domain. Overall, the cross- and external validations indicate a model with a high predictiv
The proposal of architecture for chemical splitting to optimize QSAR models for aquatic toxicity.
Colombo, Andrea; Benfenati, Emilio; Karelson, Mati; Maran, Uko
2008-06-01
One of the challenges in the field of quantitative structure-activity relationship (QSAR) analysis is the correct classification of a chemical compound to an appropriate model for the prediction of activity. Thus, in previous studies, compounds have been divided into distinct groups according to their mode of action or chemical class. In the current study, theoretical molecular descriptors were used to divide 568 organic substances into subsets with toxicity measured for the 96-h lethal median concentration for the Fathead minnow (Pimephales promelas). Simple constitutional descriptors such as the number of aliphatic and aromatic rings and a quantum chemical descriptor, maximum bond order of a carbon atom divide compounds into nine subsets. For each subset of compounds the automatic forward selection of descriptors was applied to construct QSAR models. Significant correlations were achieved for each subset of chemicals and all models were validated with the leave-one-out internal validation procedure (R(2)(cv) approximately 0.80). The results encourage to consider this alternative way for the prediction of toxicity using QSAR subset models without direct reference to the mechanism of toxic action or the traditional chemical classification.
QSAR modeling based on structure-information for properties of interest in human health.
Hall, L H; Hall, L M
2005-01-01
The development of QSAR models based on topological structure description is presented for problems in human health. These models are based on the structure-information approach to quantitative biological modeling and prediction, in contrast to the mechanism-based approach. The structure-information approach is outlined, starting with basic structure information developed from the chemical graph (connection table). Information explicit in the connection table (element identity and skeletal connections) leads to significant (implicit) structure information that is useful for establishing sound models of a wide range of properties of interest in drug design. Valence state definition leads to relationships for valence state electronegativity and atom/group molar volume. Based on these important aspects of molecules, together with skeletal branching patterns, both the electrotopological state (E-state) and molecular connectivity (chi indices) structure descriptors are developed and described. A summary of four QSAR models indicates the wide range of applicability of these structure descriptors and the predictive quality of QSAR models based on them: aqueous solubility (5535 chemically diverse compounds, 938 in external validation), percent oral absorption (%OA, 417 therapeutic drugs, 195 drugs in external validation testing), AMES mutagenicity (2963 compounds including 290 therapeutic drugs, 400 in external validation), fish toxicity (92 substituted phenols, anilines and substituted aromatics). These models are established independent of explicit three-dimensional (3-D) structure information and are directly interpretable in terms of the implicit structure information useful to the drug design process.
QSAR Modeling and Prediction of Drug-Drug Interactions.
Zakharov, Alexey V; Varlamova, Ekaterina V; Lagunin, Alexey A; Dmitriev, Alexander V; Muratov, Eugene N; Fourches, Denis; Kuz'min, Victor E; Poroikov, Vladimir V; Tropsha, Alexander; Nicklaus, Marc C
2016-02-01
Severe adverse drug reactions (ADRs) are the fourth leading cause of fatality in the U.S. with more than 100,000 deaths per year. As up to 30% of all ADRs are believed to be caused by drug-drug interactions (DDIs), typically mediated by cytochrome P450s, possibilities to predict DDIs from existing knowledge are important. We collected data from public sources on 1485, 2628, 4371, and 27,966 possible DDIs mediated by four cytochrome P450 isoforms 1A2, 2C9, 2D6, and 3A4 for 55, 73, 94, and 237 drugs, respectively. For each of these data sets, we developed and validated QSAR models for the prediction of DDIs. As a unique feature of our approach, the interacting drug pairs were represented as binary chemical mixtures in a 1:1 ratio. We used two types of chemical descriptors: quantitative neighborhoods of atoms (QNA) and simplex descriptors. Radial basis functions with self-consistent regression (RBF-SCR) and random forest (RF) were utilized to build QSAR models predicting the likelihood of DDIs for any pair of drug molecules. Our models showed balanced accuracy of 72-79% for the external test sets with a coverage of 81.36-100% when a conservative threshold for the model's applicability domain was applied. We generated virtually all possible binary combinations of marketed drugs and employed our models to identify drug pairs predicted to be instances of DDI. More than 4500 of these predicted DDIs that were not found in our training sets were confirmed by data from the DrugBank database.
Chen, Meimei; Yang, Fafu; Kang, Jie; Yang, Xuemei; Lai, Xinmei; Gao, Yuxing
2016-11-29
In this study, in silico approaches, including multiple QSAR modeling, structural similarity analysis, and molecular docking, were applied to develop QSAR classification models as a fast screening tool for identifying highly-potent ABCA1 up-regulators targeting LXRβ based on a series of new flavonoids. Initially, four modeling approaches, including linear discriminant analysis, support vector machine, radial basis function neural network, and classification and regression trees, were applied to construct different QSAR classification models. The statistics results indicated that these four kinds of QSAR models were powerful tools for screening highly potent ABCA1 up-regulators. Then, a consensus QSAR model was developed by combining the predictions from these four models. To discover new ABCA1 up-regulators at maximum accuracy, the compounds in the ZINC database that fulfilled the requirement of structural similarity of 0.7 compared to known potent ABCA1 up-regulator were subjected to the consensus QSAR model, which led to the discovery of 50 compounds. Finally, they were docked into the LXRβ binding site to understand their role in up-regulating ABCA1 expression. The excellent binding modes and docking scores of 10 hit compounds suggested they were highly-potent ABCA1 up-regulators targeting LXRβ. Overall, this study provided an effective strategy to discover highly potent ABCA1 up-regulators.
QSAR models for prediction of chromatographic behavior of homologous Fab variants.
Robinson, Julie R; Karkov, Hanne S; Woo, James A; Krogh, Berit O; Cramer, Steven M
2017-06-01
While quantitative structure activity relationship (QSAR) models have been employed successfully for the prediction of small model protein chromatographic behavior, there have been few reports to date on the use of this methodology for larger, more complex proteins. Recently our group generated focused libraries of antibody Fab fragment variants with different combinations of surface hydrophobicities and electrostatic potentials, and demonstrated that the unique selectivities of multimodal resins can be exploited to separate these Fab variants. In this work, results from linear salt gradient experiments with these Fabs were employed to develop QSAR models for six chromatographic systems, including multimodal (Capto MMC, Nuvia cPrime, and two novel ligand prototypes), hydrophobic interaction chromatography (HIC; Capto Phenyl), and cation exchange (CEX; CM Sepharose FF) resins. The models utilized newly developed "local descriptors" to quantify changes around point mutations in the Fab libraries as well as novel cluster descriptors recently introduced by our group. Subsequent rounds of feature selection and linearized machine learning algorithms were used to generate robust, well-validated models with high training set correlations (R 2 > 0.70) that were well suited for predicting elution salt concentrations in the various systems. The developed models then were used to predict the retention of a deamidated Fab and isotype variants, with varying success. The results represent the first successful utilization of QSAR for the prediction of chromatographic behavior of complex proteins such as Fab fragments in multimodal chromatographic systems. The framework presented here can be employed to facilitate process development for the purification of biological products from product-related impurities by in silico screening of resin alternatives. Biotechnol. Bioeng. 2017;114: 1231-1240. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Myint, Kyaw Z.; Xie, Xiang-Qun
2015-01-01
This chapter focuses on the fingerprint-based artificial neural networks QSAR (FANN-QSAR) approach to predict biological activities of structurally diverse compounds. Three types of fingerprints, namely ECFP6, FP2, and MACCS, were used as inputs to train the FANN-QSAR models. The results were benchmarked against known 2D and 3D QSAR methods, and the derived models were used to predict cannabinoid (CB) ligand binding activities as a case study. In addition, the FANN-QSAR model was used as a virtual screening tool to search a large NCI compound database for lead cannabinoid compounds. We discovered several compounds with good CB2 binding affinities ranging from 6.70 nM to 3.75 μM. The studies proved that the FANN-QSAR method is a useful approach to predict bioactivities or properties of ligands and to find novel lead compounds for drug discovery research. PMID:25502380
DPubChem: a web tool for QSAR modeling and high-throughput virtual screening.
Soufan, Othman; Ba-Alawi, Wail; Magana-Mora, Arturo; Essack, Magbubah; Bajic, Vladimir B
2018-06-14
High-throughput screening (HTS) performs the experimental testing of a large number of chemical compounds aiming to identify those active in the considered assay. Alternatively, faster and cheaper methods of large-scale virtual screening are performed computationally through quantitative structure-activity relationship (QSAR) models. However, the vast amount of available HTS heterogeneous data and the imbalanced ratio of active to inactive compounds in an assay make this a challenging problem. Although different QSAR models have been proposed, they have certain limitations, e.g., high false positive rates, complicated user interface, and limited utilization options. Therefore, we developed DPubChem, a novel web tool for deriving QSAR models that implement the state-of-the-art machine-learning techniques to enhance the precision of the models and enable efficient analyses of experiments from PubChem BioAssay database. DPubChem also has a simple interface that provides various options to users. DPubChem predicted active compounds for 300 datasets with an average geometric mean and F 1 score of 76.68% and 76.53%, respectively. Furthermore, DPubChem builds interaction networks that highlight novel predicted links between chemical compounds and biological assays. Using such a network, DPubChem successfully suggested a novel drug for the Niemann-Pick type C disease. DPubChem is freely available at www.cbrc.kaust.edu.sa/dpubchem .
NASA Astrophysics Data System (ADS)
Andersson, C. David; Hillgren, J. Mikael; Lindgren, Cecilia; Qian, Weixing; Akfur, Christine; Berg, Lotta; Ekström, Fredrik; Linusson, Anna
2015-03-01
Scientific disciplines such as medicinal- and environmental chemistry, pharmacology, and toxicology deal with the questions related to the effects small organic compounds exhort on biological targets and the compounds' physicochemical properties responsible for these effects. A common strategy in this endeavor is to establish structure-activity relationships (SARs). The aim of this work was to illustrate benefits of performing a statistical molecular design (SMD) and proper statistical analysis of the molecules' properties before SAR and quantitative structure-activity relationship (QSAR) analysis. Our SMD followed by synthesis yielded a set of inhibitors of the enzyme acetylcholinesterase (AChE) that had very few inherent dependencies between the substructures in the molecules. If such dependencies exist, they cause severe errors in SAR interpretation and predictions by QSAR-models, and leave a set of molecules less suitable for future decision-making. In our study, SAR- and QSAR models could show which molecular sub-structures and physicochemical features that were advantageous for the AChE inhibition. Finally, the QSAR model was used for the prediction of the inhibition of AChE by an external prediction set of molecules. The accuracy of these predictions was asserted by statistical significance tests and by comparisons to simple but relevant reference models.
Furuhama, A; Toida, T; Nishikawa, N; Aoki, Y; Yoshioka, Y; Shiraishi, H
2010-07-01
The KAshinhou Tool for Ecotoxicity (KATE) system, including ecotoxicity quantitative structure-activity relationship (QSAR) models, was developed by the Japanese National Institute for Environmental Studies (NIES) using the database of aquatic toxicity results gathered by the Japanese Ministry of the Environment and the US EPA fathead minnow database. In this system chemicals can be entered according to their one-dimensional structures and classified by substructure. The QSAR equations for predicting the toxicity of a chemical compound assume a linear correlation between its log P value and its aquatic toxicity. KATE uses a structural domain called C-judgement, defined by the substructures of specified functional groups in the QSAR models. Internal validation by the leave-one-out method confirms that the QSAR equations, with r(2 )> 0.7, RMSE
Lee, Yunho; von Gunten, Urs
2012-12-01
Various oxidants such as chlorine, chlorine dioxide, ferrate(VI), ozone, and hydroxyl radicals can be applied for eliminating organic micropollutant by oxidative transformation during water treatment in systems such as drinking water, wastewater, and water reuse. Over the last decades, many second-order rate constants (k) have been determined for the reaction of these oxidants with model compounds and micropollutants. Good correlations (quantitative structure-activity relationships or QSARs) are often found between the k-values for an oxidation reaction of closely related compounds (i.e. having a common organic functional group) and substituent descriptor variables such as Hammett or Taft sigma constants. In this study, we developed QSARs for the oxidation of organic and some inorganic compounds and organic micropollutants transformation during oxidative water treatment. A number of 18 QSARs were developed based on overall 412 k-values for the reaction of chlorine, chlorine dioxide, ferrate, and ozone with organic compounds containing electron-rich moieties such as phenols, anilines, olefins, and amines. On average, 303 out of 412 (74%) k-values were predicted by these QSARs within a factor of 1/3-3 compared to the measured values. For HO(·) reactions, some principles and estimation methods of k-values (e.g. the Group Contribution Method) are discussed. The developed QSARs and the Group Contribution Method could be used to predict the k-values for various emerging organic micropollutants. As a demonstration, 39 out of 45 (87%) predicted k-values were found within a factor 1/3-3 compared to the measured values for the selected emerging micropollutants. Finally, it is discussed how the uncertainty in the predicted k-values using the QSARs affects the accuracy of prediction for micropollutant elimination during oxidative water treatment. Copyright © 2012 Elsevier Ltd. All rights reserved.
Li, Yi; Tseng, Yufeng J.; Pan, Dahua; Liu, Jianzhong; Kern, Petra S.; Gerberick, G. Frank; Hopfinger, Anton J.
2008-01-01
Currently, the only validated methods to identify skin sensitization effects are in vivo models, such as the Local Lymph Node Assay (LLNA) and guinea pig studies. There is a tremendous need, in particular due to novel legislation, to develop animal alternatives, eg. Quantitative Structure-Activity Relationship (QSAR) models. Here, QSAR models for skin sensitization using LLNA data have been constructed. The descriptors used to generate these models are derived from the 4D-molecular similarity paradigm and are referred to as universal 4D-fingerprints. A training set of 132 structurally diverse compounds and a test set of 15 structurally diverse compounds were used in this study. The statistical methodologies used to build the models are logistic regression (LR), and partial least square coupled logistic regression (PLS-LR), which prove to be effective tools for studying skin sensitization measures expressed in the two categorical terms of sensitizer and non-sensitizer. QSAR models with low values of the Hosmer-Lemeshow goodness-of-fit statistic, χHL2, are significant and predictive. For the training set, the cross-validated prediction accuracy of the logistic regression models ranges from 77.3% to 78.0%, while that of PLS-logistic regression models ranges from 87.1% to 89.4%. For the test set, the prediction accuracy of logistic regression models ranges from 80.0%-86.7%, while that of PLS-logistic regression models ranges from 73.3%-80.0%. The QSAR models are made up of 4D-fingerprints related to aromatic atoms, hydrogen bond acceptors and negatively partially charged atoms. PMID:17226934
Sedykh, Alexander; Fourches, Denis; Duan, Jianmin; Hucke, Oliver; Garneau, Michel; Zhu, Hao; Bonneau, Pierre; Tropsha, Alexander
2013-04-01
Membrane transporters mediate many biological effects of chemicals and play a major role in pharmacokinetics and drug resistance. The selection of viable drug candidates among biologically active compounds requires the assessment of their transporter interaction profiles. Using public sources, we have assembled and curated the largest, to our knowledge, human intestinal transporter database (>5,000 interaction entries for >3,700 molecules). This data was used to develop thoroughly validated classification Quantitative Structure-Activity Relationship (QSAR) models of transport and/or inhibition of several major transporters including MDR1, BCRP, MRP1-4, PEPT1, ASBT, OATP2B1, OCT1, and MCT1. QSAR models have been developed with advanced machine learning techniques such as Support Vector Machines, Random Forest, and k Nearest Neighbors using Dragon and MOE chemical descriptors. These models afforded high external prediction accuracies of 71-100% estimated by 5-fold external validation, and showed hit retrieval rates with up to 20-fold enrichment in the virtual screening of DrugBank compounds. The compendium of predictive QSAR models developed in this study can be used for virtual profiling of drug candidates and/or environmental agents with the optimal transporter profiles.
Jagiello, Karolina; Grzonkowska, Monika; Swirog, Marta; ...
2016-08-29
In this contribution, the advantages and limitations of two computational techniques that can be used for the investigation of nanoparticles activity and toxicity: classic nano-QSAR (Quantitative Structure–Activity Relationships employed for nanomaterials) and 3D nano-QSAR (three-dimensional Quantitative Structure–Activity Relationships, such us Comparative Molecular Field Analysis, CoMFA/Comparative Molecular Similarity Indices Analysis, CoMSIA analysis employed for nanomaterials) have been briefly summarized. Both approaches were compared according to the selected criteria, including: efficiency, type of experimental data, class of nanomaterials, time required for calculations and computational cost, difficulties in the interpretation. Taking into account the advantages and limitations of each method, we provide themore » recommendations for nano-QSAR modellers and QSAR model users to be able to determine a proper and efficient methodology to investigate biological activity of nanoparticles in order to describe the underlying interactions in the most reliable and useful manner.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jagiello, Karolina; Grzonkowska, Monika; Swirog, Marta
In this contribution, the advantages and limitations of two computational techniques that can be used for the investigation of nanoparticles activity and toxicity: classic nano-QSAR (Quantitative Structure–Activity Relationships employed for nanomaterials) and 3D nano-QSAR (three-dimensional Quantitative Structure–Activity Relationships, such us Comparative Molecular Field Analysis, CoMFA/Comparative Molecular Similarity Indices Analysis, CoMSIA analysis employed for nanomaterials) have been briefly summarized. Both approaches were compared according to the selected criteria, including: efficiency, type of experimental data, class of nanomaterials, time required for calculations and computational cost, difficulties in the interpretation. Taking into account the advantages and limitations of each method, we provide themore » recommendations for nano-QSAR modellers and QSAR model users to be able to determine a proper and efficient methodology to investigate biological activity of nanoparticles in order to describe the underlying interactions in the most reliable and useful manner.« less
Verma, Rajeshwar P; Matthews, Edwin J
2015-03-01
This is part II of an in silico investigation of chemical-induced eye injury that was conducted at FDA's CFSAN. Serious eye damage caused by chemical (eye corrosion) is assessed using the rabbit Draize test, and this endpoint is an essential part of hazard identification and labeling of industrial and consumer products to ensure occupational and consumer safety. There is an urgent need to develop an alternative to the Draize test because EU's 7th amendment to the Cosmetic Directive (EC, 2003; 76/768/EEC) and recast Regulation now bans animal testing on all cosmetic product ingredients and EU's REACH Program limits animal testing for chemicals in commerce. Although in silico methods have been reported for eye irritation (reversible damage), QSARs specific for eye corrosion (irreversible damage) have not been published. This report describes the development of 21 ANN c-QSAR models (QSAR-21) for assessing eye corrosion potential of chemicals using a large and diverse CFSAN data set of 504 chemicals, ADMET Predictor's three sensitivity analyses and ANNE classification functionalities with 20% test set selection from seven different methods. QSAR-21 models were internally and externally validated and exhibited high predictive performance: average statistics for the training, verification, and external test sets of these models were 96/96/94% sensitivity and 91/91/90% specificity. Copyright © 2014 Elsevier Inc. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Politi, Regina; Department of Environmental Sciences and Engineering, University of North Carolina, Chapel Hill, NC 27599; Rusyn, Ivan, E-mail: iir@unc.edu
2014-10-01
The thyroid hormone receptor (THR) is an important member of the nuclear receptor family that can be activated by endocrine disrupting chemicals (EDC). Quantitative Structure–Activity Relationship (QSAR) models have been developed to facilitate the prioritization of THR-mediated EDC for the experimental validation. The largest database of binding affinities available at the time of the study for ligand binding domain (LBD) of THRβ was assembled to generate both continuous and classification QSAR models with an external accuracy of R{sup 2} = 0.55 and CCR = 0.76, respectively. In addition, for the first time a QSAR model was developed to predict bindingmore » affinities of antagonists inhibiting the interaction of coactivators with the AF-2 domain of THRβ (R{sup 2} = 0.70). Furthermore, molecular docking studies were performed for a set of THRβ ligands (57 agonists and 15 antagonists of LBD, 210 antagonists of the AF-2 domain, supplemented by putative decoys/non-binders) using several THRβ structures retrieved from the Protein Data Bank. We found that two agonist-bound THRβ conformations could effectively discriminate their corresponding ligands from presumed non-binders. Moreover, one of the agonist conformations could discriminate agonists from antagonists. Finally, we have conducted virtual screening of a chemical library compiled by the EPA as part of the Tox21 program to identify potential THRβ-mediated EDCs using both QSAR models and docking. We concluded that the library is unlikely to have any EDC that would bind to the THRβ. Models developed in this study can be employed either to identify environmental chemicals interacting with the THR or, conversely, to eliminate the THR-mediated mechanism of action for chemicals of concern. - Highlights: • This is the largest curated dataset for ligand binding domain (LBD) of the THRβ. • We report the first QSAR model for antagonists of AF-2 domain of THRβ. • A combination of QSAR and docking enables
Goyal, Sukriti; Dhanjal, Jaspreet K; Tyagi, Chetna; Goyal, Manisha; Grover, Abhinav
2014-07-01
The CRK3 cyclin-dependent kinase of Leishmania plays an important role in regulating the cell-cycle progression at the G2/M phase checkpoint transition, proliferation, and viability inside the host macrophage. In this study, a novel fragment-based QSAR model has been developed using 22 pyrazole-derived compounds exhibiting inhibitory activity against Leishmanial CRK3. Unlike other QSAR methods, this fragment-based method gives flexibility to study the relationship between molecular fragments of interest and their contribution for the variation in the biological response by evaluating cross-term fragment descriptors. Based on the fragment-based QSAR model, a combinatorial library was generated, and top two compounds were reported after predicting their activity. The QSAR model showed satisfactory statistical parameters for the data set (r(2) = 0.8752, q(2) = 0.6690, F-ratio = 30.37, and pred_r(2) = 0.8632) with four descriptors describing the nature of substituent groups and the environment of the substitution site. Evaluation of the model implied that electron-rich substitution at R1 position improves the inhibitory activity, while decline in inhibitory activity was observed in presence of nitrogen at R2 position. The analysis carried out in this study provides a substantial basis for consideration of the designed pyrazole-based leads as potent antileishmanial drugs. © 2014 John Wiley & Sons A/S.
Development of a general baseline toxicity QSAR model for the fish embryo acute toxicity test.
Klüver, Nils; Vogs, Carolina; Altenburger, Rolf; Escher, Beate I; Scholz, Stefan
2016-12-01
Fish embryos have become a popular model in ecotoxicology and toxicology. The fish embryo acute toxicity test (FET) with the zebrafish embryo was recently adopted by the OECD as technical guideline TG 236 and a large database of concentrations causing 50% lethality (LC 50 ) is available in the literature. Quantitative Structure-Activity Relationships (QSARs) of baseline toxicity (also called narcosis) are helpful to estimate the minimum toxicity of chemicals to be tested and to identify excess toxicity in existing data sets. Here, we analyzed an existing fish embryo toxicity database and established a QSAR for fish embryo LC 50 using chemicals that were independently classified to act according to the non-specific mode of action of baseline toxicity. The octanol-water partition coefficient K ow is commonly applied to discriminate between non-polar and polar narcotics. Replacing the K ow by the liposome-water partition coefficient K lipw yielded a common QSAR for polar and non-polar baseline toxicants. This developed baseline toxicity QSAR was applied to compare the final mode of action (MOA) assignment of 132 chemicals. Further, we included the analysis of internal lethal concentration (ILC 50 ) and chemical activity (La 50 ) as complementary approaches to evaluate the robustness of the FET baseline toxicity. The analysis of the FET dataset revealed that specifically acting and reactive chemicals converged towards the baseline toxicity QSAR with increasing hydrophobicity. The developed FET baseline toxicity QSAR can be used to identify specifically acting or reactive compounds by determination of the toxic ratio and in combination with appropriate endpoints to infer the MOA for chemicals. Copyright © 2016 Elsevier Ltd. All rights reserved.
Alves, Vinicius M.; Muratov, Eugene; Fourches, Denis; Strickland, Judy; Kleinstreuer, Nicole; Andrade, Carolina H.; Tropsha, Alexander
2015-01-01
Repetitive exposure to a chemical agent can induce an immune reaction in inherently susceptible individuals that leads to skin sensitization. Although many chemicals have been reported as skin sensitizers, there have been very few rigorously validated QSAR models with defined applicability domains (AD) that were developed using a large group of chemically diverse compounds. In this study, we have aimed to compile, curate, and integrate the largest publicly available dataset related to chemically-induced skin sensitization, use this data to generate rigorously validated and QSAR models for skin sensitization, and employ these models as a virtual screening tool for identifying putative sensitizers among environmental chemicals. We followed best practices for model building and validation implemented with our predictive QSAR workflow using random forest modeling technique in combination with SiRMS and Dragon descriptors. The Correct Classification Rate (CCR) for QSAR models discriminating sensitizers from non-sensitizers were 71–88% when evaluated on several external validation sets, within a broad AD, with positive (for sensitizers) and negative (for non-sensitizers) predicted rates of 85% and 79% respectively. When compared to the skin sensitization module included in the OECD QSAR toolbox as well as to the skin sensitization model in publicly available VEGA software, our models showed a significantly higher prediction accuracy for the same sets of external compounds as evaluated by Positive Predicted Rate, Negative Predicted Rate, and CCR. These models were applied to identify putative chemical hazards in the ScoreCard database of possible skin or sense organ toxicants as primary candidates for experimental validation. PMID:25560674
Integration of QSAR and in vitro toxicology.
Barratt, M D
1998-01-01
The principles of quantitative structure-activity relationships (QSAR) are based on the premise that the properties of a chemical are implicit in its molecular structure. Therefore, if a mechanistic hypothesis can be proposed linking a group of related chemicals with a particular toxic end point, the hypothesis can be used to define relevant parameters to establish a QSAR. Ways in which QSAR and in vitro toxicology can complement each other in development of alternatives to live animal experiments are described and illustrated by examples from acute toxicological end points. Integration of QSAR and in vitro methods is examined in the context of assessing mechanistic competence and improving the design of in vitro assays and the development of prediction models. The nature of biological variability is explored together with its implications for the selection of sets of chemicals for test development, optimization, and validation. Methods are described to support the use of data from in vivo tests that do not meet today's stringent requirements of acceptability. Integration of QSAR and in vitro methods into strategic approaches for the replacement, reduction, and refinement of the use of animals is described with examples. PMID:9599692
Pandey, Gyanendra; Saxena, Anil K
2006-01-01
A set of 65 flexible peptidomimetic competitive inhibitors (52 in the training set and 13 in the test set) of protein tyrosine phosphatase 1B (PTP1B) has been used to compare the quality and predictive power of 3D quantitative structure-activity relationship (QSAR) comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) models for the three most commonly used conformer-based alignments, namely, cocrystallized conformer-based alignment (CCBA), docked conformer-based alignment (DCBA), and global minima energy conformer-based alignment (GMCBA). These three conformers of 5-[(2S)-2-({(2S)-2-[(tert-butoxycarbonyl)amino]-3-phenylpropanoyl}amino)3-oxo-3-pentylamino)propyl]-2-(carboxymethoxy)benzoic acid (compound number 66) were obtained from the X-ray structure of its cocrystallized complex with PTP1B (PDB ID: 1JF7), its docking studies, and its global minima by simulated annealing. Among the 3D QSAR models developed using the above three alignments, the CCBA provided the optimal predictive CoMFA model for the training set with cross-validated r2 (q2)=0.708, non-cross-validated r2=0.902, standard error of estimate (s)=0.165, and F=202.553 and the optimal CoMSIA model with q2=0.440, r2=0.799, s=0.192, and F=117.782. These models also showed the best test set prediction for the 13 compounds with predictive r2 values of 0.706 and 0.683, respectively. Though the QSAR models derived using the other two alignments also produced statistically acceptable models in the order DCBA>GMCBA in terms of the values of q2, r2, and predictive r2, they were inferior to the corresponding models derived using CCBA. Thus, the order of preference for the alignment selection for 3D QSAR model development may be CCBA>DCBA>GMCBA, and the information obtained from the CoMFA and CoMSIA contour maps may be useful in designing specific PTP1B inhibitors.
Parameters for Pyrethroid Insecticide QSAR and PBPK/PD Models for Human Risk Assessment
This pyrethroid insecticide parameter review is an extension of our interest in developing quantitative structure–activity relationship–physiologically based pharmacokinetic/pharmacodynamic (QSAR-PBPK/PD) models for assessing health risks, which interest started with the organoph...
Kovarich, Simona; Papa, Ester; Gramatica, Paola
2011-06-15
The identification of potential endocrine disrupting (ED) chemicals is an important task for the scientific community due to their diffusion in the environment; the production and use of such compounds will be strictly regulated through the authorization process of the REACH regulation. To overcome the problem of insufficient experimental data, the quantitative structure-activity relationship (QSAR) approach is applied to predict the ED activity of new chemicals. In the present study QSAR classification models are developed, according to the OECD principles, to predict the ED potency for a class of emerging ubiquitary pollutants, viz. brominated flame retardants (BFRs). Different endpoints related to ED activity (i.e. aryl hydrocarbon receptor agonism and antagonism, estrogen receptor agonism and antagonism, androgen and progesterone receptor antagonism, T4-TTR competition, E2SULT inhibition) are modeled using the k-NN classification method. The best models are selected by maximizing the sensitivity and external predictive ability. We propose simple QSARs (based on few descriptors) characterized by internal stability, good predictive power and with a verified applicability domain. These models are simple tools that are applicable to screen BFRs in relation to their ED activity, and also to design safer alternatives, in agreement with the requirements of REACH regulation at the authorization step. Copyright © 2011 Elsevier B.V. All rights reserved.
Ebalunode, Jerry O; Zheng, Weifan; Tropsha, Alexander
2011-01-01
Optimization of chemical library composition affords more efficient identification of hits from biological screening experiments. The optimization could be achieved through rational selection of reagents used in combinatorial library synthesis. However, with a rapid advent of parallel synthesis methods and availability of millions of compounds synthesized by many vendors, it may be more efficient to design targeted libraries by means of virtual screening of commercial compound collections. This chapter reviews the application of advanced cheminformatics approaches such as quantitative structure-activity relationships (QSAR) and pharmacophore modeling (both ligand and structure based) for virtual screening. Both approaches rely on empirical SAR data to build models; thus, the emphasis is placed on achieving models of the highest rigor and external predictive power. We present several examples of successful applications of both approaches for virtual screening to illustrate their utility. We suggest that the expert use of both QSAR and pharmacophore models, either independently or in combination, enables users to achieve targeted libraries enriched with experimentally confirmed hit compounds.
Collecting the chemical structures and data for necessary QSAR modeling is facilitated by available public databases and open data. However, QSAR model performance is dependent on the quality of data and modeling methodology used. This study developed robust QSAR models for physi...
Jensen, G.E.; Niemelä, J.R.; Wedebye, E.B.; Nikolov, N.G.
2008-01-01
A special challenge in the new European Union chemicals legislation, Registration, Evaluation and Authorisation of Chemicals, will be the toxicological evaluation of chemicals for reproductive toxicity. Use of valid quantitative structure–activity relationships (QSARs) is a possibility under the new legislation. This article focuses on a screening exercise by use of our own and commercial QSAR models for identification of possible reproductive toxicants. Three QSAR models were used for reproductive toxicity for the endpoints teratogenic risk to humans (based on animal tests, clinical data and epidemiological human studies), dominant lethal effect in rodents (in vivo) and Drosophila melanogaster sex-linked recessive lethal effect. A structure set of 57,014 European Inventory of Existing Chemical Substances (EINECS) chemicals was screened. A total of 5240 EINECS chemicals, corresponding to 9.2%, were predicted as reproductive toxicants by one or more of the models. The chemicals predicted positive for reproductive toxicity will be submitted to the Danish Environmental Protection Agency as scientific input for a future updated advisory classification list with advisory classifications for concern for humans owing to possible developmental toxic effects: Xn (Harmful) and R63 (Possible risk of harm to the unborn child). The chemicals were also screened in three models for endocrine disruption. PMID:19061080
QSAR models of human data can enrich or replace LLNA testing for human skin sensitization
Alves, Vinicius M.; Capuzzi, Stephen J.; Muratov, Eugene; Braga, Rodolpho C.; Thornton, Thomas; Fourches, Denis; Strickland, Judy; Kleinstreuer, Nicole; Andrade, Carolina H.; Tropsha, Alexander
2016-01-01
Skin sensitization is a major environmental and occupational health hazard. Although many chemicals have been evaluated in humans, there have been no efforts to model these data to date. We have compiled, curated, analyzed, and compared the available human and LLNA data. Using these data, we have developed reliable computational models and applied them for virtual screening of chemical libraries to identify putative skin sensitizers. The overall concordance between murine LLNA and human skin sensitization responses for a set of 135 unique chemicals was low (R = 28-43%), although several chemical classes had high concordance. We have succeeded to develop predictive QSAR models of all available human data with the external correct classification rate of 71%. A consensus model integrating concordant QSAR predictions and LLNA results afforded a higher CCR of 82% but at the expense of the reduced external dataset coverage (52%). We used the developed QSAR models for virtual screening of CosIng database and identified 1061 putative skin sensitizers; for seventeen of these compounds, we found published evidence of their skin sensitization effects. Models reported herein provide more accurate alternative to LLNA testing for human skin sensitization assessment across diverse chemical data. In addition, they can also be used to guide the structural optimization of toxic compounds to reduce their skin sensitization potential. PMID:28630595
QSAR models based on quantum topological molecular similarity.
Popelier, P L A; Smith, P J
2006-07-01
A new method called quantum topological molecular similarity (QTMS) was fairly recently proposed [J. Chem. Inf. Comp. Sc., 41, 2001, 764] to construct a variety of medicinal, ecological and physical organic QSAR/QSPRs. QTMS method uses quantum chemical topology (QCT) to define electronic descriptors drawn from modern ab initio wave functions of geometry-optimised molecules. It was shown that the current abundance of computing power can be utilised to inject realistic descriptors into QSAR/QSPRs. In this article we study seven datasets of medicinal interest : the dissociation constants (pK(a)) for a set of substituted imidazolines , the pK(a) of imidazoles , the ability of a set of indole derivatives to displace [(3)H] flunitrazepam from binding to bovine cortical membranes , the influenza inhibition constants for a set of benzimidazoles , the interaction constants for a set of amides and the enzyme liver alcohol dehydrogenase , the natriuretic activity of sulphonamide carbonic anhydrase inhibitors and the toxicity of a series of benzyl alcohols. A partial least square analysis in conjunction with a genetic algorithm delivered excellent models. They are also able to highlight the active site, of the ligand or the molecule whose structure determines the activity. The advantages and limitations of QTMS are discussed.
Fourches, Denis; Muratov, Eugene; Tropsha, Alexander
2010-01-01
Molecular modelers and cheminformaticians typically analyze experimental data generated by other scientists. Consequently, when it comes to data accuracy, cheminformaticians are always at the mercy of data providers who may inadvertently publish (partially) erroneous data. Thus, dataset curation is crucial for any cheminformatics analysis such as similarity searching, clustering, QSAR modeling, virtual screening, etc., especially nowadays when the availability of chemical datasets in public domain has skyrocketed in recent years. Despite the obvious importance of this preliminary step in the computational analysis of any dataset, there appears to be no commonly accepted guidance or set of procedures for chemical data curation. The main objective of this paper is to emphasize the need for a standardized chemical data curation strategy that should be followed at the onset of any molecular modeling investigation. Herein, we discuss several simple but important steps for cleaning chemical records in a database including the removal of a fraction of the data that cannot be appropriately handled by conventional cheminformatics techniques. Such steps include the removal of inorganic and organometallic compounds, counterions, salts and mixtures; structure validation; ring aromatization; normalization of specific chemotypes; curation of tautomeric forms; and the deletion of duplicates. To emphasize the importance of data curation as a mandatory step in data analysis, we discuss several case studies where chemical curation of the original “raw” database enabled the successful modeling study (specifically, QSAR analysis) or resulted in a significant improvement of model's prediction accuracy. We also demonstrate that in some cases rigorously developed QSAR models could be even used to correct erroneous biological data associated with chemical compounds. We believe that good practices for curation of chemical records outlined in this paper will be of value to all
3D QSAR models built on structure-based alignments of Abl tyrosine kinase inhibitors.
Falchi, Federico; Manetti, Fabrizio; Carraro, Fabio; Naldini, Antonella; Maga, Giovanni; Crespan, Emmanuele; Schenone, Silvia; Bruno, Olga; Brullo, Chiara; Botta, Maurizio
2009-06-01
Quality QSAR: A combination of docking calculations and a statistical approach toward Abl inhibitors resulted in a 3D QSAR model, the analysis of which led to the identification of ligand portions important for affinity. New compounds designed on the basis of the model were found to have very good affinity for the target, providing further validation of the model itself.The X-ray crystallographic coordinates of the Abl tyrosine kinase domain in its active, inactive, and Src-like inactive conformations were used as targets to simulate the binding mode of a large series of pyrazolo[3,4-d]pyrimidines (known Abl inhibitors) by means of GOLD software. Receptor-based alignments provided by molecular docking calculations were submitted to a GRID-GOLPE protocol to generate 3D QSAR models. Analysis of the results showed that the models based on the inactive and Src-like inactive conformations had very poor statistical parameters, whereas the sole model based on the active conformation of Abl was characterized by significant internal and external predictive ability. Subsequent analysis of GOLPE PLS pseudo-coefficient contour plots of this model gave us a better understanding of the relationships between structure and affinity, providing suggestions for the next optimization process. On the basis of these results, new compounds were designed according to the hydrophobic and hydrogen bond donor and acceptor contours, and were found to have improved enzymatic and cellular activity with respect to parent compounds. Additional biological assays confirmed the important role of the selected compounds as inhibitors of cell proliferation in leukemia cells.
(Q)SARs to predict environmental toxicities: current status and future needs.
Cronin, Mark T D
2017-03-22
The current state of the art of (Quantitative) Structure-Activity Relationships ((Q)SARs) to predict environmental toxicity is assessed along with recommendations to develop these models further. The acute toxicity of compounds acting by the non-polar narcotic mechanism of action can be well predicted, however other approaches, including read-across, may be required for compounds acting by specific mechanisms of action. The chronic toxicity of compounds to environmental species is more difficult to predict from (Q)SARs, with robust data sets and more mechanistic information required. In addition, the toxicity of mixtures is little addressed by (Q)SAR approaches. Developments in environmental toxicology including Adverse Outcome Pathways (AOPs) and omics responses should be utilised to develop better, more mechanistically relevant, (Q)SAR models.
Zhou, Peng; Wang, Congcong; Tian, Feifei; Ren, Yanrong; Yang, Chao; Huang, Jian
2013-01-01
Quantitative structure-activity relationship (QSAR), a regression modeling methodology that establishes statistical correlation between structure feature and apparent behavior for a series of congeneric molecules quantitatively, has been widely used to evaluate the activity, toxicity and property of various small-molecule compounds such as drugs, toxicants and surfactants. However, it is surprising to see that such useful technique has only very limited applications to biomacromolecules, albeit the solved 3D atom-resolution structures of proteins, nucleic acids and their complexes have accumulated rapidly in past decades. Here, we present a proof-of-concept paradigm for the modeling, prediction and interpretation of the binding affinity of 144 sequence-nonredundant, structure-available and affinity-known protein complexes (Kastritis et al. Protein Sci 20:482-491, 2011) using a biomacromolecular QSAR (BioQSAR) scheme. We demonstrate that the modeling performance and predictive power of BioQSAR are comparable to or even better than that of traditional knowledge-based strategies, mechanism-type methods and empirical scoring algorithms, while BioQSAR possesses certain additional features compared to the traditional methods, such as adaptability, interpretability, deep-validation and high-efficiency. The BioQSAR scheme could be readily modified to infer the biological behavior and functions of other biomacromolecules, if their X-ray crystal structures, NMR conformation assemblies or computationally modeled structures are available.
Lin, Wei; Jiang, Ruifen; Shen, Yong; Xiong, Yaxin; Hu, Sizi; Xu, Jianqiao; Ouyang, Gangfeng
2018-04-13
Pre-equilibrium passive sampling is a simple and promising technique for studying sampling kinetics, which is crucial to determine the distribution, transfer and fate of hydrophobic organic compounds (HOCs) in environmental water and organisms. Environmental water samples contain complex matrices that complicate the traditional calibration process for obtaining the accurate rate constants. This study proposed a QSAR model to predict the sampling rate constants of HOCs (polycyclic aromatic hydrocarbons (PAHs), polychlorinated biphenyls (PCBs) and pesticides) in aqueous systems containing complex matrices. A homemade flow-through system was established to simulate an actual aqueous environment containing dissolved organic matter (DOM) i.e. humic acid (HA) and (2-Hydroxypropyl)-β-cyclodextrin (β-HPCD)), and to obtain the experimental rate constants. Then, a quantitative structure-activity relationship (QSAR) model using Genetic Algorithm-Multiple Linear Regression (GA-MLR) was found to correlate the experimental rate constants to the system state including physicochemical parameters of the HOCs and DOM which were calculated and selected as descriptors by Density Functional Theory (DFT) and Chem 3D. The experimental results showed that the rate constants significantly increased as the concentration of DOM increased, and the enhancement factors of 70-fold and 34-fold were observed for the HOCs in HA and β-HPCD, respectively. The established QSAR model was validated as credible (R Adj. 2 =0.862) and predictable (Q 2 =0.835) in estimating the rate constants of HOCs for complex aqueous sampling, and a probable mechanism was developed by comparison to the reported theoretical study. The present study established a QSAR model of passive sampling rate constants and calibrated the effect of DOM on the sampling kinetics. Copyright © 2018 Elsevier B.V. All rights reserved.
Automated workflows for data curation and standardization of chemical structures for QSAR modeling
Large collections of chemical structures and associated experimental data are publicly available, and can be used to build robust QSAR models for applications in different fields. One common concern is the quality of both the chemical structure information and associated experime...
20180312 - Structure-based QSAR Models to Predict Systemic Toxicity Points of Departure (SOT)
Human health risk assessment associated with environmental chemical exposure is limited by the tens of thousands of chemicals with little or no experimental in vivo toxicity data. Data gap filling techniques, such as quantitative structure activity relationship (QSAR) models base...
Begum, S; Achary, P Ganga Raju
2015-01-01
Quantitative structure-activity relationship (QSAR) models were built for the prediction of inhibition (pIC50, i.e. negative logarithm of the 50% effective concentration) of MAP kinase-interacting protein kinase (MNK1) by 43 potent inhibitors. The pIC50 values were modelled with five random splits, with the representations of the molecular structures by simplified molecular input line entry system (SMILES). QSAR model building was performed by the Monte Carlo optimisation using three methods: classic scheme; balance of correlations; and balance correlation with ideal slopes. The robustness of these models were checked by parameters as rm(2), r(*)m(2), [Formula: see text] and randomisation technique. The best QSAR model based on single optimal descriptors was applied to study in vitro structure-activity relationships of 6-(4-(2-(piperidin-1-yl) ethoxy) phenyl)-3-(pyridin-4-yl) pyrazolo [1,5-a] pyrimidine derivatives as a screening tool for the development of novel potent MNK1 inhibitors. The effects of alkyl group, -OH, -NO2, F, Cl, Br, I, etc. on the IC50 values towards the inhibition of MNK1 were also reported.
Does Rational Selection of Training and Test Sets Improve the Outcome of QSAR Modeling?
Prior to using a quantitative structure activity relationship (QSAR) model for external predictions, its predictive power should be established and validated. In the absence of a true external dataset, the best way to validate the predictive ability of a model is to perform its s...
Abuhamdah, Sawsan; Habash, Maha; Taha, Mutasem O
2013-12-01
Inhibition of the enzyme acetylcholinesterase (AChE) has been shown to alleviate neurodegenerative diseases prompting several attempts to discover and optimize new AChE inhibitors. In this direction, we explored the pharmacophoric space of 85 AChE inhibitors to identify high quality pharmacophores. Subsequently, we implemented genetic algorithm-based quantitative structure-activity relationship (QSAR) modeling to select optimal combination of pharmacophoric models and 2D physicochemical descriptors capable of explaining bioactivity variation among training compounds (r2(68)=0.94, F-statistic=125.8, r2 LOO=0.92, r2 PRESS against 17 external test inhibitors = 0.84). Two orthogonal pharmacophores emerged in the QSAR equation suggesting the existence of at least two binding modes accessible to ligands within AChE binding pocket. The successful pharmacophores were comparable with crystallographically resolved AChE binding pocket. We employed the pharmacophoric models and associated QSAR equation to screen the national cancer institute list of compounds. Twenty-four low micromolar AChE inhibitors were identified. The most potent gave IC50 value of 1.0 μM.
Bhhatarai, Barun; Wilson, Daniel M.; Price, Paul S.; Marty, Sue; Parks, Amanda K.; Carney, Edward
2016-01-01
Background: Integrative testing strategies (ITSs) for potential endocrine activity can use tiered in silico and in vitro models. Each component of an ITS should be thoroughly assessed. Objectives: We used the data from three in vitro ToxCast™ binding assays to assess OASIS, a quantitative structure-activity relationship (QSAR) platform covering both estrogen receptor (ER) and androgen receptor (AR) binding. For stronger binders (described here as AC50 < 1 μM), we also examined the relationship of QSAR predictions of ER or AR binding to the results from 18 ER and 10 AR transactivation assays, 72 ER-binding reference compounds, and the in vivo uterotrophic assay. Methods: NovaScreen binding assay data for ER (human, bovine, and mouse) and AR (human, chimpanzee, and rat) were used to assess the sensitivity, specificity, concordance, and applicability domain of two OASIS QSAR models. The binding strength relative to the QSAR-predicted binding strength was examined for the ER data. The relationship of QSAR predictions of binding to transactivation- and pathway-based assays, as well as to in vivo uterotrophic responses, was examined. Results: The QSAR models had both high sensitivity (> 75%) and specificity (> 86%) for ER as well as both high sensitivity (92–100%) and specificity (70–81%) for AR. For compounds within the domains of the ER and AR QSAR models that bound with AC50 < 1 μM, the QSAR models accurately predicted the binding for the parent compounds. The parent compounds were active in all transactivation assays where metabolism was incorporated and, except for those compounds known to require metabolism to manifest activity, all assay platforms where metabolism was not incorporated. Compounds in-domain and predicted to bind by the ER QSAR model that were positive in ToxCast™ ER binding at AC50 < 1 μM were active in the uterotrophic assay. Conclusions: We used the extensive ToxCast™ HTS binding data set to show that OASIS ER and AR QSAR models had
NASA Astrophysics Data System (ADS)
Rondla, Rohini; Padma Rao, Lavanya Souda; Ramatenki, Vishwanath; Vadija, Rajender; Mukkera, Thirupathi; Potlapally, Sarita Rajender; Vuruputuri, Uma
2017-04-01
The cyclin-dependent kinase 4 (CDK4) enzyme is a key regulator in cell cycle G1 phase progression. It is often overexpressed in variety of cancer cells, which makes it an attractive therapeutic target for cancer treatment. A number of chemical scaffolds have been reported as CDK4 inhibitors in the literature, and in particular azolium scaffolds as potential inhibitors. Here, a ligand based pharmacophore modeling and an atom based 3D-QSAR analyses for a series of azolium based CDK4 inhibitors are presented. A five point pharmacophore hypothesis, i.e. APRRR with one H-bond acceptor (A), one positive cationic feature (P) and three ring aromatic sites (R) is developed, which yielded an atom based 3D-QSAR model that shows an excellent correlation coefficient value- R2 = 0.93, fisher ratio- F = 207, along with good predictive ability- Q2 = 0.79, and Pearson R value = 0.89. The visual inspection of the 3D-QSAR model, with the most active and the least active ligands, demonstrates the favorable and unfavorable structural regions for the activity towards CDK4. The roles of positively charged nitrogen, the steric effect, ligand flexibility, and the substituents on the activity are in good agreement with the previously reported experimental results. The generated 3D QSAR model is further applied as query for a 3D database screening, which identifies 23 lead drug candidates with good predicted activities and diverse scaffolds. The ADME analysis reveals that, the pharmacokinetic parameters of all the identified new leads are within the acceptable range.
NASA Astrophysics Data System (ADS)
Zharifah, A.; Kusumowardani, E.; Saputro, A.; Sarwinda, D.
2017-07-01
According to data from GLOBOCAN (IARC) at 2012, breast cancer was the highest rated of new cancer case by 43.3 % (after controlled by age), with mortality rated as high as 12.9 %. Oncology is a major field which focusing on improving the development of drug and therapeutics cancer in pharmaceutical and biotechnology companies. Nowadays, many researchers lead to computational chemistry and bioinformatic for pharmacophore generation. A pharmacophore describes as a group of atoms in the molecule which is considered to be responsible for a pharmacological action. Prediction of biological function from chemical structure in silico modeling reduces the use of chemical reagents so the risk of environmental pollution decreased. In this research, we proposed QSAR model to analyze the composition of cancer drugs which assumed to be homogenous in character and treatment. Atomic interactions which analyzed are learned through parameters such as log p as descriptors hydrophobic, n_poinas descriptor contour strength and molecular structure, and also various concentrations inhibitor (micromolar and nanomolar) from NCBI drugs bank. The differences inhibitor activity was observed by the presence of IC 50 residues value from inhibitor substances at various concentration. Then, we got a general overview of the state of safety for drug stability seen from its IC 50 value. In our study, we also compared between micromolar and nanomolar inhibitor effect from QSAR model results. The QSAR model analysis shows that the drug concentration with nanomolar is better than micromolar, related with the content of inhibitor substances concentration. This QSAR model got the equation: Log 1/IC50 = (0.284) (±0.195) logP + (0.02) (±0.012) n_poin + (-0.005) (±0.083) Inhibition10.2nanoM + (0.1) (±0.079) Inhibition30.5nanoM + (-0.016) (±0.045) Inhibition91.5nanoM + (-2.572) (±1.570) (n = 13; r = 0.813; r2 = 0.660; s = 0.764; F = 2.720; q2 = 0.660).
A MODE-OF-ACTION-BASED QSAR APPROACH TO IMPROVE UNDERSTANDING OF DEVELOPMENTAL TOXICITY
QSAR models of developmental toxicity (devtox) have met with limited regulatory acceptance due to the use of ill-defined endpoints, lack of biological interpretability, and poor model performance. More generally, the lack of biological inference of many QSAR models is often due t...
NASA Astrophysics Data System (ADS)
Li, Peizhen; Tian, Yueli; Zhai, Honglin; Deng, Fangfang; Xie, Meihong; Zhang, Xiaoyun
2013-11-01
Non-purine derivatives have been shown to be promising novel drug candidates as xanthine oxidase inhibitors. Based on three-dimensional quantitative structure-activity relationship (3D-QSAR) methods including comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA), two 3D-QSAR models for a series of non-purine xanthine oxidase (XO) inhibitors were established, and their reliability was supported by statistical parameters. Combined 3D-QSAR modeling and the results of molecular docking between non-purine xanthine oxidase inhibitors and XO, the main factors that influenced activity of inhibitors were investigated, and the obtained results could explain known experimental facts. Furthermore, several new potential inhibitors with higher activity predicted were designed, which based on our analyses, and were supported by the simulation of molecular docking. This study provided some useful information for the development of non-purine xanthine oxidase inhibitors with novel structures.
NASA Astrophysics Data System (ADS)
Liu, Jianzhong; Kern, Petra S.; Gerberick, G. Frank; Santos-Filho, Osvaldo A.; Esposito, Emilio X.; Hopfinger, Anton J.; Tseng, Yufeng J.
2008-06-01
In previous studies we have developed categorical QSAR models for predicting skin-sensitization potency based on 4D-fingerprint (4D-FP) descriptors and in vivo murine local lymph node assay (LLNA) measures. Only 4D-FP derived from the ground state (GMAX) structures of the molecules were used to build the QSAR models. In this study we have generated 4D-FP descriptors from the first excited state (EMAX) structures of the molecules. The GMAX, EMAX and the combined ground and excited state 4D-FP descriptors (GEMAX) were employed in building categorical QSAR models. Logistic regression (LR) and partial least square coupled logistic regression (PLS-CLR), found to be effective model building for the LLNA skin-sensitization measures in our previous studies, were used again in this study. This also permitted comparison of the prior ground state models to those involving first excited state 4D-FP descriptors. Three types of categorical QSAR models were constructed for each of the GMAX, EMAX and GEMAX datasets: a binary model (2-state), an ordinal model (3-state) and a binary-binary model (two-2-state). No significant differences exist among the LR 2-state model constructed for each of the three datasets. However, the PLS-CLR 3-state and 2-state models based on the EMAX and GEMAX datasets have higher predictivity than those constructed using only the GMAX dataset. These EMAX and GMAX categorical models are also more significant and predictive than corresponding models built in our previous QSAR studies of LLNA skin-sensitization measures.
NASA Astrophysics Data System (ADS)
Santos-Filho, Osvaldo A.; Mishra, Rama K.; Hopfinger, A. J.
2001-09-01
Free energy force field (FEFF) 3D-QSAR analysis was used to construct ligand-receptor binding models for a set of 18 structurally diverse antifolates including pyrimethamine, cycloguanil, methotrexate, aminopterin and trimethoprim, and 13 pyrrolo[2,3-d]pyrimidines. The molecular target (`receptor') used was a 3D-homology model of a specific mutant type of Plasmodium falciparum (Pf) dihydrofolate reductase (DHFR). The dependent variable of the 3D-QSAR models is the IC50 inhibition constant for the specific mutant type of PfDHFR. The independent variables of the 3D-QSAR models (the descriptors) are scaled energy terms of a modified first-generation AMBER force field combined with a hydration shell aqueous solvation model and a collection of 2D-QSAR descriptors often used in QSAR studies. Multiple temperature molecular dynamics simulation (MDS) and the genetic function approximation (GFA) were employed using partial least square (PLS) and multidimensional linear regressions as the fitting functions to develop FEFF 3D-QSAR models for the binding process. The significant FEFF energy terms in the best 3D-QSAR models include energy contributions of the direct ligand-receptor interaction. Some changes in conformational energy terms of the ligand due to binding to the enzyme are also found to be important descriptors. The FEFF 3D-QSAR models indicate some structural features perhaps relevant to the mechanism of resistance of the PfDHFR to current antimalarials. The FEFF 3D-QSAR models are also compared to receptor-independent (RI) 4D-QSAR models developed in an earlier study and subsequently refined using recently developed generalized alignment rules.
Hendriks, A Jan; Traas, Theo P; Huijbregts, Mark A J
2005-05-01
To protect thousands of species from thousands of chemicals released in the environment, various risk assessment tools have been developed. Here, we link quantitative structure-activity relationships (QSARs) for response concentrations in water (LC50) to critical concentrations in organisms (C50) by a model for accumulation in lipid or non-lipid phases versus water Kpw. The model indicates that affinity for neutral body components such as storage fat yields steep Kpw-Kow relationships, whereas slopes for accumulation in polar phases such as proteins are gentle. This pattern is confirmed by LC50 QSARs for different modes of action, such as neutral versus polar narcotics and organochlorine versus organophosphor insecticides. LC50 QSARs were all between 0.00002 and 0.2Kow(-1). After calibrating the model with the intercepts and, for the first time also, with the slopes of the LC50 QSARs, critical concentrations in organisms C50 are calculated and compared to an independent validation data set. About 60% of the variability in lethal body burdens C50 is explained by the model. Explanations for differences between estimated and measured levels for 11 modes of action are discussed. In particular, relationships between the critical concentrations in organisms C50 and chemical (Kow) or species (lipid content) characteristics are specified and tested. The analysis combines different models proposed before and provides a substantial extension of the data set in comparison to previous work. Moreover, the concept is applied to species (e.g., plants, lean animals) and substances (e.g., specific modes of action) that were scarcely studied quantitatively so far.
Garro Martinez, Juan C; Vega-Hissi, Esteban G; Andrada, Matías F; Duchowicz, Pablo R; Torrens, Francisco; Estrada, Mario R
2014-01-01
Lacosamide is an anticonvulsant drug which presents carbonic anhydrase inhibition. In this paper, we analyzed the apparent relationship between both activities performing a molecular modeling, docking and QSAR studies on 18 lacosamide derivatives with known anticonvulsant activity. Docking results suggested the zinc-binding site of carbonic anhydrase is a possible target of lacosamide and lacosamide derivatives making favorable Van der Waals interactions with Asn67, Gln92, Phe131 and Thr200. The mathematical models revealed a poor relationship between the anticonvulsant activity and molecular descriptors obtained from DFT and docking calculations. However, a QSAR model was developed using Dragon software descriptors. The statistic parameters of the model are: correlation coefficient, R=0.957 and standard deviation, S=0.162. Our results provide new valuable information regarding the relationship between both activities and contribute important insights into the essential molecular requirements for the anticonvulsant activity.
QSAR as a random event: modeling of nanoparticles uptake in PaCa2 cancer cells.
Toropov, Andrey A; Toropova, Alla P; Puzyn, Tomasz; Benfenati, Emilio; Gini, Giuseppina; Leszczynska, Danuta; Leszczynski, Jerzy
2013-06-01
Quantitative structure-property/activity relationships (QSPRs/QSARs) are a tool to predict various endpoints for various substances. The "classic" QSPR/QSAR analysis is based on the representation of the molecular structure by the molecular graph. However, simplified molecular input-line entry system (SMILES) gradually becomes most popular representation of the molecular structure in the databases available on the Internet. Under such circumstances, the development of molecular descriptors calculated directly from SMILES becomes attractive alternative to "classic" descriptors. The CORAL software (http://www.insilico.eu/coral) is provider of SMILES-based optimal molecular descriptors which are aimed to correlate with various endpoints. We analyzed data set on nanoparticles uptake in PaCa2 pancreatic cancer cells. The data set includes 109 nanoparticles with the same core but different surface modifiers (small organic molecules). The concept of a QSAR as a random event is suggested in opposition to "classic" QSARs which are based on the only one distribution of available data into the training and the validation sets. In other words, five random splits into the "visible" training set and the "invisible" validation set were examined. The SMILES-based optimal descriptors (obtained by the Monte Carlo technique) for these splits are calculated with the CORAL software. The statistical quality of all these models is good. Copyright © 2013 Elsevier Ltd. All rights reserved.
Speck-Planche, Alejandro; Kleandrova, Valeria V; Luan, Feng; Cordeiro, M Natália D S
2012-08-01
The discovery of new and more potent anti-cancer agents constitutes one of the most active fields of research in chemotherapy. Colorectal cancer (CRC) is one of the most studied cancers because of its high prevalence and number of deaths. In the current pharmaceutical design of more efficient anti-CRC drugs, the use of methodologies based on Chemoinformatics has played a decisive role, including Quantitative-Structure-Activity Relationship (QSAR) techniques. However, until now, there is no methodology able to predict anti-CRC activity of compounds against more than one CRC cell line, which should constitute the principal goal. In an attempt to overcome this problem we develop here the first multi-target (mt) approach for the virtual screening and rational in silico discovery of anti-CRC agents against ten cell lines. Here, two mt-QSAR classification models were constructed using a large and heterogeneous database of compounds. The first model was based on linear discriminant analysis (mt-QSAR-LDA) employing fragment-based descriptors while the second model was obtained using artificial neural networks (mt-QSAR-ANN) with global 2D descriptors. Both models correctly classified more than 90% of active and inactive compounds in training and prediction sets. Some fragments were extracted from the molecules and their contributions to anti-CRC activity were calculated using mt-QSAR-LDA model. Several fragments were identified as potential substructural features responsible for the anti-CRC activity and new molecules designed from those fragments with positive contributions were suggested and correctly predicted by the two models as possible potent and versatile anti-CRC agents. Copyright © 2012 Elsevier Ltd. All rights reserved.
SAR/QSAR methods in public health practice
DOE Office of Scientific and Technical Information (OSTI.GOV)
Demchuk, Eugene, E-mail: edemchuk@cdc.gov; Ruiz, Patricia; Chou, Selene
2011-07-15
Methods of (Quantitative) Structure-Activity Relationship ((Q)SAR) modeling play an important and active role in ATSDR programs in support of the Agency mission to protect human populations from exposure to environmental contaminants. They are used for cross-chemical extrapolation to complement the traditional toxicological approach when chemical-specific information is unavailable. SAR and QSAR methods are used to investigate adverse health effects and exposure levels, bioavailability, and pharmacokinetic properties of hazardous chemical compounds. They are applied as a part of an integrated systematic approach in the development of Health Guidance Values (HGVs), such as ATSDR Minimal Risk Levels, which are used to protectmore » populations exposed to toxic chemicals at hazardous waste sites. (Q)SAR analyses are incorporated into ATSDR documents (such as the toxicological profiles and chemical-specific health consultations) to support environmental health assessments, prioritization of environmental chemical hazards, and to improve study design, when filling the priority data needs (PDNs) as mandated by Congress, in instances when experimental information is insufficient. These cases are illustrated by several examples, which explain how ATSDR applies (Q)SAR methods in public health practice.« less
Kafoury, Ramzi M; Huang, Ming-Ju
2005-08-01
The sequence of events leading to ozone-induced airway inflammation is not well known. To elucidate the molecular and cellular events underlying ozone toxicity in the lung, we hypothesized that lipid ozonation products (LOPs) generated by the reaction of ozone with unsaturated fatty acids in the epithelial lining fluid and cell membranes play a key role in mediating ozone-induced airway inflammation. To test our hypothesis, we ozonized 1-palmitoyl-2-oleoyl-sn-glycero-3-phosphatidylcholine (POPC) and generated LOPs. Confluent human bronchial epithelial cells were exposed to the derivatives of ozonized POPC-9-oxononanoyl, 9-hydroxy-9-hydroperoxynonanoyl, and 8-(5-octyl-1,2,4-trioxolan-3-yl-)octanoyl-at a concentration of 10 muM, and the activity of phospholipases A2 (PLA2), C (PLC), and D (PLD) was measured (1, 0.5, and 1 h, respectively). Quantitative structure-activity relationship (QSAR) models were utilized to predict the biological activity of LOPs in airway epithelial cells. The QSAR results showed a strong correlation between experimental and computed activity (r = 0.97, 0.98, 0.99, for PLA2, PLC, and PLD, respectively). The results indicate that QSAR models can be utilized to predict the biological activity of the various ozone-derived LOP species in the lung. Copyright 2005 Wiley Periodicals, Inc.
Al-Masri, Ihab M; Mohammad, Mohammad K; Taha, Mutasem O
2008-11-01
Dipeptidyl peptidase IV (DPP IV) deactivates the natural hypoglycemic incretin hormones. Inhibition of this enzyme should restore glucose homeostasis in diabetic patients making it an attractive target for the development of new antidiabetic drugs. With this in mind, the pharmacophoric space of DPP IV was explored using a set of 358 known inhibitors. Thereafter, genetic algorithm and multiple linear regression analysis were employed to select an optimal combination of pharmacophoric models and physicochemical descriptors that yield selfconsistent and predictive quantitative structure-activity relationships (QSAR) (r(2) (287)=0.74, F-statistic=44.5, r(2) (BS)=0.74, r(2) (LOO)=0.69, r(2) (PRESS) against 71 external testing inhibitors=0.51). Two orthogonal pharmacophores (of cross-correlation r(2)=0.23) emerged in the QSAR equation suggesting the existence of at least two distinct binding modes accessible to ligands within the DPP IV binding pocket. Docking experiments supported the binding modes suggested by QSAR/pharmacophore analyses. The validity of the QSAR equation and the associated pharmacophore models were established by the identification of new low-micromolar anti-DPP IV leads retrieved by in silico screening. One of our interesting potent anti-DPP IV hits is the fluoroquinolone gemifloxacin (IC(50)=1.12 muM). The fact that gemifloxacin was recently reported to potently inhibit the prodiabetic target glycogen synthase kinase 3beta (GSK-3beta) suggests that gemifloxacin is an excellent lead for the development of novel dual antidiabetic inhibitors against DPP IV and GSK-3beta.
The collection of chemical structures and associated experimental data for QSAR modeling is facilitated by the increasing number and size of public databases. However, the performance of QSAR models highly depends on the quality of the data used and the modeling methodology. The ...
Politi, Regina; Rusyn, Ivan; Tropsha, Alexander
2016-01-01
The thyroid hormone receptor (THR) is an important member of the nuclear receptor family that can be activated by endocrine disrupting chemicals (EDC). Quantitative Structure-Activity Relationship (QSAR) models have been developed to facilitate the prioritization of THR-mediated EDC for the experimental validation. The largest database of binding affinities available at the time of the study for ligand binding domain (LBD) of THRβ was assembled to generate both continuous and classification QSAR models with an external accuracy of R2=0.55 and CCR=0.76, respectively. In addition, for the first time a QSAR model was developed to predict binding affinities of antagonists inhibiting the interaction of coactivators with the AF-2 domain of THRβ (R2=0.70). Furthermore, molecular docking studies were performed for a set of THRβ ligands (57 agonists and 15 antagonists of LBD, 210 antagonists of the AF-2 domain, supplemented by putative decoys/non-binders) using several THRβ structures retrieved from the Protein Data Bank. We found that two agonist-bound THRβ conformations could effectively discriminate their corresponding ligands from presumed non-binders. Moreover, one of the agonist conformations could discriminate agonists from antagonists. Finally, we have conducted virtual screening of a chemical library compiled by the EPA as part of the Tox21 program to identify potential THRβ-mediated EDCs using both QSAR models and docking. We concluded that the library is unlikely to have any EDC that would bind to the THRβ. Models developed in this study can be employed either to identify environmental chemicals interacting with the THR or, conversely, to eliminate the THR-mediated mechanism of action for chemicals of concern. PMID:25058446
Verma, Rajeshwar P; Matthews, Edwin J
2015-03-01
Evaluation of potential chemical-induced eye injury through irritation and corrosion is required to ensure occupational and consumer safety for industrial, household and cosmetic ingredient chemicals. The historical method for evaluating eye irritant and corrosion potential of chemicals is the rabbit Draize test. However, the Draize test is controversial and its use is diminishing - the EU 7th Amendment to the Cosmetic Directive (76/768/EEC) and recast Regulation now bans marketing of new cosmetics having animal testing of their ingredients and requires non-animal alternative tests for safety assessments. Thus, in silico and/or in vitro tests are advocated. QSAR models for eye irritation have been reported for several small (congeneric) data sets; however, large global models have not been described. This report describes FDA/CFSAN's development of 21 ANN c-QSAR models (QSAR-21) to predict eye irritation using the ADMET Predictor program and a diverse training data set of 2928 chemicals. The 21 models had external (20% test set) and internal validation and average training/verification/test set statistics were: 88/88/85(%) sensitivity and 82/82/82(%) specificity, respectively. The new method utilized multiple artificial neural network (ANN) molecular descriptor selection functionalities to maximize the applicability domain of the battery. The eye irritation models will be used to provide information to fill the critical data gaps for the safety assessment of cosmetic ingredient chemicals. Copyright © 2014 Elsevier Inc. All rights reserved.
Receptor-based 3D-QSAR in Drug Design: Methods and Applications in Kinase Studies.
Fang, Cheng; Xiao, Zhiyan
2016-01-01
Receptor-based 3D-QSAR strategy represents a superior integration of structure-based drug design (SBDD) and three-dimensional quantitative structure-activity relationship (3D-QSAR) analysis. It combines the accurate prediction of ligand poses by the SBDD approach with the good predictability and interpretability of statistical models derived from the 3D-QSAR approach. Extensive efforts have been devoted to the development of receptor-based 3D-QSAR methods and two alternative approaches have been exploited. One associates with computing the binding interactions between a receptor and a ligand to generate structure-based descriptors for QSAR analyses. The other concerns the application of various docking protocols to generate optimal ligand poses so as to provide reliable molecular alignments for the conventional 3D-QSAR operations. This review highlights new concepts and methodologies recently developed in the field of receptorbased 3D-QSAR, and in particular, covers its application in kinase studies.
Domain-Specific QSAR Models for Identifying Potential Estrogenic Activity of Phenols (FutureTox III)
Computational tools can be used for efficient evaluation of untested chemicals for their ability to disrupt the endocrine system. We have employed previously developed global QSAR models that were trained and validated on the ToxCast/Tox21 ER assay data for virtual screening of a...
Residual-QSAR. Implications for genotoxic carcinogenesis
2011-01-01
the direct activity to parameter QSARs. Nevertheless, such contrasted correlations were further incorporated into the advanced statistical minimum paths principle, which selects the minimum hierarchy from Euclidean distances between all considered QSAR models for all combinations and considered molecular sets (i.e., school and validation). This ultimately led to a mechanistic picture based on the identified alpha, beta and gamma paths connecting structural indicators (i.e., the causes) to the global endpoint, with all included causes. The molecular mechanism preserved the self-consistent feature of the residual QSAR, with each descriptor appearing twice in the course of one cycle of ligand-DNA interaction through inter-and intra-cellular stages. Conclusions Both basal features of the residual-QSAR principle of self-consistency and suitability for non-congeneric molecules make it appropriate for conceptually assessing the mechanistic description of genotoxic carcinogenesis. Additionally, it could be extended to enriched physicochemical structural indices by considering the molecular fragments or structural alerts (or other molecular residues), providing more detailed maps of chemical-biological interactions and pathways. PMID:21668999
Tetko, Igor V; Maran, Uko; Tropsha, Alexander
2017-03-01
Thousands of (Quantitative) Structure-Activity Relationships (Q)SAR models have been described in peer-reviewed publications; however, this way of sharing seldom makes models available for the use by the research community outside of the developer's laboratory. Conversely, on-line models allow broad dissemination and application representing the most effective way of sharing the scientific knowledge. Approaches for sharing and providing on-line access to models range from web services created by individual users and laboratories to integrated modeling environments and model repositories. This emerging transition from the descriptive and informative, but "static", and for the most part, non-executable print format to interactive, transparent and functional delivery of "living" models is expected to have a transformative effect on modern experimental research in areas of scientific and regulatory use of (Q)SAR models. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Bauer, Katharina Christin; Hämmerling, Frank; Kittelmann, Jörg; Dürr, Cathrin; Görlich, Fabian; Hubbuch, Jürgen
2017-04-01
Information about protein-protein interactions provides valuable knowledge about the phase behavior of protein solutions during the biopharmaceutical production process. Up to date it is possible to capture their overall impact by an experimentally determined potential of mean force. For the description of this potential, the second virial coefficient B22, the diffusion interaction parameter kD, the storage modulus G', or the diffusion coefficient D is applied. In silico methods do not only have the potential to predict these parameters, but also to provide deeper understanding of the molecular origin of the protein-protein interactions by correlating the data to the protein's three-dimensional structure. This methodology furthermore allows a lower sample consumption and less experimental effort. Of all in silico methods, QSAR modeling, which correlates the properties of the molecule's structure with the experimental behavior, seems to be particularly suitable for this purpose. To verify this, the study reported here dealt with the determination of a QSAR model for the diffusion coefficient of proteins. This model consisted of diffusion coefficients for six different model proteins at various pH values and NaCl concentrations. The generated QSAR model showed a good correlation between experimental and predicted data with a coefficient of determination R2 = 0.9 and a good predictability for an external test set with R2 = 0.91. The information about the properties affecting protein-protein interactions present in solution was in agreement with experiment and theory. Furthermore, the model was able to give a more detailed picture of the protein properties influencing the diffusion coefficient and the acting protein-protein interactions. Biotechnol. Bioeng. 2017;114: 821-831. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
3D-QSAR and molecular docking studies on HIV protease inhibitors
NASA Astrophysics Data System (ADS)
Tong, Jianbo; Wu, Yingji; Bai, Min; Zhan, Pei
2017-02-01
In order to well understand the chemical-biological interactions governing their activities toward HIV protease activity, QSAR models of 34 cyclic-urea derivatives with inhibitory HIV were developed. The quantitative structure activity relationship (QSAR) model was built by using comparative molecular similarity indices analysis (CoMSIA) technique. And the best CoMSIA model has rcv2, rncv2 values of 0.586 and 0.931 for cross-validated and non-cross-validated. The predictive ability of CoMSIA model was further validated by a test set of 7 compounds, giving rpred2 value of 0.973. Docking studies were used to find the actual conformations of chemicals in active site of HIV protease, as well as the binding mode pattern to the binding site in protease enzyme. The information provided by 3D-QSAR model and molecular docking may lead to a better understanding of the structural requirements of 34 cyclic-urea derivatives and help to design potential anti-HIV protease molecules.
A review on principles, theory and practices of 2D-QSAR.
Roy, Kunal; Das, Rudra Narayan
2014-01-01
The central axiom of science purports the explanation of every natural phenomenon using all possible logics coming from pure as well as mixed scientific background. The quantitative structure-activity relationship (QSAR) analysis is a study correlating the behavioral manifestation of compounds with their structures employing the interdisciplinary knowledge of chemistry, mathematics, biology as well as physics. Several studies have attempted to mathematically correlate the chemistry and property (physicochemical/ biological/toxicological) of molecules using various computationally or experimentally derived quantitative parameters termed as descriptors. The dimensionality of the descriptors depends on the type of algorithm employed and defines the nature of QSAR analysis. The most interesting feature of predictive QSAR models is that the behavior of any new or even hypothesized molecule can be predicted by the use of the mathematical equations. The phrase "2D-QSAR" signifies development of QSAR models using 2D-descriptors. Such predictor variables are the most widely practised ones because of their simple and direct mathematical algorithmic nature involving no time consuming energy computations and having reproducible operability. 2D-descriptors have a deluge of contributions in extracting chemical attributes and they are also capable of representing the 3D molecular features to some extent; although in no case they should be considered as the ultimate one, since they often suffer from the problems of intercorrelation, insufficient chemical information as well as lack of interpretation. However, by following rational approaches, novel 2D-descriptors may be developed to obviate various existing problems giving potential 2D-QSAR equations, thereby solving the innumerable chemical mysteries still unexplored.
Votano, Joseph R; Parham, Marc; Hall, L Mark; Hall, Lowell H; Kier, Lemont B; Oloff, Scott; Tropsha, Alexander
2006-11-30
Four modeling techniques, using topological descriptors to represent molecular structure, were employed to produce models of human serum protein binding (% bound) on a data set of 1008 experimental values, carefully screened from publicly available sources. To our knowledge, this data is the largest set on human serum protein binding reported for QSAR modeling. The data was partitioned into a training set of 808 compounds and an external validation test set of 200 compounds. Partitioning was accomplished by clustering the compounds in a structure descriptor space so that random sampling of 20% of the whole data set produced an external test set that is a good representative of the training set with respect to both structure and protein binding values. The four modeling techniques include multiple linear regression (MLR), artificial neural networks (ANN), k-nearest neighbors (kNN), and support vector machines (SVM). With the exception of the MLR model, the ANN, kNN, and SVM QSARs were ensemble models. Training set correlation coefficients and mean absolute error ranged from r2=0.90 and MAE=7.6 for ANN to r2=0.61 and MAE=16.2 for MLR. Prediction results from the validation set yielded correlation coefficients and mean absolute errors which ranged from r2=0.70 and MAE=14.1 for ANN to a low of r2=0.59 and MAE=18.3 for the SVM model. Structure descriptors that contribute significantly to the models are discussed and compared with those found in other published models. For the ANN model, structure descriptor trends with respect to their affects on predicted protein binding can assist the chemist in structure modification during the drug design process.
Pérez-Garrido, Alfonso; Helguera, Aliuska Morales; Rodríguez, Francisco Girón; Cordeiro, M Natália D S
2010-05-01
The purpose of this study is to develop a quantitative structure-activity relationship (QSAR) model that can distinguish mutagenic from non-mutagenic species with alpha,beta-unsaturated carbonyl moiety using two endpoints for this activity - Ames test and mammalian cell gene mutation test - and also to gather information about the molecular features that most contribute to eliminate the mutagenic effects of these chemicals. Two data sets were used for modeling the two mutagenicity endpoints: (1) Ames test and (2) mammalian cells mutagenesis. The first one comprised 220 molecules, while the second one 48 substances, ranging from acrylates, methacrylates to alpha,beta-unsaturated carbonyl compounds. The QSAR models were developed by applying linear discriminant analysis (LDA) along with different sets of descriptors computed using the DRAGON software. For both endpoints, there was a concordance of 89% in the prediction and 97% confidentiality by combining the three models for the Ames test mutagenicity. We have also identified several structural alerts to assist the design of new monomers. These individual models and especially their combination are attractive from the point of view of molecular modeling and could be used for the prediction and design of new monomers that do not pose a human health risk. 2010 Academy of Dental Materials. Published by Elsevier Ltd. All rights reserved.
The objective of this work is to use the Exposure Related Dose Estimating Model (ERDEM) and quantitative structure-activity relationship (QSAR) models to develop an assessment tool for human exposure assessment to triazole fungicides. A dermal exposure route is used for the physi...
Kim, J; Lee, C; Chong, Y
2009-01-01
Influenza endonucleases have appeared as an attractive target of antiviral therapy for influenza infection. With the purpose of designing a novel antiviral agent with enhanced biological activities against influenza endonuclease, a three-dimensional quantitative structure-activity relationships (3D-QSAR) model was generated based on 34 influenza endonuclease inhibitors. The comparative molecular similarity index analysis (CoMSIA) with a steric, electrostatic and hydrophobic (SEH) model showed the best correlative and predictive capability (q(2) = 0.763, r(2) = 0.969 and F = 174.785), which provided a pharmacophore composed of the electronegative moiety as well as the bulky hydrophobic group. The CoMSIA model was used as a pharmacophore query in the UNITY search of the ChemDiv compound library to give virtual active compounds. The 3D-QSAR model was then used to predict the activity of the selected compounds, which identified three compounds as the most likely inhibitor candidates.
The development of QSAR models is critically dependent on the quality of available data. As part of our efforts to develop public platforms to provide access to predictive models, we have attempted to discriminate the influence of the quality versus quantity of data available to...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kiwamoto, R., E-mail: reiko.kiwamoto@wur.nl; Spenkelink, A.; Rietjens, I.M.C.M.
Acyclic α,β-unsaturated aldehydes present in food raise a concern because the α,β-unsaturated aldehyde moiety is considered a structural alert for genotoxicity. However, controversy remains on whether in vivo at realistic dietary exposure DNA adduct formation is significant. The aim of the present study was to develop physiologically based kinetic/dynamic (PBK/D) models to examine dose-dependent detoxification and DNA adduct formation of a group of 18 food-borne acyclic α,β-unsaturated aldehydes without 2- or 3-alkylation, and with no more than one conjugated double bond. Parameters for the PBK/D models were obtained using quantitative structure–activity relationships (QSARs) defined with a training set of sixmore » selected aldehydes. Using the QSARs, PBK/D models for the other 12 aldehydes were defined. Results revealed that DNA adduct formation in the liver increases with decreasing bulkiness of the molecule especially due to less efficient detoxification. 2-Propenal (acrolein) was identified to induce the highest DNA adduct levels. At realistic dietary intake, the predicted DNA adduct levels for all aldehydes were two orders of magnitude lower than endogenous background levels observed in disease free human liver, suggesting that for all 18 aldehydes DNA adduct formation is negligible at the relevant levels of dietary intake. The present study provides a proof of principle for the use of QSAR-based PBK/D modelling to facilitate group evaluations and read-across in risk assessment. - Highlights: • Physiologically based in silico models were made for 18 α,β-unsaturated aldehydes. • Kinetic parameters were determined by in vitro incubations and a QSAR approach. • DNA adduct formation was negligible at levels relevant for dietary intake. • The use of QSAR-based PBK/D modelling facilitates group evaluations and read-across.« less
Sparse QSAR modelling methods for therapeutic and regenerative medicine
NASA Astrophysics Data System (ADS)
Winkler, David A.
2018-02-01
The quantitative structure-activity relationships method was popularized by Hansch and Fujita over 50 years ago. The usefulness of the method for drug design and development has been shown in the intervening years. As it was developed initially to elucidate which molecular properties modulated the relative potency of putative agrochemicals, and at a time when computing resources were scarce, there is much scope for applying modern mathematical methods to improve the QSAR method and to extending the general concept to the discovery and optimization of bioactive molecules and materials more broadly. I describe research over the past two decades where we have rebuilt the unit operations of the QSAR method using improved mathematical techniques, and have applied this valuable platform technology to new important areas of research and industry such as nanoscience, omics technologies, advanced materials, and regenerative medicine. This paper was presented as the 2017 ACS Herman Skolnik lecture.
Large collections of chemical structures and associated experimental data are publicly available, and can be used to build robust QSAR models for applications in different fields. One common concern is the quality of both the chemical structure information and associated experime...
QSAR studies in the discovery of novel type-II diabetic therapies.
Abuhammad, Areej; Taha, Mutasem O
2016-01-01
Type-II diabetes mellitus (T2DM) is a complex chronic disease that represents a major therapeutic challenge. Despite extensive efforts in T2DM drug development, therapies remain unsatisfactory. Currently, there are many novel and important antidiabetic drug targets under investigation by many research groups worldwide. One of the main challenges to develop effective orally active hypoglycemic agents is off-target effects. Computational tools have impacted drug discovery at many levels. One of the earliest methods is quantitative structure-activity relationship (QSAR) studies. QSAR strategies help medicinal chemists understand the relationship between hypoglycemic activity and molecular properties. Hence, QSAR may hold promise in guiding the synthesis of specifically designed novel ligands that demonstrate high potency and target selectivity. This review aims to provide an overview of the QSAR strategies used to model antidiabetic agents. In particular, this review focuses on drug targets that raised recent scientific interest and/or led to successful antidiabetic agents in the market. Special emphasis has been made on studies that led to the identification of novel antidiabetic scaffolds. Computer-aided molecular design and discovery techniques like QSAR have a great potential in designing leads against complex diseases such as T2DM. Combined with other in silico techniques, QSAR can provide more useful and rational insights to facilitate the discovery of novel compounds. However, since T2DM is a complex disease that includes several faulty biological targets, multi-target QSAR studies are recommended in the future to achieve efficient antidiabetic therapies.
2012-01-01
Background The Hedgehog Signaling Pathway is one of signaling pathways that are very important to embryonic development. The participation of inhibitors in the Hedgehog Signal Pathway can control cell growth and death, and searching novel inhibitors to the functioning of the pathway are in a great demand. As the matter of fact, effective inhibitors could provide efficient therapies for a wide range of malignancies, and targeting such pathway in cells represents a promising new paradigm for cell growth and death control. Current research mainly focuses on the syntheses of the inhibitors of cyclopamine derivatives, which bind specifically to the Smo protein, and can be used for cancer therapy. While quantitatively structure-activity relationship (QSAR) studies have been performed for these compounds among different cell lines, none of them have achieved acceptable results in the prediction of activity values of new compounds. In this study, we proposed a novel collaborative QSAR model for inhibitors of the Hedgehog Signaling Pathway by integration the information from multiple cell lines. Such a model is expected to substantially improve the QSAR ability from single cell lines, and provide useful clues in developing clinically effective inhibitors and modifications of parent lead compounds for target on the Hedgehog Signaling Pathway. Results In this study, we have presented: (1) a collaborative QSAR model, which is used to integrate information among multiple cell lines to boost the QSAR results, rather than only a single cell line QSAR modeling. Our experiments have shown that the performance of our model is significantly better than single cell line QSAR methods; and (2) an efficient feature selection strategy under such collaborative environment, which can derive the commonly important features related to the entire given cell lines, while simultaneously showing their specific contributions to a specific cell-line. Based on feature selection results, we have
Gao, Jun; Che, Dongsheng; Zheng, Vincent W; Zhu, Ruixin; Liu, Qi
2012-07-31
The Hedgehog Signaling Pathway is one of signaling pathways that are very important to embryonic development. The participation of inhibitors in the Hedgehog Signal Pathway can control cell growth and death, and searching novel inhibitors to the functioning of the pathway are in a great demand. As the matter of fact, effective inhibitors could provide efficient therapies for a wide range of malignancies, and targeting such pathway in cells represents a promising new paradigm for cell growth and death control. Current research mainly focuses on the syntheses of the inhibitors of cyclopamine derivatives, which bind specifically to the Smo protein, and can be used for cancer therapy. While quantitatively structure-activity relationship (QSAR) studies have been performed for these compounds among different cell lines, none of them have achieved acceptable results in the prediction of activity values of new compounds. In this study, we proposed a novel collaborative QSAR model for inhibitors of the Hedgehog Signaling Pathway by integration the information from multiple cell lines. Such a model is expected to substantially improve the QSAR ability from single cell lines, and provide useful clues in developing clinically effective inhibitors and modifications of parent lead compounds for target on the Hedgehog Signaling Pathway. In this study, we have presented: (1) a collaborative QSAR model, which is used to integrate information among multiple cell lines to boost the QSAR results, rather than only a single cell line QSAR modeling. Our experiments have shown that the performance of our model is significantly better than single cell line QSAR methods; and (2) an efficient feature selection strategy under such collaborative environment, which can derive the commonly important features related to the entire given cell lines, while simultaneously showing their specific contributions to a specific cell-line. Based on feature selection results, we have proposed several
DemQSAR: predicting human volume of distribution and clearance of drugs
NASA Astrophysics Data System (ADS)
Demir-Kavuk, Ozgur; Bentzien, Jörg; Muegge, Ingo; Knapp, Ernst-Walter
2011-12-01
In silico methods characterizing molecular compounds with respect to pharmacologically relevant properties can accelerate the identification of new drugs and reduce their development costs. Quantitative structure-activity/-property relationship (QSAR/QSPR) correlate structure and physico-chemical properties of molecular compounds with a specific functional activity/property under study. Typically a large number of molecular features are generated for the compounds. In many cases the number of generated features exceeds the number of molecular compounds with known property values that are available for learning. Machine learning methods tend to overfit the training data in such situations, i.e. the method adjusts to very specific features of the training data, which are not characteristic for the considered property. This problem can be alleviated by diminishing the influence of unimportant, redundant or even misleading features. A better strategy is to eliminate such features completely. Ideally, a molecular property can be described by a small number of features that are chemically interpretable. The purpose of the present contribution is to provide a predictive modeling approach, which combines feature generation, feature selection, model building and control of overtraining into a single application called DemQSAR. DemQSAR is used to predict human volume of distribution (VDss) and human clearance (CL). To control overtraining, quadratic and linear regularization terms were employed. A recursive feature selection approach is used to reduce the number of descriptors. The prediction performance is as good as the best predictions reported in the recent literature. The example presented here demonstrates that DemQSAR can generate a model that uses very few features while maintaining high predictive power. A standalone DemQSAR Java application for model building of any user defined property as well as a web interface for the prediction of human VDss and CL is
DemQSAR: predicting human volume of distribution and clearance of drugs.
Demir-Kavuk, Ozgur; Bentzien, Jörg; Muegge, Ingo; Knapp, Ernst-Walter
2011-12-01
In silico methods characterizing molecular compounds with respect to pharmacologically relevant properties can accelerate the identification of new drugs and reduce their development costs. Quantitative structure-activity/-property relationship (QSAR/QSPR) correlate structure and physico-chemical properties of molecular compounds with a specific functional activity/property under study. Typically a large number of molecular features are generated for the compounds. In many cases the number of generated features exceeds the number of molecular compounds with known property values that are available for learning. Machine learning methods tend to overfit the training data in such situations, i.e. the method adjusts to very specific features of the training data, which are not characteristic for the considered property. This problem can be alleviated by diminishing the influence of unimportant, redundant or even misleading features. A better strategy is to eliminate such features completely. Ideally, a molecular property can be described by a small number of features that are chemically interpretable. The purpose of the present contribution is to provide a predictive modeling approach, which combines feature generation, feature selection, model building and control of overtraining into a single application called DemQSAR. DemQSAR is used to predict human volume of distribution (VD(ss)) and human clearance (CL). To control overtraining, quadratic and linear regularization terms were employed. A recursive feature selection approach is used to reduce the number of descriptors. The prediction performance is as good as the best predictions reported in the recent literature. The example presented here demonstrates that DemQSAR can generate a model that uses very few features while maintaining high predictive power. A standalone DemQSAR Java application for model building of any user defined property as well as a web interface for the prediction of human VD(ss) and CL is
QSAR of phytochemicals for the design of better drugs.
Kar, Supratik; Roy, Kunal
2012-10-01
Phytochemicals have been the single most prolific source of leads for the development of new drug entities from the dawn of the drug discovery. They cover a wide range of therapeutic indications with a great diversity of chemical structures. The research fraternity still believes in exploring the phytochemicals for new drug discovery. Application of molecular biological techniques has increased the availability of novel compounds that can be conveniently isolated from natural sources. Combinatorial chemistry approaches are being applied based on phytochemical scaffolds to create screening libraries that closely resemble drug-like compounds. In silico techniques like quantitative structure-activity relationships (QSAR), pharmacophore and virtual screening are playing crucial and rate accelerating steps for the better drug design in modern era. QSAR models of different classes of phytochemicals covering different therapeutic areas are thoroughly discussed in the review. Further, the authors have enlisted all the available phytochemical databases for the convenience of researchers working in the area. This review justifies the need to develop more QSAR models for the design of better drugs from phytochemicals. Technical drawbacks associated with phytochemical research have been lessened, and there are better opportunities to explore the biological activity of previously inaccessible sources of phytochemicals although there is still the need to reduce the time and cost involvement in such exercise. The future possibilities for the integration of ethnopharmacology with QSAR, place us at an exciting stage that will allow us to explore plant sources worldwide and design better drugs.
Liu, Huihui; Wei, Mengbi; Yang, Xianhai; Yin, Cen; He, Xiao
2017-01-01
Partition coefficients are vital parameters for measuring accurately the chemicals concentrations by passive sampling devices. Given the wide use of low density polyethylene (LDPE) film in passive sampling, we developed a theoretical linear solvation energy relationship (TLSER) model and a quantitative structure-activity relationship (QSAR) model for the prediction of the partition coefficient of chemicals between LDPE and water (K pew ). For chemicals with the octanol-water partition coefficient (log K ow ) <8, a TLSER model with V x (McGowan volume) and qA - (the most negative charge on O, N, S, X atoms) as descriptors was developed, but the model had relatively low determination coefficient (R 2 ) and cross-validated coefficient (Q 2 ). In order to further explore the theoretical mechanisms involved in the partition process, a QSAR model with four descriptors (MLOGP (Moriguchi octanol-water partition coeff.), P_VSA_s_3 (P_VSA-like on I-state, bin 3), Hy (hydrophilic factor) and NssO (number of atoms of type ssO)) was established, and statistical analysis indicated that the model had satisfactory goodness-of-fit, robustness and predictive ability. For chemicals with log K OW >8, a TLSER model with V x and a QSAR model with MLOGP as descriptor were developed. This is the first paper to explore the models for highly hydrophobic chemicals. The applicability domain of the models, characterized by the Euclidean distance-based method and Williams plot, covered a large number of structurally diverse chemicals, which included nearly all the common hydrophobic organic compounds. Additionally, through mechanism interpretation, we explored the structural features those governing the partition behavior of chemicals between LDPE and water. Copyright © 2016 Elsevier B.V. All rights reserved.
López-Lira, Claudia; Alzate-Morales, Jans H; Paulino, Margot; Mella-Raipán, Jaime; Salas, Cristian O; Tapia, Ricardo A; Soto-Delgado, Jorge
2018-01-01
A combination of three-dimensional quantitative structure-activity relationship (3D-QSAR), and molecular modelling methods were used to understand the potent inhibitory NAD(P)H:quinone oxidoreductase 1 (NQO1) activity of a set of 52 heterocyclic quinones. Molecular docking results indicated that some favourable interactions of key amino acid residues at the binding site of NQO1 with these quinones would be responsible for an improvement of the NQO1 activity of these compounds. The main interactions involved are hydrogen bond of the amino group of residue Tyr128, π-stacking interactions with Phe106 and Phe178, and electrostatic interactions with flavin adenine dinucleotide (FADH) cofactor. Three models were prepared by 3D-QSAR analysis. The models derived from Model I and Model III, shown leave-one-out cross-validation correlation coefficients (q 2 LOO ) of .75 and .73 as well as conventional correlation coefficients (R 2 ) of .93 and .95, respectively. In addition, the external predictive abilities of these models were evaluated using a test set, producing the predicted correlation coefficients (r 2 pred ) of .76 and .74, respectively. The good concordance between the docking results and 3D-QSAR contour maps provides helpful information about a rational modification of new molecules based in quinone scaffold, in order to design more potent NQO1 inhibitors, which would exhibit highly potent antitumor activity. © 2017 John Wiley & Sons A/S.
The construction of QSAR models is critically dependent on the quality of available data. As part of our efforts to develop public platforms to provide access to predictive models, we have attempted to discriminate the influence of the quality versus quantity of data available ...
Winkler, David A; Le, Tu C
2017-01-01
Neural networks have generated valuable Quantitative Structure-Activity/Property Relationships (QSAR/QSPR) models for a wide variety of small molecules and materials properties. They have grown in sophistication and many of their initial problems have been overcome by modern mathematical techniques. QSAR studies have almost always used so-called "shallow" neural networks in which there is a single hidden layer between the input and output layers. Recently, a new and potentially paradigm-shifting type of neural network based on Deep Learning has appeared. Deep learning methods have generated impressive improvements in image and voice recognition, and are now being applied to QSAR and QSAR modelling. This paper describes the differences in approach between deep and shallow neural networks, compares their abilities to predict the properties of test sets for 15 large drug data sets (the kaggle set), discusses the results in terms of the Universal Approximation theorem for neural networks, and describes how DNN may ameliorate or remove troublesome "activity cliffs" in QSAR data sets. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Increasing availability of large collections of chemical structures and associated experimental data provides an opportunity to build robust QSAR models for applications in different fields. One common concern is the quality of both the chemical structure information and associat...
Luo, Xiang; Yang, Xianhai; Qiao, Xianliang; Wang, Ya; Chen, Jingwen; Wei, Xiaoxuan; Peijnenburg, Willie J G M
2017-03-22
Reaction with hydroxyl radicals (˙OH) is an important removal pathway for organic pollutants in the aquatic environment. The aqueous reaction rate constant (k OH ) is therefore an important parameter for fate assessment of aquatic pollutants. Since experimental determination fails to meet the requirement of being able to efficiently handle numerous organic chemicals at limited cost and within a relatively short period of time, in silico methods such as quantitative structure-activity relationship (QSAR) models are needed to predict k OH . In this study, a QSAR model with a larger and wider applicability domain as compared with existing models was developed. Following the guidelines for the development and validation of QSAR models proposed by the Organization for Economic Co-operation and Development (OECD), the model shows satisfactory performance. The applicability domain of the model has been extended and contained chemicals that have rarely been covered in most previous studies. The chemicals covered in the current model contain functional groups including [double bond splayed left]C[double bond, length as m-dash]C[double bond splayed right], -C[triple bond, length as m-dash]C-, -C 6 H 5 , -OH, -CHO, -O-, [double bond splayed left]C[double bond, length as m-dash]O, -C[double bond, length as m-dash]O(O)-, -COOH, -C[triple bond, length as m-dash]N, [double bond splayed left]N-, -NH 2 , -NH-C(O)-, -NO 2 , -N[double bond, length as m-dash]C-N[double bond splayed right], [double bond splayed left]N-N[double bond splayed right], -N[double bond, length as m-dash]N-, -S-, -S-S-, -SH, -SO 3 , -SO 4 , -PO 4 , and -X (F, Cl, Br, and I).
QSAR studies of macrocyclic diterpenes with P-glycoprotein inhibitory activity.
Sousa, Inês J; Ferreira, Maria-José U; Molnár, Joseph; Fernandes, Miguel X
2013-02-14
Multidrug resistance (MDR) represents a major limitation for cancer chemotherapy. There are several mechanisms of MDR but the most important is associated with P-glycoprotein (P-gp) overexpression. The development of modulators of P-gp that are able to re-establish drug sensitivity of resistant cells has been considered a promising approach for overcoming MDR. Macrocyclic lathyrane and jatrophane-type diterpenes from Euphorbia species were found to be strong MDR reversing agents. In this study we applied quantitative structure-activity relationship (QSAR) methodology in order to identify the most relevant molecular features of macrocyclic diterpenes with P-gp inhibitory activity and to determine which structural modifications can be performed to improve their activity. Using experimental biological data at two concentrations (4 and 40 μg/ml), we developed a QSAR model for a set of 51 bioactive diterpenic compounds which includes lathyrane and jatrophane-type diterpenes and another model just for jatrophanes. The cross-validation correlation values for all diterpenes QSAR models developed for biological activities at compound concentrations of 4 and 40 μg/ml were 0.758 and 0.729, respectively. Regarding the prediction ability, we get R²(pred) values of 0.765 and 0.534 for biological activities at compound concentrations of 4 and 40 μg/ml, respectively. Applying the cross-validation test to jatrophanes QSAR models, we obtained 0.680 and 0.787 for biological activities at compound concentrations of 4 and 40 μg/ml concentrations, respectively. For the same concentrations, the obtained R²(pred) values for jatrophanes models were 0.541 and 0.534, respectively. The obtained models were statistically valid and showed high prediction ability. Copyright © 2012 Elsevier B.V. All rights reserved.
Balupuri, Anand; Balasubramanian, Pavithra K; Cho, Seung J
2016-01-01
Checkpoint kinase 1 (Chk1) has emerged as a potential therapeutic target for design and development of novel anticancer drugs. Herein, we have performed three-dimensional quantitative structure-activity relationship (3D-QSAR) and molecular docking analyses on a series of diazacarbazoles to design potent Chk1 inhibitors. 3D-QSAR models were developed using comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) techniques. Docking studies were performed using AutoDock. The best CoMFA and CoMSIA models exhibited cross-validated correlation coefficient (q2) values of 0.631 and 0.585, and non-cross-validated correlation coefficient (r2) values of 0.933 and 0.900, respectively. CoMFA and CoMSIA models showed reasonable external predictabilities (r2 pred) of 0.672 and 0.513, respectively. A satisfactory performance in the various internal and external validation techniques indicated the reliability and robustness of the best model. Docking studies were performed to explore the binding mode of inhibitors inside the active site of Chk1. Molecular docking revealed that hydrogen bond interactions with Lys38, Glu85 and Cys87 are essential for Chk1 inhibitory activity. The binding interaction patterns observed during docking studies were complementary to 3D-QSAR results. Information obtained from the contour map analysis was utilized to design novel potent Chk1 inhibitors. Their activities and binding affinities were predicted using the derived model and docking studies. Designed inhibitors were proposed as potential candidates for experimental synthesis.
Gade, Deepak Reddy; Makkapati, Amareswararao; Yarlagadda, Rajesh Babu; Peters, Godefridus J; Sastry, B S; Rajendra Prasad, V V S
2018-06-01
Overexpression of P-glycoprotein (P-gp) leads to the emergence of multidrug resistance (MDR) in cancer treatment. Acridones have the potential to reverse MDR and sensitize cells. In the present study, we aimed to elucidate the chemosensitization potential of acridones by employing various molecular modelling techniques. Pharmacophore modeling was performed for the dataset of chemosensitizing acridones earlier proved for cytotoxic activity against MCF7 breast cancer cell line. Gaussian-based QSAR studies also performed to predict the favored and disfavored region of the acridone molecules. Molecular dynamics simulations were performed for compound 10 and human P-glycoprotein (obtained from Homology modeling). An efficient pharmacophore containing 2 hydrogen bond acceptors and 3 aromatic rings (AARRR.14) was identified. NCI 2012 chemical database was screened against AARRR.14 CPH and identified 25 best-fit molecules. Potential regions of the compound were identified through Field (Gaussian) based QSAR. Regression analysis of atom-based QSAR resulted in r 2 of 0.95 and q 2 of 0.72, whereas, regression analysis of field-based QSAR resulted in r 2 of 0.92 and q 2 of 0.87 along with r 2 cv as 0.71. The fate of the acridone molecule (compound 10) in the P-glycoprotein environment is analyzed through analyzing the conformational changes occurring during the molecular dynamics simulations. Combined data of different in silico techniques provided basis for deeper understanding of structural and mechanistic insights of interaction phenomenon of acridones with P-glycoprotein and also as strategic basis for designing more potent molecules for anti-cancer and multidrug resistance reversal activities. Copyright © 2018 Elsevier Ltd. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Valerio, Luis G.; Arvidson, Kirk B.; Chanderbhan, Ronald F.
2007-07-01
Consistent with the U.S. Food and Drug Administration (FDA) Critical Path Initiative, predictive toxicology software programs employing quantitative structure-activity relationship (QSAR) models are currently under evaluation for regulatory risk assessment and scientific decision support for highly sensitive endpoints such as carcinogenicity, mutagenicity and reproductive toxicity. At the FDA's Center for Food Safety and Applied Nutrition's Office of Food Additive Safety and the Center for Drug Evaluation and Research's Informatics and Computational Safety Analysis Staff (ICSAS), the use of computational SAR tools for both qualitative and quantitative risk assessment applications are being developed and evaluated. One tool of current interest ismore » MDL-QSAR predictive discriminant analysis modeling of rodent carcinogenicity, which has been previously evaluated for pharmaceutical applications by the FDA ICSAS. The study described in this paper aims to evaluate the utility of this software to estimate the carcinogenic potential of small, organic, naturally occurring chemicals found in the human diet. In addition, a group of 19 known synthetic dietary constituents that were positive in rodent carcinogenicity studies served as a control group. In the test group of naturally occurring chemicals, 101 were found to be suitable for predictive modeling using this software's discriminant analysis modeling approach. Predictions performed on these compounds were compared to published experimental evidence of each compound's carcinogenic potential. Experimental evidence included relevant toxicological studies such as rodent cancer bioassays, rodent anti-carcinogenicity studies, genotoxic studies, and the presence of chemical structural alerts. Statistical indices of predictive performance were calculated to assess the utility of the predictive modeling method. Results revealed good predictive performance using this software's rodent carcinogenicity module of over 1200
Use of the Monte Carlo Method for OECD Principles-Guided QSAR Modeling of SIRT1 Inhibitors.
Kumar, Ashwani; Chauhan, Shilpi
2017-01-01
SIRT1 inhibitors offer therapeutic potential for the treatment of a number of diseases including cancer and human immunodeficiency virus infection. A diverse series of 45 compounds with reported SIRT1 inhibitory activity has been employed for the development of quantitative structure-activity relationship (QSAR) models using the Monte Carlo optimization method. This method makes use of simplified molecular input line entry system notation of the molecular structure. The QSAR models were built up according to OECD principles. Three subsets of three splits were examined and validated by respective external sets. All the three described models have good statistical quality. The best model has the following statistical characteristics: R 2 = 0.8350, Q 2 test = 0.7491 for the test set and R 2 = 0.9655, Q 2 ext = 0.9261 for the validation set. In the mechanistic interpretation, structural attributes responsible for the endpoint increase and decrease are defined. Further, the design of some prospective SIRT1 inhibitors is also presented on the basis of these structural attributes. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Alves, Vinicius M.; Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, NC 27599; Muratov, Eugene
Skin permeability is widely considered to be mechanistically implicated in chemically-induced skin sensitization. Although many chemicals have been identified as skin sensitizers, there have been very few reports analyzing the relationships between molecular structure and skin permeability of sensitizers and non-sensitizers. The goals of this study were to: (i) compile, curate, and integrate the largest publicly available dataset of chemicals studied for their skin permeability; (ii) develop and rigorously validate QSAR models to predict skin permeability; and (iii) explore the complex relationships between skin sensitization and skin permeability. Based on the largest publicly available dataset compiled in this study, wemore » found no overall correlation between skin permeability and skin sensitization. In addition, cross-species correlation coefficient between human and rodent permeability data was found to be as low as R{sup 2} = 0.44. Human skin permeability models based on the random forest method have been developed and validated using OECD-compliant QSAR modeling workflow. Their external accuracy was high (Q{sup 2}{sub ext} = 0.73 for 63% of external compounds inside the applicability domain). The extended analysis using both experimentally-measured and QSAR-imputed data still confirmed the absence of any overall concordance between skin permeability and skin sensitization. This observation suggests that chemical modifications that affect skin permeability should not be presumed a priori to modulate the sensitization potential of chemicals. The models reported herein as well as those developed in the companion paper on skin sensitization suggest that it may be possible to rationally design compounds with the desired high skin permeability but low sensitization potential. - Highlights: • It was compiled the largest publicly-available skin permeability dataset. • Predictive QSAR models were developed for skin permeability. • No concordance between
The anesthetic action of some polyhalogenated ethers-Monte Carlo method based QSAR study.
Golubović, Mlađan; Lazarević, Milan; Zlatanović, Dragan; Krtinić, Dane; Stoičkov, Viktor; Mladenović, Bojan; Milić, Dragan J; Sokolović, Dušan; Veselinović, Aleksandar M
2018-04-13
Up to this date, there has been an ongoing debate about the mode of action of general anesthetics, which have postulated many biological sites as targets for their action. However, postoperative nausea and vomiting are common problems in which inhalational agents may have a role in their development. When a mode of action is unknown, QSAR modelling is essential in drug development. To investigate the aspects of their anesthetic, QSAR models based on the Monte Carlo method were developed for a set of polyhalogenated ethers. Until now, their anesthetic action has not been completely defined, although some hypotheses have been suggested. Therefore, a QSAR model should be developed on molecular fragments that contribute to anesthetic action. QSAR models were built on the basis of optimal molecular descriptors based on the SMILES notation and local graph invariants, whereas the Monte Carlo optimization method with three random splits into the training and test set was applied for model development. Different methods, including novel Index of ideality correlation, were applied for the determination of the robustness of the model and its predictive potential. The Monte Carlo optimization process was capable of being an efficient in silico tool for building up a robust model of good statistical quality. Molecular fragments which have both positive and negative influence on anesthetic action were determined. The presented study can be useful in the search for novel anesthetics. Copyright © 2018 Elsevier Ltd. All rights reserved.
Proteins QSAR with Markov average electrostatic potentials.
González-Díaz, Humberto; Uriarte, Eugenio
2005-11-15
Classic physicochemical and topological indices have been largely used in small molecules QSAR but less in proteins QSAR. In this study, a Markov model is used to calculate, for the first time, average electrostatic potentials xik for an indirect interaction between aminoacids placed at topologic distances k within a given protein backbone. The short-term average stochastic potential xi1 for 53 Arc repressor mutants was used to model the effect of Alanine scanning on thermal stability. The Arc repressor is a model protein of relevance for biochemical studies on bioorganics and medicinal chemistry. A linear discriminant analysis model developed correctly classified 43 out of 53, 81.1% of proteins according to their thermal stability. More specifically, the model classified 20/28, 71.4% of proteins with near wild-type stability and 23/25, 92.0% of proteins with reduced stability. Moreover, predictability in cross-validation procedures was of 81.0%. Expansion of the electrostatic potential in the series xi0, xi1, xi2, and xi3, justified the use of the abrupt truncation approach, being the overall accuracy >70.0% for xi0 but equal for xi1, xi2, and xi3. The xi1 model compared favorably with respect to others based on D-Fire potential, surface area, volume, partition coefficient, and molar refractivity, with less than 77.0% of accuracy [Ramos de Armas, R.; González-Díaz, H.; Molina, R.; Uriarte, E. Protein Struct. Func. Bioinf.2004, 56, 715]. The xi1 model also has more tractable interpretation than others based on Markovian negentropies and stochastic moments. Finally, the model is notably simpler than the two models based on quadratic and linear indices. Both models, reported by Marrero-Ponce et al., use four-to-five time more descriptors. Introduction of average stochastic potentials may be useful for QSAR applications; having xik amenable physical interpretation and being very effective.
2D-QSAR study of fullerene nanostructure derivatives as potent HIV-1 protease inhibitors
NASA Astrophysics Data System (ADS)
Barzegar, Abolfazl; Jafari Mousavi, Somaye; Hamidi, Hossein; Sadeghi, Mehdi
2017-09-01
The protease of human immunodeficiency virus1 (HIV-PR) is an essential enzyme for antiviral treatments. Carbon nanostructures of fullerene derivatives, have nanoscale dimension with a diameter comparable to the diameter of the active site of HIV-PR which would in turn inhibit HIV. In this research, two dimensional quantitative structure-activity relationships (2D-QSAR) of fullerene derivatives against HIV-PR activity were employed as a powerful tool for elucidation the relationships between structure and experimental observations. QSAR study of 49 fullerene derivatives was performed by employing stepwise-MLR, GAPLS-MLR, and PCA-MLR models for variable (descriptor) selection and model construction. QSAR models were obtained with higher ability to predict the activity of the fullerene derivatives against HIV-PR by a correlation coefficient (R2training) of 0.942, 0.89, and 0.87 as well as R2test values of 0.791, 0.67and 0.674 for stepwise-MLR, GAPLS-MLR, and PCA -MLR models, respectively. Leave-one-out cross-validated correlation coefficient (R2CV) and Y-randomization methods confirmed the models robustness. The descriptors indicated that the HIV-PR inhibition depends on the van der Waals volumes, polarizability, bond order between two atoms and electronegativities of fullerenes derivatives. 2D-QSAR simulation without needing receptor's active site geometry, resulted in useful descriptors mainly denoting ;C60 backbone-functional groups; and ;C60 functional groups; properties. Both properties in fullerene refer to the ligand fitness and improvement van der Waals interactions with HIV-PR active site. Therefore, the QSAR models can be used in the search for novel HIV-PR inhibitors based on fullerene derivatives.
The present study explores the merit of utilizing available pharmaceutical data to construct a quantitative structure-activity relationship (QSAR) for prediction of the fraction of a chemical unbound to plasma protein (Fub) in environmentally relevant compounds. Independent model...
Xie, Huiding; Chen, Lijun; Zhang, Jianqiang; Xie, Xiaoguang; Qiu, Kaixiong; Fu, Jijun
2015-01-01
B-Raf kinase is an important target in treatment of cancers. In order to design and find potent B-Raf inhibitors (BRIs), 3D pharmacophore models were created using the Genetic Algorithm with Linear Assignment of Hypermolecular Alignment of Database (GALAHAD). The best pharmacophore model obtained which was used in effective alignment of the data set contains two acceptor atoms, three donor atoms and three hydrophobes. In succession, comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) were performed on 39 imidazopyridine BRIs to build three dimensional quantitative structure-activity relationship (3D QSAR) models based on both pharmacophore and docking alignments. The CoMSIA model based on the pharmacophore alignment shows the best result (q2 = 0.621, r2pred = 0.885). This 3D QSAR approach provides significant insights that are useful for designing potent BRIs. In addition, the obtained best pharmacophore model was used for virtual screening against the NCI2000 database. The hit compounds were further filtered with molecular docking, and their biological activities were predicted using the CoMSIA model, and three potential BRIs with new skeletons were obtained. PMID:26035757
Xie, Huiding; Chen, Lijun; Zhang, Jianqiang; Xie, Xiaoguang; Qiu, Kaixiong; Fu, Jijun
2015-05-29
B-Raf kinase is an important target in treatment of cancers. In order to design and find potent B-Raf inhibitors (BRIs), 3D pharmacophore models were created using the Genetic Algorithm with Linear Assignment of Hypermolecular Alignment of Database (GALAHAD). The best pharmacophore model obtained which was used in effective alignment of the data set contains two acceptor atoms, three donor atoms and three hydrophobes. In succession, comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) were performed on 39 imidazopyridine BRIs to build three dimensional quantitative structure-activity relationship (3D QSAR) models based on both pharmacophore and docking alignments. The CoMSIA model based on the pharmacophore alignment shows the best result (q(2) = 0.621, r(2)(pred) = 0.885). This 3D QSAR approach provides significant insights that are useful for designing potent BRIs. In addition, the obtained best pharmacophore model was used for virtual screening against the NCI2000 database. The hit compounds were further filtered with molecular docking, and their biological activities were predicted using the CoMSIA model, and three potential BRIs with new skeletons were obtained.
NASA Astrophysics Data System (ADS)
Masand, Vijay H.; El-Sayed, Nahed N. E.; Bambole, Mukesh U.; Quazi, Syed A.
2018-04-01
Multiple discrete quantitative structure-activity relationships (QSARs) models were constructed for the anticancer activity of α, β-unsaturated carbonyl-based compounds, oxime and oxime ether analogues with a variety of substituents like sbnd Br, sbnd OH, -OMe, etc. at different positions. A big pool of descriptors was considered for QSAR model building. Genetic algorithm (GA), available in QSARINS-Chem, was executed to choose optimum number and set of descriptors to create the multi-linear regression equations for a dataset of sixty-nine compounds. The newly developed five parametric models were subjected to exhaustive internal and external validation along with Y-scrambling using QSARINS-Chem, according to the OECD principles for QSAR model validation. The models were built using easily interpretable descriptors and accepted after confirming statistically robustness with high external predictive ability. The five parametric models were found to have R2 = 0.80 to 0.86, R2ex = 0.75 to 0.84, and CCCex = 0.85 to 0.90. The models indicate that frequency of nitrogen and oxygen atoms separated by five bonds from each other and internal electronic environment of the molecule have correlation with the anticancer activity.
This presentation will examine the impact of data quality on the construction of QSAR models being developed within the EPA‘s National Center for Computational Toxicology. We have developed a public-facing platform to provide access to predictive models. As part of the work we ha...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Esposito, Emilio Xavier, E-mail: emilio@exeResearch.com; The Chem21 Group, Inc., 1780 Wilson Drive, Lake Forest, IL 60045; Hopfinger, Anton J., E-mail: hopfingr@gmail.com
2015-10-01
Carbon nanotubes have become widely used in a variety of applications including biosensors and drug carriers. Therefore, the issue of carbon nanotube toxicity is increasingly an area of focus and concern. While previous studies have focused on the gross mechanisms of action relating to nanomaterials interacting with biological entities, this study proposes detailed mechanisms of action, relating to nanotoxicity, for a series of decorated (functionalized) carbon nanotube complexes based on previously reported QSAR models. Possible mechanisms of nanotoxicity for six endpoints (bovine serum albumin, carbonic anhydrase, chymotrypsin, hemoglobin along with cell viability and nitrogen oxide production) have been extracted frommore » the corresponding optimized QSAR models. The molecular features relevant to each of the endpoint respective mechanism of action for the decorated nanotubes are also discussed. Based on the molecular information contained within the optimal QSAR models for each nanotoxicity endpoint, either the decorator attached to the nanotube is directly responsible for the expression of a particular activity, irrespective of the decorator's 3D-geometry and independent of the nanotube, or those decorators having structures that place the functional groups of the decorators as far as possible from the nanotube surface most strongly influence the biological activity. These molecular descriptors are further used to hypothesize specific interactions involved in the expression of each of the six biological endpoints. - Highlights: • Proposed toxicity mechanism of action for decorated nanotubes complexes • Discussion of the key molecular features for each endpoint's mechanism of action • Unique mechanisms of action for each of the six biological systems • Hypothesized mechanisms of action based on QSAR/QNAR predictive models.« less
Molecular docking and 3D-QSAR studies on inhibitors of DNA damage signaling enzyme human PARP-1.
Fatima, Sabiha; Bathini, Raju; Sivan, Sree Kanth; Manga, Vijjulatha
2012-08-01
Poly (ADP-ribose) polymerase-1 (PARP-1) operates in a DNA damage signaling network. Molecular docking and three dimensional-quantitative structure activity relationship (3D-QSAR) studies were performed on human PARP-1 inhibitors. Docked conformation obtained for each molecule was used as such for 3D-QSAR analysis. Molecules were divided into a training set and a test set randomly in four different ways, partial least square analysis was performed to obtain QSAR models using the comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA). Derived models showed good statistical reliability that is evident from their r², q²(loo) and r²(pred) values. To obtain a consensus for predictive ability from all the models, average regression coefficient r²(avg) was calculated. CoMFA and CoMSIA models showed a value of 0.930 and 0.936, respectively. Information obtained from the best 3D-QSAR model was applied for optimization of lead molecule and design of novel potential inhibitors.
Alert-QSAR. Implications for Electrophilic Theory of Chemical Carcinogenesis
Putz, Mihai V.; Ionaşcu, Cosmin; Putz, Ana-Maria; Ostafe, Vasile
2011-01-01
Given the modeling and predictive abilities of quantitative structure activity relationships (QSARs) for genotoxic carcinogens or mutagens that directly affect DNA, the present research investigates structural alert (SA) intermediate-predicted correlations ASA of electrophilic molecular structures with observed carcinogenic potencies in rats (observed activity, A = Log[1/TD50], i.e., ASA=f(X1SA,X2SA,…)). The present method includes calculation of the recently developed residual correlation of the structural alert models, i.e., ARASA=f(A−ASA,X1SA,X2SA,…). We propose a specific electrophilic ligand-receptor mechanism that combines electronegativity with chemical hardness-associated frontier principles, equality of ligand-reagent electronegativities and ligand maximum chemical hardness for highly diverse toxic molecules against specific receptors in rats. The observed carcinogenic activity is influenced by the induced SA-mutagenic intermediate effect, alongside Hansch indices such as hydrophobicity (LogP), polarizability (POL) and total energy (Etot), which account for molecular membrane diffusion, ionic deformation, and stericity, respectively. A possible QSAR mechanistic interpretation of mutagenicity as the first step in genotoxic carcinogenesis development is discussed using the structural alert chemoinformation and in full accordance with the Organization for Economic Co-operation and Development QSAR guidance principles. PMID:21954348
QSAR Analysis of 2-Amino or 2-Methyl-1-Substituted Benzimidazoles Against Pseudomonas aeruginosa
Podunavac-Kuzmanović, Sanja O.; Cvetković, Dragoljub D.; Barna, Dijana J.
2009-01-01
A set of benzimidazole derivatives were tested for their inhibitory activities against the Gram-negative bacterium Pseudomonas aeruginosa and minimum inhibitory concentrations were determined for all the compounds. Quantitative structure activity relationship (QSAR) analysis was applied to fourteen of the abovementioned derivatives using a combination of various physicochemical, steric, electronic, and structural molecular descriptors. A multiple linear regression (MLR) procedure was used to model the relationships between molecular descriptors and the antibacterial activity of the benzimidazole derivatives. The stepwise regression method was used to derive the most significant models as a calibration model for predicting the inhibitory activity of this class of molecules. The best QSAR models were further validated by a leave one out technique as well as by the calculation of statistical parameters for the established theoretical models. To confirm the predictive power of the models, an external set of molecules was used. High agreement between experimental and predicted inhibitory values, obtained in the validation procedure, indicated the good quality of the derived QSAR models. PMID:19468332
Ghasemi, Jahan B; Safavi-Sohi, Reihaneh; Barbosa, Euzébio G
2012-02-01
A quasi 4D-QSAR has been carried out on a series of potent Gram-negative LpxC inhibitors. This approach makes use of the molecular dynamics (MD) trajectories and topology information retrieved from the GROMACS package. This new methodology is based on the generation of a conformational ensemble profile, CEP, for each compound instead of only one conformation, followed by the calculation intermolecular interaction energies at each grid point considering probes and all aligned conformations resulting from MD simulations. These interaction energies are independent variables employed in a QSAR analysis. The comparison of the proposed methodology to comparative molecular field analysis (CoMFA) formalism was performed. This methodology explores jointly the main features of CoMFA and 4D-QSAR models. Step-wise multiple linear regression was used for the selection of the most informative variables. After variable selection, multiple linear regression (MLR) and partial least squares (PLS) methods used for building the regression models. Leave-N-out cross-validation (LNO), and Y-randomization were performed in order to confirm the robustness of the model in addition to analysis of the independent test set. Best models provided the following statistics: [Formula in text] (PLS) and [Formula in text] (MLR). Docking study was applied to investigate the major interactions in protein-ligand complex with CDOCKER algorithm. Visualization of the descriptors of the best model helps us to interpret the model from the chemical point of view, supporting the applicability of this new approach in rational drug design.
Posa, Mihalj; Pilipović, Ana; Lalić, Mladena; Popović, Jovan
2011-02-15
Linear dependence between temperature (t) and retention coefficient (k, reversed phase HPLC) of bile acids is obtained. Parameters (a, intercept and b, slope) of the linear function k=f(t) highly correlate with bile acids' structures. Investigated bile acids form linear congeneric groups on a principal component (calculated from k=f(t)) score plot that are in accordance with conformations of the hydroxyl and oxo groups in a bile acid steroid skeleton. Partition coefficient (K(p)) of nitrazepam in bile acids' micelles is investigated. Nitrazepam molecules incorporated in micelles show modified bioavailability (depo effect, higher permeability, etc.). Using multiple linear regression method QSAR models of nitrazepams' partition coefficient, K(p) are derived on the temperatures of 25°C and 37°C. For deriving linear regression models on both temperatures experimentally obtained lipophilicity parameters are included (PC1 from data k=f(t)) and in silico descriptors of the shape of a molecule while on the higher temperature molecular polarisation is introduced. This indicates the fact that the incorporation mechanism of nitrazepam in BA micelles changes on the higher temperatures. QSAR models are derived using partial least squares method as well. Experimental parameters k=f(t) are shown to be significant predictive variables. Both QSAR models are validated using cross validation and internal validation method. PLS models have slightly higher predictive capability than MLR models. Copyright © 2010 Elsevier B.V. All rights reserved.
Ko, Gene M; Garg, Rajni; Bailey, Barbara A; Kumar, Sunil
2016-01-01
Quantitative structure-activity relationship (QSAR) models can be used as a predictive tool for virtual screening of chemical libraries to identify novel drug candidates. The aims of this paper were to report the results of a study performed for descriptor selection, QSAR model development, and virtual screening for identifying novel HIV-1 integrase inhibitor drug candidates. First, three evolutionary algorithms were compared for descriptor selection: differential evolution-binary particle swarm optimization (DE-BPSO), binary particle swarm optimization, and genetic algorithms. Next, three QSAR models were developed from an ensemble of multiple linear regression, partial least squares, and extremely randomized trees models. A comparison of the performances of three evolutionary algorithms showed that DE-BPSO has a significant improvement over the other two algorithms. QSAR models developed in this study were used in consensus as a predictive tool for virtual screening of the NCI Open Database containing 265,242 compounds to identify potential novel HIV-1 integrase inhibitors. Six compounds were predicted to be highly active (plC50 > 6) by each of the three models. The use of a hybrid evolutionary algorithm (DE-BPSO) for descriptor selection and QSAR model development in drug design is a novel approach. Consensus modeling may provide better predictivity by taking into account a broader range of chemical properties within the data set conducive for inhibition that may be missed by an individual model. The six compounds identified provide novel drug candidate leads in the design of next generation HIV- 1 integrase inhibitors targeting drug resistant mutant viruses.
Martínez-Santiago, O; Marrero-Ponce, Y; Vivas-Reyes, R; Rivera-Borroto, O M; Hurtado, E; Treto-Suarez, M A; Ramos, Y; Vergara-Murillo, F; Orozco-Ugarriza, M E; Martínez-López, Y
2017-05-01
Graph derivative indices (GDIs) have recently been defined over N-atoms (N = 2, 3 and 4) simultaneously, which are based on the concept of derivatives in discrete mathematics (finite difference), metaphorical to the derivative concept in classical mathematical analysis. These molecular descriptors (MDs) codify topo-chemical and topo-structural information based on the concept of the derivative of a molecular graph with respect to a given event (S) over duplex, triplex and quadruplex relations of atoms (vertices). These GDIs have been successfully applied in the description of physicochemical properties like reactivity, solubility and chemical shift, among others, and in several comparative quantitative structure activity/property relationship (QSAR/QSPR) studies. Although satisfactory results have been obtained in previous modelling studies with the aforementioned indices, it is necessary to develop new, more rigorous analysis to assess the true predictive performance of the novel structure codification. So, in the present paper, an assessment and statistical validation of the performance of these novel approaches in QSAR studies are executed, as well as a comparison with those of other QSAR procedures reported in the literature. To achieve the main aim of this research, QSARs were developed on eight chemical datasets widely used as benchmarks in the evaluation/validation of several QSAR methods and/or many different MDs (fundamentally 3D MDs). Three to seven variable QSAR models were built for each chemical dataset, according to the original dissection into training/test sets. The models were developed by using multiple linear regression (MLR) coupled with a genetic algorithm as the feature wrapper selection technique in the MobyDigs software. Each family of GDIs (for duplex, triplex and quadruplex) behaves similarly in all modelling, although there were some exceptions. However, when all families were used in combination, the results achieved were quantitatively
Gini, Giuseppina
2016-01-01
In this chapter, we introduce the basis of computational chemistry and discuss how computational methods have been extended to some biological properties and toxicology, in particular. Since about 20 years, chemical experimentation is more and more replaced by modeling and virtual experimentation, using a large core of mathematics, chemistry, physics, and algorithms. Then we see how animal experiments, aimed at providing a standardized result about a biological property, can be mimicked by new in silico methods. Our emphasis here is on toxicology and on predicting properties through chemical structures. Two main streams of such models are available: models that consider the whole molecular structure to predict a value, namely QSAR (Quantitative Structure Activity Relationships), and models that find relevant substructures to predict a class, namely SAR. The term in silico discovery is applied to chemical design, to computational toxicology, and to drug discovery. We discuss how the experimental practice in biological science is moving more and more toward modeling and simulation. Such virtual experiments confirm hypotheses, provide data for regulation, and help in designing new chemicals.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Valencia, Antoni; Prous, Josep; Mora, Oscar
As indicated in ICH M7 draft guidance, in silico predictive tools including statistically-based QSARs and expert analysis may be used as a computational assessment for bacterial mutagenicity for the qualification of impurities in pharmaceuticals. To address this need, we developed and validated a QSAR model to predict Salmonella t. mutagenicity (Ames assay outcome) of pharmaceutical impurities using Prous Institute's Symmetry℠, a new in silico solution for drug discovery and toxicity screening, and the Mold2 molecular descriptor package (FDA/NCTR). Data was sourced from public benchmark databases with known Ames assay mutagenicity outcomes for 7300 chemicals (57% mutagens). Of these data, 90%more » was used to train the model and the remaining 10% was set aside as a holdout set for validation. The model's applicability to drug impurities was tested using a FDA/CDER database of 951 structures, of which 94% were found within the model's applicability domain. The predictive performance of the model is acceptable for supporting regulatory decision-making with 84 ± 1% sensitivity, 81 ± 1% specificity, 83 ± 1% concordance and 79 ± 1% negative predictivity based on internal cross-validation, while the holdout dataset yielded 83% sensitivity, 77% specificity, 80% concordance and 78% negative predictivity. Given the importance of having confidence in negative predictions, an additional external validation of the model was also carried out, using marketed drugs known to be Ames-negative, and obtained 98% coverage and 81% specificity. Additionally, Ames mutagenicity data from FDA/CFSAN was used to create another data set of 1535 chemicals for external validation of the model, yielding 98% coverage, 73% sensitivity, 86% specificity, 81% concordance and 84% negative predictivity. - Highlights: • A new in silico QSAR model to predict Ames mutagenicity is described. • The model is extensively validated with chemicals from the FDA and the public domain. • Validation
Du, Qi-Shi; Huang, Ri-Bo; Wei, Yu-Tuo; Pang, Zong-Wen; Du, Li-Qin; Chou, Kuo-Chen
2009-01-30
In cooperation with the fragment-based design a new drug design method, the so-called "fragment-based quantitative structure-activity relationship" (FB-QSAR) is proposed. The essence of the new method is that the molecular framework in a family of drug candidates are divided into several fragments according to their substitutes being investigated. The bioactivities of molecules are correlated with the physicochemical properties of the molecular fragments through two sets of coefficients in the linear free energy equations. One coefficient set is for the physicochemical properties and the other for the weight factors of the molecular fragments. Meanwhile, an iterative double least square (IDLS) technique is developed to solve the two sets of coefficients in a training data set alternately and iteratively. The IDLS technique is a feedback procedure with machine learning ability. The standard Two-dimensional quantitative structure-activity relationship (2D-QSAR) is a special case, in the FB-QSAR, when the whole molecule is treated as one entity. The FB-QSAR approach can remarkably enhance the predictive power and provide more structural insights into rational drug design. As an example, the FB-QSAR is applied to build a predictive model of neuraminidase inhibitors for drug development against H5N1 influenza virus. (c) 2008 Wiley Periodicals, Inc.
AQUATIC TOXICITY MODE OF ACTION STUDIES APPLIED TO QSAR DEVELOPMENT
A series of QSAR models for predicting fish acute lethality were developed using systematically collected data on more than 600 chemicals. These models were developed based on the assumption that chemicals producing toxicity through a common mechanism will have commonality in the...
Approaches to developing alternative and predictive toxicology based on PBPK/PD and QSAR modeling.
Yang, R S; Thomas, R S; Gustafson, D L; Campain, J; Benjamin, S A; Verhaar, H J; Mumtaz, M M
1998-01-01
Systematic toxicity testing, using conventional toxicology methodologies, of single chemicals and chemical mixtures is highly impractical because of the immense numbers of chemicals and chemical mixtures involved and the limited scientific resources. Therefore, the development of unconventional, efficient, and predictive toxicology methods is imperative. Using carcinogenicity as an end point, we present approaches for developing predictive tools for toxicologic evaluation of chemicals and chemical mixtures relevant to environmental contamination. Central to the approaches presented is the integration of physiologically based pharmacokinetic/pharmacodynamic (PBPK/PD) and quantitative structure--activity relationship (QSAR) modeling with focused mechanistically based experimental toxicology. In this development, molecular and cellular biomarkers critical to the carcinogenesis process are evaluated quantitatively between different chemicals and/or chemical mixtures. Examples presented include the integration of PBPK/PD and QSAR modeling with a time-course medium-term liver foci assay, molecular biology and cell proliferation studies. Fourier transform infrared spectroscopic analyses of DNA changes, and cancer modeling to assess and attempt to predict the carcinogenicity of the series of 12 chlorobenzene isomers. Also presented is an ongoing effort to develop and apply a similar approach to chemical mixtures using in vitro cell culture (Syrian hamster embryo cell transformation assay and human keratinocytes) methodologies and in vivo studies. The promise and pitfalls of these developments are elaborated. When successfully applied, these approaches may greatly reduce animal usage, personnel, resources, and time required to evaluate the carcinogenicity of chemicals and chemical mixtures. Images Figure 6 PMID:9860897
QSAR prediction of additive and non-additive mixture toxicities of antibiotics and pesticide.
Qin, Li-Tang; Chen, Yu-Han; Zhang, Xin; Mo, Ling-Yun; Zeng, Hong-Hu; Liang, Yan-Peng
2018-05-01
Antibiotics and pesticides may exist as a mixture in real environment. The combined effect of mixture can either be additive or non-additive (synergism and antagonism). However, no effective predictive approach exists on predicting the synergistic and antagonistic toxicities of mixtures. In this study, we developed a quantitative structure-activity relationship (QSAR) model for the toxicities (half effect concentration, EC 50 ) of 45 binary and multi-component mixtures composed of two antibiotics and four pesticides. The acute toxicities of single compound and mixtures toward Aliivibrio fischeri were tested. A genetic algorithm was used to obtain the optimized model with three theoretical descriptors. Various internal and external validation techniques indicated that the coefficient of determination of 0.9366 and root mean square error of 0.1345 for the QSAR model predicted that 45 mixture toxicities presented additive, synergistic, and antagonistic effects. Compared with the traditional concentration additive and independent action models, the QSAR model exhibited an advantage in predicting mixture toxicity. Thus, the presented approach may be able to fill the gaps in predicting non-additive toxicities of binary and multi-component mixtures. Copyright © 2018 Elsevier Ltd. All rights reserved.
DAT/SERT Selectivity of Flexible GBR 12909 Analogs Modeled Using 3D-QSAR Methods
Gilbert, Kathleen M.; Boos, Terrence L.; Dersch, Christina M.; Greiner, Elisabeth; Jacobson, Arthur E.; Lewis, David; Matecka, Dorota; Prisinzano, Thomas E.; Zhang, Ying; Rothman, Richard B.; Rice, Kenner C.; Venanzi, Carol A.
2007-01-01
The dopamine reuptake inhibitor GBR 12909 (1-{2-[bis(4-fluorophenyl)methoxy]ethyl}-4-(3-phenylpropyl)piperazine, 1) and its analogs have been developed as tools to test the hypothesis that selective dopamine transporter (DAT) inhibitors will be useful therapeutics for cocaine addiction. This 3D-QSAR study focuses on the effect of substitutions in the phenylpropyl region of 1. CoMFA and CoMSIA techniques were used to determine a predictive and stable model for the DAT/serotonin transporter (SERT) selectivity (represented by pKi (DAT/SERT)) of a set of flexible analogs of 1, most of which have eight rotatable bonds. In the absence of a rigid analog to use as a 3D-QSAR template, six conformational families of analogs were constructed from six pairs of piperazine and piperidine template conformers identified by hierarchical clustering as representative molecular conformations. Three models stable to y-value scrambling were identified after a comprehensive CoMFA and CoMSIA survey with Region Focusing. Test set correlation validation led to an acceptable model, with q2 = 0.508, standard error of prediction = 0.601, two components, r2 = 0.685, standard error of estimate = 0.481, F value = 39, percent steric contribution = 65, and percent electrostatic contribution = 35. A CoMFA contour map identified areas of the molecule that affect pKi (DAT/SERT). This work outlines a protocol for deriving a stable and predictive model of the biological activity of a set of very flexible molecules. PMID:17127069
Chirico, Nicola; Gramatica, Paola
2011-09-26
The main utility of QSAR models is their ability to predict activities/properties for new chemicals, and this external prediction ability is evaluated by means of various validation criteria. As a measure for such evaluation the OECD guidelines have proposed the predictive squared correlation coefficient Q(2)(F1) (Shi et al.). However, other validation criteria have been proposed by other authors: the Golbraikh-Tropsha method, r(2)(m) (Roy), Q(2)(F2) (Schüürmann et al.), Q(2)(F3) (Consonni et al.). In QSAR studies these measures are usually in accordance, though this is not always the case, thus doubts can arise when contradictory results are obtained. It is likely that none of the aforementioned criteria is the best in every situation, so a comparative study using simulated data sets is proposed here, using threshold values suggested by the proponents or those widely used in QSAR modeling. In addition, a different and simple external validation measure, the concordance correlation coefficient (CCC), is proposed and compared with other criteria. Huge data sets were used to study the general behavior of validation measures, and the concordance correlation coefficient was shown to be the most restrictive. On using simulated data sets of a more realistic size, it was found that CCC was broadly in agreement, about 96% of the time, with other validation measures in accepting models as predictive, and in almost all the examples it was the most precautionary. The proposed concordance correlation coefficient also works well on real data sets, where it seems to be more stable, and helps in making decisions when the validation measures are in conflict. Since it is conceptually simple, and given its stability and restrictiveness, we propose the concordance correlation coefficient as a complementary, or alternative, more prudent measure of a QSAR model to be externally predictive.
Chi, Yulang; Zhang, Huanteng; Huang, Qiansheng; Lin, Yi; Ye, Guozhu; Zhu, Huimin; Dong, Sijun
2018-02-01
Environmental risks of organic chemicals have been greatly determined by their persistence, bioaccumulation, and toxicity (PBT) and physicochemical properties. Major regulations in different countries and regions identify chemicals according to their bioconcentration factor (BCF) and octanol-water partition coefficient (Kow), which frequently displays a substantial correlation with the sediment sorption coefficient (Koc). Half-life or degradability is crucial for the persistence evaluation of chemicals. Quantitative structure activity relationship (QSAR) estimation models are indispensable for predicting environmental fate and health effects in the absence of field- or laboratory-based data. In this study, 39 chemicals of high concern were chosen for half-life testing based on total organic carbon (TOC) degradation, and two widely accepted and highly used QSAR estimation models (i.e., EPI Suite and PBT Profiler) were adopted for environmental risk evaluation. The experimental results and estimated data, as well as the two model-based results were compared, based on the water solubility, Kow, Koc, BCF and half-life. Environmental risk assessment of the selected compounds was achieved by combining experimental data and estimation models. It was concluded that both EPI Suite and PBT Profiler were fairly accurate in measuring the physicochemical properties and degradation half-lives for water, soil, and sediment. However, the half-lives between the experimental and the estimated results were still not absolutely consistent. This suggests deficiencies of the prediction models in some ways, and the necessity to combine the experimental data and predicted results for the evaluation of environmental fate and risks of pollutants. Copyright © 2016. Published by Elsevier B.V.
Khanfar, Mohammad A; Banat, Fahmy; Alabed, Shada; Alqtaishat, Saja
2017-02-01
High expression of Nek2 has been detected in several types of cancer and it represents a novel target for human cancer. In the current study, structure-based pharmacophore modeling combined with multiple linear regression (MLR)-based QSAR analyses was applied to disclose the structural requirements for NEK2 inhibition. Generated pharmacophoric models were initially validated with receiver operating characteristic (ROC) curve, and optimum models were subsequently implemented in QSAR modeling with other physiochemical descriptors. QSAR-selected models were implied as 3D search filters to mine the National Cancer Institute (NCI) database for novel NEK2 inhibitors, whereas the associated QSAR model prioritized the bioactivities of captured hits for in vitro evaluation. Experimental validation identified several potent NEK2 inhibitors of novel structural scaffolds. The most potent captured hit exhibited an [Formula: see text] value of 237 nM.
5D-QSAR for spirocyclic sigma1 receptor ligands by Quasar receptor surface modeling.
Oberdorf, Christoph; Schmidt, Thomas J; Wünsch, Bernhard
2010-07-01
Based on a contiguous and structurally as well as biologically diverse set of 87 sigma(1) ligands, a 5D-QSAR study was conducted in which a quasi-atomistic receptor surface modeling approach (program package Quasar) was applied. The superposition of the ligands was performed with the tool Pharmacophore Elucidation (MOE-package), which takes all conformations of the ligands into account. This procedure led to four pharmacophoric structural elements with aromatic, hydrophobic, cationic and H-bond acceptor properties. Using the aligned structures a 3D-model of the ligand binding site of the sigma(1) receptor was obtained, whose general features are in good agreement with previous assumptions on the receptor structure, but revealed some novel insights since it represents the receptor surface in more detail. Thus, e.g., our model indicates the presence of an H-bond acceptor moiety in the binding site as counterpart to the ligands' cationic ammonium center, rather than a negatively charged carboxylate group. The presented QSAR model is statistically valid and represents the biological data of all tested compounds, including a test set of 21 ligands not used in the modeling process, with very good to excellent accuracy [q(2) (training set, n=66; leave 1/3 out) = 0.84, p(2) (test set, n=21)=0.64]. Moreover, the binding affinities of 13 further spirocyclic sigma(1) ligands were predicted with reasonable accuracy (mean deviation in pK(i) approximately 0.8). Thus, in addition to novel insights into the requirements for binding of spirocyclic piperidines to the sigma(1) receptor, the presented model can be used successfully in the rational design of new sigma(1) ligands. Copyright (c) 2010 Elsevier Masson SAS. All rights reserved.
Ai, Yong; Wang, Shao-Teng; Sun, Ping-Hua; Song, Fa-Jun
2011-01-01
Aurora kinases have emerged as attractive targets for the design of anticancer drugs. 3D-QSAR (comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA)) and Surflex-docking studies were performed on a series of pyrrole-indoline-2-ones as Aurora A inhibitors. The CoMFA and CoMSIA models using 25 inhibitors in the training set gave r(2) (cv) values of 0.726 and 0.566, and r(2) values of 0.972 and 0.984, respectively. The adapted alignment method with the suitable parameters resulted in reliable models. The contour maps produced by the CoMFA and CoMSIA models were employed to rationalize the key structural requirements responsible for the activity. Surflex-docking studies revealed that the sulfo group, secondary amine group on indolin-2-one, and carbonyl of 6,7-dihydro-1H-indol-4(5H)-one groups were significant for binding to the receptor, and some essential features were also identified. Based on the 3D-QSAR and docking results, a set of new molecules with high predicted activities were designed.
NASA Astrophysics Data System (ADS)
Assefa, Haregewein; Kamath, Shantaram; Buolamwini, John K.
2003-08-01
The overexpression and/or mutation of the epidermal growth factor receptor (EGFR) tyrosine kinase has been observed in many human solid tumors, and is under intense investigation as a novel anticancer molecular target. Comparative 3D-QSAR analyses using different alignments were undertaken employing comparative molecular field analysis (CoMFA) and comparative molecular similarity analysis (CoMSIA) for 122 anilinoquinazoline and 50 anilinoquinoline inhibitors of EGFR kinase. The SYBYL multifit alignment rule was applied to three different conformational templates, two obtained from a MacroModel Monte Carlo conformational search, and one from the bound conformation of erlotinib in complex with EGFR in the X-ray crystal structure. In addition, a flexible ligand docking alignment obtained with the GOLD docking program, and a novel flexible receptor-guided consensus dynamics alignment obtained with the DISCOVER program in the INSIGHTII modeling package were also investigated. 3D-QSAR models with q2 values up to 0.70 and r2 values up to 0.97 were obtained. Among the 4-anilinoquinazoline set, the q2 values were similar, but the ability of the different conformational models to predict the activities of an external test set varied considerably. In this regard, the model derived using the X-ray crystallographically determined bioactive conformation of erlotinib afforded the best predictive model. Electrostatic, hydrophobic and H-bond donor descriptors contributed the most to the QSAR models of the 4-anilinoquinazolines, whereas electrostatic, hydrophobic and H-bond acceptor descriptors contributed the most to the 4-anilinoquinoline QSAR, particularly the H-bond acceptor descriptor. A novel receptor-guided consensus dynamics alignment has also been introduced for 3D-QSAR studies. This new alignment method may incorporate to some extent ligand-receptor induced fit effects into 3D-QSAR models.
Acute oral toxicity data are used to meet both regulatory and non-regulatory needs. Recently, there have been efforts to explore alternative approaches for predicting acute oral toxicity such as QSARs. Evaluating the performance and scope of existing models and investigating the ...
Hisaki, Tomoka; Aiba Née Kaneko, Maki; Yamaguchi, Masahiko; Sasa, Hitoshi; Kouzuki, Hirokazu
2015-04-01
Use of laboratory animals for systemic toxicity testing is subject to strong ethical and regulatory constraints, but few alternatives are yet available. One possible approach to predict systemic toxicity of chemicals in the absence of experimental data is quantitative structure-activity relationship (QSAR) analysis. Here, we present QSAR models for prediction of maximum "no observed effect level" (NOEL) for repeated-dose, developmental and reproductive toxicities. NOEL values of 421 chemicals for repeated-dose toxicity, 315 for reproductive toxicity, and 156 for developmental toxicity were collected from Japan Existing Chemical Data Base (JECDB). Descriptors to predict toxicity were selected based on molecular orbital (MO) calculations, and QSAR models employing multiple independent descriptors as the input layer of an artificial neural network (ANN) were constructed to predict NOEL values. Robustness of the models was indicated by the root-mean-square (RMS) errors after 10-fold cross-validation (0.529 for repeated-dose, 0.508 for reproductive, and 0.558 for developmental toxicity). Evaluation of the models in terms of the percentages of predicted NOELs falling within factors of 2, 5 and 10 of the in-vivo-determined NOELs suggested that the model is applicable to both general chemicals and the subset of chemicals listed in International Nomenclature of Cosmetic Ingredients (INCI). Our results indicate that ANN models using in silico parameters have useful predictive performance, and should contribute to integrated risk assessment of systemic toxicity using a weight-of-evidence approach. Availability of predicted NOELs will allow calculation of the margin of safety, as recommended by the Scientific Committee on Consumer Safety (SCCS).
Esposito, Emilio Xavier; Hopfinger, Anton J; Shao, Chi-Yu; Su, Bo-Han; Chen, Sing-Zuo; Tseng, Yufeng Jane
2015-10-01
Carbon nanotubes have become widely used in a variety of applications including biosensors and drug carriers. Therefore, the issue of carbon nanotube toxicity is increasingly an area of focus and concern. While previous studies have focused on the gross mechanisms of action relating to nanomaterials interacting with biological entities, this study proposes detailed mechanisms of action, relating to nanotoxicity, for a series of decorated (functionalized) carbon nanotube complexes based on previously reported QSAR models. Possible mechanisms of nanotoxicity for six endpoints (bovine serum albumin, carbonic anhydrase, chymotrypsin, hemoglobin along with cell viability and nitrogen oxide production) have been extracted from the corresponding optimized QSAR models. The molecular features relevant to each of the endpoint respective mechanism of action for the decorated nanotubes are also discussed. Based on the molecular information contained within the optimal QSAR models for each nanotoxicity endpoint, either the decorator attached to the nanotube is directly responsible for the expression of a particular activity, irrespective of the decorator's 3D-geometry and independent of the nanotube, or those decorators having structures that place the functional groups of the decorators as far as possible from the nanotube surface most strongly influence the biological activity. These molecular descriptors are further used to hypothesize specific interactions involved in the expression of each of the six biological endpoints. Copyright © 2015 Elsevier Inc. All rights reserved.
Dai, Yujie; Chen, Nan; Wang, Qiang; Zheng, Heng; Zhang, Xiuli; Jia, Shiru; Dong, Lilong; Feng, Dacheng
2012-01-01
The inhibitors of p53-HDM2 interaction are attractive molecules for the treatment of wild-type p53 tumors. In order to search more potent HDM2 inhibitors, docking operation with CDOCKER protocol in Discovery Studio 2.1 (DS2.1) and multidimensional hybrid quantitative structure-activity relationship (QSAR) studies through the physiochemical properties obtained from DS2.1 and E-Dragon 1.0 as descriptors, have been performed on 59 1,4-benzodiazepine- 2,5-diones which have p53-HDM2 interaction inhibitory activities. The docking results indicate that π-π interaction between the imidazole group in HIS96 and the aryl ring at 4-N of 1,4-benzodiazepine-2,5-dione may be one of the key factors for the combination of ligands with HDM2. Two QSAR models were obtained using genetic function approximation (GFA) and genetic partial least squares (G/PLS) based on the descriptors obtained from DS2.1 and E-dragon 1.0, respectively. The best model can explain 85.5% of the variance (R (2) adj ) while it could predict 81.7% of the variance (R (2) cv ). With this model, the bioactivities of some new compounds were predicted.
Itteboina, Ramesh; Ballu, Srilata; Sivan, Sree Kanth; Manga, Vijjulatha
2017-10-01
Janus kinase 1 (JAK 1) belongs to the JAK family of intracellular nonreceptor tyrosine kinase. JAK-signal transducer and activator of transcription (JAK-STAT) pathway mediate signaling by cytokines, which control survival, proliferation and differentiation of a variety of cells. Three-dimensional quantitative structure activity relationship (3 D-QSAR), molecular docking and molecular dynamics (MD) methods was carried out on a dataset of Janus kinase 1(JAK 1) inhibitors. Ligands were constructed and docked into the active site of protein using GLIDE 5.6. Best docked poses were selected after analysis for further 3 D-QSAR analysis using comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) methodology. Employing 60 molecules in the training set, 3 D-QSAR models were generate that showed good statistical reliability, which is clearly observed in terms of r 2 ncv and q 2 loo values. The predictive ability of these models was determined using a test set of 25 molecules that gave acceptable predictive correlation (r 2 Pred ) values. The key amino acid residues were identified by means of molecular docking, and the stability and rationality of the derived molecular conformations were also validated by MD simulation. The good consonance between the docking results and CoMFA/CoMSIA contour maps provides helpful clues about the reasonable modification of molecules in order to design more efficient JAK 1 inhibitors. The developed models are expected to provide some directives for further synthesis of highly effective JAK 1 inhibitors.
Alves, Vinicius M.; Muratov, Eugene; Fourches, Denis; Strickland, Judy; Kleinstreuer, Nicole; Andrade, Carolina H.; Tropsha, Alexander
2015-01-01
Skin permeability is widely considered to be mechanistically implicated in chemically-induced skin sensitization. Although many chemicals have been identified as skin sensitizers, there have been very few reports analyzing the relationships between molecular structure and skin permeability of sensitizers and non-sensitizers. The goals of this study were to: (i) compile, curate, and integrate the largest publicly available dataset of chemicals studied for their skin permeability; (ii) develop and rigorously validate QSAR models to predict skin permeability; and (iii) explore the complex relationships between skin sensitization and skin permeability. Based on the largest publicly available dataset compiled in this study, we found no overall correlation between skin permeability and skin sensitization. In addition, cross-species correlation coefficient between human and rodent permeability data was found to be as low as R2=0.44. Human skin permeability models based on the random forest method have been developed and validated using OECD-compliant QSAR modeling workflow. Their external accuracy was high (Q2ext = 0.73 for 63% of external compounds inside the applicability domain). The extended analysis using both experimentally-measured and QSAR-imputed data still confirmed the absence of any overall concordance between skin permeability and skin sensitization. This observation suggests that chemical modifications that affect skin permeability should not be presumed a priori to modulate the sensitization potential of chemicals. The models reported herein as well as those developed in the companion paper on skin sensitization suggest that it may be possible to rationally design compounds with the desired high skin permeability but low sensitization potential. PMID:25560673
Bhonsle, Jayendra B; Venugopal, Divakaramenon; Huddler, Donald P; Magill, Alan J; Hicks, Rickey P
2007-12-27
In our laboratory, a series of antimicrobial peptides have been developed, where the resulting 3D-physicochemical properties are controlled by the placement of amino acids with well-defined properties (hydrophobicity, charge density, electrostatic potential, and so on) at specific locations along the peptide backbone. These peptides exhibited different in vitro activity against Staphylococcus aureus (SA) and Mycobacterium ranae (MR) bacteria. We hypothesized that the differences in the biological activity is a direct manifestation of different physicochemical interactions that occur between the peptides and the cell membranes of the bacteria. 3D-QSAR analysis has shown that, within this series, specific physicochemical properties are responsible for antibacterial activity and selectivity. There are five physicochemical properties specific to the SA QSAR model, while five properties are specific to the MR QSAR model. These results support the hypothesis that, for any particular AMP, organism selectivity and potency are controlled by the chemical composition of the target cell membrane.
Mansouri, K; Grulke, C M; Richard, A M; Judson, R S; Williams, A J
2016-11-01
The increasing availability of large collections of chemical structures and associated experimental data provides an opportunity to build robust QSAR models for applications in different fields. One common concern is the quality of both the chemical structure information and associated experimental data. Here we describe the development of an automated KNIME workflow to curate and correct errors in the structure and identity of chemicals using the publicly available PHYSPROP physicochemical properties and environmental fate datasets. The workflow first assembles structure-identity pairs using up to four provided chemical identifiers, including chemical name, CASRNs, SMILES, and MolBlock. Problems detected included errors and mismatches in chemical structure formats, identifiers and various structure validation issues, including hypervalency and stereochemistry descriptions. Subsequently, a machine learning procedure was applied to evaluate the impact of this curation process. The performance of QSAR models built on only the highest-quality subset of the original dataset was compared with the larger curated and corrected dataset. The latter showed statistically improved predictive performance. The final workflow was used to curate the full list of PHYSPROP datasets, and is being made publicly available for further usage and integration by the scientific community.
Barber, Chris; Cayley, Alex; Hanser, Thierry; Harding, Alex; Heghes, Crina; Vessey, Jonathan D; Werner, Stephane; Weiner, Sandy K; Wichard, Joerg; Giddings, Amanda; Glowienke, Susanne; Parenty, Alexis; Brigo, Alessandro; Spirkl, Hans-Peter; Amberg, Alexander; Kemper, Ray; Greene, Nigel
2016-04-01
The relative wealth of bacterial mutagenicity data available in the public literature means that in silico quantitative/qualitative structure activity relationship (QSAR) systems can readily be built for this endpoint. A good means of evaluating the performance of such systems is to use private unpublished data sets, which generally represent a more distinct chemical space than publicly available test sets and, as a result, provide a greater challenge to the model. However, raw performance metrics should not be the only factor considered when judging this type of software since expert interpretation of the results obtained may allow for further improvements in predictivity. Enough information should be provided by a QSAR to allow the user to make general, scientifically-based arguments in order to assess and overrule predictions when necessary. With all this in mind, we sought to validate the performance of the statistics-based in vitro bacterial mutagenicity prediction system Sarah Nexus (version 1.1) against private test data sets supplied by nine different pharmaceutical companies. The results of these evaluations were then analysed in order to identify findings presented by the model which would be useful for the user to take into consideration when interpreting the results and making their final decision about the mutagenic potential of a given compound. Copyright © 2015 Elsevier Inc. All rights reserved.
Jing, Pu; Zhao, Shujuan; Ruan, Siyu; Sui, Zhongquan; Chen, Lihong; Jiang, Linlei; Qian, Bingjun
2014-02-15
The 3-dimensional quantitative structure activity relationship (3D-QSAR) models were established from 21 anthocyanins based on their oxygen radical absorbing capacity (ORAC) and were applied to predict anthocyanins in eggplant and radish for their ORAC values. The cross-validated q(2)=0.857/0.729, non-cross-validated r(2) = 0.958/0.856, standard error of estimate = 0.153/0.134, and F = 73.267/19.247 were for the best QSAR (CoMFA/CoMSIA) models, where the correlation coefficient r(2)pred = 0.998/0.997 (>0.6) indicated a high predictive ability for each. Additionally, the contour map results suggested that structural characteristics of anthocyanins favourable for the high ORAC. Four anthocyanins from eggplant and radish have been screened based on the QSAR models. Pelargonidin-3-[(6''-p-coumaroyl)-glucosyl(2 → 1)glucoside]-5-(6''-malonyl)-glucoside, delphinidin-3-rutinoside-5-glucoside, and delphinidin-3-[(4''-p-coumaroyl)-rhamnosyl(1 → 6)glucoside]-5-glucoside potential with high ORAC based the QSAR models were isolated and also confirmed for their relative high antioxidant ability, which might attribute to the bulky and/or electron-donating substituent at the 3-position in the C ring or/and hydrogen bond donor group/electron donating group on the R1 position in the B ring. Copyright © 2013 Elsevier Ltd. All rights reserved.
Castillo-Garit, Juan Alberto; Abad, Concepción; Rodríguez-Borges, J Enrique; Marrero-Ponce, Yovani; Torrens, Francisco
2012-01-01
The neglected tropical diseases (NTDs) affect more than one billion people (one-sixth of the world's population) and occur primarily in undeveloped countries in sub-Saharan Africa, Asia, and Latin America. Available drugs for these diseases are decades old and present an important number of limitations, especially high toxicity and, more recently, the emergence of drug resistance. In the last decade several Quantitative Structure-Activity Relationship (QSAR) studies have been developed in order to identify new organic compounds with activity against the parasites responsible for these diseases, which are reviewed in this paper. The topics summarized in this work are: 1) QSAR studies to identify new organic compounds actives against Chaga's disease; 2) Development of QSAR studies to discover new antileishmanial drusg; 3) Computational studies to identify new drug-like compounds against human African trypanosomiasis. Each topic include the general characteristics, epidemiology and chemotherapy of the disease as well as the main QSAR approaches to discovery/identification of new actives compounds for the corresponding neglected disease. The last section is devoted to a new approach know as multi-target QSAR models developed for antiparasitic drugs specifically those actives against trypanosomatid parasites. At present, as a result of these QSAR studies several promising compounds, active against these parasites, are been indentify. However, more efforts will be required in the future to develop more selective (specific) useful drugs.
USDA-ARS?s Scientific Manuscript database
A three-dimensional quantitative structure-activity relationship (3D-QSAR) model of sulfonamide analogs binding a monoclonal antibody (MAbSMR) produced against sulfamerazine was carried out by Distance Comparison (DISCOtech), comparative molecular field analysis (CoMFA), and comparative molecular si...
Sinha, Siddharth; Goyal, Sukriti; Somvanshi, Pallavi; Grover, Abhinav
2017-01-01
Spinocerebellar ataxia (SCA-2) type-2 is a rare neurological disorder among the nine polyglutamine disorders, mainly caused by polyQ (CAG) trinucleotide repeats expansion within gene coding ataxin-2 protein. The expanded trinucleotide repeats within the ataxin-2 protein sequesters transcriptional cofactors i.e., CREB-binding protein (CBP), Ataxin-2 binding protein 1 (A2BP1) leading to a state of hypo-acetylation and transcriptional repression. Histone de-acetylases inhibitors (HDACi) have been reported to restore transcriptional balance through inhibition of class IIa HDAC's, that leads to an increased acetylation and transcription as demonstrated through in-vivo studies on mouse models of Huntington's. In this study, 61 di-aryl cyclo-propanehydroxamic acid derivatives were used for developing three dimensional (3D) QSAR and pharmacophore models. These models were then employed for screening and selection of anti-ataxia compounds. The chosen QSAR model was observed to be statistically robust with correlation coefficient (r2) value of 0.6774, cross-validated correlation coefficient (q2) of 0.6157 and co-relation coefficient for external test set (pred_r2) of 0.7570. A high F-test value of 77.7093 signified the robustness of the model. Two potential drug leads ZINC 00608101 (SEI) and ZINC 00329110 (ACI) were selected after a coalesce procedure of pharmacophore based screening using the pharmacophore model ADDRR.20 and structural analysis using molecular docking and dynamics simulations. The pharmacophore and the 3D-QSAR model generated were further validated for their screening and prediction ability using the enrichment factor (EF), goodness of hit (GH), and receiver operating characteristics (ROC) curve analysis. The compounds SEI and ACI exhibited a docking score of −10.097 and −9.182 kcal/mol, respectively. An evaluation of binding conformation of ligand-bound protein complexes was performed with MD simulations for a time period of 30 ns along with free
Sinha, Siddharth; Goyal, Sukriti; Somvanshi, Pallavi; Grover, Abhinav
2016-01-01
Spinocerebellar ataxia (SCA-2) type-2 is a rare neurological disorder among the nine polyglutamine disorders, mainly caused by polyQ (CAG) trinucleotide repeats expansion within gene coding ataxin-2 protein. The expanded trinucleotide repeats within the ataxin-2 protein sequesters transcriptional cofactors i.e., CREB-binding protein (CBP), Ataxin-2 binding protein 1 (A2BP1) leading to a state of hypo-acetylation and transcriptional repression. Histone de-acetylases inhibitors (HDACi) have been reported to restore transcriptional balance through inhibition of class IIa HDAC's, that leads to an increased acetylation and transcription as demonstrated through in-vivo studies on mouse models of Huntington's. In this study, 61 di-aryl cyclo-propanehydroxamic acid derivatives were used for developing three dimensional (3D) QSAR and pharmacophore models. These models were then employed for screening and selection of anti-ataxia compounds. The chosen QSAR model was observed to be statistically robust with correlation coefficient ( r 2 ) value of 0.6774, cross-validated correlation coefficient ( q 2 ) of 0.6157 and co-relation coefficient for external test set ( pred _ r 2 ) of 0.7570. A high F -test value of 77.7093 signified the robustness of the model. Two potential drug leads ZINC 00608101 (SEI) and ZINC 00329110 (ACI) were selected after a coalesce procedure of pharmacophore based screening using the pharmacophore model ADDRR.20 and structural analysis using molecular docking and dynamics simulations. The pharmacophore and the 3D-QSAR model generated were further validated for their screening and prediction ability using the enrichment factor (EF), goodness of hit (GH), and receiver operating characteristics (ROC) curve analysis. The compounds SEI and ACI exhibited a docking score of -10.097 and -9.182 kcal/mol, respectively. An evaluation of binding conformation of ligand-bound protein complexes was performed with MD simulations for a time period of 30 ns along with
Sedykh, Alexander; Zhu, Hao; Tang, Hao; Zhang, Liying; Richard, Ann; Rusyn, Ivan; Tropsha, Alexander
2011-01-01
Background Quantitative high-throughput screening (qHTS) assays are increasingly being used to inform chemical hazard identification. Hundreds of chemicals have been tested in dozens of cell lines across extensive concentration ranges by the National Toxicology Program in collaboration with the National Institutes of Health Chemical Genomics Center. Objectives Our goal was to test a hypothesis that dose–response data points of the qHTS assays can serve as biological descriptors of assayed chemicals and, when combined with conventional chemical descriptors, improve the accuracy of quantitative structure–activity relationship (QSAR) models applied to prediction of in vivo toxicity end points. Methods We obtained cell viability qHTS concentration–response data for 1,408 substances assayed in 13 cell lines from PubChem; for a subset of these compounds, rodent acute toxicity half-maximal lethal dose (LD50) data were also available. We used the k nearest neighbor classification and random forest QSAR methods to model LD50 data using chemical descriptors either alone (conventional models) or combined with biological descriptors derived from the concentration–response qHTS data (hybrid models). Critical to our approach was the use of a novel noise-filtering algorithm to treat qHTS data. Results Both the external classification accuracy and coverage (i.e., fraction of compounds in the external set that fall within the applicability domain) of the hybrid QSAR models were superior to conventional models. Conclusions Concentration–response qHTS data may serve as informative biological descriptors of molecules that, when combined with conventional chemical descriptors, may considerably improve the accuracy and utility of computational approaches for predicting in vivo animal toxicity end points. PMID:20980217
Kiwamoto, R; Spenkelink, A; Rietjens, I M C M; Punt, A
2015-01-01
Acyclic α,β-unsaturated aldehydes present in food raise a concern because the α,β-unsaturated aldehyde moiety is considered a structural alert for genotoxicity. However, controversy remains on whether in vivo at realistic dietary exposure DNA adduct formation is significant. The aim of the present study was to develop physiologically based kinetic/dynamic (PBK/D) models to examine dose-dependent detoxification and DNA adduct formation of a group of 18 food-borne acyclic α,β-unsaturated aldehydes without 2- or 3-alkylation, and with no more than one conjugated double bond. Parameters for the PBK/D models were obtained using quantitative structure-activity relationships (QSARs) defined with a training set of six selected aldehydes. Using the QSARs, PBK/D models for the other 12 aldehydes were defined. Results revealed that DNA adduct formation in the liver increases with decreasing bulkiness of the molecule especially due to less efficient detoxification. 2-Propenal (acrolein) was identified to induce the highest DNA adduct levels. At realistic dietary intake, the predicted DNA adduct levels for all aldehydes were two orders of magnitude lower than endogenous background levels observed in disease free human liver, suggesting that for all 18 aldehydes DNA adduct formation is negligible at the relevant levels of dietary intake. The present study provides a proof of principle for the use of QSAR-based PBK/D modelling to facilitate group evaluations and read-across in risk assessment. Copyright © 2014 Elsevier Inc. All rights reserved.
Ai, Yong; Wang, Shao-Teng; Sun, Ping-Hua; Song, Fa-Jun
2011-01-01
Aurora kinases have emerged as attractive targets for the design of anticancer drugs. 3D-QSAR (comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA)) and Surflex-docking studies were performed on a series of pyrrole-indoline-2-ones as Aurora A inhibitors. The CoMFA and CoMSIA models using 25 inhibitors in the training set gave r2cv values of 0.726 and 0.566, and r2 values of 0.972 and 0.984, respectively. The adapted alignment method with the suitable parameters resulted in reliable models. The contour maps produced by the CoMFA and CoMSIA models were employed to rationalize the key structural requirements responsible for the activity. Surflex-docking studies revealed that the sulfo group, secondary amine group on indolin-2-one, and carbonyl of 6,7-dihydro-1H-indol-4(5H)-one groups were significant for binding to the receptor, and some essential features were also identified. Based on the 3D-QSAR and docking results, a set of new molecules with high predicted activities were designed. PMID:21673910
Li, Jiazhong; Gramatica, Paola
2010-11-01
Quantitative structure-activity relationship (QSAR) methodology aims to explore the relationship between molecular structures and experimental endpoints, producing a model for the prediction of new data; the predictive performance of the model must be checked by external validation. Clearly, the qualities of chemical structure information and experimental endpoints, as well as the statistical parameters used to verify the external predictivity have a strong influence on QSAR model reliability. Here, we emphasize the importance of these three aspects by analyzing our models on estrogen receptor binders (Endocrine disruptor knowledge base (EDKB) database). Endocrine disrupting chemicals, which mimic or antagonize the endogenous hormones such as estrogens, are a hot topic in environmental and toxicological sciences. QSAR shows great values in predicting the estrogenic activity and exploring the interactions between the estrogen receptor and ligands. We have verified our previously published model for additional external validation on new EDKB chemicals. Having found some errors in the used 3D molecular conformations, we redevelop a new model using the same data set with corrected structures, the same method (ordinary least-square regression, OLS) and DRAGON descriptors. The new model, based on some different descriptors, is more predictive on external prediction sets. Three different formulas to calculate correlation coefficient for the external prediction set (Q2 EXT) were compared, and the results indicated that the new proposal of Consonni et al. had more reasonable results, consistent with the conclusions from regression line, Williams plot and root mean square error (RMSE) values. Finally, the importance of reliable endpoints values has been highlighted by comparing the classification assignments of EDKB with those of another estrogen receptor binders database (METI): we found that 16.1% assignments of the common compounds were opposite (20 among 124 common
Modeling uncertainty: quicksand for water temperature modeling
Bartholow, John M.
2003-01-01
Uncertainty has been a hot topic relative to science generally, and modeling specifically. Modeling uncertainty comes in various forms: measured data, limited model domain, model parameter estimation, model structure, sensitivity to inputs, modelers themselves, and users of the results. This paper will address important components of uncertainty in modeling water temperatures, and discuss several areas that need attention as the modeling community grapples with how to incorporate uncertainty into modeling without getting stuck in the quicksand that prevents constructive contributions to policy making. The material, and in particular the reference, are meant to supplement the presentation given at this conference.
Dai, Yujie; Chen, Nan; Wang, Qiang; Zheng, Heng; Zhang, Xiuli; Jia, Shiru; Dong, Lilong; Feng, Dacheng
2012-01-01
The inhibitors of p53-HDM2 interaction are attractive molecules for the treatment of wild-type p53 tumors. In order to search more potent HDM2 inhibitors, docking operation with CDOCKER protocol in Discovery Studio 2.1 (DS2.1) and multidimensional hybrid quantitative structure-activity relationship (QSAR) studies through the physiochemical properties obtained from DS2.1 and E-Dragon 1.0 as descriptors, have been performed on 59 1,4-benzodiazepine- 2,5-diones which have p53-HDM2 interaction inhibitory activities. The docking results indicate that π-π interaction between the imidazole group in HIS96 and the aryl ring at 4-N of 1,4-benzodiazepine-2,5-dione may be one of the key factors for the combination of ligands with HDM2. Two QSAR models were obtained using genetic function approximation (GFA) and genetic partial least squares (G/PLS) based on the descriptors obtained from DS2.1 and E-dragon 1.0, respectively. The best model can explain 85.5% of the variance (R 2adj ) while it could predict 81.7% of the variance (R 2 cv ). With this model, the bioactivities of some new compounds were predicted. PMID:24250508
NASA Astrophysics Data System (ADS)
Wang, Fangfang; Zhou, Bo
2018-04-01
Protein tyrosine phosphatase 1B (PTP1B) is an intracellular non-receptor phosphatase that is implicated in signal transduction of insulin and leptin pathways, thus PTP1B is considered as potential target for treating type II diabetes and obesity. The present article is an attempt to formulate the three-dimensional quantitative structure-activity relationship (3D-QSAR) modeling of a series of compounds possessing PTP1B inhibitory activities using comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) techniques. The optimum template ligand-based models are statistically significant with great CoMFA (R2cv = 0.600, R2pred = 0.6760) and CoMSIA (R2cv = 0.624, R2pred = 0.8068) values. Molecular docking was employed to elucidate the inhibitory mechanisms of this series of compounds against PTP1B. In addition, the CoMFA and CoMSIA field contour maps agree well with the structural characteristics of the binding pocket of PTP1B active site. The knowledge of structure-activity relationship and ligand-receptor interactions from 3D-QSAR model and molecular docking will be useful for better understanding the mechanism of ligand-receptor interaction and facilitating development of novel compounds as potent PTP1B inhibitors.
Tuppurainen, Kari; Viisas, Marja; Laatikainen, Reino; Peräkylä, Mikael
2002-01-01
A novel electronic eigenvalue (EEVA) descriptor of molecular structure for use in the derivation of predictive QSAR/QSPR models is described. Like other spectroscopic QSAR/QSPR descriptors, EEVA is also invariant as to the alignment of the structures concerned. Its performance was tested with respect to the CBG (corticosteroid binding globulin) affinity of 31 benchmark steroids. It appeared that the electronic structure of the steroids, i.e., the "spectra" derived from molecular orbital energies, is directly related to the CBG binding affinities. The predictive ability of EEVA is compared to other QSAR approaches, and its performance is discussed in the context of the Hammett equation. The good performance of EEVA is an indication of the essential quantum mechanical nature of QSAR. The EEVA method is a supplement to conventional 3D QSAR methods, which employ fields or surface properties derived from Coulombic and van der Waals interactions.
The effects of characteristics of substituents on toxicity of the nitroaromatics: HiT QSAR study
NASA Astrophysics Data System (ADS)
Kuz'min, Victor E.; Muratov, Eugene N.; Artemenko, Anatoly G.; Gorb, Leonid; Qasim, Mohammad; Leszczynski, Jerzy
2008-10-01
The present study applies the Hierarchical Technology for Quantitative Structure-Activity Relationships (HiT QSAR) for (i) evaluation of the influence of the characteristics of 28 nitroaromatic compounds (some of which belong to a widely known class of explosives) as to their toxicity; (ii) prediction of toxicity for new nitroaromatic derivatives; (iii) analysis of the effects of substituents in nitroaromatic compounds on their toxicity in vivo. The 50% lethal dose concentration for rats (LD50) was used to develop the QSAR models based on simplex representation of molecular structure. The preliminary 1D QSAR results show that even the information on the composition of molecules reveals the main tendencies of changes in toxicity. The statistic characteristics for partial least squares 2D QSAR models are quite satisfactory ( R 2 = 0.96-0.98; Q 2 = 0.91-0.93; R 2 test = 0.89-0.92), which allows us to carry out the prediction of activity for 41 novel compounds designed by the application of new combinations of substituents represented in the training set. The comprehensive analysis of toxicity changes as a function of substituent position and nature was carried out. Molecular fragments that promote and interfere with toxicity were defined on the basis of the obtained models. It was shown that the mutual influence of substituents in the benzene ring plays a crucial role regarding toxicity. The influence of different substituents on toxicity can be mediated via different C-H fragments of the aromatic ring.
Fatima, Sabiha; Jatavath, Mohan Babu; Bathini, Raju; Sivan, Sree Kanth; Manga, Vijjulatha
2014-10-01
Poly(ADP-ribose) polymerase-1 (PARP-1) functions as a DNA damage sensor and signaling molecule. It plays a vital role in the repair of DNA strand breaks induced by radiation and chemotherapeutic drugs; inhibitors of this enzyme have the potential to improve cancer chemotherapy or radiotherapy. Three-dimensional quantitative structure activity relationship (3D QSAR) models were developed using comparative molecular field analysis, comparative molecular similarity indices analysis and docking studies. A set of 88 molecules were docked into the active site of six X-ray crystal structures of poly(ADP-ribose)polymerase-1 (PARP-1), by a procedure called multiple receptor conformation docking (MRCD), in order to improve the 3D QSAR models through the analysis of binding conformations. The docked poses were clustered to obtain the best receptor binding conformation. These dock poses from clustering were used for 3D QSAR analysis. Based on MRCD and QSAR information, some key features have been identified that explain the observed variance in the activity. Two receptor-based QSAR models were generated; these models showed good internal and external statistical reliability that is evident from the [Formula: see text], [Formula: see text] and [Formula: see text]. The identified key features enabled us to design new PARP-1 inhibitors.
Molecular docking and QSAR study on steroidal compounds as aromatase inhibitors.
Dai, Yujie; Wang, Qiang; Zhang, Xiuli; Jia, Shiru; Zheng, Heng; Feng, Dacheng; Yu, Peng
2010-12-01
In order to develop more potent, selective and less toxic steroidal aromatase (AR) inhibitors, molecular docking, 2D and 3D hybrid quantitative structure-activity relationship (QSAR) study have been conducted using topological, molecular shape, spatial, structural and thermodynamic descriptors on 32 steroidal compounds. The molecular docking study shows that one or more hydrogen bonds with MET374 are one of the essential requirements for the optimum binding of ligands. The QSAR model obtained indicates that the aromatase inhibitory activity can be enhanced by increasing SIC, SC_3_C, Jurs_WNSA_1, Jurs_WPSA_1 and decreasing CDOCKER interaction energy (ECD), IAC_Total and Shadow_XZfrac. The predicted results shows that this model has a comparatively good predictive power which can be used in prediction of activity of new steroidal aromatase inhibitors. Copyright © 2010 Elsevier Masson SAS. All rights reserved.
Ferrari, Thomas; Lombardo, Anna; Benfenati, Emilio
2018-05-14
Several methods exist to develop QSAR models automatically. Some are based on indices of the presence of atoms, other on the most similar compounds, other on molecular descriptors. Here we introduce QSARpy v1.0, a new QSAR modeling tool based on a different approach: the dissimilarity. This tool fragments the molecules of the training set to extract fragments that can be associated to a difference in the property/activity value, called modulators. If the target molecule share part of the structure with a molecule of the training set and differences can be explained with one or more modulators, the property/activity value of the molecule of the training set is adjusted using the value associated to the modulator(s). This tool is tested here on the n-octanol/water partition coefficient (Kow, usually expressed in logarithmic units as log Kow). It is a key parameter in risk assessment since it is a measure of hydrophobicity. Its wide spread use makes these estimation methods very useful to reduce testing costs. Using QSARpy v1.0, we obtained a new model to predict log Kow with accurate performance (RMSE 0.43 and R 2 0.94 for the external test set), comparing favorably with other programs. QSARpy is freely available on request. Copyright © 2018 Elsevier B.V. All rights reserved.
Liu, Hong; Ji, Ming; Luo, Xiaomin; Shen, Jianhua; Huang, Xiaoqin; Hua, Weiyi; Jiang, Hualiang; Chen, Kaixian
2002-07-04
Class III antiarrhythmic agents selectively delay the effective refractory period (ERP) and increase the transmembrane action potential duration (APD). Using dofetilide (2) as a template of class III antiarrhythmic agents, we designed and synthesized 16 methylsulfonamido phenylethylamine analogues (4a-d and 5a-l). Pharmacological assay indicated that all of these compounds showed activity for increasing the ERP in isolated animal atrium; among them, the effective concentration of compound 4a is 1.6 x 10(-8) mol/L in increasing ERP by 10 ms, slightly less potent than that of 2, 1.1 x 10(-8) mol/L. Compound 4a also produced a slightly lower change in ERP at 10(-5) M, DeltaERP% = 17.5% (DeltaERP% = 24.0% for dofetilide). On the basis of this bioassay result, these 16 compounds together with dofetilide were investigated by the three-dimensional quantitative structure-activity relationship (3D-QSAR) techniques of comparative molecular field analysis (CoMFA), comparative molecular similarity index analysis (CoMSIA), and the hologram QSAR (HQSAR). The 3D-QSAR models were tested with another 11 compounds (4e-h and 5m-s) that we synthesized later. Results revealed that the CoMFA, CoMSIA, and HQSAR predicted activities for the 11 newly synthesized compounds that have a good correlation with their experimental value, r(2) = 0.943, 0.891, and 0.809 for the three QSAR models, respectively. This indicates that the 3D-QSAR models proved a good predictive ability and could describe the steric, electrostatic, and hydrophobic requirements for recognition forces of the receptor site. On the basis of these results, we designed and synthesized another eight new analogues of methanesulfonamido phenylethyamine (6a-h) according to the clues provided by the 3D-QSAR analyses. Pharmacological assay indicated that the effective concentrations of delaying the ERP by 10 ms of these newly designed compounds correlated well with the 3D-QSAR predicted values. It is remarkable that the percent
Application of 3D-QSAR in the rational design of receptor ligands and enzyme inhibitors.
Mor, Marco; Rivara, Silvia; Lodola, Alessio; Lorenzi, Simone; Bordi, Fabrizio; Plazzi, Pier Vincenzo; Spadoni, Gilberto; Bedini, Annalida; Duranti, Andrea; Tontini, Andrea; Tarzia, Giorgio
2005-11-01
Quantitative structure-activity relationships (QSARs) are frequently employed in medicinal chemistry projects, both to rationalize structure-activity relationships (SAR) for known series of compounds and to help in the design of innovative structures endowed with desired pharmacological actions. As a difference from the so-called structure-based drug design tools, they do not require the knowledge of the biological target structure, but are based on the comparison of drug structural features, thus being defined ligand-based drug design tools. In the 3D-QSAR approach, structural descriptors are calculated from molecular models of the ligands, as interaction fields within a three-dimensional (3D) lattice of points surrounding the ligand structure. These descriptors are collected in a large X matrix, which is submitted to multivariate analysis to look for correlations with biological activity. Like for other QSARs, the reliability and usefulness of the correlation models depends on the validity of the assumptions and on the quality of the data. A careful selection of compounds and pharmacological data can improve the application of 3D-QSAR analysis in drug design. Some examples of the application of CoMFA and CoMSIA approaches to the SAR study and design of receptor or enzyme ligands is described, pointing the attention to the fields of melatonin receptor ligands and FAAH inhibitors.
Using beta binomials to estimate classification uncertainty for ensemble models.
Clark, Robert D; Liang, Wenkel; Lee, Adam C; Lawless, Michael S; Fraczkiewicz, Robert; Waldman, Marvin
2014-01-01
Quantitative structure-activity (QSAR) models have enormous potential for reducing drug discovery and development costs as well as the need for animal testing. Great strides have been made in estimating their overall reliability, but to fully realize that potential, researchers and regulators need to know how confident they can be in individual predictions. Submodels in an ensemble model which have been trained on different subsets of a shared training pool represent multiple samples of the model space, and the degree of agreement among them contains information on the reliability of ensemble predictions. For artificial neural network ensembles (ANNEs) using two different methods for determining ensemble classification - one using vote tallies and the other averaging individual network outputs - we have found that the distribution of predictions across positive vote tallies can be reasonably well-modeled as a beta binomial distribution, as can the distribution of errors. Together, these two distributions can be used to estimate the probability that a given predictive classification will be in error. Large data sets comprised of logP, Ames mutagenicity, and CYP2D6 inhibition data are used to illustrate and validate the method. The distributions of predictions and errors for the training pool accurately predicted the distribution of predictions and errors for large external validation sets, even when the number of positive and negative examples in the training pool were not balanced. Moreover, the likelihood of a given compound being prospectively misclassified as a function of the degree of consensus between networks in the ensemble could in most cases be estimated accurately from the fitted beta binomial distributions for the training pool. Confidence in an individual predictive classification by an ensemble model can be accurately assessed by examining the distributions of predictions and errors as a function of the degree of agreement among the constituent
Rai, Amit; Aboumanei, Mohamed H.; Verma, Suraj P.; Kumar, Sachidanand; Raj, Vinit
2017-01-01
Introduction: Ebola Virus Disease (EVD) is caused by Ebola virus, which is often accompanied by fatal hemorrhagic fever upon infection in humans. This virus has caused the majority of deaths in human. There are no proper vaccinations and medications available for EVD. It is pivoting the attraction of scientist to develop the potent vaccination or novel lead to inhibit Ebola virus. Methods & Materials: In the present study, we developed 3D-QSAR and the pharmacophoric model from the previous reported potent compounds for the Ebola virus. Results & Discussion: Results & Discussion: The pharmacophoric model AAAP.116 was generated with better survival value and selectivity. Moreover, the 3D-QSAR model also showed the best r2 value 0.99 using PLS factor. Thereby, we found the higher F value, which demonstrated the statistical significance of both the models. Furthermore, homological modeling and molecular docking study were performed to analyze the affinity of the potent lead. This showed the best binding energy and bond formation with targeted protein. Conclusion: Finally, all the results of this study concluded that 3D-QSAR and Pharmacophore models may be helpful to search potent lead for EVD treatment in future. PMID:29387271
Afantitis, Antreas; Melagraki, Georgia; Sarimveis, Haralambos; Koutentis, Panayiotis A; Markopoulos, John; Igglessi-Markopoulou, Olga
2006-08-01
A quantitative-structure activity relationship was obtained by applying Multiple Linear Regression Analysis to a series of 80 1-[2-hydroxyethoxy-methyl]-6-(phenylthio) thymine (HEPT) derivatives with significant anti-HIV activity. For the selection of the best among 37 different descriptors, the Elimination Selection Stepwise Regression Method (ES-SWR) was utilized. The resulting QSAR model (R (2) (CV) = 0.8160; S (PRESS) = 0.5680) proved to be very accurate both in training and predictive stages.
Quasi-QSAR for mutagenic potential of multi-walled carbon-nanotubes.
Toropov, Andrey A; Toropova, Alla P
2015-04-01
Available on the Internet, the CORAL software (http://www.insilico.eu/coral) has been used to build up quasi-quantitative structure-activity relationships (quasi-QSAR) for prediction of mutagenic potential of multi-walled carbon-nanotubes (MWCNTs). In contrast with the previous models built up by CORAL which were based on representation of the molecular structure by simplified molecular input-line entry system (SMILES) the quasi-QSARs based on the representation of conditions (not on the molecular structure) such as concentration, presence (absence) S9 mix, the using (or without the using) of preincubation were encoded by so-called quasi-SMILES. The statistical characteristics of these models (quasi-QSARs) for three random splits into the visible training set and test set and invisible validation set are the following: (i) split 1: n=13, r(2)=0.8037, q(2)=0.7260, s=0.033, F=45 (training set); n=5, r(2)=0.9102, s=0.071 (test set); n=6, r(2)=0.7627, s=0.044 (validation set); (ii) split 2: n=13, r(2)=0.6446, q(2)=0.4733, s=0.045, F=20 (training set); n=5, r(2)=0.6785, s=0.054 (test set); n=6, r(2)=0.9593, s=0.032 (validation set); and (iii) n=14, r(2)=0.8087, q(2)=0.6975, s=0.026, F=51 (training set); n=5, r(2)=0.9453, s=0.074 (test set); n=5, r(2)=0.8951, s=0.052 (validation set). Copyright © 2014 Elsevier Ltd. All rights reserved.
Ghanem, Ouahid Ben; Shah, Syed Nasir; Lévêque, Jean-Marc; Mutalib, M I Abdul; El-Harbawi, Mohanad; Khan, Amir Sada; Alnarabiji, Mohamad Sahban; Al-Absi, Hamada R H; Ullah, Zahoor
2018-03-01
Over the past decades, Ionic liquids (ILs) have gained considerable attention from the scientific community in reason of their versatility and performance in many fields. However, they nowadays remain mainly for laboratory scale use. The main barrier hampering their use in a larger scale is their questionable ecological toxicity. This study investigated the effect of hydrophobic and hydrophilic cyclic cation-based ILs against four pathogenic bacteria that infect humans. For that, cations, either of aromatic character (imidazolium or pyridinium) or of non-aromatic nature, (pyrrolidinium or piperidinium), were selected with different alkyl chain lengths and combined with both hydrophilic and hydrophobic anionic moieties. The results clearly demonstrated that introducing of hydrophobic anion namely bis((trifluoromethyl)sulfonyl)amide, [NTF 2 ] and the elongation of the cations substitutions dramatically affect ILs toxicity behaviour. The established toxicity data [50% effective concentration (EC 50 )] along with similar endpoint collected from previous work against Aeromonas hydrophila were combined to developed quantitative structure-activity relationship (QSAR) model for toxicity prediction. The model was developed and validated in the light of Organization for Economic Co-operation and Development (OECD) guidelines strategy, producing good correlation coefficient R 2 of 0.904 and small mean square error (MSE) of 0.095. The reliability of the QSAR model was further determined using k-fold cross validation. Copyright © 2017 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Doytchinova, Irini A.; Walshe, Valerie; Borrow, Persephone; Flower, Darren R.
2005-03-01
The affinities of 177 nonameric peptides binding to the HLA-A*0201 molecule were measured using a FACS-based MHC stabilisation assay and analysed using chemometrics. Their structures were described by global and local descriptors, QSAR models were derived by genetic algorithm, stepwise regression and PLS. The global molecular descriptors included molecular connectivity χ indices, κ shape indices, E-state indices, molecular properties like molecular weight and log P, and three-dimensional descriptors like polarizability, surface area and volume. The local descriptors were of two types. The first used a binary string to indicate the presence of each amino acid type at each position of the peptide. The second was also position-dependent but used five z-scales to describe the main physicochemical properties of the amino acids forming the peptides. The models were developed using a representative training set of 131 peptides and validated using an independent test set of 46 peptides. It was found that the global descriptors could not explain the variance in the training set nor predict the affinities of the test set accurately. Both types of local descriptors gave QSAR models with better explained variance and predictive ability. The results suggest that, in their interactions with the MHC molecule, the peptide acts as a complicated ensemble of multiple amino acids mutually potentiating each other.
Construction of 4D-QSAR Models for Use in the Design of Novel p38-MAPK Inhibitors
NASA Astrophysics Data System (ADS)
Romeiro, Nelilma Correia; Albuquerque, Magaly Girão; de Alencastro, Ricardo Bicca; Ravi, Malini; Hopfinger, Anton J.
2005-06-01
The p38-mitogen-activated protein kinase (p38-MAPK) plays a key role in lipopolysaccharide-induced tumor necrosis factor-α (TNF-α) and interleukin-1 (IL-1) release during the inflammatory process, emerging as an attractive target for new anti-inflammatory agents. Four-dimensional quantitative structure-activity relationship (4D-QSAR) analysis [Hopfinger et al., J. Am. Chem. Soc., 119 (1997) 10509] was applied to a series of 33 (a training set of 28 and a test set of 5) pyridinyl-imidazole and pyrimidinyl-imidazole inhibitors of p38-MAPK, with IC50 ranging from 0.11 to 2100 nM [Liverton et al., J. Med. Chem., 42 (1999) 2180]. Five thousand conformations of each analogue were sampled from a molecular dynamics simulation (MDS) during 50 ps at a constant temperature of 303 K. Each conformation was placed in a 2 Å grid cell lattice for each of three trial alignments. 4D-QSAR models were constructed by genetic algorithm (GA) optimization and partial least squares (PLS) fitting, and evaluated by leave-one-out cross-validation technique. In the best models, with three to six terms, the adjusted cross-validated squared correlation coefficients, Q 2 adj, ranged from 0.67 to 0.85. Model D ( Q 2 adj = 0.84) was identified as the most robust model from alignment 1, and it is representative of the other best models. This model encompasses new molecular regions as containing pharmacophore sites, such as the amino-benzyl moiety of pyrimidine analogs and the N1-substituent in the imidazole ring. These regions of the ligands should be further explored to identify better anti-inflammatory inhibitors of p38-MAPK.
Dong, Lili; Feng, Ruirui; Bi, Jiawei; Shen, Shengqiang; Lu, Huizhe; Zhang, Jianjun
2018-03-06
Human sodium-dependent glucose co-transporter 2 (hSGLT2) is a crucial therapeutic target in the treatment of type 2 diabetes. In this study, both comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) were applied to generate three-dimensional quantitative structure-activity relationship (3D-QSAR) models. In the most accurate CoMFA-based and CoMSIA-based QSAR models, the cross-validated coefficients (r 2 cv ) were 0.646 and 0.577, respectively, while the non-cross-validated coefficients (r 2 ) were 0.997 and 0.991, respectively, indicating that both models were reliable. In addition, we constructed a homology model of hSGLT2 in the absence of a crystal structure. Molecular docking was performed to explore the bonding mode of inhibitors to the active site of hSGLT2. Molecular dynamics (MD) simulations and binding free energy calculations using MM-PBSA and MM-GBSA were carried out to further elucidate the interaction mechanism. With regards to binding affinity, we found that hydrogen-bond interactions of Asn51 and Glu75, located in the active site of hSGLT2, with compound 40 were critical. Hydrophobic and electrostatic interactions were shown to enhance activity, in agreement with the results obtained from docking and 3D-QSAR analysis. Our study results shed light on the interaction mode between inhibitors and hSGLT2 and may aid in the development of C-aryl glucoside SGLT2 inhibitors.
A 3D QSAR CoMFA study of non-peptide angiotensin II receptor antagonists
NASA Astrophysics Data System (ADS)
Belvisi, Laura; Bravi, Gianpaolo; Catalano, Giovanna; Mabilia, Massimo; Salimbeni, Aldo; Scolastico, Carlo
1996-12-01
A series of non-peptide angiotensin II receptor antagonists was investigated with the aim of developing a 3D QSAR model using comparative molecular field analysis descriptors and approaches. The main goals of the study were dictated by an interest in methodologies and an understanding of the binding requirements to the AT1 receptor. Consistency with the previously derived activity models was always checked to contemporarily test the validity of the various hypotheses. The specific conformations chosen for the study, the procedures invoked to superimpose all structures, the conditions employed to generate steric and electrostatic field values and the various PCA/PLS runs are discussed in detail. The effect of experimental design techniques to select objects (molecules) and variables (descriptors) with respect to the predictive power of the QSAR models derived was especially analysed.
Bhattacharjee, Apurba K; Kyle, Dennis E; Vennerstrom, Jonathan L; Milhous, Wilbur K
2002-01-01
Using CATALYST, a three-dimensional QSAR pharmacophore model for chloroquine(CQ)-resistance reversal was developed from a training set of 17 compounds. These included imipramine (1), desipramine (2), and 15 of their analogues (3-17), some of which fully reversed CQ-resistance, while others were without effect. The generated pharmacophore model indicates that two aromatic hydrophobic interaction sites on the tricyclic ring and a hydrogen bond acceptor (lipid) site at the side chain, preferably on a nitrogen atom, are necessary for potent activity. Stereoelectronic properties calculated by using AM1 semiempirical calculations were consistent with the model, particularly the electrostatic potential profiles characterized by a localized negative potential region by the side chain nitrogen atom and a large region covering the aromatic ring. The calculated data further revealed that aminoalkyl substitution at the N5-position of the heterocycle and a secondary or tertiary aliphatic aminoalkyl nitrogen atom with a two or three carbon bridge to the heteroaromatic nitrogen (N5) are required for potent "resistance reversal activity". Lowest energy conformers for 1-17 were determined and optimized to afford stereoelectronic properties such as molecular orbital energies, electrostatic potentials, atomic charges, proton affinities, octanol-water partition coefficients (log P), and structural parameters. For 1-17, fairly good correlation exists between resistance reversal activity and intrinsic basicity of the nitrogen atom at the tricyclic ring system, frontier orbital energies, and lipophilicity. Significantly, nine out of 11 of a group of structurally diverse CQ-resistance reversal agents mapped very well on the 3D QSAR pharmacophore model.
Cronin, Mark T D; Jaworska, Joanna S; Walker, John D; Comber, Michael H I; Watts, Christopher D; Worth, Andrew P
2003-01-01
This article is a review of the use of quantitative (and qualitative) structure-activity relationships (QSARs and SARs) by regulatory agencies and authorities to predict acute toxicity, mutagenicity, carcinogenicity, and other health effects. A number of SAR and QSAR applications, by regulatory agencies and authorities, are reviewed. These include the use of simple QSAR analyses, as well as the use of multivariate QSARs, and a number of different expert system approaches. PMID:12896862
QSAR studies of benzofuran/benzothiophene biphenyl derivatives as inhibitors of PTPase-1B
Kaushik, D.; Kumar, R.; Saxena, A. K.
2010-01-01
Objectives: Insulin resistance is associated with a defect in protein tyrosine phosphorylation in the insulin signal transduction cascade. The PTPase enzyme dephosphorylates the active form of the insulin receptor and thus attenuates its tyrosine kinase activity, therefore, the need for a potent PTPase inhibitor exists, with the intention of which the QSAR was performed. Materials and Methods: Quantitative structure-activity relationship (QSAR) has been established on a series of 106 compounds considering 27 variables, for novel biphenyl analogs, using the SYSTAT (Version 7.0) software, for their protein tyrosine phosphatase (PTPase-1B) inhibitor activity, in order to understand the essential structural requirement for binding with the receptor. Results: Among several regression models, one per series was selected on the basis of a high correlation coefficient (r, 0.86), least standard deviation (s, 0.234), and a high value of significance for the maximum number of subjects (n, 101). Conclusions: The influence of the different physicochemical parameters of the substituents in various positions has been discussed by generating the best QSAR model using multiple regression analysis, and the information thus obtained from the present study can be used to design and predict more potent molecules as PTPase-1B inhibitors, prior to their synthesis. PMID:21814427
Consensus QSAR model for identifying novel H5N1 inhibitors.
Sharma, Nitin; Yap, Chun Wei
2012-08-01
Due to the importance of neuraminidase in the pathogenesis of influenza virus infection, it has been regarded as the most important drug target for the treatment of influenza. Resistance to currently available drugs and new findings related to structure of the protein requires novel neuraminidase 1 (N1) inhibitors. In this study, a consensus QSAR model with defined applicability domain (AD) was developed using published N1 inhibitors. The consensus model was validated using an external validation set. The model achieved high sensitivity, specificity, and overall accuracy along with low false positive rate (FPR) and false discovery rate (FDR). The performance of model on the external validation set and training set were comparable, thus it was unlikely to be overfitted. The low FPR and low FDR will increase its accuracy in screening large chemical libraries. Screening of ZINC library resulted in 64,772 compounds as probable N1 inhibitors, while 173,674 compounds were defined to be outside the AD of the consensus model. The advantage of the current model is that it was developed using a large and diverse dataset and has a defined AD which prevents its use on compounds that it is not capable of predicting. The consensus model developed in this study is made available via the free software, PaDEL-DDPredictor.
The discovery of indicator variables for QSAR using inductive logic programming
NASA Astrophysics Data System (ADS)
King, Ross D.; Srinivasan, Ashwin
1997-11-01
A central problem in forming accurate regression equations in QSAR studies isthe selection of appropriate descriptors for the compounds under study. Wedescribe a novel procedure for using inductive logic programming (ILP) todiscover new indicator variables (attributes) for QSAR problems, and show thatthese improve the accuracy of the derived regression equations. ILP techniqueshave previously been shown to work well on drug design problems where thereis a large structural component or where clear comprehensible rules arerequired. However, ILP techniques have had the disadvantage of only being ableto make qualitative predictions (e.g. active, inactive) and not to predictreal numbers (regression). We unify ILP and linear regression techniques togive a QSAR method that has the strength of ILP at describing stericstructure, with the familiarity and power of linear regression. We evaluatedthe utility of this new QSAR technique by examining the prediction ofbiological activity with and without the addition of new structural indicatorvariables formed by ILP. In three out of five datasets examined the additionof ILP variables produced statistically better results (P < 0.01) over theoriginal description. The new ILP variables did not increase the overallcomplexity of the derived QSAR equations and added insight into possiblemechanisms of action. We conclude that ILP can aid in the process of drugdesign.
Tuning hERG out: Antitarget QSAR Models for Drug Development
Braga, Rodolpho C.; Alves, Vinícius M.; Silva, Meryck F. B.; Muratov, Eugene; Fourches, Denis; Tropsha, Alexander; Andrade, Carolina H.
2015-01-01
Several non-cardiovascular drugs have been withdrawn from the market due to their inhibition of hERG K+ channels that can potentially lead to severe heart arrhythmia and death. As hERG safety testing is a mandatory FDA-required procedure, there is a considerable interest for developing predictive computational tools to identify and filter out potential hERG blockers early in the drug discovery process. In this study, we aimed to generate predictive and well-characterized quantitative structure–activity relationship (QSAR) models for hERG blockage using the largest publicly available dataset of 11,958 compounds from the ChEMBL database. The models have been developed and validated according to OECD guidelines using four types of descriptors and four different machine-learning techniques. The classification accuracies discriminating blockers from non-blockers were as high as 0.83–0.93 on external set. Model interpretation revealed several SAR rules, which can guide structural optimization of some hERG blockers into non-blockers. We have also applied the generated models for screening the World Drug Index (WDI) database and identify putative hERG blockers and non-blockers among currently marketed drugs. The developed models can reliably identify blockers and non-blockers, which could be useful for the scientific community. A freely accessible web server has been developed allowing users to identify putative hERG blockers and non-blockers in chemical libraries of their interest (http://labmol.farmacia.ufg.br/predherg). PMID:24805060
Khashan, Raed; Zheng, Weifan; Tropsha, Alexander
2014-03-01
We present a novel approach to generating fragment-based molecular descriptors. The molecules are represented by labeled undirected chemical graph. Fast Frequent Subgraph Mining (FFSM) is used to find chemical-fragments (subgraphs) that occur in at least a subset of all molecules in a dataset. The collection of frequent subgraphs (FSG) forms a dataset-specific descriptors whose values for each molecule are defined by the number of times each frequent fragment occurs in this molecule. We have employed the FSG descriptors to develop variable selection k Nearest Neighbor (kNN) QSAR models of several datasets with binary target property including Maximum Recommended Therapeutic Dose (MRTD), Salmonella Mutagenicity (Ames Genotoxicity), and P-Glycoprotein (PGP) data. Each dataset was divided into training, test, and validation sets to establish the statistical figures of merit reflecting the model validated predictive power. The classification accuracies of models for both training and test sets for all datasets exceeded 75 %, and the accuracy for the external validation sets exceeded 72 %. The model accuracies were comparable or better than those reported earlier in the literature for the same datasets. Furthermore, the use of fragment-based descriptors affords mechanistic interpretation of validated QSAR models in terms of essential chemical fragments responsible for the compounds' target property. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Li, Hongzhi; Zhong, Ziyan; Li, Lin; Gao, Rui; Cui, Jingxia; Gao, Ting; Hu, Li Hong; Lu, Yinghua; Su, Zhong-Min; Li, Hui
2015-05-30
A cascaded model is proposed to establish the quantitative structure-activity relationship (QSAR) between the overall power conversion efficiency (PCE) and quantum chemical molecular descriptors of all-organic dye sensitizers. The cascaded model is a two-level network in which the outputs of the first level (JSC, VOC, and FF) are the inputs of the second level, and the ultimate end-point is the overall PCE of dye-sensitized solar cells (DSSCs). The model combines quantum chemical methods and machine learning methods, further including quantum chemical calculations, data division, feature selection, regression, and validation steps. To improve the efficiency of the model and reduce the redundancy and noise of the molecular descriptors, six feature selection methods (multiple linear regression, genetic algorithms, mean impact value, forward selection, backward elimination, and +n-m algorithm) are used with the support vector machine. The best established cascaded model predicts the PCE values of DSSCs with a MAE of 0.57 (%), which is about 10% of the mean value PCE (5.62%). The validation parameters according to the OECD principles are R(2) (0.75), Q(2) (0.77), and Qcv2 (0.76), which demonstrate the great goodness-of-fit, predictivity, and robustness of the model. Additionally, the applicability domain of the cascaded QSAR model is defined for further application. This study demonstrates that the established cascaded model is able to effectively predict the PCE for organic dye sensitizers with very low cost and relatively high accuracy, providing a useful tool for the design of dye sensitizers with high PCE. © 2015 Wiley Periodicals, Inc.
NASA Astrophysics Data System (ADS)
Ragno, Rino; Ballante, Flavio; Pirolli, Adele; Wickersham, Richard B.; Patsilinakos, Alexandros; Hesse, Stéphanie; Perspicace, Enrico; Kirsch, Gilbert
2015-08-01
Vascular endothelial growth factor receptor-2, (VEGFR-2), is a key element in angiogenesis, the process by which new blood vessels are formed, and is thus an important pharmaceutical target. Here, 3-D quantitative structure-activity relationship (3-D QSAR) were used to build a quantitative screening and pharmacophore model of the VEGFR-2 receptors for design of inhibitors with improved activities. Most of available experimental data information has been used as training set to derive optimized and fully cross-validated eight mono-probe and a multi-probe quantitative models. Notable is the use of 262 molecules, aligned following both structure-based and ligand-based protocols, as external test set confirming the 3-D QSAR models' predictive capability and their usefulness in design new VEGFR-2 inhibitors. From a survey on literature, this is the first generation of a wide-ranging computational medicinal chemistry application on VEGFR2 inhibitors.
Nargotra, Amit; Sharma, Sujata; Koul, Jawahir Lal; Sangwan, Pyare Lal; Khan, Inshad Ali; Kumar, Ashwani; Taneja, Subhash Chander; Koul, Surrinder
2009-10-01
Quantitative structure activity relationship (QSAR) analysis of piperine analogs as inhibitors of efflux pump NorA from Staphylococcus aureus has been performed in order to obtain a highly accurate model enabling prediction of inhibition of S. aureus NorA of new chemical entities from natural sources as well as synthetic ones. Algorithm based on genetic function approximation method of variable selection in Cerius2 was used to generate the model. Among several types of descriptors viz., topological, spatial, thermodynamic, information content and E-state indices that were considered in generating the QSAR model, three descriptors such as partial negative surface area of the compounds, area of the molecular shadow in the XZ plane and heat of formation of the molecules resulted in a statistically significant model with r(2)=0.962 and cross-validation parameter q(2)=0.917. The validation of the QSAR models was done by cross-validation, leave-25%-out and external test set prediction. The theoretical approach indicates that the increase in the exposed partial negative surface area increases the inhibitory activity of the compound against NorA whereas the area of the molecular shadow in the XZ plane is inversely proportional to the inhibitory activity. This model also explains the relationship of the heat of formation of the compound with the inhibitory activity. The model is not only able to predict the activity of new compounds but also explains the important regions in the molecules in quantitative manner.
Vijayaraj, Ramadoss; Devi, Mekapothula Lakshmi Vasavi; Subramanian, Venkatesan; Chattaraj, Pratim Kumar
2012-06-01
Three-dimensional quantitative structure activity relationship (3D-QSAR) study has been carried out on the Escherichia coli DHFR inhibitors 2,4-diamino-5-(substituted-benzyl)pyrimidine derivatives to understand the structural features responsible for the improved potency. To construct highly predictive 3D-QSAR models, comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) methods were used. The predicted models show statistically significant cross-validated and non-cross-validated correlation coefficient of r2 CV and r2 nCV, respectively. The final 3D-QSAR models were validated using structurally diverse test set compounds. Analysis of the contour maps generated from CoMFA and CoMSIA methods reveals that the substitution of electronegative groups at the first and second position along with electropositive group at the third position of R2 substitution significantly increases the potency of the derivatives. The results obtained from the CoMFA and CoMSIA study delineate the substituents on the trimethoprim analogues responsible for the enhanced potency and also provide valuable directions for the design of new trimethoprim analogues with improved affinity. © 2012 John Wiley & Sons A/S.
Prediction of biodegradability of aromatics in water using QSAR modeling.
Cvetnic, Matija; Juretic Perisic, Daria; Kovacic, Marin; Kusic, Hrvoje; Dermadi, Jasna; Horvat, Sanja; Bolanca, Tomislav; Marin, Vedrana; Karamanis, Panaghiotis; Loncaric Bozic, Ana
2017-05-01
The study was aimed at developing models for predicting the biodegradability of aromatic water pollutants. For that purpose, 36 single-benzene ring compounds, with different type, number and position of substituents, were used. The biodegradability was estimated according to the ratio of the biochemical (BOD 5 ) and chemical (COD) oxygen demand values determined for parent compounds ((BOD 5 /COD) 0 ), as well as for their reaction mixtures in half-life achieved by UV-C/H 2 O 2 process ((BOD 5 /COD) t1/2 ). The models correlating biodegradability and molecular structure characteristics of studied pollutants were derived using quantitative structure-activity relationship (QSAR) principles and tools. Upon derivation of the models and calibration on the training and subsequent testing on the test set, 3- and 5-variable models were selected as the most predictive for (BOD 5 /COD) 0 and (BOD 5 /COD) t1/2 , respectively, according to the values of statistical parameters R 2 and Q 2 . Hence, 3-variable model predicting (BOD 5 /COD) 0 possessed R 2 =0.863 and Q 2 =0.799 for training set, and R 2 =0.710 for test set, while 5-variable model predicting (BOD 5 /COD) 1/2 possessed R 2 =0.886 and Q 2 =0.788 for training set, and R 2 =0.564 for test set. The selected models are interpretable and transparent, reflecting key structural features that influence targeted biodegradability and can be correlated with the degradation mechanisms of studied compounds by UV-C/H 2 O 2 . Copyright © 2017 Elsevier Inc. All rights reserved.
Giesen, Daniel; van Gestel, Cornelis A M
2013-03-01
Quantitative structure-activity relationships (QSARs) are an established tool in environmental risk assessment and a valuable alternative to the exhaustive use of test animals under REACH. In this study a QSAR was developed for the toxicity of a series of six chloroanilines to the soil-dwelling collembolan Folsomia candida in standardized natural LUFA2.2 soil. Toxicity endpoints incorporated in the QSAR were the concentrations causing 10% (EC10) and 50% (EC50) reduction in reproduction of F. candida. Toxicity was based on concentrations in interstitial water estimated from nominal concentrations in the soil and published soil-water partition coefficients. Estimated effect concentrations were negatively correlated with the lipophilicity of the compounds. Interstitial water concentrations for both the EC10 and EC50 for four compounds were determined by using solid-phase microextraction (SPME). Measured and estimated concentrations were comparable only for tetra- and pentachloroaniline. With decreasing chlorination the disparity between modelled and actual concentrations increased. Optimisation of the QSAR therefore could not be accomplished, showing the necessity to move from total soil to (bio)available concentration measurements. Copyright © 2012 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Manoharan, Prabu; Vijayan, R. S. K.; Ghoshal, Nanda
2010-10-01
The ability to identify fragments that interact with a biological target is a key step in FBDD. To date, the concept of fragment based drug design (FBDD) is increasingly driven by bio-physical methods. To expand the boundaries of QSAR paradigm, and to rationalize FBDD using In silico approach, we propose a fragment based QSAR methodology referred here in as FB-QSAR. The FB-QSAR methodology was validated on a dataset consisting of 52 Hydroxy ethylamine (HEA) inhibitors, disclosed by GlaxoSmithKline Pharmaceuticals as potential anti-Alzheimer agents. To address the issue of target selectivity, a major confounding factor in the development of selective BACE1 inhibitors, FB-QSSR models were developed using the reported off target activity values. A heat map constructed, based on the activity and selectivity profile of the individual R-group fragments, and was in turn used to identify superior R-group fragments. Further, simultaneous optimization of multiple properties, an issue encountered in real-world drug discovery scenario, and often overlooked in QSAR approaches, was addressed using a Multi Objective (MO-QSPR) method that balances properties, based on the defined objectives. MO-QSPR was implemented using Derringer and Suich desirability algorithm to identify the optimal level of independent variables ( X) that could confer a trade-off between selectivity and activity. The results obtained from FB-QSAR were further substantiated using MIF (Molecular Interaction Fields) studies. To exemplify the potentials of FB-QSAR and MO-QSPR in a pragmatic fashion, the insights gleaned from the MO-QSPR study was reverse engineered using Inverse-QSAR in a combinatorial fashion to enumerate some prospective novel, potent and selective BACE1 inhibitors.
Manoharan, Prabu; Vijayan, R S K; Ghoshal, Nanda
2010-10-01
The ability to identify fragments that interact with a biological target is a key step in FBDD. To date, the concept of fragment based drug design (FBDD) is increasingly driven by bio-physical methods. To expand the boundaries of QSAR paradigm, and to rationalize FBDD using In silico approach, we propose a fragment based QSAR methodology referred here in as FB-QSAR. The FB-QSAR methodology was validated on a dataset consisting of 52 Hydroxy ethylamine (HEA) inhibitors, disclosed by GlaxoSmithKline Pharmaceuticals as potential anti-Alzheimer agents. To address the issue of target selectivity, a major confounding factor in the development of selective BACE1 inhibitors, FB-QSSR models were developed using the reported off target activity values. A heat map constructed, based on the activity and selectivity profile of the individual R-group fragments, and was in turn used to identify superior R-group fragments. Further, simultaneous optimization of multiple properties, an issue encountered in real-world drug discovery scenario, and often overlooked in QSAR approaches, was addressed using a Multi Objective (MO-QSPR) method that balances properties, based on the defined objectives. MO-QSPR was implemented using Derringer and Suich desirability algorithm to identify the optimal level of independent variables (X) that could confer a trade-off between selectivity and activity. The results obtained from FB-QSAR were further substantiated using MIF (Molecular Interaction Fields) studies. To exemplify the potentials of FB-QSAR and MO-QSPR in a pragmatic fashion, the insights gleaned from the MO-QSPR study was reverse engineered using Inverse-QSAR in a combinatorial fashion to enumerate some prospective novel, potent and selective BACE1 inhibitors.
Afantitis, Antreas; Melagraki, Georgia; Sarimveis, Haralambos; Koutentis, Panayiotis A; Igglessi-Markopoulou, Olga; Kollias, George
2010-05-01
A novel QSAR workflow is constructed that combines MLR with LS-SVM classification techniques for the identification of quinazolinone analogs as "active" or "non-active" CXCR3 antagonists. The accuracy of the LS-SVM classification technique for the training set and test was 100% and 90%, respectively. For the "active" analogs a validated MLR QSAR model estimates accurately their I-IP10 IC(50) inhibition values. The accuracy of the QSAR model (R (2) = 0.80) is illustrated using various evaluation techniques, such as leave-one-out procedure (R(LOO2)) = 0.67) and validation through an external test set (R(pred2) = 0.78). The key conclusion of this study is that the selected molecular descriptors, Highest Occupied Molecular Orbital energy (HOMO), Principal Moment of Inertia along X and Y axes PMIX and PMIZ, Polar Surface Area (PSA), Presence of triple bond (PTrplBnd), and Kier shape descriptor ((1) kappa), demonstrate discriminatory and pharmacophore abilities.
Novel 1,4-naphthoquinone-based sulfonamides: Synthesis, QSAR, anticancer and antimalarial studies.
Pingaew, Ratchanok; Prachayasittikul, Veda; Worachartcheewan, Apilak; Nantasenamat, Chanin; Prachayasittikul, Supaluk; Ruchirawat, Somsak; Prachayasittikul, Virapong
2015-10-20
A novel series of 1,4-naphthoquinones (33-44) tethered by open and closed chain sulfonamide moieties were designed, synthesized and evaluated for their cytotoxic and antimalarial activities. All quinone-sulfonamide derivatives displayed a broad spectrum of cytotoxic activities against all of the tested cancer cell lines including HuCCA-1, HepG2, A549 and MOLT-3. Most quinones (33-36 and 38-43) exerted higher anticancer activity against HepG2 cell than that of the etoposide. The open chain analogs 36 and 42 were shown to be the most potent compounds. Notably, the restricted sulfonamide analog 38 with 6,7-dimethoxy groups exhibited the most potent antimalarial activity (IC₅₀ = 2.8 μM). Quantitative structure-activity relationships (QSAR) study was performed to reveal important chemical features governing the biological activities. Five constructed QSAR models provided acceptable predictive performance (Rcv 0.5647-0.9317 and RMSEcv 0.1231-0.2825). Four additional sets of structurally modified compounds were generated in silico (34a-34d, 36a-36k, 40a-40d and 42a-42k) in which their activities were predicted using the constructed QSAR models. A comprehensive discussion of the structure-activity relationships was made and a set of promising compounds (i.e., 33, 36, 38, 42, 36d, 36f, 42e, 42g and 42f) was suggested for further development as anticancer and antimalarial agents. Copyright © 2015 Elsevier Masson SAS. All rights reserved.
A model-averaging method for assessing groundwater conceptual model uncertainty.
Ye, Ming; Pohlmann, Karl F; Chapman, Jenny B; Pohll, Greg M; Reeves, Donald M
2010-01-01
This study evaluates alternative groundwater models with different recharge and geologic components at the northern Yucca Flat area of the Death Valley Regional Flow System (DVRFS), USA. Recharge over the DVRFS has been estimated using five methods, and five geological interpretations are available at the northern Yucca Flat area. Combining the recharge and geological components together with additional modeling components that represent other hydrogeological conditions yields a total of 25 groundwater flow models. As all the models are plausible given available data and information, evaluating model uncertainty becomes inevitable. On the other hand, hydraulic parameters (e.g., hydraulic conductivity) are uncertain in each model, giving rise to parametric uncertainty. Propagation of the uncertainty in the models and model parameters through groundwater modeling causes predictive uncertainty in model predictions (e.g., hydraulic head and flow). Parametric uncertainty within each model is assessed using Monte Carlo simulation, and model uncertainty is evaluated using the model averaging method. Two model-averaging techniques (on the basis of information criteria and GLUE) are discussed. This study shows that contribution of model uncertainty to predictive uncertainty is significantly larger than that of parametric uncertainty. For the recharge and geological components, uncertainty in the geological interpretations has more significant effect on model predictions than uncertainty in the recharge estimates. In addition, weighted residuals vary more for the different geological models than for different recharge models. Most of the calibrated observations are not important for discriminating between the alternative models, because their weighted residuals vary only slightly from one model to another.
Ghafouri, Hamidreza; Ranjbar, Mohsen; Sakhteman, Amirhossein
2017-08-01
A great challenge in medicinal chemistry is to develop different methods for structural design based on the pattern of the previously synthesized compounds. In this study two different QSAR methods were established and compared for a series of piperidine acetylcholinesterase inhibitors. In one novel approach, PC-LS-SVM and PLS-LS-SVM was used for modeling 3D interaction descriptors, and in the other method the same nonlinear techniques were used to build QSAR equations based on field descriptors. Different validation methods were used to evaluate the models and the results revealed the more applicability and predictive ability of the model generated by field descriptors (Q 2 LOO-CV =1, R 2 ext =0.97). External validation criteria revealed that both methods can be used in generating reasonable QSAR models. It was concluded that due to ability of interaction descriptors in prediction of binding mode, using this approach can be implemented in future 3D-QSAR softwares. Copyright © 2017 Elsevier Ltd. All rights reserved.
Wang, Zhanhui; Kai, Zhenpeng; Beier, Ross C.; Shen, Jianzhong; Yang, Xinling
2012-01-01
A three-dimensional quantitative structure-activity relationship (3D-QSAR) model of sulfonamide analogs binding a monoclonal antibody (MAbSMR) produced against sulfamerazine was carried out by Distance Comparison (DISCOtech), comparative molecular field analysis (CoMFA), and comparative molecular similarity indices analysis (CoMSIA). The affinities of the MAbSMR, expressed as Log10IC50, for 17 sulfonamide analogs were determined by competitive fluorescence polarization immunoassay (FPIA). The results demonstrated that the proposed pharmacophore model containing two hydrogen-bond acceptors, two hydrogen-bond donors and two hydrophobic centers characterized the structural features of the sulfonamides necessary for MAbSMR binding. Removal of two outliers from the initial set of 17 sulfonamide analogs improved the predictability of the models. The 3D-QSAR models of 15 sulfonamides based on CoMFA and CoMSIA resulted in q2 cv values of 0.600 and 0.523, and r2 values of 0.995 and 0.994, respectively, which indicates that both methods have significant predictive capability. Connolly surface analysis, which mainly focused on steric force fields, was performed to complement the results from CoMFA and CoMSIA. This novel study combining FPIA with pharmacophore modeling demonstrates that multidisciplinary research is useful for investigating antigen-antibody interactions and also may provide information required for the design of new haptens. PMID:22754368
Uncertainty in tsunami sediment transport modeling
Jaffe, Bruce E.; Goto, Kazuhisa; Sugawara, Daisuke; Gelfenbaum, Guy R.; La Selle, SeanPaul M.
2016-01-01
Erosion and deposition from tsunamis record information about tsunami hydrodynamics and size that can be interpreted to improve tsunami hazard assessment. We explore sources and methods for quantifying uncertainty in tsunami sediment transport modeling. Uncertainty varies with tsunami, study site, available input data, sediment grain size, and model. Although uncertainty has the potential to be large, published case studies indicate that both forward and inverse tsunami sediment transport models perform well enough to be useful for deciphering tsunami characteristics, including size, from deposits. New techniques for quantifying uncertainty, such as Ensemble Kalman Filtering inversion, and more rigorous reporting of uncertainties will advance the science of tsunami sediment transport modeling. Uncertainty may be decreased with additional laboratory studies that increase our understanding of the semi-empirical parameters and physics of tsunami sediment transport, standardized benchmark tests to assess model performance, and development of hybrid modeling approaches to exploit the strengths of forward and inverse models.
Papamokos, George; Silins, Ilona
2016-01-01
There is an increasing need for new reliable non-animal based methods to predict and test toxicity of chemicals. Quantitative structure-activity relationship (QSAR), a computer-based method linking chemical structures with biological activities, is used in predictive toxicology. In this study, we tested the approach to combine QSAR data with literature profiles of carcinogenic modes of action automatically generated by a text-mining tool. The aim was to generate data patterns to identify associations between chemical structures and biological mechanisms related to carcinogenesis. Using these two methods, individually and combined, we evaluated 96 rat carcinogens of the hematopoietic system, liver, lung, and skin. We found that skin and lung rat carcinogens were mainly mutagenic, while the group of carcinogens affecting the hematopoietic system and the liver also included a large proportion of non-mutagens. The automatic literature analysis showed that mutagenicity was a frequently reported endpoint in the literature of these carcinogens, however, less common endpoints such as immunosuppression and hormonal receptor-mediated effects were also found in connection with some of the carcinogens, results of potential importance for certain target organs. The combined approach, using QSAR and text-mining techniques, could be useful for identifying more detailed information on biological mechanisms and the relation with chemical structures. The method can be particularly useful in increasing the understanding of structure and activity relationships for non-mutagens.
Papamokos, George; Silins, Ilona
2016-01-01
There is an increasing need for new reliable non-animal based methods to predict and test toxicity of chemicals. Quantitative structure-activity relationship (QSAR), a computer-based method linking chemical structures with biological activities, is used in predictive toxicology. In this study, we tested the approach to combine QSAR data with literature profiles of carcinogenic modes of action automatically generated by a text-mining tool. The aim was to generate data patterns to identify associations between chemical structures and biological mechanisms related to carcinogenesis. Using these two methods, individually and combined, we evaluated 96 rat carcinogens of the hematopoietic system, liver, lung, and skin. We found that skin and lung rat carcinogens were mainly mutagenic, while the group of carcinogens affecting the hematopoietic system and the liver also included a large proportion of non-mutagens. The automatic literature analysis showed that mutagenicity was a frequently reported endpoint in the literature of these carcinogens, however, less common endpoints such as immunosuppression and hormonal receptor-mediated effects were also found in connection with some of the carcinogens, results of potential importance for certain target organs. The combined approach, using QSAR and text-mining techniques, could be useful for identifying more detailed information on biological mechanisms and the relation with chemical structures. The method can be particularly useful in increasing the understanding of structure and activity relationships for non-mutagens. PMID:27625608
The QSAR study of flavonoid-metal complexes scavenging rad OH free radical
NASA Astrophysics Data System (ADS)
Wang, Bo-chu; Qian, Jun-zhen; Fan, Ying; Tan, Jun
2014-10-01
Flavonoid-metal complexes have antioxidant activities. However, quantitative structure-activity relationships (QSAR) of flavonoid-metal complexes and their antioxidant activities has still not been tackled. On the basis of 21 structures of flavonoid-metal complexes and their antioxidant activities for scavenging rad OH free radical, we optimised their structures using Gaussian 03 software package and we subsequently calculated and chose 18 quantum chemistry descriptors such as dipole, charge and energy. Then we chose several quantum chemistry descriptors that are very important to the IC50 of flavonoid-metal complexes for scavenging rad OH free radical through method of stepwise linear regression, Meanwhile we obtained 4 new variables through the principal component analysis. Finally, we built the QSAR models based on those important quantum chemistry descriptors and the 4 new variables as the independent variables and the IC50 as the dependent variable using an Artificial Neural Network (ANN), and we validated the two models using experimental data. These results show that the two models in this paper are reliable and predictable.
Valizade Hasanloei, Mohammad Amin; Sheikhpour, Razieh; Sarram, Mehdi Agha; Sheikhpour, Elnaz; Sharifi, Hamdollah
2018-02-01
Quantitative structure-activity relationship (QSAR) is an effective computational technique for drug design that relates the chemical structures of compounds to their biological activities. Feature selection is an important step in QSAR based drug design to select the most relevant descriptors. One of the most popular feature selection methods for classification problems is Fisher score which aim is to minimize the within-class distance and maximize the between-class distance. In this study, the properties of Fisher criterion were extended for QSAR models to define the new distance metrics based on the continuous activity values of compounds with known activities. Then, a semi-supervised feature selection method was proposed based on the combination of Fisher and Laplacian criteria which exploits both compounds with known and unknown activities to select the relevant descriptors. To demonstrate the efficiency of the proposed semi-supervised feature selection method in selecting the relevant descriptors, we applied the method and other feature selection methods on three QSAR data sets such as serine/threonine-protein kinase PLK3 inhibitors, ROCK inhibitors and phenol compounds. The results demonstrated that the QSAR models built on the selected descriptors by the proposed semi-supervised method have better performance than other models. This indicates the efficiency of the proposed method in selecting the relevant descriptors using the compounds with known and unknown activities. The results of this study showed that the compounds with known and unknown activities can be helpful to improve the performance of the combined Fisher and Laplacian based feature selection methods.
NASA Astrophysics Data System (ADS)
Valizade Hasanloei, Mohammad Amin; Sheikhpour, Razieh; Sarram, Mehdi Agha; Sheikhpour, Elnaz; Sharifi, Hamdollah
2018-02-01
Quantitative structure-activity relationship (QSAR) is an effective computational technique for drug design that relates the chemical structures of compounds to their biological activities. Feature selection is an important step in QSAR based drug design to select the most relevant descriptors. One of the most popular feature selection methods for classification problems is Fisher score which aim is to minimize the within-class distance and maximize the between-class distance. In this study, the properties of Fisher criterion were extended for QSAR models to define the new distance metrics based on the continuous activity values of compounds with known activities. Then, a semi-supervised feature selection method was proposed based on the combination of Fisher and Laplacian criteria which exploits both compounds with known and unknown activities to select the relevant descriptors. To demonstrate the efficiency of the proposed semi-supervised feature selection method in selecting the relevant descriptors, we applied the method and other feature selection methods on three QSAR data sets such as serine/threonine-protein kinase PLK3 inhibitors, ROCK inhibitors and phenol compounds. The results demonstrated that the QSAR models built on the selected descriptors by the proposed semi-supervised method have better performance than other models. This indicates the efficiency of the proposed method in selecting the relevant descriptors using the compounds with known and unknown activities. The results of this study showed that the compounds with known and unknown activities can be helpful to improve the performance of the combined Fisher and Laplacian based feature selection methods.
Alam, Sarfaraz; Khan, Feroz
2014-01-01
Due to the high mortality rate in India, the identification of novel molecules is important in the development of novel and potent anticancer drugs. Xanthones are natural constituents of plants in the families Bonnetiaceae and Clusiaceae, and comprise oxygenated heterocycles with a variety of biological activities along with an anticancer effect. To explore the anticancer compounds from xanthone derivatives, a quantitative structure activity relationship (QSAR) model was developed by the multiple linear regression method. The structure–activity relationship represented by the QSAR model yielded a high activity–descriptors relationship accuracy (84%) referred by regression coefficient (r2=0.84) and a high activity prediction accuracy (82%). Five molecular descriptors – dielectric energy, group count (hydroxyl), LogP (the logarithm of the partition coefficient between n-octanol and water), shape index basic (order 3), and the solvent-accessible surface area – were significantly correlated with anticancer activity. Using this QSAR model, a set of virtually designed xanthone derivatives was screened out. A molecular docking study was also carried out to predict the molecular interaction between proposed compounds and deoxyribonucleic acid (DNA) topoisomerase IIα. The pharmacokinetics parameters, such as absorption, distribution, metabolism, excretion, and toxicity, were also calculated, and later an appraisal of synthetic accessibility of organic compounds was carried out. The strategy used in this study may provide understanding in designing novel DNA topoisomerase IIα inhibitors, as well as for other cancer targets. PMID:24516330
Chen, Ying; Cai, Xiaoyu; Jiang, Long; Li, Yu
2016-02-01
Based on the experimental data of octanol-air partition coefficients (KOA) for 19 polychlorinated biphenyl (PCB) congeners, two types of QSAR methods, comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA), are used to establish 3D-QSAR models using the structural parameters as independent variables and using logKOA values as the dependent variable with the Sybyl software to predict the KOA values of the remaining 190 PCB congeners. The whole data set (19 compounds) was divided into a training set (15 compounds) for model generation and a test set (4 compounds) for model validation. As a result, the cross-validation correlation coefficient (q(2)) obtained by the CoMFA and CoMSIA models (shuffled 12 times) was in the range of 0.825-0.969 (>0.5), the correlation coefficient (r(2)) obtained was in the range of 0.957-1.000 (>0.9), and the SEP (standard error of prediction) of test set was within the range of 0.070-0.617, indicating that the models were robust and predictive. Randomly selected from a set of models, CoMFA analysis revealed that the corresponding percentages of the variance explained by steric and electrostatic fields were 23.9% and 76.1%, respectively, while CoMSIA analysis by steric, electrostatic and hydrophobic fields were 0.6%, 92.6%, and 6.8%, respectively. The electrostatic field was determined as a primary factor governing the logKOA. The correlation analysis of the relationship between the number of Cl atoms and the average logKOA values of PCBs indicated that logKOA values gradually increased as the number of Cl atoms increased. Simultaneously, related studies on PCB detection in the Arctic and Antarctic areas revealed that higher logKOA values indicate a stronger PCB migration ability. From CoMFA and CoMSIA contour maps, logKOA decreased when substituents possessed electropositive groups at the 2-, 3-, 3'-, 5- and 6- positions, which could reduce the PCB migration ability. These results are
Model averaging techniques for quantifying conceptual model uncertainty.
Singh, Abhishek; Mishra, Srikanta; Ruskauff, Greg
2010-01-01
In recent years a growing understanding has emerged regarding the need to expand the modeling paradigm to include conceptual model uncertainty for groundwater models. Conceptual model uncertainty is typically addressed by formulating alternative model conceptualizations and assessing their relative likelihoods using statistical model averaging approaches. Several model averaging techniques and likelihood measures have been proposed in the recent literature for this purpose with two broad categories--Monte Carlo-based techniques such as Generalized Likelihood Uncertainty Estimation or GLUE (Beven and Binley 1992) and criterion-based techniques that use metrics such as the Bayesian and Kashyap Information Criteria (e.g., the Maximum Likelihood Bayesian Model Averaging or MLBMA approach proposed by Neuman 2003) and Akaike Information Criterion-based model averaging (AICMA) (Poeter and Anderson 2005). These different techniques can often lead to significantly different relative model weights and ranks because of differences in the underlying statistical assumptions about the nature of model uncertainty. This paper provides a comparative assessment of the four model averaging techniques (GLUE, MLBMA with KIC, MLBMA with BIC, and AIC-based model averaging) mentioned above for the purpose of quantifying the impacts of model uncertainty on groundwater model predictions. Pros and cons of each model averaging technique are examined from a practitioner's perspective using two groundwater modeling case studies. Recommendations are provided regarding the use of these techniques in groundwater modeling practice.
Reusable launch vehicle model uncertainties impact analysis
NASA Astrophysics Data System (ADS)
Chen, Jiaye; Mu, Rongjun; Zhang, Xin; Deng, Yanpeng
2018-03-01
Reusable launch vehicle(RLV) has the typical characteristics of complex aerodynamic shape and propulsion system coupling, and the flight environment is highly complicated and intensely changeable. So its model has large uncertainty, which makes the nominal system quite different from the real system. Therefore, studying the influences caused by the uncertainties on the stability of the control system is of great significance for the controller design. In order to improve the performance of RLV, this paper proposes the approach of analyzing the influence of the model uncertainties. According to the typical RLV, the coupling dynamic and kinematics models are built. Then different factors that cause uncertainties during building the model are analyzed and summed up. After that, the model uncertainties are expressed according to the additive uncertainty model. Choosing the uncertainties matrix's maximum singular values as the boundary model, and selecting the uncertainties matrix's norm to show t how much the uncertainty factors influence is on the stability of the control system . The simulation results illustrate that the inertial factors have the largest influence on the stability of the system, and it is necessary and important to take the model uncertainties into consideration before the designing the controller of this kind of aircraft( like RLV, etc).
Reliable Prescreening of Candidate NerveAgent Prophylaxes via 3D QSAR
2005-12-31
recognize and predict prospective toxicity among covalent -binding AChE inhibitors of potential application to nerve agent prophylaxis and...is below since many authors do not follow the 200 word limit 14. SUBJECT TERMS nerve agents , acetylcholinesterase, prophylaxis, QSAR, virtual...Report: Reliable Prescreening of Candidate NerveAgent Prophylaxes via 3D QSAR Report Title ABSTRACT Organophosphorus (OP) nerve agents are among the
NASA Astrophysics Data System (ADS)
Lalit, Manisha; Gangwal, Rahul P.; Dhoke, Gaurao V.; Damre, Mangesh V.; Khandelwal, Kanchan; Sangamwar, Abhay T.
2013-10-01
A combined pharmacophore modelling, 3D-QSAR and molecular docking approach was employed to reveal structural and chemical features essential for the development of small molecules as LRH-1 agonists. The best HypoGen pharmacophore hypothesis (Hypo1) consists of one hydrogen-bond donor (HBD), two general hydrophobic (H), one hydrophobic aromatic (HYAr) and one hydrophobic aliphatic (HYA) feature. It has exhibited high correlation coefficient of 0.927, cost difference of 85.178 bit and low RMS value of 1.411. This pharmacophore hypothesis was cross-validated using test set, decoy set and Cat-Scramble methodology. Subsequently, validated pharmacophore hypothesis was used in the screening of small chemical databases. Further, 3D-QSAR models were developed based on the alignment obtained using substructure alignment. The best CoMFA and CoMSIA model has exhibited excellent rncv2 values of 0.991 and 0.987, and rcv2 values of 0.767 and 0.703, respectively. CoMFA predicted rpred2 of 0.87 and CoMSIA predicted rpred2 of 0.78 showed that the predicted values were in good agreement with the experimental values. Molecular docking analysis reveals that π-π interaction with His390 and hydrogen bond interaction with His390/Arg393 is essential for LRH-1 agonistic activity. The results from pharmacophore modelling, 3D-QSAR and molecular docking are complementary to each other and could serve as a powerful tool for the discovery of potent small molecules as LRH-1 agonists.
Predicting Drug-induced Hepatotoxicity Using QSAR and Toxicogenomics Approaches
Low, Yen; Uehara, Takeki; Minowa, Yohsuke; Yamada, Hiroshi; Ohno, Yasuo; Urushidani, Tetsuro; Sedykh, Alexander; Muratov, Eugene; Fourches, Denis; Zhu, Hao; Rusyn, Ivan; Tropsha, Alexander
2014-01-01
Quantitative Structure-Activity Relationship (QSAR) modeling and toxicogenomics are used independently as predictive tools in toxicology. In this study, we evaluated the power of several statistical models for predicting drug hepatotoxicity in rats using different descriptors of drug molecules, namely their chemical descriptors and toxicogenomic profiles. The records were taken from the Toxicogenomics Project rat liver microarray database containing information on 127 drugs (http://toxico.nibio.go.jp/datalist.html). The model endpoint was hepatotoxicity in the rat following 28 days of exposure, established by liver histopathology and serum chemistry. First, we developed multiple conventional QSAR classification models using a comprehensive set of chemical descriptors and several classification methods (k nearest neighbor, support vector machines, random forests, and distance weighted discrimination). With chemical descriptors alone, external predictivity (Correct Classification Rate, CCR) from 5-fold external cross-validation was 61%. Next, the same classification methods were employed to build models using only toxicogenomic data (24h after a single exposure) treated as biological descriptors. The optimized models used only 85 selected toxicogenomic descriptors and had CCR as high as 76%. Finally, hybrid models combining both chemical descriptors and transcripts were developed; their CCRs were between 68 and 77%. Although the accuracy of hybrid models did not exceed that of the models based on toxicogenomic data alone, the use of both chemical and biological descriptors enriched the interpretation of the models. In addition to finding 85 transcripts that were predictive and highly relevant to the mechanisms of drug-induced liver injury, chemical structural alerts for hepatotoxicity were also identified. These results suggest that concurrent exploration of the chemical features and acute treatment-induced changes in transcript levels will both enrich the
NASA Technical Reports Server (NTRS)
Shevade, A. V.; Ryan, M. A.; Homer, M. L.; Jewell, A. D.; Zhou, H.; Manatt, K.; Kisor, A. K.
2005-01-01
We report a Quantitative Structure-Activity Relationships (QSAR) study using Genetic Function Approximations (GFA) to describe the polymer-carbon composite sensor activities in the JPL Electronic Nose, when exposed to chemical vapors at parts-per-million concentration levels.
Evaluating Predictive Uncertainty of Hyporheic Exchange Modelling
NASA Astrophysics Data System (ADS)
Chow, R.; Bennett, J.; Dugge, J.; Wöhling, T.; Nowak, W.
2017-12-01
Hyporheic exchange is the interaction of water between rivers and groundwater, and is difficult to predict. One of the largest contributions to predictive uncertainty for hyporheic fluxes have been attributed to the representation of heterogeneous subsurface properties. This research aims to evaluate which aspect of the subsurface representation - the spatial distribution of hydrofacies or the model for local-scale (within-facies) heterogeneity - most influences the predictive uncertainty. Also, we seek to identify data types that help reduce this uncertainty best. For this investigation, we conduct a modelling study of the Steinlach River meander, in Southwest Germany. The Steinlach River meander is an experimental site established in 2010 to monitor hyporheic exchange at the meander scale. We use HydroGeoSphere, a fully integrated surface water-groundwater model, to model hyporheic exchange and to assess the predictive uncertainty of hyporheic exchange transit times (HETT). A highly parameterized complex model is built and treated as `virtual reality', which is in turn modelled with simpler subsurface parameterization schemes (Figure). Then, we conduct Monte-Carlo simulations with these models to estimate the predictive uncertainty. Results indicate that: Uncertainty in HETT is relatively small for early times and increases with transit times. Uncertainty from local-scale heterogeneity is negligible compared to uncertainty in the hydrofacies distribution. Introducing more data to a poor model structure may reduce predictive variance, but does not reduce predictive bias. Hydraulic head observations alone cannot constrain the uncertainty of HETT, however an estimate of hyporheic exchange flux proves to be more effective at reducing this uncertainty. Figure: Approach for evaluating predictive model uncertainty. A conceptual model is first developed from the field investigations. A complex model (`virtual reality') is then developed based on that conceptual model
NASA Astrophysics Data System (ADS)
Adhikari, Nilanjan; Amin, Sk. Abdul; Saha, Achintya; Jha, Tarun
2018-03-01
Matrix metalloproteinase-2 (MMP-2) is a promising pharmacological target for designing potential anticancer drugs. MMP-2 plays critical functions in apoptosis by cleaving the DNA repair enzyme namely poly (ADP-ribose) polymerase (PARP). Moreover, MMP-2 expression triggers the vascular endothelial growth factor (VEGF) having a positive influence on tumor size, invasion, and angiogenesis. Therefore, it is an urgent need to develop potential MMP-2 inhibitors without any toxicity but better pharmacokinetic property. In this article, robust validated multi-quantitative structure-activity relationship (QSAR) modeling approaches were attempted on a dataset of 222 MMP-2 inhibitors to explore the important structural and pharmacophoric requirements for higher MMP-2 inhibition. Different validated regression and classification-based QSARs, pharmacophore mapping and 3D-QSAR techniques were performed. These results were challenged and subjected to further validation to explain 24 in house MMP-2 inhibitors to judge the reliability of these models further. All these models were individually validated internally as well as externally and were supported and validated by each other. These results were further justified by molecular docking analysis. Modeling techniques adopted here not only helps to explore the necessary structural and pharmacophoric requirements but also for the overall validation and refinement techniques for designing potential MMP-2 inhibitors.
Horobin, R W; Stockert, J C; Rashid-Doubell, F
2015-05-01
We discuss a variety of biological targets including generic biomembranes and the membranes of the endoplasmic reticulum, endosomes/lysosomes, Golgi body, mitochondria (outer and inner membranes) and the plasma membrane of usual fluidity. For each target, we discuss the access of probes to the target membrane, probe uptake into the membrane and the mechanism of selectivity of the probe uptake. A statement of the QSAR decision rule that describes the required physicochemical features of probes that enable selective staining also is provided, followed by comments on exceptions and limits. Examples of probes typically used to demonstrate each target structure are noted and decision rule tabulations are provided for probes that localize in particular targets; these tabulations show distribution of probes in the conceptual space defined by the relevant structure parameters ("parameter space"). Some general implications and limitations of the QSAR models for probe targeting are discussed including the roles of certain cell and protocol factors that play significant roles in lipid staining. A case example illustrates the predictive ability of QSAR models. Key limiting values of the head group hydrophilicity parameter associated with membrane-probe interactions are discussed in an appendix.
Dong, Dong; Ako, Roland; Hu, Ming; Wu, Baojian
2015-01-01
The UDP-glucuronosyltransferase (UGT) enzyme catalyzes the glucuronidation reaction which is a major metabolic and detoxification pathway in humans. Understanding the mechanisms for substrate recognition by UGT assumes great importance in an attempt to predict its contribution to xenobiotic/drug disposition in vivo. Spurred on by this interest, 2D/3D-quantitative structure activity relationships (QSAR) and pharmacophore models have been established in the absence of a complete mammalian UGT crystal structure. This review discusses the recent progress in modeling human UGT substrates including those with multiple sites of glucuronidation. A better understanding of UGT active site contributing to substrate selectivity (and regioselectivity) from the homologous enzymes (i.e., plant and bacterial UGTs, all belong to family 1 of glycosyltransferase (GT1)) is also highlighted, as these enzymes share a common catalytic mechanism and/or overlapping substrate selectivity. PMID:22385482
Ahlberg, Ernst; Amberg, Alexander; Beilke, Lisa D; Bower, David; Cross, Kevin P; Custer, Laura; Ford, Kevin A; Van Gompel, Jacky; Harvey, James; Honma, Masamitsu; Jolly, Robert; Joossens, Elisabeth; Kemper, Raymond A; Kenyon, Michelle; Kruhlak, Naomi; Kuhnke, Lara; Leavitt, Penny; Naven, Russell; Neilan, Claire; Quigley, Donald P; Shuey, Dana; Spirkl, Hans-Peter; Stavitskaya, Lidiya; Teasdale, Andrew; White, Angela; Wichard, Joerg; Zwickl, Craig; Myatt, Glenn J
2016-06-01
Statistical-based and expert rule-based models built using public domain mutagenicity knowledge and data are routinely used for computational (Q)SAR assessments of pharmaceutical impurities in line with the approach recommended in the ICH M7 guideline. Knowledge from proprietary corporate mutagenicity databases could be used to increase the predictive performance for selected chemical classes as well as expand the applicability domain of these (Q)SAR models. This paper outlines a mechanism for sharing knowledge without the release of proprietary data. Primary aromatic amine mutagenicity was selected as a case study because this chemical class is often encountered in pharmaceutical impurity analysis and mutagenicity of aromatic amines is currently difficult to predict. As part of this analysis, a series of aromatic amine substructures were defined and the number of mutagenic and non-mutagenic examples for each chemical substructure calculated across a series of public and proprietary mutagenicity databases. This information was pooled across all sources to identify structural classes that activate or deactivate aromatic amine mutagenicity. This structure activity knowledge, in combination with newly released primary aromatic amine data, was incorporated into Leadscope's expert rule-based and statistical-based (Q)SAR models where increased predictive performance was demonstrated. Copyright © 2016 Elsevier Inc. All rights reserved.
Patel, Preeti; Singh, Avineesh; Patel, Vijay K; Jain, Deepak K; Veerasamy, Ravichandran; Rajak, Harish
2016-01-01
Histone deacetylase (HDAC) inhibitors can reactivate gene expression and inhibit the growth and survival of cancer cells. To identify the important pharmacophoric features and correlate 3Dchemical structure with biological activity using 3D-QSAR and Pharmacophore modeling studies. The pharmacophore hypotheses were developed using e-pharmacophore script and phase module. Pharmacophore hypothesis represents the 3D arrangement of molecular features necessary for activity. A series of 55 compounds with wellassigned HDAC inhibitory activity were used for 3D-QSAR model development. Best 3D-QSAR model, which is a five partial least square (PLS) factor model with good statistics and predictive ability, acquired Q2 (0.7293), R2 (0.9811), cross-validated coefficient rcv 2=0.9807 and R2 pred=0.7147 with low standard deviation (0.0952). Additionally, the selected pharmacophore model DDRRR.419 was used as a 3D query for virtual screening against the ZINC database. In the virtual screening workflow, docking studies (HTVS, SP and XP) were carried out by selecting multiple receptors (PDB ID: 1T69, 1T64, 4LXZ, 4LY1, 3MAX, 2VQQ, 3C10, 1W22). Finally, six compounds were obtained based on high scoring function (dock score -11.2278-10.2222 kcal/mol) and diverse structures. The structure activity correlation was established using virtual screening, docking, energetic based pharmacophore modelling, pharmacophore, atom based 3D QSAR models and their validation. The outcomes of these studies could be further employed for the design of novel HDAC inhibitors for anticancer activity.
A combined QSAR and partial order ranking approach to risk assessment.
Carlsen, L
2006-04-01
QSAR generated data appear as an attractive alternative to experimental data as foreseen in the proposed new chemicals legislation REACH. A preliminary risk assessment for the aquatic environment can be based on few factors, i.e. the octanol-water partition coefficient (Kow), the vapour pressure (VP) and the potential biodegradability of the compound in combination with the predicted no-effect concentration (PNEC) and the actual tonnage in which the substance is produced. Application of partial order ranking, allowing simultaneous inclusion of several parameters leads to a mutual prioritisation of the investigated substances, the prioritisation possibly being further analysed through the concept of linear extensions and average ranks. The ranking uses endpoint values (log Kow and log VP) derived from strictly linear 'noise-deficient' QSAR models as input parameters. Biodegradation estimates were adopted from the BioWin module of the EPI Suite. The population growth impairment of Tetrahymena pyriformis was used as a surrogate for fish lethality.
Fassihi, Afshin; Sabet, Razieh
2008-01-01
Quantitative relationships between molecular structure and p56lck protein tyrosine kinase inhibitory activity of 50 flavonoid derivatives are discovered by MLR and GA-PLS methods. Different QSAR models revealed that substituent electronic descriptors (SED) parameters have significant impact on protein tyrosine kinase inhibitory activity of the compounds. Between the two statistical methods employed, GA-PLS gave superior results. The resultant GA-PLS model had a high statistical quality (R2 = 0.74 and Q2 = 0.61) for predicting the activity of the inhibitors. The models proposed in the present work are more useful in describing QSAR of flavonoid derivatives as p56lck protein tyrosine kinase inhibitors than those provided previously. PMID:19325836
Fjodorova, Natalja; Novič, Marjana
2012-01-01
The knowledge-based Toxtree expert system (SAR approach) was integrated with the statistically based counter propagation artificial neural network (CP ANN) model (QSAR approach) to contribute to a better mechanistic understanding of a carcinogenicity model for non-congeneric chemicals using Dragon descriptors and carcinogenic potency for rats as a response. The transparency of the CP ANN algorithm was demonstrated using intrinsic mapping technique specifically Kohonen maps. Chemical structures were represented by Dragon descriptors that express the structural and electronic features of molecules such as their shape and electronic surrounding related to reactivity of molecules. It was illustrated how the descriptors are correlated with particular structural alerts (SAs) for carcinogenicity with recognized mechanistic link to carcinogenic activity. Moreover, the Kohonen mapping technique enables one to examine the separation of carcinogens and non-carcinogens (for rats) within a family of chemicals with a particular SA for carcinogenicity. The mechanistic interpretation of models is important for the evaluation of safety of chemicals. PMID:24688639
Combined QSAR and molecule docking studies on predicting P-glycoprotein inhibitors
NASA Astrophysics Data System (ADS)
Tan, Wen; Mei, Hu; Chao, Li; Liu, Tengfei; Pan, Xianchao; Shu, Mao; Yang, Li
2013-12-01
P-glycoprotein (P-gp) is an ATP-binding cassette multidrug transporter. The over expression of P-gp leads to the development of multidrug resistance (MDR), which is a major obstacle to effective treatment of cancer. Thus, designing effective P-gp inhibitors has an extremely important role in the overcoming MDR. In this paper, both ligand-based quantitative structure-activity relationship (QSAR) and receptor-based molecular docking are used to predict P-gp inhibitors. The results show that each method achieves good prediction performance. According to the results of tenfold cross-validation, an optimal linear SVM model with only three descriptors is established on 857 training samples, of which the overall accuracy (Acc), sensitivity, specificity, and Matthews correlation coefficient are 0.840, 0.873, 0.813, and 0.683, respectively. The SVM model is further validated by 418 test samples with the overall Acc of 0.868. Based on a homology model of human P-gp established, Surflex-dock is also performed to give binding free energy-based evaluations with the overall accuracies of 0.823 for the test set. Furthermore, a consensus evaluation is also performed by using these two methods. Both QSAR and molecular docking studies indicate that molecular volume, hydrophobicity and aromaticity are three dominant factors influencing the inhibitory activities.
Xiao, Ruiyang; Ye, Tiantian; Wei, Zongsu; Luo, Shuang; Yang, Zhihui; Spinney, Richard
2015-11-17
The sulfate radical anion (SO4•–) based oxidation of trace organic contaminants (TrOCs) has recently received great attention due to its high reactivity and low selectivity. In this study, a meta-analysis was conducted to better understand the role of functional groups on the reactivity between SO4•– and TrOCs. The results indicate that compounds in which electron transfer and addition channels dominate tend to exhibit a faster second-order rate constants (kSO4•–) than that of H–atom abstraction, corroborating the SO4•– reactivity and mechanisms observed in the individual studies. Then, a quantitative structure activity relationship (QSAR) model was developed using a sequential approach with constitutional, geometrical, electrostatic, and quantum chemical descriptors. Two descriptors, ELUMO and EHOMO energy gap (ELUMO–EHOMO) and the ratio of oxygen atoms to carbon atoms (#O:C), were found to mechanistically and statistically affect kSO4•– to a great extent with the standardized QSAR model: ln kSO4•– = 26.8–3.97 × #O:C – 0.746 × (ELUMO–EHOMO). In addition, the correlation analysis indicates that there is no dominant reaction channel for SO4•– reactions with various structurally diverse compounds. Our QSAR model provides a robust predictive tool for estimating emerging micropollutants removal using SO4•– during wastewater treatment processes.
Vyas, V K; Gupta, N; Ghate, M; Patel, S
2014-01-01
In this study we designed novel substituted benzimidazole derivatives and predicted their absorption, distribution, metabolism, excretion and toxicity (ADMET) properties, based on a predictive 3D QSAR study on 132 substituted benzimidazoles as AngII-AT1 receptor antagonists. The two best predicted compounds were synthesized and evaluated for AngII-AT1 receptor antagonism. Three different alignment tools for comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) were used. The best 3D QSAR models were obtained using the rigid body (Distill) alignment method. CoMFA and CoMSIA models were found to be statistically significant with leave-one-out correlation coefficients (q(2)) of 0.630 and 0.623, respectively, cross-validated coefficients (r(2)cv) of 0.651 and 0.630, respectively, and conventional coefficients of determination (r(2)) of 0.848 and 0.843, respectively. 3D QSAR models were validated using a test set of 24 compounds, giving satisfactory predicted results (r(2)pred) of 0.727 and 0.689 for the CoMFA and CoMSIA models, respectively. We have identified some key features in substituted benzimidazole derivatives, such as lipophilicity and H-bonding at the 2- and 5-positions of the benzimidazole nucleus, respectively, for AT1 receptor antagonistic activity. We designed 20 novel substituted benzimidazole derivatives and predicted their activity. In silico ADMET properties were also predicted for these designed molecules. Finally, the compounds with best predicted activity were synthesized and evaluated for in vitro angiotensin II-AT1 receptor antagonism.
Aeroservoelastic Uncertainty Model Identification from Flight Data
NASA Technical Reports Server (NTRS)
Brenner, Martin J.
2001-01-01
Uncertainty modeling is a critical element in the estimation of robust stability margins for stability boundary prediction and robust flight control system development. There has been a serious deficiency to date in aeroservoelastic data analysis with attention to uncertainty modeling. Uncertainty can be estimated from flight data using both parametric and nonparametric identification techniques. The model validation problem addressed in this paper is to identify aeroservoelastic models with associated uncertainty structures from a limited amount of controlled excitation inputs over an extensive flight envelope. The challenge to this problem is to update analytical models from flight data estimates while also deriving non-conservative uncertainty descriptions consistent with the flight data. Multisine control surface command inputs and control system feedbacks are used as signals in a wavelet-based modal parameter estimation procedure for model updates. Transfer function estimates are incorporated in a robust minimax estimation scheme to get input-output parameters and error bounds consistent with the data and model structure. Uncertainty estimates derived from the data in this manner provide an appropriate and relevant representation for model development and robust stability analysis. This model-plus-uncertainty identification procedure is applied to aeroservoelastic flight data from the NASA Dryden Flight Research Center F-18 Systems Research Aircraft.
Incorporating uncertainty in predictive species distribution modelling.
Beale, Colin M; Lennon, Jack J
2012-01-19
Motivated by the need to solve ecological problems (climate change, habitat fragmentation and biological invasions), there has been increasing interest in species distribution models (SDMs). Predictions from these models inform conservation policy, invasive species management and disease-control measures. However, predictions are subject to uncertainty, the degree and source of which is often unrecognized. Here, we review the SDM literature in the context of uncertainty, focusing on three main classes of SDM: niche-based models, demographic models and process-based models. We identify sources of uncertainty for each class and discuss how uncertainty can be minimized or included in the modelling process to give realistic measures of confidence around predictions. Because this has typically not been performed, we conclude that uncertainty in SDMs has often been underestimated and a false precision assigned to predictions of geographical distribution. We identify areas where development of new statistical tools will improve predictions from distribution models, notably the development of hierarchical models that link different types of distribution model and their attendant uncertainties across spatial scales. Finally, we discuss the need to develop more defensible methods for assessing predictive performance, quantifying model goodness-of-fit and for assessing the significance of model covariates.
Uncertainty Modeling for Structural Control Analysis and Synthesis
NASA Technical Reports Server (NTRS)
Campbell, Mark E.; Crawley, Edward F.
1996-01-01
The development of an accurate model of uncertainties for the control of structures that undergo a change in operational environment, based solely on modeling and experimentation in the original environment is studied. The application used throughout this work is the development of an on-orbit uncertainty model based on ground modeling and experimentation. A ground based uncertainty model consisting of mean errors and bounds on critical structural parameters is developed. The uncertainty model is created using multiple data sets to observe all relevant uncertainties in the system. The Discrete Extended Kalman Filter is used as an identification/parameter estimation method for each data set, in addition to providing a covariance matrix which aids in the development of the uncertainty model. Once ground based modal uncertainties have been developed, they are localized to specific degrees of freedom in the form of mass and stiffness uncertainties. Two techniques are presented: a matrix method which develops the mass and stiffness uncertainties in a mathematical manner; and a sensitivity method which assumes a form for the mass and stiffness uncertainties in macroelements and scaling factors. This form allows the derivation of mass and stiffness uncertainties in a more physical manner. The mass and stiffness uncertainties of the ground based system are then mapped onto the on-orbit system, and projected to create an analogous on-orbit uncertainty model in the form of mean errors and bounds on critical parameters. The Middeck Active Control Experiment is introduced as experimental verification for the localization and projection methods developed. In addition, closed loop results from on-orbit operations of the experiment verify the use of the uncertainty model for control analysis and synthesis in space.
Quantifying Groundwater Model Uncertainty
NASA Astrophysics Data System (ADS)
Hill, M. C.; Poeter, E.; Foglia, L.
2007-12-01
Groundwater models are characterized by the (a) processes simulated, (b) boundary conditions, (c) initial conditions, (d) method of solving the equation, (e) parameterization, and (f) parameter values. Models are related to the system of concern using data, some of which form the basis of observations used most directly, through objective functions, to estimate parameter values. Here we consider situations in which parameter values are determined by minimizing an objective function. Other methods of model development are not considered because their ad hoc nature generally prohibits clear quantification of uncertainty. Quantifying prediction uncertainty ideally includes contributions from (a) to (f). The parameter values of (f) tend to be continuous with respect to both the simulated equivalents of the observations and the predictions, while many aspects of (a) through (e) are discrete. This fundamental difference means that there are options for evaluating the uncertainty related to parameter values that generally do not exist for other aspects of a model. While the methods available for (a) to (e) can be used for the parameter values (f), the inferential methods uniquely available for (f) generally are less computationally intensive and often can be used to considerable advantage. However, inferential approaches require calculation of sensitivities. Whether the numerical accuracy and stability of the model solution required for accurate sensitivities is more broadly important to other model uses is an issue that needs to be addressed. Alternative global methods can require 100 or even 1,000 times the number of runs needed by inferential methods, though methods of reducing the number of needed runs are being developed and tested. Here we present three approaches for quantifying model uncertainty and investigate their strengths and weaknesses. (1) Represent more aspects as parameters so that the computationally efficient methods can be broadly applied. This
QSAR modeling of flotation collectors using principal components extracted from topological indices.
Natarajan, R; Nirdosh, Inderjit; Basak, Subhash C; Mills, Denise R
2002-01-01
Several topological indices were calculated for substituted-cupferrons that were tested as collectors for the froth flotation of uranium. The principal component analysis (PCA) was used for data reduction. Seven principal components (PC) were found to account for 98.6% of the variance among the computed indices. The principal components thus extracted were used in stepwise regression analyses to construct regression models for the prediction of separation efficiencies (Es) of the collectors. A two-parameter model with a correlation coefficient of 0.889 and a three-parameter model with a correlation coefficient of 0.913 were formed. PCs were found to be better than partition coefficient to form regression equations, and inclusion of an electronic parameter such as Hammett sigma or quantum mechanically derived electronic charges on the chelating atoms did not improve the correlation coefficient significantly. The method was extended to model the separation efficiencies of mercaptobenzothiazoles (MBT) and aminothiophenols (ATP) used in the flotation of lead and zinc ores, respectively. Five principal components were found to explain 99% of the data variability in each series. A three-parameter equation with correlation coefficient of 0.985 and a two-parameter equation with correlation coefficient of 0.926 were obtained for MBT and ATP, respectively. The amenability of separation efficiencies of chelating collectors to QSAR modeling using PCs based on topological indices might lead to the selection of collectors for synthesis and testing from a virtual database.
Classification of baseline toxicants for QSAR predictions to replace fish acute toxicity studies.
Nendza, Monika; Müller, Martin; Wenzel, Andrea
2017-03-22
Fish acute toxicity studies are required for environmental hazard and risk assessment of chemicals by national and international legislations such as REACH, the regulations of plant protection products and biocidal products, or the GHS (globally harmonised system) for classification and labelling of chemicals. Alternative methods like QSARs (quantitative structure-activity relationships) can replace many ecotoxicity tests. However, complete substitution of in vivo animal tests by in silico methods may not be realistic. For the so-called baseline toxicants, it is possible to predict the fish acute toxicity with sufficient accuracy from log K ow and, hence, valid QSARs can replace in vivo testing. In contrast, excess toxicants and chemicals not reliably classified as baseline toxicants require further in silico, in vitro or in vivo assessments. Thus, the critical task is to discriminate between baseline and excess toxicants. For fish acute toxicity, we derived a scheme based on structural alerts and physicochemical property thresholds to classify chemicals as either baseline toxicants (=predictable by QSARs) or as potential excess toxicants (=not predictable by baseline QSARs). The step-wise approach identifies baseline toxicants (true negatives) in a precautionary way to avoid false negative predictions. Therefore, a certain fraction of false positives can be tolerated, i.e. baseline toxicants without specific effects that may be tested instead of predicted. Application of the classification scheme to a new heterogeneous dataset for diverse fish species results in 40% baseline toxicants, 24% excess toxicants and 36% compounds not classified. Thus, we can conclude that replacing about half of the fish acute toxicity tests by QSAR predictions is realistic to be achieved in the short-term. The long-term goals are classification criteria also for further groups of toxicants and to replace as many in vivo fish acute toxicity tests as possible with valid QSAR
Burden, Natalie; Maynard, Samuel K; Weltje, Lennart; Wheeler, James R
2016-10-01
The European Plant Protection Products Regulation 1107/2009 requires that registrants establish whether pesticide metabolites pose a risk to the environment. Fish acute toxicity assessments may be carried out to this end. Considering the total number of pesticide (re-) registrations, the number of metabolites can be considerable, and therefore this testing could use many vertebrates. EFSA's recent "Guidance on tiered risk assessment for plant protection products for aquatic organisms in edge-of-field surface waters" outlines opportunities to apply non-testing methods, such as Quantitative Structure Activity Relationship (QSAR) models. However, a scientific evidence base is necessary to support the use of QSARs in predicting acute fish toxicity of pesticide metabolites. Widespread application and subsequent regulatory acceptance of such an approach would reduce the numbers of animals used. The work presented here intends to provide this evidence base, by means of retrospective data analysis. Experimental fish LC50 values for 150 metabolites were extracted from the Pesticide Properties Database (http://sitem.herts.ac.uk/aeru/ppdb/en/atoz.htm). QSAR calculations were performed to predict fish acute toxicity values for these metabolites using the US EPA's ECOSAR software. The most conservative predicted LC50 values generated by ECOSAR were compared with experimental LC50 values. There was a significant correlation between predicted and experimental fish LC50 values (Spearman rs = 0.6304, p < 0.0001). For 62% of metabolites assessed, the QSAR predicted values are equal to or lower than their respective experimental values. Refined analysis, taking into account data quality and experimental variation considerations increases the proportion of sufficiently predictive estimates to 91%. For eight of the nine outliers, there are plausible explanation(s) for the disparity between measured and predicted LC50 values. Following detailed consideration of the robustness of
Sivan, Sree Kanth; Manga, Vijjulatha
2010-06-01
Nonnucleoside reverse transcriptase inhibitors (NNRTIs) are allosteric inhibitors of the HIV-1 reverse transcriptase. Recently a series of Triazolinone and Pyridazinone were reported as potent inhibitors of HIV-1 wild type reverse transcriptase. In the present study, docking and 3D quantitative structure activity relationship (3D QSAR) studies involving comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) were performed on 31 molecules. Ligands were built and minimized using Tripos force field and applying Gasteiger-Hückel charges. These ligands were docked into protein active site using GLIDE 4.0. The docked poses were analyzed; the best docked poses were selected and aligned. CoMFA and CoMSIA fields were calculated using SYBYL6.9. The molecules were divided into training set and test set, a PLS analysis was performed and QSAR models were generated. The model showed good statistical reliability which is evident from the r2 nv, q2 loo and r2 pred values. The CoMFA model provides the most significant correlation of steric and electrostatic fields with biological activities. The CoMSIA model provides a correlation of steric, electrostatic, acceptor and hydrophobic fields with biological activities. The information rendered by 3D QSAR model initiated us to optimize the lead and design new potential inhibitors.
CURRENT PRACTICES IN QSAR DEVELOPMENT AND APPLICATIONS
Current Practices in QSAR Development and Applications
Although it is commonly assumed that the structure and properties of a single chemical determines its activity in a particular biological system, it is only through study of how biological activity varies with changes...
Determination of Uncertainties for the New SSME Model
NASA Technical Reports Server (NTRS)
Coleman, Hugh W.; Hawk, Clark W.
1996-01-01
This report discusses the uncertainty analysis performed in support of a new test analysis and performance prediction model for the Space Shuttle Main Engine. The new model utilizes uncertainty estimates for experimental data and for the analytical model to obtain the most plausible operating condition for the engine system. This report discusses the development of the data sets and uncertainty estimates to be used in the development of the new model. It also presents the application of uncertainty analysis to analytical models and the uncertainty analysis for the conservation of mass and energy balance relations is presented. A new methodology for the assessment of the uncertainty associated with linear regressions is presented.
QSAR Study on the anti-tumor activity of levofloxacin-thiadiazole HDACi conjugates
NASA Astrophysics Data System (ADS)
Tang, Ziqiang; Feng, Hui; Chen, Yan; Yue, Wei; Feng, Changjun
2017-12-01
A molecular electronegativity distance vector(M t) based on 13atomic types is used to describe the structures of 19 conjugates(LHCc) of levofloxacin-thiadiazole HDAC inhibitor(HDACi) and related to the anti-tumor activity (M F and P C) of LHCc against MCF-7 and PC-3. The quantitative structure-activity relationships (QSAR) was established by using leaps-and-bounds regression analysis for the anti-tumor activities (M F and P C) of 19 above compounds to MCF-7and PC-3 along with the M t. The correlation coefficients (R 2) and the leave-one-out (LOO) cross validation R cv 2 for the M F and P C models were 0.792 and 0.679; 0.773 and 0.565, respectively. The QSAR models have favorable correlation, as well as robustness and good prediction capability by R 2, F, R cv 2, A IC F IT V IF tests. The results indicate that the molecular structural units: -CHg-(g=1, 2), -NH2, -NH-,-OH, O=, -O-, -S- and -X are main factors which can affect the anti-tumor activity M F and PC bioactivities of these compounds directly.
3D-QSAR modeling and molecular docking studies on a series of 2,5 disubstituted 1,3,4-oxadiazoles
NASA Astrophysics Data System (ADS)
Ghaleb, Adib; Aouidate, Adnane; Ghamali, Mounir; Sbai, Abdelouahid; Bouachrine, Mohammed; Lakhlifi, Tahar
2017-10-01
3D-QSAR (comparative molecular field analysis (CoMFA)) and comparative molecular similarity indices analysis (CoMSIA) were performed on novel 2,5 disubstituted 1,3,4-oxadiazoles analogues as anti-fungal agents. The CoMFA and CoMSIA models using 13 compounds in the training set gives Q2 values of 0.52 and 0.51 respectively, while R2 values of 0.92. The adapted alignment method with the suitable parameters resulted in reliable models. The contour maps produced by the CoMFA and CoMSIA models were employed to determine a three-dimensional quantitative structure-activity relationship. Based on this study a set of new molecules with high predicted activities were designed. Surflex-docking confirmed the stability of predicted molecules in the receptor.
Cramer, Richard D.
2015-01-01
The possible applicability of the new template CoMFA methodology to the prediction of unknown biological affinities was explored. For twelve selected targets, all ChEMBL binding affinities were used as training and/or prediction sets, making these 3D-QSAR models the most structurally diverse and among the largest ever. For six of the targets, X-ray crystallographic structures provided the aligned templates required as input (BACE, cdk1, chk2, carbonic anhydrase-II, factor Xa, PTP1B). For all targets including the other six (hERG, cyp3A4 binding, endocrine receptor, COX2, D2, and GABAa), six modeling protocols applied to only three familiar ligands provided six alternate sets of aligned templates. The statistical qualities of the six or seven models thus resulting for each individual target were remarkably similar. Also, perhaps unexpectedly, the standard deviations of the errors of cross-validation predictions accompanying model derivations were indistinguishable from the standard deviations of the errors of truly prospective predictions. These standard deviations of prediction ranged from 0.70 to 1.14 log units and averaged 0.89 (8x in concentration units) over the twelve targets, representing an average reduction of almost 50% in uncertainty, compared to the null hypothesis of “predicting” an unknown affinity to be the average of known affinities. These errors of prediction are similar to those from Tanimoto coefficients of fragment occurrence frequencies, the predominant approach to side effect prediction, which template CoMFA can augment by identifying additional active structural classes, by improving Tanimoto-only predictions, by yielding quantitative predictions of potency, and by providing interpretable guidance for avoiding or enhancing any specific target response. PMID:26065424
Uncertainty in the Modeling of Tsunami Sediment Transport
NASA Astrophysics Data System (ADS)
Jaffe, B. E.; Sugawara, D.; Goto, K.; Gelfenbaum, G. R.; La Selle, S.
2016-12-01
Erosion and deposition from tsunamis record information about tsunami hydrodynamics and size that can be interpreted to improve tsunami hazard assessment. A recent study (Jaffe et al., 2016) explores sources and methods for quantifying uncertainty in tsunami sediment transport modeling. Uncertainty varies with tsunami properties, study site characteristics, available input data, sediment grain size, and the model used. Although uncertainty has the potential to be large, case studies for both forward and inverse models have shown that sediment transport modeling provides useful information on tsunami inundation and hydrodynamics that can be used to improve tsunami hazard assessment. New techniques for quantifying uncertainty, such as Ensemble Kalman Filtering inversion, and more rigorous reporting of uncertainties will advance the science of tsunami sediment transport modeling. Uncertainty may be decreased with additional laboratory studies that increase our understanding of the semi-empirical parameters and physics of tsunami sediment transport, standardized benchmark tests to assess model performance, and the development of hybrid modeling approaches to exploit the strengths of forward and inverse models. As uncertainty in tsunami sediment transport modeling is reduced, and with increased ability to quantify uncertainty, the geologic record of tsunamis will become more valuable in the assessment of tsunami hazard. Jaffe, B., Goto, K., Sugawara, D., Gelfenbaum, G., and La Selle, S., "Uncertainty in Tsunami Sediment Transport Modeling", Journal of Disaster Research Vol. 11 No. 4, pp. 647-661, 2016, doi: 10.20965/jdr.2016.p0647 https://www.fujipress.jp/jdr/dr/dsstr001100040647/
Chen, Hongming; Carlsson, Lars; Eriksson, Mats; Varkonyi, Peter; Norinder, Ulf; Nilsson, Ingemar
2013-06-24
A novel methodology was developed to build Free-Wilson like local QSAR models by combining R-group signatures and the SVM algorithm. Unlike Free-Wilson analysis this method is able to make predictions for compounds with R-groups not present in a training set. Eleven public data sets were chosen as test cases for comparing the performance of our new method with several other traditional modeling strategies, including Free-Wilson analysis. Our results show that the R-group signature SVM models achieve better prediction accuracy compared with Free-Wilson analysis in general. Moreover, the predictions of R-group signature models are also comparable to the models using ECFP6 fingerprints and signatures for the whole compound. Most importantly, R-group contributions to the SVM model can be obtained by calculating the gradient for R-group signatures. For most of the studied data sets, a significant correlation with that of a corresponding Free-Wilson analysis is shown. These results suggest that the R-group contribution can be used to interpret bioactivity data and highlight that the R-group signature based SVM modeling method is as interpretable as Free-Wilson analysis. Hence the signature SVM model can be a useful modeling tool for any drug discovery project.
Model parameter uncertainty analysis for an annual field-scale P loss model
NASA Astrophysics Data System (ADS)
Bolster, Carl H.; Vadas, Peter A.; Boykin, Debbie
2016-08-01
Phosphorous (P) fate and transport models are important tools for developing and evaluating conservation practices aimed at reducing P losses from agricultural fields. Because all models are simplifications of complex systems, there will exist an inherent amount of uncertainty associated with their predictions. It is therefore important that efforts be directed at identifying, quantifying, and communicating the different sources of model uncertainties. In this study, we conducted an uncertainty analysis with the Annual P Loss Estimator (APLE) model. Our analysis included calculating parameter uncertainties and confidence and prediction intervals for five internal regression equations in APLE. We also estimated uncertainties of the model input variables based on values reported in the literature. We then predicted P loss for a suite of fields under different management and climatic conditions while accounting for uncertainties in the model parameters and inputs and compared the relative contributions of these two sources of uncertainty to the overall uncertainty associated with predictions of P loss. Both the overall magnitude of the prediction uncertainties and the relative contributions of the two sources of uncertainty varied depending on management practices and field characteristics. This was due to differences in the number of model input variables and the uncertainties in the regression equations associated with each P loss pathway. Inspection of the uncertainties in the five regression equations brought attention to a previously unrecognized limitation with the equation used to partition surface-applied fertilizer P between leaching and runoff losses. As a result, an alternate equation was identified that provided similar predictions with much less uncertainty. Our results demonstrate how a thorough uncertainty and model residual analysis can be used to identify limitations with a model. Such insight can then be used to guide future data collection and model
Ruiz, Patricia; Begluitti, Gino; Tincher, Terry; Wheeler, John; Mumtaz, Moiz
2012-07-27
Predicting toxicity quantitatively, using Quantitative Structure Activity Relationships (QSAR), has matured over recent years to the point that the predictions can be used to help identify missing comparison values in a substance's database. In this manuscript we investigate using the lethal dose that kills fifty percent of a test population (LD₅₀) for determining relative toxicity of a number of substances. In general, the smaller the LD₅₀ value, the more toxic the chemical, and the larger the LD₅₀ value, the lower the toxicity. When systemic toxicity and other specific toxicity data are unavailable for the chemical(s) of interest, during emergency responses, LD₅₀ values may be employed to determine the relative toxicity of a series of chemicals. In the present study, a group of chemical warfare agents and their breakdown products have been evaluated using four available rat oral QSAR LD₅₀ models. The QSAR analysis shows that the breakdown products of Sulfur Mustard (HD) are predicted to be less toxic than the parent compound as well as other known breakdown products that have known toxicities. The QSAR estimated break down products LD₅₀ values ranged from 299 mg/kg to 5,764 mg/kg. This evaluation allows for the ranking and toxicity estimation of compounds for which little toxicity information existed; thus leading to better risk decision making in the field.
Impact of uncertainty on modeling and testing
NASA Technical Reports Server (NTRS)
Coleman, Hugh W.; Brown, Kendall K.
1995-01-01
A thorough understanding of the uncertainties associated with the modeling and testing of the Space Shuttle Main Engine (SSME) Engine will greatly aid decisions concerning hardware performance and future development efforts. This report will describe the determination of the uncertainties in the modeling and testing of the Space Shuttle Main Engine test program at the Technology Test Bed facility at Marshall Space Flight Center. Section 2 will present a summary of the uncertainty analysis methodology used and discuss the specific applications to the TTB SSME test program. Section 3 will discuss the application of the uncertainty analysis to the test program and the results obtained. Section 4 presents the results of the analysis of the SSME modeling effort from an uncertainty analysis point of view. The appendices at the end of the report contain a significant amount of information relative to the analysis, including discussions of venturi flowmeter data reduction and uncertainty propagation, bias uncertainty documentations, technical papers published, the computer code generated to determine the venturi uncertainties, and the venturi data and results used in the analysis.
Amin, Sk Abdul; Adhikari, Nilanjan; Jha, Tarun; Gayen, Shovanlal
2016-12-01
Huntington's disease (HD) is caused by mutation of huntingtin protein (mHtt) leading to neuronal cell death. The mHtt induced toxicity can be rescued by inhibiting the kynurenine monooxygenase (KMO) enzyme. Therefore, KMO is a promising drug target to address the neurodegenerative disorders such as Huntington's diseases. Fiftysix arylpyrimidine KMO inhibitors are structurally explored through regression and classification based multi-QSAR modeling, pharmacophore mapping and molecular docking approaches. Moreover, ten new compounds are proposed and validated through the modeling that may be effective in accelerating Huntington's disease drug discovery efforts. Copyright © 2016 Elsevier Ltd. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gordon, R.K.; Breuer, E.; Padilla, F.N.
1987-05-01
QSAR between biological activities and molecular-chemical properties were investigated to aid in designing more effective and potent antimuscarinic pharmacophores. A molecular modeling program was used to calculate geometrical and topological values of a series of DPP pharmacophores. The newly synthesized pharmacophores were tested for their antagonist activities by: (1) inhibition of (N-methyl-/sup 3/H)scopolamine binding assay to the muscarinic receptors of N4TG1 neuroblastoma cells; (2) blocking of acetylcholine-induced contraction of guinea pig ileum; and (3) inhibition of carbachol-induced ..cap alpha..-amylase release from rat pancreas. The differences in the log of these biological activities were directly and significantly related to the distancesmore » between the carbonyl oxygen of the DPP and the quaternary nitrogen of the modified pharmacophores. The biological activities, while depending on each particular assay, varied between three and four logs of activity. The charge remained the same in all the pharmacophores. There were no QSAR correlations between molecular volume, molecular connectivity, or principle moments and their antagonistic activities, although multivariate QSAR was not employed. Thus, based on distance geometry, potent muscarinic pharmacophores can be predicted.« less
Gholivand, Khodayar; Ebrahimi Valmoozi, Ali Asghar; Bonsaii, Mahyar
2014-06-01
Novel (thio)phosphoramidate derivatives based on piperidincarboxamide with the general formula of (NH2-C(O)-C5H9N)-P(X=O,S)R1R2 (1-5) and (NH2-C(O)-C5H9N)2-P(O)R (6-9) were synthesized and characterized by (31)P, (13)C, (1)H NMR, IR spectroscopy. Furthermore, the crystal structure of compound (NH2-C(O)-C5H9N)2-P(O)(OC6H5) (6) was investigated. The activities of derivatives on cholinesterases (ChE) were determined using a modified Ellman's method. Also the mixed-type mechanisms of these compounds were evaluated by Lineweaver-Burk plots. Molecular docking and quantitative structure-activity relationship (QSAR) were used to understand the relationship between molecular structural features and anti-ChE activity, and to predict the binding affinity of phosphoramido-piperidinecarboxamides (PAPCAs) to ChE receptors. From molecular docking analysis, noncovalent interactions especially hydrogen bonding as well as hydrophobic was found between PAPCAs and ChE. Based on the docking results, appropriate molecular structural parameters were adopted to develop a QSAR model. DFT-QSAR models for ChE enzymes demonstrated the importance of electrophilicity parameter in describing the anti-AChE and anti-BChE activities of the synthesized compounds. The correlation matrix of QSAR models and docking analysis confirmed that electrophilicity descriptor can control the influence of the hydrophobic properties of P=(O, S) and CO functional groups of PAPCA derivatives in the inhibition of human ChE enzymes. Copyright © 2014 Elsevier Inc. All rights reserved.
El-Kilany, Yeldez; Nahas, Nariman M; Al-Ghamdi, Mariam A; Badawy, Mohamed E I; El Ashry, El Sayed H
2015-01-01
Ethyl (benzimidazol-1-yl)acetate was subjected to hydrazinolysis with hydrazine hydrate to give (benzimidazol-1-yl)acetohydrazide. The latter was reacted with various aromatic aldehydes to give the respective arylidene (1H-benzimidazol-1-yl)acetohydrazones. Solutions of the prepared hydrazones were found to contain two geometric isomers. Similarly (2-methyl-benzimidazol-1-yl)acetohydrazide was reacted with various aldehydes to give the corresponding hydrazones. The antibacterial activity was evaluated in vitro by minimum inhibitory concentration (MIC) against Agrobacterium tumefaciens (A. tumefaciens), Erwinia carotovora (E. carotovora), Corynebacterium fascians (C. fascians) and Pseudomonas solanacearum (P. solanacearum). MIC result demonstrated that salicylaldehyde(1H-benzimidazol-1-yl)acetohydrazone (4) was the most active compound (MIC = 20, 35, 25 and 30 mg/L against A. tumefaciens, C. fascians, E. carotovora and P. solanacearum, respectively). Quantitative structure activity relationship (QSAR) investigation using Hansch analysis was applied to find out the correlation between antibacterial activity and physicochemical properties. Various physicochemical descriptors and experimentally determined MIC values for different microorganisms were used as independent and dependent variables, respectively. pMICs of the compounds exhibited good correlation (r = 0.983, 0.914, 0.960 and 0.958 for A. tumefaciens, C. fascians, E. carotovora and P. solanacearum, respectively) with the prediction made by the model. QSAR study revealed that the hydrophobic parameter (ClogP), the aqueous solubility (LogS), calculated molar refractivity, topological polar surface area and hydrogen bond acceptor were found to have overall significant correlation with antibacterial activity. The statistical results of training set, correlation coefficient (r and r (2)), the ratio between regression and residual variances (f, Fisher's statistic), the standard error of estimates and
[Application of Kohonen Self-Organizing Feature Maps in QSAR of human ADMET and kinase data sets].
Hegymegi-Barakonyi, Bálint; Orfi, László; Kéri, György; Kövesdi, István
2013-01-01
QSAR predictions have been proven very useful in a large number of studies for drug design, such as kinase inhibitor design as targets for cancer therapy, however the overall predictability often remains unsatisfactory. To improve predictability of ADMET features and kinase inhibitory data, we present a new method using Kohonen's Self-Organizing Feature Map (SOFM) to cluster molecules based on explanatory variables (X) and separate dissimilar ones. We calculated SOFM clusters for a large number of molecules with human ADMET and kinase inhibitory data, and we showed that chemically similar molecules were in the same SOFM cluster, and within such clusters the QSAR models had significantly better predictability. We used also target variables (Y, e.g. ADMET) jointly with X variables to create a novel type of clustering. With our method, cells of loosely coupled XY data could be identified and separated into different model building sets.
3D-QSAR studies on 1,2,4-triazolyl 5-azaspiro [2.4]-heptanes as D3R antagonists
NASA Astrophysics Data System (ADS)
Zhang, Xin; Zhang, Hui
2018-07-01
Dopamine D3 receptor has become an attractive target in the treatment of abused drugs. 3D-QSAR studies were performed on a novel series of D3 receptor antagonists, 1,2,4-triazolyl 5-azaspiro [2.4]-heptanes, using CoMFA and CoMSIA methods. Two predictive 3D-QSAR models have been generated for the modified design of D3R antagonists. Based on the steric, electrostatic, hydrophobic and hydrogen-bond acceptor information of contour maps, key structural factors affecting the bioactivity were explored. This work gives helpful suggestions on the design of novel D3R antagonists with increased activities.
Yadav, Dharmendra Kumar; Kalani, Komal; Khan, Feroz; Srivastava, Santosh Kumar
2013-12-01
For the prediction of anticancer activity of glycyrrhetinic acid (GA-1) analogs against the human lung cancer cell line (A-549), a QSAR model was developed by forward stepwise multiple linear regression methodology. The regression coefficient (r(2)) and prediction accuracy (rCV(2)) of the QSAR model were taken 0.94 and 0.82, respectively in terms of correlation. The QSAR study indicates that the dipole moments, size of smallest ring, amine counts, hydroxyl and nitro functional groups are correlated well with cytotoxic activity. The docking studies showed high binding affinity of the predicted active compounds against the lung cancer target EGFR. These active glycyrrhetinic acid derivatives were then semi-synthesized, characterized and in-vitro tested for anticancer activity. The experimental results were in agreement with the predicted values and the ethyl oxalyl derivative of GA-1 (GA-3) showed equal cytotoxic activity to that of standard anticancer drug paclitaxel.
Fan, Feng; Cheng, Jiagao; Li, Zhong; Xu, Xiaoyong; Qian, Xuhong
2010-02-01
Molecular aggregation state of bioactive compounds plays a key role in its bio-interactive procedure. In this article, based on the structure information of dimers, the simplest model of molecular aggregation state, and combined with solvational computation, total four descriptors (DeltaV, MR2, DeltaE(1), and DeltaE(2)) were calculated for QSAR study of a novel insect-growth regulator, N-(5-phenyl-1,3,4-oxadiazol-2-yl)-N'-benzoyl urea. Two QSAR models were constructed with r(2) = 0.671, q(2) = 0.516 and r(2) = 0.816, q(2) = 0.695, respectively. It implicates that the bioactivity may strongly depend on the characters of molecular aggregation state, especially on the dimeric transport ability from oil phase to water phase. Copyright 2009 Wiley Periodicals, Inc.
Validity and validation of expert (Q)SAR systems.
Hulzebos, E; Sijm, D; Traas, T; Posthumus, R; Maslankiewicz, L
2005-08-01
At a recent workshop in Setubal (Portugal) principles were drafted to assess the suitability of (quantitative) structure-activity relationships ((Q)SARs) for assessing the hazards and risks of chemicals. In the present study we applied some of the Setubal principles to test the validity of three (Q)SAR expert systems and validate the results. These principles include a mechanistic basis, the availability of a training set and validation. ECOSAR, BIOWIN and DEREK for Windows have a mechanistic or empirical basis. ECOSAR has a training set for each QSAR. For half of the structural fragments the number of chemicals in the training set is >4. Based on structural fragments and log Kow, ECOSAR uses linear regression to predict ecotoxicity. Validating ECOSAR for three 'valid' classes results in predictivity of > or = 64%. BIOWIN uses (non-)linear regressions to predict the probability of biodegradability based on fragments and molecular weight. It has a large training set and predicts non-ready biodegradability well. DEREK for Windows predictions are supported by a mechanistic rationale and literature references. The structural alerts in this program have been developed with a training set of positive and negative toxicity data. However, to support the prediction only a limited number of chemicals in the training set is presented to the user. DEREK for Windows predicts effects by 'if-then' reasoning. The program predicts best for mutagenicity and carcinogenicity. Each structural fragment in ECOSAR and DEREK for Windows needs to be evaluated and validated separately.
QSAR study of curcumine derivatives as HIV-1 integrase inhibitors.
Gupta, Pawan; Sharma, Anju; Garg, Prabha; Roy, Nilanjan
2013-03-01
A QSAR study was performed on curcumine derivatives as HIV-1 integrase inhibitors using multiple linear regression. The statistically significant model was developed with squared correlation coefficients (r(2)) 0.891 and cross validated r(2) (r(2) cv) 0.825. The developed model revealed that electronic, shape, size, geometry, substitution's information and hydrophilicity were important atomic properties for determining the inhibitory activity of these molecules. The model was also tested successfully for external validation (r(2) pred = 0.849) as well as Tropsha's test for model predictability. Furthermore, the domain analysis was carried out to evaluate the prediction reliability of external set molecules. The model was statistically robust and had good predictive power which can be successfully utilized for screening of new molecules.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, Xiaolin; Ye, Li; Wang, Xiaoxiang
2012-12-15
Several recent reports suggested that hydroxylated polybrominated diphenyl ethers (HO-PBDEs) may disturb thyroid hormone homeostasis. To illuminate the structural features for thyroid hormone activity of HO-PBDEs and the binding mode between HO-PBDEs and thyroid hormone receptor (TR), the hormone activity of a series of HO-PBDEs to thyroid receptors β was studied based on the combination of 3D-QSAR, molecular docking, and molecular dynamics (MD) methods. The ligand- and receptor-based 3D-QSAR models were obtained using Comparative Molecular Similarity Index Analysis (CoMSIA) method. The optimum CoMSIA model with region focusing yielded satisfactory statistical results: leave-one-out cross-validation correlation coefficient (q{sup 2}) was 0.571 andmore » non-cross-validation correlation coefficient (r{sup 2}) was 0.951. Furthermore, the results of internal validation such as bootstrapping, leave-many-out cross-validation, and progressive scrambling as well as external validation indicated the rationality and good predictive ability of the best model. In addition, molecular docking elucidated the conformations of compounds and key amino acid residues at the docking pocket, MD simulation further determined the binding process and validated the rationality of docking results. -- Highlights: ► The thyroid hormone activities of HO-PBDEs were studied by 3D-QSAR. ► The binding modes between HO-PBDEs and TRβ were explored. ► 3D-QSAR, molecular docking, and molecular dynamics (MD) methods were performed.« less
Uncertainty Quantification in Climate Modeling and Projection
DOE Office of Scientific and Technical Information (OSTI.GOV)
Qian, Yun; Jackson, Charles; Giorgi, Filippo
The projection of future climate is one of the most complex problems undertaken by the scientific community. Although scientists have been striving to better understand the physical basis of the climate system and to improve climate models, the overall uncertainty in projections of future climate has not been significantly reduced (e.g., from the IPCC AR4 to AR5). With the rapid increase of complexity in Earth system models, reducing uncertainties in climate projections becomes extremely challenging. Since uncertainties always exist in climate models, interpreting the strengths and limitations of future climate projections is key to evaluating risks, and climate change informationmore » for use in Vulnerability, Impact, and Adaptation (VIA) studies should be provided with both well-characterized and well-quantified uncertainty. The workshop aimed at providing participants, many of them from developing countries, information on strategies to quantify the uncertainty in climate model projections and assess the reliability of climate change information for decision-making. The program included a mixture of lectures on fundamental concepts in Bayesian inference and sampling, applications, and hands-on computer laboratory exercises employing software packages for Bayesian inference, Markov Chain Monte Carlo methods, and global sensitivity analyses. The lectures covered a range of scientific issues underlying the evaluation of uncertainties in climate projections, such as the effects of uncertain initial and boundary conditions, uncertain physics, and limitations of observational records. Progress in quantitatively estimating uncertainties in hydrologic, land surface, and atmospheric models at both regional and global scales was also reviewed. The application of Uncertainty Quantification (UQ) concepts to coupled climate system models is still in its infancy. The Coupled Model Intercomparison Project (CMIP) multi-model ensemble currently represents the primary data for
Uncertainty quantification for environmental models
Hill, Mary C.; Lu, Dan; Kavetski, Dmitri; Clark, Martyn P.; Ye, Ming
2012-01-01
Environmental models are used to evaluate the fate of fertilizers in agricultural settings (including soil denitrification), the degradation of hydrocarbons at spill sites, and water supply for people and ecosystems in small to large basins and cities—to mention but a few applications of these models. They also play a role in understanding and diagnosing potential environmental impacts of global climate change. The models are typically mildly to extremely nonlinear. The persistent demand for enhanced dynamics and resolution to improve model realism [17] means that lengthy individual model execution times will remain common, notwithstanding continued enhancements in computer power. In addition, high-dimensional parameter spaces are often defined, which increases the number of model runs required to quantify uncertainty [2]. Some environmental modeling projects have access to extensive funding and computational resources; many do not. The many recent studies of uncertainty quantification in environmental model predictions have focused on uncertainties related to data error and sparsity of data, expert judgment expressed mathematically through prior information, poorly known parameter values, and model structure (see, for example, [1,7,9,10,13,18]). Approaches for quantifying uncertainty include frequentist (potentially with prior information [7,9]), Bayesian [13,18,19], and likelihood-based. A few of the numerous methods, including some sensitivity and inverse methods with consequences for understanding and quantifying uncertainty, are as follows: Bayesian hierarchical modeling and Bayesian model averaging; single-objective optimization with error-based weighting [7] and multi-objective optimization [3]; methods based on local derivatives [2,7,10]; screening methods like OAT (one at a time) and the method of Morris [14]; FAST (Fourier amplitude sensitivity testing) [14]; the Sobol' method [14]; randomized maximum likelihood [10]; Markov chain Monte Carlo (MCMC) [10
QSAR models for thiophene and imidazopyridine derivatives inhibitors of the Polo-Like Kinase 1.
Comelli, Nieves C; Duchowicz, Pablo R; Castro, Eduardo A
2014-10-01
The inhibitory activity of 103 thiophene and 33 imidazopyridine derivatives against Polo-Like Kinase 1 (PLK1) expressed as pIC50 (-logIC50) was predicted by QSAR modeling. Multivariate linear regression (MLR) was employed to model the relationship between 0D and 3D molecular descriptors and biological activities of molecules using the replacement method (MR) as variable selection tool. The 136 compounds were separated into several training and test sets. Two splitting approaches, distribution of biological data and structural diversity, and the statistical experimental design procedure D-optimal distance were applied to the dataset. The significance of the training set models was confirmed by statistically higher values of the internal leave one out cross-validated coefficient of determination (Q2) and external predictive coefficient of determination for the test set (Rtest2). The model developed from a training set, obtained with the D-optimal distance protocol and using 3D descriptor space along with activity values, separated chemical features that allowed to distinguish high and low pIC50 values reasonably well. Then, we verified that such model was sufficient to reliably and accurately predict the activity of external diverse structures. The model robustness was properly characterized by means of standard procedures and their applicability domain (AD) was analyzed by leverage method. Copyright © 2014 Elsevier B.V. All rights reserved.
SEDIMENT-ASSOCIATED REACTIONS OF AROMATIC AMINES: QSAR DEVELOPMENT
Despite the common occurrence of the aromatic amine functional group in environmental contaminants, few quantitative structure-activity relationships (QSARs) have been developed to predict sorption kinetics for aromatic amines in natural soils and sediments. Towards the goal of d...
Incorporating parametric uncertainty into population viability analysis models
McGowan, Conor P.; Runge, Michael C.; Larson, Michael A.
2011-01-01
Uncertainty in parameter estimates from sampling variation or expert judgment can introduce substantial uncertainty into ecological predictions based on those estimates. However, in standard population viability analyses, one of the most widely used tools for managing plant, fish and wildlife populations, parametric uncertainty is often ignored in or discarded from model projections. We present a method for explicitly incorporating this source of uncertainty into population models to fully account for risk in management and decision contexts. Our method involves a two-step simulation process where parametric uncertainty is incorporated into the replication loop of the model and temporal variance is incorporated into the loop for time steps in the model. Using the piping plover, a federally threatened shorebird in the USA and Canada, as an example, we compare abundance projections and extinction probabilities from simulations that exclude and include parametric uncertainty. Although final abundance was very low for all sets of simulations, estimated extinction risk was much greater for the simulation that incorporated parametric uncertainty in the replication loop. Decisions about species conservation (e.g., listing, delisting, and jeopardy) might differ greatly depending on the treatment of parametric uncertainty in population models.
Are the Chemical Structures in your QSAR Correct?
Quantitative structure-activity relationships (QSARs) are used to predict many different endpoints, utilize hundreds and even thousands of different parameters (or descriptors), and are created using a variety of approaches. The one thing they all have in common is the assumptio...
Choubey, Sanjay K; Jeyaraman, Jeyakanthan
2016-11-01
Deregulated epigenetic activity of Histone deacetylase 1 (HDAC1) in tumor development and carcinogenesis pronounces it as promising therapeutic target for cancer treatment. HDAC1 has recently captured the attention of researchers owing to its decisive role in multiple types of cancer. In the present study a multistep framework combining ligand based 3D-QSAR, molecular docking and Molecular Dynamics (MD) simulation studies were performed to explore potential compound with good HDAC1 binding affinity. Four different pharmacophore hypotheses Hypo1 (AADR), Hypo2 (AAAH), Hypo3 (AAAR) and Hypo4 (ADDR) were obtained. The hypothesis Hypo1 (AADR) with two hydrogen bond acceptors (A), one hydrogen bond donor (D) and one aromatics ring (R) was selected to build 3D-QSAR model on the basis of statistical parameter. The pharmacophore hypothesis produced a statistically significant QSAR model, with co-efficient of correlation r 2 =0.82 and cross validation correlation co-efficient q 2 =0.70. External validation result displays high predictive power with r 2 (o) value of 0.88 and r 2 (m) value of 0.58 to carry out further in silico studies. Virtual screening result shows ZINC70450932 as the most promising lead where HDAC1 interacts with residues Asp99, His178, Tyr204, Phe205 and Leu271 forming seven hydrogen bonds. A high docking score (-11.17kcal/mol) and lower docking energy -37.84kcal/mol) displays the binding efficiency of the ligand. Binding free energy calculation was done using MM/GBSA to access affinity of ligands towards protein. Density Functional Theory was employed to explore electronic features of the ligands describing intramolcular charge transfer reaction. Molecular dynamics simulation studies at 50ns display metal ion (Zn)-ligand interaction which is vital to inhibit the enzymatic activity of the protein. Copyright © 2016 Elsevier Inc. All rights reserved.
Yang, Zhihui; Luo, Shuang; Wei, Zongsu; Ye, Tiantian; Spinney, Richard; Chen, Dong; Xiao, Ruiyang
2016-04-01
The second-order rate constants (k) of hydroxyl radical (·OH) with polychlorinated biphenyls (PCBs) in the gas phase are of scientific and regulatory importance for assessing their global distribution and fate in the atmosphere. Due to the limited number of measured k values, there is a need to model the k values for unknown PCBs congeners. In the present study, we developed a quantitative structure-activity relationship (QSAR) model with quantum chemical descriptors using a sequential approach, including correlation analysis, principal component analysis, multi-linear regression, validation, and estimation of applicability domain. The result indicates that the single descriptor, polarizability (α), plays an important role in determining the reactivity with a global standardized function of lnk = -0.054 × α ‒ 19.49 at 298 K. In order to validate the QSAR predicted k values and expand the current k value database for PCBs congeners, an independent method, density functional theory (DFT), was employed to calculate the kinetics and thermodynamics of the gas-phase ·OH oxidation of 2,4',5-trichlorobiphenyl (PCB31), 2,2',4,4'-tetrachlorobiphenyl (PCB47), 2,3,4,5,6-pentachlorobiphenyl (PCB116), 3,3',4,4',5,5'-hexachlorobiphenyl (PCB169), and 2,3,3',4,5,5',6-heptachlorobiphenyl (PCB192) at 298 K at B3LYP/6-311++G**//B3LYP/6-31 + G** level of theory. The QSAR predicted and DFT calculated k values for ·OH oxidation of these PCB congeners exhibit excellent agreement with the experimental k values, indicating the robustness and predictive power of the single-descriptor based QSAR model we developed. Copyright © 2015 Elsevier Ltd. All rights reserved.
Lakhlili, Wiame; Yasri, Abdelaziz; Ibrahimi, Azeddine
2016-01-01
The discovery of clinically relevant inhibitors of mammalian target of rapamycin (mTOR) for anticancer therapy has proved to be a challenging task. The quantitative structure–activity relationship (QSAR) approach is a very useful and widespread technique for ligand-based drug design, which can be used to identify novel and potent mTOR inhibitors. In this study, we performed two-dimensional QSAR tests, and molecular docking validation tests of a series of mTOR ATP-competitive inhibitors to elucidate their structural properties associated with their activity. The QSAR tests were performed using partial least square method with a correlation coefficient of r2=0.799 and a cross-validation of q2=0.714. The chemical library screening was done by associating ligand-based to structure-based approach using the three-dimensional structure of mTOR developed by homology modeling. We were able to select 22 compounds from two databases as inhibitors of the mTOR kinase active site. We believe that the method and applications highlighted in this study will help future efforts toward the design of selective ATP-competitive inhibitors. PMID:27980424
Combinatorial Pharmacophore-Based 3D-QSAR Analysis and Virtual Screening of FGFR1 Inhibitors
Zhou, Nannan; Xu, Yuan; Liu, Xian; Wang, Yulan; Peng, Jianlong; Luo, Xiaomin; Zheng, Mingyue; Chen, Kaixian; Jiang, Hualiang
2015-01-01
The fibroblast growth factor/fibroblast growth factor receptor (FGF/FGFR) signaling pathway plays crucial roles in cell proliferation, angiogenesis, migration, and survival. Aberration in FGFRs correlates with several malignancies and disorders. FGFRs have proved to be attractive targets for therapeutic intervention in cancer, and it is of high interest to find FGFR inhibitors with novel scaffolds. In this study, a combinatorial three-dimensional quantitative structure-activity relationship (3D-QSAR) model was developed based on previously reported FGFR1 inhibitors with diverse structural skeletons. This model was evaluated for its prediction performance on a diverse test set containing 232 FGFR inhibitors, and it yielded a SD value of 0.75 pIC50 units from measured inhibition affinities and a Pearson’s correlation coefficient R2 of 0.53. This result suggests that the combinatorial 3D-QSAR model could be used to search for new FGFR1 hit structures and predict their potential activity. To further evaluate the performance of the model, a decoy set validation was used to measure the efficiency of the model by calculating EF (enrichment factor). Based on the combinatorial pharmacophore model, a virtual screening against SPECS database was performed. Nineteen novel active compounds were successfully identified, which provide new chemical starting points for further structural optimization of FGFR1 inhibitors. PMID:26110383
Quantifying model uncertainty in seasonal Arctic sea-ice forecasts
NASA Astrophysics Data System (ADS)
Blanchard-Wrigglesworth, Edward; Barthélemy, Antoine; Chevallier, Matthieu; Cullather, Richard; Fučkar, Neven; Massonnet, François; Posey, Pamela; Wang, Wanqiu; Zhang, Jinlun; Ardilouze, Constantin; Bitz, Cecilia; Vernieres, Guillaume; Wallcraft, Alan; Wang, Muyin
2017-04-01
Dynamical model forecasts in the Sea Ice Outlook (SIO) of September Arctic sea-ice extent over the last decade have shown lower skill than that found in both idealized model experiments and hindcasts of previous decades. Additionally, it is unclear how different model physics, initial conditions or post-processing techniques contribute to SIO forecast uncertainty. In this work, we have produced a seasonal forecast of 2015 Arctic summer sea ice using SIO dynamical models initialized with identical sea-ice thickness in the central Arctic. Our goals are to calculate the relative contribution of model uncertainty and irreducible error growth to forecast uncertainty and assess the importance of post-processing, and to contrast pan-Arctic forecast uncertainty with regional forecast uncertainty. We find that prior to forecast post-processing, model uncertainty is the main contributor to forecast uncertainty, whereas after forecast post-processing forecast uncertainty is reduced overall, model uncertainty is reduced by an order of magnitude, and irreducible error growth becomes the main contributor to forecast uncertainty. While all models generally agree in their post-processed forecasts of September sea-ice volume and extent, this is not the case for sea-ice concentration. Additionally, forecast uncertainty of sea-ice thickness grows at a much higher rate along Arctic coastlines relative to the central Arctic ocean. Potential ways of offering spatial forecast information based on the timescale over which the forecast signal beats the noise are also explored.
NASA Astrophysics Data System (ADS)
Rivera, Diego; Rivas, Yessica; Godoy, Alex
2015-02-01
Hydrological models are simplified representations of natural processes and subject to errors. Uncertainty bounds are a commonly used way to assess the impact of an input or model architecture uncertainty in model outputs. Different sets of parameters could have equally robust goodness-of-fit indicators, which is known as Equifinality. We assessed the outputs from a lumped conceptual hydrological model to an agricultural watershed in central Chile under strong interannual variability (coefficient of variability of 25%) by using the Equifinality concept and uncertainty bounds. The simulation period ran from January 1999 to December 2006. Equifinality and uncertainty bounds from GLUE methodology (Generalized Likelihood Uncertainty Estimation) were used to identify parameter sets as potential representations of the system. The aim of this paper is to exploit the use of uncertainty bounds to differentiate behavioural parameter sets in a simple hydrological model. Then, we analyze the presence of equifinality in order to improve the identification of relevant hydrological processes. The water balance model for Chillan River exhibits, at a first stage, equifinality. However, it was possible to narrow the range for the parameters and eventually identify a set of parameters representing the behaviour of the watershed (a behavioural model) in agreement with observational and soft data (calculation of areal precipitation over the watershed using an isohyetal map). The mean width of the uncertainty bound around the predicted runoff for the simulation period decreased from 50 to 20 m3s-1 after fixing the parameter controlling the areal precipitation over the watershed. This decrement is equivalent to decreasing the ratio between simulated and observed discharge from 5.2 to 2.5. Despite the criticisms against the GLUE methodology, such as the lack of statistical formality, it is identified as a useful tool assisting the modeller with the identification of critical parameters.
Trapped Radiation Model Uncertainties: Model-Data and Model-Model Comparisons
NASA Technical Reports Server (NTRS)
Armstrong, T. W.; Colborn, B. L.
2000-01-01
The standard AP8 and AE8 models for predicting trapped proton and electron environments have been compared with several sets of flight data to evaluate model uncertainties. Model comparisons are made with flux and dose measurements made on various U.S. low-Earth orbit satellites (APEX, CRRES, DMSP, LDEF, NOAA) and Space Shuttle flights, on Russian satellites (Photon-8, Cosmos-1887, Cosmos-2044), and on the Russian Mir Space Station. This report gives the details of the model-data comparisons-summary results in terms of empirical model uncertainty factors that can be applied for spacecraft design applications are given in a combination report. The results of model-model comparisons are also presented from standard AP8 and AE8 model predictions compared with the European Space Agency versions of AP8 and AE8 and with Russian-trapped radiation models.
Trapped Radiation Model Uncertainties: Model-Data and Model-Model Comparisons
NASA Technical Reports Server (NTRS)
Armstrong, T. W.; Colborn, B. L.
2000-01-01
The standard AP8 and AE8 models for predicting trapped proton and electron environments have been compared with several sets of flight data to evaluate model uncertainties. Model comparisons are made with flux and dose measurements made on various U.S. low-Earth orbit satellites (APEX, CRRES, DMSP. LDEF, NOAA) and Space Shuttle flights, on Russian satellites (Photon-8, Cosmos-1887, Cosmos-2044), and on the Russian Mir space station. This report gives the details of the model-data comparisons -- summary results in terms of empirical model uncertainty factors that can be applied for spacecraft design applications are given in a companion report. The results of model-model comparisons are also presented from standard AP8 and AE8 model predictions compared with the European Space Agency versions of AP8 and AE8 and with Russian trapped radiation models.
Kaiser, K L E
2007-01-01
This presentation will review the evolution of the workshops from a scientific and personal perspective. From their modest beginning in 1983, the workshops have developed into larger international meetings, regularly held every two years. Their initial focus on the aquatic sphere soon expanded to include properties and effects on atmospheric and terrestrial species, including man. Concurrent with this broadening of their scientific scope, the workshops have become an important forum for the early dissemination of all aspects of qualitative and quantitative structure-activity research in ecotoxicology and human health effects. Over the last few decades, the field of quantitative structure/activity relationships (QSARs) has quickly emerged as a major scientific method in understanding the properties and effects of chemicals on the environment and human health. From substances that only affect cell membranes to those that bind strongly to a specific enzyme, QSARs provides insight into the biological effects and chemical and physical properties of substances. QSARs are useful for delineating the quantitative changes in biological effects resulting from minor but systematic variations of the structure of a compound with a specific mode of action. In addition, more holistic approaches are being devised that result in our ability to predict the effects of structurally unrelated compounds with (potentially) different modes of action. Research in QSAR environmental toxicology has led to many improvements in the manufacturing, use, and disposal of chemicals. Furthermore, it has led to national policies and international agreements, from use restrictions or outright bans of compounds, such as polychlorinated biphenyls (PCBs), mirex, and highly chlorinated pesticides (e.g. DDT, dieldrin) for the protection of avian predators, to alternatives for ozone-depleting compounds, to better waste treatment systems, to more powerful and specific acting drugs. Most of the recent advances
Quantifying radar-rainfall uncertainties in urban drainage flow modelling
NASA Astrophysics Data System (ADS)
Rico-Ramirez, M. A.; Liguori, S.; Schellart, A. N. A.
2015-09-01
This work presents the results of the implementation of a probabilistic system to model the uncertainty associated to radar rainfall (RR) estimates and the way this uncertainty propagates through the sewer system of an urban area located in the North of England. The spatial and temporal correlations of the RR errors as well as the error covariance matrix were computed to build a RR error model able to generate RR ensembles that reproduce the uncertainty associated with the measured rainfall. The results showed that the RR ensembles provide important information about the uncertainty in the rainfall measurement that can be propagated in the urban sewer system. The results showed that the measured flow peaks and flow volumes are often bounded within the uncertainty area produced by the RR ensembles. In 55% of the simulated events, the uncertainties in RR measurements can explain the uncertainties observed in the simulated flow volumes. However, there are also some events where the RR uncertainty cannot explain the whole uncertainty observed in the simulated flow volumes indicating that there are additional sources of uncertainty that must be considered such as the uncertainty in the urban drainage model structure, the uncertainty in the urban drainage model calibrated parameters, and the uncertainty in the measured sewer flows.
Scior, Thomas; Lozano-Aponte, Jorge; Ajmani, Subhash; Hernández-Montero, Eduardo; Chávez-Silva, Fabiola; Hernández-Núñez, Emanuel; Moo-Puc, Rosa; Fraguela-Collar, Andres; Navarrete-Vázquez, Gabriel
2015-01-01
In view of the serious health problems concerning infectious diseases in heavily populated areas, we followed the strategy of lead compound diversification to evaluate the near-by chemical space for new organic compounds. To this end, twenty derivatives of nitazoxanide (NTZ) were synthesized and tested for activity against Entamoeba histolytica parasites. To ensure drug-likeliness and activity relatedness of the new compounds, the synthetic work was assisted by a quantitative structure-activity relationships study (QSAR). Many of the inherent downsides – well-known to QSAR practitioners – we circumvented thanks to workarounds which we proposed in prior QSAR publication. To gain further mechanistic insight on a molecular level, ligand-enzyme docking simulations were carried out since NTZ is known to inhibit the protozoal pyruvate ferredoxin oxidoreductase (PFOR) enzyme as its biomolecular target. PMID:25872791
Dolezal, Rafael; Korabecny, Jan; Malinak, David; Honegr, Jan; Musilek, Kamil; Kuca, Kamil
2015-03-01
To predict unknown reactivation potencies of 12 mono- and bis-pyridinium aldoximes for VX-inhibited rat acetylcholinesterase (rAChE), three-dimensional quantitative structure-activity relationship (3D QSAR) analysis has been carried out. Utilizing molecular interaction fields (MIFs) calculated by molecular mechanical (MMFF94) and quantum chemical (B3LYP/6-31G*) methods, two satisfactory ligand-based CoMFA models have been developed: 1. R(2)=0.9989, Q(LOO)(2)=0.9090, Q(LTO)(2)=0.8921, Q(LMO(20%))(2)=0.8853, R(ext)(2)=0.9259, SDEP(ext)=6.8938; 2. R(2)=0.9962, Q(LOO)(2)=0.9368, Q(LTO)(2)=0.9298, Q(LMO(20%))(2)=0.9248, R(ext)(2)=0.8905, SDEP(ext)=6.6756. High statistical significance of the 3D QSAR models has been achieved through the application of several data noise reduction techniques (i.e. smart region definition SRD, fractional factor design FFD, uninformative/iterative variable elimination UVE/IVE) on the original MIFs. Besides the ligand-based CoMFA models, an alignment molecular set constructed by flexible molecular docking has been also studied. The contour maps as well as the predicted reactivation potencies resulting from 3D QSAR analyses help better understand which structural features are associated with increased reactivation potency of studied compounds. Copyright © 2014 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Norinder, Ulf
1990-12-01
An experimental design based 3-D QSAR analysis using a combination of principal component and PLS analysis is presented and applied to human corticosteroid-binding globulin complexes. The predictive capability of the created model is good. The technique can also be used as guidance when selecting new compounds to be investigated.
Jardínez, Christiaan; Vela, Alberto; Cruz-Borbolla, Julián; Alvarez-Mendez, Rodrigo J; Alvarado-Rodríguez, José G
2016-12-01
The relationship between the chemical structure and biological activity (log IC 50 ) of 40 derivatives of 1,4-dihydropyridines (DHPs) was studied using density functional theory (DFT) and multiple linear regression analysis methods. With the aim of improving the quantitative structure-activity relationship (QSAR) model, the reduced density gradient s( r) of the optimized equilibrium geometries was used as a descriptor to include weak non-covalent interactions. The QSAR model highlights the correlation between the log IC 50 with highest molecular orbital energy (E HOMO ), molecular volume (V), partition coefficient (log P), non-covalent interactions NCI(H4-G) and the dual descriptor [Δf(r)]. The model yielded values of R 2 =79.57 and Q 2 =69.67 that were validated with the next four internal analytical validations DK=0.076, DQ=-0.006, R P =0.056, and R N =0.000, and the external validation Q 2 boot =64.26. The QSAR model found can be used to estimate biological activity with high reliability in new compounds based on a DHP series. Graphical abstract The good correlation between the log IC 50 with the NCI (H4-G) estimated by the reduced density gradient approach of the DHP derivatives.
An uncertainty analysis of wildfire modeling [Chapter 13
Karin Riley; Matthew Thompson
2017-01-01
Before fire models can be understood, evaluated, and effectively applied to support decision making, model-based uncertainties must be analyzed. In this chapter, we identify and classify sources of uncertainty using an established analytical framework, and summarize results graphically in an uncertainty matrix. Our analysis facilitates characterization of the...
Bayesian models for comparative analysis integrating phylogenetic uncertainty.
de Villemereuil, Pierre; Wells, Jessie A; Edwards, Robert D; Blomberg, Simon P
2012-06-28
Uncertainty in comparative analyses can come from at least two sources: a) phylogenetic uncertainty in the tree topology or branch lengths, and b) uncertainty due to intraspecific variation in trait values, either due to measurement error or natural individual variation. Most phylogenetic comparative methods do not account for such uncertainties. Not accounting for these sources of uncertainty leads to false perceptions of precision (confidence intervals will be too narrow) and inflated significance in hypothesis testing (e.g. p-values will be too small). Although there is some application-specific software for fitting Bayesian models accounting for phylogenetic error, more general and flexible software is desirable. We developed models to directly incorporate phylogenetic uncertainty into a range of analyses that biologists commonly perform, using a Bayesian framework and Markov Chain Monte Carlo analyses. We demonstrate applications in linear regression, quantification of phylogenetic signal, and measurement error models. Phylogenetic uncertainty was incorporated by applying a prior distribution for the phylogeny, where this distribution consisted of the posterior tree sets from Bayesian phylogenetic tree estimation programs. The models were analysed using simulated data sets, and applied to a real data set on plant traits, from rainforest plant species in Northern Australia. Analyses were performed using the free and open source software OpenBUGS and JAGS. Incorporating phylogenetic uncertainty through an empirical prior distribution of trees leads to more precise estimation of regression model parameters than using a single consensus tree and enables a more realistic estimation of confidence intervals. In addition, models incorporating measurement errors and/or individual variation, in one or both variables, are easily formulated in the Bayesian framework. We show that BUGS is a useful, flexible general purpose tool for phylogenetic comparative analyses
Bayesian models for comparative analysis integrating phylogenetic uncertainty
2012-01-01
Background Uncertainty in comparative analyses can come from at least two sources: a) phylogenetic uncertainty in the tree topology or branch lengths, and b) uncertainty due to intraspecific variation in trait values, either due to measurement error or natural individual variation. Most phylogenetic comparative methods do not account for such uncertainties. Not accounting for these sources of uncertainty leads to false perceptions of precision (confidence intervals will be too narrow) and inflated significance in hypothesis testing (e.g. p-values will be too small). Although there is some application-specific software for fitting Bayesian models accounting for phylogenetic error, more general and flexible software is desirable. Methods We developed models to directly incorporate phylogenetic uncertainty into a range of analyses that biologists commonly perform, using a Bayesian framework and Markov Chain Monte Carlo analyses. Results We demonstrate applications in linear regression, quantification of phylogenetic signal, and measurement error models. Phylogenetic uncertainty was incorporated by applying a prior distribution for the phylogeny, where this distribution consisted of the posterior tree sets from Bayesian phylogenetic tree estimation programs. The models were analysed using simulated data sets, and applied to a real data set on plant traits, from rainforest plant species in Northern Australia. Analyses were performed using the free and open source software OpenBUGS and JAGS. Conclusions Incorporating phylogenetic uncertainty through an empirical prior distribution of trees leads to more precise estimation of regression model parameters than using a single consensus tree and enables a more realistic estimation of confidence intervals. In addition, models incorporating measurement errors and/or individual variation, in one or both variables, are easily formulated in the Bayesian framework. We show that BUGS is a useful, flexible general purpose tool for
Neural network-based QSAR and insecticide discovery: spinetoram
NASA Astrophysics Data System (ADS)
Sparks, Thomas C.; Crouse, Gary D.; Dripps, James E.; Anzeveno, Peter; Martynow, Jacek; DeAmicis, Carl V.; Gifford, James
2008-06-01
Improvements in the efficacy and spectrum of the spinosyns, novel fermentation derived insecticide, has long been a goal within Dow AgroSciences. As large and complex fermentation products identifying specific modifications to the spinosyns likely to result in improved activity was a difficult process, since most modifications decreased the activity. A variety of approaches were investigated to identify new synthetic directions for the spinosyn chemistry including several explorations of the quantitative structure activity relationships (QSAR) of spinosyns, which initially were unsuccessful. However, application of artificial neural networks (ANN) to the spinosyn QSAR problem identified new directions for improved activity in the chemistry, which subsequent synthesis and testing confirmed. The ANN-based analogs coupled with other information on substitution effects resulting from spinosyn structure activity relationships lead to the discovery of spinetoram (XDE-175). Launched in late 2007, spinetoram provides both improved efficacy and an expanded spectrum while maintaining the exceptional environmental and toxicological profile already established for the spinosyn chemistry.
Uncertainty and sensitivity analysis for photovoltaic system modeling.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hansen, Clifford W.; Pohl, Andrew Phillip; Jordan, Dirk
2013-12-01
We report an uncertainty and sensitivity analysis for modeling DC energy from photovoltaic systems. We consider two systems, each comprised of a single module using either crystalline silicon or CdTe cells, and located either at Albuquerque, NM, or Golden, CO. Output from a PV system is predicted by a sequence of models. Uncertainty in the output of each model is quantified by empirical distributions of each model's residuals. We sample these distributions to propagate uncertainty through the sequence of models to obtain an empirical distribution for each PV system's output. We considered models that: (1) translate measured global horizontal, directmore » and global diffuse irradiance to plane-of-array irradiance; (2) estimate effective irradiance from plane-of-array irradiance; (3) predict cell temperature; and (4) estimate DC voltage, current and power. We found that the uncertainty in PV system output to be relatively small, on the order of 1% for daily energy. Four alternative models were considered for the POA irradiance modeling step; we did not find the choice of one of these models to be of great significance. However, we observed that the POA irradiance model introduced a bias of upwards of 5% of daily energy which translates directly to a systematic difference in predicted energy. Sensitivity analyses relate uncertainty in the PV system output to uncertainty arising from each model. We found that the residuals arising from the POA irradiance and the effective irradiance models to be the dominant contributors to residuals for daily energy, for either technology or location considered. This analysis indicates that efforts to reduce the uncertainty in PV system output should focus on improvements to the POA and effective irradiance models.« less
Photovoltaic System Modeling. Uncertainty and Sensitivity Analyses
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hansen, Clifford W.; Martin, Curtis E.
2015-08-01
We report an uncertainty and sensitivity analysis for modeling AC energy from ph otovoltaic systems . Output from a PV system is predicted by a sequence of models. We quantify u ncertainty i n the output of each model using empirical distribution s of each model's residuals. We propagate uncertainty through the sequence of models by sampli ng these distributions to obtain a n empirical distribution of a PV system's output. We consider models that: (1) translate measured global horizontal, direct and global diffuse irradiance to plane - of - array irradiance; (2) estimate effective irradiance; (3) predict cell temperature;more » (4) estimate DC voltage, current and power ; (5) reduce DC power for losses due to inefficient maximum power point tracking or mismatch among modules; and (6) convert DC to AC power . O ur analysis consider s a notional PV system com prising an array of FirstSolar FS - 387 modules and a 250 kW AC inverter ; we use measured irradiance and weather at Albuquerque, NM. We found the uncertainty in PV syste m output to be relatively small, on the order of 1% for daily energy. We found that unce rtainty in the models for POA irradiance and effective irradiance to be the dominant contributors to uncertainty in predicted daily energy. Our analysis indicates that efforts to reduce the uncertainty in PV system output predictions may yield the greatest improvements by focusing on the POA and effective irradiance models.« less
In general, the accuracy of a predicted toxicity value increases with increase in similarity between the query chemical and the chemicals used to develop a QSAR model. A toxicity estimation methodology employing this finding has been developed. A hierarchical based clustering t...
Realising the Uncertainty Enabled Model Web
NASA Astrophysics Data System (ADS)
Cornford, D.; Bastin, L.; Pebesma, E. J.; Williams, M.; Stasch, C.; Jones, R.; Gerharz, L.
2012-12-01
The FP7 funded UncertWeb project aims to create the "uncertainty enabled model web". The central concept here is that geospatial models and data resources are exposed via standard web service interfaces, such as the Open Geospatial Consortium (OGC) suite of encodings and interface standards, allowing the creation of complex workflows combining both data and models. The focus of UncertWeb is on the issue of managing uncertainty in such workflows, and providing the standards, architecture, tools and software support necessary to realise the "uncertainty enabled model web". In this paper we summarise the developments in the first two years of UncertWeb, illustrating several key points with examples taken from the use case requirements that motivate the project. Firstly we address the issue of encoding specifications. We explain the usage of UncertML 2.0, a flexible encoding for representing uncertainty based on a probabilistic approach. This is designed to be used within existing standards such as Observations and Measurements (O&M) and data quality elements of ISO19115 / 19139 (geographic information metadata and encoding specifications) as well as more broadly outside the OGC domain. We show profiles of O&M that have been developed within UncertWeb and how UncertML 2.0 is used within these. We also show encodings based on NetCDF and discuss possible future directions for encodings in JSON. We then discuss the issues of workflow construction, considering discovery of resources (both data and models). We discuss why a brokering approach to service composition is necessary in a world where the web service interfaces remain relatively heterogeneous, including many non-OGC approaches, in particular the more mainstream SOAP and WSDL approaches. We discuss the trade-offs between delegating uncertainty management functions to the service interfaces themselves and integrating the functions in the workflow management system. We describe two utility services to address
Zhang, Wen; Qiu, Kai-Xiong; Yu, Fang; Xie, Xiao-Guang; Zhang, Shu-Qun; Chen, Ya-Juan; Xie, Hui-Ding
2017-10-01
B-Raf kinase has been identified as an important target in recent cancer treatment. In order to discover structurally diverse and novel B-Raf inhibitors (BRIs), a virtual screening of BRIs against ZINC database was performed by using a combination of pharmacophore modelling, molecular docking, 3D-QSAR model and binding free energy (ΔG bind ) calculation studies in this work. After the virtual screening, six promising hit compounds were obtained, which were then tested for inhibitory activities of A375 cell lines. In the result, five hit compounds show good biological activities (IC 50 <50μM). The present method of virtual screening can be applied to find structurally diverse inhibitors, and the obtained five structurally diverse compounds are expected to develop novel BRIs. Copyright © 2017. Published by Elsevier Ltd.
Development of a Prototype Model-Form Uncertainty Knowledge Base
NASA Technical Reports Server (NTRS)
Green, Lawrence L.
2016-01-01
Uncertainties are generally classified as either aleatory or epistemic. Aleatory uncertainties are those attributed to random variation, either naturally or through manufacturing processes. Epistemic uncertainties are generally attributed to a lack of knowledge. One type of epistemic uncertainty is called model-form uncertainty. The term model-form means that among the choices to be made during a design process within an analysis, there are different forms of the analysis process, which each give different results for the same configuration at the same flight conditions. Examples of model-form uncertainties include the grid density, grid type, and solver type used within a computational fluid dynamics code, or the choice of the number and type of model elements within a structures analysis. The objectives of this work are to identify and quantify a representative set of model-form uncertainties and to make this information available to designers through an interactive knowledge base (KB). The KB can then be used during probabilistic design sessions, so as to enable the possible reduction of uncertainties in the design process through resource investment. An extensive literature search has been conducted to identify and quantify typical model-form uncertainties present within aerospace design. An initial attempt has been made to assemble the results of this literature search into a searchable KB, usable in real time during probabilistic design sessions. A concept of operations and the basic structure of a model-form uncertainty KB are described. Key operations within the KB are illustrated. Current limitations in the KB, and possible workarounds are explained.
Agrawal, Vijay K; Sharma, Ruchi; Khadikar, Padmakar V
2002-09-01
QSAR studies on modelling of biological activity (hCAI) for a series of ureido and thioureido derivatives of aromatic/heterocyclic sulfonamides have been made using a pool of topological indices. Regression analysis of the data showed that excellent results were obtained in multiparametric correlations upon introduction of indicator parameters. The predictive abilities of the models are discussed using cross-validation parameters.
Assessing Uncertainty in Risk Assessment Models (BOSC CSS meeting)
In vitro assays are increasingly being used in risk assessments Uncertainty in assays leads to uncertainty in models used for risk assessments. This poster assesses uncertainty in the ER and AR models.
Model-specification uncertainty in future forest pest outbreak.
Boulanger, Yan; Gray, David R; Cooke, Barry J; De Grandpré, Louis
2016-04-01
Climate change will modify forest pest outbreak characteristics, although there are disagreements regarding the specifics of these changes. A large part of this variability may be attributed to model specifications. As a case study, we developed a consensus model predicting spruce budworm (SBW, Choristoneura fumiferana [Clem.]) outbreak duration using two different predictor data sets and six different correlative methods. The model was used to project outbreak duration and the uncertainty associated with using different data sets and correlative methods (=model-specification uncertainty) for 2011-2040, 2041-2070 and 2071-2100, according to three forcing scenarios (RCP 2.6, RCP 4.5 and RCP 8.5). The consensus model showed very high explanatory power and low bias. The model projected a more important northward shift and decrease in outbreak duration under the RCP 8.5 scenario. However, variation in single-model projections increases with time, making future projections highly uncertain. Notably, the magnitude of the shifts in northward expansion, overall outbreak duration and the patterns of outbreaks duration at the southern edge were highly variable according to the predictor data set and correlative method used. We also demonstrated that variation in forcing scenarios contributed only slightly to the uncertainty of model projections compared with the two sources of model-specification uncertainty. Our approach helped to quantify model-specification uncertainty in future forest pest outbreak characteristics. It may contribute to sounder decision-making by acknowledging the limits of the projections and help to identify areas where model-specification uncertainty is high. As such, we further stress that this uncertainty should be strongly considered when making forest management plans, notably by adopting adaptive management strategies so as to reduce future risks. © 2015 Her Majesty the Queen in Right of Canada Global Change Biology © 2015 Published by John
Gomes, Marcelo N; Braga, Rodolpho C; Grzelak, Edyta M; Neves, Bruno J; Muratov, Eugene; Ma, Rui; Klein, Larry L; Cho, Sanghyun; Oliveira, Guilherme R; Franzblau, Scott G; Andrade, Carolina Horta
2017-09-08
New anti-tuberculosis (anti-TB) drugs are urgently needed to battle drug-resistant Mycobacterium tuberculosis strains and to shorten the current 6-12-month treatment regimen. In this work, we have continued the efforts to develop chalcone-based anti-TB compounds by using an in silico design and QSAR-driven approach. Initially, we developed SAR rules and binary QSAR models using literature data for targeted design of new heteroaryl chalcone compounds with anti-TB activity. Using these models, we prioritized 33 compounds for synthesis and biological evaluation. As a result, 10 heteroaryl chalcone compounds (4, 8, 9, 11, 13, 17-20, and 23) were found to exhibit nanomolar activity against replicating mycobacteria, low micromolar activity against nonreplicating bacteria, and nanomolar and micromolar against rifampin (RMP) and isoniazid (INH) monoresistant strains (rRMP and rINH) (<1 μM and <10 μM, respectively). The series also show low activity against commensal bacteria and generally show good selectivity toward M. tuberculosis, with very low cytotoxicity against Vero cells (SI = 11-545). Our results suggest that our designed heteroaryl chalcone compounds, due to their high potency and selectivity, are promising anti-TB agents. Copyright © 2017 Elsevier Masson SAS. All rights reserved.
Visual analytics in cheminformatics: user-supervised descriptor selection for QSAR methods.
Martínez, María Jimena; Ponzoni, Ignacio; Díaz, Mónica F; Vazquez, Gustavo E; Soto, Axel J
2015-01-01
The design of QSAR/QSPR models is a challenging problem, where the selection of the most relevant descriptors constitutes a key step of the process. Several feature selection methods that address this step are concentrated on statistical associations among descriptors and target properties, whereas the chemical knowledge is left out of the analysis. For this reason, the interpretability and generality of the QSAR/QSPR models obtained by these feature selection methods are drastically affected. Therefore, an approach for integrating domain expert's knowledge in the selection process is needed for increase the confidence in the final set of descriptors. In this paper a software tool, which we named Visual and Interactive DEscriptor ANalysis (VIDEAN), that combines statistical methods with interactive visualizations for choosing a set of descriptors for predicting a target property is proposed. Domain expertise can be added to the feature selection process by means of an interactive visual exploration of data, and aided by statistical tools and metrics based on information theory. Coordinated visual representations are presented for capturing different relationships and interactions among descriptors, target properties and candidate subsets of descriptors. The competencies of the proposed software were assessed through different scenarios. These scenarios reveal how an expert can use this tool to choose one subset of descriptors from a group of candidate subsets or how to modify existing descriptor subsets and even incorporate new descriptors according to his or her own knowledge of the target property. The reported experiences showed the suitability of our software for selecting sets of descriptors with low cardinality, high interpretability, low redundancy and high statistical performance in a visual exploratory way. Therefore, it is possible to conclude that the resulting tool allows the integration of a chemist's expertise in the descriptor selection process with
Management of California Oak Woodlands: Uncertainties and Modeling
Jay E. Noel; Richard P. Thompson
1995-01-01
A mathematical policy model of oak woodlands is presented. The model illustrates the policy uncertainties that exist in the management of oak woodlands. These uncertainties include: (1) selection of a policy criterion function, (2) woodland dynamics, (3) initial and final state of the woodland stock. The paper provides a review of each of the uncertainty issues. The...
Qian, Haiyan; Chen, Jiongjiong; Pan, Youlu; Chen, Jianzhong
2016-09-19
11β-Hydroxysteroid dehydrogenase type 1 (11β-HSD1) is a potential target for the treatment of numerous human disorders, such as diabetes, obesity, and metabolic syndrome. In this work, molecular modeling studies combining molecular docking, 3D-QSAR, MESP, MD simulations and free energy calculations were performed on pyridine amides and 1,2,4-triazolopyridines as 11β-HSD1 inhibitors to explore structure-activity relationships and structural requirement for the inhibitory activity. 3D-QSAR models, including CoMFA and CoMSIA, were developed from the conformations obtained by docking strategy. The derived pharmacophoric features were further supported by MESP and Mulliken charge analyses using density functional theory. In addition, MD simulations and free energy calculations were employed to determine the detailed binding process and to compare the binding modes of inhibitors with different bioactivities. The binding free energies calculated by MM/PBSA showed a good correlation with the experimental biological activities. Free energy analyses and per-residue energy decomposition indicated the van der Waals interaction would be the major driving force for the interactions between an inhibitor and 11β-HSD1. These unified results may provide that hydrogen bond interactions with Ser170 and Tyr183 are favorable for enhancing activity. Thr124, Ser170, Tyr177, Tyr183, Val227, and Val231 are the key amino acid residues in the binding pocket. The obtained results are expected to be valuable for the rational design of novel potent 11β-HSD1 inhibitors.
Shock Layer Radiation Modeling and Uncertainty for Mars Entry
NASA Technical Reports Server (NTRS)
Johnston, Christopher O.; Brandis, Aaron M.; Sutton, Kenneth
2012-01-01
A model for simulating nonequilibrium radiation from Mars entry shock layers is presented. A new chemical kinetic rate model is developed that provides good agreement with recent EAST and X2 shock tube radiation measurements. This model includes a CO dissociation rate that is a factor of 13 larger than the rate used widely in previous models. Uncertainties in the proposed rates are assessed along with uncertainties in translational-vibrational relaxation modeling parameters. The stagnation point radiative flux uncertainty due to these flowfield modeling parameter uncertainties is computed to vary from 50 to 200% for a range of free-stream conditions, with densities ranging from 5e-5 to 5e-4 kg/m3 and velocities ranging from of 6.3 to 7.7 km/s. These conditions cover the range of anticipated peak radiative heating conditions for proposed hypersonic inflatable aerodynamic decelerators (HIADs). Modeling parameters for the radiative spectrum are compiled along with a non-Boltzmann rate model for the dominant radiating molecules, CO, CN, and C2. A method for treating non-local absorption in the non-Boltzmann model is developed, which is shown to result in up to a 50% increase in the radiative flux through absorption by the CO 4th Positive band. The sensitivity of the radiative flux to the radiation modeling parameters is presented and the uncertainty for each parameter is assessed. The stagnation point radiative flux uncertainty due to these radiation modeling parameter uncertainties is computed to vary from 18 to 167% for the considered range of free-stream conditions. The total radiative flux uncertainty is computed as the root sum square of the flowfield and radiation parametric uncertainties, which results in total uncertainties ranging from 50 to 260%. The main contributors to these significant uncertainties are the CO dissociation rate and the CO heavy-particle excitation rates. Applying the baseline flowfield and radiation models developed in this work, the
Da, Chenxiao; Mooberry, Susan L.; Gupton, John T.; Kellogg, Glen E.
2013-01-01
αβ-tubulin colchicine site inhibitors (CSIs) from four scaffolds that we previously tested for antiproliferative activity were modeled to better understand their effect on microtubules. Docking models, constructed by exploiting the SAR of a pyrrole subset and HINT scoring, guided ensemble docking of all 59 compounds. This conformation set and two variants having progressively less structure knowledge were subjected to CoMFA, CoMFA+HINT, and CoMSIA 3D-QSAR analyses. The CoMFA+HINT model (docked alignment) showed the best statistics: leave-one-out q2 of 0.616, r2 of 0.949 and r2pred (internal test set) of 0.755. An external (tested in other laboratories) collection of 24 CSIs from eight scaffolds were evaluated with the 3D-QSAR models, which correctly ranked their activity trends in 7/8 scaffolds for CoMFA+HINT (8/8 for CoMFA). The combination of SAR, ensemble docking, hydropathic analysis and 3D-QSAR provides an atomic-scale colchicine site model more consistent with a target structure resolution much higher than the ~3.6 Å available for αβ-tubulin. PMID:23961916
Exploration of Uncertainty in Glacier Modelling
NASA Technical Reports Server (NTRS)
Thompson, David E.
1999-01-01
There are procedures and methods for verification of coding algebra and for validations of models and calculations that are in use in the aerospace computational fluid dynamics (CFD) community. These methods would be efficacious if used by the glacier dynamics modelling community. This paper is a presentation of some of those methods, and how they might be applied to uncertainty management supporting code verification and model validation for glacier dynamics. The similarities and differences between their use in CFD analysis and the proposed application of these methods to glacier modelling are discussed. After establishing sources of uncertainty and methods for code verification, the paper looks at a representative sampling of verification and validation efforts that are underway in the glacier modelling community, and establishes a context for these within overall solution quality assessment. Finally, an information architecture and interactive interface is introduced and advocated. This Integrated Cryospheric Exploration (ICE) Environment is proposed for exploring and managing sources of uncertainty in glacier modelling codes and methods, and for supporting scientific numerical exploration and verification. The details and functionality of this Environment are described based on modifications of a system already developed for CFD modelling and analysis.
QSAR models for degradation of organic pollutants in ozonation process under acidic condition.
Zhu, Huicen; Guo, Weimin; Shen, Zhemin; Tang, Qingli; Ji, Wenchao; Jia, Lijuan
2015-01-01
Although some researches about the degradation of organic pollutants have been carried out during recent years, reaction rate constants are available only for homologue compounds with similar structures or components. Therefore, it is of great significance to find a universal relationship between reaction rate and certain parameters of several diverse organic pollutants. In this study, removal ratio and kinetics of 33 kinds of organic substances were investigated by ozonation process, including azo dyes, heterocyclic compounds, ionic compounds and so on. Most quantum chemical parameters were conducted by using Gaussian 09 at the DFT B3LYP/6-311G level, including μ, q H(+), q(C)minq(C)max, ELUMO and EHOMO. Other descriptors, bond order (BO) as well as Fukui indices (f(+), f(-) and f(0)), were calculated by Material Studio 6.1 at Dmol(3)/GGA-BLYP/DNP(3.5) basis for each organic compound. The recommended model for predicting rate constants was lnk'=1.978-95.484f(0)x-3.350q(C)min+38.221f(+)x, which had the squared regression coefficient R(2)=0.763 and standard deviation SD=0.716. The results of t test and the Fisher test suggested that the model exhibited optimum stability. Also, the model was validated by internal and external validations. Recommended QSAR model showed that the highest f(0) value of main-chain carbons (f(0)x) is more closely related to lnk' than other quantum descriptors. Copyright © 2014 Elsevier Ltd. All rights reserved.
Our study assesses the value of both in vitro assay and quantitative structure activity relationship (QSAR) data in predicting in vivo toxicity using numerous statistical models and approaches to process the data. Our models are built on datasets of (i) 586 chemicals for which bo...
Escher, Beate I; Baumer, Andreas; Bittermann, Kai; Henneberger, Luise; König, Maria; Kühnert, Christin; Klüver, Nils
2017-03-22
The Microtox assay, a bioluminescence inhibition assay with the marine bacterium Aliivibrio fischeri, is one of the most popular bioassays for assessing the cytotoxicity of organic chemicals, mixtures and environmental samples. Most environmental chemicals act as baseline toxicants in this short-term screening assay, which is typically run with only 30 min of exposure duration. Numerous Quantitative Structure-Activity Relationships (QSARs) exist for the Microtox assay for nonpolar and polar narcosis. However, typical water pollutants, which have highly diverse structures covering a wide range of hydrophobicity and speciation from neutral to anionic and cationic, are often outside the applicability domain of these QSARs. To include all types of environmentally relevant organic pollutants we developed a general baseline toxicity QSAR using liposome-water distribution ratios as descriptors. Previous limitations in availability of experimental liposome-water partition constants were overcome by reliable prediction models based on polyparameter linear free energy relationships for neutral chemicals and the COSMOmic model for charged chemicals. With this QSAR and targeted mixture experiments we could demonstrate that ionisable chemicals fall in the applicability domain. Most investigated water pollutants acted as baseline toxicants in this bioassay, with the few outliers identified as uncouplers or reactive toxicants. The main limitation of the Microtox assay is that chemicals with a high melting point and/or high hydrophobicity were outside of the applicability domain because of their low water solubility. We quantitatively derived a solubility cut-off but also demonstrated with mixture experiments that chemicals inactive on their own can contribute to mixture toxicity, which is highly relevant for complex environmental mixtures, where these chemicals may be present at concentrations below the solubility cut-off.
Development of the X-33 Aerodynamic Uncertainty Model
NASA Technical Reports Server (NTRS)
Cobleigh, Brent R.
1998-01-01
An aerodynamic uncertainty model for the X-33 single-stage-to-orbit demonstrator aircraft has been developed at NASA Dryden Flight Research Center. The model is based on comparisons of historical flight test estimates to preflight wind-tunnel and analysis code predictions of vehicle aerodynamics documented during six lifting-body aircraft and the Space Shuttle Orbiter flight programs. The lifting-body and Orbiter data were used to define an appropriate uncertainty magnitude in the subsonic and supersonic flight regions, and the Orbiter data were used to extend the database to hypersonic Mach numbers. The uncertainty data consist of increments or percentage variations in the important aerodynamic coefficients and derivatives as a function of Mach number along a nominal trajectory. The uncertainty models will be used to perform linear analysis of the X-33 flight control system and Monte Carlo mission simulation studies. Because the X-33 aerodynamic uncertainty model was developed exclusively using historical data rather than X-33 specific characteristics, the model may be useful for other lifting-body studies.
Combined 3D-QSAR modeling and molecular docking study on azacycles CCR5 antagonists
NASA Astrophysics Data System (ADS)
Ji, Yongjun; Shu, Mao; Lin, Yong; Wang, Yuanqiang; Wang, Rui; Hu, Yong; Lin, Zhihua
2013-08-01
The beta chemokine receptor 5 (CCR5) is an attractive target for pharmaceutical industry in the HIV-1, inflammation and cancer therapeutic areas. In this study, we have developed quantitative structure activity relationship (QSAR) models for a series of 41 azacycles CCR5 antagonists using comparative molecular field analysis (CoMFA), comparative molecular similarity indices analysis (CoMSIA), and Topomer CoMFA methods. The cross-validated coefficient q2 values of 3D-QASR (CoMFA, CoMSIA, and Topomer CoMFA) methods were 0.630, 0.758, and 0.852, respectively, the non-cross-validated R2 values were 0.979, 0.978, and 0.990, respectively. Docking studies were also employed to determine the most probable binding mode. 3D contour maps and docking results suggested that bulky groups and electron-withdrawing groups on the core part would decrease antiviral activity. Furthermore, docking results indicated that H-bonds and π bonds were favorable for antiviral activities. Finally, a set of novel derivatives with predicted activities were designed.
A Bayesian Framework of Uncertainties Integration in 3D Geological Model
NASA Astrophysics Data System (ADS)
Liang, D.; Liu, X.
2017-12-01
3D geological model can describe complicated geological phenomena in an intuitive way while its application may be limited by uncertain factors. Great progress has been made over the years, lots of studies decompose the uncertainties of geological model to analyze separately, while ignored the comprehensive impacts of multi-source uncertainties. Great progress has been made over the years, while lots of studies ignored the comprehensive impacts of multi-source uncertainties when analyzed them item by item from each source. To evaluate the synthetical uncertainty, we choose probability distribution to quantify uncertainty, and propose a bayesian framework of uncertainties integration. With this framework, we integrated data errors, spatial randomness, and cognitive information into posterior distribution to evaluate synthetical uncertainty of geological model. Uncertainties propagate and cumulate in modeling process, the gradual integration of multi-source uncertainty is a kind of simulation of the uncertainty propagation. Bayesian inference accomplishes uncertainty updating in modeling process. Maximum entropy principle makes a good effect on estimating prior probability distribution, which ensures the prior probability distribution subjecting to constraints supplied by the given information with minimum prejudice. In the end, we obtained a posterior distribution to evaluate synthetical uncertainty of geological model. This posterior distribution represents the synthetical impact of all the uncertain factors on the spatial structure of geological model. The framework provides a solution to evaluate synthetical impact on geological model of multi-source uncertainties and a thought to study uncertainty propagation mechanism in geological modeling.
Modelling ecosystem service flows under uncertainty with stochiastic SPAN
Johnson, Gary W.; Snapp, Robert R.; Villa, Ferdinando; Bagstad, Kenneth J.
2012-01-01
Ecosystem service models are increasingly in demand for decision making. However, the data required to run these models are often patchy, missing, outdated, or untrustworthy. Further, communication of data and model uncertainty to decision makers is often either absent or unintuitive. In this work, we introduce a systematic approach to addressing both the data gap and the difficulty in communicating uncertainty through a stochastic adaptation of the Service Path Attribution Networks (SPAN) framework. The SPAN formalism assesses ecosystem services through a set of up to 16 maps, which characterize the services in a study area in terms of flow pathways between ecosystems and human beneficiaries. Although the SPAN algorithms were originally defined deterministically, we present them here in a stochastic framework which combines probabilistic input data with a stochastic transport model in order to generate probabilistic spatial outputs. This enables a novel feature among ecosystem service models: the ability to spatially visualize uncertainty in the model results. The stochastic SPAN model can analyze areas where data limitations are prohibitive for deterministic models. Greater uncertainty in the model inputs (including missing data) should lead to greater uncertainty expressed in the model’s output distributions. By using Bayesian belief networks to fill data gaps and expert-provided trust assignments to augment untrustworthy or outdated information, we can account for uncertainty in input data, producing a model that is still able to run and provide information where strictly deterministic models could not. Taken together, these attributes enable more robust and intuitive modelling of ecosystem services under uncertainty.
NASA Astrophysics Data System (ADS)
Cao, Shandong
2012-08-01
The purpose of the present study was to develop in silico models allowing for a reliable prediction of polo-like kinase inhibitors based on a large diverse dataset of 136 compounds. As an effective method, quantitative structure activity relationship (QSAR) was applied using the comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA). The proposed QSAR models showed reasonable predictivity of thiophene analogs (Rcv2=0.533, Rpred2=0.845) and included four molecular descriptors, namely IC3, RDF075m, Mor02m and R4e+. The optimal model for imidazopyridine derivatives (Rcv2=0.776, Rpred2=0.876) was shown to perform good in prediction accuracy, using GATS2m and BEHe1 descriptors. Analysis of the contour maps helped to identify structural requirements for the inhibitors and served as a basis for the design of the next generation of the inhibitor analogues. Docking studies were also employed to position the inhibitors into the polo-like kinase active site to determine the most probable binding mode. These studies may help to understand the factors influencing the binding affinity of chemicals and to develop alternative methods for prescreening and designing of polo-like kinase inhibitors.
Model Uncertainty Quantification Methods In Data Assimilation
NASA Astrophysics Data System (ADS)
Pathiraja, S. D.; Marshall, L. A.; Sharma, A.; Moradkhani, H.
2017-12-01
Data Assimilation involves utilising observations to improve model predictions in a seamless and statistically optimal fashion. Its applications are wide-ranging; from improving weather forecasts to tracking targets such as in the Apollo 11 mission. The use of Data Assimilation methods in high dimensional complex geophysical systems is an active area of research, where there exists many opportunities to enhance existing methodologies. One of the central challenges is in model uncertainty quantification; the outcome of any Data Assimilation study is strongly dependent on the uncertainties assigned to both observations and models. I focus on developing improved model uncertainty quantification methods that are applicable to challenging real world scenarios. These include developing methods for cases where the system states are only partially observed, where there is little prior knowledge of the model errors, and where the model error statistics are likely to be highly non-Gaussian.
Performance of Trajectory Models with Wind Uncertainty
NASA Technical Reports Server (NTRS)
Lee, Alan G.; Weygandt, Stephen S.; Schwartz, Barry; Murphy, James R.
2009-01-01
Typical aircraft trajectory predictors use wind forecasts but do not account for the forecast uncertainty. A method for generating estimates of wind prediction uncertainty is described and its effect on aircraft trajectory prediction uncertainty is investigated. The procedure for estimating the wind prediction uncertainty relies uses a time-lagged ensemble of weather model forecasts from the hourly updated Rapid Update Cycle (RUC) weather prediction system. Forecast uncertainty is estimated using measures of the spread amongst various RUC time-lagged ensemble forecasts. This proof of concept study illustrates the estimated uncertainty and the actual wind errors, and documents the validity of the assumed ensemble-forecast accuracy relationship. Aircraft trajectory predictions are made using RUC winds with provision for the estimated uncertainty. Results for a set of simulated flights indicate this simple approach effectively translates the wind uncertainty estimate into an aircraft trajectory uncertainty. A key strength of the method is the ability to relate uncertainty to specific weather phenomena (contained in the various ensemble members) allowing identification of regional variations in uncertainty.
Cronin, Mark T D; Walker, John D; Jaworska, Joanna S; Comber, Michael H I; Watts, Christopher D; Worth, Andrew P
2003-01-01
This article is a review of the use, by regulatory agencies and authorities, of quantitative structure-activity relationships (QSARs) to predict ecologic effects and environmental fate of chemicals. For many years, the U.S. Environmental Protection Agency has been the most prominent regulatory agency using QSARs to predict the ecologic effects and environmental fate of chemicals. However, as increasing numbers of standard QSAR methods are developed and validated to predict ecologic effects and environmental fate of chemicals, it is anticipated that more regulatory agencies and authorities will find them to be acceptable alternatives to chemical testing. PMID:12896861
He, Wensi; Yan, Fangyou; Jia, Qingzhu; Xia, Shuqian; Wang, Qiang
2018-03-01
The hazardous potential of ionic liquids (ILs) is becoming an issue of great concern due to their important role in many industrial fields as green agents. The mathematical model for the toxicological effects of ILs is useful for the risk assessment and design of environmentally benign ILs. The objective of this work is to develop QSAR models to describe the minimal inhibitory concentration (MIC) and minimal bactericidal concentration (MBC) of ILs against Staphylococcus aureus (S. aureus). A total of 169 and 101 ILs with MICs and MBCs, respectively, are used to obtain multiple linear regression models based on matrix norm indexes. The norm indexes used in this work are proposed by our research group and they are first applied to estimate the antibacterial toxicity of these ILs against S. aureus. These two models precisely and reliably calculated the IL toxicities with a square of correlation coefficient (R 2 ) of 0.919 and a standard error of estimate (SE) of 0.341 (in log unit of mM) for pMIC, and an R 2 of 0.913 and SE of 0.282 for pMBC. Copyright © 2017 Elsevier Ltd. All rights reserved.
Santos-Garcia, Letícia; Assis, Letícia C; Silva, Daniela R; Ramalho, Teodorico C; da Cunha, Elaine F F
2016-07-01
Bruton's tyrosine kinase (Btk) is an important enzyme in B-lymphocyte development and differentiation. Furthermore, Btk expression is considered essential for the proliferation and survival of these cells. Btk inhibition has become an attractive strategy for treating autoimmune diseases, B-cell leukemia, and lymphomas. With the objective of proposing new candidates for Btk inhibitors, we applied receptor-dependent four-dimensional quantitative structure-activity relationship (QSAR) methodology to a series of 96 nicotinamide analogs useful as Btk modulators. The QSAR models were developed using 71 compounds, the training set, and externally validated using 25 compounds, the test set. The conformations obtained by molecular dynamics simulation were overlapped in a virtual three-dimensional cubic box comprised of 2 and 5 Å cells, according to the six trial alignments. The models were generated by combining genetic function approximation and partial least squares regression technique. The analyses suggest that Model 1a yields the best results. The best equation shows [Formula: see text], r(2) = .743, RMSEC = .831, RMSECV = .879. Given the importance of the Tyr551, this residue could become a strategic target for the design of novel Btk inhibitors with improved potency. In addition, the good potency predicted for the proposed M2 compound indicates this compound as a potential Btk inhibitor candidate.
Uncertainties in Galactic Chemical Evolution Models
Cote, Benoit; Ritter, Christian; Oshea, Brian W.; ...
2016-06-15
Here we use a simple one-zone galactic chemical evolution model to quantify the uncertainties generated by the input parameters in numerical predictions for a galaxy with properties similar to those of the Milky Way. We compiled several studies from the literature to gather the current constraints for our simulations regarding the typical value and uncertainty of the following seven basic parameters: the lower and upper mass limits of the stellar initial mass function (IMF), the slope of the high-mass end of the stellar IMF, the slope of the delay-time distribution function of Type Ia supernovae (SNe Ia), the number ofmore » SNe Ia per M ⊙ formed, the total stellar mass formed, and the final mass of gas. We derived a probability distribution function to express the range of likely values for every parameter, which were then included in a Monte Carlo code to run several hundred simulations with randomly selected input parameters. This approach enables us to analyze the predicted chemical evolution of 16 elements in a statistical manner by identifying the most probable solutions along with their 68% and 95% confidence levels. Our results show that the overall uncertainties are shaped by several input parameters that individually contribute at different metallicities, and thus at different galactic ages. The level of uncertainty then depends on the metallicity and is different from one element to another. Among the seven input parameters considered in this work, the slope of the IMF and the number of SNe Ia are currently the two main sources of uncertainty. The thicknesses of the uncertainty bands bounded by the 68% and 95% confidence levels are generally within 0.3 and 0.6 dex, respectively. When looking at the evolution of individual elements as a function of galactic age instead of metallicity, those same thicknesses range from 0.1 to 0.6 dex for the 68% confidence levels and from 0.3 to 1.0 dex for the 95% confidence levels. The uncertainty in our chemical
A tool for efficient, model-independent management optimization under uncertainty
White, Jeremy; Fienen, Michael N.; Barlow, Paul M.; Welter, Dave E.
2018-01-01
To fill a need for risk-based environmental management optimization, we have developed PESTPP-OPT, a model-independent tool for resource management optimization under uncertainty. PESTPP-OPT solves a sequential linear programming (SLP) problem and also implements (optional) efficient, “on-the-fly” (without user intervention) first-order, second-moment (FOSM) uncertainty techniques to estimate model-derived constraint uncertainty. Combined with a user-specified risk value, the constraint uncertainty estimates are used to form chance-constraints for the SLP solution process, so that any optimal solution includes contributions from model input and observation uncertainty. In this way, a “single answer” that includes uncertainty is yielded from the modeling analysis. PESTPP-OPT uses the familiar PEST/PEST++ model interface protocols, which makes it widely applicable to many modeling analyses. The use of PESTPP-OPT is demonstrated with a synthetic, integrated surface-water/groundwater model. The function and implications of chance constraints for this synthetic model are discussed.
Imprecision and Uncertainty in the UFO Database Model.
ERIC Educational Resources Information Center
Van Gyseghem, Nancy; De Caluwe, Rita
1998-01-01
Discusses how imprecision and uncertainty are dealt with in the UFO (Uncertainty and Fuzziness in an Object-oriented) database model. Such information is expressed by means of possibility distributions, and modeled by means of the proposed concept of "role objects." The role objects model uncertain, tentative information about objects,…
Have artificial neural networks met expectations in drug discovery as implemented in QSAR framework?
Dobchev, Dimitar; Karelson, Mati
2016-07-01
Artificial neural networks (ANNs) are highly adaptive nonlinear optimization algorithms that have been applied in many diverse scientific endeavors, ranging from economics, engineering, physics, and chemistry to medical science. Notably, in the past two decades, ANNs have been used widely in the process of drug discovery. In this review, the authors discuss advantages and disadvantages of ANNs in drug discovery as incorporated into the quantitative structure-activity relationships (QSAR) framework. Furthermore, the authors examine the recent studies, which span over a broad area with various diseases in drug discovery. In addition, the authors attempt to answer the question about the expectations of the ANNs in drug discovery and discuss the trends in this field. The old pitfalls of overtraining and interpretability are still present with ANNs. However, despite these pitfalls, the authors believe that ANNs have likely met many of the expectations of researchers and are still considered as excellent tools for nonlinear data modeling in QSAR. It is likely that ANNs will continue to be used in drug development in the future.
A Reliability Estimation in Modeling Watershed Runoff With Uncertainties
NASA Astrophysics Data System (ADS)
Melching, Charles S.; Yen, Ben Chie; Wenzel, Harry G., Jr.
1990-10-01
The reliability of simulation results produced by watershed runoff models is a function of uncertainties in nature, data, model parameters, and model structure. A framework is presented here for using a reliability analysis method (such as first-order second-moment techniques or Monte Carlo simulation) to evaluate the combined effect of the uncertainties on the reliability of output hydrographs from hydrologic models. For a given event the prediction reliability can be expressed in terms of the probability distribution of the estimated hydrologic variable. The peak discharge probability for a watershed in Illinois using the HEC-1 watershed model is given as an example. The study of the reliability of predictions from watershed models provides useful information on the stochastic nature of output from deterministic models subject to uncertainties and identifies the relative contribution of the various uncertainties to unreliability of model predictions.
Uncertainty in modeled upper ocean heat content change
NASA Astrophysics Data System (ADS)
Tokmakian, Robin; Challenor, Peter
2014-02-01
This paper examines the uncertainty in the change in the heat content in the ocean component of a general circulation model. We describe the design and implementation of our statistical methodology. Using an ensemble of model runs and an emulator, we produce an estimate of the full probability distribution function (PDF) for the change in upper ocean heat in an Atmosphere/Ocean General Circulation Model, the Community Climate System Model v. 3, across a multi-dimensional input space. We show how the emulator of the GCM's heat content change and hence, the PDF, can be validated and how implausible outcomes from the emulator can be identified when compared to observational estimates of the metric. In addition, the paper describes how the emulator outcomes and related uncertainty information might inform estimates of the same metric from a multi-model Coupled Model Intercomparison Project phase 3 ensemble. We illustrate how to (1) construct an ensemble based on experiment design methods, (2) construct and evaluate an emulator for a particular metric of a complex model, (3) validate the emulator using observational estimates and explore the input space with respect to implausible outcomes and (4) contribute to the understanding of uncertainties within a multi-model ensemble. Finally, we estimate the most likely value for heat content change and its uncertainty for the model, with respect to both observations and the uncertainty in the value for the input parameters.
NASA Astrophysics Data System (ADS)
Ahmed, Nafees; Anwar, Sirajudheen; Thet Htar, Thet
2017-06-01
The Plasmodium falciparum Lactate Dehydrogenase enzyme (PfLDH) catalyzes inter-conversion of pyruvate to lactate during glycolysis producing the energy required for parasitic growth. The PfLDH has been studied as a potential molecular target for development of anti-malarial agents. In an attempt to find the potent inhibitor of PfLDH, we have used Discovery studio to perform molecular docking in the active binding pocket of PfLDH by CDOCKER, followed by three-dimensional quantitative structure-activity relationship (3D-QSAR) studies of tricyclic guanidine batzelladine compounds, which were previously synthesized in our laboratory. Docking studies showed that there is a very strong correlation between in silico and in vitro results. Based on docking results, a highly predictive 3D-QSAR model was developed with q2 of 0.516. The model has predicted r2 of 0.91 showing that predicted IC50 values are in good agreement with experimental IC50 values. The results obtained from this study revealed the developed model can be used to design new anti-malarial compounds based on tricyclic guanidine derivatives and to predict activities of new inhibitors.
Ahmed, Nafees; Anwar, Sirajudheen; Thet Htar, Thet
2017-01-01
The Plasmodium falciparum Lactate Dehydrogenase enzyme ( Pf LDH) catalyzes inter-conversion of pyruvate to lactate during glycolysis producing the energy required for parasitic growth. The Pf LDH has been studied as a potential molecular target for development of anti-malarial agents. In an attempt to find the potent inhibitor of Pf LDH, we have used Discovery studio to perform molecular docking in the active binding pocket of Pf LDH by CDOCKER, followed by three-dimensional quantitative structure-activity relationship (3D-QSAR) studies of tricyclic guanidine batzelladine compounds, which were previously synthesized in our laboratory. Docking studies showed that there is a very strong correlation between in silico and in vitro results. Based on docking results, a highly predictive 3D-QSAR model was developed with q 2 of 0.516. The model has predicted r 2 of 0.91 showing that predicted IC 50 values are in good agreement with experimental IC 50 values. The results obtained from this study revealed the developed model can be used to design new anti-malarial compounds based on tricyclic guanidine derivatives and to predict activities of new inhibitors.
Ahmed, Nafees; Anwar, Sirajudheen; Thet Htar, Thet
2017-01-01
The Plasmodium falciparum Lactate Dehydrogenase enzyme (PfLDH) catalyzes inter-conversion of pyruvate to lactate during glycolysis producing the energy required for parasitic growth. The PfLDH has been studied as a potential molecular target for development of anti-malarial agents. In an attempt to find the potent inhibitor of PfLDH, we have used Discovery studio to perform molecular docking in the active binding pocket of PfLDH by CDOCKER, followed by three-dimensional quantitative structure-activity relationship (3D-QSAR) studies of tricyclic guanidine batzelladine compounds, which were previously synthesized in our laboratory. Docking studies showed that there is a very strong correlation between in silico and in vitro results. Based on docking results, a highly predictive 3D-QSAR model was developed with q2 of 0.516. The model has predicted r2 of 0.91 showing that predicted IC50 values are in good agreement with experimental IC50 values. The results obtained from this study revealed the developed model can be used to design new anti-malarial compounds based on tricyclic guanidine derivatives and to predict activities of new inhibitors. PMID:28664157
Rasulev, Bakhtiyor; Kusić, Hrvoje; Leszczynska, Danuta; Leszczynski, Jerzy; Koprivanac, Natalija
2010-05-01
The goal of the study was to predict toxicity in vivo caused by aromatic compounds structured with a single benzene ring and the presence or absence of different substituent groups such as hydroxyl-, nitro-, amino-, methyl-, methoxy-, etc., by using QSAR/QSPR tools. A Genetic Algorithm and multiple regression analysis were applied to select the descriptors and to generate the correlation models. The most predictive model is shown to be the 3-variable model which also has a good ratio of the number of descriptors and their predictive ability to avoid overfitting. The main contributions to the toxicity were shown to be the polarizability weighted MATS2p and the number of certain groups C-026 descriptors. The GA-MLRA approach showed good results in this study, which allows the building of a simple, interpretable and transparent model that can be used for future studies of predicting toxicity of organic compounds to mammals.
NASA Astrophysics Data System (ADS)
Santos-Filho, Osvaldo Andrade; Hopfinger, Anton J.
2001-01-01
A set of 18 structurally diverse antifolates including pyrimethamine, cycloguanil, methotrexate, aminopterin and trimethoprim, and 13 pyrrolo[2,3-d]pyrimidines were studied using four-dimensional quantitative structure-activity relationship (4D-QSAR) analysis. The corresponding biological activities of these compounds include IC50 inhibition constants for both the wild type, and a specific mutant type of Plasmodium falciparum dihydrofolate reductase (DHFR). Two thousand conformations of each analog were sampled to generate a conformational ensemble profile (CEP) from a molecular dynamics simulation (MDS) of 100,000 conformer trajectory states. Each sampled conformation was placed in a 1 Å cubic grid cell lattice for each of five trial alignments. The frequency of occupation of each grid cell was computed for each of six types of pharmacophore groups of atoms of each compound. These grid cell occupancy descriptors (GCODs) were then used as a descriptor pool to construct 4D-QSAR models. Models for inhibition of both the `wild' type and the mutant enzyme were generated which provide detailed spatial pharmacophore requirements for inhibition in terms of atom types and their corresponding relative locations in space. The 4D-QSAR models indicate some structural features perhaps relevant to the mechanism of resistance of the Plasmodium falciparum DHFR to current antimalarials. One feature identified is a slightly different binding alignment of the ligands to the mutant form of the enzyme as compared to the wild type.
NASA Astrophysics Data System (ADS)
Munoz-Carpena, R.; Muller, S. J.; Chu, M.; Kiker, G. A.; Perz, S. G.
2014-12-01
Model Model complexity resulting from the need to integrate environmental system components cannot be understated. In particular, additional emphasis is urgently needed on rational approaches to guide decision making through uncertainties surrounding the integrated system across decision-relevant scales. However, in spite of the difficulties that the consideration of modeling uncertainty represent for the decision process, it should not be avoided or the value and science behind the models will be undermined. These two issues; i.e., the need for coupled models that can answer the pertinent questions and the need for models that do so with sufficient certainty, are the key indicators of a model's relevance. Model relevance is inextricably linked with model complexity. Although model complexity has advanced greatly in recent years there has been little work to rigorously characterize the threshold of relevance in integrated and complex models. Formally assessing the relevance of the model in the face of increasing complexity would be valuable because there is growing unease among developers and users of complex models about the cumulative effects of various sources of uncertainty on model outputs. In particular, this issue has prompted doubt over whether the considerable effort going into further elaborating complex models will in fact yield the expected payback. New approaches have been proposed recently to evaluate the uncertainty-complexity-relevance modeling trilemma (Muller, Muñoz-Carpena and Kiker, 2011) by incorporating state-of-the-art global sensitivity and uncertainty analysis (GSA/UA) in every step of the model development so as to quantify not only the uncertainty introduced by the addition of new environmental components, but the effect that these new components have over existing components (interactions, non-linear responses). Outputs from the analysis can also be used to quantify system resilience (stability, alternative states, thresholds or tipping
3D-QSAR and docking studies of flavonoids as potent Escherichia coli inhibitors
Fang, Yajing; Lu, Yulin; Zang, Xixi; Wu, Ting; Qi, XiaoJuan; Pan, Siyi; Xu, Xiaoyun
2016-01-01
Flavonoids are potential antibacterial agents. However, key substituents and mechanism for their antibacterial activity have not been fully investigated. The quantitative structure-activity relationship (QSAR) and molecular docking of flavonoids relating to potent anti-Escherichia coli agents were investigated. Comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) were developed by using the pIC50 values of flavonoids. The cross-validated coefficient (q2) values for CoMFA (0.743) and for CoMSIA (0.708) were achieved, illustrating high predictive capabilities. Selected descriptors for the CoMFA model were ClogP (logarithm of the octanol/water partition coefficient), steric and electrostatic fields, while, ClogP, electrostatic and hydrogen bond donor fields were used for the CoMSIA model. Molecular docking results confirmed that half of the tested flavonoids inhibited DNA gyrase B (GyrB) by interacting with adenosine-triphosphate (ATP) pocket in a same orientation. Polymethoxyl flavones, flavonoid glycosides, isoflavonoids changed their orientation, resulting in a decrease of inhibitory activity. Moreover, docking results showed that 3-hydroxyl, 5-hydroxyl, 7-hydroxyl and 4-carbonyl groups were found to be crucial active substituents of flavonoids by interacting with key residues of GyrB, which were in agreement with the QSAR study results. These results provide valuable information for structure requirements of flavonoids as antibacterial agents. PMID:27049530
Uncertainty Quantification in Geomagnetic Field Modeling
NASA Astrophysics Data System (ADS)
Chulliat, A.; Nair, M. C.; Alken, P.; Meyer, B.; Saltus, R.; Woods, A.
2017-12-01
Geomagnetic field models are mathematical descriptions of the various sources of the Earth's magnetic field, and are generally obtained by solving an inverse problem. They are widely used in research to separate and characterize field sources, but also in many practical applications such as aircraft and ship navigation, smartphone orientation, satellite attitude control, and directional drilling. In recent years, more sophisticated models have been developed, thanks to the continuous availability of high quality satellite data and to progress in modeling techniques. Uncertainty quantification has become an integral part of model development, both to assess the progress made and to address specific users' needs. Here we report on recent advances made by our group in quantifying the uncertainty of geomagnetic field models. We first focus on NOAA's World Magnetic Model (WMM) and the International Geomagnetic Reference Field (IGRF), two reference models of the main (core) magnetic field produced every five years. We describe the methods used in quantifying the model commission error as well as the omission error attributed to various un-modeled sources such as magnetized rocks in the crust and electric current systems in the atmosphere and near-Earth environment. A simple error model was derived from this analysis, to facilitate usage in practical applications. We next report on improvements brought by combining a main field model with a high resolution crustal field model and a time-varying, real-time external field model, like in NOAA's High Definition Geomagnetic Model (HDGM). The obtained uncertainties are used by the directional drilling industry to mitigate health, safety and environment risks.
NASA Astrophysics Data System (ADS)
Zhao, Siqi; Zhang, Guanglong; Xia, Shuwei; Yu, Liangmin
2018-06-01
As a group of diversified frameworks, quinazolin derivatives displayed a broad field of biological functions, especially as anticancer. To investigate the quantitative structure-activity relationship, 3D-QSAR models were generated with 24 quinazolin scaffold molecules. The experimental and predicted pIC50 values for both training and test set compounds showed good correlation, which proved the robustness and reliability of the generated QSAR models. The most effective CoMFA and CoMSIA were obtained with correlation coefficient r 2 ncv of 1.00 (both) and leave-one-out coefficient q 2 of 0.61 and 0.59, respectively. The predictive abilities of CoMFA and CoMSIA were quite good with the predictive correlation coefficients ( r 2 pred ) of 0.97 and 0.91. In addition, the statistic results of CoMFA and CoMSIA were used to design new quinazolin molecules.
Identifying influences on model uncertainty: an application using a forest carbon budget model
James E. Smith; Linda S. Heath
2001-01-01
Uncertainty is an important consideration for both developers and users of environmental simulation models. Establishing quantitative estimates of uncertainty for deterministic models can be difficult when the underlying bases for such information are scarce. We demonstrate an application of probabilistic uncertainty analysis that provides for refinements in...
Estimating Coastal Digital Elevation Model (DEM) Uncertainty
NASA Astrophysics Data System (ADS)
Amante, C.; Mesick, S.
2017-12-01
Integrated bathymetric-topographic digital elevation models (DEMs) are representations of the Earth's solid surface and are fundamental to the modeling of coastal processes, including tsunami, storm surge, and sea-level rise inundation. Deviations in elevation values from the actual seabed or land surface constitute errors in DEMs, which originate from numerous sources, including: (i) the source elevation measurements (e.g., multibeam sonar, lidar), (ii) the interpolative gridding technique (e.g., spline, kriging) used to estimate elevations in areas unconstrained by source measurements, and (iii) the datum transformation used to convert bathymetric and topographic data to common vertical reference systems. The magnitude and spatial distribution of the errors from these sources are typically unknown, and the lack of knowledge regarding these errors represents the vertical uncertainty in the DEM. The National Oceanic and Atmospheric Administration (NOAA) National Centers for Environmental Information (NCEI) has developed DEMs for more than 200 coastal communities. This study presents a methodology developed at NOAA NCEI to derive accompanying uncertainty surfaces that estimate DEM errors at the individual cell-level. The development of high-resolution (1/9th arc-second), integrated bathymetric-topographic DEMs along the southwest coast of Florida serves as the case study for deriving uncertainty surfaces. The estimated uncertainty can then be propagated into the modeling of coastal processes that utilize DEMs. Incorporating the uncertainty produces more reliable modeling results, and in turn, better-informed coastal management decisions.
Modeling transport phenomena and uncertainty quantification in solidification processes
NASA Astrophysics Data System (ADS)
Fezi, Kyle S.
Direct chill (DC) casting is the primary processing route for wrought aluminum alloys. This semicontinuous process consists of primary cooling as the metal is pulled through a water cooled mold followed by secondary cooling with a water jet spray and free falling water. To gain insight into this complex solidification process, a fully transient model of DC casting was developed to predict the transport phenomena of aluminum alloys for various conditions. This model is capable of solving mixture mass, momentum, energy, and species conservation equations during multicomponent solidification. Various DC casting process parameters were examined for their effect on transport phenomena predictions in an alloy of commercial interest (aluminum alloy 7050). The practice of placing a wiper to divert cooling water from the ingot surface was studied and the results showed that placement closer to the mold causes remelting at the surface and increases susceptibility to bleed outs. Numerical models of metal alloy solidification, like the one previously mentioned, are used to gain insight into physical phenomena that cannot be observed experimentally. However, uncertainty in model inputs cause uncertainty in results and those insights. The analysis of model assumptions and probable input variability on the level of uncertainty in model predictions has not been calculated in solidification modeling as yet. As a step towards understanding the effect of uncertain inputs on solidification modeling, uncertainty quantification (UQ) and sensitivity analysis were first performed on a transient solidification model of a simple binary alloy (Al-4.5wt.%Cu) in a rectangular cavity with both columnar and equiaxed solid growth models. This analysis was followed by quantifying the uncertainty in predictions from the recently developed transient DC casting model. The PRISM Uncertainty Quantification (PUQ) framework quantified the uncertainty and sensitivity in macrosegregation, solidification
Ahamad, Shahzaib; Hassan, Md Imtaiyaz; Dwivedi, Neeraja
2018-05-01
Tuberculosis (Tb) is an airborne infectious disease caused by Mycobacterium tuberculosis. Beta-carbonic anhydrase 1 ( β-CA1 ) has emerged as one of the potential targets for new antitubercular drug development. In this work, three-dimensional quantitative structure-activity relationships (3D-QSAR), molecular docking, and molecular dynamics (MD) simulation approaches were performed on a series of natural and synthetic phenol-based β-CA1 inhibitors. The developed 3D-QSAR model ( r 2 = 0.94, q 2 = 0.86, and pred_r 2 = 0.74) indicated that the steric and electrostatic factors are important parameters to modulate the bioactivity of phenolic compounds. Based on this indication, we designed 72 new phenolic inhibitors, out of which two compounds (D25 and D50) effectively stabilized β-CA1 receptor and, thus, are potential candidates for new generation antitubercular drug discovery program.
Yavuz, Sevtap Caglar; Sabanci, Nazmiye; Saripinar, Emin
2018-01-01
The EC-GA method was employed in this study as a 4D-QSAR method, for the identification of the pharmacophore (Pha) of ruthenium(II) arene complex derivatives and quantitative prediction of activity. The arrangement of the computed geometric and electronic parameters for atoms and bonds of each compound occurring in a matrix is known as the electron-conformational matrix of congruity (ECMC). It contains the data from HF/3-21G level calculations. Compounds were represented by a group of conformers for each compound rather than a single conformation, known as fourth dimension to generate the model. ECMCs were compared within a certain range of tolerance values by using the EMRE program and the responsible pharmacophore group for ruthenium(II) arene complex derivatives was found. For selecting the sub-parameter which had the most effect on activity in the series and the calculation of theoretical activity values, the non-linear least square method and genetic algorithm which are included in the EMRE program were used. In addition, compounds were classified as the training and test set and the accuracy of the models was tested by cross-validation statistically. The model for training and test sets attained by the optimum 10 parameters gave highly satisfactory results with R2 training= 0.817, q 2=0.718 and SEtraining=0.066, q2 ext1 = 0.867, q2 ext2 = 0.849, q2 ext3 =0.895, ccctr = 0.895, ccctest = 0.930 and cccall = 0.905. Since there is no 4D-QSAR research on metal based organic complexes in the literature, this study is original and gives a powerful tool to the design of novel and selective ruthenium(II) arene complexes. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Assessment of parametric uncertainty for groundwater reactive transport modeling,
Shi, Xiaoqing; Ye, Ming; Curtis, Gary P.; Miller, Geoffery L.; Meyer, Philip D.; Kohler, Matthias; Yabusaki, Steve; Wu, Jichun
2014-01-01
The validity of using Gaussian assumptions for model residuals in uncertainty quantification of a groundwater reactive transport model was evaluated in this study. Least squares regression methods explicitly assume Gaussian residuals, and the assumption leads to Gaussian likelihood functions, model parameters, and model predictions. While the Bayesian methods do not explicitly require the Gaussian assumption, Gaussian residuals are widely used. This paper shows that the residuals of the reactive transport model are non-Gaussian, heteroscedastic, and correlated in time; characterizing them requires using a generalized likelihood function such as the formal generalized likelihood function developed by Schoups and Vrugt (2010). For the surface complexation model considered in this study for simulating uranium reactive transport in groundwater, parametric uncertainty is quantified using the least squares regression methods and Bayesian methods with both Gaussian and formal generalized likelihood functions. While the least squares methods and Bayesian methods with Gaussian likelihood function produce similar Gaussian parameter distributions, the parameter distributions of Bayesian uncertainty quantification using the formal generalized likelihood function are non-Gaussian. In addition, predictive performance of formal generalized likelihood function is superior to that of least squares regression and Bayesian methods with Gaussian likelihood function. The Bayesian uncertainty quantification is conducted using the differential evolution adaptive metropolis (DREAM(zs)) algorithm; as a Markov chain Monte Carlo (MCMC) method, it is a robust tool for quantifying uncertainty in groundwater reactive transport models. For the surface complexation model, the regression-based local sensitivity analysis and Morris- and DREAM(ZS)-based global sensitivity analysis yield almost identical ranking of parameter importance. The uncertainty analysis may help select appropriate likelihood
Parameter uncertainty analysis of a biokinetic model of caesium
Li, W. B.; Klein, W.; Blanchardon, Eric; ...
2014-04-17
Parameter uncertainties for the biokinetic model of caesium (Cs) developed by Leggett et al. were inventoried and evaluated. The methods of parameter uncertainty analysis were used to assess the uncertainties of model predictions with the assumptions of model parameter uncertainties and distributions. Furthermore, the importance of individual model parameters was assessed by means of sensitivity analysis. The calculated uncertainties of model predictions were compared with human data of Cs measured in blood and in the whole body. It was found that propagating the derived uncertainties in model parameter values reproduced the range of bioassay data observed in human subjects atmore » different times after intake. The maximum ranges, expressed as uncertainty factors (UFs) (defined as a square root of ratio between 97.5th and 2.5th percentiles) of blood clearance, whole-body retention and urinary excretion of Cs predicted at earlier time after intake were, respectively: 1.5, 1.0 and 2.5 at the first day; 1.8, 1.1 and 2.4 at Day 10 and 1.8, 2.0 and 1.8 at Day 100; for the late times (1000 d) after intake, the UFs were increased to 43, 24 and 31, respectively. The model parameters of transfer rates between kidneys and blood, muscle and blood and the rate of transfer from kidneys to urinary bladder content are most influential to the blood clearance and to the whole-body retention of Cs. For the urinary excretion, the parameters of transfer rates from urinary bladder content to urine and from kidneys to urinary bladder content impact mostly. The implication and effect on the estimated equivalent and effective doses of the larger uncertainty of 43 in whole-body retention in the later time, say, after Day 500 will be explored in a successive work in the framework of EURADOS.« less
Maganti, Lakshmi; Das, Sanjit Kumar; Mascarenhas, Nahren Manuel; Ghoshal, Nanda
2011-10-01
The re-emergence of tuberculosis infections, which are resistant to conventional drug therapy, has steadily risen in the last decade. Inhibitors of aryl acid adenylating enzyme known as MbtA, involved in siderophore biosynthesis in Mycobacterium tuberculosis, are being explored as potential antitubercular agents. The ability to identify fragments that interact with a biological target is a key step in fragment based drug design (FBDD). To expand the boundaries of quantitative structure activity relationship (QSAR) paradigm, we have proposed a Fragment Based QSAR methodology, referred here in as FB-QSAR, for deciphering the structural requirements of a series of nucleoside bisubstrate analogs for inhibition of MbtA, a key enzyme involved in siderophore biosynthetic pathway. For the development of FB-QSAR models, statistical techniques such as stepwise multiple linear regression (SMLR), genetic function approximation (GFA) and GFAspline were used. The predictive ability of the generated models was validated using different statistical metrics, and similarity-based coverage estimation was carried out to define applicability boundaries. To aid the creation of novel antituberculosis compounds, a bioisosteric database was enumerated using the combichem approach endorsed mining in a lead-like chemical space. The generated library was screened using an integrated in-silico approach and potential hits identified. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
The ToxCast and Tox21 programs have tested ~8,200 chemicals in a broad screening panel of in vitro high-throughput screening (HTS) assays for estrogen receptor (ER) agonist and antagonist activity. The present work uses this large in vitro data set to develop in silico QSAR model...
Goyal, Sukriti; Grover, Sonam; Dhanjal, Jaspreet Kaur; Tyagi, Chetna; Goyal, Manisha; Grover, Abhinav
2014-06-01
Tumour suppressor p53 is known to play a central role in prevention of tumour development, DNA repair, senescence and apoptosis which is in normal cells maintained by negative feedback regulator MDM2 (Murine Double Minute 2). In case of dysfunctioning of this regulatory loop, tumour development starts thus resulting in cancerous condition. Inhibition of p53-MDM2 binding would result in activation of the tumour suppressor. In this study, a novel robust fragment-based QSAR model has been developed for piperidinone derived compounds experimentally known to inhibit p53-MDM2 interaction. The QSAR model developed showed satisfactory statistical parameters for the experimentally reported dataset (r(2)=0.9415, q(2)=0.8958, pred_r(2)=0.8894 and F-test=112.7314), thus judging the robustness of the model. Low standard error values (r(2)_se=0.3003, q(2)_se=0.4009 and pred_r(2)_se=0.3315) confirmed the accuracy of the developed model. The regression equation obtained constituted three descriptors (R2-DeltaEpsilonA, R1-RotatableBondCount and R2-SssOCount), two of which had positive contribution while third showed negative correlation. Based on the developed QSAR model, a combinatorial library was generated and activities of the compounds were predicted. These compounds were docked with MDM2 and two top scoring compounds with binding affinities of -10.13 and -9.80kcal/mol were selected. The binding modes of actions of these complexes were analyzed using molecular dynamics simulations. Analysis of the developed fragment-based QSAR model revealed that addition of unsaturated electronegative groups at R2 site and groups with more rotatable bonds at R1 improved the inhibitory activity of these potent lead compounds. The detailed analysis carried out in this study provides a considerable basis for the design and development of novel piperidinone-based lead molecules against cancer and also provides mechanistic insights into their mode of actions. Copyright © 2014 Elsevier Inc. All
NASA Astrophysics Data System (ADS)
Asati, Vivek; Bharti, Sanjay Kumar; Budhwani, Ashok Kumar
2017-04-01
The proviral insertion site in moloney murine leukemia virus (PIM) is a family of serine/threonine kinase of Ca2+-calmodulin-dependent protein kinase (CAMK) group which is responsible for the activation and regulation of cellular transcription and translation. The three isoforms of PIM kinase (PIM-1, PIM-2 and PIM-3) share high homology and functional idleness are widely expressed and involved in a variety of biological processes including cell survival, proliferation, differentiation and apoptosis. Altered expression of PIM-1 kinase correlated with hematologic malignancies and solid tumors. In the present study, atom-based 3D-QSAR, docking and virtual screening studies have been performed on a series of thiazolidine-2,4-dione derivatives as PIM-1 kinase inhibitors. 3D-QSAR and docking approach has shortlisted the most active thiazolidine-2,4-dione derivatives such as 28, 31, 33 and 35 with the incorporation of more than one structural feature in a single molecule. External validations by various parameters and molecular docking studies at the active site of PIM-1 kinase have proved the reliability of the developed 3D-QSAR model. The generated pharmacophore (AADHR.33) from 3D-QSAR study was used for screening of drug like compounds from ZINC database, where ZINC15056464 and ZINC83292944 showed potential binding affinities at the active site amino acid residues (LYS67, GLU171, ASP128 and ASP186) of PIM-1 kinase.
NASA Astrophysics Data System (ADS)
Pathiraja, S. D.; Moradkhani, H.; Marshall, L. A.; Sharma, A.; Geenens, G.
2016-12-01
Effective combination of model simulations and observations through Data Assimilation (DA) depends heavily on uncertainty characterisation. Many traditional methods for quantifying model uncertainty in DA require some level of subjectivity (by way of tuning parameters or by assuming Gaussian statistics). Furthermore, the focus is typically on only estimating the first and second moments. We propose a data-driven methodology to estimate the full distributional form of model uncertainty, i.e. the transition density p(xt|xt-1). All sources of uncertainty associated with the model simulations are considered collectively, without needing to devise stochastic perturbations for individual components (such as model input, parameter and structural uncertainty). A training period is used to derive the distribution of errors in observed variables conditioned on hidden states. Errors in hidden states are estimated from the conditional distribution of observed variables using non-linear optimization. The theory behind the framework and case study applications are discussed in detail. Results demonstrate improved predictions and more realistic uncertainty bounds compared to a standard perturbation approach.
NASA Astrophysics Data System (ADS)
Dong, Huanhuan; Liu, Jing; Liu, Xiaoru; Yu, Yanying; Cao, Shuwen
2018-01-01
A collection of thirty-six aromatic heterocycle thiosemicarbazone analogues presented a broad span of anti-tyrosinase activities were designed and obtained. A robust and reliable two-dimensional quantitative structure-activity relationship model, as evidenced by the high q2 and r2 values (0.848 and 0.893, respectively), was gained based on the analogues to predict the quantitative chemical-biological relationship and the new modifier direction. Inhibitory activities of the compounds were found to greatly depend on molecular shape and orbital energy. Substituents brought out large ovality and high highest-occupied molecular orbital energy values helped to improve the activity of these analogues. The molecular docking results provided visual evidence for QSAR analysis and inhibition mechanism. Based on these, two novel tyrosinase inhibitors O04 and O05 with predicted IC50 of 0.5384 and 0.8752 nM were designed and suggested for further research.
Spectral optimization and uncertainty quantification in combustion modeling
NASA Astrophysics Data System (ADS)
Sheen, David Allan
Reliable simulations of reacting flow systems require a well-characterized, detailed chemical model as a foundation. Accuracy of such a model can be assured, in principle, by a multi-parameter optimization against a set of experimental data. However, the inherent uncertainties in the rate evaluations and experimental data leave a model still characterized by some finite kinetic rate parameter space. Without a careful analysis of how this uncertainty space propagates into the model's predictions, those predictions can at best be trusted only qualitatively. In this work, the Method of Uncertainty Minimization using Polynomial Chaos Expansions is proposed to quantify these uncertainties. In this method, the uncertainty in the rate parameters of the as-compiled model is quantified. Then, the model is subjected to a rigorous multi-parameter optimization, as well as a consistency-screening process. Lastly, the uncertainty of the optimized model is calculated using an inverse spectral optimization technique, and then propagated into a range of simulation conditions. An as-compiled, detailed H2/CO/C1-C4 kinetic model is combined with a set of ethylene combustion data to serve as an example. The idea that the hydrocarbon oxidation model should be understood and developed in a hierarchical fashion has been a major driving force in kinetics research for decades. How this hierarchical strategy works at a quantitative level, however, has never been addressed. In this work, we use ethylene and propane combustion as examples and explore the question of hierarchical model development quantitatively. The Method of Uncertainty Minimization using Polynomial Chaos Expansions is utilized to quantify the amount of information that a particular combustion experiment, and thereby each data set, contributes to the model. This knowledge is applied to explore the relationships among the combustion chemistry of hydrogen/carbon monoxide, ethylene, and larger alkanes. Frequently, new data will
Rivera, Gildardo; Andrade-Ochoa, Sergio; Romero, Manolo S Ortega; Palos, Isidro; Monge, Antonio; Sanchez-Torres, Luvia Enid
2017-01-01
Quinoxalines have shown a wide variety of biological activities including as antitumor agents. The aims of this study were to evaluate the activity of quinoxaline 1,4-di-N-oxide derivatives on K562 cells, the establishment of the mechanism of induced cell death, and the construction of predictive QSAR models. Sixteen esters of quinoxaline-7-carboxylate 1,4-di-N-oxide were evaluated for antitumor activity on K562 chronic myelogenous leukemia cells and their IC50 values were determined. The mechanism of induced cell death by the most active molecule was assessed by flow cytometry and an in silico study was conducted to optimize and calculate theoretical descriptors of all quinoxaline 1,4-di-N-oxide derivatives. QSAR and QPAR models were created using genetic algorithms. Our results show that compounds C5, C7, C10, C12 and C15 had the lowest IC50 of the series. C15 was the most active compound (IC50= 3.02 μg/mL), inducing caspase-dependent apoptotic cell death via the intrinsic pathway. QSAR and QPAR studies are discussed. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Partitioning uncertainty in streamflow projections under nonstationary model conditions
NASA Astrophysics Data System (ADS)
Chawla, Ila; Mujumdar, P. P.
2018-02-01
Assessing the impacts of Land Use (LU) and climate change on future streamflow projections is necessary for efficient management of water resources. However, model projections are burdened with significant uncertainty arising from various sources. Most of the previous studies have considered climate models and scenarios as major sources of uncertainty, but uncertainties introduced by land use change and hydrologic model assumptions are rarely investigated. In this paper an attempt is made to segregate the contribution from (i) general circulation models (GCMs), (ii) emission scenarios, (iii) land use scenarios, (iv) stationarity assumption of the hydrologic model, and (v) internal variability of the processes, to overall uncertainty in streamflow projections using analysis of variance (ANOVA) approach. Generally, most of the impact assessment studies are carried out with unchanging hydrologic model parameters in future. It is, however, necessary to address the nonstationarity in model parameters with changing land use and climate. In this paper, a regression based methodology is presented to obtain the hydrologic model parameters with changing land use and climate scenarios in future. The Upper Ganga Basin (UGB) in India is used as a case study to demonstrate the methodology. The semi-distributed Variable Infiltration Capacity (VIC) model is set-up over the basin, under nonstationary conditions. Results indicate that model parameters vary with time, thereby invalidating the often-used assumption of model stationarity. The streamflow in UGB under the nonstationary model condition is found to reduce in future. The flows are also found to be sensitive to changes in land use. Segregation results suggest that model stationarity assumption and GCMs along with their interactions with emission scenarios, act as dominant sources of uncertainty. This paper provides a generalized framework for hydrologists to examine stationarity assumption of models before considering them
Autocorrelation descriptor improvements for QSAR: 2DA_Sign and 3DA_Sign
NASA Astrophysics Data System (ADS)
Sliwoski, Gregory; Mendenhall, Jeffrey; Meiler, Jens
2016-03-01
Quantitative structure-activity relationship (QSAR) is a branch of computer aided drug discovery that relates chemical structures to biological activity. Two well established and related QSAR descriptors are two- and three-dimensional autocorrelation (2DA and 3DA). These descriptors encode the relative position of atoms or atom properties by calculating the separation between atom pairs in terms of number of bonds (2DA) or Euclidean distance (3DA). The sums of all values computed for a given small molecule are collected in a histogram. Atom properties can be added with a coefficient that is the product of atom properties for each pair. This procedure can lead to information loss when signed atom properties are considered such as partial charge. For example, the product of two positive charges is indistinguishable from the product of two equivalent negative charges. In this paper, we present variations of 2DA and 3DA called 2DA_Sign and 3DA_Sign that avoid information loss by splitting unique sign pairs into individual histograms. We evaluate these variations with models trained on nine datasets spanning a range of drug target classes. Both 2DA_Sign and 3DA_Sign significantly increase model performance across all datasets when compared with traditional 2DA and 3DA. Lastly, we find that limiting 3DA_Sign to maximum atom pair distances of 6 Å instead of 12 Å further increases model performance, suggesting that conformational flexibility may hinder performance with longer 3DA descriptors. Consistent with this finding, limiting the number of bonds in 2DA_Sign from 11 to 5 fails to improve performance.
Chen, H F; Dong, X C; Zen, B S; Gao, K; Yuan, S G; Panaye, A; Doucet, J P; Fan, B T
2003-08-01
An efficient virtual and rational drug design method is presented. It combines virtual bioactive compound generation with 3D-QSAR model and docking. Using this method, it is possible to generate a lot of highly diverse molecules and find virtual active lead compounds. The method was validated by the study of a set of anti-tumor drugs. With the constraints of pharmacophore obtained by DISCO implemented in SYBYL 6.8, 97 virtual bioactive compounds were generated, and their anti-tumor activities were predicted by CoMFA. Eight structures with high activity were selected and screened by the 3D-QSAR model. The most active generated structure was further investigated by modifying its structure in order to increase the activity. A comparative docking study with telomeric receptor was carried out, and the results showed that the generated structures could form more stable complexes with receptor than the reference compound selected from experimental data. This investigation showed that the proposed method was a feasible way for rational drug design with high screening efficiency.
Relating Data and Models to Characterize Parameter and Prediction Uncertainty
Applying PBPK models in risk analysis requires that we realistically assess the uncertainty of relevant model predictions in as quantitative a way as possible. The reality of human variability may add a confusing feature to the overall uncertainty assessment, as uncertainty and v...
Analytic uncertainty and sensitivity analysis of models with input correlations
NASA Astrophysics Data System (ADS)
Zhu, Yueying; Wang, Qiuping A.; Li, Wei; Cai, Xu
2018-03-01
Probabilistic uncertainty analysis is a common means of evaluating mathematical models. In mathematical modeling, the uncertainty in input variables is specified through distribution laws. Its contribution to the uncertainty in model response is usually analyzed by assuming that input variables are independent of each other. However, correlated parameters are often happened in practical applications. In the present paper, an analytic method is built for the uncertainty and sensitivity analysis of models in the presence of input correlations. With the method, it is straightforward to identify the importance of the independence and correlations of input variables in determining the model response. This allows one to decide whether or not the input correlations should be considered in practice. Numerical examples suggest the effectiveness and validation of our analytic method in the analysis of general models. A practical application of the method is also proposed to the uncertainty and sensitivity analysis of a deterministic HIV model.
Lu, Qingzhang; Shen, Guoli; Yu, Ruqin
2002-11-15
The chaotic dynamical system is introduced in genetic algorithm to train ANN to formulate the CGANN algorithm. Logistic mapping as one of the most important chaotic dynamic mappings provides each new generation a high chance to hold GA's population diversity. This enhances the ability to overcome overfitting in training an ANN. The proposed CGANN has been used for QSAR studies to predict the tetrahedral modes (nu(1)(A1) and nu(2)(E)) of halides [MX(4)](epsilon). The frequencies predicted by QSAR were compared with those calculated by quantum chemistry methods including PM3, AM1, and MNDO/d. The possibility of improving the predictive ability of QSAR by including quantum chemistry parameters as feature variables has been investigated using tetrahedral tetrahalide examples. Copyright 2002 Wiley Periodicals, Inc.
Modeling of Complex Mixtures: JP-8 Toxicokinetics
2008-10-01
generic tissue compartments in which we have combined diffusion limitation and deep tissue (global tissue model). We also applied a QSAR approach for...SUBJECT TERMS jet fuel, JP-8, PBPK modeling, complex mixtures, nonane, decane, naphthalene, QSAR , alternative fuels 16. SECURITY CLASSIFICATION OF...necessary, to apply to the interaction of specific compounds with specific tissues. We have also applied a QSAR approach for estimating blood and tissue
Quantitative structure-activity relationships (QSARs) are being developed to predict the toxicological endpoints for untested chemicals similar in structure to chemicals that have known experimental toxicological data. Based on a very large number of predetermined descriptors, a...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, Liying; Sedykh, Alexander; Tripathi, Ashutosh
2013-10-01
Identification of endocrine disrupting chemicals is one of the important goals of environmental chemical hazard screening. We report on the development of validated in silico predictors of chemicals likely to cause estrogen receptor (ER)-mediated endocrine disruption to facilitate their prioritization for future screening. A database of relative binding affinity of a large number of ERα and/or ERβ ligands was assembled (546 for ERα and 137 for ERβ). Both single-task learning (STL) and multi-task learning (MTL) continuous quantitative structure–activity relationship (QSAR) models were developed for predicting ligand binding affinity to ERα or ERβ. High predictive accuracy was achieved for ERα bindingmore » affinity (MTL R{sup 2} = 0.71, STL R{sup 2} = 0.73). For ERβ binding affinity, MTL models were significantly more predictive (R{sup 2} = 0.53, p < 0.05) than STL models. In addition, docking studies were performed on a set of ER agonists/antagonists (67 agonists and 39 antagonists for ERα, 48 agonists and 32 antagonists for ERβ, supplemented by putative decoys/non-binders) using the following ER structures (in complexes with respective ligands) retrieved from the Protein Data Bank: ERα agonist (PDB ID: 1L2I), ERα antagonist (PDB ID: 3DT3), ERβ agonist (PDB ID: 2NV7), and ERβ antagonist (PDB ID: 1L2J). We found that all four ER conformations discriminated their corresponding ligands from presumed non-binders. Finally, both QSAR models and ER structures were employed in parallel to virtually screen several large libraries of environmental chemicals to derive a ligand- and structure-based prioritized list of putative estrogenic compounds to be used for in vitro and in vivo experimental validation. - Highlights: • This is the largest curated dataset inclusive of ERα and β (the latter is unique). • New methodology that for the first time affords acceptable ERβ models. • A combination of QSAR and docking enables prediction of affinity and
Data-driven Modelling for decision making under uncertainty
NASA Astrophysics Data System (ADS)
Angria S, Layla; Dwi Sari, Yunita; Zarlis, Muhammad; Tulus
2018-01-01
The rise of the issues with the uncertainty of decision making has become a very warm conversation in operation research. Many models have been presented, one of which is with data-driven modelling (DDM). The purpose of this paper is to extract and recognize patterns in data, and find the best model in decision-making problem under uncertainty by using data-driven modeling approach with linear programming, linear and nonlinear differential equation, bayesian approach. Model criteria tested to determine the smallest error, and it will be the best model that can be used.
Characterizing Uncertainty and Variability in PBPK Models ...
Mode-of-action based risk and safety assessments can rely upon tissue dosimetry estimates in animals and humans obtained from physiologically-based pharmacokinetic (PBPK) modeling. However, risk assessment also increasingly requires characterization of uncertainty and variability; such characterization for PBPK model predictions represents a continuing challenge to both modelers and users. Current practices show significant progress in specifying deterministic biological models and the non-deterministic (often statistical) models, estimating their parameters using diverse data sets from multiple sources, and using them to make predictions and characterize uncertainty and variability. The International Workshop on Uncertainty and Variability in PBPK Models, held Oct 31-Nov 2, 2006, sought to identify the state-of-the-science in this area and recommend priorities for research and changes in practice and implementation. For the short term, these include: (1) multidisciplinary teams to integrate deterministic and non-deterministic/statistical models; (2) broader use of sensitivity analyses, including for structural and global (rather than local) parameter changes; and (3) enhanced transparency and reproducibility through more complete documentation of the model structure(s) and parameter values, the results of sensitivity and other analyses, and supporting, discrepant, or excluded data. Longer-term needs include: (1) theoretic and practical methodological impro
'spup' - an R package for uncertainty propagation in spatial environmental modelling
NASA Astrophysics Data System (ADS)
Sawicka, Kasia; Heuvelink, Gerard
2016-04-01
Computer models have become a crucial tool in engineering and environmental sciences for simulating the behaviour of complex static and dynamic systems. However, while many models are deterministic, the uncertainty in their predictions needs to be estimated before they are used for decision support. Currently, advances in uncertainty propagation and assessment have been paralleled by a growing number of software tools for uncertainty analysis, but none has gained recognition for a universal applicability, including case studies with spatial models and spatial model inputs. Due to the growing popularity and applicability of the open source R programming language we undertook a project to develop an R package that facilitates uncertainty propagation analysis in spatial environmental modelling. In particular, the 'spup' package provides functions for examining the uncertainty propagation starting from input data and model parameters, via the environmental model onto model predictions. The functions include uncertainty model specification, stochastic simulation and propagation of uncertainty using Monte Carlo (MC) techniques, as well as several uncertainty visualization functions. Uncertain environmental variables are represented in the package as objects whose attribute values may be uncertain and described by probability distributions. Both numerical and categorical data types are handled. Spatial auto-correlation within an attribute and cross-correlation between attributes is also accommodated for. For uncertainty propagation the package has implemented the MC approach with efficient sampling algorithms, i.e. stratified random sampling and Latin hypercube sampling. The design includes facilitation of parallel computing to speed up MC computation. The MC realizations may be used as an input to the environmental models called from R, or externally. Selected static and interactive visualization methods that are understandable by non-experts with limited background in
Parameterization of Model Validating Sets for Uncertainty Bound Optimizations. Revised
NASA Technical Reports Server (NTRS)
Lim, K. B.; Giesy, D. P.
2000-01-01
Given measurement data, a nominal model and a linear fractional transformation uncertainty structure with an allowance on unknown but bounded exogenous disturbances, easily computable tests for the existence of a model validating uncertainty set are given. Under mild conditions, these tests are necessary and sufficient for the case of complex, nonrepeated, block-diagonal structure. For the more general case which includes repeated and/or real scalar uncertainties, the tests are only necessary but become sufficient if a collinearity condition is also satisfied. With the satisfaction of these tests, it is shown that a parameterization of all model validating sets of plant models is possible. The new parameterization is used as a basis for a systematic way to construct or perform uncertainty tradeoff with model validating uncertainty sets which have specific linear fractional transformation structure for use in robust control design and analysis. An illustrative example which includes a comparison of candidate model validating sets is given.
Assessing model uncertainty using hexavalent chromium and ...
Introduction: The National Research Council recommended quantitative evaluation of uncertainty in effect estimates for risk assessment. This analysis considers uncertainty across model forms and model parameterizations with hexavalent chromium [Cr(VI)] and lung cancer mortality as an example. The objective of this analysis is to characterize model uncertainty by evaluating the variance in estimates across several epidemiologic analyses.Methods: This analysis compared 7 publications analyzing two different chromate production sites in Ohio and Maryland. The Ohio cohort consisted of 482 workers employed from 1940-72, while the Maryland site employed 2,357 workers from 1950-74. Cox and Poisson models were the only model forms considered by study authors to assess the effect of Cr(VI) on lung cancer mortality. All models adjusted for smoking and included a 5-year exposure lag, however other latency periods and model covariates such as age and race were considered. Published effect estimates were standardized to the same units and normalized by their variances to produce a standardized metric to compare variability in estimates across and within model forms. A total of 7 similarly parameterized analyses were considered across model forms, and 23 analyses with alternative parameterizations were considered within model form (14 Cox; 9 Poisson). Results: Across Cox and Poisson model forms, adjusted cumulative exposure coefficients for 7 similar analyses ranged from 2.47
NASA Astrophysics Data System (ADS)
Tsai, F. T.; Elshall, A. S.; Hanor, J. S.
2012-12-01
Subsurface modeling is challenging because of many possible competing propositions for each uncertain model component. How can we judge that we are selecting the correct proposition for an uncertain model component out of numerous competing propositions? How can we bridge the gap between synthetic mental principles such as mathematical expressions on one hand, and empirical observation such as observation data on the other hand when uncertainty exists on both sides? In this study, we introduce hierarchical Bayesian model averaging (HBMA) as a multi-model (multi-proposition) framework to represent our current state of knowledge and decision for hydrogeological structure modeling. The HBMA framework allows for segregating and prioritizing different sources of uncertainty, and for comparative evaluation of competing propositions for each source of uncertainty. We applied the HBMA to a study of hydrostratigraphy and uncertainty propagation of the Southern Hills aquifer system in the Baton Rouge area, Louisiana. We used geophysical data for hydrogeological structure construction through indictor hydrostratigraphy method and used lithologic data from drillers' logs for model structure calibration. However, due to uncertainty in model data, structure and parameters, multiple possible hydrostratigraphic models were produced and calibrated. The study considered four sources of uncertainties. To evaluate mathematical structure uncertainty, the study considered three different variogram models and two geological stationarity assumptions. With respect to geological structure uncertainty, the study considered two geological structures with respect to the Denham Springs-Scotlandville fault. With respect to data uncertainty, the study considered two calibration data sets. These four sources of uncertainty with their corresponding competing modeling propositions resulted in 24 calibrated models. The results showed that by segregating different sources of uncertainty, HBMA analysis
Model structures amplify uncertainty in predicted soil carbon responses to climate change.
Shi, Zheng; Crowell, Sean; Luo, Yiqi; Moore, Berrien
2018-06-04
Large model uncertainty in projected future soil carbon (C) dynamics has been well documented. However, our understanding of the sources of this uncertainty is limited. Here we quantify the uncertainties arising from model parameters, structures and their interactions, and how those uncertainties propagate through different models to projections of future soil carbon stocks. Both the vertically resolved model and the microbial explicit model project much greater uncertainties to climate change than the conventional soil C model, with both positive and negative C-climate feedbacks, whereas the conventional model consistently predicts positive soil C-climate feedback. Our findings suggest that diverse model structures are necessary to increase confidence in soil C projection. However, the larger uncertainty in the complex models also suggests that we need to strike a balance between model complexity and the need to include diverse model structures in order to forecast soil C dynamics with high confidence and low uncertainty.
He, L; Huang, G H; Lu, H W
2010-04-15
Solving groundwater remediation optimization problems based on proxy simulators can usually yield optimal solutions differing from the "true" ones of the problem. This study presents a new stochastic optimization model under modeling uncertainty and parameter certainty (SOMUM) and the associated solution method for simultaneously addressing modeling uncertainty associated with simulator residuals and optimizing groundwater remediation processes. This is a new attempt different from the previous modeling efforts. The previous ones focused on addressing uncertainty in physical parameters (i.e. soil porosity) while this one aims to deal with uncertainty in mathematical simulator (arising from model residuals). Compared to the existing modeling approaches (i.e. only parameter uncertainty is considered), the model has the advantages of providing mean-variance analysis for contaminant concentrations, mitigating the effects of modeling uncertainties on optimal remediation strategies, offering confidence level of optimal remediation strategies to system designers, and reducing computational cost in optimization processes. 2009 Elsevier B.V. All rights reserved.
Robustness for slope stability modelling under deep uncertainty
NASA Astrophysics Data System (ADS)
Almeida, Susana; Holcombe, Liz; Pianosi, Francesca; Wagener, Thorsten
2015-04-01
Landslides can have large negative societal and economic impacts, such as loss of life and damage to infrastructure. However, the ability of slope stability assessment to guide management is limited by high levels of uncertainty in model predictions. Many of these uncertainties cannot be easily quantified, such as those linked to climate change and other future socio-economic conditions, restricting the usefulness of traditional decision analysis tools. Deep uncertainty can be managed more effectively by developing robust, but not necessarily optimal, policies that are expected to perform adequately under a wide range of future conditions. Robust strategies are particularly valuable when the consequences of taking a wrong decision are high as is often the case of when managing natural hazard risks such as landslides. In our work a physically based numerical model of hydrologically induced slope instability (the Combined Hydrology and Stability Model - CHASM) is applied together with robust decision making to evaluate the most important uncertainties (storm events, groundwater conditions, surface cover, slope geometry, material strata and geotechnical properties) affecting slope stability. Specifically, impacts of climate change on long-term slope stability are incorporated, accounting for the deep uncertainty in future climate projections. Our findings highlight the potential of robust decision making to aid decision support for landslide hazard reduction and risk management under conditions of deep uncertainty.
Satellite Re-entry Modeling and Uncertainty Quantification
NASA Astrophysics Data System (ADS)
Horsley, M.
2012-09-01
LEO trajectory modeling is a fundamental aerospace capability and has applications in many areas of aerospace, such as maneuver planning, sensor scheduling, re-entry prediction, collision avoidance, risk analysis, and formation flying. Somewhat surprisingly, modeling the trajectory of an object in low Earth orbit is still a challenging task. This is primarily due to the large uncertainty in the upper atmospheric density, about 15-20% (1-sigma) for most thermosphere models. Other contributions come from our inability to precisely model future solar and geomagnetic activities, the potentially unknown shape, material construction and attitude history of the satellite, and intermittent, noisy tracking data. Current methods to predict a satellite's re-entry trajectory typically involve making a single prediction, with the uncertainty dealt with in an ad-hoc manner, usually based on past experience. However, due to the extreme speed of a LEO satellite, even small uncertainties in the re-entry time translate into a very large uncertainty in the location of the re-entry event. Currently, most methods simply update the re-entry estimate on a regular basis. This results in a wide range of estimates that are literally spread over the entire globe. With no understanding of the underlying distribution of potential impact points, the sequence of impact points predicted by the current methodology are largely useless until just a few hours before re-entry. This paper will discuss the development of a set of the High Performance Computing (HPC)-based capabilities to support near real-time quantification of the uncertainty inherent in uncontrolled satellite re-entries. An appropriate management of the uncertainties is essential for a rigorous treatment of the re-entry/LEO trajectory problem. The development of HPC-based tools for re-entry analysis is important as it will allow a rigorous and robust approach to risk assessment by decision makers in an operational setting. Uncertainty
Holistic uncertainty analysis in river basin modeling for climate vulnerability assessment
NASA Astrophysics Data System (ADS)
Taner, M. U.; Wi, S.; Brown, C.
2017-12-01
The challenges posed by uncertain future climate are a prominent concern for water resources managers. A number of frameworks exist for assessing the impacts of climate-related uncertainty, including internal climate variability and anthropogenic climate change, such as scenario-based approaches and vulnerability-based approaches. While in many cases climate uncertainty may be dominant, other factors such as future evolution of the river basin, hydrologic response and reservoir operations are potentially significant sources of uncertainty. While uncertainty associated with modeling hydrologic response has received attention, very little attention has focused on the range of uncertainty and possible effects of the water resources infrastructure and management. This work presents a holistic framework that allows analysis of climate, hydrologic and water management uncertainty in water resources systems analysis with the aid of a water system model designed to integrate component models for hydrology processes and water management activities. The uncertainties explored include those associated with climate variability and change, hydrologic model parameters, and water system operation rules. A Bayesian framework is used to quantify and model the uncertainties at each modeling steps in integrated fashion, including prior and the likelihood information about model parameters. The framework is demonstrated in a case study for the St. Croix Basin located at border of United States and Canada.
Estimating the fates of organic contaminants in an aquifer using QSAR.
Lim, Seung Joo; Fox, Peter
2013-01-01
The quantitative structure activity relationship (QSAR) model, BIOWIN, was modified to more accurately estimate the fates of organic contaminants in an aquifer. The predictions from BIOWIN were modified to include oxidation and sorption effects. The predictive model therefore included the effects of sorption, biodegradation, and oxidation. A total of 35 organic compounds were used to validate the predictive model. The majority of the ratios of predicted half-life to measured half-life were within a factor of 2 and no ratio values were greater than a factor of 5. In addition, the accuracy of estimating the persistence of organic compounds in the sub-surface was superior when modified by the relative fraction adsorbed to the solid phase, 1/Rf, to that when modified by the remaining fraction of a given compound adsorbed to a solid, 1 - fs.
Ragno, Rino; Artico, Marino; De Martino, Gabriella; La Regina, Giuseppe; Coluccia, Antonio; Di Pasquali, Alessandra; Silvestri, Romano
2005-01-13
Three-dimensional quantitative structure-activity relationship (3-D QSAR) studies and docking simulations were developed on indolyl aryl sulfones (IASs), a class of novel HIV-1 non-nucleoside reverse transcriptase (RT) inhibitors (Silvestri, et al. J. Med. Chem. 2003, 46, 2482-2493) highly active against wild type and some clinically relevant resistant strains (Y181C, the double mutant K103N-Y181C, and the K103R-V179D-P225H strain, highly resistant to efavirenz). Predictive 3-D QSAR models using the combination of GRID and GOLPE programs were obtained using a receptor-based alignment by means of docking IASs into the non-nucleoside binding site (NNBS) of RT. The derived 3-D QSAR models showed conventional correlation (r(2)) and cross-validated (q(2)) coefficients values ranging from 0.79 to 0.93 and from 0.59 to 0.84, respectively. All described models were validated by an external test set compiled from previously reported pyrryl aryl sulfones (Artico, et al. J. Med. Chem. 1996, 39, 522-530). The most predictive 3-D QSAR model was then used to predict the activity of novel untested IASs. The synthesis of six designed derivatives (prediction set) allowed disclosure of new IASs endowed with high anti-HIV-1 activities.
Assessing and reducing hydrogeologic model uncertainty
USDA-ARS?s Scientific Manuscript database
NRC is sponsoring research that couples model abstraction techniques with model uncertainty assessment methods. Insights and information from this program will be useful in decision making by NRC staff, licensees and stakeholders in their assessment of subsurface radionuclide transport. All analytic...
Framework for Uncertainty Assessment - Hanford Site-Wide Groundwater Flow and Transport Modeling
NASA Astrophysics Data System (ADS)
Bergeron, M. P.; Cole, C. R.; Murray, C. J.; Thorne, P. D.; Wurstner, S. K.
2002-05-01
Pacific Northwest National Laboratory is in the process of development and implementation of an uncertainty estimation methodology for use in future site assessments that addresses parameter uncertainty as well as uncertainties related to the groundwater conceptual model. The long-term goals of the effort are development and implementation of an uncertainty estimation methodology for use in future assessments and analyses being made with the Hanford site-wide groundwater model. The basic approach in the framework developed for uncertainty assessment consists of: 1) Alternate conceptual model (ACM) identification to identify and document the major features and assumptions of each conceptual model. The process must also include a periodic review of the existing and proposed new conceptual models as data or understanding become available. 2) ACM development of each identified conceptual model through inverse modeling with historical site data. 3) ACM evaluation to identify which of conceptual models are plausible and should be included in any subsequent uncertainty assessments. 4) ACM uncertainty assessments will only be carried out for those ACMs determined to be plausible through comparison with historical observations and model structure identification measures. The parameter uncertainty assessment process generally involves: a) Model Complexity Optimization - to identify the important or relevant parameters for the uncertainty analysis; b) Characterization of Parameter Uncertainty - to develop the pdfs for the important uncertain parameters including identification of any correlations among parameters; c) Propagation of Uncertainty - to propagate parameter uncertainties (e.g., by first order second moment methods if applicable or by a Monte Carlo approach) through the model to determine the uncertainty in the model predictions of interest. 5)Estimation of combined ACM and scenario uncertainty by a double sum with each component of the inner sum (an individual CCDF
Sharma, Pratibha; Kumar, Ashok; Sharma, Manisha; Singh, Jitendra; Bandyopadhyay, Prabal; Sathe, Manisha; Kaushik, M P
2012-04-01
Present communication deals with the synthesis of novel 2-methyl-3-[2-(2-methylprop-1-en-1-yl)-1H-benzimidazol-1-yl]pyrimido[1,2-a]benzimidazol-4(3H)-one derivatives under phase transfer catalysis (PTC) conditions using benzyl triethyl ammonium chloride (BTEAC) as PTC. It also elicits the studies on in vitro antimicrobial evaluation of synthesized compounds against a representative genera of gram-negative and gram-positive bacteria i.e., Bacillus subtilis, Staphylococcus aureus, Pseudomonas diminuta and Escherichia coli. All the compounds have been found to manifest profound antimicrobial activity. Moreover, extensive quantitative structure-activity relationship (QSAR) studies have been performed to deduce a correlation between molecular descriptors under consideration and the elicited biological activity. A tri-parametric QSAR model has been generated upon rigorous statistical treatment.
NASA Astrophysics Data System (ADS)
Gramatica, Paola
This chapter surveys the QSAR modeling approaches (developed by the author's research group) for the validated prediction of environmental properties of organic pollutants. Various chemometric methods, based on different theoretical molecular descriptors, have been applied: explorative techniques (such as PCA for ranking, SOM for similarity analysis), modeling approaches by multiple-linear regression (MLR, in particular OLS), and classification methods (mainly k-NN, CART, CP-ANN). The focus of this review is on the main topics of environmental chemistry and ecotoxicology, related to the physico-chemical properties, the reactivity, and biological activity of chemicals of high environmental concern. Thus, the review deals with atmospheric degradation reactions of VOCs by tropospheric oxidants, persistence and long-range transport of POPs, sorption behavior of pesticides (Koc and leaching), bioconcentration, toxicity (acute aquatic toxicity, mutagenicity of PAHs, estrogen binding activity for endocrine disruptors compounds (EDCs)), and finally persistent bioaccumulative and toxic (PBT) behavior for the screening and prioritization of organic pollutants. Common to all the proposed models is the attention paid to model validation for predictive ability (not only internal, but also external for chemicals not participating in the model development) and checking of the chemical domain of applicability. Adherence to such a policy, requested also by the OECD principles, ensures the production of reliable predicted data, useful also in the new European regulation of chemicals, REACH.
NASA Astrophysics Data System (ADS)
Guziałowska-Tic, Joanna
2017-10-01
According to the Directive of the European Parliament and of the Council concerning the protection of animals used for scientific purposes, the number of experiments involving the use of animals needs to be reduced. The methods which can replace animal testing include computational prediction methods, for instance, the quantitative structure-activity relationships (QSAR). These methods are designed to find a cohesive relationship between differences in the values of the properties of molecules and the biological activity of a series of test compounds. This paper compares the results of the author's own results of examination on the n-octanol/water coefficient for the hydroxyester HE-1 with those generated by means of three models: Kowwin, MlogP, AlogP. The test results indicate that, in the case of molecular similarity, the highest determination coefficient was obtained for the model MlogP and the lowest root-mean square error was obtained for the Kowwin method. When comparing the mean logP value obtained using the QSAR models with the value resulting from the author's own experiments, it was observed that the best conformity was that recorded for the model AlogP, where relative error was 15.2%.
Are models, uncertainty, and dispute resolution compatible?
NASA Astrophysics Data System (ADS)
Anderson, J. D.; Wilson, J. L.
2013-12-01
Models and their uncertainty often move from an objective use in planning and decision making into the regulatory environment, then sometimes on to dispute resolution through litigation or other legal forums. Through this last transition whatever objectivity the models and uncertainty assessment may have once possessed becomes biased (or more biased) as each party chooses to exaggerate either the goodness of a model, or its worthlessness, depending on which view is in its best interest. If worthlessness is desired, then what was uncertain becomes unknown, or even unknowable. If goodness is desired, then precision and accuracy are often exaggerated and uncertainty, if it is explicitly recognized, encompasses only some parameters or conceptual issues, ignores others, and may minimize the uncertainty that it accounts for. In dispute resolution, how well is the adversarial process able to deal with these biases? The challenge is that they are often cloaked in computer graphics and animations that appear to lend realism to what could be mostly fancy, or even a manufactured outcome. While junk science can be challenged through appropriate motions in federal court, and in most state courts, it not unusual for biased or even incorrect modeling results, or conclusions based on incorrect results, to be permitted to be presented at trial. Courts allow opinions that are based on a "reasonable degree of scientific certainty," but when that 'certainty' is grossly exaggerated by an expert, one way or the other, how well do the courts determine that someone has stepped over the line? Trials are based on the adversary system of justice, so opposing and often irreconcilable views are commonly allowed, leaving it to the judge or jury to sort out the truth. Can advances in scientific theory and engineering practice, related to both modeling and uncertainty, help address this situation and better ensure that juries and judges see more objective modeling results, or at least see
A framework for modeling uncertainty in regional climate change
In this study, we present a new modeling framework and a large ensemble of climate projections to investigate the uncertainty in regional climate change over the United States associated with four dimensions of uncertainty. The sources of uncertainty considered in this framework ...
Modeling Input Errors to Improve Uncertainty Estimates for Sediment Transport Model Predictions
NASA Astrophysics Data System (ADS)
Jung, J. Y.; Niemann, J. D.; Greimann, B. P.
2016-12-01
Bayesian methods using Markov chain Monte Carlo algorithms have recently been applied to sediment transport models to assess the uncertainty in the model predictions due to the parameter values. Unfortunately, the existing approaches can only attribute overall uncertainty to the parameters. This limitation is critical because no model can produce accurate forecasts if forced with inaccurate input data, even if the model is well founded in physical theory. In this research, an existing Bayesian method is modified to consider the potential errors in input data during the uncertainty evaluation process. The input error is modeled using Gaussian distributions, and the means and standard deviations are treated as uncertain parameters. The proposed approach is tested by coupling it to the Sedimentation and River Hydraulics - One Dimension (SRH-1D) model and simulating a 23-km reach of the Tachia River in Taiwan. The Wu equation in SRH-1D is used for computing the transport capacity for a bed material load of non-cohesive material. Three types of input data are considered uncertain: (1) the input flowrate at the upstream boundary, (2) the water surface elevation at the downstream boundary, and (3) the water surface elevation at a hydraulic structure in the middle of the reach. The benefits of modeling the input errors in the uncertainty analysis are evaluated by comparing the accuracy of the most likely forecast and the coverage of the observed data by the credible intervals to those of the existing method. The results indicate that the internal boundary condition has the largest uncertainty among those considered. Overall, the uncertainty estimates from the new method are notably different from those of the existing method for both the calibration and forecast periods.
Experimental and QSAR study on the surface activities of alkyl imidazoline surfactants
NASA Astrophysics Data System (ADS)
Kong, Xiangjun; Qian, Chengduo; Fan, Weiyu; Liang, Zupei
2018-03-01
15 alkyl imidazoline surfactants with different structures were synthesized and their critical micelle concentration (CMC) and surface tension under the CMC (σcmc) in aqueous solution were measured at 298 K. 54 kinds of molecular structure descriptors were selected as independent variables and the quantitative structure-activity relationship (QSAR) between surface activities of alkyl imidazoline and molecular structure were built through the genetic function approximation (GFA) method. Experimental results showed that the maximum surface excess of alkyl imidazoline molecules at the gas-liquid interface increased and the area occupied by each surfactant molecule and the free energies of micellization ΔGm decreased with increasing carbon number (NC) of the hydrophobic chain or decreasing hydrophilicity of counterions, which resulted in a CMC and σcmc decrease, while the log CMC and NC had a linear relationship and a negative correlation. The GFA-QSAR model, which was generated by a training set composed of 13 kinds of alkyl imidazoline though GFA method regression analysis, was highly correlated with predicted values and experimental values of the CMC. The correlation coefficient R was 0.9991, which means high prediction accuracy. The prediction error of 2 kinds of alkyl imidazoline CMCs in the Validation Set that quantitatively analyzed the influence of the alkyl imidazoline molecular structure on the CMC was less than 4%.
Itteboina, Ramesh; Ballu, Srilata; Sivan, Sree Kanth; Manga, Vijjulatha
2016-10-01
Janus kinase 1 (JAK 1) plays a critical role in initiating responses to cytokines by the JAK-signal transducer and activator of transcription (JAK-STAT). This controls survival, proliferation and differentiation of a variety of cells. Docking, 3D quantitative structure activity relationship (3D-QSAR) and molecular dynamics (MD) studies were performed on a series of Imidazo-pyrrolopyridine derivatives reported as JAK 1 inhibitors. QSAR model was generated using 30 molecules in the training set; developed model showed good statistical reliability, which is evident from r 2 ncv and r 2 loo values. The predictive ability of this model was determined using a test set of 13 molecules that gave acceptable predictive correlation (r 2 Pred ) values. Finally, molecular dynamics simulation was performed to validate docking results and MM/GBSA calculations. This facilitated us to compare binding free energies of cocrystal ligand and newly designed molecule R1. The good concordance between the docking results and CoMFA/CoMSIA contour maps afforded obliging clues for the rational modification of molecules to design more potent JAK 1 inhibitors. Copyright © 2016 Elsevier Ltd. All rights reserved.
A structured analysis of uncertainty surrounding modeled impacts of groundwater-extraction rules
NASA Astrophysics Data System (ADS)
Guillaume, Joseph H. A.; Qureshi, M. Ejaz; Jakeman, Anthony J.
2012-08-01
Integrating economic and groundwater models for groundwater-management can help improve understanding of trade-offs involved between conflicting socioeconomic and biophysical objectives. However, there is significant uncertainty in most strategic decision-making situations, including in the models constructed to represent them. If not addressed, this uncertainty may be used to challenge the legitimacy of the models and decisions made using them. In this context, a preliminary uncertainty analysis was conducted of a dynamic coupled economic-groundwater model aimed at assessing groundwater extraction rules. The analysis demonstrates how a variety of uncertainties in such a model can be addressed. A number of methods are used including propagation of scenarios and bounds on parameters, multiple models, block bootstrap time-series sampling and robust linear regression for model calibration. These methods are described within the context of a theoretical uncertainty management framework, using a set of fundamental uncertainty management tasks and an uncertainty typology.
Using CV-GLUE procedure in analysis of wetland model predictive uncertainty.
Huang, Chun-Wei; Lin, Yu-Pin; Chiang, Li-Chi; Wang, Yung-Chieh
2014-07-01
This study develops a procedure that is related to Generalized Likelihood Uncertainty Estimation (GLUE), called the CV-GLUE procedure, for assessing the predictive uncertainty that is associated with different model structures with varying degrees of complexity. The proposed procedure comprises model calibration, validation, and predictive uncertainty estimation in terms of a characteristic coefficient of variation (characteristic CV). The procedure first performed two-stage Monte-Carlo simulations to ensure predictive accuracy by obtaining behavior parameter sets, and then the estimation of CV-values of the model outcomes, which represent the predictive uncertainties for a model structure of interest with its associated behavior parameter sets. Three commonly used wetland models (the first-order K-C model, the plug flow with dispersion model, and the Wetland Water Quality Model; WWQM) were compared based on data that were collected from a free water surface constructed wetland with paddy cultivation in Taipei, Taiwan. The results show that the first-order K-C model, which is simpler than the other two models, has greater predictive uncertainty. This finding shows that predictive uncertainty does not necessarily increase with the complexity of the model structure because in this case, the more simplistic representation (first-order K-C model) of reality results in a higher uncertainty in the prediction made by the model. The CV-GLUE procedure is suggested to be a useful tool not only for designing constructed wetlands but also for other aspects of environmental management. Copyright © 2014 Elsevier Ltd. All rights reserved.
Uncertainty quantification for optical model parameters
Lovell, A. E.; Nunes, F. M.; Sarich, J.; ...
2017-02-21
Although uncertainty quantification has been making its way into nuclear theory, these methods have yet to be explored in the context of reaction theory. For example, it is well known that different parameterizations of the optical potential can result in different cross sections, but these differences have not been systematically studied and quantified. The purpose of our work is to investigate the uncertainties in nuclear reactions that result from fitting a given model to elastic-scattering data, as well as to study how these uncertainties propagate to the inelastic and transfer channels. We use statistical methods to determine a best fitmore » and create corresponding 95% confidence bands. A simple model of the process is fit to elastic-scattering data and used to predict either inelastic or transfer cross sections. In this initial work, we assume that our model is correct, and the only uncertainties come from the variation of the fit parameters. Here, we study a number of reactions involving neutron and deuteron projectiles with energies in the range of 5–25 MeV/u, on targets with mass A=12–208. We investigate the correlations between the parameters in the fit. The case of deuterons on 12C is discussed in detail: the elastic-scattering fit and the prediction of 12C(d,p) 13C transfer angular distributions, using both uncorrelated and correlated χ 2 minimization functions. The general features for all cases are compiled in a systematic manner to identify trends. This work shows that, in many cases, the correlated χ 2 functions (in comparison to the uncorrelated χ 2 functions) provide a more natural parameterization of the process. These correlated functions do, however, produce broader confidence bands. Further optimization may require improvement in the models themselves and/or more information included in the fit.« less
On the formulation of a minimal uncertainty model for robust control with structured uncertainty
NASA Technical Reports Server (NTRS)
Belcastro, Christine M.; Chang, B.-C.; Fischl, Robert
1991-01-01
In the design and analysis of robust control systems for uncertain plants, representing the system transfer matrix in the form of what has come to be termed an M-delta model has become widely accepted and applied in the robust control literature. The M represents a transfer function matrix M(s) of the nominal closed loop system, and the delta represents an uncertainty matrix acting on M(s). The nominal closed loop system M(s) results from closing the feedback control system, K(s), around a nominal plant interconnection structure P(s). The uncertainty can arise from various sources, such as structured uncertainty from parameter variations or multiple unsaturated uncertainties from unmodeled dynamics and other neglected phenomena. In general, delta is a block diagonal matrix, but for real parameter variations delta is a diagonal matrix of real elements. Conceptually, the M-delta structure can always be formed for any linear interconnection of inputs, outputs, transfer functions, parameter variations, and perturbations. However, very little of the currently available literature addresses computational methods for obtaining this structure, and none of this literature addresses a general methodology for obtaining a minimal M-delta model for a wide class of uncertainty, where the term minimal refers to the dimension of the delta matrix. Since having a minimally dimensioned delta matrix would improve the efficiency of structured singular value (or multivariable stability margin) computations, a method of obtaining a minimal M-delta would be useful. Hence, a method of obtaining the interconnection system P(s) is required. A generalized procedure for obtaining a minimal P-delta structure for systems with real parameter variations is presented. Using this model, the minimal M-delta model can then be easily obtained by closing the feedback loop. The procedure involves representing the system in a cascade-form state-space realization, determining the minimal uncertainty matrix
Morales, Juan F; Montoto, Sebastian Scioli; Fagiolino, Pietro; Ruiz, Maria E
2017-01-01
The Blood-Brain Barrier (BBB) is a physical and biochemical barrier that restricts the entry of certain drugs to the Central Nervous System (CNS), while allowing the passage of others. The ability to predict the permeability of a given molecule through the BBB is a key aspect in CNS drug discovery and development, since neurotherapeutic agents with molecular targets in the CNS should be able to cross the BBB, whereas peripherally acting agents should not, to minimize the risk of CNS adverse effects. In this review we examine and discuss QSAR approaches and current availability of experimental data for the construction of BBB permeability predictive models, focusing on the modeling of the biorelevant parameter unbound partitioning coefficient (Kp,uu). Emphasis is made on two possible strategies to overcome the current limitations of in silico models: considering the prediction of brain penetration as a multifactorial problem, and increasing experimental datasets through accurate and standardized experimental techniques.
Distribution of model uncertainty across multiple data streams
NASA Astrophysics Data System (ADS)
Wutzler, Thomas
2014-05-01
When confronting biogeochemical models with a diversity of observational data streams, we are faced with the problem of weighing the data streams. Without weighing or multiple blocked cost functions, model uncertainty is allocated to the sparse data streams and possible bias in processes that are strongly constraint is exported to processes that are constrained by sparse data streams only. In this study we propose an approach that aims at making model uncertainty a factor of observations uncertainty, that is constant over all data streams. Further we propose an implementation based on Monte-Carlo Markov chain sampling combined with simulated annealing that is able to determine this variance factor. The method is exemplified both with very simple models, artificial data and with an inversion of the DALEC ecosystem carbon model against multiple observations of Howland forest. We argue that the presented approach is able to help and maybe resolve the problem of bias export to sparse data streams.
Uncertainty analysis of hydrological modeling in a tropical area using different algorithms
NASA Astrophysics Data System (ADS)
Rafiei Emam, Ammar; Kappas, Martin; Fassnacht, Steven; Linh, Nguyen Hoang Khanh
2018-01-01
Hydrological modeling outputs are subject to uncertainty resulting from different sources of errors (e.g., error in input data, model structure, and model parameters), making quantification of uncertainty in hydrological modeling imperative and meant to improve reliability of modeling results. The uncertainty analysis must solve difficulties in calibration of hydrological models, which further increase in areas with data scarcity. The purpose of this study is to apply four uncertainty analysis algorithms to a semi-distributed hydrological model, quantifying different source of uncertainties (especially parameter uncertainty) and evaluate their performance. In this study, the Soil and Water Assessment Tools (SWAT) eco-hydrological model was implemented for the watershed in the center of Vietnam. The sensitivity of parameters was analyzed, and the model was calibrated. The uncertainty analysis for the hydrological model was conducted based on four algorithms: Generalized Likelihood Uncertainty Estimation (GLUE), Sequential Uncertainty Fitting (SUFI), Parameter Solution method (ParaSol) and Particle Swarm Optimization (PSO). The performance of the algorithms was compared using P-factor and Rfactor, coefficient of determination (R 2), the Nash Sutcliffe coefficient of efficiency (NSE) and Percent Bias (PBIAS). The results showed the high performance of SUFI and PSO with P-factor>0.83, R-factor <0.56 and R 2>0.91, NSE>0.89, and 0.18
Korabecny, Jan; Dolezal, Rafael; Cabelova, Pavla; Horova, Anna; Hruba, Eva; Ricny, Jan; Sedlacek, Lukas; Nepovimova, Eugenie; Spilovska, Katarina; Andrs, Martin; Musilek, Kamil; Opletalova, Veronika; Sepsova, Vendula; Ripova, Daniela; Kuca, Kamil
2014-07-23
A novel series of 7-methoxytacrine (7-MEOTA)-donepezil like compounds was synthesized and tested for their ability to inhibit electric eel acetylcholinesterase (EeAChE), human recombinant AChE (hAChE), equine serum butyrylcholinesterase (eqBChE) and human plasmatic BChE (hBChE). New hybrids consist of a 7-MEOTA unit, representing less toxic tacrine (THA) derivative, connected with analogues of N-benzylpiperazine moieties mimicking N-benzylpiperidine fragment from donepezil. 7-MEOTA-donepezil like compounds exerted mostly non-selective profile in inhibiting cholinesterases of different origin with IC50 ranging from micromolar to sub-micromolar concentration scale. Kinetic analysis confirmed mixed-type inhibition presuming that these inhibitors are capable to simultaneously bind peripheral anionic site (PAS) as well as catalytic anionic site (CAS) of AChE. Molecular modeling studies and QSAR studies were performed to rationalize studies from in vitro. Overall, 7-MEOTA-donepezil like derivatives can be considered as interesting candidates for Alzheimer's disease treatment. Copyright © 2014 Elsevier Masson SAS. All rights reserved.
Uncertainty in spatially explicit animal dispersal models
Mooij, Wolf M.; DeAngelis, Donald L.
2003-01-01
Uncertainty in estimates of survival of dispersing animals is a vexing difficulty in conservation biology. The current notion is that this uncertainty decreases the usefulness of spatially explicit population models in particular. We examined this problem by comparing dispersal models of three levels of complexity: (1) an event-based binomial model that considers only the occurrence of mortality or arrival, (2) a temporally explicit exponential model that employs mortality and arrival rates, and (3) a spatially explicit grid-walk model that simulates the movement of animals through an artificial landscape. Each model was fitted to the same set of field data. A first objective of the paper is to illustrate how the maximum-likelihood method can be used in all three cases to estimate the means and confidence limits for the relevant model parameters, given a particular set of data on dispersal survival. Using this framework we show that the structure of the uncertainty for all three models is strikingly similar. In fact, the results of our unified approach imply that spatially explicit dispersal models, which take advantage of information on landscape details, suffer less from uncertainly than do simpler models. Moreover, we show that the proposed strategy of model development safeguards one from error propagation in these more complex models. Finally, our approach shows that all models related to animal dispersal, ranging from simple to complex, can be related in a hierarchical fashion, so that the various approaches to modeling such dispersal can be viewed from a unified perspective.
Development of a Sigma-2 Receptor affinity filter through a Monte Carlo based QSAR analysis.
Rescifina, Antonio; Floresta, Giuseppe; Marrazzo, Agostino; Parenti, Carmela; Prezzavento, Orazio; Nastasi, Giovanni; Dichiara, Maria; Amata, Emanuele
2017-08-30
For the first time in sigma-2 (σ 2 ) receptor field, a quantitative structure-activity relationship (QSAR) model has been built using pK i values of the whole set of known selective σ 2 receptor ligands (548 compounds), taken from the Sigma-2 Receptor Selective Ligands Database (S2RSLDB) (http://www.researchdsf.unict.it/S2RSLDB/), through the Monte Carlo technique and employing the software CORAL. The model has been developed by using a large and structurally diverse set of compounds, allowing for a prediction of different populations of chemical compounds endpoint (σ 2 receptor pK i ). The statistical quality reached, suggested that model for pK i determination is robust and possesses a satisfactory predictive potential. The statistical quality is high for both visible and invisible sets. The screening of the FDA approved drugs, external to our dataset, suggested that sixteen compounds might be repositioned as σ 2 receptor ligands (predicted pK i ≥8). A literature check showed that six of these compounds have already been tested for affinity at σ 2 receptor and, of these, two (Flunarizine and Terbinafine) have shown an experimental σ 2 receptor pK i >7. This suggests that this QSAR model may be used as focusing screening filter in order to prospectively find or repurpose new drugs with high affinity for the σ 2 receptor, and overall allowing for an enhanced hit rate respect to a random screening. Copyright © 2017 Elsevier B.V. All rights reserved.
Uncertainty Analysis and Parameter Estimation For Nearshore Hydrodynamic Models
NASA Astrophysics Data System (ADS)
Ardani, S.; Kaihatu, J. M.
2012-12-01
Numerical models represent deterministic approaches used for the relevant physical processes in the nearshore. Complexity of the physics of the model and uncertainty involved in the model inputs compel us to apply a stochastic approach to analyze the robustness of the model. The Bayesian inverse problem is one powerful way to estimate the important input model parameters (determined by apriori sensitivity analysis) and can be used for uncertainty analysis of the outputs. Bayesian techniques can be used to find the range of most probable parameters based on the probability of the observed data and the residual errors. In this study, the effect of input data involving lateral (Neumann) boundary conditions, bathymetry and off-shore wave conditions on nearshore numerical models are considered. Monte Carlo simulation is applied to a deterministic numerical model (the Delft3D modeling suite for coupled waves and flow) for the resulting uncertainty analysis of the outputs (wave height, flow velocity, mean sea level and etc.). Uncertainty analysis of outputs is performed by random sampling from the input probability distribution functions and running the model as required until convergence to the consistent results is achieved. The case study used in this analysis is the Duck94 experiment, which was conducted at the U.S. Army Field Research Facility at Duck, North Carolina, USA in the fall of 1994. The joint probability of model parameters relevant for the Duck94 experiments will be found using the Bayesian approach. We will further show that, by using Bayesian techniques to estimate the optimized model parameters as inputs and applying them for uncertainty analysis, we can obtain more consistent results than using the prior information for input data which means that the variation of the uncertain parameter will be decreased and the probability of the observed data will improve as well. Keywords: Monte Carlo Simulation, Delft3D, uncertainty analysis, Bayesian techniques
Optical Model and Cross Section Uncertainties
DOE Office of Scientific and Technical Information (OSTI.GOV)
Herman,M.W.; Pigni, M.T.; Dietrich, F.S.
2009-10-05
Distinct minima and maxima in the neutron total cross section uncertainties were observed in model calculations using spherical optical potential. We found this oscillating structure to be a general feature of quantum mechanical wave scattering. Specifically, we analyzed neutron interaction with 56Fe from 1 keV up to 65 MeV, and investigated physical origin of the minima.We discuss their potential importance for practical applications as well as the implications for the uncertainties in total and absorption cross sections.
Probabilistic Radiological Performance Assessment Modeling and Uncertainty
NASA Astrophysics Data System (ADS)
Tauxe, J.
2004-12-01
A generic probabilistic radiological Performance Assessment (PA) model is presented. The model, built using the GoldSim systems simulation software platform, concerns contaminant transport and dose estimation in support of decision making with uncertainty. Both the U.S. Nuclear Regulatory Commission (NRC) and the U.S. Department of Energy (DOE) require assessments of potential future risk to human receptors of disposal of LLW. Commercially operated LLW disposal facilities are licensed by the NRC (or agreement states), and the DOE operates such facilities for disposal of DOE-generated LLW. The type of PA model presented is probabilistic in nature, and hence reflects the current state of knowledge about the site by using probability distributions to capture what is expected (central tendency or average) and the uncertainty (e.g., standard deviation) associated with input parameters, and propagating through the model to arrive at output distributions that reflect expected performance and the overall uncertainty in the system. Estimates of contaminant release rates, concentrations in environmental media, and resulting doses to human receptors well into the future are made by running the model in Monte Carlo fashion, with each realization representing a possible combination of input parameter values. Statistical summaries of the results can be compared to regulatory performance objectives, and decision makers are better informed of the inherently uncertain aspects of the model which supports their decision-making. While this information may make some regulators uncomfortable, they must realize that uncertainties which were hidden in a deterministic analysis are revealed in a probabilistic analysis, and the chance of making a correct decision is now known rather than hoped for. The model includes many typical features and processes that would be part of a PA, but is entirely fictitious. This does not represent any particular site and is meant to be a generic example. A
NASA Technical Reports Server (NTRS)
Stolarski, R. S.; Butler, D. M.; Rundel, R. D.
1977-01-01
A concise stratospheric model was used in a Monte-Carlo analysis of the propagation of reaction rate uncertainties through the calculation of an ozone perturbation due to the addition of chlorine. Two thousand Monte-Carlo cases were run with 55 reaction rates being varied. Excellent convergence was obtained in the output distributions because the model is sensitive to the uncertainties in only about 10 reactions. For a 1 ppby chlorine perturbation added to a 1.5 ppby chlorine background, the resultant 1 sigma uncertainty on the ozone perturbation is a factor of 1.69 on the high side and 1.80 on the low side. The corresponding 2 sigma factors are 2.86 and 3.23. Results are also given for the uncertainties, due to reaction rates, in the ambient concentrations of stratospheric species.
Su, Pingru; Zhu, Huicen; Shen, Zhemin
2016-02-01
Manganese dioxide formed in oxidation process by potassium permanganate exhibits promising adsorptive capacity which can be utilized to remove organic pollutants in wastewater. However, the structure variances of organic molecules lead to wide difference of adsorption efficiency. Therefore, it is of great significance to find a general relationship between removal rate of organic compounds and their quantum parameters. This study focused on building up quantitative structure activity relationship (QSAR) models based on experimental removal rate (r(exp)) of 25 organic compounds and 17 quantum parameters of each organic compounds computed by Gaussian 09 and Material Studio 6.1. The recommended model is rpre = -0.502-7.742 f(+)x + 0.107 E HOMO + 0.959 q(H(+)) + 1.388 BOx. Both internal and external validations of the recommended model are satisfied, suggesting optimum stability and predictive ability. The definition of applicability domain and the Y-randomization test indicate all the prediction is reliable and no possibility of chance correlation. The recommended model contains four variables, which are closely related to adsorption mechanism. f(+)x reveals the degree of affinity for nucleophilic attack. E HOMO represents the difficulty of electron loss. q(H(+)) reflect the distribution of partial charge between carbon and hydrogen atom. BO x shows the stability of a molecule.
Van Bossuyt, Melissa; Van Hoeck, Els; Raitano, Giuseppa; Manganelli, Serena; Braeken, Els; Ates, Gamze; Vanhaecke, Tamara; Van Miert, Sabine; Benfenati, Emilio; Mertens, Birgit; Rogiers, Vera
2017-04-01
Over the last years, more stringent safety requirements for an increasing number of chemicals across many regulatory fields (e.g. industrial chemicals, pharmaceuticals, food, cosmetics, …) have triggered the need for an efficient screening strategy to prioritize the substances of highest concern. In this context, alternative methods such as in silico (i.e. computational) techniques gain more and more importance. In the current study, a new prioritization strategy for identifying potentially mutagenic substances was developed based on the combination of multiple (quantitative) structure-activity relationship ((Q)SAR) tools. Non-evaluated substances used in printed paper and board food contact materials (FCM) were selected for a case study. By applying our strategy, 106 out of the 1723 substances were assigned 'high priority' as they were predicted mutagenic by 4 different (Q)SAR models. Information provided within the models allowed to identify 53 substances for which Ames mutagenicity prediction already has in vitro Ames test results. For further prioritization, additional support could be obtained by applying local i.e. specific models, as demonstrated here for aromatic azo compounds, typically found in printed paper and board FCM. The strategy developed here can easily be applied to other groups of chemicals facing the same need for priority ranking. Copyright © 2017 Elsevier Ltd. All rights reserved.
Toropova, Alla P; Schultz, Terry W; Toropov, Andrey A
2016-03-01
Data on toxicity toward Tetrahymena pyriformis is indicator of applicability of a substance in ecologic and pharmaceutical aspects. Quantitative structure-activity relationships (QSARs) between the molecular structure of benzene derivatives and toxicity toward T. pyriformis (expressed as the negative logarithms of the population growth inhibition dose, mmol/L) are established. The available data were randomly distributed three times into the visible training and calibration sets, and invisible validation sets. The statistical characteristics for the validation set are the following: r(2)=0.8179 and s=0.338 (first distribution); r(2)=0.8682 and s=0.341 (second distribution); r(2)=0.8435 and s=0.323 (third distribution). These models are built up using only information on the molecular structure: no data on physicochemical parameters, 3D features of the molecular structure and quantum mechanics descriptors are involved in the modeling process. Copyright © 2016 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Debry, Edouard; Mallet, Vivien; Garaud, Damien; Malherbe, Laure; Bessagnet, Bertrand; Rouïl, Laurence
2010-05-01
Prev'Air is the French operational system for air pollution forecasting. It is developed and maintained by INERIS with financial support from the French Ministry for Environment. On a daily basis it delivers forecasts up to three days ahead for ozone, nitrogene dioxide and particles over France and Europe. Maps of concentration peaks and daily averages are freely available to the general public. More accurate data can be provided to customers and modelers. Prev'Air forecasts are based on the Chemical Transport Model CHIMERE. French authorities rely more and more on this platform to alert the general public in case of high pollution events and to assess the efficiency of regulation measures when such events occur. For example the road speed limit may be reduced in given areas when the ozone level exceeds one regulatory threshold. These operational applications require INERIS to assess the quality of its forecasts and to sensitize end users about the confidence level. Indeed concentrations always remain an approximation of the true concentrations because of the high uncertainty on input data, such as meteorological fields and emissions, because of incomplete or inaccurate representation of physical processes, and because of efficiencies in numerical integration [1]. We would like to present in this communication the uncertainty analysis of the CHIMERE model led in the framework of an INERIS research project aiming, on the one hand, to assess the uncertainty of several deterministic models and, on the other hand, to propose relevant indicators describing air quality forecast and their uncertainty. There exist several methods to assess the uncertainty of one model. Under given assumptions the model may be differentiated into an adjoint model which directly provides the concentrations sensitivity to given parameters. But so far Monte Carlo methods seem to be the most widely and oftenly used [2,3] as they are relatively easy to implement. In this framework one
Ballu, Srilata; Itteboina, Ramesh; Sivan, Sree Kanth; Manga, Vijjulatha
2018-04-01
Staphylococcus aureus is a gram positive bacterium. It is the leading cause of skin and respiratory infections, osteomyelitis, Ritter's disease, endocarditis, and bacteraemia in the developed world. We employed combined studies of 3D QSAR, molecular docking which are validated by molecular dynamics simulations and in silico ADME prediction have been performed on Isothiazoloquinolones inhibitors against methicillin resistance Staphylococcus aureus. Three-dimensional quantitative structure-activity relationship (3D-QSAR) study was applied using comparative molecular field analysis (CoMFA) with Q 2 of 0.578, R 2 of 0.988, and comparative molecular similarity indices analysis (CoMSIA) with Q 2 of 0.554, R 2 of 0.975. The predictive ability of these model was determined using a test set of molecules that gave acceptable predictive correlation (r 2 Pred) values 0.55 and 0.57 of CoMFA and CoMSIA respectively. Docking, simulations were employed to position the inhibitors into protein active site to find out the most probable binding mode and most reliable conformations. Developed models and Docking methods provide guidance to design molecules with enhanced activity. Copyright © 2017 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Zhang, Zhenshan; Zheng, Mingyue; Du, Li; Shen, Jianhua; Luo, Xiaomin; Zhu, Weiliang; Jiang, Hualiang
2006-05-01
To find useful information for discovering dual functional inhibitors against both wild type (WT) and K103N mutant reverse transcriptases (RTs) of HIV-1, molecular docking and 3D-QSAR approaches were applied to a set of twenty-five 4,1-benzoxazepinone analogues of efavirenz (SUSTIVA®), some of them are active against the two RTs. 3D-QSAR models were constructed, based on their binding conformations determined by molecular docking, with r 2 cv values ranging from 0.656 to 0.834 for CoMFA and CoMSIA, respectively. The models were then validated to be highly predictive and extrapolative by inhibitors in two test sets with different molecular skeletons. Furthermore, CoMFA models were found to be well matched with the binding sites of both WT and K103N RTs. Finally, a reasonable pharmacophore model of 4,1-benzoxazepinones were established. The application of the model not only successfully differentiated the experimentally determined inhibitors from non-inhibitors, but also discovered two potent inhibitors from the compound database SPECS. On the basis of both the 3D-QSAR and pharmacophore models, new clues for discovering and designing potent dual functional drug leads against HIV-1 were proposed: (i) adopting positively charged aliphatic group at the cis-substituent of C3; (ii) reducing the electronic density at the position of O4; (iii) positioning a small branched aliphatic group at position of C5; (iv) using the negatively charged bulky substituents at position of C7.
Operationalising uncertainty in data and models for integrated water resources management.
Blind, M W; Refsgaard, J C
2007-01-01
Key sources of uncertainty of importance for water resources management are (1) uncertainty in data; (2) uncertainty related to hydrological models (parameter values, model technique, model structure); and (3) uncertainty related to the context and the framing of the decision-making process. The European funded project 'Harmonised techniques and representative river basin data for assessment and use of uncertainty information in integrated water management (HarmoniRiB)' has resulted in a range of tools and methods to assess such uncertainties, focusing on items (1) and (2). The project also engaged in a number of discussions surrounding uncertainty and risk assessment in support of decision-making in water management. Based on the project's results and experiences, and on the subsequent discussions a number of conclusions can be drawn on the future needs for successful adoption of uncertainty analysis in decision support. These conclusions range from additional scientific research on specific uncertainties, dedicated guidelines for operational use to capacity building at all levels. The purpose of this paper is to elaborate on these conclusions and anchoring them in the broad objective of making uncertainty and risk assessment an essential and natural part in future decision-making processes.
NASA Astrophysics Data System (ADS)
Aulenbach, B. T.; Burns, D. A.; Shanley, J. B.; Yanai, R. D.; Bae, K.; Wild, A.; Yang, Y.; Dong, Y.
2013-12-01
There are many sources of uncertainty in estimates of streamwater solute flux. Flux is the product of discharge and concentration (summed over time), each of which has measurement uncertainty of its own. Discharge can be measured almost continuously, but concentrations are usually determined from discrete samples, which increases uncertainty dependent on sampling frequency and how concentrations are assigned for the periods between samples. Gaps between samples can be estimated by linear interpolation or by models that that use the relations between concentration and continuously measured or known variables such as discharge, season, temperature, and time. For this project, developed in cooperation with QUEST (Quantifying Uncertainty in Ecosystem Studies), we evaluated uncertainty for three flux estimation methods and three different sampling frequencies (monthly, weekly, and weekly plus event). The constituents investigated were dissolved NO3, Si, SO4, and dissolved organic carbon (DOC), solutes whose concentration dynamics exhibit strongly contrasting behavior. The evaluation was completed for a 10-year period at five small, forested watersheds in Georgia, New Hampshire, New York, Puerto Rico, and Vermont. Concentration regression models were developed for each solute at each of the three sampling frequencies for all five watersheds. Fluxes were then calculated using (1) a linear interpolation approach, (2) a regression-model method, and (3) the composite method - which combines the regression-model method for estimating concentrations and the linear interpolation method for correcting model residuals to the observed sample concentrations. We considered the best estimates of flux to be derived using the composite method at the highest sampling frequencies. We also evaluated the importance of sampling frequency and estimation method on flux estimate uncertainty; flux uncertainty was dependent on the variability characteristics of each solute and varied for
QSAR Accelerated Discovery of Potent Ice Recrystallization Inhibitors
NASA Astrophysics Data System (ADS)
Briard, Jennie G.; Fernandez, Michael; de Luna, Phil; Woo, Tom. K.; Ben, Robert N.
2016-05-01
Ice recrystallization is the main contributor to cell damage and death during the cryopreservation of cells and tissues. Over the past five years, many small carbohydrate-based molecules were identified as ice recrystallization inhibitors and several were shown to reduce cryoinjury during the cryopreservation of red blood cells (RBCs) and hematopoietic stems cells (HSCs). Unfortunately, clear structure-activity relationships have not been identified impeding the rational design of future compounds possessing ice recrystallization inhibition (IRI) activity. A set of 124 previously synthesized compounds with known IRI activities were used to calibrate 3D-QSAR classification models using GRid INdependent Descriptors (GRIND) derived from DFT level quantum mechanical calculations. Partial least squares (PLS) model was calibrated with 70% of the data set which successfully identified 80% of the IRI active compounds with a precision of 0.8. This model exhibited good performance in screening the remaining 30% of the data set with 70% of active additives successfully recovered with a precision of ~0.7 and specificity of 0.8. The model was further applied to screen a new library of aryl-alditol molecules which were then experimentally synthesized and tested with a success rate of 82%. Presented is the first computer-aided high-throughput experimental screening for novel IRI active compounds.
QSAR Accelerated Discovery of Potent Ice Recrystallization Inhibitors
Briard, Jennie G.; Fernandez, Michael; De Luna, Phil; Woo, Tom. K.; Ben, Robert N.
2016-01-01
Ice recrystallization is the main contributor to cell damage and death during the cryopreservation of cells and tissues. Over the past five years, many small carbohydrate-based molecules were identified as ice recrystallization inhibitors and several were shown to reduce cryoinjury during the cryopreservation of red blood cells (RBCs) and hematopoietic stems cells (HSCs). Unfortunately, clear structure-activity relationships have not been identified impeding the rational design of future compounds possessing ice recrystallization inhibition (IRI) activity. A set of 124 previously synthesized compounds with known IRI activities were used to calibrate 3D-QSAR classification models using GRid INdependent Descriptors (GRIND) derived from DFT level quantum mechanical calculations. Partial least squares (PLS) model was calibrated with 70% of the data set which successfully identified 80% of the IRI active compounds with a precision of 0.8. This model exhibited good performance in screening the remaining 30% of the data set with 70% of active additives successfully recovered with a precision of ~0.7 and specificity of 0.8. The model was further applied to screen a new library of aryl-alditol molecules which were then experimentally synthesized and tested with a success rate of 82%. Presented is the first computer-aided high-throughput experimental screening for novel IRI active compounds. PMID:27216585
QSAR Accelerated Discovery of Potent Ice Recrystallization Inhibitors.
Briard, Jennie G; Fernandez, Michael; De Luna, Phil; Woo, Tom K; Ben, Robert N
2016-05-24
Ice recrystallization is the main contributor to cell damage and death during the cryopreservation of cells and tissues. Over the past five years, many small carbohydrate-based molecules were identified as ice recrystallization inhibitors and several were shown to reduce cryoinjury during the cryopreservation of red blood cells (RBCs) and hematopoietic stems cells (HSCs). Unfortunately, clear structure-activity relationships have not been identified impeding the rational design of future compounds possessing ice recrystallization inhibition (IRI) activity. A set of 124 previously synthesized compounds with known IRI activities were used to calibrate 3D-QSAR classification models using GRid INdependent Descriptors (GRIND) derived from DFT level quantum mechanical calculations. Partial least squares (PLS) model was calibrated with 70% of the data set which successfully identified 80% of the IRI active compounds with a precision of 0.8. This model exhibited good performance in screening the remaining 30% of the data set with 70% of active additives successfully recovered with a precision of ~0.7 and specificity of 0.8. The model was further applied to screen a new library of aryl-alditol molecules which were then experimentally synthesized and tested with a success rate of 82%. Presented is the first computer-aided high-throughput experimental screening for novel IRI active compounds.
Tyler Jon Smith; Lucy Amanda Marshall
2010-01-01
Model selection is an extremely important aspect of many hydrologic modeling studies because of the complexity, variability, and uncertainty that surrounds the current understanding of watershed-scale systems. However, development and implementation of a complete precipitation-runoff modeling framework, from model selection to calibration and uncertainty analysis, are...
GCR Environmental Models III: GCR Model Validation and Propagated Uncertainties in Effective Dose
NASA Technical Reports Server (NTRS)
Slaba, Tony C.; Xu, Xiaojing; Blattnig, Steve R.; Norman, Ryan B.
2014-01-01
This is the last of three papers focused on quantifying the uncertainty associated with galactic cosmic rays (GCR) models used for space radiation shielding applications. In the first paper, it was found that GCR ions with Z>2 and boundary energy below 500 MeV/nucleon induce less than 5% of the total effective dose behind shielding. This is an important finding since GCR model development and validation have been heavily biased toward Advanced Composition Explorer/Cosmic Ray Isotope Spectrometer measurements below 500 MeV/nucleon. Weights were also developed that quantify the relative contribution of defined GCR energy and charge groups to effective dose behind shielding. In the second paper, it was shown that these weights could be used to efficiently propagate GCR model uncertainties into effective dose behind shielding. In this work, uncertainties are quantified for a few commonly used GCR models. A validation metric is developed that accounts for measurements uncertainty, and the metric is coupled to the fast uncertainty propagation method. For this work, the Badhwar-O'Neill (BON) 2010 and 2011 and the Matthia GCR models are compared to an extensive measurement database. It is shown that BON2011 systematically overestimates heavy ion fluxes in the range 0.5-4 GeV/nucleon. The BON2010 and BON2011 also show moderate and large errors in reproducing past solar activity near the 2000 solar maximum and 2010 solar minimum. It is found that all three models induce relative errors in effective dose in the interval [-20%, 20%] at a 68% confidence level. The BON2010 and Matthia models are found to have similar overall uncertainty estimates and are preferred for space radiation shielding applications.
Eren, Gokcen; Macchiarulo, Antonio; Banoglu, Erden
2012-02-01
Pharmacological intervention with 5-Lipoxygenase (5-LO) is a promising strategy for treatment of inflammatory and allergic ailments, including asthma. With the aim of developing predictive models of 5-LO affinity and gaining insights into the molecular basis of ligand-target interaction, we herein describe QSAR studies of 59 diverse nonredox-competitive 5-LO inhibitors based on the use of molecular shape descriptors and docking experiments. These studies have successfully yielded a predictive model able to explain much of the variance in the activity of the training set compounds while predicting satisfactorily the 5-LO inhibitory activity of an external test set of compounds. The inspection of the selected variables in the QSAR equation unveils the importance of specific interactions which are observed from docking experiments. Collectively, these results may be used to design novel potent and selective nonredox 5-LO inhibitors. Copyright © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Climate data induced uncertainty in model-based estimations of terrestrial primary productivity
NASA Astrophysics Data System (ADS)
Wu, Zhendong; Ahlström, Anders; Smith, Benjamin; Ardö, Jonas; Eklundh, Lars; Fensholt, Rasmus; Lehsten, Veiko
2017-06-01
Model-based estimations of historical fluxes and pools of the terrestrial biosphere differ substantially. These differences arise not only from differences between models but also from differences in the environmental and climatic data used as input to the models. Here we investigate the role of uncertainties in historical climate data by performing simulations of terrestrial gross primary productivity (GPP) using a process-based dynamic vegetation model (LPJ-GUESS) forced by six different climate datasets. We find that the climate induced uncertainty, defined as the range among historical simulations in GPP when forcing the model with the different climate datasets, can be as high as 11 Pg C yr-1 globally (9% of mean GPP). We also assessed a hypothetical maximum climate data induced uncertainty by combining climate variables from different datasets, which resulted in significantly larger uncertainties of 41 Pg C yr-1 globally or 32% of mean GPP. The uncertainty is partitioned into components associated to the three main climatic drivers, temperature, precipitation, and shortwave radiation. Additionally, we illustrate how the uncertainty due to a given climate driver depends both on the magnitude of the forcing data uncertainty (climate data range) and the apparent sensitivity of the modeled GPP to the driver (apparent model sensitivity). We find that LPJ-GUESS overestimates GPP compared to empirically based GPP data product in all land cover classes except for tropical forests. Tropical forests emerge as a disproportionate source of uncertainty in GPP estimation both in the simulations and empirical data products. The tropical forest uncertainty is most strongly associated with shortwave radiation and precipitation forcing, of which climate data range contributes higher to overall uncertainty than apparent model sensitivity to forcing. Globally, precipitation dominates the climate induced uncertainty over nearly half of the vegetated land area, which is mainly due
'spup' - an R package for uncertainty propagation analysis in spatial environmental modelling
NASA Astrophysics Data System (ADS)
Sawicka, Kasia; Heuvelink, Gerard
2017-04-01
Computer models have become a crucial tool in engineering and environmental sciences for simulating the behaviour of complex static and dynamic systems. However, while many models are deterministic, the uncertainty in their predictions needs to be estimated before they are used for decision support. Currently, advances in uncertainty propagation and assessment have been paralleled by a growing number of software tools for uncertainty analysis, but none has gained recognition for a universal applicability and being able to deal with case studies with spatial models and spatial model inputs. Due to the growing popularity and applicability of the open source R programming language we undertook a project to develop an R package that facilitates uncertainty propagation analysis in spatial environmental modelling. In particular, the 'spup' package provides functions for examining the uncertainty propagation starting from input data and model parameters, via the environmental model onto model predictions. The functions include uncertainty model specification, stochastic simulation and propagation of uncertainty using Monte Carlo (MC) techniques, as well as several uncertainty visualization functions. Uncertain environmental variables are represented in the package as objects whose attribute values may be uncertain and described by probability distributions. Both numerical and categorical data types are handled. Spatial auto-correlation within an attribute and cross-correlation between attributes is also accommodated for. For uncertainty propagation the package has implemented the MC approach with efficient sampling algorithms, i.e. stratified random sampling and Latin hypercube sampling. The design includes facilitation of parallel computing to speed up MC computation. The MC realizations may be used as an input to the environmental models called from R, or externally. Selected visualization methods that are understandable by non-experts with limited background in
Bayesian uncertainty quantification in linear models for diffusion MRI.
Sjölund, Jens; Eklund, Anders; Özarslan, Evren; Herberthson, Magnus; Bånkestad, Maria; Knutsson, Hans
2018-03-29
Diffusion MRI (dMRI) is a valuable tool in the assessment of tissue microstructure. By fitting a model to the dMRI signal it is possible to derive various quantitative features. Several of the most popular dMRI signal models are expansions in an appropriately chosen basis, where the coefficients are determined using some variation of least-squares. However, such approaches lack any notion of uncertainty, which could be valuable in e.g. group analyses. In this work, we use a probabilistic interpretation of linear least-squares methods to recast popular dMRI models as Bayesian ones. This makes it possible to quantify the uncertainty of any derived quantity. In particular, for quantities that are affine functions of the coefficients, the posterior distribution can be expressed in closed-form. We simulated measurements from single- and double-tensor models where the correct values of several quantities are known, to validate that the theoretically derived quantiles agree with those observed empirically. We included results from residual bootstrap for comparison and found good agreement. The validation employed several different models: Diffusion Tensor Imaging (DTI), Mean Apparent Propagator MRI (MAP-MRI) and Constrained Spherical Deconvolution (CSD). We also used in vivo data to visualize maps of quantitative features and corresponding uncertainties, and to show how our approach can be used in a group analysis to downweight subjects with high uncertainty. In summary, we convert successful linear models for dMRI signal estimation to probabilistic models, capable of accurate uncertainty quantification. Copyright © 2018 Elsevier Inc. All rights reserved.
Uncertainty Aware Structural Topology Optimization Via a Stochastic Reduced Order Model Approach
NASA Technical Reports Server (NTRS)
Aguilo, Miguel A.; Warner, James E.
2017-01-01
This work presents a stochastic reduced order modeling strategy for the quantification and propagation of uncertainties in topology optimization. Uncertainty aware optimization problems can be computationally complex due to the substantial number of model evaluations that are necessary to accurately quantify and propagate uncertainties. This computational complexity is greatly magnified if a high-fidelity, physics-based numerical model is used for the topology optimization calculations. Stochastic reduced order model (SROM) methods are applied here to effectively 1) alleviate the prohibitive computational cost associated with an uncertainty aware topology optimization problem; and 2) quantify and propagate the inherent uncertainties due to design imperfections. A generic SROM framework that transforms the uncertainty aware, stochastic topology optimization problem into a deterministic optimization problem that relies only on independent calls to a deterministic numerical model is presented. This approach facilitates the use of existing optimization and modeling tools to accurately solve the uncertainty aware topology optimization problems in a fraction of the computational demand required by Monte Carlo methods. Finally, an example in structural topology optimization is presented to demonstrate the effectiveness of the proposed uncertainty aware structural topology optimization approach.
2011-01-01
used in efforts to develop QSAR models. Measurement of Repellent Efficacy Screening for Repellency of Compounds with Unknown Toxicology In screening...CPT) were used to develop Quantitative Structure Activity Relationship ( QSAR ) models to predict repellency. Successful prediction of novel...acylpiperidine QSAR models employed 4 descriptors to describe the relationship between structure and repellent duration. The ANN model of the carboxamides did not
Liu, Jing; Li, Yan; Zhang, Shuwei; Xiao, Zhengtao; Ai, Chunzhi
2011-01-01
In recent years, great interest has been paid to the development of compounds with high selectivity for central dopamine (DA) D3 receptors, an interesting therapeutic target in the treatment of different neurological disorders. In the present work, based on a dataset of 110 collected benzazepine (BAZ) DA D3 antagonists with diverse kinds of structures, a variety of in silico modeling approaches, including comparative molecular field analysis (CoMFA), comparative similarity indices analysis (CoMSIA), homology modeling, molecular docking and molecular dynamics (MD) were carried out to reveal the requisite 3D structural features for activity. Our results show that both the receptor-based (Q2 = 0.603, R2ncv = 0.829, R2pre = 0.690, SEE = 0.316, SEP = 0.406) and ligand-based 3D-QSAR models (Q2 = 0.506, R2ncv =0.838, R2pre = 0.794, SEE = 0.316, SEP = 0.296) are reliable with proper predictive capacity. In addition, a combined analysis between the CoMFA, CoMSIA contour maps and MD results with a homology DA receptor model shows that: (1) ring-A, position-2 and R3 substituent in ring-D are crucial in the design of antagonists with higher activity; (2) more bulky R1 substituents (at position-2 of ring-A) of antagonists may well fit in the binding pocket; (3) hydrophobicity represented by MlogP is important for building satisfactory QSAR models; (4) key amino acids of the binding pocket are CYS101, ILE105, LEU106, VAL151, PHE175, PHE184, PRO254 and ALA251. To our best knowledge, this work is the first report on 3D-QSAR modeling of the new fused BAZs as DA D3 antagonists. These results might provide information for a better understanding of the mechanism of antagonism and thus be helpful in designing new potent DA D3 antagonists. PMID:21541053
Liu, Jing; Li, Yan; Zhang, Shuwei; Xiao, Zhengtao; Ai, Chunzhi
2011-02-18
In recent years, great interest has been paid to the development of compounds with high selectivity for central dopamine (DA) D3 receptors, an interesting therapeutic target in the treatment of different neurological disorders. In the present work, based on a dataset of 110 collected benzazepine (BAZ) DA D3 antagonists with diverse kinds of structures, a variety of in silico modeling approaches, including comparative molecular field analysis (CoMFA), comparative similarity indices analysis (CoMSIA), homology modeling, molecular docking and molecular dynamics (MD) were carried out to reveal the requisite 3D structural features for activity. Our results show that both the receptor-based (Q(2) = 0.603, R(2) (ncv) = 0.829, R(2) (pre) = 0.690, SEE = 0.316, SEP = 0.406) and ligand-based 3D-QSAR models (Q(2) = 0.506, R(2) (ncv) =0.838, R(2) (pre) = 0.794, SEE = 0.316, SEP = 0.296) are reliable with proper predictive capacity. In addition, a combined analysis between the CoMFA, CoMSIA contour maps and MD results with a homology DA receptor model shows that: (1) ring-A, position-2 and R(3) substituent in ring-D are crucial in the design of antagonists with higher activity; (2) more bulky R(1) substituents (at position-2 of ring-A) of antagonists may well fit in the binding pocket; (3) hydrophobicity represented by MlogP is important for building satisfactory QSAR models; (4) key amino acids of the binding pocket are CYS101, ILE105, LEU106, VAL151, PHE175, PHE184, PRO254 and ALA251. To our best knowledge, this work is the first report on 3D-QSAR modeling of the new fused BAZs as DA D3 antagonists. These results might provide information for a better understanding of the mechanism of antagonism and thus be helpful in designing new potent DA D3 antagonists.
NASA Astrophysics Data System (ADS)
Montanari, C. A.; Tute, M. S.; Beezer, A. E.; Mitchell, J. C.
1996-02-01
Results are presented for a QSAR analysis of bisamidines, using a similarity index as descriptor. The method allows for differences in conformation of bisamidines at the receptor site to be taken into consideration. In particular, it has been suggested by others that pentamidine binds in the minor groove of DNA in a so-called isohelical conformation, and our QSAR supports this suggestion. The molecular similarity index for comparison of molecules can be used as a parameter for correlating and hence rationalising the activity as well as suggesting the design of bioactive molecules. The studied compounds had been evaluated for potency against Leishmania mexicana amazonensis, and this potency was used as a dependent variable in a series of QSAR analyses. For the calculation of similarity indexes, each analogue was in turn superimposed on a chosen lead compound in a reference conformation, either extended or isohelical, maximising overlap and hence similarity by flexible fitting.
Propagation of the velocity model uncertainties to the seismic event location
NASA Astrophysics Data System (ADS)
Gesret, A.; Desassis, N.; Noble, M.; Romary, T.; Maisons, C.
2015-01-01
Earthquake hypocentre locations are crucial in many domains of application (academic and industrial) as seismic event location maps are commonly used to delineate faults or fractures. The interpretation of these maps depends on location accuracy and on the reliability of the associated uncertainties. The largest contribution to location and uncertainty errors is due to the fact that the velocity model errors are usually not correctly taken into account. We propose a new Bayesian formulation that integrates properly the knowledge on the velocity model into the formulation of the probabilistic earthquake location. In this work, the velocity model uncertainties are first estimated with a Bayesian tomography of active shot data. We implement a sampling Monte Carlo type algorithm to generate velocity models distributed according to the posterior distribution. In a second step, we propagate the velocity model uncertainties to the seismic event location in a probabilistic framework. This enables to obtain more reliable hypocentre locations as well as their associated uncertainties accounting for picking and velocity model uncertainties. We illustrate the tomography results and the gain in accuracy of earthquake location for two synthetic examples and one real data case study in the context of induced microseismicity.
Vijaya Prabhu, Sitrarasu; Singh, Sanjeev Kumar
2018-05-28
Atom-based three dimensional-quantitative structure-activity relationship (3D-QSAR) model was developed on the basis of 5-point pharmacophore hypothesis (AARRR) with two hydrogen bond acceptors (A) and three aromatic rings for the derivatives of thieno[2,3-b]pyridine, which modulates the activity to inhibit the mGluR5 receptor. Generation of a highly predictive 3D-QSAR model was performed using the alignment of predicted pharmacophore hypothesis for the training set (R 2 = 0.84, SD = 0.26, F = 45.8, N = 29) and test set (Q 2 = 0.74, RMSE = 0.235, Pearson-R = 0.94, N = 9). The best pharmacophore hypothesis AARRR was selected, and developed three dimensional-quantitative structure activity relationship (3D-QSAR) model also supported the outcome of this study by means of favorable and unfavorable electron withdrawing group and hydrophobic regions of most active compound 42d and least active compound 18b. Following, induced fit docking and binding free energy calculations reveals the reliable binding orientation of the compounds. Finally, molecular dynamics simulations for 100 ns were performed to depict the protein-ligand stability. We anticipate that the resulted outcome could be supportive to discover potent negative allosteric modulators for metabotropic glutamate receptor 5 (mGluR5).
NASA Astrophysics Data System (ADS)
Langer, P.; Sepahvand, K.; Guist, C.; Bär, J.; Peplow, A.; Marburg, S.
2018-03-01
The simulation model which examines the dynamic behavior of real structures needs to address the impact of uncertainty in both geometry and material parameters. This article investigates three-dimensional finite element models for structural dynamics problems with respect to both model and parameter uncertainties. The parameter uncertainties are determined via laboratory measurements on several beam-like samples. The parameters are then considered as random variables to the finite element model for exploring the uncertainty effects on the quality of the model outputs, i.e. natural frequencies. The accuracy of the output predictions from the model is compared with the experimental results. To this end, the non-contact experimental modal analysis is conducted to identify the natural frequency of the samples. The results show a good agreement compared with experimental data. Furthermore, it is demonstrated that geometrical uncertainties have more influence on the natural frequencies compared to material parameters and material uncertainties are about two times higher than geometrical uncertainties. This gives valuable insights for improving the finite element model due to various parameter ranges required in a modeling process involving uncertainty.
Inter-sectoral comparison of model uncertainty of climate change impacts in Africa
NASA Astrophysics Data System (ADS)
van Griensven, Ann; Vetter, Tobias; Piontek, Franzisca; Gosling, Simon N.; Kamali, Bahareh; Reinhardt, Julia; Dinkneh, Aklilu; Yang, Hong; Alemayehu, Tadesse
2016-04-01
We present the model results and their uncertainties of an inter-sectoral impact model inter-comparison initiative (ISI-MIP) for climate change impacts in Africa. The study includes results on hydrological, crop and health aspects. The impact models used ensemble inputs consisting of 20 time series of daily rainfall and temperature data obtained from 5 Global Circulation Models (GCMs) and 4 Representative concentration pathway (RCP). In this study, we analysed model uncertainty for the Regional Hydrological Models, Global Hydrological Models, Malaria models and Crop models. For the regional hydrological models, we used 2 African test cases: the Blue Nile in Eastern Africa and the Niger in Western Africa. For both basins, the main sources of uncertainty are originating from the GCM and RCPs, while the uncertainty of the regional hydrological models is relatively low. The hydrological model uncertainty becomes more important when predicting changes on low flows compared to mean or high flows. For the other sectors, the impact models have the largest share of uncertainty compared to GCM and RCP, especially for Malaria and crop modelling. The overall conclusion of the ISI-MIP is that it is strongly advised to use ensemble modeling approach for climate change impact studies throughout the whole modelling chain.
An Adaptation Dilemma Caused by Impacts-Modeling Uncertainty
NASA Astrophysics Data System (ADS)
Frieler, K.; Müller, C.; Elliott, J. W.; Heinke, J.; Arneth, A.; Bierkens, M. F.; Ciais, P.; Clark, D. H.; Deryng, D.; Doll, P. M.; Falloon, P.; Fekete, B. M.; Folberth, C.; Friend, A. D.; Gosling, S. N.; Haddeland, I.; Khabarov, N.; Lomas, M. R.; Masaki, Y.; Nishina, K.; Neumann, K.; Oki, T.; Pavlick, R.; Ruane, A. C.; Schmid, E.; Schmitz, C.; Stacke, T.; Stehfest, E.; Tang, Q.; Wisser, D.
2013-12-01
Ensuring future well-being for a growing population under either strong climate change or an aggressive mitigation strategy requires a subtle balance of potentially conflicting response measures. In the case of competing goals, uncertainty in impact estimates plays a central role when high confidence in achieving a primary objective (such as food security) directly implies an increased probability of uncertainty induced failure with regard to a competing target (such as climate protection). We use cross sectoral consistent multi-impact model simulations from the Inter-Sectoral Impact Model Intercomparison Project (ISI-MIP, www.isi-mip.org) to illustrate this uncertainty dilemma: RCP projections from 7 global crop, 11 hydrological, and 7 biomes models are combined to analyze irrigation and land use changes as possible responses to climate change and increasing crop demand due to population growth and economic development. We show that - while a no-regrets option with regard to climate protection - additional irrigation alone is not expected to balance the demand increase by 2050. In contrast, a strong expansion of cultivated land closes the projected production-demand gap in some crop models. However, it comes at the expense of a loss of natural carbon sinks of order 50%. Given the large uncertainty of state of the art crop model projections even these strong land use changes would not bring us ';on the safe side' with respect to food supply. In a world where increasing carbon emissions continue to shrink the overall solution space, we demonstrate that current impacts-modeling uncertainty is a luxury we cannot afford. ISI-MIP is intended to provide cross sectoral consistent impact projections for model intercomparison and improvement as well as cross-sectoral integration. The results presented here were generated within the first Fast-Track phase of the project covering global impact projections. The second phase will also include regional projections. It is the aim
Active subspace uncertainty quantification for a polydomain ferroelectric phase-field model
NASA Astrophysics Data System (ADS)
Leon, Lider S.; Smith, Ralph C.; Miles, Paul; Oates, William S.
2018-03-01
Quantum-informed ferroelectric phase field models capable of predicting material behavior, are necessary for facilitating the development and production of many adaptive structures and intelligent systems. Uncertainty is present in these models, given the quantum scale at which calculations take place. A necessary analysis is to determine how the uncertainty in the response can be attributed to the uncertainty in the model inputs or parameters. A second analysis is to identify active subspaces within the original parameter space, which quantify directions in which the model response varies most dominantly, thus reducing sampling effort and computational cost. In this investigation, we identify an active subspace for a poly-domain ferroelectric phase-field model. Using the active variables as our independent variables, we then construct a surrogate model and perform Bayesian inference. Once we quantify the uncertainties in the active variables, we obtain uncertainties for the original parameters via an inverse mapping. The analysis provides insight into how active subspace methodologies can be used to reduce computational power needed to perform Bayesian inference on model parameters informed by experimental or simulated data.
NASA Astrophysics Data System (ADS)
Debry, E.; Malherbe, L.; Schillinger, C.; Bessagnet, B.; Rouil, L.
2009-04-01
Evaluation of human exposure to atmospheric pollution usually requires the knowledge of pollutants concentrations in ambient air. In the framework of PAISA project, which studies the influence of socio-economical status on relationships between air pollution and short term health effects, the concentrations of gas and particle pollutants are computed over Strasbourg with the ADMS-Urban model. As for any modeling result, simulated concentrations come with uncertainties which have to be characterized and quantified. There are several sources of uncertainties related to input data and parameters, i.e. fields used to execute the model like meteorological fields, boundary conditions and emissions, related to the model formulation because of incomplete or inaccurate treatment of dynamical and chemical processes, and inherent to the stochastic behavior of atmosphere and human activities [1]. Our aim is here to assess the uncertainties of the simulated concentrations with respect to input data and model parameters. In this scope the first step consisted in bringing out the input data and model parameters that contribute most effectively to space and time variability of predicted concentrations. Concentrations of several pollutants were simulated for two months in winter 2004 and two months in summer 2004 over five areas of Strasbourg. The sensitivity analysis shows the dominating influence of boundary conditions and emissions. Among model parameters, the roughness and Monin-Obukhov lengths appear to have non neglectable local effects. Dry deposition is also an important dynamic process. The second step of the characterization and quantification of uncertainties consists in attributing a probability distribution to each input data and model parameter and in propagating the joint distribution of all data and parameters into the model so as to associate a probability distribution to the modeled concentrations. Several analytical and numerical methods exist to perform an
Model and parametric uncertainty in source-based kinematic models of earthquake ground motion
Hartzell, Stephen; Frankel, Arthur; Liu, Pengcheng; Zeng, Yuehua; Rahman, Shariftur
2011-01-01
Four independent ground-motion simulation codes are used to model the strong ground motion for three earthquakes: 1994 Mw 6.7 Northridge, 1989 Mw 6.9 Loma Prieta, and 1999 Mw 7.5 Izmit. These 12 sets of synthetics are used to make estimates of the variability in ground-motion predictions. In addition, ground-motion predictions over a grid of sites are used to estimate parametric uncertainty for changes in rupture velocity. We find that the combined model uncertainty and random variability of the simulations is in the same range as the variability of regional empirical ground-motion data sets. The majority of the standard deviations lie between 0.5 and 0.7 natural-log units for response spectra and 0.5 and 0.8 for Fourier spectra. The estimate of model epistemic uncertainty, based on the different model predictions, lies between 0.2 and 0.4, which is about one-half of the estimates for the standard deviation of the combined model uncertainty and random variability. Parametric uncertainty, based on variation of just the average rupture velocity, is shown to be consistent in amplitude with previous estimates, showing percentage changes in ground motion from 50% to 300% when rupture velocity changes from 2.5 to 2.9 km/s. In addition, there is some evidence that mean biases can be reduced by averaging ground-motion estimates from different methods.
Jiang, Ai; Cheng, Zhiwen; Shen, Zhemin; Guo, Weimin
2018-02-13
This paper aims to study temperature-dependent quantitative structure activity relationship (QSAR) models of supercritical water oxidation (SCWO) process which were developed based on Arrhenius equation between oxidation reaction rate and temperature. Through exploring SCWO process, each kinetic rate constant was studied for 21 organic substances, including azo dyes, heterocyclic compounds and ionic compounds. We propose the concept of T R95 , which is defined as the temperature at removal ratio of 95%, it is a key indicator to evaluate compounds' complete oxidation. By using Gaussian 09 and Material Studio 7.0, quantum chemical parameters were conducted for each organic compound. The optimum model is T R95 = 654.775 + 1761.910f(+) n - 177.211qH with squared regression coefficient R 2 = 0.620 and standard error SE = 35.1. Nearly all the compounds could obtain accurate predictions of their degradation rate. Effective QSAR model exactly reveals three determinant factors, which are directly related to degradation rules. Specifically, the lowest f(+) value of main-chain atoms (f(+) n ) indicates the degree of affinity for nucleophilic attack. qH shows the ease or complexity of valence-bond breakage of organic molecules. BO x refers to the stability of a bond. Coincidentally, the degradation mechanism could reasonably be illustrated from each perspective, providing a deeper insight of universal and propagable oxidation rules. Besides, the satisfactory results of internal and external validations suggest the stability, reliability and predictive ability of optimum model.
Gao, Xiaodong; Han, Liping; Ren, Yujie
2016-05-05
Checkpoint kinase 1 (Chk1) is an important serine/threonine kinase with a self-protection function. The combination of Chk1 inhibitors and anti-cancer drugs can enhance the selectivity of tumor therapy. In this work, a set of 1,7-diazacarbazole analogs were identified as potent Chk1 inhibitors through a series of computer-aided drug design processes, including three-dimensional quantitative structure-activity relationship (3D-QSAR) modeling, molecular docking, and molecular dynamics simulations. The optimal QSAR models showed significant cross-validated correlation q² values (0.531, 0.726), fitted correlation r² coefficients (higher than 0.90), and standard error of prediction (less than 0.250). These results suggested that the developed models possess good predictive ability. Moreover, molecular docking and molecular dynamics simulations were applied to highlight the important interactions between the ligand and the Chk1 receptor protein. This study shows that hydrogen bonding and electrostatic forces are key interactions that confer bioactivity.
MotieGhader, Habib; Gharaghani, Sajjad; Masoudi-Sobhanzadeh, Yosef; Masoudi-Nejad, Ali
2017-01-01
Feature selection is of great importance in Quantitative Structure-Activity Relationship (QSAR) analysis. This problem has been solved using some meta-heuristic algorithms such as GA, PSO, ACO and so on. In this work two novel hybrid meta-heuristic algorithms i.e. Sequential GA and LA (SGALA) and Mixed GA and LA (MGALA), which are based on Genetic algorithm and learning automata for QSAR feature selection are proposed. SGALA algorithm uses advantages of Genetic algorithm and Learning Automata sequentially and the MGALA algorithm uses advantages of Genetic Algorithm and Learning Automata simultaneously. We applied our proposed algorithms to select the minimum possible number of features from three different datasets and also we observed that the MGALA and SGALA algorithms had the best outcome independently and in average compared to other feature selection algorithms. Through comparison of our proposed algorithms, we deduced that the rate of convergence to optimal result in MGALA and SGALA algorithms were better than the rate of GA, ACO, PSO and LA algorithms. In the end, the results of GA, ACO, PSO, LA, SGALA, and MGALA algorithms were applied as the input of LS-SVR model and the results from LS-SVR models showed that the LS-SVR model had more predictive ability with the input from SGALA and MGALA algorithms than the input from all other mentioned algorithms. Therefore, the results have corroborated that not only is the predictive efficiency of proposed algorithms better, but their rate of convergence is also superior to the all other mentioned algorithms. PMID:28979308
MotieGhader, Habib; Gharaghani, Sajjad; Masoudi-Sobhanzadeh, Yosef; Masoudi-Nejad, Ali
2017-01-01
Feature selection is of great importance in Quantitative Structure-Activity Relationship (QSAR) analysis. This problem has been solved using some meta-heuristic algorithms such as GA, PSO, ACO and so on. In this work two novel hybrid meta-heuristic algorithms i.e. Sequential GA and LA (SGALA) and Mixed GA and LA (MGALA), which are based on Genetic algorithm and learning automata for QSAR feature selection are proposed. SGALA algorithm uses advantages of Genetic algorithm and Learning Automata sequentially and the MGALA algorithm uses advantages of Genetic Algorithm and Learning Automata simultaneously. We applied our proposed algorithms to select the minimum possible number of features from three different datasets and also we observed that the MGALA and SGALA algorithms had the best outcome independently and in average compared to other feature selection algorithms. Through comparison of our proposed algorithms, we deduced that the rate of convergence to optimal result in MGALA and SGALA algorithms were better than the rate of GA, ACO, PSO and LA algorithms. In the end, the results of GA, ACO, PSO, LA, SGALA, and MGALA algorithms were applied as the input of LS-SVR model and the results from LS-SVR models showed that the LS-SVR model had more predictive ability with the input from SGALA and MGALA algorithms than the input from all other mentioned algorithms. Therefore, the results have corroborated that not only is the predictive efficiency of proposed algorithms better, but their rate of convergence is also superior to the all other mentioned algorithms.
Uncertainty Quantification of Turbulence Model Closure Coefficients for Transonic Wall-Bounded Flows
NASA Technical Reports Server (NTRS)
Schaefer, John; West, Thomas; Hosder, Serhat; Rumsey, Christopher; Carlson, Jan-Renee; Kleb, William
2015-01-01
The goal of this work was to quantify the uncertainty and sensitivity of commonly used turbulence models in Reynolds-Averaged Navier-Stokes codes due to uncertainty in the values of closure coefficients for transonic, wall-bounded flows and to rank the contribution of each coefficient to uncertainty in various output flow quantities of interest. Specifically, uncertainty quantification of turbulence model closure coefficients was performed for transonic flow over an axisymmetric bump at zero degrees angle of attack and the RAE 2822 transonic airfoil at a lift coefficient of 0.744. Three turbulence models were considered: the Spalart-Allmaras Model, Wilcox (2006) k-w Model, and the Menter Shear-Stress Trans- port Model. The FUN3D code developed by NASA Langley Research Center was used as the flow solver. The uncertainty quantification analysis employed stochastic expansions based on non-intrusive polynomial chaos as an efficient means of uncertainty propagation. Several integrated and point-quantities are considered as uncertain outputs for both CFD problems. All closure coefficients were treated as epistemic uncertain variables represented with intervals. Sobol indices were used to rank the relative contributions of each closure coefficient to the total uncertainty in the output quantities of interest. This study identified a number of closure coefficients for each turbulence model for which more information will reduce the amount of uncertainty in the output significantly for transonic, wall-bounded flows.
“Wrong, but Useful”: Negotiating Uncertainty in Infectious Disease Modelling
Christley, Robert M.; Mort, Maggie; Wynne, Brian; Wastling, Jonathan M.; Heathwaite, A. Louise; Pickup, Roger; Austin, Zoë; Latham, Sophia M.
2013-01-01
For infectious disease dynamical models to inform policy for containment of infectious diseases the models must be able to predict; however, it is well recognised that such prediction will never be perfect. Nevertheless, the consensus is that although models are uncertain, some may yet inform effective action. This assumes that the quality of a model can be ascertained in order to evaluate sufficiently model uncertainties, and to decide whether or not, or in what ways or under what conditions, the model should be ‘used’. We examined uncertainty in modelling, utilising a range of data: interviews with scientists, policy-makers and advisors, and analysis of policy documents, scientific publications and reports of major inquiries into key livestock epidemics. We show that the discourse of uncertainty in infectious disease models is multi-layered, flexible, contingent, embedded in context and plays a critical role in negotiating model credibility. We argue that usability and stability of a model is an outcome of the negotiation that occurs within the networks and discourses surrounding it. This negotiation employs a range of discursive devices that renders uncertainty in infectious disease modelling a plastic quality that is amenable to ‘interpretive flexibility’. The utility of models in the face of uncertainty is a function of this flexibility, the negotiation this allows, and the contexts in which model outputs are framed and interpreted in the decision making process. We contend that rather than being based predominantly on beliefs about quality, the usefulness and authority of a model may at times be primarily based on its functional status within the broad social and political environment in which it acts. PMID:24146851
Rapid Non-Gaussian Uncertainty Quantification of Seismic Velocity Models and Images
NASA Astrophysics Data System (ADS)
Ely, G.; Malcolm, A. E.; Poliannikov, O. V.
2017-12-01
Conventional seismic imaging typically provides a single estimate of the subsurface without any error bounds. Noise in the observed raw traces as well as the uncertainty of the velocity model directly impact the uncertainty of the final seismic image and its resulting interpretation. We present a Bayesian inference framework to quantify uncertainty in both the velocity model and seismic images, given noise statistics of the observed data.To estimate velocity model uncertainty, we combine the field expansion method, a fast frequency domain wave equation solver, with the adaptive Metropolis-Hastings algorithm. The speed of the field expansion method and its reduced parameterization allows us to perform the tens or hundreds of thousands of forward solves needed for non-parametric posterior estimations. We then migrate the observed data with the distribution of velocity models to generate uncertainty estimates of the resulting subsurface image. This procedure allows us to create both qualitative descriptions of seismic image uncertainty and put error bounds on quantities of interest such as the dip angle of a subduction slab or thickness of a stratigraphic layer.
Quantum-memory-assisted entropic uncertainty in spin models with Dzyaloshinskii-Moriya interaction
NASA Astrophysics Data System (ADS)
Huang, Zhiming
2018-02-01
In this article, we investigate the dynamics and correlations of quantum-memory-assisted entropic uncertainty, the tightness of the uncertainty, entanglement, quantum correlation and mixedness for various spin chain models with Dzyaloshinskii-Moriya (DM) interaction, including the XXZ model with DM interaction, the XY model with DM interaction and the Ising model with DM interaction. We find that the uncertainty grows to a stable value with growing temperature but reduces as the coupling coefficient, anisotropy parameter and DM values increase. It is found that the entropic uncertainty is closely correlated with the mixedness of the system. The increasing quantum correlation can result in a decrease in the uncertainty, and the robustness of quantum correlation is better than entanglement since entanglement means sudden birth and death. The tightness of the uncertainty drops to zero, apart from slight volatility as various parameters increase. Furthermore, we propose an effective approach to steering the uncertainty by weak measurement reversal.
Sensitivity and uncertainty analysis for the annual phosphorus loss estimator model
USDA-ARS?s Scientific Manuscript database
Models are often used to predict phosphorus (P) loss from agricultural fields. While it is commonly recognized that there are inherent uncertainties with model predictions, limited studies have addressed model prediction uncertainty. In this study we assess the effect of model input error on predict...
Liu, Ming; He, Lin; Hu, Xiaopeng; Liu, Peiqing; Luo, Hai-Bin
2010-12-01
The nociceptin/orphanin FQ receptor (NOP) has been implicated in a wide range of biological functions, including pain, anxiety, depression and drug abuse. Especially, its agonists have a great potential to be developed into anxiolytics. However, the crystal structure of NOP is still not available. In the present work, both structure-based and ligand-based modeling methods have been used to achieve a comprehensive understanding on 67N-substituted spiropiperidine analogues as NOP agonists. The comparative molecular-field analysis method was performed to formulate a reasonable 3D-QSAR model (cross-validated coefficient q(2)=0.819 and conventional r(2)=0.950), whose robustness and predictability were further verified by leave-eight-out, Y-randomization, and external test-set validations. The excellent performance of CoMFA to the affinity differences among these compounds was attributed to the contributions of electrostatic/hydrogen-bonding and steric/hydrophobic interactions, which was supported by the Surflex-Dock and CDOCKER molecular-docking simulations based on the 3D model of NOP built by the homology modeling method. The CoMFA contour maps and the molecular docking simulations were integrated to propose a binding mode for the spiropiperidine analogues at the binding site of NOP. Copyright © 2010 Elsevier Ltd. All rights reserved.
Environmental Containment Property Estimation Using QSARs in an Expert System
1991-10-15
economical method to estimate aqueous solubility, octanol/ water partition coefficients, vapor pressures, organic carbon, normalized soil sorption...PROPERTY ESTIMATION USING QSARs IN AN EXPERT SYSTEM William J. Doucette Mark S. Holt Doug J. Denne Joan E. McLean Utah State University Utah Water ...persistence of a chemical are aqueous solubility, octanol/ water partition coefficient, soil/ water sorption coefficient, Henry’s Law constant
Uncertainties in future-proof decision-making: the Dutch Delta Model
NASA Astrophysics Data System (ADS)
IJmker, Janneke; Snippen, Edwin; Ruijgh, Erik
2013-04-01
In 1953, a number of European countries experienced flooding after a major storm event coming from the northwest. Over 2100 people died of the resulting floods, 1800 of them being Dutch. This gave rise to the development of the so-called Delta Works and Zuiderzee Works that strongly reduced the flood risk in the Netherlands. These measures were a response to a large flooding event. As boundary conditions have changed (increasing population, increasing urban development, etc.) , the flood risk should be evaluated continuously, and measures should be taken if necessary. The Delta Programme was designed to be prepared for future changes and to limit the flood risk, taking into account economics, nature, landscape, residence and recreation . To support decisions in the Delta Programme, the Delta Model was developed. By using four different input scenarios (extremes in climate and economics) and variations in system setup, the outcomes of the Delta Model represent a range of possible outcomes for the hydrological situation in 2050 and 2100. These results flow into effect models that give insight in the integrated effects on freshwater supply (including navigation, industry and ecology) and flood risk. As the long-term water management policy of the Netherlands for the next decades will be based on these results, they have to be reliable. Therefore, a study was carried out to investigate the impact of uncertainties on the model outcomes. The study focused on "known unknowns": uncertainties in the boundary conditions, in the parameterization and in the model itself. This showed that for different parts of the Netherlands, the total uncertainty is in the order of meters! Nevertheless, (1) the total uncertainty is dominated by uncertainties in boundary conditions. Internal model uncertainties are subordinate to that. Furthermore, (2) the model responses develop in a logical way, such that the exact model outcomes might be uncertain, but the outcomes of different model runs
Can hydraulic-modelled rating curves reduce uncertainty in high flow data?
NASA Astrophysics Data System (ADS)
Westerberg, Ida; Lam, Norris; Lyon, Steve W.
2017-04-01
Flood risk assessments rely on accurate discharge data records. Establishing a reliable rating curve for calculating discharge from stage at a gauging station normally takes years of data collection efforts. Estimation of high flows is particularly difficult as high flows occur rarely and are often practically difficult to gauge. Hydraulically-modelled rating curves can be derived based on as few as two concurrent stage-discharge and water-surface slope measurements at different flow conditions. This means that a reliable rating curve can, potentially, be derived much faster than a traditional rating curve based on numerous stage-discharge gaugings. In this study we compared the uncertainty in discharge data that resulted from these two rating curve modelling approaches. We applied both methods to a Swedish catchment, accounting for uncertainties in the stage-discharge gauging and water-surface slope data for the hydraulic model and in the stage-discharge gauging data and rating-curve parameters for the traditional method. We focused our analyses on high-flow uncertainty and the factors that could reduce this uncertainty. In particular, we investigated which data uncertainties were most important, and at what flow conditions the gaugings should preferably be taken. First results show that the hydraulically-modelled rating curves were more sensitive to uncertainties in the calibration measurements of discharge than water surface slope. The uncertainty of the hydraulically-modelled rating curves were lowest within the range of the three calibration stage-discharge gaugings (i.e. between median and two-times median flow) whereas uncertainties were higher outside of this range. For instance, at the highest observed stage of the 24-year stage record, the 90% uncertainty band was -15% to +40% of the official rating curve. Additional gaugings at high flows (i.e. four to five times median flow) would likely substantially reduce those uncertainties. These first results show
NASA Astrophysics Data System (ADS)
Li, L.; Xu, C.-Y.; Engeland, K.
2012-04-01
With respect to model calibration, parameter estimation and analysis of uncertainty sources, different approaches have been used in hydrological models. Bayesian method is one of the most widely used methods for uncertainty assessment of hydrological models, which incorporates different sources of information into a single analysis through Bayesian theorem. However, none of these applications can well treat the uncertainty in extreme flows of hydrological models' simulations. This study proposes a Bayesian modularization method approach in uncertainty assessment of conceptual hydrological models by considering the extreme flows. It includes a comprehensive comparison and evaluation of uncertainty assessments by a new Bayesian modularization method approach and traditional Bayesian models using the Metropolis Hasting (MH) algorithm with the daily hydrological model WASMOD. Three likelihood functions are used in combination with traditional Bayesian: the AR (1) plus Normal and time period independent model (Model 1), the AR (1) plus Normal and time period dependent model (Model 2) and the AR (1) plus multi-normal model (Model 3). The results reveal that (1) the simulations derived from Bayesian modularization method are more accurate with the highest Nash-Sutcliffe efficiency value, and (2) the Bayesian modularization method performs best in uncertainty estimates of entire flows and in terms of the application and computational efficiency. The study thus introduces a new approach for reducing the extreme flow's effect on the discharge uncertainty assessment of hydrological models via Bayesian. Keywords: extreme flow, uncertainty assessment, Bayesian modularization, hydrological model, WASMOD
Model uncertainties of local-thermodynamic-equilibrium K-shell spectroscopy
NASA Astrophysics Data System (ADS)
Nagayama, T.; Bailey, J. E.; Mancini, R. C.; Iglesias, C. A.; Hansen, S. B.; Blancard, C.; Chung, H. K.; Colgan, J.; Cosse, Ph.; Faussurier, G.; Florido, R.; Fontes, C. J.; Gilleron, F.; Golovkin, I. E.; Kilcrease, D. P.; Loisel, G.; MacFarlane, J. J.; Pain, J.-C.; Rochau, G. A.; Sherrill, M. E.; Lee, R. W.
2016-09-01
Local-thermodynamic-equilibrium (LTE) K-shell spectroscopy is a common tool to diagnose electron density, ne, and electron temperature, Te, of high-energy-density (HED) plasmas. Knowing the accuracy of such diagnostics is important to provide quantitative conclusions of many HED-plasma research efforts. For example, Fe opacities were recently measured at multiple conditions at the Sandia National Laboratories Z machine (Bailey et al., 2015), showing significant disagreement with modeled opacities. Since the plasma conditions were measured using K-shell spectroscopy of tracer Mg (Nagayama et al., 2014), one concern is the accuracy of the inferred Fe conditions. In this article, we investigate the K-shell spectroscopy model uncertainties by analyzing the Mg spectra computed with 11 different models at the same conditions. We find that the inferred conditions differ by ±20-30% in ne and ±2-4% in Te depending on the choice of spectral model. Also, we find that half of the Te uncertainty comes from ne uncertainty. To refine the accuracy of the K-shell spectroscopy, it is important to scrutinize and experimentally validate line-shape theory. We investigate the impact of the inferred ne and Te model uncertainty on the Fe opacity measurements. Its impact is small and does not explain the reported discrepancies.
Assessing Uncertainties in Surface Water Security: A Probabilistic Multi-model Resampling approach
NASA Astrophysics Data System (ADS)
Rodrigues, D. B. B.
2015-12-01
Various uncertainties are involved in the representation of processes that characterize interactions between societal needs, ecosystem functioning, and hydrological conditions. Here, we develop an empirical uncertainty assessment of water security indicators that characterize scarcity and vulnerability, based on a multi-model and resampling framework. We consider several uncertainty sources including those related to: i) observed streamflow data; ii) hydrological model structure; iii) residual analysis; iv) the definition of Environmental Flow Requirement method; v) the definition of critical conditions for water provision; and vi) the critical demand imposed by human activities. We estimate the overall uncertainty coming from the hydrological model by means of a residual bootstrap resampling approach, and by uncertainty propagation through different methodological arrangements applied to a 291 km² agricultural basin within the Cantareira water supply system in Brazil. Together, the two-component hydrograph residual analysis and the block bootstrap resampling approach result in a more accurate and precise estimate of the uncertainty (95% confidence intervals) in the simulated time series. We then compare the uncertainty estimates associated with water security indicators using a multi-model framework and provided by each model uncertainty estimation approach. The method is general and can be easily extended forming the basis for meaningful support to end-users facing water resource challenges by enabling them to incorporate a viable uncertainty analysis into a robust decision making process.
Effects of input uncertainty on cross-scale crop modeling
NASA Astrophysics Data System (ADS)
Waha, Katharina; Huth, Neil; Carberry, Peter
2014-05-01
The quality of data on climate, soils and agricultural management in the tropics is in general low or data is scarce leading to uncertainty in process-based modeling of cropping systems. Process-based crop models are common tools for simulating crop yields and crop production in climate change impact studies, studies on mitigation and adaptation options or food security studies. Crop modelers are concerned about input data accuracy as this, together with an adequate representation of plant physiology processes and choice of model parameters, are the key factors for a reliable simulation. For example, assuming an error in measurements of air temperature, radiation and precipitation of ± 0.2°C, ± 2 % and ± 3 % respectively, Fodor & Kovacs (2005) estimate that this translates into an uncertainty of 5-7 % in yield and biomass simulations. In our study we seek to answer the following questions: (1) are there important uncertainties in the spatial variability of simulated crop yields on the grid-cell level displayed on maps, (2) are there important uncertainties in the temporal variability of simulated crop yields on the aggregated, national level displayed in time-series, and (3) how does the accuracy of different soil, climate and management information influence the simulated crop yields in two crop models designed for use at different spatial scales? The study will help to determine whether more detailed information improves the simulations and to advise model users on the uncertainty related to input data. We analyse the performance of the point-scale crop model APSIM (Keating et al., 2003) and the global scale crop model LPJmL (Bondeau et al., 2007) with different climate information (monthly and daily) and soil conditions (global soil map and African soil map) under different agricultural management (uniform and variable sowing dates) for the low-input maize-growing areas in Burkina Faso/West Africa. We test the models' response to different levels of input
Influence of model reduction on uncertainty of flood inundation predictions
NASA Astrophysics Data System (ADS)
Romanowicz, R. J.; Kiczko, A.; Osuch, M.
2012-04-01
Derivation of flood risk maps requires an estimation of the maximum inundation extent for a flood with an assumed probability of exceedence, e.g. a 100 or 500 year flood. The results of numerical simulations of flood wave propagation are used to overcome the lack of relevant observations. In practice, deterministic 1-D models are used for flow routing, giving a simplified image of a flood wave propagation process. The solution of a 1-D model depends on the simplifications to the model structure, the initial and boundary conditions and the estimates of model parameters which are usually identified using the inverse problem based on the available noisy observations. Therefore, there is a large uncertainty involved in the derivation of flood risk maps. In this study we examine the influence of model structure simplifications on estimates of flood extent for the urban river reach. As the study area we chose the Warsaw reach of the River Vistula, where nine bridges and several dikes are located. The aim of the study is to examine the influence of water structures on the derived model roughness parameters, with all the bridges and dikes taken into account, with a reduced number and without any water infrastructure. The results indicate that roughness parameter values of a 1-D HEC-RAS model can be adjusted for the reduction in model structure. However, the price we pay is the model robustness. Apart from a relatively simple question regarding reducing model structure, we also try to answer more fundamental questions regarding the relative importance of input, model structure simplification, parametric and rating curve uncertainty to the uncertainty of flood extent estimates. We apply pseudo-Bayesian methods of uncertainty estimation and Global Sensitivity Analysis as the main methodological tools. The results indicate that the uncertainties have a substantial influence on flood risk assessment. In the paper we present a simplified methodology allowing the influence of
Uncertainties in Atomic Data and Their Propagation Through Spectral Models. I.
NASA Technical Reports Server (NTRS)
Bautista, M. A.; Fivet, V.; Quinet, P.; Dunn, J.; Gull, T. R.; Kallman, T. R.; Mendoza, C.
2013-01-01
We present a method for computing uncertainties in spectral models, i.e., level populations, line emissivities, and emission line ratios, based upon the propagation of uncertainties originating from atomic data.We provide analytic expressions, in the form of linear sets of algebraic equations, for the coupled uncertainties among all levels. These equations can be solved efficiently for any set of physical conditions and uncertainties in the atomic data. We illustrate our method applied to spectral models of Oiii and Fe ii and discuss the impact of the uncertainties on atomic systems under different physical conditions. As to intrinsic uncertainties in theoretical atomic data, we propose that these uncertainties can be estimated from the dispersion in the results from various independent calculations. This technique provides excellent results for the uncertainties in A-values of forbidden transitions in [Fe ii]. Key words: atomic data - atomic processes - line: formation - methods: data analysis - molecular data - molecular processes - techniques: spectroscopic
Parameter uncertainty analysis for the annual phosphorus loss estimator (APLE) model
USDA-ARS?s Scientific Manuscript database
Technical abstract: Models are often used to predict phosphorus (P) loss from agricultural fields. While it is commonly recognized that model predictions are inherently uncertain, few studies have addressed prediction uncertainties using P loss models. In this study, we conduct an uncertainty analys...
Quantifying uncertainty in stable isotope mixing models
Davis, Paul; Syme, James; Heikoop, Jeffrey; ...
2015-05-19
Mixing models are powerful tools for identifying biogeochemical sources and determining mixing fractions in a sample. However, identification of actual source contributors is often not simple, and source compositions typically vary or even overlap, significantly increasing model uncertainty in calculated mixing fractions. This study compares three probabilistic methods, SIAR [ Parnell et al., 2010] a pure Monte Carlo technique (PMC), and Stable Isotope Reference Source (SIRS) mixing model, a new technique that estimates mixing in systems with more than three sources and/or uncertain source compositions. In this paper, we use nitrate stable isotope examples (δ 15N and δ 18O) butmore » all methods tested are applicable to other tracers. In Phase I of a three-phase blind test, we compared methods for a set of six-source nitrate problems. PMC was unable to find solutions for two of the target water samples. The Bayesian method, SIAR, experienced anchoring problems, and SIRS calculated mixing fractions that most closely approximated the known mixing fractions. For that reason, SIRS was the only approach used in the next phase of testing. In Phase II, the problem was broadened where any subset of the six sources could be a possible solution to the mixing problem. Results showed a high rate of Type I errors where solutions included sources that were not contributing to the sample. In Phase III some sources were eliminated based on assumed site knowledge and assumed nitrate concentrations, substantially reduced mixing fraction uncertainties and lowered the Type I error rate. These results demonstrate that valuable insights into stable isotope mixing problems result from probabilistic mixing model approaches like SIRS. The results also emphasize the importance of identifying a minimal set of potential sources and quantifying uncertainties in source isotopic composition as well as demonstrating the value of additional information in reducing the uncertainty in calculated
Chasing Perfection: Should We Reduce Model Uncertainty in Carbon Cycle-Climate Feedbacks
NASA Astrophysics Data System (ADS)
Bonan, G. B.; Lombardozzi, D.; Wieder, W. R.; Lindsay, K. T.; Thomas, R. Q.
2015-12-01
Earth system model simulations of the terrestrial carbon (C) cycle show large multi-model spread in the carbon-concentration and carbon-climate feedback parameters. Large differences among models are also seen in their simulation of global vegetation and soil C stocks and other aspects of the C cycle, prompting concern about model uncertainty and our ability to faithfully represent fundamental aspects of the terrestrial C cycle in Earth system models. Benchmarking analyses that compare model simulations with common datasets have been proposed as a means to assess model fidelity with observations, and various model-data fusion techniques have been used to reduce model biases. While such efforts will reduce multi-model spread, they may not help reduce uncertainty (and increase confidence) in projections of the C cycle over the twenty-first century. Many ecological and biogeochemical processes represented in Earth system models are poorly understood at both the site scale and across large regions, where biotic and edaphic heterogeneity are important. Our experience with the Community Land Model (CLM) suggests that large uncertainty in the terrestrial C cycle and its feedback with climate change is an inherent property of biological systems. The challenge of representing life in Earth system models, with the rich diversity of lifeforms and complexity of biological systems, may necessitate a multitude of modeling approaches to capture the range of possible outcomes. Such models should encompass a range of plausible model structures. We distinguish between model parameter uncertainty and model structural uncertainty. Focusing on improved parameter estimates may, in fact, limit progress in assessing model structural uncertainty associated with realistically representing biological processes. Moreover, higher confidence may be achieved through better process representation, but this does not necessarily reduce uncertainty.
NASA Astrophysics Data System (ADS)
Jacquin, A. P.
2012-04-01
This study analyses the effect of precipitation spatial distribution uncertainty on the uncertainty bounds of a snowmelt runoff model's discharge estimates. Prediction uncertainty bounds are derived using the Generalized Likelihood Uncertainty Estimation (GLUE) methodology. The model analysed is a conceptual watershed model operating at a monthly time step. The model divides the catchment into five elevation zones, where the fifth zone corresponds to the catchment glaciers. Precipitation amounts at each elevation zone i are estimated as the product between observed precipitation (at a single station within the catchment) and a precipitation factor FPi. Thus, these factors provide a simplified representation of the spatial variation of precipitation, specifically the shape of the functional relationship between precipitation and height. In the absence of information about appropriate values of the precipitation factors FPi, these are estimated through standard calibration procedures. The catchment case study is Aconcagua River at Chacabuquito, located in the Andean region of Central Chile. Monte Carlo samples of the model output are obtained by randomly varying the model parameters within their feasible ranges. In the first experiment, the precipitation factors FPi are considered unknown and thus included in the sampling process. The total number of unknown parameters in this case is 16. In the second experiment, precipitation factors FPi are estimated a priori, by means of a long term water balance between observed discharge at the catchment outlet, evapotranspiration estimates and observed precipitation. In this case, the number of unknown parameters reduces to 11. The feasible ranges assigned to the precipitation factors in the first experiment are slightly wider than the range of fixed precipitation factors used in the second experiment. The mean squared error of the Box-Cox transformed discharge during the calibration period is used for the evaluation of the
Uncertainty analysis of least-cost modeling for designing wildlife linkages.
Beier, Paul; Majka, Daniel R; Newell, Shawn L
2009-12-01
Least-cost models for focal species are widely used to design wildlife corridors. To evaluate the least-cost modeling approach used to develop 15 linkage designs in southern California, USA, we assessed robustness of the largest and least constrained linkage. Species experts parameterized models for eight species with weights for four habitat factors (land cover, topographic position, elevation, road density) and resistance values for each class within a factor (e.g., each class of land cover). Each model produced a proposed corridor for that species. We examined the extent to which uncertainty in factor weights and class resistance values affected two key conservation-relevant outputs, namely, the location and modeled resistance to movement of each proposed corridor. To do so, we compared the proposed corridor to 13 alternative corridors created with parameter sets that spanned the plausible ranges of biological uncertainty in these parameters. Models for five species were highly robust (mean overlap 88%, little or no increase in resistance). Although the proposed corridors for the other three focal species overlapped as little as 0% (mean 58%) of the alternative corridors, resistance in the proposed corridors for these three species was rarely higher than resistance in the alternative corridors (mean difference was 0.025 on a scale of 1 10; worst difference was 0.39). As long as the model had the correct rank order of resistance values and factor weights, our results suggest that the predicted corridor is robust to uncertainty. The three carnivore focal species, alone or in combination, were not effective umbrellas for the other focal species. The carnivore corridors failed to overlap the predicted corridors of most other focal species and provided relatively high resistance for the other focal species (mean increase of 2.7 resistance units). Least-cost modelers should conduct uncertainty analysis so that decision-makers can appreciate the potential impact of
Quantifying parametric uncertainty in the Rothermel model
S. Goodrick
2008-01-01
The purpose of the present work is to quantify parametric uncertainty in the Rothermel wildland fire spreadmodel (implemented in software such as fire spread models in the United States. This model consists of a non-linear system of equations that relates environmentalvariables (input parameter groups...
Ponzano, Stefano; Berteotti, Anna; Petracca, Rita; Vitale, Romina; Mengatto, Luisa; Bandiera, Tiziano; Cavalli, Andrea; Piomelli, Daniele; Bertozzi, Fabio; Bottegoni, Giovanni
2014-12-11
N-(2-Oxo-3-oxetanyl)carbamic acid esters have recently been reported to be noncompetitive inhibitors of the N-acylethanolamine acid amidase (NAAA) potentially useful for the treatment of pain and inflammation. In the present study, we further explored the structure-activity relationships of the carbamic acid ester side chain of 2-methyl-4-oxo-3-oxetanylcarbamic acid ester derivatives. Additional favorable features in the design of potent NAAA inhibitors have been found together with the identification of a single digit nanomolar inhibitor. In addition, we devised a 3D QSAR using the atomic property field method. The model turned out to be able to account for the structural variability and was prospectively validated by designing, synthesizing, and testing novel inhibitors. The fairly good agreement between predictions and experimental potency values points to this 3D QSAR model as the first example of quantitative structure-activity relationships in the field of NAAA inhibitors.
Evaluating the uncertainty of input quantities in measurement models
NASA Astrophysics Data System (ADS)
Possolo, Antonio; Elster, Clemens
2014-06-01
The Guide to the Expression of Uncertainty in Measurement (GUM) gives guidance about how values and uncertainties should be assigned to the input quantities that appear in measurement models. This contribution offers a concrete proposal for how that guidance may be updated in light of the advances in the evaluation and expression of measurement uncertainty that were made in the course of the twenty years that have elapsed since the publication of the GUM, and also considering situations that the GUM does not yet contemplate. Our motivation is the ongoing conversation about a new edition of the GUM. While generally we favour a Bayesian approach to uncertainty evaluation, we also recognize the value that other approaches may bring to the problems considered here, and focus on methods for uncertainty evaluation and propagation that are widely applicable, including to cases that the GUM has not yet addressed. In addition to Bayesian methods, we discuss maximum-likelihood estimation, robust statistical methods, and measurement models where values of nominal properties play the same role that input quantities play in traditional models. We illustrate these general-purpose techniques in concrete examples, employing data sets that are realistic but that also are of conveniently small sizes. The supplementary material available online lists the R computer code that we have used to produce these examples (stacks.iop.org/Met/51/3/339/mmedia). Although we strive to stay close to clause 4 of the GUM, which addresses the evaluation of uncertainty for input quantities, we depart from it as we review the classes of measurement models that we believe are generally useful in contemporary measurement science. We also considerably expand and update the treatment that the GUM gives to Type B evaluations of uncertainty: reviewing the state-of-the-art, disciplined approach to the elicitation of expert knowledge, and its encapsulation in probability distributions that are usable in
Development of an Uncertainty Model for the National Transonic Facility
NASA Technical Reports Server (NTRS)
Walter, Joel A.; Lawrence, William R.; Elder, David W.; Treece, Michael D.
2010-01-01
This paper introduces an uncertainty model being developed for the National Transonic Facility (NTF). The model uses a Monte Carlo technique to propagate standard uncertainties of measured values through the NTF data reduction equations to calculate the combined uncertainties of the key aerodynamic force and moment coefficients and freestream properties. The uncertainty propagation approach to assessing data variability is compared with ongoing data quality assessment activities at the NTF, notably check standard testing using statistical process control (SPC) techniques. It is shown that the two approaches are complementary and both are necessary tools for data quality assessment and improvement activities. The SPC approach is the final arbiter of variability in a facility. Its result encompasses variation due to people, processes, test equipment, and test article. The uncertainty propagation approach is limited mainly to the data reduction process. However, it is useful because it helps to assess the causes of variability seen in the data and consequently provides a basis for improvement. For example, it is shown that Mach number random uncertainty is dominated by static pressure variation over most of the dynamic pressure range tested. However, the random uncertainty in the drag coefficient is generally dominated by axial and normal force uncertainty with much less contribution from freestream conditions.
Implications of Uncertainty in Fossil Fuel Emissions for Terrestrial Ecosystem Modeling
NASA Astrophysics Data System (ADS)
King, A. W.; Ricciuto, D. M.; Mao, J.; Andres, R. J.
2017-12-01
Given observations of the increase in atmospheric CO2, estimates of anthropogenic emissions and models of oceanic CO2 uptake, one can estimate net global CO2 exchange between the atmosphere and terrestrial ecosystems as the residual of the balanced global carbon budget. Estimates from the Global Carbon Project 2016 show that terrestrial ecosystems are a growing sink for atmospheric CO2 (averaging 2.12 Gt C y-1 for the period 1959-2015 with a growth rate of 0.03 Gt C y-1 per year) but with considerable year-to-year variability (standard deviation of 1.07 Gt C y-1). Within the uncertainty of the observations, emissions estimates and ocean modeling, this residual calculation is a robust estimate of a global terrestrial sink for CO2. A task of terrestrial ecosystem science is to explain the trend and variability in this estimate. However, "within the uncertainty" is an important caveat. The uncertainty (2σ; 95% confidence interval) in fossil fuel emissions is 8.4% (±0.8 Gt C in 2015). Combined with uncertainty in other carbon budget components, the 2σ uncertainty surrounding the global net terrestrial ecosystem CO2 exchange is ±1.6 Gt C y-1. Ignoring the uncertainty, the estimate of a general terrestrial sink includes 2 years (1987 and 1998) in which terrestrial ecosystems are a small source of CO2 to the atmosphere. However, with 2σ uncertainty, terrestrial ecosystems may have been a source in as many as 18 years. We examine how well global terrestrial biosphere models simulate the trend and interannual variability of the global-budget estimate of the terrestrial sink within the context of this uncertainty (e.g., which models fall outside the 2σ uncertainty and in what years). Models are generally capable of reproducing the trend in net terrestrial exchange, but are less able to capture interannual variability and often fall outside the 2σ uncertainty. The trend in the residual carbon budget estimate is primarily associated with the increase in atmospheric CO2
Wei Wu; James Clark; James Vose
2010-01-01
Hierarchical Bayesian (HB) modeling allows for multiple sources of uncertainty by factoring complex relationships into conditional distributions that can be used to draw inference and make predictions. We applied an HB model to estimate the parameters and state variables of a parsimonious hydrological model â GR4J â by coherently assimilating the uncertainties from the...
Quantum chemical parameters in QSAR: what do I use when?
Hickey, James P.; Ostrander, Gary K.
1996-01-01
This chapter provides a brief overview of the numerous quantum chemical parameters that have been/are currently being used in quantitative structure activity relationships (QSAR), along with a representative bibliography. The parameters will be grouped according to their mechanistic interpretations, and representative biological and physical chemical applications will be mentioned. Parmater computation methods and the appropriate software are highlighted, as are sources for software.
Fish acute toxicity syndromes and their use in the QSAR approach to hazard assessment.
McKim, J M; Bradbury, S P; Niemi, G J
1987-01-01
Implementation of the Toxic Substances Control Act of 1977 creates the need to reliably establish testing priorities because laboratory resources are limited and the number of industrial chemicals requiring evaluation is overwhelming. The use of quantitative structure activity relationship (QSAR) models as rapid and predictive screening tools to select more potentially hazardous chemicals for in-depth laboratory evaluation has been proposed. Further implementation and refinement of quantitative structure-toxicity relationships in aquatic toxicology and hazard assessment requires the development of a "mode-of-action" database. With such a database, a qualitative structure-activity relationship can be formulated to assign the proper mode of action, and respective QSAR, to a given chemical structure. In this review, the development of fish acute toxicity syndromes (FATS), which are toxic-response sets based on various behavioral and physiological-biochemical measurements, and their projected use in the mode-of-action database are outlined. Using behavioral parameters monitored in the fathead minnow during acute toxicity testing, FATS associated with acetylcholinesterase (AChE) inhibitors and narcotics could be reliably predicted. However, compounds classified as oxidative phosphorylation uncouplers or stimulants could not be resolved. Refinement of this approach by using respiratory-cardiovascular responses in the rainbow trout, enabled FATS associated with AChE inhibitors, convulsants, narcotics, respiratory blockers, respiratory membrane irritants, and uncouplers to be correctly predicted. PMID:3297660
Di Tullio, Maurizio; Maccallini, Cristina; Ammazzalorso, Alessandra; Giampietro, Letizia; Amoroso, Rosa; De Filippis, Barbara; Fantacuzzi, Marialuigia; Wiczling, Paweł; Kaliszan, Roman
2012-07-01
A series of 27 analogues of clofibric acid, mostly heteroarylalkanoic derivatives, have been analyzed by a novel high-throughput reversed-phase HPLC method employing combined gradient of eluent's pH and organic modifier content. The such determined hydrophobicity (lipophilicity) parameters, log kw , and acidity constants, pKa , were subjected to multiple regression analysis to get a QSRR (Quantitative StructureRetention Relationships) and a QSPR (Quantitative Structure-Property Relationships) equation, respectively, describing these pharmacokinetics-determining physicochemical parameters in terms of the calculation chemistry derived structural descriptors. The previously determined in vitro log EC50 values - transactivation activity towards PPARα (human Peroxisome Proliferator-Activated Receptor α) - have also been described in a QSAR (Quantitative StructureActivity Relationships) equation in terms of the 3-D-MoRSE descriptors (3D-Molecule Representation of Structures based on Electron diffraction descriptors). The QSAR model derived can serve for an a priori prediction of bioactivity in vitro of any designed analogue, whereas the QSRR and the QSPR models can be used to evaluate lipophilicity and acidity, respectively, of the compounds, and hence to rational guide selection of structures of proper pharmacokinetics. Copyright © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
A web-application for visualizing uncertainty in numerical ensemble models
NASA Astrophysics Data System (ADS)
Alberti, Koko; Hiemstra, Paul; de Jong, Kor; Karssenberg, Derek
2013-04-01
Numerical ensemble models are used in the analysis and forecasting of a wide range of environmental processes. Common use cases include assessing the consequences of nuclear accidents, pollution releases into the ocean or atmosphere, forest fires, volcanic eruptions, or identifying areas at risk from such hazards. In addition to the increased use of scenario analyses and model forecasts, the availability of supplementary data describing errors and model uncertainties is increasingly commonplace. Unfortunately most current visualization routines are not capable of properly representing uncertain information. As a result, uncertainty information is not provided at all, not readily accessible, or it is not communicated effectively to model users such as domain experts, decision makers, policy makers, or even novice users. In an attempt to address these issues a lightweight and interactive web-application has been developed. It makes clear and concise uncertainty visualizations available in a web-based mapping and visualization environment, incorporating aggregation (upscaling) techniques to adjust uncertainty information to the zooming level. The application has been built on a web mapping stack of open source software, and can quantify and visualize uncertainties in numerical ensemble models in such a way that both expert and novice users can investigate uncertainties present in a simple ensemble dataset. As a test case, a dataset was used which forecasts the spread of an airborne tracer across Western Europe. Extrinsic uncertainty representations are used in which dynamic circular glyphs are overlaid on model attribute maps to convey various uncertainty concepts. It supports both basic uncertainty metrics such as standard deviation, standard error, width of the 95% confidence interval and interquartile range, as well as more experimental ones aimed at novice users. Ranges of attribute values can be specified, and the circular glyphs dynamically change size to
Climate data induced uncertainty in model based estimations of terrestrial primary productivity
NASA Astrophysics Data System (ADS)
Wu, Z.; Ahlström, A.; Smith, B.; Ardö, J.; Eklundh, L.; Fensholt, R.; Lehsten, V.
2016-12-01
Models used to project global vegetation and carbon cycle differ in their estimates of historical fluxes and pools. These differences arise not only from differences between models but also from differences in the environmental and climatic data that forces the models. Here we investigate the role of uncertainties in historical climate data, encapsulated by a set of six historical climate datasets. We focus on terrestrial gross primary productivity (GPP) and analyze the results from a dynamic process-based vegetation model (LPJ-GUESS) forced by six different climate datasets and two empirical datasets of GPP (derived from flux towers and remote sensing). We find that the climate induced uncertainty, defined as the difference among historical simulations in GPP when forcing the model with the different climate datasets, can be as high as 33 Pg C yr-1 globally (19% of mean GPP). The uncertainty is partitioned into the three main climatic drivers, temperature, precipitation, and shortwave radiation. Additionally, we illustrate how the uncertainty due to a given climate driver depends both on the magnitude of the forcing data uncertainty (the data range) and the sensitivity of the modeled GPP to the driver (the ecosystem sensitivity). The analysis is performed globally and stratified into five land cover classes. We find that the dynamic vegetation model overestimates GPP, compared to empirically based GPP data over most areas, except for the tropical region. Both the simulations and empirical estimates agree that the tropical region is a disproportionate source of uncertainty in GPP estimation. This is mainly caused by uncertainties in shortwave radiation forcing, of which climate data range contributes slightly higher uncertainty than ecosystem sensitivity to shortwave radiation. We also find that precipitation dominated the climate induced uncertainty over nearly half of terrestrial vegetated surfaces, which is mainly due to large ecosystem sensitivity to
Uncertainty visualisation in the Model Web
NASA Astrophysics Data System (ADS)
Gerharz, L. E.; Autermann, C.; Hopmann, H.; Stasch, C.; Pebesma, E.
2012-04-01
Visualisation of geospatial data as maps is a common way to communicate spatially distributed information. If temporal and furthermore uncertainty information are included in the data, efficient visualisation methods are required. For uncertain spatial and spatio-temporal data, numerous visualisation methods have been developed and proposed, but only few tools for visualisation of data in a standardised way exist. Furthermore, usually they are realised as thick clients, and lack functionality of handling data coming from web services as it is envisaged in the Model Web. We present an interactive web tool for visualisation of uncertain spatio-temporal data developed in the UncertWeb project. The client is based on the OpenLayers JavaScript library. OpenLayers provides standard map windows and navigation tools, i.e. pan, zoom in/out, to allow interactive control for the user. Further interactive methods are implemented using jStat, a JavaScript library for statistics plots developed in UncertWeb, and flot. To integrate the uncertainty information into existing standards for geospatial data, the Uncertainty Markup Language (UncertML) was applied in combination with OGC Observations&Measurements 2.0 and JavaScript Object Notation (JSON) encodings for vector and NetCDF for raster data. The client offers methods to visualise uncertain vector and raster data with temporal information. Uncertainty information considered for the tool are probabilistic and quantified attribute uncertainties which can be provided as realisations or samples, full probability distributions functions and statistics. Visualisation is supported for uncertain continuous and categorical data. In the client, the visualisation is realised using a combination of different methods. Based on previously conducted usability studies, a differentiation between expert (in statistics or mapping) and non-expert users has been indicated as useful. Therefore, two different modes are realised together in the tool
Liu, Genyan; Wang, Wenjie; Wan, Youlan; Ju, Xiulian; Gu, Shuangxi
2018-05-11
Diarylpyrimidines (DAPYs), acting as HIV-1 nonnucleoside reverse transcriptase inhibitors (NNRTIs), have been considered to be one of the most potent drug families in the fight against acquired immunodeficiency syndrome (AIDS). To better understand the structural requirements of HIV-1 NNRTIs, three-dimensional quantitative structure⁻activity relationship (3D-QSAR), pharmacophore, and molecular docking studies were performed on 52 DAPY analogues that were synthesized in our previous studies. The internal and external validation parameters indicated that the generated 3D-QSAR models, including comparative molecular field analysis (CoMFA, q 2 = 0.679, R 2 = 0.983, and r pred 2 = 0.884) and comparative molecular similarity indices analysis (CoMSIA, q 2 = 0.734, R 2 = 0.985, and r pred 2 = 0.891), exhibited good predictive abilities and significant statistical reliability. The docking results demonstrated that the phenyl ring at the C₄-position of the pyrimidine ring was better than the cycloalkanes for the activity, as the phenyl group was able to participate in π⁻π stacking interactions with the aromatic residues of the binding site, whereas the cycloalkanes were not. The pharmacophore model and 3D-QSAR contour maps provided significant insights into the key structural features of DAPYs that were responsible for the activity. On the basis of the obtained information, a series of novel DAPY analogues of HIV-1 NNRTIs with potentially higher predicted activity was designed. This work might provide useful information for guiding the rational design of potential HIV-1 NNRTI DAPYs.
Niches, models, and climate change: Assessing the assumptions and uncertainties
Wiens, John A.; Stralberg, Diana; Jongsomjit, Dennis; Howell, Christine A.; Snyder, Mark A.
2009-01-01
As the rate and magnitude of climate change accelerate, understanding the consequences becomes increasingly important. Species distribution models (SDMs) based on current ecological niche constraints are used to project future species distributions. These models contain assumptions that add to the uncertainty in model projections stemming from the structure of the models, the algorithms used to translate niche associations into distributional probabilities, the quality and quantity of data, and mismatches between the scales of modeling and data. We illustrate the application of SDMs using two climate models and two distributional algorithms, together with information on distributional shifts in vegetation types, to project fine-scale future distributions of 60 California landbird species. Most species are projected to decrease in distribution by 2070. Changes in total species richness vary over the state, with large losses of species in some “hotspots” of vulnerability. Differences in distributional shifts among species will change species co-occurrences, creating spatial variation in similarities between current and future assemblages. We use these analyses to consider how assumptions can be addressed and uncertainties reduced. SDMs can provide a useful way to incorporate future conditions into conservation and management practices and decisions, but the uncertainties of model projections must be balanced with the risks of taking the wrong actions or the costs of inaction. Doing this will require that the sources and magnitudes of uncertainty are documented, and that conservationists and resource managers be willing to act despite the uncertainties. The alternative, of ignoring the future, is not an option. PMID:19822750
Accounting for uncertainty in health economic decision models by using model averaging.
Jackson, Christopher H; Thompson, Simon G; Sharples, Linda D
2009-04-01
Health economic decision models are subject to considerable uncertainty, much of which arises from choices between several plausible model structures, e.g. choices of covariates in a regression model. Such structural uncertainty is rarely accounted for formally in decision models but can be addressed by model averaging. We discuss the most common methods of averaging models and the principles underlying them. We apply them to a comparison of two surgical techniques for repairing abdominal aortic aneurysms. In model averaging, competing models are usually either weighted by using an asymptotically consistent model assessment criterion, such as the Bayesian information criterion, or a measure of predictive ability, such as Akaike's information criterion. We argue that the predictive approach is more suitable when modelling the complex underlying processes of interest in health economics, such as individual disease progression and response to treatment.
Accounting for uncertainty in health economic decision models by using model averaging
Jackson, Christopher H; Thompson, Simon G; Sharples, Linda D
2009-01-01
Health economic decision models are subject to considerable uncertainty, much of which arises from choices between several plausible model structures, e.g. choices of covariates in a regression model. Such structural uncertainty is rarely accounted for formally in decision models but can be addressed by model averaging. We discuss the most common methods of averaging models and the principles underlying them. We apply them to a comparison of two surgical techniques for repairing abdominal aortic aneurysms. In model averaging, competing models are usually either weighted by using an asymptotically consistent model assessment criterion, such as the Bayesian information criterion, or a measure of predictive ability, such as Akaike's information criterion. We argue that the predictive approach is more suitable when modelling the complex underlying processes of interest in health economics, such as individual disease progression and response to treatment. PMID:19381329
Sensitivity of Earthquake Loss Estimates to Source Modeling Assumptions and Uncertainty
Reasenberg, Paul A.; Shostak, Nan; Terwilliger, Sharon
2006-01-01
Introduction: This report explores how uncertainty in an earthquake source model may affect estimates of earthquake economic loss. Specifically, it focuses on the earthquake source model for the San Francisco Bay region (SFBR) created by the Working Group on California Earthquake Probabilities. The loss calculations are made using HAZUS-MH, a publicly available computer program developed by the Federal Emergency Management Agency (FEMA) for calculating future losses from earthquakes, floods and hurricanes within the United States. The database built into HAZUS-MH includes a detailed building inventory, population data, data on transportation corridors, bridges, utility lifelines, etc. Earthquake hazard in the loss calculations is based upon expected (median value) ground motion maps called ShakeMaps calculated for the scenario earthquake sources defined in WGCEP. The study considers the effect of relaxing certain assumptions in the WG02 model, and explores the effect of hypothetical reductions in epistemic uncertainty in parts of the model. For example, it addresses questions such as what would happen to the calculated loss distribution if the uncertainty in slip rate in the WG02 model were reduced (say, by obtaining additional geologic data)? What would happen if the geometry or amount of aseismic slip (creep) on the region's faults were better known? And what would be the effect on the calculated loss distribution if the time-dependent earthquake probability were better constrained, either by eliminating certain probability models or by better constraining the inherent randomness in earthquake recurrence? The study does not consider the effect of reducing uncertainty in the hazard introduced through models of attenuation and local site characteristics, although these may have a comparable or greater effect than does source-related uncertainty. Nor does it consider sources of uncertainty in the building inventory, building fragility curves, and other assumptions
Eigenspace perturbations for structural uncertainty estimation of turbulence closure models
NASA Astrophysics Data System (ADS)
Jofre, Lluis; Mishra, Aashwin; Iaccarino, Gianluca
2017-11-01
With the present state of computational resources, a purely numerical resolution of turbulent flows encountered in engineering applications is not viable. Consequently, investigations into turbulence rely on various degrees of modeling. Archetypal amongst these variable resolution approaches would be RANS models in two-equation closures, and subgrid-scale models in LES. However, owing to the simplifications introduced during model formulation, the fidelity of all such models is limited, and therefore the explicit quantification of the predictive uncertainty is essential. In such scenario, the ideal uncertainty estimation procedure must be agnostic to modeling resolution, methodology, and the nature or level of the model filter. The procedure should be able to give reliable prediction intervals for different Quantities of Interest, over varied flows and flow conditions, and at diametric levels of modeling resolution. In this talk, we present and substantiate the Eigenspace perturbation framework as an uncertainty estimation paradigm that meets these criteria. Commencing from a broad overview, we outline the details of this framework at different modeling resolution. Thence, using benchmark flows, along with engineering problems, the efficacy of this procedure is established. This research was partially supported by NNSA under the Predictive Science Academic Alliance Program (PSAAP) II, and by DARPA under the Enabling Quantification of Uncertainty in Physical Systems (EQUiPS) project (technical monitor: Dr Fariba Fahroo).
Decay heat uncertainty for BWR used fuel due to modeling and nuclear data uncertainties
Ilas, Germina; Liljenfeldt, Henrik
2017-05-19
Characterization of the energy released from radionuclide decay in nuclear fuel discharged from reactors is essential for the design, safety, and licensing analyses of used nuclear fuel storage, transportation, and repository systems. There are a limited number of decay heat measurements available for commercial used fuel applications. Because decay heat measurements can be expensive or impractical for covering the multitude of existing fuel designs, operating conditions, and specific application purposes, decay heat estimation relies heavily on computer code prediction. Uncertainty evaluation for calculated decay heat is an important aspect when assessing code prediction and a key factor supporting decision makingmore » for used fuel applications. While previous studies have largely focused on uncertainties in code predictions due to nuclear data uncertainties, this study discusses uncertainties in calculated decay heat due to uncertainties in assembly modeling parameters as well as in nuclear data. Capabilities in the SCALE nuclear analysis code system were used to quantify the effect on calculated decay heat of uncertainties in nuclear data and selected manufacturing and operation parameters for a typical boiling water reactor (BWR) fuel assembly. Furthermore, the BWR fuel assembly used as the reference case for this study was selected from a set of assemblies for which high-quality decay heat measurements are available, to assess the significance of the results through comparison with calculated and measured decay heat data.« less
Decay heat uncertainty for BWR used fuel due to modeling and nuclear data uncertainties
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ilas, Germina; Liljenfeldt, Henrik
Characterization of the energy released from radionuclide decay in nuclear fuel discharged from reactors is essential for the design, safety, and licensing analyses of used nuclear fuel storage, transportation, and repository systems. There are a limited number of decay heat measurements available for commercial used fuel applications. Because decay heat measurements can be expensive or impractical for covering the multitude of existing fuel designs, operating conditions, and specific application purposes, decay heat estimation relies heavily on computer code prediction. Uncertainty evaluation for calculated decay heat is an important aspect when assessing code prediction and a key factor supporting decision makingmore » for used fuel applications. While previous studies have largely focused on uncertainties in code predictions due to nuclear data uncertainties, this study discusses uncertainties in calculated decay heat due to uncertainties in assembly modeling parameters as well as in nuclear data. Capabilities in the SCALE nuclear analysis code system were used to quantify the effect on calculated decay heat of uncertainties in nuclear data and selected manufacturing and operation parameters for a typical boiling water reactor (BWR) fuel assembly. Furthermore, the BWR fuel assembly used as the reference case for this study was selected from a set of assemblies for which high-quality decay heat measurements are available, to assess the significance of the results through comparison with calculated and measured decay heat data.« less
Tainio, Marko; Tuomisto, Jouni T; Hänninen, Otto; Ruuskanen, Juhani; Jantunen, Matti J; Pekkanen, Juha
2007-01-01
Background The estimation of health impacts involves often uncertain input variables and assumptions which have to be incorporated into the model structure. These uncertainties may have significant effects on the results obtained with model, and, thus, on decision making. Fine particles (PM2.5) are believed to cause major health impacts, and, consequently, uncertainties in their health impact assessment have clear relevance to policy-making. We studied the effects of various uncertain input variables by building a life-table model for fine particles. Methods Life-expectancy of the Helsinki metropolitan area population and the change in life-expectancy due to fine particle exposures were predicted using a life-table model. A number of parameter and model uncertainties were estimated. Sensitivity analysis for input variables was performed by calculating rank-order correlations between input and output variables. The studied model uncertainties were (i) plausibility of mortality outcomes and (ii) lag, and parameter uncertainties (iii) exposure-response coefficients for different mortality outcomes, and (iv) exposure estimates for different age groups. The monetary value of the years-of-life-lost and the relative importance of the uncertainties related to monetary valuation were predicted to compare the relative importance of the monetary valuation on the health effect uncertainties. Results The magnitude of the health effects costs depended mostly on discount rate, exposure-response coefficient, and plausibility of the cardiopulmonary mortality. Other mortality outcomes (lung cancer, other non-accidental and infant mortality) and lag had only minor impact on the output. The results highlight the importance of the uncertainties associated with cardiopulmonary mortality in the fine particle impact assessment when compared with other uncertainties. Conclusion When estimating life-expectancy, the estimates used for cardiopulmonary exposure-response coefficient, discount
Tainio, Marko; Tuomisto, Jouni T; Hänninen, Otto; Ruuskanen, Juhani; Jantunen, Matti J; Pekkanen, Juha
2007-08-23
The estimation of health impacts involves often uncertain input variables and assumptions which have to be incorporated into the model structure. These uncertainties may have significant effects on the results obtained with model, and, thus, on decision making. Fine particles (PM2.5) are believed to cause major health impacts, and, consequently, uncertainties in their health impact assessment have clear relevance to policy-making. We studied the effects of various uncertain input variables by building a life-table model for fine particles. Life-expectancy of the Helsinki metropolitan area population and the change in life-expectancy due to fine particle exposures were predicted using a life-table model. A number of parameter and model uncertainties were estimated. Sensitivity analysis for input variables was performed by calculating rank-order correlations between input and output variables. The studied model uncertainties were (i) plausibility of mortality outcomes and (ii) lag, and parameter uncertainties (iii) exposure-response coefficients for different mortality outcomes, and (iv) exposure estimates for different age groups. The monetary value of the years-of-life-lost and the relative importance of the uncertainties related to monetary valuation were predicted to compare the relative importance of the monetary valuation on the health effect uncertainties. The magnitude of the health effects costs depended mostly on discount rate, exposure-response coefficient, and plausibility of the cardiopulmonary mortality. Other mortality outcomes (lung cancer, other non-accidental and infant mortality) and lag had only minor impact on the output. The results highlight the importance of the uncertainties associated with cardiopulmonary mortality in the fine particle impact assessment when compared with other uncertainties. When estimating life-expectancy, the estimates used for cardiopulmonary exposure-response coefficient, discount rate, and plausibility require careful
NASA Technical Reports Server (NTRS)
Maggioni, V.; Anagnostou, E. N.; Reichle, R. H.
2013-01-01
The contribution of rainfall forcing errors relative to model (structural and parameter) uncertainty in the prediction of soil moisture is investigated by integrating the NASA Catchment Land Surface Model (CLSM), forced with hydro-meteorological data, in the Oklahoma region. Rainfall-forcing uncertainty is introduced using a stochastic error model that generates ensemble rainfall fields from satellite rainfall products. The ensemble satellite rain fields are propagated through CLSM to produce soil moisture ensembles. Errors in CLSM are modeled with two different approaches: either by perturbing model parameters (representing model parameter uncertainty) or by adding randomly generated noise (representing model structure and parameter uncertainty) to the model prognostic variables. Our findings highlight that the method currently used in the NASA GEOS-5 Land Data Assimilation System to perturb CLSM variables poorly describes the uncertainty in the predicted soil moisture, even when combined with rainfall model perturbations. On the other hand, by adding model parameter perturbations to rainfall forcing perturbations, a better characterization of uncertainty in soil moisture simulations is observed. Specifically, an analysis of the rank histograms shows that the most consistent ensemble of soil moisture is obtained by combining rainfall and model parameter perturbations. When rainfall forcing and model prognostic perturbations are added, the rank histogram shows a U-shape at the domain average scale, which corresponds to a lack of variability in the forecast ensemble. The more accurate estimation of the soil moisture prediction uncertainty obtained by combining rainfall and parameter perturbations is encouraging for the application of this approach in ensemble data assimilation systems.
Leaf area index uncertainty estimates for model-data fusion applications
Andrew D. Richardson; D. Bryan Dail; D.Y. Hollinger
2011-01-01
Estimates of data uncertainties are required to integrate different observational data streams as model constraints using model-data fusion. We describe an approach with which random and systematic uncertainties in optical measurements of leaf area index [LAI] can be quantified. We use data from a measurement campaign at the spruce-dominated Howland Forest AmeriFlux...
Alderman, Phillip D.; Stanfill, Bryan
2016-10-06
Recent international efforts have brought renewed emphasis on the comparison of different agricultural systems models. Thus far, analysis of model-ensemble simulated results has not clearly differentiated between ensemble prediction uncertainties due to model structural differences per se and those due to parameter value uncertainties. Additionally, despite increasing use of Bayesian parameter estimation approaches with field-scale crop models, inadequate attention has been given to the full posterior distributions for estimated parameters. The objectives of this study were to quantify the impact of parameter value uncertainty on prediction uncertainty for modeling spring wheat phenology using Bayesian analysis and to assess the relativemore » contributions of model-structure-driven and parameter-value-driven uncertainty to overall prediction uncertainty. This study used a random walk Metropolis algorithm to estimate parameters for 30 spring wheat genotypes using nine phenology models based on multi-location trial data for days to heading and days to maturity. Across all cases, parameter-driven uncertainty accounted for between 19 and 52% of predictive uncertainty, while model-structure-driven uncertainty accounted for between 12 and 64%. Here, this study demonstrated the importance of quantifying both model-structure- and parameter-value-driven uncertainty when assessing overall prediction uncertainty in modeling spring wheat phenology. More generally, Bayesian parameter estimation provided a useful framework for quantifying and analyzing sources of prediction uncertainty.« less
Gu, Wenwen; Chen, Ying; Li, Yu
2017-08-01
Based on the experimental subcooled liquid vapor pressures (P L ) of 17 polychlorinated naphthalene (PCN) congeners, one type of three-dimensional quantitative structure-activity relationship (3D-QSAR) models, comparative molecular similarity indices analysis (CoMSIA), was constructed with Sybyl software. Full factor experimental design was used to obtain the final regulation scheme for PCN, and then carry out modification of PCN-2 to significantly lower its P L . The contour maps of CoMSIA model showed that the migration ability of PCN decreases when the Cl atoms at the 2-, 3-, 4-, 5-, 6-, 7- and 8-positions of PCNs are replaced by electropositive groups. After modification of PCN-2, 12 types of new modified PCN-2 compounds were obtained with lnP L values two orders of magnitude lower than that of PCN-2. In addition, there are significant differences between the calculated total energies and energy gaps of the new modified compounds and those of PCN-2.