Liang, Chao; Han, Shu-ying; Qiao, Jun-qin; Lian, Hong-zhen; Ge, Xin
2014-11-01
A strategy to utilize neutral model compounds for lipophilicity measurement of ionizable basic compounds by reversed-phase high-performance liquid chromatography is proposed in this paper. The applicability of the novel protocol was justified by theoretical derivation. Meanwhile, the linear relationships between logarithm of apparent n-octanol/water partition coefficients (logKow '') and logarithm of retention factors corresponding to the 100% aqueous fraction of mobile phase (logkw ) were established for a basic training set, a neutral training set and a mixed training set of these two. As proved in theory, the good linearity and external validation results indicated that the logKow ''-logkw relationships obtained from a neutral model training set were always reliable regardless of mobile phase pH. Afterwards, the above relationships were adopted to determine the logKow of harmaline, a weakly dissociable alkaloid. As far as we know, this is the first report on experimental logKow data for harmaline (logKow = 2.28 ± 0.08). Introducing neutral compounds into a basic model training set or using neutral model compounds alone is recommended to measure the lipophilicity of weakly ionizable basic compounds especially those with high hydrophobicity for the advantages of more suitable model compound choices and convenient mobile phase pH control. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Ribay, Kathryn; Kim, Marlene T; Wang, Wenyi; Pinolini, Daniel; Zhu, Hao
2016-03-01
Estrogen receptors (ERα) are a critical target for drug design as well as a potential source of toxicity when activated unintentionally. Thus, evaluating potential ERα binding agents is critical in both drug discovery and chemical toxicity areas. Using computational tools, e.g., Quantitative Structure-Activity Relationship (QSAR) models, can predict potential ERα binding agents before chemical synthesis. The purpose of this project was to develop enhanced predictive models of ERα binding agents by utilizing advanced cheminformatics tools that can integrate publicly available bioassay data. The initial ERα binding agent data set, consisting of 446 binders and 8307 non-binders, was obtained from the Tox21 Challenge project organized by the NIH Chemical Genomics Center (NCGC). After removing the duplicates and inorganic compounds, this data set was used to create a training set (259 binders and 259 non-binders). This training set was used to develop QSAR models using chemical descriptors. The resulting models were then used to predict the binding activity of 264 external compounds, which were available to us after the models were developed. The cross-validation results of training set [Correct Classification Rate (CCR) = 0.72] were much higher than the external predictivity of the unknown compounds (CCR = 0.59). To improve the conventional QSAR models, all compounds in the training set were used to search PubChem and generate a profile of their biological responses across thousands of bioassays. The most important bioassays were prioritized to generate a similarity index that was used to calculate the biosimilarity score between each two compounds. The nearest neighbors for each compound within the set were then identified and its ERα binding potential was predicted by its nearest neighbors in the training set. The hybrid model performance (CCR = 0.94 for cross validation; CCR = 0.68 for external prediction) showed significant improvement over the original QSAR models, particularly for the activity cliffs that induce prediction errors. The results of this study indicate that the response profile of chemicals from public data provides useful information for modeling and evaluation purposes. The public big data resources should be considered along with chemical structure information when predicting new compounds, such as unknown ERα binding agents.
Kohonen and counterpropagation neural networks applied for mapping and interpretation of IR spectra.
Novic, Marjana
2008-01-01
The principles of learning strategy of Kohonen and counterpropagation neural networks are introduced. The advantages of unsupervised learning are discussed. The self-organizing maps produced in both methods are suitable for a wide range of applications. Here, we present an example of Kohonen and counterpropagation neural networks used for mapping, interpretation, and simulation of infrared (IR) spectra. The artificial neural network models were trained for prediction of structural fragments of an unknown compound from its infrared spectrum. The training set contained over 3,200 IR spectra of diverse compounds of known chemical structure. The structure-spectra relationship was encompassed by the counterpropagation neural network, which assigned structural fragments to individual compounds within certain probability limits, assessed from the predictions of test compounds. The counterpropagation neural network model for prediction of fragments of chemical structure is reversible, which means that, for a given structural domain, limited to the training data set in the study, it can be used to simulate the IR spectrum of a chemical defined with a set of structural fragments.
Chiang, Yi-Kun; Kuo, Ching-Chuan; Wu, Yu-Shan; Chen, Chung-Tong; Coumar, Mohane Selvaraj; Wu, Jian-Sung; Hsieh, Hsing-Pang; Chang, Chi-Yen; Jseng, Huan-Yi; Wu, Ming-Hsine; Leou, Jiun-Shyang; Song, Jen-Shin; Chang, Jang-Yang; Lyu, Ping-Chiang; Chao, Yu-Sheng; Wu, Su-Ying
2009-07-23
A pharmacophore model, Hypo1, was built on the basis of 21 training-set indole compounds with varying levels of antiproliferative activity. Hypo1 possessed important chemical features required for the inhibitors and demonstrated good predictive ability for biological activity, with high correlation coefficients of 0.96 and 0.89 for the training-set and test-set compounds, respectively. Further utilization of the Hypo1 pharmacophore model to screen chemical database in silico led to the identification of four compounds with antiproliferative activity. Among these four compounds, 43 showed potent antiproliferative activity against various cancer cell lines with the strongest inhibition on the proliferation of KB cells (IC(50) = 187 nM). Further biological characterization revealed that 43 effectively inhibited tubulin polymerization and significantly induced cell cycle arrest in G(2)-M phase. In addition, 43 also showed the in vivo-like anticancer effects. To our knowledge, 43 is the most potent antiproliferative compound with antitubulin activity discovered by computer-aided drug design. The chemical novelty of 43 and its anticancer activities make this compound worthy of further lead optimization.
Toward automated biochemotype annotation for large compound libraries.
Chen, Xian; Liang, Yizeng; Xu, Jun
2006-08-01
Combinatorial chemistry allows scientists to probe large synthetically accessible chemical space. However, identifying the sub-space which is selectively associated with an interested biological target, is crucial to drug discovery and life sciences. This paper describes a process to automatically annotate biochemotypes of compounds in a library and thus to identify bioactivity related chemotypes (biochemotypes) from a large library of compounds. The process consists of two steps: (1) predicting all possible bioactivities for each compound in a library, and (2) deriving possible biochemotypes based on predictions. The Prediction of Activity Spectra for Substances program (PASS) was used in the first step. In second step, structural similarity and scaffold-hopping technologies are employed. These technologies are used to derive biochemotypes from bioactivity predictions and the corresponding annotated biochemotypes from MDL Drug Data Report (MDDR) database. About a one million (982,889) commercially available compound library (CACL) has been tested using this process. This paper demonstrates the feasibility of automatically annotating biochemotypes for large libraries of compounds. Nevertheless, some issues need to be considered in order to improve the process. First, the prediction accuracy of PASS program has no significant correlation with the number of compounds in a training set. Larger training sets do not necessarily increase the maximal error of prediction (MEP), nor do they increase the hit structural diversity. Smaller training sets do not necessarily decrease MEP, nor do they decrease the hit structural diversity. Second, the success of systematic bioactivity prediction relies on modeling, training data, and the definition of bioactivities (biochemotype ontology). Unfortunately, the biochemotype ontology was not well developed in the PASS program. Consequently, "ill-defined" bioactivities can reduce the quality of predictions. This paper suggests the ways in which the systematic bioactivities prediction program should be improved.
Toropov, A A; Toropova, A P; Raska, I
2008-04-01
Simplified molecular input line entry system (SMILES) has been utilized in constructing quantitative structure-property relationships (QSPR) for octanol/water partition coefficient of vitamins and organic compounds of different classes by optimal descriptors. Statistical characteristics of the best model (vitamins) are the following: n=17, R(2)=0.9841, s=0.634, F=931 (training set); n=7, R(2)=0.9928, s=0.773, F=690 (test set). Using this approach for modeling octanol/water partition coefficient for a set of organic compounds gives a model that is statistically characterized by n=69, R(2)=0.9872, s=0.156, F=5184 (training set) and n=70, R(2)=0.9841, s=0.179, F=4195 (test set).
Predicting Mouse Liver Microsomal Stability with “Pruned” Machine Learning Models and Public Data
Perryman, Alexander L.; Stratton, Thomas P.; Ekins, Sean; Freundlich, Joel S.
2015-01-01
Purpose Mouse efficacy studies are a critical hurdle to advance translational research of potential therapeutic compounds for many diseases. Although mouse liver microsomal (MLM) stability studies are not a perfect surrogate for in vivo studies of metabolic clearance, they are the initial model system used to assess metabolic stability. Consequently, we explored the development of machine learning models that can enhance the probability of identifying compounds possessing MLM stability. Methods Published assays on MLM half-life values were identified in PubChem, reformatted, and curated to create a training set with 894 unique small molecules. These data were used to construct machine learning models assessed with internal cross-validation, external tests with a published set of antitubercular compounds, and independent validation with an additional diverse set of 571 compounds (PubChem data on percent metabolism). Results “Pruning” out the moderately unstable/moderately stable compounds from the training set produced models with superior predictive power. Bayesian models displayed the best predictive power for identifying compounds with a half-life ≥1 hour. Conclusions Our results suggest the pruning strategy may be of general benefit to improve test set enrichment and provide machine learning models with enhanced predictive value for the MLM stability of small organic molecules. This study represents the most exhaustive study to date of using machine learning approaches with MLM data from public sources. PMID:26415647
Predicting Mouse Liver Microsomal Stability with "Pruned" Machine Learning Models and Public Data.
Perryman, Alexander L; Stratton, Thomas P; Ekins, Sean; Freundlich, Joel S
2016-02-01
Mouse efficacy studies are a critical hurdle to advance translational research of potential therapeutic compounds for many diseases. Although mouse liver microsomal (MLM) stability studies are not a perfect surrogate for in vivo studies of metabolic clearance, they are the initial model system used to assess metabolic stability. Consequently, we explored the development of machine learning models that can enhance the probability of identifying compounds possessing MLM stability. Published assays on MLM half-life values were identified in PubChem, reformatted, and curated to create a training set with 894 unique small molecules. These data were used to construct machine learning models assessed with internal cross-validation, external tests with a published set of antitubercular compounds, and independent validation with an additional diverse set of 571 compounds (PubChem data on percent metabolism). "Pruning" out the moderately unstable / moderately stable compounds from the training set produced models with superior predictive power. Bayesian models displayed the best predictive power for identifying compounds with a half-life ≥1 h. Our results suggest the pruning strategy may be of general benefit to improve test set enrichment and provide machine learning models with enhanced predictive value for the MLM stability of small organic molecules. This study represents the most exhaustive study to date of using machine learning approaches with MLM data from public sources.
Prediction of Partition Coefficients of Organic Compounds between SPME/PDMS and Aqueous Solution
Chao, Keh-Ping; Lu, Yu-Ting; Yang, Hsiu-Wen
2014-01-01
Polydimethylsiloxane (PDMS) is commonly used as the coated polymer in the solid phase microextraction (SPME) technique. In this study, the partition coefficients of organic compounds between SPME/PDMS and the aqueous solution were compiled from the literature sources. The correlation analysis for partition coefficients was conducted to interpret the effect of their physicochemical properties and descriptors on the partitioning process. The PDMS-water partition coefficients were significantly correlated to the polarizability of organic compounds (r = 0.977, p < 0.05). An empirical model, consisting of the polarizability, the molecular connectivity index, and an indicator variable, was developed to appropriately predict the partition coefficients of 61 organic compounds for the training set. The predictive ability of the empirical model was demonstrated by using it on a test set of 26 chemicals not included in the training set. The empirical model, applying the straightforward calculated molecular descriptors, for estimating the PDMS-water partition coefficient will contribute to the practical applications of the SPME technique. PMID:24534804
Designing Multi-target Compound Libraries with Gaussian Process Models.
Bieler, Michael; Reutlinger, Michael; Rodrigues, Tiago; Schneider, Petra; Kriegl, Jan M; Schneider, Gisbert
2016-05-01
We present the application of machine learning models to selecting G protein-coupled receptor (GPCR)-focused compound libraries. The library design process was realized by ant colony optimization. A proprietary Boehringer-Ingelheim reference set consisting of 3519 compounds tested in dose-response assays at 11 GPCR targets served as training data for machine learning and activity prediction. We compared the usability of the proprietary data with a public data set from ChEMBL. Gaussian process models were trained to prioritize compounds from a virtual combinatorial library. We obtained meaningful models for three of the targets (5-HT2c , MCH, A1), which were experimentally confirmed for 12 of 15 selected and synthesized or purchased compounds. Overall, the models trained on the public data predicted the observed assay results more accurately. The results of this study motivate the use of Gaussian process regression on public data for virtual screening and target-focused compound library design. © 2016 The Authors. Published by Wiley-VCH Verlag GmbH & Co. KGaA. This is an open access article under the terms of the Creative Commons Attribution Non-Commercial NoDerivs License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non-commercial and no modifications or adaptations are made.
QSAR Study for Carcinogenic Potency of Aromatic Amines Based on GEP and MLPs
Song, Fucheng; Zhang, Anling; Liang, Hui; Cui, Lianhua; Li, Wenlian; Si, Hongzong; Duan, Yunbo; Zhai, Honglin
2016-01-01
A new analysis strategy was used to classify the carcinogenicity of aromatic amines. The physical-chemical parameters are closely related to the carcinogenicity of compounds. Quantitative structure activity relationship (QSAR) is a method of predicting the carcinogenicity of aromatic amine, which can reveal the relationship between carcinogenicity and physical-chemical parameters. This study accessed gene expression programming by APS software, the multilayer perceptrons by Weka software to predict the carcinogenicity of aromatic amines, respectively. All these methods relied on molecular descriptors calculated by CODESSA software and eight molecular descriptors were selected to build function equations. As a remarkable result, the accuracy of gene expression programming in training and test sets are 0.92 and 0.82, the accuracy of multilayer perceptrons in training and test sets are 0.84 and 0.74 respectively. The precision of the gene expression programming is obviously superior to multilayer perceptrons both in training set and test set. The QSAR application in the identification of carcinogenic compounds is a high efficiency method. PMID:27854309
Effect of missing data on multitask prediction methods.
de la Vega de León, Antonio; Chen, Beining; Gillet, Valerie J
2018-05-22
There has been a growing interest in multitask prediction in chemoinformatics, helped by the increasing use of deep neural networks in this field. This technique is applied to multitarget data sets, where compounds have been tested against different targets, with the aim of developing models to predict a profile of biological activities for a given compound. However, multitarget data sets tend to be sparse; i.e., not all compound-target combinations have experimental values. There has been little research on the effect of missing data on the performance of multitask methods. We have used two complete data sets to simulate sparseness by removing data from the training set. Different models to remove the data were compared. These sparse sets were used to train two different multitask methods, deep neural networks and Macau, which is a Bayesian probabilistic matrix factorization technique. Results from both methods were remarkably similar and showed that the performance decrease because of missing data is at first small before accelerating after large amounts of data are removed. This work provides a first approximation to assess how much data is required to produce good performance in multitask prediction exercises.
Ventura, Cristina; Latino, Diogo A R S; Martins, Filomena
2013-01-01
The performance of two QSAR methodologies, namely Multiple Linear Regressions (MLR) and Neural Networks (NN), towards the modeling and prediction of antitubercular activity was evaluated and compared. A data set of 173 potentially active compounds belonging to the hydrazide family and represented by 96 descriptors was analyzed. Models were built with Multiple Linear Regressions (MLR), single Feed-Forward Neural Networks (FFNNs), ensembles of FFNNs and Associative Neural Networks (AsNNs) using four different data sets and different types of descriptors. The predictive ability of the different techniques used were assessed and discussed on the basis of different validation criteria and results show in general a better performance of AsNNs in terms of learning ability and prediction of antitubercular behaviors when compared with all other methods. MLR have, however, the advantage of pinpointing the most relevant molecular characteristics responsible for the behavior of these compounds against Mycobacterium tuberculosis. The best results for the larger data set (94 compounds in training set and 18 in test set) were obtained with AsNNs using seven descriptors (R(2) of 0.874 and RMSE of 0.437 against R(2) of 0.845 and RMSE of 0.472 in MLRs, for test set). Counter-Propagation Neural Networks (CPNNs) were trained with the same data sets and descriptors. From the scrutiny of the weight levels in each CPNN and the information retrieved from MLRs, a rational design of potentially active compounds was attempted. Two new compounds were synthesized and tested against M. tuberculosis showing an activity close to that predicted by the majority of the models. Copyright © 2013 Elsevier Masson SAS. All rights reserved.
How to Achieve Better Results Using Pass-Based Virtual Screening: Case Study for Kinase Inhibitors
NASA Astrophysics Data System (ADS)
Pogodin, Pavel V.; Lagunin, Alexey A.; Rudik, Anastasia V.; Filimonov, Dmitry A.; Druzhilovskiy, Dmitry S.; Nicklaus, Mark C.; Poroikov, Vladimir V.
2018-04-01
Discovery of new pharmaceutical substances is currently boosted by the possibility of utilization of the Synthetically Accessible Virtual Inventory (SAVI) library, which includes about 283 million molecules, each annotated with a proposed synthetic one-step route from commercially available starting materials. The SAVI database is well-suited for ligand-based methods of virtual screening to select molecules for experimental testing. In this study, we compare the performance of three approaches for the analysis of structure-activity relationships that differ in their criteria for selecting of “active” and “inactive” compounds included in the training sets. PASS (Prediction of Activity Spectra for Substances), which is based on a modified Naïve Bayes algorithm, was applied since it had been shown to be robust and to provide good predictions of many biological activities based on just the structural formula of a compound even if the information in the training set is incomplete. We used different subsets of kinase inhibitors for this case study because many data are currently available on this important class of drug-like molecules. Based on the subsets of kinase inhibitors extracted from the ChEMBL 20 database we performed the PASS training, and then applied the model to ChEMBL 23 compounds not yet present in ChEMBL 20 to identify novel kinase inhibitors. As one may expect, the best prediction accuracy was obtained if only the experimentally confirmed active and inactive compounds for distinct kinases in the training procedure were used. However, for some kinases, reasonable results were obtained even if we used merged training sets, in which we designated as inactives the compounds not tested against the particular kinase. Thus, depending on the availability of data for a particular biological activity, one may choose the first or the second approach for creating ligand-based computational tools to achieve the best possible results in virtual screening.
Votano, Joseph R; Parham, Marc; Hall, L Mark; Hall, Lowell H; Kier, Lemont B; Oloff, Scott; Tropsha, Alexander
2006-11-30
Four modeling techniques, using topological descriptors to represent molecular structure, were employed to produce models of human serum protein binding (% bound) on a data set of 1008 experimental values, carefully screened from publicly available sources. To our knowledge, this data is the largest set on human serum protein binding reported for QSAR modeling. The data was partitioned into a training set of 808 compounds and an external validation test set of 200 compounds. Partitioning was accomplished by clustering the compounds in a structure descriptor space so that random sampling of 20% of the whole data set produced an external test set that is a good representative of the training set with respect to both structure and protein binding values. The four modeling techniques include multiple linear regression (MLR), artificial neural networks (ANN), k-nearest neighbors (kNN), and support vector machines (SVM). With the exception of the MLR model, the ANN, kNN, and SVM QSARs were ensemble models. Training set correlation coefficients and mean absolute error ranged from r2=0.90 and MAE=7.6 for ANN to r2=0.61 and MAE=16.2 for MLR. Prediction results from the validation set yielded correlation coefficients and mean absolute errors which ranged from r2=0.70 and MAE=14.1 for ANN to a low of r2=0.59 and MAE=18.3 for the SVM model. Structure descriptors that contribute significantly to the models are discussed and compared with those found in other published models. For the ANN model, structure descriptor trends with respect to their affects on predicted protein binding can assist the chemist in structure modification during the drug design process.
Feasibility of Active Machine Learning for Multiclass Compound Classification.
Lang, Tobias; Flachsenberg, Florian; von Luxburg, Ulrike; Rarey, Matthias
2016-01-25
A common task in the hit-to-lead process is classifying sets of compounds into multiple, usually structural classes, which build the groundwork for subsequent SAR studies. Machine learning techniques can be used to automate this process by learning classification models from training compounds of each class. Gathering class information for compounds can be cost-intensive as the required data needs to be provided by human experts or experiments. This paper studies whether active machine learning can be used to reduce the required number of training compounds. Active learning is a machine learning method which processes class label data in an iterative fashion. It has gained much attention in a broad range of application areas. In this paper, an active learning method for multiclass compound classification is proposed. This method selects informative training compounds so as to optimally support the learning progress. The combination with human feedback leads to a semiautomated interactive multiclass classification procedure. This method was investigated empirically on 15 compound classification tasks containing 86-2870 compounds in 3-38 classes. The empirical results show that active learning can solve these classification tasks using 10-80% of the data which would be necessary for standard learning techniques.
Dry selection and wet evaluation for the rational discovery of new anthelmintics
NASA Astrophysics Data System (ADS)
Marrero-Ponce, Yovani; Castañeda, Yeniel González; Vivas-Reyes, Ricardo; Vergara, Fredy Máximo; Arán, Vicente J.; Castillo-Garit, Juan A.; Pérez-Giménez, Facundo; Torrens, Francisco; Le-Thi-Thu, Huong; Pham-The, Hai; Montenegro, Yolanda Vera; Ibarra-Velarde, Froylán
2017-09-01
Helminths infections remain a major problem in medical and public health. In this report, atom-based 2D bilinear indices, a TOMOCOMD-CARDD (QuBiLs-MAS module) molecular descriptor family and linear discriminant analysis (LDA) were used to find models that differentiate among anthelmintic and non-anthelmintic compounds. Two classification models obtained by using non-stochastic and stochastic 2D bilinear indices, classified correctly 86.64% and 84.66%, respectively, in the training set. Equation 1(2) correctly classified 141(135) out of 165 [85.45%(81.82%)] compounds in external validation set. Another LDA models were performed in order to get the most likely mechanism of action of anthelmintics. The model shows an accuracy of 86.84% in the training set and 94.44% in the external prediction set. Finally, we carry out an experiment to predict the biological profile of our 'in-house' collections of indole, indazole, quinoxaline and cinnoline derivatives (∼200 compounds). Subsequently, we selected a group of nine of the theoretically most active structures. Then, these chemicals were tested in an in vitro assay and one good candidate (VA5-5c) as fasciolicide compound (100% of reduction at concentrations of 50 and 10 mg/L) was discovered.
QSAR Modeling Using Large-Scale Databases: Case Study for HIV-1 Reverse Transcriptase Inhibitors.
Tarasova, Olga A; Urusova, Aleksandra F; Filimonov, Dmitry A; Nicklaus, Marc C; Zakharov, Alexey V; Poroikov, Vladimir V
2015-07-27
Large-scale databases are important sources of training sets for various QSAR modeling approaches. Generally, these databases contain information extracted from different sources. This variety of sources can produce inconsistency in the data, defined as sometimes widely diverging activity results for the same compound against the same target. Because such inconsistency can reduce the accuracy of predictive models built from these data, we are addressing the question of how best to use data from publicly and commercially accessible databases to create accurate and predictive QSAR models. We investigate the suitability of commercially and publicly available databases to QSAR modeling of antiviral activity (HIV-1 reverse transcriptase (RT) inhibition). We present several methods for the creation of modeling (i.e., training and test) sets from two, either commercially or freely available, databases: Thomson Reuters Integrity and ChEMBL. We found that the typical predictivities of QSAR models obtained using these different modeling set compilation methods differ significantly from each other. The best results were obtained using training sets compiled for compounds tested using only one method and material (i.e., a specific type of biological assay). Compound sets aggregated by target only typically yielded poorly predictive models. We discuss the possibility of "mix-and-matching" assay data across aggregating databases such as ChEMBL and Integrity and their current severe limitations for this purpose. One of them is the general lack of complete and semantic/computer-parsable descriptions of assay methodology carried by these databases that would allow one to determine mix-and-matchability of result sets at the assay level.
Padró, Juan M; Pellegrino Vidal, Rocío B; Reta, Mario
2014-12-01
The partition coefficients, P IL/w, of several compounds, some of them of biological and pharmacological interest, between water and room-temperature ionic liquids based on the imidazolium, pyridinium, and phosphonium cations, namely 1-octyl-3-methylimidazolium hexafluorophosphate, N-octylpyridinium tetrafluorophosphate, trihexyl(tetradecyl)phosphonium chloride, trihexyl(tetradecyl)phosphonium bromide, trihexyl(tetradecyl)phosphonium bis(trifluoromethylsulfonyl)imide, and trihexyl(tetradecyl)phosphonium dicyanamide, were accurately measured. In this way, we extended our database of partition coefficients in room-temperature ionic liquids previously reported. We employed the solvation parameter model with different probe molecules (the training set) to elucidate the chemical interactions involved in the partition process and discussed the most relevant differences among the three types of ionic liquids. The multiparametric equations obtained with the aforementioned model were used to predict the partition coefficients for compounds (the test set) not present in the training set, most being of biological and pharmacological interest. An excellent agreement between calculated and experimental log P IL/w values was obtained. Thus, the obtained equations can be used to predict, a priori, the extraction efficiency for any compound using these ionic liquids as extraction solvents in liquid-liquid extractions.
For QSAR and QSPR modeling of biological and physicochemical properties, estimating the accuracy of predictions is a critical problem. The “distance to model” (DM) can be defined as a metric that defines the similarity between the training set molecules and the test set compound ...
Kaushik, Karishma S.; Kessel, Ashley; Ratnayeke, Nalin; Gordon, Vernita D.
2015-01-01
We have developed a hands-on experimental module that combines biology experiments with a physics-based analytical model in order to characterize antimicrobial compounds. To understand antibiotic resistance, participants perform a disc diffusion assay to test the antimicrobial activity of different compounds and then apply a diffusion-based analytical model to gain insights into the behavior of the active antimicrobial component. In our experience, this module was robust, reproducible, and cost-effective, suggesting that it could be implemented in diverse settings such as undergraduate research, STEM (science, technology, engineering, and math) camps, school programs, and laboratory training workshops. By providing valuable interdisciplinary research experience in science outreach and education initiatives, this module addresses the paucity of structured training or education programs that integrate diverse scientific fields. Its low-cost requirements make it especially suitable for use in resource-limited settings. PMID:25602254
Thousands of compounds in the environment have not been characterized for developmental neurotoxicity (DNT) hazard. To address this issue, methods to screen compounds rapidly for DNT hazard evaluation are necessary and are being developed for key neurodevelopmental processes. In...
Why relevant chemical information cannot be exchanged without disclosing structures
NASA Astrophysics Data System (ADS)
Filimonov, Dmitry; Poroikov, Vladimir
2005-09-01
Both society and industry are interested in increasing the safety of pharmaceuticals. Potentially dangerous compounds could be filtered out at early stages of R&D by computer prediction of biological activity and ADMET characteristics. Accuracy of such predictions strongly depends on the quality & quantity of information contained in a training set. Suggestion that some relevant chemical information can be added to such training sets without disclosing chemical structures was generated at the recent ACS Symposium. We presented arguments that such safety exchange of relevant chemical information is impossible. Any relevant information about chemical structures can be used for search of either a particular compound itself or its close analogues. Risk of identifying such structures is enough to prevent pharma industry from relevant chemical information exchange.
Quantitative structure-activity relationship modeling of rat acute toxicity by oral exposure.
Zhu, Hao; Martin, Todd M; Ye, Lin; Sedykh, Alexander; Young, Douglas M; Tropsha, Alexander
2009-12-01
Few quantitative structure-activity relationship (QSAR) studies have successfully modeled large, diverse rodent toxicity end points. In this study, a comprehensive data set of 7385 compounds with their most conservative lethal dose (LD(50)) values has been compiled. A combinatorial QSAR approach has been employed to develop robust and predictive models of acute toxicity in rats caused by oral exposure to chemicals. To enable fair comparison between the predictive power of models generated in this study versus a commercial toxicity predictor, TOPKAT (Toxicity Prediction by Komputer Assisted Technology), a modeling subset of the entire data set was selected that included all 3472 compounds used in TOPKAT's training set. The remaining 3913 compounds, which were not present in the TOPKAT training set, were used as the external validation set. QSAR models of five different types were developed for the modeling set. The prediction accuracy for the external validation set was estimated by determination coefficient R(2) of linear regression between actual and predicted LD(50) values. The use of the applicability domain threshold implemented in most models generally improved the external prediction accuracy but expectedly led to the decrease in chemical space coverage; depending on the applicability domain threshold, R(2) ranged from 0.24 to 0.70. Ultimately, several consensus models were developed by averaging the predicted LD(50) for every compound using all five models. The consensus models afforded higher prediction accuracy for the external validation data set with the higher coverage as compared to individual constituent models. The validated consensus LD(50) models developed in this study can be used as reliable computational predictors of in vivo acute toxicity.
Pizzo, Fabiola; Lombardo, Anna; Manganaro, Alberto; Benfenati, Emilio
2016-01-01
The prompt identification of chemical molecules with potential effects on liver may help in drug discovery and in raising the levels of protection for human health. Besides in vitro approaches, computational methods in toxicology are drawing attention. We built a structure-activity relationship (SAR) model for evaluating hepatotoxicity. After compiling a data set of 950 compounds using data from the literature, we randomly split it into training (80%) and test sets (20%). We also compiled an external validation set (101 compounds) for evaluating the performance of the model. To extract structural alerts (SAs) related to hepatotoxicity and non-hepatotoxicity we used SARpy, a statistical application that automatically identifies and extracts chemical fragments related to a specific activity. We also applied the chemical grouping approach for manually identifying other SAs. We calculated accuracy, specificity, sensitivity and Matthews correlation coefficient (MCC) on the training, test and external validation sets. Considering the complexity of the endpoint, the model performed well. In the training, test and external validation sets the accuracy was respectively 81, 63, and 68%, specificity 89, 33, and 33%, sensitivity 93, 88, and 80% and MCC 0.63, 0.27, and 0.13. Since it is preferable to overestimate hepatotoxicity rather than not to recognize unsafe compounds, the model's architecture followed a conservative approach. As it was built using human data, it might be applied without any need for extrapolation from other species. This model will be freely available in the VEGA platform. PMID:27920722
Prediction of biodegradability of aromatics in water using QSAR modeling.
Cvetnic, Matija; Juretic Perisic, Daria; Kovacic, Marin; Kusic, Hrvoje; Dermadi, Jasna; Horvat, Sanja; Bolanca, Tomislav; Marin, Vedrana; Karamanis, Panaghiotis; Loncaric Bozic, Ana
2017-05-01
The study was aimed at developing models for predicting the biodegradability of aromatic water pollutants. For that purpose, 36 single-benzene ring compounds, with different type, number and position of substituents, were used. The biodegradability was estimated according to the ratio of the biochemical (BOD 5 ) and chemical (COD) oxygen demand values determined for parent compounds ((BOD 5 /COD) 0 ), as well as for their reaction mixtures in half-life achieved by UV-C/H 2 O 2 process ((BOD 5 /COD) t1/2 ). The models correlating biodegradability and molecular structure characteristics of studied pollutants were derived using quantitative structure-activity relationship (QSAR) principles and tools. Upon derivation of the models and calibration on the training and subsequent testing on the test set, 3- and 5-variable models were selected as the most predictive for (BOD 5 /COD) 0 and (BOD 5 /COD) t1/2 , respectively, according to the values of statistical parameters R 2 and Q 2 . Hence, 3-variable model predicting (BOD 5 /COD) 0 possessed R 2 =0.863 and Q 2 =0.799 for training set, and R 2 =0.710 for test set, while 5-variable model predicting (BOD 5 /COD) 1/2 possessed R 2 =0.886 and Q 2 =0.788 for training set, and R 2 =0.564 for test set. The selected models are interpretable and transparent, reflecting key structural features that influence targeted biodegradability and can be correlated with the degradation mechanisms of studied compounds by UV-C/H 2 O 2 . Copyright © 2017 Elsevier Inc. All rights reserved.
QSAR Classification Model for Antibacterial Compounds and Its Use in Virtual Screening
2012-09-26
test set molecules that were not used to train the models . This allowed us to more accurately estimate the prediction power of the models . As...pathogens and deposited in PubChem Bioassays. Ultimately, the main purpose of this model is to make predictions , based on known antibacterial and non...the model built form the remaining compounds is used to predict the left out compound. Once all the compounds pass through this cycle of prediction , a
Lee, Won Jun; Kim, Sang Cheol; Lee, Seul Ji; Lee, Jeongmi; Park, Jeong Hill; Yu, Kyung-Sang; Lim, Johan; Kwon, Sung Won
2014-01-01
Based on the process of carcinogenesis, carcinogens are classified as either genotoxic or non-genotoxic. In contrast to non-genotoxic carcinogens, many genotoxic carcinogens have been reported to cause tumor in carcinogenic bioassays in animals. Thus evaluating the genotoxicity potential of chemicals is important to discriminate genotoxic from non-genotoxic carcinogens for health care and pharmaceutical industry safety. Additionally, investigating the difference between the mechanisms of genotoxic and non-genotoxic carcinogens could provide the foundation for a mechanism-based classification for unknown compounds. In this study, we investigated the gene expression of HepG2 cells treated with genotoxic or non-genotoxic carcinogens and compared their mechanisms of action. To enhance our understanding of the differences in the mechanisms of genotoxic and non-genotoxic carcinogens, we implemented a gene set analysis using 12 compounds for the training set (12, 24, 48 h) and validated significant gene sets using 22 compounds for the test set (24, 48 h). For a direct biological translation, we conducted a gene set analysis using Globaltest and selected significant gene sets. To validate the results, training and test compounds were predicted by the significant gene sets using a prediction analysis for microarrays (PAM). Finally, we obtained 6 gene sets, including sets enriched for genes involved in the adherens junction, bladder cancer, p53 signaling pathway, pathways in cancer, peroxisome and RNA degradation. Among the 6 gene sets, the bladder cancer and p53 signaling pathway sets were significant at 12, 24 and 48 h. We also found that the DDB2, RRM2B and GADD45A, genes related to the repair and damage prevention of DNA, were consistently up-regulated for genotoxic carcinogens. Our results suggest that a gene set analysis could provide a robust tool in the investigation of the different mechanisms of genotoxic and non-genotoxic carcinogens and construct a more detailed understanding of the perturbation of significant pathways.
Lee, Won Jun; Kim, Sang Cheol; Lee, Seul Ji; Lee, Jeongmi; Park, Jeong Hill; Yu, Kyung-Sang; Lim, Johan; Kwon, Sung Won
2014-01-01
Based on the process of carcinogenesis, carcinogens are classified as either genotoxic or non-genotoxic. In contrast to non-genotoxic carcinogens, many genotoxic carcinogens have been reported to cause tumor in carcinogenic bioassays in animals. Thus evaluating the genotoxicity potential of chemicals is important to discriminate genotoxic from non-genotoxic carcinogens for health care and pharmaceutical industry safety. Additionally, investigating the difference between the mechanisms of genotoxic and non-genotoxic carcinogens could provide the foundation for a mechanism-based classification for unknown compounds. In this study, we investigated the gene expression of HepG2 cells treated with genotoxic or non-genotoxic carcinogens and compared their mechanisms of action. To enhance our understanding of the differences in the mechanisms of genotoxic and non-genotoxic carcinogens, we implemented a gene set analysis using 12 compounds for the training set (12, 24, 48 h) and validated significant gene sets using 22 compounds for the test set (24, 48 h). For a direct biological translation, we conducted a gene set analysis using Globaltest and selected significant gene sets. To validate the results, training and test compounds were predicted by the significant gene sets using a prediction analysis for microarrays (PAM). Finally, we obtained 6 gene sets, including sets enriched for genes involved in the adherens junction, bladder cancer, p53 signaling pathway, pathways in cancer, peroxisome and RNA degradation. Among the 6 gene sets, the bladder cancer and p53 signaling pathway sets were significant at 12, 24 and 48 h. We also found that the DDB2, RRM2B and GADD45A, genes related to the repair and damage prevention of DNA, were consistently up-regulated for genotoxic carcinogens. Our results suggest that a gene set analysis could provide a robust tool in the investigation of the different mechanisms of genotoxic and non-genotoxic carcinogens and construct a more detailed understanding of the perturbation of significant pathways. PMID:24497971
A hierarchical clustering methodology for the estimation of toxicity.
Martin, Todd M; Harten, Paul; Venkatapathy, Raghuraman; Das, Shashikala; Young, Douglas M
2008-01-01
ABSTRACT A quantitative structure-activity relationship (QSAR) methodology based on hierarchical clustering was developed to predict toxicological endpoints. This methodology utilizes Ward's method to divide a training set into a series of structurally similar clusters. The structural similarity is defined in terms of 2-D physicochemical descriptors (such as connectivity and E-state indices). A genetic algorithm-based technique is used to generate statistically valid QSAR models for each cluster (using the pool of descriptors described above). The toxicity for a given query compound is estimated using the weighted average of the predictions from the closest cluster from each step in the hierarchical clustering assuming that the compound is within the domain of applicability of the cluster. The hierarchical clustering methodology was tested using a Tetrahymena pyriformis acute toxicity data set containing 644 chemicals in the training set and with two prediction sets containing 339 and 110 chemicals. The results from the hierarchical clustering methodology were compared to the results from several different QSAR methodologies.
Viira, Birgit; Gendron, Thibault; Lanfranchi, Don Antoine; Cojean, Sandrine; Horvath, Dragos; Marcou, Gilles; Varnek, Alexandre; Maes, Louis; Maran, Uko; Loiseau, Philippe M; Davioud-Charvet, Elisabeth
2016-06-29
Malaria is a parasitic tropical disease that kills around 600,000 patients every year. The emergence of resistant Plasmodium falciparum parasites to artemisinin-based combination therapies (ACTs) represents a significant public health threat, indicating the urgent need for new effective compounds to reverse ACT resistance and cure the disease. For this, extensive curation and homogenization of experimental anti-Plasmodium screening data from both in-house and ChEMBL sources were conducted. As a result, a coherent strategy was established that allowed compiling coherent training sets that associate compound structures to the respective antimalarial activity measurements. Seventeen of these training sets led to the successful generation of classification models discriminating whether a compound has a significant probability to be active under the specific conditions of the antimalarial test associated with each set. These models were used in consensus prediction of the most likely active from a series of curcuminoids available in-house. Positive predictions together with a few predicted as inactive were then submitted to experimental in vitro antimalarial testing. A large majority from predicted compounds showed antimalarial activity, but not those predicted as inactive, thus experimentally validating the in silico screening approach. The herein proposed consensus machine learning approach showed its potential to reduce the cost and duration of antimalarial drug discovery.
Coupling Matched Molecular Pairs with Machine Learning for Virtual Compound Optimization.
Turk, Samo; Merget, Benjamin; Rippmann, Friedrich; Fulle, Simone
2017-12-26
Matched molecular pair (MMP) analyses are widely used in compound optimization projects to gain insights into structure-activity relationships (SAR). The analysis is traditionally done via statistical methods but can also be employed together with machine learning (ML) approaches to extrapolate to novel compounds. The here introduced MMP/ML method combines a fragment-based MMP implementation with different machine learning methods to obtain automated SAR decomposition and prediction. To test the prediction capabilities and model transferability, two different compound optimization scenarios were designed: (1) "new fragments" which occurs when exploring new fragments for a defined compound series and (2) "new static core and transformations" which resembles for instance the identification of a new compound series. Very good results were achieved by all employed machine learning methods especially for the new fragments case, but overall deep neural network models performed best, allowing reliable predictions also for the new static core and transformations scenario, where comprehensive SAR knowledge of the compound series is missing. Furthermore, we show that models trained on all available data have a higher generalizability compared to models trained on focused series and can extend beyond chemical space covered in the training data. Thus, coupling MMP with deep neural networks provides a promising approach to make high quality predictions on various data sets and in different compound optimization scenarios.
Tsesmeli, Styliani N
2017-01-01
The study aimed to evaluate the intervention effects on spelling and meaning of compounds by Greek students via group board games in classroom settings. The sample consisted of 60 pupils, who were attending the first and second grade of two primary schools in Greece. Each grade-class was divided into an intervention ( N = 29 children) and a control group ( N = 31 children). Before intervention, groups were evaluated by standardized tests of reading words/pseudowords, spelling words, and vocabulary. Students were also assessed on compound knowledge by a word analogy task, a meaning task and a spelling task. The experimental design of the intervention included a pre-test, a training program, and a post-test. The pre- and post-assessments consisted of the spelling and the meaning tasks entailing equally morphologically transparent and opaque compounds. The training program was based on word families ( N = 10 word families, 56 trained items, 5 sessions) and aimed to offer instruction of morphological decomposition and meaning of words. The findings showed that training was effective in enhancing the spelling and most notably the meaning of compounds. A closer inspection of intervention data in terms of morphological transparency, revealed that training group of first graders improved significantly both on transparent and opaque compounds, while the degree of gains was larger on opaque items for the second graders. These findings are consistent with the experimental literature and particularly optimistic for the literacy enhancement of typically developing children in regular classrooms.
Tsesmeli, Styliani N.
2017-01-01
The study aimed to evaluate the intervention effects on spelling and meaning of compounds by Greek students via group board games in classroom settings. The sample consisted of 60 pupils, who were attending the first and second grade of two primary schools in Greece. Each grade-class was divided into an intervention (N = 29 children) and a control group (N = 31 children). Before intervention, groups were evaluated by standardized tests of reading words/pseudowords, spelling words, and vocabulary. Students were also assessed on compound knowledge by a word analogy task, a meaning task and a spelling task. The experimental design of the intervention included a pre-test, a training program, and a post-test. The pre- and post-assessments consisted of the spelling and the meaning tasks entailing equally morphologically transparent and opaque compounds. The training program was based on word families (N = 10 word families, 56 trained items, 5 sessions) and aimed to offer instruction of morphological decomposition and meaning of words. The findings showed that training was effective in enhancing the spelling and most notably the meaning of compounds. A closer inspection of intervention data in terms of morphological transparency, revealed that training group of first graders improved significantly both on transparent and opaque compounds, while the degree of gains was larger on opaque items for the second graders. These findings are consistent with the experimental literature and particularly optimistic for the literacy enhancement of typically developing children in regular classrooms. PMID:29238316
Teixeira, Ana L; Falcao, Andre O
2014-07-28
Structurally similar molecules tend to have similar properties, i.e. closer molecules in the molecular space are more likely to yield similar property values while distant molecules are more likely to yield different values. Based on this principle, we propose the use of a new method that takes into account the high dimensionality of the molecular space, predicting chemical, physical, or biological properties based on the most similar compounds with measured properties. This methodology uses ordinary kriging coupled with three different molecular similarity approaches (based on molecular descriptors, fingerprints, and atom matching) which creates an interpolation map over the molecular space that is capable of predicting properties/activities for diverse chemical data sets. The proposed method was tested in two data sets of diverse chemical compounds collected from the literature and preprocessed. One of the data sets contained dihydrofolate reductase inhibition activity data, and the second molecules for which aqueous solubility was known. The overall predictive results using kriging for both data sets comply with the results obtained in the literature using typical QSPR/QSAR approaches. However, the procedure did not involve any type of descriptor selection or even minimal information about each problem, suggesting that this approach is directly applicable to a large spectrum of problems in QSAR/QSPR. Furthermore, the predictive results improve significantly with the similarity threshold between the training and testing compounds, allowing the definition of a confidence threshold of similarity and error estimation for each case inferred. The use of kriging for interpolation over the molecular metric space is independent of the training data set size, and no reparametrizations are necessary when more compounds are added or removed from the set, and increasing the size of the database will consequentially improve the quality of the estimations. Finally it is shown that this model can be used for checking the consistency of measured data and for guiding an extension of the training set by determining the regions of the molecular space for which new experimental measurements could be used to maximize the model's predictive performance.
Neural network pattern recognition of thermal-signature spectra for chemical defense
NASA Astrophysics Data System (ADS)
Carrieri, Arthur H.; Lim, Pascal I.
1995-05-01
We treat infrared patterns of absorption or emission by nerve and blister agent compounds (and simulants of this chemical group) as features for the training of neural networks to detect the compounds' liquid layers on the ground or their vapor plumes during evaporation by external heating. Training of a four-layer network architecture is composed of a backward-error-propagation algorithm and a gradient-descent paradigm. We conduct testing by feed-forwarding preprocessed spectra through the network in a scaled format consistent with the structure of the training-data-set representation. The best-performance weight matrix (spectral filter) evolved from final network training and testing with software simulation trials is electronically transferred to a set of eight artificial intelligence integrated circuits (ICs') in specific modular form (splitting of weight matrices). This form makes full use of all input-output IC nodes. This neural network computer serves an important real-time detection function when it is integrated into pre-and postprocessing data-handling units of a tactical prototype thermoluminescence sensor now under development at the Edgewood Research, Development, and Engineering Center.
PREDICTION OF MOLECULAR PROPERTIES WITH MID-INFRARED SPECTRA AND INTERFEROGRAMS
We have built infrared spectroscopy-based partial least squares (PLS) models for molecular polarizabilities using a 97 member training set and a 59 member independent prediction set. These 156 compounds span a very wide range of chemical structure. Our goal was to use this well...
Assessment of Stimulus Overselectivity with Tactile Compound Stimuli in Children with Autism
ERIC Educational Resources Information Center
Ploog, Bertram O.; Kim, Nina
2007-01-01
Autistic and typical children mastered a simultaneous discrimination task with three sets of all-tactile compound stimuli. During training, responding to one stimulus (S+) resulted in rewards whereas responding to the alternative (S-) was extinguished. Test 1 was conducted with recombinations of S+ and S- elements. In Test 2, the test stimulus to…
DESCRIPTIVE ANALYSIS OF DIVALENT SALTS
YANG, HEIDI HAI-LING; LAWLESS, HARRY T.
2005-01-01
Many divalent salts (e.g., calcium, iron, zinc), have important nutritional value and are used to fortify food or as dietary supplements. Sensory characterization of some divalent salts in aqueous solutions by untrained judges has been reported in the psychophysical literature, but formal sensory evaluation by trained panels is lacking. To provide this information, a trained descriptive panel evaluated the sensory characteristics of 10 divalent salts including ferrous sulfate, chloride and gluconate; calcium chloride, lactate and glycerophosphate; zinc sulfate and chloride; and magnesium sulfate and chloride. Among the compounds tested, iron compounds were highest in metallic taste; zinc compounds had higher astringency and a glutamate-like sensation; and bitterness was pronounced for magnesium and calcium salts. Bitterness was affected by the anion in ferrous and calcium salts. Results from the trained panelists were largely consistent with the psychophysical literature using untrained judges, but provided a more comprehensive set of oral sensory attributes. PMID:16614749
Quantitative prediction of solvation free energy in octanol of organic compounds.
Delgado, Eduardo J; Jaña, Gonzalo A
2009-03-01
The free energy of solvation, DeltaGS0, in octanol of organic compounds is quantitatively predicted from the molecular structure. The model, involving only three molecular descriptors, is obtained by multiple linear regression analysis from a data set of 147 compounds containing diverse organic functions, namely, halogenated and non-halogenated alkanes, alkenes, alkynes, aromatics, alcohols, aldehydes, ketones, amines, ethers and esters; covering a DeltaGS0 range from about -50 to 0 kJ.mol(-1). The model predicts the free energy of solvation with a squared correlation coefficient of 0.93 and a standard deviation, 2.4 kJ.mol(-1), just marginally larger than the generally accepted value of experimental uncertainty. The involved molecular descriptors have definite physical meaning corresponding to the different intermolecular interactions occurring in the bulk liquid phase. The model is validated with an external set of 36 compounds not included in the training set.
Lienemann, Kai; Plötz, Thomas; Pestel, Sabine
2008-01-01
The aim of safety pharmacology is early detection of compound-induced side-effects. NMR-based urine analysis followed by multivariate data analysis (metabonomics) identifies efficiently differences between toxic and non-toxic compounds; but in most cases multiple administrations of the test compound are necessary. We tested the feasibility of detecting proximal tubule kidney toxicity and phospholipidosis with metabonomics techniques after single compound administration as an early safety pharmacology approach. Rats were treated orally, intravenously, inhalatively or intraperitoneally with different test compounds. Urine was collected at 0-8 h and 8-24 h after compound administration, and (1)H NMR-patterns were recorded from the samples. Variation of post-processing and feature extraction methods led to different views on the data. Support Vector Machines were trained on these different data sets and then aggregated as experts in an Ensemble. Finally, validity was monitored with a cross-validation study using a training, validation, and test data set. Proximal tubule kidney toxicity could be predicted with reasonable total classification accuracy (85%), specificity (88%) and sensitivity (78%). In comparison to alternative histological studies, results were obtained quicker, compound need was reduced, and very importantly fewer animals were needed. In contrast, the induction of phospholipidosis by the test compounds could not be predicted using NMR-based urine analysis or the previously published biomarker PAG. NMR-based urine analysis was shown to effectively predict proximal tubule kidney toxicity after single compound administration in rats. Thus, this experimental design allows early detection of toxicity risks with relatively low amounts of compound in a reasonably short period of time.
Whitmore, Leanne S.; Davis, Ryan W.; McCormick, Robert L.; ...
2016-09-15
Screening a large number of biologically derived molecules for potential fuel compounds without recourse to experimental testing is important in identifying understudied yet valuable molecules. Experimental testing, although a valuable standard for measuring fuel properties, has several major limitations, including the requirement of testably high quantities, considerable expense, and a large amount of time. This paper discusses the development of a general-purpose fuel property tool, using machine learning, whose outcome is to screen molecules for desirable fuel properties. BioCompoundML adopts a general methodology, requiring as input only a list of training compounds (with identifiers and measured values) and a listmore » of testing compounds (with identifiers). For the training data, BioCompoundML collects open data from the National Center for Biotechnology Information, incorporates user-provided features, imputes missing values, performs feature reduction, builds a classifier, and clusters compounds. BioCompoundML then collects data for the testing compounds, predicts class membership, and determines whether compounds are found in the range of variability of the training data set. We demonstrate this tool using three different fuel properties: research octane number (RON), threshold soot index (TSI), and melting point (MP). Here we provide measures of its success with these properties using randomized train/test measurements: average accuracy is 88% in RON, 85% in TSI, and 94% in MP; average precision is 88% in RON, 88% in TSI, and 95% in MP; and average recall is 88% in RON, 82% in TSI, and 97% in MP. The receiver operator characteristics (area under the curve) were estimated at 0.88 in RON, 0.86 in TSI, and 0.87 in MP. We also measured the success of BioCompoundML by sending 16 compounds for direct RON determination. Finally, we provide a screen of 1977 hydrocarbons/oxygenates within the 8696 compounds in MetaCyc, identifying compounds with high predictive strength for high or low RON.« less
New consensus multivariate models based on PLS and ANN studies of sigma-1 receptor antagonists.
Oliveira, Aline A; Lipinski, Célio F; Pereira, Estevão B; Honorio, Kathia M; Oliveira, Patrícia R; Weber, Karen C; Romero, Roseli A F; de Sousa, Alexsandro G; da Silva, Albérico B F
2017-10-02
The treatment of neuropathic pain is very complex and there are few drugs approved for this purpose. Among the studied compounds in the literature, sigma-1 receptor antagonists have shown to be promising. In order to develop QSAR studies applied to the compounds of 1-arylpyrazole derivatives, multivariate analyses have been performed in this work using partial least square (PLS) and artificial neural network (ANN) methods. A PLS model has been obtained and validated with 45 compounds in the training set and 13 compounds in the test set (r 2 training = 0.761, q 2 = 0.656, r 2 test = 0.746, MSE test = 0.132 and MAE test = 0.258). Additionally, multi-layer perceptron ANNs (MLP-ANNs) were employed in order to propose non-linear models trained by gradient descent with momentum backpropagation function. Based on MSE test values, the best MLP-ANN models were combined in a MLP-ANN consensus model (MLP-ANN-CM; r 2 test = 0.824, MSE test = 0.088 and MAE test = 0.197). In the end, a general consensus model (GCM) has been obtained using PLS and MLP-ANN-CM models (r 2 test = 0.811, MSE test = 0.100 and MAE test = 0.218). Besides, the selected descriptors (GGI6, Mor23m, SRW06, H7m, MLOGP, and μ) revealed important features that should be considered when one is planning new compounds of the 1-arylpyrazole class. The multivariate models proposed in this work are definitely a powerful tool for the rational drug design of new compounds for neuropathic pain treatment. Graphical abstract Main scaffold of the 1-arylpyrazole derivatives and the selected descriptors.
Castillo-Garit, Juan Alberto; del Toro-Cortés, Oremia; Vega, Maria C; Rolón, Miriam; Rojas de Arias, Antonieta; Casañola-Martin, Gerardo M; Escario, José A; Gómez-Barrio, Alicia; Marrero-Ponce, Yovani; Torrens, Francisco; Abad, Concepción
2015-01-01
Two-dimensional bond-based bilinear indices and linear discriminant analysis are used in this report to perform a quantitative structure-activity relationship study to identify new trypanosomicidal compounds. A data set of 440 organic chemicals, 143 with antitrypanosomal activity and 297 having other clinical uses, is used to develop the theoretical models. Two discriminant models, computed using bond-based bilinear indices, are developed and both show accuracies higher than 86% for training and test sets. The stochastic model correctly indentifies nine out of eleven compounds of a set of organic chemicals obtained from our synthetic collaborators. The in vitro antitrypanosomal activity of this set against epimastigote forms of Trypanosoma cruzi is assayed. Both models show a good agreement between theoretical predictions and experimental results. Three compounds showed IC50 values for epimastigote elimination (AE) lower than 50 μM, while for the benznidazole the IC50 = 54.7 μM which was used as reference compound. The value of IC50 for cytotoxicity of these compounds is at least 5 times greater than their value of IC50 for AE. Finally, we can say that, the present algorithm constitutes a step forward in the search for efficient ways of discovering new antitrypanosomal compounds. Copyright © 2015 Elsevier Masson SAS. All rights reserved.
In silico models for predicting ready biodegradability under REACH: a comparative study.
Pizzo, Fabiola; Lombardo, Anna; Manganaro, Alberto; Benfenati, Emilio
2013-10-01
REACH (Registration Evaluation Authorization and restriction of Chemicals) legislation is a new European law which aims to raise the human protection level and environmental health. Under REACH all chemicals manufactured or imported for more than one ton per year must be evaluated for their ready biodegradability. Ready biodegradability is also used as a screening test for persistent, bioaccumulative and toxic (PBT) substances. REACH encourages the use of non-testing methods such as QSAR (quantitative structure-activity relationship) models in order to save money and time and to reduce the number of animals used for scientific purposes. Some QSAR models are available for predicting ready biodegradability. We used a dataset of 722 compounds to test four models: VEGA, TOPKAT, BIOWIN 5 and 6 and START and compared their performance on the basis of the following parameters: accuracy, sensitivity, specificity and Matthew's correlation coefficient (MCC). Performance was analyzed from different points of view. The first calculation was done on the whole dataset and VEGA and TOPKAT gave the best accuracy (88% and 87% respectively). Then we considered the compounds inside and outside the training set: BIOWIN 6 and 5 gave the best results for accuracy (81%) outside training set. Another analysis examined the applicability domain (AD). VEGA had the highest value for compounds inside the AD for all the parameters taken into account. Finally, compounds outside the training set and in the AD of the models were considered to assess predictive ability. VEGA gave the best accuracy results (99%) for this group of chemicals. Generally, START model gave poor results. Since BIOWIN, TOPKAT and VEGA models performed well, they may be used to predict ready biodegradability. Copyright © 2013 Elsevier B.V. All rights reserved.
Pattern recognition and classification of vibrational spectra by artificial neural networks
NASA Astrophysics Data System (ADS)
Yang, Husheng
1999-10-01
A drawback of current open-path Fourier transform infrared (OP/FT-IR) systems is that they need a human expert to determine those compounds that may be quantified from a given spectrum. In this study, three types of artificial neural networks were used to alleviate this problem. Firstly, multi-layer feed-forward neural networks were used to automatically recognize compounds in an OP/FT-IR spectrum. Each neural network was trained to recognize one compound in the presence of up to ten interferents in an OP/FT-IR spectrum. The networks were successfully used to recognize five alcohols and two chlorinated compounds in field-measured controlled-release OP/FT-IR spectra of mixtures of these compounds. It has also been demonstrated that a neural network could correctly identify a spectrum in the presence of an interferent that was not included in the training set and could also reject interferents it has not seen before. Secondly, the possibility of using one- and two- dimensional Kohonen self-organizing maps (SOMs) to recognize similarities in low-resolution vapor-phase infrared spectra without any additional information has been investigated. Both full-range reference spectra and open-path window reference spectra were used to train the networks and the trained networks were then used to classify the reference spectra into several groups. The results showed that the SOMs obtained from the two different training sets were quite different, and it is more appropriate to use the second SOM in OP/FT-IR spectrometry. Thirdly, vapor-phase FT-IR reference spectra of five alcohols along with four baseline spectra were encoded as prototype vectors for a Hopfield network. Inclusion of the baseline spectra allowed the network to classify spectra as unknowns, when the reference spectra of these compounds were not stored as prototype vectors in the network. The network could identify each of the 5 alcohols correctly even in the presence of noise and interfering compounds. Finally, one- and two-dimensional Kohonen SOMs were also successfully used for the unsupervised differentiation of the Fourier transform Raman spectra of hardwoods from softwoods. A semi-quantitative method that is based on the Euclidean distances of the weight matrix has been developed to assist the automatic clustering of the neurons in a two-dimensional SOM.
Dragovic, Sanja; Vermeulen, Nico P E; Gerets, Helga H; Hewitt, Philip G; Ingelman-Sundberg, Magnus; Park, B Kevin; Juhila, Satu; Snoeys, Jan; Weaver, Richard J
2016-12-01
The current test systems employed by pharmaceutical industry are poorly predictive for drug-induced liver injury (DILI). The 'MIP-DILI' project addresses this situation by the development of innovative preclinical test systems which are both mechanism-based and of physiological, pharmacological and pathological relevance to DILI in humans. An iterative, tiered approach with respect to test compounds, test systems, bioanalysis and systems analysis is adopted to evaluate existing models and develop new models that can provide validated test systems with respect to the prediction of specific forms of DILI and further elucidation of mechanisms. An essential component of this effort is the choice of compound training set that will be used to inform refinement and/or development of new model systems that allow prediction based on knowledge of mechanisms, in a tiered fashion. In this review, we focus on the selection of MIP-DILI training compounds for mechanism-based evaluation of non-clinical prediction of DILI. The selected compounds address both hepatocellular and cholestatic DILI patterns in man, covering a broad range of pharmacologies and chemistries, and taking into account available data on potential DILI mechanisms (e.g. mitochondrial injury, reactive metabolites, biliary transport inhibition, and immune responses). Known mechanisms by which these compounds are believed to cause liver injury have been described, where many if not all drugs in this review appear to exhibit multiple toxicological mechanisms. Thus, the training compounds selection offered a valuable tool to profile DILI mechanisms and to interrogate existing and novel in vitro systems for the prediction of human DILI.
QSAR Modeling of Rat Acute Toxicity by Oral Exposure
Zhu, Hao; Martin, Todd M.; Ye, Lin; Sedykh, Alexander; Young, Douglas M.; Tropsha, Alexander
2009-01-01
Few Quantitative Structure-Activity Relationship (QSAR) studies have successfully modeled large, diverse rodent toxicity endpoints. In this study, a comprehensive dataset of 7,385 compounds with their most conservative lethal dose (LD50) values has been compiled. A combinatorial QSAR approach has been employed to develop robust and predictive models of acute toxicity in rats caused by oral exposure to chemicals. To enable fair comparison between the predictive power of models generated in this study versus a commercial toxicity predictor, TOPKAT (Toxicity Prediction by Komputer Assisted Technology), a modeling subset of the entire dataset was selected that included all 3,472 compounds used in the TOPKAT’s training set. The remaining 3,913 compounds, which were not present in the TOPKAT training set, were used as the external validation set. QSAR models of five different types were developed for the modeling set. The prediction accuracy for the external validation set was estimated by determination coefficient R2 of linear regression between actual and predicted LD50 values. The use of the applicability domain threshold implemented in most models generally improved the external prediction accuracy but expectedly led to the decrease in chemical space coverage; depending on the applicability domain threshold, R2 ranged from 0.24 to 0.70. Ultimately, several consensus models were developed by averaging the predicted LD50 for every compound using all 5 models. The consensus models afforded higher prediction accuracy for the external validation dataset with the higher coverage as compared to individual constituent models. The validated consensus LD50 models developed in this study can be used as reliable computational predictors of in vivo acute toxicity. PMID:19845371
3D-QSAR analysis of MCD inhibitors by CoMFA and CoMSIA.
Pourbasheer, Eslam; Aalizadeh, Reza; Ebadi, Amin; Ganjali, Mohammad Reza
2015-01-01
Three-dimensional quantitative structure-activity relationship was developed for the series of compounds as malonyl-CoA decarboxylase antagonists (MCD) using the CoMFA and CoMSIA methods. The statistical parameters for CoMFA (q(2)=0.558, r(2)=0.841) and CoMSIA (q(2)= 0.615, r(2) = 0.870) models were derived based on 38 compounds as training set in the basis of the selected alignment. The external predictive abilities of the built models were evaluated by using the test set of nine compounds. From obtained results, the CoMSIA method was found to have highly predictive capability in comparison with CoMFA method. Based on the given results by CoMSIA and CoMFA contour maps, some features that can enhance the activity of compounds as MCD antagonists were introduced and used to design new compounds with better inhibition activity.
Ren, Biye
2003-01-01
Structure-boiling point relationships are studied for a series of oxo organic compounds by means of multiple linear regression (MLR) analysis. Excellent MLR models based on the recently introduced Xu index and the atom-type-based AI indices are obtained for the two subsets containing respectively 77 ethers and 107 carbonyl compounds and a combined set of 184 oxo compounds. The best models are tested using the leave-one-out cross-validation and an external test set, respectively. The MLR model produces a correlation coefficient of r = 0.9977 and a standard error of s = 3.99 degrees C for the training set of 184 compounds, and r(cv) = 0.9974 and s(cv) = 4.16 degrees C for the cross-validation set, and r(pred) = 0.9949 and s(pred) = 4.38 degrees C for the prediction set of 21 compounds. For the two subsets containing respectively 77 ethers and 107 carbonyl compounds, the quality of the models is further improved. The standard errors are reduced to 3.30 and 3.02 degrees C, respectively. Furthermore, the results obtained from this study indicate that the boiling points of the studied oxo compound dominantly depend on molecular size and also depend on individual atom types, especially oxygen heteroatoms in molecules due to strong polar interactions between molecules. These excellent structure-boiling point models not only provide profound insights into the role of structural features in a molecule but also illustrate the usefulness of these indices in QSPR/QSAR modeling of complex compounds.
Predicting hydration free energies of amphetamine-type stimulants with a customized molecular model
NASA Astrophysics Data System (ADS)
Li, Jipeng; Fu, Jia; Huang, Xing; Lu, Diannan; Wu, Jianzhong
2016-09-01
Amphetamine-type stimulants (ATS) are a group of incitation and psychedelic drugs affecting the central nervous system. Physicochemical data for these compounds are essential for understanding the stimulating mechanism, for assessing their environmental impacts, and for developing new drug detection methods. However, experimental data are scarce due to tight regulation of such illicit drugs, yet conventional methods to estimate their properties are often unreliable. Here we introduce a tailor-made multiscale procedure for predicting the hydration free energies and the solvation structures of ATS molecules by a combination of first principles calculations and the classical density functional theory. We demonstrate that the multiscale procedure performs well for a training set with similar molecular characteristics and yields good agreement with a testing set not used in the training. The theoretical predictions serve as a benchmark for the missing experimental data and, importantly, provide microscopic insights into manipulating the hydrophobicity of ATS compounds by chemical modifications.
NASA Astrophysics Data System (ADS)
Qu, Rui; Liu, Shu-Shen; Zheng, Qiao-Feng; Li, Tong
2017-03-01
Concentration addition (CA) was proposed as a reasonable default approach for the ecological risk assessment of chemical mixtures. However, CA cannot predict the toxicity of mixture at some effect zones if not all components have definite effective concentrations at the given effect, such as some compounds induce hormesis. In this paper, we developed a new method for the toxicity prediction of various types of binary mixtures, an interpolation method based on the Delaunay triangulation (DT) and Voronoi tessellation (VT) as well as the training set of direct equipartition ray design (EquRay) mixtures, simply IDVequ. At first, the EquRay was employed to design the basic concentration compositions of five binary mixture rays. The toxic effects of single components and mixture rays at different times and various concentrations were determined by the time-dependent microplate toxicity analysis. Secondly, the concentration-toxicity data of the pure components and various mixture rays were acted as a training set. The DT triangles and VT polygons were constructed by various vertices of concentrations in the training set. The toxicities of unknown mixtures were predicted by the linear interpolation and natural neighbor interpolation of vertices. The IDVequ successfully predicted the toxicities of various types of binary mixtures.
Qu, Rui; Liu, Shu-Shen; Zheng, Qiao-Feng; Li, Tong
2017-01-01
Concentration addition (CA) was proposed as a reasonable default approach for the ecological risk assessment of chemical mixtures. However, CA cannot predict the toxicity of mixture at some effect zones if not all components have definite effective concentrations at the given effect, such as some compounds induce hormesis. In this paper, we developed a new method for the toxicity prediction of various types of binary mixtures, an interpolation method based on the Delaunay triangulation (DT) and Voronoi tessellation (VT) as well as the training set of direct equipartition ray design (EquRay) mixtures, simply IDVequ. At first, the EquRay was employed to design the basic concentration compositions of five binary mixture rays. The toxic effects of single components and mixture rays at different times and various concentrations were determined by the time-dependent microplate toxicity analysis. Secondly, the concentration-toxicity data of the pure components and various mixture rays were acted as a training set. The DT triangles and VT polygons were constructed by various vertices of concentrations in the training set. The toxicities of unknown mixtures were predicted by the linear interpolation and natural neighbor interpolation of vertices. The IDVequ successfully predicted the toxicities of various types of binary mixtures. PMID:28287626
Quantitative Prediction of Solvation Free Energy in Octanol of Organic Compounds
Delgado, Eduardo J.; Jaña, Gonzalo A.
2009-01-01
The free energy of solvation, ΔGS0, in octanol of organic compunds is quantitatively predicted from the molecular structure. The model, involving only three molecular descriptors, is obtained by multiple linear regression analysis from a data set of 147 compounds containing diverse organic functions, namely, halogenated and non-halogenated alkanes, alkenes, alkynes, aromatics, alcohols, aldehydes, ketones, amines, ethers and esters; covering a ΔGS0 range from about −50 to 0 kJ·mol−1. The model predicts the free energy of solvation with a squared correlation coefficient of 0.93 and a standard deviation, 2.4 kJ·mol−1, just marginally larger than the generally accepted value of experimental uncertainty. The involved molecular descriptors have definite physical meaning corresponding to the different intermolecular interactions occurring in the bulk liquid phase. The model is validated with an external set of 36 compounds not included in the training set. PMID:19399236
DOE Office of Scientific and Technical Information (OSTI.GOV)
von Lilienfeld, O. Anatole; Ramakrishnan, Raghunathan; Rupp, Matthias
We introduce a fingerprint representation of molecules based on a Fourier series of atomic radial distribution functions. This fingerprint is unique (except for chirality), continuous, and differentiable with respect to atomic coordinates and nuclear charges. It is invariant with respect to translation, rotation, and nuclear permutation, and requires no preconceived knowledge about chemical bonding, topology, or electronic orbitals. As such, it meets many important criteria for a good molecular representation, suggesting its usefulness for machine learning models of molecular properties trained across chemical compound space. To assess the performance of this new descriptor, we have trained machine learning models ofmore » molecular enthalpies of atomization for training sets with up to 10 k organic molecules, drawn at random from a published set of 134 k organic molecules with an average atomization enthalpy of over 1770 kcal/mol. We validate the descriptor on all remaining molecules of the 134 k set. For a training set of 10 k molecules, the fingerprint descriptor achieves a mean absolute error of 8.0 kcal/mol. This is slightly worse than the performance attained using the Coulomb matrix, another popular alternative, reaching 6.2 kcal/mol for the same training and test sets. (c) 2015 Wiley Periodicals, Inc.« less
A preliminary MTD-PLS study for androgen receptor binding of steroid compounds
NASA Astrophysics Data System (ADS)
Bora, Alina; Seclaman, E.; Kurunczi, L.; Funar-Timofei, Simona
The relative binding affinities (RBA) of a series of 30 steroids for Human Androgen Receptor (AR) were used to initiate a MTD-PLS study. The 3D structures of all the compounds were obtained through geometry optimization in the framework of AM1 semiempirical quantum chemical method. The MTD hypermolecule (HM) was constructed, superposing these structures on the AR-bonded dihydrotestosterone (DHT) skeleton obtained from PDB (AR complex, ID 1I37). The parameters characterizing the HM vertices were collected using: AM1 charges, XlogP fragmental values, calculated fragmental polarizabilities (from refractivities), volumes, and H-bond parameters (Raevsky's thermodynamic originated scale). The resulted QSAR data matrix was submitted to PCA (Principal Component Analysis) and PLS (Projections in Latent Structures) procedure (SIMCA P 9.0); five compounds were selected as test set, and the remaining 25 molecules were used as training set. In the PLS procedure supplementary chemical information was introduced, i.e. the steric effect was always considered detrimental, and the hydrophobic and van der Waals interactions were imposed to be beneficial. The initial PLS model using the entire training set has the following characteristics: R2Y = 0.584, Q2 = 0.344. Based on distances to the model criterions (DMODX and DMODY), five compounds were eliminated and the obtained final model had the following characteristics: R2Y D 0.891, Q2 D 0.591. For this the external predictivity on the test set was unsatisfactory. A tentative explanation for these behaviors is the weak information content of the input QSAR matrix for the present series comparatively with other successful MTD-PLS modeling published elsewhere.
Cheng, Zhanzhan; Zhou, Shuigeng; Wang, Yang; Liu, Hui; Guan, Jihong; Chen, Yi-Ping Phoebe
2016-05-18
Prediction of compound-protein interactions (CPIs) is to find new compound-protein pairs where a protein is targeted by at least a compound, which is a crucial step in new drug design. Currently, a number of machine learning based methods have been developed to predict new CPIs in the literature. However, as there is not yet any publicly available set of validated negative CPIs, most existing machine learning based approaches use the unknown interactions (not validated CPIs) selected randomly as the negative examples to train classifiers for predicting new CPIs. Obviously, this is not quite reasonable and unavoidably impacts the CPI prediction performance. In this paper, we simply take the unknown CPIs as unlabeled examples, and propose a new method called PUCPI (the abbreviation of PU learning for Compound-Protein Interaction identification) that employs biased-SVM (Support Vector Machine) to predict CPIs using only positive and unlabeled examples. PU learning is a class of learning methods that leans from positive and unlabeled (PU) samples. To the best of our knowledge, this is the first work that identifies CPIs using only positive and unlabeled examples. We first collect known CPIs as positive examples and then randomly select compound-protein pairs not in the positive set as unlabeled examples. For each CPI/compound-protein pair, we extract protein domains as protein features and compound substructures as chemical features, then take the tensor product of the corresponding compound features and protein features as the feature vector of the CPI/compound-protein pair. After that, biased-SVM is employed to train classifiers on different datasets of CPIs and compound-protein pairs. Experiments over various datasets show that our method outperforms six typical classifiers, including random forest, L1- and L2-regularized logistic regression, naive Bayes, SVM and k-nearest neighbor (kNN), and three types of existing CPI prediction models. Source code, datasets and related documents of PUCPI are available at: http://admis.fudan.edu.cn/projects/pucpi.html.
Minimizing DILI risk in drug discovery - A screening tool for drug candidates.
Schadt, S; Simon, S; Kustermann, S; Boess, F; McGinnis, C; Brink, A; Lieven, R; Fowler, S; Youdim, K; Ullah, M; Marschmann, M; Zihlmann, C; Siegrist, Y M; Cascais, A C; Di Lenarda, E; Durr, E; Schaub, N; Ang, X; Starke, V; Singer, T; Alvarez-Sanchez, R; Roth, A B; Schuler, F; Funk, C
2015-12-25
Drug-induced liver injury (DILI) is a leading cause of acute hepatic failure and a major reason for market withdrawal of drugs. Idiosyncratic DILI is multifactorial, with unclear dose-dependency and poor predictability since the underlying patient-related susceptibilities are not sufficiently understood. Because of these limitations, a pharmaceutical research option would be to reduce the compound-related risk factors in the drug-discovery process. Here we describe the development and validation of a methodology for the assessment of DILI risk of drug candidates. As a training set, 81 marketed or withdrawn compounds with differing DILI rates - according to the FDA categorization - were tested in a combination of assays covering different mechanisms and endpoints contributing to human DILI. These include the generation of reactive metabolites (CYP3A4 time-dependent inhibition and glutathione adduct formation), inhibition of the human bile salt export pump (BSEP), mitochondrial toxicity and cytotoxicity (fibroblasts and human hepatocytes). Different approaches for dose- and exposure-based calibrations were assessed and the same parameters applied to a test set of 39 different compounds. We achieved a similar performance to the training set with an overall accuracy of 79% correctly predicted, a sensitivity of 76% and a specificity of 82%. This test system may be applied in a prospective manner to reduce the risk of idiosyncratic DILI of drug candidates. Copyright © 2015 Elsevier B.V. All rights reserved.
Prediction of Skin Sensitization with a Particle Swarm Optimized Support Vector Machine
Yuan, Hua; Huang, Jianping; Cao, Chenzhong
2009-01-01
Skin sensitization is the most commonly reported occupational illness, causing much suffering to a wide range of people. Identification and labeling of environmental allergens is urgently required to protect people from skin sensitization. The guinea pig maximization test (GPMT) and murine local lymph node assay (LLNA) are the two most important in vivo models for identification of skin sensitizers. In order to reduce the number of animal tests, quantitative structure-activity relationships (QSARs) are strongly encouraged in the assessment of skin sensitization of chemicals. This paper has investigated the skin sensitization potential of 162 compounds with LLNA results and 92 compounds with GPMT results using a support vector machine. A particle swarm optimization algorithm was implemented for feature selection from a large number of molecular descriptors calculated by Dragon. For the LLNA data set, the classification accuracies are 95.37% and 88.89% for the training and the test sets, respectively. For the GPMT data set, the classification accuracies are 91.80% and 90.32% for the training and the test sets, respectively. The classification performances were greatly improved compared to those reported in the literature, indicating that the support vector machine optimized by particle swarm in this paper is competent for the identification of skin sensitizers. PMID:19742136
Dou, Ying; Mi, Hong; Zhao, Lingzhi; Ren, Yuqiu; Ren, Yulin
2006-09-01
The application of the second most popular artificial neural networks (ANNs), namely, the radial basis function (RBF) networks, has been developed for quantitative analysis of drugs during the last decade. In this paper, the two components (aspirin and phenacetin) were simultaneously determined in compound aspirin tablets by using near-infrared (NIR) spectroscopy and RBF networks. The total database was randomly divided into a training set (50) and a testing set (17). Different preprocessing methods (standard normal variate (SNV), multiplicative scatter correction (MSC), first-derivative and second-derivative) were applied to two sets of NIR spectra of compound aspirin tablets with different concentrations of two active components and compared each other. After that, the performance of RBF learning algorithm adopted the nearest neighbor clustering algorithm (NNCA) and the criterion for selection used a cross-validation technique. Results show that using RBF networks to quantificationally analyze tablets is reliable, and the best RBF model was obtained by first-derivative spectra.
Padró, Juan M; Ponzinibbio, Agustín; Mesa, Leidy B Agudelo; Reta, Mario
2011-03-01
The partition coefficients, P(IL/w), for different probe molecules as well as for compounds of biological interest between the room-temperature ionic liquids (RTILs) 1-butyl-3-methylimidazolium hexafluorophosphate, [BMIM][PF(6)], 1-hexyl-3-methylimidazolium hexafluorophosphate, [HMIM][PF(6)], 1-octyl-3-methylimidazolium tetrafluoroborate, [OMIM][BF(4)] and water were accurately measured. [BMIM][PF(6)] and [OMIM][BF(4)] were synthesized by adapting a procedure from the literature to a simpler, single-vessel and faster methodology, with a much lesser consumption of organic solvent. We employed the solvation-parameter model to elucidate the general chemical interactions involved in RTIL/water partitioning. With this purpose, we have selected different solute descriptor parameters that measure polarity, polarizability, hydrogen-bond-donor and hydrogen-bond-acceptor interactions, and cavity formation for a set of specifically selected probe molecules (the training set). The obtained multiparametric equations were used to predict the partition coefficients for compounds not present in the training set (the test set), most being of biological interest. Partial solubility of the ionic liquid in water (and water into the ionic liquid) was taken into account to explain the obtained results. This fact has not been deeply considered up to date. Solute descriptors were obtained from the literature, when available, or else calculated through commercial software. An excellent agreement between calculated and experimental log P(IL/w) values was obtained, which demonstrated that the resulting multiparametric equations are robust and allow predicting partitioning for any organic molecule in the biphasic systems studied.
Effects of a Modified German Volume Training Program on Muscular Hypertrophy and Strength.
Amirthalingam, Theban; Mavros, Yorgi; Wilson, Guy C; Clarke, Jillian L; Mitchell, Lachlan; Hackett, Daniel A
2017-11-01
Amirthalingam, T, Mavros, Y, Wilson, GC, Clarke, JL, Mitchell, L, and Hackett, DA. Effects of a modified German volume training program on muscular hypertrophy and strength. J Strength Cond Res 31(11): 3109-3119, 2017-German Volume Training (GVT), or the 10 sets method, has been used for decades by weightlifters to increase muscle mass. To date, no study has directly examined the training adaptations after GVT. The purpose of this study was to investigate the effect of a modified GVT intervention on muscular hypertrophy and strength. Nineteen healthy men were randomly assign to 6 weeks of 10 or 5 sets of 10 repetitions for specific compound resistance exercises included in a split routine performed 3 times per week. Total and regional lean body mass, muscle thickness, and muscle strength were measured before and after the training program. Across groups, there were significant increases in lean body mass measures, however, greater increases in trunk (p = 0.043; effect size [ES] = -0.21) and arm (p = 0.083; ES = -0.25) lean body mass favored the 5-SET group. No significant increases were found for leg lean body mass or measures of muscle thickness across groups. Significant increases were found across groups for muscular strength, with greater increases in the 5-SET group for bench press (p = 0.014; ES = -0.43) and lat pull-down (p = 0.003; ES = -0.54). It seems that the modified GVT program is no more effective than performing 5 sets per exercise for increasing muscle hypertrophy and strength. To maximize hypertrophic training effects, it is recommended that 4-6 sets per exercise be performed, as it seems gains will plateau beyond this set range and may even regress due to overtraining.
Nikolov, Nikolai G; Dybdahl, Marianne; Jónsdóttir, Svava Ó; Wedebye, Eva B
2014-11-01
Ionization is a key factor in hERG K(+) channel blocking, and acids and zwitterions are known to be less probable hERG blockers than bases and neutral compounds. However, a considerable number of acidic compounds block hERG, and the physico-chemical attributes which discriminate acidic blockers from acidic non-blockers have not been fully elucidated. We propose a rule for prediction of hERG blocking by acids and zwitterionic ampholytes based on thresholds for only three descriptors related to acidity, size and reactivity. The training set of 153 acids and zwitterionic ampholytes was predicted with a concordance of 91% by a decision tree based on the rule. Two external validations were performed with sets of 35 and 48 observations, respectively, both showing concordances of 91%. In addition, a global QSAR model of hERG blocking was constructed based on a large diverse training set of 1374 chemicals covering all ionization classes, externally validated showing high predictivity and compared to the decision tree. The decision tree was found to be superior for the acids and zwitterionic ampholytes classes. Copyright © 2014 Elsevier Ltd. All rights reserved.
Allison, Thomas C
2016-03-03
Rate constants for reactions of chemical compounds with hydroxyl radical are a key quantity used in evaluating the global warming potential of a substance. Experimental determination of these rate constants is essential, but it can also be difficult and time-consuming to produce. High-level quantum chemistry predictions of the rate constant can suffer from the same issues. Therefore, it is valuable to devise estimation schemes that can give reasonable results on a variety of chemical compounds. In this article, the construction and training of an artificial neural network (ANN) for the prediction of rate constants at 298 K for reactions of hydroxyl radical with a diverse set of molecules is described. Input to the ANN consists of counts of the chemical bonds and bends present in the target molecule. The ANN is trained using 792 (•)OH reaction rate constants taken from the NIST Chemical Kinetics Database. The mean unsigned percent error (MUPE) for the training set is 12%, and the MUPE of the testing set is 51%. It is shown that the present methodology yields rate constants of reasonable accuracy for a diverse set of inputs. The results are compared to high-quality literature values and to another estimation scheme. This ANN methodology is expected to be of use in a wide range of applications for which (•)OH reaction rate constants are required. The model uses only information that can be gathered from a 2D representation of the molecule, making the present approach particularly appealing, especially for screening applications.
Netzeva, Tatiana I; Gallegos Saliner, Ana; Worth, Andrew P
2006-05-01
The aim of the present study was to illustrate that it is possible and relatively straightforward to compare the domain of applicability of a quantitative structure-activity relationship (QSAR) model in terms of its physicochemical descriptors with a large inventory of chemicals. A training set of 105 chemicals with data for relative estrogenic gene activation, obtained in a recombinant yeast assay, was used to develop the QSAR. A binary classification model for predicting active versus inactive chemicals was developed using classification tree analysis and two descriptors with a clear physicochemical meaning (octanol-water partition coefficient, or log Kow, and the number of hydrogen bond donors, or n(Hdon)). The model demonstrated a high overall accuracy (90.5%), with a sensitivity of 95.9% and a specificity of 78.1%. The robustness of the model was evaluated using the leave-many-out cross-validation technique, whereas the predictivity was assessed using an artificial external test set composed of 12 compounds. The domain of the QSAR training set was compared with the chemical space covered by the European Inventory of Existing Commercial Chemical Substances (EINECS), as incorporated in the CDB-EC software, in the log Kow / n(Hdon) plane. The results showed that the training set and, therefore, the applicability domain of the QSAR model covers a small part of the physicochemical domain of the inventory, even though a simple method for defining the applicability domain (ranges in the descriptor space) was used. However, a large number of compounds are located within the narrow descriptor window.
Predicting the activity of drugs for a group of imidazopyridine anticoccidial compounds.
Si, Hongzong; Lian, Ning; Yuan, Shuping; Fu, Aiping; Duan, Yun-Bo; Zhang, Kejun; Yao, Xiaojun
2009-10-01
Gene expression programming (GEP) is a novel machine learning technique. The GEP is used to build nonlinear quantitative structure-activity relationship model for the prediction of the IC(50) for the imidazopyridine anticoccidial compounds. This model is based on descriptors which are calculated from the molecular structure. Four descriptors are selected from the descriptors' pool by heuristic method (HM) to build multivariable linear model. The GEP method produced a nonlinear quantitative model with a correlation coefficient and a mean error of 0.96 and 0.24 for the training set, 0.91 and 0.52 for the test set, respectively. It is shown that the GEP predicted results are in good agreement with experimental ones.
Predicting novel substrates for enzymes with minimal experimental effort with active learning
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pertusi, Dante A.; Moura, Matthew E.; Jeffryes, James G.
Enzymatic substrate promiscuity is more ubiquitous than previously thought, with significant consequences for understanding metabolism and its application to biocatalysis. This realization has given rise to the need for efficient characterization of enzyme promiscuity. Enzyme promiscuity is currently characterized with a limited number of human-selected compounds that may not be representative of the enzyme's versatility. While testing large numbers of compounds may be impractical, computational approaches can exploit existing data to determine the most informative substrates to test next, thereby more thoroughly exploring an enzyme's versatility. To demonstrate this, we used existing studies and tested compounds for four different enzymes,more » developed support vector machine (SVM) models using these datasets, and selected additional compounds for experiments using an active learning approach. SVMs trained on a chemically diverse set of compounds were discovered to achieve maximum accuracies of similar to 80% using similar to 33% fewer compounds than datasets based on all compounds tested in existing studies. Active learning-selected compounds for testing resolved apparent conflicts in the existing training data, while adding diversity to the dataset. The application of these algorithms to wide arrays of metabolic enzymes would result in a library of SVMs that can predict high-probability promiscuous enzymatic reactions and could prove a valuable resource for the design of novel metabolic pathways.« less
Predicting novel substrates for enzymes with minimal experimental effort with active learning.
Pertusi, Dante A; Moura, Matthew E; Jeffryes, James G; Prabhu, Siddhant; Walters Biggs, Bradley; Tyo, Keith E J
2017-11-01
Enzymatic substrate promiscuity is more ubiquitous than previously thought, with significant consequences for understanding metabolism and its application to biocatalysis. This realization has given rise to the need for efficient characterization of enzyme promiscuity. Enzyme promiscuity is currently characterized with a limited number of human-selected compounds that may not be representative of the enzyme's versatility. While testing large numbers of compounds may be impractical, computational approaches can exploit existing data to determine the most informative substrates to test next, thereby more thoroughly exploring an enzyme's versatility. To demonstrate this, we used existing studies and tested compounds for four different enzymes, developed support vector machine (SVM) models using these datasets, and selected additional compounds for experiments using an active learning approach. SVMs trained on a chemically diverse set of compounds were discovered to achieve maximum accuracies of ~80% using ~33% fewer compounds than datasets based on all compounds tested in existing studies. Active learning-selected compounds for testing resolved apparent conflicts in the existing training data, while adding diversity to the dataset. The application of these algorithms to wide arrays of metabolic enzymes would result in a library of SVMs that can predict high-probability promiscuous enzymatic reactions and could prove a valuable resource for the design of novel metabolic pathways. Copyright © 2017 International Metabolic Engineering Society. Published by Elsevier Inc. All rights reserved.
Han, Bucong; Ma, Xiaohua; Zhao, Ruiying; Zhang, Jingxian; Wei, Xiaona; Liu, Xianghui; Liu, Xin; Zhang, Cunlong; Tan, Chunyan; Jiang, Yuyang; Chen, Yuzong
2012-11-23
Src plays various roles in tumour progression, invasion, metastasis, angiogenesis and survival. It is one of the multiple targets of multi-target kinase inhibitors in clinical uses and trials for the treatment of leukemia and other cancers. These successes and appearances of drug resistance in some patients have raised significant interest and efforts in discovering new Src inhibitors. Various in-silico methods have been used in some of these efforts. It is desirable to explore additional in-silico methods, particularly those capable of searching large compound libraries at high yields and reduced false-hit rates. We evaluated support vector machines (SVM) as virtual screening tools for searching Src inhibitors from large compound libraries. SVM trained and tested by 1,703 inhibitors and 63,318 putative non-inhibitors correctly identified 93.53%~ 95.01% inhibitors and 99.81%~ 99.90% non-inhibitors in 5-fold cross validation studies. SVM trained by 1,703 inhibitors reported before 2011 and 63,318 putative non-inhibitors correctly identified 70.45% of the 44 inhibitors reported since 2011, and predicted as inhibitors 44,843 (0.33%) of 13.56M PubChem, 1,496 (0.89%) of 168 K MDDR, and 719 (7.73%) of 9,305 MDDR compounds similar to the known inhibitors. SVM showed comparable yield and reduced false hit rates in searching large compound libraries compared to the similarity-based and other machine-learning VS methods developed from the same set of training compounds and molecular descriptors. We tested three virtual hits of the same novel scaffold from in-house chemical libraries not reported as Src inhibitor, one of which showed moderate activity. SVM may be potentially explored for searching Src inhibitors from large compound libraries at low false-hit rates.
Mihalik, Jason P; Libby, Jeremiah J; Battaglini, Claudio L; McMurray, Robert G
2008-01-01
The purpose of this study was to determine whether there were differences in vertical jump height and lower body power production gains between complex and compound training programs. A secondary purpose was to determine whether differences in gains were observed at a faster rate between complex and compound training programs. Thirty-one college-aged club volleyball players (11 men and 20 women) were assigned into either a complex training group or a compound training group based on gender and pre-training performance measures. Both groups trained twice per week for 4 weeks. Work was equated between the 2 groups. Complex training alternated between resistance and plyometric exercises on each training day; whereas, compound training consisted of resistance training on one day and plyometric training on the other. Our analyses showed significant improvements in vertical jump height in both training groups after only 3 weeks of training (P < 0.0001); vertical jump height increased by approximately 5% and 9% in the complex and compound training groups, respectively. However, neither group improved significantly better than the other, nor did either group experience faster gains in vertical leap or power output. The results of this study suggest that performing a minimum of 3 weeks of either complex or compound training is effective for improving vertical jump height and power output; thus, coaches should choose the program which best suits their training schedules.
NASA Astrophysics Data System (ADS)
Krieger, Ulrich K.; Siegrist, Franziska; Marcolli, Claudia; Emanuelsson, Eva U.; Gøbel, Freya M.; Bilde, Merete; Marsh, Aleksandra; Reid, Jonathan P.; Huisman, Andrew J.; Riipinen, Ilona; Hyttinen, Noora; Myllys, Nanna; Kurtén, Theo; Bannan, Thomas; Percival, Carl J.; Topping, David
2018-01-01
To predict atmospheric partitioning of organic compounds between gas and aerosol particle phase based on explicit models for gas phase chemistry, saturation vapor pressures of the compounds need to be estimated. Estimation methods based on functional group contributions require training sets of compounds with well-established saturation vapor pressures. However, vapor pressures of semivolatile and low-volatility organic molecules at atmospheric temperatures reported in the literature often differ by several orders of magnitude between measurement techniques. These discrepancies exceed the stated uncertainty of each technique which is generally reported to be smaller than a factor of 2. At present, there is no general reference technique for measuring saturation vapor pressures of atmospherically relevant compounds with low vapor pressures at atmospheric temperatures. To address this problem, we measured vapor pressures with different techniques over a wide temperature range for intercomparison and to establish a reliable training set. We determined saturation vapor pressures for the homologous series of polyethylene glycols (H - (O - CH2 - CH2)n - OH) for n = 3 to n = 8 ranging in vapor pressure at 298 K from 10-7 to 5×10-2 Pa and compare them with quantum chemistry calculations. Such a homologous series provides a reference set that covers several orders of magnitude in saturation vapor pressure, allowing a critical assessment of the lower limits of detection of vapor pressures for the different techniques as well as permitting the identification of potential sources of systematic error. Also, internal consistency within the series allows outlying data to be rejected more easily. Most of the measured vapor pressures agreed within the stated uncertainty range. Deviations mostly occurred for vapor pressure values approaching the lower detection limit of a technique. The good agreement between the measurement techniques (some of which are sensitive to the mass accommodation coefficient and some not) suggests that the mass accommodation coefficients of the studied compounds are close to unity. The quantum chemistry calculations were about 1 order of magnitude higher than the measurements. We find that extrapolation of vapor pressures from elevated to atmospheric temperatures is permissible over a range of about 100 K for these compounds, suggesting that measurements should be performed best at temperatures yielding the highest-accuracy data, allowing subsequent extrapolation to atmospheric temperatures.
New public QSAR model for carcinogenicity
2010-01-01
Background One of the main goals of the new chemical regulation REACH (Registration, Evaluation and Authorization of Chemicals) is to fulfill the gaps in data concerned with properties of chemicals affecting the human health. (Q)SAR models are accepted as a suitable source of information. The EU funded CAESAR project aimed to develop models for prediction of 5 endpoints for regulatory purposes. Carcinogenicity is one of the endpoints under consideration. Results Models for prediction of carcinogenic potency according to specific requirements of Chemical regulation were developed. The dataset of 805 non-congeneric chemicals extracted from Carcinogenic Potency Database (CPDBAS) was used. Counter Propagation Artificial Neural Network (CP ANN) algorithm was implemented. In the article two alternative models for prediction carcinogenicity are described. The first model employed eight MDL descriptors (model A) and the second one twelve Dragon descriptors (model B). CAESAR's models have been assessed according to the OECD principles for the validation of QSAR. For the model validity we used a wide series of statistical checks. Models A and B yielded accuracy of training set (644 compounds) equal to 91% and 89% correspondingly; the accuracy of the test set (161 compounds) was 73% and 69%, while the specificity was 69% and 61%, respectively. Sensitivity in both cases was equal to 75%. The accuracy of the leave 20% out cross validation for the training set of models A and B was equal to 66% and 62% respectively. To verify if the models perform correctly on new compounds the external validation was carried out. The external test set was composed of 738 compounds. We obtained accuracy of external validation equal to 61.4% and 60.0%, sensitivity 64.0% and 61.8% and specificity equal to 58.9% and 58.4% respectively for models A and B. Conclusion Carcinogenicity is a particularly important endpoint and it is expected that QSAR models will not replace the human experts opinions and conventional methods. However, we believe that combination of several methods will provide useful support to the overall evaluation of carcinogenicity. In present paper models for classification of carcinogenic compounds using MDL and Dragon descriptors were developed. Models could be used to set priorities among chemicals for further testing. The models at the CAESAR site were implemented in java and are publicly accessible. PMID:20678182
Breath-based biomarkers for tuberculosis
NASA Astrophysics Data System (ADS)
Kolk, Arend H. J.; van Berkel, Joep J. B. N.; Claassens, Mareli M.; Walters, Elisabeth; Kuijper, Sjoukje; Dallinga, Jan W.; van Schooten, Fredrik-Jan
2012-06-01
We investigated the potential of breath analysis by gas chromatography - mass spectrometry (GC-MS) to discriminate between samples collected prospectively from patients with suspected tuberculosis (TB). Samples were obtained in a TB endemic setting in South Africa where 28% of the culture proven TB patients had a Ziehl-Neelsen (ZN) negative sputum smear. A training set of breath samples from 50 sputum culture proven TB patients and 50 culture negative non-TB patients was analyzed by GC-MS. A classification model with 7 compounds resulted in a training set with a sensitivity of 72%, specificity of 86% and accuracy of 79% compared with culture. The classification model was validated with an independent set of breath samples from 21 TB and 50 non-TB patients. A sensitivity of 62%, specificity of 84% and accuracy of 77% was found. We conclude that the 7 volatile organic compounds (VOCs) that discriminate breath samples from TB and non-TB patients in our study population are probably host-response related VOCs and are not derived from the VOCs secreted by M. tuberculosis. It is concluded that at present GC-MS breath analysis is able to differentiate between TB and non-TB breath samples even among patients with a negative ZN sputum smear but a positive culture for M. tuberculosis. Further research is required to improve the sensitivity and specificity before this method can be used in routine laboratories.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Uehara, Takeki, E-mail: takeki.uehara@shionogi.co.jp; Toxicogenomics Informatics Project, National Institute of Biomedical Innovation, 7-6-8 Asagi, Ibaraki, Osaka 567-0085; Minowa, Yohsuke
2011-09-15
The present study was performed to develop a robust gene-based prediction model for early assessment of potential hepatocarcinogenicity of chemicals in rats by using our toxicogenomics database, TG-GATEs (Genomics-Assisted Toxicity Evaluation System developed by the Toxicogenomics Project in Japan). The positive training set consisted of high- or middle-dose groups that received 6 different non-genotoxic hepatocarcinogens during a 28-day period. The negative training set consisted of high- or middle-dose groups of 54 non-carcinogens. Support vector machine combined with wrapper-type gene selection algorithms was used for modeling. Consequently, our best classifier yielded prediction accuracies for hepatocarcinogenicity of 99% sensitivity and 97% specificitymore » in the training data set, and false positive prediction was almost completely eliminated. Pathway analysis of feature genes revealed that the mitogen-activated protein kinase p38- and phosphatidylinositol-3-kinase-centered interactome and the v-myc myelocytomatosis viral oncogene homolog-centered interactome were the 2 most significant networks. The usefulness and robustness of our predictor were further confirmed in an independent validation data set obtained from the public database. Interestingly, similar positive predictions were obtained in several genotoxic hepatocarcinogens as well as non-genotoxic hepatocarcinogens. These results indicate that the expression profiles of our newly selected candidate biomarker genes might be common characteristics in the early stage of carcinogenesis for both genotoxic and non-genotoxic carcinogens in the rat liver. Our toxicogenomic model might be useful for the prospective screening of hepatocarcinogenicity of compounds and prioritization of compounds for carcinogenicity testing. - Highlights: >We developed a toxicogenomic model to predict hepatocarcinogenicity of chemicals. >The optimized model consisting of 9 probes had 99% sensitivity and 97% specificity. >This model enables us to detect genotoxic as well as non-genotoxic hepatocarcinogens.« less
A modification of the Hammett equation for predicting ionisation constants of p-vinyl phenols.
Sipilä, Julius; Nurmi, Harri; Kaukonen, Ann Marie; Hirvonen, Jouni; Taskinen, Jyrki; Yli-Kauhaluoma, Jari
2005-01-01
Currently there are several compounds used as drugs or studied as new chemical entities, which have an electron withdrawing group connected to a vinylic double bond in a phenolic or catecholic core structure. These compounds share a common feature--current computational methods utilizing the Hammett type equation for the prediction of ionisation constants fail to give accurate prediction of pK(a)'s for compounds containing the vinylic moiety. The hypothesis was that the effect of electron-withdrawing substituents on the pK(a) of p-vinyl phenols is due to the delocalized electronic structure of these compounds. Thus, this effect should be additive for multiple substituents attached to the vinylic double bond and quantifiable by LFER-based methods. The aim of this study was to produce an improved equation with a reduced tendency to underestimate the effect of the double bond on the ionisation of the phenolic hydroxyl. To this end a set of 19 para-substituted vinyl phenols was used. The ionisation constants were measured potentiometrically, and a training set of 10 compounds was selected to build a regression model (r2 = 0.987 and S.E. = 0.09). The average error with an external test set of six compounds was 0.19 for our model and 1.27 for the ACD-labs 7.0. Thus, we have been able to significantly improve the existing model for prediction of the ionisation constants of substituted p-vinyl phenols.
Yavuz, Sevtap Caglar; Sabanci, Nazmiye; Saripinar, Emin
2018-01-01
The EC-GA method was employed in this study as a 4D-QSAR method, for the identification of the pharmacophore (Pha) of ruthenium(II) arene complex derivatives and quantitative prediction of activity. The arrangement of the computed geometric and electronic parameters for atoms and bonds of each compound occurring in a matrix is known as the electron-conformational matrix of congruity (ECMC). It contains the data from HF/3-21G level calculations. Compounds were represented by a group of conformers for each compound rather than a single conformation, known as fourth dimension to generate the model. ECMCs were compared within a certain range of tolerance values by using the EMRE program and the responsible pharmacophore group for ruthenium(II) arene complex derivatives was found. For selecting the sub-parameter which had the most effect on activity in the series and the calculation of theoretical activity values, the non-linear least square method and genetic algorithm which are included in the EMRE program were used. In addition, compounds were classified as the training and test set and the accuracy of the models was tested by cross-validation statistically. The model for training and test sets attained by the optimum 10 parameters gave highly satisfactory results with R2 training= 0.817, q 2=0.718 and SEtraining=0.066, q2 ext1 = 0.867, q2 ext2 = 0.849, q2 ext3 =0.895, ccctr = 0.895, ccctest = 0.930 and cccall = 0.905. Since there is no 4D-QSAR research on metal based organic complexes in the literature, this study is original and gives a powerful tool to the design of novel and selective ruthenium(II) arene complexes. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Caballero, Julio; Fernández, Michael; Coll, Deysma
2010-12-01
Three-dimensional quantitative structure-activity relationship studies were carried out on a series of 28 organosulphur compounds as 15-lipoxygenase inhibitors using comparative molecular field analysis and comparative molecular similarity indices analysis. Quantitative information on structure-activity relationships is provided for further rational development and direction of selective synthesis. All models were carried out over a training set including 22 compounds. The best comparative molecular field analysis model only included steric field and had a good Q² = 0.789. Comparative molecular similarity indices analysis overcame the comparative molecular field analysis results: the best comparative molecular similarity indices analysis model also only included steric field and had a Q² = 0.894. In addition, this model predicted adequately the compounds contained in the test set. Furthermore, plots of steric comparative molecular similarity indices analysis field allowed conclusions to be drawn for the choice of suitable inhibitors. In this sense, our model should prove useful in future 15-lipoxygenase inhibitor design studies. © 2010 John Wiley & Sons A/S.
NASA Astrophysics Data System (ADS)
Samari, Fayezeh; Yousefinejad, Saeed
2017-11-01
Emission fluorescence spectroscopy has an extremely restricted scope of application to analyze of complex mixtures since its selectivity is reduced by the extensive spectral overlap. Synchronous fluorescence spectroscopy (SFS) is a technique enables us to analyze complex mixtures with overlapped emission and/or excitation spectra. The difference of excitation and emission wavelength of compounds (interval wavelength or Δλ) is an important characteristic in SFS. Thus a multi-parameter model was constructed to predict Δλ in 63 fluorescent compounds and the regression coefficient in training set, cross validation and test set were 0.88, 0.85 and 0.91 respectively. Furthermore, the applicability and validity of model were evaluated using different statistical methods such as y-scrambling and applicability domain. It was concluded that increasing average valence connectivity, number of Al2-NH functional group and Geary autocorrelation (lag 4) with electronegative weights can lead to increasing Δλ in the fluorescent compounds. The current study obtained an insight into the structural properties of compounds effective on their Δλ as an important parameter in SFS.
New antitrichomonal drug-like chemicals selected by bond (edge)-based TOMOCOMD-CARDD descriptors.
Meneses-Marcel, Alfredo; Rivera-Borroto, Oscar M; Marrero-Ponce, Yovani; Montero, Alina; Machado Tugores, Yanetsy; Escario, José Antonio; Gómez Barrio, Alicia; Montero Pereira, David; Nogal, Juan José; Kouznetsov, Vladimir V; Ochoa Puentes, Cristian; Bohórquez, Arnold R; Grau, Ricardo; Torrens, Francisco; Ibarra-Velarde, Froylán; Arán, Vicente J
2008-09-01
Bond-based quadratic indices, new TOMOCOMD-CARDD molecular descriptors, and linear discriminant analysis (LDA) were used to discover novel lead trichomonacidals. The obtained LDA-based quantitative structure-activity relationships (QSAR) models, using nonstochastic and stochastic indices, were able to classify correctly 87.91% (87.50%) and 89.01% (84.38%) of the chemicals in training (test) sets, respectively. They showed large Matthews correlation coefficients of 0.75 (0.71) and 0.78 (0.65) for the training (test) sets, correspondingly. Later, both models were applied to the virtual screening of 21 chemicals to find new lead antitrichomonal agents. Predictions agreed with experimental results to a great extent because a correct classification for both models of 95.24% (20 of 21) of the chemicals was obtained. Of the 21 compounds that were screened and synthesized, 2 molecules (chemicals G-1, UC-245) showed high to moderate cytocidal activity at the concentration of 10 microg/ml, another 2 compounds (G-0 and CRIS-148) showed high cytocidal activity only at the concentration of 100 microg/ml, and the remaining chemicals (from CRIS-105 to CRIS-153, except CRIS-148) were inactive at these assayed concentrations. Finally, the best candidate, G-1 (cytocidal activity of 100% at 10 microg/ml) was in vivo assayed in ovariectomized Wistar rats achieving promising results as a trichomonacidal drug-like compound.
Prediction of Environmental Impact of High-Energy Materials with Atomistic Computer Simulations
2010-11-01
from a training set of compounds. Other methods include Quantitative Struc- ture-Activity Relationship ( QSAR ) and Quantitative Structure-Property...26 28 the development of QSPR/ QSAR models, in contrast to boiling points and critical parameters derived from empirical correlations, to improve...Quadratic Configuration Interaction Singles Doubles QSAR Quantitative Structure-Activity Relationship QSPR Quantitative Structure-Property
In vitro transcriptomic prediction of hepatotoxicity for early drug discovery
Cheng, Feng; Theodorescu, Dan; Schulman, Ira G.; Lee, Jae K.
2012-01-01
Liver toxicity (hepatotoxicity) is a critical issue in drug discovery and development. Standard preclinical evaluation of drug hepatotoxicity is generally performed using in vivo animal systems. However, only a small number of preselected compounds can be examined in vivo due to high experimental costs. A more efficient yet accurate screening technique which can identify potentially hepatotoxic compounds in the early stages of drug development would thus be valuable. Here, we develop and apply a novel genomic prediction technique for screening hepatotoxic compounds based on in vitro human liver cell tests. Using a training set of in vivo rodent experiments for drug hepatotoxicity evaluation, we discovered common biomarkers of drug-induced liver toxicity among six heterogeneous compounds. This gene set was further triaged to a subset of 32 genes that can be used as a multi-gene expression signature to predict hepatotoxicity. This multi-gene predictor was independently validated and showed consistently high prediction performance on five test sets of in vitro human liver cell and in vivo animal toxicity experiments. The predictor also demonstrated utility in evaluating different degrees of toxicity in response to drug concentrations which may be useful not only for discerning a compound’s general hepatotoxicity but also for determining its toxic concentration. PMID:21884709
A new biodegradation prediction model specific to petroleum hydrocarbons.
Howard, Philip; Meylan, William; Aronson, Dallas; Stiteler, William; Tunkel, Jay; Comber, Michael; Parkerton, Thomas F
2005-08-01
A new predictive model for determining quantitative primary biodegradation half-lives of individual petroleum hydrocarbons has been developed. This model uses a fragment-based approach similar to that of several other biodegradation models, such as those within the Biodegradation Probability Program (BIOWIN) estimation program. In the present study, a half-life in days is estimated using multiple linear regression against counts of 31 distinct molecular fragments. The model was developed using a data set consisting of 175 compounds with environmentally relevant experimental data that was divided into training and validation sets. The original fragments from the Ministry of International Trade and Industry BIOWIN model were used initially as structural descriptors and additional fragments were then added to better describe the ring systems found in petroleum hydrocarbons and to adjust for nonlinearity within the experimental data. The training and validation sets had r2 values of 0.91 and 0.81, respectively.
He, Yu-su; Sun, Zhi-yi; Zhang, Yan-ling
2014-11-01
By using the pharmacophore model of mineralocorticoid receptor antagonists as a starting point, the experiment stud- ies the method of traditional Chinese medicine formula design for anti-hypertensive. Pharmacophore models were generated by 3D-QSAR pharmacophore (Hypogen) program of the DS3.5, based on the training set composed of 33 mineralocorticoid receptor antagonists. The best pharmacophore model consisted of two Hydrogen-bond acceptors, three Hydrophobic and four excluded volumes. Its correlation coefficient of training set and test set, N, and CAI value were 0.9534, 0.6748, 2.878, and 1.119. According to the database screening, 1700 active compounds from 86 source plant were obtained. Because of lacking of available anti-hypertensive medi cation strategy in traditional theory, this article takes advantage of patent retrieval in world traditional medicine patent database, in order to design drug formula. Finally, two formulae was obtained for antihypertensive.
Legrain, Fleur; Carrete, Jesús; van Roekeghem, Ambroise; Madsen, Georg K H; Mingo, Natalio
2018-01-18
Machine learning (ML) is increasingly becoming a helpful tool in the search for novel functional compounds. Here we use classification via random forests to predict the stability of half-Heusler (HH) compounds, using only experimentally reported compounds as a training set. Cross-validation yields an excellent agreement between the fraction of compounds classified as stable and the actual fraction of truly stable compounds in the ICSD. The ML model is then employed to screen 71 178 different 1:1:1 compositions, yielding 481 likely stable candidates. The predicted stability of HH compounds from three previous high-throughput ab initio studies is critically analyzed from the perspective of the alternative ML approach. The incomplete consistency among the three separate ab initio studies and between them and the ML predictions suggests that additional factors beyond those considered by ab initio phase stability calculations might be determinant to the stability of the compounds. Such factors can include configurational entropies and quasiharmonic contributions.
Design Considerations of a Compounded Sterile Preparations Course
Petraglia, Christine; Mattison, Melissa J.
2016-01-01
Objective. To design a comprehensive learning and assessment environment for the practical application of compounded sterile preparations using a constructivist approach. Design. Compounded Sterile Preparations Laboratory is a required 1-credit course that builds upon the themes of training aseptic technique typically used in health system settings and threads application of concepts from other courses in the curriculum. Students used critical-thinking skills to devise appropriate strategies to compound sterile preparations. Assessment. Aseptic technique skills were assessed with objective, structured, checklist-based rubrics. Most students successfully completed practical assessments using appropriate technique (mean assessment grade=83.2%). Almost all students passed the practical media fill (98%) and gloved fingertip sampling (86%) tests on the first attempt; all passed on the second attempt. Conclusion. Employing a constructivist scaffold approach to teaching proper hygiene and aseptic technique prepared students to pass media fill and gloved fingertip tests and to perform well on practical compounding assessments. PMID:26941438
Wicht, Kathryn J; Combrinck, Jill M; Smith, Peter J; Egan, Timothy J
2015-08-15
A large quantity of high throughput screening (HTS) data for antimalarial activity has become available in recent years. This includes both phenotypic and target-based activity. Realising the maximum value of these data remains a challenge. In this respect, methods that allow such data to be used for virtual screening maximise efficiency and reduce costs. In this study both in vitro antimalarial activity and inhibitory data for β-haematin formation, largely obtained from publically available sources, has been used to develop Bayesian models for inhibitors of β-haematin formation and in vitro antimalarial activity. These models were used to screen two in silico compound libraries. In the first, the 1510 U.S. Food and Drug Administration approved drugs available on PubChem were ranked from highest to lowest Bayesian score based on a training set of β-haematin inhibiting compounds active against Plasmodium falciparum that did not include any of the clinical antimalarials or close analogues. The six known clinical antimalarials that inhibit β-haematin formation were ranked in the top 2.1% of compounds. Furthermore, the in vitro antimalarial hit-rate for this prioritised set of compounds was found to be 81% in the case of the subset where activity data are available in PubChem. In the second, a library of about 5000 commercially available compounds (Aldrich(CPR)) was virtually screened for ability to inhibit β-haematin formation and then for in vitro antimalarial activity. A selection of 34 compounds was purchased and tested, of which 24 were predicted to be β-haematin inhibitors. The hit rate for inhibition of β-haematin formation was found to be 25% and a third of these were active against P. falciparum, corresponding to enrichments estimated at about 25- and 140-fold relative to random screening, respectively. Copyright © 2014 Elsevier Ltd. All rights reserved.
2012-01-01
Background Src plays various roles in tumour progression, invasion, metastasis, angiogenesis and survival. It is one of the multiple targets of multi-target kinase inhibitors in clinical uses and trials for the treatment of leukemia and other cancers. These successes and appearances of drug resistance in some patients have raised significant interest and efforts in discovering new Src inhibitors. Various in-silico methods have been used in some of these efforts. It is desirable to explore additional in-silico methods, particularly those capable of searching large compound libraries at high yields and reduced false-hit rates. Results We evaluated support vector machines (SVM) as virtual screening tools for searching Src inhibitors from large compound libraries. SVM trained and tested by 1,703 inhibitors and 63,318 putative non-inhibitors correctly identified 93.53%~ 95.01% inhibitors and 99.81%~ 99.90% non-inhibitors in 5-fold cross validation studies. SVM trained by 1,703 inhibitors reported before 2011 and 63,318 putative non-inhibitors correctly identified 70.45% of the 44 inhibitors reported since 2011, and predicted as inhibitors 44,843 (0.33%) of 13.56M PubChem, 1,496 (0.89%) of 168 K MDDR, and 719 (7.73%) of 9,305 MDDR compounds similar to the known inhibitors. Conclusions SVM showed comparable yield and reduced false hit rates in searching large compound libraries compared to the similarity-based and other machine-learning VS methods developed from the same set of training compounds and molecular descriptors. We tested three virtual hits of the same novel scaffold from in-house chemical libraries not reported as Src inhibitor, one of which showed moderate activity. SVM may be potentially explored for searching Src inhibitors from large compound libraries at low false-hit rates. PMID:23173901
Automatic Earthquake Detection by Active Learning
NASA Astrophysics Data System (ADS)
Bergen, K.; Beroza, G. C.
2017-12-01
In recent years, advances in machine learning have transformed fields such as image recognition, natural language processing and recommender systems. Many of these performance gains have relied on the availability of large, labeled data sets to train high-accuracy models; labeled data sets are those for which each sample includes a target class label, such as waveforms tagged as either earthquakes or noise. Earthquake seismologists are increasingly leveraging machine learning and data mining techniques to detect and analyze weak earthquake signals in large seismic data sets. One of the challenges in applying machine learning to seismic data sets is the limited labeled data problem; learning algorithms need to be given examples of earthquake waveforms, but the number of known events, taken from earthquake catalogs, may be insufficient to build an accurate detector. Furthermore, earthquake catalogs are known to be incomplete, resulting in training data that may be biased towards larger events and contain inaccurate labels. This challenge is compounded by the class imbalance problem; the events of interest, earthquakes, are infrequent relative to noise in continuous data sets, and many learning algorithms perform poorly on rare classes. In this work, we investigate the use of active learning for automatic earthquake detection. Active learning is a type of semi-supervised machine learning that uses a human-in-the-loop approach to strategically supplement a small initial training set. The learning algorithm incorporates domain expertise through interaction between a human expert and the algorithm, with the algorithm actively posing queries to the user to improve detection performance. We demonstrate the potential of active machine learning to improve earthquake detection performance with limited available training data.
Marrero-Ponce, Yovani; Iyarreta-Veitía, Maité; Montero-Torres, Alina; Romero-Zaldivar, Carlos; Brandt, Carlos A; Avila, Priscilla E; Kirchgatter, Karin; Machado, Yanetsy
2005-01-01
Malaria has been one of the most significant public health problems for centuries. It affects many tropical and subtropical regions of the world. The increasing resistance of Plasmodium spp. to existing therapies has heightened alarms about malaria in the international health community. Nowadays, there is a pressing need for identifying and developing new drug-based antimalarial therapies. In an effort to overcome this problem, the main purpose of this study is to develop simple linear discriminant-based quantitative structure-activity relationship (QSAR) models for the classification and prediction of antimalarial activity using some of the TOMOCOMD-CARDD (TOpological MOlecular COMputer Design-Computer Aided "Rational" Drug Design) fingerprints, so as to enable computational screening from virtual combinatorial datasets. In this sense, a database of 1562 organic chemicals having great structural variability, 597 of them antimalarial agents and 965 compounds having other clinical uses, was analyzed and presented as a helpful tool, not only for theoretical chemists but also for other researchers in this area. This series of compounds was processed by a k-means cluster analysis in order to design training and predicting sets. Afterward, two linear classification functions were derived in order to discriminate between antimalarial and nonantimalarial compounds. The models (including nonstochastic and stochastic indices) correctly classify more than 93% of the compound set, in both training and external prediction datasets. They showed high Matthews' correlation coefficients, 0.889 and 0.866 for the training set and 0.855 and 0.857 for the test one. The models' predictivity was also assessed and validated by the random removal of 10% of the compounds to form a new test set, for which predictions were made using the models. The overall means of the correct classification for this process (leave group 10% full-out cross validation) using the equations with nonstochastic and stochastic atom-based quadratic fingerprints were 93.93% and 92.77%, respectively. The quadratic maps-based TOMOCOMD-CARDD approach implemented in this work was successfully compared with four of the most useful models for antimalarials selection reported to date. The developed models were then used in a simulation of a virtual search for Ras FTase (FTase = farnesyltransferase) inhibitors with antimalarial activity; 70% and 100% of the 10 inhibitors used in this virtual search were correctly classified, showing the ability of the models to identify new lead antimalarials. Finally, these two QSAR models were used in the identification of previously unknown antimalarials. In this sense, three synthetic intermediaries of quinolinic compounds were evaluated as active/inactive ones using the developed models. The synthesis and biological evaluation of these chemicals against two malaria strains, using chloroquine as a reference, was performed. An accuracy of 100% with the theoretical predictions was observed. Compound 3 showed antimalarial activity, being the first report of an arylaminomethylenemalonate having such behavior. This result opens a door to a virtual study considering a higher variability of the structural core already evaluated, as well as of other chemicals not included in this study. We conclude that the approach described here seems to be a promising QSAR tool for the molecular discovery of novel classes of antimalarial drugs, which may meet the dual challenges posed by drug-resistant parasites and the rapid progression of malaria illnesses.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sadat Hayatshahi, Sayyed Hamed; Abdolmaleki, Parviz; Safarian, Shahrokh
2005-12-16
Logistic regression and artificial neural networks have been developed as two non-linear models to establish quantitative structure-activity relationships between structural descriptors and biochemical activity of adenosine based competitive inhibitors, toward adenosine deaminase. The training set included 24 compounds with known k {sub i} values. The models were trained to solve two-class problems. Unlike the previous work in which multiple linear regression was used, the highest of positive charge on the molecules was recognized to be in close relation with their inhibition activity, while the electric charge on atom N1 of adenosine was found to be a poor descriptor. Consequently, themore » previously developed equation was improved and the newly formed one could predict the class of 91.66% of compounds correctly. Also optimized 2-3-1 and 3-4-1 neural networks could increase this rate to 95.83%.« less
Nayana, M Ravi Shashi; Sekhar, Y Nataraja; Nandyala, Haritha; Muttineni, Ravikumar; Bairy, Santosh Kumar; Singh, Kriti; Mahmood, S K
2008-10-01
In the present study, a series of 179 quinoline and quinazoline heterocyclic analogues exhibiting inhibitory activity against Gastric (H+/K+)-ATPase were investigated using the comparative molecular field analysis (CoMFA) and comparative molecular similarity indices (CoMSIA) methods. Both the models exhibited good correlation between the calculated 3D-QSAR fields and the observed biological activity for the respective training set compounds. The most optimal CoMFA and CoMSIA models yielded significant leave-one-out cross-validation coefficient, q(2) of 0.777, 0.744 and conventional cross-validation coefficient, r(2) of 0.927, 0.914 respectively. The predictive ability of generated models was tested on a set of 52 compounds having broad range of activity. CoMFA and CoMSIA yielded predicted activities for test set compounds with r(pred)(2) of 0.893 and 0.917 respectively. These validation tests not only revealed the robustness of the models but also demonstrated that for our models r(pred)(2) based on the mean activity of test set compounds can accurately estimate external predictivity. The factors affecting activity were analyzed carefully according to standard coefficient contour maps of steric, electrostatic, hydrophobic, acceptor and donor fields derived from the CoMFA and CoMSIA. These contour plots identified several key features which explain the wide range of activities. The results obtained from models offer important structural insight into designing novel peptic-ulcer inhibitors prior to their synthesis.
Zhu, Hao; Rusyn, Ivan; Richard, Ann; Tropsha, Alexander
2008-01-01
Background To develop efficient approaches for rapid evaluation of chemical toxicity and human health risk of environmental compounds, the National Toxicology Program (NTP) in collaboration with the National Center for Chemical Genomics has initiated a project on high-throughput screening (HTS) of environmental chemicals. The first HTS results for a set of 1,408 compounds tested for their effects on cell viability in six different cell lines have recently become available via PubChem. Objectives We have explored these data in terms of their utility for predicting adverse health effects of the environmental agents. Methods and results Initially, the classification k nearest neighbor (kNN) quantitative structure–activity relationship (QSAR) modeling method was applied to the HTS data only, for a curated data set of 384 compounds. The resulting models had prediction accuracies for training, test (containing 275 compounds together), and external validation (109 compounds) sets as high as 89%, 71%, and 74%, respectively. We then asked if HTS results could be of value in predicting rodent carcinogenicity. We identified 383 compounds for which data were available from both the Berkeley Carcinogenic Potency Database and NTP–HTS studies. We found that compounds classified by HTS as “actives” in at least one cell line were likely to be rodent carcinogens (sensitivity 77%); however, HTS “inactives” were far less informative (specificity 46%). Using chemical descriptors only, kNN QSAR modeling resulted in 62.3% prediction accuracy for rodent carcinogenicity applied to this data set. Importantly, the prediction accuracy of the model was significantly improved (72.7%) when chemical descriptors were augmented by HTS data, which were regarded as biological descriptors. Conclusions Our studies suggest that combining NTP–HTS profiles with conventional chemical descriptors could considerably improve the predictive power of computational approaches in toxicology. PMID:18414635
Oliveira, Gislene B; Alewijn, Martin; Boerrigter-Eenling, Rita; van Ruth, Saskia M
2015-08-25
Consumers' interest in the way meat is produced is increasing in Europe. The resulting free range and organic meat products retail at a higher price, but are difficult to differentiate from their counterparts. To ascertain authenticity and prevent fraud, relevant markers need to be identified and new analytical methodology developed. The objective of this pilot study was to characterize pork belly meats of different animal welfare classes by their fatty acid (Fatty Acid Methyl Ester-FAME), non-volatile compound (electrospray ionization-tandem mass spectrometry-ESI-MS/MS), and volatile compound (proton-transfer-reaction mass spectrometry-PTR-MS) fingerprints. Well-defined pork belly meat samples (13 conventional, 15 free range, and 13 organic) originating from the Netherlands were subjected to analysis. Fingerprints appeared to be specific for the three categories, and resulted in 100%, 95.3%, and 95.3% correct identity predictions of training set samples for FAME, ESI-MS/MS, and PTR-MS respectively and slightly lower scores for the validation set. Organic meat was also well discriminated from the other two categories with 100% success rates for the training set for all three analytical approaches. Ten out of 25 FAs showed significant differences in abundance between organic meat and the other categories, free range meat differed significantly for 6 out of the 25 FAs. Overall, FAME fingerprinting presented highest discrimination power.
Oliveira, Gislene B.; Alewijn, Martin; Boerrigter-Eenling, Rita; van Ruth, Saskia M.
2015-01-01
Consumers’ interest in the way meat is produced is increasing in Europe. The resulting free range and organic meat products retail at a higher price, but are difficult to differentiate from their counterparts. To ascertain authenticity and prevent fraud, relevant markers need to be identified and new analytical methodology developed. The objective of this pilot study was to characterize pork belly meats of different animal welfare classes by their fatty acid (Fatty Acid Methyl Ester—FAME), non-volatile compound (electrospray ionization-tandem mass spectrometry—ESI-MS/MS), and volatile compound (proton-transfer-reaction mass spectrometry—PTR-MS) fingerprints. Well-defined pork belly meat samples (13 conventional, 15 free range, and 13 organic) originating from the Netherlands were subjected to analysis. Fingerprints appeared to be specific for the three categories, and resulted in 100%, 95.3%, and 95.3% correct identity predictions of training set samples for FAME, ESI-MS/MS, and PTR-MS respectively and slightly lower scores for the validation set. Organic meat was also well discriminated from the other two categories with 100% success rates for the training set for all three analytical approaches. Ten out of 25 FAs showed significant differences in abundance between organic meat and the other categories, free range meat differed significantly for 6 out of the 25 FAs. Overall, FAME fingerprinting presented highest discrimination power. PMID:28231211
Santos-Garcia, Letícia; Assis, Letícia C; Silva, Daniela R; Ramalho, Teodorico C; da Cunha, Elaine F F
2016-07-01
Bruton's tyrosine kinase (Btk) is an important enzyme in B-lymphocyte development and differentiation. Furthermore, Btk expression is considered essential for the proliferation and survival of these cells. Btk inhibition has become an attractive strategy for treating autoimmune diseases, B-cell leukemia, and lymphomas. With the objective of proposing new candidates for Btk inhibitors, we applied receptor-dependent four-dimensional quantitative structure-activity relationship (QSAR) methodology to a series of 96 nicotinamide analogs useful as Btk modulators. The QSAR models were developed using 71 compounds, the training set, and externally validated using 25 compounds, the test set. The conformations obtained by molecular dynamics simulation were overlapped in a virtual three-dimensional cubic box comprised of 2 and 5 Å cells, according to the six trial alignments. The models were generated by combining genetic function approximation and partial least squares regression technique. The analyses suggest that Model 1a yields the best results. The best equation shows [Formula: see text], r(2) = .743, RMSEC = .831, RMSECV = .879. Given the importance of the Tyr551, this residue could become a strategic target for the design of novel Btk inhibitors with improved potency. In addition, the good potency predicted for the proposed M2 compound indicates this compound as a potential Btk inhibitor candidate.
Julián-Ortiz, Jesus V de; Gozalbes, Rafael; Besalú, Emili
2016-01-01
The search for new drug candidates in databases is of paramount importance in pharmaceutical chemistry. The selection of molecular subsets is greatly optimized and much more promising when potential drug-like molecules are detected a priori. In this work, about one hundred thousand molecules are ranked following a new methodology: a drug/non-drug classifier constructed by a consensual set of classification trees. The classification trees arise from the stochastic generation of training sets, which in turn are used to estimate probability factors of test molecules to be drug-like compounds. Molecules were represented by Topological Quantum Similarity Indices and their Graph Theoretical counterparts. The contribution of the present paper consists of presenting an effective ranking method able to improve the probability of finding drug-like substances by using these types of molecular descriptors.
Hocart, Simon J.; Liu, Huayin; Deng, Haiyan; De, Dibyendu; Krogstad, Frances M.; Krogstad, Donald J.
2011-01-01
Chloroquine (CQ) is a safe and economical 4-aminoquinoline (AQ) antimalarial. However, its value has been severely compromised by the increasing prevalence of CQ resistance. This study examined 108 AQs, including 68 newly synthesized compounds. Of these 108 AQs, 32 (30%) were active only against CQ-susceptible Plasmodium falciparum strains and 59 (55%) were active against both CQ-susceptible and CQ-resistant P. falciparum strains (50% inhibitory concentrations [IC50s], ≤25 nM). All AQs active against both CQ-susceptible and CQ-resistant P. falciparum strains shared four structural features: (i) an AQ ring without alkyl substitution, (ii) a halogen at position 7 (Cl, Br, or I but not F), (iii) a protonatable nitrogen at position 1, and (iv) a second protonatable nitrogen at the end of the side chain distal from the point of attachment to the AQ ring via the nitrogen at position 4. For activity against CQ-resistant parasites, side chain lengths of ≤3 or ≥10 carbons were necessary but not sufficient; they were identified as essential factors by visual comparison of 2-dimensional (2-D) structures in relation to the antiparasite activities of the AQs and were confirmed by computer-based 3-D comparisons and differential contour plots of activity against P. falciparum. The advantage of the method reported here (refinement of quantitative structure-activity relationship [QSAR] descriptors by random assignment of compounds to multiple training and test sets) is that it retains QSAR descriptors according to their abilities to predict the activities of unknown test compounds rather than according to how well they fit the activities of the compounds in the training sets. PMID:21383099
Qin, Li-Tang; Liu, Shu-Shen; Liu, Hai-Ling
2010-02-01
A five-variable model (model M2) was developed for the bioconcentration factors (BCFs) of nonpolar organic compounds (NPOCs) by using molecular electronegativity distance vector (MEDV) to characterize the structures of NPOCs and variable selection and modeling based on prediction (VSMP) to select the optimum descriptors. The estimated correlation coefficient (r (2)) and the leave-one-out cross-validation correlation coefficients (q (2)) of model M2 were 0.9271 and 0.9171, respectively. The model was externally validated by splitting the whole data set into a representative training set of 85 chemicals and a validation set of 29 chemicals. The results show that the main structural factors influencing the BCFs of NPOCs are -cCc, cCcc, -Cl, and -Br (where "-" refers to a single bond and "c" refers to a conjugated bond). The quantitative structure-property relationship (QSPR) model can effectively predict the BCFs of NPOCs, and the predictions of the model can also extend the current BCF database of experimental values.
Mohr, Johannes A; Jain, Brijnesh J; Obermayer, Klaus
2008-09-01
Quantitative structure activity relationship (QSAR) analysis is traditionally based on extracting a set of molecular descriptors and using them to build a predictive model. In this work, we propose a QSAR approach based directly on the similarity between the 3D structures of a set of molecules measured by a so-called molecule kernel, which is independent of the spatial prealignment of the compounds. Predictors can be build using the molecule kernel in conjunction with the potential support vector machine (P-SVM), a recently proposed machine learning method for dyadic data. The resulting models make direct use of the structural similarities between the compounds in the test set and a subset of the training set and do not require an explicit descriptor construction. We evaluated the predictive performance of the proposed method on one classification and four regression QSAR datasets and compared its results to the results reported in the literature for several state-of-the-art descriptor-based and 3D QSAR approaches. In this comparison, the proposed molecule kernel method performed better than the other QSAR methods.
Kumar, Pankaj; Ma, Xiaohua; Liu, Xianghui; Jia, Jia; Bucong, Han; Xue, Ying; Li, Ze Rong; Yang, Sheng Yong; Wei, Yu Quan; Chen, Yu Zong
2011-05-01
Various in vitro and in-silico methods have been used for drug genotoxicity tests, which show limited genotoxicity (GT+) and non-genotoxicity (GT-) identification rates. New methods and combinatorial approaches have been explored for enhanced collective identification capability. The rates of in-silco methods may be further improved by significantly diversified training data enriched by the large number of recently reported GT+ and GT- compounds, but a major concern is the increased noise levels arising from high false-positive rates of in vitro data. In this work, we evaluated the effect of training data size and noise level on the performance of support vector machines (SVM) method known to tolerate high noise levels in training data. Two SVMs of different diversity/noise levels were developed and tested. H-SVM trained by higher diversity higher noise data (GT+ in any in vivo or in vitro test) outperforms L-SVM trained by lower noise lower diversity data (GT+ in in vivo or Ames test only). H-SVM trained by 4,763 GT+ compounds reported before 2008 and 8,232 GT- compounds excluding clinical trial drugs correctly identified 81.6% of the 38 GT+ compounds reported since 2008, predicted 83.1% of the 2,008 clinical trial drugs as GT-, and 23.96% of 168 K MDDR and 27.23% of 17.86M PubChem compounds as GT+. These are comparable to the 43.1-51.9% GT+ and 75-93% GT- rates of existing in-silico methods, 58.8% GT+ and 79% GT- rates of Ames method, and the estimated percentages of 23% in vivo and 31-33% in vitro GT+ compounds in the "universe of chemicals". There is a substantial level of agreement between H-SVM and L-SVM predicted GT+ and GT- MDDR compounds and the prediction from TOPKAT. SVM showed good potential in identifying GT+ compounds from large compound libraries based on higher diversity and higher noise training data.
Kalva, Sukesh; Vadivelan, S; Sanam, Ramadevi; Jagarlapudi, Sarma ARP; Saleena, Lilly M
2012-01-01
In this study, chemical feature based pharmacophore models of MMP-1, MMP-8 and MMP-13 inhibitors have been developed with the aid of HypoGen module within Catalyst program package. In MMP-1 and MMP-13, all the compounds in the training set mapped HBA and RA, while in MMP-8, the training set mapped HBA and HY. These features revealed responsibility for the high molecular bioactivity, and this is further used as a three dimensional query to screen the knowledge based designed molecules. These pharmacophore models for collagenases picked up some potent and novel inhibitors. Subsequently, docking studies were performed for the potent molecules and novel hits were suggested for further studies based on the docking score and active site interactions in MMP-1, MMP-8 and MMP-13. PMID:22553386
De Novo Design of Bioactive Small Molecules by Artificial Intelligence
Merk, Daniel; Friedrich, Lukas; Grisoni, Francesca
2018-01-01
Abstract Generative artificial intelligence offers a fresh view on molecular design. We present the first‐time prospective application of a deep learning model for designing new druglike compounds with desired activities. For this purpose, we trained a recurrent neural network to capture the constitution of a large set of known bioactive compounds represented as SMILES strings. By transfer learning, this general model was fine‐tuned on recognizing retinoid X and peroxisome proliferator‐activated receptor agonists. We synthesized five top‐ranking compounds designed by the generative model. Four of the compounds revealed nanomolar to low‐micromolar receptor modulatory activity in cell‐based assays. Apparently, the computational model intrinsically captured relevant chemical and biological knowledge without the need for explicit rules. The results of this study advocate generative artificial intelligence for prospective de novo molecular design, and demonstrate the potential of these methods for future medicinal chemistry. PMID:29319225
Wan, Boyong; Small, Gary W
2011-01-21
A novel synthetic data generation methodology is described for use in the development of pattern recognition classifiers that are employed for the automated detection of volatile organic compounds (VOCs) during infrared remote sensing measurements. The approach used is passive Fourier transform infrared spectrometry implemented in a downward-looking mode on an aircraft platform. A key issue in developing this methodology in practice is the need for example data that can be used to train the classifiers. To replace the time-consuming and costly collection of training data in the field, this work implements a strategy for taking laboratory analyte spectra and superimposing them on background spectra collected from the air. The resulting synthetic spectra can be used to train the classifiers. This methodology is tested by developing classifiers for ethanol and methanol, two prevalent VOCs in wide industrial use. The classifiers are successfully tested with data collected from the aircraft during controlled releases of ethanol and during a methanol release from an industrial facility. For both ethanol and methanol, missed detections in the aircraft data are in the range of 4 to 5%, with false positive detections ranging from 0.1 to 0.3%.
Li, Yi; Tseng, Yufeng J.; Pan, Dahua; Liu, Jianzhong; Kern, Petra S.; Gerberick, G. Frank; Hopfinger, Anton J.
2008-01-01
Currently, the only validated methods to identify skin sensitization effects are in vivo models, such as the Local Lymph Node Assay (LLNA) and guinea pig studies. There is a tremendous need, in particular due to novel legislation, to develop animal alternatives, eg. Quantitative Structure-Activity Relationship (QSAR) models. Here, QSAR models for skin sensitization using LLNA data have been constructed. The descriptors used to generate these models are derived from the 4D-molecular similarity paradigm and are referred to as universal 4D-fingerprints. A training set of 132 structurally diverse compounds and a test set of 15 structurally diverse compounds were used in this study. The statistical methodologies used to build the models are logistic regression (LR), and partial least square coupled logistic regression (PLS-LR), which prove to be effective tools for studying skin sensitization measures expressed in the two categorical terms of sensitizer and non-sensitizer. QSAR models with low values of the Hosmer-Lemeshow goodness-of-fit statistic, χHL2, are significant and predictive. For the training set, the cross-validated prediction accuracy of the logistic regression models ranges from 77.3% to 78.0%, while that of PLS-logistic regression models ranges from 87.1% to 89.4%. For the test set, the prediction accuracy of logistic regression models ranges from 80.0%-86.7%, while that of PLS-logistic regression models ranges from 73.3%-80.0%. The QSAR models are made up of 4D-fingerprints related to aromatic atoms, hydrogen bond acceptors and negatively partially charged atoms. PMID:17226934
NASA Astrophysics Data System (ADS)
Crivori, Patrizia; Zamora, Ismael; Speed, Bill; Orrenius, Christian; Poggesi, Italo
2004-03-01
A number of computational approaches are being proposed for an early optimization of ADME (absorption, distribution, metabolism and excretion) properties to increase the success rate in drug discovery. The present study describes the development of an in silico model able to estimate, from the three-dimensional structure of a molecule, the stability of a compound with respect to the human cytochrome P450 (CYP) 3A4 enzyme activity. Stability data were obtained by measuring the amount of unchanged compound remaining after a standardized incubation with human cDNA-expressed CYP3A4. The computational method transforms the three-dimensional molecular interaction fields (MIFs) generated from the molecular structure into descriptors (VolSurf and Almond procedures). The descriptors were correlated to the experimental metabolic stability classes by a partial least squares discriminant procedure. The model was trained using a set of 1800 compounds from the Pharmacia collection and was validated using two test sets: the first one including 825 compounds from the Pharmacia collection and the second one consisting of 20 known drugs. This model correctly predicted 75% of the first and 85% of the second test set and showed a precision above 86% to correctly select metabolically stable compounds. The model appears a valuable tool in the design of virtual libraries to bias the selection toward more stable compounds. Abbreviations: ADME - absorption, distribution, metabolism and excretion; CYP - cytochrome P450; MIFs - molecular interaction fields; HTS - high throughput screening; DDI - drug-drug interactions; 3D - three-dimensional; PCA - principal components analysis; CPCA - consensus principal components analysis; PLS - partial least squares; PLSD - partial least squares discriminant; GRIND - grid independent descriptors; GRID - software originally created and developed by Professor Peter Goodford.
Fragment-based prediction of skin sensitization using recursive partitioning
NASA Astrophysics Data System (ADS)
Lu, Jing; Zheng, Mingyue; Wang, Yong; Shen, Qiancheng; Luo, Xiaomin; Jiang, Hualiang; Chen, Kaixian
2011-09-01
Skin sensitization is an important toxic endpoint in the risk assessment of chemicals. In this paper, structure-activity relationships analysis was performed on the skin sensitization potential of 357 compounds with local lymph node assay data. Structural fragments were extracted by GASTON (GrAph/Sequence/Tree extractiON) from the training set. Eight fragments with accuracy significantly higher than 0.73 ( p < 0.1) were retained to make up an indicator descriptor fragment. The fragment descriptor and eight other physicochemical descriptors closely related to the endpoint were calculated to construct the recursive partitioning tree (RP tree) for classification. The balanced accuracy of the training set, test set I, and test set II in the leave-one-out model were 0.846, 0.800, and 0.809, respectively. The results highlight that fragment-based RP tree is a preferable method for identifying skin sensitizers. Moreover, the selected fragments provide useful structural information for exploring sensitization mechanisms, and RP tree creates a graphic tree to identify the most important properties associated with skin sensitization. They can provide some guidance for designing of drugs with lower sensitization level.
Practices of pharmacies that compound extemporaneous formulations.
Treadway, Angela K; Craddock, Deeatra; Leff, Richard
2007-07-01
A survey was conducted to characterize the standard of practice for extemporaneous pharmaceutical compounding within community and institutional pharmacies. Extemporaneous compounding practices vary among pharmacies. Because of this, the survey inquired specifically about a single pharmaceutical product (caffeine citrate 20 mg/mL) to minimize variability among respondents. Survey questions were written to identify compounding practice variations with (1) policies and procedures, (2) process validation, (3) personnel education, training, and evaluation, (4) expiration dating, (5) storage and handling of compounded prescriptions within the pharmacy, (6) labeling, (7) facilities and equipment, (8) end-product evaluation, (9) handling of sterile products outside of the pharmacy, (10) aseptic technique and product preparation, and (11) documentation. A total of 522 surveys were mailed; 117 completed surveys were returned and included in the analyses. Over half of the pharmacies surveyed were large institutional pharmacies with daily prescriptions exceeding 300. Almost 71% of pharmacies reported having policies and procedures for compounding and providing compounding training for staff. Almost one third of the pharmacies that responded did not have compounding policies and procedures and did not provide staff training. For those pharmacies that provided training, the methods used were diverse (e.g., lectures and videotapes, external certificate programs). Formulations used to compound caffeine appeared to be diverse as evidenced by the varied addition of inactive ingredients. A survey of compounding pharmacies found variability in overall compounding practices and training and in practices specifically related to compounding preparations of caffeine citrate.
Anti AIDS drug design with the help of neural networks
NASA Astrophysics Data System (ADS)
Tetko, I. V.; Tanchuk, V. Yu.; Luik, A. I.
1995-04-01
Artificial neural networks were used to analyze and predict the human immunodefiency virus type 1 reverse transcriptase inhibitors. Training and control set included 44 molecules (most of them are well-known substances such as AZT, TIBO, dde, etc.) The biological activities of molecules were taken from literature and rated for two classes: active and inactive compounds according to their values. We used topological indices as molecular parameters. Four most informative parameters (out of 46) were chosen using cluster analysis and original input parameters' estimation procedure and were used to predict activities of both control and new (synthesized in our institute) molecules. We applied pruning network algorithm and network ensembles to obtain the final classifier and avoid chance correlation. The increasing of neural network generalization of the data from the control set was observed, when using the aforementioned methods. The prognosis of new molecules revealed one molecule as possibly active. It was confirmed by further biological tests. The compound was as active as AZT and in order less toxic. The active compound is currently being evaluated in pre clinical trials as possible drug for anti-AIDS therapy.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gissi, Andrea; Dipartimento di Farmacia – Scienze del Farmaco, Università degli Studi di Bari “Aldo Moro”, Via E. Orabona 4, 70125 Bari; Lombardo, Anna
The bioconcentration factor (BCF) is an important bioaccumulation hazard assessment metric in many regulatory contexts. Its assessment is required by the REACH regulation (Registration, Evaluation, Authorization and Restriction of Chemicals) and by CLP (Classification, Labeling and Packaging). We challenged nine well-known and widely used BCF QSAR models against 851 compounds stored in an ad-hoc created database. The goodness of the regression analysis was assessed by considering the determination coefficient (R{sup 2}) and the Root Mean Square Error (RMSE); Cooper's statistics and Matthew's Correlation Coefficient (MCC) were calculated for all the thresholds relevant for regulatory purposes (i.e. 100 L/kg for Chemicalmore » Safety Assessment; 500 L/kg for Classification and Labeling; 2000 and 5000 L/kg for Persistent, Bioaccumulative and Toxic (PBT) and very Persistent, very Bioaccumulative (vPvB) assessment) to assess the classification, with particular attention to the models' ability to control the occurrence of false negatives. As a first step, statistical analysis was performed for the predictions of the entire dataset; R{sup 2}>0.70 was obtained using CORAL, T.E.S.T. and EPISuite Arnot–Gobas models. As classifiers, ACD and log P-based equations were the best in terms of sensitivity, ranging from 0.75 to 0.94. External compound predictions were carried out for the models that had their own training sets. CORAL model returned the best performance (R{sup 2}{sub ext}=0.59), followed by the EPISuite Meylan model (R{sup 2}{sub ext}=0.58). The latter gave also the highest sensitivity on external compounds with values from 0.55 to 0.85, depending on the thresholds. Statistics were also compiled for compounds falling into the models Applicability Domain (AD), giving better performances. In this respect, VEGA CAESAR was the best model in terms of regression (R{sup 2}=0.94) and classification (average sensitivity>0.80). This model also showed the best regression (R{sup 2}=0.85) and sensitivity (average>0.70) for new compounds in the AD but not present in the training set. However, no single optimal model exists and, thus, it would be wise a case-by-case assessment. Yet, integrating the wealth of information from multiple models remains the winner approach. - Highlights: • REACH encourages the use of in silico methods in the assessment of chemicals safety. • The performances of nine BCF models were evaluated on a benchmark database of 851 chemicals. • We compared the models on the basis of both regression and classification performance. • Statistics on chemicals out of the training set and/or within the applicability domain were compiled. • The results show that QSAR models are useful as weight-of-evidence in support to other methods.« less
Trainable structure-activity relationship model for virtual screening of CYP3A4 inhibition.
Didziapetris, Remigijus; Dapkunas, Justas; Sazonovas, Andrius; Japertas, Pranas
2010-11-01
A new structure-activity relationship model predicting the probability for a compound to inhibit human cytochrome P450 3A4 has been developed using data for >800 compounds from various literature sources and tested on PubChem screening data. Novel GALAS (Global, Adjusted Locally According to Similarity) modeling methodology has been used, which is a combination of baseline global QSAR model and local similarity based corrections. GALAS modeling method allows forecasting the reliability of prediction thus defining the model applicability domain. For compounds within this domain the statistical results of the final model approach the data consistency between experimental data from literature and PubChem datasets with the overall accuracy of 89%. However, the original model is applicable only for less than a half of PubChem database. Since the similarity correction procedure of GALAS modeling method allows straightforward model training, the possibility to expand the applicability domain has been investigated. Experimental data from PubChem dataset served as an example of in-house high-throughput screening data. The model successfully adapted itself to both data classified using the same and different IC₅₀ threshold compared with the training set. In addition, adjustment of the CYP3A4 inhibition model to compounds with a novel chemical scaffold has been demonstrated. The reported GALAS model is proposed as a useful tool for virtual screening of compounds for possible drug-drug interactions even prior to the actual synthesis.
Consensus QSAR model for identifying novel H5N1 inhibitors.
Sharma, Nitin; Yap, Chun Wei
2012-08-01
Due to the importance of neuraminidase in the pathogenesis of influenza virus infection, it has been regarded as the most important drug target for the treatment of influenza. Resistance to currently available drugs and new findings related to structure of the protein requires novel neuraminidase 1 (N1) inhibitors. In this study, a consensus QSAR model with defined applicability domain (AD) was developed using published N1 inhibitors. The consensus model was validated using an external validation set. The model achieved high sensitivity, specificity, and overall accuracy along with low false positive rate (FPR) and false discovery rate (FDR). The performance of model on the external validation set and training set were comparable, thus it was unlikely to be overfitted. The low FPR and low FDR will increase its accuracy in screening large chemical libraries. Screening of ZINC library resulted in 64,772 compounds as probable N1 inhibitors, while 173,674 compounds were defined to be outside the AD of the consensus model. The advantage of the current model is that it was developed using a large and diverse dataset and has a defined AD which prevents its use on compounds that it is not capable of predicting. The consensus model developed in this study is made available via the free software, PaDEL-DDPredictor.
2017-01-01
Cytochrome P450 aromatase (CYP19A1) plays a key role in the development of estrogen dependent breast cancer, and aromatase inhibitors have been at the front line of treatment for the past three decades. The development of potent, selective and safer inhibitors is ongoing with in silico screening methods playing a more prominent role in the search for promising lead compounds in bioactivity-relevant chemical space. Here we present a set of comprehensive binding affinity prediction models for CYP19A1 using our automated Linear Interaction Energy (LIE) based workflow on a set of 132 putative and structurally diverse aromatase inhibitors obtained from a typical industrial screening study. We extended the workflow with machine learning methods to automatically cluster training and test compounds in order to maximize the number of explained compounds in one or more predictive LIE models. The method uses protein–ligand interaction profiles obtained from Molecular Dynamics (MD) trajectories to help model search and define the applicability domain of the resolved models. Our method was successful in accounting for 86% of the data set in 3 robust models that show high correlation between calculated and observed values for ligand-binding free energies (RMSE < 2.5 kJ mol–1), with good cross-validation statistics. PMID:28776988
Geiger, Friedemann; Kasper, Jürgen
2012-01-01
Shared decision making (SDM) between patient and physician is an interpersonal process. Most SDM measures use the view of one party (patient, physician or observer) as a proxy to capture this process although these views typically diverge. This study suggests the compound measure SDM(MASS) (SDM Meeting its concept's ASSumptions) integrating these three perspectives in one single index. SDM(MASS) was derived theoretically and compared empirically to unilateral perspectives of patients, physicians and observers by application to a data set of 10 physicians (40 consultations) receiving an SDM training. The constituting parts of SDM(MASS) were highly reliable (Cronbach's alpha .94; interrater reliability .74-.87). Unilateral appraisal of training effects was divergent. SDM(MASS) revealed no effect. SDM(MASS) combines noteworthy information about SDM processes from different viewpoints and thereby delivers plausible assessments. It could overcome immanent shortcomings of unilateral approaches. However, it is a complex measure needing further validation. Copyright © 2012. Published by Elsevier GmbH.
QSAR models for thiophene and imidazopyridine derivatives inhibitors of the Polo-Like Kinase 1.
Comelli, Nieves C; Duchowicz, Pablo R; Castro, Eduardo A
2014-10-01
The inhibitory activity of 103 thiophene and 33 imidazopyridine derivatives against Polo-Like Kinase 1 (PLK1) expressed as pIC50 (-logIC50) was predicted by QSAR modeling. Multivariate linear regression (MLR) was employed to model the relationship between 0D and 3D molecular descriptors and biological activities of molecules using the replacement method (MR) as variable selection tool. The 136 compounds were separated into several training and test sets. Two splitting approaches, distribution of biological data and structural diversity, and the statistical experimental design procedure D-optimal distance were applied to the dataset. The significance of the training set models was confirmed by statistically higher values of the internal leave one out cross-validated coefficient of determination (Q2) and external predictive coefficient of determination for the test set (Rtest2). The model developed from a training set, obtained with the D-optimal distance protocol and using 3D descriptor space along with activity values, separated chemical features that allowed to distinguish high and low pIC50 values reasonably well. Then, we verified that such model was sufficient to reliably and accurately predict the activity of external diverse structures. The model robustness was properly characterized by means of standard procedures and their applicability domain (AD) was analyzed by leverage method. Copyright © 2014 Elsevier B.V. All rights reserved.
Ferrari, Thomas; Lombardo, Anna; Benfenati, Emilio
2018-05-14
Several methods exist to develop QSAR models automatically. Some are based on indices of the presence of atoms, other on the most similar compounds, other on molecular descriptors. Here we introduce QSARpy v1.0, a new QSAR modeling tool based on a different approach: the dissimilarity. This tool fragments the molecules of the training set to extract fragments that can be associated to a difference in the property/activity value, called modulators. If the target molecule share part of the structure with a molecule of the training set and differences can be explained with one or more modulators, the property/activity value of the molecule of the training set is adjusted using the value associated to the modulator(s). This tool is tested here on the n-octanol/water partition coefficient (Kow, usually expressed in logarithmic units as log Kow). It is a key parameter in risk assessment since it is a measure of hydrophobicity. Its wide spread use makes these estimation methods very useful to reduce testing costs. Using QSARpy v1.0, we obtained a new model to predict log Kow with accurate performance (RMSE 0.43 and R 2 0.94 for the external test set), comparing favorably with other programs. QSARpy is freely available on request. Copyright © 2018 Elsevier B.V. All rights reserved.
Modeling of adipose/blood partition coefficient for environmental chemicals.
Papadaki, K C; Karakitsios, S P; Sarigiannis, D A
2017-12-01
A Quantitative Structure Activity Relationship (QSAR) model was developed in order to predict the adipose/blood partition coefficient of environmental chemical compounds. The first step of QSAR modeling was the collection of inputs. Input data included the experimental values of adipose/blood partition coefficient and two sets of molecular descriptors for 67 organic chemical compounds; a) the descriptors from Linear Free Energy Relationship (LFER) and b) the PaDEL descriptors. The datasets were split to training and prediction set and were analysed using two statistical methods; Genetic Algorithm based Multiple Linear Regression (GA-MLR) and Artificial Neural Networks (ANN). The models with LFER and PaDEL descriptors, coupled with ANN, produced satisfying performance results. The fitting performance (R 2 ) of the models, using LFER and PaDEL descriptors, was 0.94 and 0.96, respectively. The Applicability Domain (AD) of the models was assessed and then the models were applied to a large number of chemical compounds with unknown values of adipose/blood partition coefficient. In conclusion, the proposed models were checked for fitting, validity and applicability. It was demonstrated that they are stable, reliable and capable to predict the values of adipose/blood partition coefficient of "data poor" chemical compounds that fall within the applicability domain. Copyright © 2017. Published by Elsevier Ltd.
The effect of leverage and/or influential on structure-activity relationships.
Bolboacă, Sorana D; Jäntschi, Lorentz
2013-05-01
In the spirit of reporting valid and reliable Quantitative Structure-Activity Relationship (QSAR) models, the aim of our research was to assess how the leverage (analysis with Hat matrix, h(i)) and the influential (analysis with Cook's distance, D(i)) of QSAR models may reflect the models reliability and their characteristics. The datasets included in this research were collected from previously published papers. Seven datasets which accomplished the imposed inclusion criteria were analyzed. Three models were obtained for each dataset (full-model, h(i)-model and D(i)-model) and several statistical validation criteria were applied to the models. In 5 out of 7 sets the correlation coefficient increased when compounds with either h(i) or D(i) higher than the threshold were removed. Withdrawn compounds varied from 2 to 4 for h(i)-models and from 1 to 13 for D(i)-models. Validation statistics showed that D(i)-models possess systematically better agreement than both full-models and h(i)-models. Removal of influential compounds from training set significantly improves the model and is recommended to be conducted in the process of quantitative structure-activity relationships developing. Cook's distance approach should be combined with hat matrix analysis in order to identify the compounds candidates for removal.
Application of Generative Autoencoder in De Novo Molecular Design.
Blaschke, Thomas; Olivecrona, Marcus; Engkvist, Ola; Bajorath, Jürgen; Chen, Hongming
2018-01-01
A major challenge in computational chemistry is the generation of novel molecular structures with desirable pharmacological and physiochemical properties. In this work, we investigate the potential use of autoencoder, a deep learning methodology, for de novo molecular design. Various generative autoencoders were used to map molecule structures into a continuous latent space and vice versa and their performance as structure generator was assessed. Our results show that the latent space preserves chemical similarity principle and thus can be used for the generation of analogue structures. Furthermore, the latent space created by autoencoders were searched systematically to generate novel compounds with predicted activity against dopamine receptor type 2 and compounds similar to known active compounds not included in the trainings set were identified. © 2018 The Authors. Published by Wiley-VCH Verlag GmbH & Co. KGaA.
Bajda, Marek; Jończyk, Jakub; Malawska, Barbara; Filipek, Sławomir
2014-03-24
β-Secretase (BACE-1) constitutes an important target for search of anti-Alzheimer's drugs. The first inhibitors of this enzyme were peptidic compounds with high molecular weight and low bioavailability. Therefore, the search for new efficient non-peptidic inhibitors has been undertaken by many scientific groups. We started our work from the development of in silico methodology for the design of novel BACE-1 ligands. It was validated on the basis of crystal structures of complexes with inhibitors, redocking, cross-docking and training/test sets of reference ligands. The presented procedure of assessment of the novel compounds as β-secretase inhibitors could be widely used in the design process.
Masuda, Yosuke; Yoshida, Tomoki; Yamaotsu, Noriyuki; Hirono, Shuichi
2018-01-01
We recently reported that the Gibbs free energy of hydrolytic water molecules (ΔG wat ) in acyl-trypsin intermediates calculated by hydration thermodynamics analysis could be a useful metric for estimating the catalytic rate constants (k cat ) of mechanism-based reversible covalent inhibitors. For thorough evaluation, the proposed method was tested with an increased number of covalent ligands that have no corresponding crystal structures. After modeling acyl-trypsin intermediate structures using flexible molecular superposition, ΔG wat values were calculated according to the proposed method. The orbital energies of antibonding π* molecular orbitals (MOs) of carbonyl C=O in covalently modified catalytic serine (E orb ) were also calculated by semi-empirical MO calculations. Then, linear discriminant analysis (LDA) was performed to build a model that can discriminate covalent inhibitor candidates from substrate-like ligands using ΔG wat and E orb . The model was built using a training set (10 compounds) and then validated by a test set (4 compounds). As a result, the training set and test set ligands were perfectly discriminated by the model. Hydrolysis was slower when (1) the hydrolytic water molecule has lower ΔG wat ; (2) the covalent ligand presents higher E orb (higher reaction barrier). Results also showed that the entropic term of hydrolytic water molecule (-TΔS wat ) could be used for estimating k cat and for covalent inhibitor optimization; when the rotational freedom of the hydrolytic water molecule is limited, the chance for favorable interaction with the electrophilic acyl group would also be limited. The method proposed in this study would be useful for screening and optimizing the mechanism-based reversible covalent inhibitors.
Ferreira da Costa, Joana; Silva, David; Caamaño, Olga; Brea, José M; Loza, Maria Isabel; Munteanu, Cristian R; Pazos, Alejandro; García-Mera, Xerardo; González-Díaz, Humbert
2018-06-25
Predicting drug-protein interactions (DPIs) for target proteins involved in dopamine pathways is a very important goal in medicinal chemistry. We can tackle this problem using Molecular Docking or Machine Learning (ML) models for one specific protein. Unfortunately, these models fail to account for large and complex big data sets of preclinical assays reported in public databases. This includes multiple conditions of assays, such as different experimental parameters, biological assays, target proteins, cell lines, organism of the target, or organism of assay. On the other hand, perturbation theory (PT) models allow us to predict the properties of a query compound or molecular system in experimental assays with multiple boundary conditions based on a previously known case of reference. In this work, we report the first PTML (PT + ML) study of a large ChEMBL data set of preclinical assays of compounds targeting dopamine pathway proteins. The best PTML model found predicts 50000 cases with accuracy of 70-91% in training and external validation series. We also compared the linear PTML model with alternative PTML models trained with multiple nonlinear methods (artificial neural network (ANN), Random Forest, Deep Learning, etc.). Some of the nonlinear methods outperform the linear model but at the cost of a notable increment of the complexity of the model. We illustrated the practical use of the new model with a proof-of-concept theoretical-experimental study. We reported for the first time the organic synthesis, chemical characterization, and pharmacological assay of a new series of l-prolyl-l-leucyl-glycinamide (PLG) peptidomimetic compounds. In addition, we performed a molecular docking study for some of these compounds with the software Vina AutoDock. The work ends with a PTML model predictive study of the outcomes of the new compounds in a large number of assays. Therefore, this study offers a new computational methodology for predicting the outcome for any compound in new assays. This PTML method focuses on the prediction with a simple linear model of multiple pharmacological parameters (IC 50 , EC 50 , K i , etc.) for compounds in assays involving different cell lines used, organisms of the protein target, or organism of assay for proteins in the dopamine pathway.
Zang, Yang-Yang; Li, Yuan-Mei; Yin, Yue; Chen, Shan-Shan; Kai, Zhen-Peng
2017-09-01
In a previous study we have demonstrated that insect 3-hydroxy-3-methylglutaryl-CoA reductase (HMGR) can be a potential selective insecticide target. Three series of inhibitors were designed on the basis of the difference in HMGR structures from Homo sapiens and Manduca sexta, with the aim of discovering potent selective insecticide candidates. An in vitro bioassay showed that gem-difluoromethylenated statin analogues have potent effects on JH biosynthesis of M. sexta and high selectivity between H. sapiens and M. sexta. All series II compounds {1,3,5-trisubstituted [4-tert-butyl 2-(5,5-difluoro-2,2-dimethyl-6-vinyl-4-yl) acetate] pyrazoles} have some effect on JH biosynthesis, whereas most of them are inactive on human HMGR. In particular, the IC 50 value of compound II-12 (37.8 nm) is lower than that of lovastatin (99.5 nm) and similar to that of rosuvastatin (24.2 nm). An in vivo bioassay showed that I-1, I-2, I-3 and II-12 are potential selective insecticides, especially for lepidopteran pest control. A predictable and statistically meaningful CoMFA model of 23 inhibitors (20 as training sets and three as test sets) was obtained with a value of q 2 and r 2 of 0.66 and 0.996 respectively. The final model suggested that a potent insect HMGR inhibitor should contain suitable small and non-electronegative groups in the ring part, and electronegative groups in the side chain. Four analogues were discovered as potent selective lepidopteran HMGR inhibitors, which can specifically be used for lepidopteran pest control. The CoMFA model will be useful for the design of new selective insect HMGR inhibitors that are structurally related to the training set compounds. © 2017 Society of Chemical Industry. © 2017 Society of Chemical Industry.
Boik, John C; Newman, Robert A
2008-01-01
Background Quantitative structure-activity relationship (QSAR) models have become popular tools to help identify promising lead compounds in anticancer drug development. Few QSAR studies have investigated multitask learning, however. Multitask learning is an approach that allows distinct but related data sets to be used in training. In this paper, a suite of three QSAR models is developed to identify compounds that are likely to (a) exhibit cytotoxic behavior against cancer cells, (b) exhibit high rat LD50 values (low systemic toxicity), and (c) exhibit low to modest human oral clearance (favorable pharmacokinetic characteristics). Models were constructed using Kernel Multitask Latent Analysis (KMLA), an approach that can effectively handle a large number of correlated data features, nonlinear relationships between features and responses, and multitask learning. Multitask learning is particularly useful when the number of available training records is small relative to the number of features, as was the case with the oral clearance data. Results Multitask learning modestly but significantly improved the classification precision for the oral clearance model. For the cytotoxicity model, which was constructed using a large number of records, multitask learning did not affect precision but did reduce computation time. The models developed here were used to predict activities for 115,000 natural compounds. Hundreds of natural compounds, particularly in the anthraquinone and flavonoids groups, were predicted to be cytotoxic, have high LD50 values, and have low to moderate oral clearance. Conclusion Multitask learning can be useful in some QSAR models. A suite of QSAR models was constructed and used to screen a large drug library for compounds likely to be cytotoxic to multiple cancer cell lines in vitro, have low systemic toxicity in rats, and have favorable pharmacokinetic properties in humans. PMID:18554402
Boik, John C; Newman, Robert A
2008-06-13
Quantitative structure-activity relationship (QSAR) models have become popular tools to help identify promising lead compounds in anticancer drug development. Few QSAR studies have investigated multitask learning, however. Multitask learning is an approach that allows distinct but related data sets to be used in training. In this paper, a suite of three QSAR models is developed to identify compounds that are likely to (a) exhibit cytotoxic behavior against cancer cells, (b) exhibit high rat LD50 values (low systemic toxicity), and (c) exhibit low to modest human oral clearance (favorable pharmacokinetic characteristics). Models were constructed using Kernel Multitask Latent Analysis (KMLA), an approach that can effectively handle a large number of correlated data features, nonlinear relationships between features and responses, and multitask learning. Multitask learning is particularly useful when the number of available training records is small relative to the number of features, as was the case with the oral clearance data. Multitask learning modestly but significantly improved the classification precision for the oral clearance model. For the cytotoxicity model, which was constructed using a large number of records, multitask learning did not affect precision but did reduce computation time. The models developed here were used to predict activities for 115,000 natural compounds. Hundreds of natural compounds, particularly in the anthraquinone and flavonoids groups, were predicted to be cytotoxic, have high LD50 values, and have low to moderate oral clearance. Multitask learning can be useful in some QSAR models. A suite of QSAR models was constructed and used to screen a large drug library for compounds likely to be cytotoxic to multiple cancer cell lines in vitro, have low systemic toxicity in rats, and have favorable pharmacokinetic properties in humans.
Goya Jorge, Elizabeth; Rayar, Anita Maria; Barigye, Stephen J; Jorge Rodríguez, María Elisa; Sylla-Iyarreta Veitía, Maité
2016-06-07
A quantitative structure-activity relationship (QSAR) study of the 2,2-diphenyl-l-picrylhydrazyl (DPPH•) radical scavenging ability of 1373 chemical compounds, using DRAGON molecular descriptors (MD) and the neural network technique, a technique based on the multilayer multilayer perceptron (MLP), was developed. The built model demonstrated a satisfactory performance for the training ( R 2 = 0.713 ) and test set ( Q ext 2 = 0.654 ) , respectively. To gain greater insight on the relevance of the MD contained in the MLP model, sensitivity and principal component analyses were performed. Moreover, structural and mechanistic interpretation was carried out to comprehend the relationship of the variables in the model with the modeled property. The constructed MLP model was employed to predict the radical scavenging ability for a group of coumarin-type compounds. Finally, in order to validate the model's predictions, an in vitro assay for one of the compounds (4-hydroxycoumarin) was performed, showing a satisfactory proximity between the experimental and predicted pIC50 values.
De Novo Design of Bioactive Small Molecules by Artificial Intelligence.
Merk, Daniel; Friedrich, Lukas; Grisoni, Francesca; Schneider, Gisbert
2018-01-01
Generative artificial intelligence offers a fresh view on molecular design. We present the first-time prospective application of a deep learning model for designing new druglike compounds with desired activities. For this purpose, we trained a recurrent neural network to capture the constitution of a large set of known bioactive compounds represented as SMILES strings. By transfer learning, this general model was fine-tuned on recognizing retinoid X and peroxisome proliferator-activated receptor agonists. We synthesized five top-ranking compounds designed by the generative model. Four of the compounds revealed nanomolar to low-micromolar receptor modulatory activity in cell-based assays. Apparently, the computational model intrinsically captured relevant chemical and biological knowledge without the need for explicit rules. The results of this study advocate generative artificial intelligence for prospective de novo molecular design, and demonstrate the potential of these methods for future medicinal chemistry. © 2018 The Authors. Published by Wiley-VCH Verlag GmbH & Co. KGaA.
ERIC Educational Resources Information Center
Grisante, Priscila C.; Galesi, Fernanda L.; Sabino, Nathali M.; Debert, Paula; Arntzen, Erik; McIlvane, William J.
2013-01-01
When the matching-to-sample (MTS) procedure is used, different training structures imply differences in the successive discriminations required in training and test conditions. When the go/no-go procedure with compound stimuli is used, however, differences in training structures do not imply such differences. This study assessed whether the…
Han, Shu-ying; Liang, Chao; Qiao, Jun-qin; Lian, Hong-zhen; Ge, Xin; Chen, Hong-yuan
2012-02-03
The retention factor corresponding to pure water in reversed-phase high performance liquid chromatography (RP-HPLC), k(w), was commonly obtained by extrapolation of retention factor (k) in a mixture of organic modifier and water as mobile phase in tedious experiments. In this paper, a relationship between logk(w) and logk for directly determining k(w) has been proposed for the first time. With a satisfactory validation, the approach was confirmed to enable easy and accurate evaluation of k(w) for compounds in question with similar structure to model compounds. Eight PCB congeners with different degree of chlorination were selected as a training set for modeling the logk(w)-logk correlation on both silica-based C(8) and C(18) stationary phases to evaluate logk(w) of sample compounds including seven PCB, six PBB and eight PBDE congeners. These eight model PCBs were subsequently combined with seven structure-similar benzene derivatives possessing reliable experimental K(ow) values as a whole training set for logK(ow)-logk(w) regressions on the two stationary phases. Consequently, the evaluated logk(w) values of sample compounds were used to determine their logK(ow) by the derived logK(ow)-logk(w) models. The logK(ow) values obtained by these evaluated logk(w) were well comparable with those obtained by experimental-extrapolated logk(w), demonstrating that the proposed method for logk(w) evaluation in this present study could be an effective means in lipophilicity study of environmental contaminants with numerous congeners. As a result, logK(ow) data of many PCBs, PBBs and PBDEs could be offered. These contaminants are considered to widely exist in the environment, but there have been no reliable experimental K(ow) data available yet. Copyright © 2011 Elsevier B.V. All rights reserved.
CoMFA and CoMSIA studies on C-aryl glucoside SGLT2 inhibitors as potential anti-diabetic agents.
Vyas, V K; Bhatt, H G; Patel, P K; Jalu, J; Chintha, C; Gupta, N; Ghate, M
2013-01-01
SGLT2 has become a target of therapeutic interest in diabetes research. CoMFA and CoMSIA studies were performed on C-aryl glucoside SGLT2 inhibitors (180 analogues) as potential anti-diabetic agents. Three different alignment strategies were used for the compounds. The best CoMFA and CoMSIA models were obtained by means of Distill rigid body alignment of training and test sets, and found statistically significant with cross-validated coefficients (q²) of 0.602 and 0.618, respectively, and conventional coefficients (r²) of 0.905 and 0.902, respectively. Both models were validated by a test set of 36 compounds giving satisfactory predicted correlation coefficients (r² pred) of 0.622 and 0.584 for CoMFA and CoMSIA models, respectively. A comparison was made with earlier 3D QSAR study on SGLT2 inhibitors, which shows that our 3D QSAR models are better than earlier models to predict good inhibitory activity. CoMFA and CoMSIA models generated in this work can provide useful information to design new compounds and helped in prediction of activity prior to synthesis.
Wrappers for Performance Enhancement and Oblivious Decision Graphs
1995-09-01
always select all relevant features. We test di erent search engines to search the space of feature subsets and introduce compound operators to speed...distinct instances from the original dataset appearing in the test set is thus 0:632m. The 0i accuracy estimate is derived by using bootstrap sample...i for training and the rest of the instances for testing . Given a number b, the number of bootstrap samples, let 0i be the accuracy estimate for
Chandra, Sharat; Pandey, Jyotsana; Tamrakar, Akhilesh Kumar; Siddiqi, Mohammad Imran
2017-01-01
In insulin and leptin signaling pathway, Protein-Tyrosine Phosphatase 1B (PTP1B) plays a crucial controlling role as a negative regulator, which makes it an attractive therapeutic target for both Type-2 Diabetes (T2D) and obesity. In this work, we have generated classification models by using the inhibition data set of known PTP1B inhibitors to identify new inhibitors of PTP1B utilizing multiple machine learning techniques like naïve Bayesian, random forest, support vector machine and k-nearest neighbors, along with structural fingerprints and selected molecular descriptors. Several models from each algorithm have been constructed and optimized, with the different combination of molecular descriptors and structural fingerprints. For the training and test sets, most of the predictive models showed more than 90% of overall prediction accuracies. The best model was obtained with support vector machine approach and has Matthews Correlation Coefficient of 0.82 for the external test set, which was further employed for the virtual screening of Maybridge small compound database. Five compounds were subsequently selected for experimental assay. Out of these two compounds were found to inhibit PTP1B with significant inhibitory activity in in-vitro inhibition assay. The structural fragments which are important for PTP1B inhibition were identified by naïve Bayesian method and can be further exploited to design new molecules around the identified scaffolds. The descriptive and predictive modeling strategy applied in this study is capable of identifying PTP1B inhibitors from the large compound libraries. Copyright © 2016 Elsevier Inc. All rights reserved.
Kumar, B V S Suneel; Lakshmi, Narasu; Kumar, M Ravi; Rambabu, Gundla; Manjashetty, Thimmappa H; Arunasree, Kalle M; Sriram, Dharmarajan; Ramkumar, Kavya; Neamati, Nouri; Dayam, Raveendra; Sarma, J A R P
2014-01-01
Fibroblast growth factor receptor 1 (FGFR1) a tyrosine kinase receptor, plays important roles in angiogenesis, embryonic development, cell proliferation, cell differentiation, and wound healing. The FGFR isoforms and their receptors (FGFRs) considered as a potential targets and under intense research to design potential anticancer agents. Fibroblast growth factors (FGF's) and its growth factor receptors (FGFR) plays vital role in one of the critical pathway in monitoring angiogenesis. In the current study, quantitative pharmacophore models were generated and validated using known FGFR1 inhibitors. The pharmacophore models were generated using a set of 28 compounds (training). The top pharmacophore model was selected and validated using a set of 126 compounds (test set) and also using external validation. The validated pharmacophore was considered as a virtual screening query to screen a database of 400,000 virtual molecules and pharmacophore model retrieved 2800 hits. The retrieved hits were subsequently filtered based on the fit value. The selected hits were subjected for docking studies to observe the binding modes of the retrieved hits and also to reduce the false positives. One of the potential hits (thiazole-2-amine derivative) was selected based the pharmacophore fit value, dock score, and synthetic feasibility. A few analogues of the thiazole-2-amine derivative were synthesized. These compounds were screened for FGFR1 activity and anti-proliferative studies. The top active compound showed 56.87% inhibition of FGFR1 activity at 50 µM and also showed good cellular activity. Further optimization of thiazole-2-amine derivatives is in progress.
Hit Dexter: A Machine-Learning Model for the Prediction of Frequent Hitters.
Stork, Conrad; Wagner, Johannes; Friedrich, Nils-Ole; de Bruyn Kops, Christina; Šícho, Martin; Kirchmair, Johannes
2018-03-20
False-positive assay readouts caused by badly behaving compounds-frequent hitters, pan-assay interference compounds (PAINS), aggregators, and others-continue to pose a major challenge to experimental screening. There are only a few in silico methods that allow the prediction of such problematic compounds. We report the development of Hit Dexter, two extremely randomized trees classifiers for the prediction of compounds likely to trigger positive assay readouts either by true promiscuity or by assay interference. The models were trained on a well-prepared dataset extracted from the PubChem Bioassay database, consisting of approximately 311 000 compounds tested for activity on at least 50 proteins. Hit Dexter reached MCC and AUC values of up to 0.67 and 0.96 on an independent test set, respectively. The models are expected to be of high value, in particular to medicinal chemists and biochemists who can use Hit Dexter to identify compounds for which extra caution should be exercised with positive assay readouts. Hit Dexter is available as a free web service at http://hitdexter.zbh. uni-hamburg.de. © 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Assessing deep and shallow learning methods for quantitative prediction of acute chemical toxicity.
Liu, Ruifeng; Madore, Michael; Glover, Kyle P; Feasel, Michael G; Wallqvist, Anders
2018-05-02
Animal-based methods for assessing chemical toxicity are struggling to meet testing demands. In silico approaches, including machine-learning methods, are promising alternatives. Recently, deep neural networks (DNNs) were evaluated and reported to outperform other machine-learning methods for quantitative structure-activity relationship modeling of molecular properties. However, most of the reported performance evaluations relied on global performance metrics, such as the root mean squared error (RMSE) between the predicted and experimental values of all samples, without considering the impact of sample distribution across the activity spectrum. Here, we carried out an in-depth analysis of DNN performance for quantitative prediction of acute chemical toxicity using several datasets. We found that the overall performance of DNN models on datasets of up to 30,000 compounds was similar to that of random forest (RF) models, as measured by the RMSE and correlation coefficients between the predicted and experimental results. However, our detailed analyses demonstrated that global performance metrics are inappropriate for datasets with a highly uneven sample distribution, because they show a strong bias for the most populous compounds along the toxicity spectrum. For highly toxic compounds, DNN and RF models trained on all samples performed much worse than the global performance metrics indicated. Surprisingly, our variable nearest neighbor method, which utilizes only structurally similar compounds to make predictions, performed reasonably well, suggesting that information of close near neighbors in the training sets is a key determinant of acute toxicity predictions.
Eren, Gokcen; Macchiarulo, Antonio; Banoglu, Erden
2012-02-01
Pharmacological intervention with 5-Lipoxygenase (5-LO) is a promising strategy for treatment of inflammatory and allergic ailments, including asthma. With the aim of developing predictive models of 5-LO affinity and gaining insights into the molecular basis of ligand-target interaction, we herein describe QSAR studies of 59 diverse nonredox-competitive 5-LO inhibitors based on the use of molecular shape descriptors and docking experiments. These studies have successfully yielded a predictive model able to explain much of the variance in the activity of the training set compounds while predicting satisfactorily the 5-LO inhibitory activity of an external test set of compounds. The inspection of the selected variables in the QSAR equation unveils the importance of specific interactions which are observed from docking experiments. Collectively, these results may be used to design novel potent and selective nonredox 5-LO inhibitors. Copyright © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
de Hoyo, Moises; Gonzalo-Skok, Oliver; Sañudo, Borja; Carrascal, Claudio; Plaza-Armas, Jose R; Camacho-Candil, Fernando; Otero-Esquina, Carlos
2016-02-01
The aim of this study was to analyze the effects of 3 different low/moderate load strength training methods (full-back squat [SQ], resisted sprint with sled towing [RS], and plyometric and specific drills training [PLYO]) on sprinting, jumping, and change of direction (COD) abilities in soccer players. Thirty-two young elite male Spanish soccer players participated in the study. Subjects performed 2 specific strength training sessions per week, in addition to their normal training sessions for 8 weeks. The full-back squat protocol consisted of 2-3 sets × 4-8 repetitions at 40-60% 1 repetition maximum (∼ 1.28-0.98 m · s(-1)). The resisted sprint training was compounded by 6-10 sets × 20-m loaded sprints (12.6% of body mass). The plyometric and specific drills training was based on 1-3 sets × 2-3 repetitions of 8 plyometric and speed/agility exercises. Testing sessions included a countermovement jump (CMJ), a 20-m sprint (10-m split time), a 50-m (30-m split time) sprint, and COD test (i.e., Zig-Zag test). Substantial improvements (likely to almost certainly) in CMJ (effect size [ES]: 0.50-0.57) and 30-50 m (ES: 0.45-0.84) were found in every group in comparison to pretest results. Moreover, players in PLYO and SQ groups also showed substantial enhancements (likely to very likely) in 0-50 m (ES: 0.46-0.60). In addition, 10-20 m was also improved (very likely) in the SQ group (ES: 0.61). Between-group analyses showed that improvements in 10-20 m (ES: 0.57) and 30-50 m (ES: 0.40) were likely greater in the SQ group than in the RS group. Also, 10-20 m (ES: 0.49) was substantially better in the SQ group than in the PLYO group. In conclusion, the present strength training methods used in this study seem to be effective to improve jumping and sprinting abilities, but COD might need other stimulus to achieve positive effects.
A tool for developing an automatic insect identification system based on wing outlines
Yang, He-Ping; Ma, Chun-Sen; Wen, Hui; Zhan, Qing-Bin; Wang, Xin-Li
2015-01-01
For some insect groups, wing outline is an important character for species identification. We have constructed a program as the integral part of an automated system to identify insects based on wing outlines (DAIIS). This program includes two main functions: (1) outline digitization and Elliptic Fourier transformation and (2) classifier model training by pattern recognition of support vector machines and model validation. To demonstrate the utility of this program, a sample of 120 owlflies (Neuroptera: Ascalaphidae) was split into training and validation sets. After training, the sample was sorted into seven species using this tool. In five repeated experiments, the mean accuracy for identification of each species ranged from 90% to 98%. The accuracy increased to 99% when the samples were first divided into two groups based on features of their compound eyes. DAIIS can therefore be a useful tool for developing a system of automated insect identification. PMID:26251292
Retrospective Revaluation Effects Following Serial Compound Training and Target Extinction
ERIC Educational Resources Information Center
Effting, Marieke; Vervliet, Bram; Kindt, Merel
2010-01-01
Using a conditioned suppression task, two experiments examined retrospective revaluation effects after serial compound training in a release from overshadowing design. In Experiment 1, serial X [right arrow] A+ training produced suppression to target A, which was enhanced when preceded by feature X, whereas X by itself elicited no suppression.…
Informing the Human Plasma Protein Binding of ...
The free fraction of a xenobiotic in plasma (Fub) is an important determinant of chemical adsorption, distribution, metabolism, elimination, and toxicity, yet experimental plasma protein binding data is scarce for environmentally relevant chemicals. The presented work explores the merit of utilizing available pharmaceutical data to predict Fub for environmentally relevant chemicals via machine learning techniques. Quantitative structure-activity relationship (QSAR) models were constructed with k nearest neighbors (kNN), support vector machines (SVM), and random forest (RF) machine learning algorithms from a training set of 1045 pharmaceuticals. The models were then evaluated with independent test sets of pharmaceuticals (200 compounds) and environmentally relevant ToxCast chemicals (406 total, in two groups of 238 and 168 compounds). The selection of a minimal feature set of 10-15 2D molecular descriptors allowed for both informative feature interpretation and practical applicability domain assessment via a bounded box of descriptor ranges and principal component analysis. The diverse pharmaceutical and environmental chemical sets exhibit similarities in terms of chemical space (99-82% overlap), as well as comparable bias and variance in constructed learning curves. All the models exhibit significant predictability with mean absolute errors (MAE) in the range of 0.10-0.18 Fub. The models performed best for highly bound chemicals (MAE 0.07-0.12), neutrals (MAE 0
NASA Astrophysics Data System (ADS)
Gao, Ming; Li, Shiwei
2017-05-01
Based on experimental data of the soybean yield and quality from 30 sampling points, a quantitative structure-activity relationship model (2D-QSAR) was established using the soil quality (elements, pH, organic matter content and cation exchange capacity) as independent variables and soybean yield or quality as the dependent variable, with SPSS software. During the modeling, the full data set (30 and 14 compounds) was divided into a training set (24 and 11 compounds) for model generation and a test set (6 and 3 compounds) for model validation. The R2 values of the resulting models and data were 0.826 and 0.808 for soybean yield and quality, respectively, and all regression coefficients were significant (P < 0.05). The correlation coefficient R2pred of observed values and predicted values of the soybean yield and soybean quality in the test set were 0.961 and 0.956, respectively, indicating that the models had a good predictive ability. Moreover, the Mo, Se, K, N and organic matter contents and the cation exchange capacity of soil had a positive effect on soybean production, and the B, Mo, Se, K and N contents and cation exchange coefficient had a positive effect on soybean quality. The results are instructive for enhancing soils to improve the yield and quality of soybean, and this method can also be used to study other crops or regions, providing a theoretical basis to improving the yield and quality of crops.
Wang, Shuangquan; Sun, Huiyong; Liu, Hui; Li, Dan; Li, Youyong; Hou, Tingjun
2016-08-01
Blockade of human ether-à-go-go related gene (hERG) channel by compounds may lead to drug-induced QT prolongation, arrhythmia, and Torsades de Pointes (TdP), and therefore reliable prediction of hERG liability in the early stages of drug design is quite important to reduce the risk of cardiotoxicity-related attritions in the later development stages. In this study, pharmacophore modeling and machine learning approaches were combined to construct classification models to distinguish hERG active from inactive compounds based on a diverse data set. First, an optimal ensemble of pharmacophore hypotheses that had good capability to differentiate hERG active from inactive compounds was identified by the recursive partitioning (RP) approach. Then, the naive Bayesian classification (NBC) and support vector machine (SVM) approaches were employed to construct classification models by integrating multiple important pharmacophore hypotheses. The integrated classification models showed improved predictive capability over any single pharmacophore hypothesis, suggesting that the broad binding polyspecificity of hERG can only be well characterized by multiple pharmacophores. The best SVM model achieved the prediction accuracies of 84.7% for the training set and 82.1% for the external test set. Notably, the accuracies for the hERG blockers and nonblockers in the test set reached 83.6% and 78.2%, respectively. Analysis of significant pharmacophores helps to understand the multimechanisms of action of hERG blockers. We believe that the combination of pharmacophore modeling and SVM is a powerful strategy to develop reliable theoretical models for the prediction of potential hERG liability.
Estimation of octanol/water partition coefficients using LSER parameters
Luehrs, Dean C.; Hickey, James P.; Godbole, Kalpana A.; Rogers, Tony N.
1998-01-01
The logarithms of octanol/water partition coefficients, logKow, were regressed against the linear solvation energy relationship (LSER) parameters for a training set of 981 diverse organic chemicals. The standard deviation for logKow was 0.49. The regression equation was then used to estimate logKow for a test of 146 chemicals which included pesticides and other diverse polyfunctional compounds. Thus the octanol/water partition coefficient may be estimated by LSER parameters without elaborate software but only moderate accuracy should be expected.
2010-12-01
compounds and stabiliz- ers in more conventional energetic formulations such as nitrocellulose (NC)-based propellants. Assessing the deposition...were opened and spread out to dry on aluminum foil covered trays to dry at room temperature. The dried material was then sieved under a hood with a...filtrate generated were tracked. The filters were placed in a labeled jar and set out to dry . After drying , the jars were sealed and refrigerated
NASA Astrophysics Data System (ADS)
Goudarzi, Nasser
2016-04-01
In this work, two new and powerful chemometrics methods are applied for the modeling and prediction of the 19F chemical shift values of some fluorinated organic compounds. The radial basis function-partial least square (RBF-PLS) and random forest (RF) are employed to construct the models to predict the 19F chemical shifts. In this study, we didn't used from any variable selection method and RF method can be used as variable selection and modeling technique. Effects of the important parameters affecting the ability of the RF prediction power such as the number of trees (nt) and the number of randomly selected variables to split each node (m) were investigated. The root-mean-square errors of prediction (RMSEP) for the training set and the prediction set for the RBF-PLS and RF models were 44.70, 23.86, 29.77, and 23.69, respectively. Also, the correlation coefficients of the prediction set for the RBF-PLS and RF models were 0.8684 and 0.9313, respectively. The results obtained reveal that the RF model can be used as a powerful chemometrics tool for the quantitative structure-property relationship (QSPR) studies.
Mulder, Nicola; Schwartz, Russell; Brazas, Michelle D; Brooksbank, Cath; Gaeta, Bruno; Morgan, Sarah L; Pauley, Mark A; Rosenwald, Anne; Rustici, Gabriella; Sierk, Michael; Warnow, Tandy; Welch, Lonnie
2018-02-01
Bioinformatics is recognized as part of the essential knowledge base of numerous career paths in biomedical research and healthcare. However, there is little agreement in the field over what that knowledge entails or how best to provide it. These disagreements are compounded by the wide range of populations in need of bioinformatics training, with divergent prior backgrounds and intended application areas. The Curriculum Task Force of the International Society of Computational Biology (ISCB) Education Committee has sought to provide a framework for training needs and curricula in terms of a set of bioinformatics core competencies that cut across many user personas and training programs. The initial competencies developed based on surveys of employers and training programs have since been refined through a multiyear process of community engagement. This report describes the current status of the competencies and presents a series of use cases illustrating how they are being applied in diverse training contexts. These use cases are intended to demonstrate how others can make use of the competencies and engage in the process of their continuing refinement and application. The report concludes with a consideration of remaining challenges and future plans.
Brooksbank, Cath; Morgan, Sarah L.; Rosenwald, Anne; Warnow, Tandy; Welch, Lonnie
2018-01-01
Bioinformatics is recognized as part of the essential knowledge base of numerous career paths in biomedical research and healthcare. However, there is little agreement in the field over what that knowledge entails or how best to provide it. These disagreements are compounded by the wide range of populations in need of bioinformatics training, with divergent prior backgrounds and intended application areas. The Curriculum Task Force of the International Society of Computational Biology (ISCB) Education Committee has sought to provide a framework for training needs and curricula in terms of a set of bioinformatics core competencies that cut across many user personas and training programs. The initial competencies developed based on surveys of employers and training programs have since been refined through a multiyear process of community engagement. This report describes the current status of the competencies and presents a series of use cases illustrating how they are being applied in diverse training contexts. These use cases are intended to demonstrate how others can make use of the competencies and engage in the process of their continuing refinement and application. The report concludes with a consideration of remaining challenges and future plans. PMID:29390004
Machine Learning of Parameters for Accurate Semiempirical Quantum Chemical Calculations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dral, Pavlo O.; von Lilienfeld, O. Anatole; Thiel, Walter
2015-05-12
We investigate possible improvements in the accuracy of semiempirical quantum chemistry (SQC) methods through the use of machine learning (ML) models for the parameters. For a given class of compounds, ML techniques require sufficiently large training sets to develop ML models that can be used for adapting SQC parameters to reflect changes in molecular composition and geometry. The ML-SQC approach allows the automatic tuning of SQC parameters for individual molecules, thereby improving the accuracy without deteriorating transferability to molecules with molecular descriptors very different from those in the training set. The performance of this approach is demonstrated for the semiempiricalmore » OM2 method using a set of 6095 constitutional isomers C7H10O2, for which accurate ab initio atomization enthalpies are available. The ML-OM2 results show improved average accuracy and a much reduced error range compared with those of standard OM2 results, with mean absolute errors in atomization enthalpies dropping from 6.3 to 1.7 kcal/mol. They are also found to be superior to the results from specific OM2 reparameterizations (rOM2) for the same set of isomers. The ML-SQC approach thus holds promise for fast and reasonably accurate high-throughput screening of materials and molecules.« less
Toxicity challenges in environmental chemicals: Prediction of ...
Physiologically based pharmacokinetic (PBPK) models bridge the gap between in vitro assays and in vivo effects by accounting for the adsorption, distribution, metabolism, and excretion of xenobiotics, which is especially useful in the assessment of human toxicity. Quantitative structure-activity relationships (QSAR) serve as a vital tool for the high-throughput prediction of chemical-specific PBPK parameters, such as the fraction of a chemical unbound by plasma protein (Fub). The presented work explores the merit of utilizing experimental pharmaceutical Fub data for the construction of a universal QSAR model, in order to compensate for the limited range of high-quality experimental Fub data for environmentally relevant chemicals, such as pollutants, pesticides, and consumer products. Independent QSAR models were constructed with three machine-learning algorithms, k nearest neighbors (kNN), random forest (RF), and support vector machine (SVM) regression, from a large pharmaceutical training set (~1000) and assessed with independent test sets of pharmaceuticals (~200) and environmentally relevant chemicals in the ToxCast program (~400). Small descriptor sets yielded the optimal balance of model complexity and performance, providing insight into the biochemical factors of plasma protein binding, while preventing over fitting to the training set. Overlaps in chemical space between pharmaceutical and environmental compounds were considered through applicability of do
Machine learning of parameters for accurate semiempirical quantum chemical calculations
Dral, Pavlo O.; von Lilienfeld, O. Anatole; Thiel, Walter
2015-04-14
We investigate possible improvements in the accuracy of semiempirical quantum chemistry (SQC) methods through the use of machine learning (ML) models for the parameters. For a given class of compounds, ML techniques require sufficiently large training sets to develop ML models that can be used for adapting SQC parameters to reflect changes in molecular composition and geometry. The ML-SQC approach allows the automatic tuning of SQC parameters for individual molecules, thereby improving the accuracy without deteriorating transferability to molecules with molecular descriptors very different from those in the training set. The performance of this approach is demonstrated for the semiempiricalmore » OM2 method using a set of 6095 constitutional isomers C 7H 10O 2, for which accurate ab initio atomization enthalpies are available. The ML-OM2 results show improved average accuracy and a much reduced error range compared with those of standard OM2 results, with mean absolute errors in atomization enthalpies dropping from 6.3 to 1.7 kcal/mol. They are also found to be superior to the results from specific OM2 reparameterizations (rOM2) for the same set of isomers. The ML-SQC approach thus holds promise for fast and reasonably accurate high-throughput screening of materials and molecules.« less
Zeb, Amir; Park, Chanin; Son, Minky; Rampogu, Shailima; Alam, Syed Ibrar; Park, Seok Ju; Lee, Keun Woo
2018-06-01
Proteins deacetylation by Histone deacetylase 6 (HDAC6) has been shown in various human chronic diseases like neurodegenerative diseases and cancer, and hence is an important therapeutic target. Since, the existing inhibitors have hydroxamate group, and are not HDAC6-selective, therefore, this study has designed to investigate non-hydroxamate HDAC6 inhibitors. Ligand-based pharmacophore was generated from 26 training set compounds of HDAC6 inhibitors. The statistical parameters of pharmacophore (Hypo1) included lowest total cost of 115.63, highest cost difference of 135.00, lowest RMSD of 0.70 and the highest correlation of 0.98. The pharmacophore was validated by Fischer's Randomization and Test Set validation, and used as screening tool for chemical databases. The screened compounds were filtered by fit value ([Formula: see text]), estimated Inhibitory Concentration (IC[Formula: see text]) ([Formula: see text]), Lipinski's Rule of Five and Absorption, Distribution, Metabolism, Excretion, and Toxicity (ADMET) Descriptors to identify drug-like compounds. Furthermore, the drug-like compounds were docked into the active site of HDAC6. The best docked compounds were selected having goldfitness score [Formula: see text] and [Formula: see text], and hydrogen bond interaction with catalytic active residues. Finally, three inhibitors having sulfamoyl group were selected by Molecular Dynamic (MD) simulation, which showed stable root mean square deviation (RMSD) (1.6-1.9[Formula: see text]Å), lowest potential energy ([Formula: see text][Formula: see text]kJ/mol), and hydrogen bonding with catalytic active residues of HDAC6.
Oral LD50 toxicity modeling and prediction of per- and polyfluorinated chemicals on rat and mouse.
Bhhatarai, Barun; Gramatica, Paola
2011-05-01
Quantitative structure-activity relationship (QSAR) analyses were performed using the LD(50) oral toxicity data of per- and polyfluorinated chemicals (PFCs) on rodents: rat and mouse. PFCs are studied under the EU project CADASTER which uses the available experimental data for prediction and prioritization of toxic chemicals for risk assessment by using the in silico tools. The methodology presented here applies chemometrical analysis on the existing experimental data and predicts the toxicity of new compounds. QSAR analyses were performed on the available 58 mouse and 50 rat LD(50) oral data using multiple linear regression (MLR) based on theoretical molecular descriptors selected by genetic algorithm (GA). Training and prediction sets were prepared a priori from available experimental datasets in terms of structure and response. These sets were used to derive statistically robust and predictive (both internally and externally) models. The structural applicability domain (AD) of the models were verified on 376 per- and polyfluorinated chemicals including those in REACH preregistration list. The rat and mouse endpoints were predicted by each model for the studied compounds, and finally 30 compounds, all perfluorinated, were prioritized as most important for experimental toxicity analysis under the project. In addition, cumulative study on compounds within the AD of all four models, including two earlier published models on LC(50) rodent analysis was studied and the cumulative toxicity trend was observed using principal component analysis (PCA). The similarities and the differences observed in terms of descriptors and chemical/mechanistic meaning encoded by descriptors to prioritize the most toxic compounds are highlighted.
Probabilistic neural networks modeling of the 48-h LC50 acute toxicity endpoint to Daphnia magna.
Niculescu, S P; Lewis, M A; Tigner, J
2008-01-01
Two modeling experiments based on the maximum likelihood estimation paradigm and targeting prediction of the Daphnia magna 48-h LC50 acute toxicity endpoint for both organic and inorganic compounds are reported. The resulting models computational algorithms are implemented as basic probabilistic neural networks with Gaussian kernel (statistical corrections included). The first experiment uses strictly D. magna information for 971 structures as training/learning data and the resulting model targets practical applications. The second experiment uses the same training/learning information plus additional data on another 29 compounds whose endpoint information is originating from D. pulex and Ceriodaphnia dubia. It only targets investigation of the effect of mixing strictly D. magna 48-h LC50 modeling information with small amounts of similar information estimated from related species, and this is done as part of the validation process. A complementary 81 compounds dataset (involving only strictly D. magna information) is used to perform external testing. On this external test set, the Gaussian character of the distribution of the residuals is confirmed for both models. This allows the use of traditional statistical methodology to implement computation of confidence intervals for the unknown measured values based on the models predictions. Examples are provided for the model targeting practical applications. For the same model, a comparison with other existing models targeting the same endpoint is performed.
Thermal deterioration of virgin olive oil monitored by ATR-FTIR analysis of trans content.
Tena, Noelia; Aparicio, Ramón; García-González, Diego L
2009-11-11
The monitoring of frying oils by an effective and rapid method is one of the demands of food companies and small food retailers. In this work, a method based on ATR-FTIR has been developed for monitoring the oil degradation in frying procedures. The IR bands changing during frying in sunflower, soybean, and virgin olive oils have been examined in their linear relationship with the content of total polar compounds, which is a preferred parameter for frying control. The bands assigned to conjugated and isolated trans double bonds that are commonly used for the determination of trans content provided the best relationships. Then, the area covering 978-960 cm(-1) was chosen to build a model for predicting polar material content for the particular case of virgin olive oil. A virgin olive oil was heated up to 94 h, and samples collected every 2 h constituted the training set. These samples were analyzed to obtain their FTIR spectra and to determine the composition of fatty acids and the content of total polar compounds. The excellent results predicting the polar material content (adjusted R(2) 0.997) was successfully validated with an external set of samples. The analysis of the fatty acid composition confirmed the relationship between the trans content and the content of total polar compounds.
Predictive QSAR modeling workflow, model applicability domains, and virtual screening.
Tropsha, Alexander; Golbraikh, Alexander
2007-01-01
Quantitative Structure Activity Relationship (QSAR) modeling has been traditionally applied as an evaluative approach, i.e., with the focus on developing retrospective and explanatory models of existing data. Model extrapolation was considered if only in hypothetical sense in terms of potential modifications of known biologically active chemicals that could improve compounds' activity. This critical review re-examines the strategy and the output of the modern QSAR modeling approaches. We provide examples and arguments suggesting that current methodologies may afford robust and validated models capable of accurate prediction of compound properties for molecules not included in the training sets. We discuss a data-analytical modeling workflow developed in our laboratory that incorporates modules for combinatorial QSAR model development (i.e., using all possible binary combinations of available descriptor sets and statistical data modeling techniques), rigorous model validation, and virtual screening of available chemical databases to identify novel biologically active compounds. Our approach places particular emphasis on model validation as well as the need to define model applicability domains in the chemistry space. We present examples of studies where the application of rigorously validated QSAR models to virtual screening identified computational hits that were confirmed by subsequent experimental investigations. The emerging focus of QSAR modeling on target property forecasting brings it forward as predictive, as opposed to evaluative, modeling approach.
Akbar, Jamshed; Iqbal, Shahid; Batool, Fozia; Karim, Abdul; Chan, Kim Wei
2012-01-01
Quantitative structure-retention relationships (QSRRs) have successfully been developed for naturally occurring phenolic compounds in a reversed-phase liquid chromatographic (RPLC) system. A total of 1519 descriptors were calculated from the optimized structures of the molecules using MOPAC2009 and DRAGON softwares. The data set of 39 molecules was divided into training and external validation sets. For feature selection and mapping we used step-wise multiple linear regression (SMLR), unsupervised forward selection followed by step-wise multiple linear regression (UFS-SMLR) and artificial neural networks (ANN). Stable and robust models with significant predictive abilities in terms of validation statistics were obtained with negation of any chance correlation. ANN models were found better than remaining two approaches. HNar, IDM, Mp, GATS2v, DISP and 3D-MoRSE (signals 22, 28 and 32) descriptors based on van der Waals volume, electronegativity, mass and polarizability, at atomic level, were found to have significant effects on the retention times. The possible implications of these descriptors in RPLC have been discussed. All the models are proven to be quite able to predict the retention times of phenolic compounds and have shown remarkable validation, robustness, stability and predictive performance. PMID:23203132
Computational Exploration for Lead Compounds That Can Reverse the Nuclear Morphology in Progeria
Baek, Ayoung; Son, Minky; Zeb, Amir; Park, Chanin; Kumar, Raj; Lee, Gihwan; Kim, Donghwan; Choi, Yeonuk; Cho, Yeongrae; Park, Yohan
2017-01-01
Progeria is a rare genetic disorder characterized by premature aging that eventually leads to death and is noticed globally. Despite alarming conditions, this disease lacks effective medications; however, the farnesyltransferase inhibitors (FTIs) are a hope in the dark. Therefore, the objective of the present article is to identify new compounds from the databases employing pharmacophore based virtual screening. Utilizing nine training set compounds along with lonafarnib, a common feature pharmacophore was constructed consisting of four features. The validated Hypo1 was subsequently allowed to screen Maybridge, Chembridge, and Asinex databases to retrieve the novel lead candidates, which were then subjected to Lipinski's rule of 5 and ADMET for drug-like assessment. The obtained 3,372 compounds were forwarded to docking simulations and were manually examined for the key interactions with the crucial residues. Two compounds that have demonstrated a higher dock score than the reference compounds and showed interactions with the crucial residues were subjected to MD simulations and binding free energy calculations to assess the stability of docked conformation and to investigate the binding interactions in detail. Furthermore, this study suggests that the Hits may be more effective against progeria and further the DFT studies were executed to understand their orbital energies. PMID:29226142
Rayne, Sierra; Forest, Kaya
2014-09-19
The air-water partition coefficient (Kaw) of perfluoro-2-methyl-3-pentanone (PFMP) was estimated using the G4MP2/G4 levels of theory and the SMD solvation model. A suite of 31 fluorinated compounds was employed to calibrate the theoretical method. Excellent agreement between experimental and directly calculated Kaw values was obtained for the calibration compounds. The PCM solvation model was found to yield unsatisfactory Kaw estimates for fluorinated compounds at both levels of theory. The HENRYWIN Kaw estimation program also exhibited poor Kaw prediction performance on the training set. Based on the resulting regression equation for the calibration compounds, the G4MP2-SMD method constrained the estimated Kaw of PFMP to the range 5-8 × 10(-6) M atm(-1). The magnitude of this Kaw range indicates almost all PFMP released into the atmosphere or near the land-atmosphere interface will reside in the gas phase, with only minor quantities dissolved in the aqueous phase as the parent compound and/or its hydrate/hydrate conjugate base. Following discharge into aqueous systems not at equilibrium with the atmosphere, significant quantities of PFMP will be present as the dissolved parent compound and/or its hydrate/hydrate conjugate base.
[Discovery of potential LXRβ agonists from Chinese herbs using molecular simulation methods].
Luo, Gang-Gang; Lu, Fang; Qiao, Lian-Sheng; Li, Yong; Zhang, Yan-Ling
2016-08-01
Liver X receptor β (LXRβ) has been a new target in the treatment of hyperlipemia, which was related to the cholesterol homeostasis. In this study, the quantitative pharmacophores were constructed by 3D-QSAR pharmacophore (Hypogen) method based on the LXRβ agonists. The optimal pharmacophore model containing one hydrogen bond acceptor, two hydrophobics and one ring aromatic was obtained based on five assessment indictors, including the correlation between predicted value and experimental value of the compounds in training set (correlation), Δcost of the models (Δcost), hit rate of active compounds (HRA), identification of effectiveness index (IEI) and comprehensive evaluation index (CAI). And the values of the five assessment indicators were 0.95, 128.65, 84.44%, 2.58 and 2.18 respectively. The best model as a query to screen the traditional Chinese medicine database (TCMD), a list of 309 compounds was obtained andwere then refined using Libdock program. Finally, based on the screening rules of the Libdock score of initial compound and the key interactions between initial compound and receptor, four compounds, demethoxycurcumin, isolicoflavonol, licochalcone E and silydianin, were selected as potential LXRβ agonists. The molecular simulation methods were high-efficiency and time-saving to obtainthe potential LXRβ agonists, which could provide assistance for further researchingnovel anti-hyperlipidemia drugs. Copyright© by the Chinese Pharmaceutical Association.
A reliable computational workflow for the selection of optimal screening libraries.
Gilad, Yocheved; Nadassy, Katalin; Senderowitz, Hanoch
2015-01-01
The experimental screening of compound collections is a common starting point in many drug discovery projects. Successes of such screening campaigns critically depend on the quality of the screened library. Many libraries are currently available from different vendors yet the selection of the optimal screening library for a specific project is challenging. We have devised a novel workflow for the rational selection of project-specific screening libraries. The workflow accepts as input a set of virtual candidate libraries and applies the following steps to each library: (1) data curation; (2) assessment of ADME/T profile; (3) assessment of the number of promiscuous binders/frequent HTS hitters; (4) assessment of internal diversity; (5) assessment of similarity to known active compound(s) (optional); (6) assessment of similarity to in-house or otherwise accessible compound collections (optional). For ADME/T profiling, Lipinski's and Veber's rule-based filters were implemented and a new blood brain barrier permeation model was developed and validated (85 and 74 % success rate for training set and test set, respectively). Diversity and similarity descriptors which demonstrated best performances in terms of their ability to select either diverse or focused sets of compounds from three databases (Drug Bank, CMC and CHEMBL) were identified and used for diversity and similarity assessments. The workflow was used to analyze nine common screening libraries available from six vendors. The results of this analysis are reported for each library providing an assessment of its quality. Furthermore, a consensus approach was developed to combine the results of these analyses into a single score for selecting the optimal library under different scenarios. We have devised and tested a new workflow for the rational selection of screening libraries under different scenarios. The current workflow was implemented using the Pipeline Pilot software yet due to the usage of generic components, it can be easily adapted and reproduced by computational groups interested in rational selection of screening libraries. Furthermore, the workflow could be readily modified to include additional components. This workflow has been routinely used in our laboratory for the selection of libraries in multiple projects and consistently selects libraries which are well balanced across multiple parameters.Graphical abstract.
Federal Register 2010, 2011, 2012, 2013, 2014
2013-12-04
... equipment that could affect drug safety or effectiveness? 6. Training Is specialized, highly technical training essential to ensure proper compounding of the drug product? 7. Testing and Quality Assurance Is...
Action recognition using mined hierarchical compound features.
Gilbert, Andrew; Illingworth, John; Bowden, Richard
2011-05-01
The field of Action Recognition has seen a large increase in activity in recent years. Much of the progress has been through incorporating ideas from single-frame object recognition and adapting them for temporal-based action recognition. Inspired by the success of interest points in the 2D spatial domain, their 3D (space-time) counterparts typically form the basic components used to describe actions, and in action recognition the features used are often engineered to fire sparsely. This is to ensure that the problem is tractable; however, this can sacrifice recognition accuracy as it cannot be assumed that the optimum features in terms of class discrimination are obtained from this approach. In contrast, we propose to initially use an overcomplete set of simple 2D corners in both space and time. These are grouped spatially and temporally using a hierarchical process, with an increasing search area. At each stage of the hierarchy, the most distinctive and descriptive features are learned efficiently through data mining. This allows large amounts of data to be searched for frequently reoccurring patterns of features. At each level of the hierarchy, the mined compound features become more complex, discriminative, and sparse. This results in fast, accurate recognition with real-time performance on high-resolution video. As the compound features are constructed and selected based upon their ability to discriminate, their speed and accuracy increase at each level of the hierarchy. The approach is tested on four state-of-the-art data sets, the popular KTH data set to provide a comparison with other state-of-the-art approaches, the Multi-KTH data set to illustrate performance at simultaneous multiaction classification, despite no explicit localization information provided during training. Finally, the recent Hollywood and Hollywood2 data sets provide challenging complex actions taken from commercial movie sequences. For all four data sets, the proposed hierarchical approach outperforms all other methods reported thus far in the literature and can achieve real-time operation.
Meta-learning framework applied in bioinformatics inference system design.
Arredondo, Tomás; Ormazábal, Wladimir
2015-01-01
This paper describes a meta-learner inference system development framework which is applied and tested in the implementation of bioinformatic inference systems. These inference systems are used for the systematic classification of the best candidates for inclusion in bacterial metabolic pathway maps. This meta-learner-based approach utilises a workflow where the user provides feedback with final classification decisions which are stored in conjunction with analysed genetic sequences for periodic inference system training. The inference systems were trained and tested with three different data sets related to the bacterial degradation of aromatic compounds. The analysis of the meta-learner-based framework involved contrasting several different optimisation methods with various different parameters. The obtained inference systems were also contrasted with other standard classification methods with accurate prediction capabilities observed.
ERIC Educational Resources Information Center
Tsesmeli, Styliani N.; Tsirozi, Theologia
2015-01-01
The case-study aims to examine the effectiveness of training of morphological structure on the spelling of compounds by a spelling-disabled primary school student. The experimental design of the intervention was based on the word-pair paradigm and included a pre-test, a training program and a post-test (n = 50 pairs). The Training Program aimed to…
Using information from historical high-throughput screens to predict active compounds.
Riniker, Sereina; Wang, Yuan; Jenkins, Jeremy L; Landrum, Gregory A
2014-07-28
Modern high-throughput screening (HTS) is a well-established approach for hit finding in drug discovery that is routinely employed in the pharmaceutical industry to screen more than a million compounds within a few weeks. However, as the industry shifts to more disease-relevant but more complex phenotypic screens, the focus has moved to piloting smaller but smarter chemically/biologically diverse subsets followed by an expansion around hit compounds. One standard method for doing this is to train a machine-learning (ML) model with the chemical fingerprints of the tested subset of molecules and then select the next compounds based on the predictions of this model. An alternative approach would be to take advantage of the wealth of bioactivity information contained in older (full-deck) screens using so-called HTS fingerprints, where each element of the fingerprint corresponds to the outcome of a particular assay, as input to machine-learning algorithms. We constructed HTS fingerprints using two collections of data: 93 in-house assays and 95 publicly available assays from PubChem. For each source, an additional set of 51 and 46 assays, respectively, was collected for testing. Three different ML methods, random forest (RF), logistic regression (LR), and naïve Bayes (NB), were investigated for both the HTS fingerprint and a chemical fingerprint, Morgan2. RF was found to be best suited for learning from HTS fingerprints yielding area under the receiver operating characteristic curve (AUC) values >0.8 for 78% of the internal assays and enrichment factors at 5% (EF(5%)) >10 for 55% of the assays. The RF(HTS-fp) generally outperformed the LR trained with Morgan2, which was the best ML method for the chemical fingerprint, for the majority of assays. In addition, HTS fingerprints were found to retrieve more diverse chemotypes. Combining the two models through heterogeneous classifier fusion led to a similar or better performance than the best individual model for all assays. Further validation using a pair of in-house assays and data from a confirmatory screen--including a prospective set of around 2000 compounds selected based on our approach--confirmed the good performance. Thus, the combination of machine-learning with HTS fingerprints and chemical fingerprints utilizes information from both domains and presents a very promising approach for hit expansion, leading to more hits. The source code used with the public data is provided.
Gissi, Andrea; Lombardo, Anna; Roncaglioni, Alessandra; Gadaleta, Domenico; Mangiatordi, Giuseppe Felice; Nicolotti, Orazio; Benfenati, Emilio
2015-02-01
The bioconcentration factor (BCF) is an important bioaccumulation hazard assessment metric in many regulatory contexts. Its assessment is required by the REACH regulation (Registration, Evaluation, Authorization and Restriction of Chemicals) and by CLP (Classification, Labeling and Packaging). We challenged nine well-known and widely used BCF QSAR models against 851 compounds stored in an ad-hoc created database. The goodness of the regression analysis was assessed by considering the determination coefficient (R(2)) and the Root Mean Square Error (RMSE); Cooper's statistics and Matthew's Correlation Coefficient (MCC) were calculated for all the thresholds relevant for regulatory purposes (i.e. 100L/kg for Chemical Safety Assessment; 500L/kg for Classification and Labeling; 2000 and 5000L/kg for Persistent, Bioaccumulative and Toxic (PBT) and very Persistent, very Bioaccumulative (vPvB) assessment) to assess the classification, with particular attention to the models' ability to control the occurrence of false negatives. As a first step, statistical analysis was performed for the predictions of the entire dataset; R(2)>0.70 was obtained using CORAL, T.E.S.T. and EPISuite Arnot-Gobas models. As classifiers, ACD and logP-based equations were the best in terms of sensitivity, ranging from 0.75 to 0.94. External compound predictions were carried out for the models that had their own training sets. CORAL model returned the best performance (R(2)ext=0.59), followed by the EPISuite Meylan model (R(2)ext=0.58). The latter gave also the highest sensitivity on external compounds with values from 0.55 to 0.85, depending on the thresholds. Statistics were also compiled for compounds falling into the models Applicability Domain (AD), giving better performances. In this respect, VEGA CAESAR was the best model in terms of regression (R(2)=0.94) and classification (average sensitivity>0.80). This model also showed the best regression (R(2)=0.85) and sensitivity (average>0.70) for new compounds in the AD but not present in the training set. However, no single optimal model exists and, thus, it would be wise a case-by-case assessment. Yet, integrating the wealth of information from multiple models remains the winner approach. Copyright © 2014 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
de Campos, Luana Janaína; de Melo, Eduardo Borges
2017-08-01
In the present study, 199 compounds derived from pyrimidine, pyrimidone and pyridopyrazine carboxamides with inhibitory activity against HIV-1 integrase were modeled. Subsequently, a multivariate QSAR study was conducted with 54 molecules employed by Ordered Predictors Selection (OPS) and Partial Least Squares (PLS) for the selection of variables and model construction, respectively. Topological, electrotopological, geometric, and molecular descriptors were used. The selected real model was robust and free from chance correlation; in addition, it demonstrated favorable internal and external statistical quality. Once statistically validated, the training model was used to predict the activity of a second data set (n = 145). The root mean square deviation (RMSD) between observed and predicted values was 0.698. Although it is a value outside of the standards, only 15 (10.34%) of the samples exhibited higher residual values than 1 log unit, a result considered acceptable. Results of Williams and Euclidean applicability domains relative to the prediction showed that the predictions did not occur by extrapolation and that the model is representative of the chemical space of test compounds.
Dong, Jian-Jun; Li, Qing-Liang; Yin, Hua; Zhong, Cheng; Hao, Jun-Guang; Yang, Pan-Fei; Tian, Yu-Hong; Jia, Shi-Ru
2014-10-15
Sensory evaluation is regarded as a necessary procedure to ensure a reproducible quality of beer. Meanwhile, high-throughput analytical methods provide a powerful tool to analyse various flavour compounds, such as higher alcohol and ester. In this study, the relationship between flavour compounds and sensory evaluation was established by non-linear models such as partial least squares (PLS), genetic algorithm back-propagation neural network (GA-BP), support vector machine (SVM). It was shown that SVM with a Radial Basis Function (RBF) had a better performance of prediction accuracy for both calibration set (94.3%) and validation set (96.2%) than other models. Relatively lower prediction abilities were observed for GA-BP (52.1%) and PLS (31.7%). In addition, the kernel function of SVM played an essential role of model training when the prediction accuracy of SVM with polynomial kernel function was 32.9%. As a powerful multivariate statistics method, SVM holds great potential to assess beer quality. Copyright © 2014 Elsevier Ltd. All rights reserved.
Lin, Chun-Yuan; Wang, Yen-Ling
2014-01-01
Checkpoint kinase 2 (Chk2) has a great effect on DNA-damage and plays an important role in response to DNA double-strand breaks and related lesions. In this study, we will concentrate on Chk2 and the purpose is to find the potential inhibitors by the pharmacophore hypotheses (PhModels), combinatorial fusion, and virtual screening techniques. Applying combinatorial fusion into PhModels and virtual screening techniques is a novel design strategy for drug design. We used combinatorial fusion to analyze the prediction results and then obtained the best correlation coefficient of the testing set (r test) with the value 0.816 by combining the Best(train)Best(test) and Fast(train)Fast(test) prediction results. The potential inhibitors were selected from NCI database by screening according to Best(train)Best(test) + Fast(train)Fast(test) prediction results and molecular docking with CDOCKER docking program. Finally, the selected compounds have high interaction energy between a ligand and a receptor. Through these approaches, 23 potential inhibitors for Chk2 are retrieved for further study.
NASA Astrophysics Data System (ADS)
Hsieh, Jui-Hua; Wang, Xiang S.; Teotico, Denise; Golbraikh, Alexander; Tropsha, Alexander
2008-09-01
The use of inaccurate scoring functions in docking algorithms may result in the selection of compounds with high predicted binding affinity that nevertheless are known experimentally not to bind to the target receptor. Such falsely predicted binders have been termed `binding decoys'. We posed a question as to whether true binders and decoys could be distinguished based only on their structural chemical descriptors using approaches commonly used in ligand based drug design. We have applied the k-Nearest Neighbor ( kNN) classification QSAR approach to a dataset of compounds characterized as binders or binding decoys of AmpC beta-lactamase. Models were subjected to rigorous internal and external validation as part of our standard workflow and a special QSAR modeling scheme was employed that took into account the imbalanced ratio of inhibitors to non-binders (1:4) in this dataset. 342 predictive models were obtained with correct classification rate (CCR) for both training and test sets as high as 0.90 or higher. The prediction accuracy was as high as 100% (CCR = 1.00) for the external validation set composed of 10 compounds (5 true binders and 5 decoys) selected randomly from the original dataset. For an additional external set of 50 known non-binders, we have achieved the CCR of 0.87 using very conservative model applicability domain threshold. The validated binary kNN QSAR models were further employed for mining the NCGC AmpC screening dataset (69653 compounds). The consensus prediction of 64 compounds identified as screening hits in the AmpC PubChem assay disagreed with their annotation in PubChem but was in agreement with the results of secondary assays. At the same time, 15 compounds were identified as potential binders contrary to their annotation in PubChem. Five of them were tested experimentally and showed inhibitory activities in millimolar range with the highest binding constant Ki of 135 μM. Our studies suggest that validated QSAR models could complement structure based docking and scoring approaches in identifying promising hits by virtual screening of molecular libraries.
Molecular Docking Study on Galantamine Derivatives as Cholinesterase Inhibitors.
Atanasova, Mariyana; Yordanov, Nikola; Dimitrov, Ivan; Berkov, Strahil; Doytchinova, Irini
2015-06-01
A training set of 22 synthetic galantamine derivatives binding to acetylcholinesterase was docked by GOLD and the protocol was optimized in terms of scoring function, rigidity/flexibility of the binding site, presence/absence of a water molecule inside and radius of the binding site. A moderate correlation was found between the affinities of compounds expressed as pIC50 values and their docking scores. The optimized docking protocol was validated by an external test set of 11 natural galantamine derivatives and the correlation coefficient between the docking scores and the pIC50 values was 0.800. The derived relationship was used to analyze the interactions between galantamine derivatives and AChE. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Microscopic Analysis of Activated Sludge. Training Manual.
ERIC Educational Resources Information Center
Office of Water Program Operations (EPA), Cincinnati, OH. National Training and Operational Technology Center.
This training manual presents material on the use of a compound microscope to analyze microscope communities, present in wastewater treatment processes, for operational control. Course topics include: sampling techniques, sample handling, laboratory analysis, identification of organisms, data interpretation, and use of the compound microscope.…
Bond-based linear indices in QSAR: computational discovery of novel anti-trichomonal compounds
NASA Astrophysics Data System (ADS)
Marrero-Ponce, Yovani; Meneses-Marcel, Alfredo; Rivera-Borroto, Oscar M.; García-Domenech, Ramón; De Julián-Ortiz, Jesus Vicente; Montero, Alina; Escario, José Antonio; Barrio, Alicia Gómez; Pereira, David Montero; Nogal, Juan José; Grau, Ricardo; Torrens, Francisco; Vogel, Christian; Arán, Vicente J.
2008-08-01
Trichomonas vaginalis ( Tv) is the causative agent of the most common, non-viral, sexually transmitted disease in women and men worldwide. Since 1959, metronidazole (MTZ) has been the drug of choice in the systemic treatment of trichomoniasis. However, resistance to MTZ in some patients and the great cost associated with the development of new trichomonacidals make necessary the development of computational methods that shorten the drug discovery pipeline. Toward this end, bond-based linear indices, new TOMOCOMD-CARDD molecular descriptors, and linear discriminant analysis were used to discover novel trichomonacidal chemicals. The obtained models, using non-stochastic and stochastic indices, are able to classify correctly 89.01% (87.50%) and 82.42% (84.38%) of the chemicals in the training (test) sets, respectively. These results validate the models for their use in the ligand-based virtual screening. In addition, they show large Matthews' correlation coefficients ( C) of 0.78 (0.71) and 0.65 (0.65) for the training (test) sets, correspondingly. The result of predictions on the 10% full-out cross-validation test also evidences the robustness of the obtained models. Later, both models are applied to the virtual screening of 12 compounds already proved against Tv. As a result, they correctly classify 10 out of 12 (83.33%) and 9 out of 12 (75.00%) of the chemicals, respectively; which is the most important criterion for validating the models. Besides, these classification functions are applied to a library of seven chemicals in order to find novel antitrichomonal agents. These compounds are synthesized and tested for in vitro activity against Tv. As a result, experimental observations approached to theoretical predictions, since it was obtained a correct classification of 85.71% (6 out of 7) of the chemicals. Moreover, out of the seven compounds that are screened, synthesized and biologically assayed, six compounds (VA7-34, VA7-35, VA7-37, VA7-38, VA7-68, VA7-70) show pronounced cytocidal activity at the concentration of 100 μg/ml at 24 h (48 h) within the range of 98.66%-100% (99.40%-100%), while only two molecules (chemicals VA7-37 and VA7-38) show high cytocidal activity at the concentration of 10 μg/ml at 24 h (48 h): 98.38% (94.23%) and 97.59% (98.10%), correspondingly. The LDA-assisted QSAR models presented here could significantly reduce the number of synthesized and tested compounds and could increase the chance of finding new chemical entities with anti-trichomonal activity.
Bond-based linear indices in QSAR: computational discovery of novel anti-trichomonal compounds.
Marrero-Ponce, Yovani; Meneses-Marcel, Alfredo; Rivera-Borroto, Oscar M; García-Domenech, Ramón; De Julián-Ortiz, Jesus Vicente; Montero, Alina; Escario, José Antonio; Barrio, Alicia Gómez; Pereira, David Montero; Nogal, Juan José; Grau, Ricardo; Torrens, Francisco; Vogel, Christian; Arán, Vicente J
2008-08-01
Trichomonas vaginalis (Tv) is the causative agent of the most common, non-viral, sexually transmitted disease in women and men worldwide. Since 1959, metronidazole (MTZ) has been the drug of choice in the systemic treatment of trichomoniasis. However, resistance to MTZ in some patients and the great cost associated with the development of new trichomonacidals make necessary the development of computational methods that shorten the drug discovery pipeline. Toward this end, bond-based linear indices, new TOMOCOMD-CARDD molecular descriptors, and linear discriminant analysis were used to discover novel trichomonacidal chemicals. The obtained models, using non-stochastic and stochastic indices, are able to classify correctly 89.01% (87.50%) and 82.42% (84.38%) of the chemicals in the training (test) sets, respectively. These results validate the models for their use in the ligand-based virtual screening. In addition, they show large Matthews' correlation coefficients (C) of 0.78 (0.71) and 0.65 (0.65) for the training (test) sets, correspondingly. The result of predictions on the 10% full-out cross-validation test also evidences the robustness of the obtained models. Later, both models are applied to the virtual screening of 12 compounds already proved against Tv. As a result, they correctly classify 10 out of 12 (83.33%) and 9 out of 12 (75.00%) of the chemicals, respectively; which is the most important criterion for validating the models. Besides, these classification functions are applied to a library of seven chemicals in order to find novel antitrichomonal agents. These compounds are synthesized and tested for in vitro activity against Tv. As a result, experimental observations approached to theoretical predictions, since it was obtained a correct classification of 85.71% (6 out of 7) of the chemicals. Moreover, out of the seven compounds that are screened, synthesized and biologically assayed, six compounds (VA7-34, VA7-35, VA7-37, VA7-38, VA7-68, VA7-70) show pronounced cytocidal activity at the concentration of 100 mug/ml at 24 h (48 h) within the range of 98.66%-100% (99.40%-100%), while only two molecules (chemicals VA7-37 and VA7-38) show high cytocidal activity at the concentration of 10 mug/ml at 24 h (48 h): 98.38% (94.23%) and 97.59% (98.10%), correspondingly. The LDA-assisted QSAR models presented here could significantly reduce the number of synthesized and tested compounds and could increase the chance of finding new chemical entities with anti-trichomonal activity.
Training set selection for the prediction of essential genes.
Cheng, Jian; Xu, Zhao; Wu, Wenwu; Zhao, Li; Li, Xiangchen; Liu, Yanlin; Tao, Shiheng
2014-01-01
Various computational models have been developed to transfer annotations of gene essentiality between organisms. However, despite the increasing number of microorganisms with well-characterized sets of essential genes, selection of appropriate training sets for predicting the essential genes of poorly-studied or newly sequenced organisms remains challenging. In this study, a machine learning approach was applied reciprocally to predict the essential genes in 21 microorganisms. Results showed that training set selection greatly influenced predictive accuracy. We determined four criteria for training set selection: (1) essential genes in the selected training set should be reliable; (2) the growth conditions in which essential genes are defined should be consistent in training and prediction sets; (3) species used as training set should be closely related to the target organism; and (4) organisms used as training and prediction sets should exhibit similar phenotypes or lifestyles. We then analyzed the performance of an incomplete training set and an integrated training set with multiple organisms. We found that the size of the training set should be at least 10% of the total genes to yield accurate predictions. Additionally, the integrated training sets exhibited remarkable increase in stability and accuracy compared with single sets. Finally, we compared the performance of the integrated training sets with the four criteria and with random selection. The results revealed that a rational selection of training sets based on our criteria yields better performance than random selection. Thus, our results provide empirical guidance on training set selection for the identification of essential genes on a genome-wide scale.
Pandey, Gyanendra; Saxena, Anil K
2006-01-01
A set of 65 flexible peptidomimetic competitive inhibitors (52 in the training set and 13 in the test set) of protein tyrosine phosphatase 1B (PTP1B) has been used to compare the quality and predictive power of 3D quantitative structure-activity relationship (QSAR) comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) models for the three most commonly used conformer-based alignments, namely, cocrystallized conformer-based alignment (CCBA), docked conformer-based alignment (DCBA), and global minima energy conformer-based alignment (GMCBA). These three conformers of 5-[(2S)-2-({(2S)-2-[(tert-butoxycarbonyl)amino]-3-phenylpropanoyl}amino)3-oxo-3-pentylamino)propyl]-2-(carboxymethoxy)benzoic acid (compound number 66) were obtained from the X-ray structure of its cocrystallized complex with PTP1B (PDB ID: 1JF7), its docking studies, and its global minima by simulated annealing. Among the 3D QSAR models developed using the above three alignments, the CCBA provided the optimal predictive CoMFA model for the training set with cross-validated r2 (q2)=0.708, non-cross-validated r2=0.902, standard error of estimate (s)=0.165, and F=202.553 and the optimal CoMSIA model with q2=0.440, r2=0.799, s=0.192, and F=117.782. These models also showed the best test set prediction for the 13 compounds with predictive r2 values of 0.706 and 0.683, respectively. Though the QSAR models derived using the other two alignments also produced statistically acceptable models in the order DCBA>GMCBA in terms of the values of q2, r2, and predictive r2, they were inferior to the corresponding models derived using CCBA. Thus, the order of preference for the alignment selection for 3D QSAR model development may be CCBA>DCBA>GMCBA, and the information obtained from the CoMFA and CoMSIA contour maps may be useful in designing specific PTP1B inhibitors.
Chen, Hongming; Carlsson, Lars; Eriksson, Mats; Varkonyi, Peter; Norinder, Ulf; Nilsson, Ingemar
2013-06-24
A novel methodology was developed to build Free-Wilson like local QSAR models by combining R-group signatures and the SVM algorithm. Unlike Free-Wilson analysis this method is able to make predictions for compounds with R-groups not present in a training set. Eleven public data sets were chosen as test cases for comparing the performance of our new method with several other traditional modeling strategies, including Free-Wilson analysis. Our results show that the R-group signature SVM models achieve better prediction accuracy compared with Free-Wilson analysis in general. Moreover, the predictions of R-group signature models are also comparable to the models using ECFP6 fingerprints and signatures for the whole compound. Most importantly, R-group contributions to the SVM model can be obtained by calculating the gradient for R-group signatures. For most of the studied data sets, a significant correlation with that of a corresponding Free-Wilson analysis is shown. These results suggest that the R-group contribution can be used to interpret bioactivity data and highlight that the R-group signature based SVM modeling method is as interpretable as Free-Wilson analysis. Hence the signature SVM model can be a useful modeling tool for any drug discovery project.
Prediction of breast cancer risk with volatile biomarkers in breath.
Phillips, Michael; Cataneo, Renee N; Cruz-Ramos, Jose Alfonso; Huston, Jan; Ornelas, Omar; Pappas, Nadine; Pathak, Sonali
2018-03-23
Human breath contains volatile organic compounds (VOCs) that are biomarkers of breast cancer. We investigated the positive and negative predictive values (PPV and NPV) of breath VOC biomarkers as indicators of breast cancer risk. We employed ultra-clean breath collection balloons to collect breath samples from 54 women with biopsy-proven breast cancer and 124 cancer-free controls. Breath VOCs were analyzed with gas chromatography (GC) combined with either mass spectrometry (GC MS) or surface acoustic wave detection (GC SAW). Chromatograms were randomly assigned to a training set or a validation set. Monte Carlo analysis identified significant breath VOC biomarkers of breast cancer in the training set, and these biomarkers were incorporated into a multivariate algorithm to predict disease in the validation set. In the unsplit dataset, the predictive algorithms generated discriminant function (DF) values that varied with sensitivity, specificity, PPV and NPV. Using GC MS, test accuracy = 90% (area under curve of receiver operating characteristic in unsplit dataset) and cross-validated accuracy = 77%. Using GC SAW, test accuracy = 86% and cross-validated accuracy = 74%. With both assays, a low DF value was associated with a low risk of breast cancer (NPV > 99.9%). A high DF value was associated with a high risk of breast cancer and PPV rising to 100%. Analysis of breath VOC samples collected with ultra-clean balloons detected biomarkers that accurately predicted risk of breast cancer.
Xing, Jing; Lu, Wenchao; Liu, Rongfeng; Wang, Yulan; Xie, Yiqian; Zhang, Hao; Shi, Zhe; Jiang, Hao; Liu, Yu-Chih; Chen, Kaixian; Jiang, Hualiang; Luo, Cheng; Zheng, Mingyue
2017-07-24
Bromodomain-containing protein 4 (BRD4) is implicated in the pathogenesis of a number of different cancers, inflammatory diseases and heart failure. Much effort has been dedicated toward discovering novel scaffold BRD4 inhibitors (BRD4is) with different selectivity profiles and potential antiresistance properties. Structure-based drug design (SBDD) and virtual screening (VS) are the most frequently used approaches. Here, we demonstrate a novel, structure-based VS approach that uses machine-learning algorithms trained on the priori structure and activity knowledge to predict the likelihood that a compound is a BRD4i based on its binding pattern with BRD4. In addition to positive experimental data, such as X-ray structures of BRD4-ligand complexes and BRD4 inhibitory potencies, negative data such as false positives (FPs) identified from our earlier ligand screening results were incorporated into our knowledge base. We used the resulting data to train a machine-learning model named BRD4LGR to predict the BRD4i-likeness of a compound. BRD4LGR achieved a 20-30% higher AUC-ROC than that of Glide using the same test set. When conducting in vitro experiments against a library of previously untested, commercially available organic compounds, the second round of VS using BRD4LGR generated 15 new BRD4is. Moreover, inverting the machine-learning model provided easy access to structure-activity relationship (SAR) interpretation for hit-to-lead optimization.
Comparing and Validating Machine Learning Models for Mycobacterium tuberculosis Drug Discovery.
Lane, Thomas; Russo, Daniel P; Zorn, Kimberley M; Clark, Alex M; Korotcov, Alexandru; Tkachenko, Valery; Reynolds, Robert C; Perryman, Alexander L; Freundlich, Joel S; Ekins, Sean
2018-04-26
Tuberculosis is a global health dilemma. In 2016, the WHO reported 10.4 million incidences and 1.7 million deaths. The need to develop new treatments for those infected with Mycobacterium tuberculosis ( Mtb) has led to many large-scale phenotypic screens and many thousands of new active compounds identified in vitro. However, with limited funding, efforts to discover new active molecules against Mtb needs to be more efficient. Several computational machine learning approaches have been shown to have good enrichment and hit rates. We have curated small molecule Mtb data and developed new models with a total of 18,886 molecules with activity cutoffs of 10 μM, 1 μM, and 100 nM. These data sets were used to evaluate different machine learning methods (including deep learning) and metrics and to generate predictions for additional molecules published in 2017. One Mtb model, a combined in vitro and in vivo data Bayesian model at a 100 nM activity yielded the following metrics for 5-fold cross validation: accuracy = 0.88, precision = 0.22, recall = 0.91, specificity = 0.88, kappa = 0.31, and MCC = 0.41. We have also curated an evaluation set ( n = 153 compounds) published in 2017, and when used to test our model, it showed the comparable statistics (accuracy = 0.83, precision = 0.27, recall = 1.00, specificity = 0.81, kappa = 0.36, and MCC = 0.47). We have also compared these models with additional machine learning algorithms showing Bayesian machine learning models constructed with literature Mtb data generated by different laboratories generally were equivalent to or outperformed deep neural networks with external test sets. Finally, we have also compared our training and test sets to show they were suitably diverse and different in order to represent useful evaluation sets. Such Mtb machine learning models could help prioritize compounds for testing in vitro and in vivo.
Introducing Bayesian thinking to high-throughput screening for false-negative rate estimation.
Wei, Xin; Gao, Lin; Zhang, Xiaolei; Qian, Hong; Rowan, Karen; Mark, David; Peng, Zhengwei; Huang, Kuo-Sen
2013-10-01
High-throughput screening (HTS) has been widely used to identify active compounds (hits) that bind to biological targets. Because of cost concerns, the comprehensive screening of millions of compounds is typically conducted without replication. Real hits that fail to exhibit measurable activity in the primary screen due to random experimental errors will be lost as false-negatives. Conceivably, the projected false-negative rate is a parameter that reflects screening quality. Furthermore, it can be used to guide the selection of optimal numbers of compounds for hit confirmation. Therefore, a method that predicts false-negative rates from the primary screening data is extremely valuable. In this article, we describe the implementation of a pilot screen on a representative fraction (1%) of the screening library in order to obtain information about assay variability as well as a preliminary hit activity distribution profile. Using this training data set, we then developed an algorithm based on Bayesian logic and Monte Carlo simulation to estimate the number of true active compounds and potential missed hits from the full library screen. We have applied this strategy to five screening projects. The results demonstrate that this method produces useful predictions on the numbers of false negatives.
Predictive Structure-Based Toxicology Approaches To Assess the Androgenic Potential of Chemicals.
Trisciuzzi, Daniela; Alberga, Domenico; Mansouri, Kamel; Judson, Richard; Novellino, Ettore; Mangiatordi, Giuseppe Felice; Nicolotti, Orazio
2017-11-27
We present a practical and easy-to-run in silico workflow exploiting a structure-based strategy making use of docking simulations to derive highly predictive classification models of the androgenic potential of chemicals. Models were trained on a high-quality chemical collection comprising 1689 curated compounds made available within the CoMPARA consortium from the US Environmental Protection Agency and were integrated with a two-step applicability domain whose implementation had the effect of improving both the confidence in prediction and statistics by reducing the number of false negatives. Among the nine androgen receptor X-ray solved structures, the crystal 2PNU (entry code from the Protein Data Bank) was associated with the best performing structure-based classification model. Three validation sets comprising each 2590 compounds extracted by the DUD-E collection were used to challenge model performance and the effectiveness of Applicability Domain implementation. Next, the 2PNU model was applied to screen and prioritize two collections of chemicals. The first is a small pool of 12 representative androgenic compounds that were accurately classified based on outstanding rationale at the molecular level. The second is a large external blind set of 55450 chemicals with potential for human exposure. We show how the use of molecular docking provides highly interpretable models and can represent a real-life option as an alternative nontesting method for predictive toxicology.
Linear and nonlinear methods in modeling the aqueous solubility of organic compounds.
Catana, Cornel; Gao, Hua; Orrenius, Christian; Stouten, Pieter F W
2005-01-01
Solubility data for 930 diverse compounds have been analyzed using linear Partial Least Square (PLS) and nonlinear PLS methods, Continuum Regression (CR), and Neural Networks (NN). 1D and 2D descriptors from MOE package in combination with E-state or ISIS keys have been used. The best model was obtained using linear PLS for a combination between 22 MOE descriptors and 65 ISIS keys. It has a correlation coefficient (r2) of 0.935 and a root-mean-square error (RMSE) of 0.468 log molar solubility (log S(w)). The model validated on a test set of 177 compounds not included in the training set has r2 0.911 and RMSE 0.475 log S(w). The descriptors were ranked according to their importance, and at the top of the list have been found the 22 MOE descriptors. The CR model produced results as good as PLS, and because of the way in which cross-validation has been done it is expected to be a valuable tool in prediction besides PLS model. The statistics obtained using nonlinear methods did not surpass those got with linear ones. The good statistic obtained for linear PLS and CR recommends these models to be used in prediction when it is difficult or impossible to make experimental measurements, for virtual screening, combinatorial library design, and efficient leads optimization.
Sazonovas, A; Japertas, P; Didziapetris, R
2010-01-01
This study presents a new type of acute toxicity (LD(50)) prediction that enables automated assessment of the reliability of predictions (which is synonymous with the assessment of the Model Applicability Domain as defined by the Organization for Economic Cooperation and Development). Analysis involved nearly 75,000 compounds from six animal systems (acute rat toxicity after oral and intraperitoneal administration; acute mouse toxicity after oral, intraperitoneal, intravenous, and subcutaneous administration). Fragmental Partial Least Squares (PLS) with 100 bootstraps yielded baseline predictions that were automatically corrected for non-linear effects in local chemical spaces--a combination called Global, Adjusted Locally According to Similarity (GALAS) modelling methodology. Each prediction obtained in this manner is provided with a reliability index value that depends on both compound's similarity to the training set (that accounts for similar trends in LD(50) variations within multiple bootstraps) and consistency of experimental results with regard to the baseline model in the local chemical environment. The actual performance of the Reliability Index (RI) was proven by its good (and uniform) correlations with Root Mean Square Error (RMSE) in all validation sets, thus providing quantitative assessment of the Model Applicability Domain. The obtained models can be used for compound screening in the early stages of drug development and prioritization for experimental in vitro testing or later in vivo animal acute toxicity studies.
Samson, Shazwani; Basri, Mahiran; Fard Masoumi, Hamid Reza; Abdul Malek, Emilia; Abedi Karjiban, Roghayeh
2016-01-01
A predictive model of a virgin coconut oil (VCO) nanoemulsion system for the topical delivery of copper peptide (an anti-aging compound) was developed using an artificial neural network (ANN) to investigate the factors that influence particle size. Four independent variables including the amount of VCO, Tween 80: Pluronic F68 (T80:PF68), xanthan gum and water were the inputs whereas particle size was taken as the response for the trained network. Genetic algorithms (GA) were used to model the data which were divided into training sets, testing sets and validation sets. The model obtained indicated the high quality performance of the neural network and its capability to identify the critical composition factors for the VCO nanoemulsion. The main factor controlling the particle size was found out to be xanthan gum (28.56%) followed by T80:PF68 (26.9%), VCO (22.8%) and water (21.74%). The formulation containing copper peptide was then successfully prepared using optimum conditions and particle sizes of 120.7 nm were obtained. The final formulation exhibited a zeta potential lower than -25 mV and showed good physical stability towards centrifugation test, freeze-thaw cycle test and storage at temperature 25°C and 45°C. PMID:27383135
Samson, Shazwani; Basri, Mahiran; Fard Masoumi, Hamid Reza; Abdul Malek, Emilia; Abedi Karjiban, Roghayeh
2016-01-01
A predictive model of a virgin coconut oil (VCO) nanoemulsion system for the topical delivery of copper peptide (an anti-aging compound) was developed using an artificial neural network (ANN) to investigate the factors that influence particle size. Four independent variables including the amount of VCO, Tween 80: Pluronic F68 (T80:PF68), xanthan gum and water were the inputs whereas particle size was taken as the response for the trained network. Genetic algorithms (GA) were used to model the data which were divided into training sets, testing sets and validation sets. The model obtained indicated the high quality performance of the neural network and its capability to identify the critical composition factors for the VCO nanoemulsion. The main factor controlling the particle size was found out to be xanthan gum (28.56%) followed by T80:PF68 (26.9%), VCO (22.8%) and water (21.74%). The formulation containing copper peptide was then successfully prepared using optimum conditions and particle sizes of 120.7 nm were obtained. The final formulation exhibited a zeta potential lower than -25 mV and showed good physical stability towards centrifugation test, freeze-thaw cycle test and storage at temperature 25°C and 45°C.
Role of cytopathology in cancer control in low-resource settings: sub-Saharan Africa's perspective.
Thomas, Jaiyeola
2011-03-01
Cancer is an emerging public health problem in Africa especially with increasing exposure to risky life styles, environmental carcinogens and emergence of AIDS-associated cancers. Of the WHO estimated 7.9 million cancer-related deaths in 2007 more than 72% occurred in the low- and middle-income countries and 80% presented in the late stages. To implement the WHO resolution on cancer control programs in these settings, feasible evidence-based interventions for prevention, early diagnosis and detection need to be widely introduced. Fundamental to appropriate cancer treatment and statistics is accurate diagnosis. In low-resource settings, the diagnostic techniques and procedures should be reliable, cost-effective, simple and acceptable to patients. In addition, the required equipment should be affordable, requiring minimal maintenance and with readily available consumables. Cytology, as a simple standardized low-technology procedure, fulfills these criteria and is most effective in addressing the major components of cancer control programs in these areas. The major obstacles to its widespread establishment are lack of awareness and inadequate numbers of trained personnel compounded by sociopolitical factors, poor national planning and implementation. Rather than investing in new technology or alternative screening methods, efforts should focus on the education and training of local personnel, as feasible options, to improve the chances of implementing meaningful cancer control programs.
Huang, Ruili; Southall, Noel; Xia, Menghang; Cho, Ming-Hsuang; Jadhav, Ajit; Nguyen, Dac-Trung; Inglese, James; Tice, Raymond R.; Austin, Christopher P.
2009-01-01
In support of the U.S. Tox21 program, we have developed a simple and chemically intuitive model we call weighted feature significance (WFS) to predict the toxicological activity of compounds, based on the statistical enrichment of structural features in toxic compounds. We trained and tested the model on the following: (1) data from quantitative high–throughput screening cytotoxicity and caspase activation assays conducted at the National Institutes of Health Chemical Genomics Center, (2) data from Salmonella typhimurium reverse mutagenicity assays conducted by the U.S. National Toxicology Program, and (3) hepatotoxicity data published in the Registry of Toxic Effects of Chemical Substances. Enrichments of structural features in toxic compounds are evaluated for their statistical significance and compiled into a simple additive model of toxicity and then used to score new compounds for potential toxicity. The predictive power of the model for cytotoxicity was validated using an independent set of compounds from the U.S. Environmental Protection Agency tested also at the National Institutes of Health Chemical Genomics Center. We compared the performance of our WFS approach with classical classification methods such as Naive Bayesian clustering and support vector machines. In most test cases, WFS showed similar or slightly better predictive power, especially in the prediction of hepatotoxic compounds, where WFS appeared to have the best performance among the three methods. The new algorithm has the important advantages of simplicity, power, interpretability, and ease of implementation. PMID:19805409
Liu, Zhihong; Zheng, Minghao; Yan, Xin; Gu, Qiong; Gasteiger, Johann; Tijhuis, Johan; Maas, Peter; Li, Jiabo; Xu, Jun
2014-09-01
Predicting compound chemical stability is important because unstable compounds can lead to either false positive or to false negative conclusions in bioassays. Experimental data (COMDECOM) measured from DMSO/H2O solutions stored at 50 °C for 105 days were used to predicted stability by applying rule-embedded naïve Bayesian learning, based upon atom center fragment (ACF) features. To build the naïve Bayesian classifier, we derived ACF features from 9,746 compounds in the COMDECOM dataset. By recursively applying naïve Bayesian learning from the data set, each ACF is assigned with an expected stable probability (p(s)) and an unstable probability (p(uns)). 13,340 ACFs, together with their p(s) and p(uns) data, were stored in a knowledge base for use by the Bayesian classifier. For a given compound, its ACFs were derived from its structure connection table with the same protocol used to drive ACFs from the training data. Then, the Bayesian classifier assigned p(s) and p(uns) values to the compound ACFs by a structural pattern recognition algorithm, which was implemented in-house. Compound instability is calculated, with Bayes' theorem, based upon the p(s) and p(uns) values of the compound ACFs. We were able to achieve performance with an AUC value of 84% and a tenfold cross validation accuracy of 76.5%. To reduce false negatives, a rule-based approach has been embedded in the classifier. The rule-based module allows the program to improve its predictivity by expanding its compound instability knowledge base, thus further reducing the possibility of false negatives. To our knowledge, this is the first in silico prediction service for the prediction of the stabilities of organic compounds.
NASA Astrophysics Data System (ADS)
Ward, Logan; Liu, Ruoqian; Krishna, Amar; Hegde, Vinay I.; Agrawal, Ankit; Choudhary, Alok; Wolverton, Chris
2017-07-01
While high-throughput density functional theory (DFT) has become a prevalent tool for materials discovery, it is limited by the relatively large computational cost. In this paper, we explore using DFT data from high-throughput calculations to create faster, surrogate models with machine learning (ML) that can be used to guide new searches. Our method works by using decision tree models to map DFT-calculated formation enthalpies to a set of attributes consisting of two distinct types: (i) composition-dependent attributes of elemental properties (as have been used in previous ML models of DFT formation energies), combined with (ii) attributes derived from the Voronoi tessellation of the compound's crystal structure. The ML models created using this method have half the cross-validation error and similar training and evaluation speeds to models created with the Coulomb matrix and partial radial distribution function methods. For a dataset of 435 000 formation energies taken from the Open Quantum Materials Database (OQMD), our model achieves a mean absolute error of 80 meV/atom in cross validation, which is lower than the approximate error between DFT-computed and experimentally measured formation enthalpies and below 15% of the mean absolute deviation of the training set. We also demonstrate that our method can accurately estimate the formation energy of materials outside of the training set and be used to identify materials with especially large formation enthalpies. We propose that our models can be used to accelerate the discovery of new materials by identifying the most promising materials to study with DFT at little additional computational cost.
Testate amoebae communities sensitive to surface moisture conditions in Patagonian peatlands
NASA Astrophysics Data System (ADS)
Loisel, J.; Booth, R.; Charman, D.; van Bellen, S.; Yu, Z.
2017-12-01
Here we examine moss surface samples that were collected during three field campaigns (2005, 2010, 2014) across southern Patagonian peatlands to assess the potential use of testate amoebae and 13C isotope data as proxy indicators of soil moisture. These proxies have been widely tested across North America, but their use as paleoecological tools remains sparse in the southern hemisphere. Samples were collected along a hydrological gradient spanning a range of water table depth from 0cm in wet hollows to over 85cm in dry hummocks. Moss moisture content was measured in the field. Over 25 taxa were identified, with many of them not found in North America. Ordinations indicate statistically significant and dominant effects of soil moisture and water table depth on testate assemblages, though interestingly 13C is even more strongly correlated with testates amoebae than direct soil conditions. It is possible that moss 13C signature constitutes a compound indicator that represents seasonal soil moisture better than opportunistic sampling during field campaigns. There is no significant effect of year or site across the dataset. In addition to providing a training set that translates testate amoebae moisture tolerance range into water tabel depth for Patagonian peatlands, we also compare our results with those from the North American training set to show that, despite 'novel' Patagonian taxa, the robustness of international training sets is probably sufficient to quantify most changes in soil moisture from any site around the world. We also identify key indicator species that are shown to be of universal value in peat-based hydrological reconstructions.
Martins Alho, Miriam A; Marrero-Ponce, Yovani; Barigye, Stephen J; Meneses-Marcel, Alfredo; Machado Tugores, Yanetsy; Montero-Torres, Alina; Gómez-Barrio, Alicia; Nogal, Juan J; García-Sánchez, Rory N; Vega, María Celeste; Rolón, Miriam; Martínez-Fernández, Antonio R; Escario, José A; Pérez-Giménez, Facundo; Garcia-Domenech, Ramón; Rivera, Norma; Mondragón, Ricardo; Mondragón, Mónica; Ibarra-Velarde, Froylán; Lopez-Arencibia, Atteneri; Martín-Navarro, Carmen; Lorenzo-Morales, Jacob; Cabrera-Serra, Maria Gabriela; Piñero, Jose; Tytgat, Jan; Chicharro, Roberto; Arán, Vicente J
2014-03-01
Protozoan parasites have been one of the most significant public health problems for centuries and several human infections caused by them have massive global impact. Most of the current drugs used to treat these illnesses have been used for decades and have many limitations such as the emergence of drug resistance, severe side-effects, low-to-medium drug efficacy, administration routes, cost, etc. These drugs have been largely neglected as models for drug development because they are majorly used in countries with limited resources and as a consequence with scarce marketing possibilities. Nowadays, there is a pressing need to identify and develop new drug-based antiprotozoan therapies. In an effort to overcome this problem, the main purpose of this study is to develop a QSARs-based ensemble classifier for antiprotozoan drug-like entities from a heterogeneous compounds collection. Here, we use some of the TOMOCOMD-CARDD molecular descriptors and linear discriminant analysis (LDA) to derive individual linear classification functions in order to discriminate between antiprotozoan and non-antiprotozoan compounds as a way to enable the computational screening of virtual combinatorial datasets and/or drugs already approved. Firstly, we construct a wide-spectrum benchmark database comprising of 680 organic chemicals with great structural variability (254 of them antiprotozoan agents and 426 to drugs having other clinical uses). This series of compounds was processed by a k-means cluster analysis in order to design training and predicting sets. In total, seven discriminant functions were obtained, by using the whole set of atom-based linear indices. All the LDA-based QSAR models show accuracies above 85% in the training set and values of Matthews correlation coefficients (C) vary from 0.70 to 0.86. The external validation set shows rather-good global classifications of around 80% (92.05% for best equation). Later, we developed a multi-agent QSAR classification system, in which the individual QSAR outputs are the inputs of the aforementioned fusion approach. Finally, the fusion model was used for the identification of a novel generation of lead-like antiprotozoan compounds by using ligand-based virtual screening of 'available' small molecules (with synthetic feasibility) in our 'in-house' library. A new molecular subsystem (quinoxalinones) was then theoretically selected as a promising lead series, and its derivatives subsequently synthesized, structurally characterized, and experimentally assayed by using in vitro screening that took into consideration a battery of five parasite-based assays. The chemicals 11(12) and 16 are the most active (hits) against apicomplexa (sporozoa) and mastigophora (flagellata) subphylum parasites, respectively. Both compounds depicted good activity in every protozoan in vitro panel and they did not show unspecific cytotoxicity on the host cells. The described technical framework seems to be a promising QSAR-classifier tool for the molecular discovery and development of novel classes of broad-antiprotozoan-spectrum drugs, which may meet the dual challenges posed by drug-resistant parasites and the rapid progression of protozoan illnesses. Copyright © 2014 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Zhao, Siqi; Zhang, Guanglong; Xia, Shuwei; Yu, Liangmin
2018-06-01
As a group of diversified frameworks, quinazolin derivatives displayed a broad field of biological functions, especially as anticancer. To investigate the quantitative structure-activity relationship, 3D-QSAR models were generated with 24 quinazolin scaffold molecules. The experimental and predicted pIC50 values for both training and test set compounds showed good correlation, which proved the robustness and reliability of the generated QSAR models. The most effective CoMFA and CoMSIA were obtained with correlation coefficient r 2 ncv of 1.00 (both) and leave-one-out coefficient q 2 of 0.61 and 0.59, respectively. The predictive abilities of CoMFA and CoMSIA were quite good with the predictive correlation coefficients ( r 2 pred ) of 0.97 and 0.91. In addition, the statistic results of CoMFA and CoMSIA were used to design new quinazolin molecules.
Besalú, Emili
2016-01-01
The Superposing Significant Interaction Rules (SSIR) method is described. It is a general combinatorial and symbolic procedure able to rank compounds belonging to combinatorial analogue series. The procedure generates structure-activity relationship (SAR) models and also serves as an inverse SAR tool. The method is fast and can deal with large databases. SSIR operates from statistical significances calculated from the available library of compounds and according to the previously attached molecular labels of interest or non-interest. The required symbolic codification allows dealing with almost any combinatorial data set, even in a confidential manner, if desired. The application example categorizes molecules as binding or non-binding, and consensus ranking SAR models are generated from training and two distinct cross-validation methods: leave-one-out and balanced leave-two-out (BL2O), the latter being suited for the treatment of binary properties. PMID:27240346
The essential roles of chemistry in high-throughput screening triage
Dahlin, Jayme L; Walters, Michael A
2015-01-01
It is increasingly clear that academic high-throughput screening (HTS) and virtual HTS triage suffers from a lack of scientists trained in the art and science of early drug discovery chemistry. Many recent publications report the discovery of compounds by screening that are most likely artifacts or promiscuous bioactive compounds, and these results are not placed into the context of previous studies. For HTS to be most successful, it is our contention that there must exist an early partnership between biologists and medicinal chemists. Their combined skill sets are necessary to design robust assays and efficient workflows that will weed out assay artifacts, false positives, promiscuous bioactive compounds and intractable screening hits, efforts that ultimately give projects a better chance at identifying truly useful chemical matter. Expertise in medicinal chemistry, cheminformatics and purification sciences (analytical chemistry) can enhance the post-HTS triage process by quickly removing these problematic chemotypes from consideration, while simultaneously prioritizing the more promising chemical matter for follow-up testing. It is only when biologists and chemists collaborate effectively that HTS can manifest its full promise. PMID:25163000
Vik, Anders; Proszenyák, Agnes; Vermeersch, Marieke; Cos, Paul; Maes, Louis; Gundersen, Lise-Lotte
2009-01-08
There is an urgent need for novel and improved drugs against several tropical diseases caused by protozoa. The marine sponge (Agelas sp.) metabolite agelasine D, as well as other agelasine analogs and related structures were screened for inhibitory activity against Plasmodium falciparum, Leishmania infantum, Trypanosoma brucei and T. cruzi, as well as for toxicity against MRC-5 fibroblast cells. Many compounds displayed high general toxicity towards both the protozoa and MRC-5 cells. However, two compounds exhibited more selective inhibitory activity against L. infantum (IC(50) <0.5 microg/mL) while two others displayed IC(50) <1 microg/mL against T. cruzi in combination with relatively low toxicity against MRC-5 cells. According to criteria set up by the WHO Special Programme for Research & Training in Tropical Diseases (TDR), these compounds could be classified as hits for leishmaniasis and for Chagas disease, respectively. Identification of the hits as well as other SAR data from this initial screening will be valuable for design of more potent and selective potential drugs against these neglected tropical diseases.
Identity and distribution of residues of energetic compounds at army live-fire training ranges.
Jenkins, Thomas F; Hewitt, Alan D; Grant, Clarence L; Thiboutot, Sonia; Ampleman, Guy; Walsh, Marianne E; Ranney, Thomas A; Ramsey, Charles A; Palazzo, Antonio J; Pennington, Judith C
2006-05-01
Environmental investigations have been conducted at 23 military firing ranges in the United States and Canada. The specific training facilities most frequently evaluated were hand grenade, antitank rocket, and artillery ranges. Energetic compounds (explosives and propellants) were determined and linked to the type of munition used and the major mechanisms of deposition.
Quantitative structure-activity relationships for organophosphates binding to acetylcholinesterase.
Ruark, Christopher D; Hack, C Eric; Robinson, Peter J; Anderson, Paul E; Gearhart, Jeffery M
2013-02-01
Organophosphates are a group of pesticides and chemical warfare nerve agents that inhibit acetylcholinesterase, the enzyme responsible for hydrolysis of the excitatory neurotransmitter acetylcholine. Numerous structural variants exist for this chemical class, and data regarding their toxicity can be difficult to obtain in a timely fashion. At the same time, their use as pesticides and military weapons is widespread, which presents a major concern and challenge in evaluating human toxicity. To address this concern, a quantitative structure-activity relationship (QSAR) was developed to predict pentavalent organophosphate oxon human acetylcholinesterase bimolecular rate constants. A database of 278 three-dimensional structures and their bimolecular rates was developed from 15 peer-reviewed publications. A database of simplified molecular input line entry notations and their respective acetylcholinesterase bimolecular rate constants are listed in Supplementary Material, Table I. The database was quite diverse, spanning 7 log units of activity. In order to describe their structure, 675 molecular descriptors were calculated using AMPAC 8.0 and CODESSA 2.7.10. Orthogonal projection to latent structures regression, bootstrap leave-random-many-out cross-validation and y-randomization were used to develop an externally validated consensus QSAR model. The domain of applicability was assessed by the William's plot. Six external compounds were outside the warning leverage indicating potential model extrapolation. A number of compounds had residuals >2 or <-2, indicating potential outliers or activity cliffs. The results show that the HOMO-LUMO energy gap contributed most significantly to the binding affinity. A mean training R (2) of 0.80, a mean test set R (2) of 0.76 and a consensus external test set R (2) of 0.66 were achieved using the QSAR. The training and external test set RMSE values were found to be 0.76 and 0.88. The results suggest that this QSAR model can be used in physiologically based pharmacokinetic/pharmacodynamic models of organophosphate toxicity to determine the rate of acetylcholinesterase inhibition.
Locomotor and discriminative stimulus effects of four novel hallucinogens in rodents.
Gatch, Michael B; Dolan, Sean B; Forster, Michael J
2017-08-01
There has been increasing use of novel synthetic hallucinogenic compounds, 2-(4-bromo-2,5-dimethoxyphenyl)-N-(2-methoxybenzyl)ethanamine hydrochloride (25B-NBOMe), 2-(4-chloro-2,5-dimethoxyphenyl)-N-(2-methoxybenzyl)ethanamine hydrochloride (25C-NBOMe), 2-(4-iodo-2,5-dimethoxyphenyl)-N-(2-methoxybenzyl)ethanamine hydrochloride (25I-NBOMe), and N,N-diallyl-5-methoxy tryptamine (5-MeO-DALT), which have been associated with severe toxicities. These four compounds were tested for discriminative stimulus effects similar to a prototypical hallucinogen (-)-2,5-dimethoxy-4-methylamphetamine (DOM) and the entactogen (±)-3,4-methylenedioxymethamphetamine (MDMA). Locomotor activity in mice was tested to obtain dose range and time-course information. 25B-NBOMe, 25C-NBOMe, and 25I-NBOMe decreased locomotor activity. 5-MeO-DALT dose dependently increased locomotor activity, with a peak at 10 mg/kg. A higher dose (25 mg/kg) suppressed activity. 25B-NBOMe fully substituted (≥80%) in both DOM-trained and MDMA-trained rats at 0.5 mg/kg. However, higher doses produced much lower levels of drug-appropriate responding in both DOM-trained and MDMA-trained rats. 25C-NBOMe fully substituted in DOM-trained rats, but produced only 67% drug-appropriate responding in MDMA-trained rats at doses that suppressed responding. 25I-NBOMe produced 74-78% drug-appropriate responding in DOM-trained and MDMA-trained rats at doses that suppressed responding. 5-MeO-DALT fully substituted for DOM, but produced few or no MDMA-like effects. All of the compounds, except 25I-NBOMe, fully substituted for DOM, whereas only 25B-NBOMe fully substituted for MDMA. However, the failure of 25I-NBOMe to fully substitute for either MDMA or DOM was more likely because of its substantial rate-depressant effects than weak discriminative stimulus effects. All of the compounds are likely to attract recreational users for their hallucinogenic properties, but probably of much less interest as substitutes for MDMA. Although no acute adverse effects were observed at the doses tested, the substantial toxicities reported in humans, coupled with the high likelihood for illicit use, suggests that these compounds have the same potential for abuse as other, currently scheduled compounds.
QSAR modeling for predicting mutagenic toxicity of diverse chemicals for regulatory purposes.
Basant, Nikita; Gupta, Shikha
2017-06-01
The safety assessment process of chemicals requires information on their mutagenic potential. The experimental determination of mutagenicity of a large number of chemicals is tedious and time and cost intensive, thus compelling for alternative methods. We have established local and global QSAR models for discriminating low and high mutagenic compounds and predicting their mutagenic activity in a quantitative manner in Salmonella typhimurium (TA) bacterial strains (TA98 and TA100). The decision treeboost (DTB)-based classification QSAR models discriminated among two categories with accuracies of >96% and the regression QSAR models precisely predicted the mutagenic activity of diverse chemicals yielding high correlations (R 2 ) between the experimental and model-predicted values in the respective training (>0.96) and test (>0.94) sets. The test set root mean squared error (RMSE) and mean absolute error (MAE) values emphasized the usefulness of the developed models for predicting new compounds. Relevant structural features of diverse chemicals that were responsible and influence the mutagenic activity were identified. The applicability domains of the developed models were defined. The developed models can be used as tools for screening new chemicals for their mutagenicity assessment for regulatory purpose.
QSAR studies on triazole derivatives as sglt inhibitors via CoMFA and CoMSIA
NASA Astrophysics Data System (ADS)
Zhi, Hui; Zheng, Junxia; Chang, Yiqun; Li, Qingguo; Liao, Guochao; Wang, Qi; Sun, Pinghua
2015-10-01
Forty-six sodium-dependent glucose cotransporters-2 (SGLT-2) inhibitors with hypoglycemic activity were selected to develop three-dimensional quantitative structure-activity relationship (3D-QSAR) using comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) models. A training set of 39 compounds were used to build up the models, which were then evaluated by a series of internal and external cross-validation techniques. A test set of 7 compounds was used for the external validation. The CoMFA model predicted a q2 value of 0.792 and an r2 value of 0.985. The best CoMSIA model predicted a q2 value of 0.633 and an r2 value of 0.895 based on a combination of steric, electrostatic, hydrophobic and hydrogen-bond acceptor effects. The predictive correlation coefficients (rpred2) of CoMFA and CoMSIA models were 0.872 and 0.839, respectively. The analysis of the contour maps from each model provided insight into the structural requirements for the development of more active sglt inhibitors, and on the basis of the models 8 new sglt inhibitors were designed and predicted.
Limited Effects of Set Shifting Training in Healthy Older Adults
Grönholm-Nyman, Petra; Soveri, Anna; Rinne, Juha O.; Ek, Emilia; Nyholm, Alexandra; Stigsdotter Neely, Anna; Laine, Matti
2017-01-01
Our ability to flexibly shift between tasks or task sets declines in older age. As this decline may have adverse effects on everyday life of elderly people, it is of interest to study whether set shifting ability can be trained, and if training effects generalize to other cognitive tasks. Here, we report a randomized controlled trial where healthy older adults trained set shifting with three different set shifting tasks. The training group (n = 17) performed adaptive set shifting training for 5 weeks with three training sessions a week (45 min/session), while the active control group (n = 16) played three different computer games for the same period. Both groups underwent extensive pre- and post-testing and a 1-year follow-up. Compared to the controls, the training group showed significant improvements on the trained tasks. Evidence for near transfer in the training group was very limited, as it was seen only on overall accuracy on an untrained computerized set shifting task. No far transfer to other cognitive functions was observed. One year later, the training group was still better on the trained tasks but the single near transfer effect had vanished. The results suggest that computerized set shifting training in the elderly shows long-lasting effects on the trained tasks but very little benefit in terms of generalization. PMID:28386226
Haddon, J E; Killcross, S
2011-12-29
Previous research suggests the infralimbic cortex is important in situations when there is competition between goal-directed and habitual responding. Here we used a response conflict procedure to further explore the involvement of the infralimbic cortex in this relationship. Rats received training on two instrumental biconditional discriminations, one auditory and one visual, in two distinct contexts. One discrimination was "over-trained" relative to the other, "under-trained," discrimination in the ratio 3:1. At test, animals were presented with incongruent audiovisual stimulus compounds of the training stimuli in the under-trained context. The stimulus elements of these test compounds have previously dictated different lever press responses during training. Rats receiving control infusions into the infralimbic cortex showed a significant interference effect, producing more responses to the over-trained (habitual), but context-inappropriate, stimulus element of the incongruent compound. This interference effect was abolished by inactivation of the infralimbic cortex; animals showed a reduced tendency to produce the habitual but inappropriate response compared with animals receiving control infusions. This finding provides evidence that the infralimbic cortex is involved in attenuating the influence of goal-directed behavior, for example context-appropriate responding. Copyright © 2011 IBRO. Published by Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Jain, Sankalp; Grandits, Melanie; Richter, Lars; Ecker, Gerhard F.
2017-06-01
The bile salt export pump (BSEP) actively transports conjugated monovalent bile acids from the hepatocytes into the bile. This facilitates the formation of micelles and promotes digestion and absorption of dietary fat. Inhibition of BSEP leads to decreased bile flow and accumulation of cytotoxic bile salts in the liver. A number of compounds have been identified to interact with BSEP, which results in drug-induced cholestasis or liver injury. Therefore, in silico approaches for flagging compounds as potential BSEP inhibitors would be of high value in the early stage of the drug discovery pipeline. Up to now, due to the lack of a high-resolution X-ray structure of BSEP, in silico based identification of BSEP inhibitors focused on ligand-based approaches. In this study, we provide a homology model for BSEP, developed using the corrected mouse P-glycoprotein structure (PDB ID: 4M1M). Subsequently, the model was used for docking-based classification of a set of 1212 compounds (405 BSEP inhibitors, 807 non-inhibitors). Using the scoring function ChemScore, a prediction accuracy of 81% on the training set and 73% on two external test sets could be obtained. In addition, the applicability domain of the models was assessed based on Euclidean distance. Further, analysis of the protein-ligand interaction fingerprints revealed certain functional group-amino acid residue interactions that could play a key role for ligand binding. Though ligand-based models, due to their high speed and accuracy, remain the method of choice for classification of BSEP inhibitors, structure-assisted docking models demonstrate reasonably good prediction accuracies while additionally providing information about putative protein-ligand interactions.
Dolch, Michael E; Janitza, Silke; Boulesteix, Anne-Laure; Graßmann-Lichtenauer, Carola; Praun, Siegfried; Denzer, Wolfgang; Schelling, Gustav; Schubert, Sören
2016-12-01
Identification of microorganisms in positive blood cultures still relies on standard techniques such as Gram staining followed by culturing with definite microorganism identification. Alternatively, matrix-assisted laser desorption/ionization time-of-flight mass spectrometry or the analysis of headspace volatile compound (VC) composition produced by cultures can help to differentiate between microorganisms under experimental conditions. This study assessed the efficacy of volatile compound based microorganism differentiation into Gram-negatives and -positives in unselected positive blood culture samples from patients. Headspace gas samples of positive blood culture samples were transferred to sterilized, sealed, and evacuated 20 ml glass vials and stored at -30 °C until batch analysis. Headspace gas VC content analysis was carried out via an auto sampler connected to an ion-molecule reaction mass spectrometer (IMR-MS). Measurements covered a mass range from 16 to 135 u including CO2, H2, N2, and O2. Prediction rules for microorganism identification based on VC composition were derived using a training data set and evaluated using a validation data set within a random split validation procedure. One-hundred-fifty-two aerobic samples growing 27 Gram-negatives, 106 Gram-positives, and 19 fungi and 130 anaerobic samples growing 37 Gram-negatives, 91 Gram-positives, and two fungi were analysed. In anaerobic samples, ten discriminators were identified by the random forest method allowing for bacteria differentiation into Gram-negative and -positive (error rate: 16.7 % in validation data set). For aerobic samples the error rate was not better than random. In anaerobic blood culture samples of patients IMR-MS based headspace VC composition analysis facilitates bacteria differentiation into Gram-negative and -positive.
Editor's highlight: Evaluation of a Microelectrode Array-based ...
Thousands of compounds in the environment have not been characterized for developmental neurotoxicity (DNT) hazard. To address this issue, methods to screen compounds rapidly for DNT hazard evaluation are necessary and are being developed for key neurodevelopmental processes. In order to develop an assay for network formation, the current study evaluated effects of a training set of chemicals on network ontogeny by measuring spontaneous electrical activity in neural networks grown on microelectrode arrays (MEA). Rat (0-24 h old) primary cortical cells were plated in 48 well MEA plates and exposed to six compounds: acetaminophen, bisindolylmaleimide-1 (Bis-1), domoic acid, mevastatin, sodium orthovanadate, and loperamide for a period of 12 days. Spontaneous network activity was recorded on days 2, 5, 7, 9, and 12 and viability was assessed using the Cell Titer Blue® assay on day 12. Network activity (e.g. mean firing rate (MFR), burst rate (BR), etc), increased between days 5 and 12. Random Forest analysis indicated that across all compounds and times, temporal correlation of firing patterns (r), MFR, BR, #of active electrodes and % of spikes in a burst were the most influential parameters in separating control from treated wells. All compounds except acetaminophen (≤ 30 µM) caused concentration-related effects on one or more of these parameters. Domoic acid and sodium orthovanadate altered several of these parameters in the absence of cytotoxicity. Although
Improving compound-protein interaction prediction by building up highly credible negative samples.
Liu, Hui; Sun, Jianjiang; Guan, Jihong; Zheng, Jie; Zhou, Shuigeng
2015-06-15
Computational prediction of compound-protein interactions (CPIs) is of great importance for drug design and development, as genome-scale experimental validation of CPIs is not only time-consuming but also prohibitively expensive. With the availability of an increasing number of validated interactions, the performance of computational prediction approaches is severely impended by the lack of reliable negative CPI samples. A systematic method of screening reliable negative sample becomes critical to improving the performance of in silico prediction methods. This article aims at building up a set of highly credible negative samples of CPIs via an in silico screening method. As most existing computational models assume that similar compounds are likely to interact with similar target proteins and achieve remarkable performance, it is rational to identify potential negative samples based on the converse negative proposition that the proteins dissimilar to every known/predicted target of a compound are not much likely to be targeted by the compound and vice versa. We integrated various resources, including chemical structures, chemical expression profiles and side effects of compounds, amino acid sequences, protein-protein interaction network and functional annotations of proteins, into a systematic screening framework. We first tested the screened negative samples on six classical classifiers, and all these classifiers achieved remarkably higher performance on our negative samples than on randomly generated negative samples for both human and Caenorhabditis elegans. We then verified the negative samples on three existing prediction models, including bipartite local model, Gaussian kernel profile and Bayesian matrix factorization, and found that the performances of these models are also significantly improved on the screened negative samples. Moreover, we validated the screened negative samples on a drug bioactivity dataset. Finally, we derived two sets of new interactions by training an support vector machine classifier on the positive interactions annotated in DrugBank and our screened negative interactions. The screened negative samples and the predicted interactions provide the research community with a useful resource for identifying new drug targets and a helpful supplement to the current curated compound-protein databases. Supplementary files are available at: http://admis.fudan.edu.cn/negative-cpi/. © The Author 2015. Published by Oxford University Press.
Effects of Goal Setting on Performance and Job Satisfaction
ERIC Educational Resources Information Center
Ivancevich, John M.
1976-01-01
Studied the effect of goal-setting training on the performance and job satisfaction of sales personnel. One group was trained in participative goal setting; one group was trained in assigned goal setting; and one group received no training. Both trained groups showed temporary improvements in performance and job satisfaction. For availability see…
Maragakis, Alexandros; Siddharthan, Ragavan; RachBeisel, Jill; Snipes, Cassandra
2016-09-01
Individuals with serious mental illness (SMI) are more likely to experience preventable medical health issues, such as diabetes, hyperlipidemia, obesity, and cardiovascular disease, than the general population. To further compound this issue, these individuals are less likely to seek preventative medical care. These factors result in higher usage of expensive emergency care, lower quality of care, and lower life expectancy. This manuscript presents literature that examines the health disparities this population experiences, and barriers to accessing primary care. Through the identification of these barriers, we recommend that the field of family medicine work in collaboration with the field of mental health to implement 'reverse' integrated care (RIC) systems, and provide primary care services in the mental health settings. By embedding primary care practitioners in mental health settings, where individuals with SMI are more likely to present for treatment, this population may receive treatment for somatic care by experts. This not only would improve the quality of care received by patients, but would also remove the burden of managing complex somatic care from providers trained in mental health. The rationale for this RIC system, as well as training and policy reforms, are discussed.
Adversarial Threshold Neural Computer for Molecular de Novo Design.
Putin, Evgeny; Asadulaev, Arip; Vanhaelen, Quentin; Ivanenkov, Yan; Aladinskaya, Anastasia V; Aliper, Alex; Zhavoronkov, Alex
2018-03-30
In this article, we propose the deep neural network Adversarial Threshold Neural Computer (ATNC). The ATNC model is intended for the de novo design of novel small-molecule organic structures. The model is based on generative adversarial network architecture and reinforcement learning. ATNC uses a Differentiable Neural Computer as a generator and has a new specific block, called adversarial threshold (AT). AT acts as a filter between the agent (generator) and the environment (discriminator + objective reward functions). Furthermore, to generate more diverse molecules we introduce a new objective reward function named Internal Diversity Clustering (IDC). In this work, ATNC is tested and compared with the ORGANIC model. Both models were trained on the SMILES string representation of the molecules, using four objective functions (internal similarity, Muegge druglikeness filter, presence or absence of sp 3 -rich fragments, and IDC). The SMILES representations of 15K druglike molecules from the ChemDiv collection were used as a training data set. For the different functions, ATNC outperforms ORGANIC. Combined with the IDC, ATNC generates 72% of valid and 77% of unique SMILES strings, while ORGANIC generates only 7% of valid and 86% of unique SMILES strings. For each set of molecules generated by ATNC and ORGANIC, we analyzed distributions of four molecular descriptors (number of atoms, molecular weight, logP, and tpsa) and calculated five chemical statistical features (internal diversity, number of unique heterocycles, number of clusters, number of singletons, and number of compounds that have not been passed through medicinal chemistry filters). Analysis of key molecular descriptors and chemical statistical features demonstrated that the molecules generated by ATNC elicited better druglikeness properties. We also performed in vitro validation of the molecules generated by ATNC; results indicated that ATNC is an effective method for producing hit compounds.
Ma, X H; Wang, R; Tan, C Y; Jiang, Y Y; Lu, T; Rao, H B; Li, X Y; Go, M L; Low, B C; Chen, Y Z
2010-10-04
Multitarget agents have been increasingly explored for enhancing efficacy and reducing countertarget activities and toxicities. Efficient virtual screening (VS) tools for searching selective multitarget agents are desired. Combinatorial support vector machines (C-SVM) were tested as VS tools for searching dual-inhibitors of 11 combinations of 9 anticancer kinase targets (EGFR, VEGFR, PDGFR, Src, FGFR, Lck, CDK1, CDK2, GSK3). C-SVM trained on 233-1,316 non-dual-inhibitors correctly identified 26.8%-57.3% (majority >36%) of the 56-230 intra-kinase-group dual-inhibitors (equivalent to the 50-70% yields of two independent individual target VS tools), and 12.2% of the 41 inter-kinase-group dual-inhibitors. C-SVM were fairly selective in misidentifying as dual-inhibitors 3.7%-48.1% (majority <20%) of the 233-1,316 non-dual-inhibitors of the same kinase pairs and 0.98%-4.77% of the 3,971-5,180 inhibitors of other kinases. C-SVM produced low false-hit rates in misidentifying as dual-inhibitors 1,746-4,817 (0.013%-0.036%) of the 13.56 M PubChem compounds, 12-175 (0.007%-0.104%) of the 168 K MDDR compounds, and 0-84 (0.0%-2.9%) of the 19,495-38,483 MDDR compounds similar to the known dual-inhibitors. C-SVM was compared to other VS methods Surflex-Dock, DOCK Blaster, kNN and PNN against the same sets of kinase inhibitors and the full set or subset of the 1.02 M Zinc clean-leads data set. C-SVM produced comparable dual-inhibitor yields, slightly better false-hit rates for kinase inhibitors, and significantly lower false-hit rates for the Zinc clean-leads data set. Combinatorial SVM showed promising potential for searching selective multitarget agents against intra-kinase-group kinases without explicit knowledge of multitarget agents.
Reverse bifurcation and fractal of the compound logistic map
NASA Astrophysics Data System (ADS)
Wang, Xingyuan; Liang, Qingyong
2008-07-01
The nature of the fixed points of the compound logistic map is researched and the boundary equation of the first bifurcation of the map in the parameter space is given out. Using the quantitative criterion and rule of chaotic system, the paper reveal the general features of the compound logistic map transforming from regularity to chaos, the following conclusions are shown: (1) chaotic patterns of the map may emerge out of double-periodic bifurcation and (2) the chaotic crisis phenomena and the reverse bifurcation are found. At the same time, we analyze the orbit of critical point of the compound logistic map and put forward the definition of Mandelbrot-Julia set of compound logistic map. We generalize the Welstead and Cromer's periodic scanning technology and using this technology construct a series of Mandelbrot-Julia sets of compound logistic map. We investigate the symmetry of Mandelbrot-Julia set and study the topological inflexibility of distributing of period region in the Mandelbrot set, and finds that Mandelbrot set contain abundant information of structure of Julia sets by founding the whole portray of Julia sets based on Mandelbrot set qualitatively.
Alkhouri, Naim; Cikach, Frank; Eng, Katharine; Moses, Jonathan; Patel, Nishaben; Yan, Chen; Hanouneh, Ibrahim; Grove, David; Lopez, Rocio; Dweik, Raed
2014-01-01
Nonalcoholic fatty liver disease (NAFLD) is one of the most common complications of childhood obesity. Our objective was to investigate the association of breath volatile organic compounds with the diagnosis of NAFLD in children. Patients were screened with an ultrasound of the abdomen to evaluate for NAFLD. Exhaled breath was collected and analyzed per protocol using selective ion flow tube mass spectrometry (SIFT-MS). Sixty patients were included in the study (37 with NAFLD and 23 with normal liver). All children were overweight or obese. The mean age was 14.1±2.8 years and 50% were female. A comparison of the SIFT-MS results of patients with NAFLD with those with normal liver on ultrasound revealed differences in concentration of more than 15 compounds. A panel of four volatile organic compounds can identify the presence of NAFLD with good accuracy (area under the receiver operating characteristic curve of 0.913 in the training set and 0.763 in the validation set). Breath isoprene, acetone, trimethylamine, acetaldehyde, and pentane were significantly higher in the NAFLD group compared with normal liver group (14.7 ppb vs. 8.9 for isoprene; 71.7 vs. 36.9 for acetone; 5.0 vs. 3.2 for trimethylamine; 35.1 vs. 26.0 for acetaldehyde; and 13.3 vs. 8.8 for pentane, P<0.05 for all). Exhaled breath analysis is a promising noninvasive method to detect fatty liver in children. Isoprene, acetone, trimethylamine, acetaldehyde, and pentane are novel biomarkers that may help to gain insight into pathophysiological processes leading to the development of NAFLD.
Vijaya Prabhu, Sitrarasu; Singh, Sanjeev Kumar
2018-05-28
Atom-based three dimensional-quantitative structure-activity relationship (3D-QSAR) model was developed on the basis of 5-point pharmacophore hypothesis (AARRR) with two hydrogen bond acceptors (A) and three aromatic rings for the derivatives of thieno[2,3-b]pyridine, which modulates the activity to inhibit the mGluR5 receptor. Generation of a highly predictive 3D-QSAR model was performed using the alignment of predicted pharmacophore hypothesis for the training set (R 2 = 0.84, SD = 0.26, F = 45.8, N = 29) and test set (Q 2 = 0.74, RMSE = 0.235, Pearson-R = 0.94, N = 9). The best pharmacophore hypothesis AARRR was selected, and developed three dimensional-quantitative structure activity relationship (3D-QSAR) model also supported the outcome of this study by means of favorable and unfavorable electron withdrawing group and hydrophobic regions of most active compound 42d and least active compound 18b. Following, induced fit docking and binding free energy calculations reveals the reliable binding orientation of the compounds. Finally, molecular dynamics simulations for 100 ns were performed to depict the protein-ligand stability. We anticipate that the resulted outcome could be supportive to discover potent negative allosteric modulators for metabotropic glutamate receptor 5 (mGluR5).
NASA Astrophysics Data System (ADS)
Sepehri, Bakhtyar; Ghavami, Raouf
2017-02-01
In this research, molecular docking and CoMFA were used to determine interactions of α, β-unsaturated carbonyl-based compounds and oxime analogs with P-glycoprotein and prediction of their activity. Molecular docking study shown these molecules establish strong Van der Waals interactions with side chain of PHE-332, PHE-728 and PHE-974. Based on the effect of component numbers on squared correlation coefficient for cross validation tests (including leave-one-out and leave-many-out), CoMFA models with five components were built to predict pIC50 of molecules in seven cancer cell lines (including Panc-1 (pancreas cancer cell line), PaCa-2 (pancreatic carcinoma cell line), MCF-7 (breast cancer cell line), A-549 (epithelial), HT-29 (colon cancer cell line), H-460 (lung cancer cell line), PC-3 (prostate cancer cell line)). R2 values for training and test sets were in the range of 0.94-0.97 and 0.84 to 0.92, respectively, and for LOO and LMO cross validation test, q2 values were in the range of 0.75-0.82 and 0.65 to 0.73, respectively. Based on molecular docking results and extracted steric and electrostatic contour maps for CoMFA models, four new molecules with higher activity with respect to the most active compound in data set were designed.
NASA Astrophysics Data System (ADS)
Ratu Ayu, Humairoh; Suryono, Suryono; Endro Suseno, Jatmiko; Kurniawati, Ratna
2018-05-01
The Adaptive Neural Fuzzy Inference System (ANFIS) model was used to predict and optimize the content of flavonoid compounds in guava leaves (Psidium Guajava L.). The extraction process was carried out by using ultrasound assisted extraction (UAE) with the variable parameters: temperature ranging from 25°C to 35°C, ultrasonic frequency (30 - 40 kHz) and extraction time (20 - 40 minutes). ANFIS learning procedure began by providing the input variable data set (temperature, frequency and time) and the output of the flavonoid compounds from the experiments that had been done. Subtractive clustering methods was used in the manufacture of FIS (fuzzy inference system) structures by varying the range of influence parameters to generate the ANFIS system. The ANFIS trainingsconducted wereaimed at minimum error value. The results showed that the best ANFIS models used a subtractive clustering method, in which the ranges of influence 0.1 were 0.70 x 10-4 for training RMSE, 8.11 for testing RMSE, 2.7 % MAPE, and 7.72 MAE. The optimum condition was obtained at a temperature of 35°C and frequency of 40 kHz, for 30 minutes. This result proves that the ANFIS model can be used to predict the content of flavonoid compounds in guava leaves.
Vilanova, Mar; Genisheva, Zlatina; Tubio, Miguel; Álvarez, Katia; Lissarrague, Jose Ramón; Oliveira, José Maria
2017-09-08
Viticultural practices influence both grape and wine quality. The influence of training systems on volatile composition was investigated for Albariño wine from Rías Baixas AOC in Northwest Spain. The odoriferous contribution of the compounds to the wine aroma was also studied. Volatile compounds belonging to ten groups (alcohols, C₆-compounds, ethyl esters, acetates, terpenols, C 13 -norisoprenoids, volatile phenols, volatile fatty acids, lactones and carbonyl compounds) were determined in Albariño wines from different training systems, Vertical Shoot-Positioned (VSP), Scott-Henry (SH), Geneva Double-Curtain (GDC), Arch-Cane (AC), and Parral (P) during 2010 and 2011 vintages. Wines from GDC showed the highest total volatile composition with the highest concentrations of alcohols, ethyl esters, fatty acids, and lactones families. However, the highest levels of terpenes and C 13 -norisoprenoids were quantified in the SH system. A fruitier aroma was observed in Albariño wines from GDC when odor activity values were calculated.
3D-QSAR modeling and molecular docking studies on a series of 2,5 disubstituted 1,3,4-oxadiazoles
NASA Astrophysics Data System (ADS)
Ghaleb, Adib; Aouidate, Adnane; Ghamali, Mounir; Sbai, Abdelouahid; Bouachrine, Mohammed; Lakhlifi, Tahar
2017-10-01
3D-QSAR (comparative molecular field analysis (CoMFA)) and comparative molecular similarity indices analysis (CoMSIA) were performed on novel 2,5 disubstituted 1,3,4-oxadiazoles analogues as anti-fungal agents. The CoMFA and CoMSIA models using 13 compounds in the training set gives Q2 values of 0.52 and 0.51 respectively, while R2 values of 0.92. The adapted alignment method with the suitable parameters resulted in reliable models. The contour maps produced by the CoMFA and CoMSIA models were employed to determine a three-dimensional quantitative structure-activity relationship. Based on this study a set of new molecules with high predicted activities were designed. Surflex-docking confirmed the stability of predicted molecules in the receptor.
Khashan, Raed; Zheng, Weifan; Tropsha, Alexander
2014-03-01
We present a novel approach to generating fragment-based molecular descriptors. The molecules are represented by labeled undirected chemical graph. Fast Frequent Subgraph Mining (FFSM) is used to find chemical-fragments (subgraphs) that occur in at least a subset of all molecules in a dataset. The collection of frequent subgraphs (FSG) forms a dataset-specific descriptors whose values for each molecule are defined by the number of times each frequent fragment occurs in this molecule. We have employed the FSG descriptors to develop variable selection k Nearest Neighbor (kNN) QSAR models of several datasets with binary target property including Maximum Recommended Therapeutic Dose (MRTD), Salmonella Mutagenicity (Ames Genotoxicity), and P-Glycoprotein (PGP) data. Each dataset was divided into training, test, and validation sets to establish the statistical figures of merit reflecting the model validated predictive power. The classification accuracies of models for both training and test sets for all datasets exceeded 75 %, and the accuracy for the external validation sets exceeded 72 %. The model accuracies were comparable or better than those reported earlier in the literature for the same datasets. Furthermore, the use of fragment-based descriptors affords mechanistic interpretation of validated QSAR models in terms of essential chemical fragments responsible for the compounds' target property. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
The Effects of a Duathlon Simulation on Ventilatory Threshold and Running Economy
Berry, Nathaniel T.; Wideman, Laurie; Shields, Edgar W.; Battaglini, Claudio L.
2016-01-01
Multisport events continue to grow in popularity among recreational, amateur, and professional athletes around the world. This study aimed to determine the compounding effects of the initial run and cycling legs of an International Triathlon Union (ITU) Duathlon simulation on maximal oxygen uptake (VO2max), ventilatory threshold (VT) and running economy (RE) within a thermoneutral, laboratory controlled setting. Seven highly trained multisport athletes completed three trials; Trial-1 consisted of a speed only VO2max treadmill protocol (SOVO2max) to determine VO2max, VT, and RE during a single-bout run; Trial-2 consisted of a 10 km run at 98% of VT followed by an incremental VO2max test on the cycle ergometer; Trial-3 consisted of a 10 km run and 30 km cycling bout at 98% of VT followed by a speed only treadmill test to determine the compounding effects of the initial legs of a duathlon on VO2max, VT, and RE. A repeated measures ANOVA was performed to determine differences between variables across trials. No difference in VO2max, VT (%VO2max), maximal HR, or maximal RPE was observed across trials. Oxygen consumption at VT was significantly lower during Trial-3 compared to Trial-1 (p = 0.01). This decrease was coupled with a significant reduction in running speed at VT (p = 0.015). A significant interaction between trial and running speed indicate that RE was significantly altered during Trial-3 compared to Trial-1 (p < 0.001). The first two legs of a laboratory based duathlon simulation negatively impact VT and RE. Our findings may provide a useful method to evaluate multisport athletes since a single-bout incremental treadmill test fails to reveal important alterations in physiological thresholds. Key points Decrease in relative oxygen uptake at VT (ml·kg-1·min-1) during the final leg of a duathlon simulation, compared to a single-bout maximal run. We observed a decrease in running speed at VT during the final leg of a duathlon simulation; resulting in an increase of more than 2 minutes to complete a 5 km run. During our study, highly trained athletes were unable to complete the final 5 km run at the same intensity that they completed the initial 10 km run (in a laboratory setting). A better understanding, and determination, of training loads during multisport training may help to better periodize training programs; additional research is required. PMID:27274661
Fourier spatial frequency analysis for image classification: training the training set
NASA Astrophysics Data System (ADS)
Johnson, Timothy H.; Lhamo, Yigah; Shi, Lingyan; Alfano, Robert R.; Russell, Stewart
2016-04-01
The Directional Fourier Spatial Frequencies (DFSF) of a 2D image can identify similarity in spatial patterns within groups of related images. A Support Vector Machine (SVM) can then be used to classify images if the inter-image variance of the FSF in the training set is bounded. However, if variation in FSF increases with training set size, accuracy may decrease as the size of the training set increases. This calls for a method to identify a set of training images from among the originals that can form a vector basis for the entire class. Applying the Cauchy product method we extract the DFSF spectrum from radiographs of osteoporotic bone, and use it as a matched filter set to eliminate noise and image specific frequencies, and demonstrate that selection of a subset of superclassifiers from within a set of training images improves SVM accuracy. Central to this challenge is that the size of the search space can become computationally prohibitive for all but the smallest training sets. We are investigating methods to reduce the search space to identify an optimal subset of basis training images.
Hansen, Ane H; Nyberg, Michael; Bangsbo, Jens; Saltin, Bengt; Hellsten, Ylva
2011-11-01
The effects of physical training on the formation of vasodilating and vasoconstricting compounds, as well as on related proteins important for vascular function, were examined in skeletal muscle of individuals with essential hypertension (n=10). Muscle microdialysis samples were obtained from subjects with hypertension before and after 16 weeks of physical training. Muscle dialysates were analyzed for thromboxane A(2), prostacyclin, nucleotides, and nitrite/nitrate. Protein levels of thromboxane synthase, prostacyclin synthase, cyclooxygenase 1 and 2, endothelial nitric oxide synthase (eNOS), cystathionine-γ-lyase, cytochrome P450 4A and 2C9, and the purinergic receptors P2X1 and P2Y2 were determined in skeletal muscle. The protein levels were compared with those of normotensive control subjects (n=12). Resting muscle dialysate thromboxane A(2) and prostacyclin concentrations were lower (P<0.05) after training compared with before training. Before training, dialysate thromboxane A(2) decreased with acute exercise, whereas after training, no changes were found. Before training, dialysate prostacyclin levels did not increase with acute exercise, whereas after training there was an 82% (P<0.05) increase from rest to exercise. The exercise-induced increase in ATP and ADP was markedly reduced after training (P<0.05). The amount of eNOS protein in the hypertensive subjects was 40% lower (P<0.05) than in the normotensive control subjects, whereas cystathionine-γ-lyase levels were 25% higher (P<0.05), potentially compensating for the lower eNOS level. We conclude that exercise training alters the balance between vasodilating and vasoconstricting compounds as evidenced by a decrease in the level of thromboxane, reduction in the exercise-induced increase in ATP and a greater exercise-induced increase in prostacyclin.
Using beta binomials to estimate classification uncertainty for ensemble models.
Clark, Robert D; Liang, Wenkel; Lee, Adam C; Lawless, Michael S; Fraczkiewicz, Robert; Waldman, Marvin
2014-01-01
Quantitative structure-activity (QSAR) models have enormous potential for reducing drug discovery and development costs as well as the need for animal testing. Great strides have been made in estimating their overall reliability, but to fully realize that potential, researchers and regulators need to know how confident they can be in individual predictions. Submodels in an ensemble model which have been trained on different subsets of a shared training pool represent multiple samples of the model space, and the degree of agreement among them contains information on the reliability of ensemble predictions. For artificial neural network ensembles (ANNEs) using two different methods for determining ensemble classification - one using vote tallies and the other averaging individual network outputs - we have found that the distribution of predictions across positive vote tallies can be reasonably well-modeled as a beta binomial distribution, as can the distribution of errors. Together, these two distributions can be used to estimate the probability that a given predictive classification will be in error. Large data sets comprised of logP, Ames mutagenicity, and CYP2D6 inhibition data are used to illustrate and validate the method. The distributions of predictions and errors for the training pool accurately predicted the distribution of predictions and errors for large external validation sets, even when the number of positive and negative examples in the training pool were not balanced. Moreover, the likelihood of a given compound being prospectively misclassified as a function of the degree of consensus between networks in the ensemble could in most cases be estimated accurately from the fitted beta binomial distributions for the training pool. Confidence in an individual predictive classification by an ensemble model can be accurately assessed by examining the distributions of predictions and errors as a function of the degree of agreement among the constituent submodels. Further, ensemble uncertainty estimation can often be improved by adjusting the voting or classification threshold based on the parameters of the error distribution. Finally, the profiles for models whose predictive uncertainty estimates are not reliable provide clues to that effect without the need for comparison to an external test set.
NASA Astrophysics Data System (ADS)
Khazaei, Ardeshir; Sarmasti, Negin; Seyf, Jaber Yousefi
2016-03-01
Quantitative structure activity relationship were used to study a series of curcumin-related compounds with inhibitory effect on prostate cancer PC-3 cells, pancreas cancer Panc-1 cells, and colon cancer HT-29 cells. Sphere exclusion method was used to split data set in two categories of train and test set. Multiple linear regression, principal component regression and partial least squares were used as the regression methods. In other hand, to investigate the effect of feature selection methods, stepwise, Genetic algorithm, and simulated annealing were used. In two cases (PC-3 cells and Panc-1 cells), the best models were generated by a combination of multiple linear regression and stepwise (PC-3 cells: r2 = 0.86, q2 = 0.82, pred_r2 = 0.93, and r2m (test) = 0.43, Panc-1 cells: r2 = 0.85, q2 = 0.80, pred_r2 = 0.71, and r2m (test) = 0.68). For the HT-29 cells, principal component regression with stepwise (r2 = 0.69, q2 = 0.62, pred_r2 = 0.54, and r2m (test) = 0.41) is the best method. The QSAR study reveals descriptors which have crucial role in the inhibitory property of curcumin-like compounds. 6ChainCount, T_C_C_1, and T_O_O_7 are the most important descriptors that have the greatest effect. With a specific end goal to design and optimization of novel efficient curcumin-related compounds it is useful to introduce heteroatoms such as nitrogen, oxygen, and sulfur atoms in the chemical structure (reduce the contribution of T_C_C_1 descriptor) and increase the contribution of 6ChainCount and T_O_O_7 descriptors. Models can be useful in the better design of some novel curcumin-related compounds that can be used in the treatment of prostate, pancreas, and colon cancers.
Noorizadeh, Hadi; Farmany, Abbas; Narimani, Hojat; Noorizadeh, Mehrab
2013-05-01
A quantitative structure-retention relationship (QSRR) study based on an artificial neural network (ANN) was carried out for the prediction of the ultra-performance liquid chromatography-Time-of-Flight mass spectrometry (UPLC-TOF-MS) retention time (RT) of a set of 52 pharmaceuticals and drugs of abuse in hair. The genetic algorithm was used as a variable selection tool. A partial least squares (PLS) method was used to select the best descriptors which were used as input neurons in neural network model. For choosing the best predictive model from among comparable models, square correlation coefficient R(2) for the whole set calculated based on leave-group-out predicted values of the training set and model-derived predicted values for the test set compounds is suggested to be a good criterion. Finally, to improve the results, structure-retention relationships were followed by a non-linear approach using artificial neural networks and consequently better results were obtained. This also demonstrates the advantages of ANN. Copyright © 2011 John Wiley & Sons, Ltd.
Feng, Yanli; Mu, Cuicui; Zhai, Jinqing; Li, Jian; Zou, Ting
2010-11-15
Carbonyl compounds including their concentrations, potential sources, diurnal variations and personal exposure were investigated in six subway stations and in-subway trains in Shanghai in June 2008. The carbonyls were collected onto solid sorbent (Tenax TA) coated with pentafluorophenyl hydrazine (PFPH), followed by solvent extraction and gas chromatography (GC)/mass spectrometry (MS) analysis of the PFPH derivatives. The total carbonyl concentrations of in-subway train were about 1.4-2.5 times lower than in-subway stations. A significant correlation (R>0.5, p<0.01) between the concentrations of the low molecular-weight carbonyl compounds (
ERIC Educational Resources Information Center
Haverland, Edgar M.
The report describes a project designed to facilitate the transfer and utilization of training technology by developing a model for evaluating training approaches or innovtions in relation to the requirements, resources, and constraints of specific training settings. The model consists of two parallel sets of open-ended questions--one set…
Training set extension for SVM ensemble in P300-speller with familiar face paradigm.
Li, Qi; Shi, Kaiyang; Gao, Ning; Li, Jian; Bai, Ou
2018-03-27
P300-spellers are brain-computer interface (BCI)-based character input systems. Support vector machine (SVM) ensembles are trained with large-scale training sets and used as classifiers in these systems. However, the required large-scale training data necessitate a prolonged collection time for each subject, which results in data collected toward the end of the period being contaminated by the subject's fatigue. This study aimed to develop a method for acquiring more training data based on a collected small training set. A new method was developed in which two corresponding training datasets in two sequences are superposed and averaged to extend the training set. The proposed method was tested offline on a P300-speller with the familiar face paradigm. The SVM ensemble with extended training set achieved 85% classification accuracy for the averaged results of four sequences, and 100% for 11 sequences in the P300-speller. In contrast, the conventional SVM ensemble with non-extended training set achieved only 65% accuracy for four sequences, and 92% for 11 sequences. The SVM ensemble with extended training set achieves higher classification accuracies than the conventional SVM ensemble, which verifies that the proposed method effectively improves the classification performance of BCI P300-spellers, thus enhancing their practicality.
Kiwifruit Flower Odor Perception and Recognition by Honey Bees, Apis mellifera.
Twidle, Andrew M; Mas, Flore; Harper, Aimee R; Horner, Rachael M; Welsh, Taylor J; Suckling, David M
2015-06-17
Volatile organic compounds (VOCs) from male and female kiwifruit (Actinidia deliciosa 'Hayward') flowers were collected by dynamic headspace sampling. Honey bee (Apis mellifera) perception of the flower VOCs was tested using gas chromatography coupled to electroantennogram detection. Honey bees consistently responded to six compounds present in the headspace of female kiwifruit flowers and five compounds in the headspace of male flowers. Analysis of the floral volatiles by gas chromatography-mass spectrometry and microscale chemical derivatization showed the compounds to be nonanal, 2-phenylethanol, 4-oxoisophorone, (3E,6E)-α-farnesene, (6Z,9Z)-heptadecadiene, and (8Z)-heptadecene. Bees were then trained via olfactory conditioning of the proboscis extension response (PER) to synthetic mixtures of these compounds using the ratios present in each flower type. Honey bees trained to the synthetic mixtures showed a high response to the natural floral extracts, indicating that these may be the key compounds for honey bee perception of kiwifruit flower odor.
Cross-Platform Toxicogenomics for the Prediction of Non-Genotoxic Hepatocarcinogenesis in Rat
Metzger, Ute; Templin, Markus F.; Plummer, Simon; Ellinger-Ziegelbauer, Heidrun; Zell, Andreas
2014-01-01
In the area of omics profiling in toxicology, i.e. toxicogenomics, characteristic molecular profiles have previously been incorporated into prediction models for early assessment of a carcinogenic potential and mechanism-based classification of compounds. Traditionally, the biomarker signatures used for model construction were derived from individual high-throughput techniques, such as microarrays designed for monitoring global mRNA expression. In this study, we built predictive models by integrating omics data across complementary microarray platforms and introduced new concepts for modeling of pathway alterations and molecular interactions between multiple biological layers. We trained and evaluated diverse machine learning-based models, differing in the incorporated features and learning algorithms on a cross-omics dataset encompassing mRNA, miRNA, and protein expression profiles obtained from rat liver samples treated with a heterogeneous set of substances. Most of these compounds could be unambiguously classified as genotoxic carcinogens, non-genotoxic carcinogens, or non-hepatocarcinogens based on evidence from published studies. Since mixed characteristics were reported for the compounds Cyproterone acetate, Thioacetamide, and Wy-14643, we reclassified these compounds as either genotoxic or non-genotoxic carcinogens based on their molecular profiles. Evaluating our toxicogenomics models in a repeated external cross-validation procedure, we demonstrated that the prediction accuracy of our models could be increased by joining the biomarker signatures across multiple biological layers and by adding complex features derived from cross-platform integration of the omics data. Furthermore, we found that adding these features resulted in a better separation of the compound classes and a more confident reclassification of the three undefined compounds as non-genotoxic carcinogens. PMID:24830643
Randall, Philip; Johnson, Quentin; Verster, Anna
2012-12-01
Wheat and maize flour fortification is a preventive food-based approach to improve the micronutrient status of populations. In 2009, the World Health Organization (WHO) released recommendations for such fortification, with guidelines on the addition levels for iron, folic acid, vitamin B12, vitamin A, and zinc at various levels of average daily consumption. Iron is the micronutrient of greatest concern to the food industry, as some believe there may be some adverse interaction(s) in some or all of the finished products produced from wheat flour and maize meal. To determine if there were any adverse interactions due to selection of iron compounds and, if differences were noted, to quantify those differences. Wheat flour and maize meal were sourced in Kenya, South Africa, and Tanzania, and the iron compound (sodium iron ethylenediaminetetraacetate [NaFeEDTA], ferrous fumarate, or ferrous sulfate) was varied and dosed at rates according to the WHO guidelines for consumption of 75 to 149 g/day of wheat flour and > 300 g/day of maize meal and tested again for 150 to 300 g/day for both. Bread, chapatti, ugali (thick porridge), and uji (thin porridge) were prepared locally and assessed on whether the products were acceptable under industry-approved criteria and whether industry could discern any differences, knowing that differences existed, by academic sensory analysis using a combination of trained and untrained panelists and in direct side-by-side comparison. Industry (the wheat and maize milling sector) scored the samples as well above the minimal standard, and under academic scrutiny no differences were reported. Side-by-side comparison by the milling industry did indicate some slight differences, mainly with respect to color, although these differences did not correlate with any particular iron compound. The levels of iron compounds used, in accordance with the WHO guidelines, do not lead to changes in the baking and cooking properties of the wheat flour and maize meal. Respondents trained to measure against a set benchmark and/or discern differences could not consistently replicate perceived difference observations.
Murumkar, Prashant R; Giridhar, Rajani; Yadav, Mange Ram
2008-04-01
A set of 29 benzothiadiazepine hydroxamates having selective tumor necrosis factor-alpha converting enzyme inhibitory activity were used to compare the quality and predictive power of 3D-quantitative structure-activity relationship, comparative molecular field analysis, and comparative molecular similarity indices models for the atom-based, centroid/atom-based, data-based, and docked conformer-based alignment. Removal of two outliers from the initial training set of molecules improved the predictivity of models. Among the 3D-quantitative structure-activity relationship models developed using the above four alignments, the database alignment provided the optimal predictive comparative molecular field analysis model for the training set with cross-validated r(2) (q(2)) = 0.510, non-cross-validated r(2) = 0.972, standard error of estimates (s) = 0.098, and F = 215.44 and the optimal comparative molecular similarity indices model with cross-validated r(2) (q(2)) = 0.556, non-cross-validated r(2) = 0.946, standard error of estimates (s) = 0.163, and F = 99.785. These models also showed the best test set prediction for six compounds with predictive r(2) values of 0.460 and 0.535, respectively. The contour maps obtained from 3D-quantitative structure-activity relationship studies were appraised for activity trends for the molecules analyzed. The comparative molecular similarity indices models exhibited good external predictivity as compared with that of comparative molecular field analysis models. The data generated from the present study helped us to further design and report some novel and potent tumor necrosis factor-alpha converting enzyme inhibitors.
Barcellona, Massimo G; Morrissey, Matthew C
2016-04-01
The commonly used open kinetic chain knee extensor (OKCKE) exercise loads the sagittal restraints to knee anterior tibial translation. To investigate the effect of different loads of OKCKE resistance training on anterior knee laxity (AKL) in the uninjured knee. non-clinical trial. Randomization into one of three supervised training groups occurred with training 3 times per week for 12 weeks. Subjects in the LOW and HIGH groups performed OKCKE resistance training at loads of 2 sets of 20 repetition maximum (RM) and 20 sets of 2RM, respectively. Subjects in the isokinetic training group (ISOK) performed isokinetic OKCKE resistance training using 2 sets of 20 maximal efforts. AKL was measured using the KT2000 arthrometer with concurrent measurement of lateral hamstrings muscle activity at baseline, 6 weeks and 12 weeks. Twenty six subjects participated (LOW n = 9, HIGH n = 10, ISOK n = 7). The main finding from this study is that a 12-week OKCKE resistance training programme at loads of 20 sets of 2RM, leads to an increase in manual maximal AKL. OKCKE resistance training at high loads (20 sets of 2RM) increases AKL while low load OKCKE resistance training (2 sets of 20RM) and isokinetic OKCKE resistance training at 2 sets of 20RM does not. Copyright © 2015 Elsevier Ltd. All rights reserved.
Ngo, Trieu-Du; Tran, Thanh-Dao; Le, Minh-Tri; Thai, Khac-Minh
2016-11-01
The human P-glycoprotein (P-gp) efflux pump is of great interest for medicinal chemists because of its important role in multidrug resistance (MDR). Because of the high polyspecificity as well as the unavailability of high-resolution X-ray crystal structures of this transmembrane protein, ligand-based, and structure-based approaches which were machine learning, homology modeling, and molecular docking were combined for this study. In ligand-based approach, individual two-dimensional quantitative structure-activity relationship models were developed using different machine learning algorithms and subsequently combined into the Ensemble model which showed good performance on both the diverse training set and the validation sets. The applicability domain and the prediction quality of the developed models were also judged using the state-of-the-art methods and tools. In our structure-based approach, the P-gp structure and its binding region were predicted for a docking study to determine possible interactions between the ligands and the receptor. Based on these in silico tools, hit compounds for reversing MDR were discovered from the in-house and DrugBank databases through virtual screening using prediction models and molecular docking in an attempt to restore cancer cell sensitivity to cytotoxic drugs.
In silico prediction of drug-induced myelotoxicity by using Naïve Bayes method.
Zhang, Hui; Yu, Peng; Zhang, Teng-Guo; Kang, Yan-Li; Zhao, Xiao; Li, Yuan-Yuan; He, Jia-Hui; Zhang, Ji
2015-11-01
Drug-induced myelotoxicity usually leads to decrease the production of platelets, red cells, and white cells. Thus, early identification and characterization of myelotoxicity hazard in drug development is very necessary. The purpose of this investigation was to develop a prediction model of drug-induced myelotoxicity by using a Naïve Bayes classifier. For comparison, other prediction models based on support vector machine and single-hidden-layer feed-forward neural network methods were also established. Among all the prediction models, the Naïve Bayes classification model showed the best prediction performance, which offered an average overall prediction accuracy of [Formula: see text] for the training set and [Formula: see text] for the external test set. The significant contributions of this study are that we first developed a Naïve Bayes classification model of drug-induced myelotoxicity adverse effect using a larger scale dataset, which could be employed for the prediction of drug-induced myelotoxicity. In addition, several important molecular descriptors and substructures of myelotoxic compounds have been identified, which should be taken into consideration in the design of new candidate compounds to produce safer and more effective drugs, ultimately reducing the attrition rate in later stages of drug development.
Brainstorming: weighted voting prediction of inhibitors for protein targets.
Plewczynski, Dariusz
2011-09-01
The "Brainstorming" approach presented in this paper is a weighted voting method that can improve the quality of predictions generated by several machine learning (ML) methods. First, an ensemble of heterogeneous ML algorithms is trained on available experimental data, then all solutions are gathered and a consensus is built between them. The final prediction is performed using a voting procedure, whereby the vote of each method is weighted according to a quality coefficient calculated using multivariable linear regression (MLR). The MLR optimization procedure is very fast, therefore no additional computational cost is introduced by using this jury approach. Here, brainstorming is applied to selecting actives from large collections of compounds relating to five diverse biological targets of medicinal interest, namely HIV-reverse transcriptase, cyclooxygenase-2, dihydrofolate reductase, estrogen receptor, and thrombin. The MDL Drug Data Report (MDDR) database was used for selecting known inhibitors for these protein targets, and experimental data was then used to train a set of machine learning methods. The benchmark dataset (available at http://bio.icm.edu.pl/∼darman/chemoinfo/benchmark.tar.gz ) can be used for further testing of various clustering and machine learning methods when predicting the biological activity of compounds. Depending on the protein target, the overall recall value is raised by at least 20% in comparison to any single machine learning method (including ensemble methods like random forest) and unweighted simple majority voting procedures.
A specific pharmacophore model of sodium-dependent glucose co-transporter 2 (SGLT2) inhibitors.
Tang, Chunlei; Zhu, Xiaoyun; Huang, Dandan; Zan, Xin; Yang, Baowei; Li, Ying; Du, Xiaoyong; Qian, Hai; Huang, Wenlong
2012-06-01
Sodium-dependent glucose co-transporter 2 (SGLT2) plays a pivotal role in maintaining glucose equilibrium in the human body, emerging as one of the most promising targets for the treatment of diabetes mellitus type 2. Pharmacophore models of SGLT2 inhibitors have been generated with a training set of 25 SGLT2 inhibitors using Discovery Studio V2.1. The best hypothesis (Hypo1(SGLT2)) contains one hydrogen bond donor, five excluded volumes, one ring aromatic and three hydrophobic features, and has a correlation coefficient of 0.955, cost difference of 68.76, RMSD of 0.85. This model was validated by test set, Fischer randomization test and decoy set methods. The specificity of Hypo1(SGLT2) was evaluated. The pharmacophore features of Hypo1(SGLT2) were different from the best pharmacophore model (Hypo1(SGLT1)) of SGLT1 inhibitors we developed. Moreover, Hypo1(SGLT2) could effectively distinguish selective inhibitors of SGLT2 from those of SGLT1. These results indicate that a highly predictive and specific pharmacophore model of SGLT2 inhibitors has been successfully obtained. Then Hypo1(SGLT2) was used as a 3D query to screen databases including NCI and Maybridge for identifying new inhibitors of SGLT2. The hit compounds were subsequently subjected to filtering by Lipinski's rule of five. And several compounds selected from the top ranked hits have been suggested for further experimental assay studies.
Carpenter, Kristy A; Huang, Xudong
2018-06-07
Virtual Screening (VS) has emerged as an important tool in the drug development process, as it conducts efficient in silico searches over millions of compounds, ultimately increasing yields of potential drug leads. As a subset of Artificial Intelligence (AI), Machine Learning (ML) is a powerful way of conducting VS for drug leads. ML for VS generally involves assembling a filtered training set of compounds, comprised of known actives and inactives. After training the model, it is validated and, if sufficiently accurate, used on previously unseen databases to screen for novel compounds with desired drug target binding activity. The study aims to review ML-based methods used for VS and applications to Alzheimer's disease (AD) drug discovery. To update the current knowledge on ML for VS, we review thorough backgrounds, explanations, and VS applications of the following ML techniques: Naïve Bayes (NB), k-Nearest Neighbors (kNN), Support Vector Machines (SVM), Random Forests (RF), and Artificial Neural Networks (ANN). All techniques have found success in VS, but the future of VS is likely to lean more heavily toward the use of neural networks - and more specifically, Convolutional Neural Networks (CNN), which are a subset of ANN that utilize convolution. We additionally conceptualize a work flow for conducting ML-based VS for potential therapeutics of for AD, a complex neurodegenerative disease with no known cure and prevention. This both serves as an example of how to apply the concepts introduced earlier in the review and as a potential workflow for future implementation. Different ML techniques are powerful tools for VS, and they have advantages and disadvantages albeit. ML-based VS can be applied to AD drug development. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Schroeter, Timon Sebastian; Schwaighofer, Anton; Mika, Sebastian; Ter Laak, Antonius; Suelzle, Detlev; Ganzer, Ursula; Heinrich, Nikolaus; Müller, Klaus-Robert
2007-12-01
We investigate the use of different Machine Learning methods to construct models for aqueous solubility. Models are based on about 4000 compounds, including an in-house set of 632 drug discovery molecules of Bayer Schering Pharma. For each method, we also consider an appropriate method to obtain error bars, in order to estimate the domain of applicability (DOA) for each model. Here, we investigate error bars from a Bayesian model (Gaussian Process (GP)), an ensemble based approach (Random Forest), and approaches based on the Mahalanobis distance to training data (for Support Vector Machine and Ridge Regression models). We evaluate all approaches in terms of their prediction accuracy (in cross-validation, and on an external validation set of 536 molecules) and in how far the individual error bars can faithfully represent the actual prediction error.
Schroeter, Timon Sebastian; Schwaighofer, Anton; Mika, Sebastian; Ter Laak, Antonius; Suelzle, Detlev; Ganzer, Ursula; Heinrich, Nikolaus; Müller, Klaus-Robert
2007-09-01
We investigate the use of different Machine Learning methods to construct models for aqueous solubility. Models are based on about 4000 compounds, including an in-house set of 632 drug discovery molecules of Bayer Schering Pharma. For each method, we also consider an appropriate method to obtain error bars, in order to estimate the domain of applicability (DOA) for each model. Here, we investigate error bars from a Bayesian model (Gaussian Process (GP)), an ensemble based approach (Random Forest), and approaches based on the Mahalanobis distance to training data (for Support Vector Machine and Ridge Regression models). We evaluate all approaches in terms of their prediction accuracy (in cross-validation, and on an external validation set of 536 molecules) and in how far the individual error bars can faithfully represent the actual prediction error.
Quantitative structure-toxicity relationship (QSTR) studies on the organophosphate insecticides.
Can, Alper
2014-11-04
Organophosphate insecticides are the most commonly used pesticides in the world. In this study, quantitative structure-toxicity relationship (QSTR) models were derived for estimating the acute oral toxicity of organophosphate insecticides to male rats. The 20 chemicals of the training set and the seven compounds of the external testing set were described by means of using descriptors. Descriptors for lipophilicity, polarity and molecular geometry, as well as quantum chemical descriptors for energy were calculated. Model development to predict toxicity of organophosphate insecticides in different matrices was carried out using multiple linear regression. The model was validated internally and externally. In the present study, QSTR model was used for the first time to understand the inherent relationships between the organophosphate insecticide molecules and their toxicity behavior. Such studies provide mechanistic insight about structure-toxicity relationship and help in the design of less toxic insecticides. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Schroeter, Timon Sebastian; Schwaighofer, Anton; Mika, Sebastian; Ter Laak, Antonius; Suelzle, Detlev; Ganzer, Ursula; Heinrich, Nikolaus; Müller, Klaus-Robert
2007-12-01
We investigate the use of different Machine Learning methods to construct models for aqueous solubility. Models are based on about 4000 compounds, including an in-house set of 632 drug discovery molecules of Bayer Schering Pharma. For each method, we also consider an appropriate method to obtain error bars, in order to estimate the domain of applicability (DOA) for each model. Here, we investigate error bars from a Bayesian model (Gaussian Process (GP)), an ensemble based approach (Random Forest), and approaches based on the Mahalanobis distance to training data (for Support Vector Machine and Ridge Regression models). We evaluate all approaches in terms of their prediction accuracy (in cross-validation, and on an external validation set of 536 molecules) and in how far the individual error bars can faithfully represent the actual prediction error.
NASA Astrophysics Data System (ADS)
Schroeter, Timon Sebastian; Schwaighofer, Anton; Mika, Sebastian; Ter Laak, Antonius; Suelzle, Detlev; Ganzer, Ursula; Heinrich, Nikolaus; Müller, Klaus-Robert
2007-09-01
We investigate the use of different Machine Learning methods to construct models for aqueous solubility. Models are based on about 4000 compounds, including an in-house set of 632 drug discovery molecules of Bayer Schering Pharma. For each method, we also consider an appropriate method to obtain error bars, in order to estimate the domain of applicability (DOA) for each model. Here, we investigate error bars from a Bayesian model (Gaussian Process (GP)), an ensemble based approach (Random Forest), and approaches based on the Mahalanobis distance to training data (for Support Vector Machine and Ridge Regression models). We evaluate all approaches in terms of their prediction accuracy (in cross-validation, and on an external validation set of 536 molecules) and in how far the individual error bars can faithfully represent the actual prediction error.
NASA Astrophysics Data System (ADS)
Woolfrey, John R.; Avery, Mitchell A.; Doweyko, Arthur M.
1998-03-01
Two three-dimensional quantitative structure-activity relationship (3D-QSAR) methods, comparative molecular field analysis (CoMFA) and hypothetical active site lattice (HASL), were compared with respect to the analysis of a training set of 154 artemisinin analogues. Five models were created, including a complete HASL and two trimmed versions, as well as two CoMFA models (leave-one-out standard CoMFA and the guided-region selection protocol). Similar r2 and q2 values were obtained by each method, although some striking differences existed between CoMFA contour maps and the HASL output. Each of the four predictive models exhibited a similar ability to predict the activity of a test set of 23 artemisinin analogues, although some differences were noted as to which compounds were described well by either model.
Improvement of Predictive Ability by Uniform Coverage of the Target Genetic Space
Bustos-Korts, Daniela; Malosetti, Marcos; Chapman, Scott; Biddulph, Ben; van Eeuwijk, Fred
2016-01-01
Genome-enabled prediction provides breeders with the means to increase the number of genotypes that can be evaluated for selection. One of the major challenges in genome-enabled prediction is how to construct a training set of genotypes from a calibration set that represents the target population of genotypes, where the calibration set is composed of a training and validation set. A random sampling protocol of genotypes from the calibration set will lead to low quality coverage of the total genetic space by the training set when the calibration set contains population structure. As a consequence, predictive ability will be affected negatively, because some parts of the genotypic diversity in the target population will be under-represented in the training set, whereas other parts will be over-represented. Therefore, we propose a training set construction method that uniformly samples the genetic space spanned by the target population of genotypes, thereby increasing predictive ability. To evaluate our method, we constructed training sets alongside with the identification of corresponding genomic prediction models for four genotype panels that differed in the amount of population structure they contained (maize Flint, maize Dent, wheat, and rice). Training sets were constructed using uniform sampling, stratified-uniform sampling, stratified sampling and random sampling. We compared these methods with a method that maximizes the generalized coefficient of determination (CD). Several training set sizes were considered. We investigated four genomic prediction models: multi-locus QTL models, GBLUP models, combinations of QTL and GBLUPs, and Reproducing Kernel Hilbert Space (RKHS) models. For the maize and wheat panels, construction of the training set under uniform sampling led to a larger predictive ability than under stratified and random sampling. The results of our methods were similar to those of the CD method. For the rice panel, all training set construction methods led to similar predictive ability, a reflection of the very strong population structure in this panel. PMID:27672112
Muscle wasting and sarcopenia in heart failure and beyond: update 2017
Springer, Joshua‐I.; Anker, Stefan D.
2017-01-01
Abstract Sarcopenia (loss of muscle mass and muscle function) is a strong predictor of frailty, disability and mortality in older persons and may also occur in obese subjects. The prevalence of sarcopenia is increased in patients suffering from chronic heart failure. However, there are currently few therapy options. The main intervention is resistance exercise, either alone or in combination with nutritional support, which seems to enhance the beneficial effects of training. Also, testosterone has been shown to increased muscle power and function; however, a possible limitation is the side effects of testosterone. Other investigational drugs include selective androgen receptor modulators, growth hormone, IGF‐1, compounds targeting myostatin signaling, which have their own set of side effects. There are abundant prospective targets for improving muscle function in the elderly with or without chronic heart failure, and the continuing development of new treatment strategies and compounds for sarcopenia and cardiac cachexia makes this field an exciting one. PMID:29154428
Electronic and software systems of an automated portable static mass spectrometer
NASA Astrophysics Data System (ADS)
Chichagov, Yu. V.; Bogdanov, A. A.; Lebedev, D. S.; Kogan, V. T.; Tubol'tsev, Yu. V.; Kozlenok, A. V.; Moroshkin, V. S.; Berezina, A. V.
2017-01-01
The electronic systems of a small high-sensitivity static mass spectrometer and software and hardware tools, which allow one to determine trace concentrations of gases and volatile compounds in air and water samples in real time, have been characterized. These systems and tools have been used to set up the device, control the process of measurement, synchronize this process with accompanying measurements, maintain reliable operation of the device, process the obtained results automatically, and visualize and store them. The developed software and hardware tools allow one to conduct continuous measurements for up to 100 h and provide an opportunity for personnel with no special training to perform maintenance on the device. The test results showed that mobile mass spectrometers for geophysical and medical research, which were fitted with these systems, had a determination limit for target compounds as low as several ppb(m) and a mass resolving power (depending on the current task) as high as 250.
Quantitative analysis of single- vs. multiple-set programs in resistance training.
Wolfe, Brian L; LeMura, Linda M; Cole, Phillip J
2004-02-01
The purpose of this study was to examine the existing research on single-set vs. multiple-set resistance training programs. Using the meta-analytic approach, we included studies that met the following criteria in our analysis: (a) at least 6 subjects per group; (b) subject groups consisting of single-set vs. multiple-set resistance training programs; (c) pretest and posttest strength measures; (d) training programs of 6 weeks or more; (e) apparently "healthy" individuals free from orthopedic limitations; and (f) published studies in English-language journals only. Sixteen studies generated 103 effect sizes (ESs) based on a total of 621 subjects, ranging in age from 15-71 years. Across all designs, intervention strategies, and categories, the pretest to posttest ES in muscular strength was (chi = 1.4 +/- 1.4; 95% confidence interval, 0.41-3.8; p < 0.001). The results of 2 x 2 analysis of variance revealed simple main effects for age, training status (trained vs. untrained), and research design (p < 0.001). No significant main effects were found for sex, program duration, and set end point. Significant interactions were found for training status and program duration (6-16 weeks vs. 17-40 weeks) and number of sets performed (single vs. multiple). The data indicated that trained individuals performing multiple sets generated significantly greater increases in strength (p < 0.001). For programs with an extended duration, multiple sets were superior to single sets (p < 0.05). This quantitative review indicates that single-set programs for an initial short training period in untrained individuals result in similar strength gains as multiple-set programs. However, as progression occurs and higher gains are desired, multiple-set programs are more effective.
10 CFR 35.55 - Training for an authorized nuclear pharmacist.
Code of Federal Regulations, 2010 CFR
2010-01-01
... competency in procurement, compounding, quality assurance, dispensing, distribution, health and safety... hours of classroom and laboratory training in the following areas— (A) Radiation physics and...
Chemical function based pharmacophore generation of endothelin-A selective receptor antagonists.
Funk, Oliver F; Kettmann, Viktor; Drimal, Jan; Langer, Thierry
2004-05-20
Both quantitative and qualitative chemical function based pharmacophore models of endothelin-A (ET(A)) selective receptor antagonists were generated by using the two algorithms HypoGen and HipHop, respectively, which are implemented in the Catalyst molecular modeling software. The input for HypoGen is a training set of 18 ET(A) antagonists exhibiting IC(50) values ranging between 0.19 nM and 67 microM. The best output hypothesis consists of five features: two hydrophobic (HY), one ring aromatic (RA), one hydrogen bond acceptor (HBA), and one negative ionizable (NI) function. The highest scoring Hip Hop model consists of six features: three hydrophobic (HY), one ring aromatic (RA), one hydrogen bond acceptor (HBA), and one negative ionizable (NI). It is the result of an input of three highly active, selective, and structurally diverse ET(A) antagonists. The predictive power of the quantitative model could be approved by using a test set of 30 compounds, whose activity values spread over 6 orders of magnitude. The two pharmacophores were tested according to their ability to extract known endothelin antagonists from the 3D molecular structure database of Derwent's World Drug Index. Thereby the main part of selective ET(A) antagonistic entries was detected by the two hypotheses. Furthermore, the pharmacophores were used to screen the Maybridge database. Six compounds were chosen from the output hit lists for in vitro testing of their ability to displace endothelin-1 from its receptor. Two of these are new potential lead compounds because they are structurally novel and exhibit satisfactory activity in the binding assay.
A comparative study of two hazard handling training methods for novice drivers.
Wang, Y B; Zhang, W; Salvendy, G
2010-10-01
The effectiveness of two hazard perception training methods, simulation-based error training (SET) and video-based guided error training (VGET), for novice drivers' hazard handling performance was tested, compared, and analyzed. Thirty-two novice drivers participated in the hazard perception training. Half of the participants were trained using SET by making errors and/or experiencing accidents while driving with a desktop simulator. The other half were trained using VGET by watching prerecorded video clips of errors and accidents that were made by other people. The two groups had exposure to equal numbers of errors for each training scenario. All the participants were tested and evaluated for hazard handling on a full cockpit driving simulator one week after training. Hazard handling performance and hazard response were measured in this transfer test. Both hazard handling performance scores and hazard response distances were significantly better for the SET group than the VGET group. Furthermore, the SET group had more metacognitive activities and intrinsic motivation. SET also seemed more effective in changing participants' confidence, but the result did not reach the significance level. SET exhibited a higher training effectiveness of hazard response and handling than VGET in the simulated transfer test. The superiority of SET might benefit from the higher levels of metacognition and intrinsic motivation during training, which was observed in the experiment. Future research should be conducted to assess whether the advantages of error training are still effective under real road conditions.
Tong, Lidan; Guo, Lixin; Lv, Xiaojun; Li, Yu
2017-01-01
Three-dimensional quantitative structure-activity relationship (3D-QSAR) models were established by comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA). Experimental toxicity data in Poecilia reticulata (pLC 50 ) and physico-chemical properties for 12 polychlorinated phenols were used as dependent and as independent variables, respectively. Among the 12 polychlorinated phenols, nine were randomly selected and used as a training set to construct the 3D-QSAR models through the SYBYL-X software to predict the pLC 50 values of the remaining 8 polychlorinated phenols congeners, and the other three polychlorinated phenols were used as a test set to evaluate the 3D-QSAR models (the training set and test set were arranged randomly, shuffled 60 times). Pentachlorophenol (PCP), which is the most toxic among the 20 polychlorinated phenols used in this experiment, was selected as an example for modification using contour maps produced using the established 3D-QSAR models. The aim was to decrease its toxicity and bioconcentration, increase its biodegradation, and maintain or better its effectiveness as a pesticide. The 3D-QSAR models were robust and had good predictive abilities with cross-validation correlation coefficients (q 2 ) of 0.858-0.992 (>0.5), correlation coefficients (r 2 ) of 0.966-1.000 (>0.9), and standard errors of prediction (SEP) of 0.004-0.159. CoMFA showed that the toxicity of the polychlorinated phenols arose mainly from electrostatic (42.7-66.7%) and steric (33.3-7.3%) contributions. By comparison, CoMSIA showed that the toxicity of polychlorinated phenols was dominated by electrostatic (57.5-76.9%) and hydrophobic (19.8-25.7%) contributions, with lesser contributions from the steric (0.7-1.0%) hydrogen bond donor (0.1-20.3%), and hydrogen bond acceptor (0-0.9%). 3D-QSAR electrostatic contour maps were used to modify PCP and design 11 new compounds with lower toxicity. The effectiveness of each of these molecules as a pesticide was verified using a 3D-QSAR model for polychlorinated phenol toxicity against Tetrahymena pyriformis. Four of these compounds, with -Br, -I, -OH and -NH 2 groups in place of chlorine at the 3-position on PCP, were all at least as effective as PCP against T. Pyriformis. The first-order rate constants (K b ) of these four compounds were predicted using a 3D-QSAR model for polychlorinated phenol degradation, which showed they were more biodegradable than PCP. Furthermore, a 3D-QSAR model for polychlorinated phenols bioconcentration in fish (containing Poecilia reticulata, Oncorhynchus mykiss, Pimephales promelas and Oryzias latipes) showed that there was no significant difference between the bioconcentration factors of the four new compounds and that of PCP. The results obtained are hoped to provide a new route for lowering the POPs characteristics of those polychlorinated phenol homologues and derivatives in use. Copyright © 2016 Elsevier Inc. All rights reserved.
Experimental Errors in QSAR Modeling Sets: What We Can Do and What We Cannot Do.
Zhao, Linlin; Wang, Wenyi; Sedykh, Alexander; Zhu, Hao
2017-06-30
Numerous chemical data sets have become available for quantitative structure-activity relationship (QSAR) modeling studies. However, the quality of different data sources may be different based on the nature of experimental protocols. Therefore, potential experimental errors in the modeling sets may lead to the development of poor QSAR models and further affect the predictions of new compounds. In this study, we explored the relationship between the ratio of questionable data in the modeling sets, which was obtained by simulating experimental errors, and the QSAR modeling performance. To this end, we used eight data sets (four continuous endpoints and four categorical endpoints) that have been extensively curated both in-house and by our collaborators to create over 1800 various QSAR models. Each data set was duplicated to create several new modeling sets with different ratios of simulated experimental errors (i.e., randomizing the activities of part of the compounds) in the modeling process. A fivefold cross-validation process was used to evaluate the modeling performance, which deteriorates when the ratio of experimental errors increases. All of the resulting models were also used to predict external sets of new compounds, which were excluded at the beginning of the modeling process. The modeling results showed that the compounds with relatively large prediction errors in cross-validation processes are likely to be those with simulated experimental errors. However, after removing a certain number of compounds with large prediction errors in the cross-validation process, the external predictions of new compounds did not show improvement. Our conclusion is that the QSAR predictions, especially consensus predictions, can identify compounds with potential experimental errors. But removing those compounds by the cross-validation procedure is not a reasonable means to improve model predictivity due to overfitting.
Experimental Errors in QSAR Modeling Sets: What We Can Do and What We Cannot Do
2017-01-01
Numerous chemical data sets have become available for quantitative structure–activity relationship (QSAR) modeling studies. However, the quality of different data sources may be different based on the nature of experimental protocols. Therefore, potential experimental errors in the modeling sets may lead to the development of poor QSAR models and further affect the predictions of new compounds. In this study, we explored the relationship between the ratio of questionable data in the modeling sets, which was obtained by simulating experimental errors, and the QSAR modeling performance. To this end, we used eight data sets (four continuous endpoints and four categorical endpoints) that have been extensively curated both in-house and by our collaborators to create over 1800 various QSAR models. Each data set was duplicated to create several new modeling sets with different ratios of simulated experimental errors (i.e., randomizing the activities of part of the compounds) in the modeling process. A fivefold cross-validation process was used to evaluate the modeling performance, which deteriorates when the ratio of experimental errors increases. All of the resulting models were also used to predict external sets of new compounds, which were excluded at the beginning of the modeling process. The modeling results showed that the compounds with relatively large prediction errors in cross-validation processes are likely to be those with simulated experimental errors. However, after removing a certain number of compounds with large prediction errors in the cross-validation process, the external predictions of new compounds did not show improvement. Our conclusion is that the QSAR predictions, especially consensus predictions, can identify compounds with potential experimental errors. But removing those compounds by the cross-validation procedure is not a reasonable means to improve model predictivity due to overfitting. PMID:28691113
Vilar, Santiago; Chakrabarti, Mayukh; Costanzi, Stefano
2010-01-01
The distribution of compounds between blood and brain is a very important consideration for new candidate drug molecules. In this paper, we describe the derivation of two linear discriminant analysis (LDA) models for the prediction of passive blood-brain partitioning, expressed in terms of log BB values. The models are based on computationally derived physicochemical descriptors, namely the octanol/water partition coefficient (log P), the topological polar surface area (TPSA) and the total number of acidic and basic atoms, and were obtained using a homogeneous training set of 307 compounds, for all of which the published experimental log BB data had been determined in vivo. In particular, since molecules with log BB > 0.3 cross the blood-brain barrier (BBB) readily while molecules with log BB < −1 are poorly distributed to the brain, on the basis of these thresholds we derived two distinct models, both of which show a percentage of good classification of about 80%. Notably, the predictive power of our models was confirmed by the analysis of a large external dataset of compounds with reported activity on the central nervous system (CNS) or lack thereof. The calculation of straightforward physicochemical descriptors is the only requirement for the prediction of the log BB of novel compounds through our models, which can be conveniently applied in conjunction with drug design and virtual screenings. PMID:20427217
Vilar, Santiago; Chakrabarti, Mayukh; Costanzi, Stefano
2010-06-01
The distribution of compounds between blood and brain is a very important consideration for new candidate drug molecules. In this paper, we describe the derivation of two linear discriminant analysis (LDA) models for the prediction of passive blood-brain partitioning, expressed in terms of logBB values. The models are based on computationally derived physicochemical descriptors, namely the octanol/water partition coefficient (logP), the topological polar surface area (TPSA) and the total number of acidic and basic atoms, and were obtained using a homogeneous training set of 307 compounds, for all of which the published experimental logBB data had been determined in vivo. In particular, since molecules with logBB>0.3 cross the blood-brain barrier (BBB) readily while molecules with logBB<-1 are poorly distributed to the brain, on the basis of these thresholds we derived two distinct models, both of which show a percentage of good classification of about 80%. Notably, the predictive power of our models was confirmed by the analysis of a large external dataset of compounds with reported activity on the central nervous system (CNS) or lack thereof. The calculation of straightforward physicochemical descriptors is the only requirement for the prediction of the logBB of novel compounds through our models, which can be conveniently applied in conjunction with drug design and virtual screenings. Published by Elsevier Inc.
Set Shifting Training with Categorization Tasks
Soveri, Anna; Waris, Otto; Laine, Matti
2013-01-01
The very few cognitive training studies targeting an important executive function, set shifting, have reported performance improvements that also generalized to untrained tasks. The present randomized controlled trial extends set shifting training research by comparing previously used cued training with uncued training. A computerized adaptation of the Wisconsin Card Sorting Test was utilized as the training task in a pretest-posttest experimental design involving three groups of university students. One group received uncued training (n = 14), another received cued training (n = 14) and the control group (n = 14) only participated in pre- and posttests. The uncued training group showed posttraining performance increases on their training task, but neither training group showed statistically significant transfer effects. Nevertheless, comparison of effect sizes for transfer effects indicated that our results did not differ significantly from the previous studies. Our results suggest that the cognitive effects of computerized set shifting training are mostly task-specific, and would preclude any robust generalization effects with this training. PMID:24324717
Target specific compound identification using a support vector machine.
Plewczynski, Dariusz; von Grotthuss, Marcin; Spieser, Stephane A H; Rychlewski, Leszek; Wyrwicz, Lucjan S; Ginalski, Krzysztof; Koch, Uwe
2007-03-01
In many cases at the beginning of an HTS-campaign, some information about active molecules is already available. Often known active compounds (such as substrate analogues, natural products, inhibitors of a related protein or ligands published by a pharmaceutical company) are identified in low-throughput validation studies of the biochemical target. In this study we evaluate the effectiveness of a support vector machine applied for those compounds and used to classify a collection with unknown activity. This approach was aimed at reducing the number of compounds to be tested against the given target. Our method predicts the biological activity of chemical compounds based on only the atom pairs (AP) two dimensional topological descriptors. The supervised support vector machine (SVM) method herein is trained on compounds from the MDL drug data report (MDDR) known to be active for specific protein target. For detailed analysis, five different biological targets were selected including cyclooxygenase-2, dihydrofolate reductase, thrombin, HIV-reverse transcriptase and antagonists of the estrogen receptor. The accuracy of compound identification was estimated using the recall and precision values. The sensitivities for all protein targets exceeded 80% and the classification performance reached 100% for selected targets. In another application of the method, we addressed the absence of an initial set of active compounds for a selected protein target at the beginning of an HTS-campaign. In such a case, virtual high-throughput screening (vHTS) is usually applied by using a flexible docking procedure. However, the vHTS experiment typically contains a large percentage of false positives that should be verified by costly and time-consuming experimental follow-up assays. The subsequent use of our machine learning method was found to improve the speed (since the docking procedure was not required for all compounds from the database) and also the accuracy of the HTS hit lists (the enrichment factor).
NASA Astrophysics Data System (ADS)
Kayastha, Shilva; Kunimoto, Ryo; Horvath, Dragos; Varnek, Alexandre; Bajorath, Jürgen
2017-11-01
The analysis of structure-activity relationships (SARs) becomes rather challenging when large and heterogeneous compound data sets are studied. In such cases, many different compounds and their activities need to be compared, which quickly goes beyond the capacity of subjective assessments. For a comprehensive large-scale exploration of SARs, computational analysis and visualization methods are required. Herein, we introduce a two-layered SAR visualization scheme specifically designed for increasingly large compound data sets. The approach combines a new compound pair-based variant of generative topographic mapping (GTM), a machine learning approach for nonlinear mapping, with chemical space networks (CSNs). The GTM component provides a global view of the activity landscapes of large compound data sets, in which informative local SAR environments are identified, augmented by a numerical SAR scoring scheme. Prioritized local SAR regions are then projected into CSNs that resolve these regions at the level of individual compounds and their relationships. Analysis of CSNs makes it possible to distinguish between regions having different SAR characteristics and select compound subsets that are rich in SAR information.
Modeling groundwater nitrate concentrations in private wells in Iowa
Wheeler, David C.; Nolan, Bernard T.; Flory, Abigail R.; DellaValle, Curt T.; Ward, Mary H.
2015-01-01
Contamination of drinking water by nitrate is a growing problem in many agricultural areas of the country. Ingested nitrate can lead to the endogenous formation of N-nitroso compounds, potent carcinogens. We developed a predictive model for nitrate concentrations in private wells in Iowa. Using 34,084 measurements of nitrate in private wells, we trained and tested random forest models to predict log nitrate levels by systematically assessing the predictive performance of 179 variables in 36 thematic groups (well depth, distance to sinkholes, location, land use, soil characteristics, nitrogen inputs, meteorology, and other factors). The final model contained 66 variables in 17 groups. Some of the most important variables were well depth, slope length within 1 km of the well, year of sample, and distance to nearest animal feeding operation. The correlation between observed and estimated nitrate concentrations was excellent in the training set (r-square = 0.77) and was acceptable in the testing set (r-square = 0.38). The random forest model had substantially better predictive performance than a traditional linear regression model or a regression tree. Our model will be used to investigate the association between nitrate levels in drinking water and cancer risk in the Iowa participants of the Agricultural Health Study cohort.
Modeling groundwater nitrate concentrations in private wells in Iowa.
Wheeler, David C; Nolan, Bernard T; Flory, Abigail R; DellaValle, Curt T; Ward, Mary H
2015-12-01
Contamination of drinking water by nitrate is a growing problem in many agricultural areas of the country. Ingested nitrate can lead to the endogenous formation of N-nitroso compounds, potent carcinogens. We developed a predictive model for nitrate concentrations in private wells in Iowa. Using 34,084 measurements of nitrate in private wells, we trained and tested random forest models to predict log nitrate levels by systematically assessing the predictive performance of 179 variables in 36 thematic groups (well depth, distance to sinkholes, location, land use, soil characteristics, nitrogen inputs, meteorology, and other factors). The final model contained 66 variables in 17 groups. Some of the most important variables were well depth, slope length within 1 km of the well, year of sample, and distance to nearest animal feeding operation. The correlation between observed and estimated nitrate concentrations was excellent in the training set (r-square=0.77) and was acceptable in the testing set (r-square=0.38). The random forest model had substantially better predictive performance than a traditional linear regression model or a regression tree. Our model will be used to investigate the association between nitrate levels in drinking water and cancer risk in the Iowa participants of the Agricultural Health Study cohort. Copyright © 2015 Elsevier B.V. All rights reserved.
Hospital Social Work and Spirituality: Views of Medical Social Workers.
Pandya, Samta P
2016-01-01
This article is based on a study of 1,389 medical social workers in 108 hospitals across 12 countries, on their views on spirituality and spiritually sensitive interventions in hospital settings. Results of the logistic regression analyses and structural equation models showed that medical social workers from European countries, United States of America, Canada, and Australia, those had undergone spiritual training, and those who had higher self-reported spiritual experiences scale scores were more likely to have the view that spirituality in hospital settings is for facilitating integral healing and wellness of patients and were more likely to prefer spiritual packages of New Age movements as the form of spiritual program, understand spiritual assessment as assessing the patients' spiritual starting point, to then build on further interventions and were likely to attest the understanding of spiritual techniques as mindfulness techniques. Finally they were also likely to understand the spiritual goals of intervention in a holistic way, that is, as that of integral healing, growth of consciousness and promoting overall well-being of patients vis-à-vis only coping and coming to terms with health adversities. Results of the structural equation models also showed covariances between religion, spirituality training, and scores on the self-reported spiritual experiences scale, having thus a set of compounding effects on social workers' views on spiritual interventions in hospitals. The implications of the results for health care social work practice and curriculum are discussed.
Pérez-Gálvez, Antonio; Rios, José J; Mínguez-Mosquera, María Isabel
2005-06-15
The high-temperature treatment of paprika oleoresins (Capsicum annuum L.) modified the carotenoid profile, yielding several degradation products, which were analyzed by HPLC-APCI-MS. From the initial MS data, compounds were grouped in two sets. Set 1 grouped compounds with m/z 495, and set 2 included compounds with m/z 479, in both cases for the protonated molecular mass. Two compounds of the first set were tentatively identified as 9,10,11,12,13,14,19,20-octanor-capsorubin (compound II) and 9,10,11,12,13,14,19,20-octanor-5,6-epoxide-capsanthin (compound IV), after isolation by semipreparative HPLC and analysis by EI-MS. Compounds VII, VIII, and IX from set 2 were assigned as 9,10,11,12,13,14,19,20-octanor-capsanthin and isomers, respectively. As these compounds were the major products formed in the thermal process, it was possible to apply derivatization techniques (hydrogenation and silylation) to analyze them by EI-MS, before and after chemical derivatization. Taking into account structures of the degradation products, the cyclization of polyolefins could be considered as the general reaction pathway in thermally induced reactions, yielding in the present study xylene as byproduct and the corresponding nor-carotenoids.
Eitrich, T; Kless, A; Druska, C; Meyer, W; Grotendorst, J
2007-01-01
In this paper, we study the classifications of unbalanced data sets of drugs. As an example we chose a data set of 2D6 inhibitors of cytochrome P450. The human cytochrome P450 2D6 isoform plays a key role in the metabolism of many drugs in the preclinical drug discovery process. We have collected a data set from annotated public data and calculated physicochemical properties with chemoinformatics methods. On top of this data, we have built classifiers based on machine learning methods. Data sets with different class distributions lead to the effect that conventional machine learning methods are biased toward the larger class. To overcome this problem and to obtain sensitive but also accurate classifiers we combine machine learning and feature selection methods with techniques addressing the problem of unbalanced classification, such as oversampling and threshold moving. We have used our own implementation of a support vector machine algorithm as well as the maximum entropy method. Our feature selection is based on the unsupervised McCabe method. The classification results from our test set are compared structurally with compounds from the training set. We show that the applied algorithms enable the effective high throughput in silico classification of potential drug candidates.
National Plant Diagnostic Network, Taxonomic training videos: Aphids under the microscope - overview
USDA-ARS?s Scientific Manuscript database
Training is a critical part of aphid (Hemiptera: Aphididae) identification. This training video provides provides an overview of general aphid morphology by using a compound microscope. The narrator discusses and highlights structures on the aphid that are important to make a species identification....
USDA-ARS?s Scientific Manuscript database
Training is a critical part of aphid (Hemiptera: Aphididae) identification. This video provides provides training to identify the palm aphid, Cerataphis brasiliensis, using a compound microscope and an electronic identification key called “LUCID.” The video demonstrates key morphological structures...
Witnauer, James; Rhodes, L Jack; Kysor, Sarah; Narasiwodeyar, Sanjay
2017-11-21
The correlation between blocking and within-compound memory is stronger when compound training occurs before elemental training (i.e., backward blocking) than when the phases are reversed (i.e., forward blocking; Melchers et al., 2004, 2006). This trial order effect is often interpreted as problematic for performance-focused models that assume a critical role for within-compound associations in both retrospective revaluation and traditional cue competition. The present manuscript revisits this issue using a computational modeling approach. The fit of sometimes competing retrieval (SOCR; Stout & Miller, 2007) was compared to the fit of an acquisition-focused model of retrospective revaluation and cue competition. These simulations reveal that SOCR explains this trial order effect in some situations based on its use of local error reduction. Published by Elsevier B.V.
Simon, Alison G; Mills, DeEtta K; Furton, Kenneth G
2017-06-01
Raffaelea lauricola, a fungus causing a vascular wilt (laurel wilt) in Lauraceae trees, was introduced into the United States in the early 2000s. It has devastated forests in the Southeast and has now moved into the commercial avocado groves in southern Florida. Trained detection canines are currently one of the few successful methods for early detection of pre-symptomatic diseased trees. In order to achieve the universal and frequent training required to have successful detection canines, it is desirable to create accessible, safe, and long-lasting training aids. However, identification of odorants and compounds is limited by several factors, including both the availability of chemicals and the need to present chemicals individually and in combination to detection canines. A method for the separation and identification of volatile organic compounds (VOCs) from environmental substances for the creation of such a canine training aid is presented here. Headspace solid phase microextraction-gas chromatography-mass spectrometry (HS-SPME-GC-MS) was used to identify the odors present in avocado trees infected with the R. lauricola phytopathogen. Twenty-eight compounds were detected using this method, with nine present in greater than 80% of samples. The majority of these compounds were not commercially available as standard reference materials, and a canine trial was designed to identify the active odors without the need of pure chemical compounds. To facilitate the creation of a canine training aid, the VOCs above R. lauricola were separated by venting a 0.53mm ID solgel-wax gas chromatography column to the atmosphere. Ten minute fractions of the odor profile were collected on cotton gauze in glass vials and presented to the detection canines in a series of field trials. The canines alerted to the VOCs from the vials that correspond to a portion of the chromatogram containing the most volatile species from R. lauricola. This innovative fractionation and collection method can be used to develop reliable and cost effective canine training aids. Copyright © 2017 Elsevier B.V. All rights reserved.
Fraczkiewicz, Robert; Lobell, Mario; Göller, Andreas H; Krenz, Ursula; Schoenneis, Rolf; Clark, Robert D; Hillisch, Alexander
2015-02-23
In a unique collaboration between a software company and a pharmaceutical company, we were able to develop a new in silico pKa prediction tool with outstanding prediction quality. An existing pKa prediction method from Simulations Plus based on artificial neural network ensembles (ANNE), microstates analysis, and literature data was retrained with a large homogeneous data set of drug-like molecules from Bayer. The new model was thus built with curated sets of ∼14,000 literature pKa values (∼11,000 compounds, representing literature chemical space) and ∼19,500 pKa values experimentally determined at Bayer Pharma (∼16,000 compounds, representing industry chemical space). Model validation was performed with several test sets consisting of a total of ∼31,000 new pKa values measured at Bayer. For the largest and most difficult test set with >16,000 pKa values that were not used for training, the original model achieved a mean absolute error (MAE) of 0.72, root-mean-square error (RMSE) of 0.94, and squared correlation coefficient (R(2)) of 0.87. The new model achieves significantly improved prediction statistics, with MAE = 0.50, RMSE = 0.67, and R(2) = 0.93. It is commercially available as part of the Simulations Plus ADMET Predictor release 7.0. Good predictions are only of value when delivered effectively to those who can use them. The new pKa prediction model has been integrated into Pipeline Pilot and the PharmacophorInformatics (PIx) platform used by scientists at Bayer Pharma. Different output formats allow customized application by medicinal chemists, physical chemists, and computational chemists.
Functional Analysis of Metabolomics Data.
Chagoyen, Mónica; López-Ibáñez, Javier; Pazos, Florencio
2016-01-01
Metabolomics aims at characterizing the repertory of small chemical compounds in a biological sample. As it becomes more massive and larger sets of compounds are detected, a functional analysis is required to convert these raw lists of compounds into biological knowledge. The most common way of performing such analysis is "annotation enrichment analysis," also used in transcriptomics and proteomics. This approach extracts the annotations overrepresented in the set of chemical compounds arisen in a given experiment. Here, we describe the protocols for performing such analysis as well as for visualizing a set of compounds in different representations of the metabolic networks, in both cases using free accessible web tools.
Single- vs. Multiple-Set Strength Training in Women.
ERIC Educational Resources Information Center
Schlumberger, Andreas; Stec, Justyna; Schmidtbleicher, Dietmar
2001-01-01
Compared the effects of single- and multiple-set strength training in women with basic experience in resistance training. Both training groups had significant strength improvements in leg extension. In the seated bench press, only the three-set group showed a significant increase in maximal strength. There were higher strength gains overall in the…
NASA Astrophysics Data System (ADS)
Su, Lihong
In remote sensing communities, support vector machine (SVM) learning has recently received increasing attention. SVM learning usually requires large memory and enormous amounts of computation time on large training sets. According to SVM algorithms, the SVM classification decision function is fully determined by support vectors, which compose a subset of the training sets. In this regard, a solution to optimize SVM learning is to efficiently reduce training sets. In this paper, a data reduction method based on agglomerative hierarchical clustering is proposed to obtain smaller training sets for SVM learning. Using a multiple angle remote sensing dataset of a semi-arid region, the effectiveness of the proposed method is evaluated by classification experiments with a series of reduced training sets. The experiments show that there is no loss of SVM accuracy when the original training set is reduced to 34% using the proposed approach. Maximum likelihood classification (MLC) also is applied on the reduced training sets. The results show that MLC can also maintain the classification accuracy. This implies that the most informative data instances can be retained by this approach.
Handwritten word preprocessing for database adaptation
NASA Astrophysics Data System (ADS)
Oprean, Cristina; Likforman-Sulem, Laurence; Mokbel, Chafic
2013-01-01
Handwriting recognition systems are typically trained using publicly available databases, where data have been collected in controlled conditions (image resolution, paper background, noise level,...). Since this is not often the case in real-world scenarios, classification performance can be affected when novel data is presented to the word recognition system. To overcome this problem, we present in this paper a new approach called database adaptation. It consists of processing one set (training or test) in order to adapt it to the other set (test or training, respectively). Specifically, two kinds of preprocessing, namely stroke thickness normalization and pixel intensity normalization are considered. The advantage of such approach is that we can re-use the existing recognition system trained on controlled data. We conduct several experiments with the Rimes 2011 word database and with a real-world database. We adapt either the test set or the training set. Results show that training set adaptation achieves better results than test set adaptation, at the cost of a second training stage on the adapted data. Accuracy of data set adaptation is increased by 2% to 3% in absolute value over no adaptation.
Conditioning with compound stimuli in Drosophila melanogaster in the flight simulator.
Brembs, B; Heisenberg, M
2001-08-01
Short-term memory in Drosophila melanogaster operant visual learning in the flight simulator is explored using patterns and colours as a compound stimulus. Presented together during training, the two stimuli accrue the same associative strength whether or not a prior training phase rendered one of the two stimuli a stronger predictor for the reinforcer than the other (no blocking). This result adds Drosophila to the list of other invertebrates that do not exhibit the robust vertebrate blocking phenomenon. Other forms of higher-order learning, however, were detected: a solid sensory preconditioning and a small second-order conditioning effect imply that associations between the two stimuli can be formed, even if the compound is not reinforced.
Gatch, Michael B; Rutledge, Margaret A; Carbonaro, Theresa; Forster, Michael J
2009-07-01
There has been increased recreational use of dimethyltryptamine (DMT), but little is known of its discriminative stimulus effects. The present study assessed the similarity of the discriminative stimulus effects of DMT to other types of hallucinogens and to psychostimulants. Rats were trained to discriminate DMT from saline. To test the similarity of DMT to known hallucinogens, the ability of (+)-lysergic acid diethylamide (LSD), (-)-2,5-dimethoxy-4-methylamphetamine (DOM), (+)-methamphetamine, or (+/-)3,4-methylenedioxymethyl amphetamine (MDMA) to substitute in DMT-trained rats was tested. The ability of DMT to substitute in rats trained to discriminate each of these compounds was also tested. To assess the degree of similarity in discriminative stimulus effects, each of the compounds was tested for substitution in all of the other training groups. LSD, DOM, and MDMA all fully substituted in DMT-trained rats, whereas DMT fully substituted only in DOM-trained rats. Full cross-substitution occurred between DMT and DOM, LSD and DOM, and (+)-methamphetamine and MDMA. MDMA fully substituted for (+)-methamphetamine, DOM, and DMT, but only partially for LSD. In MDMA-trained rats, LSD and (+)-methamphetamine fully substituted, whereas DMT and DOM did not fully substitute. No cross-substitution was evident between (+)-methamphetamine and DMT, LSD, or DOM. DMT produces discriminative stimulus effects most similar to those of DOM, with some similarity to the discriminative stimulus effects of LSD and MDMA. Like DOM and LSD, DMT seems to produce predominately hallucinogenic-like discriminative stimulus effects and minimal psychostimulant effects, in contrast to MDMA which produced hallucinogen- and psychostimulant-like effects.
USDA-ARS?s Scientific Manuscript database
Training is a critical part of aphid (Hemiptera: Aphididae) identification. This video provides provides training to identify the green peach aphid, Myzus persicae, using a compound microscope and an electronic identification key called “LUCID.” The video demonstrates key morphological structures t...
USDA-ARS?s Scientific Manuscript database
Training is a critical part of aphid (Hemiptera: Aphididae) identification. This video provides provides training to identify the cotton aphid, Aphis gossypii, using a compound microscope and an electronic identification key called “LUCID.” The video demonstrates key morphological structures that ca...
Thinking Outside of Outpatient: Underutilized Settings for Psychotherapy Education.
Blumenshine, Philip; Lenet, Alison E; Havel, Lauren K; Arbuckle, Melissa R; Cabaniss, Deborah L
2017-02-01
Although psychiatry residents are expected to achieve competency in conducting psychotherapy during their training, it is unclear how psychotherapy teaching is integrated across diverse clinical settings. Between January and March 2015, 177 psychiatry residency training directors were sent a survey asking about psychotherapy training practices in their programs, as well as perceived barriers to psychotherapy teaching. Eighty-two training directors (44%) completed the survey. While 95% indicated that psychotherapy was a formal learning objective for outpatient clinic rotations, fifty percent or fewer noted psychotherapy was a learning objective in other settings. Most program directors would like to see psychotherapy training included (particularly supportive psychotherapy and cognitive behavioral therapy) on inpatient (82%) and consultation-liaison settings (57%). The most common barriers identified to teaching psychotherapy in these settings were time and perceived inadequate staff training and interest. Non-outpatient rotations appear to be an underutilized setting for psychotherapy teaching.
Prediction of new bioactive molecules using a Bayesian belief network.
Abdo, Ammar; Leclère, Valérie; Jacques, Philippe; Salim, Naomie; Pupin, Maude
2014-01-27
Natural products and synthetic compounds are a valuable source of new small molecules leading to novel drugs to cure diseases. However identifying new biologically active small molecules is still a challenge. In this paper, we introduce a new activity prediction approach using Bayesian belief network for classification (BBNC). The roots of the network are the fragments composing a compound. The leaves are, on one side, the activities to predict and, on another side, the unknown compound. The activities are represented by sets of known compounds, and sets of inactive compounds are also used. We calculated a similarity between an unknown compound and each activity class. The more similar activity is assigned to the unknown compound. We applied this new approach on eight well-known data sets extracted from the literature and compared its performance to three classical machine learning algorithms. Experiments showed that BBNC provides interesting prediction rates (from 79% accuracy for high diverse data sets to 99% for low diverse ones) with a short time calculation. Experiments also showed that BBNC is particularly effective for homogeneous data sets but has been found to perform less well with structurally heterogeneous sets. However, it is important to stress that we believe that using several approaches whenever possible for activity prediction can often give a broader understanding of the data than using only one approach alone. Thus, BBNC is a useful addition to the computational chemist's toolbox.
Basics of compounding sterile preparations: nomenclature and considerations.
Allen, Loyd V
2014-01-01
This article focuses on sterile dosage forms and serves as a review for those trained in compounding sterile preparations, as well as to educate those that have not received any formal training on the topics of nomenclature and composition. The use of proper terminology is important for proper/accurate communications among healthcare practitioners. Proper terminology also has potential legal/liability implications. In addition to terminology considerations, it is important to be aware of the different routes of administration of sterile formulations and their different compositions and uses.
New strategies in sport nutrition to increase exercise performance.
Close, G L; Hamilton, D L; Philp, A; Burke, L M; Morton, J P
2016-09-01
Despite over 50 years of research, the field of sports nutrition continues to grow at a rapid rate. Whilst the traditional research focus was one that centred on strategies to maximise competition performance, emerging data in the last decade has demonstrated how both macronutrient and micronutrient availability can play a prominent role in regulating those cell signalling pathways that modulate skeletal muscle adaptations to endurance and resistance training. Nonetheless, in the context of exercise performance, it is clear that carbohydrate (but not fat) still remains king and that carefully chosen ergogenic aids (e.g. caffeine, creatine, sodium bicarbonate, beta-alanine, nitrates) can all promote performance in the correct exercise setting. In relation to exercise training, however, it is now thought that strategic periods of reduced carbohydrate and elevated dietary protein intake may enhance training adaptations whereas high carbohydrate availability and antioxidant supplementation may actually attenuate training adaptation. Emerging evidence also suggests that vitamin D may play a regulatory role in muscle regeneration and subsequent hypertrophy following damaging forms of exercise. Finally, novel compounds (albeit largely examined in rodent models) such as epicatechins, nicotinamide riboside, resveratrol, β-hydroxy β-methylbutyrate, phosphatidic acid and ursolic acid may also promote or attenuate skeletal muscle adaptations to endurance and strength training. When taken together, it is clear that sports nutrition is very much at the heart of the Olympic motto, Citius, Altius, Fortius (faster, higher, stronger). Copyright © 2016 Elsevier Inc. All rights reserved.
Shi, Z; Ma, X H; Qin, C; Jia, J; Jiang, Y Y; Tan, C Y; Chen, Y Z
2012-02-01
Selective multi-target serotonin reuptake inhibitors enhance antidepressant efficacy. Their discovery can be facilitated by multiple methods, including in silico ones. In this study, we developed and tested an in silico method, combinatorial support vector machines (COMBI-SVMs), for virtual screening (VS) multi-target serotonin reuptake inhibitors of seven target pairs (serotonin transporter paired with noradrenaline transporter, H(3) receptor, 5-HT(1A) receptor, 5-HT(1B) receptor, 5-HT(2C) receptor, melanocortin 4 receptor and neurokinin 1 receptor respectively) from large compound libraries. COMBI-SVMs trained with 917-1951 individual target inhibitors correctly identified 22-83.3% (majority >31.1%) of the 6-216 dual inhibitors collected from literature as independent testing sets. COMBI-SVMs showed moderate to good target selectivity in misclassifying as dual inhibitors 2.2-29.8% (majority <15.4%) of the individual target inhibitors of the same target pair and 0.58-7.1% of the other 6 targets outside the target pair. COMBI-SVMs showed low dual inhibitor false hit rates (0.006-0.056%, 0.042-0.21%, 0.2-4%) in screening 17 million PubChem compounds, 168,000 MDDR compounds, and 7-8181 MDDR compounds similar to the dual inhibitors. Compared with similarity searching, k-NN and PNN methods, COMBI-SVM produced comparable dual inhibitor yields, similar target selectivity, and lower false hit rate in screening 168,000 MDDR compounds. The annotated classes of many COMBI-SVMs identified MDDR virtual hits correlate with the reported effects of their predicted targets. COMBI-SVM is potentially useful for searching selective multi-target agents without explicit knowledge of these agents. Copyright © 2011 Elsevier Inc. All rights reserved.
Proline-Based Carbamates as Cholinesterase Inhibitors.
Pizova, Hana; Havelkova, Marketa; Stepankova, Sarka; Bak, Andrzej; Kauerova, Tereza; Kozik, Violetta; Oravec, Michal; Imramovsky, Ales; Kollar, Peter; Bobal, Pavel; Jampilek, Josef
2017-11-14
Series of twenty-five benzyl (2S)-2-(arylcarbamoyl)pyrrolidine-1-carboxylates was prepared and completely characterized. All the compounds were tested for their in vitro ability to inhibit acetylcholinesterase (AChE) and butyrylcholinesterase (BChE), and the selectivity of compounds to individual cholinesterases was determined. Screening of the cytotoxicity of all the compounds was performed using a human monocytic leukaemia THP-1 cell line, and the compounds demonstrated insignificant toxicity. All the compounds showed rather moderate inhibitory effect against AChE; benzyl (2 S )-2-[(2-chlorophenyl)carbamoyl]pyrrolidine-1-carboxylate (IC 50 = 46.35 μM) was the most potent agent. On the other hand, benzyl (2 S )-2-[(4-bromophenyl)-] and benzyl (2 S )-2-[(2-bromophenyl)carbamoyl]pyrrolidine-1-carboxylates expressed anti-BChE activity (IC 50 = 28.21 and 27.38 μM, respectively) comparable with that of rivastigmine. The ortho -brominated compound as well as benzyl (2 S )-2-[(2-hydroxyphenyl)carbamoyl]pyrrolidine-1-carboxylate demonstrated greater selectivity to BChE. The in silico characterization of the structure-inhibitory potency for the set of proline-based carbamates considering electronic, steric and lipophilic properties was provided using comparative molecular surface analysis (CoMSA) and principal component analysis (PCA). Moreover, the systematic space inspection with splitting data into the training/test subset was performed to monitor the statistical estimators performance in the effort to map the probability-guided pharmacophore pattern. The comprehensive screening of the AChE/BChE profile revealed potentially relevant structural and physicochemical features that might be essential for mapping of the carbamates inhibition efficiency indicating qualitative variations exerted on the reaction site by the substituent in the 3'-/4'-position of the phenyl ring. In addition, the investigation was completed by a molecular docking study of recombinant human AChE.
Establishing Fire Safety Skills Using Behavioral Skills Training
ERIC Educational Resources Information Center
Houvouras, Andrew J., IV; Harvey, Mark T.
2014-01-01
The use of behavioral skills training (BST) to educate 3 adolescent boys on the risks of lighters and fire setting was evaluated using in situ assessment in a school setting. Two participants had a history of fire setting. After training, all participants adhered to established rules: (a) avoid a deactivated lighter, (b) leave the training area,…
Li, Xiaomeng; Yang, Zhuo
2017-01-01
As a sustainable transportation mode, high-speed railway (HSR) has become an efficient way to meet the huge travel demand. However, due to the high acquisition and maintenance cost, it is impossible to build enough infrastructure and purchase enough train-sets. Great efforts are required to improve the transport capability of HSR. The utilization efficiency of train-sets (carrying tools of HSR) is one of the most important factors of the transport capacity of HSR. In order to enhance the utilization efficiency of the train-sets, this paper proposed a train-set circulation optimization model to minimize the total connection time. An innovative two-stage approach which contains segments generation and segments combination was designed to solve this model. In order to verify the feasibility of the proposed approach, an experiment was carried out in the Beijing-Tianjin passenger dedicated line, to fulfill a 174 trips train diagram. The model results showed that compared with the traditional Ant Colony Algorithm (ACA), the utilization efficiency of train-sets can be increased from 43.4% (ACA) to 46.9% (Two-Stage), and 1 train-set can be saved up to fulfill the same transportation tasks. The approach proposed in the study is faster and more stable than the traditional ones, by using which, the HSR staff can draw up the train-sets circulation plan more quickly and the utilization efficiency of the HSR system is also improved. PMID:28489933
Zevin, Boris; Dedy, Nicolas J; Bonrath, Esther M; Grantcharov, Teodor P
2017-05-01
There is no comprehensive simulation-enhanced training curriculum to address cognitive, psychomotor, and nontechnical skills for an advanced minimally invasive procedure. 1) To develop and provide evidence of validity for a comprehensive simulation-enhanced training (SET) curriculum for an advanced minimally invasive procedure; (2) to demonstrate transfer of acquired psychomotor skills from a simulation laboratory to live porcine model; and (3) to compare training outcomes of SET curriculum group and chief resident group. University. This prospective single-blinded, randomized, controlled trial allocated 20 intermediate-level surgery residents to receive either conventional training (control) or SET curriculum training (intervention). The SET curriculum consisted of cognitive, psychomotor, and nontechnical training modules. Psychomotor skills in a live anesthetized porcine model in the OR was the primary outcome. Knowledge of advanced minimally invasive and bariatric surgery and nontechnical skills in a simulated OR crisis scenario were the secondary outcomes. Residents in the SET curriculum group went on to perform a laparoscopic jejunojejunostomy in the OR. Cognitive, psychomotor, and nontechnical skills of SET curriculum group were also compared to a group of 12 chief surgery residents. SET curriculum group demonstrated superior psychomotor skills in a live porcine model (56 [47-62] versus 44 [38-53], P<.05) and superior nontechnical skills (41 [38-45] versus 31 [24-40], P<.01) compared with conventional training group. SET curriculum group and conventional training group demonstrated equivalent knowledge (14 [12-15] versus 13 [11-15], P = 0.47). SET curriculum group demonstrated equivalent psychomotor skills in the live porcine model and in the OR in a human patient (56 [47-62] versus 63 [61-68]; P = .21). SET curriculum group demonstrated inferior knowledge (13 [11-15] versus 16 [14-16]; P<.05), equivalent psychomotor skill (63 [61-68] versus 68 [62-74]; P = .50), and superior nontechnical skills (41 [38-45] versus 34 [27-35], P<.01) compared with chief resident group. Completion of the SET curriculum resulted in superior training outcomes, compared with conventional surgery training. Implementation of the SET curriculum can standardize training for an advanced minimally invasive procedure and can ensure that comprehensive proficiency milestones are met before exposure to patient care. Copyright © 2017 American Society for Bariatric Surgery. Published by Elsevier Inc. All rights reserved.
Nourski, Kirill V; Abbas, Paul J; Miller, Charles A; Robinson, Barbara K; Jeng, Fuh-Cherng
2005-04-01
This study investigated the effects of acoustic noise on the auditory nerve compound action potentials in response to electric pulse trains. Subjects were adult guinea pigs, implanted with a minimally invasive electrode to preserve acoustic sensitivity. Electrically evoked compound action potentials (ECAP) were recorded from the auditory nerve trunk in response to electric pulse trains both during and after the presentation of acoustic white noise. Simultaneously presented acoustic noise produced a decrease in ECAP amplitude. The effect of the acoustic masker on the electric probe was greatest at the onset of the acoustic stimulus and it was followed by a partial recovery of the ECAP amplitude. Following cessation of the acoustic noise, ECAP amplitude recovered over a period of approximately 100-200 ms. The effects of the acoustic noise were more prominent at lower electric pulse rates (interpulse intervals of 3 ms and higher). At higher pulse rates, the ECAP adaptation to the electric pulse train alone was larger and the acoustic noise, when presented, produced little additional effect. The observed effects of noise on ECAP were the greatest at high electric stimulus levels and, for a particular electric stimulus level, at high acoustic noise levels.
Analytic Methods Used in Quality Control in a Compounding Pharmacy.
Allen, Loyd V
2017-01-01
Analytical testing will no doubt become a more important part of pharmaceutical compounding as the public and regulatory agencies demand increasing documentation of the quality of compounded preparations. Compounding pharmacists must decide what types of testing and what amount of testing to include in their quality-control programs, and whether testing should be done in-house or outsourced. Like pharmaceutical compounding, analytical testing should be performed only by those who are appropriately trained and qualified. This article discusses the analytical methods that are used in quality control in a compounding pharmacy. Copyright© by International Journal of Pharmaceutical Compounding, Inc.
2D-QSAR and 3D-QSAR Analyses for EGFR Inhibitors
Zhao, Manman; Zheng, Linfeng; Qiu, Chun
2017-01-01
Epidermal growth factor receptor (EGFR) is an important target for cancer therapy. In this study, EGFR inhibitors were investigated to build a two-dimensional quantitative structure-activity relationship (2D-QSAR) model and a three-dimensional quantitative structure-activity relationship (3D-QSAR) model. In the 2D-QSAR model, the support vector machine (SVM) classifier combined with the feature selection method was applied to predict whether a compound was an EGFR inhibitor. As a result, the prediction accuracy of the 2D-QSAR model was 98.99% by using tenfold cross-validation test and 97.67% by using independent set test. Then, in the 3D-QSAR model, the model with q2 = 0.565 (cross-validated correlation coefficient) and r2 = 0.888 (non-cross-validated correlation coefficient) was built to predict the activity of EGFR inhibitors. The mean absolute error (MAE) of the training set and test set was 0.308 log units and 0.526 log units, respectively. In addition, molecular docking was also employed to investigate the interaction between EGFR inhibitors and EGFR. PMID:28630865
Correcting Evaluation Bias of Relational Classifiers with Network Cross Validation
2010-01-01
classi- fication algorithms: simple random resampling (RRS), equal-instance random resampling (ERS), and network cross-validation ( NCV ). The first two... NCV procedure that eliminates overlap between test sets altogether. The procedure samples for k disjoint test sets that will be used for evaluation...propLabeled ∗ S) nodes from train Pool in f erenceSet =network − trainSet F = F ∪ < trainSet, test Set, in f erenceSet > end for output: F NCV addresses
Sirois, S; Tsoukas, C M; Chou, Kuo-Chen; Wei, Dongqing; Boucher, C; Hatzakis, G E
2005-03-01
Quantitative Structure Activity Relationship (QSAR) techniques are used routinely by computational chemists in drug discovery and development to analyze datasets of compounds. Quantitative numerical methods like Partial Least Squares (PLS) and Artificial Neural Networks (ANN) have been used on QSAR to establish correlations between molecular properties and bioactivity. However, ANN may be advantageous over PLS because it considers the interrelations of the modeled variables. This study focused on the HIV-1 Protease (HIV-1 Pr) inhibitors belonging to the peptidomimetic class of compounds. The main objective was to select molecular descriptors with the best predictive value for antiviral potency (Ki). PLS and ANN were used to predict Ki activity of HIV-1 Pr inhibitors and the results were compared. To address the issue of dimensionality reduction, Genetic Algorithms (GA) were used for variable selection and their performance was compared against that of ANN. Finally, the structure of the optimum ANN achieving the highest Pearson's-R coefficient was determined. On the basis of Pearson's-R, PLS and ANN were compared to determine which exhibits maximum performance. Training and validation of models was performed on 15 random split sets of the master dataset consisted of 231 compounds. For each compound 192 molecular descriptors were considered. The molecular structure and constant of inhibition (Ki) were selected from the NIAID database. Study findings suggested that non-covalent interactions such as hydrophobicity, shape and hydrogen bonding describe well the antiviral activity of the HIV-1 Pr compounds. The significance of lipophilicity and relationship to HIV-1 associated hyperlipidemia and lipodystrophy syndrome warrant further investigation.
Consensus Modeling of Oral Rat Acute Toxicity
An acute toxicity dataset (oral rat LD50) with about 7400 compounds was compiled from the ChemIDplus database. This dataset was divided into a modeling set and a prediction set. The compounds in the prediction set were selected so that they were present in the modeling set used...
Bitter or not? BitterPredict, a tool for predicting taste from chemical structure.
Dagan-Wiener, Ayana; Nissim, Ido; Ben Abu, Natalie; Borgonovo, Gigliola; Bassoli, Angela; Niv, Masha Y
2017-09-21
Bitter taste is an innately aversive taste modality that is considered to protect animals from consuming toxic compounds. Yet, bitterness is not always noxious and some bitter compounds have beneficial effects on health. Hundreds of bitter compounds were reported (and are accessible via the BitterDB http://bitterdb.agri.huji.ac.il/dbbitter.php ), but numerous additional bitter molecules are still unknown. The dramatic chemical diversity of bitterants makes bitterness prediction a difficult task. Here we present a machine learning classifier, BitterPredict, which predicts whether a compound is bitter or not, based on its chemical structure. BitterDB was used as the positive set, and non-bitter molecules were gathered from literature to create the negative set. Adaptive Boosting (AdaBoost), based on decision trees machine-learning algorithm was applied to molecules that were represented using physicochemical and ADME/Tox descriptors. BitterPredict correctly classifies over 80% of the compounds in the hold-out test set, and 70-90% of the compounds in three independent external sets and in sensory test validation, providing a quick and reliable tool for classifying large sets of compounds into bitter and non-bitter groups. BitterPredict suggests that about 40% of random molecules, and a large portion (66%) of clinical and experimental drugs, and of natural products (77%) are bitter.
ERIC Educational Resources Information Center
Campos, Heloisa Cursi; Debert, Paula; Barros, Romariz da Silva; McIlvane, William J.
2011-01-01
A go/no-go procedure with compound stimuli typically establishes emergent behavior that parallels in structure and typical outcome that of conventional tests for symmetric, transitive, and equivalence relations in normally capable adults. The present study employed a go/no-go compound stimulus procedure with pigeons. During training, pecks to…
2016 Consequence Management Advisory Division's (CMAD) Annual Report
CMAD annual report for 2016 which covers activities such as radiation task force leaders annual training, national criminal enforcement response team annual training, field technology demonstrations, and a new method to detect perfluorinated compounds.
Smartphone-Based System for Learning and Inferring Hearing Aid Settings.
Aldaz, Gabriel; Puria, Sunil; Leifer, Larry J
2016-10-01
Previous research has shown that hearing aid wearers can successfully self-train their instruments' gain-frequency response and compression parameters in everyday situations. Combining hearing aids with a smartphone introduces additional computing power, memory, and a graphical user interface that may enable greater setting personalization. To explore the benefits of self-training with a smartphone-based hearing system, a parameter space was chosen with four possible combinations of microphone mode (omnidirectional and directional) and noise reduction state (active and off). The baseline for comparison was the "untrained system," that is, the manufacturer's algorithm for automatically selecting microphone mode and noise reduction state based on acoustic environment. The "trained system" first learned each individual's preferences, self-entered via a smartphone in real-world situations, to build a trained model. The system then predicted the optimal setting (among available choices) using an inference engine, which considered the trained model and current context (e.g., sound environment, location, and time). To develop a smartphone-based prototype hearing system that can be trained to learn preferred user settings. Determine whether user study participants showed a preference for trained over untrained system settings. An experimental within-participants study. Participants used a prototype hearing system-comprising two hearing aids, Android smartphone, and body-worn gateway device-for ∼6 weeks. Sixteen adults with mild-to-moderate sensorineural hearing loss (HL) (ten males, six females; mean age = 55.5 yr). Fifteen had ≥6 mo of experience wearing hearing aids, and 14 had previous experience using smartphones. Participants were fitted and instructed to perform daily comparisons of settings ("listening evaluations") through a smartphone-based software application called Hearing Aid Learning and Inference Controller (HALIC). In the four-week-long training phase, HALIC recorded individual listening preferences along with sensor data from the smartphone-including environmental sound classification, sound level, and location-to build trained models. In the subsequent two-week-long validation phase, participants performed blinded listening evaluations comparing settings predicted by the trained system ("trained settings") to those suggested by the hearing aids' untrained system ("untrained settings"). We analyzed data collected on the smartphone and hearing aids during the study. We also obtained audiometric and demographic information. Overall, the 15 participants with valid data significantly preferred trained settings to untrained settings (paired-samples t test). Seven participants had a significant preference for trained settings, while one had a significant preference for untrained settings (binomial test). The remaining seven participants had nonsignificant preferences. Pooling data across participants, the proportion of times that each setting was chosen in a given environmental sound class was on average very similar. However, breaking down the data by participant revealed strong and idiosyncratic individual preferences. Fourteen participants reported positive feelings of clarity, competence, and mastery when training via HALIC. The obtained data, as well as subjective participant feedback, indicate that smartphones could become viable tools to train hearing aids. Individuals who are tech savvy and have milder HL seem well suited to take advantages of the benefits offered by training with a smartphone. American Academy of Audiology
How well does multiple OCR error correction generalize?
NASA Astrophysics Data System (ADS)
Lund, William B.; Ringger, Eric K.; Walker, Daniel D.
2013-12-01
As the digitization of historical documents, such as newspapers, becomes more common, the need of the archive patron for accurate digital text from those documents increases. Building on our earlier work, the contributions of this paper are: 1. in demonstrating the applicability of novel methods for correcting optical character recognition (OCR) on disparate data sets, including a new synthetic training set, 2. enhancing the correction algorithm with novel features, and 3. assessing the data requirements of the correction learning method. First, we correct errors using conditional random fields (CRF) trained on synthetic training data sets in order to demonstrate the applicability of the methodology to unrelated test sets. Second, we show the strength of lexical features from the training sets on two unrelated test sets, yielding a relative reduction in word error rate on the test sets of 6.52%. New features capture the recurrence of hypothesis tokens and yield an additional relative reduction in WER of 2.30%. Further, we show that only 2.0% of the full training corpus of over 500,000 feature cases is needed to achieve correction results comparable to those using the entire training corpus, effectively reducing both the complexity of the training process and the learned correction model.
Xu, G; Hughes-Oliver, J M; Brooks, J D; Yeatts, J L; Baynes, R E
2013-01-01
Quantitative structure-activity relationship (QSAR) models are being used increasingly in skin permeation studies. The main idea of QSAR modelling is to quantify the relationship between biological activities and chemical properties, and thus to predict the activity of chemical solutes. As a key step, the selection of a representative and structurally diverse training set is critical to the prediction power of a QSAR model. Early QSAR models selected training sets in a subjective way and solutes in the training set were relatively homogenous. More recently, statistical methods such as D-optimal design or space-filling design have been applied but such methods are not always ideal. This paper describes a comprehensive procedure to select training sets from a large candidate set of 4534 solutes. A newly proposed 'Baynes' rule', which is a modification of Lipinski's 'rule of five', was used to screen out solutes that were not qualified for the study. U-optimality was used as the selection criterion. A principal component analysis showed that the selected training set was representative of the chemical space. Gas chromatograph amenability was verified. A model built using the training set was shown to have greater predictive power than a model built using a previous dataset [1].
Over-Selectivity as a Learned Response
ERIC Educational Resources Information Center
Reed, Phil; Petrina, Neysa; McHugh, Louise
2011-01-01
An experiment investigated the effects of different levels of task complexity in pre-training on over-selectivity in a subsequent match-to-sample (MTS) task. Twenty human participants were divided into two groups; exposed either to a 3-element, or a 9-element, compound stimulus as a sample during MTS training. After the completion of training,…
Dörrenbächer, Sandra; Müller, Philipp M.; Tröger, Johannes; Kray, Jutta
2014-01-01
Although motivational reinforcers are often used to enhance the attractiveness of trainings of cognitive control in children, little is known about how such motivational manipulations of the setting contribute to separate gains in motivation and cognitive-control performance. Here we provide a framework for systematically investigating the impact of a motivational video-game setting on the training motivation, the task performance, and the transfer success in a task-switching training in middle-aged children (8–11 years of age). We manipulated both the type of training (low-demanding/single-task training vs. high-demanding/task-switching training) as well as the motivational setting (low-motivational/without video-game elements vs. high-motivational/with video-game elements) separately from another. The results indicated that the addition of game elements to a training setting enhanced the intrinsic interest in task practice, independently of the cognitive demands placed by the training type. In the task-switching group, the high-motivational training setting led to an additional enhancement of task and switching performance during the training phase right from the outset. These motivation-induced benefits projected onto the switching performance in a switching situation different from the trained one (near-transfer measurement). However, in structurally dissimilar cognitive tasks (far-transfer measurement), the motivational gains only transferred to the response dynamics (speed of processing). Hence, the motivational setting clearly had a positive impact on the training motivation and on the paradigm-specific task-switching abilities; it did not, however, consistently generalize on broad cognitive processes. These findings shed new light on the conflation of motivation and cognition in childhood and may help to refine guidelines for designing adequate training interventions. PMID:25431564
LVQ and backpropagation neural networks applied to NASA SSME data
NASA Technical Reports Server (NTRS)
Doniere, Timothy F.; Dhawan, Atam P.
1993-01-01
Feedfoward neural networks with backpropagation learning have been used as function approximators for modeling the space shuttle main engine (SSME) sensor signals. The modeling of these sensor signals is aimed at the development of a sensor fault detection system that can be used during ground test firings. The generalization capability of a neural network based function approximator depends on the training vectors which in this application may be derived from a number of SSME ground test-firings. This yields a large number of training vectors. Large training sets can cause the time required to train the network to be very large. Also, the network may not be able to generalize for large training sets. To reduce the size of the training sets, the SSME test-firing data is reduced using the learning vector quantization (LVQ) based technique. Different compression ratios were used to obtain compressed data in training the neural network model. The performance of the neural model trained using reduced sets of training patterns is presented and compared with the performance of the model trained using complete data. The LVQ can also be used as a function approximator. The performance of the LVQ as a function approximator using reduced training sets is presented and compared with the performance of the backpropagation network.
Expanding the analyte set of the JPL Electronic Nose to include inorganic compounds
NASA Technical Reports Server (NTRS)
Ryan, M. A.; Homer, M. L.; Zhou, H.; Mannat, K.; Manfreda, A.; Kisor, A.; Shevade, A.; Yen, S. P. S.
2005-01-01
An array-based sensing system based on 32 polymer/carbon composite conductometric sensors is under development at JPL. Until the present phase of development, the analyte set has focuses on organic compounds and a few selected inorganic compounds, notably ammonia and hydrazine.
Maximizing lipocalin prediction through balanced and diversified training set and decision fusion.
Nath, Abhigyan; Subbiah, Karthikeyan
2015-12-01
Lipocalins are short in sequence length and perform several important biological functions. These proteins are having less than 20% sequence similarity among paralogs. Experimentally identifying them is an expensive and time consuming process. The computational methods based on the sequence similarity for allocating putative members to this family are also far elusive due to the low sequence similarity existing among the members of this family. Consequently, the machine learning methods become a viable alternative for their prediction by using the underlying sequence/structurally derived features as the input. Ideally, any machine learning based prediction method must be trained with all possible variations in the input feature vector (all the sub-class input patterns) to achieve perfect learning. A near perfect learning can be achieved by training the model with diverse types of input instances belonging to the different regions of the entire input space. Furthermore, the prediction performance can be improved through balancing the training set as the imbalanced data sets will tend to produce the prediction bias towards majority class and its sub-classes. This paper is aimed to achieve (i) the high generalization ability without any classification bias through the diversified and balanced training sets as well as (ii) enhanced the prediction accuracy by combining the results of individual classifiers with an appropriate fusion scheme. Instead of creating the training set randomly, we have first used the unsupervised Kmeans clustering algorithm to create diversified clusters of input patterns and created the diversified and balanced training set by selecting an equal number of patterns from each of these clusters. Finally, probability based classifier fusion scheme was applied on boosted random forest algorithm (which produced greater sensitivity) and K nearest neighbour algorithm (which produced greater specificity) to achieve the enhanced predictive performance than that of individual base classifiers. The performance of the learned models trained on Kmeans preprocessed training set is far better than the randomly generated training sets. The proposed method achieved a sensitivity of 90.6%, specificity of 91.4% and accuracy of 91.0% on the first test set and sensitivity of 92.9%, specificity of 96.2% and accuracy of 94.7% on the second blind test set. These results have established that diversifying training set improves the performance of predictive models through superior generalization ability and balancing the training set improves prediction accuracy. For smaller data sets, unsupervised Kmeans based sampling can be an effective technique to increase generalization than that of the usual random splitting method. Copyright © 2015 Elsevier Ltd. All rights reserved.
Chen, Guangchao; Li, Xuehua; Chen, Jingwen; Zhang, Ya-Nan; Peijnenburg, Willie J G M
2014-12-01
Biodegradation is the principal environmental dissipation process of chemicals. As such, it is a dominant factor determining the persistence and fate of organic chemicals in the environment, and is therefore of critical importance to chemical management and regulation. In the present study, the authors developed in silico methods assessing biodegradability based on a large heterogeneous set of 825 organic compounds, using the techniques of the C4.5 decision tree, the functional inner regression tree, and logistic regression. External validation was subsequently carried out by 2 independent test sets of 777 and 27 chemicals. As a result, the functional inner regression tree exhibited the best predictability with predictive accuracies of 81.5% and 81.0%, respectively, on the training set (825 chemicals) and test set I (777 chemicals). Performance of the developed models on the 2 test sets was subsequently compared with that of the Estimation Program Interface (EPI) Suite Biowin 5 and Biowin 6 models, which also showed a better predictability of the functional inner regression tree model. The model built in the present study exhibits a reasonable predictability compared with existing models while possessing a transparent algorithm. Interpretation of the mechanisms of biodegradation was also carried out based on the models developed. © 2014 SETAC.
Zhang, Yong-Hong; Xia, Zhi-Ning; Qin, Li-Tang; Liu, Shu-Shen
2010-09-01
The objective of this paper is to build a reliable model based on the molecular electronegativity distance vector (MEDV) descriptors for predicting the blood-brain barrier (BBB) permeability and to reveal the effects of the molecular structural segments on the BBB permeability. Using 70 structurally diverse compounds, the partial least squares regression (PLSR) models between the BBB permeability and the MEDV descriptors were developed and validated by the variable selection and modeling based on prediction (VSMP) technique. The estimation ability, stability, and predictive power of a model are evaluated by the estimated correlation coefficient (r), leave-one-out (LOO) cross-validation correlation coefficient (q), and predictive correlation coefficient (R(p)). It has been found that PLSR model has good quality, r=0.9202, q=0.7956, and R(p)=0.6649 for M1 model based on the training set of 57 samples. To search the most important structural factors affecting the BBB permeability of compounds, we performed the values of the variable importance in projection (VIP) analysis for MEDV descriptors. It was found that some structural fragments in compounds, such as -CH(3), -CH(2)-, =CH-, =C, triple bond C-, -CH<, =C<, =N-, -NH-, =O, and -OH, are the most important factors affecting the BBB permeability. (c) 2010. Published by Elsevier Inc.
Burliaeva, E V; Tarkhov, A E; Burliaev, V V; Iurkevich, A M; Shvets, V I
2002-01-01
Searching of new anti-HIV agents is still crucial now. In general, researches are looking for inhibitors of certain HIV's vital enzymes, especially for reverse transcriptase (RT) inhibitors. Modern generation of anti-HIV agents represents non-nucleoside reverse transcriptase inhibitors (NNRTIs). They are much less toxic than nucleoside analogues and more chemically stable, thus being slower metabolized and emitted from the human body. Thus, search of new NNRTIs is actual today. Synthesis and study of new anti-HIV drugs is very expensive. So employment of the activity prediction techniques for such a search is very beneficial. This technique allows predicting the activities for newly proposed structures. It is based on the property model built by investigation of a series of known compounds with measured activity. This paper presents an approach of activity prediction based on "structure-activity" models designed to form a hypothesis about probably activity interval estimate. This hypothesis formed is based on structure descriptor domains, calculated for all energetically allowed conformers for each compound in the studied sef. Tetrahydroimidazobenzodiazipenone (TIBO) derivatives and phenylethyltiazolyltiourea (PETT) derivatives illustrated the predictive power of this method. The results are consistent with experimental data and allow to predict inhibitory activity of compounds, which were not included into the training set.
The UXO Classification Demonstration at San Luis Obispo, CA
2010-09-01
Set ................................45 2.17.2 Active Learning Training and Test Set ..........................................47 2.17.3 Extended...optimized algorithm by applying it to only the unlabeled data in the test set. 2.17.2 Active Learning Training and Test Set SIG also used active ... learning [12]. Active learning , an alternative approach for constructing a training set, is used in conjunction with either supervised or semi
Does rational selection of training and test sets improve the outcome of QSAR modeling?
Martin, Todd M; Harten, Paul; Young, Douglas M; Muratov, Eugene N; Golbraikh, Alexander; Zhu, Hao; Tropsha, Alexander
2012-10-22
Prior to using a quantitative structure activity relationship (QSAR) model for external predictions, its predictive power should be established and validated. In the absence of a true external data set, the best way to validate the predictive ability of a model is to perform its statistical external validation. In statistical external validation, the overall data set is divided into training and test sets. Commonly, this splitting is performed using random division. Rational splitting methods can divide data sets into training and test sets in an intelligent fashion. The purpose of this study was to determine whether rational division methods lead to more predictive models compared to random division. A special data splitting procedure was used to facilitate the comparison between random and rational division methods. For each toxicity end point, the overall data set was divided into a modeling set (80% of the overall set) and an external evaluation set (20% of the overall set) using random division. The modeling set was then subdivided into a training set (80% of the modeling set) and a test set (20% of the modeling set) using rational division methods and by using random division. The Kennard-Stone, minimal test set dissimilarity, and sphere exclusion algorithms were used as the rational division methods. The hierarchical clustering, random forest, and k-nearest neighbor (kNN) methods were used to develop QSAR models based on the training sets. For kNN QSAR, multiple training and test sets were generated, and multiple QSAR models were built. The results of this study indicate that models based on rational division methods generate better statistical results for the test sets than models based on random division, but the predictive power of both types of models are comparable.
Barclift, Songhai C; Brown, Elizabeth J; Finnegan, Sean C; Cohen, Elena R; Klink, Kathleen
2016-05-01
Background The Teaching Health Center Graduate Medical Education (THCGME) program is an Affordable Care Act funding initiative designed to expand primary care residency training in community-based ambulatory settings. Statute suggests, but does not require, training in underserved settings. Residents who train in underserved settings are more likely to go on to practice in similar settings, and graduates more often than not practice near where they have trained. Objective The objective of this study was to describe and quantify federally designated clinical continuity training sites of the THCGME program. Methods Geographic locations of the training sites were collected and characterized as Health Professional Shortage Area, Medically Underserved Area, Population, or rural areas, and were compared with the distribution of Centers for Medicare and Medicaid Services (CMS)-funded training positions. Results More than half of the teaching health centers (57%) are located in states that are in the 4 quintiles with the lowest CMS-funded resident-to-population ratio. Of the 109 training sites identified, more than 70% are located in federally designated high-need areas. Conclusions The THCGME program is a model that funds residency training in community-based ambulatory settings. Statute suggests, but does not explicitly require, that training take place in underserved settings. Because the majority of the 109 clinical training sites of the 60 funded programs in 2014-2015 are located in federally designated underserved locations, the THCGME program deserves further study as a model to improve primary care distribution into high-need communities.
Chen, Ying; Cai, Xiaoyu; Jiang, Long; Li, Yu
2016-02-01
Based on the experimental data of octanol-air partition coefficients (KOA) for 19 polychlorinated biphenyl (PCB) congeners, two types of QSAR methods, comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA), are used to establish 3D-QSAR models using the structural parameters as independent variables and using logKOA values as the dependent variable with the Sybyl software to predict the KOA values of the remaining 190 PCB congeners. The whole data set (19 compounds) was divided into a training set (15 compounds) for model generation and a test set (4 compounds) for model validation. As a result, the cross-validation correlation coefficient (q(2)) obtained by the CoMFA and CoMSIA models (shuffled 12 times) was in the range of 0.825-0.969 (>0.5), the correlation coefficient (r(2)) obtained was in the range of 0.957-1.000 (>0.9), and the SEP (standard error of prediction) of test set was within the range of 0.070-0.617, indicating that the models were robust and predictive. Randomly selected from a set of models, CoMFA analysis revealed that the corresponding percentages of the variance explained by steric and electrostatic fields were 23.9% and 76.1%, respectively, while CoMSIA analysis by steric, electrostatic and hydrophobic fields were 0.6%, 92.6%, and 6.8%, respectively. The electrostatic field was determined as a primary factor governing the logKOA. The correlation analysis of the relationship between the number of Cl atoms and the average logKOA values of PCBs indicated that logKOA values gradually increased as the number of Cl atoms increased. Simultaneously, related studies on PCB detection in the Arctic and Antarctic areas revealed that higher logKOA values indicate a stronger PCB migration ability. From CoMFA and CoMSIA contour maps, logKOA decreased when substituents possessed electropositive groups at the 2-, 3-, 3'-, 5- and 6- positions, which could reduce the PCB migration ability. These results are expected to be beneficial in predicting logKOA values of PCB homologues and derivatives and in providing a theoretical foundation for further elucidation of the global migration behaviour of PCBs. Copyright © 2015 Elsevier Inc. All rights reserved.
Noh, Kyungrin; Yoo, Sunyong; Lee, Doheon
2018-06-13
Natural products have been widely investigated in the drug development field. Their traditional use cases as medicinal agents and their resemblance of our endogenous compounds show the possibility of new drug development. Many researchers have focused on identifying therapeutic effects of natural products, yet the resemblance of natural products and human metabolites has been rarely touched. We propose a novel method which predicts therapeutic effects of natural products based on their similarity with human metabolites. In this study, we compare the structure, target and phenotype similarities between natural products and human metabolites to capture molecular and phenotypic properties of both compounds. With the generated similarity features, we train support vector machine model to identify similar natural product and human metabolite pairs. The known functions of human metabolites are then mapped to the paired natural products to predict their therapeutic effects. With our selected three feature sets, structure, target and phenotype similarities, our trained model successfully paired similar natural products and human metabolites. When applied to the natural product derived drugs, we could successfully identify their indications with high specificity and sensitivity. We further validated the found therapeutic effects of natural products with the literature evidence. These results suggest that our model can match natural products to similar human metabolites and provide possible therapeutic effects of natural products. By utilizing the similar human metabolite information, we expect to find new indications of natural products which could not be covered by previous in silico methods.
A Classification of Remote Sensing Image Based on Improved Compound Kernels of Svm
NASA Astrophysics Data System (ADS)
Zhao, Jianing; Gao, Wanlin; Liu, Zili; Mou, Guifen; Lu, Lin; Yu, Lina
The accuracy of RS classification based on SVM which is developed from statistical learning theory is high under small number of train samples, which results in satisfaction of classification on RS using SVM methods. The traditional RS classification method combines visual interpretation with computer classification. The accuracy of the RS classification, however, is improved a lot based on SVM method, because it saves much labor and time which is used to interpret images and collect training samples. Kernel functions play an important part in the SVM algorithm. It uses improved compound kernel function and therefore has a higher accuracy of classification on RS images. Moreover, compound kernel improves the generalization and learning ability of the kernel.
Ren, Ji-Xia; Li, Cheng-Ping; Zhou, Xiu-Ling; Cao, Xue-Song; Xie, Yong
2017-08-22
Myeloid cell leukemia-1 (Mcl-1) has been a validated and attractive target for cancer therapy. Over-expression of Mcl-1 in many cancers allows cancer cells to evade apoptosis and contributes to the resistance to current chemotherapeutics. Here, we identified new Mcl-1 inhibitors using a multi-step virtual screening approach. First, based on two different ligand-receptor complexes, 20 pharmacophore models were established by simultaneously using 'Receptor-Ligand Pharmacophore Generation' method and manual build feature method, and then carefully validated by a test database. Then, pharmacophore-based virtual screening (PB-VS) could be performed by using the 20 pharmacophore models. In addition, docking study was used to predict the possible binding poses of compounds, and the docking parameters were optimized before performing docking-based virtual screening (DB-VS). Moreover, a 3D QSAR model was established by applying the 55 aligned Mcl-1 inhibitors. The 55 inhibitors sharing the same scaffold were docked into the Mcl-1 active site before alignment, then the inhibitors with possible binding conformations were aligned. For the training set, the 3D QSAR model gave a correlation coefficient r 2 of 0.996; for the test set, the correlation coefficient r 2 was 0.812. Therefore, the developed 3D QSAR model was a good model, which could be applied for carrying out 3D QSAR-based virtual screening (QSARD-VS). After the above three virtual screening methods orderly filtering, 23 potential inhibitors with novel scaffolds were identified. Furthermore, we have discussed in detail the mapping results of two potent compounds onto pharmacophore models, 3D QSAR model, and the interactions between the compounds and active site residues.
Gatch, Michael B.; Rutledge, Margaret A.; Carbonaro, Theresa; Forster, Michael J.
2010-01-01
Rationale There has been increased recreational use of dimethyltryptamine (DMT), but little is known of its discriminative stimulus effects. Objectives The present study assessed the similarity of the discriminative stimulus effects of DMT to other types of hallucinogens and to psychostimulants. Methods Rats were trained to discriminate DMT from saline. To test the similarity of DMT to known hallucinogens, the ability of (+)-lysergic acid diethylamide (LSD), (−)-2,5-dimethoxy-4-methylamphetamine (DOM), (+)-methamphetamine, or (±)3,4-methylenedioxymethyl-amphetamine (MDMA) to substitute in DMT-trained rats was tested. The ability of DMT to substitute in rats trained to discriminate each of these compounds was also tested. To assess the degree of similarity in discriminative stimulus effects, each of the compounds was tested for substitution in all of the other training groups. Results LSD, DOM, and MDMA all fully substituted in DMT-trained rats, whereas DMT fully substituted only in DOM-trained rats. Full cross-substitution occurred between DMT and DOM, LSD and DOM, and (+)-methamphetamine and MDMA. MDMA fully substituted for (+)-methamphetamine, DOM, and DMT, but only partially for LSD. In MDMA-trained rats, LSD and (+)-methamphetamine fully substituted, whereas DMT and DOM did not fully substitute. No cross-substitution was evident between (+)-methamphetamine and DMT, LSD, or DOM. Conclusions DMT produces discriminative stimulus effects most similar to those of DOM, with some similarity to the discriminative stimulus effects of LSD and MDMA. Like DOM and LSD, DMT seems to produce predominately hallucinogenic-like discriminative stimulus effects and minimal psychostimulant effects, in contrast to MDMA which produced hallucinogen- and psychostimulant-like effects. PMID:19288085
Ingle, Brandall L; Veber, Brandon C; Nichols, John W; Tornero-Velez, Rogelio
2016-11-28
The free fraction of a xenobiotic in plasma (F ub ) is an important determinant of chemical adsorption, distribution, metabolism, elimination, and toxicity, yet experimental plasma protein binding data are scarce for environmentally relevant chemicals. The presented work explores the merit of utilizing available pharmaceutical data to predict F ub for environmentally relevant chemicals via machine learning techniques. Quantitative structure-activity relationship (QSAR) models were constructed with k nearest neighbors (kNN), support vector machines (SVM), and random forest (RF) machine learning algorithms from a training set of 1045 pharmaceuticals. The models were then evaluated with independent test sets of pharmaceuticals (200 compounds) and environmentally relevant ToxCast chemicals (406 total, in two groups of 238 and 168 compounds). The selection of a minimal feature set of 10-15 2D molecular descriptors allowed for both informative feature interpretation and practical applicability domain assessment via a bounded box of descriptor ranges and principal component analysis. The diverse pharmaceutical and environmental chemical sets exhibit similarities in terms of chemical space (99-82% overlap), as well as comparable bias and variance in constructed learning curves. All the models exhibit significant predictability with mean absolute errors (MAE) in the range of 0.10-0.18F ub . The models performed best for highly bound chemicals (MAE 0.07-0.12), neutrals (MAE 0.11-0.14), and acids (MAE 0.14-0.17). A consensus model had the highest accuracy across both pharmaceuticals (MAE 0.151-0.155) and environmentally relevant chemicals (MAE 0.110-0.131). The inclusion of the majority of the ToxCast test sets within the AD of the consensus model, coupled with high prediction accuracy for these chemicals, indicates the model provides a QSAR for F ub that is broadly applicable to both pharmaceuticals and environmentally relevant chemicals.
STACCATO: a novel solution to supernova photometric classification with biased training sets
NASA Astrophysics Data System (ADS)
Revsbech, E. A.; Trotta, R.; van Dyk, D. A.
2018-01-01
We present a new solution to the problem of classifying Type Ia supernovae from their light curves alone given a spectroscopically confirmed but biased training set, circumventing the need to obtain an observationally expensive unbiased training set. We use Gaussian processes (GPs) to model the supernovae's (SN's) light curves, and demonstrate that the choice of covariance function has only a small influence on the GPs ability to accurately classify SNe. We extend and improve the approach of Richards et al. - a diffusion map combined with a random forest classifier - to deal specifically with the case of biased training sets. We propose a novel method called Synthetically Augmented Light Curve Classification (STACCATO) that synthetically augments a biased training set by generating additional training data from the fitted GPs. Key to the success of the method is the partitioning of the observations into subgroups based on their propensity score of being included in the training set. Using simulated light curve data, we show that STACCATO increases performance, as measured by the area under the Receiver Operating Characteristic curve (AUC), from 0.93 to 0.96, close to the AUC of 0.977 obtained using the 'gold standard' of an unbiased training set and significantly improving on the previous best result of 0.88. STACCATO also increases the true positive rate for SNIa classification by up to a factor of 50 for high-redshift/low-brightness SNe.
Chiddarwar, Rucha K; Rohrer, Sebastian G; Wolf, Antje; Tresch, Stefan; Wollenhaupt, Sabrina; Bender, Andreas
2017-01-01
The rapid emergence of pesticide resistance has given rise to a demand for herbicides with new mode of action (MoA). In the agrochemical sector, with the availability of experimental high throughput screening (HTS) data, it is now possible to utilize in silico target prediction methods in the early discovery phase to suggest the MoA of a compound via data mining of bioactivity data. While having been established in the pharmaceutical context, in the agrochemical area this approach poses rather different challenges, as we have found in this work, partially due to different chemistry, but even more so due to different (usually smaller) amounts of data, and different ways of conducting HTS. With the aim to apply computational methods for facilitating herbicide target identification, 48,000 bioactivity data against 16 herbicide targets were processed to train Laplacian modified Naïve Bayesian (NB) classification models. The herbicide target prediction model ("HerbiMod") is an ensemble of 16 binary classification models which are evaluated by internal, external and prospective validation sets. In addition to the experimental inactives, 10,000 random agrochemical inactives were included in the training process, which showed to improve the overall balanced accuracy of our models up to 40%. For all the models, performance in terms of balanced accuracy of≥80% was achieved in five-fold cross validation. Ranking target predictions was addressed by means of z-scores which improved predictivity over using raw scores alone. An external testset of 247 compounds from ChEMBL and a prospective testset of 394 compounds from BASF SE tested against five well studied herbicide targets (ACC, ALS, HPPD, PDS and PROTOX) were used for further validation. Only 4% of the compounds in the external testset lied in the applicability domain and extrapolation (and correct prediction) was hence impossible, which on one hand was surprising, and on the other hand illustrated the utilization of using applicability domains in the first place. However, performance better than 60% in balanced accuracy was achieved on the prospective testset, where all the compounds fell within the applicability domain, and which hence underlines the possibility of using target prediction also in the area of agrochemicals. Copyright © 2016 Elsevier Inc. All rights reserved.
Adaptation of machine translation for multilingual information retrieval in the medical domain.
Pecina, Pavel; Dušek, Ondřej; Goeuriot, Lorraine; Hajič, Jan; Hlaváčová, Jaroslava; Jones, Gareth J F; Kelly, Liadh; Leveling, Johannes; Mareček, David; Novák, Michal; Popel, Martin; Rosa, Rudolf; Tamchyna, Aleš; Urešová, Zdeňka
2014-07-01
We investigate machine translation (MT) of user search queries in the context of cross-lingual information retrieval (IR) in the medical domain. The main focus is on techniques to adapt MT to increase translation quality; however, we also explore MT adaptation to improve effectiveness of cross-lingual IR. Our MT system is Moses, a state-of-the-art phrase-based statistical machine translation system. The IR system is based on the BM25 retrieval model implemented in the Lucene search engine. The MT techniques employed in this work include in-domain training and tuning, intelligent training data selection, optimization of phrase table configuration, compound splitting, and exploiting synonyms as translation variants. The IR methods include morphological normalization and using multiple translation variants for query expansion. The experiments are performed and thoroughly evaluated on three language pairs: Czech-English, German-English, and French-English. MT quality is evaluated on data sets created within the Khresmoi project and IR effectiveness is tested on the CLEF eHealth 2013 data sets. The search query translation results achieved in our experiments are outstanding - our systems outperform not only our strong baselines, but also Google Translate and Microsoft Bing Translator in direct comparison carried out on all the language pairs. The baseline BLEU scores increased from 26.59 to 41.45 for Czech-English, from 23.03 to 40.82 for German-English, and from 32.67 to 40.82 for French-English. This is a 55% improvement on average. In terms of the IR performance on this particular test collection, a significant improvement over the baseline is achieved only for French-English. For Czech-English and German-English, the increased MT quality does not lead to better IR results. Most of the MT techniques employed in our experiments improve MT of medical search queries. Especially the intelligent training data selection proves to be very successful for domain adaptation of MT. Certain improvements are also obtained from German compound splitting on the source language side. Translation quality, however, does not appear to correlate with the IR performance - better translation does not necessarily yield better retrieval. We discuss in detail the contribution of the individual techniques and state-of-the-art features and provide future research directions. Copyright © 2014 Elsevier B.V. All rights reserved.
49 CFR 232.213 - Extended haul trains.
Code of Federal Regulations, 2010 CFR
2010-10-01
..., DEPARTMENT OF TRANSPORTATION BRAKE SYSTEM SAFETY STANDARDS FOR FREIGHT AND OTHER NON-PASSENGER TRAINS AND... extended haul trains will originate and a description of the trains that will be operated as extended haul.... (5) The train shall have no more than one pick-up and one set-out en route, except for the set-out of...
Reinstatement after human feature-positive discrimination learning.
Franssen, Mathijs; Claes, Nathalie; Vervliet, Bram; Beckers, Tom; Hermans, Dirk; Baeyens, Frank
2017-04-01
In two experiments, using an online conditioned suppression task, we investigated the possibility of reinstatement of extinguished feature-target compound presentations after sequential feature-positive discrimination training in humans. Furthermore, given a hierarchical account of Pavlovian modulation (e.g., Bonardi, 1998; Bonardi and Jennings, 2009), we predicted A-US reinstatement to be stronger than US-only reinstatement. In Experiment 1, participants learned a sequential feature-positive discrimination (X→A + |A - ), which was subsequently extinguished (X→A - ). During the following reinstatement phase, group US-only received US-only presentations (not signalled), group A-US received A-US presentations, and the Control group received exposure to the context, but no CSs or USs, for an equal amount of time. Reinstatement of differential X→A/A responding was observed in the US-only group but not in the Control or A-US groups. Although differential X→A/A responding was not significant in group A-US, responding to the X→A compound was significantly stronger compared to that in group US-only. Hence, it could be the case the group A-US showed stronger reinstatement, but that differential responding was abolished due to excitation gained by A. Experiment 2 was set up to circumvent the acquired excitation of A by testing transfer of the feature after A-US reinstatement to a different target, B. Participants acquired two discriminations, X→A/A and Y→B/B, of which X→A was then extinguished. Subsequently, group A-US received reinforced presentations of A during a reinstatement phase while group Control received exposure to the context. Final testing of the novel X→B compound was hypothesized to show higher responding in group A-US than in group Control, but findings of this approach were limited due to acquired equivalence and/or perceptual factors causing a secondary extinction effect. We conclude to have obtained clear evidence in favour of reinstatement of differential responding after human Feature-Positive discrimination training and subsequent compound extinction, but no evidence in favour of A-US presentations being a stronger trigger for reinstatement than are US-only presentations. Copyright © 2017 Elsevier B.V. All rights reserved.
NASA Technical Reports Server (NTRS)
Hoffer, R. M. (Principal Investigator); Knowlton, D. J.; Dean, M. E.
1981-01-01
A set of training statistics for the 30 meter resolution simulated thematic mapper MSS data was generated based on land use/land cover classes. In addition to this supervised data set, a nonsupervised multicluster block of training statistics is being defined in order to compare the classification results and evaluate the effect of the different training selection methods on classification performance. Two test data sets, defined using a stratified sampling procedure incorporating a grid system with dimensions of 50 lines by 50 columns, and another set based on an analyst supervised set of test fields were used to evaluate the classifications of the TMS data. The supervised training data set generated training statistics, and a per point Gaussian maximum likelihood classification of the 1979 TMS data was obtained. The August 1980 MSS data was radiometrically adjusted. The SAR data was redigitized and the SAR imagery was qualitatively analyzed.
Child health in low-resource settings: pathways through UK paediatric training.
Goenka, Anu; Magnus, Dan; Rehman, Tanya; Williams, Bhanu; Long, Andrew; Allen, Steve J
2013-11-01
UK doctors training in paediatrics benefit from experience of child health in low-resource settings. Institutions in low-resource settings reciprocally benefit from hosting UK trainees. A wide variety of opportunities exist for trainees working in low-resource settings including clinical work, research and the development of transferable skills in management, education and training. This article explores a range of pathways for UK trainees to develop experience in low-resource settings. It is important for trainees to start planning a robust rationale early for global child health activities via established pathways, in the interests of their own professional development as well as UK service provision. In the future, run-through paediatric training may include core elements of global child health, as well as designated 'tracks' for those wishing to develop their career in global child health further. Hands-on experience in low-resource settings is a critical component of these training initiatives.
Storkel, Holly L; Bontempo, Daniel E; Pak, Natalie S
2014-10-01
In this study, the authors investigated adult word learning to determine how neighborhood density and practice across phonologically related training sets influence online learning from input during training versus offline memory evolution during no-training gaps. Sixty-one adults were randomly assigned to learn low- or high-density nonwords. Within each density condition, participants were trained on one set of words and then were trained on a second set of words, consisting of phonological neighbors of the first set. Learning was measured in a picture-naming test. Data were analyzed using multilevel modeling and spline regression. Steep learning during input was observed, with new words from dense neighborhoods and new words that were neighbors of recently learned words (i.e., second-set words) being learned better than other words. In terms of memory evolution, large and significant forgetting was observed during 1-week gaps in training. Effects of density and practice during memory evolution were opposite of those during input. Specifically, forgetting was greater for high-density and second-set words than for low-density and first-set words. High phonological similarity, regardless of source (i.e., known words or recent training), appears to facilitate online learning from input but seems to impede offline memory evolution.
ERIC Educational Resources Information Center
Anoka-Hennepin Technical Coll., Minneapolis, MN.
This set of two training outlines and one basic skills set list are designed for a machine tool technology program developed during a project to retrain defense industry workers at risk of job loss or dislocation because of conversion of the defense industry. The first troubleshooting training outline lists the categories of problems that develop…
Reinforced Adversarial Neural Computer for de Novo Molecular Design.
Putin, Evgeny; Asadulaev, Arip; Ivanenkov, Yan; Aladinskiy, Vladimir; Sanchez-Lengeling, Benjamin; Aspuru-Guzik, Alán; Zhavoronkov, Alex
2018-06-12
In silico modeling is a crucial milestone in modern drug design and development. Although computer-aided approaches in this field are well-studied, the application of deep learning methods in this research area is at the beginning. In this work, we present an original deep neural network (DNN) architecture named RANC (Reinforced Adversarial Neural Computer) for the de novo design of novel small-molecule organic structures based on the generative adversarial network (GAN) paradigm and reinforcement learning (RL). As a generator RANC uses a differentiable neural computer (DNC), a category of neural networks, with increased generation capabilities due to the addition of an explicit memory bank, which can mitigate common problems found in adversarial settings. The comparative results have shown that RANC trained on the SMILES string representation of the molecules outperforms its first DNN-based counterpart ORGANIC by several metrics relevant to drug discovery: the number of unique structures, passing medicinal chemistry filters (MCFs), Muegge criteria, and high QED scores. RANC is able to generate structures that match the distributions of the key chemical features/descriptors (e.g., MW, logP, TPSA) and lengths of the SMILES strings in the training data set. Therefore, RANC can be reasonably regarded as a promising starting point to develop novel molecules with activity against different biological targets or pathways. In addition, this approach allows scientists to save time and covers a broad chemical space populated with novel and diverse compounds.
Talevi, Alan; Bellera, Carolina L; Castro, Eduardo A; Bruno-Blanch, Luis E
2007-09-01
A discriminant function based on topological descriptors was derived from a training set composed by anticonvulsants of clinical use or in clinical phase of development and compounds with other therapeutic uses. This model was internally and externally validated and applied in the virtual screening of chemical compounds from the Merck Index 13th. Methylparaben (Nipagin), a preservative widely used in food, cosmetics and pharmaceutics, was signaled as active by the discriminant function and tested in mice in the Maximal Electroshock (MES) test (i.p. administration), according to the NIH Program for Anticonvulsant Drug Development. Based on the results of Methylparaben, Propylparaben (Nipasol), another preservative usually used in association with the former, was also tested. Both methyl and propylparaben were found active in mice at doses of 30, 100, and 300 mg/kg. The discovery of the anticonvulsant activities in the MES test of methylparaben and propylparaben might be useful for the development of new anticonvulsant medications, specially considering the well-known toxicological profile of these drugs.
Muscle wasting and sarcopenia in heart failure and beyond: update 2017.
Springer, Jochen; Springer, Joshua-I; Anker, Stefan D
2017-11-01
Sarcopenia (loss of muscle mass and muscle function) is a strong predictor of frailty, disability and mortality in older persons and may also occur in obese subjects. The prevalence of sarcopenia is increased in patients suffering from chronic heart failure. However, there are currently few therapy options. The main intervention is resistance exercise, either alone or in combination with nutritional support, which seems to enhance the beneficial effects of training. Also, testosterone has been shown to increased muscle power and function; however, a possible limitation is the side effects of testosterone. Other investigational drugs include selective androgen receptor modulators, growth hormone, IGF-1, compounds targeting myostatin signaling, which have their own set of side effects. There are abundant prospective targets for improving muscle function in the elderly with or without chronic heart failure, and the continuing development of new treatment strategies and compounds for sarcopenia and cardiac cachexia makes this field an exciting one. © 2017 The Authors. ESC Heart Failure published by John Wiley & Sons Ltd on behalf of the European Society of Cardiology.
NASA Astrophysics Data System (ADS)
Talevi, Alan; Bellera, Carolina L.; Castro, Eduardo A.; Bruno-Blanch, Luis E.
2007-09-01
A discriminant function based on topological descriptors was derived from a training set composed by anticonvulsants of clinical use or in clinical phase of development and compounds with other therapeutic uses. This model was internally and externally validated and applied in the virtual screening of chemical compounds from the Merck Index 13th. Methylparaben (Nipagin), a preservative widely used in food, cosmetics and pharmaceutics, was signaled as active by the discriminant function and tested in mice in the Maximal Electroshock (MES) test (i.p. administration), according to the NIH Program for Anticonvulsant Drug Development. Based on the results of Methylparaben, Propylparaben (Nipasol), another preservative usually used in association with the former, was also tested. Both methyl and propylparaben were found active in mice at doses of 30, 100, and 300 mg/kg. The discovery of the anticonvulsant activities in the MES test of methylparaben and propylparaben might be useful for the development of new anticonvulsant medications, specially considering the well-known toxicological profile of these drugs.
Zhai, Hong Lin; Zhai, Yue Yuan; Li, Pei Zhen; Tian, Yue Li
2013-01-21
A very simple approach to quantitative analysis is proposed based on the technology of digital image processing using three-dimensional (3D) spectra obtained by high-performance liquid chromatography coupled with a diode array detector (HPLC-DAD). As the region-based shape features of a grayscale image, Zernike moments with inherently invariance property were employed to establish the linear quantitative models. This approach was applied to the quantitative analysis of three compounds in mixed samples using 3D HPLC-DAD spectra, and three linear models were obtained, respectively. The correlation coefficients (R(2)) for training and test sets were more than 0.999, and the statistical parameters and strict validation supported the reliability of established models. The analytical results suggest that the Zernike moment selected by stepwise regression can be used in the quantitative analysis of target compounds. Our study provides a new idea for quantitative analysis using 3D spectra, which can be extended to the analysis of other 3D spectra obtained by different methods or instruments.
Smartphone-Based System for Learning and Inferring Hearing Aid Settings
Aldaz, Gabriel; Puria, Sunil; Leifer, Larry J.
2017-01-01
Background Previous research has shown that hearing aid wearers can successfully self-train their instruments’ gain-frequency response and compression parameters in everyday situations. Combining hearing aids with a smartphone introduces additional computing power, memory, and a graphical user interface that may enable greater setting personalization. To explore the benefits of self-training with a smartphone-based hearing system, a parameter space was chosen with four possible combinations of microphone mode (omnidirectional and directional) and noise reduction state (active and off). The baseline for comparison was the “untrained system,” that is, the manufacturer’s algorithm for automatically selecting microphone mode and noise reduction state based on acoustic environment. The “trained system” first learned each individual’s preferences, self-entered via a smartphone in real-world situations, to build a trained model. The system then predicted the optimal setting (among available choices) using an inference engine, which considered the trained model and current context (e.g., sound environment, location, and time). Purpose To develop a smartphone-based prototype hearing system that can be trained to learn preferred user settings. Determine whether user study participants showed a preference for trained over untrained system settings. Research Design An experimental within-participants study. Participants used a prototype hearing system—comprising two hearing aids, Android smartphone, and body-worn gateway device—for ~6 weeks. Study Sample Sixteen adults with mild-to-moderate sensorineural hearing loss (HL) (ten males, six females; mean age = 55.5 yr). Fifteen had ≥6 mo of experience wearing hearing aids, and 14 had previous experience using smartphones. Intervention Participants were fitted and instructed to perform daily comparisons of settings (“listening evaluations”) through a smartphone-based software application called Hearing Aid Learning and Inference Controller (HALIC). In the four-week-long training phase, HALIC recorded individual listening preferences along with sensor data from the smartphone—including environmental sound classification, sound level, and location—to build trained models. In the subsequent two-week-long validation phase, participants performed blinded listening evaluations comparing settings predicted by the trained system (“trained settings”) to those suggested by the hearing aids’ untrained system (“untrained settings”). Data Collection and Analysis We analyzed data collected on the smartphone and hearing aids during the study. We also obtained audiometric and demographic information. Results Overall, the 15 participants with valid data significantly preferred trained settings to untrained settings (paired-samples t test). Seven participants had a significant preference for trained settings, while one had a significant preference for untrained settings (binomial test). The remaining seven participants had nonsignificant preferences. Pooling data across participants, the proportion of times that each setting was chosen in a given environmental sound class was on average very similar. However, breaking down the data by participant revealed strong and idiosyncratic individual preferences. Fourteen participants reported positive feelings of clarity, competence, and mastery when training via HALIC. Conclusions The obtained data, as well as subjective participant feedback, indicate that smartphones could become viable tools to train hearing aids. Individuals who are tech savvy and have milder HL seem well suited to take advantages of the benefits offered by training with a smartphone. PMID:27718350
Literature-based compound profiling: application to toxicogenomics.
Frijters, Raoul; Verhoeven, Stefan; Alkema, Wynand; van Schaik, René; Polman, Jan
2007-11-01
To reduce continuously increasing costs in drug development, adverse effects of drugs need to be detected as early as possible in the process. In recent years, compound-induced gene expression profiling methodologies have been developed to assess compound toxicity, including Gene Ontology term and pathway over-representation analyses. The objective of this study was to introduce an additional approach, in which literature information is used for compound profiling to evaluate compound toxicity and mode of toxicity. Gene annotations were built by text mining in Medline abstracts for retrieval of co-publications between genes, pathology terms, biological processes and pathways. This literature information was used to generate compound-specific keyword fingerprints, representing over-represented keywords calculated in a set of regulated genes after compound administration. To see whether keyword fingerprints can be used for assessment of compound toxicity, we analyzed microarray data sets of rat liver treated with 11 hepatotoxicants. Analysis of keyword fingerprints of two genotoxic carcinogens, two nongenotoxic carcinogens, two peroxisome proliferators and two randomly generated gene sets, showed that each compound produced a specific keyword fingerprint that correlated with the experimentally observed histopathological events induced by the individual compounds. By contrast, the random sets produced a flat aspecific keyword profile, indicating that the fingerprints induced by the compounds reflect biological events rather than random noise. A more detailed analysis of the keyword profiles of diethylhexylphthalate, dimethylnitrosamine and methapyrilene (MPy) showed that the differences in the keyword fingerprints of these three compounds are based upon known distinct modes of action. Visualization of MPy-linked keywords and MPy-induced genes in a literature network enabled us to construct a mode of toxicity proposal for MPy, which is in agreement with known effects of MPy in literature. Compound keyword fingerprinting based on information retrieved from literature is a powerful approach for compound profiling, allowing evaluation of compound toxicity and analysis of the mode of action.
Adaptive strategies for materials design using uncertainties
Balachandran, Prasanna V.; Xue, Dezhen; Theiler, James; ...
2016-01-21
Here, we compare several adaptive design strategies using a data set of 223 M2AX family of compounds for which the elastic properties [bulk (B), shear (G), and Young’s (E) modulus] have been computed using density functional theory. The design strategies are decomposed into an iterative loop with two main steps: machine learning is used to train a regressor that predicts elastic properties in terms of elementary orbital radii of the individual components of the materials; and a selector uses these predictions and their uncertainties to choose the next material to investigate. The ultimate goal is to obtain a material withmore » desired elastic properties in as few iterations as possible. We examine how the choice of data set size, regressor and selector impact the design. We find that selectors that use information about the prediction uncertainty outperform those that don’t. Our work is a step in illustrating how adaptive design tools can guide the search for new materials with desired properties.« less
Adaptive strategies for materials design using uncertainties
DOE Office of Scientific and Technical Information (OSTI.GOV)
Balachandran, Prasanna V.; Xue, Dezhen; Theiler, James
Here, we compare several adaptive design strategies using a data set of 223 M2AX family of compounds for which the elastic properties [bulk (B), shear (G), and Young’s (E) modulus] have been computed using density functional theory. The design strategies are decomposed into an iterative loop with two main steps: machine learning is used to train a regressor that predicts elastic properties in terms of elementary orbital radii of the individual components of the materials; and a selector uses these predictions and their uncertainties to choose the next material to investigate. The ultimate goal is to obtain a material withmore » desired elastic properties in as few iterations as possible. We examine how the choice of data set size, regressor and selector impact the design. We find that selectors that use information about the prediction uncertainty outperform those that don’t. Our work is a step in illustrating how adaptive design tools can guide the search for new materials with desired properties.« less
Alchemical and structural distribution based representation for universal quantum machine learning
NASA Astrophysics Data System (ADS)
Faber, Felix A.; Christensen, Anders S.; Huang, Bing; von Lilienfeld, O. Anatole
2018-06-01
We introduce a representation of any atom in any chemical environment for the automatized generation of universal kernel ridge regression-based quantum machine learning (QML) models of electronic properties, trained throughout chemical compound space. The representation is based on Gaussian distribution functions, scaled by power laws and explicitly accounting for structural as well as elemental degrees of freedom. The elemental components help us to lower the QML model's learning curve, and, through interpolation across the periodic table, even enable "alchemical extrapolation" to covalent bonding between elements not part of training. This point is demonstrated for the prediction of covalent binding in single, double, and triple bonds among main-group elements as well as for atomization energies in organic molecules. We present numerical evidence that resulting QML energy models, after training on a few thousand random training instances, reach chemical accuracy for out-of-sample compounds. Compound datasets studied include thousands of structurally and compositionally diverse organic molecules, non-covalently bonded protein side-chains, (H2O)40-clusters, and crystalline solids. Learning curves for QML models also indicate competitive predictive power for various other electronic ground state properties of organic molecules, calculated with hybrid density functional theory, including polarizability, heat-capacity, HOMO-LUMO eigenvalues and gap, zero point vibrational energy, dipole moment, and highest vibrational fundamental frequency.
Effects of treadmill running on rat gastrocnemius function following botulinum toxin A injection.
Tsai, Sen-Wei; Chen, Chun-Jung; Chen, Hsiao-Lin; Chen, Chuan-Mu; Chang, Yin-Yi
2012-02-01
Exercise can improve and maintain neural or muscular function, but the effects of exercise in physiological adaptation to paralysis caused by botulinum toxin A has not been well studied. Twenty-four rats were randomly assigned into control and treadmill groups. The rats assigned to the treadmill group were trained on a treadmill three times per week with the running speed set at 15 m/min. The duration of training was 20 min/session. Muscle strength, nerve conduction study and sciatic functional index (SFI) were used for functional analysis. Treadmill training improved the SFI at 2, 3, and 4 weeks (p = 0.01, 0.004, and 0.01, respectively). The maximal contraction force of the gastrocnemius muscle in the treadmill group was greater than in the control group (p < 0.05). The percentage of activated fibers was higher in the treadmill botox group than the percentage for the control botox group, which was demonstrated by differences in amplitude and area of compound muscle action potential (CMAP) under the curve between the groups (p < 0.05). After BoNT-A injection, treadmill improved the physiological properties of muscle contraction strength, CMAP amplitude, and the recovery of SFI. Copyright © 2011 Orthopaedic Research Society.
Issues in the Development and Evaluation of Cross-Cultural Training in a Business Setting.
ERIC Educational Resources Information Center
Broadbooks, Wendy J.
Issues in the development and evaluation of cross-cultural training in a business setting were investigated. Cross-cultural training and cross-cultural evaluation were defined as training and evaluation of training that involve the interaction of participants from two or more different countries. Two evaluations of a management development-type…
Yamagata, Tetsuo; Zanelli, Ugo; Gallemann, Dieter; Perrin, Dominique; Dolgos, Hugues; Petersson, Carl
2017-09-01
1. We compared direct scaling, regression model equation and the so-called "Poulin et al." methods to scale clearance (CL) from in vitro intrinsic clearance (CL int ) measured in human hepatocytes using two sets of compounds. One reference set comprised of 20 compounds with known elimination pathways and one external evaluation set based on 17 compounds development in Merck (MS). 2. A 90% prospective confidence interval was calculated using the reference set. This interval was found relevant for the regression equation method. The three outliers identified were justified on the basis of their elimination mechanism. 3. The direct scaling method showed a systematic underestimation of clearance in both the reference and evaluation sets. The "Poulin et al." and the regression equation methods showed no obvious bias in either the reference or evaluation sets. 4. The regression model equation was slightly superior to the "Poulin et al." method in the reference set and showed a better absolute average fold error (AAFE) of value 1.3 compared to 1.6. A larger difference was observed in the evaluation set were the regression method and "Poulin et al." resulted in an AAFE of 1.7 and 2.6, respectively (removing the three compounds with known issues mentioned above). A similar pattern was observed for the correlation coefficient. Based on these data we suggest the regression equation method combined with a prospective confidence interval as the first choice for the extrapolation of human in vivo hepatic metabolic clearance from in vitro systems.
Optimization of genomic selection training populations with a genetic algorithm
USDA-ARS?s Scientific Manuscript database
In this article, we derive a computationally efficient statistic to measure the reliability of estimates of genetic breeding values for a fixed set of genotypes based on a given training set of genotypes and phenotypes. We adopt a genetic algorithm scheme to find a training set of certain size from ...
Product identification techniques used as training aids for analytical chemists
NASA Technical Reports Server (NTRS)
Grillo, J. P.
1968-01-01
Laboratory staff assistants are trained to use data and observations of routine product analyses performed by experienced analytical chemists when analyzing compounds for potential toxic hazards. Commercial products are used as examples in teaching the analytical approach to unknowns.
Predicting cytotoxicity from heterogeneous data sources with Bayesian learning.
Langdon, Sarah R; Mulgrew, Joanna; Paolini, Gaia V; van Hoorn, Willem P
2010-12-09
We collected data from over 80 different cytotoxicity assays from Pfizer in-house work as well as from public sources and investigated the feasibility of using these datasets, which come from a variety of assay formats (having for instance different measured endpoints, incubation times and cell types) to derive a general cytotoxicity model. Our main aim was to derive a computational model based on this data that can highlight potentially cytotoxic series early in the drug discovery process. We developed Bayesian models for each assay using Scitegic FCFP_6 fingerprints together with the default physical property descriptors. Pairs of assays that are mutually predictive were identified by calculating the ROC score of the model derived from one predicting the experimental outcome of the other, and vice versa. The prediction pairs were visualised in a network where nodes are assays and edges are drawn for ROC scores >0.60 in both directions. We observed that, if assay pairs (A, B) and (B, C) were mutually predictive, this was often not the case for the pair (A, C). The results from 48 assays connected to each other were merged in one training set of 145590 compounds and a general cytotoxicity model was derived. The model has been cross-validated as well as being validated with a set of 89 FDA approved drug compounds. We have generated a predictive model for general cytotoxicity which could speed up the drug discovery process in multiple ways. Firstly, this analysis has shown that the outcomes of different assay formats can be mutually predictive, thus removing the need to submit a potentially toxic compound to multiple assays. Furthermore, this analysis enables selection of (a) the easiest-to-run assay as corporate standard, or (b) the most descriptive panel of assays by including assays whose outcomes are not mutually predictive. The model is no replacement for a cytotoxicity assay but opens the opportunity to be more selective about which compounds are to be submitted to it. On a more mundane level, having data from more than 80 assays in one dataset answers, for the first time, the question - "what are the known cytotoxic compounds from the Pfizer compound collection?" Finally, having a predictive cytotoxicity model will assist the design of new compounds with a desired cytotoxicity profile, since comparison of the model output with data from an in vitro safety/toxicology assay suggests one is predictive of the other.
Training a whole-book LSTM-based recognizer with an optimal training set
NASA Astrophysics Data System (ADS)
Soheili, Mohammad Reza; Yousefi, Mohammad Reza; Kabir, Ehsanollah; Stricker, Didier
2018-04-01
Despite the recent progress in OCR technologies, whole-book recognition, is still a challenging task, in particular in case of old and historical books, that the unknown font faces or low quality of paper and print contributes to the challenge. Therefore, pre-trained recognizers and generic methods do not usually perform up to required standards, and usually the performance degrades for larger scale recognition tasks, such as of a book. Such reportedly low error-rate methods turn out to require a great deal of manual correction. Generally, such methodologies do not make effective use of concepts such redundancy in whole-book recognition. In this work, we propose to train Long Short Term Memory (LSTM) networks on a minimal training set obtained from the book to be recognized. We show that clustering all the sub-words in the book, and using the sub-word cluster centers as the training set for the LSTM network, we can train models that outperform any identical network that is trained with randomly selected pages of the book. In our experiments, we also show that although the sub-word cluster centers are equivalent to about 8 pages of text for a 101- page book, a LSTM network trained on such a set performs competitively compared to an identical network that is trained on a set of 60 randomly selected pages of the book.
Lietz, Arthur C.; Meyer, Michael T.
2006-01-01
The Comprehensive Everglades Restoration Plan has identified highly treated wastewater as a possible water source for the restoration of natural water flows and hydroperiods in selected coastal areas, including the Biscayne Bay coastal wetlands. One potential source of reclaimed wastewater for the Biscayne Bay coastal wetlands is the effluent from the South District Wastewater Treatment Plant in southern Miami-Dade County. The U.S. Geological Survey, in cooperation with the Comprehensive Everglades Restoration Plan Wastewater Reuse Technology Pilot Project Delivery Team, initiated a study to assess the presence of emerging contaminants of concern in the South District Wastewater Treatment Plant influent and effluent using current wastewater-treatment methods. As part of the study, 24-hour composite and discrete samples were collected at six locations (influent at plants 1 and 2, effluent pump, reuse train, chlorine dioxide unit, and ultraviolet pilot unit) at the plant during: (1) a dry-season, low-flow event on March 2-3, 2004, with an average inflow rate of 83.7 million gallons per day; (2) a wet-season, average-flow event on July 20-21, 2004, with an average inflow rate of 89.7 million gallons per day; and (3) high-rate disinfection tests on October 5 and 20, 2004, with average flow rates of 84.1 and 119.6 million gallons per day, respectively. During these four sampling events, 26, 27, 29, and 35 constituents were detected, respectively. The following transformations in concentration were determined in the waste stream: -100 to 180 percent at the effluent pump and -100 to 85 percent at the reuse train on March 2-3, 2004, and -100 to 1,609 percent at the effluent pump and -100 to 832 percent at the reuse train on July 20-21, 2004; -100 to -37 percent at the effluent pump, -100 to -62 percent at the reuse train, -100 to -56 percent at the chlorine dioxide unit, and -100 to -40 percent at the ultraviolet pilot unit on October 5, 2004; and -100 to -4 percent at the effluent pump, -100 to 17 percent at the reuse train, -100 to -40 percent at the chlorine dioxide unit, and -100 to -14 percent at the ultraviolet pilot unit on October 20, 2004. Samples were tested for detection of household and industrial (organic) wastewater compounds, pharmaceutical compounds, antibiotic compounds, and hormones in influent. Two 'known' endocrine disrupting compounds?17 beta-estradiol (E2) and diethoxynonylphenol? and four 'suspected' endocrine-disrupting compounds?1,4-dichlorobenzene, benzophenone, tris(2-chloroethyl) phosphate, and tris(dichloroisopropyl) phosphate?were detected during these sampling events. Phenanthrene and indole showed the greatest concentration ranges and highest concentrations for the organic wastewater compounds. Acetaminophen showed the greatest concentration range and highest concentration, and warfarin showed the smallest concentration range for the pharmaceutical compounds. Sulfamethoxazole (a sulfonamide) showed the greatest concentration range and highest concentration, and sulfathiozole (also a sulfonamide) showed the smallest concentration range for the antibiotic compounds. Two hormones, 17 beta-estradiol (E2) and estrone (E1), were detected in influent. Samples were also tested for detection of organic wastewater compounds, pharmaceutical compounds, antibiotic compounds, and hormones in effluent. Indole showed the greatest concentration range and highest concentration, and triphenyl phosphate showed the smallest concentration range for the organic wastewater compounds. Dehydronifedipine showed the greatest concentration range and highest concentration, and warfarin had the smallest concentration range for the pharmaceutical compounds. Anhydro-erythromycin (a macrolide degradation product) showed the greatest concentration range, and sulfadiazine (a sulfonamide) and tetracycline showed the lowest concentration ranges for the antibiotic compounds. One hormone, 17 beta-estradiol (E2), was det
Timon, Rafael; Collado-Mateo, Daniel; Olcina, Guillermo; Gusi, Narcis
2016-03-01
Previous studies have demonstrated positive effects of acute vibration exercise on concentric strength and power, but few have observed the effects of vibration exposure on resistance training. The aim of this study was to verify the effects of whole body vibration applied to the chest via hands on bench press resistance training in trained and untrained individuals. Nineteen participants (10 recreationally trained bodybuilders and 9 untrained students) performed two randomized sessions of resistance training on separate days. Each strength session consisted of 3 bench press sets with a load of 75% 1RM to failure in each set, with 2 minutes' rest between sets. All subjects performed the same strength training with either, vibration exposure (12 Hz, 4 mm) of 30 seconds immediately before each bench press set or without vibration. Number of total repetitions, kinematic parameters, blood lactate and perceived exertion were analyzed. In the untrained group, vibration exposure caused a significant increase in the mean velocity (from 0.36±0.02 to 0.39±0.03 m/s) and acceleration (from 0.75±0.10 to 0.86±0.09 m/s2), as well as a decrease in perceived effort (from 8±0.57 to 7.35±0.47) in the first bench press set, but no change was observed in the third bench press set. In the recreationally trained bodybuilders, vibration exposure did not cause any improvement on the performance of bench press resistance training. These results suggest that vibration exposure applied just before the bench press exercise could be a good practice to be implemented by untrained individuals in resistance training.
Morrow, S A; Bates, P E
1987-01-01
This study examined the effectiveness of three sets of school-based instructional materials and community training on acquisition and generalization of a community laundry skill by nine students with severe handicaps. School-based instruction involved artificial materials (pictures), simulated materials (cardboard replica of a community washing machine), and natural materials (modified home model washing machine). Generalization assessments were conducted at two different community laundromats, on two machines represented fully by the school-based instructional materials and two machines not represented fully by these materials. After three phases of school-based instruction, the students were provided ten community training trials in one laundromat setting and a final assessment was conducted in both the trained and untrained community settings. A multiple probe design across students was used to evaluate the effectiveness of the three types of school instruction and community training. After systematic training, most of the students increased their laundry performance with all three sets of school-based materials; however, generalization of these acquired skills was limited in the two community settings. Direct training in one of the community settings resulted in more efficient acquisition of the laundry skills and enhanced generalization to the untrained laundromat setting for most of the students. Results of this study are discussed in regard to the issue of school versus community-based instruction and recommendations are made for future research in this area.
Naik, Pradeep K; Santoshi, Seneha; Joshi, Harish C
2012-01-01
We have identified a new class of microtubule-binding compounds-noscapinoids-that alter microtubule dynamics at stoichiometric concentrations without affecting tubulin polymer mass. Noscapinoids show great promise as chemotherapeutic agents for the treatment of human cancers. To investigate the structural determinants of noscapinoids responsible for anti-cancer activity, we tested 36 structurally diverse noscapinoids in human acute lymphoblastic leukemia cells (CEM). The IC(50) values of these noscapinoids vary from 1.2 to 56.0 μM. Pharmacophore models of anti-cancer activity were generated that identify two hydrogen bond acceptors, two aromatic rings, two hydrophobic groups, and one positively charged group as essential structural features. Additionally, an atom-based quantitative structure-activity relationship (QSAR) model was developed that gave a statistically satisfying result (R(2) = 0.912, Q(2) = 0.908, Pearson R = 0.951) and effectively predicts the anti-cancer activity of training and test set compounds. The pharmacophore model presented here is well supported by electronic property analysis using density functional theory at B3LYP/3-21*G level. Molecular electrostatic potential, particularly localization of negative potential near oxygen atoms of the dimethoxy isobenzofuranone ring of active compounds, matched the hydrogen bond acceptor feature of the generated pharmacophore. Our results further reveal that all active compounds have smaller lowest unoccupied molecular orbital (LUMO) energies concentrated over the dimethoxy isobenzofuranone ring, azido group, and nitro group, which is indicative of the electron acceptor capacity of the compounds. Results obtained from this study will be useful in the efficient design and development of more active noscapinoids.
ERIC Educational Resources Information Center
Frost, Jørgen; Ottem, Ernst; Hagtvet, Bente E.; Snow, Catherine E.
2016-01-01
In the present study, 81 Norwegian students were taught the meaning of words by the Word Generation (WG) method and 51 Norwegian students were taught by an approach inspired by the Thinking Schools (TS) concept. Two sets of words were used: a set of words to be trained and a set of non-trained control words. The two teaching methods yielded no…
ERIC Educational Resources Information Center
Rahyuda, Agoes Ganesha; Soltani, Ebrahim; Syed, Jawad
2018-01-01
Based on a review of the literature on post-training transfer interventions, this paper offers a conceptual model that elucidates potential mechanisms through which two types of post-training transfer intervention (relapse prevention and proximal plus distal goal setting) influence the transfer of training. We explain how the application of…
McConnell, Bridget L.; Urushihara, Kouji; Miller, Ralph R.
2009-01-01
Three conditioned suppression experiments with rats investigated contrasting predictions made by the extended comparator hypothesis and acquisition-focused models of learning, specifically, modified SOP and the revised Rescorla-Wagner model, concerning retrospective revaluation. Two target cues (X and Y) were partially reinforced using a stimulus relative validity design (i.e., AX-Outcome/ BX-No outcome/ CY-Outcome/ DY-No outcome), and subsequently one of the companion cues for each target was extinguished in compound (BC-No outcome). In Experiment 1, which used spaced trials for relative validity training, greater suppression was observed to target cue Y for which the excitatory companion cue had been extinguished relative to target cue X for which the nonexcitatory companion cue had been extinguished. Experiment 2 replicated these results in a sensory preconditioning preparation. Experiment 3 massed the trials during relative validity training, and the opposite pattern of data was observed. The results are consistent with the predictions of the extended comparator hypothesis. Furthermore, this set of experiments is unique in being able to differentiate between these models without invoking higher-order comparator processes. PMID:20141324
Comparison of molecular breeding values based on within- and across-breed training in beef cattle.
Kachman, Stephen D; Spangler, Matthew L; Bennett, Gary L; Hanford, Kathryn J; Kuehn, Larry A; Snelling, Warren M; Thallman, R Mark; Saatchi, Mahdi; Garrick, Dorian J; Schnabel, Robert D; Taylor, Jeremy F; Pollak, E John
2013-08-16
Although the efficacy of genomic predictors based on within-breed training looks promising, it is necessary to develop and evaluate across-breed predictors for the technology to be fully applied in the beef industry. The efficacies of genomic predictors trained in one breed and utilized to predict genetic merit in differing breeds based on simulation studies have been reported, as have the efficacies of predictors trained using data from multiple breeds to predict the genetic merit of purebreds. However, comparable studies using beef cattle field data have not been reported. Molecular breeding values for weaning and yearling weight were derived and evaluated using a database containing BovineSNP50 genotypes for 7294 animals from 13 breeds in the training set and 2277 animals from seven breeds (Angus, Red Angus, Hereford, Charolais, Gelbvieh, Limousin, and Simmental) in the evaluation set. Six single-breed and four across-breed genomic predictors were trained using pooled data from purebred animals. Molecular breeding values were evaluated using field data, including genotypes for 2227 animals and phenotypic records of animals born in 2008 or later. Accuracies of molecular breeding values were estimated based on the genetic correlation between the molecular breeding value and trait phenotype. With one exception, the estimated genetic correlations of within-breed molecular breeding values with trait phenotype were greater than 0.28 when evaluated in the breed used for training. Most estimated genetic correlations for the across-breed trained molecular breeding values were moderate (> 0.30). When molecular breeding values were evaluated in breeds that were not in the training set, estimated genetic correlations clustered around zero. Even for closely related breeds, within- or across-breed trained molecular breeding values have limited prediction accuracy for breeds that were not in the training set. For breeds in the training set, across- and within-breed trained molecular breeding values had similar accuracies. The benefit of adding data from other breeds to a within-breed training population is the ability to produce molecular breeding values that are more robust across breeds and these can be utilized until enough training data has been accumulated to allow for a within-breed training set.
Mendiburu, Andrés Z; de Carvalho, João A; Coronado, Christian R
2015-03-21
Estimation of the lower flammability limits of C-H compounds at 25 °C and 1 atm; at moderate temperatures and in presence of diluent was the objective of this study. A set of 120 C-H compounds was divided into a correlation set and a prediction set of 60 compounds each. The absolute average relative error for the total set was 7.89%; for the correlation set, it was 6.09%; and for the prediction set it was 9.68%. However, it was shown that by considering different sources of experimental data the values were reduced to 6.5% for the prediction set and to 6.29% for the total set. The method showed consistency with Le Chatelier's law for binary mixtures of C-H compounds. When tested for a temperature range from 5 °C to 100 °C, the absolute average relative errors were 2.41% for methane; 4.78% for propane; 0.29% for iso-butane and 3.86% for propylene. When nitrogen was added, the absolute average relative errors were 2.48% for methane; 5.13% for propane; 0.11% for iso-butane and 0.15% for propylene. When carbon dioxide was added, the absolute relative errors were 1.80% for methane; 5.38% for propane; 0.86% for iso-butane and 1.06% for propylene. Copyright © 2014 Elsevier B.V. All rights reserved.
Effects of draught load exercise and training on calcium homeostasis in horses.
Vervuert, I; Coenen, M; Zamhöfer, J
2005-01-01
This study was conducted to investigate the effects of draught load exercise on calcium (Ca) homeostasis in young horses. Five 2-year-old untrained Standardbred horses were studied in a 4-month training programme. All exercise workouts were performed on a treadmill at a 6% incline and with a constant draught load of 40 kg (0.44 kN). The training programme started with a standardized exercise test (SET 1; six incremental steps of 5 min duration each, first step 1.38 m/s, stepwise increase by 0.56 m/s). A training programme was then initiated which consisted of low-speed exercise sessions (LSE; constant velocity at 1.67 m/s for 60 min, 48 training sessions in total). After the 16th and 48th LSE sessions, SETs (SET 2: middle of training period, SET 3: finishing training period) were performed again under the identical test protocol of SET 1. Blood samples for blood lactate, plasma total Ca, blood ionized calcium (Ca(2+)), blood pH, plasma inorganic phosphorus (P(i)) and plasma intact parathyroid hormone (PTH) were collected before, during and after SETs, and before and after the first, 16th, 32nd and 48th LSE sessions. During SETs there was a decrease in ionized Ca(2+) and a rise in lactate, P(i) and intact PTH. The LSEs resulted in an increase in pH and P(i), whereas lactate, ionized Ca(2+), total Ca and intact PTH were not affected. No changes in Ca metabolism were detected in the course of training. Results of this study suggest that the type of exercise influences Ca homeostasis and intact PTH response, but that these effects are not influenced in the course of the training period.
Atmospheric Chemistry of Micrometeoritic Organic Compounds
NASA Technical Reports Server (NTRS)
Kress, M. E.; Belle, C. L.; Pevyhouse, A. R.; Iraci, L. T.
2011-01-01
Micrometeorites approx.100 m in diameter deliver most of the Earth s annual accumulation of extraterrestrial material. These small particles are so strongly heated upon atmospheric entry that most of their volatile content is vaporized. Here we present preliminary results from two sets of experiments to investigate the fate of the organic fraction of micrometeorites. In the first set of experiments, 300 m particles of a CM carbonaceous chondrite were subject to flash pyrolysis, simulating atmospheric entry. In addition to CO and CO2, many organic compounds were released, including functionalized benzenes, hydrocarbons, and small polycyclic aromatic hydrocarbons. In the second set of experiments, we subjected two of these compounds to conditions that simulate the heterogeneous chemistry of Earth s upper atmosphere. We find evidence that meteor-derived compounds can follow reaction pathways leading to the formation of more complex organic compounds.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Helguera, Aliuska Morales; Molecular Simulation and Drug Design, Chemical Bioactive Center, Central University of Las Villas, Santa Clara, 54830, Villa Clara; Department of Chemistry, Central University of Las Villas, Santa Clara, 54830, Villa Clara
2008-09-01
In this work, Quantitative Structure-Activity Relationship (QSAR) modelling was used as a tool for predicting the carcinogenic potency of a set of 39 nitroso-compounds, which have been bioassayed in male rats by using the oral route of administration. The optimum QSAR model provided evidence of good fit and performance of predicitivity from training set. It was able to account for about 84% of the variance in the experimental activity and exhibited high values of the determination coefficients of cross validations, leave one out and bootstrapping (q{sup 2}{sub LOO} = 78.53 and q{sup 2}{sub Boot} = 74.97). Such a model wasmore » based on spectral moments weighted with Gasteiger-Marsilli atomic charges, polarizability and hydrophobicity, as well as with Abraham indexes, specifically the summation solute hydrogen bond basicity and the combined dipolarity/polarizability. This is the first study to have explored the possibility of combining Abraham solute descriptors with spectral moments. A reasonable interpretation of these molecular descriptors from a toxicological point of view was achieved by means of taking into account bond contributions. The set of relationships so derived revealed the importance of the length of the alkyl chains for determining carcinogenic potential of the chemicals analysed, and were able to explain the difference between mono-substituted and di-substituted nitrosoureas as well as to discriminate between isomeric structures with hydroxyl-alkyl and alkyl substituents in different positions. Moreover, they allowed the recognition of structural alerts in classical structures of two potent nitrosamines, consistent with their biotransformation. These results indicate that this new approach has the potential for improving carcinogenicity predictions based on the identification of structural alerts.« less
Marrero-Ponce, Yovani; Khan, Mahmud Tareq Hassan; Casañola-Martín, Gerardo M; Ather, Arjumand; Sultankhodzhaev, Mukhlis N; García-Domenech, Ramón; Torrens, Francisco; Rotondo, Richard
2007-04-01
In this paper, we present a new set of bond-level TOMOCOMD-CARDD molecular descriptors (MDs), the bond-based bilinear indices, based on a bilinear map similar to those defined in linear algebra. These novel MDs are used here in Quantitative Structure-Activity Relationship (QSAR) studies of tyrosinase inhibitors, for finding functions that discriminate between the tyrosinase inhibitor compounds and inactive ones. In total 14 models were obtained and the best two discriminant functions (Eqs. 32 and 33) shown globally good classification of 91.00% and 90.17%, respectively, in the training set. The test set had accuracies of 93.33% and 88.89% for the models 32 and 33, correspondingly. A simulated virtual screening was also carried out to prove the quality of the determined models. In a final step, the fitted models were used in the biosilico identification of new synthesized tetraketones, where a good agreement could be observed between the theoretical and experimental results. Four compounds of the novel bioactive chemicals discovered as tyrosinase inhibitors: TK10 (IC(50) = 2.09 microM), TK11 (IC(50) = 2.61 microM), TK21 (IC(50) = 2.06 microM), TK23 (IC(50) = 3.19 microM), showed more potent activity than L-mimose (IC(50) = 3.68 microM). Besides, for this study a heterogeneous database of tyrosinase inhibitors was collected, and could be a useful tool for the scientist in the domain of tyrosinase enzyme researches. The current report could help to shed some clues in the identification of new chemicals that inhibits enzyme tyrosinase, for entering in the pipeline of drug discovery development.
Team Training and Retention of Skills Acquired Above Real Time Training on a Flight Simulator
NASA Technical Reports Server (NTRS)
Ali, Syed Friasat; Guckenberger, Dutch; Crane, Peter; Rossi, Marcia; Williams, Mayard; Williams, Jason; Archer, Matt
2000-01-01
Above Real-Time Training (ARTT) is the training acquired on a real time simulator when it is modified to present events at a faster pace than normal. The experiments related to training of pilots performed by NASA engineers (Kolf in 1973, Hoey in 1976) and others (Guckenberger, Crane and their associates in the nineties) have shown that in comparison with the real time training (RTT), ARTT provides the following benefits: increased rate of skill acquisition, reduced simulator and aircraft training time, and more effective training for emergency procedures. Two sets of experiments have been performed; they are reported in professional conferences and the respective papers are included in this report. The retention of effects of ARTT has been studied in the first set of experiments and the use of ARTT as top-off training has been examined in the second set of experiments. In ARTT, the pace of events was 1.5 times the pace in RTT. In both sets of experiments, university students were trained to perform an aerial gunnery task. The training unit was equipped with a joystick and a throttle. The student acted as a nose gunner in a hypothetical two place attack aircraft. The flight simulation software was installed on a Universal Distributed Interactive Simulator platform supplied by ECC International of Orlando, Florida. In the first set of experiments, two training programs RTT or ART7 were used. Students were then tested in real time on more demanding scenarios: either immediately after training or two days later. The effects of ARTT did not decrease over a two day retention interval and ARTT was more time efficient than real time training. Therefore, equal test performance could be achieved with less clock-time spent in the simulator. In the second set of experiments three training programs RTT or ARTT or RARTT, were used. In RTT, students received 36 minutes of real time training. In ARTT, students received 36 minutes of above real time training. In RARTT, students received 18 minutes of real time training and 18 minutes of above real time training as top-off training. Students were then tested in real time on more demanding scenarios. The use of ARTT as top-off training after RTT offered better training than RTT alone or ARTT alone. It is, however, suggested that a similar experiment be conducted on a relatively more complex task with a larger sample of participants. Within the proposed duration of the research effort, the setting up of experiments and trial runs on using ARTT for team training were also scheduled but they could not be accomplished due to extra ordinary challenges faced in developing the required software configuration. Team training is, however, scheduled in a future study sponsored by NASA at Tuskegee University.
NASA Astrophysics Data System (ADS)
Ma, Lei; Cheng, Liang; Li, Manchun; Liu, Yongxue; Ma, Xiaoxue
2015-04-01
Unmanned Aerial Vehicle (UAV) has been used increasingly for natural resource applications in recent years due to their greater availability and the miniaturization of sensors. In addition, Geographic Object-Based Image Analysis (GEOBIA) has received more attention as a novel paradigm for remote sensing earth observation data. However, GEOBIA generates some new problems compared with pixel-based methods. In this study, we developed a strategy for the semi-automatic optimization of object-based classification, which involves an area-based accuracy assessment that analyzes the relationship between scale and the training set size. We found that the Overall Accuracy (OA) increased as the training set ratio (proportion of the segmented objects used for training) increased when the Segmentation Scale Parameter (SSP) was fixed. The OA increased more slowly as the training set ratio became larger and a similar rule was obtained according to the pixel-based image analysis. The OA decreased as the SSP increased when the training set ratio was fixed. Consequently, the SSP should not be too large during classification using a small training set ratio. By contrast, a large training set ratio is required if classification is performed using a high SSP. In addition, we suggest that the optimal SSP for each class has a high positive correlation with the mean area obtained by manual interpretation, which can be summarized by a linear correlation equation. We expect that these results will be applicable to UAV imagery classification to determine the optimal SSP for each class.
Luilo, G B; Cabaniss, S E
2011-10-01
Chlorinating water which contains dissolved organic matter (DOM) produces disinfection byproducts, the majority of unknown structure. Hence, the total organic halide (TOX) measurement is used as a surrogate for toxic disinfection byproducts. This work derives a robust quantitative structure-property relationship (QSPR) for predicting the TOX formation potential of model compounds. Literature data for 49 compounds were used to train the QSPR in moles of chlorine per mole of compound (Cp) (mol-Cl/mol-Cp). The resulting QSPR has four descriptors, calibration [Formula: see text] of 0.72 and standard deviation of estimation of 0.43 mol-Cl/mol-Cp. Internal and external validation indicate that the QSPR has good predictive power and low bias (<1%). Applying this QSPR to predict TOX formation by DOM surrogates - tannic acid, two model fulvic acids and two agent-based model assemblages - gave a predicted TOX range of 136-184 µg-Cl/mg-C, consistent with experimental data for DOM, which ranged from 78 to 192 µg-Cl/mg-C. However, the limited structural variation in the training data may limit QSPR applicability; studies of more sulfur-containing compounds, heterocyclic compounds and high molecular weight compounds could lead to a more widely applicable QSPR.
Systematic review of skills transfer after surgical simulation-based training.
Dawe, S R; Pena, G N; Windsor, J A; Broeders, J A J L; Cregan, P C; Hewett, P J; Maddern, G J
2014-08-01
Simulation-based training assumes that skills are directly transferable to the patient-based setting, but few studies have correlated simulated performance with surgical performance. A systematic search strategy was undertaken to find studies published since the last systematic review, published in 2007. Inclusion of articles was determined using a predetermined protocol, independent assessment by two reviewers and a final consensus decision. Studies that reported on the use of surgical simulation-based training and assessed the transferability of the acquired skills to a patient-based setting were included. Twenty-seven randomized clinical trials and seven non-randomized comparative studies were included. Fourteen studies investigated laparoscopic procedures, 13 endoscopic procedures and seven other procedures. These studies provided strong evidence that participants who reached proficiency in simulation-based training performed better in the patient-based setting than their counterparts who did not have simulation-based training. Simulation-based training was equally as effective as patient-based training for colonoscopy, laparoscopic camera navigation and endoscopic sinus surgery in the patient-based setting. These studies strengthen the evidence that simulation-based training, as part of a structured programme and incorporating predetermined proficiency levels, results in skills transfer to the operative setting. © 2014 BJS Society Ltd. Published by John Wiley & Sons Ltd.
Hebisz, Rafal; Borkowski, Jacek; Zatoń, Marek
2016-01-01
Abstract The aim of this study was to determine differences in glycolytic metabolite concentrations and work output in response to an all-out interval training session in 23 cyclists with at least 2 years of interval training experience (E) and those inexperienced (IE) in this form of training. The intervention involved subsequent sets of maximal intensity exercise on a cycle ergometer. Each set comprised four 30 s repetitions interspersed with 90 s recovery periods; sets were repeated when blood pH returned to 7.3. Measurements of post-exercise hydrogen (H+) and lactate ion (LA-) concentrations and work output were taken. The experienced cyclists performed significantly more sets of maximal efforts than the inexperienced athletes (5.8 ± 1.2 vs. 4.3 ± 0.9 sets, respectively). Work output decreased in each subsequent set in the IE group and only in the last set in the E group. Distribution of power output changed only in the E group; power decreased in the initial repetitions of set only to increase in the final repetitions. H+ concentration decreased in the third, penultimate, and last sets in the E group and in each subsequent set in the IE group. LA- decreased in the last set in both groups. In conclusion, the experienced cyclists were able to repeatedly induce elevated levels of lactic acidosis. Power output distribution changed with decreased acid–base imbalance. In this way, this group could compensate for a decreased anaerobic metabolism. The above factors allowed cyclists experienced in interval training to perform more sets of maximal exercise without a decrease in power output compared with inexperienced cyclists. PMID:28149346
An accelerated training method for back propagation networks
NASA Technical Reports Server (NTRS)
Shelton, Robert O. (Inventor)
1993-01-01
The principal objective is to provide a training procedure for a feed forward, back propagation neural network which greatly accelerates the training process. A set of orthogonal singular vectors are determined from the input matrix such that the standard deviations of the projections of the input vectors along these singular vectors, as a set, are substantially maximized, thus providing an optimal means of presenting the input data. Novelty exists in the method of extracting from the set of input data, a set of features which can serve to represent the input data in a simplified manner, thus greatly reducing the time/expense to training the system.
Janak, Patricia H; Bowers, M Scott; Corbit, Laura H
2012-03-01
Drug abstinence is frequently compromised when addicted individuals are re-exposed to environmental stimuli previously associated with drug use. Research with human addicts and in animal models has demonstrated that extinction learning (non-reinforced cue-exposure) can reduce the capacity of such stimuli to induce relapse, yet extinction therapies have limited long-term success under real-world conditions (Bouton, 2002; O'Brien, 2008). We hypothesized that enhancing extinction would reduce the later ability of drug-predictive cues to precipitate drug-seeking behavior. We, therefore, tested whether compound stimulus presentation and pharmacological treatments that augment noradrenergic activity (atomoxetine; norepinephrine reuptake inhibitor) during extinction training would facilitate the extinction of drug-seeking behaviors, thus reducing relapse. Rats were trained that the presentation of a discrete cue signaled that a lever press response would result in cocaine reinforcement. Rats were subsequently extinguished and spontaneous recovery of drug-seeking behavior following presentation of previously drug-predictive cues was tested 4 weeks later. We find that compound stimulus presentations or pharmacologically increasing noradrenergic activity during extinction training results in less future recovery of responding, whereas propranolol treatment reduced the benefit seen with compound stimulus presentation. These data may have important implications for understanding the biological basis of extinction learning, as well as for improving the outcome of extinction-based therapies.
NASA Astrophysics Data System (ADS)
Samala, Ravi K.; Chan, Heang-Ping; Hadjiiski, Lubomir; Helvie, Mark A.; Richter, Caleb; Cha, Kenny
2018-02-01
We propose a cross-domain, multi-task transfer learning framework to transfer knowledge learned from non-medical images by a deep convolutional neural network (DCNN) to medical image recognition task while improving the generalization by multi-task learning of auxiliary tasks. A first stage cross-domain transfer learning was initiated from ImageNet trained DCNN to mammography trained DCNN. 19,632 regions-of-interest (ROI) from 2,454 mass lesions were collected from two imaging modalities: digitized-screen film mammography (SFM) and full-field digital mammography (DM), and split into training and test sets. In the multi-task transfer learning, the DCNN learned the mass classification task simultaneously from the training set of SFM and DM. The best transfer network for mammography was selected from three transfer networks with different number of convolutional layers frozen. The performance of single-task and multitask transfer learning on an independent SFM test set in terms of the area under the receiver operating characteristic curve (AUC) was 0.78+/-0.02 and 0.82+/-0.02, respectively. In the second stage cross-domain transfer learning, a set of 12,680 ROIs from 317 mass lesions on DBT were split into validation and independent test sets. We first studied the data requirements for the first stage mammography trained DCNN by varying the mammography training data from 1% to 100% and evaluated its learning on the DBT validation set in inference mode. We found that the entire available mammography set provided the best generalization. The DBT validation set was then used to train only the last four fully connected layers, resulting in an AUC of 0.90+/-0.04 on the independent DBT test set.
Large Dataset of Acute Oral Toxicity Data Created for Testing ...
Acute toxicity data is a common requirement for substance registration in the US. Currently only data derived from animal tests are accepted by regulatory agencies, and the standard in vivo tests use lethality as the endpoint. Non-animal alternatives such as in silico models are being developed due to animal welfare and resource considerations. We compiled a large dataset of oral rat LD50 values to assess the predictive performance currently available in silico models. Our dataset combines LD50 values from five different sources: literature data provided by The Dow Chemical Company, REACH data from eChemportal, HSDB (Hazardous Substances Data Bank), RTECS data from Leadscope, and the training set underpinning TEST (Toxicity Estimation Software Tool). Combined these data sources yield 33848 chemical-LD50 pairs (data points), with 23475 unique data points covering 16439 compounds. The entire dataset was loaded into a chemical properties database. All of the compounds were registered in DSSTox and 59.5% have publically available structures. Compounds without a structure in DSSTox are currently having their structures registered. The structural data will be used to evaluate the predictive performance and applicable chemical domains of three QSAR models (TIMES, PROTOX, and TEST). Future work will combine the dataset with information from ToxCast assays, and using random forest modeling, assess whether ToxCast assays are useful in predicting acute oral toxicity. Pre
Ahmadi, Mehdi; Nowroozi, Amin; Shahlaei, Mohsen
2015-09-01
The P2X purinoceptor 7 (P2X7R) is a trimeric ATP-activated ion channel gated by extracellular ATP. P2X7R has important role in numerous diseases including pain, neurodegeneration, and inflammatory diseases such as rheumatoid arthritis and osteoarthritis. In this prospective, the discovery of small-molecule inhibitors for P2X7R as a novel therapeutic target has received considerable attention in recent years. At first, 3D structure of P2X7R was built by using homology modeling (HM) and a 50ns molecular dynamics simulation (MDS). Ligand-based quantitative pharmacophore modeling methodology of P2X7R antagonists were developed based on training set of 49 compounds. The best four-feature pharmacophore model, includes two hydrophobic aromatic, one hydrophobic and one aromatic ring features, has the highest correlation coefficient (0.874), cost difference (368.677), low RMSD (2.876), as well as it shows a high goodness of fit and enrichment factor. Consequently, some hit compounds were introduced as final candidates by employing virtual screening and molecular docking procedure simultaneously. Among these compounds, six potential molecule were identified as potential virtual leads which, as such or upon further optimization, can be used to design novel P2X7R inhibitors. Copyright © 2015 Elsevier Inc. All rights reserved.
Broeckling, Corey D.; Ganna, Andrea; Layer, Mark; ...
2016-09-08
Liquid chromatography coupled to electrospray ionization-mass spectrometry (LC–ESI-MS) is a versatile and robust platform for metabolomic analysis. However, while ESI is a soft ionization technique, in-source phenomena including multimerization, nonproton cation adduction, and in-source fragmentation complicate interpretation of MS data. Here, we report chromatographic and mass spectrometric behavior of 904 authentic standards collected under conditions identical to a typical nontargeted profiling experiment. The data illustrate that the often high level of complexity in MS spectra is likely to result in misinterpretation during the annotation phase of the experiment and a large overestimation of the number of compounds detected. However, ourmore » analysis of this MS spectral library data indicates that in-source phenomena are not random but depend at least in part on chemical structure. These nonrandom patterns enabled predictions to be made as to which in-source signals are likely to be observed for a given compound. Using the authentic standard spectra as a training set, we modeled the in-source phenomena for all compounds in the Human Metabolome Database to generate a theoretical in-source spectrum and retention time library. A novel spectral similarity matching platform was developed to facilitate efficient spectral searching for nontargeted profiling applications. Taken together, this collection of experimental spectral data, predictive modeling, and informatic tools enables more efficient, reliable, and transparent metabolite annotation.« less
Broeckling, Corey D.; Ganna, Andrea; Layer, Mark; ...
2016-08-25
Liquid chromatography coupled to electrospray ionization-mass spectrometry (LC–ESI-MS) is a versatile and robust platform for metabolomic analysis. However, while ESI is a soft ionization technique, in-source phenomena including multimerization, nonproton cation adduction, and in-source fragmentation complicate interpretation of MS data. Here, we report chromatographic and mass spectrometric behavior of 904 authentic standards collected under conditions identical to a typical nontargeted profiling experiment. The data illustrate that the often high level of complexity in MS spectra is likely to result in misinterpretation during the annotation phase of the experiment and a large overestimation of the number of compounds detected. However, ourmore » analysis of this MS spectral library data indicates that in-source phenomena are not random but depend at least in part on chemical structure. These nonrandom patterns enabled predictions to be made as to which in-source signals are likely to be observed for a given compound. Using the authentic standard spectra as a training set, we modeled the in-source phenomena for all compounds in the Human Metabolome Database to generate a theoretical in-source spectrum and retention time library. A novel spectral similarity matching platform was developed to facilitate efficient spectral searching for nontargeted profiling applications. Taken together, this collection of experimental spectral data, predictive modeling, and informatic tools enables more efficient, reliable, and transparent metabolite annotation.« less
NASA Astrophysics Data System (ADS)
Rivera, J. D.; Moraes, B.; Merson, A. I.; Jouvel, S.; Abdalla, F. B.; Abdalla, M. C. B.
2018-07-01
We perform an analysis of photometric redshifts estimated by using a non-representative training sets in magnitude space. We use the ANNz2 and GPz algorithms to estimate the photometric redshift both in simulations and in real data from the Sloan Digital Sky Survey (DR12). We show that for the representative case, the results obtained by using both algorithms have the same quality, using either magnitudes or colours as input. In order to reduce the errors when estimating the redshifts with a non-representative training set, we perform the training in colour space. We estimate the quality of our results by using a mock catalogue which is split samples cuts in the r band between 19.4 < r < 20.8. We obtain slightly better results with GPz on single point z-phot estimates in the complete training set case, however the photometric redshifts estimated with ANNz2 algorithm allows us to obtain mildly better results in deeper r-band cuts when estimating the full redshift distribution of the sample in the incomplete training set case. By using a cumulative distribution function and a Monte Carlo process, we manage to define a photometric estimator which fits well the spectroscopic distribution of galaxies in the mock testing set, but with a larger scatter. To complete this work, we perform an analysis of the impact on the detection of clusters via density of galaxies in a field by using the photometric redshifts obtained with a non-representative training set.
NASA Astrophysics Data System (ADS)
Rivera, J. D.; Moraes, B.; Merson, A. I.; Jouvel, S.; Abdalla, F. B.; Abdalla, M. C. B.
2018-04-01
We perform an analysis of photometric redshifts estimated by using a non-representative training sets in magnitude space. We use the ANNz2 and GPz algorithms to estimate the photometric redshift both in simulations as well as in real data from the Sloan Digital Sky Survey (DR12). We show that for the representative case, the results obtained by using both algorithms have the same quality, either using magnitudes or colours as input. In order to reduce the errors when estimating the redshifts with a non-representative training set, we perform the training in colour space. We estimate the quality of our results by using a mock catalogue which is split samples cuts in the r-band between 19.4 < r < 20.8. We obtain slightly better results with GPz on single point z-phot estimates in the complete training set case, however the photometric redshifts estimated with ANNz2 algorithm allows us to obtain mildly better results in deeper r-band cuts when estimating the full redshift distribution of the sample in the incomplete training set case. By using a cumulative distribution function and a Monte-Carlo process, we manage to define a photometric estimator which fits well the spectroscopic distribution of galaxies in the mock testing set, but with a larger scatter. To complete this work, we perform an analysis of the impact on the detection of clusters via density of galaxies in a field by using the photometric redshifts obtained with a non-representative training set.
Data Programming: Creating Large Training Sets, Quickly.
Ratner, Alexander; De Sa, Christopher; Wu, Sen; Selsam, Daniel; Ré, Christopher
2016-12-01
Large labeled training sets are the critical building blocks of supervised learning methods and are key enablers of deep learning techniques. For some applications, creating labeled training sets is the most time-consuming and expensive part of applying machine learning. We therefore propose a paradigm for the programmatic creation of training sets called data programming in which users express weak supervision strategies or domain heuristics as labeling functions , which are programs that label subsets of the data, but that are noisy and may conflict. We show that by explicitly representing this training set labeling process as a generative model, we can "denoise" the generated training set, and establish theoretically that we can recover the parameters of these generative models in a handful of settings. We then show how to modify a discriminative loss function to make it noise-aware, and demonstrate our method over a range of discriminative models including logistic regression and LSTMs. Experimentally, on the 2014 TAC-KBP Slot Filling challenge, we show that data programming would have led to a new winning score, and also show that applying data programming to an LSTM model leads to a TAC-KBP score almost 6 F1 points over a state-of-the-art LSTM baseline (and into second place in the competition). Additionally, in initial user studies we observed that data programming may be an easier way for non-experts to create machine learning models when training data is limited or unavailable.
Data Programming: Creating Large Training Sets, Quickly
Ratner, Alexander; De Sa, Christopher; Wu, Sen; Selsam, Daniel; Ré, Christopher
2018-01-01
Large labeled training sets are the critical building blocks of supervised learning methods and are key enablers of deep learning techniques. For some applications, creating labeled training sets is the most time-consuming and expensive part of applying machine learning. We therefore propose a paradigm for the programmatic creation of training sets called data programming in which users express weak supervision strategies or domain heuristics as labeling functions, which are programs that label subsets of the data, but that are noisy and may conflict. We show that by explicitly representing this training set labeling process as a generative model, we can “denoise” the generated training set, and establish theoretically that we can recover the parameters of these generative models in a handful of settings. We then show how to modify a discriminative loss function to make it noise-aware, and demonstrate our method over a range of discriminative models including logistic regression and LSTMs. Experimentally, on the 2014 TAC-KBP Slot Filling challenge, we show that data programming would have led to a new winning score, and also show that applying data programming to an LSTM model leads to a TAC-KBP score almost 6 F1 points over a state-of-the-art LSTM baseline (and into second place in the competition). Additionally, in initial user studies we observed that data programming may be an easier way for non-experts to create machine learning models when training data is limited or unavailable. PMID:29872252
Potas, Jason Robert; de Castro, Newton Gonçalves; Maddess, Ted; de Souza, Marcio Nogueira
2015-01-01
Experimental electrophysiological assessment of evoked responses from regenerating nerves is challenging due to the typical complex response of events dispersed over various latencies and poor signal-to-noise ratio. Our objective was to automate the detection of compound action potential events and derive their latencies and magnitudes using a simple cross-correlation template comparison approach. For this, we developed an algorithm called Waveform Similarity Analysis. To test the algorithm, challenging signals were generated in vivo by stimulating sural and sciatic nerves, whilst recording evoked potentials at the sciatic nerve and tibialis anterior muscle, respectively, in animals recovering from sciatic nerve transection. Our template for the algorithm was generated based on responses evoked from the intact side. We also simulated noisy signals and examined the output of the Waveform Similarity Analysis algorithm with imperfect templates. Signals were detected and quantified using Waveform Similarity Analysis, which was compared to event detection, latency and magnitude measurements of the same signals performed by a trained observer, a process we called Trained Eye Analysis. The Waveform Similarity Analysis algorithm could successfully detect and quantify simple or complex responses from nerve and muscle compound action potentials of intact or regenerated nerves. Incorrectly specifying the template outperformed Trained Eye Analysis for predicting signal amplitude, but produced consistent latency errors for the simulated signals examined. Compared to the trained eye, Waveform Similarity Analysis is automatic, objective, does not rely on the observer to identify and/or measure peaks, and can detect small clustered events even when signal-to-noise ratio is poor. Waveform Similarity Analysis provides a simple, reliable and convenient approach to quantify latencies and magnitudes of complex waveforms and therefore serves as a useful tool for studying evoked compound action potentials in neural regeneration studies.
Randall, Patrick A; Cannady, Reginald; Besheer, Joyce
2016-08-01
Nicotine and alcohol co-use is highly prevalent, and as such, individuals experience the interoceptive effects of both substances together. Therefore, examining sensitivity to a compound nicotine and alcohol (N + A) interoceptive cue is critical to broaden our understanding of mechanisms that may contribute to nicotine and alcohol co-use. This work assessed the ability of a N + A interoceptive cue to gain control over goal-tracking behavior and determined the effects of the α4β2 nicotinic partial agonist and smoking cessation compound varenicline on sensitivity to N + A. Two groups of male Long Evans rats were trained to discriminate N + A (0.4 mg/kg nicotine + 1 g/kg alcohol, intragastric gavage (IG)) from water under two different training conditions using a Pavlovian drug discrimination task. The effects of varenicline (0, 1, 3 mg/kg, intraperitoneally (IP)) administered alone and on sensitivity to N + A and the components were determined. Under both training conditions, N + A rapidly gained control over behavior, with a greater contribution of nicotine to the N + A compound cue. Varenicline fully substituted for the N + A training dose, and varenicline (1 mg/kg) enhanced sensitivity to the lowest N + A dose (0.1 N + 0.1 A). Given the high selectivity of varenicline for the α4β2 receptor, this finding suggests a functional role for α4β2 nicotinic acetylcholine receptors (nAChRs) in modulating sensitivity to N + A. The N + A compound cue is a unique cue that is modulated, in part, by activity at the α4β2 nAChR. These findings advance understanding of the interoceptive effects of nicotine and alcohol in combination and may have implications in relation to their co-use.
Shelton, Keith L.
2009-01-01
Rationale Because the toxicity of many inhalants precludes evaluation in humans, drug discrimination, an animal model of subjective effects, can be used to gain insights on their poorly understood abuse-related effects. Objectives The purpose of the present study was to train a prototypic inhalant that has known abuse liability, 1,1,1-trichloroethane (TCE), as a discriminative stimulus in mice and compare it to other classes of inhalants. Methods Eight B6SJLF1/J mice were trained to discriminate 10 min of exposure to 12000 ppm inhaled TCE vapor from air and seven mice were trained to discriminate 4000 ppm TCE from air. Tests were then conducted to characterize the discriminative stimulus of TCE and to compare it to representative aromatic and chlorinated hydrocarbon vapors, volatile halogenated anesthetics as well as an odorant compound. Results Only the 12000 ppm TCE versus air discrimination group exhibited sufficient discrimination accuracy for substitution testing. TCE vapor concentration- and exposure time-dependently substituted for the 12000 ppm TCE vapor training stimulus. Full substitution was produced by trichloroethylene, toluene, enflurane and sevoflurane. Varying degrees of partial substitution were produced by the other volatile test compounds. The odorant, 2-butanol, did not produce any substitution for TCE. Conclusions The discriminative stimulus effects of TCE are shared fully or partially by chlorinated and aromatic hydrocarbons as well as by halogenated volatile anesthetics. However, these compounds can be differentiated from TCE both quantitatively and qualitatively. It appears that the degree of similarity is not solely a function of chemical classification but may also be dependent upon the neurochemical effects of the individual compounds. PMID:18972104
Potas, Jason Robert; de Castro, Newton Gonçalves; Maddess, Ted; de Souza, Marcio Nogueira
2015-01-01
Experimental electrophysiological assessment of evoked responses from regenerating nerves is challenging due to the typical complex response of events dispersed over various latencies and poor signal-to-noise ratio. Our objective was to automate the detection of compound action potential events and derive their latencies and magnitudes using a simple cross-correlation template comparison approach. For this, we developed an algorithm called Waveform Similarity Analysis. To test the algorithm, challenging signals were generated in vivo by stimulating sural and sciatic nerves, whilst recording evoked potentials at the sciatic nerve and tibialis anterior muscle, respectively, in animals recovering from sciatic nerve transection. Our template for the algorithm was generated based on responses evoked from the intact side. We also simulated noisy signals and examined the output of the Waveform Similarity Analysis algorithm with imperfect templates. Signals were detected and quantified using Waveform Similarity Analysis, which was compared to event detection, latency and magnitude measurements of the same signals performed by a trained observer, a process we called Trained Eye Analysis. The Waveform Similarity Analysis algorithm could successfully detect and quantify simple or complex responses from nerve and muscle compound action potentials of intact or regenerated nerves. Incorrectly specifying the template outperformed Trained Eye Analysis for predicting signal amplitude, but produced consistent latency errors for the simulated signals examined. Compared to the trained eye, Waveform Similarity Analysis is automatic, objective, does not rely on the observer to identify and/or measure peaks, and can detect small clustered events even when signal-to-noise ratio is poor. Waveform Similarity Analysis provides a simple, reliable and convenient approach to quantify latencies and magnitudes of complex waveforms and therefore serves as a useful tool for studying evoked compound action potentials in neural regeneration studies. PMID:26325291
1974-08-31
Procedures and techniques for compounding syrups, collodion, waters, spirits, liniments Use and maintenance of automatic liquid prepacker IIi [ o [ I... liniments , glycerites, elixirs Use and maintenance of automatic liquid prepacker 31 J ] Competency: PHARMACY TECHNICIAN (PHT) Unit II: Compounding
Coercion, Conformity, and Kids from the Waco Cult.
ERIC Educational Resources Information Center
Gilliam, Bobby; Daniels, Jack Kyle
1997-01-01
Offers an account of firsthand experiences with children from the Branch Davidian compound in Waco, Texas. Details how the children, who were taken to a Methodist Children's Home following release from the Davidian Compound, exhibited evidence of martial training, severe physical punishment, and sexual abuse. Narrates the children's rapid…
Sirsat, Sujata A; Kim, Kawon; Gibson, Kristen E; Crandall, Phillip G; Ricke, Steven C; Neal, Jack A
2014-03-05
Cross contamination of foodborne pathogens in the retail environment is a significant public health issue contributing to an increased risk for foodborne illness. Ready-to-eat (RTE) processed foods such as deli meats, cheese, and in some cases fresh produce, have been involved in foodborne disease outbreaks due to contamination with pathogens such as Listeria monocytogenes. With respect to L. monocytogenes, deli slicers are often the main source of cross contamination. The goal of this study was to use a fluorescent compound to simulate bacterial contamination and track this contamination in a retail setting. A mock deli kitchen was designed to simulate the retail environment. Deli meat was inoculated with the fluorescent compound and volunteers were recruited to complete a set of tasks similar to those expected of a food retail employee. The volunteers were instructed to slice, package, and store the meat in a deli refrigerator. The potential cross contamination was tracked in the mock retail environment by swabbing specific areas and measuring the optical density of the swabbed area with a spectrophotometer. The results indicated that the refrigerator (i.e. deli case) grip and various areas on the slicer had the highest risk for cross contamination. The results of this study may be used to develop more focused training material for retail employees. In addition, similar methodologies could also be used to track microbial contamination in food production environments (e.g. small farms), hospitals, nursing homes, cruise ships, and hotels.
Task Analysis of Tactical Leadership Skills for Bradley Infantry Fighting Vehicle Leaders
1986-10-01
The Bradley Leader Trainer is conceptualized as a device or set of de - vices that can be used to teach Bradley leaders to perform their full set of...experts. The task list was examined to de - termine critical training requirements, requirements for training device sup- port of this training, and...Functions/ j ITask | |Task | |Task | [Training j , To Further De - | ;Critical Train- | iTninir
Tate, C.M.; Heiny, J.S.
1996-01-01
Bed-sediment and fish-tissue samples were collected in the South Platte River Basin to determine the occurrence and distribution of organochlorine compounds in the basin. During August-November 1992 and August 1993, bed sediment (23 sites) and fish tissue (subset of 19 sites) were sampled and analyzed for 32 organochlorine compounds in bed sediment and 27 compounds in fish tissue. More types of organochlorine compounds were detected in fish tissue than in bed sediment. Total DDT, p,p???-DDE, o,p???-DDE, p,p???-DDD, total PCS, Dacthal??, dieldrin, cis-chlordane, cis-nonachlor, trans-nonachlor, and p,p???-DDT were detected in fish tissue at >25% of the sites; p,p???-DDE, total DDT, cis-chlordane, and trans-chlordane were detected in bed sediment at >25% of the sites. Organochlorine concentrations in bed sediment and fish tissue were related to land-use settings. Few organochlorine compounds were detected at minimally impacted sites located in rangeland, forest, and built-up land-use settings. Chlordane-related compounds and p,p???-methoxychlor in bed sediment and fish tissue, endrin in fish tissue, and endosulfan I in bed sediment were associated with urban and mixed (urban and agricultural) sites. Dacthal?? in bed sediment and fish tissue was associated with agricultural sites. The compounds HCB, ??-HCH, PCA, and toxaphene were detected only at mixed land-use sites. Although DDT and DDT-metabolites, dieldrin, and total PCB were detected in urban, mixed, and agricultural land-use settings, highest mean concentrations were detected at mixed land-use sites. Mixed land-use sites had the greatest number of organochlorine compounds detected in fish tissue, whereas urban and mixed sites had the greatest number of organochlorine compounds detected in bed sediment. Measuring concentrations of organochlorine compounds in bed sediment and fish tissue at the same site offers a more complete picture of the persistence of organochlorine compounds in the environment and their relation to land-use settings.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Noor, Fozia; Niklas, Jens; Mueller-Vieira, Ursula
2009-06-01
Efficient and accurate safety assessment of compounds is extremely important in the preclinical development of drugs especially when hepatotoxicty is in question. Multiparameter and time resolved assays are expected to greatly improve the prediction of toxicity by assessing complex mechanisms of toxicity. An integrated approach is presented in which Hep G2 cells and primary rat hepatocytes are compared in frequently used cytotoxicity assays for parent compound toxicity. The interassay variability was determined. The cytotoxicity assays were also compared with a reliable alternative time resolved respirometric assay. The set of training compounds consisted of well known hepatotoxins; amiodarone, carbamazepine, clozapine, diclofenac,more » tacrine, troglitazone and verapamil. The sensitivity of both cell systems in each tested assay was determined. Results show that careful selection of assay parameters and inclusion of a kinetic time resolved assay improves prediction for non-metabolism mediated toxicity using Hep G2 cells as indicated by a sensitivity ratio of 1. The drugs with EC{sub 50} values 100 {mu}M or lower were considered toxic. The difference in the sensitivity of the two cell systems to carbamazepine which causes toxicity via reactive metabolites emphasizes the importance of human cell based in-vitro assays. Using the described system, primary rat hepatocytes do not offer advantage over the Hep G2 cells in parent compound toxicity evaluation. Moreover, respiration method is non invasive, highly sensitive and allows following the time course of toxicity. Respiration assay could serve as early indicator of changes that subsequently lead to toxicity.« less
Gogoi, Dhrubajyoti; Baruah, Vishwa Jyoti; Chaliha, Amrita Kashyap; Kakoti, Bibhuti Bhushan; Sarma, Diganta; Buragohain, Alak Kumar
2016-12-21
Human epidermal growth factor receptor 2 (HER2) is one of the four members of the epidermal growth factor receptor (EGFR) family and is expressed to facilitate cellular proliferation across various tissue types. Therapies targeting HER2, which is a transmembrane glycoprotein with tyrosine kinase activity, offer promising prospects especially in breast and gastric/gastroesophageal cancer patients. Persistence of both primary and acquired resistance to various routine drugs/antibodies is a disappointing outcome in the treatment of many HER2 positive cancer patients and is a challenge that requires formulation of new and improved strategies to overcome the same. Identification of novel HER2 inhibitors with improved therapeutics index was performed with a highly correlating (r=0.975) ligand-based pharmacophore model (Hypo1) in this study. Hypo1 was generated from a training set of 22 compounds with HER2 inhibitory activity and this well-validated hypothesis was subsequently used as a 3D query to screen compounds in a total of four databases of which two were natural product databases. Further, these compounds were analyzed for compliance with Veber's drug-likeness rule and optimum ADMET parameters. The selected compounds were then subjected to molecular docking and Density Functional Theory (DFT) analysis to discern their molecular interactions at the active site of HER2. The findings thus presented would be an important starting point towards the development of novel HER2 inhibitors using well-validated computational techniques. Copyright © 2016 Elsevier Ltd. All rights reserved.
Principles to Consider in Defining New Directions in Internal Medicine Training and Certification
Turner, Barbara J; Centor, Robert M; Rosenthal, Gary E
2006-01-01
SGIM endoreses seven principles related to current thinking about internal medicine training: 1) internal medicine requires a full three years of residency training before subspecialization; 2) internal medicine residency programs must dramatically increase support for training in the ambulatory setting and offer equivalent opportunities for training in both inpatient and outpatient medicine; 3) in settings where adequate support and time are devoted to ambulatory training, the third year of residency could offer an opportunity to develop further expertise or mastery in a specific type or setting of care; 4) further certification in specific specialties within internal medicine requires the completion of an approved fellowship program; 5) areas of mastery in internal medicine can be demonstrated through modified board certification and recertification examinations; 6) certification processes throughout internal medicine should focus increasingly on demonstration of clinical competence through adherence to validated standards of care within and across practice settings; and 7) regardless of the setting in which General Internists practice, we should unite to promote the critical role that this specialty serves in patient care. PMID:16637826
NASA Technical Reports Server (NTRS)
Decker, Arthur J.
2001-01-01
Artificial neural networks have been used for a number of years to process holography-generated characteristic patterns of vibrating structures. This technology depends critically on the selection and the conditioning of the training sets. A scaling operation called folding is discussed for conditioning training sets optimally for training feed-forward neural networks to process characteristic fringe patterns. Folding allows feed-forward nets to be trained easily to detect damage-induced vibration-displacement-distribution changes as small as 10 nm. A specific application to aerospace of neural-net processing of characteristic patterns is presented to motivate the conditioning and optimization effort.
Borysko, Petro; Moroz, Yurii S; Vasylchenko, Oleksandr V; Hurmach, Vasyl V; Starodubtseva, Anastasia; Stefanishena, Natalia; Nesteruk, Kateryna; Zozulya, Sergey; Kondratov, Ivan S; Grygorenko, Oleksandr O
2018-05-09
A combination approach of a fragment screening and "SAR by catalog" was used for the discovery of bromodomain-containing protein 4 (BRD4) inhibitors. Initial screening of 3695-fragment library against bromodomain 1 of BRD4 using thermal shift assay (TSA), followed by initial hit validation, resulted in 73 fragment hits, which were used to construct a follow-up library selected from available screening collection. Additionally, analogs of inactive fragments, as well as a set of randomly selected compounds were also prepared (3 × 3200 compounds in total). Screening of the resulting sets using TSA, followed by re-testing at several concentrations, counter-screen, and TR-FRET assay resulted in 18 confirmed hits. Compounds derived from the initial fragment set showed better hit rate as compared to the other two sets. Finally, building dose-response curves revealed three compounds with IC 50 = 1.9-7.4 μM. For these compounds, binding sites and conformations in the BRD4 (4UYD) have been determined by docking. Copyright © 2018 Elsevier Ltd. All rights reserved.
Comparison of in silico models for prediction of mutagenicity.
Bakhtyari, Nazanin G; Raitano, Giuseppa; Benfenati, Emilio; Martin, Todd; Young, Douglas
2013-01-01
Using a dataset with more than 6000 compounds, the performance of eight quantitative structure activity relationships (QSAR) models was evaluated: ACD/Tox Suite, Absorption, Distribution, Metabolism, Elimination, and Toxicity of chemical substances (ADMET) predictor, Derek, Toxicity Estimation Software Tool (T.E.S.T.), TOxicity Prediction by Komputer Assisted Technology (TOPKAT), Toxtree, CEASAR, and SARpy (SAR in python). In general, the results showed a high level of performance. To have a realistic estimate of the predictive ability, the results for chemicals inside and outside the training set for each model were considered. The effect of applicability domain tools (when available) on the prediction accuracy was also evaluated. The predictive tools included QSAR models, knowledge-based systems, and a combination of both methods. Models based on statistical QSAR methods gave better results.
Hurst, J
2005-01-01
Continuous professional development (CPD) in caring for people with kidney disease is limited in some regions of the UK and within Europe generally. This is compounded for all by limited resources for course fees and the lack of study leave granted away from the clinical area for full-time courses. This is set against recommendations from National and European governments, and renal clinical guidelines concerning expectations of CPD and clinical competency levels of renal nurses (1-4). In the past renal practitioners have been trained in all areas of the renal speciality by Schools of Nursing linked to renal units based in large teaching hospitals. However, more recent changes in the structure of Health Care provision have led in some instances to a rationalising of post registration education delivery.
BUMPER v1.0: a Bayesian user-friendly model for palaeo-environmental reconstruction
NASA Astrophysics Data System (ADS)
Holden, Philip B.; Birks, H. John B.; Brooks, Stephen J.; Bush, Mark B.; Hwang, Grace M.; Matthews-Bird, Frazer; Valencia, Bryan G.; van Woesik, Robert
2017-02-01
We describe the Bayesian user-friendly model for palaeo-environmental reconstruction (BUMPER), a Bayesian transfer function for inferring past climate and other environmental variables from microfossil assemblages. BUMPER is fully self-calibrating, straightforward to apply, and computationally fast, requiring ˜ 2 s to build a 100-taxon model from a 100-site training set on a standard personal computer. We apply the model's probabilistic framework to generate thousands of artificial training sets under ideal assumptions. We then use these to demonstrate the sensitivity of reconstructions to the characteristics of the training set, considering assemblage richness, taxon tolerances, and the number of training sites. We find that a useful guideline for the size of a training set is to provide, on average, at least 10 samples of each taxon. We demonstrate general applicability to real data, considering three different organism types (chironomids, diatoms, pollen) and different reconstructed variables. An identically configured model is used in each application, the only change being the input files that provide the training-set environment and taxon-count data. The performance of BUMPER is shown to be comparable with weighted average partial least squares (WAPLS) in each case. Additional artificial datasets are constructed with similar characteristics to the real data, and these are used to explore the reasons for the differing performances of the different training sets.
NASA Technical Reports Server (NTRS)
1992-01-01
NETS (A Neural Network Development Tool) is a software system for mimicking the human brain. It is used in a University of Arkansas project in pattern matching of chemical systems. If successful, chemists would be able to identify mixtures of compounds without long and costly separation procedures. Using NETS, the group has trained the computer to recognize pattern relationships in a known compound and associate the results to an unknown compound. The research appears to be promising.
Comparison of molecular breeding values based on within- and across-breed training in beef cattle
2013-01-01
Background Although the efficacy of genomic predictors based on within-breed training looks promising, it is necessary to develop and evaluate across-breed predictors for the technology to be fully applied in the beef industry. The efficacies of genomic predictors trained in one breed and utilized to predict genetic merit in differing breeds based on simulation studies have been reported, as have the efficacies of predictors trained using data from multiple breeds to predict the genetic merit of purebreds. However, comparable studies using beef cattle field data have not been reported. Methods Molecular breeding values for weaning and yearling weight were derived and evaluated using a database containing BovineSNP50 genotypes for 7294 animals from 13 breeds in the training set and 2277 animals from seven breeds (Angus, Red Angus, Hereford, Charolais, Gelbvieh, Limousin, and Simmental) in the evaluation set. Six single-breed and four across-breed genomic predictors were trained using pooled data from purebred animals. Molecular breeding values were evaluated using field data, including genotypes for 2227 animals and phenotypic records of animals born in 2008 or later. Accuracies of molecular breeding values were estimated based on the genetic correlation between the molecular breeding value and trait phenotype. Results With one exception, the estimated genetic correlations of within-breed molecular breeding values with trait phenotype were greater than 0.28 when evaluated in the breed used for training. Most estimated genetic correlations for the across-breed trained molecular breeding values were moderate (> 0.30). When molecular breeding values were evaluated in breeds that were not in the training set, estimated genetic correlations clustered around zero. Conclusions Even for closely related breeds, within- or across-breed trained molecular breeding values have limited prediction accuracy for breeds that were not in the training set. For breeds in the training set, across- and within-breed trained molecular breeding values had similar accuracies. The benefit of adding data from other breeds to a within-breed training population is the ability to produce molecular breeding values that are more robust across breeds and these can be utilized until enough training data has been accumulated to allow for a within-breed training set. PMID:23953034
Target discrimination method for SAR images based on semisupervised co-training
NASA Astrophysics Data System (ADS)
Wang, Yan; Du, Lan; Dai, Hui
2018-01-01
Synthetic aperture radar (SAR) target discrimination is usually performed in a supervised manner. However, supervised methods for SAR target discrimination may need lots of labeled training samples, whose acquirement is costly, time consuming, and sometimes impossible. This paper proposes an SAR target discrimination method based on semisupervised co-training, which utilizes a limited number of labeled samples and an abundant number of unlabeled samples. First, Lincoln features, widely used in SAR target discrimination, are extracted from the training samples and partitioned into two sets according to their physical meanings. Second, two support vector machine classifiers are iteratively co-trained with the extracted two feature sets based on the co-training algorithm. Finally, the trained classifiers are exploited to classify the test data. The experimental results on real SAR images data not only validate the effectiveness of the proposed method compared with the traditional supervised methods, but also demonstrate the superiority of co-training over self-training, which only uses one feature set.
Stewart, Eugene L; Brown, Peter J; Bentley, James A; Willson, Timothy M
2004-08-01
A methodology for the selection and validation of nuclear receptor ligand chemical descriptors is described. After descriptors for a targeted chemical space were selected, a virtual screening methodology utilizing this space was formulated for the identification of potential NR ligands from our corporate collection. Using simple descriptors and our virtual screening method, we are able to quickly identify potential NR ligands from a large collection of compounds. As validation of the virtual screening procedure, an 8, 000-membered NR targeted set and a 24, 000-membered diverse control set of compounds were selected from our in-house general screening collection and screened in parallel across a number of orphan NR FRET assays. For the two assays that provided at least one hit per set by the established minimum pEC(50) for activity, the results showed a 2-fold increase in the hit-rate of the targeted compound set over the diverse set.
Sample Selection for Training Cascade Detectors.
Vállez, Noelia; Deniz, Oscar; Bueno, Gloria
2015-01-01
Automatic detection systems usually require large and representative training datasets in order to obtain good detection and false positive rates. Training datasets are such that the positive set has few samples and/or the negative set should represent anything except the object of interest. In this respect, the negative set typically contains orders of magnitude more images than the positive set. However, imbalanced training databases lead to biased classifiers. In this paper, we focus our attention on a negative sample selection method to properly balance the training data for cascade detectors. The method is based on the selection of the most informative false positive samples generated in one stage to feed the next stage. The results show that the proposed cascade detector with sample selection obtains on average better partial AUC and smaller standard deviation than the other compared cascade detectors.
Peng, Jiangjun; Leung, Yee; Leung, Kwong-Sak; Wong, Man-Hon; Lu, Gang; Ballester, Pedro J.
2018-01-01
It has recently been claimed that the outstanding performance of machine-learning scoring functions (SFs) is exclusively due to the presence of training complexes with highly similar proteins to those in the test set. Here, we revisit this question using 24 similarity-based training sets, a widely used test set, and four SFs. Three of these SFs employ machine learning instead of the classical linear regression approach of the fourth SF (X-Score which has the best test set performance out of 16 classical SFs). We have found that random forest (RF)-based RF-Score-v3 outperforms X-Score even when 68% of the most similar proteins are removed from the training set. In addition, unlike X-Score, RF-Score-v3 is able to keep learning with an increasing training set size, becoming substantially more predictive than X-Score when the full 1105 complexes are used for training. These results show that machine-learning SFs owe a substantial part of their performance to training on complexes with dissimilar proteins to those in the test set, against what has been previously concluded using the same data. Given that a growing amount of structural and interaction data will be available from academic and industrial sources, this performance gap between machine-learning SFs and classical SFs is expected to enlarge in the future. PMID:29538331
Li, Hongjian; Peng, Jiangjun; Leung, Yee; Leung, Kwong-Sak; Wong, Man-Hon; Lu, Gang; Ballester, Pedro J
2018-03-14
It has recently been claimed that the outstanding performance of machine-learning scoring functions (SFs) is exclusively due to the presence of training complexes with highly similar proteins to those in the test set. Here, we revisit this question using 24 similarity-based training sets, a widely used test set, and four SFs. Three of these SFs employ machine learning instead of the classical linear regression approach of the fourth SF (X-Score which has the best test set performance out of 16 classical SFs). We have found that random forest (RF)-based RF-Score-v3 outperforms X-Score even when 68% of the most similar proteins are removed from the training set. In addition, unlike X-Score, RF-Score-v3 is able to keep learning with an increasing training set size, becoming substantially more predictive than X-Score when the full 1105 complexes are used for training. These results show that machine-learning SFs owe a substantial part of their performance to training on complexes with dissimilar proteins to those in the test set, against what has been previously concluded using the same data. Given that a growing amount of structural and interaction data will be available from academic and industrial sources, this performance gap between machine-learning SFs and classical SFs is expected to enlarge in the future.
A Component-Based Vocabulary-Extensible Sign Language Gesture Recognition Framework.
Wei, Shengjing; Chen, Xiang; Yang, Xidong; Cao, Shuai; Zhang, Xu
2016-04-19
Sign language recognition (SLR) can provide a helpful tool for the communication between the deaf and the external world. This paper proposed a component-based vocabulary extensible SLR framework using data from surface electromyographic (sEMG) sensors, accelerometers (ACC), and gyroscopes (GYRO). In this framework, a sign word was considered to be a combination of five common sign components, including hand shape, axis, orientation, rotation, and trajectory, and sign classification was implemented based on the recognition of five components. Especially, the proposed SLR framework consisted of two major parts. The first part was to obtain the component-based form of sign gestures and establish the code table of target sign gesture set using data from a reference subject. In the second part, which was designed for new users, component classifiers were trained using a training set suggested by the reference subject and the classification of unknown gestures was performed with a code matching method. Five subjects participated in this study and recognition experiments under different size of training sets were implemented on a target gesture set consisting of 110 frequently-used Chinese Sign Language (CSL) sign words. The experimental results demonstrated that the proposed framework can realize large-scale gesture set recognition with a small-scale training set. With the smallest training sets (containing about one-third gestures of the target gesture set) suggested by two reference subjects, (82.6 ± 13.2)% and (79.7 ± 13.4)% average recognition accuracy were obtained for 110 words respectively, and the average recognition accuracy climbed up to (88 ± 13.7)% and (86.3 ± 13.7)% when the training set included 50~60 gestures (about half of the target gesture set). The proposed framework can significantly reduce the user's training burden in large-scale gesture recognition, which will facilitate the implementation of a practical SLR system.
An Analysis of Quality in the Modular Housing Industry.
1991-12-01
finishing, Station 5, installs rough plumbing and applies the first coat of drywall joint compound . The unit continues to ceiling/roof setting, Station...with I joint compound and drywall or plywood plates. 3 14. Rigid waferboard, oriented strand board, or plywood is used for exterior wall sheathing to...completed and tested, the second coat of joint compound is placed, and windows and doors are set. Insulation, exterior sheathing, roof sheathing
Ling, C Y M; Mak, W W S
2012-03-01
The present study examined the effectiveness of three staff training elements: psychoeducation (PE) on autism, introduction of functional behavioural analysis (FBA) and emotional management (EM), on the reaction of challenging behaviours for frontline staff towards children with autism in Hong Kong special education settings. A sample of 311 frontline staff in educational settings was recruited to one of the three conditions: control, PE-FBA and PE-FBA-EM groups. A total of 175 participants completed all three sets of questionnaires during pre-training, immediate post-training and 1-month follow-up. Findings showed that the one-session staff training workshop increased staff knowledge of autism and perceived efficacy but decrease helping behavioural intention. In spite of the limited effectiveness of a one-session staff training workshop, continued staff training is still necessary for the improvement of service quality. Further exploration on how to change emotion response of staff is important. © 2011 The Authors. Journal of Intellectual Disability Research © 2011 Blackwell Publishing Ltd.
Sissons, Heather T.; Urcelay, Gonzalo P.; Miller, Ralph R.
2009-01-01
The present experiments examined the role of within-compound associations in the interaction of the overshadowing procedure with conditioned stimulus (CS) duration, using a conditioned suppression procedure with rats. Experiment 1 found that, with elemental reinforced training, conditioned suppression to the target stimulus decreased as CS duration increased (i.e., the CS-duration effect), whereas with compound reinforced training (i.e., the overshadowing procedure) conditioned suppression to the target stimulus increased as CS duration increased. Subsequent experiments replicated these findings in sensory preconditioning and demonstrated that extinction of the overshadowing stimulus results in retrospective revaluation with short CSs and mediated extinction with long CSs. These results highlight the role of the duration of the stimulus in behavioral control. Moreover, these results illuminate one cause (the CS duration) of whether retrospective revaluation or mediated extinction will be observed. PMID:19542092
Bayesian molecular design with a chemical language model
NASA Astrophysics Data System (ADS)
Ikebata, Hisaki; Hongo, Kenta; Isomura, Tetsu; Maezono, Ryo; Yoshida, Ryo
2017-04-01
The aim of computational molecular design is the identification of promising hypothetical molecules with a predefined set of desired properties. We address the issue of accelerating the material discovery with state-of-the-art machine learning techniques. The method involves two different types of prediction; the forward and backward predictions. The objective of the forward prediction is to create a set of machine learning models on various properties of a given molecule. Inverting the trained forward models through Bayes' law, we derive a posterior distribution for the backward prediction, which is conditioned by a desired property requirement. Exploring high-probability regions of the posterior with a sequential Monte Carlo technique, molecules that exhibit the desired properties can computationally be created. One major difficulty in the computational creation of molecules is the exclusion of the occurrence of chemically unfavorable structures. To circumvent this issue, we derive a chemical language model that acquires commonly occurring patterns of chemical fragments through natural language processing of ASCII strings of existing compounds, which follow the SMILES chemical language notation. In the backward prediction, the trained language model is used to refine chemical strings such that the properties of the resulting structures fall within the desired property region while chemically unfavorable structures are successfully removed. The present method is demonstrated through the design of small organic molecules with the property requirements on HOMO-LUMO gap and internal energy. The R package iqspr is available at the CRAN repository.
Bayesian molecular design with a chemical language model.
Ikebata, Hisaki; Hongo, Kenta; Isomura, Tetsu; Maezono, Ryo; Yoshida, Ryo
2017-04-01
The aim of computational molecular design is the identification of promising hypothetical molecules with a predefined set of desired properties. We address the issue of accelerating the material discovery with state-of-the-art machine learning techniques. The method involves two different types of prediction; the forward and backward predictions. The objective of the forward prediction is to create a set of machine learning models on various properties of a given molecule. Inverting the trained forward models through Bayes' law, we derive a posterior distribution for the backward prediction, which is conditioned by a desired property requirement. Exploring high-probability regions of the posterior with a sequential Monte Carlo technique, molecules that exhibit the desired properties can computationally be created. One major difficulty in the computational creation of molecules is the exclusion of the occurrence of chemically unfavorable structures. To circumvent this issue, we derive a chemical language model that acquires commonly occurring patterns of chemical fragments through natural language processing of ASCII strings of existing compounds, which follow the SMILES chemical language notation. In the backward prediction, the trained language model is used to refine chemical strings such that the properties of the resulting structures fall within the desired property region while chemically unfavorable structures are successfully removed. The present method is demonstrated through the design of small organic molecules with the property requirements on HOMO-LUMO gap and internal energy. The R package iqspr is available at the CRAN repository.
Chemometric classification of casework arson samples based on gasoline content.
Sinkov, Nikolai A; Sandercock, P Mark L; Harynuk, James J
2014-02-01
Detection and identification of ignitable liquids (ILs) in arson debris is a critical part of arson investigations. The challenge of this task is due to the complex and unpredictable chemical nature of arson debris, which also contains pyrolysis products from the fire. ILs, most commonly gasoline, are complex chemical mixtures containing hundreds of compounds that will be consumed or otherwise weathered by the fire to varying extents depending on factors such as temperature, air flow, the surface on which IL was placed, etc. While methods such as ASTM E-1618 are effective, data interpretation can be a costly bottleneck in the analytical process for some laboratories. In this study, we address this issue through the application of chemometric tools. Prior to the application of chemometric tools such as PLS-DA and SIMCA, issues of chromatographic alignment and variable selection need to be addressed. Here we use an alignment strategy based on a ladder consisting of perdeuterated n-alkanes. Variable selection and model optimization was automated using a hybrid backward elimination (BE) and forward selection (FS) approach guided by the cluster resolution (CR) metric. In this work, we demonstrate the automated construction, optimization, and application of chemometric tools to casework arson data. The resulting PLS-DA and SIMCA classification models, trained with 165 training set samples, have provided classification of 55 validation set samples based on gasoline content with 100% specificity and sensitivity. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Kalderstam, Jonas; Edén, Patrik; Bendahl, Pär-Ola; Strand, Carina; Fernö, Mårten; Ohlsson, Mattias
2013-06-01
The concordance index (c-index) is the standard way of evaluating the performance of prognostic models in the presence of censored data. Constructing prognostic models using artificial neural networks (ANNs) is commonly done by training on error functions which are modified versions of the c-index. Our objective was to demonstrate the capability of training directly on the c-index and to evaluate our approach compared to the Cox proportional hazards model. We constructed a prognostic model using an ensemble of ANNs which were trained using a genetic algorithm. The individual networks were trained on a non-linear artificial data set divided into a training and test set both of size 2000, where 50% of the data was censored. The ANNs were also trained on a data set consisting of 4042 patients treated for breast cancer spread over five different medical studies, 2/3 used for training and 1/3 used as a test set. A Cox model was also constructed on the same data in both cases. The two models' c-indices on the test sets were then compared. The ranking performance of the models is additionally presented visually using modified scatter plots. Cross validation on the cancer training set did not indicate any non-linear effects between the covariates. An ensemble of 30 ANNs with one hidden neuron was therefore used. The ANN model had almost the same c-index score as the Cox model (c-index=0.70 and 0.71, respectively) on the cancer test set. Both models identified similarly sized low risk groups with at most 10% false positives, 49 for the ANN model and 60 for the Cox model, but repeated bootstrap runs indicate that the difference was not significant. A significant difference could however be seen when applied on the non-linear synthetic data set. In that case the ANN ensemble managed to achieve a c-index score of 0.90 whereas the Cox model failed to distinguish itself from the random case (c-index=0.49). We have found empirical evidence that ensembles of ANN models can be optimized directly on the c-index. Comparison with a Cox model indicates that near identical performance is achieved on a real cancer data set while on a non-linear data set the ANN model is clearly superior. Copyright © 2013 Elsevier B.V. All rights reserved.
Effects of training set selection on pain recognition via facial expressions
NASA Astrophysics Data System (ADS)
Shier, Warren A.; Yanushkevich, Svetlana N.
2016-07-01
This paper presents an approach to pain expression classification based on Gabor energy filters with Support Vector Machines (SVMs), followed by analyzing the effects of training set variations on the systems classification rate. This approach is tested on the UNBC-McMaster Shoulder Pain Archive, which consists of spontaneous pain images, hand labelled using the Prkachin and Solomon Pain Intensity scale. In this paper, the subjects pain intensity level has been quantized into three disjoint groups: no pain, weak pain and strong pain. The results of experiments show that Gabor energy filters with SVMs provide comparable or better results to previous filter- based pain recognition methods, with precision rates of 74%, 30% and 78% for no pain, weak pain and strong pain, respectively. The study of effects of intra-class skew, or changing the number of images per subject, show that both completely removing and over-representing poor quality subjects in the training set has little effect on the overall accuracy of the system. This result suggests that poor quality subjects could be removed from the training set to save offline training time and that SVM is robust not only to outliers in training data, but also to significant amounts of poor quality data mixed into the training sets.
International standards for programmes of training in intensive care medicine in Europe.
2011-03-01
To develop internationally harmonised standards for programmes of training in intensive care medicine (ICM). Standards were developed by using consensus techniques. A nine-member nominal group of European intensive care experts developed a preliminary set of standards. These were revised and refined through a modified Delphi process involving 28 European national coordinators representing national training organisations using a combination of moderated discussion meetings, email, and a Web-based tool for determining the level of agreement with each proposed standard, and whether the standard could be achieved in the respondent's country. The nominal group developed an initial set of 52 possible standards which underwent four iterations to achieve maximal consensus. All national coordinators approved a final set of 29 standards in four domains: training centres, training programmes, selection of trainees, and trainers' profiles. Only three standards were considered immediately achievable by all countries, demonstrating a willingness to aspire to quality rather than merely setting a minimum level. Nine proposed standards which did not achieve full consensus were identified as potential candidates for future review. This preliminary set of clearly defined and agreed standards provides a transparent framework for assuring the quality of training programmes, and a foundation for international harmonisation and quality improvement of training in ICM.
Shah, Pranav; Kerns, Edward; Nguyen, Dac-Trung; Obach, R Scott; Wang, Amy Q; Zakharov, Alexey; McKew, John; Simeonov, Anton; Hop, Cornelis E C A; Xu, Xin
2016-10-01
Advancement of in silico tools would be enabled by the availability of data for metabolic reaction rates and intrinsic clearance (CLint) of a diverse compound structure data set by specific metabolic enzymes. Our goal is to measure CLint for a large set of compounds with each major human cytochrome P450 (P450) isozyme. To achieve our goal, it is of utmost importance to develop an automated, robust, sensitive, high-throughput metabolic stability assay that can efficiently handle a large volume of compound sets. The substrate depletion method [in vitro half-life (t1/2) method] was chosen to determine CLint The assay (384-well format) consisted of three parts: 1) a robotic system for incubation and sample cleanup; 2) two different integrated, ultraperformance liquid chromatography/mass spectrometry (UPLC/MS) platforms to determine the percent remaining of parent compound, and 3) an automated data analysis system. The CYP3A4 assay was evaluated using two long t1/2 compounds, carbamazepine and antipyrine (t1/2 > 30 minutes); one moderate t1/2 compound, ketoconazole (10 < t1/2 < 30 minutes); and two short t1/2 compounds, loperamide and buspirone (t½ < 10 minutes). Interday and intraday precision and accuracy of the assay were within acceptable range (∼12%) for the linear range observed. Using this assay, CYP3A4 CLint and t1/2 values for more than 3000 compounds were measured. This high-throughput, automated, and robust assay allows for rapid metabolic stability screening of large compound sets and enables advanced computational modeling for individual human P450 isozymes. U.S. Government work not protected by U.S. copyright.
Jin, Xiaohui; Peldszus, Sigrid
2012-01-01
Micropollutants remain of concern in drinking water, and there is a broad interest in the ability of different treatment processes to remove these compounds. To gain a better understanding of treatment effectiveness for structurally diverse compounds and to be cost effective, it is necessary to select a small set of representative micropollutants for experimental studies. Unlike other approaches to-date, in this research micropollutants were systematically selected based solely on their physico-chemical and structural properties that are important in individual water treatment processes. This was accomplished by linking underlying principles of treatment processes such as coagulation/flocculation, oxidation, activated carbon adsorption, and membrane filtration to compound characteristics and corresponding molecular descriptors. A systematic statistical approach not commonly used in water treatment was then applied to a compound pool of 182 micropollutants (identified from the literature) and their relevant calculated molecular descriptors. Principal component analysis (PCA) was used to summarize the information residing in this large dataset. D-optimal onion design was then applied to the PCA results to select structurally representative compounds that could be used in experimental treatment studies. To demonstrate the applicability and flexibility of this selection approach, two sets of 22 representative micropollutants are presented. Compounds in the first set are representative when studying a range of water treatment processes (coagulation/flocculation, oxidation, activated carbon adsorption, and membrane filtration), whereas the second set shows representative compounds for ozonation and advanced oxidation studies. Overall, selected micropollutants in both lists are structurally diverse, have wide-ranging physico-chemical properties and cover a large spectrum of applications. The systematic compound selection approach presented here can also be adjusted to fit individual research needs with respect to type of micropollutants, treatment processes and number of compounds selected. Copyright © 2011 Elsevier B.V. All rights reserved.
Acute effects of verbal feedback on upper-body performance in elite athletes.
Argus, Christos K; Gill, Nicholas D; Keogh, Justin Wl; Hopkins, Will G
2011-12-01
Argus, CK, Gill, ND, Keogh, JWL, and Hopkins, WG. Acute effects of verbal feedback on upper-body performance in elite athletes. J Strength Cond Res 25(12): 3282-3287, 2011-Improved training quality has the potential to enhance training adaptations. Previous research suggests that receiving feedback improves single-effort maximal strength and power tasks, but whether quality of a training session with repeated efforts can be improved remains unclear. The purpose of this investigation was to determine the effects of verbal feedback on upper-body performance in a resistance training session consisting of multiple sets and repetitions in well-trained athletes. Nine elite rugby union athletes were assessed using the bench throw exercise on 4 separate occasions each separated by 7 days. Each athlete completed 2 sessions consisting of 3 sets of 4 repetitions of the bench throw with feedback provided after each repetition and 2 identical sessions where no feedback was provided after each repetition. When feedback was received, there was a small increase of 1.8% (90% confidence limits, ±2.7%) and 1.3% (±0.7%) in mean peak power and velocity when averaged over the 3 sets. When individual sets were compared, there was a tendency toward the improvements in mean peak power being greater in the second and third sets. These results indicate that providing verbal feedback produced acute improvements in upper-body power output of well-trained athletes. The benefits of feedback may be greatest in the latter sets of training and could improve training quality and result in greater long-term adaptation.
Effect of creatine supplementation and drop-set resistance training in untrained aging adults.
Johannsmeyer, Sarah; Candow, Darren G; Brahms, C Markus; Michel, Deborah; Zello, Gordon A
2016-10-01
To investigate the effects of creatine supplementation and drop-set resistance training in untrained aging adults. Participants were randomized to one of two groups: Creatine (CR: n=14, 7 females, 7 males; 58.0±3.0yrs, 0.1g/kg/day of creatine+0.1g/kg/day of maltodextrin) or Placebo (PLA: n=17, 7 females, 10 males; age: 57.6±5.0yrs, 0.2g/kg/day of maltodextrin) during 12weeks of drop-set resistance training (3days/week; 2 sets of leg press, chest press, hack squat and lat pull-down exercises performed to muscle fatigue at 80% baseline 1-repetition maximum [1-RM] immediately followed by repetitions to muscle fatigue at 30% baseline 1-RM). Prior to and following training and supplementation, assessments were made for body composition, muscle strength, muscle endurance, tasks of functionality, muscle protein catabolism and diet. Drop-set resistance training improved muscle mass, muscle strength, muscle endurance and tasks of functionality (p<0.05). The addition of creatine to drop-set resistance training significantly increased body mass (p=0.002) and muscle mass (p=0.007) compared to placebo. Males on creatine increased muscle strength (lat pull-down only) to a greater extent than females on creatine (p=0.005). Creatine enabled males to resistance train at a greater capacity over time compared to males on placebo (p=0.049) and females on creatine (p=0.012). Males on creatine (p=0.019) and females on placebo (p=0.014) decreased 3-MH compared to females on creatine. The addition of creatine to drop-set resistance training augments the gains in muscle mass from resistance training alone. Creatine is more effective in untrained aging males compared to untrained aging females. Copyright © 2016 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Ndaw, Joseph D.; Faye, Andre; Maïga, Amadou S.
2017-05-01
Artificial neural networks (ANN)-based models are efficient ways of source localisation. However very large training sets are needed to precisely estimate two-dimensional Direction of arrival (2D-DOA) with ANN models. In this paper we present a fast artificial neural network approach for 2D-DOA estimation with reduced training sets sizes. We exploit the symmetry properties of Uniform Circular Arrays (UCA) to build two different datasets for elevation and azimuth angles. Linear Vector Quantisation (LVQ) neural networks are then sequentially trained on each dataset to separately estimate elevation and azimuth angles. A multilevel training process is applied to further reduce the training sets sizes.
ERIC Educational Resources Information Center
Kehoe, E. James; White, Natasha E.
2004-01-01
Rabbits were given reinforced training of the nictitating membrane (NM) response using separate conditioned stimuli (CSs), which were a tone, light, and/or tactile vibration. Then, two CSs were compounded and given further pairings with the unconditioned stimulus (US). Evidence of both overexpectation and summation effects appeared. That is,…
Cappelli Fontanive, Fernando; Souza-Silva, Érica Aparecida; Macedo da Silva, Juliana; Bastos Caramão, Elina; Alcaraz Zini, Claudia
2016-08-26
Diesel and naphtha samples were analyzed using ionic liquid (IL) columns to evaluate the best column set for the investigation of organic sulfur compounds (OSC) and nitrogen(N)-containing compounds analyses with comprehensive two-dimensional gas chromatography coupled to time-of-flight mass spectrometry detector (GC×GC/TOFMS). Employing a series of stationary phase sets, namely DB-5MS/DB-17, DB-17/DB-5MS, DB-5MS/IL-59, and IL-59/DB-5MS, the following parameters were systematically evaluated: number of tentatively identified OSC, 2D chromatographic space occupation, number of polyaromatic hydrocarbons (PAH) and OSC co-elutions, and percentage of asymmetric peaks. DB-5MS/IL-59 was chosen for OSC analysis, while IL59/DB-5MS was chosen for nitrogen compounds, as each stationary phase set provided the best chromatographic efficiency for these two classes of compounds, respectively. Most compounds were tentatively identified by Lee and Van den Dool and Kratz retention indexes, and spectra-matching to library. Whenever available, compounds were also positively identified via injection of authentic standards. Copyright © 2016 Elsevier B.V. All rights reserved.
Highly predictive and interpretable models for PAMPA permeability.
Sun, Hongmao; Nguyen, Kimloan; Kerns, Edward; Yan, Zhengyin; Yu, Kyeong Ri; Shah, Pranav; Jadhav, Ajit; Xu, Xin
2017-02-01
Cell membrane permeability is an important determinant for oral absorption and bioavailability of a drug molecule. An in silico model predicting drug permeability is described, which is built based on a large permeability dataset of 7488 compound entries or 5435 structurally unique molecules measured by the same lab using parallel artificial membrane permeability assay (PAMPA). On the basis of customized molecular descriptors, the support vector regression (SVR) model trained with 4071 compounds with quantitative data is able to predict the remaining 1364 compounds with the qualitative data with an area under the curve of receiver operating characteristic (AUC-ROC) of 0.90. The support vector classification (SVC) model trained with half of the whole dataset comprised of both the quantitative and the qualitative data produced accurate predictions to the remaining data with the AUC-ROC of 0.88. The results suggest that the developed SVR model is highly predictive and provides medicinal chemists a useful in silico tool to facilitate design and synthesis of novel compounds with optimal drug-like properties, and thus accelerate the lead optimization in drug discovery. Copyright © 2016 Elsevier Ltd. All rights reserved.
Time-dependent VOC-profile of decomposed human and animal remains in laboratory environment.
Rosier, E; Loix, S; Develter, W; Van de Voorde, W; Tytgat, J; Cuypers, E
2016-09-01
A validated method using a thermal desorber combined with a gas chromatograph coupled to a mass spectrometer was used to identify the volatile organic compounds released in decomposed human and animal remains after 9 and 12 months in glass jars in a laboratory environment. This is a follow-up study on a previous report where the first 6 months of decomposition of 6 human and 26 animal remains was investigated. In the first report, out of 452 identified compounds, a combination of 8 compounds was proposed as human and pig specific. The goal of the current study was to investigate if these 8 compounds were still released after 9 and 12 months. The next results were noticed: 287 compounds were identified; only 9 new compounds were detected and 173 were no longer seen. Sulfur-containing compounds were less prevalent as compared to the first month of decomposition. The appearance of nitrogen-containing compounds and alcohols was increasingly evident during the first 6 months, and the same trend was seen in the following 6 months. Esters became less important after 6 months. From the proposed human and pig specific compounds, diethyl disulfide was only detected during the first months of decomposition. Interestingly, the 4 proposed human and pig specific esters, as well as pyridine, 3-methylthio-1-propanol and methyl(methylthio)ethyl disulfide were still present after 9 and 12 months of decomposition. This means that these 7 human and pig specific markers can be used in the development of training aids for cadaver dogs during the whole decomposition process. Diethyl disulfide can be used in training aids for the first month of decomposition. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Ghasemi Damavandi, Hamidreza; Sen Gupta, Ananya; Nelson, Robert K; Reddy, Christopher M
2016-01-01
Comprehensive two-dimensional gas chromatography [Formula: see text] provides high-resolution separations across hundreds of compounds in a complex mixture, thus unlocking unprecedented information for intricate quantitative interpretation. We exploit this compound diversity across the [Formula: see text] topography to provide quantitative compound-cognizant interpretation beyond target compound analysis with petroleum forensics as a practical application. We focus on the [Formula: see text] topography of biomarker hydrocarbons, hopanes and steranes, as they are generally recalcitrant to weathering. We introduce peak topography maps (PTM) and topography partitioning techniques that consider a notably broader and more diverse range of target and non-target biomarker compounds compared to traditional approaches that consider approximately 20 biomarker ratios. Specifically, we consider a range of 33-154 target and non-target biomarkers with highest-to-lowest peak ratio within an injection ranging from 4.86 to 19.6 (precise numbers depend on biomarker diversity of individual injections). We also provide a robust quantitative measure for directly determining "match" between samples, without necessitating training data sets. We validate our methods across 34 [Formula: see text] injections from a diverse portfolio of petroleum sources, and provide quantitative comparison of performance against established statistical methods such as principal components analysis (PCA). Our data set includes a wide range of samples collected following the 2010 Deepwater Horizon disaster that released approximately 160 million gallons of crude oil from the Macondo well (MW). Samples that were clearly collected following this disaster exhibit statistically significant match [Formula: see text] using PTM-based interpretation against other closely related sources. PTM-based interpretation also provides higher differentiation between closely correlated but distinct sources than obtained using PCA-based statistical comparisons. In addition to results based on this experimental field data, we also provide extentive perturbation analysis of the PTM method over numerical simulations that introduce random variability of peak locations over the [Formula: see text] biomarker ROI image of the MW pre-spill sample (sample [Formula: see text] in Additional file 4: Table S1). We compare the robustness of the cross-PTM score against peak location variability in both dimensions and compare the results against PCA analysis over the same set of simulated images. Detailed description of the simulation experiment and discussion of results are provided in Additional file 1: Section S8. We provide a peak-cognizant informational framework for quantitative interpretation of [Formula: see text] topography. Proposed topographic analysis enables [Formula: see text] forensic interpretation across target petroleum biomarkers, while including the nuances of lesser-known non-target biomarkers clustered around the target peaks. This allows potential discovery of hitherto unknown connections between target and non-target biomarkers.
Neuromuscular Adaptations to Reduced Use
NASA Technical Reports Server (NTRS)
Ploutz-Snyder, Lori
2009-01-01
This viewgraph presentation reviews the studies done to reduce neuromuscular strength loss during unilateral lower limb suspension (ULLS). Since there are animals that undergo fairly long periods of muscular disuse without any or minimal muscular atrophy, there is an answer to that might be applicable to human in situations that require no muscular use to diminish the effects of muscular atrophy. Three sets of ULLS studies were reviewed indicated that muscle strength decreased more than the muscle mass. The study reviewed exercise countermeasures to combat the atrophy, including: ischemia maintained during Compound muscle action potential (CMAP), ischemia and low load exercise, Japanese kaatsu, and the potential for rehabilitation or situations where heavy loading is undesirable. Two forms of countermeasures to unloading have been successful, (1) high-load resistance training has maintained muscle mass and strength, and low load resistance training with blood flow restriction (LL(sub BFR)). The LL(sub BFR) has been shown to increase muscle mass and strength. There has been significant interest in Tourniquet training. An increase in Growth Hormone(GH) has been noted for LL(sub BFR) exercise. An experimental study with 16 subjects 8 of whom performed ULLS, and 8 of whom performed ULLS and LL(sub BFR) exercise three times per week during the ULLS. Charts show the results of the two groups, showing that performing LL(sub BFR) exercise during 30 days of ULLS can maintain muscle size and strength and even improve muscular endurance.
Crystal Structure Prediction via Deep Learning.
Ryan, Kevin; Lengyel, Jeff; Shatruk, Michael
2018-06-06
We demonstrate the application of deep neural networks as a machine-learning tool for the analysis of a large collection of crystallographic data contained in the crystal structure repositories. Using input data in the form of multi-perspective atomic fingerprints, which describe coordination topology around unique crystallographic sites, we show that the neural-network model can be trained to effectively distinguish chemical elements based on the topology of their crystallographic environment. The model also identifies structurally similar atomic sites in the entire dataset of ~50000 crystal structures, essentially uncovering trends that reflect the periodic table of elements. The trained model was used to analyze templates derived from the known binary and ternary crystal structures in order to predict the likelihood to form new compounds that could be generated by placing elements into these structural templates in combinatorial fashion. Statistical analysis of predictive performance of the neural-network model, which was applied to a test set of structures never seen by the model during training, indicates its ability to predict known elemental compositions with a high likelihood of success. In ~30% of cases, the known compositions were found among top-10 most likely candidates proposed by the model. These results suggest that the approach developed in this work can be used to effectively guide the synthetic efforts in the discovery of new materials, especially in the case of systems composed of 3 or more chemical elements.
Long-Term Abstract Learning of Attentional Set
ERIC Educational Resources Information Center
Leber, Andrew B.; Kawahara, Jun-Ichiro; Gabari, Yuji
2009-01-01
How does past experience influence visual search strategy (i.e., attentional set)? Recent reports have shown that, when given the option to use 1 of 2 attentional sets, observers persist with the set previously required in a training phase. Here, 2 related questions are addressed. First, does the training effect result only from perseveration with…
Conditional Relations with Compound Abstract Stimuli Using a Go/No-Go Procedure
ERIC Educational Resources Information Center
Debert, Paula; Matos, Maria Amelia; McIlvane, William
2007-01-01
The aim of this study was to evaluate whether emergent conditional relations could be established with a go/no-go procedure using compound abstract stimuli. The procedure was conducted with 6 adult humans. During training, responses emitted in the presence of certain stimulus compounds (A1B1, A2B2, A3B3, B1C1, B2C2, and B3C3) were followed by…
Geropsychology Training in a VA Nursing Home Setting
ERIC Educational Resources Information Center
Karel, Michele J.; Moye, Jennifer
2005-01-01
There is a growing need for professional psychology training in nursing home settings, and nursing homes provide a rich environment for teaching geropsychology competencies. We describe the nursing home training component of our Department of Veterans Affairs (VA) Predoctoral Internship and Geropsychology Postdoctoral Fellowship programs. Our…
The Evolution of On-Board Emergency Training for the International Space Station Crew
NASA Technical Reports Server (NTRS)
LaBuff, Skyler
2015-01-01
The crew of the International Space Station (ISS) receives extensive ground-training in order to safely and effectively respond to any potential emergency event while on-orbit, but few people realize that their training is not concluded when they launch into space. The evolution of the emergency On- Board Training events (OBTs) has recently moved from paper "scripts" to an intranet-based software simulation that allows for the crew, as well as the flight control teams in Mission Control Centers across the world, to share in an improved and more realistic training event. This emergency OBT simulator ensures that the participants experience the training event as it unfolds, completely unaware of the type, location, or severity of the simulated emergency until the scenario begins. The crew interfaces with the simulation software via iPads that they keep with them as they translate through the ISS modules, receiving prompts and information as they proceed through the response. Personnel in the control centers bring up the simulation via an intranet browser at their console workstations, and can view additional telemetry signatures in simulated ground displays in order to assist the crew and communicate vital information to them as applicable. The Chief Training Officers and emergency instructors set the simulation in motion, choosing the type of emergency (rapid depressurization, fire, or toxic atmosphere) and specific initial conditions to emphasize the desired training objectives. Project development, testing, and implementation was a collaborative effort between ISS emergency instructors, Chief Training Officers, Flight Directors, and the Crew Office using commercial off the shelf (COTS) hardware along with simulation software created in-house. Due to the success of the Emergency OBT simulator, the already-developed software has been leveraged and repurposed to develop a new emulator used during fire response ground-training to deliver data that the crew receives from the handheld Compound Specific Analyzer for Combustion Products (CSA-CP). This CSA-CP emulator makes use of a portion of codebase from the Emergency OBT simulator dealing with atmospheric contamination during fire scenarios, and feeds various data signatures to crew via an iPod Touch with a flight-like CSA-CP display. These innovative simulations, which make use of COTS hardware with custom in-house software, have yielded drastic improvements to emergency training effectiveness and risk reduction for ISS crew and flight control teams during on-orbit and ground training events.
Balsamo, Sandor; Tibana, Ramires Alsamir; Nascimento, Dahan da Cunha; de Farias, Gleyverton Landim; Petruccelli, Zeno; de Santana, Frederico dos Santos; Martins, Otávio Vanni; de Aguiar, Fernando; Pereira, Guilherme Borges; de Souza, Jéssica Cardoso; Prestes, Jonato
2012-01-01
The super-set is a widely used resistance training method consisting of exercises for agonist and antagonist muscles with limited or no rest interval between them – for example, bench press followed by bent-over rows. In this sense, the aim of the present study was to compare the effects of different super-set exercise sequences on the total training volume. A secondary aim was to evaluate the ratings of perceived exertion and fatigue index in response to different exercise order. On separate testing days, twelve resistance-trained men, aged 23.0 ± 4.3 years, height 174.8 ± 6.75 cm, body mass 77.8 ± 13.27 kg, body fat 12.0% ± 4.7%, were submitted to a super-set method by using two different exercise orders: quadriceps (leg extension) + hamstrings (leg curl) (QH) or hamstrings (leg curl) + quadriceps (leg extension) (HQ). Sessions consisted of three sets with a ten-repetition maximum load with 90 seconds rest between sets. Results revealed that the total training volume was higher for the HQ exercise order (P = 0.02) with lower perceived exertion than the inverse order (P = 0.04). These results suggest that HQ exercise order involving lower limbs may benefit practitioners interested in reaching a higher total training volume with lower ratings of perceived exertion compared with the leg extension plus leg curl order. PMID:22371654
How large a training set is needed to develop a classifier for microarray data?
Dobbin, Kevin K; Zhao, Yingdong; Simon, Richard M
2008-01-01
A common goal of gene expression microarray studies is the development of a classifier that can be used to divide patients into groups with different prognoses, or with different expected responses to a therapy. These types of classifiers are developed on a training set, which is the set of samples used to train a classifier. The question of how many samples are needed in the training set to produce a good classifier from high-dimensional microarray data is challenging. We present a model-based approach to determining the sample size required to adequately train a classifier. It is shown that sample size can be determined from three quantities: standardized fold change, class prevalence, and number of genes or features on the arrays. Numerous examples and important experimental design issues are discussed. The method is adapted to address ex post facto determination of whether the size of a training set used to develop a classifier was adequate. An interactive web site for performing the sample size calculations is provided. We showed that sample size calculations for classifier development from high-dimensional microarray data are feasible, discussed numerous important considerations, and presented examples.
Using Graph Indices for the Analysis and Comparison of Chemical Datasets.
Fourches, Denis; Tropsha, Alexander
2013-10-01
In cheminformatics, compounds are represented as points in multidimensional space of chemical descriptors. When all pairs of points found within certain distance threshold in the original high dimensional chemistry space are connected by distance-labeled edges, the resulting data structure can be defined as Dataset Graph (DG). We show that, similarly to the conventional description of organic molecules, many graph indices can be computed for DGs as well. We demonstrate that chemical datasets can be effectively characterized and compared by computing simple graph indices such as the average vertex degree or Randic connectivity index. This approach is used to characterize and quantify the similarity between different datasets or subsets of the same dataset (e.g., training, test, and external validation sets used in QSAR modeling). The freely available ADDAGRA program has been implemented to build and visualize DGs. The approach proposed and discussed in this report could be further explored and utilized for different cheminformatics applications such as dataset diversification by acquiring external compounds, dataset processing prior to QSAR modeling, or (dis)similarity modeling of multiple datasets studied in chemical genomics applications. Copyright © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Kadurin, Artur; Aliper, Alexander; Kazennov, Andrey; Mamoshina, Polina; Vanhaelen, Quentin; Khrabrov, Kuzma; Zhavoronkov, Alex
2017-01-01
Recent advances in deep learning and specifically in generative adversarial networks have demonstrated surprising results in generating new images and videos upon request even using natural language as input. In this paper we present the first application of generative adversarial autoencoders (AAE) for generating novel molecular fingerprints with a defined set of parameters. We developed a 7-layer AAE architecture with the latent middle layer serving as a discriminator. As an input and output the AAE uses a vector of binary fingerprints and concentration of the molecule. In the latent layer we also introduced a neuron responsible for growth inhibition percentage, which when negative indicates the reduction in the number of tumor cells after the treatment. To train the AAE we used the NCI-60 cell line assay data for 6252 compounds profiled on MCF-7 cell line. The output of the AAE was used to screen 72 million compounds in PubChem and select candidate molecules with potential anti-cancer properties. This approach is a proof of concept of an artificially-intelligent drug discovery engine, where AAEs are used to generate new molecular fingerprints with the desired molecular properties. PMID:28029644
Al-Sahaf, Harith; Zhang, Mengjie; Johnston, Mark
2016-01-01
In the computer vision and pattern recognition fields, image classification represents an important yet difficult task. It is a challenge to build effective computer models to replicate the remarkable ability of the human visual system, which relies on only one or a few instances to learn a completely new class or an object of a class. Recently we proposed two genetic programming (GP) methods, one-shot GP and compound-GP, that aim to evolve a program for the task of binary classification in images. The two methods are designed to use only one or a few instances per class to evolve the model. In this study, we investigate these two methods in terms of performance, robustness, and complexity of the evolved programs. We use ten data sets that vary in difficulty to evaluate these two methods. We also compare them with two other GP and six non-GP methods. The results show that one-shot GP and compound-GP outperform or achieve results comparable to competitor methods. Moreover, the features extracted by these two methods improve the performance of other classifiers with handcrafted features and those extracted by a recently developed GP-based method in most cases.
Özdemir Tarı, Gonca; Gümüş, Sümeyye; Ağar, Erbil
2015-04-15
The title compound, 2-[((3-iodo-4-methyl)phenylimino)methyl]-5-nitrothiophene, C12H9O2N2I1S1, was synthesized and characterized by IR, UV-Vis and single-crystal X-ray diffraction technique. The molecular structure was optimized at the B3LYP, B3PW91 and PBEPBE levels of the density functional method (DFT) with the 6-311G+(d,p) basis set. Using the TD-DFT method, the electronic absorption spectra of the title compound was computed in both the gas phase and ethanol solvent. The harmonic vibrational frequencies of the title compound were calculated using the same methods with the 6-311G+(d,p) basis set. The calculated results were compared with the experimental determination results of the compound. The energetic behavior such as the total energy, atomic charges, dipole moment of the title compound in solvent media were examined using the B3LYP, B3PW91 and PBEPBE methods with the 6-311G+(d,p) basis set by applying the Onsager and the polarizable continuum model (PCM). The molecular orbitals (FMOs) analysis, the molecular electrostatic potential map (MEP) and the nonlinear optical properties (NLO) for the title compound were obtained with the same levels of theory. And then thermodynamic properties for the title compound were obtained using the same methods with the 6-311G(d,p) basis set. Copyright © 2015 Elsevier B.V. All rights reserved.
Tables of compound-discount interest rate multipliers for evaluating forestry investments.
Allen L. Lundgren
1971-01-01
Tables, prepared by computer, are presented for 10 selected compound-discount interest rate multipliers commonly used in financial analyses of forestry investments. Two set of tables are given for each of the 10 multipliers. The first set gives multipliers for each year from 1 to 40 years; the second set gives multipliers at 5-year intervals from 5 to 160 years....
Performance Measures for Adaptive Decisioning Systems
1991-09-11
set to hypothesis space mapping best approximates the known map. Two assumptions, a sufficiently representative training set and the ability of the...successful prediction of LINEXT performance. The LINEXT algorithm above performs the decision space mapping on the training-set ele- ments exactly. For a
NASA Astrophysics Data System (ADS)
Orenstein, E. C.; Morgado, P. M.; Peacock, E.; Sosik, H. M.; Jaffe, J. S.
2016-02-01
Technological advances in instrumentation and computing have allowed oceanographers to develop imaging systems capable of collecting extremely large data sets. With the advent of in situ plankton imaging systems, scientists must now commonly deal with "big data" sets containing tens of millions of samples spanning hundreds of classes, making manual classification untenable. Automated annotation methods are now considered to be the bottleneck between collection and interpretation. Typically, such classifiers learn to approximate a function that predicts a predefined set of classes for which a considerable amount of labeled training data is available. The requirement that the training data span all the classes of concern is problematic for plankton imaging systems since they sample such diverse, rapidly changing populations. These data sets may contain relatively rare, sparsely distributed, taxa that will not have associated training data; a classifier trained on a limited set of classes will miss these samples. The computer vision community, leveraging advances in Convolutional Neural Networks (CNNs), has recently attempted to tackle such problems using "zero-shot" object categorization methods. Under a zero-shot framework, a classifier is trained to map samples onto a set of attributes rather than a class label. These attributes can include visual and non-visual information such as what an organism is made out of, where it is distributed globally, or how it reproduces. A second stage classifier is then used to extrapolate a class. In this work, we demonstrate a zero-shot classifier, implemented with a CNN, to retrieve out-of-training-set labels from images. This method is applied to data from two continuously imaging, moored instruments: the Scripps Plankton Camera System (SPCS) and the Imaging FlowCytobot (IFCB). Results from simulated deployment scenarios indicate zero-shot classifiers could be successful at recovering samples of rare taxa in image sets. This capability will allow ecologists to identify trends in the distribution of difficult to sample organisms in their data.
Coordinating a national rangeland monitoring training program: Success and lessons learned
USDA-ARS?s Scientific Manuscript database
One of the best ways to ensure quality of information gathered in a rangeland monitoring program is through a strong and uniform set of trainings. Curriculum development and delivery of monitoring trainings poses unique challenges that are not seen in academic settings. Participants come from a rang...
Hopman, J; Hakizimana, B; Meintjes, W A J; Nillessen, M; de Both, E; Voss, A; Mehtar, S
2016-01-01
Hospital-associated infections (HAIs) are more frequently encountered in low- than in high-resource settings. There is a need to identify and implement feasible and sustainable approaches to strengthen HAI prevention in low-resource settings. To evaluate the biological contamination of routinely cleaned mattresses in both high- and low-resource settings. In this two-stage observational study, routine manual bed cleaning was evaluated at two university hospitals using adenosine triphosphate (ATP). Standardized training of cleaning personnel was achieved in both high- and low-resource settings. Qualitative analysis of the cleaning process was performed to identify predictors of cleaning outcome in low-resource settings. Mattresses in low-resource settings were highly contaminated prior to cleaning. Cleaning significantly reduced biological contamination of mattresses in low-resource settings (P < 0.0001). After training, the contamination observed after cleaning in both the high- and low-resource settings seemed comparable. Cleaning with appropriate type of cleaning materials reduced the contamination of mattresses adequately. Predictors for mattresses that remained contaminated in a low-resource setting included: type of product used, type of ward, training, and the level of contamination prior to cleaning. In low-resource settings mattresses were highly contaminated as noted by ATP levels. Routine manual cleaning by trained staff can be as effective in a low-resource setting as in a high-resource setting. We recommend a multi-modal cleaning strategy that consists of training of domestic services staff, availability of adequate time to clean beds between patients, and application of the correct type of cleaning products. Copyright © 2015 The Healthcare Infection Society. Published by Elsevier Ltd. All rights reserved.
Teaching between-class generalization of toy play behavior to handicapped children.
Haring, T G
1985-01-01
In this study, young children with severe and moderate handicaps were taught to generalize play responses. A multiple baseline across responses design, replicated with four children, was used to assess the effects of generalization training within four sets of toys on generalization to untrained toys from four other sets. The responses taught were unique for each set of toys. Across the four participants, training to generalize within-toy sets resulted in complete between-class generalization in 11 sets, partial generalization in 3 sets, and no generalization in 2 sets. No generalization occurred to another class of toys that differed from the previous sets in that they produced a reaction to the play movement (e.g., pianos). Implications for conducting research using strategies based on class interrelationships in training contexts are discussed. PMID:4019349
Ma, J; Meng, X D; Luo, H M; Zhou, H C; Qu, S L; Liu, X T; Dai, Z
2016-06-01
In order to understand the current management status on education/training and needs for training among new employees working at the provincial CDC in China during 2012-2014, so as to provide basis for setting up related programs at the CDC levels. Based on data gathered through questionnaire surveys run by CDCs from 32 provincial and 5 specifically-designated cities, microsoft excel was used to analyze the current status on management of education and training, for new employees. There were 156 management staff members working on education and training programs in 36 CDCs, with 70% of them having received intermediate or higher levels of education. Large differences were seen on equipment of training hardware in different regions. There were 1 214 teaching staff with 66 percent in the fields or related professional areas on public health, in 2014. 5084 new employees conducted pre/post training programs, from 2012 to 2014 with funding as 750 thousand RMB Yuan. 99.5% of the new employees expressed the needs for further training while. 74% of the new staff members expecting a 2-5 day training program to be implemented. 79% of the new staff members claimed that practice as the most appropriate method for training. Institutional programs set for education and training at the CDCs need to be clarified, with management team organized. It is important to provide more financial support on both hardware, software and human resources related to training programs which are set for new stuff members at all levels of CDCs.
Chang, Kuei-Hu; Chang, Yung-Chia; Chain, Kai; Chung, Hsiang-Yu
2016-01-01
The advancement of high technologies and the arrival of the information age have caused changes to the modern warfare. The military forces of many countries have replaced partially real training drills with training simulation systems to achieve combat readiness. However, considerable types of training simulation systems are used in military settings. In addition, differences in system set up time, functions, the environment, and the competency of system operators, as well as incomplete information have made it difficult to evaluate the performance of training simulation systems. To address the aforementioned problems, this study integrated analytic hierarchy process, soft set theory, and the fuzzy linguistic representation model to evaluate the performance of various training simulation systems. Furthermore, importance–performance analysis was adopted to examine the influence of saving costs and training safety of training simulation systems. The findings of this study are expected to facilitate applying military training simulation systems, avoiding wasting of resources (e.g., low utility and idle time), and providing data for subsequent applications and analysis. To verify the method proposed in this study, the numerical examples of the performance evaluation of training simulation systems were adopted and compared with the numerical results of an AHP and a novel AHP-based ranking technique. The results verified that not only could expert-provided questionnaire information be fully considered to lower the repetition rate of performance ranking, but a two-dimensional graph could also be used to help administrators allocate limited resources, thereby enhancing the investment benefits and training effectiveness of a training simulation system. PMID:27598390
Chang, Kuei-Hu; Chang, Yung-Chia; Chain, Kai; Chung, Hsiang-Yu
2016-01-01
The advancement of high technologies and the arrival of the information age have caused changes to the modern warfare. The military forces of many countries have replaced partially real training drills with training simulation systems to achieve combat readiness. However, considerable types of training simulation systems are used in military settings. In addition, differences in system set up time, functions, the environment, and the competency of system operators, as well as incomplete information have made it difficult to evaluate the performance of training simulation systems. To address the aforementioned problems, this study integrated analytic hierarchy process, soft set theory, and the fuzzy linguistic representation model to evaluate the performance of various training simulation systems. Furthermore, importance-performance analysis was adopted to examine the influence of saving costs and training safety of training simulation systems. The findings of this study are expected to facilitate applying military training simulation systems, avoiding wasting of resources (e.g., low utility and idle time), and providing data for subsequent applications and analysis. To verify the method proposed in this study, the numerical examples of the performance evaluation of training simulation systems were adopted and compared with the numerical results of an AHP and a novel AHP-based ranking technique. The results verified that not only could expert-provided questionnaire information be fully considered to lower the repetition rate of performance ranking, but a two-dimensional graph could also be used to help administrators allocate limited resources, thereby enhancing the investment benefits and training effectiveness of a training simulation system.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mohammed, Irshad; Gnedin, Nickolay Y.
Baryonic effects are amongst the most severe systematics to the tomographic analysis of weak lensing data which is the principal probe in many future generations of cosmological surveys like LSST, Euclid etc.. Modeling or parameterizing these effects is essential in order to extract valuable constraints on cosmological parameters. In a recent paper, Eifler et al. (2015) suggested a reduction technique for baryonic effects by conducting a principal component analysis (PCA) and removing the largest baryonic eigenmodes from the data. In this article, we conducted the investigation further and addressed two critical aspects. Firstly, we performed the analysis by separating the simulations into training and test sets, computing a minimal set of principle components from the training set and examining the fits on the test set. We found that using only four parameters, corresponding to the four largest eigenmodes of the training set, the test sets can be fitted thoroughly with an RMSmore » $$\\sim 0.0011$$. Secondly, we explored the significance of outliers, the most exotic/extreme baryonic scenarios, in this method. We found that excluding the outliers from the training set results in a relatively bad fit and degraded the RMS by nearly a factor of 3. Therefore, for a direct employment of this method to the tomographic analysis of the weak lensing data, the principle components should be derived from a training set that comprises adequately exotic but reasonable models such that the reality is included inside the parameter domain sampled by the training set. The baryonic effects can be parameterized as the coefficients of these principle components and should be marginalized over the cosmological parameter space.« less
Plagianakos, V P; Magoulas, G D; Vrahatis, M N
2006-03-01
Distributed computing is a process through which a set of computers connected by a network is used collectively to solve a single problem. In this paper, we propose a distributed computing methodology for training neural networks for the detection of lesions in colonoscopy. Our approach is based on partitioning the training set across multiple processors using a parallel virtual machine. In this way, interconnected computers of varied architectures can be used for the distributed evaluation of the error function and gradient values, and, thus, training neural networks utilizing various learning methods. The proposed methodology has large granularity and low synchronization, and has been implemented and tested. Our results indicate that the parallel virtual machine implementation of the training algorithms developed leads to considerable speedup, especially when large network architectures and training sets are used.
Olfactory Blocking and Odorant Similarity in the Honeybee
ERIC Educational Resources Information Center
Gerber, Bertram; Giurfa, Martin; Guerrieri, Fernando; Lachnit, Harald
2005-01-01
Blocking occurs when previous training with a stimulus A reduces (blocks) subsequent learning about a stimulus B, when A and B are trained in compound. The question of whether blocking exists in olfactory conditioning of proboscis extension reflex (PER) in honeybees is under debate. The last published accounts on blocking in honeybees state that…
de Bruijn, Cornelis Marinus; Houterman, Willem; Ploeg, Margreet; Ducro, Bart; Boshuizen, Berit; Goethals, Klaartje; Verdegaal, Elisabeth-Lidwien; Delesalle, Catherine
2017-02-14
Most Friesian horses reach their anaerobic threshold during a standardized exercise test (SET) which requires lower intensity exercise than daily routine training. to study strengths and weaknesses of an alternative SET-protocol. Two different SETs (SETA and SETB) were applied during a 2 month training period of 9 young Friesian dressage horses. SETB alternated short episodes of canter with trot and walk, lacking long episodes of cantering, as applied in SETA. Following parameters were monitored: blood lactic acid (BLA) after cantering, average heart rate (HR) in trot and maximum HR in canter. HR and BLA of SETA and SETB were analyzed using a paired two-sided T-test and Spearman Correlation-coefficient (p* < 0.05). BLA after cantering was significantly higher in SETA compared to SETB and maximum HR in canter was significantly higher in SETA compared to SETB. The majority of horses showed a significant training response based upon longitudinal follow-up of BLA. Horses with the lowest fitness at start, displayed the largest training response. BLA was significantly lower in week 8 compared to week 0, in both SETA and SETB. A significantly decreased BLA level after cantering was noticeable in week 6 in SETA, whereas in SETB only as of week 8. In SETA a very strong correlation for BLA and average HR at trot was found throughout the entire training period, not for canter. Young Friesian horses do reach their anaerobic threshold during a SET which requires lower intensity than daily routine training. Therefore close monitoring throughout training is warranted. Longitudinal follow up of BLA and not of HR is suitable to assess training response. In the current study, horses that started with the lowest fitness level, showed the largest training response. During training monitoring HR in trot rather than in canter is advised. SETB is best suited as a template for daily training in the aerobic window.
Hoffman, Erin M; Curran, Allison M; Dulgerian, Nishan; Stockham, Rex A; Eckenrode, Brian A
2009-04-15
Law enforcement agencies frequently use canines trained to detect the odor of human decomposition to aid in determining the location of clandestine burials and human remains deposited or scattered on the surface. However, few studies attempt to identify the specific volatile organic compounds (VOCs) that elicit an appropriate response from victim recovery (VR) canines. Solid-phase microextraction (SPME) was combined with gas chromatography-mass spectrometry (GC-MS) to identify the VOCs released into the headspace associated with 14 separate tissue samples of human remains previously used for VR canine training. The headspace was found to contain various classes of VOCs, including acids, alcohols, aldehydes, halogens, aromatic hydrocarbons, ketones, and sulfides. Analysis of the data indicates that the VOCs associated with human decomposition share similarities across regions of the body and across types of tissue. However, sufficient differences exist to warrant VR canine testing to identify potential mimic odor chemical profiles that can be used as training aids. The resulting data will assist in the identification of the most suitable mixture and relative concentrations of VOCs to appropriately train VR canines.
Selective attention to visual compound stimuli in squirrel monkeys (Saimiri sciureus).
Ploog, Bertram O
2011-05-01
Five squirrel monkeys served under a simultaneous discrimination paradigm with visual compound stimuli that allowed measurement of excitatory and inhibitory control exerted by individual stimulus components (form and luminance/"color"), which could not be presented in isolation (i.e., form could not be presented without color). After performance exceeded a criterion of 75% correct during training, unreinforced test trials with stimuli comprising recombined training stimulus components were interspersed while the overall reinforcement rate remained constant for training and testing. The training-testing series was then repeated with reversed reinforcement contingencies. The findings were that color acquired greater excitatory control than form under the original condition, that no such difference was found for the reversal condition or for inhibitory control under either condition, and that overall inhibitory control was less pronounced than excitatory control. The remarkably accurate performance throughout suggested that a forced 4-s delay between the stimulus presentation and the opportunity to respond was effective in reducing "impulsive" responding, which has implications for suppressing impulsive responding in children with autism and with attention deficit disorder. Copyright © 2011 Elsevier B.V. All rights reserved.
Yu, Jingkai; Finley, Russell L
2009-01-01
High-throughput experimental and computational methods are generating a wealth of protein-protein interaction data for a variety of organisms. However, data produced by current state-of-the-art methods include many false positives, which can hinder the analyses needed to derive biological insights. One way to address this problem is to assign confidence scores that reflect the reliability and biological significance of each interaction. Most previously described scoring methods use a set of likely true positives to train a model to score all interactions in a dataset. A single positive training set, however, may be biased and not representative of true interaction space. We demonstrate a method to score protein interactions by utilizing multiple independent sets of training positives to reduce the potential bias inherent in using a single training set. We used a set of benchmark yeast protein interactions to show that our approach outperforms other scoring methods. Our approach can also score interactions across data types, which makes it more widely applicable than many previously proposed methods. We applied the method to protein interaction data from both Drosophila melanogaster and Homo sapiens. Independent evaluations show that the resulting confidence scores accurately reflect the biological significance of the interactions.
Comba, Peter; Martin, Bodo; Sanyal, Avik; Stephan, Holger
2013-08-21
A QSPR scheme for the computation of lipophilicities of ⁶⁴Cu complexes was developed with a training set of 24 tetraazamacrocylic and bispidine-based Cu(II) compounds and their experimentally available 1-octanol-water distribution coefficients. A minimum number of physically meaningful parameters were used in the scheme, and these are primarily based on data available from molecular mechanics calculations, using an established force field for Cu(II) complexes and a recently developed scheme for the calculation of fluctuating atomic charges. The developed model was also applied to an independent validation set and was found to accurately predict distribution coefficients of potential ⁶⁴Cu PET (positron emission tomography) systems. A possible next step would be the development of a QSAR-based biodistribution model to track the uptake of imaging agents in different organs and tissues of the body. It is expected that such simple, empirical models of lipophilicity and biodistribution will be very useful in the design and virtual screening of positron emission tomography (PET) imaging agents.
Haftka, Joris J H; Parsons, John R; Govers, Harrie A J
2006-11-24
A gas chromatographic method using Kováts retention indices has been applied to determine the liquid vapour pressure (P(i)), enthalpy of vaporization (DeltaH(i)) and difference in heat capacity between gas and liquid phase (DeltaC(i)) for a group of polycyclic aromatic hydrocarbons (PAHs). This group consists of 19 unsubstituted, methylated and sulphur containing PAHs. Differences in log P(i) of -0.04 to +0.99 log units at 298.15K were observed between experimental values and data from effusion and gas saturation studies. These differences in log P(i) have been fitted with multilinear regression resulting in a compound and temperature dependent correction. Over a temperature range from 273.15 to 423.15K, differences in corrected log P(i) of a training set (-0.07 to +0.03 log units) and a validation set (-0.17 to 0.19 log units) were within calculated error ranges. The corrected vapour pressures also showed a good agreement with other GC determined vapour pressures (average -0.09 log units).
SAR matrices: automated extraction of information-rich SAR tables from large compound data sets.
Wassermann, Anne Mai; Haebel, Peter; Weskamp, Nils; Bajorath, Jürgen
2012-07-23
We introduce the SAR matrix data structure that is designed to elucidate SAR patterns produced by groups of structurally related active compounds, which are extracted from large data sets. SAR matrices are systematically generated and sorted on the basis of SAR information content. Matrix generation is computationally efficient and enables processing of large compound sets. The matrix format is reminiscent of SAR tables, and SAR patterns revealed by different categories of matrices are easily interpretable. The structural organization underlying matrix formation is more flexible than standard R-group decomposition schemes. Hence, the resulting matrices capture SAR information in a comprehensive manner.
Trushkov, V F; Perminov, K A; Sapozhnikova, V V; Ignatova, O L
2013-01-01
The connection of thermodynamic properties and parameters of toxicity of chemical substances was determined. Obtained data are used for the evaluation of toxicity and hygienic rate setting of chemical compounds. The relationship between enthalpy and toxicity of chemical compounds has been established. Orthogonal planning of the experiment was carried out in the course of the investigations. Equation of unified hygienic rate setting in combined, complex, conjunct influence on the organism is presented. Prospects of determination of toxicity and methodology of unified hygienic rate setting in combined, complex, conjunct influence on the organism are presented
ERIC Educational Resources Information Center
Mills, John; Bowman, Kaye; Crean, David; Ranshaw, Danielle
2012-01-01
This literature review examines the available research on skill sets. It provides background for a larger research project "Workforce skills development and engagement in training through skill sets," the report of which will be released early next year. This paper outlines the origin of skill sets and explains the difference between…
Marks, Zach
2014-01-01
Today's health-system pharmacists and those in independent practice face risks, including exposure to potent cytotoxic drugs via needlesticks, that are associated with preparing intravenous compounded sterile preparations for immediate use. Healthcare givers who administer such medications also risk exposure to needlesticks. Those hazards can be minimized when the pharmacist thoroughly understands and complies with current standard operating procedures for preparing intravenous compounded sterile preparations and the healthcare giver uses a needle-free system for drug reconstitution and administration. The components of an overall needlestick risk-reduction strategy to ensure safety in the preparation (and eventual administration) of intravenous compounded sterile preparations should therefore include the use of needle-free connection and administration devices as well as hand-hygiene training, aseptic technique competency evaluation and training, and the maximum use of commercially available or ready-to-use dosage forms. This article, which focuses on the pharmacist's use of a needle-free reconstitution and transfer system for compounded sterile intravenous drug solutions, uses as an example the Vial2Bag (Medimop Medical Projects, Ltd., [a subsidiary of West Pharmaceutical Services, Inc., Exton, Pennsylvania], Ra'anana, Israel), which complies with United States Pharmacopeia Chapter <797> standards. Features of that system are summarized for easy reference.
A simple method to derive bounds on the size and to train multilayer neural networks
NASA Technical Reports Server (NTRS)
Sartori, Michael A.; Antsaklis, Panos J.
1991-01-01
A new derivation is presented for the bounds on the size of a multilayer neural network to exactly implement an arbitrary training set; namely, the training set can be implemented with zero error with two layers and with the number of the hidden-layer neurons equal to no.1 is greater than p - 1. The derivation does not require the separation of the input space by particular hyperplanes, as in previous derivations. The weights for the hidden layer can be chosen almost arbitrarily, and the weights for the output layer can be found by solving no.1 + 1 linear equations. The method presented exactly solves (M), the multilayer neural network training problem, for any arbitrary training set.
Kiefl, Johannes; Cordero, Chiara; Nicolotti, Luca; Schieberle, Peter; Reichenbach, Stephen E; Bicchi, Carlo
2012-06-22
The continuous interest in non-targeted profiling induced the development of tools for automated cross-sample analysis. Such tools were found to be selective or not comprehensive thus delivering a biased view on the qualitative/quantitative peak distribution across 2D sample chromatograms. Therefore, the performance of non-targeted approaches needs to be critically evaluated. This study focused on the development of a validation procedure for non-targeted, peak-based, GC×GC-MS data profiling. The procedure introduced performance parameters such as specificity, precision, accuracy, and uncertainty for a profiling method known as Comprehensive Template Matching. The performance was assessed by applying a three-week validation protocol based on CITAC/EURACHEM guidelines. Optimized ¹D and ²D retention times search windows, MS match factor threshold, detection threshold, and template threshold were evolved from two training sets by a semi-automated learning process. The effectiveness of proposed settings to consistently match 2D peak patterns was established by evaluating the rate of mismatched peaks and was expressed in terms of results accuracy. The study utilized 23 different 2D peak patterns providing the chemical fingerprints of raw and roasted hazelnuts (Corylus avellana L.) from different geographical origins, of diverse varieties and different roasting degrees. The validation results show that non-targeted peak-based profiling can be reliable with error rates lower than 10% independent of the degree of analytical variance. The optimized Comprehensive Template Matching procedure was employed to study hazelnut roasting profiles and in particular to find marker compounds strongly dependent on the thermal treatment, and to establish the correlation of potential marker compounds to geographical origin and variety/cultivar and finally to reveal the characteristic release of aroma active compounds. Copyright © 2012 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Sidorov, Pavel; Gaspar, Helena; Marcou, Gilles; Varnek, Alexandre; Horvath, Dragos
2015-12-01
Intuitive, visual rendering—mapping—of high-dimensional chemical spaces (CS), is an important topic in chemoinformatics. Such maps were so far dedicated to specific compound collections—either limited series of known activities, or large, even exhaustive enumerations of molecules, but without associated property data. Typically, they were challenged to answer some classification problem with respect to those same molecules, admired for their aesthetical virtues and then forgotten—because they were set-specific constructs. This work wishes to address the question whether a general, compound set-independent map can be generated, and the claim of "universality" quantitatively justified, with respect to all the structure-activity information available so far—or, more realistically, an exploitable but significant fraction thereof. The "universal" CS map is expected to project molecules from the initial CS into a lower-dimensional space that is neighborhood behavior-compliant with respect to a large panel of ligand properties. Such map should be able to discriminate actives from inactives, or even support quantitative neighborhood-based, parameter-free property prediction (regression) models, for a wide panel of targets and target families. It should be polypharmacologically competent, without requiring any target-specific parameter fitting. This work describes an evolutionary growth procedure of such maps, based on generative topographic mapping, followed by the validation of their polypharmacological competence. Validation was achieved with respect to a maximum of exploitable structure-activity information, covering all of Homo sapiens proteins of the ChEMBL database, antiparasitic and antiviral data, etc. Five evolved maps satisfactorily solved hundreds of activity-based ligand classification challenges for targets, and even in vivo properties independent from training data. They also stood chemogenomics-related challenges, as cumulated responsibility vectors obtained by mapping of target-specific ligand collections were shown to represent validated target descriptors, complying with currently accepted target classification in biology. Therefore, they represent, in our opinion, a robust and well documented answer to the key question "What is a good CS map?"
Sidorov, Pavel; Gaspar, Helena; Marcou, Gilles; Varnek, Alexandre; Horvath, Dragos
2015-12-01
Intuitive, visual rendering--mapping--of high-dimensional chemical spaces (CS), is an important topic in chemoinformatics. Such maps were so far dedicated to specific compound collections--either limited series of known activities, or large, even exhaustive enumerations of molecules, but without associated property data. Typically, they were challenged to answer some classification problem with respect to those same molecules, admired for their aesthetical virtues and then forgotten--because they were set-specific constructs. This work wishes to address the question whether a general, compound set-independent map can be generated, and the claim of "universality" quantitatively justified, with respect to all the structure-activity information available so far--or, more realistically, an exploitable but significant fraction thereof. The "universal" CS map is expected to project molecules from the initial CS into a lower-dimensional space that is neighborhood behavior-compliant with respect to a large panel of ligand properties. Such map should be able to discriminate actives from inactives, or even support quantitative neighborhood-based, parameter-free property prediction (regression) models, for a wide panel of targets and target families. It should be polypharmacologically competent, without requiring any target-specific parameter fitting. This work describes an evolutionary growth procedure of such maps, based on generative topographic mapping, followed by the validation of their polypharmacological competence. Validation was achieved with respect to a maximum of exploitable structure-activity information, covering all of Homo sapiens proteins of the ChEMBL database, antiparasitic and antiviral data, etc. Five evolved maps satisfactorily solved hundreds of activity-based ligand classification challenges for targets, and even in vivo properties independent from training data. They also stood chemogenomics-related challenges, as cumulated responsibility vectors obtained by mapping of target-specific ligand collections were shown to represent validated target descriptors, complying with currently accepted target classification in biology. Therefore, they represent, in our opinion, a robust and well documented answer to the key question "What is a good CS map?"
Surveillance system and method having parameter estimation and operating mode partitioning
NASA Technical Reports Server (NTRS)
Bickford, Randall L. (Inventor)
2003-01-01
A system and method for monitoring an apparatus or process asset including partitioning an unpartitioned training data set into a plurality of training data subsets each having an operating mode associated thereto; creating a process model comprised of a plurality of process submodels each trained as a function of at least one of the training data subsets; acquiring a current set of observed signal data values from the asset; determining an operating mode of the asset for the current set of observed signal data values; selecting a process submodel from the process model as a function of the determined operating mode of the asset; calculating a current set of estimated signal data values from the selected process submodel for the determined operating mode; and outputting the calculated current set of estimated signal data values for providing asset surveillance and/or control.
Schoenfeld, Brad J; Contreras, Bret; Vigotsky, Andrew D; Peterson, Mark
2016-12-01
The purpose of the present study was to evaluate muscular adaptations between heavy- and moderate-load resistance training (RT) with all other variables controlled between conditions. Nineteen resistance-trained men were randomly assigned to either a strength-type RT routine (HEAVY) that trained in a loading range of 2-4 repetitions per set (n = 10) or a hypertrophy-type RT routine (MODERATE) that trained in a loading range of 8-12 repetitions per set (n = 9). Training was carried out 3 days a week for 8 weeks. Both groups performed 3 sets of 7 exercises for the major muscle groups of the upper and lower body. Subjects were tested pre- and post-study for: 1 repetition maximum (RM) strength in the bench press and squat, upper body muscle endurance, and muscle thickness of the elbow flexors, elbow extensors, and lateral thigh. Results showed statistically greater increases in 1RM squat strength favoring HEAVY compared to MODERATE. Alternatively, statistically greater increases in lateral thigh muscle thickness were noted for MODERATE versus HEAVY. These findings indicate that heavy load training is superior for maximal strength goals while moderate load training is more suited to hypertrophy-related goals when an equal number of sets are performed between conditions.
Programming "loose training" as a strategy to facilitate language generalization.
Campbell, C R; Stremel-Campbell, K
1982-01-01
This study investigated the generalization of spontaneous complex language behavior across a nontraining setting and the durability of generalization as a result of programming and "loose training" strategy. A within-subject, across-behaviors multiple-baseline design was used to examine the performance of two moderately retarded students in the use of is/are across three syntactic structures (i.e., "wh" questions, "yes/no" reversal questions, and statements). The language training procedure used in this study represented a functional example of programming "loose training." The procedure involved conducting concurrent language training within the context of an academic training task, and establishing a functional reduction in stimulus control by permitting the student to initiate a language response based on a wide array of naturally occurring stimulus events. Concurrent probes were conducted in the free play setting to assess the immediate generalization and the durability of the language behaviors. The results demonstrated that "loose training" was effective in establishing a specific set of language responses with the participants of this investigation. Further, both students demonstrated spontaneous use of the language behavior in the free play generalization setting and a trend was clearly evident for generalization to continue across time. Thus, the methods used appear to be successful for training the use of is/are in three syntactic structures. PMID:7118759
21 CFR 880.5440 - Intravascular administration set.
Code of Federal Regulations, 2010 CFR
2010-04-01
... 21 Food and Drugs 8 2010-04-01 2010-04-01 false Intravascular administration set. 880.5440 Section 880.5440 Food and Drugs FOOD AND DRUG ADMINISTRATION, DEPARTMENT OF HEALTH AND HUMAN SERVICES... Compounding Systems; Final Guidance for Industry and FDA Reviewers.” Pharmacy compounding systems classified...
Analysis of precision and accuracy in a simple model of machine learning
NASA Astrophysics Data System (ADS)
Lee, Julian
2017-12-01
Machine learning is a procedure where a model for the world is constructed from a training set of examples. It is important that the model should capture relevant features of the training set, and at the same time make correct prediction for examples not included in the training set. I consider the polynomial regression, the simplest method of learning, and analyze the accuracy and precision for different levels of the model complexity.
HLA imputation in an admixed population: An assessment of the 1000 Genomes data as a training set.
Nunes, Kelly; Zheng, Xiuwen; Torres, Margareth; Moraes, Maria Elisa; Piovezan, Bruno Z; Pontes, Gerlandia N; Kimura, Lilian; Carnavalli, Juliana E P; Mingroni Netto, Regina C; Meyer, Diogo
2016-03-01
Methods to impute HLA alleles based on dense single nucleotide polymorphism (SNP) data provide a valuable resource to association studies and evolutionary investigation of the MHC region. The availability of appropriate training sets is critical to the accuracy of HLA imputation, and the inclusion of samples with various ancestries is an important pre-requisite in studies of admixed populations. We assess the accuracy of HLA imputation using 1000 Genomes Project data as a training set, applying it to a highly admixed Brazilian population, the Quilombos from the state of São Paulo. To assess accuracy, we compared imputed and experimentally determined genotypes for 146 samples at 4 HLA classical loci. We found imputation accuracies of 82.9%, 81.8%, 94.8% and 86.6% for HLA-A, -B, -C and -DRB1 respectively (two-field resolution). Accuracies were improved when we included a subset of Quilombo individuals in the training set. We conclude that the 1000 Genomes data is a valuable resource for construction of training sets due to the diversity of ancestries and the potential for a large overlap of SNPs with the target population. We also show that tailoring training sets to features of the target population substantially enhances imputation accuracy. Copyright © 2016 American Society for Histocompatibility and Immunogenetics. Published by Elsevier Inc. All rights reserved.
Gurd, Brendon J; Patel, Jugal; Edgett, Brittany A; Scribbans, Trisha D; Quadrilatero, Joe; Fischer, Steven L
2018-05-28
Whole body sprint-interval training (WB-SIT) represents a mode of exercise training that is both time-efficient and does not require access to an exercise facility. The current study examined the feasibility of implementing a WB-SIT intervention in a workplace setting. A total of 747 employees from a large office building were invited to participate with 31 individuals being enrolled in the study. Anthropometrics, aerobic fitness, core and upper body strength, and lower body mobility were assessed before and after a 12-week exercise intervention consisting of 2-4 training sessions per week. Each training session required participants to complete 8, 20-second intervals (separated by 10 seconds of rest) of whole body exercise. Proportion of participation was 4.2% while the response rate was 35% (11/31 participants completed post training testing). In responders, compliance to prescribed training was 83±17%, and significant (p < 0.05) improvements were observed for aerobic fitness, push-up performance and lower body mobility. These results demonstrate the efficacy of WB-FIT for improving fitness and mobility in an office setting, but highlight the difficulties in achieving high rates of participation and response in this setting.
1984-04-01
compounds time, understanding and coordination problems. Just too many people in the process. In fact, there are numerous versions of a task with the...sometimes -his caused interruptions. Nhis was further compounded by the fact that the cnalyss * voas toarted ar-d +hen t~opped, when the first cnaiyst...productive. Discrepancies - The major discrepancy was ’he use of Anti-Seize Compound . It is applied to components as a light, thin coat to prevent i..re, any
Go/No-Go Procedure with Compound Stimuli with Children with Autism
ERIC Educational Resources Information Center
Silva, Rafael Augusto; Debert, Paula
2017-01-01
The go/no-go with compound stimuli is an alternative to matching-to-sample to produce conditional and emergent relations in adults. The aim of this study was to evaluate the effectiveness of this procedure with two children diagnosed with autism. We trained and tested participants to respond to conditional relations among arbitrary stimuli using…
ERIC Educational Resources Information Center
Kehoe, E. James; Ludvig, Elliot A.; Sutton, Richard S.
2013-01-01
Rabbits were classically conditioned using compounds of tone and light conditioned stimuli (CSs) presented with either simultaneous onsets (Experiment 1) or serial onsets (Experiment 2) in a delay conditioning paradigm. Training with the simultaneous compound reduced the likelihood of a conditioned response (CR) to the individual CSs ("mutual…
Past Experience Influences the Processing of Stimulus Compounds in Human Pavlovian Conditioning
ERIC Educational Resources Information Center
Melchers, Klaus G.; Lachnit, Harold; Shanks, David R.
2004-01-01
In two human skin conductance conditioning experiments we investigated whether processing of stimulus compounds can be influenced by past experience. Participants were either pre-trained with a discrimination problem that could be solved elementally (A+, B-, AB+, C- in Experiment 1 and A+, AB+, C-, CB- in Experiment 2) or one that required a…
Education and Training for Clinical Neuropsychologists in Integrated Care Settings.
Roper, Brad L; Block, Cady K; Osborn, Katie; Ready, Rebecca E
2018-05-01
The increasing importance of integrated care necessitates that education and training experiences prepare clinical neuropsychologists for competent practice in integrated care settings, which includes (a) general competence related to an integrated/interdisciplinary approach and (b) competence specific to the setting. Formal neuropsychology training prepares neuropsychologists with a wide range of knowledge and skills in assessment, intervention, teaching/supervision, and research that are relevant to such settings. However, less attention has been paid to the knowledge and skills that directly address functioning within integrated teams, such as the ability to develop, maintain, and expand collaboration across disciplines, bidirectional clinical-research translation and implementation in integrated team settings, and how such collaboration contributes to clinical and research activities. Foundational knowledge and skills relevant to interdisciplinary systems have been articulated as part of competencies for entry into clinical neuropsychology, but their emphasis in education and training programs is unclear. Recommendations and resources are provided regarding how competencies relevant to integrated care can be provided across the continuum of education and training (i.e., doctoral, internship, postdoctoral, and post-licensure).
Competition between conceptual relations affects compound recognition: the role of entropy.
Schmidtke, Daniel; Kuperman, Victor; Gagné, Christina L; Spalding, Thomas L
2016-04-01
Previous research has suggested that the conceptual representation of a compound is based on a relational structure linking the compound's constituents. Existing accounts of the visual recognition of modifier-head or noun-noun compounds posit that the process involves the selection of a relational structure out of a set of competing relational structures associated with the same compound. In this article, we employ the information-theoretic metric of entropy to gauge relational competition and investigate its effect on the visual identification of established English compounds. The data from two lexical decision megastudies indicates that greater entropy (i.e., increased competition) in a set of conceptual relations associated with a compound is associated with longer lexical decision latencies. This finding indicates that there exists competition between potential meanings associated with the same complex word form. We provide empirical support for conceptual composition during compound word processing in a model that incorporates the effect of the integration of co-activated and competing relational information.
[An odour of disease and decay: the nose as a diagnostic instrument].
Bomers, Marije K; Smulders, Yvo M
2015-01-01
Infectious diseases and cancer change a patient's metabolism and hence the metabolic compounds produced. The composition of volatile organic compounds (VOCs) in exhaled breath or urine or stool samples can therefore be characteristic of a particular disease. In recent years many studies have been conducted into the training of animals, including dogs, to recognise diseases by smell. Besides trained animals, electronic noses (e-noses) are also being developed. These devices can identify disease-specific odour profiles in VOCs. Although the results of research in the field of scent diagnosis are promising, the medical community remains largely sceptical. We discuss applications of scent detection as a diagnostic tool in modern medicine.
DOE Office of Scientific and Technical Information (OSTI.GOV)
King, R.D.; Srinivasan, A.
1996-10-01
The machine learning program Progol was applied to the problem of forming the structure-activity relationship (SAR) for a set of compounds tested for carcinogenicity in rodent bioassays by the U.S. National Toxicology Program (NTP). Progol is the first inductive logic programming (ILP) algorithm to use a fully relational method for describing chemical structure in SARs, based on using atoms and their bond connectivities. Progol is well suited to forming SARs for carcinogenicity as it is designed to produce easily understandable rules (structural alerts) for sets of noncongeneric compounds. The Progol SAR method was tested by prediction of a set ofmore » compounds that have been widely predicted by other SAR methods (the compounds used in the NTP`s first round of carcinogenesis predictions). For these compounds no method (human or machine) was significantly more accurate than Progol. Progol was the most accurate method that did not use data from biological tests on rodents (however, the difference in accuracy is not significant). The Progol predictions were based solely on chemical structure and the results of tests for Salmonella mutagenicity. Using the full NTP database, the prediction accuracy of Progol was estimated to be 63% ({+-}3%) using 5-fold cross validation. A set of structural alerts for carcinogenesis was automatically generated and the chemical rationale for them investigated-these structural alerts are statistically independent of the Salmonella mutagenicity. Carcinogenicity is predicted for the compounds used in the NTP`s second round of carcinogenesis predictions. The results for prediction of carcinogenesis, taken together with the previous successful applications of predicting mutagenicity in nitroaromatic compounds, and inhibition of angiogenesis by suramin analogues, show that Progol has a role to play in understanding the SARs of cancer-related compounds. 29 refs., 2 figs., 4 tabs.« less
ERIC Educational Resources Information Center
Iyioke, Ifeoma Chika
2013-01-01
This dissertation describes a design for training, in accordance with probability judgment heuristics principles, for the Angoff standard setting method. The new training with instruction, practice, and feedback tailored to the probability judgment heuristics principles was called the Heuristic training and the prevailing Angoff method training…
Replacing Maladaptive Speech with Verbal Labeling Responses: An Analysis of Generalized Responding.
ERIC Educational Resources Information Center
Foxx, R. M.; And Others
1988-01-01
Three mentally handicapped students (aged 13, 36, and 40) with maladaptive speech received training to answer questions with verbal labels. The results of their cues-pause-point training showed that the students replaced their maladaptive speech with correct labels (answers) to questions in the training setting and three generalization settings.…
A Model for Teaching Rational Behavior Therapy in a Public School Setting.
ERIC Educational Resources Information Center
Patton, Patricia L.
A training model for the use of rational behavior therapy (RBT) with emotionally disturbed adolescents in a school setting is presented, including a structured, didactic format consisting of five basic RBT training techniques. The training sessions, lasting 10 weeks each, are described. Also presented is the organization for the actual classroom…
Electronic Nose: Evaluation of Kamina Prototype Unit
NASA Technical Reports Server (NTRS)
Schattke, Nathan
2001-01-01
The Kamina, Sam and Cyranose electronic nose systems were evaluated and partially trained. Much work was performed on the Kamina as it has the ability to respond to low (less than 10 ppb) concentrations of hydrazine compounds. We were able to tell the difference between Hydrazine (Hz) and Monomethylhydrazine (MMH) in standard clean humid air. We were able to detect MMH in reduced pressure (1/3 atm) at about 250 ppb, however the training set was to far from the real situation to be useful now. Various engineering and usability aspects of both the noses was noted, especially the software. One serious physical engineering flaw was remedied in the Kamina system. A gas flow manifold was created for the Sam system. Different chips were evaluated for the Kamina system. It is still unclear if they can be exchanged without retraining the software.The Sam Detect commercial unit was evaluated for solvent detection and evaluation. It was able to successfully identify some solvents. The Cyranose, was observed and evaluated for two days. It has the ability to detect gasses in the 100 parts per million level but not the 10 parts per billion level. It is very sensitive to humidity changes; there is software to partially handle this.
Novel naïve Bayes classification models for predicting the chemical Ames mutagenicity.
Zhang, Hui; Kang, Yan-Li; Zhu, Yuan-Yuan; Zhao, Kai-Xia; Liang, Jun-Yu; Ding, Lan; Zhang, Teng-Guo; Zhang, Ji
2017-06-01
Prediction of drug candidates for mutagenicity is a regulatory requirement since mutagenic compounds could pose a toxic risk to humans. The aim of this investigation was to develop a novel prediction model of mutagenicity by using a naïve Bayes classifier. The established model was validated by the internal 5-fold cross validation and external test sets. For comparison, the recursive partitioning classifier prediction model was also established and other various reported prediction models of mutagenicity were collected. Among these methods, the prediction performance of naïve Bayes classifier established here displayed very well and stable, which yielded average overall prediction accuracies for the internal 5-fold cross validation of the training set and external test set I set were 89.1±0.4% and 77.3±1.5%, respectively. The concordance of the external test set II with 446 marketed drugs was 90.9±0.3%. In addition, four simple molecular descriptors (e.g., Apol, No. of H donors, Num-Rings and Wiener) related to mutagenicity and five representative substructures of mutagens (e.g., aromatic nitro, hydroxyl amine, nitroso, aromatic amine and N-methyl-N-methylenemethanaminum) produced by ECFP_14 fingerprints were identified. We hope the established naïve Bayes prediction model can be applied to risk assessment processes; and the obtained important information of mutagenic chemicals can guide the design of chemical libraries for hit and lead optimization. Copyright © 2017 Elsevier B.V. All rights reserved.
Guo, Jing; Chen, Shangxiang; Li, Shun; Sun, Xiaowei; Li, Wei; Zhou, Zhiwei; Chen, Yingbo; Xu, Dazhi
2018-01-12
Several studies have highlighted the prognostic value of the individual and the various combinations of the tumor markers for gastric cancer (GC). Our study was designed to assess establish a new novel model incorporating carcino-embryonic antigen (CEA), carbohydrate antigen 19-9 (CA19-9), carbohydrate antigen 72-4 (CA72-4). A total of 1,566 GC patients (Primary cohort) between Jan 2000 and July 2013 were analyzed. The Primary cohort was randomly divided into Training set (n=783) and Validation set (n=783). A three-tumor marker classifier was developed in the Training set and validated in the Validation set by multivariate regression and risk-score analysis. We have identified a three-tumor marker classifier (including CEA, CA19-9 and CA72-4) for the cancer specific survival (CSS) of GC (p<0.001). Consistent results were obtained in the both Training set and Validation set. Multivariate analysis showed that the classifier was an independent predictor of GC (All p value <0.001 in the Training set, Validation set and Primary cohort). Furthermore, when the leave-one-out approach was performed, the classifier showed superior predictive value to the individual or two of them (with the highest AUC (Area Under Curve); 0.618 for the Training set, and 0.625 for the Validation set), which ascertained its predictive value. Our three-tumor marker classifier is closely associated with the CSS of GC and may serve as a novel model for future decisions concerning treatments.
Cronin, John; Storey, Adam; Zourdos, Michael C.
2016-01-01
ABSTRACT RATINGS OF PERCEIVED EXERTION ARE A VALID METHOD OF ESTIMATING THE INTENSITY OF A RESISTANCE TRAINING EXERCISE OR SESSION. SCORES ARE GIVEN AFTER COMPLETION OF AN EXERCISE OR TRAINING SESSION FOR THE PURPOSES OF ATHLETE MONITORING. HOWEVER, A NEWLY DEVELOPED SCALE BASED ON HOW MANY REPETITIONS ARE REMAINING AT THE COMPLETION OF A SET MAY BE A MORE PRECISE TOOL. THIS APPROACH ADJUSTS LOADS AUTOMATICALLY TO MATCH ATHLETE CAPABILITIES ON A SET-TO-SET BASIS AND MAY MORE ACCURATELY GAUGE INTENSITY AT NEAR-LIMIT LOADS. THIS ARTICLE OUTLINES HOW TO INCORPORATE THIS NOVEL SCALE INTO A TRAINING PLAN. PMID:27531969
Cascade Back-Propagation Learning in Neural Networks
NASA Technical Reports Server (NTRS)
Duong, Tuan A.
2003-01-01
The cascade back-propagation (CBP) algorithm is the basis of a conceptual design for accelerating learning in artificial neural networks. The neural networks would be implemented as analog very-large-scale integrated (VLSI) circuits, and circuits to implement the CBP algorithm would be fabricated on the same VLSI circuit chips with the neural networks. Heretofore, artificial neural networks have learned slowly because it has been necessary to train them via software, for lack of a good on-chip learning technique. The CBP algorithm is an on-chip technique that provides for continuous learning in real time. Artificial neural networks are trained by example: A network is presented with training inputs for which the correct outputs are known, and the algorithm strives to adjust the weights of synaptic connections in the network to make the actual outputs approach the correct outputs. The input data are generally divided into three parts. Two of the parts, called the "training" and "cross-validation" sets, respectively, must be such that the corresponding input/output pairs are known. During training, the cross-validation set enables verification of the status of the input-to-output transformation learned by the network to avoid over-learning. The third part of the data, termed the "test" set, consists of the inputs that are required to be transformed into outputs; this set may or may not include the training set and/or the cross-validation set. Proposed neural-network circuitry for on-chip learning would be divided into two distinct networks; one for training and one for validation. Both networks would share the same synaptic weights.
NASA Technical Reports Server (NTRS)
Stevens, Mark A.; Handschuh, Robert F.; Lewicki, David G.
2010-01-01
The Offset Compound Gear Drive is an in-line, discrete, two-speed device utilizing a special offset compound gear that has both an internal tooth configuration on the input end and external tooth configuration on the output end, thus allowing it to mesh in series, simultaneously, with both a smaller external tooth input gear and a larger internal tooth output gear. This unique geometry and offset axis permits the compound gear to mesh with the smaller diameter input gear and the larger diameter output gear, both of which are on the same central, or primary, centerline. This configuration results in a compact in-line reduction gear set consisting of fewer gears and bearings than a conventional planetary gear train. Switching between the two output ratios is accomplished through a main control clutch and sprag. Power flow to the above is transmitted through concentric power paths. Low-speed operation is accomplished in two meshes. For the purpose of illustrating the low-speed output operation, the following example pitch diameters are given. A 5.0 pitch diameter (PD) input gear to 7.50 PD (internal tooth) intermediate gear (0.667 reduction mesh), and a 7.50 PD (external tooth) intermediate gear to a 10.00 PD output gear (0.750 reduction mesh). Note that it is not required that the intermediate gears on the offset axis be of the same diameter. For this example, the resultant low-speed ratio is 2:1 (output speed = 0.500; product of stage one 0.667 reduction and stage two 0.750 stage reduction). The design is not restricted to the example pitch diameters, or output ratio. From the output gear, power is transmitted through a hollow drive shaft, which, in turn, drives a sprag during which time the main clutch is disengaged.
Dufour, Marie-Michèle; Lanovaz, Marc J
2017-11-01
The purpose of our study was to compare the effects of serial and concurrent training on the generalization of receptive identification in children with autism spectrum disorders (ASD). We taught one to three pairs of stimulus sets to nine children with ASD between the ages of three and six. One stimulus set within each pair was taught using concurrent training and the other using serial training. We alternated the training sessions within a multielement design and staggered the introduction of subsequent pairs for each participant as in a multiple baseline design. Overall, six participants generalized at least one stimulus set more rapidly with concurrent training whereas two participants showed generalization more rapidly with serial training. Our results differ from other comparison studies on the topic and indicate that practitioners should consider assessing the effects of both procedures prior to teaching receptive identification to children with ASD.
Owen, Lucy; Laird, Katie; Wilson, Philippe B
2018-04-01
Many essential oil components are known to possess broad spectrum antimicrobial activity, including against antibiotic resistant bacteria. These compounds may be a useful source of new and novel antimicrobials. However, there is limited research on the structure-activity relationship (SAR) of essential oil compounds, which is important for target identification and lead optimization. This study aimed to elucidate SARs of essential oil components from experimental and literature sources. Minimum Inhibitory Concentrations (MICs) of essential oil components were determined against Escherichia coli and Staphylococcus aureus using a microdilution method and then compared to those in published in literature. Of 12 essential oil components tested, carvacrol and cuminaldehyde were most potent with MICs of 1.98 and 2.10 mM, respectively. The activity of 21 compounds obtained from the literature, MICs ranged from 0.004 mM for limonene to 36.18 mM for α-terpineol. A 3D qualitative SAR model was generated from MICs using FORGE software by consideration of electrostatic and steric parameters. An r 2 value of 0.807 for training and cross-validation sets was achieved with the model developed. Ligand efficiency was found to correlate well to the observed activity (r 2 = 0.792), while strongly negative electrostatic regions were present in potent molecules. These descriptors may be useful for target identification of essential oils or their major components in antimicrobial/drug development. Copyright © 2017 Elsevier Ltd. All rights reserved.
The effects of characteristics of substituents on toxicity of the nitroaromatics: HiT QSAR study
NASA Astrophysics Data System (ADS)
Kuz'min, Victor E.; Muratov, Eugene N.; Artemenko, Anatoly G.; Gorb, Leonid; Qasim, Mohammad; Leszczynski, Jerzy
2008-10-01
The present study applies the Hierarchical Technology for Quantitative Structure-Activity Relationships (HiT QSAR) for (i) evaluation of the influence of the characteristics of 28 nitroaromatic compounds (some of which belong to a widely known class of explosives) as to their toxicity; (ii) prediction of toxicity for new nitroaromatic derivatives; (iii) analysis of the effects of substituents in nitroaromatic compounds on their toxicity in vivo. The 50% lethal dose concentration for rats (LD50) was used to develop the QSAR models based on simplex representation of molecular structure. The preliminary 1D QSAR results show that even the information on the composition of molecules reveals the main tendencies of changes in toxicity. The statistic characteristics for partial least squares 2D QSAR models are quite satisfactory ( R 2 = 0.96-0.98; Q 2 = 0.91-0.93; R 2 test = 0.89-0.92), which allows us to carry out the prediction of activity for 41 novel compounds designed by the application of new combinations of substituents represented in the training set. The comprehensive analysis of toxicity changes as a function of substituent position and nature was carried out. Molecular fragments that promote and interfere with toxicity were defined on the basis of the obtained models. It was shown that the mutual influence of substituents in the benzene ring plays a crucial role regarding toxicity. The influence of different substituents on toxicity can be mediated via different C-H fragments of the aromatic ring.
"Functional" Inspiratory and Core Muscle Training Enhances Running Performance and Economy.
Tong, Tomas K; McConnell, Alison K; Lin, Hua; Nie, Jinlei; Zhang, Haifeng; Wang, Jiayuan
2016-10-01
Tong, TK, McConnell, AK, Lin, H, Nie, J, Zhang, H, and Wang, J. "Functional" inspiratory and core muscle training enhances running performance and economy. J Strength Cond Res 30(10): 2942-2951, 2016-We compared the effects of two 6-week high-intensity interval training interventions. Under the control condition (CON), only interval training was undertaken, whereas under the intervention condition (ICT), interval training sessions were followed immediately by core training, which was combined with simultaneous inspiratory muscle training (IMT)-"functional" IMT. Sixteen recreational runners were allocated to either ICT or CON groups. Before the intervention phase, both groups undertook a 4-week program of "foundation" IMT to control for the known ergogenic effect of IMT (30 inspiratory efforts at 50% maximal static inspiratory pressure [P0] per set, 2 sets per day, 6 days per week). The subsequent 6-week interval running training phase consisted of 3-4 sessions per week. In addition, the ICT group undertook 4 inspiratory-loaded core exercises (10 repetitions per set, 2 sets per day, inspiratory load set at 50% post-IMT P0) immediately after each interval training session. The CON group received neither core training nor functional IMT. After the intervention phase, global inspiratory and core muscle functions increased in both groups (p ≤ 0.05), as evidenced by P0 and a sport-specific endurance plank test (SEPT) performance, respectively. Compared with CON, the ICT group showed larger improvements in SEPT, running economy at the speed of the onset of blood lactate accumulation, and 1-hour running performance (3.04% vs. 1.57%, p ≤ 0.05). The changes in these variables were interindividually correlated (r ≥ 0.57, n = 16, p ≤ 0.05). Such findings suggest that the addition of inspiratory-loaded core conditioning into a high-intensity interval training program augments the influence of the interval program on endurance running performance and that this may be underpinned by an improvement in running economy.
Training transfer: a systematic review of the impact of inner setting factors.
Jackson, Carrie B; Brabson, Laurel A; Quetsch, Lauren B; Herschell, Amy D
2018-06-19
Consistent with Baldwin and Ford's model (Pers Psychol 41(1):63-105, 1988), training transfer is defined as the generalization of learning from a training to everyday practice in the workplace. The purpose of this review was to examine the influence of work-environment factors, one component of the model hypothesized to influence training transfer within behavioral health. An electronic literature search guided by the Consolidated Framework for Implementation Research's inner setting domain was conducted was conducted on Medline OVID, Medline EMBASE, and PsycINFO databases. Of 9184 unique articles, 169 full-text versions of articles were screened for eligibility, yielding 26 articles meeting inclusion criteria. Results from the 26 studies revealed that overall, having more positive networks and communication, culture, implementation climate, and readiness for implementation can facilitate training transfer. Although few studies have examined the impact of inner setting factors on training transfer, these results suggest organizational context is important to consider with training efforts. These findings have important implications for individuals in the broader health professions educational field.
Eigendorf, Julian; May, Marcus; Friedrich, Jan; Engeli, Stefan; Maassen, Norbert; Gros, Gerolf; Meissner, Joachim D
2018-01-01
We present here a longitudinal study determining the effects of two 3 week-periods of high intensity high volume interval training (HIHVT) (90 intervals of 6 s cycling at 250% maximum power, P max /24 s) on a cycle ergometer. HIHVT was evaluated by comparing performance tests before and after the entire training (baseline, BSL, and endpoint, END) and between the two training sets (intermediate, INT). The mRNA expression levels of myosin heavy chain (MHC) isoforms and markers of energy metabolism were analyzed in M. vastus lateralis biopsies by quantitative real-time PCR. In incremental tests peak power (P peak ) was increased, whereas V ˙ O 2peak was unaltered. Prolonged time-to-exhaustion was found in endurance tests with 65 and 80% P max at INT and END. No changes in blood levels of lipid metabolites were detected. Training-induced decreases of hematocrit indicate hypervolemia. A shift from slow MHCI/β to fast MHCIIa mRNA expression occurred after the first and second training set. The mRNA expression of peroxisome proliferator-activated receptor gamma coactivator 1α (PGC-1α), a master regulator of oxidative energy metabolism, decreased after the second training set. In agreement, a significant decrease was also found for citrate synthase mRNA after the second training set, indicating reduced oxidative capacity. However, mRNA expression levels of glycolytic marker enzyme glyceraldehyde-3-phosphate dehydrogenase did not change after the first and second training set. HIHVT induced a nearly complete slow-to-fast fiber type transformation on the mRNA level, which, however, cannot account for the improvements of performance parameters. The latter might be explained by the well-known effects of hypervolemia on exercise performance.
Training practices and ergogenic aids used by male bodybuilders.
Hackett, Daniel A; Johnson, Nathan A; Chow, Chin-Moi
2013-06-01
Bodybuilding involves performing a series of poses on stage where the competitor is judged on aesthetic muscular appearance. The purpose of this study was to describe training practices and ergogenic aids used by competitive bodybuilders and to determine whether training practices comply with current recommendations for muscular hypertrophy. A web-based survey was completed by 127 competitive male bodybuilders. The results showed that during the off-season phase of training (OFF), the majority of respondents performed 3-6 sets per exercise (95.3%), 7-12 repetition maximum (RM) per set (77.0%), and 61- to 120-seconds recovery between sets and exercises (68.6%). However, training practices changed 6 weeks before competition (PRE), where there was an increased number of respondents who reported undertaking 3-4 sets per exercise at the expense of 5-6 sets per exercise (p < 0.001), an increase in the number reporting 10-15RM per set from 7-9RM per set (p < 0.001), and an increase in the number reporting 30-60 seconds vs. 61-180 seconds recovery between sets and exercises (p < 0.001). Anabolic steroid use was high among respondents competing in amateur competitions (56 of 73 respondents), whereas dietary supplementation was used by all respondents. The findings of this study demonstrate that competitive bodybuilders comply with current resistance exercise recommendations for muscular hypertrophy; however, these changed before competition during which there is a reduction resistance training volume and intensity. This alteration, in addition to an increase in aerobic exercise volume, is purportedly used to increase muscle definition. However, these practices may increase the risk of muscle mass loss in natural compared with amateur bodybuilders who reportedly use drugs known to preserve muscle mass.
de Brito, Monique Araújo; Rodrigues, Carlos Rangel; Cirino, José Jair Vianna; de Alencastro, Ricardo Bicca; Castro, Helena Carla; Albuquerque, Magaly Girão
2008-08-01
A series of 74 dihydroalkoxybenzyloxopyrimidines (DABOs), a class of highly potent non-nucleoside reverse transcriptase inhibitors (NNRTIs), was retrieved from the literature and studied by comparative molecular field analysis (CoMFA) in order to derive three-dimensional quantitative structure-activity relationship (3D-QSAR) models. The CoMFA study has been performed with a training set of 59 compounds, testing three alignments and four charge schemes (DFT, HF, AM1, and PM3) and using defaults probe atom (Csp (3), +1 charge), cutoffs (30 kcal.mol (-1) for both steric and electrostatic fields), and grid distance (2.0 A). The best model ( N = 59), derived from Alignment 1 and PM3 charges, shows q (2) = 0.691, SE cv = 0.475, optimum number of components = 6, r (2) = 0.930, SEE = 0.226, and F-value = 115.544. The steric and electrostatic contributions for the best model were 43.2% and 56.8%, respectively. The external predictive ability (r (2) pred = 0.918) of the resultant best model was evaluated using a test set of 15 compounds. In order to design more potent DABO analogues as anti-HIV/AIDS agents, attention should be taken in order to select a substituent for the 4-oxopyrimidine ring, since, as revealed by the best CoMFA model, there are a steric restriction at the C2-position, a electron-rich group restriction at the C6-position ( para-substituent of the 6-benzyl group), and a steric allowed region at the C5-position.
The eTOX data-sharing project to advance in silico drug-induced toxicity prediction.
Cases, Montserrat; Briggs, Katharine; Steger-Hartmann, Thomas; Pognan, François; Marc, Philippe; Kleinöder, Thomas; Schwab, Christof H; Pastor, Manuel; Wichard, Jörg; Sanz, Ferran
2014-11-14
The high-quality in vivo preclinical safety data produced by the pharmaceutical industry during drug development, which follows numerous strict guidelines, are mostly not available in the public domain. These safety data are sometimes published as a condensed summary for the few compounds that reach the market, but the majority of studies are never made public and are often difficult to access in an automated way, even sometimes within the owning company itself. It is evident from many academic and industrial examples, that useful data mining and model development requires large and representative data sets and careful curation of the collected data. In 2010, under the auspices of the Innovative Medicines Initiative, the eTOX project started with the objective of extracting and sharing preclinical study data from paper or pdf archives of toxicology departments of the 13 participating pharmaceutical companies and using such data for establishing a detailed, well-curated database, which could then serve as source for read-across approaches (early assessment of the potential toxicity of a drug candidate by comparison of similar structure and/or effects) and training of predictive models. The paper describes the efforts undertaken to allow effective data sharing intellectual property (IP) protection and set up of adequate controlled vocabularies) and to establish the database (currently with over 4000 studies contributed by the pharma companies corresponding to more than 1400 compounds). In addition, the status of predictive models building and some specific features of the eTOX predictive system (eTOXsys) are presented as decision support knowledge-based tools for drug development process at an early stage.
Importance of eccentric actions in performance adaptations to resistance training
NASA Technical Reports Server (NTRS)
Dudley, Gary A.; Miller, Bruce J.; Buchanan, Paul; Tesch, Per A.
1991-01-01
The importance of eccentric (ecc) muscle actions in resistance training for the maintenance of muscle strength and mass in hypogravity was investigated in experiments in which human subjects, divided into three groups, were asked to perform four-five sets of 6 to 12 repetitions (rep) per set of three leg press and leg extension exercises, 2 days each weeks for 19 weeks. One group, labeled 'con', performed each rep with only concentric (con) actions, while group con/ecc with performed each rep with only ecc actions; the third group, con/con, performed twice as many sets with only con actions. Control subjects did not train. It was found that resistance training wih both con and ecc actions induced greater increases in muscle strength than did training with only con actions.
Teaching adolescents with severe disabilities to use the public telephone.
Test, D W; Spooner, F; Keul, P K; Grossi, T
1990-04-01
Two adolescents with severe disabilities served as participants in a study conducted to train in the use of the public telephone to call home. Participants were trained to complete a 17-step task analysis using a training package which consisted of total task presentation in conjunction with a four-level prompting procedure (i.e., independent, verbal, verbal + gesture, verbal + guidance). All instruction took place in a public setting (e.g., a shopping mall) with generalization probes taken in two alternative settings (e.g., a movie theater and a convenience store). A multiple probe across individuals design demonstrated the training package was successful in teaching participants to use the telephone to call home. In addition, newly acquired skills generalized to the two untrained settings. Implications for community-based training are discussed.
Weng, Ziqing; Wolc, Anna; Shen, Xia; Fernando, Rohan L; Dekkers, Jack C M; Arango, Jesus; Settar, Petek; Fulton, Janet E; O'Sullivan, Neil P; Garrick, Dorian J
2016-03-19
Genomic estimated breeding values (GEBV) based on single nucleotide polymorphism (SNP) genotypes are widely used in animal improvement programs. It is typically assumed that the larger the number of animals is in the training set, the higher is the prediction accuracy of GEBV. The aim of this study was to quantify genomic prediction accuracy depending on the number of ancestral generations included in the training set, and to determine the optimal number of training generations for different traits in an elite layer breeding line. Phenotypic records for 16 traits on 17,793 birds were used. All parents and some selection candidates from nine non-overlapping generations were genotyped for 23,098 segregating SNPs. An animal model with pedigree relationships (PBLUP) and the BayesB genomic prediction model were applied to predict EBV or GEBV at each validation generation (progeny of the most recent training generation) based on varying numbers of immediately preceding ancestral generations. Prediction accuracy of EBV or GEBV was assessed as the correlation between EBV and phenotypes adjusted for fixed effects, divided by the square root of trait heritability. The optimal number of training generations that resulted in the greatest prediction accuracy of GEBV was determined for each trait. The relationship between optimal number of training generations and heritability was investigated. On average, accuracies were higher with the BayesB model than with PBLUP. Prediction accuracies of GEBV increased as the number of closely-related ancestral generations included in the training set increased, but reached an asymptote or slightly decreased when distant ancestral generations were used in the training set. The optimal number of training generations was 4 or more for high heritability traits but less than that for low heritability traits. For less heritable traits, limiting the training datasets to individuals closely related to the validation population resulted in the best predictions. The effect of adding distant ancestral generations in the training set on prediction accuracy differed between traits and the optimal number of necessary training generations is associated with the heritability of traits.
Paraskevas, Paschalis D; Sabbe, Maarten K; Reyniers, Marie-Françoise; Papayannakos, Nikos G; Marin, Guy B
2014-10-09
Hydrogen-abstraction reactions play a significant role in thermal biomass conversion processes, as well as regular gasification, pyrolysis, or combustion. In this work, a group additivity model is constructed that allows prediction of reaction rates and Arrhenius parameters of hydrogen abstractions by hydrogen atoms from alcohols, ethers, esters, peroxides, ketones, aldehydes, acids, and diketones in a broad temperature range (300-2000 K). A training set of 60 reactions was developed with rate coefficients and Arrhenius parameters calculated by the CBS-QB3 method in the high-pressure limit with tunneling corrections using Eckart tunneling coefficients. From this set of reactions, 15 group additive values were derived for the forward and the reverse reaction, 4 referring to primary and 11 to secondary contributions. The accuracy of the model is validated upon an ab initio and an experimental validation set of 19 and 21 reaction rates, respectively, showing that reaction rates can be predicted with a mean factor of deviation of 2 for the ab initio and 3 for the experimental values. Hence, this work illustrates that the developed group additive model can be reliably applied for the accurate prediction of kinetics of α-hydrogen abstractions by hydrogen atoms from a broad range of oxygenates.
CERAPP: Collaborative Estrogen Receptor Activity Prediction ...
Humans potentially are exposed to thousands of man-made chemicals in the environment. Some chemicals mimic natural endocrine hormones and, thus, have the potential to be endocrine disruptors. Many of these chemicals never have been tested for their ability to interact with the estrogen receptor (ER). Risk assessors need tools to prioritize chemicals for assessment in costly in vivo tests, for instance, within the EPA Endocrine Disruptor Screening Program. Here, we describe a large-scale modeling project called CERAPP (Collaborative Estrogen Receptor Activity Prediction Project) demonstrating the efficacy of using predictive computational models on high-throughput screening data to screen thousands of chemicals against the ER. CERAPP combined multiple models developed in collaboration among 17 groups in the United States and Europe to predict ER activity of a common set of 32,464 chemical structures. Quantitative structure-activity relationship models and docking approaches were employed, mostly using a common training set of 1677 compounds provided by EPA, to build a total of 40 categorical and 8 continuous models for binding, agonist, and antagonist ER activity. All predictions were tested using an evaluation set of 7522 chemicals collected from the literature. To overcome the limitations of single models, a consensus was built weighting models using a scoring function (0 to 1) based on their accuracies. Individual model scores ranged from 0.69 to 0.85, showing
Verma, Manjusha; Chaudhry, Aneese F.; Fahrni, Christoph J.
2010-01-01
The photophysical properties of 1,3,5-triarylpyrazolines are strongly influenced by the nature and position of substituents attached to the aryl-rings, rendering this fluorophore platform well suited for the design of fluorescent probes utilizing a photoinduced electron transfer (PET) switching mechanism. To explore the tunability of two key parameters that govern the PET thermodynamics, the excited state energy ΔE00 and acceptor potential E(A/A−), a library of polyfluoro-substituted 1,3-diaryl-5-phenyl-pyrazolines was synthesized and characterized. The observed trends for the PET parameters were effectively captured through multiple Hammett linear free energy relationships (LFER) using a set of independent substituent constants for each of the two aryl rings. Given the lack of experimental Hammett constants for polyfluoro substituted aromatics, theoretically derived constants based on the electrostatic potential at the nucleus (EPN) of carbon atoms were employed as quantum chemical descriptors. The performance of the LFER was evaluated with a set of compounds that were not included in the training set, yielding a mean unsigned error of 0.05 eV for the prediction of the combined PET parameters. The outlined LFER approach should be well suited to design and optimize the performance of cation-responsive 1,3,5-triarylpyrazolines. PMID:19343239
Kumar, B V S Suneel; Kotla, Rohith; Buddiga, Revanth; Roy, Jyoti; Singh, Sardar Shamshair; Gundla, Rambabu; Ravikumar, Muttineni; Sarma, Jagarlapudi A R P
2011-01-01
Structure and ligand based pharmacophore modeling and docking studies carried out using diversified set of c-Jun N-terminal kinase-3 (JNK3) inhibitors are presented in this paper. Ligand based pharmacophore model (LBPM) was developed for 106 inhibitors of JNK3 using a training set of 21 compounds to reveal structural and chemical features necessary for these molecules to inhibit JNK3. Hypo1 consisted of two hydrogen bond acceptors (HBA), one hydrogen bond donor (HBD), and a hydrophobic (HY) feature with a correlation coefficient (r²) of 0.950. This pharmacophore model was validated using test set containing 85 inhibitors and had a good r² of 0.846. All the molecules were docked using Glide software and interestingly, all the docked conformations showed hydrogen bond interactions with important hinge region amino acids (Gln155 and Met149)and these interactions were compared with Hypo1 features. The results of ligand based pharmacophore model (LBPM)and docking studies are validated each other. The structure based pharmacophore model (SBPM) studies have identified additional features, two hydrogen bond donors and one hydrogen bond acceptor. The combination of these methodologies is useful in designing ideal pharmacophore which provides a powerful tool for the discovery of novel and selective JNK3 inhibitors.
Virtual screening of cathepsin k inhibitors using docking and pharmacophore models.
Ravikumar, Muttineni; Pavan, S; Bairy, Santhosh; Pramod, A B; Sumakanth, M; Kishore, Madala; Sumithra, Tirunagaram
2008-07-01
Cathepsin K is a lysosomal cysteine protease that is highly and selectively expressed in osteoclasts, the cells which degrade bone during the continuous cycle of bone degradation and formation. Inhibition of cathepsin K represents a potential therapeutic approach for diseases characterized by excessive bone resorption such as osteoporosis. In order to elucidate the essential structural features for cathepsin K, a three-dimensional pharmacophore hypotheses were built on the basis of a set of known cathepsin K inhibitors selected from the literature using catalyst program. Several methods are used in validation of pharmacophore hypothesis were presented, and the fourth hypothesis (Hypo4) was considered to be the best pharmacophore hypothesis which has a correlation coefficient of 0.944 with training set and has high prediction of activity for a set of 30 test molecules with correlation of 0.909. The model (Hypo4) was then employed as 3D search query to screen the Maybridge database containing 59,000 compounds, to discover novel and highly potent ligands. For analyzing intermolecular interactions between protein and ligand, all the molecules were docked using Glide software. The result showed that the type and spatial location of chemical features encoded in the pharmacophore are in full agreement with the enzyme inhibitor interaction pattern identified from molecular docking.
Lack of training threatening drilling talent supply
DOE Office of Scientific and Technical Information (OSTI.GOV)
Von Flatern, R.
When oil prices crashed in the mid-1980s, the industry tightened budgets. Among the austerity measures taken to survive the consequences of low product prices was an end to expensive, long-term investment training of drilling engineers. In the absence of traditional sources of trained drilling talent, forward-looking contractors are creating their own training programs. The paper describes the activities of some companies who are setting up their own training programs, and an alliance being set up by Chevron and Amoco for training. The paper also discusses training drilling managers, third-party trainers, and the consequences for the industry that does not renewmore » its inventory of people.« less
Training in interprofessional collaboration: pedagogic innovation in family medicine units.
Paré, Line; Maziade, Jean; Pelletier, Francine; Houle, Nathalie; Iloko-Fundi, Maximilien
2012-04-01
A number of agencies that accredit university health sciences programs recently added standards for the acquisition of knowledge and skills with respect to interprofessional collaboration. Within primary care settings there are no practical training programs that allow students from different disciplines to develop competencies in this area. The training program was developed within family medicine units affiliated with Université Laval in Quebec for family medicine residents and trainees from various disciplines to develop competencies in patient-centred, interprofessional collaborative practice in primary care. Based on adult learning theories, the program was divided into 3 phases--preparing family medicine unit professionals, training preceptors, and training the residents and trainees. The program's pedagogic strategies allowed participants to learn with, from, and about one another while preparing them to engage in contemporary primary care practices. A combination of quantitative and qualitative methods was used to evaluate the implementation process and the immediate results of the training program. The training program had a positive effect on both the clinical settings and the students. Preparation of clinical settings is an important issue that must be considered when planning practical interprofessional training.
Recruiting High School Students into Tech Programs
ERIC Educational Resources Information Center
Squires, Dan; Case, Pauline
2007-01-01
Industry's needs for highly skilled workers are not currently being met. The U.S. needs more than a half-million people in skilled worker training programs now. Not enough young people are choosing to be trained in these areas, and compounding this problem is the reality that the average age of the current skilled labor force is 55--ready for…
Emergence of Tacts following Mand Training in Young Children with Autism
ERIC Educational Resources Information Center
Egan, Claire E.; Barnes-Holmes, Dermot
2009-01-01
This study sought to examine the effects of training mands on the emergence of tacts with the same response forms. Results indicated that training adjective sets as mands resulted in the emergence of adjective sets as tacts under modified, but not standard, antecedent conditions. The findings suggested that the apparent functional independence of…
ERIC Educational Resources Information Center
O'Reilly, Mark F.; Glynn, Dawn
1995-01-01
A process social skills training approach was implemented and evaluated with two high school students having mild intellectual disabilities and social skills deficits. The intervention package was successful in promoting generalization of targeted social skills from the training setting to the classroom for both students. Participants had…
Bringing the Science of Team Training to School-Based Teams
ERIC Educational Resources Information Center
Benishek, Lauren E.; Gregory, Megan E.; Hodges, Karin; Newell, Markeda; Hughes, Ashley M.; Marlow, Shannon; Lacerenza, Christina; Rosenfield, Sylvia; Salas, Eduardo
2016-01-01
Teams are ubiquitous in schools in the 21st Century; yet training for effective teaming within these settings has lagged behind. The authors of this article developed 5 modules, grounded in the science of team training and adapted from an evidence-based curriculum used in medical settings called TeamSTEPPS®, to prepare instructional and…
ERIC Educational Resources Information Center
Kissel, Robert C.; And Others
1980-01-01
A parent and teacher were trained in home and school settings to administer a self-feeding program to a profoundly retarded adult woman. During training, an increase in both the parent and teacher's appropriate use of instruction and attention occurred, and a high stable rate of self-feeding responses developed across settings. (Author)
Effect of core stability training on throwing velocity in female handball players.
Saeterbakken, Atle H; van den Tillaar, Roland; Seiler, Stephen
2011-03-01
The purpose was to study the effect of a sling exercise training (SET)-based core stability program on maximal throwing velocity among female handball players. Twenty-four female high-school handball players (16.6 ± 0.3 years, 63 ± 6 kg, and 169 ± 7 cm) participated and were initially divided into a SET training group (n = 14) and a control group (CON, n = 10). Both groups performed their regular handball training for 6 weeks. In addition, twice a week, the SET group performed a progressive core stability-training program consisting of 6 unstable closed kinetic chain exercises. Maximal throwing velocity was measured before and after the training period using photocells. Maximal throwing velocity significantly increased 4.9% from 17.9 ± 0.5 to 18.8 ± 0.4 m·s in the SET group after the training period (p < 0.01), but was unchanged in the control group (17.1 ± 0.4 vs. 16.9 ± 0.4 m·s). These results suggest that core stability training using unstable, closed kinetic chain movements can significantly improve maximal throwing velocity. A stronger and more stable lumbopelvic-hip complex may contribute to higher rotational velocity in multisegmental movements. Strength coaches can incorporate exercises exposing the joints for destabilization force during training in closed kinetic chain exercises. This may encourage an effective neuromuscular pattern and increase force production and can improve a highly specific performance task such as throwing.