Sample records for test set molecules

  1. Molecular Docking Studies of Flavonoids Derivatives on the Flavonoid 3- O-Glucosyltransferase.

    PubMed

    Harsa, Alexandra M; Harsa, Teodora E; Diudea, Mircea V; Janezic, Dusanka

    2015-01-01

    A study of 30 flavonoid derivatives, taken from PubChem database and docked on flavonoid 3-O-glucosyltransferase 3HBF, next submitted to a QSAR study, performed within a hypermolecule frame, to model their LD50 values, is reported. The initial set of molecules was split into a training set and the test set (taken from the best scored molecules in the docking test); the predicted LD50 values, computed on similarity clusters, built up for each of the molecules of the test set, surpassed in accuracy the best model. The binding energies to 3HBF protein, provided by the docking step, are not related to the LD50 of these flavonoids, more protein targets are to be investigated in this respect. However, the docking step was useful in choosing the test set of molecules.

  2. Optimized auxiliary basis sets for density fitted post-Hartree-Fock calculations of lanthanide containing molecules

    NASA Astrophysics Data System (ADS)

    Chmela, Jiří; Harding, Michael E.

    2018-06-01

    Optimised auxiliary basis sets for lanthanide atoms (Ce to Lu) for four basis sets of the Karlsruhe error-balanced segmented contracted def2 - series (SVP, TZVP, TZVPP and QZVPP) are reported. These auxiliary basis sets enable the use of the resolution-of-the-identity (RI) approximation in post Hartree-Fock methods - as for example, second-order perturbation theory (MP2) and coupled cluster (CC) theory. The auxiliary basis sets are tested on an enlarged set of about a hundred molecules where the test criterion is the size of the RI error in MP2 calculations. Our tests also show that the same auxiliary basis sets can be used together with different effective core potentials. With these auxiliary basis set calculations of MP2 and CC quality can now be performed efficiently on medium-sized molecules containing lanthanides.

  3. Bias-Free Chemically Diverse Test Sets from Machine Learning.

    PubMed

    Swann, Ellen T; Fernandez, Michael; Coote, Michelle L; Barnard, Amanda S

    2017-08-14

    Current benchmarking methods in quantum chemistry rely on databases that are built using a chemist's intuition. It is not fully understood how diverse or representative these databases truly are. Multivariate statistical techniques like archetypal analysis and K-means clustering have previously been used to summarize large sets of nanoparticles however molecules are more diverse and not as easily characterized by descriptors. In this work, we compare three sets of descriptors based on the one-, two-, and three-dimensional structure of a molecule. Using data from the NIST Computational Chemistry Comparison and Benchmark Database and machine learning techniques, we demonstrate the functional relationship between these structural descriptors and the electronic energy of molecules. Archetypes and prototypes found with topological or Coulomb matrix descriptors can be used to identify smaller, statistically significant test sets that better capture the diversity of chemical space. We apply this same method to find a diverse subset of organic molecules to demonstrate how the methods can easily be reapplied to individual research projects. Finally, we use our bias-free test sets to assess the performance of density functional theory and quantum Monte Carlo methods.

  4. Ab Initio Density Fitting: Accuracy Assessment of Auxiliary Basis Sets from Cholesky Decompositions.

    PubMed

    Boström, Jonas; Aquilante, Francesco; Pedersen, Thomas Bondo; Lindh, Roland

    2009-06-09

    The accuracy of auxiliary basis sets derived by Cholesky decompositions of the electron repulsion integrals is assessed in a series of benchmarks on total ground state energies and dipole moments of a large test set of molecules. The test set includes molecules composed of atoms from the first three rows of the periodic table as well as transition metals. The accuracy of the auxiliary basis sets are tested for the 6-31G**, correlation consistent, and atomic natural orbital basis sets at the Hartree-Fock, density functional theory, and second-order Møller-Plesset levels of theory. By decreasing the decomposition threshold, a hierarchy of auxiliary basis sets is obtained with accuracies ranging from that of standard auxiliary basis sets to that of conventional integral treatments.

  5. Generating Focused Molecule Libraries for Drug Discovery with Recurrent Neural Networks

    PubMed Central

    2017-01-01

    In de novo drug design, computational strategies are used to generate novel molecules with good affinity to the desired biological target. In this work, we show that recurrent neural networks can be trained as generative models for molecular structures, similar to statistical language models in natural language processing. We demonstrate that the properties of the generated molecules correlate very well with the properties of the molecules used to train the model. In order to enrich libraries with molecules active toward a given biological target, we propose to fine-tune the model with small sets of molecules, which are known to be active against that target. Against Staphylococcus aureus, the model reproduced 14% of 6051 hold-out test molecules that medicinal chemists designed, whereas against Plasmodium falciparum (Malaria), it reproduced 28% of 1240 test molecules. When coupled with a scoring function, our model can perform the complete de novo drug design cycle to generate large sets of novel molecules for drug discovery. PMID:29392184

  6. Relationship between molecular connectivity and carcinogenic activity: a confirmation with a new software program based on graph theory.

    PubMed Central

    Malacarne, D; Pesenti, R; Paolucci, M; Parodi, S

    1993-01-01

    For a database of 826 chemicals tested for carcinogenicity, we fragmented the structural formula of the chemicals into all possible contiguous-atom fragments with size between two and eight (nonhydrogen) atoms. The fragmentation was obtained using a new software program based on graph theory. We used 80% of the chemicals as a training set and 20% as a test set. The two sets were obtained by random sorting. From the training sets, an average (8 computer runs with independently sorted chemicals) of 315 different fragments were significantly (p < 0.125) associated with carcinogenicity or lack thereof. Even using this relatively low level of statistical significance, 23% of the molecules of the test sets lacked significant fragments. For 77% of the molecules of the test sets, we used the presence of significant fragments to predict carcinogenicity. The average level of accuracy of the predictions in the test sets was 67.5%. Chemicals containing only positive fragments were predicted with an accuracy of 78.7%. The level of accuracy was around 60% for chemicals characterized by contradictory fragments or only negative fragments. In a parallel manner, we performed eight paired runs in which carcinogenicity was attributed randomly to the molecules of the training sets. The fragments generated by these pseudo-training sets were devoid of any predictivity in the corresponding test sets. Using an independent software program, we confirmed (for the complex biological endpoint of carcinogenicity) the validity of a structure-activity relationship approach of the type proposed by Klopman and Rosenkranz with their CASE program. Images Figure 1. Figure 2. Figure 3. Figure 4. Figure 5. Figure 6. PMID:8275991

  7. The impact of surface area, volume, curvature, and Lennard-Jones potential to solvation modeling.

    PubMed

    Nguyen, Duc D; Wei, Guo-Wei

    2017-01-05

    This article explores the impact of surface area, volume, curvature, and Lennard-Jones (LJ) potential on solvation free energy predictions. Rigidity surfaces are utilized to generate robust analytical expressions for maximum, minimum, mean, and Gaussian curvatures of solvent-solute interfaces, and define a generalized Poisson-Boltzmann (GPB) equation with a smooth dielectric profile. Extensive correlation analysis is performed to examine the linear dependence of surface area, surface enclosed volume, maximum curvature, minimum curvature, mean curvature, and Gaussian curvature for solvation modeling. It is found that surface area and surfaces enclosed volumes are highly correlated to each other's, and poorly correlated to various curvatures for six test sets of molecules. Different curvatures are weakly correlated to each other for six test sets of molecules, but are strongly correlated to each other within each test set of molecules. Based on correlation analysis, we construct twenty six nontrivial nonpolar solvation models. Our numerical results reveal that the LJ potential plays a vital role in nonpolar solvation modeling, especially for molecules involving strong van der Waals interactions. It is found that curvatures are at least as important as surface area or surface enclosed volume in nonpolar solvation modeling. In conjugation with the GPB model, various curvature-based nonpolar solvation models are shown to offer some of the best solvation free energy predictions for a wide range of test sets. For example, root mean square errors from a model constituting surface area, volume, mean curvature, and LJ potential are less than 0.42 kcal/mol for all test sets. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  8. Automatic Nanodesign Using Evolutionary Techniques

    NASA Technical Reports Server (NTRS)

    Globus, Al; Saini, Subhash (Technical Monitor)

    1998-01-01

    Many problems associated with the development of nanotechnology require custom designed molecules. We use genetic graph software, a new development, to automatically evolve molecules of interest when only the requirements are known. Genetic graph software designs molecules, and potentially nanoelectronic circuits, given a fitness function that determines which of two molecules is better. A set of molecules, the first generation, is generated at random then tested with the fitness function, Subsequent generations are created by randomly choosing two parent molecules with a bias towards high scoring molecules, tearing each molecules in two at random, and mating parts from the mother and father to create two children. This procedure is repeated until a satisfactory molecule is found. An atom pair similarity test is currently used as the fitness function to evolve molecules similar to existing pharmaceuticals.

  9. Systematic study on the TD-DFT calculated electronic circular dichroism spectra of chiral aromatic nitro compounds: A comparison of B3LYP and CAM-B3LYP.

    PubMed

    Komjáti, Balázs; Urai, Ákos; Hosztafi, Sándor; Kökösi, József; Kováts, Benjámin; Nagy, József; Horváth, Péter

    2016-02-15

    B3LYP is one of the most widely used functional for the prediction of electronic circular dichroism spectra, however if the studied molecule contains aromatic nitro group computations may fail to produce reliable results. A test set of molecules of known stereochemistry were synthesized to study this phenomenon in detail. Spectra were computed by B3LYP and CAM-B3LYP functionals with 6-311++G(2d,2p) basis set. It was found that the range separated CAM-B3LYP gives better predictions than B3LYP for all test molecules. Fragment population analysis revealed that the nitro groups form highly localized molecule orbitals but the exact composition depends on the functional. CAM-B3LYP allows sufficient spatial overlap between the nitro group and distant parts of the molecule, which is necessary for the accurate description of excited states especially for charge transfer states. This phenomenon and the synthesized test molecules can be used to benchmark theoretical methods as well as to help the development of new functionals intended for spectroscopical studies. Copyright © 2015 Elsevier B.V. All rights reserved.

  10. Multiple receptor conformation docking and dock pose clustering as tool for CoMFA and CoMSIA analysis - a case study on HIV-1 protease inhibitors.

    PubMed

    Sivan, Sree Kanth; Manga, Vijjulatha

    2012-02-01

    Multiple receptors conformation docking (MRCD) and clustering of dock poses allows seamless incorporation of receptor binding conformation of the molecules on wide range of ligands with varied structural scaffold. The accuracy of the approach was tested on a set of 120 cyclic urea molecules having HIV-1 protease inhibitory activity using 12 high resolution X-ray crystal structures and one NMR resolved conformation of HIV-1 protease extracted from protein data bank. A cross validation was performed on 25 non-cyclic urea HIV-1 protease inhibitor having varied structures. The comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) models were generated using 60 molecules in the training set by applying leave one out cross validation method, r (loo) (2) values of 0.598 and 0.674 for CoMFA and CoMSIA respectively and non-cross validated regression coefficient r(2) values of 0.983 and 0.985 were obtained for CoMFA and CoMSIA respectively. The predictive ability of these models was determined using a test set of 60 cyclic urea molecules that gave predictive correlation (r (pred) (2) ) of 0.684 and 0.64 respectively for CoMFA and CoMSIA indicating good internal predictive ability. Based on this information 25 non-cyclic urea molecules were taken as a test set to check the external predictive ability of these models. This gave remarkable out come with r (pred) (2) of 0.61 and 0.53 for CoMFA and CoMSIA respectively. The results invariably show that this method is useful for performing 3D QSAR analysis on molecules having different structural motifs.

  11. Bitter or not? BitterPredict, a tool for predicting taste from chemical structure.

    PubMed

    Dagan-Wiener, Ayana; Nissim, Ido; Ben Abu, Natalie; Borgonovo, Gigliola; Bassoli, Angela; Niv, Masha Y

    2017-09-21

    Bitter taste is an innately aversive taste modality that is considered to protect animals from consuming toxic compounds. Yet, bitterness is not always noxious and some bitter compounds have beneficial effects on health. Hundreds of bitter compounds were reported (and are accessible via the BitterDB http://bitterdb.agri.huji.ac.il/dbbitter.php ), but numerous additional bitter molecules are still unknown. The dramatic chemical diversity of bitterants makes bitterness prediction a difficult task. Here we present a machine learning classifier, BitterPredict, which predicts whether a compound is bitter or not, based on its chemical structure. BitterDB was used as the positive set, and non-bitter molecules were gathered from literature to create the negative set. Adaptive Boosting (AdaBoost), based on decision trees machine-learning algorithm was applied to molecules that were represented using physicochemical and ADME/Tox descriptors. BitterPredict correctly classifies over 80% of the compounds in the hold-out test set, and 70-90% of the compounds in three independent external sets and in sensory test validation, providing a quick and reliable tool for classifying large sets of compounds into bitter and non-bitter groups. BitterPredict suggests that about 40% of random molecules, and a large portion (66%) of clinical and experimental drugs, and of natural products (77%) are bitter.

  12. QSARpy: A new flexible algorithm to generate QSAR models based on dissimilarities. The log Kow case study.

    PubMed

    Ferrari, Thomas; Lombardo, Anna; Benfenati, Emilio

    2018-05-14

    Several methods exist to develop QSAR models automatically. Some are based on indices of the presence of atoms, other on the most similar compounds, other on molecular descriptors. Here we introduce QSARpy v1.0, a new QSAR modeling tool based on a different approach: the dissimilarity. This tool fragments the molecules of the training set to extract fragments that can be associated to a difference in the property/activity value, called modulators. If the target molecule share part of the structure with a molecule of the training set and differences can be explained with one or more modulators, the property/activity value of the molecule of the training set is adjusted using the value associated to the modulator(s). This tool is tested here on the n-octanol/water partition coefficient (Kow, usually expressed in logarithmic units as log Kow). It is a key parameter in risk assessment since it is a measure of hydrophobicity. Its wide spread use makes these estimation methods very useful to reduce testing costs. Using QSARpy v1.0, we obtained a new model to predict log Kow with accurate performance (RMSE 0.43 and R 2 0.94 for the external test set), comparing favorably with other programs. QSARpy is freely available on request. Copyright © 2018 Elsevier B.V. All rights reserved.

  13. Strategy to discover diverse optimal molecules in the small molecule universe.

    PubMed

    Rupakheti, Chetan; Virshup, Aaron; Yang, Weitao; Beratan, David N

    2015-03-23

    The small molecule universe (SMU) is defined as a set of over 10(60) synthetically feasible organic molecules with molecular weight less than ∼500 Da. Exhaustive enumerations and evaluation of all SMU molecules for the purpose of discovering favorable structures is impossible. We take a stochastic approach and extend the ACSESS framework ( Virshup et al. J. Am. Chem. Soc. 2013 , 135 , 7296 - 7303 ) to develop diversity oriented molecular libraries that can generate a set of compounds that is representative of the small molecule universe and that also biases the library toward favorable physical property values. We show that the approach is efficient compared to exhaustive enumeration and to existing evolutionary algorithms for generating such libraries by testing in the NKp fitness landscape model and in the fully enumerated GDB-9 chemical universe containing 3 × 10(5) molecules.

  14. Strategy To Discover Diverse Optimal Molecules in the Small Molecule Universe

    PubMed Central

    2015-01-01

    The small molecule universe (SMU) is defined as a set of over 1060 synthetically feasible organic molecules with molecular weight less than ∼500 Da. Exhaustive enumerations and evaluation of all SMU molecules for the purpose of discovering favorable structures is impossible. We take a stochastic approach and extend the ACSESS framework (Virshup et al. J. Am. Chem. Soc.2013, 135, 7296–730323548177) to develop diversity oriented molecular libraries that can generate a set of compounds that is representative of the small molecule universe and that also biases the library toward favorable physical property values. We show that the approach is efficient compared to exhaustive enumeration and to existing evolutionary algorithms for generating such libraries by testing in the NKp fitness landscape model and in the fully enumerated GDB-9 chemical universe containing 3 × 105 molecules. PMID:25594586

  15. Organic molecule fluorescence as an experimental test-bed for quantum jumps in thermodynamics

    NASA Astrophysics Data System (ADS)

    Browne, Cormac; Farrow, Tristan; Dahlsten, Oscar C. O.; Taylor, Robert A.; Vlatko, Vedral

    2017-08-01

    We demonstrate with an experiment how molecules are a natural test bed for probing fundamental quantum thermodynamics. Single-molecule spectroscopy has undergone transformative change in the past decade with the advent of techniques permitting individual molecules to be distinguished and probed. We demonstrate that the quantum Jarzynski equality for heat is satisfied in this set-up by considering the time-resolved emission spectrum of organic molecules as arising from quantum jumps between states. This relates the heat dissipated into the environment to the free energy difference between the initial and final state. We demonstrate also how utilizing the quantum Jarzynski equality allows for the detection of energy shifts within a molecule, beyond the relative shift.

  16. Organic molecule fluorescence as an experimental test-bed for quantum jumps in thermodynamics.

    PubMed

    Browne, Cormac; Farrow, Tristan; Dahlsten, Oscar C O; Taylor, Robert A; Vlatko, Vedral

    2017-08-01

    We demonstrate with an experiment how molecules are a natural test bed for probing fundamental quantum thermodynamics. Single-molecule spectroscopy has undergone transformative change in the past decade with the advent of techniques permitting individual molecules to be distinguished and probed. We demonstrate that the quantum Jarzynski equality for heat is satisfied in this set-up by considering the time-resolved emission spectrum of organic molecules as arising from quantum jumps between states. This relates the heat dissipated into the environment to the free energy difference between the initial and final state. We demonstrate also how utilizing the quantum Jarzynski equality allows for the detection of energy shifts within a molecule, beyond the relative shift.

  17. Size-independent neural networks based first-principles method for accurate prediction of heat of formation of fuels

    NASA Astrophysics Data System (ADS)

    Yang, GuanYa; Wu, Jiang; Chen, ShuGuang; Zhou, WeiJun; Sun, Jian; Chen, GuanHua

    2018-06-01

    Neural network-based first-principles method for predicting heat of formation (HOF) was previously demonstrated to be able to achieve chemical accuracy in a broad spectrum of target molecules [L. H. Hu et al., J. Chem. Phys. 119, 11501 (2003)]. However, its accuracy deteriorates with the increase in molecular size. A closer inspection reveals a systematic correlation between the prediction error and the molecular size, which appears correctable by further statistical analysis, calling for a more sophisticated machine learning algorithm. Despite the apparent difference between simple and complex molecules, all the essential physical information is already present in a carefully selected set of small molecule representatives. A model that can capture the fundamental physics would be able to predict large and complex molecules from information extracted only from a small molecules database. To this end, a size-independent, multi-step multi-variable linear regression-neural network-B3LYP method is developed in this work, which successfully improves the overall prediction accuracy by training with smaller molecules only. And in particular, the calculation errors for larger molecules are drastically reduced to the same magnitudes as those of the smaller molecules. Specifically, the method is based on a 164-molecule database that consists of molecules made of hydrogen and carbon elements. 4 molecular descriptors were selected to encode molecule's characteristics, among which raw HOF calculated from B3LYP and the molecular size are also included. Upon the size-independent machine learning correction, the mean absolute deviation (MAD) of the B3LYP/6-311+G(3df,2p)-calculated HOF is reduced from 16.58 to 1.43 kcal/mol and from 17.33 to 1.69 kcal/mol for the training and testing sets (small molecules), respectively. Furthermore, the MAD of the testing set (large molecules) is reduced from 28.75 to 1.67 kcal/mol.

  18. Molecule kernels: a descriptor- and alignment-free quantitative structure-activity relationship approach.

    PubMed

    Mohr, Johannes A; Jain, Brijnesh J; Obermayer, Klaus

    2008-09-01

    Quantitative structure activity relationship (QSAR) analysis is traditionally based on extracting a set of molecular descriptors and using them to build a predictive model. In this work, we propose a QSAR approach based directly on the similarity between the 3D structures of a set of molecules measured by a so-called molecule kernel, which is independent of the spatial prealignment of the compounds. Predictors can be build using the molecule kernel in conjunction with the potential support vector machine (P-SVM), a recently proposed machine learning method for dyadic data. The resulting models make direct use of the structural similarities between the compounds in the test set and a subset of the training set and do not require an explicit descriptor construction. We evaluated the predictive performance of the proposed method on one classification and four regression QSAR datasets and compared its results to the results reported in the literature for several state-of-the-art descriptor-based and 3D QSAR approaches. In this comparison, the proposed molecule kernel method performed better than the other QSAR methods.

  19. 3D QSAR based design of novel oxindole derivative as 5HT7 inhibitors.

    PubMed

    Chitta, Aparna; Sivan, Sree Kanth; Manga, Vijjulatha

    2014-06-01

    To understand the structural requirements of 5-hydroxytryptamine (5HT7) receptor inhibitors and to design new ligands against 5HT7 receptor with enhanced inhibitory potency, a three-dimensional quantitative structure-activity relationship study with comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) for a data set of 56 molecules consisting of oxindole, tetrahydronaphthalene, aryl ketone substituted arylpiperazinealkylamide derivatives was performed. Derived model showed good statistical reliability in terms of predicting 5HT7 inhibitory activity of the molecules, based on molecular property fields like steric, electrostatic, hydrophobic, hydrogen bond donor and hydrogen bond acceptor fields. This is evident from statistical parameters like conventional r2 and a cross validated (q2) values of 0.985, 0.743 for CoMFA and 0.970, 0.608 for CoMSIA, respectively. Predictive ability of the models to determine 5HT7 antagonistic activity is validated using a test set of 16 molecules that were not included in the training set. Predictive r2 obtained for the test set was 0.560 and 0.619 for CoMFA and CoMSIA, respectively. Steric, electrostatic fields majorly contributed toward activity which forms the basis for design of new molecules. Absorption, distribution, metabolism and elimination (ADME) calculation using QikProp 2.5 (Schrodinger 2010, Portland, OR) reveals that the molecules confer to Lipinski's rule of five in majority of the cases.

  20. Random forest models to predict aqueous solubility.

    PubMed

    Palmer, David S; O'Boyle, Noel M; Glen, Robert C; Mitchell, John B O

    2007-01-01

    Random Forest regression (RF), Partial-Least-Squares (PLS) regression, Support Vector Machines (SVM), and Artificial Neural Networks (ANN) were used to develop QSPR models for the prediction of aqueous solubility, based on experimental data for 988 organic molecules. The Random Forest regression model predicted aqueous solubility more accurately than those created by PLS, SVM, and ANN and offered methods for automatic descriptor selection, an assessment of descriptor importance, and an in-parallel measure of predictive ability, all of which serve to recommend its use. The prediction of log molar solubility for an external test set of 330 molecules that are solid at 25 degrees C gave an r2 = 0.89 and RMSE = 0.69 log S units. For a standard data set selected from the literature, the model performed well with respect to other documented methods. Finally, the diversity of the training and test sets are compared to the chemical space occupied by molecules in the MDL drug data report, on the basis of molecular descriptors selected by the regression analysis.

  1. Resolving Transition Metal Chemical Space: Feature Selection for Machine Learning and Structure-Property Relationships.

    PubMed

    Janet, Jon Paul; Kulik, Heather J

    2017-11-22

    Machine learning (ML) of quantum mechanical properties shows promise for accelerating chemical discovery. For transition metal chemistry where accurate calculations are computationally costly and available training data sets are small, the molecular representation becomes a critical ingredient in ML model predictive accuracy. We introduce a series of revised autocorrelation functions (RACs) that encode relationships of the heuristic atomic properties (e.g., size, connectivity, and electronegativity) on a molecular graph. We alter the starting point, scope, and nature of the quantities evaluated in standard ACs to make these RACs amenable to inorganic chemistry. On an organic molecule set, we first demonstrate superior standard AC performance to other presently available topological descriptors for ML model training, with mean unsigned errors (MUEs) for atomization energies on set-aside test molecules as low as 6 kcal/mol. For inorganic chemistry, our RACs yield 1 kcal/mol ML MUEs on set-aside test molecules in spin-state splitting in comparison to 15-20× higher errors for feature sets that encode whole-molecule structural information. Systematic feature selection methods including univariate filtering, recursive feature elimination, and direct optimization (e.g., random forest and LASSO) are compared. Random-forest- or LASSO-selected subsets 4-5× smaller than the full RAC set produce sub- to 1 kcal/mol spin-splitting MUEs, with good transferability to metal-ligand bond length prediction (0.004-5 Å MUE) and redox potential on a smaller data set (0.2-0.3 eV MUE). Evaluation of feature selection results across property sets reveals the relative importance of local, electronic descriptors (e.g., electronegativity, atomic number) in spin-splitting and distal, steric effects in redox potential and bond lengths.

  2. Comparing and Validating Machine Learning Models for Mycobacterium tuberculosis Drug Discovery.

    PubMed

    Lane, Thomas; Russo, Daniel P; Zorn, Kimberley M; Clark, Alex M; Korotcov, Alexandru; Tkachenko, Valery; Reynolds, Robert C; Perryman, Alexander L; Freundlich, Joel S; Ekins, Sean

    2018-04-26

    Tuberculosis is a global health dilemma. In 2016, the WHO reported 10.4 million incidences and 1.7 million deaths. The need to develop new treatments for those infected with Mycobacterium tuberculosis ( Mtb) has led to many large-scale phenotypic screens and many thousands of new active compounds identified in vitro. However, with limited funding, efforts to discover new active molecules against Mtb needs to be more efficient. Several computational machine learning approaches have been shown to have good enrichment and hit rates. We have curated small molecule Mtb data and developed new models with a total of 18,886 molecules with activity cutoffs of 10 μM, 1 μM, and 100 nM. These data sets were used to evaluate different machine learning methods (including deep learning) and metrics and to generate predictions for additional molecules published in 2017. One Mtb model, a combined in vitro and in vivo data Bayesian model at a 100 nM activity yielded the following metrics for 5-fold cross validation: accuracy = 0.88, precision = 0.22, recall = 0.91, specificity = 0.88, kappa = 0.31, and MCC = 0.41. We have also curated an evaluation set ( n = 153 compounds) published in 2017, and when used to test our model, it showed the comparable statistics (accuracy = 0.83, precision = 0.27, recall = 1.00, specificity = 0.81, kappa = 0.36, and MCC = 0.47). We have also compared these models with additional machine learning algorithms showing Bayesian machine learning models constructed with literature Mtb data generated by different laboratories generally were equivalent to or outperformed deep neural networks with external test sets. Finally, we have also compared our training and test sets to show they were suitably diverse and different in order to represent useful evaluation sets. Such Mtb machine learning models could help prioritize compounds for testing in vitro and in vivo.

  3. A permanent magnet trap for buffer gas cooled atoms and molecules

    NASA Astrophysics Data System (ADS)

    Nohlmans, D.; Skoff, S. M.; Hendricks, R. J.; Segal, D. M.; Sauer, B. E.; Hinds, E. A.; Tarbutt, M. R.

    2013-05-01

    Cold molecules are set to provide a wealth of new science compared to their atomic counterparts. Here we want to present preliminary results for cooling and trapping atoms/molecules in a permanent magnetic trap. By replacing the conventional buffer gas cell with an arrangement of permanent magnets, we will be able to trap a fraction of the molecules right where they are cooled. For this purpose we have designed a quadrupole trap using NdFeB magnets, which has a trap depth of 0.4 K for molecules with a magnetic moment of 1 μB. Cold helium gas is pulsed into the trap region by a solenoid valve and the atoms/molecules are subsequently ablated into this and cooled via elastic collisions, leaving a fraction of them trapped. This new set-up is currently being tested with lithium atoms as they are easier to make. After having optimised the trapping and detection processes, we will use the same trap for YbF molecules.

  4. Fourier series of atomic radial distribution functions: A molecular fingerprint for machine learning models of quantum chemical properties

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    von Lilienfeld, O. Anatole; Ramakrishnan, Raghunathan; Rupp, Matthias

    We introduce a fingerprint representation of molecules based on a Fourier series of atomic radial distribution functions. This fingerprint is unique (except for chirality), continuous, and differentiable with respect to atomic coordinates and nuclear charges. It is invariant with respect to translation, rotation, and nuclear permutation, and requires no preconceived knowledge about chemical bonding, topology, or electronic orbitals. As such, it meets many important criteria for a good molecular representation, suggesting its usefulness for machine learning models of molecular properties trained across chemical compound space. To assess the performance of this new descriptor, we have trained machine learning models ofmore » molecular enthalpies of atomization for training sets with up to 10 k organic molecules, drawn at random from a published set of 134 k organic molecules with an average atomization enthalpy of over 1770 kcal/mol. We validate the descriptor on all remaining molecules of the 134 k set. For a training set of 10 k molecules, the fingerprint descriptor achieves a mean absolute error of 8.0 kcal/mol. This is slightly worse than the performance attained using the Coulomb matrix, another popular alternative, reaching 6.2 kcal/mol for the same training and test sets. (c) 2015 Wiley Periodicals, Inc.« less

  5. Accurate Induction Energies for Small Organic Molecules. 2. Development and Testing of Distributed Polarizability Models against SAPT(DFT) Energies.

    PubMed

    Misquitta, Alston J; Stone, Anthony J; Price, Sarah L

    2008-01-01

    In part 1 of this two-part investigation we set out the theoretical basis for constructing accurate models of the induction energy of clusters of moderately sized organic molecules. In this paper we use these techniques to develop a variety of accurate distributed polarizability models for a set of representative molecules that include formamide, N-methyl propanamide, benzene, and 3-azabicyclo[3.3.1]nonane-2,4-dione. We have also explored damping, penetration, and basis set effects. In particular, we have provided a way to treat the damping of the induction expansion. Different approximations to the induction energy are evaluated against accurate SAPT(DFT) energies, and we demonstrate the accuracy of our induction models on the formamide-water dimer.

  6. Applicability domains for classification problems: benchmarking of distance to models for AMES mutagenicity set

    EPA Science Inventory

    For QSAR and QSPR modeling of biological and physicochemical properties, estimating the accuracy of predictions is a critical problem. The “distance to model” (DM) can be defined as a metric that defines the similarity between the training set molecules and the test set compound ...

  7. Linear Discriminant Analysis for the in Silico Discovery of Mechanism-Based Reversible Covalent Inhibitors of a Serine Protease: Application of Hydration Thermodynamics Analysis and Semi-empirical Molecular Orbital Calculation.

    PubMed

    Masuda, Yosuke; Yoshida, Tomoki; Yamaotsu, Noriyuki; Hirono, Shuichi

    2018-01-01

    We recently reported that the Gibbs free energy of hydrolytic water molecules (ΔG wat ) in acyl-trypsin intermediates calculated by hydration thermodynamics analysis could be a useful metric for estimating the catalytic rate constants (k cat ) of mechanism-based reversible covalent inhibitors. For thorough evaluation, the proposed method was tested with an increased number of covalent ligands that have no corresponding crystal structures. After modeling acyl-trypsin intermediate structures using flexible molecular superposition, ΔG wat values were calculated according to the proposed method. The orbital energies of antibonding π* molecular orbitals (MOs) of carbonyl C=O in covalently modified catalytic serine (E orb ) were also calculated by semi-empirical MO calculations. Then, linear discriminant analysis (LDA) was performed to build a model that can discriminate covalent inhibitor candidates from substrate-like ligands using ΔG wat and E orb . The model was built using a training set (10 compounds) and then validated by a test set (4 compounds). As a result, the training set and test set ligands were perfectly discriminated by the model. Hydrolysis was slower when (1) the hydrolytic water molecule has lower ΔG wat ; (2) the covalent ligand presents higher E orb (higher reaction barrier). Results also showed that the entropic term of hydrolytic water molecule (-TΔS wat ) could be used for estimating k cat and for covalent inhibitor optimization; when the rotational freedom of the hydrolytic water molecule is limited, the chance for favorable interaction with the electrophilic acyl group would also be limited. The method proposed in this study would be useful for screening and optimizing the mechanism-based reversible covalent inhibitors.

  8. A Bayesian approach to in silico blood-brain barrier penetration modeling.

    PubMed

    Martins, Ines Filipa; Teixeira, Ana L; Pinheiro, Luis; Falcao, Andre O

    2012-06-25

    The human blood-brain barrier (BBB) is a membrane that protects the central nervous system (CNS) by restricting the passage of solutes. The development of any new drug must take into account its existence whether for designing new molecules that target components of the CNS or, on the other hand, to find new substances that should not penetrate the barrier. Several studies in the literature have attempted to predict BBB penetration, so far with limited success and few, if any, application to real world drug discovery and development programs. Part of the reason is due to the fact that only about 2% of small molecules can cross the BBB, and the available data sets are not representative of that reality, being generally biased with an over-representation of molecules that show an ability to permeate the BBB (BBB positives). To circumvent this limitation, the current study aims to devise and use a new approach based on Bayesian statistics, coupled with state-of-the-art machine learning methods to produce a robust model capable of being applied in real-world drug research scenarios. The data set used, gathered from the literature, totals 1970 curated molecules, one of the largest for similar studies. Random Forests and Support Vector Machines were tested in various configurations against several chemical descriptor set combinations. Models were tested in a 5-fold cross-validation process, and the best one tested over an independent validation set. The best fitted model produced an overall accuracy of 95%, with a mean square contingency coefficient (ϕ) of 0.74, and showing an overall capacity for predicting BBB positives of 83% and 96% for determining BBB negatives. This model was adapted into a Web based tool made available for the whole community at http://b3pp.lasige.di.fc.ul.pt.

  9. Discriminating Drug-Like Compounds by Partition Trees with Quantum Similarity Indices and Graph Invariants.

    PubMed

    Julián-Ortiz, Jesus V de; Gozalbes, Rafael; Besalú, Emili

    2016-01-01

    The search for new drug candidates in databases is of paramount importance in pharmaceutical chemistry. The selection of molecular subsets is greatly optimized and much more promising when potential drug-like molecules are detected a priori. In this work, about one hundred thousand molecules are ranked following a new methodology: a drug/non-drug classifier constructed by a consensual set of classification trees. The classification trees arise from the stochastic generation of training sets, which in turn are used to estimate probability factors of test molecules to be drug-like compounds. Molecules were represented by Topological Quantum Similarity Indices and their Graph Theoretical counterparts. The contribution of the present paper consists of presenting an effective ranking method able to improve the probability of finding drug-like substances by using these types of molecular descriptors.

  10. Estimation of boiling points using density functional theory with polarized continuum model solvent corrections.

    PubMed

    Chan, Poh Yin; Tong, Chi Ming; Durrant, Marcus C

    2011-09-01

    An empirical method for estimation of the boiling points of organic molecules based on density functional theory (DFT) calculations with polarized continuum model (PCM) solvent corrections has been developed. The boiling points are calculated as the sum of three contributions. The first term is calculated directly from the structural formula of the molecule, and is related to its effective surface area. The second is a measure of the electronic interactions between molecules, based on the DFT-PCM solvation energy, and the third is employed only for planar aromatic molecules. The method is applicable to a very diverse range of organic molecules, with normal boiling points in the range of -50 to 500 °C, and includes ten different elements (C, H, Br, Cl, F, N, O, P, S and Si). Plots of observed versus calculated boiling points gave R²=0.980 for a training set of 317 molecules, and R²=0.979 for a test set of 74 molecules. The role of intramolecular hydrogen bonding in lowering the boiling points of certain molecules is quantitatively discussed. Crown Copyright © 2011. Published by Elsevier Inc. All rights reserved.

  11. A novel method to estimate the affinity of HLA-A∗0201 restricted CTL epitope

    NASA Astrophysics Data System (ADS)

    Xu, Yun-sheng; Lin, Yong; Zhu, Bo; Lin, Zhi-hua

    2009-02-01

    A set of 70 peptides with affinity for the class I MHC HLA-A∗0201 molecule was subjected to quantitative structure-affinity relationship studies based on the SCORE function with good results ( r2 = 0.6982, RMS = 0.280). Then the 'leave-one-out' cross-validation (LOO-CV) and an outer test set including 18 outer samples were used to validate the QSAR model. The results of the LOO-CV were q2 = 0.6188, RMS = 0.315, and the results of outer test set were r2 = 0.5633, RMS = 0.2292. All these show that the QSAR model has good predictability. Statistical analysis showed that the hydrophobic and hydrogen bond interaction played a significant role in peptide-MHC molecule binding. The study also provided useful information for structure modification of CTL epitope, and laid theoretical base for molecular design of therapeutic vaccine.

  12. The Development of Novel Chemical Fragment-Based Descriptors Using Frequent Common Subgraph Mining Approach and Their Application in QSAR Modeling.

    PubMed

    Khashan, Raed; Zheng, Weifan; Tropsha, Alexander

    2014-03-01

    We present a novel approach to generating fragment-based molecular descriptors. The molecules are represented by labeled undirected chemical graph. Fast Frequent Subgraph Mining (FFSM) is used to find chemical-fragments (subgraphs) that occur in at least a subset of all molecules in a dataset. The collection of frequent subgraphs (FSG) forms a dataset-specific descriptors whose values for each molecule are defined by the number of times each frequent fragment occurs in this molecule. We have employed the FSG descriptors to develop variable selection k Nearest Neighbor (kNN) QSAR models of several datasets with binary target property including Maximum Recommended Therapeutic Dose (MRTD), Salmonella Mutagenicity (Ames Genotoxicity), and P-Glycoprotein (PGP) data. Each dataset was divided into training, test, and validation sets to establish the statistical figures of merit reflecting the model validated predictive power. The classification accuracies of models for both training and test sets for all datasets exceeded 75 %, and the accuracy for the external validation sets exceeded 72 %. The model accuracies were comparable or better than those reported earlier in the literature for the same datasets. Furthermore, the use of fragment-based descriptors affords mechanistic interpretation of validated QSAR models in terms of essential chemical fragments responsible for the compounds' target property. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  13. Efficient G0W0 using localized basis sets: a benchmark for molecules

    NASA Astrophysics Data System (ADS)

    Koval, Petr; Per Ljungberg, Mathias; Sanchez-Portal, Daniel

    Electronic structure calculations within Hedin's GW approximation are becoming increasingly accessible to the community. In particular, as it has been shown earlier and we confirm by calculations using our MBPT_LCAO package, the computational cost of the so-called G0W0 can be made comparable to the cost of a regular Hartree-Fock calculation. In this work, we study the performance of our new implementation of G0W0 to reproduce the ionization potentials of all 117 closed-shell molecules belonging to the G2/97 test set, using a pseudo-potential starting point provided by the popular density-functional package SIESTA. Moreover, the ionization potentials and electron affinities of a set of 24 acceptor molecules are compared to experiment and to reference all-electron calculations. PK: Guipuzcoa Fellow; PK,ML,DSP: Deutsche Forschungsgemeinschaft (SFB1083); PK,DSP: MINECO MAT2013-46593-C6-2-P.

  14. Rapid and accurate prediction and scoring of water molecules in protein binding sites.

    PubMed

    Ross, Gregory A; Morris, Garrett M; Biggin, Philip C

    2012-01-01

    Water plays a critical role in ligand-protein interactions. However, it is still challenging to predict accurately not only where water molecules prefer to bind, but also which of those water molecules might be displaceable. The latter is often seen as a route to optimizing affinity of potential drug candidates. Using a protocol we call WaterDock, we show that the freely available AutoDock Vina tool can be used to predict accurately the binding sites of water molecules. WaterDock was validated using data from X-ray crystallography, neutron diffraction and molecular dynamics simulations and correctly predicted 97% of the water molecules in the test set. In addition, we combined data-mining, heuristic and machine learning techniques to develop probabilistic water molecule classifiers. When applied to WaterDock predictions in the Astex Diverse Set of protein ligand complexes, we could identify whether a water molecule was conserved or displaced to an accuracy of 75%. A second model predicted whether water molecules were displaced by polar groups or by non-polar groups to an accuracy of 80%. These results should prove useful for anyone wishing to undertake rational design of new compounds where the displacement of water molecules is being considered as a route to improved affinity.

  15. Simulation-based cheminformatic analysis of organelle-targeted molecules: lysosomotropic monobasic amines.

    PubMed

    Zhang, Xinyuan; Zheng, Nan; Rosania, Gus R

    2008-09-01

    Cell-based molecular transport simulations are being developed to facilitate exploratory cheminformatic analysis of virtual libraries of small drug-like molecules. For this purpose, mathematical models of single cells are built from equations capturing the transport of small molecules across membranes. In turn, physicochemical properties of small molecules can be used as input to simulate intracellular drug distribution, through time. Here, with mathematical equations and biological parameters adjusted so as to mimic a leukocyte in the blood, simulations were performed to analyze steady state, relative accumulation of small molecules in lysosomes, mitochondria, and cytosol of this target cell, in the presence of a homogenous extracellular drug concentration. Similarly, with equations and parameters set to mimic an intestinal epithelial cell, simulations were also performed to analyze steady state, relative distribution and transcellular permeability in this non-target cell, in the presence of an apical-to-basolateral concentration gradient. With a test set of ninety-nine monobasic amines gathered from the scientific literature, simulation results helped analyze relationships between the chemical diversity of these molecules and their intracellular distributions.

  16. Breaking the polar-nonpolar division in solvation free energy prediction.

    PubMed

    Wang, Bao; Wang, Chengzhang; Wu, Kedi; Wei, Guo-Wei

    2018-02-05

    Implicit solvent models divide solvation free energies into polar and nonpolar additive contributions, whereas polar and nonpolar interactions are inseparable and nonadditive. We present a feature functional theory (FFT) framework to break this ad hoc division. The essential ideas of FFT are as follows: (i) representability assumption: there exists a microscopic feature vector that can uniquely characterize and distinguish one molecule from another; (ii) feature-function relationship assumption: the macroscopic features, including solvation free energy, of a molecule is a functional of microscopic feature vectors; and (iii) similarity assumption: molecules with similar microscopic features have similar macroscopic properties, such as solvation free energies. Based on these assumptions, solvation free energy prediction is carried out in the following protocol. First, we construct a molecular microscopic feature vector that is efficient in characterizing the solvation process using quantum mechanics and Poisson-Boltzmann theory. Microscopic feature vectors are combined with macroscopic features, that is, physical observable, to form extended feature vectors. Additionally, we partition a solvation dataset into queries according to molecular compositions. Moreover, for each target molecule, we adopt a machine learning algorithm for its nearest neighbor search, based on the selected microscopic feature vectors. Finally, from the extended feature vectors of obtained nearest neighbors, we construct a functional of solvation free energy, which is employed to predict the solvation free energy of the target molecule. The proposed FFT model has been extensively validated via a large dataset of 668 molecules. The leave-one-out test gives an optimal root-mean-square error (RMSE) of 1.05 kcal/mol. FFT predictions of SAMPL0, SAMPL1, SAMPL2, SAMPL3, and SAMPL4 challenge sets deliver the RMSEs of 0.61, 1.86, 1.64, 0.86, and 1.14 kcal/mol, respectively. Using a test set of 94 molecules and its associated training set, the present approach was carefully compared with a classic solvation model based on weighted solvent accessible surface area. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.

  17. Integrated Computational Solution for Predicting Skin Sensitization Potential of Molecules

    PubMed Central

    Desai, Aarti; Singh, Vivek K.; Jere, Abhay

    2016-01-01

    Introduction Skin sensitization forms a major toxicological endpoint for dermatology and cosmetic products. Recent ban on animal testing for cosmetics demands for alternative methods. We developed an integrated computational solution (SkinSense) that offers a robust solution and addresses the limitations of existing computational tools i.e. high false positive rate and/or limited coverage. Results The key components of our solution include: QSAR models selected from a combinatorial set, similarity information and literature-derived sub-structure patterns of known skin protein reactive groups. Its prediction performance on a challenge set of molecules showed accuracy = 75.32%, CCR = 74.36%, sensitivity = 70.00% and specificity = 78.72%, which is better than several existing tools including VEGA (accuracy = 45.00% and CCR = 54.17% with ‘High’ reliability scoring), DEREK (accuracy = 72.73% and CCR = 71.44%) and TOPKAT (accuracy = 60.00% and CCR = 61.67%). Although, TIMES-SS showed higher predictive power (accuracy = 90.00% and CCR = 92.86%), the coverage was very low (only 10 out of 77 molecules were predicted reliably). Conclusions Owing to improved prediction performance and coverage, our solution can serve as a useful expert system towards Integrated Approaches to Testing and Assessment for skin sensitization. It would be invaluable to cosmetic/ dermatology industry for pre-screening their molecules, and reducing time, cost and animal testing. PMID:27271321

  18. Association of blood lipids with Alzheimer's disease: A comprehensive lipidomics analysis.

    PubMed

    Proitsi, Petroula; Kim, Min; Whiley, Luke; Simmons, Andrew; Sattlecker, Martina; Velayudhan, Latha; Lupton, Michelle K; Soininen, Hillka; Kloszewska, Iwona; Mecocci, Patrizia; Tsolaki, Magda; Vellas, Bruno; Lovestone, Simon; Powell, John F; Dobson, Richard J B; Legido-Quigley, Cristina

    2017-02-01

    The aim of this study was to (1) replicate previous associations between six blood lipids and Alzheimer's disease (AD) (Proitsi et al 2015) and (2) identify novel associations between lipids, clinical AD diagnosis, disease progression and brain atrophy (left/right hippocampus/entorhinal cortex). We performed untargeted lipidomic analysis on 148 AD and 152 elderly control plasma samples and used univariate and multivariate analysis methods. We replicated our previous lipids associations and reported novel associations between lipids molecules and all phenotypes. A combination of 24 molecules classified AD patients with >70% accuracy in a test and a validation data set, and we identified lipid signatures that predicted disease progression (R 2  = 0.10, test data set) and brain atrophy (R 2  ≥ 0.14, all test data sets except left entorhinal cortex). We putatively identified a number of metabolic features including cholesteryl esters/triglycerides and phosphatidylcholines. Blood lipids are promising AD biomarkers that may lead to new treatment strategies. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.

  19. Chemical Topic Modeling: Exploring Molecular Data Sets Using a Common Text-Mining Approach.

    PubMed

    Schneider, Nadine; Fechner, Nikolas; Landrum, Gregory A; Stiefl, Nikolaus

    2017-08-28

    Big data is one of the key transformative factors which increasingly influences all aspects of modern life. Although this transformation brings vast opportunities it also generates novel challenges, not the least of which is organizing and searching this data deluge. The field of medicinal chemistry is not different: more and more data are being generated, for instance, by technologies such as DNA encoded libraries, peptide libraries, text mining of large literature corpora, and new in silico enumeration methods. Handling those huge sets of molecules effectively is quite challenging and requires compromises that often come at the expense of the interpretability of the results. In order to find an intuitive and meaningful approach to organizing large molecular data sets, we adopted a probabilistic framework called "topic modeling" from the text-mining field. Here we present the first chemistry-related implementation of this method, which allows large molecule sets to be assigned to "chemical topics" and investigating the relationships between those. In this first study, we thoroughly evaluate this novel method in different experiments and discuss both its disadvantages and advantages. We show very promising results in reproducing human-assigned concepts using the approach to identify and retrieve chemical series from sets of molecules. We have also created an intuitive visualization of the chemical topics output by the algorithm. This is a huge benefit compared to other unsupervised machine-learning methods, like clustering, which are commonly used to group sets of molecules. Finally, we applied the new method to the 1.6 million molecules of the ChEMBL22 data set to test its robustness and efficiency. In about 1 h we built a 100-topic model of this large data set in which we could identify interesting topics like "proteins", "DNA", or "steroids". Along with this publication we provide our data sets and an open-source implementation of the new method (CheTo) which will be part of an upcoming version of the open-source cheminformatics toolkit RDKit.

  20. Accurate energetics of small molecules containing third-row atoms Ga-Kr: A comparison of advanced ab initio and density functional theory

    NASA Astrophysics Data System (ADS)

    Yockel, Scott; Mintz, Benjamin; Wilson, Angela K.

    2004-07-01

    Advanced ab initio [coupled cluster theory through quasiperturbative triple excitations (CCSD(T))] and density functional (B3LYP) computational chemistry approaches were used in combination with the standard and augmented correlation consistent polarized valence basis sets [cc-pVnZ and aug-cc-pVnZ, where n=D(2), T(3), Q(4), and 5] to investigate the energetic and structural properties of small molecules containing third-row (Ga-Kr) atoms. These molecules were taken from the Gaussian-2 (G2) extended test set for third-row atoms. Several different schemes were used to extrapolate the calculated energies to the complete basis set (CBS) limit for CCSD(T) and the Kohn-Sham (KS) limit for B3LYP. Zero point energy and spin orbital corrections were included in the results. Overall, CCSD(T) atomization energies, ionization energies, proton affinities, and electron affinities are in good agreement with experiment, within 1.1 kcal/mol when the CBS limit has been determined using a series of two basis sets of at least triple zeta quality. For B3LYP, the overall mean absolute deviation from experiment for the three properties and the series of molecules is more significant at the KS limit, within 2.3 and 2.6 kcal/mol for the cc-pVnZ and aug-cc-pVnZ basis set series, respectively.

  1. Crossing borders to bind proteins--a new concept in protein recognition based on the conjugation of small organic molecules or short peptides to polypeptides from a designed set.

    PubMed

    Baltzer, Lars

    2011-06-01

    A new concept for protein recognition and binding is highlighted. The conjugation of small organic molecules or short peptides to polypeptides from a designed set provides binder molecules that bind proteins with high affinities, and with selectivities that are equal to those of antibodies. The small organic molecules or peptides need to bind the protein targets but only with modest affinities and selectivities, because conjugation to the polypeptides results in molecules with dramatically improved binder performance. The polypeptides are selected from a set of only sixteen sequences designed to bind, in principle, any protein. The small number of polypeptides used to prepare high-affinity binders contrasts sharply with the huge libraries used in binder technologies based on selection or immunization. Also, unlike antibodies and engineered proteins, the polypeptides have unordered three-dimensional structures and adapt to the proteins to which they bind. Binder molecules for the C-reactive protein, human carbonic anhydrase II, acetylcholine esterase, thymidine kinase 1, phosphorylated proteins, the D-dimer, and a number of antibodies are used as examples to demonstrate that affinities are achieved that are higher than those of the small molecules or peptides by as much as four orders of magnitude. Evaluation by pull-down experiments and ELISA-based tests in human serum show selectivities to be equal to those of antibodies. Small organic molecules and peptides are readily available from pools of endogenous ligands, enzyme substrates, inhibitors or products, from screened small molecule libraries, from phage display, and from mRNA display. The technology is an alternative to established binder concepts for applications in drug development, diagnostics, medical imaging, and protein separation.

  2. QSAR study of curcumine derivatives as HIV-1 integrase inhibitors.

    PubMed

    Gupta, Pawan; Sharma, Anju; Garg, Prabha; Roy, Nilanjan

    2013-03-01

    A QSAR study was performed on curcumine derivatives as HIV-1 integrase inhibitors using multiple linear regression. The statistically significant model was developed with squared correlation coefficients (r(2)) 0.891 and cross validated r(2) (r(2) cv) 0.825. The developed model revealed that electronic, shape, size, geometry, substitution's information and hydrophilicity were important atomic properties for determining the inhibitory activity of these molecules. The model was also tested successfully for external validation (r(2) pred = 0.849) as well as Tropsha's test for model predictability. Furthermore, the domain analysis was carried out to evaluate the prediction reliability of external set molecules. The model was statistically robust and had good predictive power which can be successfully utilized for screening of new molecules.

  3. The Electrolyte Genome project: A big data approach in battery materials discovery

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Qu, Xiaohui; Jain, Anubhav; Rajput, Nav Nidhi

    2015-06-01

    We present a high-throughput infrastructure for the automated calculation of molecular properties with a focus on battery electrolytes. The infrastructure is largely open-source and handles both practical aspects (input file generation, output file parsing, and information management) as well as more complex problems (structure matching, salt complex generation, and failure recovery). Using this infrastructure, we have computed the ionization potential (IP) and electron affinities (EA) of 4830 molecules relevant to battery electrolytes (encompassing almost 55,000 quantum mechanics calculations) at the B3LYP/6-31+G(*) level. We describe automated workflows for computing redox potential, dissociation constant, and salt-molecule binding complex structure generation. We presentmore » routines for automatic recovery from calculation errors, which brings the failure rate from 9.2% to 0.8% for the QChem DFT code. Automated algorithms to check duplication between two arbitrary molecules and structures are described. We present benchmark data on basis sets and functionals on the G2-97 test set; one finding is that a IP/EA calculation method that combines PBE geometry optimization and B3LYP energy evaluation requires less computational cost and yields nearly identical results as compared to a full B3LYP calculation, and could be suitable for the calculation of large molecules. Our data indicates that among the 8 functionals tested, XYGJ-OS and B3LYP are the two best functionals to predict IP/EA with an RMSE of 0.12 and 0.27 eV, respectively. Application of our automated workflow to a large set of quinoxaline derivative molecules shows that functional group effect and substitution position effect can be separated for IP/EA of quinoxaline derivatives, and the most sensitive position is different for IP and EA. Published by Elsevier B.V« less

  4. Anti AIDS drug design with the help of neural networks

    NASA Astrophysics Data System (ADS)

    Tetko, I. V.; Tanchuk, V. Yu.; Luik, A. I.

    1995-04-01

    Artificial neural networks were used to analyze and predict the human immunodefiency virus type 1 reverse transcriptase inhibitors. Training and control set included 44 molecules (most of them are well-known substances such as AZT, TIBO, dde, etc.) The biological activities of molecules were taken from literature and rated for two classes: active and inactive compounds according to their values. We used topological indices as molecular parameters. Four most informative parameters (out of 46) were chosen using cluster analysis and original input parameters' estimation procedure and were used to predict activities of both control and new (synthesized in our institute) molecules. We applied pruning network algorithm and network ensembles to obtain the final classifier and avoid chance correlation. The increasing of neural network generalization of the data from the control set was observed, when using the aforementioned methods. The prognosis of new molecules revealed one molecule as possibly active. It was confirmed by further biological tests. The compound was as active as AZT and in order less toxic. The active compound is currently being evaluated in pre clinical trials as possible drug for anti-AIDS therapy.

  5. Structural similarity based kriging for quantitative structure activity and property relationship modeling.

    PubMed

    Teixeira, Ana L; Falcao, Andre O

    2014-07-28

    Structurally similar molecules tend to have similar properties, i.e. closer molecules in the molecular space are more likely to yield similar property values while distant molecules are more likely to yield different values. Based on this principle, we propose the use of a new method that takes into account the high dimensionality of the molecular space, predicting chemical, physical, or biological properties based on the most similar compounds with measured properties. This methodology uses ordinary kriging coupled with three different molecular similarity approaches (based on molecular descriptors, fingerprints, and atom matching) which creates an interpolation map over the molecular space that is capable of predicting properties/activities for diverse chemical data sets. The proposed method was tested in two data sets of diverse chemical compounds collected from the literature and preprocessed. One of the data sets contained dihydrofolate reductase inhibition activity data, and the second molecules for which aqueous solubility was known. The overall predictive results using kriging for both data sets comply with the results obtained in the literature using typical QSPR/QSAR approaches. However, the procedure did not involve any type of descriptor selection or even minimal information about each problem, suggesting that this approach is directly applicable to a large spectrum of problems in QSAR/QSPR. Furthermore, the predictive results improve significantly with the similarity threshold between the training and testing compounds, allowing the definition of a confidence threshold of similarity and error estimation for each case inferred. The use of kriging for interpolation over the molecular metric space is independent of the training data set size, and no reparametrizations are necessary when more compounds are added or removed from the set, and increasing the size of the database will consequentially improve the quality of the estimations. Finally it is shown that this model can be used for checking the consistency of measured data and for guiding an extension of the training set by determining the regions of the molecular space for which new experimental measurements could be used to maximize the model's predictive performance.

  6. Spectral properties from Matsubara Green's function approach: Application to molecules

    NASA Astrophysics Data System (ADS)

    Schüler, M.; Pavlyukh, Y.

    2018-03-01

    We present results for many-body perturbation theory for the one-body Green's function at finite temperatures using the Matsubara formalism. Our method relies on the accurate representation of the single-particle states in standard Gaussian basis sets, allowing to efficiently compute, among other observables, quasiparticle energies and Dyson orbitals of atoms and molecules. In particular, we challenge the second-order treatment of the Coulomb interaction by benchmarking its accuracy for a well-established test set of small molecules, which includes also systems where the usual Hartree-Fock treatment encounters difficulties. We discuss different schemes how to extract quasiparticle properties and assess their range of applicability. With an accurate solution and compact representation, our method is an ideal starting point to study electron dynamics in time-resolved experiments by the propagation of the Kadanoff-Baym equations.

  7. Measuring CAMD technique performance. 2. How "druglike" are drugs? Implications of Random test set selection exemplified using druglikeness classification models.

    PubMed

    Good, Andrew C; Hermsmeier, Mark A

    2007-01-01

    Research into the advancement of computer-aided molecular design (CAMD) has a tendency to focus on the discipline of algorithm development. Such efforts are often wrought to the detriment of the data set selection and analysis used in said algorithm validation. Here we highlight the potential problems this can cause in the context of druglikeness classification. More rigorous efforts are applied to the selection of decoy (nondruglike) molecules from the ACD. Comparisons are made between model performance using the standard technique of random test set creation with test sets derived from explicit ontological separation by drug class. The dangers of viewing druglike space as sufficiently coherent to permit simple classification are highlighted. In addition the issues inherent in applying unfiltered data and random test set selection to (Q)SAR models utilizing large and supposedly heterogeneous databases are discussed.

  8. Atom-type-based AI topological descriptors: application in structure-boiling point correlations of oxo organic compounds.

    PubMed

    Ren, Biye

    2003-01-01

    Structure-boiling point relationships are studied for a series of oxo organic compounds by means of multiple linear regression (MLR) analysis. Excellent MLR models based on the recently introduced Xu index and the atom-type-based AI indices are obtained for the two subsets containing respectively 77 ethers and 107 carbonyl compounds and a combined set of 184 oxo compounds. The best models are tested using the leave-one-out cross-validation and an external test set, respectively. The MLR model produces a correlation coefficient of r = 0.9977 and a standard error of s = 3.99 degrees C for the training set of 184 compounds, and r(cv) = 0.9974 and s(cv) = 4.16 degrees C for the cross-validation set, and r(pred) = 0.9949 and s(pred) = 4.38 degrees C for the prediction set of 21 compounds. For the two subsets containing respectively 77 ethers and 107 carbonyl compounds, the quality of the models is further improved. The standard errors are reduced to 3.30 and 3.02 degrees C, respectively. Furthermore, the results obtained from this study indicate that the boiling points of the studied oxo compound dominantly depend on molecular size and also depend on individual atom types, especially oxygen heteroatoms in molecules due to strong polar interactions between molecules. These excellent structure-boiling point models not only provide profound insights into the role of structural features in a molecule but also illustrate the usefulness of these indices in QSPR/QSAR modeling of complex compounds.

  9. On the accuracy of density functional theory and wave function methods for calculating vertical ionization energies

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    McKechnie, Scott; Booth, George H.; Cohen, Aron J.

    The best practice in computational methods for determining vertical ionization energies (VIEs) is assessed, via reference to experimentally determined VIEs that are corroborated by highly accurate coupled-cluster calculations. These reference values are used to benchmark the performance of density-functional theory (DFT) and wave function methods: Hartree-Fock theory (HF), second-order Møller-Plesset perturbation theory (MP2) and Electron Propagator Theory (EPT). The core test set consists of 147 small molecules. An extended set of six larger molecules, from benzene to hexacene, is also considered to investigate the dependence of the results on molecule size. The closest agreement with experiment is found for ionizationmore » energies obtained from total energy diff calculations. In particular, DFT calculations using exchange-correlation functionals with either a large amount of exact exchange or long-range correction perform best. The results from these functionals are also the least sensitive to an increase in molecule size. In general, ionization energies calculated directly from the orbital energies of the neutral species are less accurate and more sensitive to an increase in molecule size. For the single-calculation approach, the EPT calculations are in closest agreement for both sets of molecules. For the orbital energies from DFT functionals, only those with long-range correction give quantitative agreement with dramatic failing for all other functionals considered. The results offer a practical hierarchy of approximations for the calculation of vertical ionization energies. In addition, the experimental and computational reference values can be used as a standardized set of benchmarks, against which other approximate methods can be compared.« less

  10. Comparison of localized basis and plane-wave basis for density-functional calculations of organic molecules on metals

    NASA Astrophysics Data System (ADS)

    Lee, Kyuho; Yu, Jaejun; Morikawa, Yoshitada

    2007-01-01

    Localized pseudoatomic orbitals (PAOs) are mainly optimized and tested for the strong chemical bonds within molecules and solids with their proven accuracy and efficiency, but are prone to significant basis set superposition error (BSSE) for weakly interacting systems. Here we test the accuracy of PAO basis in comparison with the BSSE-free plane-wave basis for the physisorption of pentacene molecule on Au (001) by calculating the binding energy, adsorption height, and energy level alignment. We show that both the large cutoff radius for localized PAOs and the counter-poise correction for BSSE are necessary to obtain well-converged physical properties. Thereby obtained results are as accurate as the plane-wave basis results. The comparison with experiment is given as well.

  11. Versatile Analysis of Single-Molecule Tracking Data by Comprehensive Testing against Monte Carlo Simulations

    PubMed Central

    Wieser, Stefan; Axmann, Markus; Schütz, Gerhard J.

    2008-01-01

    We propose here an approach for the analysis of single-molecule trajectories which is based on a comprehensive comparison of an experimental data set with multiple Monte Carlo simulations of the diffusion process. It allows quantitative data analysis, particularly whenever analytical treatment of a model is infeasible. Simulations are performed on a discrete parameter space and compared with the experimental results by a nonparametric statistical test. The method provides a matrix of p-values that assess the probability for having observed the experimental data at each setting of the model parameters. We show the testing approach for three typical situations observed in the cellular plasma membrane: i), free Brownian motion of the tracer, ii), hop diffusion of the tracer in a periodic meshwork of squares, and iii), transient binding of the tracer to slowly diffusing structures. By plotting the p-value as a function of the model parameters, one can easily identify the most consistent parameter settings but also recover mutual dependencies and ambiguities which are difficult to determine by standard fitting routines. Finally, we used the test to reanalyze previous data obtained on the diffusion of the glycosylphosphatidylinositol-protein CD59 in the plasma membrane of the human T24 cell line. PMID:18805933

  12. Automated Inference of Chemical Discriminants of Biological Activity.

    PubMed

    Raschka, Sebastian; Scott, Anne M; Huertas, Mar; Li, Weiming; Kuhn, Leslie A

    2018-01-01

    Ligand-based virtual screening has become a standard technique for the efficient discovery of bioactive small molecules. Following assays to determine the activity of compounds selected by virtual screening, or other approaches in which dozens to thousands of molecules have been tested, machine learning techniques make it straightforward to discover the patterns of chemical groups that correlate with the desired biological activity. Defining the chemical features that generate activity can be used to guide the selection of molecules for subsequent rounds of screening and assaying, as well as help design new, more active molecules for organic synthesis.The quantitative structure-activity relationship machine learning protocols we describe here, using decision trees, random forests, and sequential feature selection, take as input the chemical structure of a single, known active small molecule (e.g., an inhibitor, agonist, or substrate) for comparison with the structure of each tested molecule. Knowledge of the atomic structure of the protein target and its interactions with the active compound are not required. These protocols can be modified and applied to any data set that consists of a series of measured structural, chemical, or other features for each tested molecule, along with the experimentally measured value of the response variable you would like to predict or optimize for your project, for instance, inhibitory activity in a biological assay or ΔG binding . To illustrate the use of different machine learning algorithms, we step through the analysis of a dataset of inhibitor candidates from virtual screening that were tested recently for their ability to inhibit GPCR-mediated signaling in a vertebrate.

  13. MultiDK: A Multiple Descriptor Multiple Kernel Approach for Molecular Discovery and Its Application to Organic Flow Battery Electrolytes.

    PubMed

    Kim, Sungjin; Jinich, Adrián; Aspuru-Guzik, Alán

    2017-04-24

    We propose a multiple descriptor multiple kernel (MultiDK) method for efficient molecular discovery using machine learning. We show that the MultiDK method improves both the speed and accuracy of molecular property prediction. We apply the method to the discovery of electrolyte molecules for aqueous redox flow batteries. Using multiple-type-as opposed to single-type-descriptors, we obtain more relevant features for machine learning. Following the principle of "wisdom of the crowds", the combination of multiple-type descriptors significantly boosts prediction performance. Moreover, by employing multiple kernels-more than one kernel function for a set of the input descriptors-MultiDK exploits nonlinear relations between molecular structure and properties better than a linear regression approach. The multiple kernels consist of a Tanimoto similarity kernel and a linear kernel for a set of binary descriptors and a set of nonbinary descriptors, respectively. Using MultiDK, we achieve an average performance of r 2 = 0.92 with a test set of molecules for solubility prediction. We also extend MultiDK to predict pH-dependent solubility and apply it to a set of quinone molecules with different ionizable functional groups to assess their performance as flow battery electrolytes.

  14. Iterative Refinement of a Binding Pocket Model: Active Computational Steering of Lead Optimization

    PubMed Central

    2012-01-01

    Computational approaches for binding affinity prediction are most frequently demonstrated through cross-validation within a series of molecules or through performance shown on a blinded test set. Here, we show how such a system performs in an iterative, temporal lead optimization exercise. A series of gyrase inhibitors with known synthetic order formed the set of molecules that could be selected for “synthesis.” Beginning with a small number of molecules, based only on structures and activities, a model was constructed. Compound selection was done computationally, each time making five selections based on confident predictions of high activity and five selections based on a quantitative measure of three-dimensional structural novelty. Compound selection was followed by model refinement using the new data. Iterative computational candidate selection produced rapid improvements in selected compound activity, and incorporation of explicitly novel compounds uncovered much more diverse active inhibitors than strategies lacking active novelty selection. PMID:23046104

  15. Simulation-based cheminformatic analysis of organelle-targeted molecules: lysosomotropic monobasic amines

    PubMed Central

    Zhang, Xinyuan; Zheng, Nan

    2008-01-01

    Cell-based molecular transport simulations are being developed to facilitate exploratory cheminformatic analysis of virtual libraries of small drug-like molecules. For this purpose, mathematical models of single cells are built from equations capturing the transport of small molecules across membranes. In turn, physicochemical properties of small molecules can be used as input to simulate intracellular drug distribution, through time. Here, with mathematical equations and biological parameters adjusted so as to mimic a leukocyte in the blood, simulations were performed to analyze steady state, relative accumulation of small molecules in lysosomes, mitochondria, and cytosol of this target cell, in the presence of a homogenous extracellular drug concentration. Similarly, with equations and parameters set to mimic an intestinal epithelial cell, simulations were also performed to analyze steady state, relative distribution and transcellular permeability in this non-target cell, in the presence of an apical-to-basolateral concentration gradient. With a test set of ninety-nine monobasic amines gathered from the scientific literature, simulation results helped analyze relationships between the chemical diversity of these molecules and their intracellular distributions. Electronic supplementary material The online version of this article (doi:10.1007/s10822-008-9194-7) contains supplementary material, which is available to authorized users. PMID:18338229

  16. Using open source computational tools for predicting human metabolic stability and additional absorption, distribution, metabolism, excretion, and toxicity properties.

    PubMed

    Gupta, Rishi R; Gifford, Eric M; Liston, Ted; Waller, Chris L; Hohman, Moses; Bunin, Barry A; Ekins, Sean

    2010-11-01

    Ligand-based computational models could be more readily shared between researchers and organizations if they were generated with open source molecular descriptors [e.g., chemistry development kit (CDK)] and modeling algorithms, because this would negate the requirement for proprietary commercial software. We initially evaluated open source descriptors and model building algorithms using a training set of approximately 50,000 molecules and a test set of approximately 25,000 molecules with human liver microsomal metabolic stability data. A C5.0 decision tree model demonstrated that CDK descriptors together with a set of Smiles Arbitrary Target Specification (SMARTS) keys had good statistics [κ = 0.43, sensitivity = 0.57, specificity = 0.91, and positive predicted value (PPV) = 0.64], equivalent to those of models built with commercial Molecular Operating Environment 2D (MOE2D) and the same set of SMARTS keys (κ = 0.43, sensitivity = 0.58, specificity = 0.91, and PPV = 0.63). Extending the dataset to ∼193,000 molecules and generating a continuous model using Cubist with a combination of CDK and SMARTS keys or MOE2D and SMARTS keys confirmed this observation. When the continuous predictions and actual values were binned to get a categorical score we observed a similar κ statistic (0.42). The same combination of descriptor set and modeling method was applied to passive permeability and P-glycoprotein efflux data with similar model testing statistics. In summary, open source tools demonstrated predictive results comparable to those of commercial software with attendant cost savings. We discuss the advantages and disadvantages of open source descriptors and the opportunity for their use as a tool for organizations to share data precompetitively, avoiding repetition and assisting drug discovery.

  17. Benefits of statistical molecular design, covariance analysis, and reference models in QSAR: a case study on acetylcholinesterase

    NASA Astrophysics Data System (ADS)

    Andersson, C. David; Hillgren, J. Mikael; Lindgren, Cecilia; Qian, Weixing; Akfur, Christine; Berg, Lotta; Ekström, Fredrik; Linusson, Anna

    2015-03-01

    Scientific disciplines such as medicinal- and environmental chemistry, pharmacology, and toxicology deal with the questions related to the effects small organic compounds exhort on biological targets and the compounds' physicochemical properties responsible for these effects. A common strategy in this endeavor is to establish structure-activity relationships (SARs). The aim of this work was to illustrate benefits of performing a statistical molecular design (SMD) and proper statistical analysis of the molecules' properties before SAR and quantitative structure-activity relationship (QSAR) analysis. Our SMD followed by synthesis yielded a set of inhibitors of the enzyme acetylcholinesterase (AChE) that had very few inherent dependencies between the substructures in the molecules. If such dependencies exist, they cause severe errors in SAR interpretation and predictions by QSAR-models, and leave a set of molecules less suitable for future decision-making. In our study, SAR- and QSAR models could show which molecular sub-structures and physicochemical features that were advantageous for the AChE inhibition. Finally, the QSAR model was used for the prediction of the inhibition of AChE by an external prediction set of molecules. The accuracy of these predictions was asserted by statistical significance tests and by comparisons to simple but relevant reference models.

  18. Optimization of auxiliary basis sets for the LEDO expansion and a projection technique for LEDO-DFT.

    PubMed

    Götz, Andreas W; Kollmar, Christian; Hess, Bernd A

    2005-09-01

    We present a systematic procedure for the optimization of the expansion basis for the limited expansion of diatomic overlap density functional theory (LEDO-DFT) and report on optimized auxiliary orbitals for the Ahlrichs split valence plus polarization basis set (SVP) for the elements H, Li--F, and Na--Cl. A new method to deal with near-linear dependences in the LEDO expansion basis is introduced, which greatly reduces the computational effort of LEDO-DFT calculations. Numerical results for a test set of small molecules demonstrate the accuracy of electronic energies, structural parameters, dipole moments, and harmonic frequencies. For larger molecular systems the numerical errors introduced by the LEDO approximation can lead to an uncontrollable behavior of the self-consistent field (SCF) process. A projection technique suggested by Löwdin is presented in the framework of LEDO-DFT, which guarantees for SCF convergence. Numerical results on some critical test molecules suggest the general applicability of the auxiliary orbitals presented in combination with this projection technique. Timing results indicate that LEDO-DFT is competitive with conventional density fitting methods. (c) 2005 Wiley Periodicals, Inc.

  19. Single molecule real-time (SMRT) sequencing comes of age: applications and utilities for medical diagnostics

    PubMed Central

    Ardui, Simon; Ameur, Adam; Vermeesch, Joris R; Hestand, Matthew S

    2018-01-01

    Abstract Short read massive parallel sequencing has emerged as a standard diagnostic tool in the medical setting. However, short read technologies have inherent limitations such as GC bias, difficulties mapping to repetitive elements, trouble discriminating paralogous sequences, and difficulties in phasing alleles. Long read single molecule sequencers resolve these obstacles. Moreover, they offer higher consensus accuracies and can detect epigenetic modifications from native DNA. The first commercially available long read single molecule platform was the RS system based on PacBio's single molecule real-time (SMRT) sequencing technology, which has since evolved into their RSII and Sequel systems. Here we capsulize how SMRT sequencing is revolutionizing constitutional, reproductive, cancer, microbial and viral genetic testing. PMID:29401301

  20. Representation of molecular structure using quantum topology with inductive logic programming in structure-activity relationships.

    PubMed

    Buttingsrud, Bård; Ryeng, Einar; King, Ross D; Alsberg, Bjørn K

    2006-06-01

    The requirement of aligning each individual molecule in a data set severely limits the type of molecules which can be analysed with traditional structure activity relationship (SAR) methods. A method which solves this problem by using relations between objects is inductive logic programming (ILP). Another advantage of this methodology is its ability to include background knowledge as 1st-order logic. However, previous molecular ILP representations have not been effective in describing the electronic structure of molecules. We present a more unified and comprehensive representation based on Richard Bader's quantum topological atoms in molecules (AIM) theory where critical points in the electron density are connected through a network. AIM theory provides a wealth of chemical information about individual atoms and their bond connections enabling a more flexible and chemically relevant representation. To obtain even more relevant rules with higher coverage, we apply manual postprocessing and interpretation of ILP rules. We have tested the usefulness of the new representation in SAR modelling on classifying compounds of low/high mutagenicity and on a set of factor Xa inhibitors of high and low affinity.

  1. Predicting a small molecule-kinase interaction map: A machine learning approach

    PubMed Central

    2011-01-01

    Background We present a machine learning approach to the problem of protein ligand interaction prediction. We focus on a set of binding data obtained from 113 different protein kinases and 20 inhibitors. It was attained through ATP site-dependent binding competition assays and constitutes the first available dataset of this kind. We extract information about the investigated molecules from various data sources to obtain an informative set of features. Results A Support Vector Machine (SVM) as well as a decision tree algorithm (C5/See5) is used to learn models based on the available features which in turn can be used for the classification of new kinase-inhibitor pair test instances. We evaluate our approach using different feature sets and parameter settings for the employed classifiers. Moreover, the paper introduces a new way of evaluating predictions in such a setting, where different amounts of information about the binding partners can be assumed to be available for training. Results on an external test set are also provided. Conclusions In most of the cases, the presented approach clearly outperforms the baseline methods used for comparison. Experimental results indicate that the applied machine learning methods are able to detect a signal in the data and predict binding affinity to some extent. For SVMs, the binding prediction can be improved significantly by using features that describe the active site of a kinase. For C5, besides diversity in the feature set, alignment scores of conserved regions turned out to be very useful. PMID:21708012

  2. vSDC: a method to improve early recognition in virtual screening when limited experimental resources are available.

    PubMed

    Chaput, Ludovic; Martinez-Sanz, Juan; Quiniou, Eric; Rigolet, Pascal; Saettel, Nicolas; Mouawad, Liliane

    2016-01-01

    In drug design, one may be confronted to the problem of finding hits for targets for which no small inhibiting molecules are known and only low-throughput experiments are available (like ITC or NMR studies), two common difficulties encountered in a typical academic setting. Using a virtual screening strategy like docking can alleviate some of the problems and save a considerable amount of time by selecting only top-ranking molecules, but only if the method is very efficient, i.e. when a good proportion of actives are found in the 1-10 % best ranked molecules. The use of several programs (in our study, Gold, Surflex, FlexX and Glide were considered) shows a divergence of the results, which presents a difficulty in guiding the experiments. To overcome this divergence and increase the yield of the virtual screening, we created the standard deviation consensus (SDC) and variable SDC (vSDC) methods, consisting of the intersection of molecule sets from several virtual screening programs, based on the standard deviations of their ranking distributions. SDC allowed us to find hits for two new protein targets by testing only 9 and 11 small molecules from a chemical library of circa 15,000 compounds. Furthermore, vSDC, when applied to the 102 proteins of the DUD-E benchmarking database, succeeded in finding more hits than any of the four isolated programs for 13-60 % of the targets. In addition, when only 10 molecules of each of the 102 chemical libraries were considered, vSDC performed better in the number of hits found, with an improvement of 6-24 % over the 10 best-ranked molecules given by the individual docking programs.Graphical abstractIn drug design, for a given target and a given chemical library, the results obtained with different virtual screening programs are divergent. So how to rationally guide the experimental tests, especially when only a few number of experiments can be made? The variable Standard Deviation Consensus (vSDC) method was developed to answer this issue. Left panel the vSDC principle consists of intersecting molecule sets, chosen on the basis of the standard deviations of their ranking distributions, obtained from various virtual screening programs. In this study Glide, Gold, FlexX and Surflex were used and tested on the 102 targets of the DUD-E database. Right panel Comparison of the average percentage of hits found with vSDC and each of the four programs, when only 10 molecules from each of the 102 chemical libraries of the DUD-E database were considered. On average, vSDC was capable of finding 38 % of the findable hits, against 34 % for Glide, 32 % for Gold, 16 % for FlexX and 14 % for Surflex, showing that with vSDC, it was possible to overcome the unpredictability of the virtual screening results and to improve them.

  3. Small molecule absorption by PDMS in the context of drug response bioassays.

    PubMed

    van Meer, B J; de Vries, H; Firth, K S A; van Weerd, J; Tertoolen, L G J; Karperien, H B J; Jonkheijm, P; Denning, C; IJzerman, A P; Mummery, C L

    2017-01-08

    The polymer polydimethylsiloxane (PDMS) is widely used to build microfluidic devices compatible with cell culture. Whilst convenient in manufacture, PDMS has the disadvantage that it can absorb small molecules such as drugs. In microfluidic devices like "Organs-on-Chip", designed to examine cell behavior and test the effects of drugs, this might impact drug bioavailability. Here we developed an assay to compare the absorption of a test set of four cardiac drugs by PDMS based on measuring the residual non-absorbed compound by High Pressure Liquid Chromatography (HPLC). We showed that absorption was variable and time dependent and not determined exclusively by hydrophobicity as claimed previously. We demonstrated that two commercially available lipophilic coatings and the presence of cells affected absorption. The use of lipophilic coatings may be useful in preventing small molecule absorption by PDMS. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.

  4. Polyatomic molecular Dirac-Hartree-Fock calculations with Gaussian basis sets

    NASA Technical Reports Server (NTRS)

    Dyall, Kenneth G.; Faegri, Knut, Jr.; Taylor, Peter R.

    1990-01-01

    Numerical methods have been used successfully in atomic Dirac-Hartree-Fock (DHF) calculations for many years. Some DHF calculations using numerical methods have been done on diatomic molecules, but while these serve a useful purpose for calibration, the computational effort in extending this approach to polyatomic molecules is prohibitive. An alternative more in line with traditional quantum chemistry is to use an analytical basis set expansion of the wave function. This approach fell into disrepute in the early 1980's due to problems with variational collapse and intruder states, but has recently been put on firm theoretical foundations. In particular, the problems of variational collapse are well understood, and prescriptions for avoiding the most serious failures have been developed. Consequently, it is now possible to develop reliable molecular programs using basis set methods. This paper describes such a program and reports results of test calculations to demonstrate the convergence and stability of the method.

  5. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Freitez, Juan A.; Sanchez, Morella; Ruette, Fernando

    Application of simulated annealing (SA) and simplified GSA (SGSA) techniques for parameter optimization of parametric quantum chemistry method (CATIVIC) was performed. A set of organic molecules were selected for test these techniques. Comparison of the algorithms was carried out for error function minimization with respect to experimental values. Results show that SGSA is more efficient than SA with respect to computer time. Accuracy is similar in both methods; however, there are important differences in the final set of parameters.

  6. Third-Order Incremental Dual-Basis Set Zero-Buffer Approach: An Accurate and Efficient Way To Obtain CCSD and CCSD(T) Energies.

    PubMed

    Zhang, Jun; Dolg, Michael

    2013-07-09

    An efficient way to obtain accurate CCSD and CCSD(T) energies for large systems, i.e., the third-order incremental dual-basis set zero-buffer approach (inc3-db-B0), has been developed and tested. This approach combines the powerful incremental scheme with the dual-basis set method, and along with the new proposed K-means clustering (KM) method and zero-buffer (B0) approximation, can obtain very accurate absolute and relative energies efficiently. We tested the approach for 10 systems of different chemical nature, i.e., intermolecular interactions including hydrogen bonding, dispersion interaction, and halogen bonding; an intramolecular rearrangement reaction; aliphatic and conjugated hydrocarbon chains; three compact covalent molecules; and a water cluster. The results show that the errors for relative energies are <1.94 kJ/mol (or 0.46 kcal/mol), for absolute energies of <0.0026 hartree. By parallelization, our approach can be applied to molecules of more than 30 atoms and more than 100 correlated electrons with high-quality basis set such as cc-pVDZ or cc-pVTZ, saving computational cost by a factor of more than 10-20, compared to traditional implementation. The physical reasons of the success of the inc3-db-B0 approach are also analyzed.

  7. Predicting Mouse Liver Microsomal Stability with “Pruned” Machine Learning Models and Public Data

    PubMed Central

    Perryman, Alexander L.; Stratton, Thomas P.; Ekins, Sean; Freundlich, Joel S.

    2015-01-01

    Purpose Mouse efficacy studies are a critical hurdle to advance translational research of potential therapeutic compounds for many diseases. Although mouse liver microsomal (MLM) stability studies are not a perfect surrogate for in vivo studies of metabolic clearance, they are the initial model system used to assess metabolic stability. Consequently, we explored the development of machine learning models that can enhance the probability of identifying compounds possessing MLM stability. Methods Published assays on MLM half-life values were identified in PubChem, reformatted, and curated to create a training set with 894 unique small molecules. These data were used to construct machine learning models assessed with internal cross-validation, external tests with a published set of antitubercular compounds, and independent validation with an additional diverse set of 571 compounds (PubChem data on percent metabolism). Results “Pruning” out the moderately unstable/moderately stable compounds from the training set produced models with superior predictive power. Bayesian models displayed the best predictive power for identifying compounds with a half-life ≥1 hour. Conclusions Our results suggest the pruning strategy may be of general benefit to improve test set enrichment and provide machine learning models with enhanced predictive value for the MLM stability of small organic molecules. This study represents the most exhaustive study to date of using machine learning approaches with MLM data from public sources. PMID:26415647

  8. Predicting Mouse Liver Microsomal Stability with "Pruned" Machine Learning Models and Public Data.

    PubMed

    Perryman, Alexander L; Stratton, Thomas P; Ekins, Sean; Freundlich, Joel S

    2016-02-01

    Mouse efficacy studies are a critical hurdle to advance translational research of potential therapeutic compounds for many diseases. Although mouse liver microsomal (MLM) stability studies are not a perfect surrogate for in vivo studies of metabolic clearance, they are the initial model system used to assess metabolic stability. Consequently, we explored the development of machine learning models that can enhance the probability of identifying compounds possessing MLM stability. Published assays on MLM half-life values were identified in PubChem, reformatted, and curated to create a training set with 894 unique small molecules. These data were used to construct machine learning models assessed with internal cross-validation, external tests with a published set of antitubercular compounds, and independent validation with an additional diverse set of 571 compounds (PubChem data on percent metabolism). "Pruning" out the moderately unstable / moderately stable compounds from the training set produced models with superior predictive power. Bayesian models displayed the best predictive power for identifying compounds with a half-life ≥1 h. Our results suggest the pruning strategy may be of general benefit to improve test set enrichment and provide machine learning models with enhanced predictive value for the MLM stability of small organic molecules. This study represents the most exhaustive study to date of using machine learning approaches with MLM data from public sources.

  9. Single DNA imaging and length quantification through a mobile phone microscope

    NASA Astrophysics Data System (ADS)

    Wei, Qingshan; Luo, Wei; Chiang, Samuel; Kappel, Tara; Mejia, Crystal; Tseng, Derek; Chan, Raymond Yan L.; Yan, Eddie; Qi, Hangfei; Shabbir, Faizan; Ozkan, Haydar; Feng, Steve; Ozcan, Aydogan

    2016-03-01

    The development of sensitive optical microscopy methods for the detection of single DNA molecules has become an active research area which cultivates various promising applications including point-of-care (POC) genetic testing and diagnostics. Direct visualization of individual DNA molecules usually relies on sophisticated optical microscopes that are mostly available in well-equipped laboratories. For POC DNA testing/detection, there is an increasing need for the development of new single DNA imaging and sensing methods that are field-portable, cost-effective, and accessible for diagnostic applications in resource-limited or field-settings. For this aim, we developed a mobile-phone integrated fluorescence microscopy platform that allows imaging and sizing of single DNA molecules that are stretched on a chip. This handheld device contains an opto-mechanical attachment integrated onto a smartphone camera module, which creates a high signal-to-noise ratio dark-field imaging condition by using an oblique illumination/excitation configuration. Using this device, we demonstrated imaging of individual linearly stretched λ DNA molecules (48 kilobase-pair, kbp) over 2 mm2 field-of-view. We further developed a robust computational algorithm and a smartphone app that allowed the users to quickly quantify the length of each DNA fragment imaged using this mobile interface. The cellphone based device was tested by five different DNA samples (5, 10, 20, 40, and 48 kbp), and a sizing accuracy of <1 kbp was demonstrated for DNA strands longer than 10 kbp. This mobile DNA imaging and sizing platform can be very useful for various diagnostic applications including the detection of disease-specific genes and quantification of copy-number-variations at POC settings.

  10. Consistent structures and interactions by density functional theory with small atomic orbital basis sets.

    PubMed

    Grimme, Stefan; Brandenburg, Jan Gerit; Bannwarth, Christoph; Hansen, Andreas

    2015-08-07

    A density functional theory (DFT) based composite electronic structure approach is proposed to efficiently compute structures and interaction energies in large chemical systems. It is based on the well-known and numerically robust Perdew-Burke-Ernzerhoff (PBE) generalized-gradient-approximation in a modified global hybrid functional with a relatively large amount of non-local Fock-exchange. The orbitals are expanded in Ahlrichs-type valence-double zeta atomic orbital (AO) Gaussian basis sets, which are available for many elements. In order to correct for the basis set superposition error (BSSE) and to account for the important long-range London dispersion effects, our well-established atom-pairwise potentials are used. In the design of the new method, particular attention has been paid to an accurate description of structural parameters in various covalent and non-covalent bonding situations as well as in periodic systems. Together with the recently proposed three-fold corrected (3c) Hartree-Fock method, the new composite scheme (termed PBEh-3c) represents the next member in a hierarchy of "low-cost" electronic structure approaches. They are mainly free of BSSE and account for most interactions in a physically sound and asymptotically correct manner. PBEh-3c yields good results for thermochemical properties in the huge GMTKN30 energy database. Furthermore, the method shows excellent performance for non-covalent interaction energies in small and large complexes. For evaluating its performance on equilibrium structures, a new compilation of standard test sets is suggested. These consist of small (light) molecules, partially flexible, medium-sized organic molecules, molecules comprising heavy main group elements, larger systems with long bonds, 3d-transition metal systems, non-covalently bound complexes (S22 and S66×8 sets), and peptide conformations. For these sets, overall deviations from accurate reference data are smaller than for various other tested DFT methods and reach that of triple-zeta AO basis set second-order perturbation theory (MP2/TZ) level at a tiny fraction of computational effort. Periodic calculations conducted for molecular crystals to test structures (including cell volumes) and sublimation enthalpies indicate very good accuracy competitive to computationally more involved plane-wave based calculations. PBEh-3c can be applied routinely to several hundreds of atoms on a single processor and it is suggested as a robust "high-speed" computational tool in theoretical chemistry and physics.

  11. On a relationship between molecular polarizability and partial molar volume in water.

    PubMed

    Ratkova, Ekaterina L; Fedorov, Maxim V

    2011-12-28

    We reveal a universal relationship between molecular polarizability (a single-molecule property) and partial molar volume in water that is an ensemble property characterizing solute-solvent systems. Since both of these quantities are of the key importance to describe solvation behavior of dissolved molecular species in aqueous solutions, the obtained relationship should have a high impact in chemistry, pharmaceutical, and life sciences as well as in environments. We demonstrated that the obtained relationship between the partial molar volume in water and the molecular polarizability has in general a non-homogeneous character. We performed a detailed analysis of this relationship on a set of ~200 organic molecules from various chemical classes and revealed its fine well-organized structure. We found that this structure strongly depends on the chemical nature of the solutes and can be rationalized in terms of specific solute-solvent interactions. Efficiency and universality of the proposed approach was demonstrated on an external test set containing several dozens of polyfunctional and druglike molecules.

  12. An implicit boundary integral method for computing electric potential of macromolecules in solvent

    NASA Astrophysics Data System (ADS)

    Zhong, Yimin; Ren, Kui; Tsai, Richard

    2018-04-01

    A numerical method using implicit surface representations is proposed to solve the linearized Poisson-Boltzmann equation that arises in mathematical models for the electrostatics of molecules in solvent. The proposed method uses an implicit boundary integral formulation to derive a linear system defined on Cartesian nodes in a narrowband surrounding the closed surface that separates the molecule and the solvent. The needed implicit surface is constructed from the given atomic description of the molecules, by a sequence of standard level set algorithms. A fast multipole method is applied to accelerate the solution of the linear system. A few numerical studies involving some standard test cases are presented and compared to other existing results.

  13. Artificial neural network and classical least-squares methods for neurotransmitter mixture analysis.

    PubMed

    Schulze, H G; Greek, L S; Gorzalka, B B; Bree, A V; Blades, M W; Turner, R F

    1995-02-01

    Identification of individual components in biological mixtures can be a difficult problem regardless of the analytical method employed. In this work, Raman spectroscopy was chosen as a prototype analytical method due to its inherent versatility and applicability to aqueous media, making it useful for the study of biological samples. Artificial neural networks (ANNs) and the classical least-squares (CLS) method were used to identify and quantify the Raman spectra of the small-molecule neurotransmitters and mixtures of such molecules. The transfer functions used by a network, as well as the architecture of a network, played an important role in the ability of the network to identify the Raman spectra of individual neurotransmitters and the Raman spectra of neurotransmitter mixtures. Specifically, networks using sigmoid and hyperbolic tangent transfer functions generalized better from the mixtures in the training data set to those in the testing data sets than networks using sine functions. Networks with connections that permit the local processing of inputs generally performed better than other networks on all the testing data sets. and better than the CLS method of curve fitting, on novel spectra of some neurotransmitters. The CLS method was found to perform well on noisy, shifted, and difference spectra.

  14. A prototypic small molecule database for bronchoalveolar lavage-based metabolomics

    NASA Astrophysics Data System (ADS)

    Walmsley, Scott; Cruickshank-Quinn, Charmion; Quinn, Kevin; Zhang, Xing; Petrache, Irina; Bowler, Russell P.; Reisdorph, Richard; Reisdorph, Nichole

    2018-04-01

    The analysis of bronchoalveolar lavage fluid (BALF) using mass spectrometry-based metabolomics can provide insight into lung diseases, such as asthma. However, the important step of compound identification is hindered by the lack of a small molecule database that is specific for BALF. Here we describe prototypic, small molecule databases derived from human BALF samples (n=117). Human BALF was extracted into lipid and aqueous fractions and analyzed using liquid chromatography mass spectrometry. Following filtering to reduce contaminants and artifacts, the resulting BALF databases (BALF-DBs) contain 11,736 lipid and 658 aqueous compounds. Over 10% of these were found in 100% of samples. Testing the BALF-DBs using nested test sets produced a 99% match rate for lipids and 47% match rate for aqueous molecules. Searching an independent dataset resulted in 45% matching to the lipid BALF-DB compared to<25% when general databases are searched. The BALF-DBs are available for download from MetaboLights. Overall, the BALF-DBs can reduce false positives and improve confidence in compound identification compared to when general databases are used.

  15. Molecular docking, 3D QSAR and dynamics simulation studies of imidazo-pyrrolopyridines as janus kinase 1 (JAK 1) inhibitors.

    PubMed

    Itteboina, Ramesh; Ballu, Srilata; Sivan, Sree Kanth; Manga, Vijjulatha

    2016-10-01

    Janus kinase 1 (JAK 1) plays a critical role in initiating responses to cytokines by the JAK-signal transducer and activator of transcription (JAK-STAT). This controls survival, proliferation and differentiation of a variety of cells. Docking, 3D quantitative structure activity relationship (3D-QSAR) and molecular dynamics (MD) studies were performed on a series of Imidazo-pyrrolopyridine derivatives reported as JAK 1 inhibitors. QSAR model was generated using 30 molecules in the training set; developed model showed good statistical reliability, which is evident from r 2 ncv and r 2 loo values. The predictive ability of this model was determined using a test set of 13 molecules that gave acceptable predictive correlation (r 2 Pred ) values. Finally, molecular dynamics simulation was performed to validate docking results and MM/GBSA calculations. This facilitated us to compare binding free energies of cocrystal ligand and newly designed molecule R1. The good concordance between the docking results and CoMFA/CoMSIA contour maps afforded obliging clues for the rational modification of molecules to design more potent JAK 1 inhibitors. Copyright © 2016 Elsevier Ltd. All rights reserved.

  16. Design new P-glycoprotein modulators based on molecular docking and CoMFA study of α, β-unsaturated carbonyl-based compounds and oxime analogs as anticancer agents

    NASA Astrophysics Data System (ADS)

    Sepehri, Bakhtyar; Ghavami, Raouf

    2017-02-01

    In this research, molecular docking and CoMFA were used to determine interactions of α, β-unsaturated carbonyl-based compounds and oxime analogs with P-glycoprotein and prediction of their activity. Molecular docking study shown these molecules establish strong Van der Waals interactions with side chain of PHE-332, PHE-728 and PHE-974. Based on the effect of component numbers on squared correlation coefficient for cross validation tests (including leave-one-out and leave-many-out), CoMFA models with five components were built to predict pIC50 of molecules in seven cancer cell lines (including Panc-1 (pancreas cancer cell line), PaCa-2 (pancreatic carcinoma cell line), MCF-7 (breast cancer cell line), A-549 (epithelial), HT-29 (colon cancer cell line), H-460 (lung cancer cell line), PC-3 (prostate cancer cell line)). R2 values for training and test sets were in the range of 0.94-0.97 and 0.84 to 0.92, respectively, and for LOO and LMO cross validation test, q2 values were in the range of 0.75-0.82 and 0.65 to 0.73, respectively. Based on molecular docking results and extracted steric and electrostatic contour maps for CoMFA models, four new molecules with higher activity with respect to the most active compound in data set were designed.

  17. Detailed analysis of complex single molecule FRET data with the software MASH

    NASA Astrophysics Data System (ADS)

    Hadzic, Mélodie C. A. S.; Kowerko, Danny; Börner, Richard; Zelger-Paulus, Susann; Sigel, Roland K. O.

    2016-04-01

    The processing and analysis of surface-immobilized single molecule FRET (Förster resonance energy transfer) data follows systematic steps (e.g. single molecule localization, clearance of different sources of noise, selection of the conformational and kinetic model, etc.) that require a solid knowledge in optics, photophysics, signal processing and statistics. The present proceeding aims at standardizing and facilitating procedures for single molecule detection by guiding the reader through an optimization protocol for a particular experimental data set. Relevant features were determined from single molecule movies (SMM) imaging Cy3- and Cy5-labeled Sc.ai5γ group II intron molecules synthetically recreated, to test the performances of four different detection algorithms. Up to 120 different parameterizations per method were routinely evaluated to finally establish an optimum detection procedure. The present protocol is adaptable to any movie displaying surface-immobilized molecules, and can be easily reproduced with our home-written software MASH (multifunctional analysis software for heterogeneous data) and script routines (both available in the download section of www.chem.uzh.ch/rna).

  18. Influence of vehicles used for oral dosing of test molecules on the progression of Mycobacterium tuberculosis infection in mice.

    PubMed

    Singh, Shubhra; Dwivedi, Richa; Chaturvedi, Vinita

    2012-11-01

    Preclinical evaluation of drug-like molecules requires their oral administration to experimental animals using suitable vehicles. We studied the effect of oral dosing with corn oil, carboxymethyl cellulose, dimethyl sulfoxide, and polysorbate-80 on the progression of Mycobacterium tuberculosis infection in mice. Infection was monitored by physical (survival time and body weight) and bacteriological (viable counts in lungs) parameters. Compared with water, corn oil significantly improved both sets of parameters, whereas the other vehicles affected only physical parameters.

  19. Influence of Vehicles Used for Oral Dosing of Test Molecules on the Progression of Mycobacterium tuberculosis Infection in Mice

    PubMed Central

    Singh, Shubhra; Dwivedi, Richa

    2012-01-01

    Preclinical evaluation of drug-like molecules requires their oral administration to experimental animals using suitable vehicles. We studied the effect of oral dosing with corn oil, carboxymethyl cellulose, dimethyl sulfoxide, and polysorbate-80 on the progression of Mycobacterium tuberculosis infection in mice. Infection was monitored by physical (survival time and body weight) and bacteriological (viable counts in lungs) parameters. Compared with water, corn oil significantly improved both sets of parameters, whereas the other vehicles affected only physical parameters. PMID:22926571

  20. Construction of the Fock Matrix on a Grid-Based Molecular Orbital Basis Using GPGPUs.

    PubMed

    Losilla, Sergio A; Watson, Mark A; Aspuru-Guzik, Alán; Sundholm, Dage

    2015-05-12

    We present a GPGPU implementation of the construction of the Fock matrix in the molecular orbital basis using the fully numerical, grid-based bubbles representation. For a test set of molecules containing up to 90 electrons, the total Hartree-Fock energies obtained from reference GTO-based calculations are reproduced within 10(-4) Eh to 10(-8) Eh for most of the molecules studied. Despite the very large number of arithmetic operations involved, the high performance obtained made the calculations possible on a single Nvidia Tesla K40 GPGPU card.

  1. Towards a universal method for calculating hydration free energies: a 3D reference interaction site model with partial molar volume correction.

    PubMed

    Palmer, David S; Frolov, Andrey I; Ratkova, Ekaterina L; Fedorov, Maxim V

    2010-12-15

    We report a simple universal method to systematically improve the accuracy of hydration free energies calculated using an integral equation theory of molecular liquids, the 3D reference interaction site model. A strong linear correlation is observed between the difference of the experimental and (uncorrected) calculated hydration free energies and the calculated partial molar volume for a data set of 185 neutral organic molecules from different chemical classes. By using the partial molar volume as a linear empirical correction to the calculated hydration free energy, we obtain predictions of hydration free energies in excellent agreement with experiment (R = 0.94, σ = 0.99 kcal mol (- 1) for a test set of 120 organic molecules).

  2. Hydration of Atmospheric Molecular Clusters: Systematic Configurational Sampling.

    PubMed

    Kildgaard, Jens; Mikkelsen, Kurt V; Bilde, Merete; Elm, Jonas

    2018-05-09

    We present a new systematic configurational sampling algorithm for investigating the potential energy surface of hydrated atmospheric molecular clusters. The algo- rithm is based on creating a Fibonacci sphere around each atom in the cluster and adding water molecules to each point in 9 different orientations. To allow the sam- pling of water molecules to existing hydrogen bonds, the cluster is displaced along the hydrogen bond and a water molecule is placed in between in three different ori- entations. Generated redundant structures are eliminated based on minimizing the root mean square distance (RMSD) of different conformers. Initially, the clusters are sampled using the semiempirical PM6 method and subsequently using density func- tional theory (M06-2X and ωB97X-D) with the 6-31++G(d,p) basis set. Applying the developed algorithm we study the hydration of sulfuric acid with up to 15 water molecules. We find that the additions of the first four water molecules "saturate" the sulfuric acid molecule and are more thermodynamically favourable than the addition of water molecule 5-15. Using the large generated set of conformers, we assess the performance of approximate methods (ωB97X-D, M06-2X, PW91 and PW6B95-D3) in calculating the binding energies and assigning the global minimum conformation compared to high level CCSD(T)-F12a/VDZ-F12 reference calculations. The tested DFT functionals systematically overestimates the binding energies compared to cou- pled cluster calculations, and we find that this deficiency can be corrected by a simple scaling factor.

  3. Pharmacophore Based Virtual Screening Approach to Identify Selective PDE4B Inhibitors

    PubMed Central

    Gaurav, Anand; Gautam, Vertika

    2017-01-01

    Phosphodiesterase 4 (PDE4) has been established as a promising target in asthma and chronic obstructive pulmonary disease. PDE4B subtype selective inhibitors are known to reduce the dose limiting adverse effect associated with non-selective PDE4B inhibitors. This makes the development of PDE4B subtype selective inhibitors a desirable research goal. To achieve this goal, ligand based pharmacophore modeling approach is employed. Separate pharmacophore hypotheses for PDE4B and PDE4D inhibitors were generated using HypoGen algorithm and 106 PDE4 inhibitors from literature having thiopyrano [3,2-d] Pyrimidines, 2-arylpyrimidines, and triazines skeleton. Suitable training and test sets were created using the molecules as per the guidelines available for HypoGen program. Training set was used for hypothesis development while test set was used for validation purpose. Fisher validation was also used to test the significance of the developed hypothesis. The validated pharmacophore hypotheses for PDE4B and PDE4D inhibitors were used in sequential virtual screening of zinc database of drug like molecules to identify selective PDE4B inhibitors. The hits were screened for their estimated activity and fit value. The top hit was subjected to docking into the active sites of PDE4B and PDE4D to confirm its selectivity for PDE4B. The hits are proposed to be evaluated further using in-vitro assays. PMID:29201082

  4. Varespladib (LY315920) Appears to Be a Potent, Broad-Spectrum, Inhibitor of Snake Venom Phospholipase A2 and a Possible Pre-Referral Treatment for Envenomation

    PubMed Central

    Lewin, Matthew; Samuel, Stephen; Merkel, Janie; Bickler, Philip

    2016-01-01

    Snakebite remains a neglected medical problem of the developing world with up to 125,000 deaths each year despite more than a century of calls to improve snakebite prevention and care. An estimated 75% of fatalities from snakebite occur outside the hospital setting. Because phospholipase A2 (PLA2) activity is an important component of venom toxicity, we sought candidate PLA2 inhibitors by directly testing drugs. Surprisingly, varespladib and its orally bioavailable prodrug, methyl-varespladib showed high-level secretory PLA2 (sPLA2) inhibition at nanomolar and picomolar concentrations against 28 medically important snake venoms from six continents. In vivo proof-of-concept studies with varespladib had striking survival benefit against lethal doses of Micrurus fulvius and Vipera berus venom, and suppressed venom-induced sPLA2 activity in rats challenged with 100% lethal doses of M. fulvius venom. Rapid development and deployment of a broad-spectrum PLA2 inhibitor alone or in combination with other small molecule inhibitors of snake toxins (e.g., metalloproteases) could fill the critical therapeutic gap spanning pre-referral and hospital setting. Lower barriers for clinical testing of safety tested, repurposed small molecule therapeutics are a potentially economical and effective path forward to fill the pre-referral gap in the setting of snakebite. PMID:27571102

  5. Multilevel geometry optimization

    NASA Astrophysics Data System (ADS)

    Rodgers, Jocelyn M.; Fast, Patton L.; Truhlar, Donald G.

    2000-02-01

    Geometry optimization has been carried out for three test molecules using six multilevel electronic structure methods, in particular Gaussian-2, Gaussian-3, multicoefficient G2, multicoefficient G3, and two multicoefficient correlation methods based on correlation-consistent basis sets. In the Gaussian-2 and Gaussian-3 methods, various levels are added and subtracted with unit coefficients, whereas the multicoefficient Gaussian-x methods involve noninteger parameters as coefficients. The multilevel optimizations drop the average error in the geometry (averaged over the 18 cases) by a factor of about two when compared to the single most expensive component of a given multilevel calculation, and in all 18 cases the accuracy of the atomization energy for the three test molecules improves; with an average improvement of 16.7 kcal/mol.

  6. Prediction of molecular crystal structures by a crystallographic QM/MM model with full space-group symmetry.

    PubMed

    Mörschel, Philipp; Schmidt, Martin U

    2015-01-01

    A crystallographic quantum-mechanical/molecular-mechanical model (c-QM/MM model) with full space-group symmetry has been developed for molecular crystals. The lattice energy was calculated by quantum-mechanical methods for short-range interactions and force-field methods for long-range interactions. The quantum-mechanical calculations covered the interactions within the molecule and the interactions of a reference molecule with each of the surrounding 12-15 molecules. The interactions with all other molecules were treated by force-field methods. In each optimization step the energies in the QM and MM shells were calculated separately as single-point energies; after adding both energy contributions, the crystal structure (including the lattice parameters) was optimized accordingly. The space-group symmetry was maintained throughout. Crystal structures with more than one molecule per asymmetric unit, e.g. structures with Z' = 2, hydrates and solvates, have been optimized as well. Test calculations with different quantum-mechanical methods on nine small organic molecules revealed that the density functional theory methods with dispersion correction using the B97-D functional with 6-31G* basis set in combination with the DREIDING force field reproduced the experimental crystal structures with good accuracy. Subsequently the c-QM/MM method was applied to nine compounds from the CCDC blind tests resulting in good energy rankings and excellent geometric accuracies.

  7. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Grimme, Stefan, E-mail: grimme@thch.uni-bonn.de; Brandenburg, Jan Gerit; Bannwarth, Christoph

    A density functional theory (DFT) based composite electronic structure approach is proposed to efficiently compute structures and interaction energies in large chemical systems. It is based on the well-known and numerically robust Perdew-Burke-Ernzerhoff (PBE) generalized-gradient-approximation in a modified global hybrid functional with a relatively large amount of non-local Fock-exchange. The orbitals are expanded in Ahlrichs-type valence-double zeta atomic orbital (AO) Gaussian basis sets, which are available for many elements. In order to correct for the basis set superposition error (BSSE) and to account for the important long-range London dispersion effects, our well-established atom-pairwise potentials are used. In the design ofmore » the new method, particular attention has been paid to an accurate description of structural parameters in various covalent and non-covalent bonding situations as well as in periodic systems. Together with the recently proposed three-fold corrected (3c) Hartree-Fock method, the new composite scheme (termed PBEh-3c) represents the next member in a hierarchy of “low-cost” electronic structure approaches. They are mainly free of BSSE and account for most interactions in a physically sound and asymptotically correct manner. PBEh-3c yields good results for thermochemical properties in the huge GMTKN30 energy database. Furthermore, the method shows excellent performance for non-covalent interaction energies in small and large complexes. For evaluating its performance on equilibrium structures, a new compilation of standard test sets is suggested. These consist of small (light) molecules, partially flexible, medium-sized organic molecules, molecules comprising heavy main group elements, larger systems with long bonds, 3d-transition metal systems, non-covalently bound complexes (S22 and S66×8 sets), and peptide conformations. For these sets, overall deviations from accurate reference data are smaller than for various other tested DFT methods and reach that of triple-zeta AO basis set second-order perturbation theory (MP2/TZ) level at a tiny fraction of computational effort. Periodic calculations conducted for molecular crystals to test structures (including cell volumes) and sublimation enthalpies indicate very good accuracy competitive to computationally more involved plane-wave based calculations. PBEh-3c can be applied routinely to several hundreds of atoms on a single processor and it is suggested as a robust “high-speed” computational tool in theoretical chemistry and physics.« less

  8. Towards the chemometric dissection of peptide - HLA-A*0201 binding affinity: comparison of local and global QSAR models

    NASA Astrophysics Data System (ADS)

    Doytchinova, Irini A.; Walshe, Valerie; Borrow, Persephone; Flower, Darren R.

    2005-03-01

    The affinities of 177 nonameric peptides binding to the HLA-A*0201 molecule were measured using a FACS-based MHC stabilisation assay and analysed using chemometrics. Their structures were described by global and local descriptors, QSAR models were derived by genetic algorithm, stepwise regression and PLS. The global molecular descriptors included molecular connectivity χ indices, κ shape indices, E-state indices, molecular properties like molecular weight and log P, and three-dimensional descriptors like polarizability, surface area and volume. The local descriptors were of two types. The first used a binary string to indicate the presence of each amino acid type at each position of the peptide. The second was also position-dependent but used five z-scales to describe the main physicochemical properties of the amino acids forming the peptides. The models were developed using a representative training set of 131 peptides and validated using an independent test set of 46 peptides. It was found that the global descriptors could not explain the variance in the training set nor predict the affinities of the test set accurately. Both types of local descriptors gave QSAR models with better explained variance and predictive ability. The results suggest that, in their interactions with the MHC molecule, the peptide acts as a complicated ensemble of multiple amino acids mutually potentiating each other.

  9. Condorcet and borda count fusion method for ligand-based virtual screening.

    PubMed

    Ahmed, Ali; Saeed, Faisal; Salim, Naomie; Abdo, Ammar

    2014-01-01

    It is known that any individual similarity measure will not always give the best recall of active molecule structure for all types of activity classes. Recently, the effectiveness of ligand-based virtual screening approaches can be enhanced by using data fusion. Data fusion can be implemented using two different approaches: group fusion and similarity fusion. Similarity fusion involves searching using multiple similarity measures. The similarity scores, or ranking, for each similarity measure are combined to obtain the final ranking of the compounds in the database. The Condorcet fusion method was examined. This approach combines the outputs of similarity searches from eleven association and distance similarity coefficients, and then the winner measure for each class of molecules, based on Condorcet fusion, was chosen to be the best method of searching. The recall of retrieved active molecules at top 5% and significant test are used to evaluate our proposed method. The MDL drug data report (MDDR), maximum unbiased validation (MUV) and Directory of Useful Decoys (DUD) data sets were used for experiments and were represented by 2D fingerprints. Simulated virtual screening experiments with the standard two data sets show that the use of Condorcet fusion provides a very simple way of improving the ligand-based virtual screening, especially when the active molecules being sought have a lowest degree of structural heterogeneity. However, the effectiveness of the Condorcet fusion was increased slightly when structural sets of high diversity activities were being sought.

  10. Condorcet and borda count fusion method for ligand-based virtual screening

    PubMed Central

    2014-01-01

    Background It is known that any individual similarity measure will not always give the best recall of active molecule structure for all types of activity classes. Recently, the effectiveness of ligand-based virtual screening approaches can be enhanced by using data fusion. Data fusion can be implemented using two different approaches: group fusion and similarity fusion. Similarity fusion involves searching using multiple similarity measures. The similarity scores, or ranking, for each similarity measure are combined to obtain the final ranking of the compounds in the database. Results The Condorcet fusion method was examined. This approach combines the outputs of similarity searches from eleven association and distance similarity coefficients, and then the winner measure for each class of molecules, based on Condorcet fusion, was chosen to be the best method of searching. The recall of retrieved active molecules at top 5% and significant test are used to evaluate our proposed method. The MDL drug data report (MDDR), maximum unbiased validation (MUV) and Directory of Useful Decoys (DUD) data sets were used for experiments and were represented by 2D fingerprints. Conclusions Simulated virtual screening experiments with the standard two data sets show that the use of Condorcet fusion provides a very simple way of improving the ligand-based virtual screening, especially when the active molecules being sought have a lowest degree of structural heterogeneity. However, the effectiveness of the Condorcet fusion was increased slightly when structural sets of high diversity activities were being sought. PMID:24883114

  11. Relationships between lattice energies of inorganic ionic solids

    NASA Astrophysics Data System (ADS)

    Kaya, Savaş

    2018-06-01

    Lattice energy, which is a measure of the stabilities of inorganic ionic solids, is the energy required to decompose a solid into its constituent independent gaseous ions. In the present work, the relationships between lattice energies of many diatomic and triatomic inorganic ionic solids are revealed and a simple rule that can be used for the prediction of the lattice energies of inorganic ionic solids is introduced. According to this rule, the lattice energy of an AB molecule can be predicted with the help of the lattice energies of AX, BY and XY molecules in agreement with the experimental data. This rule is valid for not only diatomic molecules but also triatomic molecules. The lattice energy equations proposed in this rule provides compatible results with previously published lattice energy equations by Jenkins, Kaya, Born-Lande, Born-Mayer, Kapustinskii and Reddy. For a large set of tested molecules, calculated percent standard deviation values considering experimental data and the results of the equations proposed in this work are in general between %1-2%.

  12. CoMFA, LeapFrog and blind docking studies on sulfonanilide derivatives acting as selective aromatase expression regulators.

    PubMed

    Gueto, Carlos; Torres, Juan; Vivas-Reyes, Ricardo

    2009-09-01

    Aromatase, the enzyme responsible for estrogen biosynthesis, is an attractive target in the treatment of hormone-dependent breast cancer. In this manuscript, the structure-based drug design approach of sulfonanilide analogues as potential selective aromatase expression regulators (SAERs) is described. Receptor-independent CoMFA (Comparative Molecular Field Analysis) maps were employed for generating a pseudocavity for LeapFrog calculation. A robust model, using 45 and 10 molecules in the training and test sets, respectively, was developed producing statistically significant results with cross-validated and conventional correlation coefficients of 0.656 and 0.956, respectively. This model was used to predict the activity of newly proposed molecules as SAERs candidates being two magnitude orders more potent than the previously reported compounds. Also in the present study, the computational blind docking method using eHiTS is tested on molecules study group and COX-2 enzyme. Future perspectives of the method in the screening of SAERs candidates with no COX-2 inhibitory activity are discussed.

  13. Density Functional Theory Calculation of pKa's of Thiols in Aqueous Solution Using Explicit Water Molecules and the Polarizable Continuum Model.

    PubMed

    Thapa, Bishnu; Schlegel, H Bernhard

    2016-07-21

    The pKa's of substituted thiols are important for understanding their properties and reactivities in applications in chemistry, biochemistry, and material chemistry. For a collection of 175 different density functionals and the SMD implicit solvation model, the average errors in the calculated pKa's of methanethiol and ethanethiol are almost 10 pKa units higher than for imidazole. A test set of 45 substituted thiols with pKa's ranging from 4 to 12 has been used to assess the performance of 8 functionals with 3 different basis sets. As expected, the basis set needs to include polarization functions on the hydrogens and diffuse functions on the heavy atoms. Solvent cavity scaling was ineffective in correcting the errors in the calculated pKa's. Inclusion of an explicit water molecule that is hydrogen bonded with the H of the thiol group (in neutral) or S(-) (in thiolates) lowers error by an average of 3.5 pKa units. With one explicit water and the SMD solvation model, pKa's calculated with the M06-2X, PBEPBE, BP86, and LC-BLYP functionals are found to deviate from the experimental values by about 1.5-2.0 pKa units whereas pKa's with the B3LYP, ωB97XD and PBEVWN5 functionals are still in error by more than 3 pKa units. The inclusion of three explicit water molecules lowers the calculated pKa further by about 4.5 pKa units. With the B3LYP and ωB97XD functionals, the calculated pKa's are within one unit of the experimental values whereas most other functionals used in this study underestimate the pKa's. This study shows that the ωB97XD functional with the 6-31+G(d,p) and 6-311++G(d,p) basis sets, and the SMD solvation model with three explicit water molecules hydrogen bonded to the sulfur produces the best result for the test set (average error -0.11 ± 0.50 and +0.15 ± 0.58, respectively). The B3LYP functional also performs well (average error -1.11 ± 0.82 and -0.78 ± 0.79, respectively).

  14. A graph-based approach to construct target-focused libraries for virtual screening.

    PubMed

    Naderi, Misagh; Alvin, Chris; Ding, Yun; Mukhopadhyay, Supratik; Brylinski, Michal

    2016-01-01

    Due to exorbitant costs of high-throughput screening, many drug discovery projects commonly employ inexpensive virtual screening to support experimental efforts. However, the vast majority of compounds in widely used screening libraries, such as the ZINC database, will have a very low probability to exhibit the desired bioactivity for a given protein. Although combinatorial chemistry methods can be used to augment existing compound libraries with novel drug-like compounds, the broad chemical space is often too large to be explored. Consequently, the trend in library design has shifted to produce screening collections specifically tailored to modulate the function of a particular target or a protein family. Assuming that organic compounds are composed of sets of rigid fragments connected by flexible linkers, a molecule can be decomposed into its building blocks tracking their atomic connectivity. On this account, we developed eSynth, an exhaustive graph-based search algorithm to computationally synthesize new compounds by reconnecting these building blocks following their connectivity patterns. We conducted a series of benchmarking calculations against the Directory of Useful Decoys, Enhanced database. First, in a self-benchmarking test, the correctness of the algorithm is validated with the objective to recover a molecule from its building blocks. Encouragingly, eSynth can efficiently rebuild more than 80 % of active molecules from their fragment components. Next, the capability to discover novel scaffolds is assessed in a cross-benchmarking test, where eSynth successfully reconstructed 40 % of the target molecules using fragments extracted from chemically distinct compounds. Despite an enormous chemical space to be explored, eSynth is computationally efficient; half of the molecules are rebuilt in less than a second, whereas 90 % take only about a minute to be generated. eSynth can successfully reconstruct chemically feasible molecules from molecular fragments. Furthermore, in a procedure mimicking the real application, where one expects to discover novel compounds based on a small set of already developed bioactives, eSynth is capable of generating diverse collections of molecules with the desired activity profiles. Thus, we are very optimistic that our effort will contribute to targeted drug discovery. eSynth is freely available to the academic community at www.brylinski.org/content/molecular-synthesis.Graphical abstractAssuming that organic compounds are composed of sets of rigid fragments connected by flexible linkers, a molecule can be decomposed into its building blocks tracking their atomic connectivity. Here, we developed eSynth, an automated method to synthesize new compounds by reconnecting these building blocks following the connectivity patterns via an exhaustive graph-based search algorithm. eSynth opens up a possibility to rapidly construct virtual screening libraries for targeted drug discovery.

  15. Molecular docking and 3D-QSAR studies on inhibitors of DNA damage signaling enzyme human PARP-1.

    PubMed

    Fatima, Sabiha; Bathini, Raju; Sivan, Sree Kanth; Manga, Vijjulatha

    2012-08-01

    Poly (ADP-ribose) polymerase-1 (PARP-1) operates in a DNA damage signaling network. Molecular docking and three dimensional-quantitative structure activity relationship (3D-QSAR) studies were performed on human PARP-1 inhibitors. Docked conformation obtained for each molecule was used as such for 3D-QSAR analysis. Molecules were divided into a training set and a test set randomly in four different ways, partial least square analysis was performed to obtain QSAR models using the comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA). Derived models showed good statistical reliability that is evident from their r², q²(loo) and r²(pred) values. To obtain a consensus for predictive ability from all the models, average regression coefficient r²(avg) was calculated. CoMFA and CoMSIA models showed a value of 0.930 and 0.936, respectively. Information obtained from the best 3D-QSAR model was applied for optimization of lead molecule and design of novel potential inhibitors.

  16. Performance and Self-Consistency of the Generalized Dielectric Dependent Hybrid Functional

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Brawand, Nicholas P.; Govoni, Marco; Vörös, Márton

    Here, we analyze the performance of the recently proposed screened exchange constant functional (SX) on the GW100 test set, and we discuss results obtained at different levels of self-consistency. The SX functional is a generalization of dielectric dependent hybrid functionals to finite systems; it is nonempirical and depends on the average screening of the exchange interaction. We compare results for ionization potentials obtained with SX to those of CCSD(T) calculations and experiments, and we find excellent agreement, on par with recent state of the art methods based on many body perturbation theory. Applying SX perturbatively to correct PBE eigenvalues yieldsmore » improved results in most cases, except for ionic molecules, for which wave function self-consistency is instead crucial. Calculations where wave functions and the screened exchange constant (α SX) are determined self-consistently, and those where α SX is fixed to the value determined within PBE, yield results of comparable accuracy. Perturbative G 0W 0 corrections of eigenvalues obtained with self-consistent αSX are small on average, for all molecules in the GW100 test set.« less

  17. Performance and Self-Consistency of the Generalized Dielectric Dependent Hybrid Functional

    DOE PAGES

    Brawand, Nicholas P.; Govoni, Marco; Vörös, Márton; ...

    2017-05-24

    Here, we analyze the performance of the recently proposed screened exchange constant functional (SX) on the GW100 test set, and we discuss results obtained at different levels of self-consistency. The SX functional is a generalization of dielectric dependent hybrid functionals to finite systems; it is nonempirical and depends on the average screening of the exchange interaction. We compare results for ionization potentials obtained with SX to those of CCSD(T) calculations and experiments, and we find excellent agreement, on par with recent state of the art methods based on many body perturbation theory. Applying SX perturbatively to correct PBE eigenvalues yieldsmore » improved results in most cases, except for ionic molecules, for which wave function self-consistency is instead crucial. Calculations where wave functions and the screened exchange constant (α SX) are determined self-consistently, and those where α SX is fixed to the value determined within PBE, yield results of comparable accuracy. Perturbative G 0W 0 corrections of eigenvalues obtained with self-consistent αSX are small on average, for all molecules in the GW100 test set.« less

  18. Adversarial Threshold Neural Computer for Molecular de Novo Design.

    PubMed

    Putin, Evgeny; Asadulaev, Arip; Vanhaelen, Quentin; Ivanenkov, Yan; Aladinskaya, Anastasia V; Aliper, Alex; Zhavoronkov, Alex

    2018-03-30

    In this article, we propose the deep neural network Adversarial Threshold Neural Computer (ATNC). The ATNC model is intended for the de novo design of novel small-molecule organic structures. The model is based on generative adversarial network architecture and reinforcement learning. ATNC uses a Differentiable Neural Computer as a generator and has a new specific block, called adversarial threshold (AT). AT acts as a filter between the agent (generator) and the environment (discriminator + objective reward functions). Furthermore, to generate more diverse molecules we introduce a new objective reward function named Internal Diversity Clustering (IDC). In this work, ATNC is tested and compared with the ORGANIC model. Both models were trained on the SMILES string representation of the molecules, using four objective functions (internal similarity, Muegge druglikeness filter, presence or absence of sp 3 -rich fragments, and IDC). The SMILES representations of 15K druglike molecules from the ChemDiv collection were used as a training data set. For the different functions, ATNC outperforms ORGANIC. Combined with the IDC, ATNC generates 72% of valid and 77% of unique SMILES strings, while ORGANIC generates only 7% of valid and 86% of unique SMILES strings. For each set of molecules generated by ATNC and ORGANIC, we analyzed distributions of four molecular descriptors (number of atoms, molecular weight, logP, and tpsa) and calculated five chemical statistical features (internal diversity, number of unique heterocycles, number of clusters, number of singletons, and number of compounds that have not been passed through medicinal chemistry filters). Analysis of key molecular descriptors and chemical statistical features demonstrated that the molecules generated by ATNC elicited better druglikeness properties. We also performed in vitro validation of the molecules generated by ATNC; results indicated that ATNC is an effective method for producing hit compounds.

  19. ChemBank: a small-molecule screening and cheminformatics resource database.

    PubMed

    Seiler, Kathleen Petri; George, Gregory A; Happ, Mary Pat; Bodycombe, Nicole E; Carrinski, Hyman A; Norton, Stephanie; Brudz, Steve; Sullivan, John P; Muhlich, Jeremy; Serrano, Martin; Ferraiolo, Paul; Tolliday, Nicola J; Schreiber, Stuart L; Clemons, Paul A

    2008-01-01

    ChemBank (http://chembank.broad.harvard.edu/) is a public, web-based informatics environment developed through a collaboration between the Chemical Biology Program and Platform at the Broad Institute of Harvard and MIT. This knowledge environment includes freely available data derived from small molecules and small-molecule screens and resources for studying these data. ChemBank is unique among small-molecule databases in its dedication to the storage of raw screening data, its rigorous definition of screening experiments in terms of statistical hypothesis testing, and its metadata-based organization of screening experiments into projects involving collections of related assays. ChemBank stores an increasingly varied set of measurements derived from cells and other biological assay systems treated with small molecules. Analysis tools are available and are continuously being developed that allow the relationships between small molecules, cell measurements, and cell states to be studied. Currently, ChemBank stores information on hundreds of thousands of small molecules and hundreds of biomedically relevant assays that have been performed at the Broad Institute by collaborators from the worldwide research community. The goal of ChemBank is to provide life scientists unfettered access to biomedically relevant data and tools heretofore available primarily in the private sector.

  20. Electronegativity and redox reactions.

    PubMed

    Miranda-Quintana, Ramón Alain; Martínez González, Marco; Ayers, Paul W

    2016-08-10

    Using the maximum hardness principle, we show that the oxidation potential of a molecule increases as its electronegativity increases and also increases as its electronegativity in its oxidized state increases. This insight can be used to construct a linear free energy relation for the oxidation potential, which we train on a set of 31 organic redox couples and test on a set of 10 different redox reactions. Better results are obtained when the electronegativity of the oxidized/reduced reagents are adjusted to account for the reagents' interaction with their chemical environment.

  1. Prediction of Mass Spectral Response Factors from Predicted Chemometric Data for Druglike Molecules

    NASA Astrophysics Data System (ADS)

    Cramer, Christopher J.; Johnson, Joshua L.; Kamel, Amin M.

    2017-02-01

    A method is developed for the prediction of mass spectral ion counts of drug-like molecules using in silico calculated chemometric data. Various chemometric data, including polar and molecular surface areas, aqueous solvation free energies, and gas-phase and aqueous proton affinities were computed, and a statistically significant relationship between measured mass spectral ion counts and the combination of aqueous proton affinity and total molecular surface area was identified. In particular, through multilinear regression of ion counts on predicted chemometric data, we find that log10(MS ion counts) = -4.824 + c 1•PA + c 2•SA, where PA is the aqueous proton affinity of the molecule computed at the SMD(aq)/M06-L/MIDI!//M06-L/MIDI! level of electronic structure theory, SA is the total surface area of the molecule in its conjugate base form, and c 1 and c 2 have values of -3.912 × 10-2 mol kcal-1 and 3.682 × 10-3 Å-2. On a 66-molecule training set, this regression exhibits a multiple R value of 0.791 with p values for the intercept, c 1, and c 2 of 1.4 × 10-3, 4.3 × 10-10, and 2.5 × 10-6, respectively. Application of this regression to an 11-molecule test set provides a good correlation of prediction with experiment ( R = 0.905) albeit with a systematic underestimation of about 0.2 log units. This method may prove useful for semiquantitative analysis of drug metabolites for which MS response factors or authentic standards are not readily available.

  2. A Minimal Optical Trapping and Imaging Microscopy System

    PubMed Central

    Hernández Candia, Carmen Noemí; Tafoya Martínez, Sara; Gutiérrez-Medina, Braulio

    2013-01-01

    We report the construction and testing of a simple and versatile optical trapping apparatus, suitable for visualizing individual microtubules (∼25 nm in diameter) and performing single-molecule studies, using a minimal set of components. This design is based on a conventional, inverted microscope, operating under plain bright field illumination. A single laser beam enables standard optical trapping and the measurement of molecular displacements and forces, whereas digital image processing affords real-time sample visualization with reduced noise and enhanced contrast. We have tested our trapping and imaging instrument by measuring the persistence length of individual double-stranded DNA molecules, and by following the stepping of single kinesin motor proteins along clearly imaged microtubules. The approach presented here provides a straightforward alternative for studies of biomaterials and individual biomolecules. PMID:23451216

  3. Multiplexed quantification of nucleic acids with large dynamic range using multivolume digital RT-PCR on a rotational SlipChip tested with HIV and hepatitis C viral load.

    PubMed

    Shen, Feng; Sun, Bing; Kreutz, Jason E; Davydova, Elena K; Du, Wenbin; Reddy, Poluru L; Joseph, Loren J; Ismagilov, Rustem F

    2011-11-09

    In this paper, we are working toward a problem of great importance to global health: determination of viral HIV and hepatitis C (HCV) loads under point-of-care and resource limited settings. While antiretroviral treatments are becoming widely available, viral load must be evaluated at regular intervals to prevent the spread of drug resistance and requires a quantitative measurement of RNA concentration over a wide dynamic range (from 50 up to 10(6) molecules/mL for HIV and up to 10(8) molecules/mL for HCV). "Digital" single molecule measurements are attractive for quantification, but the dynamic range of such systems is typically limited or requires excessive numbers of compartments. Here we designed and tested two microfluidic rotational SlipChips to perform multivolume digital RT-PCR (MV digital RT-PCR) experiments with large and tunable dynamic range. These designs were characterized using synthetic control RNA and validated with HIV viral RNA and HCV control viral RNA. The first design contained 160 wells of each of four volumes (125 nL, 25 nL, 5 nL, and 1 nL) to achieve a dynamic range of 5.2 × 10(2) to 4.0 × 10(6) molecules/mL at 3-fold resolution. The second design tested the flexibility of this approach, and further expanded it to allow for multiplexing while maintaining a large dynamic range by adding additional wells with volumes of 0.2 nL and 625 nL and dividing the SlipChip into five regions to analyze five samples each at a dynamic range of 1.8 × 10(3) to 1.2 × 10(7) molecules/mL at 3-fold resolution. No evidence of cross-contamination was observed. The multiplexed SlipChip can be used to analyze a single sample at a dynamic range of 1.7 × 10(2) to 2.0 × 10(7) molecules/mL at 3-fold resolution with limit of detection of 40 molecules/mL. HIV viral RNA purified from clinical samples were tested on the SlipChip, and viral load results were self-consistent and in good agreement with results determined using the Roche COBAS AmpliPrep/COBAS TaqMan HIV-1 Test. With further validation, this SlipChip should become useful to precisely quantify viral HIV and HCV RNA for high-performance diagnostics in resource-limited settings. These microfluidic designs should also be valuable for other diagnostic and research applications, including detecting rare cells and rare mutations, prenatal diagnostics, monitoring residual disease, and quantifying copy number variation and gene expression patterns. The theory for the design and analysis of multivolume digital PCR experiments is presented in other work by Kreutz et al.

  4. Cyndi: a multi-objective evolution algorithm based method for bioactive molecular conformational generation.

    PubMed

    Liu, Xiaofeng; Bai, Fang; Ouyang, Sisheng; Wang, Xicheng; Li, Honglin; Jiang, Hualiang

    2009-03-31

    Conformation generation is a ubiquitous problem in molecule modelling. Many applications require sampling the broad molecular conformational space or perceiving the bioactive conformers to ensure success. Numerous in silico methods have been proposed in an attempt to resolve the problem, ranging from deterministic to non-deterministic and systemic to stochastic ones. In this work, we described an efficient conformation sampling method named Cyndi, which is based on multi-objective evolution algorithm. The conformational perturbation is subjected to evolutionary operation on the genome encoded with dihedral torsions. Various objectives are designated to render the generated Pareto optimal conformers to be energy-favoured as well as evenly scattered across the conformational space. An optional objective concerning the degree of molecular extension is added to achieve geometrically extended or compact conformations which have been observed to impact the molecular bioactivity (J Comput -Aided Mol Des 2002, 16: 105-112). Testing the performance of Cyndi against a test set consisting of 329 small molecules reveals an average minimum RMSD of 0.864 A to corresponding bioactive conformations, indicating Cyndi is highly competitive against other conformation generation methods. Meanwhile, the high-speed performance (0.49 +/- 0.18 seconds per molecule) renders Cyndi to be a practical toolkit for conformational database preparation and facilitates subsequent pharmacophore mapping or rigid docking. The copy of precompiled executable of Cyndi and the test set molecules in mol2 format are accessible in Additional file 1. On the basis of MOEA algorithm, we present a new, highly efficient conformation generation method, Cyndi, and report the results of validation and performance studies comparing with other four methods. The results reveal that Cyndi is capable of generating geometrically diverse conformers and outperforms other four multiple conformer generators in the case of reproducing the bioactive conformations against 329 structures. The speed advantage indicates Cyndi is a powerful alternative method for extensive conformational sampling and large-scale conformer database preparation.

  5. Oligo-branched peptides for tumor targeting: from magic bullets to magic forks.

    PubMed

    Falciani, Chiara; Pini, Alessandro; Bracci, Luisa

    2009-02-01

    Selective targeting of tumor cells is the final goal of research and drug discovery for cancer diagnosis, imaging and therapy. After the invention of hybridoma technology, the concept of magic bullet was introduced into the field of oncology, referring to selective killing of tumor cells, by specific antibodies. More recently, small molecules and peptides have also been proposed as selective targeting agents. We analyze the state of the art of tumor-selective agents that are presently available and tested in clinical settings. A novel approach based on 'armed' oligo-branched peptides as tumor targeting agents, is discussed and compared with existing tumor-selective therapies mediated by antibodies, small molecules or monomeric peptides. Oligo-branched peptides could be novel drugs that combine the advantages of antibodies and small molecules.

  6. Tunable diblock copolypeptide hydrogel depots for local delivery of hydrophobic molecules in healthy and injured central nervous system

    PubMed Central

    Zhang, Shanshan; Anderson, Mark A.; Ao, Yan; Khakh, Baljit S.; Fan, Jessica; Deming, Timothy J.; Sofroniew, Michael V.

    2014-01-01

    Many hydrophobic small molecules are available to regulate gene expression and other cellular functions. Locally restricted application of such molecules in the central nervous system (CNS) would be desirable in many experimental and therapeutic settings, but is limited by a lack of innocuous vehicles able to load and easily deliver hydrophobic cargo. Here, we tested the potential for diblock copolypeptide hydrogels (DCH) to serve as such vehicles. In vitro tests on loading and release were conducted with cholesterol and the anti-cancer agent, temozolomide (TMZ). Loading of hydrophobic cargo modified DCH physical properties such as stiffness and viscosity, but these could readily be tuned to desired ranges by modifying DCH concentration, amino acid composition or chain lengths. Different DCH formulations exhibited different loading capacities and different rates of release. For example, comparison of different DCH with increasing alanine contents showed corresponding increases in both cargo loading capacity and time for cargo release. In vivo tests were conducted with tamoxifen, a small synthetic hydrophobic molecule widely used to regulate transgene expression. Tamoxifen released from DCH depots injected into healthy or injured CNS efficiently activated reporter gene expression in a locally restricted manner in transgenic mice. These findings demonstrate the facile and predictable tunability of DCH to achieve a wide range of loading capacities and release profiles of hydrophobic cargos while retaining CNS compatible physical properties. In addition, the findings show that DCH depots injected into the CNS can efficiently deliver small hydrophobic molecules that regulate gene expression in local cells. PMID:24314556

  7. Predicting the Metabolic Sites by Flavin-Containing Monooxygenase on Drug Molecules Using SVM Classification on Computed Quantum Mechanics and Circular Fingerprints Molecular Descriptors

    PubMed Central

    Fu, Chien-wei; Lin, Thy-Hou

    2017-01-01

    As an important enzyme in Phase I drug metabolism, the flavin-containing monooxygenase (FMO) also metabolizes some xenobiotics with soft nucleophiles. The site of metabolism (SOM) on a molecule is the site where the metabolic reaction is exerted by an enzyme. Accurate prediction of SOMs on drug molecules will assist the search for drug leads during the optimization process. Here, some quantum mechanics features such as the condensed Fukui function and attributes from circular fingerprints (called Molprint2D) are computed and classified using the support vector machine (SVM) for predicting some potential SOMs on a series of drugs that can be metabolized by FMO enzymes. The condensed Fukui function fA− representing the nucleophilicity of central atom A and the attributes from circular fingerprints accounting the influence of neighbors on the central atom. The total number of FMO substrates and non-substrates collected in the study is 85 and they are equally divided into the training and test sets with each carrying roughly the same number of potential SOMs. However, only N-oxidation and S-oxidation features were considered in the prediction since the available C-oxidation data was scarce. In the training process, the LibSVM package of WEKA package and the option of 10-fold cross validation are employed. The prediction performance on the test set evaluated by accuracy, Matthews correlation coefficient and area under ROC curve computed are 0.829, 0.659, and 0.877 respectively. This work reveals that the SVM model built can accurately predict the potential SOMs for drug molecules that are metabolizable by the FMO enzymes. PMID:28072829

  8. Defining Desirable Central Nervous System Drug Space through the Alignment of Molecular Properties, in Vitro ADME, and Safety Attributes

    PubMed Central

    2010-01-01

    As part of our effort to increase survival of drug candidates and to move our medicinal chemistry design to higher probability space for success in the Neuroscience therapeutic area, we embarked on a detailed study of the property space for a collection of central nervous system (CNS) molecules. We carried out a thorough analysis of properties for 119 marketed CNS drugs and a set of 108 Pfizer CNS candidates. In particular, we focused on understanding the relationships between physicochemical properties, in vitro ADME (absorption, distribution, metabolism, and elimination) attributes, primary pharmacology binding efficiencies, and in vitro safety data for these two sets of compounds. This scholarship provides guidance for the design of CNS molecules in a property space with increased probability of success and may lead to the identification of druglike candidates with favorable safety profiles that can successfully test hypotheses in the clinic. PMID:22778836

  9. rPM6 parameters for phosphorous and sulphur-containing open-shell molecules

    NASA Astrophysics Data System (ADS)

    Saito, Toru; Takano, Yu

    2018-03-01

    In this article, we have introduced a reparameterisation of PM6 (rPM6) for phosphorus and sulphur to achieve a better description of open-shell species containing the two elements. Two sets of the parameters have been optimised separately using our training sets. The performance of the spin-unrestricted rPM6 (UrPM6) method with the optimised parameters is evaluated against 14 radical species, which contain either phosphorus or sulphur atom, comparing with the original UPM6 and the spin-unrestricted density functional theory (UDFT) methods. The standard UPM6 calculations fail to describe the adiabatic singlet-triplet energy gaps correctly, and may cause significant structural mismatches with UDFT-optimised geometries. Leaving aside three difficult cases, tests on 11 open-shell molecules strongly indicate the superior performance of UrPM6, which provides much better agreement with the results of UDFT methods for geometric and electronic properties.

  10. A new parallel algorithm of MP2 energy calculations.

    PubMed

    Ishimura, Kazuya; Pulay, Peter; Nagase, Shigeru

    2006-03-01

    A new parallel algorithm has been developed for second-order Møller-Plesset perturbation theory (MP2) energy calculations. Its main projected applications are for large molecules, for instance, for the calculation of dispersion interaction. Tests on a moderate number of processors (2-16) show that the program has high CPU and parallel efficiency. Timings are presented for two relatively large molecules, taxol (C(47)H(51)NO(14)) and luciferin (C(11)H(8)N(2)O(3)S(2)), the former with the 6-31G* and 6-311G** basis sets (1,032 and 1,484 basis functions, 164 correlated orbitals), and the latter with the aug-cc-pVDZ and aug-cc-pVTZ basis sets (530 and 1,198 basis functions, 46 correlated orbitals). An MP2 energy calculation on C(130)H(10) (1,970 basis functions, 265 correlated orbitals) completed in less than 2 h on 128 processors.

  11. Communication: Multiple-property-based diabatization for open-shell van der Waals molecules

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Karman, Tijs; Avoird, Ad van der; Groenenboom, Gerrit C., E-mail: gerritg@theochem.ru.nl

    2016-03-28

    We derive a new multiple-property-based diabatization algorithm. The transformation between adiabatic and diabatic representations is determined by requiring a set of properties in both representations to be related by a similarity transformation. This set of properties is determined in the adiabatic representation by rigorous electronic structure calculations. In the diabatic representation, the same properties are determined using model diabatic states defined as products of undistorted monomer wave functions. This diabatic model is generally applicable to van der Waals molecules in arbitrary electronic states. Application to locating seams of conical intersections and collisional transfer of electronic excitation energy is demonstrated formore » O{sub 2} − O{sub 2} in low-lying excited states. Property-based diabatization for this test system included all components of the electric quadrupole tensor, orbital angular momentum, and spin-orbit coupling.« less

  12. Detection of Interstellar Urea with Carma

    NASA Astrophysics Data System (ADS)

    Kuo, H.-L.; Snyder, L. E.; Friedel, D. N.; Looney, L. W.; McCall, B. J.; Remijan, A. J.; Lovas, F. J.; Hollis, J. M.

    2010-06-01

    Urea, a molecule discovered in human urine by H. M. Rouelle in 1773, has a significant role in prebiotic chemistry. Previous BIMA observations have suggested that interstellar urea [(NH_2)_2CO] is a compact hot core molecule such as other large molecules, e.g. methyl formate and acetic acid (2009, 64th OSU Symposium On Molecular Spectroscopy, WI05). We have conducted an extensive search for urea toward the high mass hot molecular core Sgr B2(N-LMH) using CARMA and the IRAM 30 m. Because the spectral lines of heavy molecules like urea tend to be weak and hot cores display lines from a wide range of molecules, a major problem in identifying urea lines is confusion with lines of other molecules. Therefore, it is necessary to detect a number of urea lines and apply sophisticated statistical tests before having confidence in an identification. The 1 mm resolution of CARMA enables favorable coupling of the source size and synthesized beam size, which was found to be essential for the detection of weak signals. The 2.5^"×2^" synthesized beam of CARMA significantly resolves out the contamination by extended emission and reveals the eight weak urea lines that were previously blended with nearby transitions. Our analysis indicates that these lines are likely to be urea since the resulting observed line frequencies are coincident with a set of overlapping connecting urea lines, and the observed line intensities are consistent with the expected line strengths of urea. In addition, we have developed a new statistical approach to examine the spatial correlation between the observed lines by applying the Student T-test to the high resolution channel maps obtained from CARMA. The T-test shows similar spatial distributions from all eight candidate lines, suggesting a common molecular origin, urea. Our T-test method could have a broad impact on the next generation of arrays, such as ALMA, because the new arrays will require a method to systematically determine the credibility of detections of weaker signals from new and larger interstellar molecules.

  13. High precision optical spectroscopy and quantum state selected photodissociation of ultracold 88Sr2 molecules in an optical lattice

    NASA Astrophysics Data System (ADS)

    McDonald, Mickey

    2017-04-01

    Over the past several decades, rapid progress has been made toward the accurate characterization and control of atoms, epitomized by the ever-increasing accuracy and precision of optical atomic lattice clocks. Extending this progress to molecules will have exciting implications for chemistry, condensed matter physics, and precision tests of physics beyond the Standard Model. My thesis describes work performed over the past six years to establish the state of the art in manipulation and quantum control of ultracold molecules. We describe a thorough set of measurements characterizing the rovibrational structure of weakly bound 88Sr2 molecules from several different perspectives, including determinations of binding energies; linear, quadratic, and higher order Zeeman shifts; transition strengths between bound states; and lifetimes of narrow subradiant states. Finally, we discuss measurements of photofragment angular distributions produced by photodissociation of molecules in single quantum states, leading to an exploration of quantum-state-resolved ultracold chemistry. The images of exploding photofragments produced in these studies exhibit dramatic interference effects and strongly violate semiclassical predictions, instead requiring a fully quantum mechanical description.

  14. Multiple search methods for similarity-based virtual screening: analysis of search overlap and precision

    PubMed Central

    2011-01-01

    Background Data fusion methods are widely used in virtual screening, and make the implicit assumption that the more often a molecule is retrieved in multiple similarity searches, the more likely it is to be active. This paper tests the correctness of this assumption. Results Sets of 25 searches using either the same reference structure and 25 different similarity measures (similarity fusion) or 25 different reference structures and the same similarity measure (group fusion) show that large numbers of unique molecules are retrieved by just a single search, but that the numbers of unique molecules decrease very rapidly as more searches are considered. This rapid decrease is accompanied by a rapid increase in the fraction of those retrieved molecules that are active. There is an approximately log-log relationship between the numbers of different molecules retrieved and the number of searches carried out, and a rationale for this power-law behaviour is provided. Conclusions Using multiple searches provides a simple way of increasing the precision of a similarity search, and thus provides a justification for the use of data fusion methods in virtual screening. PMID:21824430

  15. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Krasnokutski, Serge A.; Huisken, Friedrich; Jäger, Cornelia

    A very high abundance of atomic carbon in the interstellar medium (ISM), and the high reactivity of these species toward different hydrocarbon molecules including benzene, raise questions regarding the stability of polycyclic aromatic hydrocarbon (PAH) molecules in space. To test the efficiency of destruction of PAH molecules via reactions with atomic carbon, we performed a set of laboratory and computational studies of the reactions of naphthalene, anthracene, and coronene molecules with carbon atoms in the ground state. The reactions were investigated in liquid helium droplets at T = 0.37 K and by quantum chemical computations. Our studies suggest that allmore » small and all large catacondensed PAHs react barrierlessly with atomic carbon, and therefore should be efficiently destroyed by such reactions in a broad temperature range. At the same time, large compact pericondensed PAHs should be more inert toward such a reaction. In addition, taking into account their higher photostability, much higher abundances of pericondensed PAHs should be expected in various astrophysical environments. The barrierless reactions between carbon atoms and small PAHs also suggest that, in the ISM, these reactions could lead to the bottom-up formation of PAH molecules.« less

  16. Analysis of Different Fragmentation Strategies on a Variety of Large Peptides: Implementation of a Low Level of Theory in Fragment-Based Methods Can Be a Crucial Factor.

    PubMed

    Saha, Arjun; Raghavachari, Krishnan

    2015-05-12

    We have investigated the performance of two classes of fragmentation methods developed in our group (Molecules-in-Molecules (MIM) and Many-Overlapping-Body (MOB) expansion), to reproduce the unfragmented MP2 energies on a test set composed of 10 small to large biomolecules. They have also been assessed to recover the relative energies of different motifs of the acetyl(ala)18NH2 system. Performance of different bond-cutting environments and the use of Hartree-Fock and different density functionals (as a low level of theory) in conjunction with the fragmentation strategies have been analyzed. Our investigation shows that while a low level of theory (for recovering long-range interactions) may not be necessary for small peptides, it provides a very effective strategy to accurately reproduce the total and relative energies of larger peptides such as the different motifs of the acetyl(ala)18NH2 system. Employing M06-2X as the low level of theory, the calculated mean total energy deviation (maximum deviation) in the total MP2 energies for the 10 molecules in the test set at MIM(d=3.5Å), MIM(η=9), and MOB(d=5Å) are 1.16 (2.31), 0.72 (1.87), and 0.43 (2.02) kcal/mol, respectively. The excellent performance suggests that such fragment-based methods should be of general use for the computation of accurate energies of large biomolecular systems.

  17. Electronegativity Equalization Method: Parameterization and Validation for Large Sets of Organic, Organohalogene and Organometal Molecule

    PubMed Central

    Vařeková, Radka Svobodová; Jiroušková, Zuzana; Vaněk, Jakub; Suchomel, Šimon; Koča, Jaroslav

    2007-01-01

    The Electronegativity Equalization Method (EEM) is a fast approach for charge calculation. A challenging part of the EEM is the parameterization, which is performed using ab initio charges obtained for a set of molecules. The goal of our work was to perform the EEM parameterization for selected sets of organic, organohalogen and organometal molecules. We have performed the most robust parameterization published so far. The EEM parameterization was based on 12 training sets selected from a database of predicted 3D structures (NCI DIS) and from a database of crystallographic structures (CSD). Each set contained from 2000 to 6000 molecules. We have shown that the number of molecules in the training set is very important for quality of the parameters. We have improved EEM parameters (STO-3G MPA charges) for elements that were already parameterized, specifically: C, O, N, H, S, F and Cl. The new parameters provide more accurate charges than those published previously. We have also developed new parameters for elements that were not parameterized yet, specifically for Br, I, Fe and Zn. We have also performed crossover validation of all obtained parameters using all training sets that included relevant elements and confirmed that calculated parameters provide accurate charges.

  18. Vibrational multiconfiguration self-consistent field theory: implementation and test calculations.

    PubMed

    Heislbetz, Sandra; Rauhut, Guntram

    2010-03-28

    A state-specific vibrational multiconfiguration self-consistent field (VMCSCF) approach based on a multimode expansion of the potential energy surface is presented for the accurate calculation of anharmonic vibrational spectra. As a special case of this general approach vibrational complete active space self-consistent field calculations will be discussed. The latter method shows better convergence than the general VMCSCF approach and must be considered the preferred choice within the multiconfigurational framework. Benchmark calculations are provided for a small set of test molecules.

  19. Single molecules can operate as primitive biological sensors, switches and oscillators.

    PubMed

    Hernansaiz-Ballesteros, Rosa D; Cardelli, Luca; Csikász-Nagy, Attila

    2018-06-18

    Switch-like and oscillatory dynamical systems are widely observed in biology. We investigate the simplest biological switch that is composed of a single molecule that can be autocatalytically converted between two opposing activity forms. We test how this simple network can keep its switching behaviour under perturbations in the system. We show that this molecule can work as a robust bistable system, even for alterations in the reactions that drive the switching between various conformations. We propose that this single molecule system could work as a primitive biological sensor and show by steady state analysis of a mathematical model of the system that it could switch between possible states for changes in environmental signals. Particularly, we show that a single molecule phosphorylation-dephosphorylation switch could work as a nucleotide or energy sensor. We also notice that a given set of reductions in the reaction network can lead to the emergence of oscillatory behaviour. We propose that evolution could have converted this switch into a single molecule oscillator, which could have been used as a primitive timekeeper. We discuss how the structure of the simplest known circadian clock regulatory system, found in cyanobacteria, resembles the proposed single molecule oscillator. Besides, we speculate if such minimal systems could have existed in an RNA world.

  20. Characterization of Low Pressure Cold Plasma in the Cleaning of Contaminated Surfaces

    NASA Technical Reports Server (NTRS)

    Lanz, Devin Garrett; Hintze, Paul E.

    2016-01-01

    The characterization of low pressure cold plasma is a broad topic which would benefit many different applications involving such plasma. The characterization described in this paper focuses on cold plasma used as a medium in cleaning and disinfection applications. Optical Emission Spectroscopy (OES) and Mass Spectrometry (MS) are the two analytical methods used in this paper to characterize the plasma. OES analyzes molecules in the plasma phase by displaying the light emitted by the plasma molecules on a graph of wavelength vs. intensity. OES was most useful in identifying species which may interact with other molecules in the plasma, such as atomic oxygen or hydroxide radicals. Extracting useful data from the MS is done by filtering out the peaks generated by expected molecules and looking for peaks caused by foreign ones leaving the plasma chamber. This paper describes the efforts at setting up and testing these methods in order to accurately and effectively characterize the plasma.

  1. On the origin independence of the Verdet tensor†

    NASA Astrophysics Data System (ADS)

    Caputo, M. C.; Coriani, S.; Pelloni, S.; Lazzeretti, P.

    2013-07-01

    The condition for invariance under a translation of the coordinate system of the Verdet tensor and the Verdet constant, calculated via quantum chemical methods using gaugeless basis sets, is expressed by a vanishing sum rule involving a third-rank polar tensor. The sum rule is, in principle, satisfied only in the ideal case of optimal variational electronic wavefunctions. In general, it is not fulfilled in non-variational calculations and variational calculations allowing for the algebraic approximation, but it can be satisfied for reasons of molecular symmetry. Group-theoretical procedures have been used to determine (i) the total number of non-vanishing components and (ii) the unique components of both the polar tensor appearing in the sum rule and the axial Verdet tensor, for a series of symmetry groups. Test calculations at the random-phase approximation level of accuracy for water, hydrogen peroxide and ammonia molecules, using basis sets of increasing quality, show a smooth convergence to zero of the sum rule. Verdet tensor components calculated for the same molecules converge to limit values, estimated via large basis sets of gaugeless Gaussian functions and London orbitals.

  2. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Aldegunde, Manuel, E-mail: M.A.Aldegunde-Rodriguez@warwick.ac.uk; Kermode, James R., E-mail: J.R.Kermode@warwick.ac.uk; Zabaras, Nicholas

    This paper presents the development of a new exchange–correlation functional from the point of view of machine learning. Using atomization energies of solids and small molecules, we train a linear model for the exchange enhancement factor using a Bayesian approach which allows for the quantification of uncertainties in the predictions. A relevance vector machine is used to automatically select the most relevant terms of the model. We then test this model on atomization energies and also on bulk properties. The average model provides a mean absolute error of only 0.116 eV for the test points of the G2/97 set butmore » a larger 0.314 eV for the test solids. In terms of bulk properties, the prediction for transition metals and monovalent semiconductors has a very low test error. However, as expected, predictions for types of materials not represented in the training set such as ionic solids show much larger errors.« less

  3. Application of an Artificial Neural Network to the Prediction of OH Radical Reaction Rate Constants for Evaluating Global Warming Potential.

    PubMed

    Allison, Thomas C

    2016-03-03

    Rate constants for reactions of chemical compounds with hydroxyl radical are a key quantity used in evaluating the global warming potential of a substance. Experimental determination of these rate constants is essential, but it can also be difficult and time-consuming to produce. High-level quantum chemistry predictions of the rate constant can suffer from the same issues. Therefore, it is valuable to devise estimation schemes that can give reasonable results on a variety of chemical compounds. In this article, the construction and training of an artificial neural network (ANN) for the prediction of rate constants at 298 K for reactions of hydroxyl radical with a diverse set of molecules is described. Input to the ANN consists of counts of the chemical bonds and bends present in the target molecule. The ANN is trained using 792 (•)OH reaction rate constants taken from the NIST Chemical Kinetics Database. The mean unsigned percent error (MUPE) for the training set is 12%, and the MUPE of the testing set is 51%. It is shown that the present methodology yields rate constants of reasonable accuracy for a diverse set of inputs. The results are compared to high-quality literature values and to another estimation scheme. This ANN methodology is expected to be of use in a wide range of applications for which (•)OH reaction rate constants are required. The model uses only information that can be gathered from a 2D representation of the molecule, making the present approach particularly appealing, especially for screening applications.

  4. Predicting the partitioning of biological compounds between room-temperature ionic liquids and water by means of the solvation-parameter model.

    PubMed

    Padró, Juan M; Ponzinibbio, Agustín; Mesa, Leidy B Agudelo; Reta, Mario

    2011-03-01

    The partition coefficients, P(IL/w), for different probe molecules as well as for compounds of biological interest between the room-temperature ionic liquids (RTILs) 1-butyl-3-methylimidazolium hexafluorophosphate, [BMIM][PF(6)], 1-hexyl-3-methylimidazolium hexafluorophosphate, [HMIM][PF(6)], 1-octyl-3-methylimidazolium tetrafluoroborate, [OMIM][BF(4)] and water were accurately measured. [BMIM][PF(6)] and [OMIM][BF(4)] were synthesized by adapting a procedure from the literature to a simpler, single-vessel and faster methodology, with a much lesser consumption of organic solvent. We employed the solvation-parameter model to elucidate the general chemical interactions involved in RTIL/water partitioning. With this purpose, we have selected different solute descriptor parameters that measure polarity, polarizability, hydrogen-bond-donor and hydrogen-bond-acceptor interactions, and cavity formation for a set of specifically selected probe molecules (the training set). The obtained multiparametric equations were used to predict the partition coefficients for compounds not present in the training set (the test set), most being of biological interest. Partial solubility of the ionic liquid in water (and water into the ionic liquid) was taken into account to explain the obtained results. This fact has not been deeply considered up to date. Solute descriptors were obtained from the literature, when available, or else calculated through commercial software. An excellent agreement between calculated and experimental log P(IL/w) values was obtained, which demonstrated that the resulting multiparametric equations are robust and allow predicting partitioning for any organic molecule in the biphasic systems studied.

  5. In silico prediction of nematic transition temperature for liquid crystals using quantitative structure-property relationship approaches.

    PubMed

    Fatemi, Mohammad Hossein; Ghorbanzad'e, Mehdi

    2009-11-01

    Quantitative structure-property relationship models for the prediction of the nematic transition temperature (T (N)) were developed by using multilinear regression analysis and a feedforward artificial neural network (ANN). A collection of 42 thermotropic liquid crystals was chosen as the data set. The data set was divided into three sets: for training, and an internal and external test set. Training and internal test sets were used for ANN model development, and the external test set was used for evaluation of the predictive power of the model. In order to build the models, a set of six descriptors were selected by the best multilinear regression procedure of the CODESSA program. These descriptors were: atomic charge weighted partial negatively charged surface area, relative negative charged surface area, polarity parameter/square distance, minimum most negative atomic partial charge, molecular volume, and the A component of moment of inertia, which encode geometrical and electronic characteristics of molecules. These descriptors were used as inputs to ANN. The optimized ANN model had 6:6:1 topology. The standard errors in the calculation of T (N) for the training, internal, and external test sets using the ANN model were 1.012, 4.910, and 4.070, respectively. To further evaluate the ANN model, a crossvalidation test was performed, which produced the statistic Q (2) = 0.9796 and standard deviation of 2.67 based on predicted residual sum of square. Also, the diversity test was performed to ensure the model's stability and prove its predictive capability. The obtained results reveal the suitability of ANN for the prediction of T (N) for liquid crystals using molecular structural descriptors.

  6. Rational selection of training and test sets for the development of validated QSAR models

    NASA Astrophysics Data System (ADS)

    Golbraikh, Alexander; Shen, Min; Xiao, Zhiyan; Xiao, Yun-De; Lee, Kuo-Hsiung; Tropsha, Alexander

    2003-02-01

    Quantitative Structure-Activity Relationship (QSAR) models are used increasingly to screen chemical databases and/or virtual chemical libraries for potentially bioactive molecules. These developments emphasize the importance of rigorous model validation to ensure that the models have acceptable predictive power. Using k nearest neighbors ( kNN) variable selection QSAR method for the analysis of several datasets, we have demonstrated recently that the widely accepted leave-one-out (LOO) cross-validated R2 (q2) is an inadequate characteristic to assess the predictive ability of the models [Golbraikh, A., Tropsha, A. Beware of q2! J. Mol. Graphics Mod. 20, 269-276, (2002)]. Herein, we provide additional evidence that there exists no correlation between the values of q 2 for the training set and accuracy of prediction ( R 2) for the test set and argue that this observation is a general property of any QSAR model developed with LOO cross-validation. We suggest that external validation using rationally selected training and test sets provides a means to establish a reliable QSAR model. We propose several approaches to the division of experimental datasets into training and test sets and apply them in QSAR studies of 48 functionalized amino acid anticonvulsants and a series of 157 epipodophyllotoxin derivatives with antitumor activity. We formulate a set of general criteria for the evaluation of predictive power of QSAR models.

  7. Pharmacophore modeling and virtual screening to identify potential RET kinase inhibitors.

    PubMed

    Shih, Kuei-Chung; Shiau, Chung-Wai; Chen, Ting-Shou; Ko, Ching-Huai; Lin, Chih-Lung; Lin, Chun-Yuan; Hwang, Chrong-Shiong; Tang, Chuan-Yi; Chen, Wan-Ru; Huang, Jui-Wen

    2011-08-01

    Chemical features based 3D pharmacophore model for REarranged during Transfection (RET) tyrosine kinase were developed by using a training set of 26 structurally diverse known RET inhibitors. The best pharmacophore hypothesis, which identified inhibitors with an associated correlation coefficient of 0.90 between their experimental and estimated anti-RET values, contained one hydrogen-bond acceptor, one hydrogen-bond donor, one hydrophobic, and one ring aromatic features. The model was further validated by a testing set, Fischer's randomization test, and goodness of hit (GH) test. We applied this pharmacophore model to screen NCI database for potential RET inhibitors. The hits were docked to RET with GOLD and CDOCKER after filtering by Lipinski's rules. Ultimately, 24 molecules were selected as potential RET inhibitors for further investigation. Copyright © 2011 Elsevier Ltd. All rights reserved.

  8. Improved pKa Prediction of Substituted Alcohols, Phenols, and Hydroperoxides in Aqueous Medium Using Density Functional Theory and a Cluster-Continuum Solvation Model.

    PubMed

    Thapa, Bishnu; Schlegel, H Bernhard

    2017-06-22

    Acid dissociation constants (pK a 's) are key physicochemical properties that are needed to understand the structure and reactivity of molecules in solution. Theoretical pK a 's have been calculated for a set of 72 organic compounds with -OH and -OOH groups (48 with known experimental pK a 's). This test set includes 17 aliphatic alcohols, 25 substituted phenols, and 30 hydroperoxides. Calculations in aqueous medium have been carried out with SMD implicit solvation and three hybrid DFT functionals (B3LYP, ωB97XD, and M06-2X) with two basis sets (6-31+G(d,p) and 6-311++G(d,p)). The effect of explicit water molecules on calculated pK a 's was assessed by including up to three water molecules. pK a 's calculated with only SMD implicit solvation are found to have average errors greater than 6 pK a units. Including one explicit water reduces the error by about 3 pK a units, but the error is still far from chemical accuracy. With B3LYP/6-311++G(d,p) and three explicit water molecules in SMD solvation, the mean signed error and standard deviation are only -0.02 ± 0.55; a linear fit with zero intercept has a slope of 1.005 and R 2 = 0.97. Thus, this level of theory can be used to calculate pK a 's directly without the need for linear correlations or thermodynamic cycles. Estimated pK a values are reported for 24 hydroperoxides that have not yet been determined experimentally.

  9. Combination of large and small basis sets in electronic structure calculations on large systems

    NASA Astrophysics Data System (ADS)

    Røeggen, Inge; Gao, Bin

    2018-04-01

    Two basis sets—a large and a small one—are associated with each nucleus of the system. Each atom has its own separate one-electron basis comprising the large basis set of the atom in question and the small basis sets for the partner atoms in the complex. The perturbed atoms in molecules and solids model is at core of the approach since it allows for the definition of perturbed atoms in a system. It is argued that this basis set approach should be particularly useful for periodic systems. Test calculations are performed on one-dimensional arrays of H and Li atoms. The ground-state energy per atom in the linear H array is determined versus bond length.

  10. Neural network consistent empirical physical formula construction for density functional theory based nonlinear vibrational absorbance and intensity of 6-choloronicotinic acid molecule

    NASA Astrophysics Data System (ADS)

    Yildiz, Nihat; Karabacak, Mehmet; Kurt, Mustafa; Akkoyun, Serkan

    2012-05-01

    Being directly related to the electric charge distributions in a molecule, the vibrational spectra intensities are both experimentally and theoretically important physical quantities. However, these intensities are inherently highly nonlinear and of complex pattern. Therefore, in particular for unknown detailed spatial molecular structures, it is difficult to make ab initio intensity calculations to compare with new experimental data. In this respect, we very recently initiated entirely novel layered feedforward neural network (LFNN) approach to construct empirical physical formulas (EPFs) for density functional theory (DFT) vibrational spectra of some molecules. In this paper, as a new and far improved contribution to our novel molecular vibrational spectra LFNN-EPF approach, we constructed LFFN-EPFs for absorbances and intensities of 6-choloronicotinic acid (6-CNA) molecule. The 6-CNA data, borrowed from our previous study, was entirely different and much larger than the vibrational intensity data of our formerly used LFNN-EPF molecules. In line with our another previous work which theoretically proved the LFNN relevance to EPFs, although the 6-CNA DFT absorbance and intensity were inherently highly nonlinear and sharply fluctuating in character, still the optimally constructed train set LFFN-EPFs very successfully fitted the absorbances and intensities. Moreover, test set (i.e. yet-to-be measured experimental data) LFNN-EPFs consistently and successfully predicted the absorbance and intensity data. This simply means that the physical law embedded in the 6-CNA vibrational data was successfully extracted by the LFNN-EPFs. In conclusion, these vibrational LFNN-EPFs are of explicit form. Therefore, by various suitable operations of mathematical analysis, they can be used to estimate the electronic charge distributions of the unknown molecule of the significant complexity. Additionally, these estimations can be combined with those of theoretical DFT atomic polar tensor calculations to contribute to the identification of the molecule.

  11. Neural network consistent empirical physical formula construction for density functional theory based nonlinear vibrational absorbance and intensity of 6-choloronicotinic acid molecule.

    PubMed

    Yildiz, Nihat; Karabacak, Mehmet; Kurt, Mustafa; Akkoyun, Serkan

    2012-05-01

    Being directly related to the electric charge distributions in a molecule, the vibrational spectra intensities are both experimentally and theoretically important physical quantities. However, these intensities are inherently highly nonlinear and of complex pattern. Therefore, in particular for unknown detailed spatial molecular structures, it is difficult to make ab initio intensity calculations to compare with new experimental data. In this respect, we very recently initiated entirely novel layered feedforward neural network (LFNN) approach to construct empirical physical formulas (EPFs) for density functional theory (DFT) vibrational spectra of some molecules. In this paper, as a new and far improved contribution to our novel molecular vibrational spectra LFNN-EPF approach, we constructed LFFN-EPFs for absorbances and intensities of 6-choloronicotinic acid (6-CNA) molecule. The 6-CNA data, borrowed from our previous study, was entirely different and much larger than the vibrational intensity data of our formerly used LFNN-EPF molecules. In line with our another previous work which theoretically proved the LFNN relevance to EPFs, although the 6-CNA DFT absorbance and intensity were inherently highly nonlinear and sharply fluctuating in character, still the optimally constructed train set LFFN-EPFs very successfully fitted the absorbances and intensities. Moreover, test set (i.e. yet-to-be measured experimental data) LFNN-EPFs consistently and successfully predicted the absorbance and intensity data. This simply means that the physical law embedded in the 6-CNA vibrational data was successfully extracted by the LFNN-EPFs. In conclusion, these vibrational LFNN-EPFs are of explicit form. Therefore, by various suitable operations of mathematical analysis, they can be used to estimate the electronic charge distributions of the unknown molecule of the significant complexity. Additionally, these estimations can be combined with those of theoretical DFT atomic polar tensor calculations to contribute to the identification of the molecule. Copyright © 2012 Elsevier B.V. All rights reserved.

  12. Consistency of QSAR models: Correct split of training and test sets, ranking of models and performance parameters.

    PubMed

    Rácz, A; Bajusz, D; Héberger, K

    2015-01-01

    Recent implementations of QSAR modelling software provide the user with numerous models and a wealth of information. In this work, we provide some guidance on how one should interpret the results of QSAR modelling, compare and assess the resulting models, and select the best and most consistent ones. Two QSAR datasets are applied as case studies for the comparison of model performance parameters and model selection methods. We demonstrate the capabilities of sum of ranking differences (SRD) in model selection and ranking, and identify the best performance indicators and models. While the exchange of the original training and (external) test sets does not affect the ranking of performance parameters, it provides improved models in certain cases (despite the lower number of molecules in the training set). Performance parameters for external validation are substantially separated from the other merits in SRD analyses, highlighting their value in data fusion.

  13. Stereochemical analysis of (+)-limonene using theoretical and experimental NMR and chiroptical data

    NASA Astrophysics Data System (ADS)

    Reinscheid, F.; Reinscheid, U. M.

    2016-02-01

    Using limonene as test molecule, the success and the limitations of three chiroptical methods (optical rotatory dispersion (ORD), electronic and vibrational circular dichroism, ECD and VCD) could be demonstrated. At quite low levels of theory (mpw1pw91/cc-pvdz, IEFPCM (integral equation formalism polarizable continuum model)) the experimental ORD values differ by less than 10 units from the calculated values. The modelling in the condensed phase still represents a challenge so that experimental NMR data were used to test for aggregation and solvent-solute interactions. After establishing a reasonable structural model, only the ECD spectra prediction showed a decisive dependence on the basis set: only augmented (in the case of Dunning's basis sets) or diffuse (in the case of Pople's basis sets) basis sets predicted the position and shape of the ECD bands correctly. Based on these result we propose a procedure to assign the absolute configuration (AC) of an unknown compound using the comparison between experimental and calculated chiroptical data.

  14. Robust validation of approximate 1-matrix functionals with few-electron harmonium atoms

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cioslowski, Jerzy, E-mail: jerzy@wmf.univ.szczecin.pl; Piris, Mario; Matito, Eduard

    2015-12-07

    A simple comparison between the exact and approximate correlation components U of the electron-electron repulsion energy of several states of few-electron harmonium atoms with varying confinement strengths provides a stringent validation tool for 1-matrix functionals. The robustness of this tool is clearly demonstrated in a survey of 14 known functionals, which reveals their substandard performance within different electron correlation regimes. Unlike spot-testing that employs dissociation curves of diatomic molecules or more extensive benchmarking against experimental atomization energies of molecules comprising some standard set, the present approach not only uncovers the flaws and patent failures of the functionals but, even moremore » importantly, also allows for pinpointing their root causes. Since the approximate values of U are computed at exact 1-densities, the testing requires minimal programming and thus is particularly suitable for rapid screening of new functionals.« less

  15. Setting up a Rayleigh Scattering Based Flow Measuring System in a Large Nozzle Testing Facility

    NASA Technical Reports Server (NTRS)

    Panda, Jayanta; Gomez, Carlos R.

    2002-01-01

    A molecular Rayleigh scattering based air density measurement system has been built in a large nozzle testing facility at NASA Glenn Research Center. The technique depends on the light scattering by gas molecules present in air; no artificial seeding is required. Light from a single mode, continuous wave laser was transmitted to the nozzle facility by optical fiber, and light scattered by gas molecules, at various points along the laser beam, is collected and measured by photon-counting electronics. By placing the laser beam and collection optics on synchronized traversing units, the point measurement technique is made effective for surveying density variation over a cross-section of the nozzle plume. Various difficulties associated with dust particles, stray light, high noise level and vibration are discussed. Finally, a limited amount of data from an underexpanded jet are presented and compared with expected variations to validate the technique.

  16. Detection of Interstellar Urea

    NASA Astrophysics Data System (ADS)

    Kuo, Hsin-Lun; Remijan, Anthony J.; Snyder, Lewis E.; Looney, Leslie W.; Friedel, Douglas N.; Lovas, Francis J.; McCall, Benjamin J.; Hollis, Jan M.

    2010-11-01

    Urea, a molecule discovered in human urine by H. M. Rouelle in 1773, has a significant role in prebiotic chemistry. Previous BIMA observations have suggested that interstellar urea [(NH2)2CO] is a compact hot core molecule such as other large molecules (e.g. methyl formate and acetic acid). We have conducted an extensive search for urea toward the high mass hot molecular core Sgr B2(N-LMH) using BIMA, CARMA and the IRAM 30 m. Because the spectral lines of heavy molecules like urea tend to be weak and hot cores display lines from a wide range of molecules, it is necessary to detect a number of urea lines and apply sophisticated statistical tests before having confidence in an identification. The 1 mm resolution of CARMA enables favorable coupling of the source size and synthesized beam size, which was found to be essential for the detection of weak signals. We have detected a total of 65 spectral lines (32 molecular transitions and 33 unidentified transitions), most of which are narrower than the SEST survey (Nummelin et al. 1998) due to the small synthesized beam (2.5" x 2") of CARMA. It significantly resolves out the contamination by extended emission and reveals the eight weak urea lines that were previously blended with nearby transitions. Our analysis indicates that these lines are likely to be urea since the resulting observed line frequencies are coincident with a set of overlapping connecting urea lines, and the observed line intensities are consistent with the expected line strengths of urea. In addition, we have developed a new statistical approach to examine the spatial correlation between the observed lines by applying the Student's t test to the high resolution channel maps obtained from CARMA. The t test shows consistent spatial distributions from all eight candidate lines, suggesting a common molecular origin, urea. Our t test method could have a broad impact on the next generation of arrays, such as ALMA, because the new arrays will require a method to systematically determine the credibility of detections of weaker signals from new and larger interstellar molecules.

  17. Theoretical investigation of gas-surface interactions

    NASA Technical Reports Server (NTRS)

    Dyall, Kenneth G.

    1990-01-01

    A Dirac-Hartree-Fock code was developed for polyatomic molecules. The program uses integrals over symmetry-adapted real spherical harmonic Gaussian basis functions generated by a modification of the MOLECULE integrals program. A single Gaussian function is used for the nuclear charge distribution, to ensure proper boundary conditions at the nuclei. The Gaussian primitive functions are chosen to satisfy the kinetic balance condition. However, contracted functions which do not necessarily satisfy this condition may be used. The Fock matrix is constructed in the scalar basis and transformed to a jj-coupled 2-spinor basis before diagonalization. The program was tested against numerical results for atoms with a Gaussian nucleus and diatomic molecules with point nuclei. The energies converge on the numerical values as the basis set size is increased. Full use of molecular symmetry (restricted to D sub 2h and subgroups) is yet to be implemented.

  18. Calculation of multicenter electric field gradient integrals over Slater-type orbitals using unsymmetrical one-range addition theorems.

    PubMed

    Guseinov, Israfil I; Görgün, Nurşen Seçkin

    2011-06-01

    The electric field induced within a molecule by its electrons determines a whole series of important physical properties of the molecule. In particular, the values of the gradient of this field at the nuclei determine the interaction of their quadrupole moments with the electrons. Using unsymmetrical one-range addition theorems introduced by one of the authors, the sets of series expansion relations for multicenter electric field gradient integrals over Slater-type orbitals in terms of multicenter charge density expansion coefficients and two-center basic integrals are presented. The convergence of the series is tested by calculating concrete cases for different values of quantum numbers, parameters and locations of orbitals.

  19. Use of Crystal Structure Informatics for Defining the Conformational Space Needed for Predicting Crystal Structures of Pharmaceutical Molecules.

    PubMed

    Iuzzolino, Luca; Reilly, Anthony M; McCabe, Patrick; Price, Sarah L

    2017-10-10

    Determining the range of conformations that a flexible pharmaceutical-like molecule could plausibly adopt in a crystal structure is a key to successful crystal structure prediction (CSP) studies. We aim to use conformational information from the crystal structures in the Cambridge Structural Database (CSD) to facilitate this task. The conformations produced by the CSD Conformer Generator are reduced in number by considering the underlying rotamer distributions, an analysis of changes in molecular shape, and a minimal number of molecular ab initio calculations. This method is tested for five pharmaceutical-like molecules where an extensive CSP study has already been performed. The CSD informatics-derived set of crystal structure searches generates almost all the low-energy crystal structures previously found, including all experimental structures. The workflow effectively combines information on individual torsion angles and then eliminates the combinations that are too high in energy to be found in the solid state, reducing the resources needed to cover the solid-state conformational space of a molecule. This provides insights into how the low-energy solid-state and isolated-molecule conformations are related to the properties of the individual flexible torsion angles.

  20. Using support vector machines to improve elemental ion identification in macromolecular crystal structures

    DOE PAGES

    Morshed, Nader; Echols, Nathaniel; Adams, Paul D.

    2015-04-25

    In the process of macromolecular model building, crystallographers must examine electron density for isolated atoms and differentiate sites containing structured solvent molecules from those containing elemental ions. This task requires specific knowledge of metal-binding chemistry and scattering properties and is prone to error. A method has previously been described to identify ions based on manually chosen criteria for a number of elements. Here, the use of support vector machines (SVMs) to automatically classify isolated atoms as either solvent or one of various ions is described. Two data sets of protein crystal structures, one containing manually curated structures deposited with anomalousmore » diffraction data and another with automatically filtered, high-resolution structures, were constructed. On the manually curated data set, an SVM classifier was able to distinguish calcium from manganese, zinc, iron and nickel, as well as all five of these ions from water molecules, with a high degree of accuracy. Additionally, SVMs trained on the automatically curated set of high-resolution structures were able to successfully classify most common elemental ions in an independent validation test set. This method is readily extensible to other elemental ions and can also be used in conjunction with previous methods based on a priori expectations of the chemical environment and X-ray scattering.« less

  1. Prediction of small molecule binding property of protein domains with Bayesian classifiers based on Markov chains.

    PubMed

    Bulashevska, Alla; Stein, Martin; Jackson, David; Eils, Roland

    2009-12-01

    Accurate computational methods that can help to predict biological function of a protein from its sequence are of great interest to research biologists and pharmaceutical companies. One approach to assume the function of proteins is to predict the interactions between proteins and other molecules. In this work, we propose a machine learning method that uses a primary sequence of a domain to predict its propensity for interaction with small molecules. By curating the Pfam database with respect to the small molecule binding ability of its component domains, we have constructed a dataset of small molecule binding and non-binding domains. This dataset was then used as training set to learn a Bayesian classifier, which should distinguish members of each class. The domain sequences of both classes are modelled with Markov chains. In a Jack-knife test, our classification procedure achieved the predictive accuracies of 77.2% and 66.7% for binding and non-binding classes respectively. We demonstrate the applicability of our classifier by using it to identify previously unknown small molecule binding domains. Our predictions are available as supplementary material and can provide very useful information to drug discovery specialists. Given the ubiquitous and essential role small molecules play in biological processes, our method is important for identifying pharmaceutically relevant components of complete proteomes. The software is available from the author upon request.

  2. Reading Out Single-Molecule Digital RNA and DNA Isothermal Amplification in Nanoliter Volumes with Unmodified Camera Phones

    PubMed Central

    2016-01-01

    Digital single-molecule technologies are expanding diagnostic capabilities, enabling the ultrasensitive quantification of targets, such as viral load in HIV and hepatitis C infections, by directly counting single molecules. Replacing fluorescent readout with a robust visual readout that can be captured by any unmodified cell phone camera will facilitate the global distribution of diagnostic tests, including in limited-resource settings where the need is greatest. This paper describes a methodology for developing a visual readout system for digital single-molecule amplification of RNA and DNA by (i) selecting colorimetric amplification-indicator dyes that are compatible with the spectral sensitivity of standard mobile phones, and (ii) identifying an optimal ratiometric image-process for a selected dye to achieve a readout that is robust to lighting conditions and camera hardware and provides unambiguous quantitative results, even for colorblind users. We also include an analysis of the limitations of this methodology, and provide a microfluidic approach that can be applied to expand dynamic range and improve reaction performance, allowing ultrasensitive, quantitative measurements at volumes as low as 5 nL. We validate this methodology using SlipChip-based digital single-molecule isothermal amplification with λDNA as a model and hepatitis C viral RNA as a clinically relevant target. The innovative combination of isothermal amplification chemistry in the presence of a judiciously chosen indicator dye and ratiometric image processing with SlipChip technology allowed the sequence-specific visual readout of single nucleic acid molecules in nanoliter volumes with an unmodified cell phone camera. When paired with devices that integrate sample preparation and nucleic acid amplification, this hardware-agnostic approach will increase the affordability and the distribution of quantitative diagnostic and environmental tests. PMID:26900709

  3. Techniques for estimating Space Station aerodynamic characteristics

    NASA Technical Reports Server (NTRS)

    Thomas, Richard E.

    1993-01-01

    A method was devised and calculations were performed to determine the effects of reflected molecules on the aerodynamic force and moment coefficients for a body in free molecule flow. A procedure was developed for determining the velocity and temperature distributions of molecules reflected from a surface of arbitrary momentum and energy accommodation. A system of equations, based on momentum and energy balances for the surface, incident, and reflected molecules, was solved by a numerical optimization technique. The minimization of a 'cost' function, developed from the set of equations, resulted in the determination of the defining properties of the flow reflected from the arbitrary surface. The properties used to define both the incident and reflected flows were: average temperature of the molecules in the flow, angle of the flow with respect to a vector normal to the surface, and the molecular speed ratio. The properties of the reflected flow were used to calculate the contribution of multiply reflected molecules to the force and moments on a test body in the flow. The test configuration consisted of two flat plates joined along one edge at a right angle to each other. When force and moment coefficients of this 90 deg concave wedge were compared to results that did not include multiple reflections, it was found that multiple reflections could nearly double lift and drag coefficients, with nearly a 50 percent increase in pitching moment for cases with specular or nearly specular accommodation. The cases of diffuse or nearly diffuse accommodation often had minor reductions in axial and normal forces when multiple reflections were included. There were several cases of intermediate accommodation where the addition of multiple reflection effects more than tripled the lift coefficient over the convex technique.

  4. QSAR Analysis of 2-Amino or 2-Methyl-1-Substituted Benzimidazoles Against Pseudomonas aeruginosa

    PubMed Central

    Podunavac-Kuzmanović, Sanja O.; Cvetković, Dragoljub D.; Barna, Dijana J.

    2009-01-01

    A set of benzimidazole derivatives were tested for their inhibitory activities against the Gram-negative bacterium Pseudomonas aeruginosa and minimum inhibitory concentrations were determined for all the compounds. Quantitative structure activity relationship (QSAR) analysis was applied to fourteen of the abovementioned derivatives using a combination of various physicochemical, steric, electronic, and structural molecular descriptors. A multiple linear regression (MLR) procedure was used to model the relationships between molecular descriptors and the antibacterial activity of the benzimidazole derivatives. The stepwise regression method was used to derive the most significant models as a calibration model for predicting the inhibitory activity of this class of molecules. The best QSAR models were further validated by a leave one out technique as well as by the calculation of statistical parameters for the established theoretical models. To confirm the predictive power of the models, an external set of molecules was used. High agreement between experimental and predicted inhibitory values, obtained in the validation procedure, indicated the good quality of the derived QSAR models. PMID:19468332

  5. Selecting an optimal number of binding site waters to improve virtual screening enrichments against the adenosine A2A receptor.

    PubMed

    Lenselink, Eelke B; Beuming, Thijs; Sherman, Woody; van Vlijmen, Herman W T; IJzerman, Adriaan P

    2014-06-23

    A major challenge in structure-based virtual screening (VS) involves the treatment of explicit water molecules during docking in order to improve the enrichment of active compounds over decoys. Here we have investigated this in the context of the adenosine A2A receptor, where water molecules have previously been shown to be important for achieving high enrichment rates with docking, and where the positions of some binding site waters are known from a high-resolution crystal structure. The effect of these waters (both their presence and orientations) on VS enrichment was assessed using a carefully curated set of 299 high affinity A2A antagonists and 17,337 decoys. We show that including certain crystal waters greatly improves VS enrichment and that optimization of water hydrogen positions is needed in order to achieve the best results. We also show that waters derived from a molecular dynamics simulation - without any knowledge of crystallographic waters - can improve enrichments to a similar degree as the crystallographic waters, which makes this strategy applicable to structures without experimental knowledge of water positions. Finally, we used decision trees to select an ensemble of structures with different water molecule positions and orientations that outperforms any single structure with water molecules. The approach presented here is validated against independent test sets of A2A receptor antagonists and decoys from the literature. In general, this water optimization strategy could be applied to any target with waters-mediated protein-ligand interactions.

  6. Enhanced Optical and Electric Manipulation of a Quantum Gas of KRb Molecules

    NASA Astrophysics Data System (ADS)

    Covey, Jacob P.

    Polar molecules are an ideal platform for studying quantum information and quantum simulation due to their long-range dipolar interactions. However, they have many degrees of freedom at disparate energy scales and thus are difficult to cool. Ultracold KRb molecules near quantum degeneracy were first produced in 2008. Nevertheless, it was found that even when prepared in the absolute lowest state chemical reactions can make the gas unstable. During my PhD we worked to mitigate these limitations by loading molecules into an optical lattice where the tunneling rates, and thus the chemistry, can be exquisitely controlled. This setting allowed us to start using the rotational degree of freedom as a pseudo-spin, and paved the way for studying models of quantum magnetism, such as the t-J model and the XXZ model. Further, by allowing molecules of two "spin''-states to tunnel in the lattice, we were able to observe a continuous manifestion of the quantum Zeno effect, where increased mobility counterintuitively suppresses dissipation from inelastic collisions. In a deep lattice we observed dipolar spin-exchange interactions, and we were able to elucidate their truly many-body nature. These two sets of experiments informed us that the filling fraction of the molecules in the lattice was only 5-10%, and so we implemented a quantum synthesis approach where atomic insulators were used to maximize the number of sites with one K and one Rb, and then these "doublons'' were converted to molecules with a filling of 30%. Despite these successes, a number of tools such as high resolution detection and addressing as well as large, stable electric fields were unavailable. Also during my PhD I led efforts to design, build, test, and implement a new apparatus which provides access to these tools and more. We have successfully produced ultracold molecules in this new apparatus, and we are now applying AC and DC electric fields with in vacuum electrodes. This apparatus will allow us to study quantum magnetism in a large electric field, and to detect the dynamics of out-of-equilibrium many-body states.

  7. How to Achieve Better Results Using Pass-Based Virtual Screening: Case Study for Kinase Inhibitors

    NASA Astrophysics Data System (ADS)

    Pogodin, Pavel V.; Lagunin, Alexey A.; Rudik, Anastasia V.; Filimonov, Dmitry A.; Druzhilovskiy, Dmitry S.; Nicklaus, Mark C.; Poroikov, Vladimir V.

    2018-04-01

    Discovery of new pharmaceutical substances is currently boosted by the possibility of utilization of the Synthetically Accessible Virtual Inventory (SAVI) library, which includes about 283 million molecules, each annotated with a proposed synthetic one-step route from commercially available starting materials. The SAVI database is well-suited for ligand-based methods of virtual screening to select molecules for experimental testing. In this study, we compare the performance of three approaches for the analysis of structure-activity relationships that differ in their criteria for selecting of “active” and “inactive” compounds included in the training sets. PASS (Prediction of Activity Spectra for Substances), which is based on a modified Naïve Bayes algorithm, was applied since it had been shown to be robust and to provide good predictions of many biological activities based on just the structural formula of a compound even if the information in the training set is incomplete. We used different subsets of kinase inhibitors for this case study because many data are currently available on this important class of drug-like molecules. Based on the subsets of kinase inhibitors extracted from the ChEMBL 20 database we performed the PASS training, and then applied the model to ChEMBL 23 compounds not yet present in ChEMBL 20 to identify novel kinase inhibitors. As one may expect, the best prediction accuracy was obtained if only the experimentally confirmed active and inactive compounds for distinct kinases in the training procedure were used. However, for some kinases, reasonable results were obtained even if we used merged training sets, in which we designated as inactives the compounds not tested against the particular kinase. Thus, depending on the availability of data for a particular biological activity, one may choose the first or the second approach for creating ligand-based computational tools to achieve the best possible results in virtual screening.

  8. DAT/SERT Selectivity of Flexible GBR 12909 Analogs Modeled Using 3D-QSAR Methods

    PubMed Central

    Gilbert, Kathleen M.; Boos, Terrence L.; Dersch, Christina M.; Greiner, Elisabeth; Jacobson, Arthur E.; Lewis, David; Matecka, Dorota; Prisinzano, Thomas E.; Zhang, Ying; Rothman, Richard B.; Rice, Kenner C.; Venanzi, Carol A.

    2007-01-01

    The dopamine reuptake inhibitor GBR 12909 (1-{2-[bis(4-fluorophenyl)methoxy]ethyl}-4-(3-phenylpropyl)piperazine, 1) and its analogs have been developed as tools to test the hypothesis that selective dopamine transporter (DAT) inhibitors will be useful therapeutics for cocaine addiction. This 3D-QSAR study focuses on the effect of substitutions in the phenylpropyl region of 1. CoMFA and CoMSIA techniques were used to determine a predictive and stable model for the DAT/serotonin transporter (SERT) selectivity (represented by pKi (DAT/SERT)) of a set of flexible analogs of 1, most of which have eight rotatable bonds. In the absence of a rigid analog to use as a 3D-QSAR template, six conformational families of analogs were constructed from six pairs of piperazine and piperidine template conformers identified by hierarchical clustering as representative molecular conformations. Three models stable to y-value scrambling were identified after a comprehensive CoMFA and CoMSIA survey with Region Focusing. Test set correlation validation led to an acceptable model, with q2 = 0.508, standard error of prediction = 0.601, two components, r2 = 0.685, standard error of estimate = 0.481, F value = 39, percent steric contribution = 65, and percent electrostatic contribution = 35. A CoMFA contour map identified areas of the molecule that affect pKi (DAT/SERT). This work outlines a protocol for deriving a stable and predictive model of the biological activity of a set of very flexible molecules. PMID:17127069

  9. Hydration in drug design. 3. Conserved water molecules at the ligand-binding sites of homologous proteins

    NASA Astrophysics Data System (ADS)

    Poornima, C. S.; Dean, P. M.

    1995-12-01

    Water molecules are known to play an important rôle in mediating protein-ligand interactions. If water molecules are conserved at the ligand-binding sites of homologous proteins, such a finding may suggest the structural importance of water molecules in ligand binding. Structurally conserved water molecules change the conventional definition of `binding sites' by changing the shape and complementarity of these sites. Such conserved water molecules can be important for site-directed ligand/drug design. Therefore, five different sets of homologous protein/protein-ligand complexes have been examined to identify the conserved water molecules at the ligand-binding sites. Our analysis reveals that there are as many as 16 conserved water molecules at the FAD binding site of glutathione reductase between the crystal structures obtained from human and E. coli. In the remaining four sets of high-resolution crystal structures, 2-4 water molecules have been found to be conserved at the ligand-binding sites. The majority of these conserved water molecules are either bound in deep grooves at the protein-ligand interface or completely buried in cavities between the protein and the ligand. All these water molecules, conserved between the protein/protein-ligand complexes from different species, have identical or similar apolar and polar interactions in a given set. The site residues interacting with the conserved water molecules at the ligand-binding sites have been found to be highly conserved among proteins from different species; they are more conserved compared to the other site residues interacting with the ligand. These water molecules, in general, make multiple polar contacts with protein-site residues.

  10. Diffusion and mobility of atomic particles in a liquid

    NASA Astrophysics Data System (ADS)

    Smirnov, B. M.; Son, E. E.; Tereshonok, D. V.

    2017-11-01

    The diffusion coefficient of a test atom or molecule in a liquid is determined for the mechanism where the displacement of the test molecule results from the vibrations and motion of liquid molecules surrounding the test molecule and of the test particle itself. This leads to a random change in the coordinate of the test molecule, which eventually results in the diffusion motion of the test particle in space. Two models parameters of interaction of a particle and a liquid are used to find the activation energy of the diffusion process under consideration: the gas-kinetic cross section for scattering of test molecules in the parent gas and the Wigner-Seitz radius for test molecules. In the context of this approach, we have calculated the diffusion coefficient of atoms and molecules in water, where based on experimental data, we have constructed the dependence of the activation energy for the diffusion of test molecules in water on the interaction parameter and the temperature dependence for diffusion coefficient of atoms or molecules in water within the models considered. The statistically averaged difference of the activation energies for the diffusion coefficients of different test molecules in water that we have calculated based on each of the presented models does not exceed 10% of the diffusion coefficient itself. We have considered the diffusion of clusters in water and present the dependence of the diffusion coefficient on the cluster size. The accuracy of the presented formulas for the diffusion coefficient of atomic particles in water is estimated to be 50%.

  11. Improved pan-specific MHC class I peptide-binding predictions using a novel representation of the MHC-binding cleft environment.

    PubMed

    Carrasco Pro, S; Zimic, M; Nielsen, M

    2014-02-01

    Major histocompatibility complex (MHC) molecules play a key role in cell-mediated immune responses presenting bounded peptides for recognition by the immune system cells. Several in silico methods have been developed to predict the binding affinity of a given peptide to a specific MHC molecule. One of the current state-of-the-art methods for MHC class I is NetMHCpan, which has a core ingredient for the representation of the MHC class I molecule using a pseudo-sequence representation of the binding cleft amino acid environment. New and large MHC-peptide-binding data sets are constantly being made available, and also new structures of MHC class I molecules with a bound peptide have been published. In order to test if the NetMHCpan method can be improved by integrating this novel information, we created new pseudo-sequence definitions for the MHC-binding cleft environment from sequence and structural analyses of different MHC data sets including human leukocyte antigen (HLA), non-human primates (chimpanzee, macaque and gorilla) and other animal alleles (cattle, mouse and swine). From these constructs, we showed that by focusing on MHC sequence positions found to be polymorphic across the MHC molecules used to train the method, the NetMHCpan method achieved a significant increase in the predictive performance, in particular, of non-human MHCs. This study hence showed that an improved performance of MHC-binding methods can be achieved not only by the accumulation of more MHC-peptide-binding data but also by a refined definition of the MHC-binding environment including information from non-human species. © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  12. Pharmacophore modeling, molecular docking, and molecular dynamics simulation approaches for identifying new lead compounds for inhibiting aldose reductase 2.

    PubMed

    Sakkiah, Sugunadevi; Thangapandian, Sundarapandian; Lee, Keun Woo

    2012-07-01

    Aldose reductase 2 (ALR2), which catalyzes the reduction of glucose to sorbitol using NADP as a cofactor, has been implicated in the etiology of secondary complications of diabetes. A pharmacophore model, Hypo1, was built based on 26 compounds with known ALR2-inhibiting activity values. Hypo1 contains important chemical features required for an ALR2 inhibitor, and demonstrates good predictive ability by having a high correlation coefficient (0.95) as well as the highest cost difference (128.44) and the lowest RMS deviation (1.02) among the ten pharmacophore models examined. Hypo1 was further validated by Fisher's randomization method (95%), test set (r = 0.91), and the decoy set shows the goodness of fit (0.70). Furthermore, during virtual screening, Hypo1 was used as a 3D query to screen the NCI database, and the hit leads were sorted by applying Lipinski's rule of five and ADME properties. The best-fitting leads were subjected to docking to identify a suitable orientation at the ALR2 active site. The molecule that showed the strongest interactions with the critical amino acids was used in molecular dynamics simulations to calculate its binding affinity to the candidate molecules. Thus, Hypo1 describes the key structure-activity relationship along with the estimated activities of ALR2 inhibitors. The hit molecules were searched against PubChem to find similar molecules with new scaffolds. Finally, four molecules were found to satisfy all of the chemical features and the geometric constraints of Hypo1, as well as to show good dock scores, PLPs and PMFs. Thus, we believe that Hypo1 facilitates the selection of novel scaffolds for ALR2, allowing new classes of ALR2 inhibitors to be designed.

  13. Evaluations of the conformational search accuracy of CAMDAS using experimental three-dimensional structures of protein-ligand complexes

    NASA Astrophysics Data System (ADS)

    Oda, A.; Yamaotsu, N.; Hirono, S.; Takano, Y.; Fukuyoshi, S.; Nakagaki, R.; Takahashi, O.

    2013-08-01

    CAMDAS is a conformational search program, through which high temperature molecular dynamics (MD) calculations are carried out. In this study, the conformational search ability of CAMDAS was evaluated using structurally known 281 protein-ligand complexes as a test set. For the test, the influences of initial settings and initial conformations on search results were validated. By using the CAMDAS program, reasonable conformations whose root mean square deviations (RMSDs) in comparison with crystal structures were less than 2.0 Å could be obtained from 96% of the test set even though the worst initial settings were used. The success rate was comparable to those of OMEGA, and the errors of CAMDAS were less than those of OMEGA. Based on the results obtained using CAMDAS, the worst RMSD was around 2.5 Å, although the worst value obtained was around 4.0 Å using OMEGA. The results indicated that CAMDAS is a robust and versatile conformational search method and that it can be used for a wide variety of small molecules. In addition, the accuracy of a conformational search in relation to this study was improved by longer MD calculations and multiple MD simulations.

  14. Space flight effects on antioxidant molecules in dry tardigrades: the TARDIKISS experiment.

    PubMed

    Rizzo, Angela Maria; Altiero, Tiziana; Corsetto, Paola Antonia; Montorfano, Gigliola; Guidetti, Roberto; Rebecchi, Lorena

    2015-01-01

    The TARDIKISS (Tardigrades in Space) experiment was part of the Biokon in Space (BIOKIS) payload, a set of multidisciplinary experiments performed during the DAMA (Dark Matter) mission organized by Italian Space Agency and Italian Air Force in 2011. This mission supported the execution of experiments in short duration (16 days) taking the advantage of the microgravity environment on board of the Space Shuttle Endeavour (its last mission STS-134) docked to the International Space Station. TARDIKISS was composed of three sample sets: one flight sample and two ground control samples. These samples provided the biological material used to test as space stressors, including microgravity, affected animal survivability, life cycle, DNA integrity, and pathways of molecules working as antioxidants. In this paper we compared the molecular pathways of some antioxidant molecules, thiobarbituric acid reactive substances, and fatty acid composition between flight and control samples in two tardigrade species, namely, Paramacrobiotus richtersi and Ramazzottius oberhaeuseri. In both species, the activities of ROS scavenging enzymes, the total content of glutathione, and the fatty acids composition between flight and control samples showed few significant differences. TARDIKISS experiment, together with a previous space experiment (TARSE), further confirms that both desiccated and hydrated tardigrades represent useful animal tool for space research.

  15. Rapid parameterization of small molecules using the Force Field Toolkit.

    PubMed

    Mayne, Christopher G; Saam, Jan; Schulten, Klaus; Tajkhorshid, Emad; Gumbart, James C

    2013-12-15

    The inability to rapidly generate accurate and robust parameters for novel chemical matter continues to severely limit the application of molecular dynamics simulations to many biological systems of interest, especially in fields such as drug discovery. Although the release of generalized versions of common classical force fields, for example, General Amber Force Field and CHARMM General Force Field, have posited guidelines for parameterization of small molecules, many technical challenges remain that have hampered their wide-scale extension. The Force Field Toolkit (ffTK), described herein, minimizes common barriers to ligand parameterization through algorithm and method development, automation of tedious and error-prone tasks, and graphical user interface design. Distributed as a VMD plugin, ffTK facilitates the traversal of a clear and organized workflow resulting in a complete set of CHARMM-compatible parameters. A variety of tools are provided to generate quantum mechanical target data, setup multidimensional optimization routines, and analyze parameter performance. Parameters developed for a small test set of molecules using ffTK were comparable to existing CGenFF parameters in their ability to reproduce experimentally measured values for pure-solvent properties (<15% error from experiment) and free energy of solvation (±0.5 kcal/mol from experiment). Copyright © 2013 Wiley Periodicals, Inc.

  16. Many-Body Descriptors for Predicting Molecular Properties with Machine Learning: Analysis of Pairwise and Three-Body Interactions in Molecules.

    PubMed

    Pronobis, Wiktor; Tkatchenko, Alexandre; Müller, Klaus-Robert

    2018-06-12

    Machine learning (ML) based prediction of molecular properties across chemical compound space is an important and alternative approach to efficiently estimate the solutions of highly complex many-electron problems in chemistry and physics. Statistical methods represent molecules as descriptors that should encode molecular symmetries and interactions between atoms. Many such descriptors have been proposed; all of them have advantages and limitations. Here, we propose a set of general two-body and three-body interaction descriptors which are invariant to translation, rotation, and atomic indexing. By adapting the successfully used kernel ridge regression methods of machine learning, we evaluate our descriptors on predicting several properties of small organic molecules calculated using density-functional theory. We use two data sets. The GDB-7 set contains 6868 molecules with up to 7 heavy atoms of type CNO. The GDB-9 set is composed of 131722 molecules with up to 9 heavy atoms containing CNO. When trained on 5000 random molecules, our best model achieves an accuracy of 0.8 kcal/mol (on the remaining 1868 molecules of GDB-7) and 1.5 kcal/mol (on the remaining 126722 molecules of GDB-9) respectively. Applying a linear regression model on our novel many-body descriptors performs almost equal to a nonlinear kernelized model. Linear models are readily interpretable: a feature importance ranking measure helps to obtain qualitative and quantitative insights on the importance of two- and three-body molecular interactions for predicting molecular properties computed with quantum-mechanical methods.

  17. Nature is the best source of anti-inflammatory drugs: indexing natural products for their anti-inflammatory bioactivity.

    PubMed

    Aswad, Miran; Rayan, Mahmoud; Abu-Lafi, Saleh; Falah, Mizied; Raiyn, Jamal; Abdallah, Ziyad; Rayan, Anwar

    2018-01-01

    The aim was to index natural products for less expensive preventive or curative anti-inflammatory therapeutic drugs. A set of 441 anti-inflammatory drugs representing the active domain and 2892 natural products representing the inactive domain was used to construct a predictive model for bioactivity-indexing purposes. The model for indexing the natural products for potential anti-inflammatory activity was constructed using the iterative stochastic elimination algorithm (ISE). ISE is capable of differentiating between active and inactive anti-inflammatory molecules. By applying the prediction model to a mix set of (active/inactive) substances, we managed to capture 38% of the anti-inflammatory drugs in the top 1% of the screened set of chemicals, yielding enrichment factor of 38. Ten natural products that scored highly as potential anti-inflammatory drug candidates are disclosed. Searching the PubMed revealed that only three molecules (Moupinamide, Capsaicin, and Hypaphorine) out of the ten were tested and reported as anti-inflammatory. The other seven phytochemicals await evaluation for their anti-inflammatory activity in wet lab. The proposed anti-inflammatory model can be utilized for the virtual screening of large chemical databases and for indexing natural products for potential anti-inflammatory activity.

  18. Theoretical investigation of gas-phase molecular complex formation between 2-hydroxy thiophenol and a water molecule.

    PubMed

    Kumar Deb, Debojit; Sarkar, Biplab

    2017-01-18

    The torsional potential of OH and SH rotations in 2-hydroxy thiophenol is systematically studied using the MP2 ab initio method. The outcome of state-of-the-art calculations is used in the investigation of the structures and conformational preferences of 2-hydroxy thiophenol and aims at further interaction studies with a gas phase water molecule. SCS-MP2 and CCSD(T) complete basis set (CBS) limit interaction energies for these complexes are presented. The SCS-MP2/CBS limit is achieved using various two-point extrapolation methods with aug-cc-pVDZ and aug-cc-pVTZ basis sets. The CCSD(T) correction term is determined as the difference between CCSD(T) and SCS-MP2 interaction energies calculated using a smaller basis set. The effect of counterpoise correction on the extrapolation to the CBS limit is discussed. The performance of DFT based wB97XD, M06-2X and B3LYP-D3 functionals is tested against the benchmark energy from ab initio calculations. Hydrogen bond interactions are characterized by carrying out QTAIM, NCIPLOT, NBO and SAPT analyses.

  19. Noncontiguous atom matching structural similarity function.

    PubMed

    Teixeira, Ana L; Falcao, Andre O

    2013-10-28

    Measuring similarity between molecules is a fundamental problem in cheminformatics. Given that similar molecules tend to have similar physical, chemical, and biological properties, the notion of molecular similarity plays an important role in the exploration of molecular data sets, query-retrieval in molecular databases, and in structure-property/activity modeling. Various methods to define structural similarity between molecules are available in the literature, but so far none has been used with consistent and reliable results for all situations. We propose a new similarity method based on atom alignment for the analysis of structural similarity between molecules. This method is based on the comparison of the bonding profiles of atoms on comparable molecules, including features that are seldom found in other structural or graph matching approaches like chirality or double bond stereoisomerism. The similarity measure is then defined on the annotated molecular graph, based on an iterative directed graph similarity procedure and optimal atom alignment between atoms using a pairwise matching algorithm. With the proposed approach the similarities detected are more intuitively understood because similar atoms in the molecules are explicitly shown. This noncontiguous atom matching structural similarity method (NAMS) was tested and compared with one of the most widely used similarity methods (fingerprint-based similarity) using three difficult data sets with different characteristics. Despite having a higher computational cost, the method performed well being able to distinguish either different or very similar hydrocarbons that were indistinguishable using a fingerprint-based approach. NAMS also verified the similarity principle using a data set of structurally similar steroids with differences in the binding affinity to the corticosteroid binding globulin receptor by showing that pairs of steroids with a high degree of similarity (>80%) tend to have smaller differences in the absolute value of binding activity. Using a highly diverse set of compounds with information about the monoamine oxidase inhibition level, the method was also able to recover a significantly higher average fraction of active compounds when the seed is active for different cutoff threshold values of similarity. Particularly, for the cutoff threshold values of 86%, 93%, and 96.5%, NAMS was able to recover a fraction of actives of 0.57, 0.63, and 0.83, respectively, while the fingerprint-based approach was able to recover a fraction of actives of 0.41, 0.40, and 0.39, respectively. NAMS is made available freely for the whole community in a simple Web based tool as well as the Python source code at http://nams.lasige.di.fc.ul.pt/.

  20. Low-lying excited states by constrained DFT

    NASA Astrophysics Data System (ADS)

    Ramos, Pablo; Pavanello, Michele

    2018-04-01

    Exploiting the machinery of Constrained Density Functional Theory (CDFT), we propose a variational method for calculating low-lying excited states of molecular systems. We dub this method eXcited CDFT (XCDFT). Excited states are obtained by self-consistently constraining a user-defined population of electrons, Nc, in the virtual space of a reference set of occupied orbitals. By imposing this population to be Nc = 1.0, we computed the first excited state of 15 molecules from a test set. Our results show that XCDFT achieves an accuracy in the predicted excitation energy only slightly worse than linear-response time-dependent DFT (TDDFT), but without incurring into problems of variational collapse typical of the more commonly adopted ΔSCF method. In addition, we selected a few challenging processes to test the limits of applicability of XCDFT. We find that in contrast to TDDFT, XCDFT is capable of reproducing energy surfaces featuring conical intersections (azobenzene and H3) with correct topology and correct overall energetics also away from the intersection. Venturing to condensed-phase systems, XCDFT reproduces the TDDFT solvatochromic shift of benzaldehyde when it is embedded by a cluster of water molecules. Thus, we find XCDFT to be a competitive method among single-reference methods for computations of excited states in terms of time to solution, rate of convergence, and accuracy of the result.

  1. Large scale study of multiple-molecule queries

    PubMed Central

    2009-01-01

    Background In ligand-based screening, as well as in other chemoinformatics applications, one seeks to effectively search large repositories of molecules in order to retrieve molecules that are similar typically to a single molecule lead. However, in some case, multiple molecules from the same family are available to seed the query and search for other members of the same family. Multiple-molecule query methods have been less studied than single-molecule query methods. Furthermore, the previous studies have relied on proprietary data and sometimes have not used proper cross-validation methods to assess the results. In contrast, here we develop and compare multiple-molecule query methods using several large publicly available data sets and background. We also create a framework based on a strict cross-validation protocol to allow unbiased benchmarking for direct comparison in future studies across several performance metrics. Results Fourteen different multiple-molecule query methods were defined and benchmarked using: (1) 41 publicly available data sets of related molecules with similar biological activity; and (2) publicly available background data sets consisting of up to 175,000 molecules randomly extracted from the ChemDB database and other sources. Eight of the fourteen methods were parameter free, and six of them fit one or two free parameters to the data using a careful cross-validation protocol. All the methods were assessed and compared for their ability to retrieve members of the same family against the background data set by using several performance metrics including the Area Under the Accumulation Curve (AUAC), Area Under the Curve (AUC), F1-measure, and BEDROC metrics. Consistent with the previous literature, the best parameter-free methods are the MAX-SIM and MIN-RANK methods, which score a molecule to a family by the maximum similarity, or minimum ranking, obtained across the family. One new parameterized method introduced in this study and two previously defined methods, the Exponential Tanimoto Discriminant (ETD), the Tanimoto Power Discriminant (TPD), and the Binary Kernel Discriminant (BKD), outperform most other methods but are more complex, requiring one or two parameters to be fit to the data. Conclusion Fourteen methods for multiple-molecule querying of chemical databases, including novel methods, (ETD) and (TPD), are validated using publicly available data sets, standard cross-validation protocols, and established metrics. The best results are obtained with ETD, TPD, BKD, MAX-SIM, and MIN-RANK. These results can be replicated and compared with the results of future studies using data freely downloadable from http://cdb.ics.uci.edu/. PMID:20298525

  2. BioCompoundML: A General Biofuel Property Screening Tool for Biological Molecules Using Random Forest Classifiers

    DOE PAGES

    Whitmore, Leanne S.; Davis, Ryan W.; McCormick, Robert L.; ...

    2016-09-15

    Screening a large number of biologically derived molecules for potential fuel compounds without recourse to experimental testing is important in identifying understudied yet valuable molecules. Experimental testing, although a valuable standard for measuring fuel properties, has several major limitations, including the requirement of testably high quantities, considerable expense, and a large amount of time. This paper discusses the development of a general-purpose fuel property tool, using machine learning, whose outcome is to screen molecules for desirable fuel properties. BioCompoundML adopts a general methodology, requiring as input only a list of training compounds (with identifiers and measured values) and a listmore » of testing compounds (with identifiers). For the training data, BioCompoundML collects open data from the National Center for Biotechnology Information, incorporates user-provided features, imputes missing values, performs feature reduction, builds a classifier, and clusters compounds. BioCompoundML then collects data for the testing compounds, predicts class membership, and determines whether compounds are found in the range of variability of the training data set. We demonstrate this tool using three different fuel properties: research octane number (RON), threshold soot index (TSI), and melting point (MP). Here we provide measures of its success with these properties using randomized train/test measurements: average accuracy is 88% in RON, 85% in TSI, and 94% in MP; average precision is 88% in RON, 88% in TSI, and 95% in MP; and average recall is 88% in RON, 82% in TSI, and 97% in MP. The receiver operator characteristics (area under the curve) were estimated at 0.88 in RON, 0.86 in TSI, and 0.87 in MP. We also measured the success of BioCompoundML by sending 16 compounds for direct RON determination. Finally, we provide a screen of 1977 hydrocarbons/oxygenates within the 8696 compounds in MetaCyc, identifying compounds with high predictive strength for high or low RON.« less

  3. Electronic Coupling Calculations for Bridge-Mediated Charge Transfer Using Constrained Density Functional Theory (CDFT) and Effective Hamiltonian Approaches at the Density Functional Theory (DFT) and Fragment-Orbital Density Functional Tight Binding (FODFTB) Level

    DOE PAGES

    Gillet, Natacha; Berstis, Laura; Wu, Xiaojing; ...

    2016-09-09

    In this paper, four methods to calculate charge transfer integrals in the context of bridge-mediated electron transfer are tested. These methods are based on density functional theory (DFT). We consider two perturbative Green's function effective Hamiltonian methods (first, at the DFT level of theory, using localized molecular orbitals; second, applying a tight-binding DFT approach, using fragment orbitals) and two constrained DFT implementations with either plane-wave or local basis sets. To assess the performance of the methods for through-bond (TB)-dominated or through-space (TS)-dominated transfer, different sets of molecules are considered. For through-bond electron transfer (ET), several molecules that were originally synthesizedmore » by Paddon-Row and co-workers for the deduction of electronic coupling values from photoemission and electron transmission spectroscopies, are analyzed. The tested methodologies prove to be successful in reproducing experimental data, the exponential distance decay constant and the superbridge effects arising from interference among ET pathways. For through-space ET, dedicated p-stacked systems with heterocyclopentadiene molecules were created and analyzed on the basis of electronic coupling dependence on donor-acceptor distance, structure of the bridge, and ET barrier height. The inexpensive fragment-orbital density functional tight binding (FODFTB) method gives similar results to constrained density functional theory (CDFT) and both reproduce the expected exponential decay of the coupling with donor-acceptor distances and the number of bridging units. Finally, these four approaches appear to give reliable results for both TB and TS ET and present a good alternative to expensive ab initio methodologies for large systems involving long-range charge transfers.« less

  4. Electronic Coupling Calculations for Bridge-Mediated Charge Transfer Using Constrained Density Functional Theory (CDFT) and Effective Hamiltonian Approaches at the Density Functional Theory (DFT) and Fragment-Orbital Density Functional Tight Binding (FODFTB) Level

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gillet, Natacha; Berstis, Laura; Wu, Xiaojing

    In this paper, four methods to calculate charge transfer integrals in the context of bridge-mediated electron transfer are tested. These methods are based on density functional theory (DFT). We consider two perturbative Green's function effective Hamiltonian methods (first, at the DFT level of theory, using localized molecular orbitals; second, applying a tight-binding DFT approach, using fragment orbitals) and two constrained DFT implementations with either plane-wave or local basis sets. To assess the performance of the methods for through-bond (TB)-dominated or through-space (TS)-dominated transfer, different sets of molecules are considered. For through-bond electron transfer (ET), several molecules that were originally synthesizedmore » by Paddon-Row and co-workers for the deduction of electronic coupling values from photoemission and electron transmission spectroscopies, are analyzed. The tested methodologies prove to be successful in reproducing experimental data, the exponential distance decay constant and the superbridge effects arising from interference among ET pathways. For through-space ET, dedicated p-stacked systems with heterocyclopentadiene molecules were created and analyzed on the basis of electronic coupling dependence on donor-acceptor distance, structure of the bridge, and ET barrier height. The inexpensive fragment-orbital density functional tight binding (FODFTB) method gives similar results to constrained density functional theory (CDFT) and both reproduce the expected exponential decay of the coupling with donor-acceptor distances and the number of bridging units. Finally, these four approaches appear to give reliable results for both TB and TS ET and present a good alternative to expensive ab initio methodologies for large systems involving long-range charge transfers.« less

  5. Quantitative Assessment of Molecular Dynamics Sampling for Flexible Systems.

    PubMed

    Nemec, Mike; Hoffmann, Daniel

    2017-02-14

    Molecular dynamics (MD) simulation is a natural method for the study of flexible molecules but at the same time is limited by the large size of the conformational space of these molecules. We ask by how much the MD sampling quality for flexible molecules can be improved by two means: the use of diverse sets of trajectories starting from different initial conformations to detect deviations between samples and sampling with enhanced methods such as accelerated MD (aMD) or scaled MD (sMD) that distort the energy landscape in controlled ways. To this end, we test the effects of these approaches on MD simulations of two flexible biomolecules in aqueous solution, Met-Enkephalin (5 amino acids) and HIV-1 gp120 V3 (a cycle of 35 amino acids). We assess the convergence of the sampling quantitatively with known, extensive measures of cluster number N c and cluster distribution entropy S c and with two new quantities, conformational overlap O conf and density overlap O dens , both conveniently ranging from 0 to 1. These new overlap measures quantify self-consistency of sampling in multitrajectory MD experiments, a necessary condition for converged sampling. A comprehensive assessment of sampling quality of MD experiments identifies the combination of diverse trajectory sets and aMD as the most efficient approach among those tested. However, analysis of O dens between conventional and aMD trajectories also reveals that we have not completely corrected aMD sampling for the distorted energy landscape. Moreover, for V3, the courses of N c and O dens indicate that much higher resources than those generally invested today will probably be needed to achieve convergence. The comparative analysis also shows that conventional MD simulations with insufficient sampling can be easily misinterpreted as being converged.

  6. Electronic Coupling Calculations for Bridge-Mediated Charge Transfer Using Constrained Density Functional Theory (CDFT) and Effective Hamiltonian Approaches at the Density Functional Theory (DFT) and Fragment-Orbital Density Functional Tight Binding (FODFTB) Level.

    PubMed

    Gillet, Natacha; Berstis, Laura; Wu, Xiaojing; Gajdos, Fruzsina; Heck, Alexander; de la Lande, Aurélien; Blumberger, Jochen; Elstner, Marcus

    2016-10-11

    In this article, four methods to calculate charge transfer integrals in the context of bridge-mediated electron transfer are tested. These methods are based on density functional theory (DFT). We consider two perturbative Green's function effective Hamiltonian methods (first, at the DFT level of theory, using localized molecular orbitals; second, applying a tight-binding DFT approach, using fragment orbitals) and two constrained DFT implementations with either plane-wave or local basis sets. To assess the performance of the methods for through-bond (TB)-dominated or through-space (TS)-dominated transfer, different sets of molecules are considered. For through-bond electron transfer (ET), several molecules that were originally synthesized by Paddon-Row and co-workers for the deduction of electronic coupling values from photoemission and electron transmission spectroscopies, are analyzed. The tested methodologies prove to be successful in reproducing experimental data, the exponential distance decay constant and the superbridge effects arising from interference among ET pathways. For through-space ET, dedicated π-stacked systems with heterocyclopentadiene molecules were created and analyzed on the basis of electronic coupling dependence on donor-acceptor distance, structure of the bridge, and ET barrier height. The inexpensive fragment-orbital density functional tight binding (FODFTB) method gives similar results to constrained density functional theory (CDFT) and both reproduce the expected exponential decay of the coupling with donor-acceptor distances and the number of bridging units. These four approaches appear to give reliable results for both TB and TS ET and present a good alternative to expensive ab initio methodologies for large systems involving long-range charge transfers.

  7. QSAR analyses of 3-(4-benzylpiperidin-1-yl)-N-phenylpropylamine derivatives as potent CCR5 antagonists.

    PubMed

    Roy, Kunal; Leonard, J Thomas

    2005-01-01

    CCR5 receptor binding affinity of a series of 3-(4-benzylpiperidin-1-yl)propylamine congeners was subjected to QSAR study using the linear free energy related (LFER) model of Hansch. Appropriate indicator variables encoding different group contributions and different physicochemical variables such as hydrophobicity (pi), electronic (Hammett sigma), and steric (molar refractivity, STERIMOL values) parameters of phenyl ring substituents of the compounds were used as predictor variables. The Hansch analysis explores the importance of the lipophilicity and electron-donating substituents for the binding affinity. However, this method could not give more insight into the structure-activity relationships because of the diverse molecular features in the data set. 3D-QSAR analyses of the same data set using Molecular Shape Analysis (MSA), Receptor Surface Analysis (RSA), and Molecular Field Analysis (MFA) techniques were also performed. The best model with acceptable statistical quality was derived from the MSA, which showed the importance of the relative negative charge (RNCG): substituents with a high RNCG value have more binding affinity than the unsubstituted piperidine and phenyl (R1 position) congeners. The relative negative charge surface area (RNCS) is detrimental (e.g. R2 = 3,4-Cl2) for the activity. An increase in the length of the molecule in the Z dimension (Lz) is conducive (e.g. R3 = sulfonylmorpholino), while an increase in the area of the molecular shadow in the XZ plane (Sxz) is detrimental (e.g. R1 = N-c-hexylmethyl-5-oxopyrrolidin-3-yl) for the binding affinity. The presence of a chiral center makes the molecule less active (e.g. R1 = N-methyl-5-oxopyrrolidin-3-yl). An increase in the van der Waals area, the molecular volume, and the difference between the volume of the individual molecule and the shape reference compound are conducive (e.g. R3 = (CH3)2NSO2-) for the binding affinity. Substituents with higher JursFPSA_2 values (fractional charged partial surface area) like the N-methylsulfonylpiperidin-4-yl (R1 position) group have better binding affinity than the substituents such as 4-chlorophenylamino (R1 position). Unsubstituted piperidines (R1 position) with less JursFNSA_1 values have lower binding affinity than the 4-chlorophenyl substituted compounds. The MFA derived equation shows interaction energies at different grid points, while the RSA model shows the importance of hydrophobicity and charge at different regions of the molecules. The models were validated through the leave-one-out, leave-15%-out, and leave-25%-out cross-validation techniques. The developed models were also subjected to a randomization test (99% confidence level). Although the MSA derived models had excellent statistical qualities both for the training as well as test sets, RSA and MFA results for the test sets are not comparable statistically with the MSA derived models.

  8. Get Your Atoms in Order--An Open-Source Implementation of a Novel and Robust Molecular Canonicalization Algorithm.

    PubMed

    Schneider, Nadine; Sayle, Roger A; Landrum, Gregory A

    2015-10-26

    Finding a canonical ordering of the atoms in a molecule is a prerequisite for generating a unique representation of the molecule. The canonicalization of a molecule is usually accomplished by applying some sort of graph relaxation algorithm, the most common of which is the Morgan algorithm. There are known issues with that algorithm that lead to noncanonical atom orderings as well as problems when it is applied to large molecules like proteins. Furthermore, each cheminformatics toolkit or software provides its own version of a canonical ordering, most based on unpublished algorithms, which also complicates the generation of a universal unique identifier for molecules. We present an alternative canonicalization approach that uses a standard stable-sorting algorithm instead of a Morgan-like index. Two new invariants that allow canonical ordering of molecules with dependent chirality as well as those with highly symmetrical cyclic graphs have been developed. The new approach proved to be robust and fast when tested on the 1.45 million compounds of the ChEMBL 20 data set in different scenarios like random renumbering of input atoms or SMILES round tripping. Our new algorithm is able to generate a canonical order of the atoms of protein molecules within a few milliseconds. The novel algorithm is implemented in the open-source cheminformatics toolkit RDKit. With this paper, we provide a reference Python implementation of the algorithm that could easily be integrated in any cheminformatics toolkit. This provides a first step toward a common standard for canonical atom ordering to generate a universal unique identifier for molecules other than InChI.

  9. Force Field Benchmark of Organic Liquids: Density, Enthalpy of Vaporization, Heat Capacities, Surface Tension, Isothermal Compressibility, Volumetric Expansion Coefficient, and Dielectric Constant.

    PubMed

    Caleman, Carl; van Maaren, Paul J; Hong, Minyan; Hub, Jochen S; Costa, Luciano T; van der Spoel, David

    2012-01-10

    The chemical composition of small organic molecules is often very similar to amino acid side chains or the bases in nucleic acids, and hence there is no a priori reason why a molecular mechanics force field could not describe both organic liquids and biomolecules with a single parameter set. Here, we devise a benchmark for force fields in order to test the ability of existing force fields to reproduce some key properties of organic liquids, namely, the density, enthalpy of vaporization, the surface tension, the heat capacity at constant volume and pressure, the isothermal compressibility, the volumetric expansion coefficient, and the static dielectric constant. Well over 1200 experimental measurements were used for comparison to the simulations of 146 organic liquids. Novel polynomial interpolations of the dielectric constant (32 molecules), heat capacity at constant pressure (three molecules), and the isothermal compressibility (53 molecules) as a function of the temperature have been made, based on experimental data, in order to be able to compare simulation results to them. To compute the heat capacities, we applied the two phase thermodynamics method (Lin et al. J. Chem. Phys.2003, 119, 11792), which allows one to compute thermodynamic properties on the basis of the density of states as derived from the velocity autocorrelation function. The method is implemented in a new utility within the GROMACS molecular simulation package, named g_dos, and a detailed exposé of the underlying equations is presented. The purpose of this work is to establish the state of the art of two popular force fields, OPLS/AA (all-atom optimized potential for liquid simulation) and GAFF (generalized Amber force field), to find common bottlenecks, i.e., particularly difficult molecules, and to serve as a reference point for future force field development. To make for a fair playing field, all molecules were evaluated with the same parameter settings, such as thermostats and barostats, treatment of electrostatic interactions, and system size (1000 molecules). The densities and enthalpy of vaporization from an independent data set based on simulations using the CHARMM General Force Field (CGenFF) presented by Vanommeslaeghe et al. (J. Comput. Chem.2010, 31, 671) are included for comparison. We find that, overall, the OPLS/AA force field performs somewhat better than GAFF, but there are significant issues with reproduction of the surface tension and dielectric constants for both force fields.

  10. Force Field Benchmark of Organic Liquids: Density, Enthalpy of Vaporization, Heat Capacities, Surface Tension, Isothermal Compressibility, Volumetric Expansion Coefficient, and Dielectric Constant

    PubMed Central

    2011-01-01

    The chemical composition of small organic molecules is often very similar to amino acid side chains or the bases in nucleic acids, and hence there is no a priori reason why a molecular mechanics force field could not describe both organic liquids and biomolecules with a single parameter set. Here, we devise a benchmark for force fields in order to test the ability of existing force fields to reproduce some key properties of organic liquids, namely, the density, enthalpy of vaporization, the surface tension, the heat capacity at constant volume and pressure, the isothermal compressibility, the volumetric expansion coefficient, and the static dielectric constant. Well over 1200 experimental measurements were used for comparison to the simulations of 146 organic liquids. Novel polynomial interpolations of the dielectric constant (32 molecules), heat capacity at constant pressure (three molecules), and the isothermal compressibility (53 molecules) as a function of the temperature have been made, based on experimental data, in order to be able to compare simulation results to them. To compute the heat capacities, we applied the two phase thermodynamics method (Lin et al. J. Chem. Phys.2003, 119, 11792), which allows one to compute thermodynamic properties on the basis of the density of states as derived from the velocity autocorrelation function. The method is implemented in a new utility within the GROMACS molecular simulation package, named g_dos, and a detailed exposé of the underlying equations is presented. The purpose of this work is to establish the state of the art of two popular force fields, OPLS/AA (all-atom optimized potential for liquid simulation) and GAFF (generalized Amber force field), to find common bottlenecks, i.e., particularly difficult molecules, and to serve as a reference point for future force field development. To make for a fair playing field, all molecules were evaluated with the same parameter settings, such as thermostats and barostats, treatment of electrostatic interactions, and system size (1000 molecules). The densities and enthalpy of vaporization from an independent data set based on simulations using the CHARMM General Force Field (CGenFF) presented by Vanommeslaeghe et al. (J. Comput. Chem.2010, 31, 671) are included for comparison. We find that, overall, the OPLS/AA force field performs somewhat better than GAFF, but there are significant issues with reproduction of the surface tension and dielectric constants for both force fields. PMID:22241968

  11. Automation of the CHARMM General Force Field (CGenFF) I: bond perception and atom typing

    PubMed Central

    Vanommeslaeghe, K.; MacKerell, A. D.

    2012-01-01

    Molecular mechanics force fields are widely used in computer-aided drug design for the study of drug-like molecules alone or interacting with biological systems. In simulations involving biological macromolecules, the biological part is typically represented by a specialized biomolecular force field, while the drug is represented by a matching general (organic) force field. In order to apply these general force fields to an arbitrary drug-like molecule, functionality for assignment of atom types, parameters and charges is required. In the present article, which is part I of a series of two, we present the algorithms for bond perception and atom typing for the CHARMM General Force Field (CGenFF). The CGenFF atom typer first associates attributes to the atoms and bonds in a molecule, such as valence, bond order, and ring membership among others. Of note are a number of features that are specifically required for CGenFF. This information is then used by the atom typing routine to assign CGenFF atom types based on a programmable decision tree. This allows for straightforward implementation of CGenFF’s complicated atom typing rules and for equally straightforward updating of the atom typing scheme as the force field grows. The presented atom typer was validated by assigning correct atom types on 477 model compounds including in the training set as well as 126 test-set molecules that were constructed to specifically verify its different components. The program may be utilized via an online implementation at https://www.paramchem.org/. PMID:23146088

  12. Automation of the CHARMM General Force Field (CGenFF) I: bond perception and atom typing.

    PubMed

    Vanommeslaeghe, K; MacKerell, A D

    2012-12-21

    Molecular mechanics force fields are widely used in computer-aided drug design for the study of drug-like molecules alone or interacting with biological systems. In simulations involving biological macromolecules, the biological part is typically represented by a specialized biomolecular force field, while the drug is represented by a matching general (organic) force field. In order to apply these general force fields to an arbitrary drug-like molecule, functionality for assignment of atom types, parameters, and charges is required. In the present article, which is part I of a series of two, we present the algorithms for bond perception and atom typing for the CHARMM General Force Field (CGenFF). The CGenFF atom typer first associates attributes to the atoms and bonds in a molecule, such as valence, bond order, and ring membership among others. Of note are a number of features that are specifically required for CGenFF. This information is then used by the atom typing routine to assign CGenFF atom types based on a programmable decision tree. This allows for straightforward implementation of CGenFF's complicated atom typing rules and for equally straightforward updating of the atom typing scheme as the force field grows. The presented atom typer was validated by assigning correct atom types on 477 model compounds including in the training set as well as 126 test-set molecules that were constructed to specifically verify its different components. The program may be utilized via an online implementation at https://www.paramchem.org/ .

  13. Constant size descriptors for accurate machine learning models of molecular properties

    NASA Astrophysics Data System (ADS)

    Collins, Christopher R.; Gordon, Geoffrey J.; von Lilienfeld, O. Anatole; Yaron, David J.

    2018-06-01

    Two different classes of molecular representations for use in machine learning of thermodynamic and electronic properties are studied. The representations are evaluated by monitoring the performance of linear and kernel ridge regression models on well-studied data sets of small organic molecules. One class of representations studied here counts the occurrence of bonding patterns in the molecule. These require only the connectivity of atoms in the molecule as may be obtained from a line diagram or a SMILES string. The second class utilizes the three-dimensional structure of the molecule. These include the Coulomb matrix and Bag of Bonds, which list the inter-atomic distances present in the molecule, and Encoded Bonds, which encode such lists into a feature vector whose length is independent of molecular size. Encoded Bonds' features introduced here have the advantage of leading to models that may be trained on smaller molecules and then used successfully on larger molecules. A wide range of feature sets are constructed by selecting, at each rank, either a graph or geometry-based feature. Here, rank refers to the number of atoms involved in the feature, e.g., atom counts are rank 1, while Encoded Bonds are rank 2. For atomization energies in the QM7 data set, the best graph-based feature set gives a mean absolute error of 3.4 kcal/mol. Inclusion of 3D geometry substantially enhances the performance, with Encoded Bonds giving 2.4 kcal/mol, when used alone, and 1.19 kcal/mol, when combined with graph features.

  14. Numerical Treatment of Stokes Solvent Flow and Solute-Solvent Interfacial Dynamics for Nonpolar Molecules.

    PubMed

    Sun, Hui; Zhou, Shenggao; Moore, David K; Cheng, Li-Tien; Li, Bo

    2016-05-01

    We design and implement numerical methods for the incompressible Stokes solvent flow and solute-solvent interface motion for nonpolar molecules in aqueous solvent. The balance of viscous force, surface tension, and van der Waals type dispersive force leads to a traction boundary condition on the solute-solvent interface. To allow the change of solute volume, we design special numerical boundary conditions on the boundary of a computational domain through a consistency condition. We use a finite difference ghost fluid scheme to discretize the Stokes equation with such boundary conditions. The method is tested to have a second-order accuracy. We combine this ghost fluid method with the level-set method to simulate the motion of the solute-solvent interface that is governed by the solvent fluid velocity. Numerical examples show that our method can predict accurately the blow up time for a test example of curvature flow and reproduce the polymodal (e.g., dry and wet) states of hydration of some simple model molecular systems.

  15. Numerical Treatment of Stokes Solvent Flow and Solute-Solvent Interfacial Dynamics for Nonpolar Molecules

    PubMed Central

    Sun, Hui; Zhou, Shenggao; Moore, David K.; Cheng, Li-Tien; Li, Bo

    2015-01-01

    We design and implement numerical methods for the incompressible Stokes solvent flow and solute-solvent interface motion for nonpolar molecules in aqueous solvent. The balance of viscous force, surface tension, and van der Waals type dispersive force leads to a traction boundary condition on the solute-solvent interface. To allow the change of solute volume, we design special numerical boundary conditions on the boundary of a computational domain through a consistency condition. We use a finite difference ghost fluid scheme to discretize the Stokes equation with such boundary conditions. The method is tested to have a second-order accuracy. We combine this ghost fluid method with the level-set method to simulate the motion of the solute-solvent interface that is governed by the solvent fluid velocity. Numerical examples show that our method can predict accurately the blow up time for a test example of curvature flow and reproduce the polymodal (e.g., dry and wet) states of hydration of some simple model molecular systems. PMID:27365866

  16. Quantifying entanglement of rotor chains using basis truncation: Application to dipolar endofullerene peapods.

    PubMed

    Halverson, Tom; Iouchtchenko, Dmitri; Roy, Pierre-Nicholas

    2018-02-21

    We propose a variational approach for the calculation of the quantum entanglement entropy of assemblies of rotating dipolar molecules. A basis truncation scheme based on the total angular momentum quantum number is proposed. The method is tested on hydrogen fluoride (HF) molecules confined in C 60 fullerene cages themselves trapped in a nanotube to form a carbon peapod. The rotational degrees of freedom of the HF molecules and dipolar interactions between neighboring molecules are considered in our model Hamiltonian. Both screened and unscreened dipoles are simulated and results are obtained for the ground state and one excited state that is expected to be accessible via a far-infrared collective excitation. The effect of basis truncation on energetic and entanglement properties is examined and discussed in terms of size extensivity. It is empirically found that for unscreened dipoles, a total angular momentum cutoff that increases linearly with the number of rotors is required in order to obtain proper system size scaling of the chemical potential and entanglement entropy. Recent experiments [A. Krachmalnicoff et al., Nat. Chem. 8, 953 (2016)] suggest substantial screening of the HF dipole moment, so much smaller basis sets are required to obtain converged results in this realistic case. Static correlation functions are also computed and are shown to decay much quicker in the case of screened dipoles. Our variational results are also used to test the accuracy of perturbative and pairwise ansatz treatments.

  17. Structure-based prediction of free energy changes of binding of PTP1B inhibitors

    NASA Astrophysics Data System (ADS)

    Wang, Jing; Ling Chan, Shek; Ramnarayan, Kal

    2003-08-01

    The goals were (1) to understand the driving forces in the binding of small molecule inhibitors to the active site of PTP1B and (2) to develop a molecular mechanics-based empirical free energy function for compound potency prediction. A set of compounds with known activities was docked onto the active site. The related energy components and molecular surface areas were calculated. The bridging water molecules were identified and their contributions were considered. Linear relationships were explored between the above terms and the binding free energies of compounds derived based on experimental inhibition constants. We found that minimally three terms are required to give rise to a good correlation (0.86) with predictive power in five-group cross-validation test (q2 = 0.70). The dominant terms are the electrostatic energy and non-electrostatic energy stemming from the intra- and intermolecular interactions of solutes and from those of bridging water molecules in complexes.

  18. Introducing new reactivity descriptors: "Bond reactivity indices." Comparison of the new definitions and atomic reactivity indices.

    PubMed

    Sánchez-Márquez, Jesús

    2016-11-21

    A new methodology to obtain reactivity indices has been defined. This is based on reactivity functions such as the Fukui function or the dual descriptor and makes it possible to project the information of reactivity functions over molecular orbitals instead of the atoms of the molecule (atomic reactivity indices). The methodology focuses on the molecule's natural bond orbitals (bond reactivity indices) because these orbitals (with physical meaning) have the advantage of being very localized, allowing the reaction site of an electrophile or nucleophile to be determined within a very precise molecular region. This methodology gives a reactivity index for every Natural Bond Orbital (NBO), and we have verified that they have equivalent information to the reactivity functions. A representative set of molecules has been used to test the new definitions. Also, the bond reactivity index has been related with the atomic reactivity one, and complementary information has been obtained from the comparison. Finally, a new atomic reactivity index has been defined and compared with previous definitions.

  19. Molecular Determinants of Peptide Binding to Two Common Rhesus Macaque Major Histocompatibility Complex Class II Molecules

    PubMed Central

    Dzuris, John L.; Sidney, John; Horton, Helen; Correa, Rose; Carter, Donald; Chesnut, Robert W.; Watkins, David I.; Sette, Alessandro

    2001-01-01

    Major histocompatibility complex class II molecules encoded by two common rhesus macaque alleles Mamu-DRB1*0406 and Mamu-DRB*w201 have been purified, and quantitative binding assays have been established. The structural requirements for peptide binding to each molecule were characterized by testing panels of single-substitution analogs of the two previously defined epitopes HIV Env242 (Mamu-DRB1*0406 restricted) and HIV Env482 (Mamu-DRB*w201 restricted). Anchor positions of both macaque DR molecules were spaced following a position 1 (P1), P4, P6, P7, and P9 pattern. The specific binding motif associated with each molecule was distinct, but largely overlapping, and was based on crucial roles of aromatic and/or hydrophobic residues at P1, P6, and P9. Based on these results, a tentative Mamu class II DR supermotif was defined. This pattern is remarkably similar to a previously defined human HLA-DR supermotif. Similarities in binding motifs between human HLA and macaque Mamu-DR molecules were further illustrated by testing a panel of more than 60 different single-substitution analogs of the HLA-DR-restricted HA 307–319 epitope for binding to Mamu-DRB*w201 and HLA-DRB1*0101. The Mamu-DRB1*0406 and -DRB*w201 binding capacity of a set of 311 overlapping peptides spanning the entire simian immunodeficiency virus (SIV) genome was also evaluated. Ten peptides capable of binding both molecules were identified, together with 19 DRB1*0406 and 43 DRB*w201 selective binders. The Mamu-DR supermotif was found to be present in about 75% of the good binders and in 50% of peptides binding with intermediate affinity but only in approximately 25% of the peptides which did not bind either Mamu class II molecule. Finally, using flow cytometric detection of antigen-induced intracellular gamma interferon, we identify a new CD4+ T-lymphocyte epitope encoded within the Rev protein of SIV. PMID:11602736

  20. Headspace Analysis of Volatile Compounds Using Segemented Chirped-Pulse Fourier Transform Mm-Wave Spectroscopy

    NASA Astrophysics Data System (ADS)

    Harris, Brent; Steber, Amanda; Pate, Brooks

    2014-06-01

    A chirped-pulse Fourier transform mm-wave spectrometer has been tested in analytical chemistry applications of headspace analysis of volatile species. A solid-state mm-wave light source (260-290 GHz) provides 30-50 mW of power. This power is sufficient to achieve optimal excitation of individual transitions of molecules with dipole moments larger than about 0.1 D. The chirped-pulse spectrometer has near 100% measurement duty cycle using a high-speed digitizer (4 GS/s) with signal accumulation in an FPGA. The combination of the ability to perform optimal pulse excitation and near 100% measurement duty cycle gives a spectrometer that is fully optimized for trace detection. The performance of the instrument is tested using an EPA sample (EPA VOC Mix 6 - Supelco) that contains a set of molecules that are fast eluting on gas chromatographs and, as a result, present analysis challenges to mass spectrometry. The ability to directly analyze the VOC mixture is tested by acquiring the full bandwidth (260-290 GHz) spectrum in a "high dynamic range" measurement mode that minimizes spurious spectrometer responses. The high-resolution of molecular rotational spectroscopy makes it easy to analyze this mixture without the need for chemical separation. The sensitivity of the instrument for individual molecule detection, where a single transition is polarized by the excitation pulse, is also tested. Detection limits in water will be reported. In the case of chloromethane, the detection limit (0.1 microgram/L), matches the sensitivity reported in the EPA measurement protocol (EPA Method 524) for GC/MS.

  1. Ab initio phasing based on topological restraints: automated determination of the space group and the number of molecules in the unit cell.

    PubMed

    Urzhumtseva, Ludmila; Lunina, Natalia; Fokine, Andrei; Samama, Jean Pierre; Lunin, Vladimir Y; Urzhumtsev, Alexandre

    2004-09-01

    The connectivity-based phasing method has been demonstrated to be capable of finding molecular packing and envelopes even for difficult cases of structure determination, as well as of identifying, in favorable cases, secondary-structure elements of protein molecules in the crystal. This method uses a single set of structure factor magnitudes and general topological features of a crystallographic image of the macromolecule under study. This information is expressed through a number of parameters. Most of these parameters are easy to estimate, and the results of phasing are practically independent of these parameters when they are chosen within reasonable limits. By contrast, the correct choice for such parameters as the expected number of connected regions in the unit cell is sometimes ambiguous. To study these dependencies, numerous tests were performed with simulated data, experimental data and mixed data sets, where several reflections missed in the experiment were completed by computed data. This paper demonstrates that the procedure is able to control this choice automatically and helps in difficult cases to identify the correct number of molecules in the asymmetric unit. In addition, the procedure behaves abnormally if the space group is defined incorrectly and therefore may distinguish between the rotation and screw axes even when high-resolution data are not available.

  2. Genarris: Random generation of molecular crystal structures and fast screening with a Harris approximation

    NASA Astrophysics Data System (ADS)

    Li, Xiayue; Curtis, Farren S.; Rose, Timothy; Schober, Christoph; Vazquez-Mayagoitia, Alvaro; Reuter, Karsten; Oberhofer, Harald; Marom, Noa

    2018-06-01

    We present Genarris, a Python package that performs configuration space screening for molecular crystals of rigid molecules by random sampling with physical constraints. For fast energy evaluations, Genarris employs a Harris approximation, whereby the total density of a molecular crystal is constructed via superposition of single molecule densities. Dispersion-inclusive density functional theory is then used for the Harris density without performing a self-consistency cycle. Genarris uses machine learning for clustering, based on a relative coordinate descriptor developed specifically for molecular crystals, which is shown to be robust in identifying packing motif similarity. In addition to random structure generation, Genarris offers three workflows based on different sequences of successive clustering and selection steps: the "Rigorous" workflow is an exhaustive exploration of the potential energy landscape, the "Energy" workflow produces a set of low energy structures, and the "Diverse" workflow produces a maximally diverse set of structures. The latter is recommended for generating initial populations for genetic algorithms. Here, the implementation of Genarris is reported and its application is demonstrated for three test cases.

  3. QSAR Accelerated Discovery of Potent Ice Recrystallization Inhibitors

    NASA Astrophysics Data System (ADS)

    Briard, Jennie G.; Fernandez, Michael; de Luna, Phil; Woo, Tom. K.; Ben, Robert N.

    2016-05-01

    Ice recrystallization is the main contributor to cell damage and death during the cryopreservation of cells and tissues. Over the past five years, many small carbohydrate-based molecules were identified as ice recrystallization inhibitors and several were shown to reduce cryoinjury during the cryopreservation of red blood cells (RBCs) and hematopoietic stems cells (HSCs). Unfortunately, clear structure-activity relationships have not been identified impeding the rational design of future compounds possessing ice recrystallization inhibition (IRI) activity. A set of 124 previously synthesized compounds with known IRI activities were used to calibrate 3D-QSAR classification models using GRid INdependent Descriptors (GRIND) derived from DFT level quantum mechanical calculations. Partial least squares (PLS) model was calibrated with 70% of the data set which successfully identified 80% of the IRI active compounds with a precision of 0.8. This model exhibited good performance in screening the remaining 30% of the data set with 70% of active additives successfully recovered with a precision of ~0.7 and specificity of 0.8. The model was further applied to screen a new library of aryl-alditol molecules which were then experimentally synthesized and tested with a success rate of 82%. Presented is the first computer-aided high-throughput experimental screening for novel IRI active compounds.

  4. Ligand-based and structure-based approaches in identifying ideal pharmacophore against c-Jun N-terminal kinase-3.

    PubMed

    Kumar, B V S Suneel; Kotla, Rohith; Buddiga, Revanth; Roy, Jyoti; Singh, Sardar Shamshair; Gundla, Rambabu; Ravikumar, Muttineni; Sarma, Jagarlapudi A R P

    2011-01-01

    Structure and ligand based pharmacophore modeling and docking studies carried out using diversified set of c-Jun N-terminal kinase-3 (JNK3) inhibitors are presented in this paper. Ligand based pharmacophore model (LBPM) was developed for 106 inhibitors of JNK3 using a training set of 21 compounds to reveal structural and chemical features necessary for these molecules to inhibit JNK3. Hypo1 consisted of two hydrogen bond acceptors (HBA), one hydrogen bond donor (HBD), and a hydrophobic (HY) feature with a correlation coefficient (r²) of 0.950. This pharmacophore model was validated using test set containing 85 inhibitors and had a good r² of 0.846. All the molecules were docked using Glide software and interestingly, all the docked conformations showed hydrogen bond interactions with important hinge region amino acids (Gln155 and Met149)and these interactions were compared with Hypo1 features. The results of ligand based pharmacophore model (LBPM)and docking studies are validated each other. The structure based pharmacophore model (SBPM) studies have identified additional features, two hydrogen bond donors and one hydrogen bond acceptor. The combination of these methodologies is useful in designing ideal pharmacophore which provides a powerful tool for the discovery of novel and selective JNK3 inhibitors.

  5. Virtual screening of cathepsin k inhibitors using docking and pharmacophore models.

    PubMed

    Ravikumar, Muttineni; Pavan, S; Bairy, Santhosh; Pramod, A B; Sumakanth, M; Kishore, Madala; Sumithra, Tirunagaram

    2008-07-01

    Cathepsin K is a lysosomal cysteine protease that is highly and selectively expressed in osteoclasts, the cells which degrade bone during the continuous cycle of bone degradation and formation. Inhibition of cathepsin K represents a potential therapeutic approach for diseases characterized by excessive bone resorption such as osteoporosis. In order to elucidate the essential structural features for cathepsin K, a three-dimensional pharmacophore hypotheses were built on the basis of a set of known cathepsin K inhibitors selected from the literature using catalyst program. Several methods are used in validation of pharmacophore hypothesis were presented, and the fourth hypothesis (Hypo4) was considered to be the best pharmacophore hypothesis which has a correlation coefficient of 0.944 with training set and has high prediction of activity for a set of 30 test molecules with correlation of 0.909. The model (Hypo4) was then employed as 3D search query to screen the Maybridge database containing 59,000 compounds, to discover novel and highly potent ligands. For analyzing intermolecular interactions between protein and ligand, all the molecules were docked using Glide software. The result showed that the type and spatial location of chemical features encoded in the pharmacophore are in full agreement with the enzyme inhibitor interaction pattern identified from molecular docking.

  6. QSAR Accelerated Discovery of Potent Ice Recrystallization Inhibitors

    PubMed Central

    Briard, Jennie G.; Fernandez, Michael; De Luna, Phil; Woo, Tom. K.; Ben, Robert N.

    2016-01-01

    Ice recrystallization is the main contributor to cell damage and death during the cryopreservation of cells and tissues. Over the past five years, many small carbohydrate-based molecules were identified as ice recrystallization inhibitors and several were shown to reduce cryoinjury during the cryopreservation of red blood cells (RBCs) and hematopoietic stems cells (HSCs). Unfortunately, clear structure-activity relationships have not been identified impeding the rational design of future compounds possessing ice recrystallization inhibition (IRI) activity. A set of 124 previously synthesized compounds with known IRI activities were used to calibrate 3D-QSAR classification models using GRid INdependent Descriptors (GRIND) derived from DFT level quantum mechanical calculations. Partial least squares (PLS) model was calibrated with 70% of the data set which successfully identified 80% of the IRI active compounds with a precision of 0.8. This model exhibited good performance in screening the remaining 30% of the data set with 70% of active additives successfully recovered with a precision of ~0.7 and specificity of 0.8. The model was further applied to screen a new library of aryl-alditol molecules which were then experimentally synthesized and tested with a success rate of 82%. Presented is the first computer-aided high-throughput experimental screening for novel IRI active compounds. PMID:27216585

  7. QSAR Accelerated Discovery of Potent Ice Recrystallization Inhibitors.

    PubMed

    Briard, Jennie G; Fernandez, Michael; De Luna, Phil; Woo, Tom K; Ben, Robert N

    2016-05-24

    Ice recrystallization is the main contributor to cell damage and death during the cryopreservation of cells and tissues. Over the past five years, many small carbohydrate-based molecules were identified as ice recrystallization inhibitors and several were shown to reduce cryoinjury during the cryopreservation of red blood cells (RBCs) and hematopoietic stems cells (HSCs). Unfortunately, clear structure-activity relationships have not been identified impeding the rational design of future compounds possessing ice recrystallization inhibition (IRI) activity. A set of 124 previously synthesized compounds with known IRI activities were used to calibrate 3D-QSAR classification models using GRid INdependent Descriptors (GRIND) derived from DFT level quantum mechanical calculations. Partial least squares (PLS) model was calibrated with 70% of the data set which successfully identified 80% of the IRI active compounds with a precision of 0.8. This model exhibited good performance in screening the remaining 30% of the data set with 70% of active additives successfully recovered with a precision of ~0.7 and specificity of 0.8. The model was further applied to screen a new library of aryl-alditol molecules which were then experimentally synthesized and tested with a success rate of 82%. Presented is the first computer-aided high-throughput experimental screening for novel IRI active compounds.

  8. Design of efficient molecular organic light-emitting diodes by a high-throughput virtual screening and experimental approach.

    PubMed

    Gómez-Bombarelli, Rafael; Aguilera-Iparraguirre, Jorge; Hirzel, Timothy D; Duvenaud, David; Maclaurin, Dougal; Blood-Forsythe, Martin A; Chae, Hyun Sik; Einzinger, Markus; Ha, Dong-Gwang; Wu, Tony; Markopoulos, Georgios; Jeon, Soonok; Kang, Hosuk; Miyazaki, Hiroshi; Numata, Masaki; Kim, Sunghan; Huang, Wenliang; Hong, Seong Ik; Baldo, Marc; Adams, Ryan P; Aspuru-Guzik, Alán

    2016-10-01

    Virtual screening is becoming a ground-breaking tool for molecular discovery due to the exponential growth of available computer time and constant improvement of simulation and machine learning techniques. We report an integrated organic functional material design process that incorporates theoretical insight, quantum chemistry, cheminformatics, machine learning, industrial expertise, organic synthesis, molecular characterization, device fabrication and optoelectronic testing. After exploring a search space of 1.6 million molecules and screening over 400,000 of them using time-dependent density functional theory, we identified thousands of promising novel organic light-emitting diode molecules across the visible spectrum. Our team collaboratively selected the best candidates from this set. The experimentally determined external quantum efficiencies for these synthesized candidates were as large as 22%.

  9. Design of efficient molecular organic light-emitting diodes by a high-throughput virtual screening and experimental approach

    NASA Astrophysics Data System (ADS)

    Gómez-Bombarelli, Rafael; Aguilera-Iparraguirre, Jorge; Hirzel, Timothy D.; Duvenaud, David; MacLaurin, Dougal; Blood-Forsythe, Martin A.; Chae, Hyun Sik; Einzinger, Markus; Ha, Dong-Gwang; Wu, Tony; Markopoulos, Georgios; Jeon, Soonok; Kang, Hosuk; Miyazaki, Hiroshi; Numata, Masaki; Kim, Sunghan; Huang, Wenliang; Hong, Seong Ik; Baldo, Marc; Adams, Ryan P.; Aspuru-Guzik, Alán

    2016-10-01

    Virtual screening is becoming a ground-breaking tool for molecular discovery due to the exponential growth of available computer time and constant improvement of simulation and machine learning techniques. We report an integrated organic functional material design process that incorporates theoretical insight, quantum chemistry, cheminformatics, machine learning, industrial expertise, organic synthesis, molecular characterization, device fabrication and optoelectronic testing. After exploring a search space of 1.6 million molecules and screening over 400,000 of them using time-dependent density functional theory, we identified thousands of promising novel organic light-emitting diode molecules across the visible spectrum. Our team collaboratively selected the best candidates from this set. The experimentally determined external quantum efficiencies for these synthesized candidates were as large as 22%.

  10. Around the macrolide - Impact of 3D structure of macrocycles on lipophilicity and cellular accumulation.

    PubMed

    Koštrun, Sanja; Munic Kos, Vesna; Matanović Škugor, Maja; Palej Jakopović, Ivana; Malnar, Ivica; Dragojević, Snježana; Ralić, Jovica; Alihodžić, Sulejman

    2017-06-16

    The aim of this study was to investigate lipophilicity and cellular accumulation of rationally designed azithromycin and clarithromycin derivatives at the molecular level. The effect of substitution site and substituent properties on a global physico-chemical profile and cellular accumulation of investigated compounds was studied using calculated structural parameters as well as experimentally determined lipophilicity. In silico models based on the 3D structure of molecules were generated to investigate conformational effect on studied properties and to enable prediction of lipophilicity and cellular accumulation for this class of molecules based on non-empirical parameters. The applicability of developed models was explored on a validation and test sets and compared with previously developed empirical models. Copyright © 2017 Elsevier Masson SAS. All rights reserved.

  11. Evolution of the physicochemical properties of marketed drugs: can history foretell the future?

    PubMed

    Faller, Bernard; Ottaviani, Giorgio; Ertl, Peter; Berellini, Giuliano; Collis, Alan

    2011-11-01

    A set of diverse bioactive molecules, relevant from a medicinal chemistry viewpoint, was assembled and used to navigate the physicochemical property space of new and old, or traditional drugs against a larger set of 12,000 diverse bioactive small molecules. Most drugs on the market only occupy a fraction of the property space of the bioactive molecules, whereas new molecular entities (NMEs) approved since 2002 are moving away from this traditional drug space. In this new territory, semi-empirical rules derived from knowledge accumulated from historic, older molecules are not necessarily valid and different liabilities become more prominent. Copyright © 2011 Elsevier Ltd. All rights reserved.

  12. QSAR Classification Model for Antibacterial Compounds and Its Use in Virtual Screening

    DTIC Science & Technology

    2012-09-26

    test set molecules that were not used to train the models . This allowed us to more accurately estimate the prediction power of the models . As...pathogens and deposited in PubChem Bioassays. Ultimately, the main purpose of this model is to make predictions , based on known antibacterial and non...the model built form the remaining compounds is used to predict the left out compound. Once all the compounds pass through this cycle of prediction , a

  13. Which Method of Assigning Bond Orders in Lewis Structures Best Reflects Experimental Data? An Analysis of the Octet Rule and Formal Charge Systems for Period 2 and 3 Nonmetallic Compounds

    ERIC Educational Resources Information Center

    See, Ronald F.

    2009-01-01

    Two systems were evaluated for drawing Lewis structures of period 2 and 3 non-metallic compounds: the octet rule and minimization of formal charge. The test set of molecules consisted of the oxides, halides, oxohalides, oxoanions, and oxoacids of B, N, O, F, Al, P, S, and Cl. Bond orders were quantified using experimental data, including bond…

  14. A ligand-based comparative molecular field analysis (CoMFA) and homology model based molecular docking studies on 3', 4'-dihydroxyflavones as rat 5-lipoxygenase inhibitors: Design of new inhibitors.

    PubMed

    Ahamed, T K Shameera; Muraleedharan, K

    2017-12-01

    In this study, ligand based comparative molecular field analysis (CoMFA) with five principal components was performed on class of 3', 4'-dihydroxyflavone derivatives for potent rat 5-LOX inhibitors. The percentage contributions in building of CoMFA model were 91.36% for steric field and 8.6% for electrostatic field. R 2 values for training and test sets were found to be 0.9320 and 0.8259, respectively. In case of LOO, LTO and LMO cross validation test, q 2 values were 0.6587, 0.6479 and 0.5547, respectively. These results indicate that the model has high statistical reliability and good predictive power. The extracted contour maps were used to identify the important regions where the modification was necessary to design a new molecule with improved activity. The study has developed a homology model for rat 5-LOX and recognized the key residues at the binding site. Docking of most active molecule to the binding site of 5-LOX confirmed the stability and rationality of CoMFA model. Based on molecular docking results and CoMFA contour plots, new inhibitors with higher activity with respect to the most active compound in data set were designed. Copyright © 2017 Elsevier Ltd. All rights reserved.

  15. Low-lying excited states by constrained DFT.

    PubMed

    Ramos, Pablo; Pavanello, Michele

    2018-04-14

    Exploiting the machinery of Constrained Density Functional Theory (CDFT), we propose a variational method for calculating low-lying excited states of molecular systems. We dub this method eXcited CDFT (XCDFT). Excited states are obtained by self-consistently constraining a user-defined population of electrons, N c , in the virtual space of a reference set of occupied orbitals. By imposing this population to be N c = 1.0, we computed the first excited state of 15 molecules from a test set. Our results show that XCDFT achieves an accuracy in the predicted excitation energy only slightly worse than linear-response time-dependent DFT (TDDFT), but without incurring into problems of variational collapse typical of the more commonly adopted ΔSCF method. In addition, we selected a few challenging processes to test the limits of applicability of XCDFT. We find that in contrast to TDDFT, XCDFT is capable of reproducing energy surfaces featuring conical intersections (azobenzene and H 3 ) with correct topology and correct overall energetics also away from the intersection. Venturing to condensed-phase systems, XCDFT reproduces the TDDFT solvatochromic shift of benzaldehyde when it is embedded by a cluster of water molecules. Thus, we find XCDFT to be a competitive method among single-reference methods for computations of excited states in terms of time to solution, rate of convergence, and accuracy of the result.

  16. New molecular descriptors based on local properties at the molecular surface and a boiling-point model derived from them.

    PubMed

    Ehresmann, Bernd; de Groot, Marcel J; Alex, Alexander; Clark, Timothy

    2004-01-01

    New molecular descriptors based on statistical descriptions of the local ionization potential, local electron affinity, and the local polarizability at the surface of the molecule are proposed. The significance of these descriptors has been tested by calculating them for the Maybridge database in addition to our set of 26 descriptors reported previously. The new descriptors show little correlation with those already in use. Furthermore, the principal components of the extended set of descriptors for the Maybridge data show that especially the descriptors based on the local electron affinity extend the variance in our set of descriptors, which we have previously shown to be relevant to physical properties. The first nine principal components are shown to be most significant. As an example of the usefulness of the new descriptors, we have set up a QSPR model for boiling points using both the old and new descriptors.

  17. Assessment of Linear Finite-Difference Poisson-Boltzmann Solvers

    PubMed Central

    Wang, Jun; Luo, Ray

    2009-01-01

    CPU time and memory usage are two vital issues that any numerical solvers for the Poisson-Boltzmann equation have to face in biomolecular applications. In this study we systematically analyzed the CPU time and memory usage of five commonly used finite-difference solvers with a large and diversified set of biomolecular structures. Our comparative analysis shows that modified incomplete Cholesky conjugate gradient and geometric multigrid are the most efficient in the diversified test set. For the two efficient solvers, our test shows that their CPU times increase approximately linearly with the numbers of grids. Their CPU times also increase almost linearly with the negative logarithm of the convergence criterion at very similar rate. Our comparison further shows that geometric multigrid performs better in the large set of tested biomolecules. However, modified incomplete Cholesky conjugate gradient is superior to geometric multigrid in molecular dynamics simulations of tested molecules. We also investigated other significant components in numerical solutions of the Poisson-Boltzmann equation. It turns out that the time-limiting step is the free boundary condition setup for the linear systems for the selected proteins if the electrostatic focusing is not used. Thus, development of future numerical solvers for the Poisson-Boltzmann equation should balance all aspects of the numerical procedures in realistic biomolecular applications. PMID:20063271

  18. Prediction of kinase-inhibitor binding affinity using energetic parameters

    PubMed Central

    Usha, Singaravelu; Selvaraj, Samuel

    2016-01-01

    The combination of physicochemical properties and energetic parameters derived from protein-ligand complexes play a vital role in determining the biological activity of a molecule. In the present work, protein-ligand interaction energy along with logP values was used to predict the experimental log (IC50) values of 25 different kinase-inhibitors using multiple regressions which gave a correlation coefficient of 0.93. The regression equation obtained was tested on 93 kinase-inhibitor complexes and an average deviation of 0.92 from the experimental log IC50 values was shown. The same set of descriptors was used to predict binding affinities for a test set of five individual kinase families, with correlation values > 0.9. We show that the protein-ligand interaction energies and partition coefficient values form the major deterministic factors for binding affinity of the ligand for its receptor. PMID:28149052

  19. Evaluation of genotoxicity testing of FDA approved large molecule therapeutics.

    PubMed

    Sawant, Satin G; Fielden, Mark R; Black, Kurt A

    2014-10-01

    Large molecule therapeutics (MW>1000daltons) are not expected to enter the cell and thus have reduced potential to interact directly with DNA or related physiological processes. Genotoxicity studies are therefore not relevant and typically not required for large molecule therapeutic candidates. Regulatory guidance supports this approach; however there are examples of marketed large molecule therapeutics where sponsors have conducted genotoxicity studies. A retrospective analysis was performed on genotoxicity studies of United States FDA approved large molecule therapeutics since 1998 identified through the Drugs@FDA website. This information was used to provide a data-driven rationale for genotoxicity evaluations of large molecule therapeutics. Fifty-three of the 99 therapeutics identified were tested for genotoxic potential. None of the therapeutics tested showed a positive outcome in any study except the peptide glucagon (GlucaGen®) showing equivocal in vitro results, as stated in the product labeling. Scientific rationale and data from this review indicate that testing of a majority of large molecule modalities do not add value to risk assessment and support current regulatory guidance. Similarly, the data do not support testing of peptides containing only natural amino acids. Peptides containing non-natural amino acids and small molecules in conjugated products may need to be tested. Copyright © 2014 Elsevier Inc. All rights reserved.

  20. Introducing a new bond reactivity index: Philicities for natural bond orbitals.

    PubMed

    Sánchez-Márquez, Jesús; Zorrilla, David; García, Víctor; Fernández, Manuel

    2017-12-22

    In the present work, a new methodology defined for obtaining reactivity indices (philicities) is proposed. This is based on reactivity functions such as the Fukui function or the dual descriptor, and makes it possible to project the information from reactivity functions onto molecular orbitals, instead of onto the atoms of the molecule (atomic reactivity indices). The methodology focuses on the molecules' natural bond orbitals (bond reactivity indices) because these orbitals have the advantage of being localized, allowing the reaction site of an electrophile or nucleophile to be determined within a very precise molecular region. This methodology provides a "philicity" index for every NBO, and a representative set of molecules has been used to test the new definition. A new methodology has also been developed to compare the "finite difference" and the "frontier molecular orbital" approximations. To facilitate their use, the proposed methodology as well as the possibility of calculating the new indices have been implemented in a new version of UCA-FUKUI software. In addition, condensation schemes based on atomic populations of the "atoms in molecules" theory, the Hirshfeld population analysis, the approximation of Mulliken (with a minimal basis set) and electrostatic potential-derived charges have also been implemented, including the calculation of "bond reactivity indices" defined in previous studies. Graphical abstract A new methodology defined for obtaining bond reactivity indices (philicities) is proposed and makes it possible to project the information from reactivity functions onto molecular orbitals. The proposed methodology as well as the possibility of calculating the new indices have been implemented in a new version of UCA-FUKUI software. In addition, this version can use new atomic condensation schemes and new "utilities" have also been included in this second version.

  1. Machine Learning Estimation of Atom Condensed Fukui Functions.

    PubMed

    Zhang, Qingyou; Zheng, Fangfang; Zhao, Tanfeng; Qu, Xiaohui; Aires-de-Sousa, João

    2016-02-01

    To enable the fast estimation of atom condensed Fukui functions, machine learning algorithms were trained with databases of DFT pre-calculated values for ca. 23,000 atoms in organic molecules. The problem was approached as the ranking of atom types with the Bradley-Terry (BT) model, and as the regression of the Fukui function. Random Forests (RF) were trained to predict the condensed Fukui function, to rank atoms in a molecule, and to classify atoms as high/low Fukui function. Atomic descriptors were based on counts of atom types in spheres around the kernel atom. The BT coefficients assigned to atom types enabled the identification (93-94 % accuracy) of the atom with the highest Fukui function in pairs of atoms in the same molecule with differences ≥0.1. In whole molecules, the atom with the top Fukui function could be recognized in ca. 50 % of the cases and, on the average, about 3 of the top 4 atoms could be recognized in a shortlist of 4. Regression RF yielded predictions for test sets with R(2) =0.68-0.69, improving the ability of BT coefficients to rank atoms in a molecule. Atom classification (as high/low Fukui function) was obtained with RF with sensitivity of 55-61 % and specificity of 94-95 %. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  2. Projected Hybrid Orbitals: A General QM/MM Method

    PubMed Central

    2015-01-01

    A projected hybrid orbital (PHO) method was described to model the covalent boundary in a hybrid quantum mechanical and molecular mechanical (QM/MM) system. The PHO approach can be used in ab initio wave function theory and in density functional theory with any basis set without introducing system-dependent parameters. In this method, a secondary basis set on the boundary atom is introduced to formulate a set of hybrid atomic orbtials. The primary basis set on the boundary atom used for the QM subsystem is projected onto the secondary basis to yield a representation that provides a good approximation to the electron-withdrawing power of the primary basis set to balance electronic interactions between QM and MM subsystems. The PHO method has been tested on a range of molecules and properties. Comparison with results obtained from QM calculations on the entire system shows that the present PHO method is a robust and balanced QM/MM scheme that preserves the structural and electronic properties of the QM region. PMID:25317748

  3. Conserved water molecules in bacterial serine hydroxymethyltransferases.

    PubMed

    Milano, Teresa; Di Salvo, Martino Luigi; Angelaccio, Sebastiana; Pascarella, Stefano

    2015-10-01

    Water molecules occurring in the interior of protein structures often are endowed with key structural and functional roles. We report the results of a systematic analysis of conserved water molecules in bacterial serine hydroxymethyltransferases (SHMTs). SHMTs are an important group of pyridoxal-5'-phosphate-dependent enzymes that catalyze the reversible conversion of l-serine and tetrahydropteroylglutamate to glycine and 5,10-methylenetetrahydropteroylglutamate. The approach utilized in this study relies on two programs, ProACT2 and WatCH. The first software is able to categorize water molecules in a protein crystallographic structure as buried, positioned in clefts or at the surface. The other program finds, in a set of superposed homologous proteins, water molecules that occur approximately in equivalent position in each of the considered structures. These groups of molecules are referred to as 'clusters' and represent structurally conserved water molecules. Several conserved clusters of buried or cleft water molecules were found in the set of 11 bacterial SHMTs we took into account for this work. The majority of these clusters were not described previously. Possible structural and functional roles for the conserved water molecules are envisaged. This work provides a map of the conserved water molecules helpful for deciphering SHMT mechanism and for rational design of molecular engineering experiments. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  4. Many Molecular Properties from One Kernel in Chemical Space

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ramakrishnan, Raghunathan; von Lilienfeld, O. Anatole

    We introduce property-independent kernels for machine learning modeling of arbitrarily many molecular properties. The kernels encode molecular structures for training sets of varying size, as well as similarity measures sufficiently diffuse in chemical space to sample over all training molecules. Corresponding molecular reference properties provided, they enable the instantaneous generation of ML models which can systematically be improved through the addition of more data. This idea is exemplified for single kernel based modeling of internal energy, enthalpy, free energy, heat capacity, polarizability, electronic spread, zero-point vibrational energy, energies of frontier orbitals, HOMOLUMO gap, and the highest fundamental vibrational wavenumber. Modelsmore » of these properties are trained and tested using 112 kilo organic molecules of similar size. Resulting models are discussed as well as the kernels’ use for generating and using other property models.« less

  5. Chemical reactivity indices for the complete series of chlorinated benzenes: solvent effect.

    PubMed

    Padmanabhan, J; Parthasarathi, R; Subramanian, V; Chattaraj, P K

    2006-03-02

    We present a comprehensive analysis to probe the effect of solvation on the reactivity of the complete series of chlorobenzenes through the conceptual density functional theory (DFT)-based global and local descriptors. We propose a multiphilic descriptor in this study to explore the nature of attack at a particular site in a molecule. It is defined as the difference between nucleophilic and electrophilic condensed philicity functions. This descriptor is capable of explaining both the nucleophilicity and electrophilicity of the given atomic sites in the molecule simultaneously. The predictive ability of this descriptor is tested on the complete series of chlorobenzenes in gas and solvent media. A structure-toxicity analysis of these entire sets of chlorobenzenes toward aquatic organisms demonstrates the importance of the electrophilicity index in the prediction of the reactivity/toxicity.

  6. Molecular engineering and measurements to test hypothesized mechanisms in single molecule conductance switching.

    PubMed

    Moore, Amanda M; Dameron, Arrelaine A; Mantooth, Brent A; Smith, Rachel K; Fuchs, Daniel J; Ciszek, Jacob W; Maya, Francisco; Yao, Yuxing; Tour, James M; Weiss, Paul S

    2006-02-15

    Six customized phenylene-ethynylene-based oligomers have been studied for their electronic properties using scanning tunneling microscopy to test hypothesized mechanisms of stochastic conductance switching. Previously suggested mechanisms include functional group reduction, functional group rotation, backbone ring rotation, neighboring molecule interactions, bond fluctuations, and hybridization changes. Here, we test these hypotheses experimentally by varying the molecular designs of the switches; the ability of the molecules to switch via each hypothetical mechanism is selectively engineered into or out of each molecule. We conclude that hybridization changes at the molecule-surface interface are responsible for the switching we observe.

  7. A systematization of spectral data on the methanol molecule

    NASA Astrophysics Data System (ADS)

    Akhlyostin, A. Yu.; Voronina, S. S.; Lavrentiev, N. A.; Privezentsev, A. I.; Rodimova, O. B.; Fazliev, A. Z.

    2015-11-01

    Problems underlying a systematization of spectral data on the methanol molecule are formulated. Data on the energy levels and vacuum wavenumbers acquired from the published literature are presented in the form of information sources imported into the W@DIS information system. Sets of quantum numbers and labels used to describe the CH3OH molecular states are analyzed. The set of labels is different from universally accepted sets. A system of importing the data sources into W@DIS is outlined. The structure of databases characterizing transitions in an isolated CH3OH molecule is introduced and a digital library of the relevant published literature is discussed. A brief description is given of an imported data quality analysis and representation of the results obtained in the form of ontologies for subsequent computer processing.

  8. Experimental and computational prediction of glass transition temperature of drugs.

    PubMed

    Alzghoul, Ahmad; Alhalaweh, Amjad; Mahlin, Denny; Bergström, Christel A S

    2014-12-22

    Glass transition temperature (Tg) is an important inherent property of an amorphous solid material which is usually determined experimentally. In this study, the relation between Tg and melting temperature (Tm) was evaluated using a data set of 71 structurally diverse druglike compounds. Further, in silico models for prediction of Tg were developed based on calculated molecular descriptors and linear (multilinear regression, partial least-squares, principal component regression) and nonlinear (neural network, support vector regression) modeling techniques. The models based on Tm predicted Tg with an RMSE of 19.5 K for the test set. Among the five computational models developed herein the support vector regression gave the best result with RMSE of 18.7 K for the test set using only four chemical descriptors. Hence, two different models that predict Tg of drug-like molecules with high accuracy were developed. If Tm is available, a simple linear regression can be used to predict Tg. However, the results also suggest that support vector regression and calculated molecular descriptors can predict Tg with equal accuracy, already before compound synthesis.

  9. PAREMD: A parallel program for the evaluation of momentum space properties of atoms and molecules

    NASA Astrophysics Data System (ADS)

    Meena, Deep Raj; Gadre, Shridhar R.; Balanarayan, P.

    2018-03-01

    The present work describes a code for evaluating the electron momentum density (EMD), its moments and the associated Shannon information entropy for a multi-electron molecular system. The code works specifically for electronic wave functions obtained from traditional electronic structure packages such as GAMESS and GAUSSIAN. For the momentum space orbitals, the general expression for Gaussian basis sets in position space is analytically Fourier transformed to momentum space Gaussian basis functions. The molecular orbital coefficients of the wave function are taken as an input from the output file of the electronic structure calculation. The analytic expressions of EMD are evaluated over a fine grid and the accuracy of the code is verified by a normalization check and a numerical kinetic energy evaluation which is compared with the analytic kinetic energy given by the electronic structure package. Apart from electron momentum density, electron density in position space has also been integrated into this package. The program is written in C++ and is executed through a Shell script. It is also tuned for multicore machines with shared memory through OpenMP. The program has been tested for a variety of molecules and correlated methods such as CISD, Møller-Plesset second order (MP2) theory and density functional methods. For correlated methods, the PAREMD program uses natural spin orbitals as an input. The program has been benchmarked for a variety of Gaussian basis sets for different molecules showing a linear speedup on a parallel architecture.

  10. Guest Molecule Exchange Kinetics for the 2012 Ignik Sikumi Gas Hydrate Field Trial

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    White, Mark D.; Lee, Won Suk

    A commercially viable technology for producing methane from natural gas hydrate reservoirs remains elusive. Short-term depressurization field tests have demonstrated the potential for producing natural gas via dissociation of the clathrate structure, but the long-term performance of the depressurization technology ultimately requires a heat source to sustain the dissociation. A decade of laboratory experiments and theoretical studies have demonstrated the exchange of pure CO2 and N2-CO2 mixtures with CH4 in sI gas hydrates, yielding critical information about molecular mechanisms, recoveries, and exchange kinetics. Findings indicated the potential for producing natural gas with little to no production of water and rapidmore » exchange kinetics, generating sufficient interest in the guest-molecule exchange technology for a field test. In 2012 the U.S. DOE/NETL, ConocoPhillips Company, and Japan Oil, Gas and Metals National Corporation jointly sponsored the first field trial of injecting a mixture of N2-CO2 into a CH4-hydrate bearing formation beneath the permafrost on the Alaska North Slope. Known as the Ignik Sikumi #1 Gas Hydrate Field Trial, this experiment involved three stages: 1) the injection of a N2-CO2 mixture into a targeted hydrate-bearing layer, 2) a 4-day pressurized soaking period, and 3) a sustained depressurization and fluid production period. Data collected during the three stages of the field trial were made available after an extensive quality check. These data included continuous temperature and pressure logs, injected and recovered fluid compositions and volumes. The Ignik Sikumi #1 data set is extensive, but contains no direct evidence of the guest-molecule exchange process. This investigation is directed at using numerical simulation to provide an interpretation of the collected data. A numerical simulator, STOMP-HYDT-KE, was recently completed that solves conservation equations for energy, water, mobile fluid guest molecules, and hydrate guest molecules, for up to three gas hydrate guest molecules: CH4, CO2, and N2. The independent tracking of mobile fluid and hydrate guest molecules allows for the kinetic exchange of guest molecules between the mobile fluids and hydrate. The particular interest of this numerical investigation is to determine whether kinetic exchange parameters, determined from laboratory-scale experiments, are directly applicable to interpreting the Ignik Sikumi #1 data.« less

  11. Extended fenske-hall calculation of inner-shell binding energies using ( Z + 1)-bazis sets: Sulfur-containing molecules

    NASA Astrophysics Data System (ADS)

    Zwanziger, Ch.; Zwanziger, H.; Szargan, R.; Reinhold, J.

    1981-08-01

    It is shown that the S1s and S2p binding energies and their chemical shifts in the molecules H 2S, SO 2, SF 6 and COS obtained with hole-state calculations using an extended Fenske-Hall method are in good agreement with experimental values if mixed ( Z + 1)-basis sets are applied.

  12. Combining solvent thermodynamic profiles with functionality maps of the Hsp90 binding site to predict the displacement of water molecules.

    PubMed

    Haider, Kamran; Huggins, David J

    2013-10-28

    Intermolecular interactions in the aqueous phase must compete with the interactions between the two binding partners and their solvating water molecules. In biological systems, water molecules in protein binding sites cluster at well-defined hydration sites and can form strong hydrogen-bonding interactions with backbone and side-chain atoms. Displacement of such water molecules is only favorable when the ligand can form strong compensating hydrogen bonds. Conversely, water molecules in hydrophobic regions of protein binding sites make only weak interactions, and the requirements for favorable displacement are less stringent. The propensity of water molecules for displacement can be identified using inhomogeneous fluid solvation theory (IFST), a statistical mechanical method that decomposes the solvation free energy of a solute into the contributions from different spatial regions and identifies potential binding hotspots. In this study, we employed IFST to study the displacement of water molecules from the ATP binding site of Hsp90, using a test set of 103 ligands. The predicted contribution of a hydration site to the hydration free energy was found to correlate well with the observed displacement. Additionally, we investigated if this correlation could be improved by using the energetic scores of favorable probe groups binding at the location of hydration sites, derived from a multiple copy simultaneous search (MCSS) method. The probe binding scores were not highly predictive of the observed displacement and did not improve the predictivity when used in combination with IFST-based hydration free energies. The results show that IFST alone can be used to reliably predict the observed displacement of water molecules in Hsp90. However, MCSS can augment IFST calculations by suggesting which functional groups should be used to replace highly displaceable water molecules. Such an approach could be very useful in improving the hit-to-lead process for new drug targets.

  13. Methods And Devices For Characterizing Duplex Nucleic Acid Molecules

    DOEpatents

    Akeson, Mark; Vercoutere, Wenonah; Haussler, David; Winters-Hilt, Stephen

    2005-08-30

    Methods and devices are provided for characterizing a duplex nucleic acid, e.g., a duplex DNA molecule. In the subject methods, a fluid conducting medium that includes a duplex nucleic acid molecule is contacted with a nanopore under the influence of an applied electric field and the resulting changes in current through the nanopore caused by the duplex nucleic acid molecule are monitored. The observed changes in current through the nanopore are then employed as a set of data values to characterize the duplex nucleic acid, where the set of data values may be employed in raw form or manipulated, e.g., into a current blockade profile. Also provided are nanopore devices for practicing the subject methods, where the subject nanopore devices are characterized by the presence of an algorithm which directs a processing means to employ monitored changes in current through a nanopore to characterize a duplex nucleic acid molecule responsible for the current changes. The subject methods and devices find use in a variety of applications, including, among other applications, the identification of an analyte duplex DNA molecule in a sample, the specific base sequence at a single nulceotide polymorphism (SNP), and the sequencing of duplex DNA molecules.

  14. Molecular Dynamics Calculations of Optical Nonlinear Properties of Materials

    DTIC Science & Technology

    1991-12-20

    by saturating the hydrogens with five sets each of d and p functions with exponents of 1.0, 0.5, 0.25, 0.125, 0.0625 but for a molecule like ASH 3...of d polarization functions using the exponents suggested by Dykstra et al. A similar calculation was also performed in which a second diffuse p set...one set each of d and p functions with exponents of 0.05 as suggested by DuPuis et al. for larger molecules was used. There was a loss in & of only

  15. Computational tests of quantum chemical models for excited and ionized states of molecules with phosphorus and sulfur atoms.

    PubMed

    Hahn, David K; RaghuVeer, Krishans; Ortiz, J V

    2014-05-15

    Time-dependent density functional theory (TD-DFT) and electron propagator theory (EPT) are used to calculate the electronic transition energies and ionization energies, respectively, of species containing phosphorus or sulfur. The accuracy of TD-DFT and EPT, in conjunction with various basis sets, is assessed with data from gas-phase spectroscopy. TD-DFT is tested using 11 prominent exchange-correlation functionals on a set of 37 vertical and 19 adiabatic transitions. For vertical transitions, TD-CAM-B3LYP calculations performed with the MG3S basis set are lowest in overall error, having a mean absolute deviation from experiment of 0.22 eV, or 0.23 eV over valence transitions and 0.21 eV over Rydberg transitions. Using a larger basis set, aug-pc3, improves accuracy over the valence transitions via hybrid functionals, but improved accuracy over the Rydberg transitions is only obtained via the BMK functional. For adiabatic transitions, all hybrid functionals paired with the MG3S basis set perform well, and B98 is best, with a mean absolute deviation from experiment of 0.09 eV. The testing of EPT used the Outer Valence Green's Function (OVGF) approximation and the Partial Third Order (P3) approximation on 37 vertical first ionization energies. It is found that OVGF outperforms P3 when basis sets of at least triple-ζ quality in the polarization functions are used. The largest basis set used in this study, aug-pc3, obtained the best mean absolute error from both methods -0.08 eV for OVGF and 0.18 eV for P3. The OVGF/6-31+G(2df,p) level of theory is particularly cost-effective, yielding a mean absolute error of 0.11 eV.

  16. Large-Scale Validation of Mixed-Solvent Simulations to Assess Hotspots at Protein-Protein Interaction Interfaces.

    PubMed

    Ghanakota, Phani; van Vlijmen, Herman; Sherman, Woody; Beuming, Thijs

    2018-04-23

    The ability to target protein-protein interactions (PPIs) with small molecule inhibitors offers great promise in expanding the druggable target space and addressing a broad range of untreated diseases. However, due to their nature and function of interacting with protein partners, PPI interfaces tend to extend over large surfaces without the typical pockets of enzymes and receptors. These features present unique challenges for small molecule inhibitor design. As such, determining whether a particular PPI of interest could be pursued with a small molecule discovery strategy requires an understanding of the characteristics of the PPI interface and whether it has hotspots that can be leveraged by small molecules to achieve desired potency. Here, we assess the ability of mixed-solvent molecular dynamic (MSMD) simulations to detect hotspots at PPI interfaces. MSMD simulations using three cosolvents (acetonitrile, isopropanol, and pyrimidine) were performed on a large test set of 21 PPI targets that have been experimentally validated by small molecule inhibitors. We compare MSMD, which includes explicit solvent and full protein flexibility, to a simpler approach that does not include dynamics or explicit solvent (SiteMap) and find that MSMD simulations reveal additional information about the characteristics of these targets and the ability for small molecules to inhibit the PPI interface. In the few cases were MSMD simulations did not detect hotspots, we explore the shortcomings of this technique and propose future improvements. Finally, using Interleukin-2 as an example, we highlight the advantage of the MSMD approach for detecting transient cryptic druggable pockets that exists at PPI interfaces.

  17. Electronegativity equalization method: parameterization and validation for organic molecules using the Merz-Kollman-Singh charge distribution scheme.

    PubMed

    Jirousková, Zuzana; Vareková, Radka Svobodová; Vanek, Jakub; Koca, Jaroslav

    2009-05-01

    The electronegativity equalization method (EEM) was developed by Mortier et al. as a semiempirical method based on the density-functional theory. After parameterization, in which EEM parameters A(i), B(i), and adjusting factor kappa are obtained, this approach can be used for calculation of average electronegativity and charge distribution in a molecule. The aim of this work is to perform the EEM parameterization using the Merz-Kollman-Singh (MK) charge distribution scheme obtained from B3LYP/6-31G* and HF/6-31G* calculations. To achieve this goal, we selected a set of 380 organic molecules from the Cambridge Structural Database (CSD) and used the methodology, which was recently successfully applied to EEM parameterization to calculate the HF/STO-3G Mulliken charges on large sets of molecules. In the case of B3LYP/6-31G* MK charges, we have improved the EEM parameters for already parameterized elements, specifically C, H, N, O, and F. Moreover, EEM parameters for S, Br, Cl, and Zn, which have not as yet been parameterized for this level of theory and basis set, we also developed. In the case of HF/6-31G* MK charges, we have developed the EEM parameters for C, H, N, O, S, Br, Cl, F, and Zn that have not been parameterized for this level of theory and basis set so far. The obtained EEM parameters were verified by a previously developed validation procedure and used for the charge calculation on a different set of 116 organic molecules from the CSD. The calculated EEM charges are in a very good agreement with the quantum mechanically obtained ab initio charges. 2008 Wiley Periodicals, Inc.

  18. The accuracy of ab initio calculations without ab initio calculations for charged systems: Kriging predictions of atomistic properties for ions in aqueous solutions

    NASA Astrophysics Data System (ADS)

    Di Pasquale, Nicodemo; Davie, Stuart J.; Popelier, Paul L. A.

    2018-06-01

    Using the machine learning method kriging, we predict the energies of atoms in ion-water clusters, consisting of either Cl- or Na+ surrounded by a number of water molecules (i.e., without Na+Cl- interaction). These atomic energies are calculated following the topological energy partitioning method called Interacting Quantum Atoms (IQAs). Kriging predicts atomic properties (in this case IQA energies) by a model that has been trained over a small set of geometries with known property values. The results presented here are part of the development of an advanced type of force field, called FFLUX, which offers quantum mechanical information to molecular dynamics simulations without the limiting computational cost of ab initio calculations. The results reported for the prediction of the IQA components of the energy in the test set exhibit an accuracy of a few kJ/mol, corresponding to an average error of less than 5%, even when a large cluster of water molecules surrounding an ion is considered. Ions represent an important chemical system and this work shows that they can be correctly taken into account in the framework of the FFLUX force field.

  19. RigFit: a new approach to superimposing ligand molecules.

    PubMed

    Lemmen, C; Hiller, C; Lengauer, T

    1998-09-01

    If structural knowledge of a receptor under consideration is lacking, drug design approaches focus on similarity or dissimilarity analysis of putative ligands. In this context the mutual ligand superposition is of utmost importance. Methods that are rapid enough to facilitate interactive usage, that allow to process sets of conformers and that enable database screening are of special interest here. The ability to superpose molecular fragments instead of entire molecules has proven to be helpful too. The RIGFIT approach meets these requirements and has several additional advantages. In three distinct test applications, we evaluated how closely we can approximate the observed relative orientation for a set of known crystal structures, we employed RIGFIT as a fragment placement procedure, and we performed a fragment-based database screening. The run time of RIGFIT can be traded off against its accuracy. To be competitive in accuracy with another state-of-the-art alignment tool, with which we compare our method explicitly, computing times of about 6 s per superposition on a common day workstation are required. If longer run times can be afforded the accuracy increases significantly. RIGFIT is part of the flexible superposition software FLEXS which can be accessed on the WWW [http:/(/)cartan.gmd.de/FlexS].

  20. aCLIMAX 4.0.1, The new version of the software for analyzing and interpreting INS spectra

    NASA Astrophysics Data System (ADS)

    Ramirez-Cuesta, A. J.

    2004-03-01

    In Inelastic Neutron Scattering Spectroscopy, the neutron scattering intensity is plotted versus neutron energy loss giving a spectrum that looks like an infrared or a Raman spectrum. Unlike IR or Raman, INS does not have selection rules, i.e. all transitions are in principle observable. This particular characteristic makes INS a test bed for Density Functional Theory calculations of vibrational modes. aCLIMAX is the first user friendly program, within the Windows environment, that uses the output of normal modes to generate the calculated INS of the model molecule, making a lot easier to establish a connection between theory and experiment. Program summaryTitle of program: aCLIMAX 4.0.1 Catalogue identifier: ADSW Program summary URL:http://cpc.cs.qub.ac.uk/summaries/ADSW Program obtainable from: CPC Program Library, Queen's University of Belfast, N. Ireland Operating systems: Windows 95 onwards, except Windows ME where it does not work Programming language used: Visual Basic Memory requirements: 64 MB No. of processors: 1 Has the code been parallelized: No No. of bytes in distributed program, including test data, etc.: 2 432 775 No. of lines in distributed program, including test data, etc.: 17 998 Distribution format: tar gzip file Nature of physical problem: Calculation of the Inelastic Neutron Scattering Spectra from DFT calculations of the vibrational density of states for molecules. Method of solution: INS spectral intensity calculated from normal modes analysis. Isolated molecule approximation. Typical time of running: From few seconds to few minutes depending on the size of the molecule. Unusual features of the program: Special care has to be taken in the case of computers that have different regional options than the English speaking countries, the decimal separator has to be set as "." (dot) instead of the usual "," (comma) that most countries use.

  1. Hybrid Correlation Energy (HyCE): An Approach Based on Separate Evaluations of Internal and External Components.

    PubMed

    Ivanic, Joseph; Schmidt, Michael W

    2018-06-04

    A novel hybrid correlation energy (HyCE) approach is proposed that determines the total correlation energy via distinct computation of its internal and external components. This approach evolved from two related studies. First, rigorous assessment of the accuracies and size extensivities of a number of electron correlation methods, that include perturbation theory (PT2), coupled-cluster (CC), configuration interaction (CI), and coupled electron pair approximation (CEPA), shows that the CEPA(0) variant of the latter and triples-corrected CC methods consistently perform very similarly. These findings were obtained by comparison to near full CI results for four small molecules and by charting recovered correlation energies for six steadily growing chain systems. Second, by generating valence virtual orbitals (VVOs) and utilizing the CEPA(0) method, we were able to partition total correlation energies into internal (or nondynamic) and external (or dynamic) parts for the aforementioned six chain systems and a benchmark test bed of 36 molecules. When using triple-ζ basis sets it was found that per orbital internal correlation energies were appreciably larger than per orbital external energies and that the former showed far more chemical variation than the latter. Additionally, accumulations of external correlation energies were seen to proceed smoothly, and somewhat linearly, as the virtual space is gradually increased. Combination of these two studies led to development of the HyCE approach, whereby the internal and external correlation energies are determined separately by CEPA(0)/VVO and PT2/external calculations, respectively. When applied to the six chain systems and the 36-molecule benchmark test set it was found that HyCE energies followed closely those of triples-corrected CC and CEPA(0) while easily outperforming MP2 and CCSD. The success of the HyCE approach is more notable when considering that its cost is only slightly more than MP2 and significantly cheaper than the CC approaches.

  2. Development of a Simple Electron Transfer and Polarization Model and Its Application to Biological Systems.

    PubMed

    Diller, David J

    2017-01-10

    Here we present a new method for point charge calculation which we call Q ET (charges by electron transfer). The intent of this work is to develop a method that can be useful for studying charge transfer in large biological systems. It is based on the intuitive framework of the Q EQ method with the key difference being that the Q ET method tracks all pairwise electron transfers by augmenting the Q EQ pseudoenergy function with a distance dependent cost function for each electron transfer. This approach solves the key limitation of the Q EQ method which is its handling of formally charged groups. First, we parametrize the Q ET method by fitting to electrostatic potentials calculated using ab initio quantum mechanics on over 11,000 small molecules. On an external test set of over 2500 small molecules the Q ET method achieves a mean absolute error of 1.37 kcal/mol/electron when compared to the ab initio electrostatic potentials. Second, we examine the conformational dependence of the charges on over 2700 tripeptides. With the tripeptide data set, we show that the conformational effects account for approximately 0.4 kcal/mol/electron on the electrostatic potentials. Third, we test the Q ET method for its ability to reproduce the effects of polarization and electron transfer on 1000 water clusters. For the water clusters, we show that the Q ET method captures about 50% of the polarization and electron transfer effects. Finally, we examine the effects of electron transfer and polarizability on the electrostatic interaction between p38 and 94 small molecule ligands. When used in conjunction with the Generalized-Born continuum solvent model, polarization and electron transfer with the Q ET model lead to an average change of 17 kcal/mol on the calculated electrostatic component of ΔG.

  3. Molecular Docking Study on Galantamine Derivatives as Cholinesterase Inhibitors.

    PubMed

    Atanasova, Mariyana; Yordanov, Nikola; Dimitrov, Ivan; Berkov, Strahil; Doytchinova, Irini

    2015-06-01

    A training set of 22 synthetic galantamine derivatives binding to acetylcholinesterase was docked by GOLD and the protocol was optimized in terms of scoring function, rigidity/flexibility of the binding site, presence/absence of a water molecule inside and radius of the binding site. A moderate correlation was found between the affinities of compounds expressed as pIC50 values and their docking scores. The optimized docking protocol was validated by an external test set of 11 natural galantamine derivatives and the correlation coefficient between the docking scores and the pIC50 values was 0.800. The derived relationship was used to analyze the interactions between galantamine derivatives and AChE. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  4. Water adsorption on a copper formate paddlewheel model of CuBTC: A comparative MP2 and DFT study

    NASA Astrophysics Data System (ADS)

    Toda, Jordi; Fischer, Michael; Jorge, Miguel; Gomes, José R. B.

    2013-11-01

    Simultaneous adsorption of two water molecules on open metal sites of the HKUST-1 metal-organic framework (MOF), modeled with a Cu2(HCOO)4 cluster, was studied by means of density functional theory (DFT) and second-order Moller-Plesset (MP2) approaches together with correlation consistent basis sets. Experimental geometries and MP2 energetic data extrapolated to the complete basis set limit were used as benchmarks for testing the accuracy of several different exchange-correlation functionals in the correct description of the water-MOF interaction. M06-L and some LC-DFT methods arise as the most appropriate in terms of the quality of geometrical data, energetic data and computational resources needed.

  5. Computational Chemistry Comparison and Benchmark Database

    National Institute of Standards and Technology Data Gateway

    SRD 101 NIST Computational Chemistry Comparison and Benchmark Database (Web, free access)   The NIST Computational Chemistry Comparison and Benchmark Database is a collection of experimental and ab initio thermochemical properties for a selected set of molecules. The goals are to provide a benchmark set of molecules for the evaluation of ab initio computational methods and allow the comparison between different ab initio computational methods for the prediction of thermochemical properties.

  6. Predicting binding poses and affinities for protein - ligand complexes in the 2015 D3R Grand Challenge using a physical model with a statistical parameter estimation

    NASA Astrophysics Data System (ADS)

    Grudinin, Sergei; Kadukova, Maria; Eisenbarth, Andreas; Marillet, Simon; Cazals, Frédéric

    2016-09-01

    The 2015 D3R Grand Challenge provided an opportunity to test our new model for the binding free energy of small molecules, as well as to assess our protocol to predict binding poses for protein-ligand complexes. Our pose predictions were ranked 3-9 for the HSP90 dataset, depending on the assessment metric. For the MAP4K dataset the ranks are very dispersed and equal to 2-35, depending on the assessment metric, which does not provide any insight into the accuracy of the method. The main success of our pose prediction protocol was the re-scoring stage using the recently developed Convex-PL potential. We make a thorough analysis of our docking predictions made with AutoDock Vina and discuss the effect of the choice of rigid receptor templates, the number of flexible residues in the binding pocket, the binding pocket size, and the benefits of re-scoring. However, the main challenge was to predict experimentally determined binding affinities for two blind test sets. Our affinity prediction model consisted of two terms, a pairwise-additive enthalpy, and a non pairwise-additive entropy. We trained the free parameters of the model with a regularized regression using affinity and structural data from the PDBBind database. Our model performed very well on the training set, however, failed on the two test sets. We explain the drawback and pitfalls of our model, in particular in terms of relative coverage of the test set by the training set and missed dynamical properties from crystal structures, and discuss different routes to improve it.

  7. Rapid detection of urinary soluble intercellular adhesion molecule-1 for determination of lupus nephritis activity.

    PubMed

    Wang, Yanyun; Tao, Ye; Liu, Yi; Zhao, Yi; Song, Chao; Zhou, Bin; Wang, Tao; Gao, Linbo; Zhang, Lin; Hu, Huaizhong

    2018-06-01

    The current methods of monitoring the activity of lupus nephritis (LN) may cause unnecessary hospital visits or delayed immunosuppressive therapy. We aimed to find a urinary biomarker that could be developed as a home-based test for monitoring the activity of LN.Urine samples were collected immediately before a renal biopsy from patients of suspected active LN, and also from patients with inactive LN, systemic lupus erythematous without LN or healthy controls. Biomarker search was conducted on a cytokine antibody array and confirmation was done by quantitative evaluation with enzyme-linked immunosorbent assay. The Mann-Whiney test or Student t test was used to compare the levels of 9 cytokines between different groups. The sensitivity and specificity of each cytokine for diagnosis of LN was evaluated by receiver operating characteristic curve. A rapid test based on colloidal gold immunochromatography was then developed for bedside or home use. Furthermore, an experimental e-healthcare system was constructed for recording and sharing the results of the rapid test a cloud-assisted internet of things (IoT) consisting of a sensing device, an IoT device and a cloud server.Adiponectin (Acrp30), soluble intercellular cell adhesion molecule-1 (sICAM-1), neural cell adhesion molecule 1 (NCAM-1), and CD26 were significantly higher in urine samples of active LN patients. sICAM-1 appeared more sensitive and specific among these candidates. When the cut-off value of sICAM-1 was set at 1.44 ng/mL, the sensitivity reached 98.33% with a specificity at 85.71%. The sICAM-1 strip test showed comparable sensitivity of 95% and a specificity of 83.3% for assessing the LN activity. Meanwhile, the e-healthcare system was able to conveniently digitize and share the sICAM-1 rapid test results.sICAM-1 appeared to be an excellent biomarker for monitoring LN activity. The e-healthcare system with cloud-assisted IoT could assist the digitalization and sharing of the bedside or home-based sICAM-1 test results.

  8. Supplementation of conventional freezing medium with a combination of catalase and trehalose results in better protection of surface molecules and functionality of hematopoietic cells.

    PubMed

    Sasnoor, Lalita M; Kale, Vaijayanti P; Limaye, Lalita S

    2003-10-01

    Our previous studies had shown that a combination of the bio-antioxidant catalase and the membrane stabilizer trehalose in the conventional freezing mixture affords better cryoprotection to hematopoietic cells as judged by clonogenic assays. In the present investigation, we extended these studies using several parameters like responsiveness to growth factors, expression of growth factor receptors, adhesion assays, adhesion molecule expression, and long-term culture-forming ability. Cells were frozen with (test cells) or without additives (control cells) in the conventional medium containing 10% dimethylsulfoxide (DMSO). Experiments were done on mononuclear cells (MNC) from cord blood/fetal liver hematopoietic cells (CB/FL) and CD34(+) cells isolated from frozen MNC. Our results showed that the responsiveness of test cells to the two early-acting cytokines, viz. interleukin-3 (IL-3) and stem cell factor (SCF) in CFU assays was better than control cells as seen by higher colony formation at limiting concentrations of these cytokines. We, therefore, analyzed the expression of these two growth factor receptors by flow cytometry. We found that in cryopreserved test MNC, as well as CD34(+) cells isolated from them, the expression of both cytokine receptors was two- to three-fold higher than control MNC and CD34(+) cells isolated from them. Adhesion assays carried out with CB/FL-derived CD34(+) cells and KG1a cells showed significantly higher adherence of test cells to M210B4 than respective control cells. Cryopreserved test MNC as well as CD34(+) cells isolated from them showed increased expression of adhesion molecules like CD43, CD44, CD49d, and CD49e. On isolated CD34(+) cells and KG1a cells, there was a two- to three-fold increase in a double-positive population expressing CD34/L-selectin in test cells as compared to control cells. Long-term cultures (LTC) were set up with frozen MNC as well as with CD34(+) cells. Clonogenic cells from LTC were enumerated at the end of the fifth week. There was a significantly increased formation of CFU from test cells than from control cells, indicating better preservation of early progenitors in test cells. Our results suggest that use of a combination of catalase and trehalose as a supplement in the conventional freezing medium results in better protection of growth factor receptors, adhesion molecules, and functionality of hematopoietic cells, yielding a better graft quality.

  9. LEAP into the Pfizer Global Virtual Library (PGVL) space: creation of readily synthesizable design ideas automatically.

    PubMed

    Hu, Qiyue; Peng, Zhengwei; Kostrowicki, Jaroslav; Kuki, Atsuo

    2011-01-01

    Pfizer Global Virtual Library (PGVL) of 10(13) readily synthesizable molecules offers a tremendous opportunity for lead optimization and scaffold hopping in drug discovery projects. However, mining into a chemical space of this size presents a challenge for the concomitant design informatics due to the fact that standard molecular similarity searches against a collection of explicit molecules cannot be utilized, since no chemical information system could create and manage more than 10(8) explicit molecules. Nevertheless, by accepting a tolerable level of false negatives in search results, we were able to bypass the need for full 10(13) enumeration and enabled the efficient similarity search and retrieval into this huge chemical space for practical usage by medicinal chemists. In this report, two search methods (LEAP1 and LEAP2) are presented. The first method uses PGVL reaction knowledge to disassemble the incoming search query molecule into a set of reactants and then uses reactant-level similarities into actual available starting materials to focus on a much smaller sub-region of the full virtual library compound space. This sub-region is then explicitly enumerated and searched via a standard similarity method using the original query molecule. The second method uses a fuzzy mapping onto candidate reactions and does not require exact disassembly of the incoming query molecule. Instead Basis Products (or capped reactants) are mapped into the query molecule and the resultant asymmetric similarity scores are used to prioritize the corresponding reactions and reactant sets. All sets of Basis Products are inherently indexed to specific reactions and specific starting materials. This again allows focusing on a much smaller sub-region for explicit enumeration and subsequent standard product-level similarity search. A set of validation studies were conducted. The results have shown that the level of false negatives for the disassembly-based method is acceptable when the query molecule can be recognized for exact disassembly, and the fuzzy reaction mapping method based on Basis Products has an even better performance in terms of lower false-negative rate because it is not limited by the requirement that the query molecule needs to be recognized by any disassembly algorithm. Both search methods have been implemented and accessed through a powerful desktop molecular design tool (see ref. (33) for details). The chapter will end with a comparison of published search methods against large virtual chemical space.

  10. Counting numbers of synaptic proteins: absolute quantification and single molecule imaging techniques

    PubMed Central

    Patrizio, Angela; Specht, Christian G.

    2016-01-01

    Abstract. The ability to count molecules is essential to elucidating cellular mechanisms, as these often depend on the absolute numbers and concentrations of molecules within specific compartments. Such is the case at chemical synapses, where the transmission of information from presynaptic to postsynaptic terminals requires complex interactions between small sets of molecules. Be it the subunit stoichiometry specifying neurotransmitter receptor properties, the copy numbers of scaffold proteins setting the limit of receptor accumulation at synapses, or protein packing densities shaping the molecular organization and plasticity of the postsynaptic density, all of these depend on exact quantities of components. A variety of proteomic, electrophysiological, and quantitative imaging techniques have yielded insights into the molecular composition of synaptic complexes. In this review, we compare the different quantitative approaches and consider the potential of single molecule imaging techniques for the quantification of synaptic components. We also discuss specific neurobiological data to contextualize the obtained numbers and to explain how they aid our understanding of synaptic structure and function. PMID:27335891

  11. Counting numbers of synaptic proteins: absolute quantification and single molecule imaging techniques.

    PubMed

    Patrizio, Angela; Specht, Christian G

    2016-10-01

    The ability to count molecules is essential to elucidating cellular mechanisms, as these often depend on the absolute numbers and concentrations of molecules within specific compartments. Such is the case at chemical synapses, where the transmission of information from presynaptic to postsynaptic terminals requires complex interactions between small sets of molecules. Be it the subunit stoichiometry specifying neurotransmitter receptor properties, the copy numbers of scaffold proteins setting the limit of receptor accumulation at synapses, or protein packing densities shaping the molecular organization and plasticity of the postsynaptic density, all of these depend on exact quantities of components. A variety of proteomic, electrophysiological, and quantitative imaging techniques have yielded insights into the molecular composition of synaptic complexes. In this review, we compare the different quantitative approaches and consider the potential of single molecule imaging techniques for the quantification of synaptic components. We also discuss specific neurobiological data to contextualize the obtained numbers and to explain how they aid our understanding of synaptic structure and function.

  12. Practical aspects of mutagenicity testing strategy: an industrial perspective.

    PubMed

    Gollapudi, B B; Krishna, G

    2000-11-20

    Genetic toxicology studies play a central role in the development and marketing of new chemicals for pharmaceutical, agricultural, industrial, and consumer use. During the discovery phase of product development, rapid screening tests that require minimal amounts of test materials are used to assist in the design and prioritization of new molecules. At this stage, a modified Salmonella reverse mutation assay and an in vitro micronucleus test with mammalian cell culture are frequently used for screening. Regulatory genetic toxicology studies are conducted with a short list of compounds using protocols that conform to various international guidelines. A set of four assays usually constitutes the minimum test battery that satisfies global requirements. This set includes a bacterial reverse mutation assay, an in vitro cytogenetic test with mammalian cell culture, an in vitro gene mutation assay in mammalian cell cultures, and an in vivo rodent bone marrow micronucleus test. Supplementary studies are conducted in certain instances either as a follow-up to the findings from this initial testing battery and/or to satisfy a regulatory requirement. Currently available genetic toxicology assays have helped the scientific and industrial community over the past several decades in evaluating the mutagenic potential of chemical agents. The emerging field of toxicogenomics has the potential to redefine our ability to study the response of cells to genetic damage and hence our ability to study threshold phenomenon.

  13. DFT and 3D-QSAR Studies of Anti-Cancer Agents m-(4-Morpholinoquinazolin-2-yl) Benzamide Derivatives for Novel Compounds Design

    NASA Astrophysics Data System (ADS)

    Zhao, Siqi; Zhang, Guanglong; Xia, Shuwei; Yu, Liangmin

    2018-06-01

    As a group of diversified frameworks, quinazolin derivatives displayed a broad field of biological functions, especially as anticancer. To investigate the quantitative structure-activity relationship, 3D-QSAR models were generated with 24 quinazolin scaffold molecules. The experimental and predicted pIC50 values for both training and test set compounds showed good correlation, which proved the robustness and reliability of the generated QSAR models. The most effective CoMFA and CoMSIA were obtained with correlation coefficient r 2 ncv of 1.00 (both) and leave-one-out coefficient q 2 of 0.61 and 0.59, respectively. The predictive abilities of CoMFA and CoMSIA were quite good with the predictive correlation coefficients ( r 2 pred ) of 0.97 and 0.91. In addition, the statistic results of CoMFA and CoMSIA were used to design new quinazolin molecules.

  14. Sustainable reduction of bioreactor contamination in an industrial fermentation pilot plant.

    PubMed

    Junker, Beth; Lester, Michael; Leporati, James; Schmitt, John; Kovatch, Michael; Borysewicz, Stan; Maciejak, Waldemar; Seeley, Anna; Hesse, Michelle; Connors, Neal; Brix, Thomas; Creveling, Eric; Salmon, Peter

    2006-10-01

    Facility experience primarily in drug-oriented fermentation equipment (producing small molecules such as secondary metabolites, bioconversions, and enzymes) and, to a lesser extent, in biologics-oriented fermentation equipment (producing large molecules such as recombinant proteins and microbial vaccines) in an industrial fermentation pilot plant over the past 15 years is described. Potential approaches for equipment design and maintenance, operational procedures, validation/verification testing, medium selection, culture purity/sterility analysis, and contamination investigation are presented, and those approaches implemented are identified. Failure data collected for pilot plant operation for nearly 15 years are presented and best practices for documentation and tracking are outlined. This analysis does not exhaustively discuss available design, operational and procedural options; rather it selectively presents what has been determined to be beneficial in an industrial pilot plant setting. Literature references have been incorporated to provide background and context where appropriate.

  15. 3D-quantitative structure-activity relationship studies on benzothiadiazepine hydroxamates as inhibitors of tumor necrosis factor-alpha converting enzyme.

    PubMed

    Murumkar, Prashant R; Giridhar, Rajani; Yadav, Mange Ram

    2008-04-01

    A set of 29 benzothiadiazepine hydroxamates having selective tumor necrosis factor-alpha converting enzyme inhibitory activity were used to compare the quality and predictive power of 3D-quantitative structure-activity relationship, comparative molecular field analysis, and comparative molecular similarity indices models for the atom-based, centroid/atom-based, data-based, and docked conformer-based alignment. Removal of two outliers from the initial training set of molecules improved the predictivity of models. Among the 3D-quantitative structure-activity relationship models developed using the above four alignments, the database alignment provided the optimal predictive comparative molecular field analysis model for the training set with cross-validated r(2) (q(2)) = 0.510, non-cross-validated r(2) = 0.972, standard error of estimates (s) = 0.098, and F = 215.44 and the optimal comparative molecular similarity indices model with cross-validated r(2) (q(2)) = 0.556, non-cross-validated r(2) = 0.946, standard error of estimates (s) = 0.163, and F = 99.785. These models also showed the best test set prediction for six compounds with predictive r(2) values of 0.460 and 0.535, respectively. The contour maps obtained from 3D-quantitative structure-activity relationship studies were appraised for activity trends for the molecules analyzed. The comparative molecular similarity indices models exhibited good external predictivity as compared with that of comparative molecular field analysis models. The data generated from the present study helped us to further design and report some novel and potent tumor necrosis factor-alpha converting enzyme inhibitors.

  16. Using support vector machines to improve elemental ion identification in macromolecular crystal structures

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Morshed, Nader; Lawrence Berkeley National Laboratory, Berkeley, CA 94720; Echols, Nathaniel, E-mail: nechols@lbl.gov

    2015-05-01

    A method to automatically identify possible elemental ions in X-ray crystal structures has been extended to use support vector machine (SVM) classifiers trained on selected structures in the PDB, with significantly improved sensitivity over manually encoded heuristics. In the process of macromolecular model building, crystallographers must examine electron density for isolated atoms and differentiate sites containing structured solvent molecules from those containing elemental ions. This task requires specific knowledge of metal-binding chemistry and scattering properties and is prone to error. A method has previously been described to identify ions based on manually chosen criteria for a number of elements. Here,more » the use of support vector machines (SVMs) to automatically classify isolated atoms as either solvent or one of various ions is described. Two data sets of protein crystal structures, one containing manually curated structures deposited with anomalous diffraction data and another with automatically filtered, high-resolution structures, were constructed. On the manually curated data set, an SVM classifier was able to distinguish calcium from manganese, zinc, iron and nickel, as well as all five of these ions from water molecules, with a high degree of accuracy. Additionally, SVMs trained on the automatically curated set of high-resolution structures were able to successfully classify most common elemental ions in an independent validation test set. This method is readily extensible to other elemental ions and can also be used in conjunction with previous methods based on a priori expectations of the chemical environment and X-ray scattering.« less

  17. Effect of hydration on the stability of fullerene-like silica molecules

    NASA Astrophysics Data System (ADS)

    Filonenko, O. V.; Lobanov, V. V.

    2011-05-01

    The hydration of fullerene-like silica molecules was studied by the density functional method (exchange-correlation functional B3LYP, basis set 6-31G**). It was demonstrated that completely coordinated structures transform to more stable hydroxylated ones during hydrolysis. These in turn react with H2O molecules with the formation of hydrogen bonds.

  18. X-ray characterization of solid small molecule organic materials

    DOEpatents

    Billinge, Simon; Shankland, Kenneth; Shankland, Norman; Florence, Alastair

    2014-06-10

    The present invention provides, inter alia, methods of characterizing a small molecule organic material, e.g., a drug or a drug product. This method includes subjecting the solid small molecule organic material to x-ray total scattering analysis at a short wavelength, collecting data generated thereby, and mathematically transforming the data to provide a refined set of data.

  19. Accurate Valence Ionization Energies from Kohn-Sham Eigenvalues with the Help of Potential Adjustors.

    PubMed

    Thierbach, Adrian; Neiss, Christian; Gallandi, Lukas; Marom, Noa; Körzdörfer, Thomas; Görling, Andreas

    2017-10-10

    An accurate yet computationally very efficient and formally well justified approach to calculate molecular ionization potentials is presented and tested. The first as well as higher ionization potentials are obtained as the negatives of the Kohn-Sham eigenvalues of the neutral molecule after adjusting the eigenvalues by a recently [ Görling Phys. Rev. B 2015 , 91 , 245120 ] introduced potential adjustor for exchange-correlation potentials. Technically the method is very simple. Besides a Kohn-Sham calculation of the neutral molecule, only a second Kohn-Sham calculation of the cation is required. The eigenvalue spectrum of the neutral molecule is shifted such that the negative of the eigenvalue of the highest occupied molecular orbital equals the energy difference of the total electronic energies of the cation minus the neutral molecule. For the first ionization potential this simply amounts to a ΔSCF calculation. Then, the higher ionization potentials are obtained as the negatives of the correspondingly shifted Kohn-Sham eigenvalues. Importantly, this shift of the Kohn-Sham eigenvalue spectrum is not just ad hoc. In fact, it is formally necessary for the physically correct energetic adjustment of the eigenvalue spectrum as it results from ensemble density-functional theory. An analogous approach for electron affinities is equally well obtained and justified. To illustrate the practical benefits of the approach, we calculate the valence ionization energies of test sets of small- and medium-sized molecules and photoelectron spectra of medium-sized electron acceptor molecules using a typical semilocal (PBE) and two typical global hybrid functionals (B3LYP and PBE0). The potential adjusted B3LYP and PBE0 eigenvalues yield valence ionization potentials that are in very good agreement with experimental values, reaching an accuracy that is as good as the best G 0 W 0 methods, however, at much lower computational costs. The potential adjusted PBE eigenvalues result in somewhat less accurate ionization energies, which, however, are almost as accurate as those obtained from the most commonly used G 0 W 0 variants.

  20. High precision optical spectroscopy and quantum state selected photodissociation of ultracold 88Sr2 molecules in an optical lattice

    NASA Astrophysics Data System (ADS)

    McDonald, Mickey Patrick

    Over the past several decades, rapid progress has been made toward the accurate characterization and control of atoms, made possible largely by the development of narrow-linewidth lasers and techniques for trapping and cooling at ultracold temperatures. Extending this progress to molecules will have exciting implications for chemistry, condensed matter physics, and precision tests of physics beyond the Standard Model. These possibilities are all consequences of the richness of molecular structure, which is governed by physics substantially different from that characterizing atomic structure. This same richness of structure, however, increases the complexity of any molecular experiment manyfold over its atomic counterpart, magnifying the difficulty of everything from trapping and cooling to the comparison of theory with experiment. This thesis describes work performed over the past six years to establish the state of the art in manipulation and quantum control of ultracold molecules. Our molecules are produced via photoassociation of ultracold strontium atoms followed by spontaneous decay to a stable ground state. We describe a thorough set of measurements characterizing the rovibrational structure of very weakly bound (and therefore very large) 88Sr2 molecules from several different perspectives, including determinations of binding energies; linear, quadratic, and higher order Zeeman shifts; transition strengths between bound states; and lifetimes of narrow subradiant states. The physical intuition gained in these experiments applies generally to weakly bound diatomic molecules, and suggests extensive applications in precision measurement and metrology. In addition, we present a detailed analysis of the thermally broadened spectroscopic lineshape of molecules in a non-magic optical lattice trap, showing how such lineshapes can be used to directly determine the temperature of atoms or molecules in situ, addressing a long-standing problem in ultracold physics. Finally, we discuss the measurement of photofragment angular distributions produced by photodissociation, leading to an exploration of quantum-state-resolved ultracold chemistry.

  1. A New Structure-Activity Relationship (SAR) Model for Predicting Drug-Induced Liver Injury, Based on Statistical and Expert-Based Structural Alerts

    PubMed Central

    Pizzo, Fabiola; Lombardo, Anna; Manganaro, Alberto; Benfenati, Emilio

    2016-01-01

    The prompt identification of chemical molecules with potential effects on liver may help in drug discovery and in raising the levels of protection for human health. Besides in vitro approaches, computational methods in toxicology are drawing attention. We built a structure-activity relationship (SAR) model for evaluating hepatotoxicity. After compiling a data set of 950 compounds using data from the literature, we randomly split it into training (80%) and test sets (20%). We also compiled an external validation set (101 compounds) for evaluating the performance of the model. To extract structural alerts (SAs) related to hepatotoxicity and non-hepatotoxicity we used SARpy, a statistical application that automatically identifies and extracts chemical fragments related to a specific activity. We also applied the chemical grouping approach for manually identifying other SAs. We calculated accuracy, specificity, sensitivity and Matthews correlation coefficient (MCC) on the training, test and external validation sets. Considering the complexity of the endpoint, the model performed well. In the training, test and external validation sets the accuracy was respectively 81, 63, and 68%, specificity 89, 33, and 33%, sensitivity 93, 88, and 80% and MCC 0.63, 0.27, and 0.13. Since it is preferable to overestimate hepatotoxicity rather than not to recognize unsafe compounds, the model's architecture followed a conservative approach. As it was built using human data, it might be applied without any need for extrapolation from other species. This model will be freely available in the VEGA platform. PMID:27920722

  2. Classification of Breast Cancer Resistant Protein (BCRP) Inhibitors and Non-Inhibitors Using Machine Learning Approaches.

    PubMed

    Belekar, Vilas; Lingineni, Karthik; Garg, Prabha

    2015-01-01

    The breast cancer resistant protein (BCRP) is an important transporter and its inhibitors play an important role in cancer treatment by improving the oral bioavailability as well as blood brain barrier (BBB) permeability of anticancer drugs. In this work, a computational model was developed to predict the compounds as BCRP inhibitors or non-inhibitors. Various machine learning approaches like, support vector machine (SVM), k-nearest neighbor (k-NN) and artificial neural network (ANN) were used to develop the models. The Matthews correlation coefficients (MCC) of developed models using ANN, k-NN and SVM are 0.67, 0.71 and 0.77, and prediction accuracies are 85.2%, 88.3% and 90.8% respectively. The developed models were tested with a test set of 99 compounds and further validated with external set of 98 compounds. Distribution plot analysis and various machine learning models were also developed based on druglikeness descriptors. Applicability domain is used to check the prediction reliability of the new molecules.

  3. Statistical analysis of particle trajectories in living cells

    NASA Astrophysics Data System (ADS)

    Briane, Vincent; Kervrann, Charles; Vimond, Myriam

    2018-06-01

    Recent advances in molecular biology and fluorescence microscopy imaging have made possible the inference of the dynamics of molecules in living cells. Such inference allows us to understand and determine the organization and function of the cell. The trajectories of particles (e.g., biomolecules) in living cells, computed with the help of object tracking methods, can be modeled with diffusion processes. Three types of diffusion are considered: (i) free diffusion, (ii) subdiffusion, and (iii) superdiffusion. The mean-square displacement (MSD) is generally used to discriminate the three types of particle dynamics. We propose here a nonparametric three-decision test as an alternative to the MSD method. The rejection of the null hypothesis, i.e., free diffusion, is accompanied by claims of the direction of the alternative (subdiffusion or superdiffusion). We study the asymptotic behavior of the test statistic under the null hypothesis and under parametric alternatives which are currently considered in the biophysics literature. In addition, we adapt the multiple-testing procedure of Benjamini and Hochberg to fit with the three-decision-test setting, in order to apply the test procedure to a collection of independent trajectories. The performance of our procedure is much better than the MSD method as confirmed by Monte Carlo experiments. The method is demonstrated on real data sets corresponding to protein dynamics observed in fluorescence microscopy.

  4. Training a molecular automaton to play a game

    NASA Astrophysics Data System (ADS)

    Pei, Renjun; Matamoros, Elizabeth; Liu, Manhong; Stefanovic, Darko; Stojanovic, Milan N.

    2010-11-01

    Research at the interface between chemistry and cybernetics has led to reports of `programmable molecules', but what does it mean to say `we programmed a set of solution-phase molecules to do X'? A survey of recently implemented solution-phase circuitry indicates that this statement could be replaced with `we pre-mixed a set of molecules to do X and functional subsets of X'. These hard-wired mixtures are then exposed to a set of molecular inputs, which can be interpreted as being keyed to human moves in a game, or as assertions of logical propositions. In nucleic acids-based systems, stemming from DNA computation, these inputs can be seen as generic oligonucleotides. Here, we report using reconfigurable nucleic acid catalyst-based units to build a multipurpose reprogrammable molecular automaton that goes beyond single-purpose `hard-wired' molecular automata. The automaton covers all possible responses to two consecutive sets of four inputs (such as four first and four second moves for a generic set of trivial two-player two-move games). This is a model system for more general molecular field programmable gate array (FPGA)-like devices that can be programmed by example, which means that the operator need not have any knowledge of molecular computing methods.

  5. Training a molecular automaton to play a game.

    PubMed

    Pei, Renjun; Matamoros, Elizabeth; Liu, Manhong; Stefanovic, Darko; Stojanovic, Milan N

    2010-11-01

    Research at the interface between chemistry and cybernetics has led to reports of 'programmable molecules', but what does it mean to say 'we programmed a set of solution-phase molecules to do X'? A survey of recently implemented solution-phase circuitry indicates that this statement could be replaced with 'we pre-mixed a set of molecules to do X and functional subsets of X'. These hard-wired mixtures are then exposed to a set of molecular inputs, which can be interpreted as being keyed to human moves in a game, or as assertions of logical propositions. In nucleic acids-based systems, stemming from DNA computation, these inputs can be seen as generic oligonucleotides. Here, we report using reconfigurable nucleic acid catalyst-based units to build a multipurpose reprogrammable molecular automaton that goes beyond single-purpose 'hard-wired' molecular automata. The automaton covers all possible responses to two consecutive sets of four inputs (such as four first and four second moves for a generic set of trivial two-player two-move games). This is a model system for more general molecular field programmable gate array (FPGA)-like devices that can be programmed by example, which means that the operator need not have any knowledge of molecular computing methods.

  6. Molecular modeling-driven approach for identification of Janus kinase 1 inhibitors through 3D-QSAR, docking and molecular dynamics simulations.

    PubMed

    Itteboina, Ramesh; Ballu, Srilata; Sivan, Sree Kanth; Manga, Vijjulatha

    2017-10-01

    Janus kinase 1 (JAK 1) belongs to the JAK family of intracellular nonreceptor tyrosine kinase. JAK-signal transducer and activator of transcription (JAK-STAT) pathway mediate signaling by cytokines, which control survival, proliferation and differentiation of a variety of cells. Three-dimensional quantitative structure activity relationship (3 D-QSAR), molecular docking and molecular dynamics (MD) methods was carried out on a dataset of Janus kinase 1(JAK 1) inhibitors. Ligands were constructed and docked into the active site of protein using GLIDE 5.6. Best docked poses were selected after analysis for further 3 D-QSAR analysis using comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) methodology. Employing 60 molecules in the training set, 3 D-QSAR models were generate that showed good statistical reliability, which is clearly observed in terms of r 2 ncv and q 2 loo values. The predictive ability of these models was determined using a test set of 25 molecules that gave acceptable predictive correlation (r 2 Pred ) values. The key amino acid residues were identified by means of molecular docking, and the stability and rationality of the derived molecular conformations were also validated by MD simulation. The good consonance between the docking results and CoMFA/CoMSIA contour maps provides helpful clues about the reasonable modification of molecules in order to design more efficient JAK 1 inhibitors. The developed models are expected to provide some directives for further synthesis of highly effective JAK 1 inhibitors.

  7. Asymmetric bagging and feature selection for activities prediction of drug molecules.

    PubMed

    Li, Guo-Zheng; Meng, Hao-Hua; Lu, Wen-Cong; Yang, Jack Y; Yang, Mary Qu

    2008-05-28

    Activities of drug molecules can be predicted by QSAR (quantitative structure activity relationship) models, which overcomes the disadvantages of high cost and long cycle by employing the traditional experimental method. With the fact that the number of drug molecules with positive activity is rather fewer than that of negatives, it is important to predict molecular activities considering such an unbalanced situation. Here, asymmetric bagging and feature selection are introduced into the problem and asymmetric bagging of support vector machines (asBagging) is proposed on predicting drug activities to treat the unbalanced problem. At the same time, the features extracted from the structures of drug molecules affect prediction accuracy of QSAR models. Therefore, a novel algorithm named PRIFEAB is proposed, which applies an embedded feature selection method to remove redundant and irrelevant features for asBagging. Numerical experimental results on a data set of molecular activities show that asBagging improve the AUC and sensitivity values of molecular activities and PRIFEAB with feature selection further helps to improve the prediction ability. Asymmetric bagging can help to improve prediction accuracy of activities of drug molecules, which can be furthermore improved by performing feature selection to select relevant features from the drug molecules data sets.

  8. Wavelength dependence and multiple-induced states in photoresponses of copper phthalocyanine-doped gold nanoparticle single-electron device

    NASA Astrophysics Data System (ADS)

    Yamamoto, Makoto; Ueda, Rieko; Terui, Toshifumi; Imazu, Keisuke; Tamada, Kaoru; Sakano, Takeshi; Matsuda, Kenji; Ishii, Hisao; Noguchi, Yutaka

    2014-01-01

    We have proposed a gold nanoparticle (GNP)-based single-electron transistor (SET) doped with a dye molecule, where the molecule works as a photoresponsive floating gate. Here, we examined the source-drain current (I_{\\text{SD}}) at a constant drain voltage under light irradiation with various wavelengths ranging from 400 to 700 nm. Current change was enhanced at the wavelengths of 600 and 700 nm, corresponding to the optical absorption band of the doped molecule (copper phthalocyanine: CuPc). Moreover, several peaks appear in the histograms of I_{\\text{SD}} during light irradiation, indicating that multiple discrete states were induced in the device. The results suggest that the current change was initiated by the light absorption of CuPc and multiple CuPc molecules near the GNP working as a floating gate. Molecular doping can activate advanced device functions in GNP-based SETs.

  9. A theoretical study of potentially observable chirality-sensitive NMR effects in molecules.

    PubMed

    Garbacz, Piotr; Cukras, Janusz; Jaszuński, Michał

    2015-09-21

    Two recently predicted nuclear magnetic resonance effects, the chirality-induced rotating electric polarization and the oscillating magnetization, are examined for several experimentally available chiral molecules. We discuss in detail the requirements for experimental detection of chirality-sensitive NMR effects of the studied molecules. These requirements are related to two parameters: the shielding polarizability and the antisymmetric part of the nuclear magnetic shielding tensor. The dominant second contribution has been computed for small molecules at the coupled cluster and density functional theory levels. It was found that DFT calculations using the KT2 functional and the aug-cc-pCVTZ basis set adequately reproduce the CCSD(T) values obtained with the same basis set. The largest values of parameters, thus most promising from the experimental point of view, were obtained for the fluorine nuclei in 1,3-difluorocyclopropene and 1,3-diphenyl-2-fluoro-3-trifluoromethylcyclopropene.

  10. Binding screen for cystic fibrosis transmembrane conductance regulator correctors finds new chemical matter and yields insights into cystic fibrosis therapeutic strategy.

    PubMed

    Hall, Justin D; Wang, Hong; Byrnes, Laura J; Shanker, Suman; Wang, Kelong; Efremov, Ivan V; Chong, P Andrew; Forman-Kay, Julie D; Aulabaugh, Ann E

    2016-02-01

    The most common mutation in cystic fibrosis (CF) patients is deletion of F508 (ΔF508) in the first nucleotide binding domain (NBD1) of the CF transmembrane conductance regulator (CFTR). ΔF508 causes a decrease in the trafficking of CFTR to the cell surface and reduces the thermal stability of isolated NBD1; it is well established that both of these effects can be rescued by additional revertant mutations in NBD1. The current paradigm in CF small molecule drug discovery is that, like revertant mutations, a path may exist to ΔF508 CFTR correction through a small molecule chaperone binding to NBD1. We, therefore, set out to find small molecule binders of NBD1 and test whether it is possible to develop these molecules into potent binders that increase CFTR trafficking in CF-patient-derived human bronchial epithelial cells. Several fragments were identified that bind NBD1 at either the CFFT-001 site or the BIA site. However, repeated attempts to improve the affinity of these fragments resulted in only modest gains. Although these results cannot prove that there is no possibility of finding a high-affinity small molecule binder of NBD1, they are discouraging and lead us to hypothesize that the nature of these two binding sites, and isolated NBD1 itself, may not contain the features needed to build high-affinity interactions. Future work in this area may, therefore, require constructs including other domains of CFTR in addition to NBD1, if high-affinity small molecule binding is to be achieved. © 2016 The Protein Society.

  11. Profiling cellular protein complexes by proximity ligation with dual tag microarray readout.

    PubMed

    Hammond, Maria; Nong, Rachel Yuan; Ericsson, Olle; Pardali, Katerina; Landegren, Ulf

    2012-01-01

    Patterns of protein interactions provide important insights in basic biology, and their analysis plays an increasing role in drug development and diagnostics of disease. We have established a scalable technique to compare two biological samples for the levels of all pairwise interactions among a set of targeted protein molecules. The technique is a combination of the proximity ligation assay with readout via dual tag microarrays. In the proximity ligation assay protein identities are encoded as DNA sequences by attaching DNA oligonucleotides to antibodies directed against the proteins of interest. Upon binding by pairs of antibodies to proteins present in the same molecular complexes, ligation reactions give rise to reporter DNA molecules that contain the combined sequence information from the two DNA strands. The ligation reactions also serve to incorporate a sample barcode in the reporter molecules to allow for direct comparison between pairs of samples. The samples are evaluated using a dual tag microarray where information is decoded, revealing which pairs of tags that have become joined. As a proof-of-concept we demonstrate that this approach can be used to detect a set of five proteins and their pairwise interactions both in cellular lysates and in fixed tissue culture cells. This paper provides a general strategy to analyze the extent of any pairwise interactions in large sets of molecules by decoding reporter DNA strands that identify the interacting molecules.

  12. Quantum Mechanical Studies of Molecular Hyperpolarizabilities.

    DTIC Science & Technology

    1980-04-30

    exponent , reflects the screening of an electron in a given orbital by the interior electrons in the atom or molecule. In practice, when studying...Basis sets have evolved over the years in molecular quantum mechanics until sets of orbital exponents for the different atoms composing the molecule have...and R. P. Hurst , J. Chem. Phys. 46, 2356 (1967); S. P. LickmannI and J. W. Moskowitz, J. Chem. Phys. 54, 3622 7T971). 26. T. H. Dunning, J. Chem. Phys

  13. DFT analysis on the molecular structure, vibrational and electronic spectra of 2-(cyclohexylamino)ethanesulfonic acid

    NASA Astrophysics Data System (ADS)

    Renuga Devi, T. S.; Sharmi kumar, J.; Ramkumaar, G. R.

    2015-02-01

    The FTIR and FT-Raman spectra of 2-(cyclohexylamino)ethanesulfonic acid were recorded in the regions 4000-400 cm-1 and 4000-50 cm-1 respectively. The structural and spectroscopic data of the molecule in the ground state were calculated using Hartee-Fock and Density functional method (B3LYP) with the correlation consistent-polarized valence double zeta (cc-pVDZ) basis set and 6-311++G(d,p) basis set. The most stable conformer was optimized and the structural and vibrational parameters were determined based on this. The complete assignments were performed based on the Potential Energy Distribution (PED) of the vibrational modes, calculated using Vibrational Energy Distribution Analysis (VEDA) 4 program. With the observed FTIR and FT-Raman data, a complete vibrational assignment and analysis of the fundamental modes of the compound were carried out. Thermodynamic properties and Atomic charges were calculated using both Hartee-Fock and density functional method using the cc-pVDZ basis set and compared. The calculated HOMO-LUMO energy gap revealed that charge transfer occurs within the molecule. 1H and 13C NMR chemical shifts of the molecule were calculated using Gauge Including Atomic Orbital (GIAO) method and were compared with experimental results. Stability of the molecule arising from hyperconjugative interactions, charge delocalization have been analyzed using Natural Bond Orbital (NBO) analysis. The first order hyperpolarizability (β) and Molecular Electrostatic Potential (MEP) of the molecule was computed using DFT calculations. The electron density based local reactivity descriptor such as Fukui functions were calculated to explain the chemical reactivity site in the molecule.

  14. Exploiting the spatial locality of electron correlation within the parametric two-electron reduced-density-matrix method

    NASA Astrophysics Data System (ADS)

    DePrince, A. Eugene; Mazziotti, David A.

    2010-01-01

    The parametric variational two-electron reduced-density-matrix (2-RDM) method is applied to computing electronic correlation energies of medium-to-large molecular systems by exploiting the spatial locality of electron correlation within the framework of the cluster-in-molecule (CIM) approximation [S. Li et al., J. Comput. Chem. 23, 238 (2002); J. Chem. Phys. 125, 074109 (2006)]. The 2-RDMs of individual molecular fragments within a molecule are determined, and selected portions of these 2-RDMs are recombined to yield an accurate approximation to the correlation energy of the entire molecule. In addition to extending CIM to the parametric 2-RDM method, we (i) suggest a more systematic selection of atomic-orbital domains than that presented in previous CIM studies and (ii) generalize the CIM method for open-shell quantum systems. The resulting method is tested with a series of polyacetylene molecules, water clusters, and diazobenzene derivatives in minimal and nonminimal basis sets. Calculations show that the computational cost of the method scales linearly with system size. We also compute hydrogen-abstraction energies for a series of hydroxyurea derivatives. Abstraction of hydrogen from hydroxyurea is thought to be a key step in its treatment of sickle cell anemia; the design of hydroxyurea derivatives that oxidize more rapidly is one approach to devising more effective treatments.

  15. Application of ab initio many-body perturbation theory with Gaussian basis sets to the singlet and triplet excitations of organic molecules

    NASA Astrophysics Data System (ADS)

    Hamed, Samia; Rangel, Tonatiuh; Bruneval, Fabien; Neaton, Jeffrey B.

    Quantitative understanding of charged and neutral excitations of organic molecules is critical in diverse areas of study that include astrophysics and the development of energy technologies that are clean and efficient. The recent use of local basis sets with ab initio many-body perturbation theory in the GW approximation and the Bethe-Saltpeter equation approach (BSE), methods traditionally applied to periodic condensed phases with a plane-wave basis, has opened the door to detailed study of such excitations for molecules, as well as accurate numerical benchmarks. Here, through a series of systematic benchmarks with a Gaussian basis, we report on the extent to which the predictive power and utility of this approach depend critically on interdependent underlying approximations and choices for molecules, including the mean-field starting point (eg optimally-tuned range separated hybrids, pure DFT functionals, and untuned hybrids), the GW scheme, and the Tamm Dancoff approximation. We demonstrate the effects of these choices in the context of Thiels' set while drawing analogies to linear-response time-dependent DFT and making comparisons to best theoretical estimates from higher-order wavefunction-based theories.

  16. ForceGen 3D structure and conformer generation: from small lead-like molecules to macrocyclic drugs

    NASA Astrophysics Data System (ADS)

    Cleves, Ann E.; Jain, Ajay N.

    2017-05-01

    We introduce the ForceGen method for 3D structure generation and conformer elaboration of drug-like small molecules. ForceGen is novel, avoiding use of distance geometry, molecular templates, or simulation-oriented stochastic sampling. The method is primarily driven by the molecular force field, implemented using an extension of MMFF94s and a partial charge estimator based on electronegativity-equalization. The force field is coupled to algorithms for direct sampling of realistic physical movements made by small molecules. Results are presented on a standard benchmark from the Cambridge Crystallographic Database of 480 drug-like small molecules, including full structure generation from SMILES strings. Reproduction of protein-bound crystallographic ligand poses is demonstrated on four carefully curated data sets: the ConfGen Set (667 ligands), the PINC cross-docking benchmark (1062 ligands), a large set of macrocyclic ligands (182 total with typical ring sizes of 12-23 atoms), and a commonly used benchmark for evaluating macrocycle conformer generation (30 ligands total). Results compare favorably to alternative methods, and performance on macrocyclic compounds approaches that observed on non-macrocycles while yielding a roughly 100-fold speed improvement over alternative MD-based methods with comparable performance.

  17. A Quantum-Based Similarity Method in Virtual Screening.

    PubMed

    Al-Dabbagh, Mohammed Mumtaz; Salim, Naomie; Himmat, Mubarak; Ahmed, Ali; Saeed, Faisal

    2015-10-02

    One of the most widely-used techniques for ligand-based virtual screening is similarity searching. This study adopted the concepts of quantum mechanics to present as state-of-the-art similarity method of molecules inspired from quantum theory. The representation of molecular compounds in mathematical quantum space plays a vital role in the development of quantum-based similarity approach. One of the key concepts of quantum theory is the use of complex numbers. Hence, this study proposed three various techniques to embed and to re-represent the molecular compounds to correspond with complex numbers format. The quantum-based similarity method that developed in this study depending on complex pure Hilbert space of molecules called Standard Quantum-Based (SQB). The recall of retrieved active molecules were at top 1% and top 5%, and significant test is used to evaluate our proposed methods. The MDL drug data report (MDDR), maximum unbiased validation (MUV) and Directory of Useful Decoys (DUD) data sets were used for experiments and were represented by 2D fingerprints. Simulated virtual screening experiment show that the effectiveness of SQB method was significantly increased due to the role of representational power of molecular compounds in complex numbers forms compared to Tanimoto benchmark similarity measure.

  18. Experimental and TD-DFT study of optical absorption of six explosive molecules: RDX, HMX, PETN, TNT, TATP, and HMTD.

    PubMed

    Cooper, Jason K; Grant, Christian D; Zhang, Jin Z

    2013-07-25

    Time dependent density function theory (TD-DFT) has been utilized to calculate the excitation energies and oscillator strengths of six common explosives: RDX (1,3,5-trinitroperhydro-1,3,5-triazine), β-HMX (octahydro-1,3,5,7-tetranitro-1,3,5,7-tetrazocine), TATP (triacetone triperoxide), HMTD (hexamethylene triperoxide diamine), TNT (2,4,6-trinitrotoluene), and PETN (pentaerythritol tetranitrate). The results were compared to experimental UV-vis absorption spectra collected in acetonitrile. Four computational methods were tested including: B3LYP, CAM-B3LYP, ωB97XD, and PBE0. PBE0 outperforms the other methods tested. Basis set effects on the electronic energies and oscillator strengths were evaluated with 6-31G(d), 6-31+G(d), 6-31+G(d,p), and 6-311+G(d,p). The minimal basis set required was 6-31+G(d); however, additional calculations were performed with 6-311+G(d,p). For each molecule studied, the natural transition orbitals (NTOs) were reported for the most prominent singlet excitations. The TD-DFT results have been combined with the IPv calculated by CBS-QB3 to construct energy level diagrams for the six compounds. The results suggest optimization approaches for fluorescence based detection methods for these explosives by guiding materials selections for optimal band alignment between fluorescent probe and explosive analyte. Also, the role of the TNT Meisenheimer complex formation and the resulting electronic structure thereof on of the quenching mechanism of II-VI semiconductors is discussed.

  19. Systems Biological Approach of Molecular Descriptors Connectivity: Optimal Descriptors for Oral Bioavailability Prediction

    PubMed Central

    Ahmed, Shiek S. S. J.; Ramakrishnan, V.

    2012-01-01

    Background Poor oral bioavailability is an important parameter accounting for the failure of the drug candidates. Approximately, 50% of developing drugs fail because of unfavorable oral bioavailability. In silico prediction of oral bioavailability (%F) based on physiochemical properties are highly needed. Although many computational models have been developed to predict oral bioavailability, their accuracy remains low with a significant number of false positives. In this study, we present an oral bioavailability model based on systems biological approach, using a machine learning algorithm coupled with an optimal discriminative set of physiochemical properties. Results The models were developed based on computationally derived 247 physicochemical descriptors from 2279 molecules, among which 969, 605 and 705 molecules were corresponds to oral bioavailability, intestinal absorption (HIA) and caco-2 permeability data set, respectively. The partial least squares discriminate analysis showed 49 descriptors of HIA and 50 descriptors of caco-2 are the major contributing descriptors in classifying into groups. Of these descriptors, 47 descriptors were commonly associated to HIA and caco-2, which suggests to play a vital role in classifying oral bioavailability. To determine the best machine learning algorithm, 21 classifiers were compared using a bioavailability data set of 969 molecules with 47 descriptors. Each molecule in the data set was represented by a set of 47 physiochemical properties with the functional relevance labeled as (+bioavailability/−bioavailability) to indicate good-bioavailability/poor-bioavailability molecules. The best-performing algorithm was the logistic algorithm. The correlation based feature selection (CFS) algorithm was implemented, which confirms that these 47 descriptors are the fundamental descriptors for oral bioavailability prediction. Conclusion The logistic algorithm with 47 selected descriptors correctly predicted the oral bioavailability, with a predictive accuracy of more than 71%. Overall, the method captures the fundamental molecular descriptors, that can be used as an entity to facilitate prediction of oral bioavailability. PMID:22815781

  20. Systems biological approach of molecular descriptors connectivity: optimal descriptors for oral bioavailability prediction.

    PubMed

    Ahmed, Shiek S S J; Ramakrishnan, V

    2012-01-01

    Poor oral bioavailability is an important parameter accounting for the failure of the drug candidates. Approximately, 50% of developing drugs fail because of unfavorable oral bioavailability. In silico prediction of oral bioavailability (%F) based on physiochemical properties are highly needed. Although many computational models have been developed to predict oral bioavailability, their accuracy remains low with a significant number of false positives. In this study, we present an oral bioavailability model based on systems biological approach, using a machine learning algorithm coupled with an optimal discriminative set of physiochemical properties. The models were developed based on computationally derived 247 physicochemical descriptors from 2279 molecules, among which 969, 605 and 705 molecules were corresponds to oral bioavailability, intestinal absorption (HIA) and caco-2 permeability data set, respectively. The partial least squares discriminate analysis showed 49 descriptors of HIA and 50 descriptors of caco-2 are the major contributing descriptors in classifying into groups. Of these descriptors, 47 descriptors were commonly associated to HIA and caco-2, which suggests to play a vital role in classifying oral bioavailability. To determine the best machine learning algorithm, 21 classifiers were compared using a bioavailability data set of 969 molecules with 47 descriptors. Each molecule in the data set was represented by a set of 47 physiochemical properties with the functional relevance labeled as (+bioavailability/-bioavailability) to indicate good-bioavailability/poor-bioavailability molecules. The best-performing algorithm was the logistic algorithm. The correlation based feature selection (CFS) algorithm was implemented, which confirms that these 47 descriptors are the fundamental descriptors for oral bioavailability prediction. The logistic algorithm with 47 selected descriptors correctly predicted the oral bioavailability, with a predictive accuracy of more than 71%. Overall, the method captures the fundamental molecular descriptors, that can be used as an entity to facilitate prediction of oral bioavailability.

  1. Single molecule data under scrutiny. Comment on "Extracting physics of life at the molecular level: A review of single-molecule data analyses" by W. Colomb & S.K. Sarkar

    NASA Astrophysics Data System (ADS)

    Wohland, Thorsten

    2015-06-01

    Single Molecule Detection and Spectroscopy have grown from their first beginnings into mainstream, mature research areas that are widely applied in the biological sciences. However, despite the advances in technology and the application of many single molecule techniques even in in vivo settings, the data analysis of single molecule experiments is complicated by noise, systematic errors, and complex underlying processes that are only incompletely understood. Colomb and Sarkar provide in this issue an overview of single molecule experiments and the accompanying problems in data analysis, which have to be overcome for a proper interpretation of the experiments [1].

  2. Excitation energies of molecules within time-independent density functional theory

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hemanadhan, M., E-mail: hemanadh@iitk.ac.in; Harbola, Manoj K., E-mail: hemanadh@iitk.ac.in

    2014-04-24

    Recently proposed exchange energy functional for excited-states is tested for obtaining excitation energies of diatomic molecules. The functional is the ground-state counterpart of the local-density approximation, the modified local spin density (MLSD). The MLSD functional is tested for the N{sub 2} and CO diatomic molecules. The excitation energy obtained with the MLSD functional for the N{sub 2} molecule is in close vicinity to that obtained from the exact exchange orbital functional, Krieger, Li and Iafrate (KLI). For the CO molecule, the departure in excitation energy is observed and is due to the overcorrection of self-interaction.

  3. Excitation energies of molecules within time-independent density functional theory

    NASA Astrophysics Data System (ADS)

    Hemanadhan, M.; Harbola, Manoj K.

    2014-04-01

    Recently proposed exchange energy functional for excited-states is tested for obtaining excitation energies of diatomic molecules. The functional is the ground-state counterpart of the local-density approximation, the modified local spin density (MLSD). The MLSD functional is tested for the N2 and CO diatomic molecules. The excitation energy obtained with the MLSD functional for the N2 molecule is in close vicinity to that obtained from the exact exchange orbital functional, Krieger, Li and Iafrate (KLI). For the CO molecule, the departure in excitation energy is observed and is due to the overcorrection of self-interaction.

  4. The correcting method for the estimation of correlation energies of MF2 (M = Be, Mg, Ca) set molecules

    NASA Astrophysics Data System (ADS)

    Zhuo, Shuping; Wei, Jichong; Ju, Guanzhi

    The intrapair and interpair correlation energies of MF2 (M = Be, Mg, Ca) set molecules are calculated and analysed, and the transferability of inner core correlation effects of Mδ+ are investigated. A detailed analysis of the comparison of correlation energies of neutral atoms with their corresponding ions of Mδ+ and Fδ-/2 is given in terms of the correlation contribution of this component. The study reveals that the total correlation energy of MF2 molecules can be obtained by summing the correlation contributions of Mδ+ and two Fδ-/2 components. This simple estimation method does shed light on the importance of searching useful means for the calculation of electron correlation energy for large biological systems.

  5. MS2Analyzer: A Software for Small Molecule Substructure Annotations from Accurate Tandem Mass Spectra

    PubMed Central

    2015-01-01

    Systematic analysis and interpretation of the large number of tandem mass spectra (MS/MS) obtained in metabolomics experiments is a bottleneck in discovery-driven research. MS/MS mass spectral libraries are small compared to all known small molecule structures and are often not freely available. MS2Analyzer was therefore developed to enable user-defined searches of thousands of spectra for mass spectral features such as neutral losses, m/z differences, and product and precursor ions from MS/MS spectra in MSP/MGF files. The software is freely available at http://fiehnlab.ucdavis.edu/projects/MS2Analyzer/. As the reference query set, 147 literature-reported neutral losses and their corresponding substructures were collected. This set was tested for accuracy of linking neutral loss analysis to substructure annotations using 19 329 accurate mass tandem mass spectra of structurally known compounds from the NIST11 MS/MS library. Validation studies showed that 92.1 ± 6.4% of 13 typical neutral losses such as acetylations, cysteine conjugates, or glycosylations are correct annotating the associated substructures, while the absence of mass spectra features does not necessarily imply the absence of such substructures. Use of this tool has been successfully demonstrated for complex lipids in microalgae. PMID:25263576

  6. Improved hybrid algorithm with Gaussian basis sets and plane waves: First-principles calculations of ethylene adsorption on β-SiC(001)-(3×2)

    NASA Astrophysics Data System (ADS)

    Wieferink, Jürgen; Krüger, Peter; Pollmann, Johannes

    2006-11-01

    We present an algorithm for DFT calculations employing Gaussian basis sets for the wave function and a Fourier basis for the potential representation. In particular, a numerically very efficient calculation of the local potential matrix elements and the charge density is described. Special emphasis is placed on the consequences of periodicity and explicit k -vector dependence. The algorithm is tested by comparison with more straightforward ones for the case of adsorption of ethylene on the silicon-rich SiC(001)-(3×2) surface clearly revealing its substantial advantages. A complete self-consistency cycle is speeded up by roughly one order of magnitude since the calculation of matrix elements and of the charge density are accelerated by factors of 10 and 80, respectively, as compared to their straightforward calculation. Our results for C2H4:SiC(001)-(3×2) show that ethylene molecules preferentially adsorb in on-top positions above Si dimers on the substrate surface saturating both dimer dangling bonds per unit cell. In addition, a twist of the molecules around a surface-perpendicular axis is slightly favored energetically similar to the case of a complete monolayer of ethylene adsorbed on the Si(001)-(2×1) surface.

  7. Investigation of antimicrobial activities, DNA interaction, structural and spectroscopic properties of 2-chloro-6-(trifluoromethyl)pyridine

    NASA Astrophysics Data System (ADS)

    Evecen, Meryem; Kara, Mehmet; Idil, Onder; Tanak, Hasan

    2017-06-01

    2-Chloro-6-(trifluoromethyl)pyridine has been characterized by FT-IR, 1H and 13C NMR experiment. FT-IR spectra of the molecule has been recorded in the 4000-400 cm-1 region. The molecular structural parameters and vibrational frequencies were computed using the HF and DFT (B3LYP, B3PW91) methods with the 6-31+G(d,p) and 6-311++G(d,p) basis sets. 1H and 13C NMR Gauge Including Atomic Orbital (GIAO) chemical shifts of the compound were calculated using the density functional method (B3LYP) with the 6-311++G(d,p) basis set. The vibrational wavenumbers and chemical shifts were compared with the experimental data of the compound. Using the TD-DFT methodology, electronic absorption spectra of the compound have been computed. Besides, solvent effects on the excitation energies and chemical shifts were carried out using the integral equation formalism of the polarisable continuum model (IEF-PCM). DFT calculations of the compound, Mulliken's charges, molecular electrostatic potential (MEP), natural bond orbital (NBO) and thermodynamic properties were also obtained theoretically. In addition, the antimicrobial activities were tested by using minimal inhibitory concentration method (MIC) and also the effect of the molecule on pBR322 plasmid DNA was monitored byagarose gel electrophoresis experiments.

  8. C-Peptide Test

    MedlinePlus

    ... Cancer Therapy Glucose Tests Gonorrhea Testing Gram Stain Growth Hormone Haptoglobin hCG Pregnancy hCG Tumor Marker HDL Cholesterol ... splits apart and forms one molecule of C-peptide and one molecule of insulin . Insulin is the hormone that is vital for the body to use ...

  9. Structures, Bonding, and Energetics of Potential Triatomic Circumstellar Molecules Containing Group 15 and 16 Elements.

    PubMed

    Turner, Walter E; Agarwal, Jay; Schaefer, Henry F

    2015-12-03

    The recent discovery of PN in the oxygen-rich shell of the supergiant star VY Canis Majoris points to the formation of several triatomic molecules involving oxygen, nitrogen, and phosphorus; these are also intriguing targets for main-group synthetic inorganic chemistry. In this research, high-level ab initio electronic structure computations were conducted on the potential circumstellar molecule OPN and several of its heavier group 15 and 16 congeners (SPN, SePN, TePN, OPP, OPAs, and OPSb). For each congener, four isomers were examined. Optimized geometries were obtained with coupled cluster theory [CCSD(T)] using large Dunning basis sets [aug-cc-pVQZ, aug-cc-pV(Q+d)Z, and aug-cc-pVQZ-PP], and relative energies were determined at the complete basis set limit of CCSDT(Q) from focal point analyses. The linear phosphorus-centered molecules were consistently the lowest in energy of the group 15 congeners by at least 6 kcal mol(-1), resulting from double-triple and single-double bond resonances within the molecule. The linear nitrogen-centered molecules were consistently the lowest in energy of the group 16 congeners by at least 5 kcal mol(-1), due to the electronegative central nitrogen atom encouraging electron delocalization throughout the molecule. For OPN, OPP, and SPN, anharmonic vibrational frequencies and vibrationally corrected rotational constants are predicted; good agreement with available experimental data is observed.

  10. Quantitative Lateral Flow Assays for Salivary Biomarker Assessment: A Review

    PubMed Central

    Miočević, Olga; Cole, Craig R.; Laughlin, Mary J.; Buck, Robert L.; Slowey, Paul D.; Shirtcliff, Elizabeth A.

    2017-01-01

    Saliva is an emerging biofluid with a significant number of applications in use across research and clinical settings. The present paper explores the reasons why saliva has grown in popularity in recent years, balancing both the potential strengths and weaknesses of this biofluid. Focusing on reasons why saliva is different from other common biological fluids such as blood, urine, or tears, we review how saliva is easily obtained, with minimal risk to the donor, and reduced costs for collection, transportation, and analysis. We then move on to a brief review of the history and progress in rapid salivary testing, again reviewing the strengths and weaknesses of rapid immunoassays (e.g., lateral flow immunoassay) compared to more traditional immunoassays. We consider the potential for saliva as an alternative biofluid in a setting where rapid results are important. We focus the review on salivary tests for small molecule biomarkers using cortisol as an example. Such salivary tests can be applied readily in a variety of settings and for specific measurement purposes, providing researchers and clinicians with opportunities to assess biomarkers in real time with lower transportation, collection, and analysis costs, faster turnaround time, and minimal training requirements. We conclude with a note of cautious optimism that the field will soon gain the ability to collect and analyze salivary specimens at any location and return viable results within minutes. PMID:28660183

  11. Prediction of new bioactive molecules using a Bayesian belief network.

    PubMed

    Abdo, Ammar; Leclère, Valérie; Jacques, Philippe; Salim, Naomie; Pupin, Maude

    2014-01-27

    Natural products and synthetic compounds are a valuable source of new small molecules leading to novel drugs to cure diseases. However identifying new biologically active small molecules is still a challenge. In this paper, we introduce a new activity prediction approach using Bayesian belief network for classification (BBNC). The roots of the network are the fragments composing a compound. The leaves are, on one side, the activities to predict and, on another side, the unknown compound. The activities are represented by sets of known compounds, and sets of inactive compounds are also used. We calculated a similarity between an unknown compound and each activity class. The more similar activity is assigned to the unknown compound. We applied this new approach on eight well-known data sets extracted from the literature and compared its performance to three classical machine learning algorithms. Experiments showed that BBNC provides interesting prediction rates (from 79% accuracy for high diverse data sets to 99% for low diverse ones) with a short time calculation. Experiments also showed that BBNC is particularly effective for homogeneous data sets but has been found to perform less well with structurally heterogeneous sets. However, it is important to stress that we believe that using several approaches whenever possible for activity prediction can often give a broader understanding of the data than using only one approach alone. Thus, BBNC is a useful addition to the computational chemist's toolbox.

  12. Liquid chromatography coupled to quadrupole-time of flight tandem mass spectrometry based quantitative structure-retention relationships of amino acid analogues derivatized via n-propyl chloroformate mediated reaction.

    PubMed

    Kritikos, Nikolaos; Tsantili-Kakoulidou, Anna; Loukas, Yannis L; Dotsikas, Yannis

    2015-07-17

    In the current study, quantitative structure-retention relationships (QSRR) were constructed based on data obtained by a LC-(ESI)-QTOF-MS/MS method for the determination of amino acid analogues, following their derivatization via chloroformate esters. Molecules were derivatized via n-propyl chloroformate/n-propanol mediated reaction. Derivatives were acquired through a liquid-liquid extraction procedure. Chromatographic separation is based on gradient elution using methanol/water mixtures from a 70/30% composition to an 85/15% final one, maintaining a constant rate of change. The group of examined molecules was diverse, including mainly α-amino acids, yet also β- and γ-amino acids, γ-amino acid analogues, decarboxylated and phosphorylated analogues and dipeptides. Projection to latent structures (PLS) method was selected for the formation of QSRRs, resulting in a total of three PLS models with high cross-validated coefficients of determination Q(2)Y. For this reason, molecular structures were previously described through the use of descriptors. Through stratified random sampling procedures, 57 compounds were split to a training set and a test set. Model creation was based on multiple criteria including principal component significance and eigenvalue, variable importance, form of residuals, etc. Validation was based on statistical metrics Rpred(2),QextF2(2),QextF3(2) for the test set and Roy's metrics rm(Av)(2) and rm(δ)(2), assessing both predictive stability and internal validity. Based on aforementioned models, simplified equivalent were then created using a multi-linear regression (MLR) method. MLR models were also validated with the same metrics. The suggested models are considered useful for the estimation of retention times of amino acid analogues for a series of applications. Copyright © 2015 Elsevier B.V. All rights reserved.

  13. The molecular gradient using the divide-expand-consolidate resolution of the identity second-order Møller-Plesset perturbation theory: The DEC-RI-MP2 gradient

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bykov, Dmytro; Kristensen, Kasper; Kjærgaard, Thomas

    We report an implementation of the molecular gradient using the divide-expand-consolidate resolution of the identity second-order Møller-Plesset perturbation theory (DEC-RI-MP2). The new DEC-RI-MP2 gradient method combines the precision control as well as the linear-scaling and massively parallel features of the DEC scheme with efficient evaluations of the gradient contributions using the RI approximation. We further demonstrate that the DEC-RI-MP2 gradient method is capable of calculating molecular gradients for very large molecular systems. A test set of supramolecular complexes containing up to 158 atoms and 1960 contracted basis functions has been employed to demonstrate the general applicability of the DEC-RI-MP2 methodmore » and to analyze the errors of the DEC approximation. Moreover, the test set contains molecules of complicated electronic structures and is thus deliberately chosen to stress test the DEC-RI-MP2 gradient implementation. Additionally, as a showcase example the full molecular gradient for insulin (787 atoms and 7604 contracted basis functions) has been evaluated.« less

  14. QSAR models to predict mutagenicity of acrylates, methacrylates and alpha,beta-unsaturated carbonyl compounds.

    PubMed

    Pérez-Garrido, Alfonso; Helguera, Aliuska Morales; Rodríguez, Francisco Girón; Cordeiro, M Natália D S

    2010-05-01

    The purpose of this study is to develop a quantitative structure-activity relationship (QSAR) model that can distinguish mutagenic from non-mutagenic species with alpha,beta-unsaturated carbonyl moiety using two endpoints for this activity - Ames test and mammalian cell gene mutation test - and also to gather information about the molecular features that most contribute to eliminate the mutagenic effects of these chemicals. Two data sets were used for modeling the two mutagenicity endpoints: (1) Ames test and (2) mammalian cells mutagenesis. The first one comprised 220 molecules, while the second one 48 substances, ranging from acrylates, methacrylates to alpha,beta-unsaturated carbonyl compounds. The QSAR models were developed by applying linear discriminant analysis (LDA) along with different sets of descriptors computed using the DRAGON software. For both endpoints, there was a concordance of 89% in the prediction and 97% confidentiality by combining the three models for the Ames test mutagenicity. We have also identified several structural alerts to assist the design of new monomers. These individual models and especially their combination are attractive from the point of view of molecular modeling and could be used for the prediction and design of new monomers that do not pose a human health risk. 2010 Academy of Dental Materials. Published by Elsevier Ltd. All rights reserved.

  15. New insights into the stereochemical requirements of the bradykinin B2 receptor antagonists binding

    NASA Astrophysics Data System (ADS)

    Lupala, Cecylia S.; Gomez-Gutierrez, Patricia; Perez, Juan J.

    2016-01-01

    Bradykinin (BK) is a member of the kinin family, released in response to inflammation, trauma, burns, shock, allergy and some cardiovascular diseases, provoking vasodilatation and increased vascular permeability among other effects. Their actions are mediated through at least two G-protein coupled receptors, B1 a receptor up-regulated during inflammation episodes or tissue trauma and B2 that is constitutively expressed in a variety of cell types. The goal of the present work is to carry out a structure-activity study of BK B2 antagonism, taking into account the stereochemical features of diverse non-peptide antagonists and the way these features translate into ligand anchoring points to complementary regions of the receptor, through the analysis of the respective ligand-receptor complex. For this purpose an atomistic model of the BK B2 receptor was built by homology modeling and subsequently refined embedded in a lipid bilayer by means of a 600 ns molecular dynamics trajectory. The average structure from the last hundred nanoseconds of the molecular dynamics trajectory was energy minimized and used as model of the receptor for docking studies. For this purpose, a set of compounds with antagonistic profile, covering maximal diversity were selected from the literature. Specifically, the set of compounds include Fasitibant, FR173657, Anatibant, WIN64338, Bradyzide, CHEMBL442294, and JSM10292. Molecules were docked into the BK B2 receptor model and the corresponding complexes analyzed to understand ligand-receptor interactions. The outcome of this study is summarized in a 3D pharmacophore that explains the observed structure-activity results and provides insight into the design of novel molecules with antagonistic profile. To prove the validity of the pharmacophore hypothesized a virtual screening process was also carried out. The pharmacophore was used as query to identify new hits using diverse databases of molecules. The results of this study revealed a set of new hits with structures not connected to the molecules used for pharmacophore development. A few of these structures were purchased and tested. The results of the binding studies show about a 33 % success rate with a correlation between the number of pharmacophore points fulfilled and their antagonistic potency. Some of these structures are disclosed in the present work.

  16. Controlled manipulation of the Co-Alq3 interface by rational design of Alq3 derivatives.

    PubMed

    Großmann, Nicolas; Magri, Andrea; Laux, Martin; Stadtmüller, Benjamin; Thielen, Philip; Schäfer, Bernhard; Fuhr, Olaf; Ruben, Mario; Cinchetti, Mirko; Aeschlimann, Martin

    2016-11-15

    Recently, research has revealed that molecules can be used to steer the local spin properties of ferromagnetic surfaces. One possibility to manipulate ferromagnetic-metal-molecule interfaces in a controlled way is to synthesize specific, non-magnetic molecules to obtain a desired interaction with the ferromagnetic substrate. Here, we have synthesized derivatives of the well-known semiconductor Alq 3 (with q = 8-hydroxyquinolinate), in which the 8-hydroxyquinolinate ligands are partially or completely replaced by similar ligands bearing O- or N-donor sets. The goal of this study was to investigate how the presence of (i) different donor atom sets and (ii) aromaticity in different conjugated π-systems influences the spin properties of the metal-molecule interface formed with a Co(100) surface. The spin-dependent metal-molecule-interface properties have been measured by spin-resolved photoemission spectroscopy, backed up by DFT calculations. Overall, our results show that, in the case of the Co-molecule interface, chemical synthesis of organic ligands leads to specific electronic properties of the interface, such as exciton formation or highly spin-polarized interface states. We find that these properties are even additive, i.e. they can be engineered into one single molecular system that incorporates all the relevant ligands.

  17. Comparative investigation of low-molecular-weight fulvic acids of different origin by SEC-Q-TOF-ms: new insights into structure and formation.

    PubMed

    Reemtsma, Thorsten; These, Anja

    2005-05-15

    Size exclusion chromatography (SEC) coupled to electrospray ionization quadrupole time-of-flight mass spectrometry (ESI-Q-TOF-MS) was used to analyze the elemental composition and structure of low-molecular-weight fulvic acid molecules. It is shown that the set of hundreds of individual molecules form a homogeneous and structurally unique class of compounds that can be clearly differentiated from any other class of biogenic matter investigated to date. The molecular composition of low-molecular-weight fulvic acids in isolates of very different origin (surface water, groundwater, peat) is virtually indistinguishable. Significant and characteristic differences are, however, recognized when qualitative information and quantitative information provided by ESI-Q-TOF-MS are linked to each other. The relative frequency of the various molecules in each mixture can differ significantly, with the peat showing higher intensity of the aromatic and less carboxylated molecules of this set, whereas the aquatic fulvic acids show a strong contribution of the molecules with less aromaticity and a higher carboxylate content. The identity of fulvic acid molecules in isolates of different origin implies that no specific source material is required forfulvic acid formation but that they may be formed from different sources by different oxidative processes.

  18. DFT analysis on the molecular structure, vibrational and electronic spectra of 2-(cyclohexylamino)ethanesulfonic acid.

    PubMed

    Renuga Devi, T S; Sharmi kumar, J; Ramkumaar, G R

    2015-02-25

    The FTIR and FT-Raman spectra of 2-(cyclohexylamino)ethanesulfonic acid were recorded in the regions 4000-400 cm(-1) and 4000-50 cm(-1) respectively. The structural and spectroscopic data of the molecule in the ground state were calculated using Hartee-Fock and Density functional method (B3LYP) with the correlation consistent-polarized valence double zeta (cc-pVDZ) basis set and 6-311++G(d,p) basis set. The most stable conformer was optimized and the structural and vibrational parameters were determined based on this. The complete assignments were performed based on the Potential Energy Distribution (PED) of the vibrational modes, calculated using Vibrational Energy Distribution Analysis (VEDA) 4 program. With the observed FTIR and FT-Raman data, a complete vibrational assignment and analysis of the fundamental modes of the compound were carried out. Thermodynamic properties and Atomic charges were calculated using both Hartee-Fock and density functional method using the cc-pVDZ basis set and compared. The calculated HOMO-LUMO energy gap revealed that charge transfer occurs within the molecule. (1)H and (13)C NMR chemical shifts of the molecule were calculated using Gauge Including Atomic Orbital (GIAO) method and were compared with experimental results. Stability of the molecule arising from hyperconjugative interactions, charge delocalization have been analyzed using Natural Bond Orbital (NBO) analysis. The first order hyperpolarizability (β) and Molecular Electrostatic Potential (MEP) of the molecule was computed using DFT calculations. The electron density based local reactivity descriptor such as Fukui functions were calculated to explain the chemical reactivity site in the molecule. Copyright © 2014 Elsevier B.V. All rights reserved.

  19. Fibronectin-based scaffold domain proteins that bind myostatin: a patent evaluation of WO2014043344.

    PubMed

    Walker, Ryan G; Thompson, Thomas B

    2015-05-01

    Muscular dystrophies (MD) are commonly characterized by progressive loss of muscle mass and function. It is hypothesized that therapeutic blockade of the TGF-β ligand myostatin, a negative regulator of muscle mass, will stimulate muscle growth and restore muscle function. Although many anti-myostatin targets are currently being pursued in the clinical setting, the efficacies of the tested molecules have shown mixed results. The patent WO2014043344 describes a novel approach for myostatin inhibition using a modified fibronectin type III domain that could potentially be used to treat MD and other muscle-related pathologies.

  20. Do Practical Standard Coupled Cluster Calculations Agree Better than Kohn–Sham Calculations with Currently Available Functionals When Compared to the Best Available Experimental Data for Dissociation Energies of Bonds to 3d Transition Metals?

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Xu, Xuefei; Zhang, Wenjing; Tang, Mingsheng

    2015-05-12

    Coupled-cluster (CC) methods have been extensively used as the high-level approach in quantum electronic structure theory to predict various properties of molecules when experimental results are unavailable. It is often assumed that CC methods, if they include at least up to connected-triple-excitation quasiperturbative corrections to a full treatment of single and double excitations (in particular, CCSD(T)), and a very large basis set, are more accurate than Kohn–Sham (KS) density functional theory (DFT). In the present work, we tested and compared the performance of standard CC and KS methods on bond energy calculations of 20 3d transition metal-containing diatomic molecules againstmore » the most reliable experimental data available, as collected in a database called 3dMLBE20. It is found that, although the CCSD(T) and higher levels CC methods have mean unsigned deviations from experiment that are smaller than most exchange-correlation functionals for metal–ligand bond energies of transition metals, the improvement is less than one standard deviation of the mean unsigned deviation. Furthermore, on average, almost half of the 42 exchange-correlation functionals that we tested are closer to experiment than CCSD(T) with the same extended basis set for the same molecule. The results show that, when both relativistic and core–valence correlation effects are considered, even the very high-level (expensive) CC method with single, double, triple, and perturbative quadruple cluster operators, namely, CCSDT(2)Q, averaged over 20 bond energies, gives a mean unsigned deviation (MUD(20) = 4.7 kcal/mol when one correlates only valence, 3p, and 3s electrons of transition metals and only valence electrons of ligands, or 4.6 kcal/mol when one correlates all core electrons except for 1s shells of transition metals, S, and Cl); and that is similar to some good xc functionals (e.g., B97-1 (MUD(20) = 4.5 kcal/mol) and PW6B95 (MUD(20) = 4.9 kcal/mol)) when the same basis set is used. We found that, for both coupled cluster calculations and KS calculations, the T1 diagnostics correlate the errors better than either the M diagnostics or the B1 DFT-based diagnostics. The potential use of practical standard CC methods as a benchmark theory is further confounded by the finding that CC and DFT methods usually have different signs of the error. We conclude that the available experimental data do not provide a justification for using conventional single-reference CC theory calculations to validate or test xc functionals for systems involving 3d transition metals.« less

  1. Performance of machine-learning scoring functions in structure-based virtual screening.

    PubMed

    Wójcikowski, Maciej; Ballester, Pedro J; Siedlecki, Pawel

    2017-04-25

    Classical scoring functions have reached a plateau in their performance in virtual screening and binding affinity prediction. Recently, machine-learning scoring functions trained on protein-ligand complexes have shown great promise in small tailored studies. They have also raised controversy, specifically concerning model overfitting and applicability to novel targets. Here we provide a new ready-to-use scoring function (RF-Score-VS) trained on 15 426 active and 893 897 inactive molecules docked to a set of 102 targets. We use the full DUD-E data sets along with three docking tools, five classical and three machine-learning scoring functions for model building and performance assessment. Our results show RF-Score-VS can substantially improve virtual screening performance: RF-Score-VS top 1% provides 55.6% hit rate, whereas that of Vina only 16.2% (for smaller percent the difference is even more encouraging: RF-Score-VS top 0.1% achieves 88.6% hit rate for 27.5% using Vina). In addition, RF-Score-VS provides much better prediction of measured binding affinity than Vina (Pearson correlation of 0.56 and -0.18, respectively). Lastly, we test RF-Score-VS on an independent test set from the DEKOIS benchmark and observed comparable results. We provide full data sets to facilitate further research in this area (http://github.com/oddt/rfscorevs) as well as ready-to-use RF-Score-VS (http://github.com/oddt/rfscorevs_binary).

  2. QSAR models for thiophene and imidazopyridine derivatives inhibitors of the Polo-Like Kinase 1.

    PubMed

    Comelli, Nieves C; Duchowicz, Pablo R; Castro, Eduardo A

    2014-10-01

    The inhibitory activity of 103 thiophene and 33 imidazopyridine derivatives against Polo-Like Kinase 1 (PLK1) expressed as pIC50 (-logIC50) was predicted by QSAR modeling. Multivariate linear regression (MLR) was employed to model the relationship between 0D and 3D molecular descriptors and biological activities of molecules using the replacement method (MR) as variable selection tool. The 136 compounds were separated into several training and test sets. Two splitting approaches, distribution of biological data and structural diversity, and the statistical experimental design procedure D-optimal distance were applied to the dataset. The significance of the training set models was confirmed by statistically higher values of the internal leave one out cross-validated coefficient of determination (Q2) and external predictive coefficient of determination for the test set (Rtest2). The model developed from a training set, obtained with the D-optimal distance protocol and using 3D descriptor space along with activity values, separated chemical features that allowed to distinguish high and low pIC50 values reasonably well. Then, we verified that such model was sufficient to reliably and accurately predict the activity of external diverse structures. The model robustness was properly characterized by means of standard procedures and their applicability domain (AD) was analyzed by leverage method. Copyright © 2014 Elsevier B.V. All rights reserved.

  3. Extended solvent-contact model approach to blind SAMPL5 prediction challenge for the distribution coefficients of drug-like molecules

    NASA Astrophysics Data System (ADS)

    Chung, Kee-Choo; Park, Hwangseo

    2016-11-01

    The performance of the extended solvent-contact model has been addressed in the SAMPL5 blind prediction challenge for distribution coefficient (LogD) of drug-like molecules with respect to the cyclohexane/water partitioning system. All the atomic parameters defined for 41 atom types in the solvation free energy function were optimized by operating a standard genetic algorithm with respect to water and cyclohexane solvents. In the parameterizations for cyclohexane, the experimental solvation free energy (Δ G sol ) data of 15 molecules for 1-octanol were combined with those of 77 molecules for cyclohexane to construct a training set because Δ G sol values of the former were unavailable for cyclohexane in publicly accessible databases. Using this hybrid training set, we established the LogD prediction model with the correlation coefficient ( R), average error (AE), and root mean square error (RMSE) of 0.55, 1.53, and 3.03, respectively, for the comparison of experimental and computational results for 53 SAMPL5 molecules. The modest accuracy in LogD prediction could be attributed to the incomplete optimization of atomic solvation parameters for cyclohexane. With respect to 31 SAMPL5 molecules containing the atom types for which experimental reference data for Δ G sol were available for both water and cyclohexane, the accuracy in LogD prediction increased remarkably with the R, AE, and RMSE values of 0.82, 0.89, and 1.60, respectively. This significant enhancement in performance stemmed from the better optimization of atomic solvation parameters by limiting the element of training set to the molecules with experimental Δ G sol data for cyclohexane. Due to the simplicity in model building and to low computational cost for parameterizations, the extended solvent-contact model is anticipated to serve as a valuable computational tool for LogD prediction upon the enrichment of experimental Δ G sol data for organic solvents.

  4. Improved methods for predicting peptide binding affinity to MHC class II molecules.

    PubMed

    Jensen, Kamilla Kjaergaard; Andreatta, Massimo; Marcatili, Paolo; Buus, Søren; Greenbaum, Jason A; Yan, Zhen; Sette, Alessandro; Peters, Bjoern; Nielsen, Morten

    2018-07-01

    Major histocompatibility complex class II (MHC-II) molecules are expressed on the surface of professional antigen-presenting cells where they display peptides to T helper cells, which orchestrate the onset and outcome of many host immune responses. Understanding which peptides will be presented by the MHC-II molecule is therefore important for understanding the activation of T helper cells and can be used to identify T-cell epitopes. We here present updated versions of two MHC-II-peptide binding affinity prediction methods, NetMHCII and NetMHCIIpan. These were constructed using an extended data set of quantitative MHC-peptide binding affinity data obtained from the Immune Epitope Database covering HLA-DR, HLA-DQ, HLA-DP and H-2 mouse molecules. We show that training with this extended data set improved the performance for peptide binding predictions for both methods. Both methods are publicly available at www.cbs.dtu.dk/services/NetMHCII-2.3 and www.cbs.dtu.dk/services/NetMHCIIpan-3.2. © 2018 John Wiley & Sons Ltd.

  5. Non-perturbative calculation of orbital and spin effects in molecules subject to non-uniform magnetic fields

    NASA Astrophysics Data System (ADS)

    Sen, Sangita; Tellgren, Erik I.

    2018-05-01

    External non-uniform magnetic fields acting on molecules induce non-collinear spin densities and spin-symmetry breaking. This necessitates a general two-component Pauli spinor representation. In this paper, we report the implementation of a general Hartree-Fock method, without any spin constraints, for non-perturbative calculations with finite non-uniform fields. London atomic orbitals are used to ensure faster basis convergence as well as invariance under constant gauge shifts of the magnetic vector potential. The implementation has been applied to investigate the joint orbital and spin response to a field gradient—quantified through the anapole moments—of a set of small molecules. The relative contributions of orbital and spin-Zeeman interaction terms have been studied both theoretically and computationally. Spin effects are stronger and show a general paramagnetic behavior for closed shell molecules while orbital effects can have either direction. Basis set convergence and size effects of anapole susceptibility tensors have been reported. The relation of the mixed anapole susceptibility tensor to chirality is also demonstrated.

  6. DFT study of the effect of substituents on the absorption and emission spectra of Indigo

    PubMed Central

    2012-01-01

    Background Theoretical analyses of the indigo dye molecule and its derivatives with Chlorine (Cl), Sulfur (S), Selenium (Se) and Bromine (Br) substituents, as well as an analysis of the Hemi-Indigo molecule, were performed using the Gaussian 03 software package. Results Calculations were performed based on the framework of density functional theory (DFT) with the Becke 3- parameter-Lee-Yang-Parr (B3LYP) functional, where the 6-31 G(d,p) basis set was employed. The configuration interaction singles (CIS) method with the same basis set was employed for the analysis of excited states and for the acquisition of the emission spectra. Conclusions The presented absorption and emission spectra were affected by the substitution position. When a hydrogen atom of the molecule was substituted by Cl or Br, practically no change in the absorbed and emitted energies relative to those of the indigo molecule were observed; however, when N was substituted by S or Se, the absorbed and emitted energies increased. PMID:22809100

  7. Imaging and sizing of single DNA molecules on a mobile phone.

    PubMed

    Wei, Qingshan; Luo, Wei; Chiang, Samuel; Kappel, Tara; Mejia, Crystal; Tseng, Derek; Chan, Raymond Yan Lok; Yan, Eddie; Qi, Hangfei; Shabbir, Faizan; Ozkan, Haydar; Feng, Steve; Ozcan, Aydogan

    2014-12-23

    DNA imaging techniques using optical microscopy have found numerous applications in biology, chemistry and physics and are based on relatively expensive, bulky and complicated set-ups that limit their use to advanced laboratory settings. Here we demonstrate imaging and length quantification of single molecule DNA strands using a compact, lightweight and cost-effective fluorescence microscope installed on a mobile phone. In addition to an optomechanical attachment that creates a high contrast dark-field imaging setup using an external lens, thin-film interference filters, a miniature dovetail stage and a laser-diode for oblique-angle excitation, we also created a computational framework and a mobile phone application connected to a server back-end for measurement of the lengths of individual DNA molecules that are labeled and stretched using disposable chips. Using this mobile phone platform, we imaged single DNA molecules of various lengths to demonstrate a sizing accuracy of <1 kilobase-pairs (kbp) for 10 kbp and longer DNA samples imaged over a field-of-view of ∼2 mm2.

  8. Magnetic field dependent electronic transport of Mn4 single-molecule magnet.

    NASA Astrophysics Data System (ADS)

    Haque, F.; Langhirt, M.; Henderson, J. J.; Del Barco, E.; Taguchi, T.; Christou, G.

    2010-03-01

    We have performed single-electron transport measurements on a Mn4 single-molecule magnet (SMM) in where amino groups were added to electrically protect the magnetic core and to increase the stability of the molecule when deposited on the single-electron transistor (SET) chip. A three-terminal SET with nano-gap electro-migrated gold electrodes and a naturally oxidized Aluminum back gate. Experiments were conducted at temperatures down to 230mK in the presence of high magnetic fields generated by a superconducting vector magnet. Mn4 molecules were deposited from solution to form a mono-layer. The optimum deposition time was determined by AFM analysis on atomically flat gold surfaces. We have observed Coulomb blockade an electronic excitations that curve with the magnetic field and present zero-field splitting, which represents evidence of magnetic anisotropy. Level anticrossings and large excitations slopes are associated with the behavior of molecular states with high spin values (S ˜ 9), as expected from Mn4.

  9. Performance of Deep and Shallow Neural Networks, the Universal Approximation Theorem, Activity Cliffs, and QSAR.

    PubMed

    Winkler, David A; Le, Tu C

    2017-01-01

    Neural networks have generated valuable Quantitative Structure-Activity/Property Relationships (QSAR/QSPR) models for a wide variety of small molecules and materials properties. They have grown in sophistication and many of their initial problems have been overcome by modern mathematical techniques. QSAR studies have almost always used so-called "shallow" neural networks in which there is a single hidden layer between the input and output layers. Recently, a new and potentially paradigm-shifting type of neural network based on Deep Learning has appeared. Deep learning methods have generated impressive improvements in image and voice recognition, and are now being applied to QSAR and QSAR modelling. This paper describes the differences in approach between deep and shallow neural networks, compares their abilities to predict the properties of test sets for 15 large drug data sets (the kaggle set), discusses the results in terms of the Universal Approximation theorem for neural networks, and describes how DNN may ameliorate or remove troublesome "activity cliffs" in QSAR data sets. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  10. A Novel Low-Molecular-Weight Compound Enhances Ectopic Bone Formation and Fracture Repair

    PubMed Central

    Wong, Eugene; Sangadala, Sreedhara; Boden, Scott D.; Yoshioka, Katsuhito; Hutton, William C.; Oliver, Colleen; Titus, Louisa

    2013-01-01

    Background: Use of recombinant human bone morphogenetic protein-2 (rhBMP-2) is expensive and may cause local side effects. A small synthetic molecule, SVAK-12, has recently been shown in vitro to potentiate rhBMP-2-induced transdifferentiation of myoblasts into the osteoblastic phenotype. The aims of this study were to test the ability of SVAK-12 to enhance bone formation in a rodent ectopic model and to test whether a single percutaneous injection of SVAK-12 can accelerate callus formation in a rodent femoral fracture model. Methods: Collagen disks with rhBMP-2 alone or with rhBMP-2 and SVAK-12 were implanted in a standard athymic rat chest ectopic model, and radiographic analysis was performed at four weeks. In a second set of rats (Sprague-Dawley), SVAK-12 was percutaneously injected into the site of a closed femoral fracture. The fractures were analyzed radiographically and biomechanically (with torsional testing) five weeks after surgery. Results: In the ectopic model, there was dose-dependent enhancement of rhBMP-2 activity with use of SVAK-12 at doses of 100 to 500 μg. In the fracture model, the SVAK-12-treated group had significantly higher radiographic healing scores than the untreated group (p = 0.028). Biomechanical testing revealed that the fractured femora in the 200 to 250-μg SVAK-12 group were 43% stronger (p = 0.008) and 93% stiffer (p = 0.014) than those in the control group. In summary, at five weeks the femoral fracture group injected with SVAK-12 showed significantly improved radiographic and biomechanical evidence of healing compared with the controls. Conclusions: A single local dose of a low-molecular-weight compound, SVAK-12, enhanced bone-healing in the presence of low-dose exogenous rhBMP-2 (in the ectopic model) and endogenous rhBMPs (in the femoral fracture model). Clinical Relevance: This study demonstrates that rhBMP-2 responsiveness can be enhanced by a novel small molecule, SVAK-12. Local application of anabolic small molecules has the potential for potentiating and accelerating fracture-healing. Use of this small molecule to lower required doses of rhBMPs might both decrease their cost and improve their safety profile. PMID:23467869

  11. Highly efficient implementation of pseudospectral time-dependent density-functional theory for the calculation of excitation energies of large molecules.

    PubMed

    Cao, Yixiang; Hughes, Thomas; Giesen, Dave; Halls, Mathew D; Goldberg, Alexander; Vadicherla, Tati Reddy; Sastry, Madhavi; Patel, Bhargav; Sherman, Woody; Weisman, Andrew L; Friesner, Richard A

    2016-06-15

    We have developed and implemented pseudospectral time-dependent density-functional theory (TDDFT) in the quantum mechanics package Jaguar to calculate restricted singlet and restricted triplet, as well as unrestricted excitation energies with either full linear response (FLR) or the Tamm-Dancoff approximation (TDA) with the pseudospectral length scales, pseudospectral atomic corrections, and pseudospectral multigrid strategy included in the implementations to improve the chemical accuracy and to speed the pseudospectral calculations. The calculations based on pseudospectral time-dependent density-functional theory with full linear response (PS-FLR-TDDFT) and within the Tamm-Dancoff approximation (PS-TDA-TDDFT) for G2 set molecules using B3LYP/6-31G*(*) show mean and maximum absolute deviations of 0.0015 eV and 0.0081 eV, 0.0007 eV and 0.0064 eV, 0.0004 eV and 0.0022 eV for restricted singlet excitation energies, restricted triplet excitation energies, and unrestricted excitation energies, respectively; compared with the results calculated from the conventional spectral method. The application of PS-FLR-TDDFT to OLED molecules and organic dyes, as well as the comparisons for results calculated from PS-FLR-TDDFT and best estimations demonstrate that the accuracy of both PS-FLR-TDDFT and PS-TDA-TDDFT. Calculations for a set of medium-sized molecules, including Cn fullerenes and nanotubes, using the B3LYP functional and 6-31G(**) basis set show PS-TDA-TDDFT provides 19- to 34-fold speedups for Cn fullerenes with 450-1470 basis functions, 11- to 32-fold speedups for nanotubes with 660-3180 basis functions, and 9- to 16-fold speedups for organic molecules with 540-1340 basis functions compared to fully analytic calculations without sacrificing chemical accuracy. The calculations on a set of larger molecules, including the antibiotic drug Ramoplanin, the 46-residue crambin protein, fullerenes up to C540 and nanotubes up to 14×(6,6), using the B3LYP functional and 6-31G(**) basis set with up to 8100 basis functions show that PS-FLR-TDDFT CPU time scales as N(2.05) with the number of basis functions. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  12. Parameterization of DFTB3/3OB for Sulfur and Phosphorus for Chemical and Biological Applications

    PubMed Central

    2015-01-01

    We report the parametrization of the approximate density functional tight binding method, DFTB3, for sulfur and phosphorus. The parametrization is done in a framework consistent with our previous 3OB set established for O, N, C, and H, thus the resulting parameters can be used to describe a broad set of organic and biologically relevant molecules. The 3d orbitals are included in the parametrization, and the electronic parameters are chosen to minimize errors in the atomization energies. The parameters are tested using a fairly diverse set of molecules of biological relevance, focusing on the geometries, reaction energies, proton affinities, and hydrogen bonding interactions of these molecules; vibrational frequencies are also examined, although less systematically. The results of DFTB3/3OB are compared to those from DFT (B3LYP and PBE), ab initio (MP2, G3B3), and several popular semiempirical methods (PM6 and PDDG), as well as predictions of DFTB3 with the older parametrization (the MIO set). In general, DFTB3/3OB is a major improvement over the previous parametrization (DFTB3/MIO), and for the majority cases tested here, it also outperforms PM6 and PDDG, especially for structural properties, vibrational frequencies, hydrogen bonding interactions, and proton affinities. For reaction energies, DFTB3/3OB exhibits major improvement over DFTB3/MIO, due mainly to significant reduction of errors in atomization energies; compared to PM6 and PDDG, DFTB3/3OB also generally performs better, although the magnitude of improvement is more modest. Compared to high-level calculations, DFTB3/3OB is most successful at predicting geometries; larger errors are found in the energies, although the results can be greatly improved by computing single point energies at a high level with DFTB3 geometries. There are several remaining issues with the DFTB3/3OB approach, most notably its difficulty in describing phosphate hydrolysis reactions involving a change in the coordination number of the phosphorus, for which a specific parametrization (3OB/OPhyd) is developed as a temporary solution; this suggests that the current DFTB3 methodology has limited transferability for complex phosphorus chemistry at the level of accuracy required for detailed mechanistic investigations. Therefore, fundamental improvements in the DFTB3 methodology are needed for a reliable method that describes phosphorus chemistry without ad hoc parameters. Nevertheless, DFTB3/3OB is expected to be a competitive QM method in QM/MM calculations for studying phosphorus/sulfur chemistry in condensed phase systems, especially as a low-level method that drives the sampling in a dual-level QM/MM framework. PMID:24803865

  13. Extensive database of liquid phase diffusion coefficients of some frequently used test molecules in reversed-phase liquid chromatography and hydrophilic interaction liquid chromatography.

    PubMed

    Song, Huiying; Vanderheyden, Yoachim; Adams, Erwin; Desmet, Gert; Cabooter, Deirdre

    2016-07-15

    Diffusion plays an important role in all aspects of band broadening in chromatography. An accurate knowledge of molecular diffusion coefficients in different mobile phases is therefore crucial in fundamental column performance studies. Correlations available in literature, such as the Wilke-Chang equation, can provide good approximations of molecular diffusion under reversed-phase conditions. However, these correlations have been demonstrated to be less accurate for mobile phases containing a large percentage of acetonitrile, as is the case in hydrophilic interaction liquid chromatography. A database of experimentally measured molecular diffusion coefficients of some 45 polar and apolar compounds that are frequently used as test molecules under hydrophilic interaction liquid chromatography and reversed-phase conditions is therefore presented. Special attention is given to diffusion coefficients of polar compounds obtained in large percentages of acetonitrile (>90%). The effect of the buffer concentration (5-10mM ammonium acetate) on the obtained diffusion coefficients is investigated and is demonstrated to mainly influence the molecular diffusion of charged molecules. Diffusion coefficients are measured using the Taylor-Aris method and hence deduced from the peak broadening of a solute when flowing through a long open tube. The validity of the set-up employed for the measurement of the diffusion coefficients is demonstrated by ruling out the occurrence of longitudinal diffusion, secondary flow interactions and extra-column effects, while it is also shown that radial equilibration in the 15m long capillary is effective. Copyright © 2016 Elsevier B.V. All rights reserved.

  14. Identification of critical chemical features for Aurora kinase-B inhibitors using Hip-Hop, virtual screening and molecular docking

    NASA Astrophysics Data System (ADS)

    Sakkiah, Sugunadevi; Thangapandian, Sundarapandian; John, Shalini; Lee, Keun Woo

    2011-01-01

    This study was performed to find the selective chemical features for Aurora kinase-B inhibitors using the potent methods like Hip-Hop, virtual screening, homology modeling, molecular dynamics and docking. The best hypothesis, Hypo1 was validated toward a wide range of test set containing the selective inhibitors of Aurora kinase-B. Homology modeling and molecular dynamics studies were carried out to perform the molecular docking studies. The best hypothesis Hypo1 was used as a 3D query to screen the chemical databases. The screened molecules from the databases were sorted based on ADME and drug like properties. The selective hit compounds were docked and the hydrogen bond interactions with the critical amino acids present in Aurora kinase-B were compared with the chemical features present in the Hypo1. Finally, we suggest that the chemical features present in the Hypo1 are vital for a molecule to inhibit the Aurora kinase-B activity.

  15. In-Vivo Real-Time Control of Protein Expression from Endogenous and Synthetic Gene Networks

    PubMed Central

    Orabona, Emanuele; De Stefano, Luca; Ferry, Mike; Hasty, Jeff; di Bernardo, Mario; di Bernardo, Diego

    2014-01-01

    We describe an innovative experimental and computational approach to control the expression of a protein in a population of yeast cells. We designed a simple control algorithm to automatically regulate the administration of inducer molecules to the cells by comparing the actual protein expression level in the cell population with the desired expression level. We then built an automated platform based on a microfluidic device, a time-lapse microscopy apparatus, and a set of motorized syringes, all controlled by a computer. We tested the platform to force yeast cells to express a desired fixed, or time-varying, amount of a reporter protein over thousands of minutes. The computer automatically switched the type of sugar administered to the cells, its concentration and its duration, according to the control algorithm. Our approach can be used to control expression of any protein, fused to a fluorescent reporter, provided that an external molecule known to (indirectly) affect its promoter activity is available. PMID:24831205

  16. Efficient calculation of beyond RPA correlation energies in the dielectric matrix formalism

    NASA Astrophysics Data System (ADS)

    Beuerle, Matthias; Graf, Daniel; Schurkus, Henry F.; Ochsenfeld, Christian

    2018-05-01

    We present efficient methods to calculate beyond random phase approximation (RPA) correlation energies for molecular systems with up to 500 atoms. To reduce the computational cost, we employ the resolution-of-the-identity and a double-Laplace transform of the non-interacting polarization propagator in conjunction with an atomic orbital formalism. Further improvements are achieved using integral screening and the introduction of Cholesky decomposed densities. Our methods are applicable to the dielectric matrix formalism of RPA including second-order screened exchange (RPA-SOSEX), the RPA electron-hole time-dependent Hartree-Fock (RPA-eh-TDHF) approximation, and RPA renormalized perturbation theory using an approximate exchange kernel (RPA-AXK). We give an application of our methodology by presenting RPA-SOSEX benchmark results for the L7 test set of large, dispersion dominated molecules, yielding a mean absolute error below 1 kcal/mol. The present work enables calculating beyond RPA correlation energies for significantly larger molecules than possible to date, thereby extending the applicability of these methods to a wider range of chemical systems.

  17. Efficient anharmonic vibrational spectroscopy for large molecules using local-mode coordinates

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cheng, Xiaolu; Steele, Ryan P., E-mail: ryan.steele@utah.edu

    This article presents a general computational approach for efficient simulations of anharmonic vibrational spectra in chemical systems. An automated local-mode vibrational approach is presented, which borrows techniques from localized molecular orbitals in electronic structure theory. This approach generates spatially localized vibrational modes, in contrast to the delocalization exhibited by canonical normal modes. The method is rigorously tested across a series of chemical systems, ranging from small molecules to large water clusters and a protonated dipeptide. It is interfaced with exact, grid-based approaches, as well as vibrational self-consistent field methods. Most significantly, this new set of reference coordinates exhibits a well-behavedmore » spatial decay of mode couplings, which allows for a systematic, a priori truncation of mode couplings and increased computational efficiency. Convergence can typically be reached by including modes within only about 4 Å. The local nature of this truncation suggests particular promise for the ab initio simulation of anharmonic vibrational motion in large systems, where connection to experimental spectra is currently most challenging.« less

  18. QuickVina: accelerating AutoDock Vina using gradient-based heuristics for global optimization.

    PubMed

    Handoko, Stephanus Daniel; Ouyang, Xuchang; Su, Chinh Tran To; Kwoh, Chee Keong; Ong, Yew Soon

    2012-01-01

    Predicting binding between macromolecule and small molecule is a crucial phase in the field of rational drug design. AutoDock Vina, one of the most widely used docking software released in 2009, uses an empirical scoring function to evaluate the binding affinity between the molecules and employs the iterated local search global optimizer for global optimization, achieving a significantly improved speed and better accuracy of the binding mode prediction compared its predecessor, AutoDock 4. In this paper, we propose further improvement in the local search algorithm of Vina by heuristically preventing some intermediate points from undergoing local search. Our improved version of Vina-dubbed QVina-achieved a maximum acceleration of about 25 times with the average speed-up of 8.34 times compared to the original Vina when tested on a set of 231 protein-ligand complexes while maintaining the optimal scores mostly identical. Using our heuristics, larger number of different ligands can be quickly screened against a given receptor within the same time frame.

  19. Solubility of organic compounds in octanol: Improved predictions based on the geometrical fragment approach.

    PubMed

    Mathieu, Didier

    2017-09-01

    Two new models are introduced to predict the solubility of chemicals in octanol (S oct ), taking advantage of the extensive character of log(S oct ) through a decomposition of molecules into so-called geometrical fragments (GF). They are extensively validated and their compliance with regulatory requirements is demonstrated. The first model requires just a molecular formula as input. Despite an extreme simplicity, it performs as well as an advanced random forest model involving 86 descriptors, with a root mean square error (RMSE) of 0.64 log units for an external test set of 100 molecules. For the second one, which requires the melting point T m as input, introducing GF descriptors reduces the RMSE from about 0.7 to <0.5 log units, a performance that could previously be obtained only through the use of Abraham descriptors. A script is provided for easy application of the models, taking into account the limits of their applicability domains. Copyright © 2017 Elsevier Ltd. All rights reserved.

  20. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Noe, F; Diadone, Isabella; Lollmann, Marc

    There is a gap between kinetic experiment and simulation in their views of the dynamics of complex biomolecular systems. Whereas experiments typically reveal only a few readily discernible exponential relaxations, simulations often indicate complex multistate behavior. Here, a theoretical framework is presented that reconciles these two approaches. The central concept is dynamical fingerprints which contain peaks at the time scales of the dynamical processes involved with amplitudes determined by the experimental observable. Fingerprints can be generated from both experimental and simulation data, and their comparison by matching peaks permits assignment of structural changes present in the simulation to experimentally observedmore » relaxation processes. The approach is applied here to a test case interpreting single molecule fluorescence correlation spectroscopy experiments on a set of fluorescent peptides with molecular dynamics simulations. The peptides exhibit complex kinetics shown to be consistent with the apparent simplicity of the experimental data. Moreover, the fingerprint approach can be used to design new experiments with site-specific labels that optimally probe specific dynamical processes in the molecule under investigation.« less

  1. Broth Microdilution In Vitro Screening: An Easy and Fast Method to Detect New Antifungal Compounds.

    PubMed

    de-Souza-Silva, Calliandra Maria; Guilhelmelli, Fernanda; Zamith-Miranda, Daniel; de Oliveira, Marco Antônio; Nosanchuk, Joshua Daniel; Silva-Pereira, Ildinete; Albuquerque, Patrícia

    2018-02-14

    Fungal infections have become an important medical condition in the last decades, but the number of available antifungal drugs is limited. In this scenario, the search for new antifungal drugs is necessary. The protocol reported here details a method to screen peptides for their antifungal properties. It is based on the broth microdilution susceptibility test from the Clinical and Laboratory Standards Institute (CLSI) M27-A3 guidelines with modifications to suit the research of antimicrobial peptides as potential new antifungals. This protocol describes a functional assay to evaluate the activity of antifungal compounds and may be easily modified to suit any particular class of molecules under investigation. Since the assays are performed in 96-well plates using small volumes, a large-scale screening can be completed in a short amount of time, especially if carried out in an automation setting. This procedure illustrates how a standardized and adjustable clinical protocol can help the bench-work pursuit of new molecules to improve the therapy of fungal diseases.

  2. Rank Order Entropy: why one metric is not enough

    PubMed Central

    McLellan, Margaret R.; Ryan, M. Dominic; Breneman, Curt M.

    2011-01-01

    The use of Quantitative Structure-Activity Relationship models to address problems in drug discovery has a mixed history, generally resulting from the mis-application of QSAR models that were either poorly constructed or used outside of their domains of applicability. This situation has motivated the development of a variety of model performance metrics (r2, PRESS r2, F-tests, etc) designed to increase user confidence in the validity of QSAR predictions. In a typical workflow scenario, QSAR models are created and validated on training sets of molecules using metrics such as Leave-One-Out or many-fold cross-validation methods that attempt to assess their internal consistency. However, few current validation methods are designed to directly address the stability of QSAR predictions in response to changes in the information content of the training set. Since the main purpose of QSAR is to quickly and accurately estimate a property of interest for an untested set of molecules, it makes sense to have a means at hand to correctly set user expectations of model performance. In fact, the numerical value of a molecular prediction is often less important to the end user than knowing the rank order of that set of molecules according to their predicted endpoint values. Consequently, a means for characterizing the stability of predicted rank order is an important component of predictive QSAR. Unfortunately, none of the many validation metrics currently available directly measure the stability of rank order prediction, making the development of an additional metric that can quantify model stability a high priority. To address this need, this work examines the stabilities of QSAR rank order models created from representative data sets, descriptor sets, and modeling methods that were then assessed using Kendall Tau as a rank order metric, upon which the Shannon Entropy was evaluated as a means of quantifying rank-order stability. Random removal of data from the training set, also known as Data Truncation Analysis (DTA), was used as a means for systematically reducing the information content of each training set while examining both rank order performance and rank order stability in the face of training set data loss. The premise for DTA ROE model evaluation is that the response of a model to incremental loss of training information will be indicative of the quality and sufficiency of its training set, learning method, and descriptor types to cover a particular domain of applicability. This process is termed a “rank order entropy” evaluation, or ROE. By analogy with information theory, an unstable rank order model displays a high level of implicit entropy, while a QSAR rank order model which remains nearly unchanged during training set reductions would show low entropy. In this work, the ROE metric was applied to 71 data sets of different sizes, and was found to reveal more information about the behavior of the models than traditional metrics alone. Stable, or consistently performing models, did not necessarily predict rank order well. Models that performed well in rank order did not necessarily perform well in traditional metrics. In the end, it was shown that ROE metrics suggested that some QSAR models that are typically used should be discarded. ROE evaluation helps to discern which combinations of data set, descriptor set, and modeling methods lead to usable models in prioritization schemes, and provides confidence in the use of a particular model within a specific domain of applicability. PMID:21875058

  3. Predictions of BuChE inhibitors using support vector machine and naive Bayesian classification techniques in drug discovery.

    PubMed

    Fang, Jiansong; Yang, Ranyao; Gao, Li; Zhou, Dan; Yang, Shengqian; Liu, Ai-Lin; Du, Guan-hua

    2013-11-25

    Butyrylcholinesterase (BuChE, EC 3.1.1.8) is an important pharmacological target for Alzheimer's disease (AD) treatment. However, the currently available BuChE inhibitor screening assays are expensive, labor-intensive, and compound-dependent. It is necessary to develop robust in silico methods to predict the activities of BuChE inhibitors for the lead identification. In this investigation, support vector machine (SVM) models and naive Bayesian models were built to discriminate BuChE inhibitors (BuChEIs) from the noninhibitors. Each molecule was initially represented in 1870 structural descriptors (1235 from ADRIANA.Code, 334 from MOE, and 301 from Discovery studio). Correlation analysis and stepwise variable selection method were applied to figure out activity-related descriptors for prediction models. Additionally, structural fingerprint descriptors were added to improve the predictive ability of models, which were measured by cross-validation, a test set validation with 1001 compounds and an external test set validation with 317 diverse chemicals. The best two models gave Matthews correlation coefficient of 0.9551 and 0.9550 for the test set and 0.9132 and 0.9221 for the external test set. To demonstrate the practical applicability of the models in virtual screening, we screened an in-house data set with 3601 compounds, and 30 compounds were selected for further bioactivity assay. The assay results showed that 10 out of 30 compounds exerted significant BuChE inhibitory activities with IC50 values ranging from 0.32 to 22.22 μM, at which three new scaffolds as BuChE inhibitors were identified for the first time. To our best knowledge, this is the first report on BuChE inhibitors using machine learning approaches. The models generated from SVM and naive Bayesian approaches successfully predicted BuChE inhibitors. The study proved the feasibility of a new method for predicting bioactivities of ligands and discovering novel lead compounds.

  4. Local-global alignment for finding 3D similarities in protein structures

    DOEpatents

    Zemla, Adam T [Brentwood, CA

    2011-09-20

    A method of finding 3D similarities in protein structures of a first molecule and a second molecule. The method comprises providing preselected information regarding the first molecule and the second molecule. Comparing the first molecule and the second molecule using Longest Continuous Segments (LCS) analysis. Comparing the first molecule and the second molecule using Global Distance Test (GDT) analysis. Comparing the first molecule and the second molecule using Local Global Alignment Scoring function (LGA_S) analysis. Verifying constructed alignment and repeating the steps to find the regions of 3D similarities in protein structures.

  5. Exotic Molecules in Space: A Coordinated Astronomical Laboratory and Theoretical Study

    NASA Technical Reports Server (NTRS)

    Thaddeus, Patrick

    1999-01-01

    The past three years have been a period of great progress in our laboratory investigation of molecules of astrophysical interest-the most productive by far in the 20-year history of a research program which has led to the discovery of over 20% of the 123 known interstellar and circumstellar molecules. Most of the discoveries made during this period have been the result of the construction in late 1995 and early 1996 of a Fourier transform microwave spectrometer working in the centimeter-wave band. The sensitivity of this instrument from the moment that it was turned on has exceeded our expectations by an order of magnitude. The Table below shows the 46 new molecules which have been discovered. Most are carbon chains, the dominant type of molecule which has been found in space. Several comments with respect to these molecules should be made: 1. There are probably no mistakes in any of the identifications, since these have been confirmed by the standard, powerful assays and tests used to check spectroscopic identifications: isotopic substitution, quantum calculations of the expected molecular structures, detection of hyperfine structure, Zeeman effect, etc. 2. The radio laboratory astrophysics of the entire set is complete for the time being, in the sense that essentially all the astronomically interesting radio transitions (including hfs when present) are either directly measured or can now be calculated from the derived spectroscopic constants to better than 1 part per million (or 0.3 km s-1 in radial velocity, and often much better than that). 3. Six of the forty six new molecules have already been identified in space, in every case but one on the basis of our laboratory measurements. 4. Sensitive as they are, our laboratory techniques are far from fundamental limits on sensitivity, and 5. One of the principal motivations of our research is to close the fairly small mass and size gap, now only a factor of a few, between the smallest postulated interstellar grains and the largest identified interstellar molecules.

  6. Machine Learning Model Analysis and Data Visualization with Small Molecules Tested in a Mouse Model of Mycobacterium tuberculosis Infection (2014–2015)

    PubMed Central

    2016-01-01

    The renewed urgency to develop new treatments for Mycobacterium tuberculosis (Mtb) infection has resulted in large-scale phenotypic screening and thousands of new active compounds in vitro. The next challenge is to identify candidates to pursue in a mouse in vivo efficacy model as a step to predicting clinical efficacy. We previously analyzed over 70 years of this mouse in vivo efficacy data, which we used to generate and validate machine learning models. Curation of 60 additional small molecules with in vivo data published in 2014 and 2015 was undertaken to further test these models. This represents a much larger test set than for the previous models. Several computational approaches have now been applied to analyze these molecules and compare their molecular properties beyond those attempted previously. Our previous machine learning models have been updated, and a novel aspect has been added in the form of mouse liver microsomal half-life (MLM t1/2) and in vitro-based Mtb models incorporating cytotoxicity data that were used to predict in vivo activity for comparison. Our best Mtbin vivo models possess fivefold ROC values > 0.7, sensitivity > 80%, and concordance > 60%, while the best specificity value is >40%. Use of an MLM t1/2 Bayesian model affords comparable results for scoring the 60 compounds tested. Combining MLM stability and in vitroMtb models in a novel consensus workflow in the best cases has a positive predicted value (hit rate) > 77%. Our results indicate that Bayesian models constructed with literature in vivoMtb data generated by different laboratories in various mouse models can have predictive value and may be used alongside MLM t1/2 and in vitro-based Mtb models to assist in selecting antitubercular compounds with desirable in vivo efficacy. We demonstrate for the first time that consensus models of any kind can be used to predict in vivo activity for Mtb. In addition, we describe a new clustering method for data visualization and apply this to the in vivo training and test data, ultimately making the method accessible in a mobile app. PMID:27335215

  7. Multiple reaction monitoring (MRM)-profiling for biomarker discovery applied to human polycystic ovarian syndrome.

    PubMed

    Cordeiro, Fernanda B; Ferreira, Christina R; Sobreira, Tiago Jose P; Yannell, Karen E; Jarmusch, Alan K; Cedenho, Agnaldo P; Lo Turco, Edson G; Cooks, R Graham

    2017-09-15

    We describe multiple reaction monitoring (MRM)-profiling, which provides accelerated discovery of discriminating molecular features, and its application to human polycystic ovary syndrome (PCOS) diagnosis. The discovery phase of the MRM-profiling seeks molecular features based on some prior knowledge of the chemical functional groups likely to be present in the sample. It does this through use of a limited number of pre-chosen and chemically specific neutral loss and/or precursor ion MS/MS scans. The output of the discovery phase is a set of precursor/product transitions. In the screening phase these MRM transitions are used to interrogate multiple samples (hence the name MRM-profiling). MRM-profiling was applied to follicular fluid samples of 22 controls and 29 clinically diagnosed PCOS patients. Representative samples were delivered by flow injection to a triple quadrupole mass spectrometer set to perform a number of pre-chosen and chemically specific neutral loss and/or precursor ion MS/MS scans. The output of this discovery phase was a set of 1012 precursor/product transitions. In the screening phase each individual sample was interrogated for these MRM transitions. Principal component analysis (PCA) and receiver operating characteristic (ROC) curves were used for statistical analysis. To evaluate the method's performance, half the samples were used to build a classification model (testing set) and half were blinded (validation set). Twenty transitions were used for the classification of the blind samples, most of them (N = 19) showed lower abundances in the PCOS group and corresponded to phosphatidylethanolamine (PE) and phosphatidylserine (PS) lipids. Agreement of 73% with clinical diagnosis was found when classifying the 26 blind samples. MRM-profiling is a supervised method characterized by its simplicity, speed and the absence of chromatographic separation. It can be used to rapidly isolate discriminating molecules in healthy/disease conditions by tailored screening of signals associated with hundreds of molecules in complex samples. Copyright © 2017 John Wiley & Sons, Ltd.

  8. Classification of nervous system withdrawn and approved drugs with ToxPrint features via machine learning strategies.

    PubMed

    Onay, Aytun; Onay, Melih; Abul, Osman

    2017-04-01

    Early-phase virtual screening of candidate drug molecules plays a key role in pharmaceutical industry from data mining and machine learning to prevent adverse effects of the drugs. Computational classification methods can distinguish approved drugs from withdrawn ones. We focused on 6 data sets including maximum 110 approved and 110 withdrawn drugs for all and nervous system diseases to distinguish approved drugs from withdrawn ones. In this study, we used support vector machines (SVMs) and ensemble methods (EMs) such as boosted and bagged trees to classify drugs into approved and withdrawn categories. Also, we used CORINA Symphony program to identify Toxprint chemotypes including over 700 predefined chemotypes for determination of risk and safety assesment of candidate drug molecules. In addition, we studied nervous system withdrawn drugs to determine the key fragments with The ParMol package including gSpan algorithm. According to our results, the descriptors named as the number of total chemotypes and bond CN_amine_aliphatic_generic were more significant descriptors. The developed Medium Gaussian SVM model reached 78% prediction accuracy on test set for drug data set including all disease. Here, bagged tree and linear SVM models showed 89% of accuracies for phycholeptics and psychoanaleptics drugs. A set of discriminative fragments in nervous system withdrawn drug (NSWD) data sets was obtained. These fragments responsible for the drugs removed from market were benzene, toluene, N,N-dimethylethylamine, crotylamine, 5-methyl-2,4-heptadiene, octatriene and carbonyl group. This paper covers the development of computational classification methods to distinguish approved drugs from withdrawn ones. In addition, the results of this study indicated the identification of discriminative fragments is of significance to design a new nervous system approved drugs with interpretation of the structures of the NSWDs. Copyright © 2017 Elsevier B.V. All rights reserved.

  9. In silico study toward the identification of new and safe potential inhibitors of photosynthetic electron transport.

    PubMed

    Ribeiro, Taisa Pereira Piacentini; Manarin, Flávia Giovana; Borges de Melo, Eduardo

    2018-05-30

    To address the rising global demand for food, it is necessary to search for new herbicides that can control resistant weeds. We performed a 2D-quantitative structure-activity relationship (QSAR) study to predict compounds with photosynthesis-inhibitory activity. A data set of 44 compounds (quinolines and naphthalenes), which are described as photosynthetic electron transport (PET) inhibitors, was used. The obtained model was approved in internal and external validation tests. 2D Similarity-based virtual screening was performed and 64 compounds were selected from the ZINC database. By using the VEGA QSAR software, 48 compounds were shown to have potential toxic effects (mutagenicity and carcinogenicity). Therefore, the model was also tested using a set of 16 molecules obtained by a similarity search of the ZINC database. Six compounds showed good predicted inhibition of PET. The obtained model shows potential utility in the design of new PET inhibitors, and the hit compounds found by virtual screening are novel bicyclic scaffolds of this class. Copyright © 2018 Elsevier Inc. All rights reserved.

  10. MPAI (mass probes aided ionization) method for total analysis of biomolecules by mass spectrometry.

    PubMed

    Honda, Aki; Hayashi, Shinichiro; Hifumi, Hiroki; Honma, Yuya; Tanji, Noriyuki; Iwasawa, Naoko; Suzuki, Yoshio; Suzuki, Koji

    2007-01-01

    We have designed and synthesized various mass probes, which enable us to effectively ionize various molecules to be detected with mass spectrometry. We call the ionization method using mass probes the "MPAI (mass probes aided ionization)" method. We aim at the sensitive detection of various biological molecules, and also the detection of bio-molecules by a single mass spectrometry serially without changing the mechanical settings. Here, we review mass probes for small molecules with various functional groups and mass probes for proteins. Further, we introduce newly developed mass probes for proteins for highly sensitive detection.

  11. Solid harmonic wavelet scattering for predictions of molecule properties

    NASA Astrophysics Data System (ADS)

    Eickenberg, Michael; Exarchakis, Georgios; Hirn, Matthew; Mallat, Stéphane; Thiry, Louis

    2018-06-01

    We present a machine learning algorithm for the prediction of molecule properties inspired by ideas from density functional theory (DFT). Using Gaussian-type orbital functions, we create surrogate electronic densities of the molecule from which we compute invariant "solid harmonic scattering coefficients" that account for different types of interactions at different scales. Multilinear regressions of various physical properties of molecules are computed from these invariant coefficients. Numerical experiments show that these regressions have near state-of-the-art performance, even with relatively few training examples. Predictions over small sets of scattering coefficients can reach a DFT precision while being interpretable.

  12. Novel naïve Bayes classification models for predicting the carcinogenicity of chemicals.

    PubMed

    Zhang, Hui; Cao, Zhi-Xing; Li, Meng; Li, Yu-Zhi; Peng, Cheng

    2016-11-01

    The carcinogenicity prediction has become a significant issue for the pharmaceutical industry. The purpose of this investigation was to develop a novel prediction model of carcinogenicity of chemicals by using a naïve Bayes classifier. The established model was validated by the internal 5-fold cross validation and external test set. The naïve Bayes classifier gave an average overall prediction accuracy of 90 ± 0.8% for the training set and 68 ± 1.9% for the external test set. Moreover, five simple molecular descriptors (e.g., AlogP, Molecular weight (M W ), No. of H donors, Apol and Wiener) considered as important for the carcinogenicity of chemicals were identified, and some substructures related to the carcinogenicity were achieved. Thus, we hope the established naïve Bayes prediction model could be applied to filter early-stage molecules for this potential carcinogenicity adverse effect; and the identified five simple molecular descriptors and substructures of carcinogens would give a better understanding of the carcinogenicity of chemicals, and further provide guidance for medicinal chemists in the design of new candidate drugs and lead optimization, ultimately reducing the attrition rate in later stages of drug development. Copyright © 2016 Elsevier Ltd. All rights reserved.

  13. In search for an optimal methodology to calculate the valence electron affinities of temporary anions.

    PubMed

    Puiatti, Marcelo; Vera, D Mariano A; Pierini, Adriana B

    2009-10-28

    Recently, we have proposed an approach for finding the valence anion ground state, based on the stabilization exerted by a polar solvent; the methodology used standard DFT methods and relatively inexpensive basis sets and yielded correct electron affinity (EA) values by gradually decreasing the dielectric constant of the medium. In order to address the overall performance of the new methodology, to find the best conditions for stabilizing the valence state and to evaluate its scope and limitations, we gathered a pool of 60 molecules, 25 of them bearing the conventional valence state as the ground anion and 35 for which the lowest anion state found holds the extra electron in a diffuse orbital around the molecule (non valence state). The results obtained by testing this representative set suggest a very good performance for most species having an experimental EA less negative than -3.0 eV; the correlation at the B3LYP/6-311+G(2df,p) level being y = 1.01x + 0.06, with a correlation index of 0.985. As an alternative, the time dependent DFT (TD-DFT) approach was also tested with both B3LYP and PBE0 functionals. The methodology we proposed shows a comparable or better accuracy with respect to TD-DFT, although the TD-DFT approach with the PBE0 functional is suggested as a suitable estimate for species with the most negative EAs (ca.-2.5 to -3.5 eV), for which stabilization strategies can hardly reach the valence state. As an application, a pool of 8 compounds of key biological interest with EAs which remain unknown or unclear were predicted using the new methodology.

  14. Predicting Displaceable Water Sites Using Mixed-Solvent Molecular Dynamics.

    PubMed

    Graham, Sarah E; Smith, Richard D; Carlson, Heather A

    2018-02-26

    Water molecules are an important factor in protein-ligand binding. Upon binding of a ligand with a protein's surface, waters can either be displaced by the ligand or may be conserved and possibly bridge interactions between the protein and ligand. Depending on the specific interactions made by the ligand, displacing waters can yield a gain in binding affinity. The extent to which binding affinity may increase is difficult to predict, as the favorable displacement of a water molecule is dependent on the site-specific interactions made by the water and the potential ligand. Several methods have been developed to predict the location of water sites on a protein's surface, but the majority of methods are not able to take into account both protein dynamics and the interactions made by specific functional groups. Mixed-solvent molecular dynamics (MixMD) is a cosolvent simulation technique that explicitly accounts for the interaction of both water and small molecule probes with a protein's surface, allowing for their direct competition. This method has previously been shown to identify both active and allosteric sites on a protein's surface. Using a test set of eight systems, we have developed a method using MixMD to identify conserved and displaceable water sites. Conserved sites can be determined by an occupancy-based metric to identify sites which are consistently occupied by water even in the presence of probe molecules. Conversely, displaceable water sites can be found by considering the sites which preferentially bind probe molecules. Furthermore, the inclusion of six probe types allows the MixMD method to predict which functional groups are capable of displacing which water sites. The MixMD method consistently identifies sites which are likely to be nondisplaceable and predicts the favorable displacement of water sites that are known to be displaced upon ligand binding.

  15. Comparative analysis of local spin definitions.

    PubMed

    Herrmann, Carmen; Reiher, Markus; Hess, Bernd A

    2005-01-15

    This work provides a survey of the definition of electron spin as a local property and its dependence on several parameters in actual calculations. We analyze one-determinant wave functions constructed from Hartree-Fock and, in particular, from Kohn-Sham orbitals within the collinear approach to electron spin. The scalar total spin operators S2 and Sz are partitioned by projection operators, as introduced by Clark and Davidson, in order to obtain local spin operators SASB and SzA, respectively. To complement the work of Davidson and co-workers, we analyze some features of local spins which have not yet been discussed in sufficient depth. The dependence of local spin on the choice of basis set, density functional, and projector is studied. We also discuss the results of Sz partitioning and show that SzA values depend less on these parameters than SASB values. Furthermore, we demonstrate that for small organic test molecules, a partitioning of Sz with preorthogonalized Lowdin projectors yields nearly the same results as one obtains using atoms-in-molecules projectors. In addition, the physical significance of nonzero SASB values for closed-shell molecules is investigated. It is shown that due to this problem, SASB values are useful for calculations of relative spin values, but not for absolute local spins, where SzA values appear to be better suited.

  16. Novel AgoshRNA molecules for silencing of the CCR5 co-receptor for HIV-1 infection

    PubMed Central

    Herrera-Carrillo, Elena

    2017-01-01

    Allogeneic transplantation of blood stem cells from a CCR5-Δ32 homozygous donor to an HIV-infected individual, the “Berlin patient”, led to a cure. Since then there has been a search for approaches that mimic this intervention in a gene therapy setting. RNA interference (RNAi) has evolved as a powerful tool to regulate gene expression in a sequence-specific manner and can be used to inactivate the CCR5 mRNA. Short hairpin RNA (shRNA) molecules can impair CCR5 expression, but these molecules may cause unintended side effects and they will not be processed in cells that lack Dicer, such as monocytes. Dicer-independent RNAi pathways have opened opportunities for new AgoshRNA designs that rely exclusively on Ago2 for maturation. Furthermore, AgoshRNA processing yields a single active guide RNA, thus reducing off-target effects. In this study, we tested different AgoshRNA designs against CCR5. We selected AgoshRNAs that potently downregulated CCR5 expression on human T cells and peripheral blood mononuclear cells (PBMC) and that had no apparent adverse effect on T cell development as assessed in a competitive cell growth assay. CCR5 knockdown significantly protected T cells from CCR5 tropic HIV-1 infection. PMID:28542329

  17. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Witte, Jonathon; Molecular Foundry, Lawrence Berkeley National Laboratory, Berkeley, California 94720; Neaton, Jeffrey B.

    With the aim of systematically characterizing the convergence of common families of basis sets such that general recommendations for basis sets can be made, we have tested a wide variety of basis sets against complete-basis binding energies across the S22 set of intermolecular interactions—noncovalent interactions of small and medium-sized molecules consisting of first- and second-row atoms—with three distinct density functional approximations: SPW92, a form of local-density approximation; B3LYP, a global hybrid generalized gradient approximation; and B97M-V, a meta-generalized gradient approximation with nonlocal correlation. We have found that it is remarkably difficult to reach the basis set limit; for the methodsmore » and systems examined, the most complete basis is Jensen’s pc-4. The Dunning correlation-consistent sequence of basis sets converges slowly relative to the Jensen sequence. The Karlsruhe basis sets are quite cost effective, particularly when a correction for basis set superposition error is applied: counterpoise-corrected def2-SVPD binding energies are better than corresponding energies computed in comparably sized Dunning and Jensen bases, and on par with uncorrected results in basis sets 3-4 times larger. These trends are exhibited regardless of the level of density functional approximation employed. A sense of the magnitude of the intrinsic incompleteness error of each basis set not only provides a foundation for guiding basis set choice in future studies but also facilitates quantitative comparison of existing studies on similar types of systems.« less

  18. Molecular docking and 3D-QSAR studies on triazolinone and pyridazinone, non-nucleoside inhibitor of HIV-1 reverse transcriptase.

    PubMed

    Sivan, Sree Kanth; Manga, Vijjulatha

    2010-06-01

    Nonnucleoside reverse transcriptase inhibitors (NNRTIs) are allosteric inhibitors of the HIV-1 reverse transcriptase. Recently a series of Triazolinone and Pyridazinone were reported as potent inhibitors of HIV-1 wild type reverse transcriptase. In the present study, docking and 3D quantitative structure activity relationship (3D QSAR) studies involving comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) were performed on 31 molecules. Ligands were built and minimized using Tripos force field and applying Gasteiger-Hückel charges. These ligands were docked into protein active site using GLIDE 4.0. The docked poses were analyzed; the best docked poses were selected and aligned. CoMFA and CoMSIA fields were calculated using SYBYL6.9. The molecules were divided into training set and test set, a PLS analysis was performed and QSAR models were generated. The model showed good statistical reliability which is evident from the r2 nv, q2 loo and r2 pred values. The CoMFA model provides the most significant correlation of steric and electrostatic fields with biological activities. The CoMSIA model provides a correlation of steric, electrostatic, acceptor and hydrophobic fields with biological activities. The information rendered by 3D QSAR model initiated us to optimize the lead and design new potential inhibitors.

  19. Machine Learning of Parameters for Accurate Semiempirical Quantum Chemical Calculations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dral, Pavlo O.; von Lilienfeld, O. Anatole; Thiel, Walter

    2015-05-12

    We investigate possible improvements in the accuracy of semiempirical quantum chemistry (SQC) methods through the use of machine learning (ML) models for the parameters. For a given class of compounds, ML techniques require sufficiently large training sets to develop ML models that can be used for adapting SQC parameters to reflect changes in molecular composition and geometry. The ML-SQC approach allows the automatic tuning of SQC parameters for individual molecules, thereby improving the accuracy without deteriorating transferability to molecules with molecular descriptors very different from those in the training set. The performance of this approach is demonstrated for the semiempiricalmore » OM2 method using a set of 6095 constitutional isomers C7H10O2, for which accurate ab initio atomization enthalpies are available. The ML-OM2 results show improved average accuracy and a much reduced error range compared with those of standard OM2 results, with mean absolute errors in atomization enthalpies dropping from 6.3 to 1.7 kcal/mol. They are also found to be superior to the results from specific OM2 reparameterizations (rOM2) for the same set of isomers. The ML-SQC approach thus holds promise for fast and reasonably accurate high-throughput screening of materials and molecules.« less

  20. Machine learning of parameters for accurate semiempirical quantum chemical calculations

    DOE PAGES

    Dral, Pavlo O.; von Lilienfeld, O. Anatole; Thiel, Walter

    2015-04-14

    We investigate possible improvements in the accuracy of semiempirical quantum chemistry (SQC) methods through the use of machine learning (ML) models for the parameters. For a given class of compounds, ML techniques require sufficiently large training sets to develop ML models that can be used for adapting SQC parameters to reflect changes in molecular composition and geometry. The ML-SQC approach allows the automatic tuning of SQC parameters for individual molecules, thereby improving the accuracy without deteriorating transferability to molecules with molecular descriptors very different from those in the training set. The performance of this approach is demonstrated for the semiempiricalmore » OM2 method using a set of 6095 constitutional isomers C 7H 10O 2, for which accurate ab initio atomization enthalpies are available. The ML-OM2 results show improved average accuracy and a much reduced error range compared with those of standard OM2 results, with mean absolute errors in atomization enthalpies dropping from 6.3 to 1.7 kcal/mol. They are also found to be superior to the results from specific OM2 reparameterizations (rOM2) for the same set of isomers. The ML-SQC approach thus holds promise for fast and reasonably accurate high-throughput screening of materials and molecules.« less

  1. Quantum Behavior of Water Molecules Confined to Nanocavities in Gemstones.

    PubMed

    Gorshunov, Boris P; Zhukova, Elena S; Torgashev, Victor I; Lebedev, Vladimir V; Shakurov, Gil'man S; Kremer, Reinhard K; Pestrjakov, Efim V; Thomas, Victor G; Fursenko, Dimitry A; Dressel, Martin

    2013-06-20

    When water is confined to nanocavities, its quantum mechanical behavior can be revealed by terahertz spectroscopy. We place H2O molecules in the nanopores of a beryl crystal lattice and observe a rich and highly anisotropic set of absorption lines in the terahertz spectral range. Two bands can be identified, which originate from translational and librational motions of the water molecule isolated within the cage; they correspond to the analogous broad bands in liquid water and ice. In the present case of well-defined and highly symmetric nanocavities, the observed fine structure can be explained by macroscopic tunneling of the H2O molecules within a six-fold potential caused by the interaction of the molecule with the cavity walls.

  2. Detection of the aromatic molecule benzonitrile (c-C6H5CN) in the interstellar medium

    NASA Astrophysics Data System (ADS)

    McGuire, Brett A.; Burkhardt, Andrew M.; Kalenskii, Sergei; Shingledecker, Christopher N.; Remijan, Anthony J.; Herbst, Eric; McCarthy, Michael C.

    2018-01-01

    Polycyclic aromatic hydrocarbons and polycyclic aromatic nitrogen heterocycles are thought to be widespread throughout the universe, because these classes of molecules are probably responsible for the unidentified infrared bands, a set of emission features seen in numerous Galactic and extragalactic sources. Despite their expected ubiquity, astronomical identification of specific aromatic molecules has proven elusive. We present the discovery of benzonitrile (c-C6H5CN), one of the simplest nitrogen-bearing aromatic molecules, in the interstellar medium. We observed hyperfine-resolved transitions of benzonitrile in emission from the molecular cloud TMC-1. Simple aromatic molecules such as benzonitrile may be precursors for polycyclic aromatic hydrocarbon formation, providing a chemical link to the carriers of the unidentified infrared bands.

  3. Analyzing Single-Molecule Time Series via Nonparametric Bayesian Inference

    PubMed Central

    Hines, Keegan E.; Bankston, John R.; Aldrich, Richard W.

    2015-01-01

    The ability to measure the properties of proteins at the single-molecule level offers an unparalleled glimpse into biological systems at the molecular scale. The interpretation of single-molecule time series has often been rooted in statistical mechanics and the theory of Markov processes. While existing analysis methods have been useful, they are not without significant limitations including problems of model selection and parameter nonidentifiability. To address these challenges, we introduce the use of nonparametric Bayesian inference for the analysis of single-molecule time series. These methods provide a flexible way to extract structure from data instead of assuming models beforehand. We demonstrate these methods with applications to several diverse settings in single-molecule biophysics. This approach provides a well-constrained and rigorously grounded method for determining the number of biophysical states underlying single-molecule data. PMID:25650922

  4. Developing an Efficient and General Strategy for Immobilization of Small Molecules onto Microarrays Using Isocyanate Chemistry.

    PubMed

    Zhu, Chenggang; Zhu, Xiangdong; Landry, James P; Cui, Zhaomeng; Li, Quanfu; Dang, Yongjun; Mi, Lan; Zheng, Fengyun; Fei, Yiyan

    2016-03-16

    Small-molecule microarray (SMM) is an effective platform for identifying lead compounds from large collections of small molecules in drug discovery, and efficient immobilization of molecular compounds is a pre-requisite for the success of such a platform. On an isocyanate functionalized surface, we studied the dependence of immobilization efficiency on chemical residues on molecular compounds, terminal residues on isocyanate functionalized surface, lengths of spacer molecules, and post-printing treatment conditions, and we identified a set of optimized conditions that enable us to immobilize small molecules with significantly improved efficiencies, particularly for those molecules with carboxylic acid residues that are known to have low isocyanate reactivity. We fabricated microarrays of 3375 bioactive compounds on isocyanate functionalized glass slides under these optimized conditions and confirmed that immobilization percentage is over 73%.

  5. Atomic and molecular data for spacecraft re-entry plasmas

    NASA Astrophysics Data System (ADS)

    Celiberto, R.; Armenise, I.; Cacciatore, M.; Capitelli, M.; Esposito, F.; Gamallo, P.; Janev, R. K.; Laganà, A.; Laporta, V.; Laricchiuta, A.; Lombardi, A.; Rutigliano, M.; Sayós, R.; Tennyson, J.; Wadehra, J. M.

    2016-06-01

    The modeling of atmospheric gas, interacting with the space vehicles in re-entry conditions in planetary exploration missions, requires a large set of scattering data for all those elementary processes occurring in the system. A fundamental aspect of re-entry problems is represented by the strong non-equilibrium conditions met in the atmospheric plasma close to the surface of the thermal shield, where numerous interconnected relaxation processes determine the evolution of the gaseous system towards equilibrium conditions. A central role is played by the vibrational exchanges of energy, so that collisional processes involving vibrationally excited molecules assume a particular importance. In the present paper, theoretical calculations of complete sets of vibrationally state-resolved cross sections and rate coefficients are reviewed, focusing on the relevant classes of collisional processes: resonant and non-resonant electron-impact excitation of molecules, atom-diatom and molecule-molecule collisions as well as gas-surface interaction. In particular, collisional processes involving atomic and molecular species, relevant to Earth (N2, O2, NO), Mars (CO2, CO, N2) and Jupiter (H2, He) atmospheres are considered.

  6. Benchmark of Ab Initio Bethe-Salpeter Equation Approach with Numeric Atom-Centered Orbitals

    NASA Astrophysics Data System (ADS)

    Liu, Chi; Kloppenburg, Jan; Kanai, Yosuke; Blum, Volker

    The Bethe-Salpeter equation (BSE) approach based on the GW approximation has been shown to be successful for optical spectra prediction of solids and recently also for small molecules. We here present an all-electron implementation of the BSE using numeric atom-centered orbital (NAO) basis sets. In this work, we present benchmark of BSE implemented in FHI-aims for low-lying excitation energies for a set of small organic molecules, the well-known Thiel's set. The difference between our implementation (using an analytic continuation of the GW self-energy on the real axis) and the results generated by a fully frequency dependent GW treatment on the real axis is on the order of 0.07 eV for the benchmark molecular set. We study the convergence behavior to the complete basis set limit for excitation spectra, using a group of valence correlation consistent NAO basis sets (NAO-VCC-nZ), as well as for standard NAO basis sets for ground state DFT with extended augmentation functions (NAO+aug). The BSE results and convergence behavior are compared to linear-response time-dependent DFT, where excellent numerical convergence is shown for NAO+aug basis sets.

  7. Predicting hydration free energies of amphetamine-type stimulants with a customized molecular model

    NASA Astrophysics Data System (ADS)

    Li, Jipeng; Fu, Jia; Huang, Xing; Lu, Diannan; Wu, Jianzhong

    2016-09-01

    Amphetamine-type stimulants (ATS) are a group of incitation and psychedelic drugs affecting the central nervous system. Physicochemical data for these compounds are essential for understanding the stimulating mechanism, for assessing their environmental impacts, and for developing new drug detection methods. However, experimental data are scarce due to tight regulation of such illicit drugs, yet conventional methods to estimate their properties are often unreliable. Here we introduce a tailor-made multiscale procedure for predicting the hydration free energies and the solvation structures of ATS molecules by a combination of first principles calculations and the classical density functional theory. We demonstrate that the multiscale procedure performs well for a training set with similar molecular characteristics and yields good agreement with a testing set not used in the training. The theoretical predictions serve as a benchmark for the missing experimental data and, importantly, provide microscopic insights into manipulating the hydrophobicity of ATS compounds by chemical modifications.

  8. Experimental and theoretical study on DPPH radical scavenging mechanism of some chalcone quinoline derivatives

    NASA Astrophysics Data System (ADS)

    Hamlaoui, Ikram; Bencheraiet, Reguia; Bensegueni, Rafik; Bencharif, Mustapha

    2018-03-01

    In this study, the antioxidant capacity of three chalcone derivatives was evaluated by DPPH free radical scavenging. Experimental data showed low antioxidant activity (IC50±SD) of these molecules in comparison with BHT. The mechanism of DPPH radical scavenging elucidated by means of density functional theory (DFT) calculations. The tested compounds and their corresponding radicals and anions were optimized using B3LYP functional with 6-31G (d,p) basis set in the gas phase. The C-PCM model was used to perform solvent medium calculations. On the basis of theoretical calculations, it was shown that HAT mechanism was predominant in the gas phase, whereas SET-PT and SPLET mechanisms were favored in the presence of the solvent. Moreover, the HOMO orbitals and spin density distribution was evaluated to predict the probable sites for free radical attack.

  9. Quantitative structure-toxicity relationship (QSTR) studies on the organophosphate insecticides.

    PubMed

    Can, Alper

    2014-11-04

    Organophosphate insecticides are the most commonly used pesticides in the world. In this study, quantitative structure-toxicity relationship (QSTR) models were derived for estimating the acute oral toxicity of organophosphate insecticides to male rats. The 20 chemicals of the training set and the seven compounds of the external testing set were described by means of using descriptors. Descriptors for lipophilicity, polarity and molecular geometry, as well as quantum chemical descriptors for energy were calculated. Model development to predict toxicity of organophosphate insecticides in different matrices was carried out using multiple linear regression. The model was validated internally and externally. In the present study, QSTR model was used for the first time to understand the inherent relationships between the organophosphate insecticide molecules and their toxicity behavior. Such studies provide mechanistic insight about structure-toxicity relationship and help in the design of less toxic insecticides. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  10. Molecular dynamics simulations and docking enable to explore the biophysical factors controlling the yields of engineered nanobodies.

    PubMed

    Soler, Miguel A; de Marco, Ario; Fortuna, Sara

    2016-10-10

    Nanobodies (VHHs) have proved to be valuable substitutes of conventional antibodies for molecular recognition. Their small size represents a precious advantage for rational mutagenesis based on modelling. Here we address the problem of predicting how Camelidae nanobody sequences can tolerate mutations by developing a simulation protocol based on all-atom molecular dynamics and whole-molecule docking. The method was tested on two sets of nanobodies characterized experimentally for their biophysical features. One set contained point mutations introduced to humanize a wild type sequence, in the second the CDRs were swapped between single-domain frameworks with Camelidae and human hallmarks. The method resulted in accurate scoring approaches to predict experimental yields and enabled to identify the structural modifications induced by mutations. This work is a promising tool for the in silico development of single-domain antibodies and opens the opportunity to customize single functional domains of larger macromolecules.

  11. Molecular dynamics simulations and docking enable to explore the biophysical factors controlling the yields of engineered nanobodies

    NASA Astrophysics Data System (ADS)

    Soler, Miguel A.; De Marco, Ario; Fortuna, Sara

    2016-10-01

    Nanobodies (VHHs) have proved to be valuable substitutes of conventional antibodies for molecular recognition. Their small size represents a precious advantage for rational mutagenesis based on modelling. Here we address the problem of predicting how Camelidae nanobody sequences can tolerate mutations by developing a simulation protocol based on all-atom molecular dynamics and whole-molecule docking. The method was tested on two sets of nanobodies characterized experimentally for their biophysical features. One set contained point mutations introduced to humanize a wild type sequence, in the second the CDRs were swapped between single-domain frameworks with Camelidae and human hallmarks. The method resulted in accurate scoring approaches to predict experimental yields and enabled to identify the structural modifications induced by mutations. This work is a promising tool for the in silico development of single-domain antibodies and opens the opportunity to customize single functional domains of larger macromolecules.

  12. Machine learning methods in chemoinformatics

    PubMed Central

    Mitchell, John B O

    2014-01-01

    Machine learning algorithms are generally developed in computer science or adjacent disciplines and find their way into chemical modeling by a process of diffusion. Though particular machine learning methods are popular in chemoinformatics and quantitative structure–activity relationships (QSAR), many others exist in the technical literature. This discussion is methods-based and focused on some algorithms that chemoinformatics researchers frequently use. It makes no claim to be exhaustive. We concentrate on methods for supervised learning, predicting the unknown property values of a test set of instances, usually molecules, based on the known values for a training set. Particularly relevant approaches include Artificial Neural Networks, Random Forest, Support Vector Machine, k-Nearest Neighbors and naïve Bayes classifiers. WIREs Comput Mol Sci 2014, 4:468–481. How to cite this article: WIREs Comput Mol Sci 2014, 4:468–481. doi:10.1002/wcms.1183 PMID:25285160

  13. High mobility high efficiency organic films based on pure organic materials

    DOEpatents

    Salzman, Rhonda F [Ann Arbor, MI; Forrest, Stephen R [Ann Arbor, MI

    2009-01-27

    A method of purifying small molecule organic material, performed as a series of operations beginning with a first sample of the organic small molecule material. The first step is to purify the organic small molecule material by thermal gradient sublimation. The second step is to test the purity of at least one sample from the purified organic small molecule material by spectroscopy. The third step is to repeat the first through third steps on the purified small molecule material if the spectroscopic testing reveals any peaks exceeding a threshold percentage of a magnitude of a characteristic peak of a target organic small molecule. The steps are performed at least twice. The threshold percentage is at most 10%. Preferably the threshold percentage is 5% and more preferably 2%. The threshold percentage may be selected based on the spectra of past samples that achieved target performance characteristics in finished devices.

  14. Solvent effects in time-dependent self-consistent field methods. I. Optical response calculations

    DOE PAGES

    Bjorgaard, J. A.; Kuzmenko, V.; Velizhanin, K. A.; ...

    2015-01-22

    In this study, we implement and examine three excited state solvent models in time-dependent self-consistent field methods using a consistent formalism which unambiguously shows their relationship. These are the linear response, state specific, and vertical excitation solvent models. Their effects on energies calculated with the equivalent of COSMO/CIS/AM1 are given for a set of test molecules with varying excited state charge transfer character. The resulting solvent effects are explained qualitatively using a dipole approximation. It is shown that the fundamental differences between these solvent models are reflected by the character of the calculated excitations.

  15. Virtual reality visual feedback for hand-controlled scanning probe microscopy manipulation of single molecules.

    PubMed

    Leinen, Philipp; Green, Matthew F B; Esat, Taner; Wagner, Christian; Tautz, F Stefan; Temirov, Ruslan

    2015-01-01

    Controlled manipulation of single molecules is an important step towards the fabrication of single molecule devices and nanoscale molecular machines. Currently, scanning probe microscopy (SPM) is the only technique that facilitates direct imaging and manipulations of nanometer-sized molecular compounds on surfaces. The technique of hand-controlled manipulation (HCM) introduced recently in Beilstein J. Nanotechnol. 2014, 5, 1926-1932 simplifies the identification of successful manipulation protocols in situations when the interaction pattern of the manipulated molecule with its environment is not fully known. Here we present a further technical development that substantially improves the effectiveness of HCM. By adding Oculus Rift virtual reality goggles to our HCM set-up we provide the experimentalist with 3D visual feedback that displays the currently executed trajectory and the position of the SPM tip during manipulation in real time, while simultaneously plotting the experimentally measured frequency shift (Δf) of the non-contact atomic force microscope (NC-AFM) tuning fork sensor as well as the magnitude of the electric current (I) flowing between the tip and the surface. The advantages of the set-up are demonstrated by applying it to the model problem of the extraction of an individual PTCDA molecule from its hydrogen-bonded monolayer grown on Ag(111) surface.

  16. Choosing the right fluorophore for single-molecule fluorescence studies in a lipid environment.

    PubMed

    Zhang, Zhenfu; Yomo, Dan; Gradinaru, Claudiu

    2017-07-01

    Nonspecific interactions between lipids and fluorophores can alter the outcomes of single-molecule spectroscopy of membrane proteins in live cells, liposomes or lipid nanodiscs and of cytosolic proteins encapsulated in liposomes or tethered to supported lipid bilayers. To gain insight into these effects, we examined interactions between 9 dyes that are commonly used as labels for single-molecule fluorescence (SMF) and 6 standard lipids including cationic, zwitterionic and anionic types. The diffusion coefficients of dyes in the absence and presence of set amounts of lipid vesicles were measured by fluorescence correlation spectroscopy (FCS). The partition coefficients and the free energies of partitioning for different fluorophore-lipid pairs were obtained by global fitting of the titration FCS curves. Lipids with different charges, head groups and degrees of chain saturation were investigated, and interactions with dyes are discussed in terms of hydrophobic, electrostatic and steric contributions. Fluorescence imaging of individual fluorophores adsorbed on supported lipid bilayers provides visualization and additional quantification of the strength of dye-lipid interaction in the context of single-molecule measurements. By dissecting fluorophore-lipid interactions, our study provides new insights into setting up single-molecule fluorescence spectroscopy experiments with minimal interference from interactions between fluorescent labels and lipids in the environment. Copyright © 2017 Elsevier B.V. All rights reserved.

  17. Bayesian approach to MSD-based analysis of particle motion in live cells.

    PubMed

    Monnier, Nilah; Guo, Syuan-Ming; Mori, Masashi; He, Jun; Lénárt, Péter; Bathe, Mark

    2012-08-08

    Quantitative tracking of particle motion using live-cell imaging is a powerful approach to understanding the mechanism of transport of biological molecules, organelles, and cells. However, inferring complex stochastic motion models from single-particle trajectories in an objective manner is nontrivial due to noise from sampling limitations and biological heterogeneity. Here, we present a systematic Bayesian approach to multiple-hypothesis testing of a general set of competing motion models based on particle mean-square displacements that automatically classifies particle motion, properly accounting for sampling limitations and correlated noise while appropriately penalizing model complexity according to Occam's Razor to avoid over-fitting. We test the procedure rigorously using simulated trajectories for which the underlying physical process is known, demonstrating that it chooses the simplest physical model that explains the observed data. Further, we show that computed model probabilities provide a reliability test for the downstream biological interpretation of associated parameter values. We subsequently illustrate the broad utility of the approach by applying it to disparate biological systems including experimental particle trajectories from chromosomes, kinetochores, and membrane receptors undergoing a variety of complex motions. This automated and objective Bayesian framework easily scales to large numbers of particle trajectories, making it ideal for classifying the complex motion of large numbers of single molecules and cells from high-throughput screens, as well as single-cell-, tissue-, and organism-level studies. Copyright © 2012 Biophysical Society. Published by Elsevier Inc. All rights reserved.

  18. N-(3-Chloro-1-methyl-1H-indazol-5-yl)-4-methylbenzene-sulfonamide.

    PubMed

    Chicha, Hakima; Rakib, El Mostapha; Amiri, Ouafa; Saadi, Mohamed; El Ammari, Lahcen

    2014-02-01

    The asymmetric unit of the title compound, C15H14ClN3O2S, contains two independent mol-ecules showing different conformations: in one mol-ecule, the indazole ring system makes a dihedral angle of 51.5 (1)° with the benzene ring whereas in the other, the indazole unit is almost perpendicular to the benzene ring [dihedral angle 77.7 (1)°]. In the crystal, the mol-ecules are linked by N-H⋯N and N-H⋯O hydrogen bonds, forming a set of four mol-ecules linked in pairs about an inversion centre.

  19. Parallel algorithms for the molecular conformation problem

    NASA Astrophysics Data System (ADS)

    Rajan, Kumar

    Given a set of objects, and some of the pairwise distances between them, the problem of identifying the positions of the objects in the Euclidean space is referred to as the molecular conformation problem. This problem is known to be computationally difficult. One of the most important applications of this problem is the determination of the structure of molecules. In the case of molecular structure determination, usually only the lower and upper bounds on some of the interatomic distances are available. The process of obtaining a tighter set of bounds between all pairs of atoms, using the available interatomic distance bounds is referred to as bound-smoothing . One method for bound-smoothing is to use the limits imposed by the triangle inequality. The distance bounds so obtained can often be tightened further by applying the tetrangle inequality---the limits imposed on the six pairwise distances among a set of four atoms (instead of three for the triangle inequalities). The tetrangle inequality is expressed by the Cayley-Menger determinants. The sequential tetrangle-inequality bound-smoothing algorithm considers a quadruple of atoms at a time, and tightens the bounds on each of its six distances. The sequential algorithm is computationally expensive, and its application is limited to molecules with up to a few hundred atoms. Here, we conduct an experimental study of tetrangle-inequality bound-smoothing and reduce the sequential time by identifying the most computationally expensive portions of the process. We also present a simple criterion to determine which of the quadruples of atoms are likely to be tightened the most by tetrangle-inequality bound-smoothing. This test could be used to enhance the applicability of this process to large molecules. We map the problem of parallelizing tetrangle-inequality bound-smoothing to that of generating disjoint packing designs of a certain kind. We map this, in turn, to a regular-graph coloring problem, and present a simple, parallel algorithm for tetrangle-inequality bound-smoothing. We implement the parallel algorithm on the Intel Paragon X/PS, and apply it to real-life molecules. Our results show that with this parallel algorithm, tetrangle inequality can be applied to large molecules in a reasonable amount of time. We extend the regular graph to represent more general packing designs, and present a coloring algorithm for this graph. This can be used to generate constant-weight binary codes in parallel. Once a tighter set of distance bounds is obtained, the molecular conformation problem is usually formulated as a non-linear optimization problem, and a global optimization algorithm is then used to solve the problem. Here we present a parallel, deterministic algorithm for the optimization problem based on Interval Analysis. We implement our algorithm, using dynamic load balancing, on a network of Sun Ultra-Sparc workstations. Our experience with this algorithm shows that its application is limited to small instances of the molecular conformation problem, where the number of measured, pairwise distances is close to the maximum value. However, since the interval method eliminates a substantial portion of the initial search space very quickly, it can be used to prune the search space before any of the more efficient, nondeterministic methods can be applied.

  20. Performance of machine-learning scoring functions in structure-based virtual screening

    PubMed Central

    Wójcikowski, Maciej; Ballester, Pedro J.; Siedlecki, Pawel

    2017-01-01

    Classical scoring functions have reached a plateau in their performance in virtual screening and binding affinity prediction. Recently, machine-learning scoring functions trained on protein-ligand complexes have shown great promise in small tailored studies. They have also raised controversy, specifically concerning model overfitting and applicability to novel targets. Here we provide a new ready-to-use scoring function (RF-Score-VS) trained on 15 426 active and 893 897 inactive molecules docked to a set of 102 targets. We use the full DUD-E data sets along with three docking tools, five classical and three machine-learning scoring functions for model building and performance assessment. Our results show RF-Score-VS can substantially improve virtual screening performance: RF-Score-VS top 1% provides 55.6% hit rate, whereas that of Vina only 16.2% (for smaller percent the difference is even more encouraging: RF-Score-VS top 0.1% achieves 88.6% hit rate for 27.5% using Vina). In addition, RF-Score-VS provides much better prediction of measured binding affinity than Vina (Pearson correlation of 0.56 and −0.18, respectively). Lastly, we test RF-Score-VS on an independent test set from the DEKOIS benchmark and observed comparable results. We provide full data sets to facilitate further research in this area (http://github.com/oddt/rfscorevs) as well as ready-to-use RF-Score-VS (http://github.com/oddt/rfscorevs_binary). PMID:28440302

  1. Substitution Structures of Large Molecules and Medium Range Correlations in Quantum Chemistry Calculations

    NASA Astrophysics Data System (ADS)

    Evangelisti, Luca; Pate, Brooks

    2017-06-01

    A study of the minimally exciting topic of agreement between experimental and measured rotational constants of molecules was performed on a set of large molecules with 16-18 heavy atoms (carbon and oxygen). The molecules are: nootkatone (C_{15}H_{22}O), cedrol (C_{15}H_{26}O), ambroxide (C_{16}H_{28}O), sclareolide (C_{16}H_{22}O_{2}), and dihydroartemisinic acid (C_{15}H_{24}O_{2}). For this set of molecules we obtained 13C-subsitution structures for six molecules (this includes two conformers of nootkatone). A comparison of theoretical structures and experimental substitution structures was performed in the spirit of the recent work of Grimme and Steinmetz.[1] Our analysis focused the center-of-mass distance of the carbon atoms in the molecules. Four different computational methods were studied: standard DFT (B3LYP), dispersion corrected DFT (B3LYP-D3BJ), hybrid DFT with dispersion correction (B2PLYP-D3), and MP2. A significant difference in these theories is how they handle medium range correlation of electrons that produce dispersion forces. For larger molecules, these dispersion forces produce an overall contraction of the molecule around the center-of-mass. DFT poorly treats this effect and produces structures that are too expanded. MP2 calculations overestimate the correction and produce structures that are too compact. Both dispersion corrected DFT methods produce structures in excellent agreement with experiment. The analysis shows that the difference in computational methods can be described by a linear error in the center-of-mass distance. This makes it possible to correct poorer performing calculations with a single scale factor. We also reexamine the issue of the "Costain error" in substitution structures and show that it is significantly larger in these systems than in the smaller molecules used by Costain to establish the error limits. [1] Stefan Grimme and Marc Steinmetz, "Effects of London dispersion correction in density functional theory on structures of organic molecules in the gas phase", Phys. Chem. Chem. Phys. 15, 16031-16042 (2013).

  2. Estimating the domain of applicability for machine learning QSAR models: a study on aqueous solubility of drug discovery molecules.

    PubMed

    Schroeter, Timon Sebastian; Schwaighofer, Anton; Mika, Sebastian; Ter Laak, Antonius; Suelzle, Detlev; Ganzer, Ursula; Heinrich, Nikolaus; Müller, Klaus-Robert

    2007-12-01

    We investigate the use of different Machine Learning methods to construct models for aqueous solubility. Models are based on about 4000 compounds, including an in-house set of 632 drug discovery molecules of Bayer Schering Pharma. For each method, we also consider an appropriate method to obtain error bars, in order to estimate the domain of applicability (DOA) for each model. Here, we investigate error bars from a Bayesian model (Gaussian Process (GP)), an ensemble based approach (Random Forest), and approaches based on the Mahalanobis distance to training data (for Support Vector Machine and Ridge Regression models). We evaluate all approaches in terms of their prediction accuracy (in cross-validation, and on an external validation set of 536 molecules) and in how far the individual error bars can faithfully represent the actual prediction error.

  3. Estimating the domain of applicability for machine learning QSAR models: a study on aqueous solubility of drug discovery molecules.

    PubMed

    Schroeter, Timon Sebastian; Schwaighofer, Anton; Mika, Sebastian; Ter Laak, Antonius; Suelzle, Detlev; Ganzer, Ursula; Heinrich, Nikolaus; Müller, Klaus-Robert

    2007-09-01

    We investigate the use of different Machine Learning methods to construct models for aqueous solubility. Models are based on about 4000 compounds, including an in-house set of 632 drug discovery molecules of Bayer Schering Pharma. For each method, we also consider an appropriate method to obtain error bars, in order to estimate the domain of applicability (DOA) for each model. Here, we investigate error bars from a Bayesian model (Gaussian Process (GP)), an ensemble based approach (Random Forest), and approaches based on the Mahalanobis distance to training data (for Support Vector Machine and Ridge Regression models). We evaluate all approaches in terms of their prediction accuracy (in cross-validation, and on an external validation set of 536 molecules) and in how far the individual error bars can faithfully represent the actual prediction error.

  4. Estimating the domain of applicability for machine learning QSAR models: a study on aqueous solubility of drug discovery molecules

    NASA Astrophysics Data System (ADS)

    Schroeter, Timon Sebastian; Schwaighofer, Anton; Mika, Sebastian; Ter Laak, Antonius; Suelzle, Detlev; Ganzer, Ursula; Heinrich, Nikolaus; Müller, Klaus-Robert

    2007-12-01

    We investigate the use of different Machine Learning methods to construct models for aqueous solubility. Models are based on about 4000 compounds, including an in-house set of 632 drug discovery molecules of Bayer Schering Pharma. For each method, we also consider an appropriate method to obtain error bars, in order to estimate the domain of applicability (DOA) for each model. Here, we investigate error bars from a Bayesian model (Gaussian Process (GP)), an ensemble based approach (Random Forest), and approaches based on the Mahalanobis distance to training data (for Support Vector Machine and Ridge Regression models). We evaluate all approaches in terms of their prediction accuracy (in cross-validation, and on an external validation set of 536 molecules) and in how far the individual error bars can faithfully represent the actual prediction error.

  5. Estimating the domain of applicability for machine learning QSAR models: a study on aqueous solubility of drug discovery molecules

    NASA Astrophysics Data System (ADS)

    Schroeter, Timon Sebastian; Schwaighofer, Anton; Mika, Sebastian; Ter Laak, Antonius; Suelzle, Detlev; Ganzer, Ursula; Heinrich, Nikolaus; Müller, Klaus-Robert

    2007-09-01

    We investigate the use of different Machine Learning methods to construct models for aqueous solubility. Models are based on about 4000 compounds, including an in-house set of 632 drug discovery molecules of Bayer Schering Pharma. For each method, we also consider an appropriate method to obtain error bars, in order to estimate the domain of applicability (DOA) for each model. Here, we investigate error bars from a Bayesian model (Gaussian Process (GP)), an ensemble based approach (Random Forest), and approaches based on the Mahalanobis distance to training data (for Support Vector Machine and Ridge Regression models). We evaluate all approaches in terms of their prediction accuracy (in cross-validation, and on an external validation set of 536 molecules) and in how far the individual error bars can faithfully represent the actual prediction error.

  6. [Application of Kohonen Self-Organizing Feature Maps in QSAR of human ADMET and kinase data sets].

    PubMed

    Hegymegi-Barakonyi, Bálint; Orfi, László; Kéri, György; Kövesdi, István

    2013-01-01

    QSAR predictions have been proven very useful in a large number of studies for drug design, such as kinase inhibitor design as targets for cancer therapy, however the overall predictability often remains unsatisfactory. To improve predictability of ADMET features and kinase inhibitory data, we present a new method using Kohonen's Self-Organizing Feature Map (SOFM) to cluster molecules based on explanatory variables (X) and separate dissimilar ones. We calculated SOFM clusters for a large number of molecules with human ADMET and kinase inhibitory data, and we showed that chemically similar molecules were in the same SOFM cluster, and within such clusters the QSAR models had significantly better predictability. We used also target variables (Y, e.g. ADMET) jointly with X variables to create a novel type of clustering. With our method, cells of loosely coupled XY data could be identified and separated into different model building sets.

  7. Photoisomerization-induced manipulation of single-electron tunneling for novel Si-based optical memory.

    PubMed

    Hayakawa, Ryoma; Higashiguchi, Kenji; Matsuda, Kenji; Chikyow, Toyohiro; Wakayama, Yutaka

    2013-11-13

    We demonstrated optical manipulation of single-electron tunneling (SET) by photoisomerization of diarylethene molecules in a metal-insulator-semiconductor (MIS) structure. Stress is placed on the fact that device operation is realized in the practical device configuration of MIS structure and that it is not achieved in structures based on nanogap electrodes and scanning probe techniques. Namely, this is a basic memory device configuration that has the potential for large-scale integration. In our device, the threshold voltage of SET was clearly modulated as a reversible change in the molecular orbital induced by photoisomerization, indicating that diarylethene molecules worked as optically controllable quantum dots. These findings will allow the integration of photonic functionality into current Si-based memory devices, which is a unique feature of organic molecules that is unobtainable with inorganic materials. Our proposed device therefore has enormous potential for providing a breakthrough in Si technology.

  8. Designing a multiroute synthesis scheme in combinatorial chemistry.

    PubMed

    Akavia, Adi; Senderowitz, Hanoch; Lerner, Alon; Shamir, Ron

    2004-01-01

    Solid-phase mix-and-split combinatorial synthesis is often used to produce large arrays of compounds to be tested during the various stages of the drug development process. This method can be represented by a synthesis graph in which nodes correspond to grow operations and arcs to beads transferred among the different reaction vessels. In this work, we address the problem of designing such a graph which maximizes the number of produced target compounds (namely, compounds out of an input library of desired molecules), given constraints on the number of beads used for library synthesis and on the number of reaction vessels available for concurrent grow steps. We present a heuristic based on a discrete search for solving this problem, test our solution on several data sets, explore its behavior, and show that it achieves good performance.

  9. Pre-symptomatic diagnosis and treatment of filovirus diseases

    PubMed Central

    Shurtleff, Amy C.; Whitehouse, Chris A.; Ward, Michael D.; Cazares, Lisa H.; Bavari, Sina

    2015-01-01

    Filoviruses are virulent human pathogens which cause severe illness with high case fatality rates and for which there are no available FDA-approved vaccines or therapeutics. Diagnostic tools including antibody- and molecular-based assays, mass spectrometry, and next-generation sequencing are continually under development. Assays using the polymerase chain reaction (PCR) have become the mainstay for the detection of filoviruses in outbreak settings. In many cases, real-time reverse transcriptase-PCR allows for the detection of filoviruses to be carried out with minimal manipulation and equipment and can provide results in less than 2 h. In cases of novel, highly diverse filoviruses, random-primed pyrosequencing approaches have proved useful. Ideally, diagnostic tests would allow for diagnosis of filovirus infection as early as possible after infection, either before symptoms begin, in the event of a known exposure or epidemiologic outbreak, or post-symptomatically. If tests could provide an early definitive diagnosis, then this information may be used to inform the choice of possible therapeutics. Several exciting new candidate therapeutics have been described recently; molecules that have therapeutic activity when administered to animal models of infection several days post-exposure, once signs of disease have begun. The latest data for candidate nucleoside analogs, small interfering RNA (siRNA) molecules, phosphorodiamidate (PMO) molecules, as well as antibody and blood-product therapeutics and therapeutic vaccines are discussed. For filovirus researchers and government agencies interested in making treatments available for a nation’s defense as well as its general public, having the right diagnostic tools to identify filovirus infections, as well as a panel of available therapeutics for treatment when needed, is a high priority. Additional research in both areas is required for ultimate success, but significant progress is being made to reach these goals. PMID:25750638

  10. Endothelial activation biomarkers increase after HIV-1 acquisition: plasma vascular cell adhesion molecule-1 predicts disease progression.

    PubMed

    Graham, Susan M; Rajwans, Nimerta; Jaoko, Walter; Estambale, Benson B A; McClelland, R Scott; Overbaugh, Julie; Liles, W Conrad

    2013-07-17

    We aimed to determine whether endothelial activation biomarkers increase after HIV-1 acquisition, and whether biomarker levels measured in chronic infection would predict disease progression and death in HIV-1 seroconverters. HIV-1-seronegative Kenyan women were monitored monthly for seroconversion, and followed prospectively after HIV-1 acquisition. Plasma levels of angiopoietin-1 and angiopoietin-2 (ANG-1, ANG-2) and soluble vascular cell adhesion molecule-1 (VCAM-1), intercellular adhesion molecule-1 (ICAM-1), and E-selectin were tested in stored samples from pre-infection, acute infection, and two chronic infection time points. We used nonparametric tests to compare biomarkers before and after HIV-1 acquisition, and Cox proportional-hazards regression to analyze associations with disease progression (CD4 < 200 cells/μl, stage IV disease, or antiretroviral therapy initiation) or death. Soluble ICAM-1 and VCAM-1 were elevated relative to baseline in all postinfection periods assessed (P < 0.0001). Soluble E-selectin and the ANG-2:ANG-1 ratio increased in acute infection (P = 0.0001), and ANG-1 decreased in chronic infection (P = 0.0004). Among 228 participants followed over 1028 person-years, 115 experienced disease progression or death. Plasma VCAM-1 levels measured during chronic infection were independently associated with time to HIV progression or death (adjusted hazard ratio 5.36, 95% confidence interval 1.99-14.44 per log10 increase), after adjustment for set point plasma viral load, age at infection, and soluble ICAM-1 levels. HIV-1 acquisition was associated with endothelial activation, with sustained elevations of soluble ICAM-1 and VCAM-1 postinfection. Soluble VCAM-1 may be an informative biomarker for predicting the risk of HIV-1 disease progression, morbidity, and mortality.

  11. Excitation spectra of aromatic molecules within a real-space G W -BSE formalism: Role of self-consistency and vertex corrections

    DOE PAGES

    Hung, Linda; da Jornada, Felipe H.; Souto-Casares, Jaime; ...

    2016-08-15

    Here, we present first-principles calculations on the vertical ionization potentials (IPs), electron affinities (EAs), and singlet excitation energies on an aromatic-molecule test set (benzene, thiophene, 1,2,5-thiadiazole, naphthalene, benzothiazole, and tetrathiafulvalene) within the GW and Bethe-Salpeter equation (BSE) formalisms. Our computational framework, which employs a real-space basis for ground-state and a transition-space basis for excited-state calculations, is well suited for high-accuracy calculations on molecules, as we show by comparing against G0W0 calculations within a plane-wave-basis formalism. We then generalize our framework to test variants of the GW approximation that include a local density approximation (LDA)–derived vertex function (Γ LDA ) andmore » quasiparticle-self-consistent (QS) iterations. We find that Γ LDA and quasiparticle self-consistency shift IPs and EAs by roughly the same magnitude, but with opposite sign for IPs and the same sign for EAs. G0W0 and QS GWΓ LDA are more accurate for IPs, while G 0W 0Γ LDA and QS GW are best for EAs. For optical excitations, we find that perturbative GW-BSE underestimates the singlet excitation energy, while self-consistent GW-BSE results in good agreement with previous best-estimate values for both valence and Rydberg excitations. Finally, our work suggests that a hybrid approach, in which G0W0 energies are used for occupied orbitals and G0W0Γ LDA for unoccupied orbitals, also yields optical excitation energies in good agreement with experiment but at a smaller computational cost.« less

  12. Pharmacophore generation, atom-based 3D-QSAR, molecular docking and molecular dynamics simulation studies on benzamide analogues as FtsZ inhibitors.

    PubMed

    Tripathy, Swayansiddha; Azam, Mohammed Afzal; Jupudi, Srikanth; Sahu, Susanta Kumar

    2017-10-11

    FtsZ is an appealing target for the design of antimicrobial agent that can be used to defeat the multidrug-resistant bacterial pathogens. Pharmacophore modelling, molecular docking and molecular dynamics (MD) simulation studies were performed on a series of three-substituted benzamide derivatives. In the present study a five-featured pharmacophore model with one hydrogen bond acceptors, one hydrogen bond donors, one hydrophobic and two aromatic rings was developed using 97 molecules having MIC values ranging from .07 to 957 μM. A statistically significant 3D-QSAR model was obtained using this pharmacophore hypothesis with a good correlation coefficient (R 2  = .8319), cross validated coefficient (Q 2  = .6213) and a high Fisher ratio (F = 103.9) with three component PLS factor. A good correlation between experimental and predicted activity of the training (R 2  = .83) and test set (R 2  = .67) molecules were displayed by ADHRR.1682 model. The generated model was further validated by enrichment studies using the decoy test and MAE-based criteria to measure the efficiency of the model. The docking studies of all selected inhibitors in the active site of FtsZ protein showed crucial hydrogen bond interactions with Val 207, Asn 263, Leu 209, Gly 205 and Asn-299 residues. The binding free energies of these inhibitors were calculated by the molecular mechanics/generalized born surface area VSGB 2.0 method. Finally, a 15 ns MD simulation was done to confirm the stability of the 4DXD-ligand complex. On a wider scope, the prospect of present work provides insight in designing molecules with better selective FtsZ inhibitory potential.

  13. Excitation spectra of aromatic molecules within a real-space G W -BSE formalism: Role of self-consistency and vertex corrections

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hung, Linda; da Jornada, Felipe H.; Souto-Casares, Jaime

    Here, we present first-principles calculations on the vertical ionization potentials (IPs), electron affinities (EAs), and singlet excitation energies on an aromatic-molecule test set (benzene, thiophene, 1,2,5-thiadiazole, naphthalene, benzothiazole, and tetrathiafulvalene) within the GW and Bethe-Salpeter equation (BSE) formalisms. Our computational framework, which employs a real-space basis for ground-state and a transition-space basis for excited-state calculations, is well suited for high-accuracy calculations on molecules, as we show by comparing against G0W0 calculations within a plane-wave-basis formalism. We then generalize our framework to test variants of the GW approximation that include a local density approximation (LDA)–derived vertex function (Γ LDA ) andmore » quasiparticle-self-consistent (QS) iterations. We find that Γ LDA and quasiparticle self-consistency shift IPs and EAs by roughly the same magnitude, but with opposite sign for IPs and the same sign for EAs. G0W0 and QS GWΓ LDA are more accurate for IPs, while G 0W 0Γ LDA and QS GW are best for EAs. For optical excitations, we find that perturbative GW-BSE underestimates the singlet excitation energy, while self-consistent GW-BSE results in good agreement with previous best-estimate values for both valence and Rydberg excitations. Finally, our work suggests that a hybrid approach, in which G0W0 energies are used for occupied orbitals and G0W0Γ LDA for unoccupied orbitals, also yields optical excitation energies in good agreement with experiment but at a smaller computational cost.« less

  14. NetMHCIIpan-2.0 - Improved pan-specific HLA-DR predictions using a novel concurrent alignment and weight optimization training procedure.

    PubMed

    Nielsen, Morten; Justesen, Sune; Lund, Ole; Lundegaard, Claus; Buus, Søren

    2010-11-13

    Binding of peptides to Major Histocompatibility class II (MHC-II) molecules play a central role in governing responses of the adaptive immune system. MHC-II molecules sample peptides from the extracellular space allowing the immune system to detect the presence of foreign microbes from this compartment. Predicting which peptides bind to an MHC-II molecule is therefore of pivotal importance for understanding the immune response and its effect on host-pathogen interactions. The experimental cost associated with characterizing the binding motif of an MHC-II molecule is significant and large efforts have therefore been placed in developing accurate computer methods capable of predicting this binding event. Prediction of peptide binding to MHC-II is complicated by the open binding cleft of the MHC-II molecule, allowing binding of peptides extending out of the binding groove. Moreover, the genes encoding the MHC molecules are immensely diverse leading to a large set of different MHC molecules each potentially binding a unique set of peptides. Characterizing each MHC-II molecule using peptide-screening binding assays is hence not a viable option. Here, we present an MHC-II binding prediction algorithm aiming at dealing with these challenges. The method is a pan-specific version of the earlier published allele-specific NN-align algorithm and does not require any pre-alignment of the input data. This allows the method to benefit also from information from alleles covered by limited binding data. The method is evaluated on a large and diverse set of benchmark data, and is shown to significantly out-perform state-of-the-art MHC-II prediction methods. In particular, the method is found to boost the performance for alleles characterized by limited binding data where conventional allele-specific methods tend to achieve poor prediction accuracy. The method thus shows great potential for efficient boosting the accuracy of MHC-II binding prediction, as accurate predictions can be obtained for novel alleles at highly reduced experimental costs. Pan-specific binding predictions can be obtained for all alleles with know protein sequence and the method can benefit by including data in the training from alleles even where only few binders are known. The method and benchmark data are available at http://www.cbs.dtu.dk/services/NetMHCIIpan-2.0.

  15. Where to place the positive muon in the Periodic Table?

    PubMed

    Goli, Mohammad; Shahbazian, Shant

    2015-03-14

    In a recent study it was suggested that the positively charged muon is capable of forming its own "atoms in molecules" (AIM) in the muonic hydrogen-like molecules, composed of two electrons, a muon and one of the hydrogen's isotopes, thus deserves to be placed in the Periodic Table [Phys. Chem. Chem. Phys., 2014, 16, 6602]. In the present report, the capacity of the positively charged muon in forming its own AIM is considered in a large set of molecules replacing muons with all protons in the hydrides of the second and third rows of the Periodic Table. Accordingly, in a comparative study the wavefunctions of both sets of hydrides and their muonic congeners are first derived beyond the Born-Oppenheimer (BO) paradigm, assuming protons and muons as quantum waves instead of clamped particles. Then, the non-BO wavefunctions are used to derive the AIM structures of both hydrides and muonic congeners within the context of the multi-component quantum theory of atoms in molecules. The results of the analysis demonstrate that muons are generally capable of forming their own atomic basins and the properties of these basins are not fundamentally different from those AIM containing protons. Particularly, the bonding modes in the muonic species seem to be qualitatively similar to their congener hydrides and no new bonding model is required to describe the bonding of muons to a diverse set of neighboring atoms. All in all, the positively charged muon is similar to a proton from the structural and bonding viewpoint and deserves to be placed in the same box of hydrogen in the Periodic Table. This conclusion is in line with a large body of studies on the chemical kinetics of the muonic molecules portraying the positively charged muon as a lighter isotope of hydrogen.

  16. Push it to the limit: Characterizing the convergence of common sequences of basis sets for intermolecular interactions as described by density functional theory

    NASA Astrophysics Data System (ADS)

    Witte, Jonathon; Neaton, Jeffrey B.; Head-Gordon, Martin

    2016-05-01

    With the aim of systematically characterizing the convergence of common families of basis sets such that general recommendations for basis sets can be made, we have tested a wide variety of basis sets against complete-basis binding energies across the S22 set of intermolecular interactions—noncovalent interactions of small and medium-sized molecules consisting of first- and second-row atoms—with three distinct density functional approximations: SPW92, a form of local-density approximation; B3LYP, a global hybrid generalized gradient approximation; and B97M-V, a meta-generalized gradient approximation with nonlocal correlation. We have found that it is remarkably difficult to reach the basis set limit; for the methods and systems examined, the most complete basis is Jensen's pc-4. The Dunning correlation-consistent sequence of basis sets converges slowly relative to the Jensen sequence. The Karlsruhe basis sets are quite cost effective, particularly when a correction for basis set superposition error is applied: counterpoise-corrected def2-SVPD binding energies are better than corresponding energies computed in comparably sized Dunning and Jensen bases, and on par with uncorrected results in basis sets 3-4 times larger. These trends are exhibited regardless of the level of density functional approximation employed. A sense of the magnitude of the intrinsic incompleteness error of each basis set not only provides a foundation for guiding basis set choice in future studies but also facilitates quantitative comparison of existing studies on similar types of systems.

  17. Accuracy of Lagrange-sinc functions as a basis set for electronic structure calculations of atoms and molecules

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Choi, Sunghwan; Hong, Kwangwoo; Kim, Jaewook

    2015-03-07

    We developed a self-consistent field program based on Kohn-Sham density functional theory using Lagrange-sinc functions as a basis set and examined its numerical accuracy for atoms and molecules through comparison with the results of Gaussian basis sets. The result of the Kohn-Sham inversion formula from the Lagrange-sinc basis set manifests that the pseudopotential method is essential for cost-effective calculations. The Lagrange-sinc basis set shows faster convergence of the kinetic and correlation energies of benzene as its size increases than the finite difference method does, though both share the same uniform grid. Using a scaling factor smaller than or equal tomore » 0.226 bohr and pseudopotentials with nonlinear core correction, its accuracy for the atomization energies of the G2-1 set is comparable to all-electron complete basis set limits (mean absolute deviation ≤1 kcal/mol). The same basis set also shows small mean absolute deviations in the ionization energies, electron affinities, and static polarizabilities of atoms in the G2-1 set. In particular, the Lagrange-sinc basis set shows high accuracy with rapid convergence in describing density or orbital changes by an external electric field. Moreover, the Lagrange-sinc basis set can readily improve its accuracy toward a complete basis set limit by simply decreasing the scaling factor regardless of systems.« less

  18. An exact variational method to calculate rovibrational spectra of polyatomic molecules with large amplitude motion

    NASA Astrophysics Data System (ADS)

    Yu, Hua-Gen

    2016-08-01

    We report a new full-dimensional variational algorithm to calculate rovibrational spectra of polyatomic molecules using an exact quantum mechanical Hamiltonian. The rovibrational Hamiltonian of system is derived in a set of orthogonal polyspherical coordinates in the body-fixed frame. It is expressed in an explicitly Hermitian form. The Hamiltonian has a universal formulation regardless of the choice of orthogonal polyspherical coordinates and the number of atoms in molecule, which is suitable for developing a general program to study the spectra of many polyatomic systems. An efficient coupled-state approach is also proposed to solve the eigenvalue problem of the Hamiltonian using a multi-layer Lanczos iterative diagonalization approach via a set of direct product basis set in three coordinate groups: radial coordinates, angular variables, and overall rotational angles. A simple set of symmetric top rotational functions is used for the overall rotation whereas a potential-optimized discrete variable representation method is employed in radial coordinates. A set of contracted vibrationally diabatic basis functions is adopted in internal angular variables. Those diabatic functions are first computed using a neural network iterative diagonalization method based on a reduced-dimension Hamiltonian but only once. The final rovibrational energies are computed using a modified Lanczos method for a given total angular momentum J, which is usually fast. Two numerical applications to CH4 and H2CO are given, together with a comparison with previous results.

  19. An exact variational method to calculate rovibrational spectra of polyatomic molecules with large amplitude motion

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yu, Hua-Gen, E-mail: hgy@bnl.gov

    We report a new full-dimensional variational algorithm to calculate rovibrational spectra of polyatomic molecules using an exact quantum mechanical Hamiltonian. The rovibrational Hamiltonian of system is derived in a set of orthogonal polyspherical coordinates in the body-fixed frame. It is expressed in an explicitly Hermitian form. The Hamiltonian has a universal formulation regardless of the choice of orthogonal polyspherical coordinates and the number of atoms in molecule, which is suitable for developing a general program to study the spectra of many polyatomic systems. An efficient coupled-state approach is also proposed to solve the eigenvalue problem of the Hamiltonian using amore » multi-layer Lanczos iterative diagonalization approach via a set of direct product basis set in three coordinate groups: radial coordinates, angular variables, and overall rotational angles. A simple set of symmetric top rotational functions is used for the overall rotation whereas a potential-optimized discrete variable representation method is employed in radial coordinates. A set of contracted vibrationally diabatic basis functions is adopted in internal angular variables. Those diabatic functions are first computed using a neural network iterative diagonalization method based on a reduced-dimension Hamiltonian but only once. The final rovibrational energies are computed using a modified Lanczos method for a given total angular momentum J, which is usually fast. Two numerical applications to CH{sub 4} and H{sub 2}CO are given, together with a comparison with previous results.« less

  20. A dataset of images and morphological profiles of 30 000 small-molecule treatments using the Cell Painting assay

    PubMed Central

    Bray, Mark-Anthony; Gustafsdottir, Sigrun M; Rohban, Mohammad H; Singh, Shantanu; Ljosa, Vebjorn; Sokolnicki, Katherine L; Bittker, Joshua A; Bodycombe, Nicole E; Dančík, Vlado; Hasaka, Thomas P; Hon, Cindy S; Kemp, Melissa M; Li, Kejie; Walpita, Deepika; Wawer, Mathias J; Golub, Todd R; Schreiber, Stuart L; Clemons, Paul A; Shamji, Alykhan F

    2017-01-01

    Abstract Background Large-scale image sets acquired by automated microscopy of perturbed samples enable a detailed comparison of cell states induced by each perturbation, such as a small molecule from a diverse library. Highly multiplexed measurements of cellular morphology can be extracted from each image and subsequently mined for a number of applications. Findings This microscopy dataset includes 919 265 five-channel fields of view, representing 30 616 tested compounds, available at “The Cell Image Library” (CIL) repository. It also includes data files containing morphological features derived from each cell in each image, both at the single-cell level and population-averaged (i.e., per-well) level; the image analysis workflows that generated the morphological features are also provided. Quality-control metrics are provided as metadata, indicating fields of view that are out-of-focus or containing highly fluorescent material or debris. Lastly, chemical annotations are supplied for the compound treatments applied. Conclusions Because computational algorithms and methods for handling single-cell morphological measurements are not yet routine, the dataset serves as a useful resource for the wider scientific community applying morphological (image-based) profiling. The dataset can be mined for many purposes, including small-molecule library enrichment and chemical mechanism-of-action studies, such as target identification. Integration with genetically perturbed datasets could enable identification of small-molecule mimetics of particular disease- or gene-related phenotypes that could be useful as probes or potential starting points for development of future therapeutics. PMID:28327978

  1. Introducing Bond-Line Organic Structures in High School Biology: An Activity that Incorporates Pleasant-Smelling Molecules

    ERIC Educational Resources Information Center

    Rios, Andro C.; French, Gerald

    2011-01-01

    Chemical education occurs in settings other than just the chemistry classroom. High school biology courses are frequently where students are introduced to organic molecules and their importance to cellular chemistry. However, structural representations are often intimidating because students have not been introduced to the language. As part of a…

  2. Harnessing the Efficiency of 0(1D) Insertion Reactions for Prebiotic Astrochemistry

    NASA Astrophysics Data System (ADS)

    Widicus Weaver, Susanna

    We propose a THz spectroscopic study of the small prebiotic molecules aminomethanol, methanediol, and methoxymethanol. These target molecules are predicted as the dominant products of photo-driven grain surface chemistry in interstellar environments, and are precursors to important prebiotic molecules like sugars and amino acids. These molecules are also expected to be major contributors to the spectral line density in the submillimeter spectral surveys from the Herschel and SOFIA observatories. We will use our custom mixing source to produce these molecules through O(1D) insertion reactions with the precursor molecules methyl amine, methanol, and dimethyl ether, respectively. We will then record their rotational spectra across the THz frequency range using our existing submillimeter spectrometer. This research will increase the science return from NASA missions because the target molecules serve as tracers of the simplest organic chemistry that can occur in starforming regions. This chemistry begins with methanol, which is the predominant organic molecule observed in interstellar ices. Methanol photodissociation leads to small organic radicals such as CH3O, CH2OH, and CH3. These radicals can undergo combination reactions on interstellar ices to form many of the complex organic molecules that are routinely observed in star-forming regions. Our target molecules aminomethanol, methanediol, and methoxymethanol are some of the simplest molecules that can form from this type of chemistry, and serve as tracers of ice mantle liberation in star-forming regions. These molecules also participate in gas-phase reactions that lead to amino acids and sugars, and as such are fundamentally important prebiotic molecules in interstellar environments. These types of small organic molecules also have high spectral line density, and are major contributors to line confusion in observational spectral surveys such as those conducted by Herschel and SOFIA. Therefore, the proposed research will aid in full data interpretation from Herschel and SOFIA observations. Currently there is no spectral information available for these molecules to guide observational studies, despite their importance in astrochemistry. This is because these molecules are difficult to study in laboratory settings due to their instability and reactivity. We are using highly exothermic O(1D) insertion reactions to produce these molecules in a supersonic expansion, and investigating the products using THz spectroscopy. This work builds on the work involved in our previous APRA award (Grant NNX11AI07G) "New THz Tools to Support Herschel Observations: Integrative Studies in Laboratory Spectroscopy, Observational Astronomy, and Chemical Modeling". In this previous award, we laid the groundwork for these experiments by constructing and benchmarking the spectrometer, designing and testing the molecular source used for the O(1D) reactions, and studying the proposed formation reactions for the laboratory work through computational studies. We have confirmed production of methanol from O(1D) insertion into methane, and then applied this chemistry to produce vinyl alcohol from ethylene. We have now also obtained preliminary spectra of aminomethanol. Here we propose to extend this work by finishing the aminomethanol characterization as well as examining methanediol and methoxymethanol during the next proposal period.

  3. A general method for controlling and resolving rotational orientation of molecules in molecule-surface collisions

    PubMed Central

    Godsi, Oded; Corem, Gefen; Alkoby, Yosef; Cantin, Joshua T.; Krems, Roman V.; Somers, Mark F.; Meyer, Jörg; Kroes, Geert-Jan; Maniv, Tsofar; Alexandrowicz, Gil

    2017-01-01

    The outcome of molecule–surface collisions can be modified by pre-aligning the molecule; however, experiments accomplishing this are rare because of the difficulty of preparing molecules in aligned quantum states. Here we present a general solution to this problem based on magnetic manipulation of the rotational magnetic moment of the incident molecule. We apply the technique to the scattering of H2 from flat and stepped copper surfaces. We demonstrate control of the molecule's initial quantum state, allowing a direct comparison of differences in the stereodynamic scattering from the two surfaces. Our results show that a stepped surface exhibits a much larger dependence of the corrugation of the interaction on the alignment of the molecule than the low-index surface. We also demonstrate an extension of the technique that transforms the set-up into an interferometer, which is sensitive to molecular quantum states both before and after the scattering event. PMID:28480890

  4. Dimerization drives EGFR endocytosis through two sets of compatible endocytic codes.

    PubMed

    Wang, Qian; Chen, Xinmei; Wang, Zhixiang

    2015-03-01

    We have shown previously that epidermal growth factor (EGF) receptor (EGFR) endocytosis is controlled by EGFR dimerization. However, it is not clear how the dimerization drives receptor internalization. We propose that EGFR endocytosis is driven by dimerization, bringing two sets of endocytic codes, one contained in each receptor monomer, in close proximity. Here, we tested this hypothesis by generating specific homo- or hetero-dimers of various receptors and their mutants. We show that ErbB2 and ErbB3 homodimers are endocytosis deficient owing to the lack of endocytic codes. Interestingly, EGFR-ErbB2 or EGFR-ErbB3 heterodimers are also endocytosis deficient. Moreover, the heterodimer of EGFR and the endocytosis-deficient mutant EGFRΔ1005-1017 is also impaired in endocytosis. These results indicate that two sets of endocytic codes are required for receptor endocytosis. We found that an EGFR-PDGFRβ heterodimer is endocytosis deficient, although both EGFR and PDGFRβ homodimers are endocytosis-competent, indicating that two compatible sets of endocytic codes are required. Finally, we found that to mediate the endocytosis of the receptor dimer, the two sets of compatible endocytic codes, one contained in each receptor molecule, have to be spatially coordinated. © 2015. Published by The Company of Biologists Ltd.

  5. Alignment-independent technique for 3D QSAR analysis

    NASA Astrophysics Data System (ADS)

    Wilkes, Jon G.; Stoyanova-Slavova, Iva B.; Buzatu, Dan A.

    2016-04-01

    Molecular biochemistry is controlled by 3D phenomena but structure-activity models based on 3D descriptors are infrequently used for large data sets because of the computational overhead for determining molecular conformations. A diverse dataset of 146 androgen receptor binders was used to investigate how different methods for defining molecular conformations affect the performance of 3D-quantitative spectral data activity relationship models. Molecular conformations tested: (1) global minimum of molecules' potential energy surface; (2) alignment-to-templates using equal electronic and steric force field contributions; (3) alignment using contributions "Best-for-Each" template; (4) non-energy optimized, non-aligned (2D > 3D). Aggregate predictions from models were compared. Highest average coefficients of determination ranged from R Test 2 = 0.56 to 0.61. The best model using 2D > 3D (imported directly from ChemSpider) produced R Test 2 = 0.61. It was superior to energy-minimized and conformation-aligned models and was achieved in only 3-7 % of the time required using the other conformation strategies. Predictions averaged from models built on different conformations achieved a consensus R Test 2 = 0.65. The best 2D > 3D model was analyzed for underlying structure-activity relationships. For the compound strongest binding to the androgen receptor, 10 substructural features contributing to binding were flagged. Utility of 2D > 3D was compared for two other activity endpoints, each modeling a medium sized data set. Results suggested that large scale, accurate predictions using 2D > 3D SDAR descriptors may be produced for interactions involving endocrine system nuclear receptors and other data sets in which strongest activities are produced by fairly inflexible substrates.

  6. Development of a reference material of a single DNA molecule for the quality control of PCR testing.

    PubMed

    Mano, Junichi; Hatano, Shuko; Futo, Satoshi; Yoshii, Junji; Nakae, Hiroki; Naito, Shigehiro; Takabatake, Reona; Kitta, Kazumi

    2014-09-02

    We developed a reference material of a single DNA molecule with a specific nucleotide sequence. The double-strand linear DNA which has PCR target sequences at the both ends was prepared as a reference DNA molecule, and we named the PCR targets on each side as confirmation sequence and standard sequence. The highly diluted solution of the reference molecule was dispensed into 96 wells of a plastic PCR plate to make the average number of molecules in a well below one. Subsequently, the presence or absence of the reference molecule in each well was checked by real-time PCR targeting for the confirmation sequence. After an enzymatic treatment of the reaction mixture in the positive wells for the digestion of PCR products, the resultant solution was used as the reference material of a single DNA molecule with the standard sequence. PCR analyses revealed that the prepared samples included only one reference molecule with high probability. The single-molecule reference material developed in this study will be useful for the absolute evaluation of a detection limit of PCR-based testing methods, the quality control of PCR analyses, performance evaluations of PCR reagents and instruments, and the preparation of an accurate calibration curve for real-time PCR quantitation.

  7. Machine learning of molecular properties: Locality and active learning

    NASA Astrophysics Data System (ADS)

    Gubaev, Konstantin; Podryabinkin, Evgeny V.; Shapeev, Alexander V.

    2018-06-01

    In recent years, the machine learning techniques have shown great potent1ial in various problems from a multitude of disciplines, including materials design and drug discovery. The high computational speed on the one hand and the accuracy comparable to that of density functional theory on another hand make machine learning algorithms efficient for high-throughput screening through chemical and configurational space. However, the machine learning algorithms available in the literature require large training datasets to reach the chemical accuracy and also show large errors for the so-called outliers—the out-of-sample molecules, not well-represented in the training set. In the present paper, we propose a new machine learning algorithm for predicting molecular properties that addresses these two issues: it is based on a local model of interatomic interactions providing high accuracy when trained on relatively small training sets and an active learning algorithm of optimally choosing the training set that significantly reduces the errors for the outliers. We compare our model to the other state-of-the-art algorithms from the literature on the widely used benchmark tests.

  8. Revealing nonergodic dynamics in living cells from a single particle trajectory

    NASA Astrophysics Data System (ADS)

    Lanoiselée, Yann; Grebenkov, Denis S.

    2016-05-01

    We propose the improved ergodicity and mixing estimators to identify nonergodic dynamics from a single particle trajectory. The estimators are based on the time-averaged characteristic function of the increments and can thus capture additional information on the process as compared to the conventional time-averaged mean-square displacement. The estimators are first investigated and validated for several models of anomalous diffusion, such as ergodic fractional Brownian motion and diffusion on percolating clusters, and nonergodic continuous-time random walks and scaled Brownian motion. The estimators are then applied to two sets of earlier published trajectories of mRNA molecules inside live Escherichia coli cells and of Kv2.1 potassium channels in the plasma membrane. These statistical tests did not reveal nonergodic features in the former set, while some trajectories of the latter set could be classified as nonergodic. Time averages along such trajectories are thus not representative and may be strongly misleading. Since the estimators do not rely on ensemble averages, the nonergodic features can be revealed separately for each trajectory, providing a more flexible and reliable analysis of single-particle tracking experiments in microbiology.

  9. Performance of an Optimally Tuned Range-Separated Hybrid Functional for 0-0 Electronic Excitation Energies.

    PubMed

    Jacquemin, Denis; Moore, Barry; Planchat, Aurélien; Adamo, Carlo; Autschbach, Jochen

    2014-04-08

    Using a set of 40 conjugated molecules, we assess the performance of an "optimally tuned" range-separated hybrid functional in reproducing the experimental 0-0 energies. The selected protocol accounts for the impact of solvation using a corrected linear-response continuum approach and vibrational corrections through calculations of the zero-point energies of both ground and excited-states and provides basis set converged data thanks to the systematic use of diffuse-containing atomic basis sets at all computational steps. It turns out that an optimally tuned long-range corrected hybrid form of the Perdew-Burke-Ernzerhof functional, LC-PBE*, delivers both the smallest mean absolute error (0.20 eV) and standard deviation (0.15 eV) of all tested approaches, while the obtained correlation (0.93) is large but remains slightly smaller than its M06-2X counterpart (0.95). In addition, the efficiency of two other recently developed exchange-correlation functionals, namely SOGGA11-X and ωB97X-D, has been determined in order to allow more complete comparisons with previously published data.

  10. A molecule-based genetic association approach implicates a range of voltage-gated calcium channels associated with schizophrenia.

    PubMed

    Li, Wen; Fan, Chun Chieh; Mäki-Marttunen, Tuomo; Thompson, Wesley K; Schork, Andrew J; Bettella, Francesco; Djurovic, Srdjan; Dale, Anders M; Andreassen, Ole A; Wang, Yunpeng

    2018-06-01

    Traditional genome-wide association studies (GWAS) have successfully detected genetic variants associated with schizophrenia. However, only a small fraction of heritability can be explained. Gene-set/pathway-based methods can overcome limitations arising from single nucleotide polymorphism (SNP)-based analysis, but most of them place constraints on size which may exclude highly specific and functional sets, like macromolecules. Voltage-gated calcium (Ca v ) channels, belonging to macromolecules, are composed of several subunits whose encoding genes are located far away or even on different chromosomes. We combined information about such molecules with GWAS data to investigate how functional channels associated with schizophrenia. We defined a biologically meaningful SNP-set based on channel structure and performed an association study by using a validated method: SNP-set (sequence) kernel association test. We identified eight subtypes of Ca v channels significantly associated with schizophrenia from a subsample of published data (N = 56,605), including the L-type channels (Ca v 1.1, Ca v 1.2, Ca v 1.3), P-/Q-type Ca v 2.1, N-type Ca v 2.2, R-type Ca v 2.3, T-type Ca v 3.1, and Ca v 3.3. Only genes from Ca v 1.2 and Ca v 3.3 have been implicated by the largest GWAS (N = 82,315). Each subtype of Ca v channels showed relatively high chip heritability, proportional to the size of its constituent gene regions. The results suggest that abnormalities of Ca v channels may play an important role in the pathophysiology of schizophrenia and these channels may represent appropriate drug targets for therapeutics. Analyzing subunit-encoding genes of a macromolecule in aggregate is a complementary way to identify more genetic variants of polygenic diseases. This study offers the potential of power for discovery the biological mechanisms of schizophrenia. © 2018 Wiley Periodicals, Inc.

  11. Problem-Solving Test: Restriction Endonuclease Mapping

    ERIC Educational Resources Information Center

    Szeberenyi, Jozsef

    2011-01-01

    The term "restriction endonuclease mapping" covers a number of related techniques used to identify specific restriction enzyme recognition sites on small DNA molecules. A method for restriction endonuclease mapping of a 1,000-basepair (bp)-long DNA molecule is described in the fictitious experiment of this test. The most important fact needed to…

  12. Benchmark quality total atomization energies of small polyatomic molecules

    NASA Astrophysics Data System (ADS)

    Martin, Jan M. L.; Taylor, Peter R.

    1997-05-01

    Successive coupled-cluster [CCSD(T)] calculations in basis sets of spdf, spdfg, and spdfgh quality, combined with separate Schwartz-type extrapolations A+B/(l+1/2)α of the self-consistent field (SCF) and correlation energies, permit the calculations of molecular total atomization energies (TAEs) with a mean absolute error of as low as 0.12 kcal/mol. For the largest molecule treated, C2H4, we find ∑D0=532.0 kcal/mol, in perfect agreement with experiment. The aug-cc-pV5Z basis set recovers on average about 99% of the valence correlation contribution to the TAE, and essentially the entire SCF contribution.

  13. Atomistic Monte Carlo Simulation of Lipid Membranes

    PubMed Central

    Wüstner, Daniel; Sklenar, Heinz

    2014-01-01

    Biological membranes are complex assemblies of many different molecules of which analysis demands a variety of experimental and computational approaches. In this article, we explain challenges and advantages of atomistic Monte Carlo (MC) simulation of lipid membranes. We provide an introduction into the various move sets that are implemented in current MC methods for efficient conformational sampling of lipids and other molecules. In the second part, we demonstrate for a concrete example, how an atomistic local-move set can be implemented for MC simulations of phospholipid monomers and bilayer patches. We use our recently devised chain breakage/closure (CBC) local move set in the bond-/torsion angle space with the constant-bond-length approximation (CBLA) for the phospholipid dipalmitoylphosphatidylcholine (DPPC). We demonstrate rapid conformational equilibration for a single DPPC molecule, as assessed by calculation of molecular energies and entropies. We also show transition from a crystalline-like to a fluid DPPC bilayer by the CBC local-move MC method, as indicated by the electron density profile, head group orientation, area per lipid, and whole-lipid displacements. We discuss the potential of local-move MC methods in combination with molecular dynamics simulations, for example, for studying multi-component lipid membranes containing cholesterol. PMID:24469314

  14. Route to three-dimensional fragments using diversity-oriented synthesis

    PubMed Central

    Hung, Alvin W.; Ramek, Alex; Wang, Yikai; Kaya, Taner; Wilson, J. Anthony; Clemons, Paul A.; Young, Damian W.

    2011-01-01

    Fragment-based drug discovery (FBDD) has proven to be an effective means of producing high-quality chemical ligands as starting points for drug-discovery pursuits. The increasing number of clinical candidate drugs developed using FBDD approaches is a testament of the efficacy of this approach. The success of fragment-based methods is highly dependent on the identity of the fragment library used for screening. The vast majority of FBDD has centered on the use of sp2-rich aromatic compounds. An expanded set of fragments that possess more 3D character would provide access to a larger chemical space of fragments than those currently used. Diversity-oriented synthesis (DOS) aims to efficiently generate a set of molecules diverse in skeletal and stereochemical properties. Molecules derived from DOS have also displayed significant success in the modulation of function of various “difficult” targets. Herein, we describe the application of DOS toward the construction of a unique set of fragments containing highly sp3-rich skeletons for fragment-based screening. Using cheminformatic analysis, we quantified the shapes and physical properties of the new 3D fragments and compared them with a database containing known fragment-like molecules. PMID:21482811

  15. Route to three-dimensional fragments using diversity-oriented synthesis.

    PubMed

    Hung, Alvin W; Ramek, Alex; Wang, Yikai; Kaya, Taner; Wilson, J Anthony; Clemons, Paul A; Young, Damian W

    2011-04-26

    Fragment-based drug discovery (FBDD) has proven to be an effective means of producing high-quality chemical ligands as starting points for drug-discovery pursuits. The increasing number of clinical candidate drugs developed using FBDD approaches is a testament of the efficacy of this approach. The success of fragment-based methods is highly dependent on the identity of the fragment library used for screening. The vast majority of FBDD has centered on the use of sp(2)-rich aromatic compounds. An expanded set of fragments that possess more 3D character would provide access to a larger chemical space of fragments than those currently used. Diversity-oriented synthesis (DOS) aims to efficiently generate a set of molecules diverse in skeletal and stereochemical properties. Molecules derived from DOS have also displayed significant success in the modulation of function of various "difficult" targets. Herein, we describe the application of DOS toward the construction of a unique set of fragments containing highly sp(3)-rich skeletons for fragment-based screening. Using cheminformatic analysis, we quantified the shapes and physical properties of the new 3D fragments and compared them with a database containing known fragment-like molecules.

  16. Optimal Down Regulation of mRNA Translation

    NASA Astrophysics Data System (ADS)

    Zarai, Yoram; Margaliot, Michael; Tuller, Tamir

    2017-01-01

    Down regulation of mRNA translation is an important problem in various bio-medical domains ranging from developing effective medicines for tumors and for viral diseases to developing attenuated virus strains that can be used for vaccination. Here, we study the problem of down regulation of mRNA translation using a mathematical model called the ribosome flow model (RFM). In the RFM, the mRNA molecule is modeled as a chain of n sites. The flow of ribosomes between consecutive sites is regulated by n + 1 transition rates. Given a set of feasible transition rates, that models the outcome of all possible mutations, we consider the problem of maximally down regulating protein production by altering the rates within this set of feasible rates. Under certain conditions on the feasible set, we show that an optimal solution can be determined efficiently. We also rigorously analyze two special cases of the down regulation optimization problem. Our results suggest that one must focus on the position along the mRNA molecule where the transition rate has the strongest effect on the protein production rate. However, this rate is not necessarily the slowest transition rate along the mRNA molecule. We discuss some of the biological implications of these results.

  17. Alchemical prediction of hydration free energies for SAMPL

    PubMed Central

    Mobley, David L.; Liu, Shaui; Cerutti, David S.; Swope, William C.; Rice, Julia E.

    2013-01-01

    Hydration free energy calculations have become important tests of force fields. Alchemical free energy calculations based on molecular dynamics simulations provide a rigorous way to calculate these free energies for a particular force field, given sufficient sampling. Here, we report results of alchemical hydration free energy calculations for the set of small molecules comprising the 2011 Statistical Assessment of Modeling of Proteins and Ligands (SAMPL) challenge. Our calculations are largely based on the Generalized Amber Force Field (GAFF) with several different charge models, and we achieved RMS errors in the 1.4-2.2 kcal/mol range depending on charge model, marginally higher than what we typically observed in previous studies1-5. The test set consists of ethane, biphenyl, and a dibenzyl dioxin, as well as a series of chlorinated derivatives of each. We found that, for this set, using high-quality partial charges from MP2/cc-PVTZ SCRF RESP fits provided marginally improved agreement with experiment over using AM1-BCC partial charges as we have more typically done, in keeping with our recent findings5. Switching to OPLS Lennard-Jones parameters with AM1-BCC charges also improves agreement with experiment. We also find a number of chemical trends within each molecular series which we can explain, but there are also some surprises, including some that are captured by the calculations and some that are not. PMID:22198475

  18. Chirality measures of α-amino acids.

    PubMed

    Jamróz, Michał H; Rode, Joanna E; Ostrowski, Sławomir; Lipiński, Piotr F J; Dobrowolski, Jan Cz

    2012-06-25

    To measure molecular chirality, the molecule is treated as a finite set of points in the Euclidean R(3) space supplemented by k properties, p(1)((i)), p(2)((i)), ..., p(k)((i)) assigned to the ith atom, which constitute a point in the Property P(k) space. Chirality measures are described as the distance between a molecule and its mirror image minimized over all its arbitrary orientation-preserving isometries in the R(3) × P(k) Cartesian product space. Following this formalism, different chirality measures can be estimated by taking into consideration different sets of atomic properties. Here, for α-amino acid zwitterionic structures taken from the Cambridge Structural Database and for all 1684 neutral conformers of 19 biogenic α-amino acid molecules, except glycine and cystine, found at the B3LYP/6-31G** level, chirality measures have been calculated by a CHIMEA program written in this project. It is demonstrated that there is a significant correlation between the measures determined for the α-amino acid zwitterions in crystals and the neutral forms in the gas phase. Performance of the studied chirality measures with changes of the basis set and computation method was also checked. An exemplary quantitative structure–activity relationship (QSAR) application of the chirality measures was presented by an introductory model for the benchmark Cramer data set of steroidal ligands of the sex-hormone binding globulin.

  19. Deducing protein structures using logic programming: exploiting minimum data of diverse types.

    PubMed

    Sibbald, P R

    1995-04-21

    The extent to which a protein can be modeled from constraint data depends on the amount and quality of the data. This report quantifies a relationship between the amount of data and the achievable model resolution. In an information-theoretic framework the number of bits of information per residue needed to constrain a solution was calculated. The number of bits provided by different kinds of constraints was estimated from a tetrahedral lattice where all unique molecules of 6, 9, ..., 21 atoms were enumerated. Subsets of these molecules consistent with different constraint sets were then chosen, counted, and the root-mean-square distance between them calculated. This provided the desired relations. In a discrete system the number of possible models can be severely limited with relatively few constraints. An expert system that can model a protein from data of different types was built to illustrate the principle and was tested using known proteins as examples. C-alpha resolutions of 5 A are obtainable from 5 bits of information per amino acid and, in principle, from data that could be rapidly collected using standard biophysical techniques.

  20. Kinetic analysis of single molecule FRET transitions without trajectories

    NASA Astrophysics Data System (ADS)

    Schrangl, Lukas; Göhring, Janett; Schütz, Gerhard J.

    2018-03-01

    Single molecule Förster resonance energy transfer (smFRET) is a popular tool to study biological systems that undergo topological transitions on the nanometer scale. smFRET experiments typically require recording of long smFRET trajectories and subsequent statistical analysis to extract parameters such as the states' lifetimes. Alternatively, analysis of probability distributions exploits the shapes of smFRET distributions at well chosen exposure times and hence works without the acquisition of time traces. Here, we describe a variant that utilizes statistical tests to compare experimental datasets with Monte Carlo simulations. For a given model, parameters are varied to cover the full realistic parameter space. As output, the method yields p-values which quantify the likelihood for each parameter setting to be consistent with the experimental data. The method provides suitable results even if the actual lifetimes differ by an order of magnitude. We also demonstrated the robustness of the method to inaccurately determine input parameters. As proof of concept, the new method was applied to the determination of transition rate constants for Holliday junctions.

  1. The structural, electronic and spectroscopic properties of 4FPBAPE molecule: Experimental and theoretical study

    NASA Astrophysics Data System (ADS)

    Tanış, Emine; Babur Sas, Emine; Kurban, Mustafa; Kurt, Mustafa

    2018-02-01

    The experimental and theoretical study of 4-Formyl Phenyl Boronic Acid Pinacol Ester (4FPBAPE) molecule were performed in this work. 1H, 13C NMR and UV-Vis spectra were tested in dimethyl sulfoxide (DMSO). The structural, spectroscopic properties and energies of 4FPBAPE were obtained for two potential conformers from density functional theory (DFT) with B3LYP/6-311G (d, p) and CAM-B3LYP/6-311G (d, p) basis sets. The optimal geometry of those structures was obtained according to the position of oxygen atom upon determining the scan coordinates for each conformation. The most stable conformer was found as the A2 form. The fundamental vibrations were determined based on optimized structure in terms of total energy distribution. Electronic properties such as oscillator strength, wavelength, excitation energy, HOMO, LUMO and molecular electrostatic potential and structural properties such as radial distribution functions (RDF) and probability density depending on coordination number are presented. Theoretical results of 4-FPBAPE spectra were found to be compatible with observed spectra.

  2. Correlated natural transition orbital framework for low-scaling excitation energy calculations (CorNFLEx).

    PubMed

    Baudin, Pablo; Kristensen, Kasper

    2017-06-07

    We present a new framework for calculating coupled cluster (CC) excitation energies at a reduced computational cost. It relies on correlated natural transition orbitals (NTOs), denoted CIS(D')-NTOs, which are obtained by diagonalizing generalized hole and particle density matrices determined from configuration interaction singles (CIS) information and additional terms that represent correlation effects. A transition-specific reduced orbital space is determined based on the eigenvalues of the CIS(D')-NTOs, and a standard CC excitation energy calculation is then performed in that reduced orbital space. The new method is denoted CorNFLEx (Correlated Natural transition orbital Framework for Low-scaling Excitation energy calculations). We calculate second-order approximate CC singles and doubles (CC2) excitation energies for a test set of organic molecules and demonstrate that CorNFLEx yields excitation energies of CC2 quality at a significantly reduced computational cost, even for relatively small systems and delocalized electronic transitions. In order to illustrate the potential of the method for large molecules, we also apply CorNFLEx to calculate CC2 excitation energies for a series of solvated formamide clusters (up to 4836 basis functions).

  3. Single-molecule FRET unveils induced-fit mechanism for substrate selectivity in flap endonuclease 1

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Rashid, Fahad; Harris, Paul D.; Zaher, Manal S.

    Human flap endonuclease 1 (FEN1) and related structure-specific 5’nucleases precisely identify and incise aberrant DNA structures during replication, repair and recombination to avoid genomic instability. Yet, it is unclear how the 5’nuclease mechanisms of DNA distortion and protein ordering robustly mediate efficient and accurate substrate recognition and catalytic selectivity. Here, single-molecule sub-millisecond and millisecond analyses of FEN1 reveal a protein-DNA induced-fit mechanism that efficiently verifies substrate and suppresses off-target cleavage. FEN1 sculpts DNA with diffusion-limited kinetics to test DNA substrate. This DNA distortion mutually ‘locks’ protein and DNA conformation and enables substrate verification with extreme precision. Strikingly, FEN1 never missesmore » cleavage of its cognate substrate while blocking probable formation of catalytically competent interactions with noncognate substrates and fostering their pre-incision dissociation. These findings establish FEN1 has practically perfect precision and that separate control of induced-fit substrate recognition sets up the catalytic selectivity of the nuclease active site for genome stability.« less

  4. Single-molecule FRET unveils induced-fit mechanism for substrate selectivity in flap endonuclease 1

    DOE PAGES

    Rashid, Fahad; Harris, Paul D.; Zaher, Manal S.; ...

    2017-02-23

    Human flap endonuclease 1 (FEN1) and related structure-specific 5’nucleases precisely identify and incise aberrant DNA structures during replication, repair and recombination to avoid genomic instability. Yet, it is unclear how the 5’nuclease mechanisms of DNA distortion and protein ordering robustly mediate efficient and accurate substrate recognition and catalytic selectivity. Here, single-molecule sub-millisecond and millisecond analyses of FEN1 reveal a protein-DNA induced-fit mechanism that efficiently verifies substrate and suppresses off-target cleavage. FEN1 sculpts DNA with diffusion-limited kinetics to test DNA substrate. This DNA distortion mutually ‘locks’ protein and DNA conformation and enables substrate verification with extreme precision. Strikingly, FEN1 never missesmore » cleavage of its cognate substrate while blocking probable formation of catalytically competent interactions with noncognate substrates and fostering their pre-incision dissociation. These findings establish FEN1 has practically perfect precision and that separate control of induced-fit substrate recognition sets up the catalytic selectivity of the nuclease active site for genome stability.« less

  5. Bayesian screening for active compounds in high-dimensional chemical spaces combining property descriptors and molecular fingerprints.

    PubMed

    Vogt, Martin; Bajorath, Jürgen

    2008-01-01

    Bayesian classifiers are increasingly being used to distinguish active from inactive compounds and search large databases for novel active molecules. We introduce an approach to directly combine the contributions of property descriptors and molecular fingerprints in the search for active compounds that is based on a Bayesian framework. Conventionally, property descriptors and fingerprints are used as alternative features for virtual screening methods. Following the approach introduced here, probability distributions of descriptor values and fingerprint bit settings are calculated for active and database molecules and the divergence between the resulting combined distributions is determined as a measure of biological activity. In test calculations on a large number of compound activity classes, this methodology was found to consistently perform better than similarity searching using fingerprints and multiple reference compounds or Bayesian screening calculations using probability distributions calculated only from property descriptors. These findings demonstrate that there is considerable synergy between different types of property descriptors and fingerprints in recognizing diverse structure-activity relationships, at least in the context of Bayesian modeling.

  6. Atomic Charge Parameters for the Finite Difference Poisson-Boltzmann Method Using Electronegativity Neutralization.

    PubMed

    Yang, Qingyi; Sharp, Kim A

    2006-07-01

    An optimization of Rappe and Goddard's charge equilibration (QEq) method of assigning atomic partial charges is described. This optimization is designed for fast and accurate calculation of solvation free energies using the finite difference Poisson-Boltzmann (FDPB) method. The optimization is performed against experimental small molecule solvation free energies using the FDPB method and adjusting Rappe and Goddard's atomic electronegativity values. Using a test set of compounds for which experimental solvation energies are available and a rather small number of parameters, very good agreement was obtained with experiment, with a mean unsigned error of about 0.5 kcal/mol. The QEq atomic partial charge assignment method can reflect the effects of the conformational changes and solvent induction on charge distribution in molecules. In the second section of the paper we examined this feature with a study of the alanine dipeptide conformations in water solvent. The different contributions to the energy surface of the dipeptide were examined and compared with the results from fixed CHARMm charge potential, which is widely used for molecular dynamics studies.

  7. Single-molecule FRET unveils induced-fit mechanism for substrate selectivity in flap endonuclease 1

    PubMed Central

    Rashid, Fahad; Harris, Paul D; Zaher, Manal S; Sobhy, Mohamed A; Joudeh, Luay I; Yan, Chunli; Piwonski, Hubert; Tsutakawa, Susan E; Ivanov, Ivaylo; Tainer, John A; Habuchi, Satoshi; Hamdan, Samir M

    2017-01-01

    Human flap endonuclease 1 (FEN1) and related structure-specific 5’nucleases precisely identify and incise aberrant DNA structures during replication, repair and recombination to avoid genomic instability. Yet, it is unclear how the 5’nuclease mechanisms of DNA distortion and protein ordering robustly mediate efficient and accurate substrate recognition and catalytic selectivity. Here, single-molecule sub-millisecond and millisecond analyses of FEN1 reveal a protein-DNA induced-fit mechanism that efficiently verifies substrate and suppresses off-target cleavage. FEN1 sculpts DNA with diffusion-limited kinetics to test DNA substrate. This DNA distortion mutually ‘locks’ protein and DNA conformation and enables substrate verification with extreme precision. Strikingly, FEN1 never misses cleavage of its cognate substrate while blocking probable formation of catalytically competent interactions with noncognate substrates and fostering their pre-incision dissociation. These findings establish FEN1 has practically perfect precision and that separate control of induced-fit substrate recognition sets up the catalytic selectivity of the nuclease active site for genome stability. DOI: http://dx.doi.org/10.7554/eLife.21884.001 PMID:28230529

  8. Measuring performance in off-patent drug markets: a methodological framework and empirical evidence from twelve EU Member States.

    PubMed

    Kanavos, Panos

    2014-11-01

    This paper develops a methodological framework to help evaluate the performance of generic pharmaceutical policies post-patent expiry or after loss of exclusivity in non-tendering settings, comprising five indicators (generic availability, time delay to and speed of generic entry, number of generic competitors, price developments, and generic volume share evolution) and proposes a series of metrics to evaluate performance. The paper subsequently tests this framework across twelve EU Member States (MS) by using IMS data on 101 patent expired molecules over the 1998-2010 period. Results indicate that significant variation exists in generic market entry, price competition and generic penetration across the study countries. Size of a geographical market is not a predictor of generic market entry intensity or price decline. Regardless of geographic or product market size, many off patent molecules lack generic competitors two years after loss of exclusivity. The ranges in each of the five proposed indicators suggest, first, that there are numerous factors--including institutional ones--contributing to the success of generic entry, price decline and market penetration and, second, MS should seek a combination of supply and demand-side policies in order to maximise cost-savings from generics. Overall, there seems to be considerable potential for faster generic entry, uptake and greater generic competition, particularly for molecules at the lower end of the market. Copyright © 2014. Published by Elsevier Ireland Ltd.

  9. Integration of multi-scale molecular modeling approaches with experiments for the in silico guided design and discovery of novel hERG-Neutral antihypertensive oxazalone and imidazolone derivatives and analysis of their potential restrictive effects on cell proliferation.

    PubMed

    Durdagi, Serdar; Aksoydan, Busecan; Erol, Ismail; Kantarcioglu, Isik; Ergun, Yavuz; Bulut, Gulay; Acar, Melih; Avsar, Timucin; Liapakis, George; Karageorgos, Vlasios; Salmas, Ramin E; Sergi, Barış; Alkhatib, Sara; Turan, Gizem; Yigit, Berfu Nur; Cantasir, Kutay; Kurt, Bahar; Kilic, Turker

    2018-02-10

    AT1 antagonists is the most recent drug class of molecules against hypertension and they mediate their actions through blocking detrimental effects of angiotensin II (A-II) when acts on type I (AT1) A-II receptor. The effects of AT1 antagonists are not limited to cardiovascular diseases. AT1 receptor blockers may be used as potential anti-cancer agents - due to the inhibition of cell proliferation stimulated by A-II. Therefore, AT1 receptors and the A-II biosynthesis mechanisms are targets for the development of new synthetic drugs and therapeutic treatment of various cardiovascular and other diseases. In this work, multi-scale molecular modeling approaches were performed and it is found that oxazolone and imidazolone derivatives reveal similar/better interaction energy profiles compared to the FDA approved sartan molecules at the binding site of the AT1 receptor. In silico-guided designed hit molecules were then synthesized and tested for their binding affinities to human AT1 receptor in radioligand binding studies, using [ 125 I-Sar 1 -Ile 8 ] AngII. Among the compounds tested, 19d and 9j molecules bound to receptor in a dose response manner and with relatively high affinities. Next, cytotoxicity and wound healing assays were performed for these hit molecules. Since hit molecule 19d led to deceleration of cell motility in all three cell lines (NIH3T3, A549, and H358) tested in this study, this molecule is investigated in further tests. In two cell lines (HUVEC and MCF-7) tested, 19d induced G2/M cell cycle arrest in a concentration dependent manner. Adherent cells detached from the plates and underwent cell death possibly due to apoptosis at 19d concentrations that induced cell cycle arrest. Copyright © 2017 Elsevier Masson SAS. All rights reserved.

  10. QSPR models of n-octanol/water partition coefficients and aqueous solubility of halogenated methyl-phenyl ethers by DFT method.

    PubMed

    Zeng, Xiao-Lan; Wang, Hong-Jun; Wang, Yan

    2012-02-01

    The possible molecular geometries of 134 halogenated methyl-phenyl ethers were optimized at B3LYP/6-31G(*) level with Gaussian 98 program. The calculated structural parameters were taken as theoretical descriptors to establish two new novel QSPR models for predicting aqueous solubility (-lgS(w,l)) and n-octanol/water partition coefficient (lgK(ow)) of halogenated methyl-phenyl ethers. The two models achieved in this work both contain three variables: energy of the lowest unoccupied molecular orbital (E(LUMO)), most positive atomic partial charge in molecule (q(+)), and quadrupole moment (Q(yy) or Q(zz)), of which R values are 0.992 and 0.970 respectively, their standard errors of estimate in modeling (SD) are 0.132 and 0.178, respectively. The results of leave-one-out (LOO) cross-validation for training set and validation with external test sets both show that the models obtained exhibited optimum stability and good predictive power. We suggests that two QSPR models derived here can be used to predict S(w,l) and K(ow) accurately for non-tested halogenated methyl-phenyl ethers congeners. Copyright © 2011 Elsevier Ltd. All rights reserved.

  11. Automatic Molecular Design using Evolutionary Techniques

    NASA Technical Reports Server (NTRS)

    Globus, Al; Lawton, John; Wipke, Todd; Saini, Subhash (Technical Monitor)

    1998-01-01

    Molecular nanotechnology is the precise, three-dimensional control of materials and devices at the atomic scale. An important part of nanotechnology is the design of molecules for specific purposes. This paper describes early results using genetic software techniques to automatically design molecules under the control of a fitness function. The fitness function must be capable of determining which of two arbitrary molecules is better for a specific task. The software begins by generating a population of random molecules. The population is then evolved towards greater fitness by randomly combining parts of the better individuals to create new molecules. These new molecules then replace some of the worst molecules in the population. The unique aspect of our approach is that we apply genetic crossover to molecules represented by graphs, i.e., sets of atoms and the bonds that connect them. We present evidence suggesting that crossover alone, operating on graphs, can evolve any possible molecule given an appropriate fitness function and a population containing both rings and chains. Prior work evolved strings or trees that were subsequently processed to generate molecular graphs. In principle, genetic graph software should be able to evolve other graph representable systems such as circuits, transportation networks, metabolic pathways, computer networks, etc.

  12. A revised set of values of single-bond radii derived from the observed interatomic distances in metals by correction for bond number and resonance energy

    PubMed Central

    Pauling, Linus; Kamb, Barclay

    1986-01-01

    An earlier discussion [Pauling, L. (1947) J. Am. Chem. Soc. 69, 542] of observed bond lengths in elemental metals with correction for bond number and resonance energy led to a set of single-bond metallic radii with values usually somewhat less than the corresponding values obtained from molecules and complex ions. A theory of resonating covalent bonds has now been developed that permits calculation of the number of resonance structures per atom and of the effective resonance energy per bond. With this refined method of correcting the observed bond lengths for the effect of resonance energy, a new set of single-bond covalent radii, in better agreement with values from molecules and complex ions, has been constructed. PMID:16593698

  13. Computational laser intensity stabilisation for organic molecule concentration estimation in low-resource settings

    NASA Astrophysics Data System (ADS)

    Haider, Shahid A.; Kazemzadeh, Farnoud; Wong, Alexander

    2017-03-01

    An ideal laser is a useful tool for the analysis of biological systems. In particular, the polarization property of lasers can allow for the concentration of important organic molecules in the human body, such as proteins, amino acids, lipids, and carbohydrates, to be estimated. However, lasers do not always work as intended and there can be effects such as mode hopping and thermal drift that can cause time-varying intensity fluctuations. The causes of these effects can be from the surrounding environment, where either an unstable current source is used or the temperature of the surrounding environment is not temporally stable. This intensity fluctuation can cause bias and error in typical organic molecule concentration estimation techniques. In a low-resource setting where cost must be limited and where environmental factors, like unregulated power supplies and temperature, cannot be controlled, the hardware required to correct for these intensity fluctuations can be prohibitive. We propose a method for computational laser intensity stabilisation that uses Bayesian state estimation to correct for the time-varying intensity fluctuations from electrical and thermal instabilities without the use of additional hardware. This method will allow for consistent intensities across all polarization measurements for accurate estimates of organic molecule concentrations.

  14. The study of electrical conductivity of DNA molecules by scanning tunneling spectroscopy

    NASA Astrophysics Data System (ADS)

    Sharipov, T. I.; Bakhtizin, R. Z.

    2017-10-01

    An interest to the processes of charge transport in DNA molecules is very high, due to perspective of their using in nanoelectronics. The original sample preparation for studying electrical conductivity of DNA molecules by scanning tunneling spectroscopy has been proposed and tested. The DNA molecules immobilized on gold surface have been imaged clearly and their current-voltage curves have been measured.

  15. Plasmonics and SERS activity of post-transition metal nanoparticles

    NASA Astrophysics Data System (ADS)

    Bezerra, A. G.; Machado, T. N.; Woiski, T. D.; Turchetti, D. A.; Lenz, J. A.; Akcelrud, L.; Schreiner, W. H.

    2018-05-01

    Nanoparticles of the post-transition metals, In, Sn, Pb, and Bi, and of the metalloid Sb were produced by laser ablation synthesis in solution (LASiS) and tested for localized surface plasmon resonances (LSPR) and surface-enhanced Raman scattering (SERS). The nanoparticles were characterized by UV-Vis optical absorption, dynamic light scattering (DLS), and transmission electron microscopy (TEM). Several organic and biological molecules were tested, and SERS activity was demonstrated for all tested nanoparticles and molecules. The Raman enhancement factor for each nanoparticle class and molecule was experimentally determined. The search for new plasmonic nanostructures is important mainly for life sciences-related applications and this study expands the range of SERS active systems.

  16. DNA molecule stretching through thermo-electrophoresis and thermal convection in a heated converging-diverging microchannel.

    PubMed

    Hsieh, Shou-Shing; Chen, Jyun-Hong; Tsai, Cheng-Fung

    2013-02-18

    A novel DNA molecule stretching technique is developed and tested herein. Through a heated converging-diverging microchannel, thermal convection and thermophoresis induced by regional heating are shown to significantly elongate single DNA molecules; they are visualized via a confocal laser scanning microscopy. In addition, electrophoretic stretching is also implemented to examine the hybrid effect on the conformation and dynamics of single DNA molecules. The physical properties of the DNA molecules are secured via experimental measurements.

  17. Midbond basis functions for weakly bound complexes

    NASA Astrophysics Data System (ADS)

    Shaw, Robert A.; Hill, J. Grant

    2018-06-01

    Weakly bound systems present a difficult problem for conventional atom-centred basis sets due to large separations, necessitating the use of large, computationally expensive bases. This can be remedied by placing a small number of functions in the region between molecules in the complex. We present compact sets of optimised midbond functions for a range of complexes involving noble gases, alkali metals and small molecules for use in high accuracy coupled -cluster calculations, along with a more robust procedure for their optimisation. It is shown that excellent results are possible with double-zeta quality orbital basis sets when a few midbond functions are added, improving both the interaction energy and the equilibrium bond lengths of a series of noble gas dimers by 47% and 8%, respectively. When used in conjunction with explicitly correlated methods, near complete basis set limit accuracy is readily achievable at a fraction of the cost that using a large basis would entail. General purpose auxiliary sets are developed to allow explicitly correlated midbond function studies to be carried out, making it feasible to perform very high accuracy calculations on weakly bound complexes.

  18. Rapid method to detect duplex formation in sequencing by hybridization methods

    DOEpatents

    Mirzabekov, A.D.; Timofeev, E.N.; Florentiev, V.L.; Kirillov, E.V.

    1999-01-19

    A method for determining the existence of duplexes of oligonucleotide complementary molecules is provided. A plurality of immobilized oligonucleotide molecules, each of a specific length and each having a specific base sequence, is contacted with complementary, single stranded oligonucleotide molecules to form a duplex. Each duplex facilitates intercalation of a fluorescent dye between the base planes of the duplex. The invention also provides for a method for constructing oligonucleotide matrices comprising confining light sensitive fluid to a surface and exposing the light-sensitive fluid to a light pattern. This causes the fluid exposed to the light to coalesce into discrete units and adhere to the surface. This places each of the units in contact with a set of different oligonucleotide molecules so as to allow the molecules to disperse into the units. 13 figs.

  19. Predicting human olfactory perception from chemical features of odor molecules.

    PubMed

    Keller, Andreas; Gerkin, Richard C; Guan, Yuanfang; Dhurandhar, Amit; Turu, Gabor; Szalai, Bence; Mainland, Joel D; Ihara, Yusuke; Yu, Chung Wen; Wolfinger, Russ; Vens, Celine; Schietgat, Leander; De Grave, Kurt; Norel, Raquel; Stolovitzky, Gustavo; Cecchi, Guillermo A; Vosshall, Leslie B; Meyer, Pablo

    2017-02-24

    It is still not possible to predict whether a given molecule will have a perceived odor or what olfactory percept it will produce. We therefore organized the crowd-sourced DREAM Olfaction Prediction Challenge. Using a large olfactory psychophysical data set, teams developed machine-learning algorithms to predict sensory attributes of molecules based on their chemoinformatic features. The resulting models accurately predicted odor intensity and pleasantness and also successfully predicted 8 among 19 rated semantic descriptors ("garlic," "fish," "sweet," "fruit," "burnt," "spices," "flower," and "sour"). Regularized linear models performed nearly as well as random forest-based ones, with a predictive accuracy that closely approaches a key theoretical limit. These models help to predict the perceptual qualities of virtually any molecule with high accuracy and also reverse-engineer the smell of a molecule. Copyright © 2017, American Association for the Advancement of Science.

  20. Modeling Molecules

    NASA Technical Reports Server (NTRS)

    2000-01-01

    The molecule modeling method known as Multibody Order (N) Dynamics, or MBO(N)D, was developed by Moldyn, Inc. at Goddard Space Flight Center through funding provided by the SBIR program. The software can model the dynamics of molecules through technology which stimulates low-frequency molecular motions and properties, such as movements among a molecule's constituent parts. With MBO(N)D, a molecule is substructured into a set of interconnected rigid and flexible bodies. These bodies replace the computation burden of mapping individual atoms. Moldyn's technology cuts computation time while increasing accuracy. The MBO(N)D technology is available as Insight II 97.0 from Molecular Simulations, Inc. Currently the technology is used to account for forces on spacecraft parts and to perform molecular analyses for pharmaceutical purposes. It permits the solution of molecular dynamics problems on a moderate workstation, as opposed to on a supercomputer.

  1. Rapid method to detect duplex formation in sequencing by hybridization methods

    DOEpatents

    Mirzabekov, Andrei Darievich; Timofeev, Edward Nikolaevich; Florentiev, Vladimer Leonidovich; Kirillov, Eugene Vladislavovich

    1999-01-01

    A method for determining the existence of duplexes of oligonucleotide complementary molecules is provided whereby a plurality of immobilized oligonucleotide molecules, each of a specific length and each having a specific base sequence, is contacted with complementary, single stranded oligonucleotide molecules to form a duplex so as to facilitate intercalation of a fluorescent dye between the base planes of the duplex. The invention also provides for a method for constructing oligonucleotide matrices comprising confining light sensitive fluid to a surface, exposing said light-sensitive fluid to a light pattern so as to cause the fluid exposed to the light to coalesce into discrete units and adhere to the surface; and contacting each of the units with a set of different oligonucleotide molecules so as to allow the molecules to disperse into the units.

  2. Method and apparatus for detecting and quantifying bacterial spores on a surface

    NASA Technical Reports Server (NTRS)

    Ponce, Adrian (Inventor)

    2009-01-01

    A method and an apparatus for detecting and quantifying bacterial spores on a surface. In accordance with the method: bacterial spores are transferred from a place of origin to a test surface, the test surface comprises lanthanide ions. Aromatic molecules are released from the bacterial spores; a complex of the lanthanide ions and aromatic molecules is formed on the test surface, the complex is excited to generate a characteristic luminescence on the test surface; the luminescence on the test surface is detected and quantified.

  3. Method and Apparatus for Detecting and Quantifying Bacterial Spores on a Surface

    NASA Technical Reports Server (NTRS)

    Ponce, Adrian (Inventor)

    2016-01-01

    A method and an apparatus for detecting and quantifying bacterial spores on a surface. In accordance with the method: bacterial spores are transferred from a place of origin to a test surface, the test surface comprises lanthanide ions. Aromatic molecules are released from the bacterial spores; a complex of the lanthanide ions and aromatic molecules is formed on the test surface, the complex is excited to generate a characteristic luminescence on the test surface; the luminescence on the test surface is detected and quantified.

  4. Combined 3D-QSAR and molecular docking study on 7,8-dialkyl-1,3-diaminopyrrolo-[3,2-f] Quinazoline series compounds to understand the binding mechanism of DHFR inhibitors

    NASA Astrophysics Data System (ADS)

    Aouidate, Adnane; Ghaleb, Adib; Ghamali, Mounir; Chtita, Samir; Choukrad, M'barek; Sbai, Abdelouahid; Bouachrine, Mohammed; Lakhlifi, Tahar

    2017-07-01

    A series of nineteen DHFR inhibitors was studied based on the combination of two computational techniques namely, three-dimensional quantitative structure activity relationship (3D-QSAR) and molecular docking. The comparative molecular field analysis (CoMFA) and comparative molecular similarity index analysis (CoMSIA) were developed using 19 molecules having pIC50 ranging from 9.244 to 5.839. The best CoMFA and CoMSIA models show conventional determination coefficients R2 of 0.96 and 0.93 as well as the Leave One Out cross-validation determination coefficients Q2 of 0.64 and 0.72, respectively. The predictive ability of those models was evaluated by the external validation using a test set of five compounds with predicted determination coefficients R2test of 0.92 and 0.94, respectively. The binding mode between this kind of compounds and the DHFR enzyme in addition to the key amino acid residues were explored by molecular docking simulation. Contour maps and molecular docking identified that the R1 and R2 natures at the pyrazole moiety are the important features for the optimization of the binding affinity to the DHFR receptor. According to the good concordance between the CoMFA/CoMSIA contour maps and docking results, the obtained information was explored to design novel molecules.

  5. Local CC2 response method for triplet states based on Laplace transform: excitation energies and first-order properties.

    PubMed

    Freundorfer, Katrin; Kats, Daniel; Korona, Tatiana; Schütz, Martin

    2010-12-28

    A new multistate local CC2 response method for calculating excitation energies and first-order properties of excited triplet states in extended molecular systems is presented. The Laplace transform technique is employed to partition the left/right local CC2 eigenvalue problems as well as the linear equations determining the Lagrange multipliers needed for the properties. The doubles part in the equations can then be inverted on-the-fly and only effective equations for the singles part must be solved iteratively. The local approximation presented here is adaptive and state-specific. The density-fitting method is utilized to approximate the electron-repulsion integrals. The accuracy of the new method is tested by comparison to canonical reference values for a set of 12 test molecules and 62 excited triplet states. As an illustrative application example, the lowest four triplet states of 3-(5-(5-(4-(bis(4-(hexyloxy)phenyl)amino)phenyl)thiophene-2-yl)thiophene-2-yl)-2-cyanoacrylic acid, an organic sensitizer for solar-cell applications, are computed in the present work. No triplet charge-transfer states are detected among these states. This situation contrasts with the singlet states of this molecule, where the lowest singlet state has been recently found to correspond to an excited state with a pronounced charge-transfer character having a large transition strength.

  6. Bioturbo similarity searching: combining chemical and biological similarity to discover structurally diverse bioactive molecules.

    PubMed

    Wassermann, Anne Mai; Lounkine, Eugen; Glick, Meir

    2013-03-25

    Virtual screening using bioactivity profiles has become an integral part of currently applied hit finding methods in pharmaceutical industry. However, a significant drawback of this approach is that it is only applicable to compounds that have been biologically tested in the past and have sufficient activity annotations for meaningful profile comparisons. Although bioactivity data generated in pharmaceutical institutions are growing on an unprecedented scale, the number of biologically annotated compounds still covers only a minuscule fraction of chemical space. For a newly synthesized compound or an isolated natural product to be biologically characterized across multiple assays, it may take a considerable amount of time. Consequently, this chemical matter will not be included in virtual screening campaigns based on bioactivity profiles. To overcome this problem, we herein introduce bioturbo similarity searching that uses chemical similarity to map molecules without biological annotations into bioactivity space and then searches for biologically similar compounds in this reference system. In benchmark calculations on primary screening data, we demonstrate that our approach generally achieves higher hit rates and identifies structurally more diverse compounds than approaches using chemical information only. Furthermore, our method is able to discover hits with novel modes of inhibition that traditional 2D and 3D similarity approaches are unlikely to discover. Test calculations on a set of natural products reveal the practical utility of the approach for identifying novel and synthetically more accessible chemical matter.

  7. Explicitly correlated coupled-cluster theory using cusp conditions. II. Treatment of connected triple excitations.

    PubMed

    Köhn, Andreas

    2010-11-07

    The coupled-cluster singles and doubles method augmented with single Slater-type correlation factors (CCSD-F12) determined by the cusp conditions (also denoted as SP ansatz) yields results close to the basis set limit with only small overhead compared to conventional CCSD. Quantitative calculations on many-electron systems, however, require to include the effect of connected triple excitations at least. In this contribution, the recently proposed [A. Köhn, J. Chem. Phys. 130, 131101 (2009)] extended SP ansatz and its application to the noniterative triples correction CCSD(T) is reviewed. The approach allows to include explicit correlation into connected triple excitations without introducing additional unknown parameters. The explicit expressions are presented and analyzed, and possible simplifications to arrive at a computationally efficient scheme are suggested. Numerical tests based on an implementation obtained by an automated approach are presented. Using a partial wave expansion for the neon atom, we can show that the proposed ansatz indeed leads to the expected (L(max)+1)(-7) convergence of the noniterative triples correction, where L(max) is the maximum angular momentum in the orbital expansion. Further results are reported for a test set of 29 molecules, employing Peterson's F12-optimized basis sets. We find that the customary approach of using the conventional noniterative triples correction on top of a CCSD-F12 calculation leads to significant basis set errors. This, however, is not always directly visible for total CCSD(T) energies due to fortuitous error compensation. The new approach offers a thoroughly explicitly correlated CCSD(T)-F12 method with improved basis set convergence of the triples contributions to both total and relative energies.

  8. The Laser Cooling and Magneto-Optical Trapping of the YO Molecule

    NASA Astrophysics Data System (ADS)

    Yeo, Mark

    Laser cooling and magneto-optical trapping of neutral atoms has revolutionized the field of atomic physics by providing an elegant and efficient method to produce cold dense samples of ultracold atoms. Molecules, with their strong anisotropic dipolar interaction promises to unlock even richer phenomenon. However, due to their additional vibrational and rotational degrees of freedom, laser cooling techniques have only been extended to a small set of diatomic molecules. In this thesis, we demonstrate the first magneto-optical trapping of a diatomic molecule using a quasi-cycling transition and an oscillating quadrupole magnetic field. The transverse temperature of a cryogenically produced YO beam was reduced from 25 mK to 10 mK via doppler cooling and further reduced to 2 mK with the addition of magneto-optical trapping forces. The optical cycling in YO is complicated by the presence of an intermediate electronic state, as decays through this state lead to optical pumping into dark rotational states. Thus, we also demonstrate the mixing of rotational states in the ground electronic state using microwave radiation. This technique greatly enhances optical cycling, leading to a factor of 4 increase in the YO beam fluorescence and is used in conjunction with a frequency modulated and chirped continuous wave laser to longitudinally slow the YO beam. We generate YO molecules below 10 m/s that are directly loadable into a three-dimensional magneto-optical trap. This mixing technique provides an alternative to maintaining rotational closure and should extend laser cooling to a larger set of molecules.

  9. The Chemistry of Multiply Deuterated Molecules in Protoplanetary Disks: I. The Outer Disk

    NASA Technical Reports Server (NTRS)

    Willacy, K.

    2007-01-01

    We present new models of the deuterium chemistry in protoplanetary disks, including, for the first time, multiply deuterated species. We use these models to explore whether observations in combination with models can give us clues as to which desorption processes occur in disks.We find, in common with other authors, that photodesorption can allow strongly bound molecules such as HDO to exist in the gas phase in a layer above the midplane. Models including this process give the best agreement with the observations. In the midplane, cosmic-ray heating can desorb weakly bound molecules such as CO and N2. We find the observations suggest that N2 is gaseous in this region, but that CO must be retained on the grains to account for the observed DCO+/HCO+. This could be achieved by CO having a higher binding energy than N2 (as may be the case when these molecules are accreted onto water ice) or by a smaller cosmic-ray desorption rate for CO than assumed here, as suggested by recent theoretical work. For gaseous molecules the calculated deuteration can be greatly changed by chemical processing in the disk from the input molecular cloud values. On the grains singly deuterated species tend to retain the D/H ratio set in the molecular cloud, whereas multiply deuterated species are more affected by the disk chemistry. Consequently, the D/H ratios observed in comets may be partly set in the parent cloud and partly in the disk, depending on the molecule.

  10. Block-localized wavefunction (BLW) method at the density functional theory (DFT) level.

    PubMed

    Mo, Yirong; Song, Lingchun; Lin, Yuchun

    2007-08-30

    The block-localized wavefunction (BLW) approach is an ab initio valence bond (VB) method incorporating the efficiency of molecular orbital (MO) theory. It can generate the wavefunction for a resonance structure or diabatic state self-consistently by partitioning the overall electrons and primitive orbitals into several subgroups and expanding each block-localized molecular orbital in only one subspace. Although block-localized molecular orbitals in the same subspace are constrained to be orthogonal (a feature of MO theory), orbitals between different subspaces are generally nonorthogonal (a feature of VB theory). The BLW method is particularly useful in the quantification of the electron delocalization (resonance) effect within a molecule and the charge-transfer effect between molecules. In this paper, we extend the BLW method to the density functional theory (DFT) level and implement the BLW-DFT method to the quantum mechanical software GAMESS. Test applications to the pi conjugation in the planar allyl radical and ions with the basis sets of 6-31G(d), 6-31+G(d), 6-311+G(d,p), and cc-pVTZ show that the basis set dependency is insignificant. In addition, the BLW-DFT method can also be used to elucidate the nature of intermolecular interactions. Examples of pi-cation interactions and solute-solvent interactions will be presented and discussed. By expressing each diabatic state with one BLW, the BLW method can be further used to study chemical reactions and electron-transfer processes whose potential energy surfaces are typically described by two or more diabatic states.

  11. VUV absorption spectroscopy measurements of the role of fast neutral atoms in a high-power gap breakdown

    NASA Astrophysics Data System (ADS)

    Filuk, A. B.; Bailey, J. E.; Cuneo, M. E.; Lake, P. W.; Nash, T. J.; Noack, D. D.; Maron, Y.

    2000-12-01

    The maximum power achieved in a wide variety of high-power devices, including electron and ion diodes, z pinches, and microwave generators, is presently limited by anode-cathode gap breakdown. A frequently discussed hypothesis for this effect is ionization of fast neutral atoms injected throughout the anode-cathode gap during the power pulse. We describe a newly developed diagnostic tool that provides a direct test of this hypothesis. Time-resolved vacuum-ultraviolet absorption spectroscopy is used to directly probe fast neutral atoms with 1-mm spatial resolution in the 10-mm anode-cathode gap of the SABRE 5 MV, 1 TW applied-B ion diode. Absorption spectra collected during Ar RF glow discharges and with CO2 gas fills confirm the reliability of the diagnostic technique. Throughout the 50-100 ns ion diode pulses no measurable neutral absorption was seen, setting upper limits of (0.12-1.5)×1014 cm-3 for ground-state fast neutral atom densities of H, C, N, O, and F. The absence of molecular absorption bands also sets upper limits of (0.16-1.2)×1015 cm-3 for common simple molecules. These limits are low enough to rule out ionization of fast neutral atoms as a breakdown mechanism. Breakdown due to ionization of molecules is also found to be unlikely. This technique can now be applied to quantify the role of neutral atoms in other high-power devices.

  12. Quantum chemical approach to estimating the thermodynamics of metabolic reactions.

    PubMed

    Jinich, Adrian; Rappoport, Dmitrij; Dunn, Ian; Sanchez-Lengeling, Benjamin; Olivares-Amaya, Roberto; Noor, Elad; Even, Arren Bar; Aspuru-Guzik, Alán

    2014-11-12

    Thermodynamics plays an increasingly important role in modeling and engineering metabolism. We present the first nonempirical computational method for estimating standard Gibbs reaction energies of metabolic reactions based on quantum chemistry, which can help fill in the gaps in the existing thermodynamic data. When applied to a test set of reactions from core metabolism, the quantum chemical approach is comparable in accuracy to group contribution methods for isomerization and group transfer reactions and for reactions not including multiply charged anions. The errors in standard Gibbs reaction energy estimates are correlated with the charges of the participating molecules. The quantum chemical approach is amenable to systematic improvements and holds potential for providing thermodynamic data for all of metabolism.

  13. Enantioselective Total Synthesis of Antibiotic CJ-16,264, Synthesis and Biological Evaluation of Designed Analogues, and Discovery of Highly Potent and Simpler Antibacterial Agents.

    PubMed

    Nicolaou, K C; Pulukuri, Kiran Kumar; Rigol, Stephan; Buchman, Marek; Shah, Akshay A; Cen, Nicholas; McCurry, Megan D; Beabout, Kathryn; Shamoo, Yousif

    2017-11-08

    An improved and enantioselective total synthesis of antibiotic CJ-16,264 through a practical kinetic resolution and an iodolactonization reaction to form the iodo pyrrolizidinone fragment of the molecule is described. A series of racemic and enantiopure analogues of CJ-16,264 was designed and synthesized through the developed synthetic technologies and tested against drug-resistant bacterial strains. These studies led to interesting structure-activity relationships and the identification of a number of simpler, and yet equipotent, or even more potent, antibacterial agents than the natural product, thereby setting the foundation for further investigations in the quest for new anti-infective drugs.

  14. Spin-orbit splitted excited states using explicitly-correlated equation-of-motion coupled-cluster singles and doubles eigenvectors

    NASA Astrophysics Data System (ADS)

    Bokhan, Denis; Trubnikov, Dmitrii N.; Perera, Ajith; Bartlett, Rodney J.

    2018-04-01

    An explicitly-correlated method of calculation of excited states with spin-orbit couplings, has been formulated and implemented. Developed approach utilizes left and right eigenvectors of equation-of-motion coupled-cluster model, which is based on the linearly approximated explicitly correlated coupled-cluster singles and doubles [CCSD(F12)] method. The spin-orbit interactions are introduced by using the spin-orbit mean field (SOMF) approximation of the Breit-Pauli Hamiltonian. Numerical tests for several atoms and molecules show good agreement between explicitly-correlated results and the corresponding values, calculated in complete basis set limit (CBS); the highly-accurate excitation energies can be obtained already at triple- ζ level.

  15. On the performance of large Gaussian basis sets for the computation of total atomization energies

    NASA Technical Reports Server (NTRS)

    Martin, J. M. L.

    1992-01-01

    The total atomization energies of a number of molecules have been computed using an augmented coupled-cluster method and (5s4p3d2f1g) and 4s3p2d1f) atomic natural orbital (ANO) basis sets, as well as the correlation consistent valence triple zeta plus polarization (cc-pVTZ) correlation consistent valence quadrupole zeta plus polarization (cc-pVQZ) basis sets. The performance of ANO and correlation consistent basis sets is comparable throughout, although the latter can result in significant CPU time savings. Whereas the inclusion of g functions has significant effects on the computed Sigma D(e) values, chemical accuracy is still not reached for molecules involving multiple bonds. A Gaussian-1 (G) type correction lowers the error, but not much beyond the accuracy of the G1 model itself. Using separate corrections for sigma bonds, pi bonds, and valence pairs brings down the mean absolute error to less than 1 kcal/mol for the spdf basis sets, and about 0.5 kcal/mol for the spdfg basis sets. Some conclusions on the success of the Gaussian-1 and Gaussian-2 models are drawn.

  16. Three-dimensional quantitative structure-activity relationship (3D QSAR) and pharmacophore elucidation of tetrahydropyran derivatives as serotonin and norepinephrine transporter inhibitors

    NASA Astrophysics Data System (ADS)

    Kharkar, Prashant S.; Reith, Maarten E. A.; Dutta, Aloke K.

    2008-01-01

    Three-dimensional quantitative structure-activity relationship (3D QSAR) using comparative molecular field analysis (CoMFA) was performed on a series of substituted tetrahydropyran (THP) derivatives possessing serotonin (SERT) and norepinephrine (NET) transporter inhibitory activities. The study aimed to rationalize the potency of these inhibitors for SERT and NET as well as the observed selectivity differences for NET over SERT. The dataset consisted of 29 molecules, of which 23 molecules were used as the training set for deriving CoMFA models for SERT and NET uptake inhibitory activities. Superimpositions were performed using atom-based fitting and 3-point pharmacophore-based alignment. Two charge calculation methods, Gasteiger-Hückel and semiempirical PM3, were tried. Both alignment methods were analyzed in terms of their predictive abilities and produced comparable results with high internal and external predictivities. The models obtained using the 3-point pharmacophore-based alignment outperformed the models with atom-based fitting in terms of relevant statistics and interpretability of the generated contour maps. Steric fields dominated electrostatic fields in terms of contribution. The selectivity analysis (NET over SERT), though yielded models with good internal predictivity, showed very poor external test set predictions. The analysis was repeated with 24 molecules after systematically excluding so-called outliers (5 out of 29) from the model derivation process. The resulting CoMFA model using the atom-based fitting exhibited good statistics and was able to explain most of the selectivity (NET over SERT)-discriminating factors. The presence of -OH substituent on the THP ring was found to be one of the most important factors governing the NET selectivity over SERT. Thus, a 4-point NET-selective pharmacophore, after introducing this newly found H-bond donor/acceptor feature in addition to the initial 3-point pharmacophore, was proposed.

  17. Less is more: Sampling chemical space with active learning

    NASA Astrophysics Data System (ADS)

    Smith, Justin S.; Nebgen, Ben; Lubbers, Nicholas; Isayev, Olexandr; Roitberg, Adrian E.

    2018-06-01

    The development of accurate and transferable machine learning (ML) potentials for predicting molecular energetics is a challenging task. The process of data generation to train such ML potentials is a task neither well understood nor researched in detail. In this work, we present a fully automated approach for the generation of datasets with the intent of training universal ML potentials. It is based on the concept of active learning (AL) via Query by Committee (QBC), which uses the disagreement between an ensemble of ML potentials to infer the reliability of the ensemble's prediction. QBC allows the presented AL algorithm to automatically sample regions of chemical space where the ML potential fails to accurately predict the potential energy. AL improves the overall fitness of ANAKIN-ME (ANI) deep learning potentials in rigorous test cases by mitigating human biases in deciding what new training data to use. AL also reduces the training set size to a fraction of the data required when using naive random sampling techniques. To provide validation of our AL approach, we develop the COmprehensive Machine-learning Potential (COMP6) benchmark (publicly available on GitHub) which contains a diverse set of organic molecules. Active learning-based ANI potentials outperform the original random sampled ANI-1 potential with only 10% of the data, while the final active learning-based model vastly outperforms ANI-1 on the COMP6 benchmark after training to only 25% of the data. Finally, we show that our proposed AL technique develops a universal ANI potential (ANI-1x) that provides accurate energy and force predictions on the entire COMP6 benchmark. This universal ML potential achieves a level of accuracy on par with the best ML potentials for single molecules or materials, while remaining applicable to the general class of organic molecules composed of the elements CHNO.

  18. Kinetics and Thermodynamics of the Reaction between the (•)OH Radical and Adenine: A Theoretical Investigation.

    PubMed

    Milhøj, Birgitte O; Sauer, Stephan P A

    2015-06-18

    The accessibility of all possible reaction paths for the reaction between the nucleobase adenine and the (•)OH radical is investigated through quantum chemical calculations of barrier heights and rate constants at the ωB97X-D/6-311++G(2df,2pd) level with Eckart tunneling corrections. First the computational method is validated by considering the hydrogen abstraction from the heterocyclic N9 nitrogen in adenine as a test system. Geometries for all molecules in the reaction are optimized with four different DFT exchange-correlation functionals (B3LYP, BHandHLYP, M06-2X, and ωB97X-D), in combination with Pople and Dunning basis sets, all of which have been employed in similar investigations in the literature. Improved energies are obtained through single point calculations with CCSD(T) and the same basis sets, and reaction rate constants are calculated for all methods both without tunneling corrections and with the Wigner, Bell, and Eckart corrections. In comparison to CCSD(T)//BHandHLYP/aug-cc-pVTZ reference results, the ωB97X-D/6-311++G(2df,2pd) method combined with Eckart tunneling corrections provides a sensible compromise between accuracy and time. Using this method, all subreactions of the reaction between adenine and the (•)OH radical are investigated. The total rate constants for hydrogen abstraction and addition for adenine are predicted with this method to be 1.06 × 10(-12) and 1.10 × 10(-12) cm(3) molecules(-1) s(-1), respectively. Abstractions of H61 and H62 contribute the most, while only addition to the C8 carbon is found to be of any significance, in contrast to previous claims that addition is the dominant reaction pathway. The overall rate constant for the complete reaction is found to be 2.17 × 10(-12) cm(3) molecules(-1) s(-1), which agrees exceptionally well with experimental results.

  19. New Parameters for Higher Accuracy in the Computation of Binding Free Energy Differences upon Alanine Scanning Mutagenesis on Protein-Protein Interfaces.

    PubMed

    Simões, Inês C M; Costa, Inês P D; Coimbra, João T S; Ramos, Maria J; Fernandes, Pedro A

    2017-01-23

    Knowing how proteins make stable complexes enables the development of inhibitors to preclude protein-protein (P:P) binding. The identification of the specific interfacial residues that mostly contribute to protein binding, denominated as hot spots, is thus critical. Here, we refine an in silico alanine scanning mutagenesis protocol, based on a residue-dependent dielectric constant version of the Molecular Mechanics/Poisson-Boltzmann Surface Area method. We have used a large data set of structurally diverse P:P complexes to redefine the residue-dependent dielectric constants used in the determination of binding free energies. The accuracy of the method was validated through comparison with experimental data, considering the per-residue P:P binding free energy (ΔΔG binding ) differences upon alanine mutation. Different protocols were tested, i.e., a geometry optimization protocol and three molecular dynamics (MD) protocols: (1) one using explicit water molecules, (2) another with an implicit solvation model, and (3) a third where we have carried out an accelerated MD with explicit water molecules. Using a set of protein dielectric constants (within the range from 1 to 20) we showed that the dielectric constants of 7 for nonpolar and polar residues and 11 for charged residues (and histidine) provide optimal ΔΔG binding predictions. An overall mean unsigned error (MUE) of 1.4 kcal mol -1 relative to the experiment was achieved in 210 mutations only with geometry optimization, which was further reduced with MD simulations (MUE of 1.1 kcal mol -1 for the MD employing explicit solvent). This recalibrated method allows for a better computational identification of hot spots, avoiding expensive and time-consuming experiments or thermodynamic integration/ free energy perturbation/ uBAR calculations, and will hopefully help new drug discovery campaigns in their quest of searching spots of interest for binding small drug-like molecules at P:P interfaces.

  20. Three-dimensional quantitative structure-activity relationship study on anti-cancer activity of 3,4-dihydroquinazoline derivatives against human lung cancer A549 cells

    NASA Astrophysics Data System (ADS)

    Cho, Sehyeon; Choi, Min Ji; Kim, Minju; Lee, Sunhoe; Lee, Jinsung; Lee, Seok Joon; Cho, Haelim; Lee, Kyung-Tae; Lee, Jae Yeol

    2015-03-01

    A series of 3,4-dihydroquinazoline derivatives with anti-cancer activities against human lung cancer A549 cells were subjected to three-dimensional quantitative structure-activity relationship (3D-QSAR) studies using the comparative molecular similarity indices analysis (CoMSIA) approaches. The most potent compound, 1 was used to align the molecules. As a result, the best prediction was obtained with CoMSIA combined the steric, electrostatic, hydrophobic, hydrogen bond donor, and hydrogen bond acceptor fields (q2 = 0.720, r2 = 0.897). This model was validated by an external test set of 6 compounds giving satisfactory predictive r2 value of 0.923 as well as the scrambling stability test. This model would guide the design of potent 3,4-dihydroquinazoline derivatives as anti-cancer agent for the treatment of human lung cancer.

  1. Virology: The Next Generation from Digital PCR to Single Virion Genomics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    White, Richard A.; Brazelton De Cardenas, Jessica N.; Hayden, Randall T.

    In the past 25 years, virology has had major technology breakthroughs stemming first from the introduction of nucleic acid amplification testing, but more recently from the use of next-generation sequencing, digital PCR, and the possibility of single virion genomics. These technologies have and will improve diagnosis and disease state monitoring in clinical settings, aid in environmental monitoring, and reveal the vast genetic potential of viruses. Using the principle of limiting dilution, digital PCR amplifies single molecules of DNA in highly partitioned endpoint reactions and reads each of those reactions as either positive or negative based on the presence or absencemore » of target fluorophore. In this review, digital PCR will be highlighted along with current studies, advantages/disadvantages, and future perspectives with regard to digital PCR, viral load testing, and the possibility of single virion genomics.« less

  2. "Ask Ernö": a self-learning tool for assignment and prediction of nuclear magnetic resonance spectra.

    PubMed

    Castillo, Andrés M; Bernal, Andrés; Dieden, Reiner; Patiny, Luc; Wist, Julien

    2016-01-01

    We present "Ask Ernö", a self-learning system for the automatic analysis of NMR spectra, consisting of integrated chemical shift assignment and prediction tools. The output of the automatic assignment component initializes and improves a database of assigned protons that is used by the chemical shift predictor. In turn, the predictions provided by the latter facilitate improvement of the assignment process. Iteration on these steps allows Ask Ernö to improve its ability to assign and predict spectra without any prior knowledge or assistance from human experts. This concept was tested by training such a system with a dataset of 2341 molecules and their (1)H-NMR spectra, and evaluating the accuracy of chemical shift predictions on a test set of 298 partially assigned molecules (2007 assigned protons). After 10 iterations, Ask Ernö was able to decrease its prediction error by 17 %, reaching an average error of 0.265 ppm. Over 60 % of the test chemical shifts were predicted within 0.2 ppm, while only 5 % still presented a prediction error of more than 1 ppm. Ask Ernö introduces an innovative approach to automatic NMR analysis that constantly learns and improves when provided with new data. Furthermore, it completely avoids the need for manually assigned spectra. This system has the potential to be turned into a fully autonomous tool able to compete with the best alternatives currently available.Graphical abstractSelf-learning loop. Any progress in the prediction (forward problem) will improve the assignment ability (reverse problem) and vice versa.

  3. Identification of Human IKK-2 Inhibitors of Natural Origin (Part I): Modeling of the IKK-2 Kinase Domain, Virtual Screening and Activity Assays

    PubMed Central

    Sala, Esther; Guasch, Laura; Iwaszkiewicz, Justyna; Mulero, Miquel; Salvadó, Maria-Josepa; Pinent, Montserrat; Zoete, Vincent; Grosdidier, Aurélien; Garcia-Vallvé, Santiago; Michielin, Olivier; Pujadas, Gerard

    2011-01-01

    Background Their large scaffold diversity and properties, such as structural complexity and drug similarity, form the basis of claims that natural products are ideal starting points for drug design and development. Consequently, there has been great interest in determining whether such molecules show biological activity toward protein targets of pharmacological relevance. One target of particular interest is hIKK-2, a serine-threonine protein kinase belonging to the IKK complex that is the primary component responsible for activating NF-κB in response to various inflammatory stimuli. Indeed, this has led to the development of synthetic ATP-competitive inhibitors for hIKK-2. Therefore, the main goals of this study were (a) to use virtual screening to identify potential hIKK-2 inhibitors of natural origin that compete with ATP and (b) to evaluate the reliability of our virtual-screening protocol by experimentally testing the in vitro activity of selected natural-product hits. Methodology/Principal Findings We thus predicted that 1,061 out of the 89,425 natural products present in the studied database would inhibit hIKK-2 with good ADMET properties. Notably, when these 1,061 molecules were merged with the 98 synthetic hIKK-2 inhibitors used in this study and the resulting set was classified into ten clusters according to chemical similarity, there were three clusters that contained only natural products. Five molecules from these three clusters (for which no anti-inflammatory activity has been previously described) were then selected for in vitro activity testing, in which three out of the five molecules were shown to inhibit hIKK-2. Conclusions/Significance We demonstrated that our virtual-screening protocol was successful in identifying lead compounds for developing new inhibitors for hIKK-2, a target of great interest in medicinal chemistry. Additionally, all the tools developed during the current study (i.e., the homology model for the hIKK-2 kinase domain and the pharmacophore) will be made available to interested readers upon request. PMID:21390216

  4. Quantitative structure-activity relationships for reactivities of sulfate and hydroxyl radicals with aromatic contaminants through single-electron transfer pathway.

    PubMed

    Luo, Shuang; Wei, Zongsu; Spinney, Richard; Villamena, Frederick A; Dionysiou, Dionysios D; Chen, Dong; Tang, Chong-Jian; Chai, Liyuan; Xiao, Ruiyang

    2018-02-15

    Sulfate radical anion (SO 4 •- ) and hydroxyl radical (OH) based advanced oxidation technologies has been extensively used for removal of aromatic contaminants (ACs) in waters. In this study, we investigated the Gibbs free energy (ΔG SET ∘ ) of the single electron transfer (SET) reactions for 76 ACs with SO 4 •- and OH, respectively. The result reveals that SO 4 •- possesses greater propensity to react with ACs through the SET channel than OH. We hypothesized that the electron distribution within the molecule plays an essential role in determining the ΔG SET ∘ and subsequent SET reactions. To test the hypothesis, a quantitative structure-activity relationship (QSAR) model was developed for predicting ΔG SET ∘ using the highest occupied molecular orbital energies (E HOMO ), a measure of electron distribution and donating ability. The standardized QSAR models are reported to be ΔG ° SET =-0.97×E HOMO - 181 and ΔG ° SET =-0.97×E HOMO - 164 for SO 4 •- and OH, respectively. The models were internally and externally validated to ensure robustness and predictability, and the application domain and limitations were discussed. The single-descriptor based models account for 95% of the variability for SO 4 •- and OH. These results provide the mechanistic insight into the SET reaction pathway of radical and non-radical bimolecular reactions, and have important applications for radical based oxidation technologies to remove target ACs in different waters. Copyright © 2017 Elsevier B.V. All rights reserved.

  5. Comparison of gas-solid chromatography and MM2 force field molecular binding energies for greenhouse gases on a carbonaceous surface.

    PubMed

    Rybolt, Thomas R; Bivona, Kevin T; Thomas, Howard E; O'Dell, Casey M

    2009-10-01

    Gas-solid chromatography was used to determine B(2s) (gas-solid virial coefficient) values for eight molecular adsorbates interacting with a carbon powder (Carbopack B, Supelco). B(2s) values were determined by multiple size variant injections within the temperature range of 313-553 K. The molecular adsorbates included: carbon dioxide (CO(2)); tetrafluoromethane (CF(4)); hexafluoroethane (C(2)F(6)); 1,1-difluoroethane (C(2)H(4)F(2)); 1-chloro-1,1-difluoroethane (C(2)H(3)ClF(2)); dichlorodifluoromethane (CCl(2)F(2)); trichlorofluoromethane (CCl(3)F); and 1,1,1-trichloroethane (C(2)H(3)Cl(3)). Two of these molecules are of special interest because they are "super greenhouse gases". The global warming potential, GWP, for CF(4) is 6500 and for C(2)F(6) is 9200 relative to the reference value of 1 for CO(2). The GWP index considers both radiative blocking and molecular lifetime. For these and other industrial greenhouse gases, adsorptive trapping on a carbonaceous solid, which depends on molecule-surface binding energy, could avoid atmospheric release. The temperature variations of the gas-solid virial coefficients in conjunction with van't Hoff plots were used to find the experimental adsorption energy or binding energy values (E(*)) for each adsorbate. A molecular mechanics based, rough-surface model was used to calculate the molecule-surface binding energy (Ecal(*)) using augmented MM2 parameters. The surface model consisted of parallel graphene layers with two separated nanostructures each containing 17 benzene rings arranged in linear strips. The separation of the parallel nanostructures had been optimized in a prior study to appropriately represent molecule-surface interactions for Carbopack B. Linear regressions of E(*) versus Ecal(*) for the current data set of eight molecules and the same surface model gave E(*)=0.926 Ecal(*) and r(2)=0.956. A combined set of the current and prior Carbopack B adsorbates studied (linear alkanes, branched alkanes, cyclic alkanes, ethers, and halogenated hydrocarbons) gave a data set with 33 molecules and a regression of E(*)=0.991 Ecal(*) and r(2)=0.968. These results indicated a good correlation between the experimental and the MM2 computed molecule-surface binding energies.

  6. Discovery of anti-microbial and anti-tubercular molecules from Fusarium solani: an endophyte of Glycyrrhiza glabra.

    PubMed

    Shah, A; Rather, M A; Hassan, Q P; Aga, M A; Mushtaq, S; Shah, A M; Hussain, A; Baba, S A; Ahmad, Z

    2017-05-01

    Glycyrrhiza glabra is a high-value medicinal plant thriving in biodiversity rich Kashmir Himalaya. The present study was designed to explore the fungal endophytes from G. glabra as a source of bioactive molecules. The extracts prepared from the isolated endophytes were evaluated for anti-microbial activities using broth micro-dilution assay. The endophytic strain coded as A2 exhibiting promising anti-bacterial as well as anti-tuberculosis activity was identified as Fusarium solani by ITS-5.8S ribosomal gene sequencing technique. This strain was subjected to large-scale fermentation followed by isolation of its bioactive compounds using column chromatography. From the results of spectral data analysis and comparison with literature, the molecules were identified as 3,6,9-trihydroxy-7-methoxy-4,4-dimethyl-3,4-dihydro-1H-benzo[g]isochromene-5,10-dione (1), fusarubin (2), 3-O-methylfusarubin (3) and javanicin (4). Compound 1 is reported for the first time from this strain. All the four compounds inhibited the growth of various tested bacterial strains with MIC values in the range of <1 to 256 μg ml -1 . Fusarubin showed good activity against Mycobacterium tuberculosis strain H37Rv with MIC value of 8 μg ml -1 , whereas compounds 1, 3 and 4 exhibited moderate activity with MIC values of 256, 64, 32 μg ml -1 , respectively. To the best of our knowledge, this is the first study that reports significant anti-tuberculosis potential of bioactive molecules from endophytic F. solani evaluated against the virulent strain of M. tuberculosis. This study sets background towards their synthetic intervention for activity enhancement experiments in anti-microbial drug discovery programme. Due to the chemoprofile variation of same endophyte with respect to source plant and ecoregions, further studies are required to explore endophytes of medicinal plants of all unusual biodiversity rich ecoregions for important and or novel bioactive molecules. © 2017 The Society for Applied Microbiology.

  7. An in vitro screening cascade to identify neuroprotective antioxidants in ALS

    PubMed Central

    Barber, Siân C.; Higginbottom, Adrian; Mead, Richard J.; Barber, Stuart; Shaw, Pamela J.

    2009-01-01

    Amyotrophic lateral sclerosis (ALS) is an adult-onset neurodegenerative disease, characterized by progressive dysfunction and death of motor neurons. Although evidence for oxidative stress in ALS pathogenesis is well described, antioxidants have generally shown poor efficacy in animal models and human clinical trials. We have developed an in vitro screening cascade to identify antioxidant molecules capable of rescuing NSC34 motor neuron cells expressing an ALS-associated mutation of superoxide dismutase 1. We have tested known antioxidants and screened a library of 2000 small molecules. The library screen identified 164 antioxidant molecules, which were refined to the 9 most promising molecules in subsequent experiments. Analysis of the in silico properties of hit compounds and a review of published literature on their in vivo effectiveness have enabled us to systematically identify molecules with antioxidant activity combined with chemical properties necessary to penetrate the central nervous system. The top-performing molecules identified include caffeic acid phenethyl ester, esculetin, and resveratrol. These compounds were tested for their ability to rescue primary motor neuron cultures after trophic factor withdrawal, and the mechanisms of action of their antioxidant effects were investigated. Subsequent in vivo studies can be targeted using molecules with the greatest probability of success. PMID:19439221

  8. Electronic spectra from TDDFT and machine learning in chemical space

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ramakrishnan, Raghunathan; Hartmann, Mia; Tapavicza, Enrico

    Due to its favorable computational efficiency, time-dependent (TD) density functional theory (DFT) enables the prediction of electronic spectra in a high-throughput manner across chemical space. Its predictions, however, can be quite inaccurate. We resolve this issue with machine learning models trained on deviations of reference second-order approximate coupled-cluster (CC2) singles and doubles spectra from TDDFT counterparts, or even from DFT gap. We applied this approach to low-lying singlet-singlet vertical electronic spectra of over 20 000 synthetically feasible small organic molecules with up to eight CONF atoms. The prediction errors decay monotonously as a function of training set size. For amore » training set of 10 000 molecules, CC2 excitation energies can be reproduced to within +/- 0.1 eV for the remaining molecules. Analysis of our spectral database via chromophore counting suggests that even higher accuracies can be achieved. Based on the evidence collected, we discuss open challenges associated with data-driven modeling of high-lying spectra and transition intensities.« less

  9. An acidic microenvironment sets the humoral pattern recognition molecule PTX3 in a tissue repair mode

    PubMed Central

    Doni, Andrea; Musso, Tiziana; Morone, Diego; Bastone, Antonio; Zambelli, Vanessa; Sironi, Marina; Castagnoli, Carlotta; Cambieri, Irene; Stravalaci, Matteo; Pasqualini, Fabio; Laface, Ilaria; Valentino, Sonia; Tartari, Silvia; Ponzetta, Andrea; Maina, Virginia; Barbieri, Silvia S.; Tremoli, Elena; Catapano, Alberico L.; Norata, Giuseppe D.; Bottazzi, Barbara; Garlanda, Cecilia

    2015-01-01

    Pentraxin 3 (PTX3) is a fluid-phase pattern recognition molecule and a key component of the humoral arm of innate immunity. In four different models of tissue damage in mice, PTX3 deficiency was associated with increased fibrin deposition and persistence, and thicker clots, followed by increased collagen deposition, when compared with controls. Ptx3-deficient macrophages showed defective pericellular fibrinolysis in vitro. PTX3-bound fibrinogen/fibrin and plasminogen at acidic pH and increased plasmin-mediated fibrinolysis. The second exon-encoded N-terminal domain of PTX3 recapitulated the activity of the intact molecule. Thus, a prototypic component of humoral innate immunity, PTX3, plays a nonredundant role in the orchestration of tissue repair and remodeling. Tissue acidification resulting from metabolic adaptation during tissue repair sets PTX3 in a tissue remodeling and repair mode, suggesting that matrix and microbial recognition are common, ancestral features of the humoral arm of innate immunity. PMID:25964372

  10. Controlling Brownian motion of single protein molecules and single fluorophores in aqueous buffer.

    PubMed

    Cohen, Adam E; Moerner, W E

    2008-05-12

    We present an Anti-Brownian Electrokinetic trap (ABEL trap) capable of trapping individual fluorescently labeled protein molecules in aqueous buffer. The ABEL trap operates by tracking the Brownian motion of a single fluorescent particle in solution, and applying a time-dependent electric field designed to induce an electrokinetic drift that cancels the Brownian motion. The trapping strength of the ABEL trap is limited by the latency of the feedback loop. In previous versions of the trap, this latency was set by the finite frame rate of the camera used for video-tracking. In the present system, the motion of the particle is tracked entirely in hardware (without a camera or image-processing software) using a rapidly rotating laser focus and lock-in detection. The feedback latency is set by the finite rate of arrival of photons. We demonstrate trapping of individual molecules of the protein GroEL in buffer, and we show confinement of single fluorophores of the dye Cy3 in water.

  11. A combined experimental and DFT investigation of disazo dye having pyrazole skeleton

    NASA Astrophysics Data System (ADS)

    Şener, Nesrin; Bayrakdar, Alpaslan; Kart, Hasan Hüseyin; Şener, İzzet

    2017-02-01

    Disazo dye containing pyrazole skeleton has been synthesized. The structure of the dye has been confirmed by using FT-IR, 1H NMR, 13C NMR, HRMS spectral technique and elemental analysis. The molecular geometry and infrared spectrum are also calculated by the Density Functional Theory (DFT) employing B3LYP level with 6-311G (d,p) basis set. The chemical shifts calculation for 1H NMR of the title molecule is done by using by Gauge-Invariant Atomic Orbital (GIAO) method by utilizing the same basis sets. The total density of state, the partial density of state and the overlap population density of state diagram analysis are done via Gauss Sum 3.0 program. Frontier molecular orbitals such as highest occupied molecular orbital (HOMO) and lowest unoccupied molecular orbital (LUMO) and molecular electrostatic potential surface on the title molecule are predicted for various intramolecular interactions that are responsible for the stabilization of the molecule. The experimental results and theoretical values have been compared.

  12. Prebiotic molecules formation through the gas-phase reaction between HNO and CH2CHOH2+

    NASA Astrophysics Data System (ADS)

    Redondo, Pilar; Martínez, Henar; Largo, Antonio; Barrientos, Carmen

    2017-07-01

    Context. Knowing how the molecules that are present in the ISM can evolve to more complex ones is an interesting topic in interstellar chemistry. The study of possible reactions between detected species can help to understand the evolution in complexity of the interstellar matter and also allows knowing the formation of new molecules which could be candidates to be detected. We focus our attention on two molecules detected in space, vinyl alcohol (CH2CHOH) and azanone (HNO). Aims: We aim to carry out a theoretical study of the ion-molecule reaction between protonated vinyl alcohol and azanone. The viability of formation of complex organic molecules (COMs) from these reactants is expected to provide some insight into the formation of prebiotic species through gas phase reactions. Methods: The reaction of protonated vinyl alcohol with azanone has been theoretically studied by using ab initio methods. Stationary points on the potential energy surface (PES) were characterized at the second-order Moller-Plesset level in conjunction with the aug-cc-pVTZ (correlation-consistent polarized valence triple-zeta) basis set. In addition, the electronic energies were refined by means of single-point calculations at the CCSD(T) level (coupled cluster single and double excitation model augmented with a non-iterative treatment of triple excitations) with the same basis set. Results: From a thermodynamic point of view, twelve products, composed of carbon, oxygen, nitrogen, and hydrogen which could be precursors in the formation of more complex biological molecules, can be obtained from this reaction. Among these, we focus especially on ionized glycine and two of its isomers. The analysis of the PES shows that only formation of cis- and trans-O-protonated imine acetaldehyde, CH2NHCOH+ and, CHNHCHOH+, are viable under interstellar conditions. Conclusions: The reaction of protonated vinyl alcohol with azanone can evolve in the interstellar medium to more complex organic molecules of prebiotic interest. Our results suggest that imine acetaldehyde could be a feasible candidate molecule to be searched for in space.

  13. Benchmarking the pseudopotential and fixed-node approximations in diffusion Monte Carlo calculations of molecules and solids

    DOE PAGES

    Nazarov, Roman; Shulenburger, Luke; Morales, Miguel A.; ...

    2016-03-28

    We performed diffusion Monte Carlo (DMC) calculations of the spectroscopic properties of a large set of molecules, assessing the effect of different approximations. In systems containing elements with large atomic numbers, we show that the errors associated with the use of nonlocal mean-field-based pseudopotentials in DMC calculations can be significant and may surpass the fixed-node error. In conclusion, we suggest practical guidelines for reducing these pseudopotential errors, which allow us to obtain DMC-computed spectroscopic parameters of molecules and equation of state properties of solids in excellent agreement with experiment.

  14. Energy-switching potential energy surface for the water molecule revisited: A highly accurate singled-sheeted form.

    PubMed

    Galvão, B R L; Rodrigues, S P J; Varandas, A J C

    2008-07-28

    A global ab initio potential energy surface is proposed for the water molecule by energy-switching/merging a highly accurate isotope-dependent local potential function reported by Polyansky et al. [Science 299, 539 (2003)] with a global form of the many-body expansion type suitably adapted to account explicitly for the dynamical correlation and parametrized from extensive accurate multireference configuration interaction energies extrapolated to the complete basis set limit. The new function mimics also the complicated Sigma/Pi crossing that arises at linear geometries of the water molecule.

  15. Benchmarking the pseudopotential and fixed-node approximations in diffusion Monte Carlo calculations of molecules and solids

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nazarov, Roman; Shulenburger, Luke; Morales, Miguel A.

    We performed diffusion Monte Carlo (DMC) calculations of the spectroscopic properties of a large set of molecules, assessing the effect of different approximations. In systems containing elements with large atomic numbers, we show that the errors associated with the use of nonlocal mean-field-based pseudopotentials in DMC calculations can be significant and may surpass the fixed-node error. In conclusion, we suggest practical guidelines for reducing these pseudopotential errors, which allow us to obtain DMC-computed spectroscopic parameters of molecules and equation of state properties of solids in excellent agreement with experiment.

  16. An improvement of quantum parametric methods by using SGSA parameterization technique and new elementary parametric functionals

    NASA Astrophysics Data System (ADS)

    Sánchez, M.; Oldenhof, M.; Freitez, J. A.; Mundim, K. C.; Ruette, F.

    A systematic improvement of parametric quantum methods (PQM) is performed by considering: (a) a new application of parameterization procedure to PQMs and (b) novel parametric functionals based on properties of elementary parametric functionals (EPF) [Ruette et al., Int J Quantum Chem 2008, 108, 1831]. Parameterization was carried out by using the simplified generalized simulated annealing (SGSA) method in the CATIVIC program. This code has been parallelized and comparison with MOPAC/2007 (PM6) and MINDO/SR was performed for a set of molecules with C=C, C=H, and H=H bonds. Results showed better accuracy than MINDO/SR and MOPAC-2007 for a selected trial set of molecules.

  17. In Silico Modeling of Indigo and Tyrian Purple Single-Electron Nano-Transistors Using Density Functional Theory Approach

    NASA Astrophysics Data System (ADS)

    Shityakov, Sergey; Roewer, Norbert; Förster, Carola; Broscheit, Jens-Albert

    2017-07-01

    The purpose of this study was to develop and implement an in silico model of indigoid-based single-electron transistor (SET) nanodevices, which consist of indigoid molecules from natural dye weakly coupled to gold electrodes that function in a Coulomb blockade regime. The electronic properties of the indigoid molecules were investigated using the optimized density-functional theory (DFT) with a continuum model. Higher electron transport characteristics were determined for Tyrian purple, consistent with experimentally derived data. Overall, these results can be used to correctly predict and emphasize the electron transport functions of organic SETs, demonstrating their potential for sustainable nanoelectronics comprising the biodegradable and biocompatible materials.

  18. Defining Multiple Characteristic Raman Bands of α-Amino Acids as Biomarkers for Planetary Missions Using a Statistical Method

    NASA Astrophysics Data System (ADS)

    Rolfe, S. M.; Patel, M. R.; Gilmour, I.; Olsson-Francis, K.; Ringrose, T. J.

    2016-06-01

    Biomarker molecules, such as amino acids, are key to discovering whether life exists elsewhere in the Solar System. Raman spectroscopy, a technique capable of detecting biomarkers, will be on board future planetary missions including the ExoMars rover. Generally, the position of the strongest band in the spectra of amino acids is reported as the identifying band. However, for an unknown sample, it is desirable to define multiple characteristic bands for molecules to avoid any ambiguous identification. To date, there has been no definition of multiple characteristic bands for amino acids of interest to astrobiology. This study examined l-alanine, l-aspartic acid, l-cysteine, l-glutamine and glycine and defined several Raman bands per molecule for reference as characteristic identifiers. Per amino acid, 240 spectra were recorded and compared using established statistical tests including ANOVA. The number of characteristic bands defined were 10, 12, 12, 14 and 19 for l-alanine (strongest intensity band: 832 cm-1), l-aspartic acid (938 cm-1), l-cysteine (679 cm-1), l-glutamine (1090 cm-1) and glycine (875 cm-1), respectively. The intensity of bands differed by up to six times when several points on the crystal sample were rotated through 360 °; to reduce this effect when defining characteristic bands for other molecules, we find that spectra should be recorded at a statistically significant number of points per sample to remove the effect of sample rotation. It is crucial that sets of characteristic Raman bands are defined for biomarkers that are targets for future planetary missions to ensure a positive identification can be made.

  19. Defining Multiple Characteristic Raman Bands of α-Amino Acids as Biomarkers for Planetary Missions Using a Statistical Method.

    PubMed

    Rolfe, S M; Patel, M R; Gilmour, I; Olsson-Francis, K; Ringrose, T J

    2016-06-01

    Biomarker molecules, such as amino acids, are key to discovering whether life exists elsewhere in the Solar System. Raman spectroscopy, a technique capable of detecting biomarkers, will be on board future planetary missions including the ExoMars rover. Generally, the position of the strongest band in the spectra of amino acids is reported as the identifying band. However, for an unknown sample, it is desirable to define multiple characteristic bands for molecules to avoid any ambiguous identification. To date, there has been no definition of multiple characteristic bands for amino acids of interest to astrobiology. This study examined L-alanine, L-aspartic acid, L-cysteine, L-glutamine and glycine and defined several Raman bands per molecule for reference as characteristic identifiers. Per amino acid, 240 spectra were recorded and compared using established statistical tests including ANOVA. The number of characteristic bands defined were 10, 12, 12, 14 and 19 for L-alanine (strongest intensity band: 832 cm(-1)), L-aspartic acid (938 cm(-1)), L-cysteine (679 cm(-1)), L-glutamine (1090 cm(-1)) and glycine (875 cm(-1)), respectively. The intensity of bands differed by up to six times when several points on the crystal sample were rotated through 360 °; to reduce this effect when defining characteristic bands for other molecules, we find that spectra should be recorded at a statistically significant number of points per sample to remove the effect of sample rotation. It is crucial that sets of characteristic Raman bands are defined for biomarkers that are targets for future planetary missions to ensure a positive identification can be made.

  20. Mismatched HLA-DRB3 Can Induce a Potent Immune Response After HLA 10/10 Matched Stem Cell Transplantation.

    PubMed

    van Balen, Peter; van Luxemburg-Heijs, Simone A P; van de Meent, Marian; van Bergen, Cornelis A M; Halkes, Constantijn J M; Jedema, Inge; Falkenburg, J H Frederik

    2017-12-01

    Donors for allogeneic stem cell transplantation are preferentially matched with patients for HLA-A, -B, -C, and -DRB1. Mismatches between donor and patient in these alleles are associated with an increased risk of graft-versus-host disease (GVHD). In contrast, HLA-DRB3, 4 and 5, HLA-DQ and HLA-DP are usually assumed to be low expression loci with limited relevance, although mismatches in HLA-DQ and HLA-DP can result in alloimmune responses. Mismatches in HLA-DRB3, 4, and 5 are usually not taken into account in donor selection. Conversion of chimerism in the presence of GVHD after CD4 donor lymphocyte infusion was observed in a patient, HLA 10/10 matched, but mismatched for HLA-DRB3 and HLA-DPB1 compared with the donor. Alloreactive CD4 T cells were isolated from peripheral blood after CD4 donor lymphocyte infusion and recognition of donor-derived target cells transduced with the mismatched patient variant HLA-DRB3 and HLA-DPB1 molecule was tested. A dominant polyclonal CD4 T cell response against patient's mismatched HLA-DRB3 molecule was found in addition to an immune response against patient's mismatched HLA-DPB1 molecule. CD4 T cells specific for these HLA class II molecules recognized both hematopoietic target cells as well as GVHD target cells. In contrast to the assumption that mismatches in HLA-DRB3, 4, and 5 are not of immunogenic significance after HLA 10/10 matched allogeneic stem cell transplantation, we show that in this matched setting not only mismatches in HLA-DPB1, but also mismatches in HLA-DRB3 may induce a polyclonal allo-immune response associated with conversion of chimerism and severe GVHD.

  1. Cross section data sets for electron collisions with H2, O2, CO, CO2, N2O and H2O

    NASA Astrophysics Data System (ADS)

    Anzai, K.; Kato, H.; Hoshino, M.; Tanaka, H.; Itikawa, Y.; Campbell, L.; Brunger, M. J.; Buckman, S. J.; Cho, H.; Blanco, F.; Garcia, G.; Limão-Vieira, P.; Ingólfsson, O.

    2012-02-01

    We review earlier cross section data sets for electron-collisions with H2, O2, CO, CO2, H2O and N2O, updated here by experimental results for their electronic states. Based on our recent measurements of differential cross sections for the electronic states of those molecules, integral cross sections (ICSs) are derived by applying a generalized oscillator strength analysis and then assessed against theory (BE f-scaling [Y.-K. Kim, J. Chem. Phys. 126, 064305 (2007)]). As they now represent benchmark electronic state cross sections, those ICSs for the above molecules are added into the original cross section sets taken from the data reviews for H2, O2, CO2 and H2O (the Itikawa group), and for CO and N2O (the Zecca group).

  2. Quantum-chemical investigation of the structures and electronic spectra of the nucleic acid bases at the coupled cluster CC2 level.

    PubMed

    Fleig, Timo; Knecht, Stefan; Hättig, Christof

    2007-06-28

    We study the ground-state structures and singlet- and triplet-excited states of the nucleic acid bases by applying the coupled cluster model CC2 in combination with a resolution-of-the-identity approximation for electron interaction integrals. Both basis set effects and the influence of dynamic electron correlation on the molecular structures are elucidated; the latter by comparing CC2 with Hartree-Fock and Møller-Plesset perturbation theory to second order. Furthermore, we investigate basis set and electron correlation effects on the vertical excitation energies and compare our highest-level results with experiment and other theoretical approaches. It is shown that small basis sets are insufficient for obtaining accurate results for excited states of these molecules and that the CC2 approach to dynamic electron correlation is a reliable and efficient tool for electronic structure calculations on medium-sized molecules.

  3. Vibrational spectroscopic and non-linear optical activity studies on nicotinanilide : A DFT approach

    NASA Astrophysics Data System (ADS)

    Premkumar, S.; Jawahar, A.; Mathavan, T.; Dhas, M. Kumara; Benial, A. Milton Franklin

    2015-06-01

    The molecular structure of nicotinanilide was optimized by the DFT/B3LYP method with cc-pVTZ basis set using Gaussian 09 program. The first order hyperpolarizability of the molecule was calculated, which exhibits the higher nonlinear optical activity. The natural bond orbital analysis confirms the presence of intramolecular charge transfer and the hydrogen bonding interaction, which leads to the higher nonlinear optical activity of the molecule. The Frontier molecular orbitals analysis of the molecule shows that the delocalization of electron density occurs within the molecule. The lower energy gap indicates that the hydrogen bond formation between the charged species. The vibrational frequencies were calculated and assigned on the basis of potential energy distribution calculation using the VEDA 4.0 program and the corresponding vibrational spectra were simulated. Hence, the nicotinanilide molecule can be a good candidate for second-order NLO material.

  4. Combined use of computational chemistry and chemoinformatics methods for chemical discovery

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sugimoto, Manabu, E-mail: sugimoto@kumamoto-u.ac.jp; Institute for Molecular Science, 38 Nishigo-Naka, Myodaiji, Okazaki 444-8585; CREST, Japan Science and Technology Agency, 4-1-8 Honcho, Kawaguchi, Saitama 332-0012

    2015-12-31

    Data analysis on numerical data by the computational chemistry calculations is carried out to obtain knowledge information of molecules. A molecular database is developed to systematically store chemical, electronic-structure, and knowledge-based information. The database is used to find molecules related to a keyword of “cancer”. Then the electronic-structure calculations are performed to quantitatively evaluate quantum chemical similarity of the molecules. Among the 377 compounds registered in the database, 24 molecules are found to be “cancer”-related. This set of molecules includes both carcinogens and anticancer drugs. The quantum chemical similarity analysis, which is carried out by using numerical results of themore » density-functional theory calculations, shows that, when some energy spectra are referred to, carcinogens are reasonably distinguished from the anticancer drugs. Therefore these spectral properties are considered of as important measures for classification.« less

  5. Rapid method to detect duplex formation in sequencing by hybridization methods, a method for constructing containment structures for reagent interaction

    DOEpatents

    Mirzabekov, Andrei Darievich; Yershov, Gennadiy Moiseyevich; Guschin, Dmitry Yuryevich; Gemmell, Margaret Anne; Shick, Valentine V.; Proudnikov, Dmitri Y.; Timofeev, Edward N.

    2002-01-01

    A method for determining the existence of duplexes of oligonucleotide complementary molecules is provided whereby a plurality of immobilized oligonucleotide molecules, each of a specific length and each having a specific base sequence, is contacted with complementary, single stranded oligonucleotide molecules to form a duplex so as to facilitate intercalation of a fluorescent dye between the base planes of the duplex. The invention also provides for a method for constructing oligonucleotide matrices comprising confining light sensitive fluid to a surface, exposing said light-sensitive fluid to a light pattern so as to cause the fluid exposed to the light to polymerize into discrete units and adhere to the surface; and contacting each of the units with a set of different oligonucleotide molecules so as to allow the molecules to disperse into the units.

  6. A Quantitative Method for Comparing the Brightness of Antibody-dye Reagents and Estimating Antibodies Bound per Cell.

    PubMed

    Kantor, Aaron B; Moore, Wayne A; Meehan, Stephen; Parks, David R

    2016-07-01

    We present a quantitative method for comparing the brightness of antibody-dye reagents and estimating antibodies bound per cell. The method is based on complementary binding of test and fill reagents to antibody capture microspheres. Several aliquots of antibody capture beads are stained with varying amounts of the test conjugate. The remaining binding sites on the beads are then filled with a second conjugate containing a different fluorophore. Finally, the fluorescence of the test conjugate compared to the fill conjugate is used to measure the relative brightness of the test conjugate. The fundamental assumption of the test-fill method is that if it takes X molecules of one test antibody to lower the fill signal by Y units, it will take the same X molecules of any other test antibody to give the same effect. We apply a quadratic fit to evaluate the test-fill signal relationship across different amounts of test reagent. If the fit is close to linear, we consider the test reagent to be suitable for quantitative evaluation of antibody binding. To calibrate the antibodies bound per bead, a PE conjugate with 1 PE molecule per antibody is used as a test reagent and the fluorescence scale is calibrated with Quantibrite PE beads. When the fluorescence per antibody molecule has been determined for a particular conjugate, that conjugate can be used for measurement of antibodies bound per cell. This provides comparisons of the brightness of different conjugates when conducted on an instrument whose statistical photoelectron (Spe) scales are known. © 2016 by John Wiley & Sons, Inc. Copyright © 2016 John Wiley & Sons, Inc.

  7. High-throughput identification and rational design of synergistic small-molecule pairs for combating and bypassing antibiotic resistance.

    PubMed

    Wambaugh, Morgan A; Shakya, Viplendra P S; Lewis, Adam J; Mulvey, Matthew A; Brown, Jessica C S

    2017-06-01

    Antibiotic-resistant infections kill approximately 23,000 people and cost $20,000,000,000 each year in the United States alone despite the widespread use of small-molecule antimicrobial combination therapy. Antibiotic combinations typically have an additive effect: the efficacy of the combination matches the sum of the efficacies of each antibiotic when used alone. Small molecules can also act synergistically when the efficacy of the combination is greater than the additive efficacy. However, synergistic combinations are rare and have been historically difficult to identify. High-throughput identification of synergistic pairs is limited by the scale of potential combinations: a modest collection of 1,000 small molecules involves 1 million pairwise combinations. Here, we describe a high-throughput method for rapid identification of synergistic small-molecule pairs, the overlap2 method (O2M). O2M extracts patterns from chemical-genetic datasets, which are created when a collection of mutants is grown in the presence of hundreds of different small molecules, producing a precise set of phenotypes induced by each small molecule across the mutant set. The identification of mutants that show the same phenotype when treated with known synergistic molecules allows us to pinpoint additional molecule combinations that also act synergistically. As a proof of concept, we focus on combinations with the antibiotics trimethoprim and sulfamethizole, which had been standard treatment against urinary tract infections until widespread resistance decreased efficacy. Using O2M, we screened a library of 2,000 small molecules and identified several that synergize with the antibiotic trimethoprim and/or sulfamethizole. The most potent of these synergistic interactions is with the antiviral drug azidothymidine (AZT). We then demonstrate that understanding the molecular mechanism underlying small-molecule synergistic interactions allows the rational design of additional combinations that bypass drug resistance. Trimethoprim and sulfamethizole are both folate biosynthesis inhibitors. We find that this activity disrupts nucleotide homeostasis, which blocks DNA replication in the presence of AZT. Building on these data, we show that other small molecules that disrupt nucleotide homeostasis through other mechanisms (hydroxyurea and floxuridine) also act synergistically with AZT. These novel combinations inhibit the growth and virulence of trimethoprim-resistant clinical Escherichia coli and Klebsiella pneumoniae isolates, suggesting that they may be able to be rapidly advanced into clinical use. In sum, we present a generalizable method to screen for novel synergistic combinations, to identify particular mechanisms resulting in synergy, and to use the mechanistic knowledge to rationally design new combinations that bypass drug resistance.

  8. High-throughput identification and rational design of synergistic small-molecule pairs for combating and bypassing antibiotic resistance

    PubMed Central

    Lewis, Adam J.; Mulvey, Matthew A.

    2017-01-01

    Antibiotic-resistant infections kill approximately 23,000 people and cost $20,000,000,000 each year in the United States alone despite the widespread use of small-molecule antimicrobial combination therapy. Antibiotic combinations typically have an additive effect: the efficacy of the combination matches the sum of the efficacies of each antibiotic when used alone. Small molecules can also act synergistically when the efficacy of the combination is greater than the additive efficacy. However, synergistic combinations are rare and have been historically difficult to identify. High-throughput identification of synergistic pairs is limited by the scale of potential combinations: a modest collection of 1,000 small molecules involves 1 million pairwise combinations. Here, we describe a high-throughput method for rapid identification of synergistic small-molecule pairs, the overlap2 method (O2M). O2M extracts patterns from chemical-genetic datasets, which are created when a collection of mutants is grown in the presence of hundreds of different small molecules, producing a precise set of phenotypes induced by each small molecule across the mutant set. The identification of mutants that show the same phenotype when treated with known synergistic molecules allows us to pinpoint additional molecule combinations that also act synergistically. As a proof of concept, we focus on combinations with the antibiotics trimethoprim and sulfamethizole, which had been standard treatment against urinary tract infections until widespread resistance decreased efficacy. Using O2M, we screened a library of 2,000 small molecules and identified several that synergize with the antibiotic trimethoprim and/or sulfamethizole. The most potent of these synergistic interactions is with the antiviral drug azidothymidine (AZT). We then demonstrate that understanding the molecular mechanism underlying small-molecule synergistic interactions allows the rational design of additional combinations that bypass drug resistance. Trimethoprim and sulfamethizole are both folate biosynthesis inhibitors. We find that this activity disrupts nucleotide homeostasis, which blocks DNA replication in the presence of AZT. Building on these data, we show that other small molecules that disrupt nucleotide homeostasis through other mechanisms (hydroxyurea and floxuridine) also act synergistically with AZT. These novel combinations inhibit the growth and virulence of trimethoprim-resistant clinical Escherichia coli and Klebsiella pneumoniae isolates, suggesting that they may be able to be rapidly advanced into clinical use. In sum, we present a generalizable method to screen for novel synergistic combinations, to identify particular mechanisms resulting in synergy, and to use the mechanistic knowledge to rationally design new combinations that bypass drug resistance. PMID:28632788

  9. Mapping of the Available Chemical Space versus the Chemical Universe of Lead-Like Compounds.

    PubMed

    Lin, Arkadii; Horvath, Dragos; Afonina, Valentina; Marcou, Gilles; Reymond, Jean-Louis; Varnek, Alexandre

    2018-03-20

    This is, to our knowledge, the most comprehensive analysis to date based on generative topographic mapping (GTM) of fragment-like chemical space (40 million molecules with no more than 17 heavy atoms, both from the theoretically enumerated GDB-17 and real-world PubChem/ChEMBL databases). The challenge was to prove that a robust map of fragment-like chemical space can actually be built, in spite of a limited (≪10 5 ) maximal number of compounds ("frame set") usable for fitting the GTM manifold. An evolutionary map building strategy has been updated with a "coverage check" step, which discards manifolds failing to accommodate compounds outside the frame set. The evolved map has a good propensity to separate actives from inactives for more than 20 external structure-activity sets. It was proven to properly accommodate the entire collection of 40 m compounds. Next, it served as a library comparison tool to highlight biases of real-world molecules (PubChem and ChEMBL) versus the universe of all possible species represented by FDB-17, a fragment-like subset of GDB-17 containing 10 million molecules. Specific patterns, proper to some libraries and absent from others (diversity holes), were highlighted. © 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  10. Overview of the SAMPL5 host–guest challenge: Are we doing better?

    PubMed Central

    Yin, Jian; Henriksen, Niel M.; Slochower, David R.; Shirts, Michael R.; Chiu, Michael W.; Mobley, David L.; Gilson, Michael K.

    2016-01-01

    The ability to computationally predict protein-small molecule binding affinities with high accuracy would accelerate drug discovery and reduce its cost by eliminating rounds of trial-and-error synthesis and experimental evaluation of candidate ligands. As academic and industrial groups work toward this capability, there is an ongoing need for datasets that can be used to rigorously test new computational methods. Although protein–ligand data are clearly important for this purpose, their size and complexity make it difficult to obtain well-converged results and to troubleshoot computational methods. Host–guest systems offer a valuable alternative class of test cases, as they exemplify noncovalent molecular recognition but are far smaller and simpler. As a consequence, host–guest systems have been part of the prior two rounds of SAMPL prediction exercises, and they also figure in the present SAMPL5 round. In addition to being blinded, and thus avoiding biases that may arise in retrospective studies, the SAMPL challenges have the merit of focusing multiple researchers on a common set of molecular systems, so that methods may be compared and ideas exchanged. The present paper provides an overview of the host–guest component of SAMPL5, which centers on three different hosts, two octa-acids and a glycoluril-based molecular clip, and two different sets of guest molecules, in aqueous solution. A range of methods were applied, including electronic structure calculations with implicit solvent models; methods that combine empirical force fields with implicit solvent models; and explicit solvent free energy simulations. The most reliable methods tend to fall in the latter class, consistent with results in prior SAMPL rounds, but the level of accuracy is still below that sought for reliable computer-aided drug design. Advances in force field accuracy, modeling of protonation equilibria, electronic structure methods, and solvent models, hold promise for future improvements. PMID:27658802

  11. Overview of the SAMPL5 host-guest challenge: Are we doing better?

    PubMed

    Yin, Jian; Henriksen, Niel M; Slochower, David R; Shirts, Michael R; Chiu, Michael W; Mobley, David L; Gilson, Michael K

    2017-01-01

    The ability to computationally predict protein-small molecule binding affinities with high accuracy would accelerate drug discovery and reduce its cost by eliminating rounds of trial-and-error synthesis and experimental evaluation of candidate ligands. As academic and industrial groups work toward this capability, there is an ongoing need for datasets that can be used to rigorously test new computational methods. Although protein-ligand data are clearly important for this purpose, their size and complexity make it difficult to obtain well-converged results and to troubleshoot computational methods. Host-guest systems offer a valuable alternative class of test cases, as they exemplify noncovalent molecular recognition but are far smaller and simpler. As a consequence, host-guest systems have been part of the prior two rounds of SAMPL prediction exercises, and they also figure in the present SAMPL5 round. In addition to being blinded, and thus avoiding biases that may arise in retrospective studies, the SAMPL challenges have the merit of focusing multiple researchers on a common set of molecular systems, so that methods may be compared and ideas exchanged. The present paper provides an overview of the host-guest component of SAMPL5, which centers on three different hosts, two octa-acids and a glycoluril-based molecular clip, and two different sets of guest molecules, in aqueous solution. A range of methods were applied, including electronic structure calculations with implicit solvent models; methods that combine empirical force fields with implicit solvent models; and explicit solvent free energy simulations. The most reliable methods tend to fall in the latter class, consistent with results in prior SAMPL rounds, but the level of accuracy is still below that sought for reliable computer-aided drug design. Advances in force field accuracy, modeling of protonation equilibria, electronic structure methods, and solvent models, hold promise for future improvements.

  12. A novel structure-based multimode QSAR method affords predictive models for phosphodiesterase inhibitors.

    PubMed

    Dong, Xialan; Ebalunode, Jerry O; Cho, Sung Jin; Zheng, Weifan

    2010-02-22

    Quantitative structure-activity relationship (QSAR) methods aim to build quantitatively predictive models for the discovery of new molecules. It has been widely used in medicinal chemistry for drug discovery. Many QSAR techniques have been developed since Hansch's seminal work, and more are still being developed. Motivated by Hopfinger's receptor-dependent QSAR (RD-QSAR) formalism and the Lukacova-Balaz scheme to treat multimode issues, we have initiated studies that focus on a structure-based multimode QSAR (SBMM QSAR) method, where the structure of the target protein is used in characterizing the ligand, and the multimode issue of ligand binding is systematically treated with a modified Lukacova-Balaz scheme. All ligand molecules are first docked to the target binding pocket to obtain a set of aligned ligand poses. A structure-based pharmacophore concept is adopted to characterize the binding pocket. Specifically, we represent the binding pocket as a geometric grid labeled by pharmacophoric features. Each pose of the ligand is also represented as a labeled grid, where each grid point is labeled according to the atom types of nearby ligand atoms. These labeled grids or three-dimensional (3D) maps (both the receptor map (R-map) and the ligand map (L-map)) are compared to each other to derive descriptors for each pose of the ligand, resulting in a multimode structure-activity relationship (SAR) table. Iterative partial least-squares (PLS) is employed to build the QSAR models. When we applied this method to analyze PDE-4 inhibitors, predictive models have been developed, obtaining models with excellent training correlation (r(2) = 0.65-0.66), as well as test correlation (R(2) = 0.64-0.65). A comparative analysis with 4 other QSAR techniques demonstrates that this new method affords better models, in terms of the prediction power for the test set.

  13. A promising tool to achieve chemical accuracy for density functional theory calculations on Y-NO homolysis bond dissociation energies.

    PubMed

    Li, Hong Zhi; Hu, Li Hong; Tao, Wei; Gao, Ting; Li, Hui; Lu, Ying Hua; Su, Zhong Min

    2012-01-01

    A DFT-SOFM-RBFNN method is proposed to improve the accuracy of DFT calculations on Y-NO (Y = C, N, O, S) homolysis bond dissociation energies (BDE) by combining density functional theory (DFT) and artificial intelligence/machine learning methods, which consist of self-organizing feature mapping neural networks (SOFMNN) and radial basis function neural networks (RBFNN). A descriptor refinement step including SOFMNN clustering analysis and correlation analysis is implemented. The SOFMNN clustering analysis is applied to classify descriptors, and the representative descriptors in the groups are selected as neural network inputs according to their closeness to the experimental values through correlation analysis. Redundant descriptors and intuitively biased choices of descriptors can be avoided by this newly introduced step. Using RBFNN calculation with the selected descriptors, chemical accuracy (≤1 kcal·mol(-1)) is achieved for all 92 calculated organic Y-NO homolysis BDE calculated by DFT-B3LYP, and the mean absolute deviations (MADs) of the B3LYP/6-31G(d) and B3LYP/STO-3G methods are reduced from 4.45 and 10.53 kcal·mol(-1) to 0.15 and 0.18 kcal·mol(-1), respectively. The improved results for the minimal basis set STO-3G reach the same accuracy as those of 6-31G(d), and thus B3LYP calculation with the minimal basis set is recommended to be used for minimizing the computational cost and to expand the applications to large molecular systems. Further extrapolation tests are performed with six molecules (two containing Si-NO bonds and two containing fluorine), and the accuracy of the tests was within 1 kcal·mol(-1). This study shows that DFT-SOFM-RBFNN is an efficient and highly accurate method for Y-NO homolysis BDE. The method may be used as a tool to design new NO carrier molecules.

  14. A Promising Tool to Achieve Chemical Accuracy for Density Functional Theory Calculations on Y-NO Homolysis Bond Dissociation Energies

    PubMed Central

    Li, Hong Zhi; Hu, Li Hong; Tao, Wei; Gao, Ting; Li, Hui; Lu, Ying Hua; Su, Zhong Min

    2012-01-01

    A DFT-SOFM-RBFNN method is proposed to improve the accuracy of DFT calculations on Y-NO (Y = C, N, O, S) homolysis bond dissociation energies (BDE) by combining density functional theory (DFT) and artificial intelligence/machine learning methods, which consist of self-organizing feature mapping neural networks (SOFMNN) and radial basis function neural networks (RBFNN). A descriptor refinement step including SOFMNN clustering analysis and correlation analysis is implemented. The SOFMNN clustering analysis is applied to classify descriptors, and the representative descriptors in the groups are selected as neural network inputs according to their closeness to the experimental values through correlation analysis. Redundant descriptors and intuitively biased choices of descriptors can be avoided by this newly introduced step. Using RBFNN calculation with the selected descriptors, chemical accuracy (≤1 kcal·mol−1) is achieved for all 92 calculated organic Y-NO homolysis BDE calculated by DFT-B3LYP, and the mean absolute deviations (MADs) of the B3LYP/6-31G(d) and B3LYP/STO-3G methods are reduced from 4.45 and 10.53 kcal·mol−1 to 0.15 and 0.18 kcal·mol−1, respectively. The improved results for the minimal basis set STO-3G reach the same accuracy as those of 6-31G(d), and thus B3LYP calculation with the minimal basis set is recommended to be used for minimizing the computational cost and to expand the applications to large molecular systems. Further extrapolation tests are performed with six molecules (two containing Si-NO bonds and two containing fluorine), and the accuracy of the tests was within 1 kcal·mol−1. This study shows that DFT-SOFM-RBFNN is an efficient and highly accurate method for Y-NO homolysis BDE. The method may be used as a tool to design new NO carrier molecules. PMID:22942689

  15. Bayesian molecular design with a chemical language model

    NASA Astrophysics Data System (ADS)

    Ikebata, Hisaki; Hongo, Kenta; Isomura, Tetsu; Maezono, Ryo; Yoshida, Ryo

    2017-04-01

    The aim of computational molecular design is the identification of promising hypothetical molecules with a predefined set of desired properties. We address the issue of accelerating the material discovery with state-of-the-art machine learning techniques. The method involves two different types of prediction; the forward and backward predictions. The objective of the forward prediction is to create a set of machine learning models on various properties of a given molecule. Inverting the trained forward models through Bayes' law, we derive a posterior distribution for the backward prediction, which is conditioned by a desired property requirement. Exploring high-probability regions of the posterior with a sequential Monte Carlo technique, molecules that exhibit the desired properties can computationally be created. One major difficulty in the computational creation of molecules is the exclusion of the occurrence of chemically unfavorable structures. To circumvent this issue, we derive a chemical language model that acquires commonly occurring patterns of chemical fragments through natural language processing of ASCII strings of existing compounds, which follow the SMILES chemical language notation. In the backward prediction, the trained language model is used to refine chemical strings such that the properties of the resulting structures fall within the desired property region while chemically unfavorable structures are successfully removed. The present method is demonstrated through the design of small organic molecules with the property requirements on HOMO-LUMO gap and internal energy. The R package iqspr is available at the CRAN repository.

  16. The electronegativity equalization method and the split charge equilibration applied to organic systems: parametrization, validation, and comparison.

    PubMed

    Verstraelen, Toon; Van Speybroeck, Veronique; Waroquier, Michel

    2009-07-28

    An extensive benchmark of the electronegativity equalization method (EEM) and the split charge equilibration (SQE) model on a very diverse set of organic molecules is presented. These models efficiently compute atomic partial charges and are used in the development of polarizable force fields. The predicted partial charges that depend on empirical parameters are calibrated to reproduce results from quantum mechanical calculations. Recently, SQE is presented as an extension of the EEM to obtain the correct size dependence of the molecular polarizability. In this work, 12 parametrization protocols are applied to each model and the optimal parameters are benchmarked systematically. The training data for the empirical parameters comprise of MP2/Aug-CC-pVDZ calculations on 500 organic molecules containing the elements H, C, N, O, F, S, Cl, and Br. These molecules have been selected by an ingenious and autonomous protocol from an initial set of almost 500,000 small organic molecules. It is clear that the SQE model outperforms the EEM in all benchmark assessments. When using Hirshfeld-I charges for the calibration, the SQE model optimally reproduces the molecular electrostatic potential from the ab initio calculations. Applications on chain molecules, i.e., alkanes, alkenes, and alpha alanine helices, confirm that the EEM gives rise to a divergent behavior for the polarizability, while the SQE model shows the correct trends. We conclude that the SQE model is an essential component of a polarizable force field, showing several advantages over the original EEM.

  17. Bayesian molecular design with a chemical language model.

    PubMed

    Ikebata, Hisaki; Hongo, Kenta; Isomura, Tetsu; Maezono, Ryo; Yoshida, Ryo

    2017-04-01

    The aim of computational molecular design is the identification of promising hypothetical molecules with a predefined set of desired properties. We address the issue of accelerating the material discovery with state-of-the-art machine learning techniques. The method involves two different types of prediction; the forward and backward predictions. The objective of the forward prediction is to create a set of machine learning models on various properties of a given molecule. Inverting the trained forward models through Bayes' law, we derive a posterior distribution for the backward prediction, which is conditioned by a desired property requirement. Exploring high-probability regions of the posterior with a sequential Monte Carlo technique, molecules that exhibit the desired properties can computationally be created. One major difficulty in the computational creation of molecules is the exclusion of the occurrence of chemically unfavorable structures. To circumvent this issue, we derive a chemical language model that acquires commonly occurring patterns of chemical fragments through natural language processing of ASCII strings of existing compounds, which follow the SMILES chemical language notation. In the backward prediction, the trained language model is used to refine chemical strings such that the properties of the resulting structures fall within the desired property region while chemically unfavorable structures are successfully removed. The present method is demonstrated through the design of small organic molecules with the property requirements on HOMO-LUMO gap and internal energy. The R package iqspr is available at the CRAN repository.

  18. QSAR and molecular modelling studies on B-DNA recognition of minor groove binders.

    PubMed

    de Oliveira, André Mauricio; Custódio, Flávia Beatriz; Donnici, Cláudio Luis; Montanari, Carlos Alberto

    2003-02-01

    Aromatic bisamidines have been proved to be efficient compounds against Leishmania spp. and Pneumocystis carinii. Although the mode of action is still not known, these molecules are supposed to be DNA minor groove binders (MGBs). This paper describes a molecular modelling study for a set of MGBs in order to rank them through their complementarity to the Dickerson Drew Dodecamer (DDD) according to their interaction energies with B-DNA. A comparative molecular field analysis (CoMFA) has shown the importance of relatively bulky positively charged groups attached to the MGB aromatic rings, and small and negatively charged substituents into the middle chain. Models were obtained for DNA denaturation related to H-bonding processes of binding modes. Validation of the model demonstrated the robustness of CoMFA in terms of independent test set of similar MGBs. GRID results allotted bioisosteric substitution of z.sbnd;Oz.sbnd; by z.sbnd;NHz.sbnd; in furan ring of furamidine and related compounds as being capable to enhance the binding to DDD.

  19. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gesta, E.; Intelligent Insect Control, 118 Chemin des Alouettes, Castelnau-le-Lez, 34170; Skovmand, O., E-mail: osk@insectcontrol.net

    The purpose of this study is to understand the influence of the yarn processing on the migration of additives molecules, especially insecticide, within polyethylene (PE) yarns. Yarns were manufactured in the laboratory focusing on three key-steps (spinning, post-stretching and heat-setting). Influence of each step on yarn properties was investigated using tensile tests, differential scanning calorimetry and wide-angle X-ray diffraction. The post-stretching step was proved to be critical in defining yarn mechanical and structural properties. Although a first orientation of polyethylene crystals was induced during spinning, the optimal orientation was only reached by post-stretching. The results also showed that the heat-settingmore » did not significantly change these properties. The presence of additives crystals at the yarn surface was evidenced by scanning-electron microscopy. These studies performed at each yarn production step allowed a detailed analysis of the additives’ ability to migrate. It is concluded that while post-stretching decreased the migration rate, heat-setting seems to boost this migration.« less

  20. Vascular endothelial growth factor receptor-2 (VEGFR-2) inhibitors: development and validation of predictive 3-D QSAR models through extensive ligand- and structure-based approaches

    NASA Astrophysics Data System (ADS)

    Ragno, Rino; Ballante, Flavio; Pirolli, Adele; Wickersham, Richard B.; Patsilinakos, Alexandros; Hesse, Stéphanie; Perspicace, Enrico; Kirsch, Gilbert

    2015-08-01

    Vascular endothelial growth factor receptor-2, (VEGFR-2), is a key element in angiogenesis, the process by which new blood vessels are formed, and is thus an important pharmaceutical target. Here, 3-D quantitative structure-activity relationship (3-D QSAR) were used to build a quantitative screening and pharmacophore model of the VEGFR-2 receptors for design of inhibitors with improved activities. Most of available experimental data information has been used as training set to derive optimized and fully cross-validated eight mono-probe and a multi-probe quantitative models. Notable is the use of 262 molecules, aligned following both structure-based and ligand-based protocols, as external test set confirming the 3-D QSAR models' predictive capability and their usefulness in design new VEGFR-2 inhibitors. From a survey on literature, this is the first generation of a wide-ranging computational medicinal chemistry application on VEGFR2 inhibitors.

  1. Influence of protonation, tautomeric, and stereoisomeric states on protein-ligand docking results.

    PubMed

    ten Brink, Tim; Exner, Thomas E

    2009-06-01

    In this work, we present a systematical investigation of the influence of ligand protonation states, stereoisomers, and tautomers on results obtained with the two protein-ligand docking programs GOLD and PLANTS. These different states were generated with a fully automated tool, called SPORES (Structure PrOtonation and Recognition System). First, the most probable protonations, as defined by this rule based system, were compared to the ones stored in the well-known, manually revised CCDC/ASTEX data set. Then, to investigate the influence of the ligand protonation state on the docking results, different protonation states were created. Redocking and virtual screening experiments were conducted demonstrating that both docking programs have problems in identifying the correct protomer for each complex. Therefore, a preselection of plausible protomers or the improvement of the scoring functions concerning their ability to rank different molecules/states is needed. Additionally, ligand stereoisomers were tested for a subset of the CCDC/ASTEX set, showing similar problems regarding the ranking of these stereoisomers as the ranking of the protomers.

  2. An Optimization-Based Framework for the Transformation of Incomplete Biological Knowledge into a Probabilistic Structure and Its Application to the Utilization of Gene/Protein Signaling Pathways in Discrete Phenotype Classification.

    PubMed

    Esfahani, Mohammad Shahrokh; Dougherty, Edward R

    2015-01-01

    Phenotype classification via genomic data is hampered by small sample sizes that negatively impact classifier design. Utilization of prior biological knowledge in conjunction with training data can improve both classifier design and error estimation via the construction of the optimal Bayesian classifier. In the genomic setting, gene/protein signaling pathways provide a key source of biological knowledge. Although these pathways are neither complete, nor regulatory, with no timing associated with them, they are capable of constraining the set of possible models representing the underlying interaction between molecules. The aim of this paper is to provide a framework and the mathematical tools to transform signaling pathways to prior probabilities governing uncertainty classes of feature-label distributions used in classifier design. Structural motifs extracted from the signaling pathways are mapped to a set of constraints on a prior probability on a Multinomial distribution. Being the conjugate prior for the Multinomial distribution, we propose optimization paradigms to estimate the parameters of a Dirichlet distribution in the Bayesian setting. The performance of the proposed methods is tested on two widely studied pathways: mammalian cell cycle and a p53 pathway model.

  3. Consistent assignment of the vibrations of symmetric and asymmetric meta-disubstituted benzenes

    NASA Astrophysics Data System (ADS)

    Kemp, David J.; Tuttle, William D.; Jones, Florence M. S.; Gardner, Adrian M.; Andrejeva, Anna; Wakefield, Jonathan C. A.; Wright, Timothy G.

    2018-04-01

    The assignment of vibrational structure in spectra gives valuable insights into geometric and electronic structure changes upon electronic excitation or ionization; particularly when such information is available for families of molecules. We give a description of the phenyl-ring-localized vibrational modes of the ground (S0) electronic states of sets of meta-disubstituted benzene molecules including both symmetrically- and asymmetrically-substituted cases. As in our earlier work on monosubstituted benzenes (Gardner and Wright, 2011), para-disubstituted benzenes (Andrejeva et al., 2016), and ortho-disubstituted benzenes (Tuttle et al., 2018), we conclude that the use of the commonly-used Wilson or Varsányi mode labels, which are based on the vibrational motions of benzene itself, is misleading and ambiguous. Instead, we label the phenyl-ring-localized modes consistently based upon the Mulliken (Herzberg) method for the modes of meta-difluorobenzene (mDFB) under Cs symmetry, since we wish the labelling scheme to cover both symmetrically- and asymmetrically-substituted molecules. By studying the vibrational wavenumbers obtained from the same force-field while varying the mass of the substituent, we are able to follow the evolving modes across a wide range of molecules and hence provide consistent assignments. We assign the vibrations of the following sets of molecules: the symmetric meta-dihalobenzenes, meta-xylene and resorcinol (meta-dihydroxybenzene); and the asymmetric meta-dihalobenzenes, meta-halotoluenes, meta-halophenols and meta-cresol. In the symmetrically-substituted species, we find two pairs of in-phase and out-of-phase carbon-substituent stretches, and this motion persists in asymmetrically-substituted molecules for heavier substituents; however, when at least one of the substituents is light, then we find that these evolve into localized carbon-substituent stretches.

  4. Insight of endo-1,4-xylanase II from Trichoderma reesei: conserved water-mediated H-bond and ion pairs interactions.

    PubMed

    Vijayakumar, Balakrishnan; Velmurugan, Devadasan

    2013-12-01

    Endo-1,4-Xylanase II is an enzyme which degrades the linear polysaccharide beta-1,4-xylan into xylose. This enzyme shows highest enzyme activity around 55 °C, even without being stabilized by the disulphide bridges. A set of nine high resolution crystal structures of Xylanase II (1.11-1.80 Å) from Trichoderma reesei were selected and analyzed in order to identify the invariant water molecules, ion pairs and water-mediated ionic interactions. The crystal structure (PDB-id: 2DFB) solved at highest resolution (1.11 Å) was chosen as the reference and the remaining structures were treated as mobile molecules. These structures were then superimposed with the reference molecule to observe the invariant water molecules using 3-dimensional structural superposition server. A total of 37 water molecules were identified to be invariant molecules in all the crystal structures, of which 26 invariant molecules have hydrogen bond interactions with the back bone of residues and 21 invariant water molecules have interactions with side chain residues. The structural and functional roles of these water molecules and ion pairs have been discussed. The results show that the invariant water molecules and ion pairs may be involved in maintaining the structural architecture, dynamics and function of the Endo-1,4-Xylanase II.

  5. Exploring Polypharmacology Using a ROCS-Based Target Fishing Approach

    DTIC Science & Technology

    2012-01-01

    target representatives. Target profiles were then generated for a given query molecule by computing maximal shape/ chemistry overlap between the query...molecule and the drug sets assigned to each protein target. The overlap was computed using the program ROCS (Rapid Overlay of Chemical Structures ). We...approaches in off-target prediction has been reviewed.9,10 Many structure -based target fishing (SBTF) approaches, such as INVDOCK11 and Target Fishing Dock

  6. Theoretical research program to study chemical reactions in AOTV bow shock tubes

    NASA Technical Reports Server (NTRS)

    Taylor, Peter R.

    1993-01-01

    The main focus was the development, implementation, and calibration of methods for performing molecular electronic structure calculations to high accuracy. These various methods were then applied to a number of chemical reactions and species of interest to NASA, notably in the area of combustion chemistry. Among the development work undertaken was a collaborative effort to develop a program to efficiently predict molecular structures and vibrational frequencies using energy derivatives. Another major development effort involved the design of new atomic basis sets for use in chemical studies: these sets were considerably more accurate than those previously in use. Much effort was also devoted to calibrating methods for computing accurate molecular wave functions, including the first reliable calibrations for realistic molecules using full CI results. A wide variety of application calculations were undertaken. One area of interest was the spectroscopy and thermochemistry of small molecules, including establishing small molecule binding energies to an accuracy rivaling, or even on occasion surpassing, the experiment. Such binding energies are essential input to modeling chemical reaction processes, such as combustion. Studies of large molecules and processes important in both hydrogen and hydrocarbon combustion chemistry were also carried out. Finally, some effort was devoted to the structure and spectroscopy of small metal clusters, with applications to materials science problems.

  7. Thermodynamics of Anharmonic Systems: Uncoupled Mode Approximations for Molecules

    DOE PAGES

    Li, Yi-Pei; Bell, Alexis T.; Head-Gordon, Martin

    2016-05-26

    The partition functions, heat capacities, entropies, and enthalpies of selected molecules were calculated using uncoupled mode (UM) approximations, where the full-dimensional potential energy surface for internal motions was modeled as a sum of independent one-dimensional potentials for each mode. The computational cost of such approaches scales the same with molecular size as standard harmonic oscillator vibrational analysis using harmonic frequencies (HO hf). To compute thermodynamic properties, a computational protocol for obtaining the energy levels of each mode was established. The accuracy of the UM approximation depends strongly on how the one-dimensional potentials of each modes are defined. If the potentialsmore » are determined by the energy as a function of displacement along each normal mode (UM-N), the accuracies of the calculated thermodynamic properties are not significantly improved versus the HO hf model. Significant improvements can be achieved by constructing potentials for internal rotations and vibrations using the energy surfaces along the torsional coordinates and the remaining vibrational normal modes, respectively (UM-VT). For hydrogen peroxide and its isotopologs at 300 K, UM-VT captures more than 70% of the partition functions on average. By con trast, the HO hf model and UM-N can capture no more than 50%. For a selected test set of C2 to C8 linear and branched alkanes and species with different moieties, the enthalpies calculated using the HO hf model, UM-N, and UM-VT are all quite accurate comparing with reference values though the RMS errors of the HO model and UM-N are slightly higher than UM-VT. However, the accuracies in entropy calculations differ significantly between these three models. For the same test set, the RMS error of the standard entropies calculated by UM-VT is 2.18 cal mol -1 K -1 at 1000 K. By contrast, the RMS error obtained using the HO model and UM-N are 6.42 and 5.73 cal mol -1 K -1, respectively. For a test set composed of nine alkanes ranging from C5 to C8, the heat capacities calculated with the UM-VT model agree with the experimental values to within a RMS error of 0.78 cal mol -1 K -1 , which is less than one-third of the RMS error of the HO hf (2.69 cal mol -1 K -1) and UM-N (2.41 cal mol -1 K -1) models.« less

  8. Greater Real-Life Diagnostic Efficacy of Allergen Molecule-Based Diagnosis for Prescription of Immunotherapy in an Area with Multiple Pollen Exposure

    PubMed Central

    Saltabayeva, Ulbosin; Garib, Victoria; Morenko, Marina; Rosenson, Rafail; Ispayeva, Zhanat; Gatauova, Madina; Zulus, Loreta; Karaulov, Alexander; Gastager, Felix; Valenta, Rudolf

    2017-01-01

    Background Allergen molecule-based diagnosis has been suggested to facilitate the identification of disease-causing allergen sources and the prescription of allergen-specific immunotherapy (AIT). The aim of the current study was to compare allergen molecule-based IgE serology with allergen extract-based skin testing for the identification of the disease-causing allergen sources. The study was conducted in an area where patients are exposed to pollen from multiple sources (trees, grasses, and weeds) at the same time to compare the diagnostic efficiency of the 2 forms of diagnosis. Methods Patients from Astana, Kazakhstan, who suffered from pollen-induced allergy (n = 95) were subjected to skin prick testing (SPT) with a local panel of tree pollen, grass pollen, and weed pollen allergen extracts and IgE antibodies specific for marker allergen molecules (nArt v 1, nArt v 3, rAmb a 1, rPhl p 1, rPhl p 5, rBet v 1) were measured by ImmunoCAP. Direct and indirect costs for diagnosis based on SPT and marker allergen-based IgE serology as well as direct costs for immunotherapy depending on SPT and serological test results were calculated. Results The costs for SPT-based diagnosis per patient were lower than the costs for allergen molecule-based IgE serology. However, allergen molecule-based serology was more precise in detecting the disease-causing allergen sources. A lower number of immunotherapy treatments (n = 119) was needed according to molecular diagnosis as compared to extract-based diagnosis (n = 275), which considerably reduced the total costs for diagnosis and for a 3-year treatment from EUR 1,112.30 to 521.77 per patient. Conclusions The results from this real-life study show that SPT is less expensive than allergen molecule-based diagnostic testing, but molecular diagnosis allowed more precise prescription of immunotherapy which substantially reduced treatment costs and combined costs for diagnosis and treatment. PMID:28654920

  9. Inclusion of orbital relaxation and correlation through the unitary group adapted open shell coupled cluster theory using non-relativistic and scalar relativistic Hamiltonians to study the core ionization potential of molecules containing light to medium-heavy elements

    NASA Astrophysics Data System (ADS)

    Sen, Sangita; Shee, Avijit; Mukherjee, Debashis

    2018-02-01

    The orbital relaxation attendant on ionization is particularly important for the core electron ionization potential (core IP) of molecules. The Unitary Group Adapted State Universal Coupled Cluster (UGA-SUMRCC) theory, recently formulated and implemented by Sen et al. [J. Chem. Phys. 137, 074104 (2012)], is very effective in capturing orbital relaxation accompanying ionization or excitation of both the core and the valence electrons [S. Sen et al., Mol. Phys. 111, 2625 (2013); A. Shee et al., J. Chem. Theory Comput. 9, 2573 (2013)] while preserving the spin-symmetry of the target states and using the neutral closed-shell spatial orbitals of the ground state. Our Ansatz invokes a normal-ordered exponential representation of spin-free cluster-operators. The orbital relaxation induced by a specific set of cluster operators in our Ansatz is good enough to eliminate the need for different sets of orbitals for the ground and the core-ionized states. We call the single configuration state function (CSF) limit of this theory the Unitary Group Adapted Open-Shell Coupled Cluster (UGA-OSCC) theory. The aim of this paper is to comprehensively explore the efficacy of our Ansatz to describe orbital relaxation, using both theoretical analysis and numerical performance. Whenever warranted, we also make appropriate comparisons with other coupled-cluster theories. A physically motivated truncation of the chains of spin-free T-operators is also made possible by the normal-ordering, and the operational resemblance to single reference coupled-cluster theory allows easy implementation. Our test case is the prediction of the 1s core IP of molecules containing a single light- to medium-heavy nucleus and thus, in addition to demonstrating the orbital relaxation, we have addressed the scalar relativistic effects on the accuracy of the IPs by using a hierarchy of spin-free Hamiltonians in conjunction with our theory. Additionally, the contribution of the spin-free component of the two-electron Gaunt term, not usually taken into consideration, has been estimated at the Self-Consistent Field (ΔSCF) level and is found to become increasingly important and eventually quite prominent for molecules with third period atoms and below. The accuracies of the IPs computed using UGA-OSCC are found to be of the same order as the Coupled Cluster Singles Doubles (ΔCCSD) values while being free from spin contamination. Since the UGA-OSCC uses a common set of orbitals for the ground state and the ion, it obviates the need of two N5 AO to MO transformation in contrast to the ΔCCSD method.

  10. Inclusion of orbital relaxation and correlation through the unitary group adapted open shell coupled cluster theory using non-relativistic and scalar relativistic Hamiltonians to study the core ionization potential of molecules containing light to medium-heavy elements.

    PubMed

    Sen, Sangita; Shee, Avijit; Mukherjee, Debashis

    2018-02-07

    The orbital relaxation attendant on ionization is particularly important for the core electron ionization potential (core IP) of molecules. The Unitary Group Adapted State Universal Coupled Cluster (UGA-SUMRCC) theory, recently formulated and implemented by Sen et al. [J. Chem. Phys. 137, 074104 (2012)], is very effective in capturing orbital relaxation accompanying ionization or excitation of both the core and the valence electrons [S. Sen et al., Mol. Phys. 111, 2625 (2013); A. Shee et al., J. Chem. Theory Comput. 9, 2573 (2013)] while preserving the spin-symmetry of the target states and using the neutral closed-shell spatial orbitals of the ground state. Our Ansatz invokes a normal-ordered exponential representation of spin-free cluster-operators. The orbital relaxation induced by a specific set of cluster operators in our Ansatz is good enough to eliminate the need for different sets of orbitals for the ground and the core-ionized states. We call the single configuration state function (CSF) limit of this theory the Unitary Group Adapted Open-Shell Coupled Cluster (UGA-OSCC) theory. The aim of this paper is to comprehensively explore the efficacy of our Ansatz to describe orbital relaxation, using both theoretical analysis and numerical performance. Whenever warranted, we also make appropriate comparisons with other coupled-cluster theories. A physically motivated truncation of the chains of spin-free T-operators is also made possible by the normal-ordering, and the operational resemblance to single reference coupled-cluster theory allows easy implementation. Our test case is the prediction of the 1s core IP of molecules containing a single light- to medium-heavy nucleus and thus, in addition to demonstrating the orbital relaxation, we have addressed the scalar relativistic effects on the accuracy of the IPs by using a hierarchy of spin-free Hamiltonians in conjunction with our theory. Additionally, the contribution of the spin-free component of the two-electron Gaunt term, not usually taken into consideration, has been estimated at the Self-Consistent Field (ΔSCF) level and is found to become increasingly important and eventually quite prominent for molecules with third period atoms and below. The accuracies of the IPs computed using UGA-OSCC are found to be of the same order as the Coupled Cluster Singles Doubles (ΔCCSD) values while being free from spin contamination. Since the UGA-OSCC uses a common set of orbitals for the ground state and the ion, it obviates the need of two N 5 AO to MO transformation in contrast to the ΔCCSD method.

  11. Stochastic voyages into uncharted chemical space produce a representative library of all possible drug-like compounds

    PubMed Central

    Virshup, Aaron M.; Contreras-García, Julia; Wipf, Peter; Yang, Weitao; Beratan, David N.

    2013-01-01

    The “small molecule universe” (SMU), the set of all synthetically feasible organic molecules of 500 Daltons molecular weight or less, is estimated to contain over 1060 structures, making exhaustive searches for structures of interest impractical. Here, we describe the construction of a “representative universal library” spanning the SMU that samples the full extent of feasible small molecule chemistries. This library was generated using the newly developed Algorithm for Chemical Space Exploration with Stochastic Search (ACSESS). ACSESS makes two important contributions to chemical space exploration: it allows the systematic search of the unexplored regions of the small molecule universe, and it facilitates the mining of chemical libraries that do not yet exist, providing a near-infinite source of diverse novel compounds. PMID:23548177

  12. Internal vibrations of a molecule consisting of rigid segments. I - Non-interacting internal vibrations

    NASA Technical Reports Server (NTRS)

    He, X. M.; Craven, B. M.

    1993-01-01

    For molecular crystals, a procedure is proposed for interpreting experimentally determined atomic mean square anisotropic displacement parameters (ADPs) in terms of the overall molecular vibration together with internal vibrations with the assumption that the molecule consists of a set of linked rigid segments. The internal librations (molecular torsional or bending modes) are described using the variable internal coordinates of the segmented body. With this procedure, the experimental ADPs obtained from crystal structure determinations involving six small molecules (sym-trinitrobenzene, adenosine, tetra-cyanoquinodimethane, benzamide, alpha-cyanoacetic acid hydrazide and N-acetyl-L-tryptophan methylamide) have been analyzed. As a consequence, vibrational corrections to the bond lengths and angles of the molecule are calculated as well as the frequencies and force constants for each internal torsional or bending vibration.

  13. A Diffusion-Based and Dynamic 3D-Printed Device That Enables Parallel in Vitro Pharmacokinetic Profiling of Molecules

    PubMed Central

    Lockwood, Sarah Y.; Meisel, Jayda E.; Monsma, Frederick J.; Spence, Dana M.

    2016-01-01

    The process of bringing a drug to market involves many steps, including the preclinical stage, where various properties of the drug candidate molecule are determined. These properties, which include drug absorption, distribution, metabolism, and excretion, are often displayed in a pharmacokinetic (PK) profile. While PK profiles are determined in animal models, in vitro systems that model in vivo processes are available, although each possesses shortcomings. Here, we present a 3D-printed, diffusion-based, and dynamic in vitro PK device. The device contains six flow channels, each with integrated porous membrane-based insert wells. The pores of these membranes enable drugs to freely diffuse back and forth between the flow channels and the inserts, thus enabling both loading and clearance portions of a standard PK curve to be generated. The device is designed to work with 96-well plate technology and consumes single-digit milliliter volumes to generate multiple PK profiles, simultaneously. Generation of PK profiles by use of the device was initially performed with fluorescein as a test molecule. Effects of such parameters as flow rate, loading time, volume in the insert well, and initial concentration of the test molecule were investigated. A prediction model was generated from this data, enabling the user to predict the concentration of the test molecule at any point along the PK profile within a coefficient of variation of ~5%. Depletion of the analyte from the well was characterized and was determined to follow first-order rate kinetics, indicated by statistically equivalent (p > 0.05) depletion half-lives that were independent of the starting concentration. A PK curve for an approved antibiotic, levofloxacin, was generated to show utility beyond the fluorescein test molecule. PMID:26727249

  14. Theoretical Investigation of Single-Molecule Sensing Using Nanotube-Enhanced Circular Dichroism.

    PubMed

    Silva, Jaime; Milne, Bruce F; Nogueira, Fernando

    2018-06-19

    First-principles calculations have been used to investigate the potential use of circular dichroism (CD) spectroscopy in single-molecule sensing. Using a real-space implementation of time-dependent density functional theory (TDDFT), several systems involving single-walled carbon nanotubes (SWCNT) and small molecules have been studied to evaluate their CD response. Large induced CD (ICD) effects, differing for each test molecule, were observed in all SWCNT-molecule complexes. As the SWCNT used in this study shows no intrinsic CD response, the ICD spectra are the result of interaction with the small molecules. This finding is general and independent of the (a)chiral nature of the adsorbed molecule. Our results indicate that it is possible to design a system that uses SWCNT for detection of molecules using the change in CD spectrum of the system induced by adsorption of the molecule onto the SWCNT surface.

  15. FTIR, FT-RAMAN, NMR, spectra, normal co-ordinate analysis, NBO, NLO and DFT calculation of N,N-diethyl-4-methylpiperazine-1-carboxamide molecule

    NASA Astrophysics Data System (ADS)

    Muthu, S.; Elamurugu Porchelvi, E.

    2013-11-01

    The Fourier Transform Infrared (FT-IR) and FT-Raman of N,N-diethyl-4-methylpiperazine-1-carboxamide (NND4MC) have been recorded and analyzed. The structure of the compound was optimized and the structural characteristics were determined by density functional theory (DFT) using B3LYP method with 6-31G(d,p) and 6-311G(d,p) basis sets. The difference between the observed and scaled wavenumber values of most of the fundamentals is very small. The theoretically predicted FT-IR and FT-Raman spectra of the title molecule have been constructed. The detailed interpretation of the vibrational spectra has been carried out with aid of normal coordinate analysis (NCA) following the scaled quantum mechanical force field methodology. Stability of the molecule arising from hyperconjugative interactions and charge delocalization has been analyzed using natural bond orbital (NBO) analysis. The results show that electron density (ED) in the σ* and π* antibonding orbitals and second order delocalization energies (E2) confirm the occurrence of intramolecular charge transfer (ICT) within the molecule. The electronic dipole moment (μD) and the first hyperpolarizability (βtot) values of the investigated molecule were computed using Density Functional Theory (DFT/B3LYP) with 6-31G(d,p) and 6-311G(d,p) basis sets. The calculated results also show that the NND4MC molecule may have microscopy nonlinear optical (NLO) behavior with non zero values. Mulliken atomic charges of NND4MC were calculated. The 13C nuclear magnetic resonance (NMR) chemical shifts of the molecule were calculated by the gauge independent atomic orbital (GIAO) method and compared with experimental results. The UV-Vis spectrum of the compound was recorded. The theoretical electronic absorption spectra have been calculated by using CIS, TD-DFT methods. A study on the electronic properties, such as HOMO and LUMO energies, molecular electrostatic potential (MEP) were also performed.

  16. Design of novel quinazolinone derivatives as inhibitors for 5HT7 receptor.

    PubMed

    Chitta, Aparna; Jatavath, Mohan Babu; Fatima, Sabiha; Manga, Vijjulatha

    2012-02-01

    To study the pharmacophore properties of quinazolinone derivatives as 5HT(7) inhibitors, 3D QSAR methodologies, namely Comparative Molecular Field Analysis (CoMFA) and Comparative Molecular Similarity Indices Analysis (CoMSIA) were applied, partial least square (PLS) analysis was performed and QSAR models were generated. The derived model showed good statistical reliability in terms of predicting the 5HT(7) inhibitory activity of the quinazolione derivative, based on molecular property fields like steric, electrostatic, hydrophobic, hydrogen bond donor and hydrogen bond acceptor fields. This is evident from statistical parameters like q(2) (cross validated correlation coefficient) of 0.642, 0.602 and r(2) (conventional correlation coefficient) of 0.937, 0.908 for CoMFA and CoMSIA respectively. The predictive ability of the models to determine 5HT(7) antagonistic activity is validated using a test set of 26 molecules that were not included in the training set and the predictive r(2) obtained for the test set was 0.512 & 0.541. Further, the results of the derived model are illustrated by means of contour maps, which give an insight into the interaction of the drug with the receptor. The molecular fields so obtained served as the basis for the design of twenty new ligands. In addition, ADME (Adsorption, Distribution, Metabolism and Elimination) have been calculated in order to predict the relevant pharmaceutical properties, and the results are in conformity with required drug like properties.

  17. Single-Molecule Test for Markovianity of the Dynamics along a Reaction Coordinate.

    PubMed

    Berezhkovskii, Alexander M; Makarov, Dmitrii E

    2018-05-03

    In an effort to answer the much-debated question of whether the time evolution of common experimental observables can be described as one-dimensional diffusion in the potential of mean force, we propose a simple criterion that allows one to test whether the Markov assumption is applicable to a single-molecule trajectory x( t). This test does not involve fitting of the data to any presupposed model and can be applied to experimental data with relatively low temporal resolution.

  18. Evaluation of Complexation Ability Using a Sensor Electrode Chip Equipped with a Wireless Screening System

    PubMed Central

    Isoda, Takaaki; Urushibara, Ikuko; Sato, Hikaru; Yamauchi, Noriyoshi

    2012-01-01

    We fabricated an electrode chip with a structure coated by an insulation layer that contains dispersed SiO2 adsorbent particles modified by an amino-group on a source-drain electrode. Voltage changes caused by chelate molecule adsorption onto electrode surfaces and by specific cation interactions were investigated. The detection of specific cations without the presence of chelate molecules on the free electrode was also examined. By comparing both sets of results the complexation ability of the studied chelate molecules onto the electrode was evaluated. Five pairs of source-drain electrodes(×8 arrays) were fabricated on a glass substrate of 20 × 30mm in size. The individual Au/Cr (1.0/0.1μm thickness) electrodes had widths of 50 μm and an inter-electrode interval of 100μm.The fabricated source-drain electrodes were further coated with an insulation layer comprising a porous SiO2 particle modified amino-group to adsorb the chelate molecules. The electrode chip was equipped with a handy-type sensor signal analyzer that was mounted on an amplifier circuit using a Miniship™ or a system in a packaged LSI device. For electrode surfaces containing different adsorbed chelate molecules an increase in the sensor voltage depended on a combination of host-guest reactions and generally decreased in the following order:5,10,15,20-tetrakis(N-methylpyridinium-4-yl)-21H,23H-porphine, tetrakis(p-toluenesulfonate) (TMPyP)as a Cu2+chelator and Cu2+>2-nitroso-5-[N-n-propyl-N-(3-sulfopropyl)amino]phenol(nitroso-PSAP) as an Fe2+chelator and Fe2+>4,7-diphenyl-1,10-phenanthrolinedisulfonic acid, disodium salt (BPDSA) as an Fe2+chelatorand Fe2+>3-[3-(2,4-dimethylphenylcarbamoyl)-2-hydroxynaphthalene-1-yl-azo]-4-hydroxybenzenesulfonic acid, sodium salt (XB-1) as a Mg2+chelator and Mg2+>2,9-dimethyl-4,7-diphenyl-1,10-phenanthrolinedisulfonic acid, disodium salt (BCIDSA) as a Cu2+chelator and Cu2+, respectively. In contrast, for the electrode surfaces with adsorbed O,O′-bis(2-aminoethyl)ethyleneglycol-N,N,N′,N′-tetraacetic acid (GEDTA) or O,O′-bis(2-aminophenyl)ethyleneglycol-N,N,N′,N′-tetraacetic acid, tetrapotassium salt, hydrate (BAPTA) as a Ca2+chelator no increase in the detection voltage was found for all the electrode tests conducted in the presence of Ca2+.To determine the differences in electrode detection, molecular orbital (MO) calculations of the chelate molecules and surface molecular modeling of the adsorbents were carried out. In accordance with frontier orbital theory, the lowest unoccupied MO (LUMO) of the chelate molecules can accept two lone pair electrons at the highest occupied MO (HOMO) of the amino group on the model surface structure of the SiO2 particle. As a result, a good correlation was obtained between the LUMO-HOMO difference and the ion response of all the electrodes tested. Based on the results obtained, the order of adsorbed chelate molecules on adsorption particles reflects the different metal ion detection abilities of the electrode chips. PMID:22969407

  19. Experimental validation of FINDSITEcomb virtual ligand screening results for eight proteins yields novel nanomolar and micromolar binders

    PubMed Central

    2014-01-01

    Background Identification of ligand-protein binding interactions is a critical step in drug discovery. Experimental screening of large chemical libraries, in spite of their specific role and importance in drug discovery, suffer from the disadvantages of being random, time-consuming and expensive. To accelerate the process, traditional structure- or ligand-based VLS approaches are combined with experimental high-throughput screening, HTS. Often a single protein or, at most, a protein family is considered. Large scale VLS benchmarking across diverse protein families is rarely done, and the reported success rate is very low. Here, we demonstrate the experimental HTS validation of a novel VLS approach, FINDSITEcomb, across a diverse set of medically-relevant proteins. Results For eight different proteins belonging to different fold-classes and from diverse organisms, the top 1% of FINDSITEcomb’s VLS predictions were tested, and depending on the protein target, 4%-47% of the predicted ligands were shown to bind with μM or better affinities. In total, 47 small molecule binders were identified. Low nanomolar (nM) binders for dihydrofolate reductase and protein tyrosine phosphatases (PTPs) and micromolar binders for the other proteins were identified. Six novel molecules had cytotoxic activity (<10 μg/ml) against the HCT-116 colon carcinoma cell line and one novel molecule had potent antibacterial activity. Conclusions We show that FINDSITEcomb is a promising new VLS approach that can assist drug discovery. PMID:24936211

  20. Biosimilar therapeutics-what do we need to consider?

    PubMed

    Schellekens, Huub

    2009-01-01

    Patents for the first generation of approved biopharmaceuticals have either expired or are about to expire. Thus the market is opening for generic versions, referred to as 'biosimilars' (European Union) or 'follow-on protein products' (United States). Healthcare professionals need to understand the critical issues surrounding the use of biosimilars to make informed treatment decisions.The complex high-molecular-weight three-dimensional structures of biopharmaceuticals, their heterogeneity and dependence on production in living cells makes them different from classical chemical drugs. Current analytical methods cannot characterize these complex molecules sufficiently to confirm structural equivalence with reference molecules. Verification of the similarity of biosimilars to innovator biopharmaceuticals remains a key challenge. Furthermore, a critical safety issue, the immunogenicity of biopharmaceuticals, has been highlighted in recent years, confirming a need for comprehensive immunogenicity testing prior to approval and extended post-marketing surveillance.Biosimilars present a new set of challenges for regulatory authorities when compared with conventional generics. While the demonstration of a pharmacokinetic similarity is sufficient for conventional, small-molecule generic agents, a number of issues will make the approval of biosimilars more complicated. Documents recently published by the European Medicines Agency (EMEA) outlining requirements for the market approval of biosimilars provide much-needed guidance. The EMEA has approved a number of biosimilar products in a scientifically rigorous and balanced process. Outstanding issues include the interchangeability of biosimilars and innovator products, the possible need for unique naming to differentiate the various biopharmaceutical products, and more comprehensive labelling for biosimilars to include relevant clinical data.

  1. Partition coefficients of organic compounds between water and imidazolium-, pyridinium-, and phosphonium-based ionic liquids.

    PubMed

    Padró, Juan M; Pellegrino Vidal, Rocío B; Reta, Mario

    2014-12-01

    The partition coefficients, P IL/w, of several compounds, some of them of biological and pharmacological interest, between water and room-temperature ionic liquids based on the imidazolium, pyridinium, and phosphonium cations, namely 1-octyl-3-methylimidazolium hexafluorophosphate, N-octylpyridinium tetrafluorophosphate, trihexyl(tetradecyl)phosphonium chloride, trihexyl(tetradecyl)phosphonium bromide, trihexyl(tetradecyl)phosphonium bis(trifluoromethylsulfonyl)imide, and trihexyl(tetradecyl)phosphonium dicyanamide, were accurately measured. In this way, we extended our database of partition coefficients in room-temperature ionic liquids previously reported. We employed the solvation parameter model with different probe molecules (the training set) to elucidate the chemical interactions involved in the partition process and discussed the most relevant differences among the three types of ionic liquids. The multiparametric equations obtained with the aforementioned model were used to predict the partition coefficients for compounds (the test set) not present in the training set, most being of biological and pharmacological interest. An excellent agreement between calculated and experimental log P IL/w values was obtained. Thus, the obtained equations can be used to predict, a priori, the extraction efficiency for any compound using these ionic liquids as extraction solvents in liquid-liquid extractions.

  2. Spectral properties of minimal-basis-set orbitals: Implications for molecular electronic continuum states

    NASA Astrophysics Data System (ADS)

    Langhoff, P. W.; Winstead, C. L.

    Early studies of the electronically excited states of molecules by John A. Pople and coworkers employing ab initio single-excitation configuration interaction (SECI) calculations helped to simulate related applications of these methods to the partial-channel photoionization cross sections of polyatomic molecules. The Gaussian representations of molecular orbitals adopted by Pople and coworkers can describe SECI continuum states when sufficiently large basis sets are employed. Minimal-basis virtual Fock orbitals stabilized in the continuous portions of such SECI spectra are generally associated with strong photoionization resonances. The spectral attributes of these resonance orbitals are illustrated here by revisiting previously reported experimental and theoretical studies of molecular formaldehyde (H2CO) in combination with recently calculated continuum orbital amplitudes.

  3. Ligandbook: an online repository for small and drug-like molecule force field parameters.

    PubMed

    Domanski, Jan; Beckstein, Oliver; Iorga, Bogdan I

    2017-06-01

    Ligandbook is a public database and archive for force field parameters of small and drug-like molecules. It is a repository for parameter sets that are part of published work but are not easily available to the community otherwise. Parameter sets can be downloaded and immediately used in molecular dynamics simulations. The sets of parameters are versioned with full histories and carry unique identifiers to facilitate reproducible research. Text-based search on rich metadata and chemical substructure search allow precise identification of desired compounds or functional groups. Ligandbook enables the rapid set up of reproducible molecular dynamics simulations of ligands and protein-ligand complexes. Ligandbook is available online at https://ligandbook.org and supports all modern browsers. Parameters can be searched and downloaded without registration, including access through a programmatic RESTful API. Deposition of files requires free user registration. Ligandbook is implemented in the PHP Symfony2 framework with TCL scripts using the CACTVS toolkit. oliver.beckstein@asu.edu or bogdan.iorga@cnrs.fr ; contact@ligandbook.org . Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press.

  4. Spherical and hyperspherical harmonics representation of van der Waals aggregates

    NASA Astrophysics Data System (ADS)

    Lombardi, Andrea; Palazzetti, Federico; Aquilanti, Vincenzo; Grossi, Gaia; Albernaz, Alessandra F.; Barreto, Patricia R. P.; Cruz, Ana Claudia P. S.

    2016-12-01

    The representation of the potential energy surfaces of atom-molecule or molecular dimers interactions should account faithfully for the symmetry properties of the systems, preserving at the same time a compact analytical form. To this aim, the choice of a proper set of coordinates is a necessary precondition. Here we illustrate a description in terms of hyperspherical coordinates and the expansion of the intermolecular interaction energy in terms of hypersherical harmonics, as a general method for building potential energy surfaces suitable for molecular dynamics simulations of van der Waals aggregates. Examples for the prototypical case diatomic-molecule-diatomic-molecule interactions are shown.

  5. Generation of Murine Monoclonal Antibodies by Hybridoma Technology.

    PubMed

    Holzlöhner, Pamela; Hanack, Katja

    2017-01-02

    Monoclonal antibodies are universal binding molecules and are widely used in biomedicine and research. Nevertheless, the generation of these binding molecules is time-consuming and laborious due to the complicated handling and lack of alternatives. The aim of this protocol is to provide one standard method for the generation of monoclonal antibodies using hybridoma technology. This technology combines two steps. Step 1 is an appropriate immunization of the animal and step 2 is the fusion of B lymphocytes with immortal myeloma cells in order to generate hybrids possessing both parental functions, such as the production of antibody molecules and immortality. The generated hybridoma cells were then recloned and diluted to obtain stable monoclonal cell cultures secreting the desired monoclonal antibody in the culture supernatant. The supernatants were tested in enzyme-linked immunosorbent assays (ELISA) for antigen specificity. After the selection of appropriate cell clones, the cells were transferred to mass cultivation in order to produce the desired antibody molecule in large amounts. The purification of the antibodies is routinely performed by affinity chromatography. After purification, the antibody molecule can be characterized and validated for the final test application. The whole process takes 8 to 12 months of development, and there is a high risk that the antibody will not work in the desired test system.

  6. Surface functionalization of bioactive glasses with natural molecules of biological significance, Part I: Gallic acid as model molecule

    NASA Astrophysics Data System (ADS)

    Zhang, Xin; Ferraris, Sara; Prenesti, Enrico; Verné, Enrica

    2013-12-01

    Gallic acid (3,4,5-trihydroxybenzoic acid, GA) and its derivatives are a group of biomolecules (polyphenols) obtained from plants. They have effects which are potentially beneficial to heath, for example they are antioxidant, anticarcinogenic and antibacterial, as recently investigated in many fields such as medicine, food and plant sciences. The main drawbacks of these molecules are both low stability and bioavailability. In this research work the opportunity to graft GA to bioactive glasses is investigated, in order to deliver the undamaged biological molecule into the body, using the biomaterial surfaces as a localized carrier. GA was considered for functionalization since it is a good model molecule for polyphenols and presents several interesting biological activities, like antibacterial, antioxidant and anticarcinogenic properties. Two different silica based bioactive glasses (SCNA and CEL2), with different reactivity, were employed as substrates. UV photometry combined with the Folin&Ciocalteu reagent was adopted to test the concentration of GA in uptake solution after functionalization. This test verified how much GA consumption occurred with surface modification and it was also used on solid samples to test the presence of GA on functionalized glasses. XPS and SEM-EDS techniques were employed to characterize the modification of material surface properties and functional group composition before and after functionalization.

  7. Towards crystal structure prediction of complex organic compounds – a report on the fifth blind test

    PubMed Central

    Bardwell, David A.; Adjiman, Claire S.; Arnautova, Yelena A.; Bartashevich, Ekaterina; Boerrigter, Stephan X. M.; Braun, Doris E.; Cruz-Cabeza, Aurora J.; Day, Graeme M.; Della Valle, Raffaele G.; Desiraju, Gautam R.; van Eijck, Bouke P.; Facelli, Julio C.; Ferraro, Marta B.; Grillo, Damian; Habgood, Matthew; Hofmann, Detlef W. M.; Hofmann, Fridolin; Jose, K. V. Jovan; Karamertzanis, Panagiotis G.; Kazantsev, Andrei V.; Kendrick, John; Kuleshova, Liudmila N.; Leusen, Frank J. J.; Maleev, Andrey V.; Misquitta, Alston J.; Mohamed, Sharmarke; Needs, Richard J.; Neumann, Marcus A.; Nikylov, Denis; Orendt, Anita M.; Pal, Rumpa; Pantelides, Constantinos C.; Pickard, Chris J.; Price, Louise S.; Price, Sarah L.; Scheraga, Harold A.; van de Streek, Jacco; Thakur, Tejender S.; Tiwari, Siddharth; Venuti, Elisabetta; Zhitkov, Ilia K.

    2011-01-01

    Following on from the success of the previous crystal structure prediction blind tests (CSP1999, CSP2001, CSP2004 and CSP2007), a fifth such collaborative project (CSP2010) was organized at the Cambridge Crystallographic Data Centre. A range of methodologies was used by the participating groups in order to evaluate the ability of the current computational methods to predict the crystal structures of the six organic molecules chosen as targets for this blind test. The first four targets, two rigid molecules, one semi-flexible molecule and a 1:1 salt, matched the criteria for the targets from CSP2007, while the last two targets belonged to two new challenging categories – a larger, much more flexible molecule and a hydrate with more than one polymorph. Each group submitted three predictions for each target it attempted. There was at least one successful prediction for each target, and two groups were able to successfully predict the structure of the large flexible molecule as their first place submission. The results show that while not as many groups successfully predicted the structures of the three smallest molecules as in CSP2007, there is now evidence that methodologies such as dispersion-corrected density functional theory (DFT-D) are able to reliably do so. The results also highlight the many challenges posed by more complex systems and show that there are still issues to be overcome. PMID:22101543

  8. New rate coefficients of CS in collision with para- and ortho-H2 and astrophysical implications

    NASA Astrophysics Data System (ADS)

    Denis-Alpizar, Otoniel; Stoecklin, Thierry; Guilloteau, Stéphane; Dutrey, Anne

    2018-05-01

    Astronomers use the CS molecule as a gas mass tracer in dense regions of the interstellar medium, either to measure the gas density through multi-line observations or the level of turbulence. This necessarily requires the knowledge of the rates coefficients with the most common colliders in the interstellar medium, He and H2. In the present work, the close coupling collisional rates are computed for the first thirty rotational states of CS in collision with para- and ortho-H2 using a recent rigid rotor potential energy surface. Some radiative transfer calculations, using typical astrophysical conditions, are also performed to test this new set of data and to compare with the existing ones.

  9. Laser photoactivation gibberellin molecules in the surface tissues of plants

    NASA Astrophysics Data System (ADS)

    Grishkanich, Alexander; Zhevlakov, Alexander; Kascheev, Sergey; Sidorov, Igor; Ruzankina, Julia; Yakovlev, Alexey; Mak, Andrey

    2016-03-01

    The experimental results presented in this study are the early studies of germination on the example of Picea abies and were aimed at testing the germination of seeds and the development of morphology, caused a therapeutic effect on the laser radiation field in the early stages of development under the action of ultraviolet and red light in the spectral range of 405 nm and 640 nm. A set of seeds irradiated at various energy doses within the same time. The experimental results analyzed in parallel with control group. In all analyzed seeds were studied the germination and growth of seedlings. The results showed that the percentage of germination higher than control group Samanids all of the recurrence options.

  10. Communication: Hilbert-space partitioning of the molecular one-electron density matrix with orthogonal projectors

    NASA Astrophysics Data System (ADS)

    Vanfleteren, Diederik; Van Neck, Dimitri; Bultinck, Patrick; Ayers, Paul W.; Waroquier, Michel

    2010-12-01

    A double-atom partitioning of the molecular one-electron density matrix is used to describe atoms and bonds. All calculations are performed in Hilbert space. The concept of atomic weight functions (familiar from Hirshfeld analysis of the electron density) is extended to atomic weight matrices. These are constructed to be orthogonal projection operators on atomic subspaces, which has significant advantages in the interpretation of the bond contributions. In close analogy to the iterative Hirshfeld procedure, self-consistency is built in at the level of atomic charges and occupancies. The method is applied to a test set of about 67 molecules, representing various types of chemical binding. A close correlation is observed between the atomic charges and the Hirshfeld-I atomic charges.

  11. The quantitative structure-insecticidal activity relationships from plant derived compounds against chikungunya and zika Aedes aegypti (Diptera:Culicidae) vector.

    PubMed

    Saavedra, Laura M; Romanelli, Gustavo P; Rozo, Ciro E; Duchowicz, Pablo R

    2018-01-01

    The insecticidal activity of a series of 62 plant derived molecules against the chikungunya, dengue and zika vector, the Aedes aegypti (Diptera:Culicidae) mosquito, is subjected to a Quantitative Structure-Activity Relationships (QSAR) analysis. The Replacement Method (RM) variable subset selection technique based on Multivariable Linear Regression (MLR) proves to be successful for exploring 4885 molecular descriptors calculated with Dragon 6. The predictive capability of the obtained models is confirmed through an external test set of compounds, Leave-One-Out (LOO) cross-validation and Y-Randomization. The present study constitutes a first necessary computational step for designing less toxic insecticides. Copyright © 2017 Elsevier B.V. All rights reserved.

  12. Quantum Chemical Approach to Estimating the Thermodynamics of Metabolic Reactions

    PubMed Central

    Jinich, Adrian; Rappoport, Dmitrij; Dunn, Ian; Sanchez-Lengeling, Benjamin; Olivares-Amaya, Roberto; Noor, Elad; Even, Arren Bar; Aspuru-Guzik, Alán

    2014-01-01

    Thermodynamics plays an increasingly important role in modeling and engineering metabolism. We present the first nonempirical computational method for estimating standard Gibbs reaction energies of metabolic reactions based on quantum chemistry, which can help fill in the gaps in the existing thermodynamic data. When applied to a test set of reactions from core metabolism, the quantum chemical approach is comparable in accuracy to group contribution methods for isomerization and group transfer reactions and for reactions not including multiply charged anions. The errors in standard Gibbs reaction energy estimates are correlated with the charges of the participating molecules. The quantum chemical approach is amenable to systematic improvements and holds potential for providing thermodynamic data for all of metabolism. PMID:25387603

  13. Spectroscopic studies (FTIR, FT-Raman and UV-Visible), normal coordinate analysis, NBO analysis, first order hyper polarizability, HOMO and LUMO analysis of (1R)-N-(Prop-2-yn-1-yl)-2,3-dihydro-1H-inden-1-amine molecule by ab initio HF and density functional methods.

    PubMed

    Muthu, S; Ramachandran, G

    2014-01-01

    The Fourier transform infrared (FT-IR) and FT-Raman of (1R)-N-(Prop-2-yn-1-yl)-2,3-dihydro-1H-inden-1-amine (1RNPDA) were recorded in the regions 4000-400 cm(-1) and 4000-100 cm(-1) respectively. A complete assignment and analysis of the fundamental vibrational modes of the molecule were carried out. The observed fundamental modes have been compared with the harmonic vibrational frequencies computed using HF method by employing 6-31G(d,p) basis set and DFT(B3LYP) method by employing 6-31G(d,p) basis set. The vibrational studies were interpreted in terms of Potential Energy Distribution (PED). The complete vibrational frequency assignments were made by Normal Co-ordinate Analysis (NCA) following the scaled quantum mechanical force field methodology (SQMFF). The first order hyper polarizability (β0) of this molecular system and related properties (α, μ, and Δα) are calculated using B3LYP/6-31G(d,p) method based on the finite-field approach. The thermodynamic functions of the title compound were also performed at the above methods and basis set. A detailed interpretation of the infrared and Raman spectra of 1RNPDA is reported. The (1)H and (13)C nuclear magnetic resonance (NMR) chemical shifts of the molecule were calculated using the GIAO method confirms with the experimental values. Stability of the molecule arising from hyper-conjugative interactions and charge delocalization has been analyzed using Natural Bond Orbital (NBO) analysis. UV-vis spectrum of the compound was recorded and electronic properties such as excitation energies, oscillator strength and wavelength were performed by TD-DFT/B3LYP using 6-31G(d,p) basis set. The HOMO and LUMO energy gap reveals that the energy gap reflects the chemical activity of the molecule. The observed and calculated wave numbers are formed to be in good agreement. The experimental spectra also coincide satisfactorily with those of theoretically constructed spectra. Copyright © 2013 Elsevier B.V. All rights reserved.

  14. Adsorption of small molecules on the [Zn-Zn]2+ linkage in zeolite. A DFT study of ferrierite

    NASA Astrophysics Data System (ADS)

    Benco, Lubomir

    2017-02-01

    In zeolites monovalent Zn(I) forms a sub-nano particles [Zn-Zn]2+ stabilized in rings of the zeolite framework, which exhibit interesting catalytic properties. This work reports on adsorption properties of [Zn-Zn]2+ particles in zeolite ferrierite investigated for a set of probing diatomic (N2, O2, H2, CO, NO) and triatomic (CO2, N2O, NO2, H2O) molecules using dispersion-corrected DFT. Three [Zn-Zn]2+ sites are compared differing in the location and stability. On all sites molecules form physisorbed clusters with the molecule connected on-top of the Zn-Zn linkage. In physisorbed clusters adsorption induces only slight change of bonding and the geometry of the Zn-Zn linkage. Some molecules can form stable chemisorbed clusters in which the molecule is integrated between two Zn+ cations. The sandwich-like chemisorption causes pronounced changes of bonding and can lead to the transfer of the electron density between two Zn+ cations and to a change of the oxidation state. The knowledge of bonding of small molecules can help understanding of the mechanism of conversion reactions catalyzed by sub-nano [Zn-Zn] particles.

  15. Pharmacophore modeling and virtual screening studies to design some potential histone deacetylase inhibitors as new leads.

    PubMed

    Vadivelan, S; Sinha, B N; Rambabu, G; Boppana, Kiran; Jagarlapudi, Sarma A R P

    2008-02-01

    Histone deacetylase is one of the important targets in the treatment of solid tumors and hematological cancers. A total of 20 well-defined inhibitors were used to generate Pharmacophore models using and HypoGen module of Catalyst. These 20 molecules broadly represent 3 different chemotypes. The best HypoGen model consists of four-pharmacophore features--one hydrogen bond acceptor, one hydrophobic aliphatic and two ring aromatic centers. This model was validated against 378 known HDAC inhibitors with a correlation of 0.897 as well as enrichment factor of 2.68 against a maximum value of 3. This model was further used to retrieve molecules from NCI database with 238,819 molecules. A total of 4638 molecules from a pool of 238,819 molecules were identified as hits while 297 molecules were indicated as highly active. Also, a Similarity analysis has been carried out for set of 4638 hits with respect to most active molecule of each chemotypes which validated not only the Virtual Screening potential of the model but also identified the possible new Chemotypes. This type of Similarity analysis would prove to be efficient not only for lead generation but also for lead optimization.

  16. DNA origami as biocompatible surface to match single-molecule and ensemble experiments

    PubMed Central

    Gietl, Andreas; Holzmeister, Phil; Grohmann, Dina; Tinnefeld, Philip

    2012-01-01

    Single-molecule experiments on immobilized molecules allow unique insights into the dynamics of molecular machines and enzymes as well as their interactions. The immobilization, however, can invoke perturbation to the activity of biomolecules causing incongruities between single molecule and ensemble measurements. Here we introduce the recently developed DNA origami as a platform to transfer ensemble assays to the immobilized single molecule level without changing the nano-environment of the biomolecules. The idea is a stepwise transfer of common functional assays first to the surface of a DNA origami, which can be checked at the ensemble level, and then to the microscope glass slide for single-molecule inquiry using the DNA origami as a transfer platform. We studied the structural flexibility of a DNA Holliday junction and the TATA-binding protein (TBP)-induced bending of DNA both on freely diffusing molecules and attached to the origami structure by fluorescence resonance energy transfer. This resulted in highly congruent data sets demonstrating that the DNA origami does not influence the functionality of the biomolecule. Single-molecule data collected from surface-immobilized biomolecule-loaded DNA origami are in very good agreement with data from solution measurements supporting the fact that the DNA origami can be used as biocompatible surface in many fluorescence-based measurements. PMID:22523083

  17. Spectroscopic and molecular docking studies on N,N-di-tert-butoxycarbonyl (Boc)-2-amino pyridine: A potential bioactive agent for lung cancer treatment

    NASA Astrophysics Data System (ADS)

    Mohamed Asath, R.; Premkumar, R.; Mathavan, T.; Milton Franklin Benial, A.

    2017-09-01

    Potential energy surface scan was performed and the most stable molecular structure of the N,N-di-tert-butoxycarbonyl (Boc)-2-amino pyridine (DBAP) molecule was predicted. The most stable molecular structure of the molecule was optimized using B3LYP method with cc-pVTZ basis set. Anticancer activity of the DBAP molecule was evaluated by molecular docking analysis. The structural parameters and vibrational wavenumbers were calculated for the optimized molecular structure. The experimental and theoretical wavenumbers were assigned and compared. Ultraviolet-Visible spectrum was simulated and validated experimentally. The molecular electrostatic potential surface was simulated and Fukui function calculations were also carried out to investigate the reactive nature of the DBAP molecule. The natural bond orbital analysis was also performed to probe the intramolecular interactions and confirm the bioactivity of the DBAP molecule. The molecular docking analysis reveals the better inhibitory nature of the DBAP molecule against the epidermal growth factor receptor (EGFR) protein which causes lung cancer. Hence, the present study unveils the structural and bioactive nature of the title molecule. The DBAP molecule was identified as a potential inhibitor against the lung cancer which may be useful in further development of drug designing in the treatment of lung cancer.

  18. Effects of Functional Groups in Redox-Active Organic Molecules: A High-Throughput Screening Approach

    DOE PAGES

    Pelzer, Kenley M.; Cheng, Lei; Curtiss, Larry A.

    2016-12-08

    Nonaqueous redox flow batteries have attracted recent attention with their potential for high electrochemical storage capacity, with organic electrolytes serving as solvents with a wide electrochemical stability window. Organic molecules can also serve as electroactive species, where molecules with low reduction potentials or high oxidation potentials can provide substantial chemical energy. To identify promising electrolytes in a vast chemical space, high-throughput screening (HTS) of candidate molecules plays an important role, where HTS is used to calculate properties of thousands of molecules and identify a few organic molecules worthy of further attention in battery research. Here, in this work, we presentmore » reduction and oxidation potentials obtained from HTS of 4178 molecules. The molecules are composed of base groups of five- or six-membered rings with one or two functional groups attached, with the set of possible functional groups including both electron-withdrawing and electron-donating groups. In addition to observing the trends in potentials that result from differences in organic base groups and functional groups, we analyze the effects of molecular characteristics such as multiple bonds, Hammett parameters, and functional group position. In conclusion, this work provides useful guidance in determining how the identities of the base groups and functional groups are correlated with desirable reduction and oxidation potentials.« less

  19. Effects of Functional Groups in Redox-Active Organic Molecules: A High-Throughput Screening Approach

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pelzer, Kenley M.; Cheng, Lei; Curtiss, Larry A.

    Nonaqueous redox flow batteries have attracted recent attention with their potential for high electrochemical storage capacity, with organic electrolytes serving as solvents with a wide electrochemical stability window. Organic molecules can also serve as electroactive species, where molecules with low reduction potentials or high oxidation potentials can provide substantial chemical energy. To identify promising electrolytes in a vast chemical space, high-throughput screening (HTS) of candidate molecules plays an important role, where HTS is used to calculate properties of thousands of molecules and identify a few organic molecules worthy of further attention in battery research. Here, in this work, we presentmore » reduction and oxidation potentials obtained from HTS of 4178 molecules. The molecules are composed of base groups of five- or six-membered rings with one or two functional groups attached, with the set of possible functional groups including both electron-withdrawing and electron-donating groups. In addition to observing the trends in potentials that result from differences in organic base groups and functional groups, we analyze the effects of molecular characteristics such as multiple bonds, Hammett parameters, and functional group position. In conclusion, this work provides useful guidance in determining how the identities of the base groups and functional groups are correlated with desirable reduction and oxidation potentials.« less

  20. Eckart frame vibration-rotation Hamiltonians: Contravariant metric tensor

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pesonen, Janne, E-mail: janne.pesonen@helsinki.fi

    2014-02-21

    Eckart frame is a unique embedding in the theory of molecular vibrations and rotations. It is defined by the condition that the Coriolis coupling of the reference structure of the molecule is zero for every choice of the shape coordinates. It is far from trivial to set up Eckart kinetic energy operators (KEOs), when the shape of the molecule is described by curvilinear coordinates. In order to obtain the KEO, one needs to set up the corresponding contravariant metric tensor. Here, I derive explicitly the Eckart frame rotational measuring vectors. Their inner products with themselves give the rotational elements, andmore » their inner products with the vibrational measuring vectors (which, in the absence of constraints, are the mass-weighted gradients of the shape coordinates) give the Coriolis elements of the contravariant metric tensor. The vibrational elements are given as the inner products of the vibrational measuring vectors with themselves, and these elements do not depend on the choice of the body-frame. The present approach has the advantage that it does not depend on any particular choice of the shape coordinates, but it can be used in conjunction with all shape coordinates. Furthermore, it does not involve evaluation of covariant metric tensors, chain rules of derivation, or numerical differentiation, and it can be easily modified if there are constraints on the shape of the molecule. Both the planar and non-planar reference structures are accounted for. The present method is particular suitable for numerical work. Its computational implementation is outlined in an example, where I discuss how to evaluate vibration-rotation energies and eigenfunctions of a general N-atomic molecule, the shape of which is described by a set of local polyspherical coordinates.« less

Top