Science.gov

Sample records for molecular structure descriptors

  1. Molecular Descriptors

    NASA Astrophysics Data System (ADS)

    Consonni, Viviana; Todeschini, Roberto

    In the last decades, several scientific researches have been focused on studying how to encompass and convert - by a theoretical pathway - the information encoded in the molecular structure into one or more numbers used to establish quantitative relationships between structures and properties, biological activities, or other experimental properties. Molecular descriptors are formally mathematical representations of a molecule obtained by a well-specified algorithm applied to a defined molecular representation or a well-specified experimental procedure. They play a fundamental role in chemistry, pharmaceutical sciences, environmental protection policy, toxicology, ecotoxicology, health research, and quality control. Evidence of the interest of the scientific community in the molecular descriptors is provided by the huge number of descriptors proposed up today: more than 5000 descriptors derived from different theories and approaches are defined in the literature and most of them can be calculated by means of dedicated software applications. Molecular descriptors are of outstanding importance in the research fields of quantitative structure-activity relationships (QSARs) and quantitative structure-property relationships (QSPRs), where they are the independent chemical information used to predict the properties of interest. Along with the definition of appropriate molecular descriptors, the molecular structure representation and the mathematical tools for deriving and assessing models are other fundamental components of the QSAR/QSPR approach. The remarkable progress during the last few years in chemometrics and chemoinformatics has led to new strategies for finding mathematical meaningful relationships between the molecular structure and biological activities, physico-chemical, toxicological, and environmental properties of chemicals. Different approaches for deriving molecular descriptors here reviewed and some of the most relevant descriptors are presented in

  2. Reverse engineering chemical structures from molecular descriptors : how many solutions?

    SciTech Connect

    Brown, William Michael; Martin, Shawn Bryan; Faulon, Jean-Loup Michel

    2005-06-01

    Physical, chemical and biological properties are the ultimate information of interest for chemical compounds. Molecular descriptors that map structural information to activities and properties are obvious candidates for information sharing. In this paper, we consider the feasibility of using molecular descriptors to safely exchange chemical information in such a way that the original chemical structures cannot be reverse engineered. To investigate the safety of sharing such descriptors, we compute the degeneracy (the number of structure matching a descriptor value) of several 2D descriptors, and use various methods to search for and reverse engineer structures. We examine degeneracy in the entire chemical space taking descriptors values from the alkane isomer series and the PubChem database. We further use a stochastic search to retrieve structures matching specific topological index values. Finally, we investigate the safety of exchanging of fragmental descriptors using deterministic enumeration.

  3. Molecular descriptor subset selection in theoretical peptide quantitative structure-retention relationship model development using nature-inspired optimization algorithms.

    PubMed

    Žuvela, Petar; Liu, J Jay; Macur, Katarzyna; Bączek, Tomasz

    2015-10-01

    In this work, performance of five nature-inspired optimization algorithms, genetic algorithm (GA), particle swarm optimization (PSO), artificial bee colony (ABC), firefly algorithm (FA), and flower pollination algorithm (FPA), was compared in molecular descriptor selection for development of quantitative structure-retention relationship (QSRR) models for 83 peptides that originate from eight model proteins. The matrix with 423 descriptors was used as input, and QSRR models based on selected descriptors were built using partial least squares (PLS), whereas root mean square error of prediction (RMSEP) was used as a fitness function for their selection. Three performance criteria, prediction accuracy, computational cost, and the number of selected descriptors, were used to evaluate the developed QSRR models. The results show that all five variable selection methods outperform interval PLS (iPLS), sparse PLS (sPLS), and the full PLS model, whereas GA is superior because of its lowest computational cost and higher accuracy (RMSEP of 5.534%) with a smaller number of variables (nine descriptors). The GA-QSRR model was validated initially through Y-randomization. In addition, it was successfully validated with an external testing set out of 102 peptides originating from Bacillus subtilis proteomes (RMSEP of 22.030%). Its applicability domain was defined, from which it was evident that the developed GA-QSRR exhibited strong robustness. All the sources of the model's error were identified, thus allowing for further application of the developed methodology in proteomics. PMID:26346190

  4. Signature molecular descriptor : advanced applications.

    SciTech Connect

    Visco, Donald Patrick, Jr.

    2010-04-01

    In this work we report on the development of the Signature Molecular Descriptor (or Signature) for use in the solution of inverse design problems as well as in highthroughput screening applications. The ultimate goal of using Signature is to identify novel and non-intuitive chemical structures with optimal predicted properties for a given application. We demonstrate this in three studies: green solvent design, glucocorticoid receptor ligand design and the design of inhibitors for Factor XIa. In many areas of engineering, compounds are designed and/or modified in incremental ways which rely upon heuristics or institutional knowledge. Often multiple experiments are performed and the optimal compound is identified in this brute-force fashion. Perhaps a traditional chemical scaffold is identified and movement of a substituent group around a ring constitutes the whole of the design process. Also notably, a chemical being evaluated in one area might demonstrate properties very attractive in another area and serendipity was the mechanism for solution. In contrast to such approaches, computer-aided molecular design (CAMD) looks to encompass both experimental and heuristic-based knowledge into a strategy that will design a molecule on a computer to meet a given target. Depending on the algorithm employed, the molecule which is designed might be quite novel (re: no CAS registration number) and/or non-intuitive relative to what is known about the problem at hand. While CAMD is a fairly recent strategy (dating to the early 1980s), it contains a variety of bottlenecks and limitations which have prevented the technique from garnering more attention in the academic, governmental and industrial institutions. A main reason for this is how the molecules are described in the computer. This step can control how models are developed for the properties of interest on a given problem as well as how to go from an output of the algorithm to an actual chemical structure. This report

  5. Relationships between structure and binding affinity of humic substances for polycyclic aromatic hydrocarbons: Relevance of molecular descriptors

    SciTech Connect

    Perminova, I.V.; Grechishcheva, N.Y.; Petrosyan, V.S.

    1999-11-01

    Partition coefficients for the binding affinities of pyrene, fluoranthene, and anthracene to 26 different humic materials were determined by fluorescence quenching. Sources included isolated humic acids, fulvic acids, and combined humic and fulvic fractions from soil, peat, and freshwater as well as Aldrich humic acid. Each of the humic materials was characterized by elemental composition, ultraviolet absorbance at 280 nm, molecular weight, and for 19 samples, composition of main structural fragments determined by {sup 13}C solution-state NMR. The magnitude of the K{sub oc} values correlated strongly with the independent descriptors of aromaticity of humic materials, including atomic H/C ratio, absorptivity at 280 nm, and three interdependent {sup 13}C NMR descriptors (C{sub Ar{minus}H,R}, {summation}C{sub Ar}, {summation}C{sub Ar}/{summation}C{sub Alk}). Statistical comparison of humic sources grouped by the origin revealed that binding affinities were best predicted by the {sup 13}C NMR descriptors. with a slight prevalence of {summation}C{sub Ar}/{summation}C{sub Alk} ration, while molecular weight was the poorest predictor. The latter produced either direct or inverse significant correlation with the K{sub oc} values depending upon the origin and/or fractional composition of the grouped humic materials.

  6. ANN expert system screening for illicit amphetamines using molecular descriptors

    NASA Astrophysics Data System (ADS)

    Gosav, S.; Praisler, M.; Dorohoi, D. O.

    2007-05-01

    The goal of this study was to develop and an artificial neural network (ANN) based on computed descriptors, which would be able to classify the molecular structures of potential illicit amphetamines and to derive their biological activity according to the similarity of their molecular structure with amphetamines of known toxicity. The system is necessary for testing new molecular structures for epidemiological, clinical, and forensic purposes. It was built using a database formed by 146 compounds representing drugs of abuse (mainly central stimulants, hallucinogens, sympathomimetic amines, narcotics and other potent analgesics), precursors, or derivatized counterparts. Their molecular structures were characterized by computing three types of descriptors: 38 constitutional descriptors (CDs), 69 topological descriptors (TDs) and 160 3D-MoRSE descriptors (3DDs). An ANN system was built for each category of variables. All three networks (CD-NN, TD-NN and 3DD-NN) were trained to distinguish between stimulant amphetamines, hallucinogenic amphetamines, and nonamphetamines. A selection of variables was performed when necessary. The efficiency with which each network identifies the class identity of an unknown sample was evaluated by calculating several figures of merit. The results of the comparative analysis are presented.

  7. The signature molecular descriptor. 3. Inverse-quantitative structure-activity relationship of ICAM-1 inhibitory peptides.

    PubMed

    Churchwell, Carla J; Rintoul, Mark D; Martin, Shawn; Visco, Donald P; Kotu, Archana; Larson, Richard S; Sillerud, Laurel O; Brown, David C; Faulon, Jean-Loup

    2004-03-01

    We present a methodology for solving the inverse-quantitative structure-activity relationship (QSAR) problem using the molecular descriptor called signature. This methodology is detailed in four parts. First, we create a QSAR equation that correlates the occurrence of a signature to the activity values using a stepwise multilinear regression technique. Second, we construct constraint equations, specifically the graphicality and consistency equations, which facilitate the reconstruction of the solution compounds directly from the signatures. Third, we solve the set of constraint equations, which are both linear and Diophantine in nature. Last, we reconstruct and enumerate the solution molecules and calculate their activity values from the QSAR equation. We apply this inverse-QSAR method to a small set of LFA-1/ICAM-1 peptide inhibitors to assist in the search and design of more-potent inhibitory compounds. Many novel inhibitors were predicted, a number of which are predicted to be more potent than the strongest inhibitor in the training set. Two of the more potent inhibitors were synthesized and tested in-vivo, confirming them to be the strongest inhibiting peptides to date. Some of these compounds can be recycled to train a new QSAR and develop a more focused library of lead compounds. PMID:15177078

  8. Notes on quantitative structure-properties relationships (QSPR) part 2: the role of the number of atoms as a molecular descriptor.

    PubMed

    Carbó-Dorca, Ramon; Gallegos Saliner, Ana

    2009-10-01

    A previous analysis performed in our laboratory about the polynomial dependency of the atomic quantum self-similarity measures on the atomic number, together with recent publications on quantitative structure-properties relationships (QSPR), based on the number of molecular atoms, published by various authors, have driven us to show here that a simplified form of the fundamental quantum QSPR (QQSPR) equation, permits to theoretically demonstrate the important, but obvious, role of the number of atoms in a molecule, as a possible molecular descriptor. A discussion of the practical use of the number of atoms in QSPR is also given at the end, which also contains a discussion on the role of Ockham's razor in descriptor simplification choices. PMID:19242962

  9. Systems Biological Approach of Molecular Descriptors Connectivity: Optimal Descriptors for Oral Bioavailability Prediction

    PubMed Central

    Ahmed, Shiek S. S. J.; Ramakrishnan, V.

    2012-01-01

    bioavailability, with a predictive accuracy of more than 71%. Overall, the method captures the fundamental molecular descriptors, that can be used as an entity to facilitate prediction of oral bioavailability. PMID:22815781

  10. Lattice enumeration for inverse molecular design using the signature descriptor.

    PubMed

    Martin, Shawn

    2012-07-23

    We describe an inverse quantitative structure-activity relationship (QSAR) framework developed for the design of molecular structures with desired properties. This framework uses chemical fragments encoded with a molecular descriptor known as a signature. It solves a system of linear constrained Diophantine equations to reorganize the fragments into novel molecular structures. The method has been previously applied to problems in drug and materials design but has inherent computational limitations due to the necessity of solving the Diophantine constraints. We propose a new approach to overcome these limitations using the Fincke-Pohst algorithm for lattice enumeration. We benchmark the new approach against previous results on LFA-1/ICAM-1 inhibitory peptides, linear homopolymers, and hydrofluoroether foam blowing agents. Software implementing the new approach is available at www.cs.otago.ac.nz/homepages/smartin. PMID:22657105

  11. A novel method to compare protein structures using local descriptors

    PubMed Central

    2011-01-01

    Background Protein structure comparison is one of the most widely performed tasks in bioinformatics. However, currently used methods have problems with the so-called "difficult similarities", including considerable shifts and distortions of structure, sequential swaps and circular permutations. There is a demand for efficient and automated systems capable of overcoming these difficulties, which may lead to the discovery of previously unknown structural relationships. Results We present a novel method for protein structure comparison based on the formalism of local descriptors of protein structure - DEscriptor Defined Alignment (DEDAL). Local similarities identified by pairs of similar descriptors are extended into global structural alignments. We demonstrate the method's capability by aligning structures in difficult benchmark sets: curated alignments in the SISYPHUS database, as well as SISY and RIPC sets, including non-sequential and non-rigid-body alignments. On the most difficult RIPC set of sequence alignment pairs the method achieves an accuracy of 77% (the second best method tested achieves 60% accuracy). Conclusions DEDAL is fast enough to be used in whole proteome applications, and by lowering the threshold of detectable structure similarity it may shed additional light on molecular evolution processes. It is well suited to improving automatic classification of structure domains, helping analyze protein fold space, or to improving protein classification schemes. DEDAL is available online at http://bioexploratorium.pl/EP/DEDAL. PMID:21849047

  12. A new graph-based molecular descriptor using the canonical representation of the molecule.

    PubMed

    Hentabli, Hamza; Saeed, Faisal; Abdo, Ammar; Salim, Naomie

    2014-01-01

    Molecular similarity is a pervasive concept in drug design. The basic idea underlying molecular similarity is the similar property principle, which states that structurally similar molecules will exhibit similar physicochemical and biological properties. In this paper, a new graph-based molecular descriptor (GBMD) is introduced. The GBMD is a new method of obtaining a rough description of 2D molecular structure in textual form based on the canonical representations of the molecule outline shape and it allows rigorous structure specification using small and natural grammars. Simulated virtual screening experiments with the MDDR database show clearly the superiority of the graph-based descriptor compared to many standard descriptors (ALOGP, MACCS, EPFP4, CDKFP, PCFP, and SMILE) using the Tanimoto coefficient (TAN) and the basic local alignment search tool (BLAST) when searches were carried. PMID:25140330

  13. How diverse are diversity assessment methods? A comparative analysis and benchmarking of molecular descriptor space.

    PubMed

    Koutsoukas, Alexios; Paricharak, Shardul; Galloway, Warren R J D; Spring, David R; Ijzerman, Adriaan P; Glen, Robert C; Marcus, David; Bender, Andreas

    2014-01-27

    Chemical diversity is a widely applied approach to select structurally diverse subsets of molecules, often with the objective of maximizing the number of hits in biological screening. While many methods exist in the area, few systematic comparisons using current descriptors in particular with the objective of assessing diversity in bioactivity space have been published, and this shortage is what the current study is aiming to address. In this work, 13 widely used molecular descriptors were compared, including fingerprint-based descriptors (ECFP4, FCFP4, MACCS keys), pharmacophore-based descriptors (TAT, TAD, TGT, TGD, GpiDAPH3), shape-based descriptors (rapid overlay of chemical structures (ROCS) and principal moments of inertia (PMI)), a connectivity-matrix-based descriptor (BCUT), physicochemical-property-based descriptors (prop2D), and a more recently introduced molecular descriptor type (namely, "Bayes Affinity Fingerprints"). We assessed both the similar behavior of the descriptors in assessing the diversity of chemical libraries, and their ability to select compounds from libraries that are diverse in bioactivity space, which is a property of much practical relevance in screening library design. This is particularly evident, given that many future targets to be screened are not known in advance, but that the library should still maximize the likelihood of containing bioactive matter also for future screening campaigns. Overall, our results showed that descriptors based on atom topology (i.e., fingerprint-based descriptors and pharmacophore-based descriptors) correlate well in rank-ordering compounds, both within and between descriptor types. On the other hand, shape-based descriptors such as ROCS and PMI showed weak correlation with the other descriptors utilized in this study, demonstrating significantly different behavior. We then applied eight of the molecular descriptors compared in this study to sample a diverse subset of sample compounds (4%) from an

  14. Molecular descriptors that influence the amount of drugs transfer into human breast milk.

    PubMed

    Agatonovic-Kustrin, S; Ling, L H; Tham, S Y; Alany, R G

    2002-06-20

    Most drugs are excreted into breast milk to some extent and are bioavailable to the infant. The ability to predict the approximate amount of drug that might be present in milk from the drug structure would be very useful in the clinical setting. The aim of this research was to simplify and upgrade the previously developed model for prediction of the milk to plasma (M/P) concentration ratio, given only the molecular structure of the drug. The set of 123 drug compounds, with experimentally derived M/P values taken from the literature, was used to develop, test and validate a predictive model. Each compound was encoded with 71 calculated molecular structure descriptors, including constitutional descriptors, topological descriptors, molecular connectivity, geometrical descriptors, quantum chemical descriptors, physicochemical descriptors and liquid properties. Genetic algorithm was used to select a subset of the descriptors that best describe the drug transfer into breast milk and artificial neural network (ANN) to correlate selected descriptors with the M/P ratio and develop a QSAR. The averaged literature M/P values were used as the ANN's output and calculated molecular descriptors as the inputs. A nine-descriptor nonlinear computational neural network model has been developed for the estimation of M/P ratio values for a data set of 123 drugs. The model included the percent of oxygen, parachor, density, highest occupied molecular orbital energy (HOMO), topological indices (chiV2, chi2 and chi1) and shape indices (kappa3, kappa2), as the inputs had four hidden neurons and one output neuron. The QSPR that was developed indicates that molecular size (parachor, density) shape (topological shape indices, molecular connectivity indices) and electronic properties (HOMO) are the most important for drug transfer into breast milk. Unlike previously reported models, the QSPR model described here does not require experimentally derived parameters and could potentially provide a

  15. An Infrastructure to Mine Molecular Descriptors for Ligand Selection on Virtual Screening

    PubMed Central

    Seus, Vinicius Rosa; Perazzo, Giovanni Xavier; Winck, Ana T.; Werhli, Adriano V.; Machado, Karina S.

    2014-01-01

    The receptor-ligand interaction evaluation is one important step in rational drug design. The databases that provide the structures of the ligands are growing on a daily basis. This makes it impossible to test all the ligands for a target receptor. Hence, a ligand selection before testing the ligands is needed. One possible approach is to evaluate a set of molecular descriptors. With the aim of describing the characteristics of promising compounds for a specific receptor we introduce a data warehouse-based infrastructure to mine molecular descriptors for virtual screening (VS). We performed experiments that consider as target the receptor HIV-1 protease and different compounds for this protein. A set of 9 molecular descriptors are taken as the predictive attributes and the free energy of binding is taken as a target attribute. By applying the J48 algorithm over the data we obtain decision tree models that achieved up to 84% of accuracy. The models indicate which molecular descriptors and their respective values are relevant to influence good FEB results. Using their rules we performed ligand selection on ZINC database. Our results show important reduction in ligands selection to be applied in VS experiments; for instance, the best selection model picked only 0.21% of the total amount of drug-like ligands. PMID:24812613

  16. QSPR study on refractive indices of solvents commonly used in polymer chemistry using flexible molecular descriptors.

    PubMed

    Fioressi, S E; Bacelo, D E; Cui, W P; Saavedra, L M; Duchowicz, P R

    2015-06-01

    A predictive Quantitative Structure-Property Relationship (QSPR) for the refractive indices of 370 solvents commonly used in the processing and analysis of polymers is presented, using as chemical information descriptors the simplified molecular input line entry system (SMILES). The model employs a flexible molecular descriptor and a conformation-independent approach. Various well-known techniques, such as the use of an external test set of compounds, the cross-validation method, and Y-randomization were used to test and validate the established equations. The predicted values were finally compared with published results from the literature. The simple model proposed correlates the refractive index values with good accuracy, and it is not dependent on 3D-molecular geometries. PMID:26223885

  17. A local average distance descriptor for flexible protein structure comparison

    PubMed Central

    2014-01-01

    Background Protein structures are flexible and often show conformational changes upon binding to other molecules to exert biological functions. As protein structures correlate with characteristic functions, structure comparison allows classification and prediction of proteins of undefined functions. However, most comparison methods treat proteins as rigid bodies and cannot retrieve similarities of proteins with large conformational changes effectively. Results In this paper, we propose a novel descriptor, local average distance (LAD), based on either the geodesic distances (GDs) or Euclidean distances (EDs) for pairwise flexible protein structure comparison. The proposed method was compared with 7 structural alignment methods and 7 shape descriptors on two datasets comprising hinge bending motions from the MolMovDB, and the results have shown that our method outperformed all other methods regarding retrieving similar structures in terms of precision-recall curve, retrieval success rate, R-precision, mean average precision and F1-measure. Conclusions Both ED- and GD-based LAD descriptors are effective to search deformed structures and overcome the problems of self-connection caused by a large bending motion. We have also demonstrated that the ED-based LAD is more robust than the GD-based descriptor. The proposed algorithm provides an alternative approach for blasting structure database, discovering previously unknown conformational relationships, and reorganizing protein structure classification. PMID:24694083

  18. QSPR models based on molecular mechanics and quantum chemical calculations. 1. Construction of Boltzmann-averaged descriptors for alkanes, alcohols, diols, ethers and cyclic compounds.

    PubMed

    Dyekjaer, Jane; Rasmussen, Kjeld; Jónsdóttir, Svava

    2002-09-01

    Values for nine descriptors for QSPR (quantitative structure-property relationships) modeling of physical properties of 96 alkanes, alcohols, ethers, diols, triols and cyclic alkanes and alcohols in conjunction with the program Codessa are presented. The descriptors are Boltzmann-averaged by selection of the most relevant conformers out of a set of possible molecular conformers generated by a systematic scheme presented in this paper. Six of these descriptors are calculated with molecular mechanics and three with quantum chemical methods. Especially interesting descriptors are the relative van der Waals energies and the molecular polarizabilities, which correlate very well with boiling points. Five more simple descriptors that only depend on the molecular constitutional formula are also discussed briefly. PMID:12415333

  19. Investigation of a novel molecular descriptor for the lead optimization of 4-aminoquinazolines as vascular endothelial growth factor receptor-2 inhibitors: application for quantitative structure-activity relationship analysis in lead optimization.

    PubMed

    Kawakami, Joel K; Martinez, Yannica; Sasaki, Brandi; Harris, Melissa; Kurata, Wendy E; Lau, Alan F

    2011-03-01

    We investigated the use of infrared vibrational frequency of ligands as a potential novel molecular descriptor in three different molecular target and chemical series. The vibrational energy of a ligand was approximated from the sum of infrared (IR) absorptions of each functional group within a molecule and normalized by its molecular weight (MDIR). Calculations were performed on a set of 4-aminoquinazolines with similar docking scores for the VEGFR2/KDR receptor. 4-Aminoquinazolines with MDIR values ranging 192-196 provided compounds with KDR inhibitory activity. The correlation of KDR inhibitory activity was similarly observed in a separate chemical series, the pyrazolo[1,5-a]pyrimidines. Initial exploration of this molecular descriptor supports a tool for rapid lead optimization in the 4-aminoquinazoline chemical series and a potential method for scaffold hopping in pursuit of new inhibitors. PMID:21306896

  20. Investigation of a novel molecular descriptor for the lead optimization of 4-aminoquinazolines as vascular endothelial growth factor receptor – 2 inhibitors: Application for quantitative structure activity relationship analysis in lead optimization

    PubMed Central

    Kawakami, Joel K.; Martinez, Yannica; Sasaki, Brandi; Harris, Melissa; Kurata, Wendy E.; Lau, Alan F.

    2013-01-01

    We investigated the use of infrared vibrational frequency of ligands as a potential novel molecular descriptor in three different molecular target and chemical series. The vibrational energy of a ligand was approximated from the sum of infrared (IR) absorptions of each functional group within a molecule and normalized by its molecular weight (MDIR). Calculations were performed on a set of 4-aminoquinazolines with similar docking scores for the VEGFR2/KDR receptor. 4-Aminoquinazolines with MDIR values ranging 192–196 provided compounds with KDR inhibitory activity. The correlation of KDR inhibitory activity was similarly observed in a separate chemical series, the pyrazolo[1,5-a]pyrimidines. Initial exploration of this molecular descriptor supports a tool for rapid lead optimization in the 4-aminoquinazoline chemical series and a potential method for scaffold hopping in pursuit of new inhibitors. PMID:21306896

  1. Heuman indices of hydrophobicity of bile acids and their comparison with a newly developed and conventional molecular descriptors.

    PubMed

    Poša, Mihalj

    2014-02-01

    Bile salts (BSs), in addition to their physiological role in the digestion of lipids in vertebrates, are also of significant importance in biomedical investigations. For predicting biological-pharmacological activity and physico-chemical properties of BSs it is important to develop such molecular descriptors that adequately describe the structural characteristics of the steroid skeleton. The present study encompassed the following bile acids (BAs): cholic, chenodeoxycholic, deoxycholic, hyodeoxycholic, ursodeoxycholic, hyocholic, and ursocholic acid, as well as oxo derivatives of certain BAs. For all of them, Heuman hydrophobicity indices (HI(BA)) (RP-HPLC parameters) were determined, and a detailed conformational analysis of the steroid skeleton showed that HI(BA) has the discrimination power for BAs based on the size of the hydrophobic surface on the β side and the lateral L7 and L12 sides of the steroid skeleton. Also, HI(BA) discerns the regiochemical characteristics of OH and oxo groups. Based on a survey of the structural factors of the steroid skeleton that influence the HI(BA) values of the tested BAs, we constructed a new molecular descriptor, CHIBA, with the characteristics of 2D and 3D topological descriptors. In respect of the structure of the steroid skeleton, the descriptor CHIBA behaves as a reversed-phase chromatographic descriptor of BAs. PMID:24076126

  2. Dissecting molecular descriptors into atomic contributions in density functional reactivity theory

    SciTech Connect

    Rong, Chunying; Lu, Tian; Liu, Shubin

    2014-01-14

    Density functional reactivity theory (DFRT) employs the electron density of a molecule and its related quantities such as gradient and Laplacian to describe its structure and reactivity properties. Proper descriptions at both molecular (global) and atomic (local) levels are equally important and illuminating. In this work, we make use of Bader's zero-flux partition scheme and consider atomic contributions for a few global reactivity descriptors in DFRT, including the density-based quantification of steric effect and related indices. Earlier, we proved that these quantities are intrinsically correlated for atomic and molecular systems [S. B. Liu, J. Chem. Phys. 126, 191107 (2007); ibid. 126, 244103 (2007)]. In this work, a new basin-based integration algorithm has been implemented, whose reliability and effectiveness have been extensively examined. We also investigated a list of simple hydrocarbon systems and different scenarios of bonding processes, including stretching, bending, and rotating. Interesting changing patterns for the atomic and molecular values of these quantities have been revealed for different systems. This work not only confirms the strong correlation between these global reactivity descriptors for molecular systems, as theoretically proven earlier by us, it also provides new and unexpected changing patterns for their atomic values, which can be employed to understand the origin and nature of chemical phenomena.

  3. Informatics calibration of a molecular descriptors database to predict solid dispersion potential of small molecule organic solids.

    PubMed

    Moore, Michael D; Wildfong, Peter L D

    2011-10-14

    The use of a novel, in silico method for making an intelligent polymer selection to physically stabilize small molecule organic (SMO) solid compounds formulated as amorphous molecular solid dispersions is reported. 12 compounds (75%, w/w) were individually co-solidified with polyvinyl pyrrolidone:vinyl acetate (PVPva) copolymer by melt-quenching. Co-solidified products were analyzed intact using differential scanning calorimetry (DSC) and the pair distribution function (PDF) transform of powder X-ray diffraction (PXRD) data to assess miscibility. Molecular descriptor indices were calculated for all twelve compounds using their reported crystallographic structures. Logistic regression was used to assess correlation between molecular descriptors and amorphous molecular solid dispersion potential. The final model was challenged with three compounds. Of the 12 compounds, 6 were miscible with PVPva (i.e. successful formation) and 6 were phase separated (i.e. unsuccessful formation). 2 of the 6 unsuccessful compounds exhibited detectable phase-separation using the PDF method, where DSC indicated miscibility. Logistic regression identified 7 molecular descriptors correlated to solid dispersion potential (α=0.001). The atomic mass-weighted third-order R autocorrelation index (R3m) was the only significant descriptor to provide completely accurate predictions of dispersion potential. The three compounds used to challenge the R3m model were also successfully predicted. PMID:21756988

  4. Chemical and Molecular Descriptors for the Reactivity of Amines with CO{sub 2}

    SciTech Connect

    Lee, Anita S.; Kitchin, John R.

    2012-10-24

    Amine-based solvents are likely to play an important role in CO{sub 2} capture applications in the future, and the identification of amines with superior performance will facilitate their use in CO{sub 2} capture. While some improvements in performance will be achieved through process modifications, modifying the CO{sub 2} capture performance of an amine also implies in part an ability to modify the reactions between the amine and CO{sub 2} through development of new functionalized amines. We present a computational study of trends in the reactions between CO{sub 2} and functionalized amines with a focus on identifying molecular descriptors that determine trends in reactivity. We examine the formation of bicarbonate and carbamate species on three classes of functionalized amines: alkylamines, alkanolamines, and fluorinated alkylamines including primary, secondary and tertiary amines in each class. These functional groups span electron-withdrawing to donating behavior, hydrogen-bonding, extent of functionalization, and proximity effects of the functional groups. Electron withdrawing groups tend to destabilize CO{sub 2} reaction products, whereas electron-donating groups tend to stabilize CO{sub 2} reaction products. Hydrogen bonding stabilizes CO{sub 2} reaction products. Electronic structure descriptors based on electronegativity were found to describe trends in the bicarbonate formation energy. A chemical correlation was observed between the carbamate formation energy and the carbamic acid formation energy. The local softness on the reacting N in the amine was found to partially explain trends carbamic acid formation energy.

  5. Establishment of an in silico phospholipidosis prediction method using descriptors related to molecular interactions causing phospholipid-compound complex formation.

    PubMed

    Haranosono, Yu; Nemoto, Shingo; Kurata, Masaaki; Sakaki, Hideyuki

    2016-01-01

    Although phospholipidosis (PLD) often affects drug development, there is no convenient in vitro or in vivo test system for PLD detection. In this study, we developed an in silico PLD prediction method based on the PLD-inducing mechanism. We focused on phospholipid (PL)-compound complex formation, which inhibits PL degradation by phospholipase. Thus, we used some molecular interactions, such as electrostatic interactions, hydrophobic interactions, and intermolecular forces, between PL and compounds as descriptors. First, we performed descriptor screening for intermolecular force and then developed a new in silico PLD prediction using descriptors related to molecular interactions. Based on the screening, we identified molecular refraction (MR) as a descriptor of intermolecular force. It is known that ClogP and most-basic pKa can be used for PLD prediction. Thereby, we developed an in silico prediction method using ClogP, most-basic pKa, and MR, which were related to hydrophobic interactions, electrostatic interactions, and intermolecular forces. In addition, a resampling method was used to determine the cut-off values for each descriptor. We obtained good results for 77 compounds as follows: sensitivity = 95.8%, specificity = 75.9%, and concordance = 88.3%. Although there is a concern regarding false-negative compounds for pKa calculations, this predictive ability will be adequate for PLD screening. In conclusion, the mechanism-based in silico PLD prediction method provided good prediction ability, and this method will be useful for evaluating the potential of drugs to cause PLD, particularly in the early stage of drug development, because this method only requires knowledge of the chemical structure. PMID:26961617

  6. Molecular field extrema as descriptors of biological activity: definition and validation.

    PubMed

    Cheeseright, Tim; Mackey, Mark; Rose, Sally; Vinter, Andy

    2006-01-01

    The paper describes the generation of four types of three-dimensional molecular field descriptors or 'field points' as extrema of electrostatic, steric, and hydrophobic fields. These field points are used to define the properties necessary for a molecule to bind in a characteristic way into a specified active site. The hypothesis is that compounds showing a similar field point pattern are likely to bind at the same target site regardless of structure. The methodology to test this idea is illustrated using HIV NNRTI and thrombin ligands and validated across seven other targets. From the in silico comparisons of field point overlays, the experimentally observed binding poses of these ligands in their respective sites can be reproduced from pairwise comparisons. PMID:16562997

  7. Prediction of substrate-enzyme-product interaction based on molecular descriptors and physicochemical properties.

    PubMed

    Niu, Bing; Huang, Guohua; Zheng, Linfeng; Wang, Xueyuan; Chen, Fuxue; Zhang, Yuhui; Huang, Tao

    2013-01-01

    It is important to correctly and efficiently predict the interaction of substrate-enzyme and to predict their product in metabolic pathway. In this work, a novel approach was introduced to encode substrate/product and enzyme molecules with molecular descriptors and physicochemical properties, respectively. Based on this encoding method, KNN was adopted to build the substrate-enzyme-product interaction network. After selecting the optimal features that are able to represent the main factors of substrate-enzyme-product interaction in our prediction, totally 160 features out of 290 features were attained which can be clustered into ten categories: elemental analysis, geometry, chemistry, amino acid composition, predicted secondary structure, hydrophobicity, polarizability, solvent accessibility, normalized van der Waals volume, and polarity. As a result, our predicting model achieved an MCC of 0.423 and an overall prediction accuracy of 89.1% for 10-fold cross-validation test. PMID:24455714

  8. Relationship between molecular descriptors and the enthalpies of sublimation of natural amino acids

    NASA Astrophysics Data System (ADS)

    Badelin, V. G.; Tyunina, V. V.; Girichev, G. V.; Tyunina, E. Yu.

    2016-07-01

    A multiparameter correlation between the enthalpies of sublimation and molecular descriptors of natural amino acids is proposed, based on generalized experimental and literature data on the heat effects of sublimation. The contributions from Van der Waals interactions, hydrogen bond formation, and electrostatic effects into enthalpy of sublimation has been evaluated using regression coefficients.

  9. Classification of signaling proteins based on molecular star graph descriptors using Machine Learning models.

    PubMed

    Fernandez-Lozano, Carlos; Cuiñas, Rubén F; Seoane, José A; Fernández-Blanco, Enrique; Dorado, Julian; Munteanu, Cristian R

    2015-11-01

    Signaling proteins are an important topic in drug development due to the increased importance of finding fast, accurate and cheap methods to evaluate new molecular targets involved in specific diseases. The complexity of the protein structure hinders the direct association of the signaling activity with the molecular structure. Therefore, the proposed solution involves the use of protein star graphs for the peptide sequence information encoding into specific topological indices calculated with S2SNet tool. The Quantitative Structure-Activity Relationship classification model obtained with Machine Learning techniques is able to predict new signaling peptides. The best classification model is the first signaling prediction model, which is based on eleven descriptors and it was obtained using the Support Vector Machines-Recursive Feature Elimination (SVM-RFE) technique with the Laplacian kernel (RFE-LAP) and an AUROC of 0.961. Testing a set of 3114 proteins of unknown function from the PDB database assessed the prediction performance of the model. Important signaling pathways are presented for three UniprotIDs (34 PDBs) with a signaling prediction greater than 98.0%. PMID:26297890

  10. Structure-activity relationship analysis of N-benzoylpyrazoles for elastase inhibitory activity: a simplified approach using atom pair descriptors.

    PubMed

    Khlebnikov, Andrei I; Schepetkin, Igor A; Quinn, Mark T

    2008-03-15

    Previously, we utilized high throughput screening of a chemical diversity library to identify potent inhibitors of human neutrophil elastase and found that many of these compounds had N-benzoylpyrazole core structures. We also found individual ring substituents had significant impact on elastase inhibitory activity and compound stability. In the present study, we utilized computational structure-activity relationship (SAR) analysis of a series of 53 N-benzoylpyrazole derivatives to further optimize these lead molecules. We present an improved approach to SAR methodology based on atom pair descriptors in combination with 2-dimensional (2D) molecular descriptors. This approach utilizes the rich representation of chemical structure and leads to SAR analysis that is both accurate and intuitively easy to understand. A sequence of ANOVA, linear discriminant, and binary classification tree analyses of the molecular descriptors led to the derivation of SAR rule-based algorithms. These rules revealed that the main factors influencing elastase inhibitory activity of N-benzoylpyrazole molecules were the presence of methyl groups in the pyrazole moiety and ortho-substituents in the benzoyl radical. Furthermore, our data showed that physicochemical characteristics (energy of frontier molecular orbitals, molar refraction, lipophilicity) were not necessary for achieving good SAR, as comparable quality of SAR classification was obtained with atom pairs and 2D descriptors only. This simplified SAR approach may be useful to qualitative SAR recognition problems in a variety of data sets. PMID:18234502

  11. 'Quasi-Mixture' Descriptors for QSPR Analysis of Molecular Macroscopic Properties. The Critical Properties of Organic Compounds.

    PubMed

    Mokshyna, E; Nedostup, V I; Polishchuk, P G; Kuzmin, V E

    2014-10-01

    Rational approach towards the QSAR/QSPR modeling requires the descriptors to be computationally efficient, yet physically and chemically meaningful. On the basis of existing simplex representation of molecular structure (SiRMS) the novel 'quasi-mixture' descriptors were developed in order to accomplish the goal of characterization molecules on 2D level (i.e. without explicit generation of 3D structure and exhaustive conformational search) with account for potential intermolecular interactions. The critical properties of organic compounds were chosen as target properties for the estimation of descriptors' efficacy because of their well-known physical nature, rigorously estimated experimental errors and large quantity of experimental data. Among described properties are critical temperature, pressure and volume. Obtained models have high statistical characteristics, therefore showing the efficacy of suggested 'quasi-mixture' approach. Moreover, 'quasi-mixture' approach, as a branch of the SiRMS, allows to interpret results in terms of simple basic molecular properties. The obtained picture of influences corresponds to the accepted theoretical views. PMID:27485300

  12. Structure-activity Relationship Analysis of N-Benzoylpyrazoles for Elastase Inhibitory Activity: A Simplified Approach Using Atom Pair Descriptors

    PubMed Central

    Khlebnikov, Andrei I.; Schepetkin, Igor A.; Quinn, Mark T.

    2008-01-01

    Previously, we utilized high throughput screening of a chemical diversity library to identify potent inhibitors of human neutrophil elastase and found that many of these compounds had N-benzoylpyrazole core structures. We also found individual ring substituents had significant impact on elastase inhibitory activity and compound stability. In the present study, we utilized computational structure–activity relationship (SAR) analysis of a series of 53 N-benzoylpyrazole derivatives to further optimize these lead molecules. We present an improved approach to SAR methodology based on atom pair descriptors in combination with 2-dimentional (2D) molecular descriptors. This approach utilizes the rich representation of chemical structure and leads to SAR analysis that is both accurate and intuitively easy to understand. A sequence of ANOVA, linear discriminant, and binary classification tree analyses of the molecular descriptors led to the derivation of SAR rule-based algorithms. These rules revealed that the main factors influencing elastase inhibitory activity of N-benzoylpyrazole molecules were the presence of methyl groups in the pyrazole moiety and ortho-substituents in the benzoyl radical. Furthermore, our data showed that physicochemical characteristics (energy of frontier molecular orbitals, molar refraction, lipophilicity) were not necessary for achieving good SAR, as comparable quality of SAR classification was obtained with atom pairs and 2D descriptors only. This simplified SAR approach may be useful to qualitative SAR recognition problems in a variety of data sets. PMID:18234502

  13. Quantitative structure-activity relationships of selective antagonists of glucagon receptor using QuaSAR descriptors.

    PubMed

    Manoj Kumar, Palanivelu; Karthikeyan, Chandrabose; Hari Narayana Moorthy, Narayana Subbiah; Trivedi, Piyush

    2006-11-01

    In the present paper, quantitative structure activity relationship (QSAR) approach was applied to understand the affinity and selectivity of a novel series of triaryl imidazole derivatives towards glucagon receptor. Statistically significant and highly predictive QSARs were derived for glucagon receptor inhibition by triaryl imidazoles using QuaSAR descriptors of molecular operating environment (MOE) employing computer-assisted multiple regression procedure. The generated QSAR models revealed that factors related to hydrophobicity, molecular shape and geometry predominantly influences glucagon receptor binding affinity of the triaryl imidazoles indicating the relevance of shape specific steric interactions between the molecule and the receptor. Further, QSAR models formulated for selective inhibition of glucagon receptor over p38 mitogen activated protein (MAP) kinase of the compounds in the series highlights that the same structural features, which influence the glucagon receptor affinity, also contribute to their selective inhibition. PMID:17077558

  14. Combined experimental (FT-IR, UV-visible spectra, NMR) and theoretical studies on the molecular structure, vibrational spectra, HOMO, LUMO, MESP surfaces, reactivity descriptor and molecular docking of Phomarin

    NASA Astrophysics Data System (ADS)

    Kumar, Abhishek; Srivastava, Ambrish Kumar; Gangwar, Shashi; Misra, Neeraj; Mondal, Avijit; Brahmachari, Goutam

    2015-09-01

    Phomarin is an important natural product belonging to anthraquinone series of compounds. The equilibrium geometry of phomarin has been determined and analyzed at DFT method employing B3LYP/6-311++G(d,p) level of computation. The reactivity of molecule using various descriptors such as Fukui functions, local softness, electrophilicity, electronegativity, Hardness, HOMO-LUMO gap are calculated and discussed. The infrared and UV-vis spectra of phomarin are calculated and compared with the experimentally observed ones. Moreover, 1H and 13C NMR spectra have been calculated by using the gauge independent atomic orbital method. We also notice that phomarin shows remarkable biological activities against malaria parasite. The study suggests further investigation on phomarin for their pharmacological importance.

  15. In silico modelling of permeation enhancement potency in Caco-2 monolayers based on molecular descriptors and random forest.

    PubMed

    Welling, Søren H; Clemmensen, Line K H; Buckley, Stephen T; Hovgaard, Lars; Brockhoff, Per B; Refsgaard, Hanne H F

    2015-08-01

    Structural traits of permeation enhancers are important determinants of their capacity to promote enhanced drug absorption. Therefore, in order to obtain a better understanding of structure-activity relationships for permeation enhancers, a Quantitative Structural Activity Relationship (QSAR) model has been developed. The random forest-QSAR model was based upon Caco-2 data for 41 surfactant-like permeation enhancers from Whitehead et al. (2008) and molecular descriptors calculated from their structure. The QSAR model was validated by two test-sets: (i) an eleven compound experimental set with Caco-2 data and (ii) nine compounds with Caco-2 data from literature. Feature contributions, a recent developed diagnostic tool, was applied to elucidate the contribution of individual molecular descriptors to the predicted potency. Feature contributions provided easy interpretable suggestions of important structural properties for potent permeation enhancers such as segregation of hydrophilic and lipophilic domains. Focusing on surfactant-like properties, it is possible to model the potency of the complex pharmaceutical excipients, permeation enhancers. For the first time, a QSAR model has been developed for permeation enhancement. The model is a valuable in silico approach for both screening of new permeation enhancers and physicochemical optimisation of surfactant enhancer systems. PMID:26004819

  16. Predicting cytotoxicity of PAMAM dendrimers using molecular descriptors

    PubMed Central

    Jones, David E; Ghandehari, Hamidreza

    2015-01-01

    Summary The use of data mining techniques in the field of nanomedicine has been very limited. In this paper we demonstrate that data mining techniques can be used for the development of predictive models of the cytotoxicity of poly(amido amine) (PAMAM) dendrimers using their chemical and structural properties. We present predictive models developed using 103 PAMAM dendrimer cytotoxicity values that were extracted from twelve cancer nanomedicine journal articles. The results indicate that data mining and machine learning can be effectively used to predict the cytotoxicity of PAMAM dendrimers on Caco-2 cells. PMID:26665059

  17. Predictability of physicochemical properties of polychlorinated dibenzo-p-dioxins (PCDDs) based on single-molecular descriptor models.

    PubMed

    Kim, Minhee; Li, Loretta Y; Grace, John R

    2016-06-01

    Polychlorinated dibenzo-p-dioxins (PCDDs) are of global concern due to their persistence, bioaccumulation and toxicity. Although the fate of PCDDs in the environment is determined by their physical-chemical properties, such as aqueous solubility, vapor pressure, octanol/water-, air/water-, and octanol/water-partition coefficients, experimental property data on the entire set of 75 PCDD congeners are limited. The quantitative structure-property relationship (QSPR) approach is applied to predict the properties of all PCDD congeners. Experimental property data available from the literature are correlated against 16 molecular descriptors of five types. Reported and newly developed QSPR models for PCDDs are presented and reviewed. The values calculated by the best QSPRs are further adjusted to satisfy fundamental thermodynamic relationships. Although the single-descriptor models with chlorine number, molar volume, solvent accessible surface area and polarizability are based on good statistical results, these models cannot distinguish among PCDDs having the same chlorine number. The QSPR model based on the hyper-Wiener index of quantum-chemical descriptor gives useful statistical results and is able to distinguish among congeners with the same chlorine number, as well as satisfying thermodynamic relationships. The resulting consistent properties of the 75 PCDD congeners can be used for environmental modeling. PMID:26878604

  18. Alpha shapes applied to molecular shape characterization exhibit novel properties compared to established shape descriptors

    PubMed Central

    Wilson, J. Anthony; Bender, Andreas; Kaya, Taner; Clemons, Paul A.

    2011-01-01

    Despite considerable efforts, description of molecular shape is still largely an unresolved problem. Given the importance of molecular shape in the description of spatial interactions in crystals or ligand-target complexes, this is not a satisfying state. In the current work, we propose a novel application of alpha shapes to the description of the shapes of small molecules. Alpha shapes are parameterized generalizations of the convex hull. For a specific value of α, the alpha shape is the geometric dual of the space-filling model of a molecule, with the parameter α allowing description of shape in varying degrees of detail. To date, alpha shapes have been used to find macromolecular cavities and to estimate molecular surface areas and volumes. We developed a novel methodology for computing molecular shape characteristics from the alpha shape. In this work, we show that alpha-shape descriptors reveal aspects of molecular shape that are complementary to other shape descriptors, and that accord well with chemists’ intuition about shape. While our implementation of alpha-shape descriptors is not computationally trivial, we suggest that the additional shape characteristics they provide can be used to improve and complement shape-analysis methods in domains such as crystallography and ligand-target interactions. In this communication, we present a unique methodology for computing molecular shape characteristics from the alpha shape. We first describe details of the alpha-shape calculation, an outline of validation experiments performed, and a discussion of the advantages and challenges we found while implementing this approach. The results show that, relative to known shape calculations, this method provides a high degree of shape resolution with even small changes in atomic coordinates. PMID:19775113

  19. Aquifer vulnerability to pesticide pollution - Combining soil, land-use and aquifer properties with molecular descriptors

    USGS Publications Warehouse

    Worrall, F.; Kolpin, D.W.

    2004-01-01

    This study uses an extensive survey of herbicides in groundwater across the midwest United States to predict occurrences of a range of compounds across the region from a combination of their molecular properties and the properties of the catchment of a borehole. The study covers 100 boreholes and eight pesticides. For each of the boreholes its catchment the soil, land-use and aquifer properties were characterized. Discriminating boreholes where pollution occurred from those where no pollution occurred gave a model that was 74% correct with organic carbon content, percentage sand content and depth to the water table being significant properties of the borehole catchment. Molecular topological descriptors as well as Koc, solubility and half-life were used to characterize each compound included in the study. Inclusion of molecular properties makes it possible to discriminate between occurrence and non-occurrence of each compound in each well. The best-fit model combines: organic carbon content, percentage sand content and depth to the water table with molecular descriptors representing molecular size, molecular branching and functional group composition of the herbicides.

  20. Controlling the Adsorption of Aromatic Compounds on Pt(111) with Oxygenate Substituents: From DFT to Simple Molecular Descriptors.

    PubMed

    Réocreux, Romain; Huynh, Minh; Michel, Carine; Sautet, Philippe

    2016-06-01

    Aromatic chemistry on metallic surfaces is involved in many processes within the contexts of biomass valorization, pollutant degradation, or corrosion protection. Albeit theoretically and experimentally challenging, knowing the structure and the stability of aromatic compounds on such surfaces is essential to understand their properties. To gain insights on this topic, we performed periodic ab initio calculations on Pt(111) to determine a set of simple molecular descriptors that predict both the stability and the structure of aromatic adsorbates substituted with alkyl and alkoxy (or hydroxy) groups. While the van der Waals (vdW) interaction is controlled by the molecular weight and the deformation energy by both the nature and the relative position of the substituents to the surface, the chemical bonding can be correlated to the Hard and Soft Acids and Bases (HSAB) interaction energy. This work gives general insights on the interaction of aromatic compounds with the Pt(111) surface. PMID:27206155

  1. Structural protein descriptors in 1-dimension and their sequence-based predictions.

    PubMed

    Kurgan, Lukasz; Disfani, Fatemeh Miri

    2011-09-01

    The last few decades observed an increasing interest in development and application of 1-dimensional (1D) descriptors of protein structure. These descriptors project 3D structural features onto 1D strings of residue-wise structural assignments. They cover a wide-range of structural aspects including conformation of the backbone, burying depth/solvent exposure and flexibility of residues, and inter-chain residue-residue contacts. We perform first-of-its-kind comprehensive comparative review of the existing 1D structural descriptors. We define, review and categorize ten structural descriptors and we also describe, summarize and contrast over eighty computational models that are used to predict these descriptors from the protein sequences. We show that the majority of the recent sequence-based predictors utilize machine learning models, with the most popular being neural networks, support vector machines, hidden Markov models, and support vector and linear regressions. These methods provide high-throughput predictions and most of them are accessible to a non-expert user via web servers and/or stand-alone software packages. We empirically evaluate several recent sequence-based predictors of secondary structure, disorder, and solvent accessibility descriptors using a benchmark set based on CASP8 targets. Our analysis shows that the secondary structure can be predicted with over 80% accuracy and segment overlap (SOV), disorder with over 0.9 AUC, 0.6 Matthews Correlation Coefficient (MCC), and 75% SOV, and relative solvent accessibility with PCC of 0.7 and MCC of 0.6 (0.86 when homology is used). We demonstrate that the secondary structure predicted from sequence without the use of homology modeling is as good as the structure extracted from the 3D folds predicted by top-performing template-based methods. PMID:21787299

  2. Quantitative structure-activity relationship study of antioxidative peptide by using different sets of amino acids descriptors

    NASA Astrophysics Data System (ADS)

    Li, Yao-Wang; Li, Bo; He, Jiguo; Qian, Ping

    2011-07-01

    A database consisting of 214 tripeptides which contain either His or Tyr residue was applied to study quantitative structure-activity relationships (QSAR) of antioxidative tripeptides. Partial Least-Squares Regression analysis (PLSR) was conducted using parameters individually of each amino acid descriptor, including Divided Physico-chemical Property Scores (DPPS), Hydrophobic, Electronic, Steric, and Hydrogen (HESH), Vectors of Hydrophobic, Steric, and Electronic properties (VHSE), Molecular Surface-Weighted Holistic Invariant Molecular (MS-WHIM), isotropic surface area-electronic charge index (ISA-ECI) and Z-scale, to describe antioxidative tripeptides as X-variables and antioxidant activities measured with ferric thiocyanate methods were as Y-variable. After elimination of outliers by Hotelling's T 2 method and residual analysis, six significant models were obtained describing the entire data set. According to cumulative squared multiple correlation coefficients ( R2), cumulative cross-validation coefficients ( Q2) and relative standard deviation for calibration set (RSD c), the qualities of models using DPPS, HESH, ISA-ECI, and VHSE descriptors are better ( R2 > 0.6, Q2 > 0.5, RSD c < 0.39) than that of models using MS-WHIM and Z-scale descriptors ( R2 < 0.6, Q2 < 0.5, RSD c > 0.44). Furthermore, the predictive ability of models using DPPS descriptor is best among the six descriptors systems (cumulative multiple correlation coefficient for predict set ( Rext2) > 0.7). It was concluded that the DPPS is better to describe the amino acid of antioxidative tripeptides. The results of DPPS descriptor reveal that the importance of the center amino acid and the N-terminal amino acid are far more than the importance of the C-terminal amino acid for antioxidative tripeptides. The hydrophobic (positively to activity) and electronic (negatively to activity) properties of the N-terminal amino acid are suggested to play the most important significance to activity, followed

  3. A Survey of Quantitative Descriptions of Molecular Structure

    PubMed Central

    Guha, Rajarshi; Willighagen, Egon

    2013-01-01

    Numerical characterization of molecular structure is a first step in many computational analysis of chemical structure data. These numerical representations, termed descriptors, come in many forms, ranging from simple atom counts and invariants of the molecular graph to distribution of properties, such as charge, across a molecular surface. In this article we first present a broad categorization of descriptors and then describe applications and toolkits that can be employed to evaluate them. We highlight a number of issues surrounding molecular descriptor calculations such as versioning and reproducibility and describe how some toolkits have attempted to address these problems. PMID:23110530

  4. Binary classification of chalcone derivatives with LDA or KNN based on their antileishmanial activity and molecular descriptors selected using the Successive Projections Algorithm feature-selection technique.

    PubMed

    Goodarzi, Mohammad; Saeys, Wouter; de Araujo, Mario Cesar Ugulino; Galvão, Roberto Kawakami Harrop; Vander Heyden, Yvan

    2014-01-23

    Chalcones are naturally occurring aromatic ketones, which consist of an α-, β-unsaturated carbonyl system joining two aryl rings. These compounds are reported to exhibit several pharmacological activities, including antiparasitic, antibacterial, antifungal, anticancer, immunomodulatory, nitric oxide inhibition and anti-inflammatory effects. In the present work, a Quantitative Structure-Activity Relationship (QSAR) study is carried out to classify chalcone derivatives with respect to their antileishmanial activity (active/inactive) on the basis of molecular descriptors. For this purpose, two techniques to select descriptors are employed, the Successive Projections Algorithm (SPA) and the Genetic Algorithm (GA). The selected descriptors are initially employed to build Linear Discriminant Analysis (LDA) models. An additional investigation is then carried out to determine whether the results can be improved by using a non-parametric classification technique (One Nearest Neighbour, 1NN). In a case study involving 100 chalcone derivatives, the 1NN models were found to provide better rates of correct classification than LDA, both in the training and test sets. The best result was achieved by a SPA-1NN model with six molecular descriptors, which provided correct classification rates of 97% and 84% for the training and test sets, respectively. PMID:24090733

  5. Multi-Server Approach for High-Throughput Molecular Descriptors Calculation based on Multi-Linear Algebraic Maps.

    PubMed

    García-Jacas, César R; Aguilera-Mendoza, Longendri; González-Pérez, Reisel; Marrero-Ponce, Yovani; Acevedo-Martínez, Liesner; Barigye, Stephen J; Avdeenko, Tatiana

    2015-01-01

    The present report introduces a novel module of the QuBiLS-MIDAS software for the distributed computation of the 3D Multi-Linear algebraic molecular indices. The main motivation for developing this module is to deal with the computational complexity experienced during the calculation of the descriptors over large datasets. To accomplish this task, a multi-server computing platform named T-arenal was developed, which is suited for institutions with many workstations interconnected through a local network and without resources particularly destined for computation tasks. This new system was deployed in 337 workstations and it was perfectly integrated with the QuBiLS-MIDAS software. To illustrate the usability of the T-arenal platform, performance tests over a dataset comprised of 15 000 compounds are carried out, yielding a 52 and 60 fold reduction in the sequential processing time for the 2-Linear and 3-Linear indices, respectively. Therefore, it can be stated that the T-arenal based distribution of computation tasks constitutes a suitable strategy for performing high-throughput calculations of 3D Multi-Linear descriptors over thousands of chemical structures for posterior QSAR and/or ADME-Tox studies. PMID:27490863

  6. Mammary Carcinogen-Protein Binding Potentials: Novel and Biologically Relevant Structure-Activity Relationship Model Descriptors

    PubMed Central

    Cunningham, A.R.; Qamar, S.; Carrasquer, C.A.; Holt, P.A.; Maguire, J.M.; Cunningham, S.L.; Trent, J.O.

    2010-01-01

    Previously, SAR models for carcinogenesis used descriptors that are essentially chemical descriptors. Herein we report the development of models with the cat-SAR expert system using biological descriptors (i.e., ligand-receptor interactions) rat mammary carcinogens. These new descriptors are derived from the virtual screening for ligand-receptor interactions of carcinogens, non-carcinogens, and mammary carcinogens to a set of 5494 target proteins. Leave-one-out validations of the ligand mammary carcinogen non-carcinogen model had a concordance between experimental and predicted results of 71% and the mammary carcinogen non-mammary carcinogen model was 72% concordant. The development of a hybrid fragment-ligand model improved the concordances to 85 and 83%, respectively. In a separate external validation exercise, hybrid fragment-ligand models had concordances of 81 and 76%. Analyses of example rat mammary carcinogens including the food mutagen and estrogenic compound PhIP, the herbicide atrazine, and the drug indomethacin, the ligand model identified a number of proteins associated with each compound that had previously been referenced in Medline in conjunction with the test chemical and separately with association to breast cancer. This new modelling approach can enhance model predictivity and help bridge the gap between chemical structure and carcinogenic activity by descriptors that are related to biological targets. PMID:20818582

  7. Lagrangian Descriptors: A Method for Revealing Phase Space Structures of General Time Dependent Dynamical Systems

    NASA Astrophysics Data System (ADS)

    Mancho, Ana M.; Wiggins, Stephen; Curbelo, Jezabel; Mendoza, Carolina

    2013-11-01

    Lagrangian descriptors are a recent technique which reveals geometrical structures in phase space and which are valid for aperiodically time dependent dynamical systems. We discuss a general methodology for constructing them and we discuss a ``heuristic argument'' that explains why this method is successful. We support this argument by explicit calculations on a benchmark problem. Several other benchmark examples are considered that allow us to assess the performance of Lagrangian descriptors with both finite time Lyapunov exponents (FTLEs) and finite time averages of certain components of the vector field (``time averages''). In all cases Lagrangian descriptors are shown to be both more accurate and computationally efficient than these methods. We thank CESGA for computing facilities. This research was supported by MINECO grants: MTM2011-26696, I-Math C3-0104, ICMAT Severo Ochoa project SEV-2011-0087, and CSIC grant OCEANTECH. SW acknowledges the support of the ONR (Grant No. N00014-01-1-0769).

  8. QuBiLS-MIDAS: a parallel free-software for molecular descriptors computation based on multilinear algebraic maps.

    PubMed

    García-Jacas, César R; Marrero-Ponce, Yovani; Acevedo-Martínez, Liesner; Barigye, Stephen J; Valdés-Martiní, José R; Contreras-Torres, Ernesto

    2014-07-01

    The present report introduces the QuBiLS-MIDAS software belonging to the ToMoCoMD-CARDD suite for the calculation of three-dimensional molecular descriptors (MDs) based on the two-linear (bilinear), three-linear, and four-linear (multilinear or N-linear) algebraic forms. Thus, it is unique software that computes these tensor-based indices. These descriptors, establish relations for two, three, and four atoms by using several (dis-)similarity metrics or multimetrics, matrix transformations, cutoffs, local calculations and aggregation operators. The theoretical background of these N-linear indices is also presented. The QuBiLS-MIDAS software was developed in the Java programming language and employs the Chemical Development Kit library for the manipulation of the chemical structures and the calculation of the atomic properties. This software is composed by a desktop user-friendly interface and an Abstract Programming Interface library. The former was created to simplify the configuration of the different options of the MDs, whereas the library was designed to allow its easy integration to other software for chemoinformatics applications. This program provides functionalities for data cleaning tasks and for batch processing of the molecular indices. In addition, it offers parallel calculation of the MDs through the use of all available processors in current computers. The studies of complexity of the main algorithms demonstrate that these were efficiently implemented with respect to their trivial implementation. Lastly, the performance tests reveal that this software has a suitable behavior when the amount of processors is increased. Therefore, the QuBiLS-MIDAS software constitutes a useful application for the computation of the molecular indices based on N-linear algebraic maps and it can be used freely to perform chemoinformatics studies. PMID:24889018

  9. Understanding the comparative molecular field analysis (CoMFA) in terms of molecular quantum similarity and DFT-based reactivity descriptors.

    PubMed

    Morales-Bayuelo, Alejandro; Matute, Ricardo A; Caballero, Julio

    2015-06-01

    The three-dimensional quantitative structure-activity relationship (3D QSAR) models have many applications, although the inherent complexity to understand the results coming from 3D-QSAR arises the necessity of new insights in the interpretation of them. Hence, the quantum similarity field as well as reactivity descriptors based on the density functional theory were used in this work as a consistent approach to better understand the 3D-QSAR studies in drug design. For this purpose, the quantification of steric and electrostatic effects on a series of bicycle [4.1.0] heptane derivatives as melanin-concentrating hormone receptor 1 antagonists were performed on the basis of molecular quantum similarity measures. The maximum similarity superposition and the topo-geometrical superposition algorithms were used as molecular alignment methods to deal with the problem of relative molecular orientation in quantum similarity. In addition, a chemical reactivity analysis using global and local descriptors such as chemical hardness, softness, electrophilicity, and Fukui functions, was developed. Overall, our results suggest that the application of this methodology in drug design can be useful when the receptor is known or even unknown. PMID:26016942

  10. Prediction of enantiomeric selectivity in chromatography. Application of conformation-dependent and conformation-independent descriptors of molecular chirality.

    PubMed

    Aires-de-Sousa, João; Gasteiger, Johann

    2002-03-01

    In order to process molecular chirality by computational methods and to obtain predictions for properties that are influenced by chirality, a fixed-length conformation-dependent chirality code is introduced. The code consists of a set of molecular descriptors representing the chirality of a 3D molecular structure. It includes information about molecular geometry and atomic properties, and can distinguish between enantiomers, even if chirality does not result from chiral centers. The new molecular transform was applied to two datasets of chiral compounds, each of them containing pairs of enantiomers that had been separated by chiral chromatography. The elution order within each pair of isomers was predicted by means of Kohonen neural networks (NN) using the chirality codes as input. A previously described conformation-independent chirality code was also applied and the results were compared. In both applications clustering of the two classes of enantiomers (first eluted and last eluted enantiomers) could be successfully achieved by NN and accurate predictions could be obtained for independent test sets. The chirality code described here has a potential for a broad range of applications from stereoselective reactions to analytical chemistry and to the study of biological activity of chiral compounds. PMID:11885960

  11. Prioritization of in silico models and molecular descriptors for the assessment of ready biodegradability.

    PubMed

    Fernández, Alberto; Rallo, Robert; Giralt, Francesc

    2015-10-01

    Ready biodegradability is a key property for evaluating the long-term effects of chemicals on the environment and human health. As such, it is used as a screening test for the assessment of persistent, bioaccumulative and toxic substances. Regulators encourage the use of non-testing methods, such as in silico models, to save money and time. A dataset of 757 chemicals was collected to assess the performance of four freely available in silico models that predict ready biodegradability. They were applied to develop a new consensus method that prioritizes the use of each individual model according to its performance on chemical subsets driven by the presence or absence of different molecular descriptors. This consensus method was capable of almost eliminating unpredictable chemicals, while the performance of combined models was substantially improved with respect to that of the individual models. PMID:26160046

  12. Molecular docking using the molecular lipophilicity potential as hydrophobic descriptor: impact on GOLD docking performance.

    PubMed

    Nurisso, Alessandra; Bravo, Juan; Carrupt, Pierre-Alain; Daina, Antoine

    2012-05-25

    GOLD is a molecular docking software widely used in drug design. In the initial steps of docking, it creates a list of hydrophobic fitting points inside protein cavities that steer the positioning of ligand hydrophobic moieties. These points are generated based on the Lennard-Jones potential between a carbon probe and each atom of the residues delimitating the binding site. To thoroughly describe hydrophobic regions in protein pockets and properly guide ligand hydrophobic moieties toward favorable areas, an in-house tool, the MLP filter, was developed and herein applied. This strategy only retains GOLD hydrophobic fitting points that match the rigorous definition of hydrophobicity given by the molecular lipophilicity potential (MLP), a molecular interaction field that relies on an atomic fragmental system based on 1-octanol/water experimental partition coefficients (log P(oct)). MLP computations in the binding sites of crystallographic protein structures revealed that a significant number of points considered hydrophobic by GOLD were actually polar according to the MLP definition of hydrophobicity. To examine the impact of this new tool, ligand-protein complexes from the Astex Diverse Set and the PDB bind core database were redocked with and without the use of the MLP filter. Reliable docking results were obtained by using the MLP filter that increased the quality of docking in nonpolar cavities and outperformed the standard GOLD docking approach. PMID:22462609

  13. Quantitative crystal structure descriptors from multiplicative congruential generators.

    PubMed

    Hornfeck, Wolfgang

    2012-03-01

    Special types of number-theoretic relations, termed multiplicative congruential generators (MCGs), exhibit an intrinsic sublattice structure. This has considerable implications within the crystallographic realm, namely for the coordinate description of crystal structures for which MCGs allow for a concise way of encoding the numerical structural information. Thus, a conceptual framework is established, with some focus on layered superstructures, which proposes the use of MCGs as a tool for the quantitative description of crystal structures. The multiplicative congruential method eventually affords an algorithmic generation of three-dimensional crystal structures with a near-uniform distribution of atoms, whereas a linearization procedure facilitates their combinatorial enumeration and classification. The outlook for homometric structures and dual-space crystallography is given. Some generalizations and extensions are formulated in addition, revealing the connections of MCGs with geometric algebra, discrete dynamical systems (iterative maps), as well as certain quasicrystal approximants. PMID:22338652

  14. Morphological and Molecular Descriptors of the Developmental Cycle of Babesia divergens Parasites in Human Erythrocytes

    PubMed Central

    Rossouw, Ingrid; Maritz-Olivier, Christine; Niemand, Jandeli; van Biljon, Riette; Smit, Annel; Olivier, Nicholas A.; Birkholtz, Lyn-Marie

    2015-01-01

    Human babesiosis, especially caused by the cattle derived Babesia divergens parasite, is on the increase, resulting in renewed attentiveness to this potentially life threatening emerging zoonotic disease. The molecular mechanisms underlying the pathophysiology and intra-erythrocytic development of these parasites are poorly understood. This impedes concerted efforts aimed at the discovery of novel anti-babesiacidal agents. By applying sensitive cell biological and molecular functional genomics tools, we describe the intra-erythrocytic development cycle of B. divergens parasites from immature, mono-nucleated ring forms to bi-nucleated paired piriforms and ultimately multi-nucleated tetrads that characterizes zoonotic Babesia spp. This is further correlated for the first time to nuclear content increases during intra-erythrocytic development progression, providing insight into the part of the life cycle that occurs during human infection. High-content temporal evaluation elucidated the contribution of the different stages to life cycle progression. Moreover, molecular descriptors indicate that B. divergens parasites employ physiological adaptation to in vitro cultivation. Additionally, differential expression is observed as the parasite equilibrates its developmental stages during its life cycle. Together, this information provides the first temporal evaluation of the functional transcriptome of B. divergens parasites, information that could be useful in identifying biological processes essential to parasite survival for future anti-babesiacidal discoveries. PMID:25955414

  15. Morphological and Molecular Descriptors of the Developmental Cycle of Babesia divergens Parasites in Human Erythrocytes.

    PubMed

    Rossouw, Ingrid; Maritz-Olivier, Christine; Niemand, Jandeli; van Biljon, Riette; Smit, Annel; Olivier, Nicholas A; Birkholtz, Lyn-Marie

    2015-05-01

    Human babesiosis, especially caused by the cattle derived Babesia divergens parasite, is on the increase, resulting in renewed attentiveness to this potentially life threatening emerging zoonotic disease. The molecular mechanisms underlying the pathophysiology and intra-erythrocytic development of these parasites are poorly understood. This impedes concerted efforts aimed at the discovery of novel anti-babesiacidal agents. By applying sensitive cell biological and molecular functional genomics tools, we describe the intra-erythrocytic development cycle of B. divergens parasites from immature, mono-nucleated ring forms to bi-nucleated paired piriforms and ultimately multi-nucleated tetrads that characterizes zoonotic Babesia spp. This is further correlated for the first time to nuclear content increases during intra-erythrocytic development progression, providing insight into the part of the life cycle that occurs during human infection. High-content temporal evaluation elucidated the contribution of the different stages to life cycle progression. Moreover, molecular descriptors indicate that B. divergens parasites employ physiological adaptation to in vitro cultivation. Additionally, differential expression is observed as the parasite equilibrates its developmental stages during its life cycle. Together, this information provides the first temporal evaluation of the functional transcriptome of B. divergens parasites, information that could be useful in identifying biological processes essential to parasite survival for future anti-babesiacidal discoveries. PMID:25955414

  16. Improving B3LYP heats of formation with three-dimensional molecular descriptors.

    PubMed

    Zhou, Yuwei; Wu, Jianming; Xu, Xin

    2016-05-15

    In the present work, we propose the X3D method that extends the B3LYP method by correcting its errors on heats of formation of hydrocarbons (HCs) with three-dimensional (3D) molecular descriptors. Inspired by the widely used Wiener index, these 3D descriptors are developed to improve over the original B3LYP method for a better description of atom-atom, atom-bond and bond-bond interactions. On top of a training set of only 45 species, the X3D method is validated against various sets of different chemistry, displaying an overall near chemical accuracy. In particular, X3D improves over B3LYP, reducing its mean absolute errors from 28.4 to 0.3 kcal/mol for (Set 1) 21 n-alkanes up to n-C32 H66 , from 19.3 to 0.6 kcal/mol for (Set 2) n-C7 H16 and its branched isomers, from 29.5 to 1.6 kcal/mol for (Set 3) 36 polycyclic saturated HCs, from 8.6 to 1.1 kcal/mol for (Set 4) 41 C6 H8 isomers of rings, alkenes, alkynes, and cumulenes, from 20.3 to 0.6 kcal/mol for (Set 5) 41 benzene-based compounds, and 8.1 to 1.3 kcal/mol for (Set 6) 66 radicals, etc. Comparisons with the G4 results are also presented. © 2016 Wiley Periodicals, Inc. PMID:26887921

  17. Computing a new family of shape descriptors for protein structures.

    PubMed

    Røgen, Peter; Sinclair, Robert

    2003-01-01

    The large-scale 3D structure of a protein can be represented by the polygonal curve through the carbon alpha atoms of the protein backbone. We introduce an algorithm for computing the average number of times that a given configuration of crossings on such polygonal curves is seen, the average being taken over all directions in space. Hereby, we introduce a new family of global geometric measures of protein structures, which we compare with the so-called generalized Gauss integrals. PMID:14632419

  18. High-throughput screening for thermoelectric sulphides by using crystal structure features as descriptors

    NASA Astrophysics Data System (ADS)

    Zhang, Ruizhi; Du, Baoli; Chen, Kan; Reece, Mike; Materials Research Insititute Team

    With the increasing computational power and reliable databases, high-throughput screening is playing a more and more important role in the search of new thermoelectric materials. Rather than the well established density functional theory (DFT) calculation based methods, we propose an alternative approach to screen for new TE materials: using crystal structural features as 'descriptors'. We show that a non-distorted transition metal sulphide polyhedral network can be a good descriptor for high power factor according to crystal filed theory. By using Cu/S containing compounds as an example, 1600+ Cu/S containing entries in the Inorganic Crystal Structure Database (ICSD) were screened, and of those 84 phases are identified as promising thermoelectric materials. The screening results are validated by both electronic structure calculations and experimental results from the literature. We also fabricated some new compounds to test our screening results. Another advantage of using crystal structure features as descriptors is that we can easily establish structural relationships between the identified phases. Based on this, two material design approaches are discussed: 1) High-pressure synthesis of metastable phase; 2) In-situ 2-phase composites with coherent interface. This work was supported by a Marie Curie International Incoming Fellowship of the European Community Human Potential Program.

  19. On phase/current components of entropy/information descriptors of molecular states

    NASA Astrophysics Data System (ADS)

    Nalewajski, Roman F.

    2014-10-01

    Quantum-generalised descriptors of the information content of electronic states in molecules are proposed, in which non-classical (current) terms complement classical (probability) functionals of the ordinary information theory. The relation between densities of the familiar classical Fisher and Shannon information/entropy measures is applied to determine their non-classical complements. The quantum supplement of the classical Shannon entropy describes the average magnitude of the phase distribution, while the current term in the Fisher measure accounts for the gradient content of the state phase function. Illustrative applications of these quantum information concepts are presented and thermodynamical analogies are commented upon. The particle-density-constrained (vertical) and -unconstrained (horizontal) equilibria in molecules and their fragments are explored and the corresponding equilibrium 'thermodynamic' phases are determined. A separation of the density (modulus) and current (phase) factors of general many-electron states is effected using the Harriman-Zumbach-Maschke construction of antisymmetric states yielding the specified electron density. The phenomenological framework in spirit of the non-equilibrium thermodynamical description is proposed. It accounts for both the density and current degrees of freedom of molecular states. The associated entropy source in the information continuity equation is derived.

  20. A rotation-translation invariant molecular descriptor of partial charges and its use in ligand-based virtual screening

    PubMed Central

    2014-01-01

    Background Measures of similarity for chemical molecules have been developed since the dawn of chemoinformatics. Molecular similarity has been measured by a variety of methods including molecular descriptor based similarity, common molecular fragments, graph matching and 3D methods such as shape matching. Similarity measures are widespread in practice and have proven to be useful in drug discovery. Because of our interest in electrostatics and high throughput ligand-based virtual screening, we sought to exploit the information contained in atomic coordinates and partial charges of a molecule. Results A new molecular descriptor based on partial charges is proposed. It uses the autocorrelation function and linear binning to encode all atoms of a molecule into two rotation-translation invariant vectors. Combined with a scoring function, the descriptor allows to rank-order a database of compounds versus a query molecule. The proposed implementation is called ACPC (AutoCorrelation of Partial Charges) and released in open source. Extensive retrospective ligand-based virtual screening experiments were performed and other methods were compared with in order to validate the method and associated protocol. Conclusions While it is a simple method, it performed remarkably well in experiments. At an average speed of 1649 molecules per second, it reached an average median area under the curve of 0.81 on 40 different targets; hence validating the proposed protocol and implementation. PMID:24887178

  1. Descriptors, physical properties, and drug-likeness.

    PubMed

    Brüstle, Matthias; Beck, Bernd; Schindler, Torsten; King, William; Mitchell, Timothy; Clark, Timothy

    2002-08-01

    We have investigated techniques for distinguishing between drugs and nondrugs using a set of molecular descriptors derived from semiempirical molecular orbital (AM1) calculations. The "drug" data set of 2105 compounds was derived from the World Drug Index (WDI) using a procedure designed to select real drugs. The "nondrug" data set was the Maybridge database. We have first investigated the dimensionality of physical properties space based on a set of 26 descriptors that we have used successfully to build absorption, distribution, metabolism, and excretion-related quantitative structure-property relationship models. We discuss the general nature of the descriptors for physical property space and the ability of these descriptors to distinguish between drugs and nondrugs. The third most significant principal component of this set of descriptors serves as a useful numerical index of drug-likeness, but no others are able to distinguish between drugs and nondrugs. We have therefore extended our set of descriptors to a total of 66 and have used recursive partitioning to identify the descriptors that can distinguish between drugs and nondrugs. This procedure pointed to two of the descriptors that play an important role in the principal component found above and one more from the set of 40 extra descriptors. These three descriptors were then used to train a Kohonen artificial neural net for the entire Maybridge data set. Projecting the drug database onto the map obtained resulted in a clear distinction not only between drugs and nondrugs but also, for instance, between hormones and other drugs. Projection of 42 131 compounds from the WDI onto the Kohonen map also revealed pronounced clustering in the regions of the map assigned as druglike. PMID:12139446

  2. A novel texture descriptor for detection of glandular structures in colon histology images

    NASA Astrophysics Data System (ADS)

    Sirinukunwattana, Korsuk; Snead, David R.; Rajpoot, Nasir M.

    2015-03-01

    The first step prior to most analyses on most histopathology images is the detection of area of interest. In this work, we present a superpixel-based approach for glandular structure detection in colon histology images. An image is first segmented into superpixels with the constraint on the presence of glandular boundaries. Texture and color information is then extracted from each superpixel to calculate the probability of that superpixel belonging to glandular regions, resulting in a glandular probability map. In addition, we present a novel texture descriptor derived from a region covariance matrix of scattering coefficients. Our approach shows encouraging results for the detection of glandular structures in colon tissue samples.

  3. Methanol Oxidative Dehydrogenation on Oxide Catalysts: Molecular and Dissociative Routes and Hydrogen Addition Energies as Descriptors of Reactivity

    SciTech Connect

    Deshlahra, Prashant; Iglesia, Enrique

    2014-11-13

    The oxidative dehydrogenation (ODH) of alkanols on oxide catalysts is generally described as involving H-abstraction from alkoxy species formed via O–H dissociation. Kinetic and isotopic data cannot discern between such routes and those involving kinetically-relevant H-abstraction from undissociated alkanols. Here, we combine such experiments with theoretical estimates of activation energies and entropies to show that the latter molecular routes prevail over dissociative routes for methanol reactions on polyoxometalate (POM) clusters at all practical reaction temperatures. The stability of the late transition states that mediate H-abstraction depend predominantly on the stability of the O–H bond formed, making H-addition energies (HAE) accurate and single-valued descriptors of reactivity. Density functional theory-derived activation energies depend linearly on HAE values at each O-atom location on clusters with a range of composition (H3PMo12, H4SiMo12, H3PW12, H4PV1Mo11, and H4PV1W11); both barriers and HAE values reflect the lowest unoccupied molecular orbital energy of metal centers that accept the electron and the protonation energy of O-atoms that accept the proton involved in the H-atom transfer. Bridging O-atoms form O–H bonds that are stronger than those of terminal atoms and therefore exhibit more negative HAE values and higher ODH reactivity on all POM clusters. For each cluster composition, ODH turnover rates reflect the reactivity-averaged HAE of all accessible O-atoms, which can be evaluated for each cluster composition to provide a rigorous and accurate predictor of ODH reactivity for catalysts with known structure. These relations together with oxidation reactivity measurements can then be used to estimate HAE values and to infer plausible structures for catalysts with uncertain active site structures.

  4. Hydration Free Energy as a Molecular Descriptor in Drug Design: A Feasibility Study.

    PubMed

    Zafar, Ayesha; Reynisson, Jóhannes

    2016-05-01

    In this work the idea was investigated whether calculated hydration energy (ΔGhyd ) can be used as a molecular descriptor in defining promising regions of chemical space for drug design. Calculating ΔGhyd using the Density Solvation Model (SMD) in conjunction with the density functional theory (DFT) gave an excellent correlation with experimental values. Furthermore, calculated ΔGhyd correlates reasonably well with experimental water solubility (r(2) =0.545) and also log P (r(2) =0.530). Three compound collections were used: Known drugs (n=150), drug-like compounds (n=100) and simple organic compounds (n=140). As an approximation only molecules, which do not de/protonate at physiological pH were considered. A relatively broad distribution was seen for the known drugs with an average at -15.3 kcal/mol and a standard deviation of 7.5 kcal/mol. Interestingly, much lower averages were found for the drug-like compounds (-7.5 kcal/mol) and the simple organic compounds (-3.1 kcal/mol) with tighter distributions; 4.3 and 3.2 kcal/mol, respectively. This trend was not observed for these collections when calculated log P and log S values were used. The considerable greater exothermic ΔGhyd average for the known drugs clearly indicates in order to develop a successful drug candidate value of ΔGhyd <-5 kcal/mol or less is preferable. PMID:27492087

  5. Comparison of topological descriptors for similarity-based virtual screening using multiple bioactive reference structures.

    PubMed

    Hert, Jérôme; Willett, Peter; Wilton, David J; Acklin, Pierre; Azzaoui, Kamal; Jacoby, Edgar; Schuffenhauer, Ansgar

    2004-11-21

    This paper reports a detailed comparison of a range of different types of 2D fingerprints when used for similarity-based virtual screening with multiple reference structures. Experiments with the MDL Drug Data Report database demonstrate the effectiveness of fingerprints that encode circular substructure descriptors generated using the Morgan algorithm. These fingerprints are notably more effective than fingerprints based on a fragment dictionary, on hashing and on topological pharmacophores. The combination of these fingerprints with data fusion based on similarity scores provides both an effective and an efficient approach to virtual screening in lead-discovery programmes. PMID:15534703

  6. A new quantitative structure-property relationship model to predict bioconcentration factors of polychlorinated biphenyls (PCBs) in fishes using E-state index and topological descriptors.

    PubMed

    de Melo, Eduardo Borges

    2012-01-01

    A quantitative structure-property relationship (QSPR) study for predicting the logarithm of bioconcentration factors (LogBCF) of polychlorinated biphenyls (PCBs) is presented in this work. For this, the descriptors were obtained using only the Simplified Molecular Input Line Entry System (SMILES) strings in the free web server Parameter Client. The model was built using the Partial Least Squares (PLS) regression method. The best model presented five descriptors (one E-state index and four topological descriptors) and a high quality for fit, internal, and external predictions. The leave-N-out (LNO) cross validation and the y-randomization test showed the model is robust and has no shown chance correlation. With a second test set, the model was compared to other models and presented a root mean square error (RMSE) very close to the best model. The mechanistic interpretation was corroborated by other works in the literature and by the descriptors' theory. Thus, the results meet the five Organization for Economic Co-operation and Development (OECD) principles for validation of QSA(P)R models, and it is expected the model can effectively predict the BCF values in fishes of the PCB congeners without highly reliable experimental BCF. PMID:21959189

  7. Collision cross section prediction of deprotonated phenolics in a travelling-wave ion mobility spectrometer using molecular descriptors and chemometrics.

    PubMed

    Gonzales, Gerard Bryan; Smagghe, Guy; Coelus, Sofie; Adriaenssens, Dieter; De Winter, Karel; Desmet, Tom; Raes, Katleen; Van Camp, John

    2016-06-14

    The combination of ion mobility and mass spectrometry (MS) affords significant improvements over conventional MS/MS, especially in the characterization of isomeric metabolites due to the differences in their collision cross sections (CCS). Experimentally obtained CCS values are typically matched with theoretical CCS values from Trajectory Method (TM) and/or Projection Approximation (PA) calculations. In this paper, predictive models for CCS of deprotonated phenolics were developed using molecular descriptors and chemometric tools, stepwise multiple linear regression (SMLR), principal components regression (PCR), and partial least squares regression (PLS). A total of 102 molecular descriptors were generated and reduced to 28 after employing a feature selection tool, composed of mass, topological descriptors, Jurs descriptors and shadow indices. Therefore, the generated models considered the effects of mass, 3D conformation and partial charge distribution on CCS, which are the main parameters for either TM or PA (only 3D conformation) calculations. All three techniques yielded highly predictive models for both the training (R(2)SMLR = 0.9911; R(2)PCR = 0.9917; R(2)PLS = 0.9918) and validation datasets (R(2)SMLR = 0.9489; R(2)PCR = 0.9761; R(2)PLS = 0.9760). Also, the high cross validated R(2) values indicate that the generated models are robust and highly predictive (Q(2)SMLR = 0.9859; Q(2)PCR = 0.9748; Q(2)PLS = 0.9760). The predictions were also very comparable to the results from TM calculations using modified mobcal (N2). Most importantly, this method offered a rapid (<10 min) alternative to TM calculations without compromising predictive ability. These methods could therefore be used in routine analysis and could be easily integrated to metabolite identification platforms. PMID:27181646

  8. Surface area and cortical thickness descriptors reveal different attributes of the structural human brain networks.

    PubMed

    Sanabria-Diaz, Gretel; Melie-García, Lester; Iturria-Medina, Yasser; Alemán-Gómez, Yasser; Hernández-González, Gertrudis; Valdés-Urrutia, Lourdes; Galán, Lídice; Valdés-Sosa, Pedro

    2010-05-01

    Recently, a related morphometry-based connection concept has been introduced using local mean cortical thickness and volume to study the underlying complex architecture of the brain networks. In this article, the surface area is employed as a morphometric descriptor to study the concurrent changes between brain structures and to build binarized connectivity graphs. The statistical similarity in surface area between pair of regions was measured by computing the partial correlation coefficient across 186 normal subjects of the Cuban Human Brain Mapping Project. We demonstrated that connectivity matrices obtained follow a small-world behavior for two different parcellations of the brain gray matter. The properties of the connectivity matrices were compared to the matrices obtained using the mean cortical thickness for the same cortical parcellations. The topology of the cortical thickness and surface area networks were statistically different, demonstrating that both capture distinct properties of the interaction or different aspects of the same interaction (mechanical, anatomical, chemical, etc.) between brain structures. This finding could be explained by the fact that each descriptor is driven by distinct cellular mechanisms as result of a distinct genetic origin. To our knowledge, this is the first time that surface area is used to study the morphological connectivity of brain networks. PMID:20083210

  9. Advances in structural damage assessment using strain measurements and invariant shape descriptors

    NASA Astrophysics Data System (ADS)

    Patki, Amol Suhas

    to the area surrounding the damage, while damage in orthotropic materials tends to have more global repercussions. This calls for analysis of full-field strain distributions adding to the complexity of post-damage life estimation. This study explores shape descriptors used in the field of medical imagery, military targeting and biometric recognition for obtaining a qualitative and quantitative comparison between full-field strain data recorded from damaged composite panels using sophisticated experimental techniques. These descriptors are capable of decomposing images with 103 to 106 pixels into a feature vector with only a few hundred elements. This ability of shape descriptors to achieve enormous reduction in strain data, while providing unique representation, makes them a practical choice for the purpose of structural damage assessment. Consequently, it is relatively easy to statistically compare the shape descriptors of the full-field strain maps using similarity measures rather than the strain maps themselves. However, the wide range of geometric and design features in engineering components pose difficulties in the application of traditional shape description techniques. Thus a new shape descriptor is developed which is applicable to a wide range of specimen geometries. This work also illustrates how shape description techniques can be applied to full-field finite element model validations and updating.

  10. Molecular descriptor data explain market prices of a large commercial chemical compound library

    PubMed Central

    Polanski, Jaroslaw; Kucia, Urszula; Duszkiewicz, Roksana; Kurczyk, Agata; Magdziarz, Tomasz; Gasteiger, Johann

    2016-01-01

    The relationship between the structure and a property of a chemical compound is an essential concept in chemistry guiding, for example, drug design. Actually, however, we need economic considerations to fully understand the fate of drugs on the market. We are performing here for the first time the exploration of quantitative structure-economy relationships (QSER) for a large dataset of a commercial building block library of over 2.2 million chemicals. This investigation provided molecular statistics that shows that on average what we are paying for is the quantity of matter. On the other side, the influence of synthetic availability scores is also revealed. Finally, we are buying substances by looking at the molecular graphs or molecular formulas. Thus, those molecules that have a higher number of atoms look more attractive and are, on average, also more expensive. Our study shows how data binning could be used as an informative method when analyzing big data in chemistry. PMID:27334348

  11. Molecular descriptor data explain market prices of a large commercial chemical compound library

    NASA Astrophysics Data System (ADS)

    Polanski, Jaroslaw; Kucia, Urszula; Duszkiewicz, Roksana; Kurczyk, Agata; Magdziarz, Tomasz; Gasteiger, Johann

    2016-06-01

    The relationship between the structure and a property of a chemical compound is an essential concept in chemistry guiding, for example, drug design. Actually, however, we need economic considerations to fully understand the fate of drugs on the market. We are performing here for the first time the exploration of quantitative structure-economy relationships (QSER) for a large dataset of a commercial building block library of over 2.2 million chemicals. This investigation provided molecular statistics that shows that on average what we are paying for is the quantity of matter. On the other side, the influence of synthetic availability scores is also revealed. Finally, we are buying substances by looking at the molecular graphs or molecular formulas. Thus, those molecules that have a higher number of atoms look more attractive and are, on average, also more expensive. Our study shows how data binning could be used as an informative method when analyzing big data in chemistry.

  12. Molecular descriptor data explain market prices of a large commercial chemical compound library.

    PubMed

    Polanski, Jaroslaw; Kucia, Urszula; Duszkiewicz, Roksana; Kurczyk, Agata; Magdziarz, Tomasz; Gasteiger, Johann

    2016-01-01

    The relationship between the structure and a property of a chemical compound is an essential concept in chemistry guiding, for example, drug design. Actually, however, we need economic considerations to fully understand the fate of drugs on the market. We are performing here for the first time the exploration of quantitative structure-economy relationships (QSER) for a large dataset of a commercial building block library of over 2.2 million chemicals. This investigation provided molecular statistics that shows that on average what we are paying for is the quantity of matter. On the other side, the influence of synthetic availability scores is also revealed. Finally, we are buying substances by looking at the molecular graphs or molecular formulas. Thus, those molecules that have a higher number of atoms look more attractive and are, on average, also more expensive. Our study shows how data binning could be used as an informative method when analyzing big data in chemistry. PMID:27334348

  13. Quantitative structure-activation barrier relationship modeling for Diels-Alder ligations utilizing quantum chemical structural descriptors

    PubMed Central

    2013-01-01

    Background In the present study, we show the correlation of quantum chemical structural descriptors with the activation barriers of the Diels-Alder ligations. A set of 72 non-catalysed Diels-Alder reactions were subjected to quantitative structure-activation barrier relationship (QSABR) under the framework of theoretical quantum chemical descriptors calculated solely from the structures of diene and dienophile reactants. Experimental activation barrier data were obtained from literature. Descriptors were computed using Hartree-Fock theory using 6-31G(d) basis set as implemented in Gaussian 09 software. Results Variable selection and model development were carried out by stepwise multiple linear regression methodology. Predictive performance of the quantitative structure-activation barrier relationship (QSABR) model was assessed by training and test set concept and by calculating leave-one-out cross-validated Q2 and predictive R2 values. The QSABR model can explain and predict 86.5% and 80% of the variances, respectively, in the activation energy barrier training data. Alternatively, a neural network model based on back propagation of errors was developed to assess the nonlinearity of the sought correlations between theoretical descriptors and experimental reaction barriers. Conclusions A reasonable predictability for the activation barrier of the test set reactions was obtained, which enabled an exploration and interpretation of the significant variables responsible for Diels-Alder interaction between dienes and dienophiles. Thus, studies in the direction of QSABR modelling that provide efficient and fast prediction of activation barriers of the Diels-Alder reactions turn out to be a meaningful alternative to transition state theory based computation. PMID:24171724

  14. Adaptive modelling of structured molecular representations for toxicity prediction

    NASA Astrophysics Data System (ADS)

    Bertinetto, Carlo; Duce, Celia; Micheli, Alessio; Solaro, Roberto; Tiné, Maria Rosaria

    2012-12-01

    We investigated the possibility of modelling structure-toxicity relationships by direct treatment of the molecular structure (without using descriptors) through an adaptive model able to retain the appropriate structural information. With respect to traditional descriptor-based approaches, this provides a more general and flexible way to tackle prediction problems that is particularly suitable when little or no background knowledge is available. Our method employs a tree-structured molecular representation, which is processed by a recursive neural network (RNN). To explore the realization of RNN modelling in toxicological problems, we employed a data set containing growth impairment concentrations (IGC50) for Tetrahymena pyriformis.

  15. Electronic structure descriptor for the discovery of narrow-band red-emitting phosphors

    DOE PAGESBeta

    Wang, Zhenbin; Chu, Iek -Heng; Zhou, Fei; Ong, Shyue Ping

    2016-05-09

    Narrow-band red-emitting phosphors are a critical component of phosphor-converted light-emitting diodes for highly efficient illumination-grade lighting. In this work, we report the discovery of a quantitative descriptor for narrow-band Eu2+-activated emission identified through a comparison of the electronic structures of known narrow-band and broad-band phosphors. We find that a narrow emission bandwidth is characterized by a large splitting of more than 0.1 eV between the two highest Eu2+ 4f7 bands. By incorporating this descriptor in a high-throughput first-principles screening of 2259 nitride compounds, we identify five promising new nitride hosts for Eu2+-activated red-emitting phosphors that are predicted to exhibit goodmore » chemical stability, thermal quenching resistance, and quantum efficiency, as well as narrow-band emission. Lastly, our findings provide important insights into the emission characteristics of rare-earth activators in phosphor hosts and a general strategy to the discovery of phosphors with a desired emission peak and bandwidth.« less

  16. Predictive Modeling of Chemical Hazard by Integrating Numerical Descriptors of Chemical Structures and Short-term Toxicity Assay Data

    PubMed Central

    Rusyn, Ivan; Sedykh, Alexander; Guyton, Kathryn Z.; Tropsha, Alexander

    2012-01-01

    Quantitative structure-activity relationship (QSAR) models are widely used for in silico prediction of in vivo toxicity of drug candidates or environmental chemicals, adding value to candidate selection in drug development or in a search for less hazardous and more sustainable alternatives for chemicals in commerce. The development of traditional QSAR models is enabled by numerical descriptors representing the inherent chemical properties that can be easily defined for any number of molecules; however, traditional QSAR models often have limited predictive power due to the lack of data and complexity of in vivo endpoints. Although it has been indeed difficult to obtain experimentally derived toxicity data on a large number of chemicals in the past, the results of quantitative in vitro screening of thousands of environmental chemicals in hundreds of experimental systems are now available and continue to accumulate. In addition, publicly accessible toxicogenomics data collected on hundreds of chemicals provide another dimension of molecular information that is potentially useful for predictive toxicity modeling. These new characteristics of molecular bioactivity arising from short-term biological assays, i.e., in vitro screening and/or in vivo toxicogenomics data can now be exploited in combination with chemical structural information to generate hybrid QSAR–like quantitative models to predict human toxicity and carcinogenicity. Using several case studies, we illustrate the benefits of a hybrid modeling approach, namely improvements in the accuracy of models, enhanced interpretation of the most predictive features, and expanded applicability domain for wider chemical space coverage. PMID:22387746

  17. Novel 3D bio-macromolecular bilinear descriptors for protein science: Predicting protein structural classes.

    PubMed

    Marrero-Ponce, Yovani; Contreras-Torres, Ernesto; García-Jacas, César R; Barigye, Stephen J; Cubillán, Néstor; Alvarado, Ysaías J

    2015-06-01

    In the present study, we introduce novel 3D protein descriptors based on the bilinear algebraic form in the ℝ(n) space on the coulombic matrix. For the calculation of these descriptors, macromolecular vectors belonging to ℝ(n) space, whose components represent certain amino acid side-chain properties, were used as weighting schemes. Generalization approaches for the calculation of inter-amino acidic residue spatial distances based on Minkowski metrics are proposed. The simple- and double-stochastic schemes were defined as approaches to normalize the coulombic matrix. The local-fragment indices for both amino acid-types and amino acid-groups are presented in order to permit characterizing fragments of interest in proteins. On the other hand, with the objective of taking into account specific interactions among amino acids in global or local indices, geometric and topological cut-offs are defined. To assess the utility of global and local indices a classification model for the prediction of the major four protein structural classes, was built with the Linear Discriminant Analysis (LDA) technique. The developed LDA-model correctly classifies the 92.6% and 92.7% of the proteins on the training and test sets, respectively. The obtained model showed high values of the generalized square correlation coefficient (GC(2)) on both the training and test series. The statistical parameters derived from the internal and external validation procedures demonstrate the robustness, stability and the high predictive power of the proposed model. The performance of the LDA-model demonstrates the capability of the proposed indices not only to codify relevant biochemical information related to the structural classes of proteins, but also to yield suitable interpretability. It is anticipated that the current method will benefit the prediction of other protein attributes or functions. PMID:25843214

  18. Modular Chemical Descriptor Language (MCDL): Stereochemical modules

    SciTech Connect

    Gakh, Andrei A; Burnett, Michael N; Trepalin, Sergei V.; Yarkov, Alexander V

    2011-01-01

    In our previous papers we introduced the Modular Chemical Descriptor Language (MCDL) for providing a linear representation of chemical information. A subsequent development was the MCDL Java Chemical Structure Editor which is capable of drawing chemical structures from linear representations and generating MCDL descriptors from structures. In this paper we present MCDL modules and accompanying software that incorporate unique representation of molecular stereochemistry based on Cahn-Ingold-Prelog and Fischer ideas in constructing stereoisomer descriptors. The paper also contains additional discussions regarding canonical representation of stereochemical isomers, and brief algorithm descriptions of the open source LINDES, Java applet, and Open Babel MCDL processing module software packages. Testing of the upgraded MCDL Java Chemical Structure Editor on compounds taken from several large and diverse chemical databases demonstrated satisfactory performance for storage and processing of stereochemical information in MCDL format.

  19. Convergent study of Ru-ligand interactions through QTAIM, ELF, NBO molecular descriptors and TDDFT analysis of organometallic dyes

    NASA Astrophysics Data System (ADS)

    Sánchez-Coronilla, Antonio; Sánchez-Márquez, Jesús; Zorrilla, David; Martín, Elisa I.; de los Santos, Desireé M.; Navas, Javier; Fernández-Lorenzo, Concha; Alcántara, Rodrigo; Martín-Calleja, Joaquín

    2014-08-01

    We report a theoretical study of a series of Ru complexes of interest in dye-sensitised solar cells, in organic light-emitting diodes, and in the war against cancer. Other metal centres, such as Cr, Co, Ni, Rh, Pd, and Pt, have been included for comparison purposes. The metal-ligand trends in organometallic chemistry for those compounds are shown synergistically by using three molecular descriptors: quantum theory of atoms in molecules (QTAIM), electron localisation function (ELF) and second-order perturbation theory analysis of the natural bond orbital (NBO). The metal-ligand bond order is addressed through both delocalisation index (DI) of QTAIM and fluctuation index (λ) of ELF. Correlation between DI and λ for Ru-N bond in those complexes is introduced for the first time. Electron transfer and stability was also assessed by the second-order perturbation theory analysis of the NBO. Electron transfer from the lone pair NBO of the ligands toward the antibonding lone pair NBO of the metal plays a relevant role in stabilising the complexes, providing useful insights into understanding the effect of the 'expanded ligand' principle in supramolecular chemistry. Finally, absorption wavelengths associated to the metal-to-ligand charge transfer transitions and the highest occupied molecular orbital (HOMO)--lowest unoccupied molecular orbital (LUMO) characteristics were studied by time-dependent density functional theory.

  20. The use of density functional theory-based reactivity descriptors in molecular similarity calculations

    NASA Astrophysics Data System (ADS)

    Boon, Greet; De Proft, Frank; Langenaeker, Wilfried; Geerlings, Paul

    1998-10-01

    Molecular similarity is studied via density functional theory-based similarity indices using a numerical integration method. Complementary to the existing similarity indices, we introduce a reactivity-related similarity index based on the local softness. After a study of some test systems, a series of peptide isosteres is studied in view of their importance in pharmacology. The whole of the present work illustrates the importance of the study of molecular similarity based on both shape and reactivity.

  1. On the Development and Use of Large Chemical Similarity Networks, Informatics Best Practices and Novel Chemical Descriptors Towards Materials Quantitative Structure Property Relationships

    NASA Astrophysics Data System (ADS)

    Krein, Michael

    After decades of development and use in a variety of application areas, Quantitative Structure Property Relationships (QSPRs) and related descriptor-based statistical learning methods have achieved a level of infamy due to their misuse. The field is rife with past examples of overtrained models, overoptimistic performance assessment, and outright cheating in the form of explicitly removing data to fit models. These actions do not serve the community well, nor are they beneficial to future predictions based on established models. In practice, in order to select combinations of descriptors and machine learning methods that might work best, one must consider the nature and size of the training and test datasets, be aware of existing hypotheses about the data, and resist the temptation to bias structure representation and modeling to explicitly fit the hypotheses. The definition and application of these best practices is important for obtaining actionable modeling outcomes, and for setting user expectations of modeling accuracy when predicting the endpoint values of unknowns. A wide variety of statistical learning approaches, descriptor types, and model validation strategies are explored herein, with the goals of helping end users understand the factors involved in creating and using QSPR models effectively, and to better understand relationships within the data, especially by looking at the problem space from multiple perspectives. Molecular relationships are commonly envisioned in a continuous high-dimensional space of numerical descriptors, referred to as chemistry space. Descriptor and similarity metric choice influence the partitioning of this space into regions corresponding to local structural similarity. These regions, known as domains of applicability, are most likely to be successfully modeled by a QSPR. In Chapter 2, the network topology and scaling relationships of several chemistry spaces are thoroughly investigated. Chemistry spaces studied include the

  2. Essential Set of Molecular Descriptors for ADME Prediction in Drug and Environmental Chemical Space

    EPA Science Inventory

    Historically, the disciplines of pharmacology and toxicology have embraced quantitative structure-activity relationships (QSAR) and quantitative structure-property relationships (QSPR) to predict ADME properties or biological activities of untested chemicals. The question arises ...

  3. Taking advantage of local structure descriptors to analyze interresidue contacts in protein structures and protein complexes.

    PubMed

    Martin, Juliette; Regad, Leslie; Etchebest, Catherine; Camproux, Anne-Claude

    2008-11-15

    Interresidue protein contacts in proteins structures and at protein-protein interface are classically described by the amino acid types of interacting residues and the local structural context of the contact, if any, is described using secondary structures. In this study, we present an alternate analysis of interresidue contact using local structures defined by the structural alphabet introduced by Camproux et al. This structural alphabet allows to describe a 3D structure as a sequence of prototype fragments called structural letters, of 27 different types. Each residue can then be assigned to a particular local structure, even in loop regions. The analysis of interresidue contacts within protein structures defined using Voronoï tessellations reveals that pairwise contact specificity is greater in terms of structural letters than amino acids. Using a simple heuristic based on specificity score comparison, we find that 74% of the long-range contacts within protein structures are better described using structural letters than amino acid types. The investigation is extended to a set of protein-protein complexes, showing that the similar global rules apply as for intraprotein contacts, with 64% of the interprotein contacts best described by local structures. We then present an evaluation of pairing functions integrating structural letters to decoy scoring and show that some complexes could benefit from the use of structural letter-based pairing functions. PMID:18491388

  4. A novel and robust rotation and scale invariant structuring elements based descriptor for pedestrian classification in infrared images

    NASA Astrophysics Data System (ADS)

    Soundrapandiyan, Rajkumar; Chandra Mouli, P. V. S. S. R.

    2016-09-01

    In this paper, a novel and robust rotation and scale invariant structuring elements based descriptor (RSSED) for pedestrian classification in infrared (IR) images is proposed. In addition, a segmentation method using difference of Gaussian (DoG) and horizontal intensity projection is proposed. The three major steps are moving object segmentation, feature extraction and classification of objects as pedestrian or non-pedestrian. The segmentation result is used to extract the RSSED feature descriptor. To extract features, the segmentation result is encoded using local directional pattern (LDP). This helps in the identification of local textural patterns. The LDP encoded image is further quantized adaptively to four levels. Finally the proposed RSSED is used to formalize the descriptor from the quantized image. Support vector machine is employed for classification of the moving objects in a given IR image into pedestrian and non-pedestrian classes. The segmentation results shows the robustness in extracting the moving objects. The classification results obtained from SVM classifier shows the efficacy of the proposed method.

  5. Chemometric Methods and Theoretical Molecular Descriptors in Predictive QSAR Modeling of the Environmental Behavior of Organic Pollutants

    NASA Astrophysics Data System (ADS)

    Gramatica, Paola

    This chapter surveys the QSAR modeling approaches (developed by the author's research group) for the validated prediction of environmental properties of organic pollutants. Various chemometric methods, based on different theoretical molecular descriptors, have been applied: explorative techniques (such as PCA for ranking, SOM for similarity analysis), modeling approaches by multiple-linear regression (MLR, in particular OLS), and classification methods (mainly k-NN, CART, CP-ANN). The focus of this review is on the main topics of environmental chemistry and ecotoxicology, related to the physico-chemical properties, the reactivity, and biological activity of chemicals of high environmental concern. Thus, the review deals with atmospheric degradation reactions of VOCs by tropospheric oxidants, persistence and long-range transport of POPs, sorption behavior of pesticides (Koc and leaching), bioconcentration, toxicity (acute aquatic toxicity, mutagenicity of PAHs, estrogen binding activity for endocrine disruptors compounds (EDCs)), and finally persistent bioaccumulative and toxic (PBT) behavior for the screening and prioritization of organic pollutants. Common to all the proposed models is the attention paid to model validation for predictive ability (not only internal, but also external for chemicals not participating in the model development) and checking of the chemical domain of applicability. Adherence to such a policy, requested also by the OECD principles, ensures the production of reliable predicted data, useful also in the new European regulation of chemicals, REACH.

  6. Explorations of molecular structure-property relationships.

    PubMed

    Seybold, P G

    1999-01-01

    The problem of the relationship between the structure of a molecule and its physical, chemical, and biological properties is one of the most fundamental in chemistry. Three molecular structure-property studies are discussed as illustrations of different approaches to this problem. In the first study the carcinogenic activities of polycyclic aromatic hydrocarbons and their derivatives are examined. Molecular orbital calculations of the presumptive activation steps and species for these compounds (based on the "bay region" theory of activation) are seen to yield a surprisingly good guide to the observed carcinogenic activities. Both activation and deactivation steps are considered. The second study reviews structure-property work on the tissue solubilities of halogenated hydrocarbons. Relatively simple structural descriptors give a good account of the solubilities of these compounds in blood, muscle, fat, and liver tissue. With the aid of principal components analysis it is shown that there are two dominant dimensions to this problem, which can be interpreted in terms of solubilities of the compounds in lipid and saline environments. The final study, which examines the boiling points of aliphatic alcohols, illustrates the value of using more than one descriptor set. The (perhaps surprising) conclusion is that a theoretical model can sometimes be more accurate than the data upon which it is based. Moreover, two models are better than one. PMID:10491848

  7. Relationship between reaction rate constants of organic pollutants and their molecular descriptors during Fenton oxidation and in situ formed ferric-oxyhydroxides.

    PubMed

    Jia, Lijuan; Shen, Zhemin; Su, Pingru

    2016-05-01

    Fenton oxidation is a promising water treatment method to degrade organic pollutants. In this study, 30 different organic compounds were selected and their reaction rate constants (k) were determined for the Fenton oxidation process. Gaussian09 and Material Studio software sets were used to carry out calculations and obtain values of 10 different molecular descriptors for each studied compound. Ferric-oxyhydroxide coagulation experiments were conducted to determine the coagulation percentage. Based upon the adsorption capacity, all of the investigated organic compounds were divided into two groups (Group A and Group B). The percentage adsorption of organic compounds in Group A was less than 15% (wt./wt.) and that in the Group B was higher than 15% (wt./wt.). For Group A, removal of the compounds by oxidation was the dominant process while for Group B, removal by both oxidation and coagulation (as a synergistic process) took place. Results showed that the relationship between the rate constants (k values) and the molecular descriptors of Group A was more pronounced than for Group B compounds. For the oxidation-dominated process, EHOMO and Fukui indices (f(0)x, f(-)x, f(+)x) were the most significant factors. The influence of bond order was more significant for the synergistic process of oxidation and coagulation than for the oxidation-dominated process. The influences of all other molecular descriptors on the synergistic process were weaker than on the oxidation-dominated process. PMID:27155432

  8. PowerMV: a software environment for molecular viewing, descriptor generation, data analysis and hit evaluation.

    PubMed

    Liu, Kejun; Feng, Jun; Young, S Stanley

    2005-01-01

    Ideally, a team of biologists, medicinal chemists and information specialists will evaluate the hits from high throughput screening. In practice, it often falls to nonmedicinal chemists to make the initial evaluation of HTS hits. Chemical genetics and high content screening both rely on screening in cells or animals where the biological target may not be known. There is a need to place active compounds into a context to suggest potential biological mechanisms. Our idea is to build an operating environment to help the biologist make the initial evaluation of HTS data. To this end the operating environment provides viewing of compound structure files, computation of basic biologically relevant chemical properties and searching against biologically annotated chemical structure databases. The benefit is to help the nonmedicinal chemist, biologist and statistician put compounds into a potentially informative biological context. Although there are several similar public and private programs used in the pharmaceutical industry to help evaluate hits, these programs are often built for computational chemists. Our program is designed for use by biologists and statisticians. PMID:15807517

  9. Large-scale structure-activity relationship study of hepatitis C virus NS5B polymerase inhibition using SMILES-based descriptors.

    PubMed

    Worachartcheewan, Apilak; Prachayasittikul, Virapong; Toropova, Alla P; Toropov, Andrey A; Nantasenamat, Chanin

    2015-11-01

    Hepatitis C virus (HCV) is composed of structural and non-structural proteins involved in viral transcription and propagation. In particular, NS5B is an RNA-dependent RNA polymerase for viral transcription and genome replication and is a target for designing anti-viral agents. In this study, classification and quantitative structure-activity relationship (QSAR) models of HCV NS5B inhibitors were constructed using the Correlation and Logic software. Molecular descriptors for a set of 970 HCV NS5B inhibitors were encoded using the simplified molecular input line entry system notation, and predictive models were built via the Monte Carlo method. The QSAR models provided acceptable correlation coefficients of [Formula: see text] and [Formula: see text] in the ranges of 0.6038-0.7344 and 0.6171-0.7294, respectively, while the classification models displayed sensitivity, specificity, and accuracy in ranges of 88.24-98.84, 83.87-93.94, and 86.50-94.41 %, respectively. Furthermore, molecular fragments as substructures involved in increased and decreased inhibitory activities were explored. The results provide information on QSAR and classification models for high-throughput screening and mechanistic insights into the inhibitory activity of HCV NS5B polymerase. PMID:26164590

  10. Data mining PubChem using a support vector machine with the Signature molecular descriptor: classification of factor XIa inhibitors.

    PubMed

    Weis, Derick C; Visco, Donald P; Faulon, Jean-Loup

    2008-11-01

    The amount of high-throughput screening (HTS) data readily available has significantly increased because of the PubChem project (http://pubchem.ncbi.nlm.nih.gov/). There is considerable opportunity for data mining of small molecules for a variety of biological systems using cheminformatic tools and the resources available through PubChem. In this work, we trained a support vector machine (SVM) classifier using the Signature molecular descriptor on factor XIa inhibitor HTS data. The optimal number of Signatures was selected by implementing a feature selection algorithm of highly correlated clusters. Our method included an improvement that allowed clusters to work together for accuracy improvement, where previous methods have scored clusters on an individual basis. The resulting model had a 10-fold cross-validation accuracy of 89%, and additional validation was provided by two independent test sets. We applied the SVM to rapidly predict activity for approximately 12 million compounds also deposited in PubChem. Confidence in these predictions was assessed by considering the number of Signatures within the training set range for a given compound, defined as the overlap metric. To further evaluate compounds identified as active by the SVM, docking studies were performed using AutoDock. A focused database of compounds predicted to be active was obtained with several of the compounds appreciably dissimilar to those used in training the SVM. This focused database is suitable for further study. The data mining technique presented here is not specific to factor XIa inhibitors, and could be applied to other bioassays in PubChem where one is looking to expand the search for small molecules as chemical probes. PMID:18829357

  11. Improving Predictions of Protein-Protein Interfaces by Combining Amino Acid-Specific Classifiers Based on Structural and Physicochemical Descriptors with Their Weighted Neighbor Averages

    PubMed Central

    de Moraes, Fábio R.; Neshich, Izabella A. P.; Mazoni, Ivan; Yano, Inácio H.; Pereira, José G. C.; Salim, José A.; Jardine, José G.; Neshich, Goran

    2014-01-01

    Protein-protein interactions are involved in nearly all regulatory processes in the cell and are considered one of the most important issues in molecular biology and pharmaceutical sciences but are still not fully understood. Structural and computational biology contributed greatly to the elucidation of the mechanism of protein interactions. In this paper, we present a collection of the physicochemical and structural characteristics that distinguish interface-forming residues (IFR) from free surface residues (FSR). We formulated a linear discriminative analysis (LDA) classifier to assess whether chosen descriptors from the BlueStar STING database (http://www.cbi.cnptia.embrapa.br/SMS/) are suitable for such a task. Receiver operating characteristic (ROC) analysis indicates that the particular physicochemical and structural descriptors used for building the linear classifier perform much better than a random classifier and in fact, successfully outperform some of the previously published procedures, whose performance indicators were recently compared by other research groups. The results presented here show that the selected set of descriptors can be utilized to predict IFRs, even when homologue proteins are missing (particularly important for orphan proteins where no homologue is available for comparative analysis/indication) or, when certain conformational changes accompany interface formation. The development of amino acid type specific classifiers is shown to increase IFR classification performance. Also, we found that the addition of an amino acid conservation attribute did not improve the classification prediction. This result indicates that the increase in predictive power associated with amino acid conservation is exhausted by adequate use of an extensive list of independent physicochemical and structural parameters that, by themselves, fully describe the nano-environment at protein-protein interfaces. The IFR classifier developed in this study is now

  12. Improving predictions of protein-protein interfaces by combining amino acid-specific classifiers based on structural and physicochemical descriptors with their weighted neighbor averages.

    PubMed

    de Moraes, Fábio R; Neshich, Izabella A P; Mazoni, Ivan; Yano, Inácio H; Pereira, José G C; Salim, José A; Jardine, José G; Neshich, Goran

    2014-01-01

    Protein-protein interactions are involved in nearly all regulatory processes in the cell and are considered one of the most important issues in molecular biology and pharmaceutical sciences but are still not fully understood. Structural and computational biology contributed greatly to the elucidation of the mechanism of protein interactions. In this paper, we present a collection of the physicochemical and structural characteristics that distinguish interface-forming residues (IFR) from free surface residues (FSR). We formulated a linear discriminative analysis (LDA) classifier to assess whether chosen descriptors from the BlueStar STING database (http://www.cbi.cnptia.embrapa.br/SMS/) are suitable for such a task. Receiver operating characteristic (ROC) analysis indicates that the particular physicochemical and structural descriptors used for building the linear classifier perform much better than a random classifier and in fact, successfully outperform some of the previously published procedures, whose performance indicators were recently compared by other research groups. The results presented here show that the selected set of descriptors can be utilized to predict IFRs, even when homologue proteins are missing (particularly important for orphan proteins where no homologue is available for comparative analysis/indication) or, when certain conformational changes accompany interface formation. The development of amino acid type specific classifiers is shown to increase IFR classification performance. Also, we found that the addition of an amino acid conservation attribute did not improve the classification prediction. This result indicates that the increase in predictive power associated with amino acid conservation is exhausted by adequate use of an extensive list of independent physicochemical and structural parameters that, by themselves, fully describe the nano-environment at protein-protein interfaces. The IFR classifier developed in this study is now

  13. Predicting the auto-ignition temperatures of organic compounds from molecular structure using support vector machine.

    PubMed

    Pan, Yong; Jiang, Juncheng; Wang, Rui; Cao, Hongyin; Cui, Yi

    2009-05-30

    A quantitative structure-property relationship (QSPR) study is suggested for the prediction of auto-ignition temperatures (AIT) of organic compounds. Various kinds of molecular descriptors were calculated to represent the molecular structures of compounds, such as topological, charge, and geometric descriptors. The variable selection method of genetic algorithm (GA) was employed to select optimal subset of descriptors that have significant contribution to the overall AIT property from the large pool of calculated descriptors. The novel modeling method of support vector machine (SVM) was then employed to model the possible quantitative relationship existed between these selected descriptors and AIT property. The resulted model showed high prediction ability with the average absolute error being 28.88 degrees C, and the root mean square error being 36.86 for the prediction set, which are within the range of the experimental error of AIT measurements. The proposed method can be successfully used to predict the auto-ignition temperatures of organic compounds with only nine pre-selected theoretical descriptors which can be calculated directly from molecular structure alone. PMID:18952371

  14. Benchmarking of HPCC: A novel 3D molecular representation combining shape and pharmacophoric descriptors for efficient molecular similarity assessments.

    PubMed

    Karaboga, Arnaud S; Petronin, Florent; Marchetti, Gino; Souchet, Michel; Maigret, Bernard

    2013-04-01

    Since 3D molecular shape is an important determinant of biological activity, designing accurate 3D molecular representations is still of high interest. Several chemoinformatic approaches have been developed to try to describe accurate molecular shapes. Here, we present a novel 3D molecular description, namely harmonic pharma chemistry coefficient (HPCC), combining a ligand-centric pharmacophoric description projected onto a spherical harmonic based shape of a ligand. The performance of HPCC was evaluated by comparison to the standard ROCS software in a ligand-based virtual screening (VS) approach using the publicly available directory of useful decoys (DUD) data set comprising over 100,000 compounds distributed across 40 protein targets. Our results were analyzed using commonly reported statistics such as the area under the curve (AUC) and normalized sum of logarithms of ranks (NSLR) metrics. Overall, our HPCC 3D method is globally as efficient as the state-of-the-art ROCS software in terms of enrichment and slightly better for more than half of the DUD targets. Since it is largely admitted that VS results depend strongly on the nature of the protein families, we believe that the present HPCC solution is of interest over the current ligand-based VS methods. PMID:23467019

  15. Quantitative relationships between structure and cytotoxic activity of flavonoid derivatives. An application of Hirshfeld surface derived descriptors.

    PubMed

    Kupcewicz, Bogumiła; Małecka, Magdalena; Zapadka, Mariusz; Krajewska, Urszula; Rozalski, Marek; Budzisz, Elzbieta

    2016-07-15

    Quantitative relationships between the structure and cytotoxic activity of series flavonoid derivatives were examined. The first regression-based model, developed for 18 flavanone-2-pyrazoline hybrids, involved two interpretable descriptors: a Mor04v and partial atomic charge. The second model, developed for structurally diverse set of compounds, was based on descriptors derived from Hirshfeld surface analysis. This model suggests that cytotoxic activity of compounds can be successfully predicted based on a fraction of H⋯H contacts and a fraction of interactions involving a halogen atom. For non-halogen derivatives, the data reveal that cytotoxic activity is inversely proportional to the percentage of O⋯H and N⋯H close contacts to Hirshfeld surface, while directly proportional to the percentage of H⋯H interactions. Chlorine (1k) and bromine (1l) derivatives of compounds, containing flavanone fused with N-methyl-2-pyrazoline, exhibited high cytotoxic potential against HL-60 cancer cell line (IC50<10μM). The cytotoxicity of 1k and 1l towards normal cells (HUVEC) was 10 and 25-fold lower, respectively. PMID:27234147

  16. Investigating local spatially-enhanced structural and textural descriptors for classification of iPSC colony images.

    PubMed

    Gizatdinova, Yulia; Rasku, Jyrki; Haponen, Markus; Joutsijoki, Henry; Baldin, Ivan; Paci, Michelangelo; Hyttinen, Jari; Aalto-Setälä, Katriina; Juhola, Martti

    2014-01-01

    Induced pluripotent stem cells (iPSC) can be derived from fully differentiated cells of adult individuals and used to obtain any other cell type of the human body. This implies numerous prospective applications of iPSCs in regenerative medicine and drug development. In order to obtain valid cell culture, a quality control process must be applied to identify and discard abnormal iPSC colonies. Computer vision systems that analyze visual characteristics of iPSC colony health can be especially useful in automating and improving the quality control process. In this paper, we present an ongoing research that aims at the development of local spatially-enhanced descriptors for classification of iPSC colony images. For this, local oriented edges and local binary patterns are extracted from the detected colony regions and used to represent structural and textural properties of the colonies, respectively. We preliminary tested the proposed descriptors in classifying iPSCs colonies according to the degree of colony abnormality. The tests showed promising results for both, detection of iPSC colony borders and colony classification. PMID:25570711

  17. Calculation of aqueous solubility of crystalline un-ionized organic chemicals and drugs based on structural similarity and physicochemical descriptors.

    PubMed

    Raevsky, Oleg A; Grigor'ev, Veniamin Yu; Polianczyk, Daniel E; Raevskaja, Olga E; Dearden, John C

    2014-02-24

    Solubilities of crystalline organic compounds calculated according to AMP (arithmetic mean property) and LoReP (local one-parameter regression) models based on structural and physicochemical similarities are presented. We used data on water solubility of 2615 compounds in un-ionized form measured at 25±5 °C. The calculation results were compared with the equation based on the experimental data for lipophilicity and melting point. According to statistical criteria, the model based on structural and physicochemical similarities showed a better fit with the experimental data. An additional advantage of this model is that it uses only theoretical descriptors, and this provides means for calculating water solubility for both existing and not yet synthesized compounds. PMID:24456022

  18. Derivatives in discrete mathematics: a novel graph-theoretical invariant for generating new 2/3D molecular descriptors. I. Theory and QSPR application.

    PubMed

    Marrero-Ponce, Yovani; Santiago, Oscar Martínez; López, Yoan Martínez; Barigye, Stephen J; Torrens, Francisco

    2012-11-01

    In this report, we present a new mathematical approach for describing chemical structures of organic molecules at atomic-molecular level, proposing for the first time the use of the concept of the derivative ([Formula: see text]) of a molecular graph (MG) with respect to a given event (E), to obtain a new family of molecular descriptors (MDs). With this purpose, a new matrix representation of the MG, which generalizes graph's theory's traditional incidence matrix, is introduced. This matrix, denominated the generalized incidence matrix, Q, arises from the Boolean representation of molecular sub-graphs that participate in the formation of the graph molecular skeleton MG and could be complete (representing all possible connected sub-graphs) or constitute sub-graphs of determined orders or types as well as a combination of these. The Q matrix is a non-quadratic and unsymmetrical in nature, its columns (n) and rows (m) are conditions (letters) and collection of conditions (words) with which the event occurs. This non-quadratic and unsymmetrical matrix is transformed, by algebraic manipulation, to a quadratic and symmetric matrix known as relations frequency matrix, F, which characterizes the participation intensity of the conditions (letters) in the events (words). With F, we calculate the derivative over a pair of atomic nuclei. The local index for the atomic nuclei i, Δ(i), can therefore be obtained as a linear combination of all the pair derivatives of the atomic nuclei i with all the rest of the j's atomic nuclei. Here, we also define new strategies that generalize the present form of obtaining global or local (group or atom-type) invariants from atomic contributions (local vertex invariants, LOVIs). In respect to this, metric (norms), means and statistical invariants are introduced. These invariants are applied to a vector whose components are the values Δ(i) for the atomic nuclei of the molecule or its fragments. Moreover, with the purpose of differentiating

  19. On the Development and Use of Large Chemical Similarity Networks, Informatics Best Practices and Novel Chemical Descriptors towards Materials Quantitative Structure Property Relationships

    ERIC Educational Resources Information Center

    Krein, Michael

    2011-01-01

    After decades of development and use in a variety of application areas, Quantitative Structure Property Relationships (QSPRs) and related descriptor-based statistical learning methods have achieved a level of infamy due to their misuse. The field is rife with past examples of overtrained models, overoptimistic performance assessment, and outright…

  20. Prediction of novel and selective TNF-alpha converting enzyme (TACE) inhibitors and characterization of correlative molecular descriptors by machine learning approaches.

    PubMed

    Cong, Yong; Yang, Xue-Gang; Lv, Wei; Xue, Ying

    2009-10-01

    The inhibition of TNF-alpha converting enzyme (TACE) has been explored as a feasible therapy for the treatment of rheumatoid arthritis (RA) and Crohn's disease (CD). Recently, large numbers of novel and selective TACE inhibitors have been reported. It is desirable to develop machine learning (ML) models for identifying the inhibitors of TACE in the early drug design phase and test the prediction capabilities of these ML models. This work evaluated four ML methods, support vector machine (SVM), k-nearest neighbor (k-NN), back-propagation neural network (BPNN) and C4.5 decision tree (C4.5 DT), which were trained and tested by using a diverse set of 443 TACE inhibitors and 759 non-inhibitors. A well-established feature selection method, the recursive feature elimination (RFE) method, was used to select the most appropriate descriptors for classification from a large pool of descriptors, and two evaluation methods, 5-fold cross-validation and independent evaluation, were used to assess the performances of these developed models. In this study, all these ML models have already achieved promising prediction accuracies. By using the RFE method, the prediction accuracies are further improved. In k-NN, the model gives the best prediction for TACE inhibitors (98.32%), and the SVM bears the best prediction for non-inhibitors (99.51%). Both the k-NN and SVM model give the best overall prediction accuracy (98.45%). To the best of our knowledge, the SVM model developed in this work is the first one for the classification prediction of TACE inhibitors with a broad applicability domain. Our study suggests that ML methods, particularly SVM, are potentially useful for facilitating the discovery of TACE inhibitors and for exhibiting the molecular descriptors associated with TACE inhibitors. PMID:19729328

  1. In silico prediction of major drug clearance pathways by support vector machines with feature-selected descriptors.

    PubMed

    Toshimoto, Kouta; Wakayama, Naomi; Kusama, Makiko; Maeda, Kazuya; Sugiyama, Yuichi; Akiyama, Yutaka

    2014-11-01

    We have previously established an in silico classification method ("CPathPred") to predict the major clearance pathways of drugs based on an empirical decision with only four physicochemical descriptors-charge, molecular weight, octanol-water distribution coefficient, and protein unbound fraction in plasma-using a rectangular method. In this study, we attempted to improve the prediction performance of the method by introducing a support vector machine (SVM) and increasing the number of descriptors. The data set consisted of 141 approved drugs whose major clearance pathways were classified into metabolism by CYP3A4, CYP2C9, or CYP2D6; organic anion transporting polypeptide-mediated hepatic uptake; or renal excretion. With the same four default descriptors as used in CPathPred, the SVM-based predictor (named "default descriptor SVM") resulted in higher prediction performance compared with a rectangular-based predictor judged by 10-fold cross-validation. Two SVM-based predictors were also established by adding some descriptors as follows: 1) 881 descriptors predicted in silico from the chemical structures of drugs in addition to 4 default descriptors ("885 descriptor SVM"); and 2) selected descriptors extracted by a feature selection based on a greedy algorithm with default descriptors ("feature selection SVM"). The prediction accuracies of the rectangular-based predictor, default descriptor SVM, 885 descriptor SVM, and feature selection SVM were 0.49, 0.60, 0.72, and 0.91, respectively, and the overall precision values for these four methods were 0.72, 0.77, 0.86, and 0.98, respectively. In conclusion, we successfully constructed SVM-based predictors with limited numbers of descriptors to classify the major clearance pathways of drugs in humans with high prediction performance. PMID:25128502

  2. Phospholipophilicity of CxHyN(+) amines: chromatographic descriptors and molecular simulations for understanding partitioning into membranes.

    PubMed

    Droge, S T J; Hermens, J L M; Rabone, J; Gutsell, S; Hodges, G

    2016-08-10

    Using immobilized artificial membrane high-performance liquid chromatography (IAM-HPLC) the sorption affinity of 70 charged amine structures to phospholipids was determined. The amines contained only 1 charged moiety and no other polar groups, the rest of the molecule being aliphatic and/or aromatic hydrocarbon groups. We systematically evaluated the influence of the amine type (1°, 2°, 3° amines and quaternary ammonium), alkyl chain branching, phenyl ring positioning, charge positioning (terminal vs. central in the molecule) on the phospholipid-water partitioning coefficient (KPLIPW). These experimental results were compared with quantum-chemistry based three-dimensional (3D) molecular simulations of the partitioning of charged amines, including the most likely solute conformers, using a hydrated phospholipid bilayer in the COSMOmic module of COSMOtherm software. Both IAM-HPLC retention data and the simulations suggest that the molecular orientation of charged amines at the location in the bilayer with the lowest calculated Gibbs free energy exerts a strong influence over the partitioning within the membrane. The most favourable position of charged amines coincides with the region where the phosphate anions in the phospholipid bilayer are most abundant. Hydrocarbon units oriented in this layer are located more towards the aqueous phase and contribute less to the overall membrane affinity than hydrocarbon units extending into the more hydrophobic core of the bilayer. COSMOmic simulations explain most of the trends between the structural differences observed in IAM-HPLC based KPLIPW. For this set of cationic structures, the mean absolute difference between COSMOmic simulations and IAM-HPLC data, accounting only for amine type corrective increments, is 0.31 log units. PMID:27118065

  3. Euclidian embeddings of periodic nets: definition of a topologically induced complete set of geometric descriptors for crystal structures.

    PubMed

    Eon, Jean-Guillaume

    2011-01-01

    Crystal-structure topologies, represented by periodic nets, are described by labelled quotient graphs (or voltage graphs). Because the edge space of a finite graph is the direct sum of its cycle and co-cycle spaces, a Euclidian representation of the derived periodic net is provided by mapping a basis of the cycle and co-cycle spaces to a set of real vectors. The mapping is consistent if every cycle of the basis is mapped on its own net voltage. The sum of all outgoing edges at every vertex may be chosen as a generating set of the co-cycle space. The embedding maps the cycle space onto the lattice L. By analogy, the concept of the co-lattice L* is defined as the image of the generators of the co-cycle space; a co-lattice vector is proportional to the distance vector between an atom and the centre of gravity of its neighbours. The pair (L, L*) forms a complete geometric descriptor of the embedding, generalizing the concept of barycentric embedding. An algebraic expression permits the direct calculation of fractional coordinates. Non-zero co-lattice vectors allow nets with collisions, displacive transitions etc. to be dealt with. The method applies to nets of any periodicity and dimension, be they crystallographic nets or not. Examples are analyzed: α-cristobalite, the seven unstable 3-periodic minimal nets etc. PMID:21173475

  4. Quantitative structure-activity relationship models with receptor-dependent descriptors for predicting peroxisome proliferator-activated receptor activities of thiazolidinedione and oxazolidinedione derivatives.

    PubMed

    Lather, Viney; Kairys, Visvaldas; Fernandes, Miguel X

    2009-04-01

    A quantitative structure-activity relationship study has been carried out, in which the relationship between the peroxisome proliferator-activated receptor alpha and the peroxisome proliferator-activated receptor gamma agonistic activities of thiazolidinedione and oxazolidinedione derivatives and quantitative descriptors, V(site) calculated in a receptor-dependent manner is modeled. These descriptors quantify the volume occupied by the optimized ligands in regions that are either common or specific to the superimposed binding sites of the targets under consideration. The quantitative structure-activity relationship models were built by forward stepwise linear regression modeling for a training set of 27 compounds and validated for a test set of seven compounds, resulting in a squared correlation coefficient value of 0.90 for peroxisome proliferator-activated receptor alpha and of 0.89 for peroxisome proliferator-activated receptor gamma. The leave-one-out cross-validation and test set predictability squared correlation coefficient values for these models were 0.85 and 0.62 for peroxisome proliferator-activated receptor alpha and 0.89 and 0.50 for peroxisome proliferator-activated receptor gamma respectively. A dual peroxisome proliferator-activated receptor model has also been developed, and it indicates the structural features required for the design of ligands with dual peroxisome proliferator-activated receptor activity. These quantitative structure-activity relationship models show the importance of the descriptors here introduced in the prediction and interpretation of the compounds affinity and selectivity. PMID:19243388

  5. Local functional descriptors for surface comparison based binding prediction

    PubMed Central

    2012-01-01

    Background Molecular recognition in proteins occurs due to appropriate arrangements of physical, chemical, and geometric properties of an atomic surface. Similar surface regions should create similar binding interfaces. Effective methods for comparing surface regions can be used in identifying similar regions, and to predict interactions without regard to the underlying structural scaffold that creates the surface. Results We present a new descriptor for protein functional surfaces and algorithms for using these descriptors to compare protein surface regions to identify ligand binding interfaces. Our approach uses descriptors of local regions of the surface, and assembles collections of matches to compare larger regions. Our approach uses a variety of physical, chemical, and geometric properties, adaptively weighting these properties as appropriate for different regions of the interface. Our approach builds a classifier based on a training corpus of examples of binding sites of the target ligand. The constructed classifiers can be applied to a query protein providing a probability for each position on the protein that the position is part of a binding interface. We demonstrate the effectiveness of the approach on a number of benchmarks, demonstrating performance that is comparable to the state-of-the-art, with an approach with more generality than these prior methods. Conclusions Local functional descriptors offer a new method for protein surface comparison that is sufficiently flexible to serve in a variety of applications. PMID:23176080

  6. Relations between water physico-chemistry and benthic algal communities in a northern Canadian watershed: defining reference conditions using multiple descriptors of community structure.

    PubMed

    Thomas, Kathryn E; Hall, Roland I; Scrimgeour, Garry J

    2015-09-01

    Defining reference conditions is central to identifying environmental effects of anthropogenic activities. Using a watershed approach, we quantified reference conditions for benthic algal communities and their relations to physico-chemical conditions in rivers in the South Nahanni River watershed, NWT, Canada, in 2008 and 2009. We also compared the ability of three descriptors that vary in terms of analytical costs to define algal community structure based on relative abundances of (i) all algal taxa, (ii) only diatom taxa, and (iii) photosynthetic pigments. Ordination analyses showed that variance in algal community structure was strongly related to gradients in environmental variables describing water physico-chemistry, stream habitats, and sub-watershed structure. Water physico-chemistry and local watershed-scale descriptors differed significantly between algal communities from sites in the Selwyn Mountain ecoregion compared to sites in the Nahanni-Hyland ecoregions. Distinct differences in algal community types between ecoregions were apparent irrespective of whether algal community structure was defined using all algal taxa, diatom taxa, or photosynthetic pigments. Two algal community types were highly predictable using environmental variables, a core consideration in the development of Reference Condition Approach (RCA) models. These results suggest that assessments of environmental impacts could be completed using RCA models for each ecoregion. We suggest that use of algal pigments, a high through-put analysis, is a promising alternative compared to more labor-intensive and costly taxonomic approaches for defining algal community structure. PMID:26255271

  7. Evaluation of structure-reactivity descriptors and biological activity spectra of 4-(6-methoxy-2-naphthyl)-2-butanone using spectroscopic techniques

    NASA Astrophysics Data System (ADS)

    Agrawal, Megha; Deval, Vipin; Gupta, Archana; Sangala, Bagvanth Reddy; Prabhu, S. S.

    2016-10-01

    The structure and several spectroscopic features along with reactivity parameters of the compound 4-(6-methoxy-2-naphthyl)-2-butanone (Nabumetone) have been studied using experimental techniques and tools derived from quantum chemical calculations. Structure optimization is followed by force field calculations based on density functional theory (DFT) at the B3LYP/6-311++G(d,p) level of theory. The vibrational spectra have been interpreted with the aid of normal coordinate analysis. UV-visible spectrum and the effect of solvent have been discussed. The electronic properties such as HOMO and LUMO energies have been determined by TD-DFT approach. In order to understand various aspects of pharmacological sciences several new chemical reactivity descriptors - chemical potential, global hardness and electrophilicity have been evaluated. Local reactivity descriptors - Fukui functions and local softnesses have also been calculated to find out the reactive sites within molecule. Aqueous solubility and lipophilicity have been calculated which are crucial for estimating transport properties of organic molecules in drug development. Estimation of biological effects, toxic/side effects has been made on the basis of prediction of activity spectra for substances (PASS) prediction results and their analysis by Pharma Expert software. Using the THz-TDS technique, the frequency-dependent absorptions of NBM have been measured in the frequency range up to 3 THz.

  8. Evaluation of structure-reactivity descriptors and biological activity spectra of 4-(6-methoxy-2-naphthyl)-2-butanone using spectroscopic techniques.

    PubMed

    Agrawal, Megha; Deval, Vipin; Gupta, Archana; Sangala, Bagvanth Reddy; Prabhu, S S

    2016-10-01

    The structure and several spectroscopic features along with reactivity parameters of the compound 4-(6-methoxy-2-naphthyl)-2-butanone (Nabumetone) have been studied using experimental techniques and tools derived from quantum chemical calculations. Structure optimization is followed by force field calculations based on density functional theory (DFT) at the B3LYP/6-311++G(d,p) level of theory. The vibrational spectra have been interpreted with the aid of normal coordinate analysis. UV-visible spectrum and the effect of solvent have been discussed. The electronic properties such as HOMO and LUMO energies have been determined by TD-DFT approach. In order to understand various aspects of pharmacological sciences several new chemical reactivity descriptors - chemical potential, global hardness and electrophilicity have been evaluated. Local reactivity descriptors - Fukui functions and local softnesses have also been calculated to find out the reactive sites within molecule. Aqueous solubility and lipophilicity have been calculated which are crucial for estimating transport properties of organic molecules in drug development. Estimation of biological effects, toxic/side effects has been made on the basis of prediction of activity spectra for substances (PASS) prediction results and their analysis by Pharma Expert software. Using the THz-TDS technique, the frequency-dependent absorptions of NBM have been measured in the frequency range up to 3THz. PMID:27284764

  9. Potential energy profile, structural, vibrational and reactivity descriptors of trans-2-methoxycinnamic acid by FTIR, FT-Raman and quantum chemical studies

    NASA Astrophysics Data System (ADS)

    Arjunan, V.; Anitha, R.; Thenmozhi, S.; Marchewka, M. K.; Mohan, S.

    2016-06-01

    The stable conformers of trans-2-methoxycinnamic acid (trans-2MCA) are determined by potential energy profile analysis. The energies of the s-cis and s-trans conformers of trans-2MCA determined by B3LYP/cc-pVTZ method are -612.9788331 Hartrees and -612.9780953 Hartrees, respectively. The vibrational and electronic investigations of the stable s-cis and s-trans conformers of trans-2-methoxycinnamic acid have been carried out extensively with FTIR and FT-Raman spectral techniques. The s-cis conformer (I) with a (C16-C17-C18-O19) dihedral angle equal to 0° is found to be more favoured relative to the one s-trans (II) with (C16-C17-C18-O19) = 180°, possibly due to delocalization, hydrogen bonding and steric repulsion effects between the methoxy and acrylic acid groups. The DFT studies are performed with B3LYP method by utilizing 6-311++G** and cc-pVTZ basis sets to determine the structure, thermodynamic properties, vibrational characteristics and chemical shifts of the compound. The total dipole moments of the conformers determined by B3LYP/cc-pVTZ method are 3.35 D and 4.87 D for s-cis and s-trans, respectively. It reveals the higher polarity of s-trans conformer of trans-2MCA molecule. The electronic and steric influence of the methoxy group on the skeletal frequencies has been analysed. The energies of the frontier molecular orbitals and the LUMO-HOMO energy gap have been determined. The MEP of s-cis conformer lie in the range +1.374e × 10-2 to -1.374e × 10-2 while for s-trans it is +1.591e × 10-2 to -1.591e × 10-2. The total electron density of s-cis conformer lie in the range +5.273e × 10-2 to -5.273e × 10-2 while for s-trans it is +5.403e × 10-2 to -5.403e × 10-2. The MEP and total electron density shows that the s-cis conformer is less polar, less reactive and more stable than the s-trans conformer. All the reactivity descriptors of the molecule have been discussed. Intramolecular electronic interactions and their stabilisation energies have analysed

  10. Logistic regression models to predict solvent accessible residues using sequence- and homology-based qualitative and quantitative descriptors applied to a domain-complete X-ray structure learning set

    PubMed Central

    Nepal, Reecha; Spencer, Joanna; Bhogal, Guneet; Nedunuri, Amulya; Poelman, Thomas; Kamath, Thejas; Chung, Edwin; Kantardjieff, Katherine; Gottlieb, Andrea; Lustig, Brooke

    2015-01-01

    A working example of relative solvent accessibility (RSA) prediction for proteins is presented. Novel logistic regression models with various qualitative descriptors that include amino acid type and quantitative descriptors that include 20- and six-term sequence entropy have been built and validated. A domain-complete learning set of over 1300 proteins is used to fit initial models with various sequence homology descriptors as well as query residue qualitative descriptors. Homology descriptors are derived from BLASTp sequence alignments, whereas the RSA values are determined directly from the crystal structure. The logistic regression models are fitted using dichotomous responses indicating buried or accessible solvent, with binary classifications obtained from the RSA values. The fitted models determine binary predictions of residue solvent accessibility with accuracies comparable to other less computationally intensive methods using the standard RSA threshold criteria 20 and 25% as solvent accessible. When an additional non-homology descriptor describing Lobanov–Galzitskaya residue disorder propensity is included, incremental improvements in accuracy are achieved with 25% threshold accuracies of 76.12 and 74.79% for the Manesh-215 and CASP(8+9) test sets, respectively. Moreover, the described software and the accompanying learning and validation sets allow students and researchers to explore the utility of RSA prediction with simple, physically intuitive models in any number of related applications. PMID:26664348

  11. RELATIONSHIPS BETWEEN DESCRIPTORS FOR HYDROPHOBICITY AND SOFT ELECTROPHILICITY IN PREDICTING TOXICITY

    EPA Science Inventory

    The toxicity of chemicals is orthogonal with individual molecular descriptors used to quantify hydrophobicity and soft electro-philicity when considering large data sets. Estimating the toxicity of reactive chemicals requires descriptors of both passive transport and the stereoel...

  12. New quantitative structure-fragmentation relationship strategy for chemical structure identification using the calculated enthalpy of formation as a descriptor for the fragments produced in electron ionization mass spectrometry: a case study with tetrachlorinated biphenyls.

    PubMed

    Dinca, Nicolae; Dragan, Simona; Dinca, Mihael; Sisu, Eugen; Covaci, Adrian

    2014-05-20

    Differential mass spectrometry correlated with quantum chemical calculations (QCC-ΔMS) has been shown to be an efficient tool for the chemical structure identification (CSI) of isomers with similar mass spectra. For this type of analysis, we report here a new strategy based on ordering (ORD), linear correlation (LCOR) algorithms, and their coupling, to filter the most probable structures corresponding to similar mass spectra belonging to a group with dozens of isomers (e.g., tetrachlorinated biphenyls, TeCBs). This strategy quantifies and compares the values of enthalpies of formation (Δ(f)H) obtained by QCC for some isobaric ions from the electron ionization (EI)-MS mass spectra, to the corresponding relative intensities. The result of CSI is provided in the form of lists of decreasing probabilities calculated for all the position-isomeric structures using the specialized software package CSI-Diff-MS Analysis 3.1.1. The simulation of CSI with ORD, LCOR, and their coupling of six TeCBs (IUPAC no. 44, 46, 52, 66, 74, and 77) has allowed us to find the best semiempirical molecular-orbital methods for several of their common isobaric fragments. The study of algorithms and strategy for the entire group of TeCBs (42 isomers) was made with one of the optimal variants for the computation of Δ(f)H using semiempirical molecular orbital methods of HyperChem: AM1 for M(+•) and [M - 4Cl](+•) ions and RM1 for [M - Cl](+) and [M - 2Cl](+•). The analytical performance of ORD, LCOR, and their coupling resulted from the CSI simulation of an analyte of known structure, using a decreasing number of isomeric standards, s = 5, 4, 3, and 2. Compared with the results obtained by a classical library search for TeCB isomers, the novel strategies of assigning structures of isomers with very similar mass spectra based on ORD, LCOR, and their coupling were much more efficient, because they provide the correct structure at the top of the probability list. Databases used in these CSI

  13. Student Descriptor Scale Manual.

    ERIC Educational Resources Information Center

    Goetz, Lori; And Others

    The Student Descriptor Scale (SDS) was developed as a validation measure to determine whether students described and counted by states as "severely handicapped" were, indeed, students with severe disabilities. The SDS addresses nine characteristics: intellectual disability, health impairment, need for toileting assistance, upper torso motor…

  14. Learning discriminant face descriptor.

    PubMed

    Lei, Zhen; Pietikäinen, Matti; Li, Stan Z

    2014-02-01

    Local feature descriptor is an important module for face recognition and those like Gabor and local binary patterns (LBP) have proven effective face descriptors. Traditionally, the form of such local descriptors is predefined in a handcrafted way. In this paper, we propose a method to learn a discriminant face descriptor (DFD) in a data-driven way. The idea is to learn the most discriminant local features that minimize the difference of the features between images of the same person and maximize that between images from different people. In particular, we propose to enhance the discriminative ability of face representation in three aspects. First, the discriminant image filters are learned. Second, the optimal neighborhood sampling strategy is soft determined. Third, the dominant patterns are statistically constructed. Discriminative learning is incorporated to extract effective and robust features. We further apply the proposed method to the heterogeneous (cross-modality) face recognition problem and learn DFD in a coupled way (coupled DFD or C-DFD) to reduce the gap between features of heterogeneous face images to improve the performance of this challenging problem. Extensive experiments on FERET, CAS-PEAL-R1, LFW, and HFB face databases validate the effectiveness of the proposed DFD learning on both homogeneous and heterogeneous face recognition problems. The DFD improves POEM and LQP by about 4.5 percent on LFW database and the C-DFD enhances the heterogeneous face recognition performance of LBP by over 25 percent. PMID:24356350

  15. Synthesis, molecular structure, spectroscopic analysis, thermodynamic parameters and molecular modeling studies of (2-methoxyphenyl)oxalate

    NASA Astrophysics Data System (ADS)

    Şahin, Zarife Sibel; Kantar, Günay Kaya; Şaşmaz, Selami; Büyükgüngör, Orhan

    2015-05-01

    The aim of this study is to find out the molecular characteristic and structural parameters that govern the chemical behavior of a new (2-methoxyphenyl)oxalate compound and to compare predictions made from theory with experimental observations. The title compound, (2-methoxyphenyl)oxalate, (I), (C16H14O6), has been synthesized. The compound has been characterized by elemental analysis, IR, 1H NMR, 13C NMR spectroscopies and single crystal X-ray diffraction techniques. Optimized molecular structure, harmonic vibrational frequencies, 1H and 13C NMR chemical shifts have been investigated by B3LYP/6-31G(d,p) method using density functional theory (DFT). The calculated results show that the predicted geometry can well reproduce structural parameters. In addition, global chemical reactivity descriptors, molecular electrostatic potential map (MEP), frontier molecular orbitals (FMOs), Mulliken population method and natural population analysis (NPA) and thermodynamic properties have also been studied. The energetic behavior of title compound has been examined in solvent media using polarizable continuum model (PCM).

  16. Molecular structure and elastic properties of thermotropic liquid crystals: Integrated molecular dynamics—Statistical mechanical theory vs molecular field approach

    NASA Astrophysics Data System (ADS)

    Capar, M. Ilk; Nar, A.; Ferrarini, A.; Frezza, E.; Greco, C.; Zakharov, A. V.; Vakulenko, A. A.

    2013-03-01

    The connection between the molecular structure of liquid crystals and their elastic properties, which control the director deformations relevant for electro-optic applications, remains a challenging objective for theories and computations. Here, we compare two methods that have been proposed to this purpose, both characterized by a detailed molecular level description. One is an integrated molecular dynamics-statistical mechanical approach, where the bulk elastic constants of nematics are calculated from the direct correlation function (DCFs) and the single molecule orientational distribution function [D. A. McQuarrie, Statistical Mechanics (Harper & Row, New York, 1973)]. The latter is obtained from atomistic molecular dynamics trajectories, together with the radial distribution function, from which the DCF is then determined by solving the Ornstein-Zernike equation. The other approach is based on a molecular field theory, where the potential of mean torque experienced by a mesogen in the liquid crystal phase is parameterized according to its molecular surface. In this case, the calculation of elastic constants is combined with the Monte Carlo sampling of single molecule conformations. Using these different approaches, but the same description, at the level of molecular geometry and torsional potentials, we have investigated the elastic properties of the nematic phase of two typical mesogens, 4'-n-pentyloxy-4-cyanobiphenyl and 4'-n-heptyloxy-4-cyanobiphenyl. Both methods yield K3(bend) >K1 (splay) >K2 (twist), although there are some discrepancies in the average elastic constants and in their anisotropy. These are interpreted in terms of the different approximations and the different ways of accounting for the structural properties of molecules in the two approaches. In general, the results point to the role of the molecular shape, which is modulated by the conformational freedom and cannot be fully accounted for by a single descriptor such as the aspect ratio.

  17. Molecular structure and elastic properties of thermotropic liquid crystals: integrated molecular dynamics--statistical mechanical theory vs molecular field approach.

    PubMed

    Ilk Capar, M; Nar, A; Ferrarini, A; Frezza, E; Greco, C; Zakharov, A V; Vakulenko, A A

    2013-03-21

    The connection between the molecular structure of liquid crystals and their elastic properties, which control the director deformations relevant for electro-optic applications, remains a challenging objective for theories and computations. Here, we compare two methods that have been proposed to this purpose, both characterized by a detailed molecular level description. One is an integrated molecular dynamics-statistical mechanical approach, where the bulk elastic constants of nematics are calculated from the direct correlation function (DCFs) and the single molecule orientational distribution function [D. A. McQuarrie, Statistical Mechanics (Harper & Row, New York, 1973)]. The latter is obtained from atomistic molecular dynamics trajectories, together with the radial distribution function, from which the DCF is then determined by solving the Ornstein-Zernike equation. The other approach is based on a molecular field theory, where the potential of mean torque experienced by a mesogen in the liquid crystal phase is parameterized according to its molecular surface. In this case, the calculation of elastic constants is combined with the Monte Carlo sampling of single molecule conformations. Using these different approaches, but the same description, at the level of molecular geometry and torsional potentials, we have investigated the elastic properties of the nematic phase of two typical mesogens, 4'-n-pentyloxy-4-cyanobiphenyl and 4'-n-heptyloxy-4-cyanobiphenyl. Both methods yield K3(bend) >K1 (splay) >K2 (twist), although there are some discrepancies in the average elastic constants and in their anisotropy. These are interpreted in terms of the different approximations and the different ways of accounting for the structural properties of molecules in the two approaches. In general, the results point to the role of the molecular shape, which is modulated by the conformational freedom and cannot be fully accounted for by a single descriptor such as the aspect ratio

  18. Admissible consensus for heterogeneous descriptor multi-agent systems

    NASA Astrophysics Data System (ADS)

    Yang, Xin-Rong; Liu, Guo-Ping

    2016-09-01

    This paper focuses on the admissible consensus problem for heterogeneous descriptor multi-agent systems. Based on algebra, graph and descriptor system theory, the necessary and sufficient conditions are proposed for heterogeneous descriptor multi-agent systems achieving admissible consensus. The provided conditions depend on not only the structure properties of each agent dynamics but also the topologies within the descriptor multi-agent systems. Moreover, an algorithm is given to design the novel consensus protocol. A numerical example demonstrates the effectiveness of the proposed design approach.

  19. Interactive Modelling of Molecular Structures

    NASA Astrophysics Data System (ADS)

    Rustad, J. R.; Kreylos, O.; Hamann, B.

    2004-12-01

    The "Nanotech Construction Kit" (NCK) [1] is a new project aimed at improving the understanding of molecular structures at a nanometer-scale level by visualization and interactive manipulation. Our very first prototype is a virtual-reality program allowing the construction of silica and carbon structures from scratch by assembling them one atom at a time. In silica crystals or glasses, the basic building block is an SiO4 unit, with the four oxygen atoms arranged around the central silicon atom in the shape of a regular tetrahedron. Two silicate units can connect to each other by their silicon atoms covalently bonding to one shared oxygen atom. Geometrically, this means that two tetrahedra can link at their vertices. Our program is based on geometric representations and uses simple force fields to simulate the interaction of building blocks, such as forming/breaking of bonds and repulsion. Together with stereoscopic visualization and direct manipulation of building blocks using wands or data gloves, this enables users to create realistic and complex molecular models in short amounts of time. The NCK can either be used as a standalone tool, to analyze or experiment with molecular structures, or it can be used in combination with "traditional" molecular dynamics (MD) simulations. In a first step, the NCK can create initial configurations for subsequent MD simulation. In a more evolved setup, the NCK can serve as a visual front-end for an ongoing MD simulation, visualizing changes in simulation state in real time. Additionally, the NCK can be used to change simulation state on-the-fly, to experiment with different simulation conditions, or force certain events, e.g., the forming of a bond, and observe the simulation's reaction. [1] http://graphics.cs.ucdavis.edu/~okreylos/ResDev/NanoTech

  20. Application of the quantum mechanical IEF/PCM-MST hydrophobic descriptors to selectivity in ligand binding.

    PubMed

    Ginex, Tiziana; Muñoz-Muriedas, Jordi; Herrero, Enric; Gibert, Enric; Cozzini, Pietro; Luque, F Javier

    2016-06-01

    We have recently reported the development and validation of quantum mechanical (QM)-based hydrophobic descriptors derived from the parametrized IEF/PCM-MST continuum solvation model for 3D-QSAR studies within the framework of the Hydrophobic Pharmacophore (HyPhar) method. In this study we explore the applicability of these descriptors to the analysis of selectivity fields. To this end, we have examined a series of 88 compounds with inhibitory activities against thrombin, trypsin and factor Xa, and the HyPhar results have been compared with 3D-QSAR models reported in the literature. The quantitative models obtained by combining the electrostatic and non-electrostatic components of the octanol/water partition coefficient yield results that compare well with the predictive potential of standard CoMFA and CoMSIA techniques. The results also highlight the potential of HyPhar descriptors to discriminate the selectivity of the compounds against thrombin, trypsin, and factor Xa. Moreover, the graphical representation of the hydrophobic maps provides a direct linkage with the pattern of interactions found in crystallographic structures. Overall, the results support the usefulness of the QM/MST-based hydrophobic descriptors as a complementary approach for disclosing structure-activity relationships in drug design and for gaining insight into the molecular determinants of ligand selectivity. Graphical Abstract Quantum Mechanical continuum solvation calculations performed with the IEF/PCM-MST method are used to derived atomic hydrophobic descriptors, which are then used to discriminate the selectivity of ligands against thrombin, trypsin and factor Xa. The descriptors provide complementary view to standard 3D-QSAR analysis, leading to a more comprehensive understanding of ligand recognition. PMID:27188723

  1. Predicting activities without computing descriptors: graph machines for QSAR.

    PubMed

    Goulon, A; Picot, T; Duprat, A; Dreyfus, G

    2007-01-01

    We describe graph machines, an alternative approach to traditional machine-learning-based QSAR, which circumvents the problem of designing, computing and selecting molecular descriptors. In that approach, which is similar in spirit to recursive networks, molecules are considered as structured data, represented as graphs. For each example of the data set, a mathematical function (graph machine) is built, whose structure reflects the structure of the molecule under consideration; it is the combination of identical parameterised functions, called "node functions" (e.g. a feedforward neural network). The parameters of the node functions, shared both within and across the graph machines, are adjusted during training with the "shared weights" technique. Model selection is then performed by traditional cross-validation. Therefore, the designer's main task consists in finding the optimal complexity for the node function. The efficiency of this new approach has been demonstrated in many QSAR or QSPR tasks, as well as in modelling the activities of complex chemicals (e.g. the toxicity of a family of phenols or the anti-HIV activities of HEPT derivatives). It generally outperforms traditional techniques without requiring the selection and computation of descriptors. PMID:17365965

  2. Quantitative structure-antibacterial activity relationship modeling using a combination of piecewise linear regression-discriminant analysis (I): Quantum chemical, topographic, and topological descriptors

    NASA Astrophysics Data System (ADS)

    Molina, Enrique; Estrada, Ernesto; Nodarse, Delvin; Torres, Luis A.; González, Humberto; Uriarte, Eugenio

    Time-dependent antibacterial activity of 2-furylethylenes using quantum chemical, topographic, and topological indices is described as inhibition of respiration in E. coli. A QSAR strategy based on the combination of the linear piecewise regression and the discriminant analysis is used to predict the biological activity values of strong and moderates antibacterial furylethylenes. The breakpoint in the values of the biological activity was detected. The biological activities of the compounds are described by two linear regression equations. A discriminant analysis is carried out to classify the compounds in one of the biological activity two groups. The results showed using different kind of descriptors were compared. In all cases the piecewise linear regression - discriminant analysis (PLR-DA) method produced significantly better QSAR models than the linear regression analysis. The QSAR models were validated using an external validation previously extracted from the original data. A prediction of reported antibacterial activity analysis was carried out showing dependence between the probability of a good classification and the experimental antibacterial activity. Statistical parameters showed the quality of quantum-chemical descriptors based models prediction in LDA having an accuracy of 0.9 and a C of 0.9. The best PLR-DA model explains more than 92% of the variance of experimental activity. Models with best prediction results were those based on quantum-chemical descriptors. An interpretation of quantum-chemical descriptors entered in models was carried out.

  3. Molecular structure-adsorption study on current textile dyes.

    PubMed

    Örücü, E; Tugcu, G; Saçan, M T

    2014-01-01

    This study was performed to investigate the adsorption of a diverse set of textile dyes onto granulated activated carbon (GAC). The adsorption experiments were carried out in a batch system. The Langmuir and Freundlich isotherm models were applied to experimental data and the isotherm constants were calculated for 33 anthraquinone and azo dyes. The adsorption equilibrium data fitted more adequately to the Langmuir isotherm model than the Freundlich isotherm model. Added to a qualitative analysis of experimental results, multiple linear regression (MLR), support vector regression (SVR) and back propagation neural network (BPNN) methods were used to develop quantitative structure-property relationship (QSPR) models with the novel adsorption data. The data were divided randomly into training and test sets. The predictive ability of all models was evaluated using the test set. Descriptors were selected with a genetic algorithm (GA) using QSARINS software. Results related to QSPR models on the adsorption capacity of GAC showed that molecular structure of dyes was represented by ionization potential based on two-dimensional topological distances, chromophoric features and a property filter index. Comparison of the performance of the models demonstrated the superiority of the BPNN over GA-MLR and SVR models. PMID:25529487

  4. Quantitative structure-hydrophobicity relationships of molecular fragments and beyond.

    PubMed

    Zou, Jian-Wei; Huang, Meilan; Huang, Jian-Xiang; Hu, Gui-Xiang; Jiang, Yong-Jun

    2016-03-01

    Quantitative structure-property relationship (QSPR) models were firstly established for the hydrophobic substituent constant (πX) using the theoretical descriptors derived solely from electrostatic potentials (EPSs) at the substituent atoms. The descriptors introduced are found to be related to hydrogen-bond basicity, hydrogen-bond acidity, cavity, or dipolarity/polarizability terms in linear solvation energy relationship, which endows the models good interpretability. The predictive capabilities of the models constructed were also verified by rigorous Monte Carlo cross-validation. Then, eight groups of meta- or para-disubstituted benzenes and one group of substituted pyridines were investigated. QSPR models for individual systems were achieved with the ESP-derived descriptors. Additionally, two QSPR models were also established for Rekker's fragment constants (foct), which is a secondary-treatment quantity and reflects average contribution of the fragment to logP. It has been demonstrated that the descriptors derived from ESPs at the fragments, can be well used to quantitatively express the relationship between fragment structures and their hydrophobic properties, regardless of the attached parent structure or the valence state. Finally, the relations of Hammett σ constant and ESP quantities were explored. It implies that σ and π, which are essential in classic QSAR and represent different type of contributions to biological activities, are also complementary in interaction site. PMID:26826800

  5. Periodic table-based descriptors to encode cytotoxicity profile of metal oxide nanoparticles: a mechanistic QSTR approach.

    PubMed

    Kar, Supratik; Gajewicz, Agnieszka; Puzyn, Tomasz; Roy, Kunal; Leszczynski, Jerzy

    2014-09-01

    Nanotechnology has evolved as a frontrunner in the development of modern science. Current studies have established toxicity of some nanoparticles to human and environment. Lack of sufficient data and low adequacy of experimental protocols hinder comprehensive risk assessment of nanoparticles (NPs). In the present work, metal electronegativity (χ), the charge of the metal cation corresponding to a given oxide (χox), atomic number and valence electron number of the metal have been used as simple molecular descriptors to build up quantitative structure-toxicity relationship (QSTR) models for prediction of cytotoxicity of metal oxide NPs to bacteria Escherichia coli. These descriptors can be easily obtained from molecular formula and information acquired from periodic table in no time. It has been shown that a simple molecular descriptor χox can efficiently encode cytotoxicity of metal oxides leading to models with high statistical quality as well as interpretability. Based on this model and previously published experimental results, we have hypothesized the most probable mechanism of the cytotoxicity of metal oxide nanoparticles to E. coli. Moreover, the required information for descriptor calculation is independent of size range of NPs, nullifying a significant problem that various physical properties of NPs change for different size ranges. PMID:24949897

  6. The Timbre Toolbox: extracting audio descriptors from musical signals.

    PubMed

    Peeters, Geoffroy; Giordano, Bruno L; Susini, Patrick; Misdariis, Nicolas; McAdams, Stephen

    2011-11-01

    The analysis of musical signals to extract audio descriptors that can potentially characterize their timbre has been disparate and often too focused on a particular small set of sounds. The Timbre Toolbox provides a comprehensive set of descriptors that can be useful in perceptual research, as well as in music information retrieval and machine-learning approaches to content-based retrieval in large sound databases. Sound events are first analyzed in terms of various input representations (short-term Fourier transform, harmonic sinusoidal components, an auditory model based on the equivalent rectangular bandwidth concept, the energy envelope). A large number of audio descriptors are then derived from each of these representations to capture temporal, spectral, spectrotemporal, and energetic properties of the sound events. Some descriptors are global, providing a single value for the whole sound event, whereas others are time-varying. Robust descriptive statistics are used to characterize the time-varying descriptors. To examine the information redundancy across audio descriptors, correlational analysis followed by hierarchical clustering is performed. This analysis suggests ten classes of relatively independent audio descriptors, showing that the Timbre Toolbox is a multidimensional instrument for the measurement of the acoustical structure of complex sound signals. PMID:22087919

  7. Structure parameters in molecular tunneling ionization theory

    NASA Astrophysics Data System (ADS)

    Wang, Jun-Ping; Li, Wei; Zhao, Song-Feng

    2014-04-01

    We extracted the accurate structure parameters in molecular tunneling ionization theory (so called MO-ADK theory) for 22 selected linear molecules including some inner orbitals. The molecular wave functions with the correct asymptotic behavior are obtained by solving the time-independent Schrödinger equation with B-spline functions and molecular potentials numerically constructed using the modified Leeuwen-Baerends (LBα) model.

  8. Molecular modeling of nucleic acid structure

    PubMed Central

    Galindo-Murillo, Rodrigo; Bergonzo, Christina

    2013-01-01

    This unit is the first in a series of four units covering the analysis of nucleic acid structure by molecular modeling. This unit provides an overview of computer simulation of nucleic acids. Topics include the static structure model, computational graphics and energy models, generation of an initial model, and characterization of the overall three-dimensional structure. PMID:18428873

  9. The Molecular Structure of Penicillin

    NASA Astrophysics Data System (ADS)

    Bentley, Ronald

    2004-10-01

    The chemical structure of penicillin was determined between 1942 and 1945 under conditions of secrecy established by the U.S. and U.K. governments. The evidence was not published in the open literature but as a monograph. This complex volume does not present a structure proof that can be readily comprehended by a student. In this article, a basic structural proof for the penicillin molecule is provided, emphasizing the chemical work. The stereochemistry of penicillin is also described, and various rearrangements are considered on the basis of the accepted β-lactam structure.

  10. The Molecular Structure of Penicillin

    ERIC Educational Resources Information Center

    Bentley, Ronald

    2004-01-01

    Overviews of the observations that constitute a structure proof for penicillin, specifically aimed at the general student population, are presented. Melting points and boiling points were criteria of purity and a crucial tool was microanalysis leading to empirical formulas.

  11. A Generally Applicable Computer Algorithm Based on the Group Additivity Method for the Calculation of Seven Molecular Descriptors: Heat of Combustion, LogPO/W, LogS, Refractivity, Polarizability, Toxicity and LogBB of Organic Compounds; Scope and Limits of Applicability.

    PubMed

    Naef, Rudolf

    2015-01-01

    A generally applicable computer algorithm for the calculation of the seven molecular descriptors heat of combustion, logPoctanol/water, logS (water solubility), molar refractivity, molecular polarizability, aqueous toxicity (protozoan growth inhibition) and logBB (log (cblood/cbrain)) is presented. The method, an extendable form of the group-additivity method, is based on the complete break-down of the molecules into their constituting atoms and their immediate neighbourhood. The contribution of the resulting atom groups to the descriptor values is calculated using the Gauss-Seidel fitting method, based on experimental data gathered from literature. The plausibility of the method was tested for each descriptor by means of a k-fold cross-validation procedure demonstrating good to excellent predictive power for the former six descriptors and low reliability of logBB predictions. The goodness of fit (Q²) and the standard deviation of the 10-fold cross-validation calculation was >0.9999 and 25.2 kJ/mol, respectively, (based on N = 1965 test compounds) for the heat of combustion, 0.9451 and 0.51 (N = 2640) for logP, 0.8838 and 0.74 (N = 1419) for logS, 0.9987 and 0.74 (N = 4045) for the molar refractivity, 0.9897 and 0.77 (N = 308) for the molecular polarizability, 0.8404 and 0.42 (N = 810) for the toxicity and 0.4709 and 0.53 (N = 383) for logBB. The latter descriptor revealing a very low Q² for the test molecules (R² was 0.7068 and standard deviation 0.38 for N = 413 training molecules) is included as an example to show the limits of the group-additivity method. An eighth molecular descriptor, the heat of formation, was indirectly calculated from the heat of combustion data and correlated with published experimental heat of formation data with a correlation coefficient R² of 0.9974 (N = 2031). PMID:26457702

  12. STRUCTURED MOLECULAR GAS REVEALS GALACTIC SPIRAL ARMS

    SciTech Connect

    Sawada, Tsuyoshi; Hasegawa, Tetsuo; Koda, Jin

    2012-11-01

    We explore the development of structures in molecular gas in the Milky Way by applying the analysis of the brightness distribution function and the brightness distribution index (BDI) in the archival data from the Boston University-Five College Radio Astronomy Observatory {sup 13}CO J = 1-0 Galactic Ring Survey. The BDI measures the fractional contribution of spatially confined bright molecular emission over faint emission extended over large areas. This relative quantity is largely independent of the amount of molecular gas and of any conventional, pre-conceived structures, such as cores, clumps, or giant molecular clouds. The structured molecular gas traced by higher BDI is located continuously along the spiral arms in the Milky Way in the longitude-velocity diagram. This clearly indicates that molecular gas changes its structure as it flows through the spiral arms. Although the high-BDI gas generally coincides with H II regions, there is also some high-BDI gas with no/little signature of ongoing star formation. These results support a possible evolutionary sequence in which unstructured, diffuse gas transforms itself into a structured state on encountering the spiral arms, followed by star formation and an eventual return to the unstructured state after the spiral arm passage.

  13. Exploiting non-linear relationships between retention time and molecular structure of peptides originating from proteomes and comparing three multivariate approaches.

    PubMed

    Žuvela, Petar; Macur, Katarzyna; Jay Liu, J; Bączek, Tomasz

    2016-08-01

    Peptides' retention time prediction is gaining increasing popularity in liquid chromatography-tandem mass spectrometry (LC-MS/MS)-based proteomics. This is a promising approach for improving successful proteome mapping, useful both in identification and quantification workflows. In this work, a quantitative structure-retention relationships (QSRR) model for its direct prediction from the molecular structure of 185 peptides originating from 8 well-characterized proteins and two Bacillus subtilis proteomes has been developed. Genetic Algorithm (GA) was used for selection of a subset of molecular descriptors coupled with three machine learning methods: Support Vector Regression (SVR), Artificial Neural Networks (ANN), and kernel Partial Least Squares (kPLS) for regression. Final GA-SVR, GA-ANN, and GA-kPLS models were validated through an external validation set of 95 peptides originating from the human epithelial HeLa cells proteomes. Robustness and stability was ensured by defining their applicability domain. The descriptors of the developed models were interpreted confirming a causal relationship between parameters of molecular structure and retention time. GA-SVR model has shown to be superior over the others in terms of both predictive ability, and interpretation of the selected descriptors. PMID:26856456

  14. A Quantitative Structure-Property Relationship (QSPR) Study of Aliphatic Alcohols by the Method of Dividing the Molecular Structure into Substructure

    PubMed Central

    Liu, Fengping; Cao, Chenzhong; Cheng, Bin

    2011-01-01

    A quantitative structure–property relationship (QSPR) analysis of aliphatic alcohols is presented. Four physicochemical properties were studied: boiling point (BP), n-octanol–water partition coefficient (lg POW), water solubility (lg W) and the chromatographic retention indices (RI) on different polar stationary phases. In order to investigate the quantitative structure–property relationship of aliphatic alcohols, the molecular structure ROH is divided into two parts, R and OH to generate structural parameter. It was proposed that the property is affected by three main factors for aliphatic alcohols, alkyl group R, substituted group OH, and interaction between R and OH. On the basis of the polarizability effect index (PEI), previously developed by Cao, the novel molecular polarizability effect index (MPEI) combined with odd-even index (OEI), the sum eigenvalues of bond-connecting matrix (SX1CH) previously developed in our team, were used to predict the property of aliphatic alcohols. The sets of molecular descriptors were derived directly from the structure of the compounds based on graph theory. QSPR models were generated using only calculated descriptors and multiple linear regression techniques. These QSPR models showed high values of multiple correlation coefficient (R > 0.99) and Fisher-ratio statistics. The leave-one-out cross-validation demonstrated the final models to be statistically significant and reliable. PMID:21731451

  15. Ambient gas/particle partitioning. 3. Estimating partition coefficients of apolar, polar, and ionizable organic compounds by their molecular structure.

    PubMed

    Arp, Hans Peter H; Gosses, Kai-Uwe

    2009-03-15

    Equilibrium gas/particle partitioning coefficients of terrestrial aerosols, Kip, are dependent on various intermolecular interactions that can be quantified by experimentally determined compound-specific descriptors. For many compounds of environmental interest, such as emerging contaminants and atmospheric phototransformation products, these compound-specific descriptors are unknown or immeasurable. Often, only the molecular structure is known. Here we present the ability of two computer programs to predict equilibrium partitioning to terrestrial aerosols solely on the basis of molecular structure: COSMOtherm and SPARC. The greatest hurdle with designing such an approach is to identify suitable molecular surrogates to represent the dominating sorbing phases, which for ambient terrestrial aerosols are the water insoluble organic matter (WIOM) phase and the mixed-aqueous phase. For the WI0M phase, hypothetical urban secondary organic aerosol structural units from Kalberer et al. Science 2004, 303, 1659-1662 were investigated as input surrogates, and for the mixed-aqueous phase mildly acidic water was used as a surrogate. Using a validation data set of more than 1400 experimentally determined Kip values for polar, apolar, and ionic compounds ranging over 9 orders of magnitude (including semivolatile compounds such as PCDD/Fs, pesticides, and PBDEs), SPARC and COSMOtherm were generally able to predict Kip values well within an order of magnitude over an ambient range of temperature and relative humidity. This is remarkable as these two models were not fitted or calibrated to any experimental data. As these models can be used for potentially any organic molecule, they are particularly recommended for environmental screening purposes and for use when experimental compound descriptor data are not available. PMID:19368193

  16. Estimation of biliary excretion of foreign compounds using properties of molecular structure.

    PubMed

    Sharifi, Mohsen; Ghafourian, Taravat

    2014-01-01

    Biliary excretion is one of the main elimination pathways for drugs and/or their metabolites. Therefore, an insight into the structural profile of cholephilic compounds through accurate modelling of the biliary excretion is important for the estimation of clinical pharmacokinetics in early stages of drug discovery. The aim of this study was to develop quantitative structure-activity relationships as computational tools for the estimation of biliary excretion and identification of the molecular properties controlling this process. The study used percentage of dose excreted intact into bile measured in vivo in rat for a diverse dataset of 217 compounds. Statistical techniques were multiple linear regression analysis, regression trees, random forest and boosted trees. A simple regression tree model generated using the CART algorithm was the most accurate in the estimation of the percentage of bile excretion of compounds, and this outperformed the more sophisticated boosted trees and random forest techniques. Analysis of the outliers indicated that the models perform best when lipophilicity is not too extreme (log P < 5.35) and for compounds with molecular weight above 280 Da. Molecular descriptors selected by all these models including the top ten incorporated in boosted trees and random forest indicated a higher biliary excretion for relatively hydrophilic compounds especially if they are anionic or cationic, and have a large molecular size. A statistically validated molecular weight threshold for potentially significant biliary excretion was above 348 Da. PMID:24202722

  17. Excited-state properties from ground-state DFT descriptors: A QSPR approach for dyes.

    PubMed

    Fayet, Guillaume; Jacquemin, Denis; Wathelet, Valérie; Perpète, Eric A; Rotureau, Patricia; Adamo, Carlo

    2010-02-26

    This work presents a quantitative structure-property relationship (QSPR)-based approach allowing an accurate prediction of the excited-state properties of organic dyes (anthraquinones and azobenzenes) from ground-state molecular descriptors, obtained within the (conceptual) density functional theory (DFT) framework. The ab initio computation of the descriptors was achieved at several levels of theory, so that the influence of the basis set size as well as of the modeling of environmental effects could be statistically quantified. It turns out that, for the entire data set, a statistically-robust four-variable multiple linear regression based on PCM-PBE0/6-31G calculations delivers a R(adj)(2) of 0.93 associated to predictive errors allowing for rapid and efficient dye design. All the selected descriptors are independent of the dye's family, an advantage over previously designed QSPR schemes. On top of that, the obtained accuracy is comparable to the one of the today's reference methods while exceeding the one of hardness-based fittings. QSPR relationships specific to both families of dyes have also been built up. This work paves the way towards reliable and computationally affordable color design for organic dyes. PMID:20036173

  18. Structures in Molecular Clouds: Modeling

    SciTech Connect

    Kane, J O; Mizuta, A; Pound, M W; Remington, B A; Ryutov, D D

    2006-04-20

    We attempt to predict the observed morphology, column density and velocity gradient of Pillar II of the Eagle Nebula, using Rayleigh Taylor (RT) models in which growth is seeded by an initial perturbation in density or in shape of the illuminated surface, and cometary models in which structure is arises from a initially spherical cloud with a dense core. Attempting to mitigate suppression of RT growth by recombination, we use a large cylindrical model volume containing the illuminating source and the self-consistently evolving ablated outflow and the photon flux field, and use initial clouds with finite lateral extent. An RT model shows no growth, while a cometary model appears to be more successful at reproducing observations.

  19. On the emergence of molecular structure

    SciTech Connect

    Matyus, Edit; Reiher, Markus; Hutter, Juerg; Mueller-Herold, Ulrich

    2011-05-15

    The structure of (a{sup {+-}},a{sup {+-}},b{sup {+-}})-type Coulombic systems is characterized by the effective ground-state density of the a-type particles, computed via nonrelativistic quantum mechanics without introduction of the Born-Oppenheimer approximation. A structural transition is observed when varying the relative mass of the a- and b-type particles, e.g., between atomic H{sup -} and molecular H{sub 2}{sup +}. The particle-density profile indicates a molecular-type behavior for the positronium ion, Ps{sup -}.

  20. Gun bore flaw image matching based on improved SIFT descriptor

    NASA Astrophysics Data System (ADS)

    Zeng, Luan; Xiong, Wei; Zhai, You

    2013-01-01

    In order to increase the operation speed and matching ability of SIFT algorithm, the SIFT descriptor and matching strategy are improved. First, a method of constructing feature descriptor based on sector area is proposed. By computing the gradients histogram of location bins which are parted into 6 sector areas, a descriptor with 48 dimensions is constituted. It can reduce the dimension of feature vector and decrease the complexity of structuring descriptor. Second, it introduce a strategy that partitions the circular region into 6 identical sector areas starting from the dominate orientation. Consequently, the computational complexity is reduced due to cancellation of rotation operation for the area. The experimental results indicate that comparing with the OpenCV SIFT arithmetic, the average matching speed of the new method increase by about 55.86%. The matching veracity can be increased even under some variation of view point, illumination, rotation, scale and out of focus. The new method got satisfied results in gun bore flaw image matching. Keywords: Metrology, Flaw image matching, Gun bore, Feature descriptor

  1. How We Teach Molecular Structure to Freshmen.

    ERIC Educational Resources Information Center

    Hurst, Michael O.

    2002-01-01

    Currently molecular structure is taught in general chemistry using three theories, this being based more on historical development rather than logical pedagogy. Electronegativity is taught with a confusing mixture of definitions that do not correspond to modern practice. Valence bond theory and VSEPR are used together in a way that often confuses…

  2. Molecular Structure of Human-Liver Glycogen

    PubMed Central

    Deng, Bin; Sullivan, Mitchell A.; Chen, Cheng; Li, Jialun; Powell, Prudence O.; Hu, Zhenxia; Gilbert, Robert G.

    2016-01-01

    Glycogen is a highly branched glucose polymer which is involved in maintaining blood-sugar homeostasis. Liver glycogen contains large composite α particles made up of linked β particles. Previous studies have shown that the binding which links β particles into α particles is impaired in diabetic mice. The present study reports the first molecular structural characterization of human-liver glycogen from non-diabetic patients, using transmission electron microscopy for morphology and size-exclusion chromatography for the molecular size distribution; the latter is also studied as a function of time during acid hydrolysis in vitro, which is sensitive to certain structural features, particularly glycosidic vs. proteinaceous linkages. The results are compared with those seen in mice and pigs. The molecular structural change during acid hydrolysis is similar in each case, and indicates that the linkage of β into α particles is not glycosidic. This result, and the similar morphology in each case, together imply that human liver glycogen has similar molecular structure to those of mice and pigs. This knowledge will be useful for future diabetes drug targets. PMID:26934359

  3. Molecular Association and Structure of Hydrogen Peroxide.

    ERIC Educational Resources Information Center

    Giguere, Paul A.

    1983-01-01

    The statement is sometimes made in textbooks that liquid hydrogen peroxide is more strongly associated than water, evidenced by its higher boiling point and greater heat of vaporization. Discusses these and an additional factor (the nearly double molecular mass of the peroxide), focusing on hydrogen bonds and structure of the molecule. (JN)

  4. Quantitative Structure-Property Relationship (QSPR) Models for a Local Quantum Descriptor: Investigation of the 4- and 3-Substituted-Cinnamic Acid Esterification.

    PubMed

    Rodrigues-Santos, Cláudio E; Echevarria, Aurea; Sant'Anna, Carlos M R; Bitencourt, Thiago B; Nascimento, Maria G; Bauerfeldt, Glauco F

    2015-01-01

    In this work, the theoretical description of the 4- and 3-substituted-cinnamic acid esterification with different electron donating and electron withdrawing groups was performed at the B3LYP and M06-2X levels, as a two-step process: the O-protonation and the nucleophile attack by ethanol. In parallel, an experimental work devoted to the synthesis and characterization of the substituted-cinnamate esters has also been performed. In order to quantify the substituents effects, quantitative structure-property relationship (QSPR) models based on the atomic charges, Fukui functions and the Frontier Effective-for-Reaction Molecular Orbitals (FERMO) energies were investigated. In fact, the Fukui functions, ƒ⁺C and ƒ(-)O, indicated poor correlations for each individual step, and in contrast with the general literature, the O-protonation step is affected both by the FERMO energies and the O-charges of the carbonyl group. Since the process was shown to not be totally described by either charge- or frontier-orbitals, it is proposed to be frontier-charge-miscere controlled. Moreover, the observed trend for the experimental reaction yields suggests that the electron withdrawing groups favor the reaction and the same was observed for Step 2, which can thus be pointed out as the determining step. PMID:26402661

  5. Is electronegativity a useful descriptor for the pseudo-alkali metal NH4?

    PubMed

    Whiteside, Alexander; Xantheas, Sotiris S; Gutowski, Maciej

    2011-11-18

    Molecular ions in the form of "pseudo-atoms" are common structural motifs in chemistry, with properties that are transferrable between different compounds. We have determined one such property--the electronegativity--for the "pseudo-alkali metal" ammonium (NH(4)), and evaluated its reliability as a descriptor versus the electronegativities of the alkali metals. The computed properties of ammonium's binary complexes with astatine and of selected borohydrides confirm the similarity of NH(4) to the alkali metal atoms, although the electronegativity of NH(4) is relatively large in comparison to its cationic radius. We have paid particular attention to the molecular properties of ammonium (angular anisotropy, geometric relaxation and reactivity), which can cause deviations from the behaviour expected of a conceptual "true alkali metal" with this electronegativity. These deviations allow for the discrimination of effects associated with the molecular nature of NH(4). PMID:21928287

  6. A local adaptive image descriptor

    NASA Astrophysics Data System (ADS)

    Zahid Ishraque, S. M.; Shoyaib, Mohammad; Abdullah-Al-Wadud, M.; Monirul Hoque, Md; Chae, Oksam

    2013-12-01

    The local binary pattern (LBP) is a robust but computationally simple approach in texture analysis. However, LBP performs poorly in the presence of noise and large illumination variation. Thus, a local adaptive image descriptor termed as LAID is introduced in this proposal. It is a ternary pattern and is able to generate persistent codes to represent microtextures in a given image, especially in noisy conditions. It can also generate stable texture codes if the pixel intensities change abruptly due to the illumination changes. Experimental results also show the superiority of the proposed method over other state-of-the-art methods.

  7. Students' understanding of molecular structure representations

    NASA Astrophysics Data System (ADS)

    Ferk, Vesna; Vrtacnik, Margareta; Blejec, Andrej; Gril, Alenka

    2003-10-01

    The purpose of the investigation was to determine the meanings attached by students to the different kinds of molecular structure representations used in chemistry teaching. The students (n = 124) were from primary (aged 13-14 years) and secondary (aged 17-18 years) schools and a university (aged 21-25 years). A computerised 'Chemical Visualisation Test' was developed and applied. The research indicates that students' appreciation of three-dimensional molecular structures differs according to the kind of representation used. The best results were achieved with the use of concrete, and pseudo-concrete types of representations (e.g. three-dimensional models, their photographs, computer-generated models). However, the use of more abstract types (e.g. schematic representations, stereochemical formula) was less effective. A correlation between students' results on the Chemical Visualisation Test and their educational level, spatial visualisation, and spatial relations skills was shown statistically, but no statistically significant gender differences were observed.

  8. 2004 Reversible Associations in Structure & Molecular Biology

    SciTech Connect

    Edward Eisenstein Nancy Ryan Gray

    2005-03-23

    The Gordon Research Conference (GRC) on 2004 Gordon Research Conference on Reversible Associations in Structure & Molecular Biology was held at Four Points Sheraton, CA, 1/25-30/2004. The Conference was well attended with 82 participants (attendees list attached). The attendees represented the spectrum of endeavor in this field coming from academia, industry, and government laboratories, both U.S. and foreign scientists, senior researchers, young investigators, and students.

  9. 8B structure in Fermionic Molecular Dynamics

    NASA Astrophysics Data System (ADS)

    Henninger, K. R.; Neff, T.; Feldmeier, H.

    2015-04-01

    The structure of the light exotic nucleus 8B is investigated in the Fermionic Molecular Dynamics (FMD) model. The decay of 8B is responsible for almost the entire high- energy solar-neutrino flux, making structure calculations of 8B important for determining the solar core temperature. 8B is a proton halo candidate thought to exhibit clustering. FMD uses a wave-packet basis and is well-suited for modelling clustering and halos. For a multiconfiguration treatment we construct the many-body Hilbert space from antisymmetrised angular-momentum projected 8-particle states. First results show formation of a proton halo.

  10. Prediction of the Fate of Organic Compounds in the Environment From Their Molecular Properties: A Review

    PubMed Central

    Mamy, Laure; Patureau, Dominique; Barriuso, Enrique; Bedos, Carole; Bessac, Fabienne; Louchart, Xavier; Martin-laurent, Fabrice; Miege, Cecile; Benoit, Pierre

    2015-01-01

    A comprehensive review of quantitative structure-activity relationships (QSAR) allowing the prediction of the fate of organic compounds in the environment from their molecular properties was done. The considered processes were water dissolution, dissociation, volatilization, retention on soils and sediments (mainly adsorption and desorption), degradation (biotic and abiotic), and absorption by plants. A total of 790 equations involving 686 structural molecular descriptors are reported to estimate 90 environmental parameters related to these processes. A significant number of equations was found for dissociation process (pKa), water dissolution or hydrophobic behavior (especially through the KOW parameter), adsorption to soils and biodegradation. A lack of QSAR was observed to estimate desorption or potential of transfer to water. Among the 686 molecular descriptors, five were found to be dominant in the 790 collected equations and the most generic ones: four quantum-chemical descriptors, the energy of the highest occupied molecular orbital (EHOMO) and the energy of the lowest unoccupied molecular orbital (ELUMO), polarizability (α) and dipole moment (μ), and one constitutional descriptor, the molecular weight. Keeping in mind that the combination of descriptors belonging to different categories (constitutional, topological, quantum-chemical) led to improve QSAR performances, these descriptors should be considered for the development of new QSAR, for further predictions of environmental parameters. This review also allows finding of the relevant QSAR equations to predict the fate of a wide diversity of compounds in the environment. PMID:25866458

  11. Molecular-structure-based models of chemical inventories using neural networks.

    PubMed

    Wernet, Gregor; Hellweg, Stefanie; Fischer, Ulrich; Papadokonstantakis, Stavros; Hungerbühler, Konrad

    2008-09-01

    Chemical synthesis is a complex and diverse procedure, and production data are often scarce or incomplete. A detailed inventory analysis of all mass and energy flows necessary for the production of chemicals is often costly and time-intensive. Therefore only few chemical inventories exist, even though they are essential for process optimization and the environmental assessment of many products. This paper introduces a newtype of model to provide estimates for inventory data and environmental impacts of chemical production based on the molecular structure of a chemical and without a priori knowledge of the production process. These molecular-structure-based models offer inventory data for users in process design and optimization, screening life cycle assessment (LCA), and supply chain management. They can be applied even if the producer is unknown or the production process is not documented. We assessed the capabilities of linear regression and neural network models for this purpose. All models were generated with a data set of inventory data on 103 chemicals. Different input sets were chosen as ways to transform the chemical structure into a numerical vector of descriptors and the effectiveness of the different input sets was analyzed. The results show that a correctly developed neural network model can perform on an acceptable level for many purposes. The models can assist process developers to improve energy efficiency in all design stages and aid in LCA and supply chain management by filling data gaps. PMID:18800554

  12. Filamentary structure in the Orion molecular cloud

    NASA Astrophysics Data System (ADS)

    Bally, John; Langer, William D.; Stark, Antony A.; Wilson, Robert W.

    1987-01-01

    A large-scale (C-13)O map (containing 33,000 spectra on a 1-arcmin grid) is presented for the giant molecular cloud located in the southern part of Ori which contains the Ori Nebula, NGC 1977, and the L1641 dark cloud complex. The overall structure of the cloud is filamentary, with individual features having a length up to 40 times their width. The northern portion of the cloud is compressed, dynamically relaxed, and supports massive star formation. In contrast, the southern part of the Ori A cloud is diffuse, exhibits chaotic spatial and velocity structure, and supports only intermediate- to low-mass star formation. This morphology may be the consequence of the formation and evolution of the Ori OB I association centered north of the molecular cloud. The entire cloud, in addition to the 5000-solar-mass filament containing both OMC-1 and OMC-2, exhibits a north-south velocity gradient. Implications of the observed cloud morphology for theories of molecular cloud evolution are discussed.

  13. Filamentary structure in the Orion molecular cloud

    SciTech Connect

    Bally, J.; Stark, A.A.; Wilson, R.W.; Langer, W.D.

    1987-01-01

    A large-scale (C-13)O map (containing 33,000 spectra on a 1-arcmin grid) is presented for the giant molecular cloud located in the southern part of Ori which contains the Ori Nebula, NGC 1977, and the L1641 dark cloud complex. The overall structure of the cloud is filamentary, with individual features having a length up to 40 times their width. The northern portion of the cloud is compressed, dynamically relaxed, and supports massive star formation. In contrast, the southern part of the Ori A cloud is diffuse, exhibits chaotic spatial and velocity structure, and supports only intermediate- to low-mass star formation. This morphology may be the consequence of the formation and evolution of the Ori OB I association centered north of the molecular cloud. The entire cloud, in addition to the 5000-solar-mass filament containing both OMC-1 and OMC-2, exhibits a north-south velocity gradient. Implications of the observed cloud morphology for theories of molecular cloud evolution are discussed. 14 references.

  14. The assess facility descriptor module

    SciTech Connect

    Jordan, S.E.; Winblad, A.; Key, B.; Walker, S.; Renis, T.; Saleh, R.

    1989-01-01

    The Facility Descriptor (Facility) module is part of the Analytic System and Software for Evaluating Safeguards and Security (ASSESS). Facility is the foundational software application in the ASSESS system for modelling a nuclear facility's safeguards and security system to determine the effectiveness against theft of special nuclear material. The Facility module provides the tools for an analyst to define a complete description of a facility's physical protection system which can then be used by other ASSESS software modules to determine vulnerability to a spectrum of insider and outsider threats. The analyst can enter a comprehensive description of the protection system layout including all secured areas, target locations, and detailed safeguards specifications. An extensive safeguard component catalog provides the reference data for calculating delay and detection performance. Multiple target locations within the same physical area may be specified, and the facility may be defined for two different operational states such as dayshift and nightshift. 6 refs., 5 figs.

  15. THESAURUS OF ERIC DESCRIPTORS (INTERIM) JANUARY 1967.

    ERIC Educational Resources Information Center

    1967

    THE "THESAURUS OF ERIC DESCRIPTORS (INTERIM)" SUPERSEDES, AND REPRESENTS A REFINEMENT OF, THE "THESAURUS OF ERIC DESCRIPTORS." THE INTERIM ISSUE IS A PRELIMINARY ERIC SYSTEM TOOL AND IS NOT TO BE CONSIDERED A COMPLETE REPRESENTATION OF THE FINAL PRODUCT. THIS REFINEMENT IS THE RESULT OF TWO MAJOR PROJECTS--(1) THE INCORPORATION OF SUGGESTIONS…

  16. The Molecular Structure of cis-FONO

    NASA Technical Reports Server (NTRS)

    Lee, Timothy J.; Dateo, Christopher E.; Rice, Julia E.; Langhoff, Stephen R. (Technical Monitor)

    1994-01-01

    The molecular structure of cis-FONO has been determined with the CCSD(T) correlation method using an spdf quality basis set. In agreement with previous coupled-cluster calculations but in disagreement with density functional theory, cis-FONO is found to exhibit normal bond distances. The quadratic and cubic force fields of cis-FONO have also been determined in order to evaluate the effect of vibrational averaging on the molecular geometry. Vibrational averaging is found to increase bond distances, as expected, but it does not affect the qualitative nature of the bonding. The CCSD(T)/spdf harmonic frequencies of cis-FONO support our previous assertion that a band observed at 1200 /cm is a combination band (upsilon(sub 3) + upsilon(sub 4)), and not a fundamental.

  17. Aircraft noise descriptor and its application

    NASA Astrophysics Data System (ADS)

    Igarashi, Juichi

    The methods and indices used in Japan to evaluate aircraft noise and the government-enforced countermeasures are discussed. The ECPNL descriptor was modified so as to make the new descriptor, WECPNL', approximately equivalent to Lden, and the noise contours were calculated for each airport in Japan. The government enforced the policy of land purchase within the WECPNL' of 85, and the houses within the value of 75 were declared as needing insulation. The noise descriptor Leq or Ldn has been used to describe human responses to various kinds of noises. However, a single value descriptor was found to have a limit of applicability, because the human response is not a linear function of a sound level. Another defect of the descriptor is a failure to represent the human response adequately for a small number of flights. It is noted that the house vibration caused by low-frequency components of aircraft noise cannot yet be evaluated.

  18. Is Electronegativity a Useful Descriptor for the "Pseudo-Alkali-Metal" NH4?

    SciTech Connect

    Whiteside, Alexander; Xantheas, Sotiris S.; Gutowski, Maciej S.

    2011-11-18

    Molecular ions in the form of "pseudo-atoms" are common structural motifs in chemistry, with properties that are transferrable between different compounds. We have determined the electronegativity of the "pseudo-alkali metal" ammonium (NH4) and evaluated its reliability as a descriptor in comparison to the electronegativities of the alkali metals. The computed properties of its binary complexes with astatine and of selected borohydrides confirm the similarity of NH4 to the alkali metal atoms, although the electronegativity of NH4 is relatively large in comparison to its cationic radius. We paid particular attention to the molecular properties of ammonium (angular anisotropy, geometric relaxation, and reactivity), which can cause deviations from the behaviour expected of a conceptual "true alkali metal" with this electronegativity. These deviations allow for the discrimination of effects associated with the polyatomic nature of NH4.

  19. Relationship between molecular structure, concentration and odor qualities of oxygenated aliphatic molecules.

    PubMed

    Laing, D G; Legha, P K; Jinks, A L; Hutchinson, I

    2003-01-01

    Increasing the concentration of an odorant increases the number of receptor cells and glomeruli in the olfactory bulb that are stimulated, and it is commonly acknowledged that these represent increased numbers of receptor types. Currently, it is not known whether a receptor type is associated with a unique quality and a unique molecular feature of an odorant, or its activation is used by the brain in a combinatorial manner with other activated receptor types to produce a characteristic quality. The present study investigated the proposal that a molecular feature common to several aliphatic odorants and known to be the key feature required to stimulate the same mitral cells in the olfactory bulb results in a quality that is common to the odorants. Since the common structural feature may activate a specific receptor type possibly at a similar concentration, the qualities of the odorants were determined at seven concentrations where the lowest and highest concentrations were the detection threshold (DT) and 729DT of each subject. A list of 146 descriptors was used by 15 subjects to describe the qualities of each odorant at each concentration. The results indicate that each of the five odorants was characterized by different qualities and the qualities of four of the odorants changed with changes in concentration. Importantly, no quality common to each of the odorants that had the same molecular feature could be identified and it is proposed that identification of the odorants occurs via a combinatorial mechanism involving several types of receptors. PMID:12502524

  20. Gaze estimation using a hybrid appearance and motion descriptor

    NASA Astrophysics Data System (ADS)

    Xiong, Chunshui; Huang, Lei; Liu, Changping

    2015-03-01

    It is a challenging problem to realize a robust and low cost gaze estimation system. Existing appearance-based and feature-based methods both have achieved impressive progress in the past several years, while their improvements are still limited by feature representation. Therefore, in this paper, we propose a novel descriptor combining eye appearance and pupil center-cornea reflections (PCCR). The hybrid gaze descriptor represents eye structure from both feature level and topology level. At the feature level, a glints-centered appearance descriptor is presented to capture intensity and contour information of eye, and a polynomial representation of normalized PCCR vector is employed to capture motion information of eyeball. At the topology level, the partial least squares is applied for feature fusion and selection. At last, sparse representation based regression is employed to map the descriptor to the point-of-gaze (PoG). Experimental results show that the proposed method achieves high accuracy and has a good tolerance to head movements.

  1. Analysis of peptide-protein binding using amino acid descriptors: prediction and experimental verification for human histocompatibility complex HLA-A0201.

    PubMed

    Guan, Pingping; Doytchinova, Irini A; Walshe, Valerie A; Borrow, Persephone; Flower, Darren R

    2005-11-17

    Amino acid descriptors are often used in quantitative structure-activity relationship (QSAR) analysis of proteins and peptides. In the present study, descriptors were used to characterize peptides binding to the human MHC allele HLA-A0201. Two sets of amino acid descriptors were chosen: 93 descriptors taken from the amino acid descriptor database AAindex and the z descriptors defined by Wold and Sandberg. Variable selection techniques (SIMCA, genetic algorithm, and GOLPE) were applied to remove redundant descriptors. Our results indicate that QSAR models generated using five z descriptors had the highest predictivity and explained variance (q2 between 0.6 and 0.7 and r2 between 0.6 and 0.9). Further to the QSAR analysis, 15 peptides were synthesized and tested using a T2 stabilization assay. All peptides bound to HLA-A0201 well, and four peptides were identified as high-affinity binders. PMID:16279801

  2. Is conformation a fundamental descriptor in QSAR? A case for halogenated anesthetics

    PubMed Central

    Guimarães, Maria C; Duarte, Mariene H; Silla, Josué M

    2016-01-01

    Summary An intriguing question in 3D-QSAR lies on which conformation(s) to use when generating molecular descriptors (MD) for correlation with bioactivity values. This is not a simple task because the bioactive conformation in molecule data sets is usually unknown and, therefore, optimized structures in a receptor-free environment are often used to generate the MD´s. In this case, a wrong conformational choice can cause misinterpretation of the QSAR model. The present computational work reports the conformational analysis of the volatile anesthetic isoflurane (2-chloro-2-(difluoromethoxy)-1,1,1-trifluoroethane) in the gas phase and also in polar and nonpolar implicit and explicit solvents to show that stable minima (ruled by intramolecular interactions) do not necessarily coincide with the bioconformation (ruled by enzyme induced fit). Consequently, a QSAR model based on two-dimensional chemical structures was built and exhibited satisfactory modeling/prediction capability and interpretability, then suggesting that these 2D MD´s can be advantageous over some three-dimensional descriptors. PMID:27340468

  3. Learning spectral descriptors for deformable shape correspondence.

    PubMed

    Litman, R; Bronstein, A M

    2014-01-01

    Informative and discriminative feature descriptors play a fundamental role in deformable shape analysis. For example, they have been successfully employed in correspondence, registration, and retrieval tasks. In recent years, significant attention has been devoted to descriptors obtained from the spectral decomposition of the Laplace-Beltrami operator associated with the shape. Notable examples in this family are the heat kernel signature (HKS) and the recently introduced wave kernel signature (WKS). The Laplacian-based descriptors achieve state-of-the-art performance in numerous shape analysis tasks; they are computationally efficient, isometry-invariant by construction, and can gracefully cope with a variety of transformations. In this paper, we formulate a generic family of parametric spectral descriptors. We argue that to be optimized for a specific task, the descriptor should take into account the statistics of the corpus of shapes to which it is applied (the "signal") and those of the class of transformations to which it is made insensitive (the "noise"). While such statistics are hard to model axiomatically, they can be learned from examples. Following the spirit of the Wiener filter in signal processing, we show a learning scheme for the construction of optimized spectral descriptors and relate it to Mahalanobis metric learning. The superiority of the proposed approach in generating correspondences is demonstrated on synthetic and scanned human figures. We also show that the learned descriptors are robust enough to be learned on synthetic data and transferred successfully to scanned shapes. PMID:24231874

  4. Application of SMILES Notation Based Optimal Descriptors in Drug Discovery and Design.

    PubMed

    Veselinović, Aleksandar M; Veselinović, Jovana B; Živković, Jelena V; Nikolić, Goran M

    2015-01-01

    SMILES notation based optimal descriptors as a universal tool for the QSAR analysis with further application in drug discovery and design is presented. The basis of this QSAR modeling is Monte Carlo method which has important advantages over other methods, like the possibility of analysis of a QSAR as a random event, is discussed. The advantages of SMILES notation based optimal descriptors in comparison to commonly used descriptors are defined. The published results of QSAR modeling with SMILES notation based optimal descriptors applied for various pharmacologically important endpoints are listed. The presented QSAR modeling approach obeys OECD principles and has mechanistic interpretation with possibility to identify molecular fragments that contribute in positive and negative way to studied biological activity, what is of big importance in computer aided drug design of new compounds with desired activity. PMID:25961525

  5. Structure and Dynamics of Cellulose Molecular Solutions

    NASA Astrophysics Data System (ADS)

    Wang, Howard; Zhang, Xin; Tyagi, Madhusudan; Mao, Yimin; Briber, Robert

    Molecular dissolution of microcrystalline cellulose has been achieved through mixing with ionic liquid 1-Ethyl-3-methylimidazolium acetate (EMIMAc), and organic solvent dimethylformamide (DMF). The mechanism of cellulose dissolution in tertiary mixtures has been investigated by combining quasielastic and small angle neutron scattering (QENS and SANS). As SANS data show that cellulose chains take Gaussian-like conformations in homogenous solutions, which exhibit characteristics of having an upper critical solution temperature, the dynamic signals predominantly from EMIMAc molecules indicate strong association with cellulose in the dissolution state. The mean square displacement quantities support the observation of the stoichiometric 3:1 EMIMAc to cellulose unit molar ratio, which is a necessary criterion for the molecular dissolution of cellulose. Analyses of dynamics structure factors reveal the temperature dependence of a slow and a fast process for EMIMAc's bound to cellulose and in DMF, respectively, as well as a very fast process due possibly to the rotational motion of methyl groups, which persisted to near the absolute zero.

  6. Algorithmic dimensionality reduction for molecular structure analysis

    PubMed Central

    Brown, W. Michael; Martin, Shawn; Pollock, Sara N.; Coutsias, Evangelos A.; Watson, Jean-Paul

    2008-01-01

    Dimensionality reduction approaches have been used to exploit the redundancy in a Cartesian coordinate representation of molecular motion by producing low-dimensional representations of molecular motion. This has been used to help visualize complex energy landscapes, to extend the time scales of simulation, and to improve the efficiency of optimization. Until recently, linear approaches for dimensionality reduction have been employed. Here, we investigate the efficacy of several automated algorithms for nonlinear dimensionality reduction for representation of trans, trans-1,2,4-trifluorocyclo-octane conformation—a molecule whose structure can be described on a 2-manifold in a Cartesian coordinate phase space. We describe an efficient approach for a deterministic enumeration of ring conformations. We demonstrate a drastic improvement in dimensionality reduction with the use of nonlinear methods. We discuss the use of dimensionality reduction algorithms for estimating intrinsic dimensionality and the relationship to the Whitney embedding theorem. Additionally, we investigate the influence of the choice of high-dimensional encoding on the reduction. We show for the case studied that, in terms of reconstruction error root mean square deviation, Cartesian coordinate representations and encodings based on interatom distances provide better performance than encodings based on a dihedral angle representation. PMID:18715062

  7. Computing stoichiometric molecular composition from crystal structures

    PubMed Central

    Gražulis, Saulius; Merkys, Andrius; Vaitkus, Antanas; Okulič-Kazarinas, Mykolas

    2015-01-01

    Crystallographic investigations deliver high-accuracy information about positions of atoms in crystal unit cells. For chemists, however, the structure of a molecule is most often of interest. The structure must thus be reconstructed from crystallographic files using symmetry information and chemical properties of atoms. Most existing algorithms faithfully reconstruct separate molecules but not the overall stoichiometry of the complex present in a crystal. Here, an algorithm that can reconstruct stoichiometrically correct multimolecular ensembles is described. This algorithm uses only the crystal symmetry information for determining molecule numbers and their stoichiometric ratios. The algorithm can be used by chemists and crystallographers as a standalone implementation for investigating above-molecular ensembles or as a function implemented in graphical crystal analysis software. The greatest envisaged benefit of the algorithm, however, is for the users of large crystallographic and chemical databases, since it will permit database maintainers to generate stoichiometrically correct chemical representations of crystal structures automatically and to match them against chemical databases, enabling multidisciplinary searches across multiple databases. PMID:26089747

  8. Structural disorder in molecular framework materials.

    PubMed

    Cairns, Andrew B; Goodwin, Andrew L

    2013-06-21

    It is increasingly apparent that many important classes of molecular framework material exhibit a variety of interesting and useful types of structural disorder. This tutorial review summarises a number of recent efforts to understand better both the complex microscopic nature of this disorder and also how it might be implicated in useful functionalities of these materials. We draw on a number of topical examples including topologically-disordered zeolitic imidazolate frameworks (ZIFs), porous aromatic frameworks (PAFs), the phenomena of temperature-, pressure- and desorption-induced amorphisation, partial interpenetration, ferroelectric transition-metal formates, negative thermal expansion in cyanide frameworks, and the mechanics and processing of layered frameworks. We outline the various uses of pair distribution function (PDF) analysis, dielectric spectroscopy, peak-shape analysis of powder diffraction data and single-crystal diffuse scattering measurements as means of characterising disorder in these systems, and we suggest a number of opportunities for future research in the field. PMID:23471316

  9. Molecular structure, vibrational, electronic and thermal properties of 4-vinylcyclohexene by quantum chemical calculations

    NASA Astrophysics Data System (ADS)

    Nagabalasubramanian, P. B.; Periandy, S.; Karabacak, Mehmet; Govindarajan, M.

    2015-06-01

    The solid phase FT-IR and FT-Raman spectra of 4-vinylcyclohexene (abbreviated as 4-VCH) have been recorded in the region 4000-100 cm-1. The optimized molecular geometry and vibrational frequencies of the fundamental modes of 4-VCH have been precisely assigned and analyzed with the aid of structure optimizations and normal coordinate force field calculations based on density functional theory (DFT) method at 6-311++G(d,p) level basis set. The theoretical frequencies were properly scaled and compared with experimentally obtained FT-IR and FT-Raman spectra. Also, the effect due the substitution of vinyl group on the ring vibrational frequencies was analyzed and a detailed interpretation of the vibrational spectra of this compound has been made on the basis of the calculated total energy distribution (TED). The time dependent DFT (TD-DFT) method was employed to predict its electronic properties, such as electronic transitions by UV-Visible analysis, HOMO and LUMO energies, molecular electrostatic potential (MEP) and various global reactivity and selectivity descriptors (chemical hardness, chemical potential, softness, electrophilicity index). Stability of the molecule arising from hyper conjugative interaction, charge delocalization has been analyzed using natural bond orbital (NBO) analysis. Atomic charges obtained by Mulliken population analysis and NBO analysis are compared. Thermodynamic properties (heat capacity, entropy and enthalpy) of the title compound at different temperatures are also calculated.

  10. Molecular structure, vibrational, electronic and thermal properties of 4-vinylcyclohexene by quantum chemical calculations.

    PubMed

    Nagabalasubramanian, P B; Periandy, S; Karabacak, Mehmet; Govindarajan, M

    2015-06-15

    The solid phase FT-IR and FT-Raman spectra of 4-vinylcyclohexene (abbreviated as 4-VCH) have been recorded in the region 4000-100cm(-1). The optimized molecular geometry and vibrational frequencies of the fundamental modes of 4-VCH have been precisely assigned and analyzed with the aid of structure optimizations and normal coordinate force field calculations based on density functional theory (DFT) method at 6-311++G(d,p) level basis set. The theoretical frequencies were properly scaled and compared with experimentally obtained FT-IR and FT-Raman spectra. Also, the effect due the substitution of vinyl group on the ring vibrational frequencies was analyzed and a detailed interpretation of the vibrational spectra of this compound has been made on the basis of the calculated total energy distribution (TED). The time dependent DFT (TD-DFT) method was employed to predict its electronic properties, such as electronic transitions by UV-Visible analysis, HOMO and LUMO energies, molecular electrostatic potential (MEP) and various global reactivity and selectivity descriptors (chemical hardness, chemical potential, softness, electrophilicity index). Stability of the molecule arising from hyper conjugative interaction, charge delocalization has been analyzed using natural bond orbital (NBO) analysis. Atomic charges obtained by Mulliken population analysis and NBO analysis are compared. Thermodynamic properties (heat capacity, entropy and enthalpy) of the title compound at different temperatures are also calculated. PMID:25795608

  11. Deconstructing field-induced ketene isomerization through Lagrangian descriptors.

    PubMed

    Craven, Galen T; Hernandez, Rigoberto

    2016-02-01

    The time-dependent geometrical separatrices governing state transitions in field-induced ketene isomerization are constructed using the method of Lagrangian descriptors. We obtain the stable and unstable manifolds of time-varying transition states as dynamic phase space objects governing configurational changes when the ketene molecule is subjected to an oscillating electric field. The dynamics of the isomerization reaction are modeled through classical trajectory studies on the Gezelter-Miller potential energy surface and an approximate dipole moment model which is coupled to a time-dependent electric field. We obtain a representation of the reaction geometry, over varying field strengths and oscillation frequencies, by partitioning an initial phase space into basins labeled according to which product state is reached at a given time. The borders between these basins are in agreement with those obtained using Lagrangian descriptors, even in regimes exhibiting chaotic dynamics. Major outcomes of this work are: validation and extension of a transition state theory framework built from Lagrangian descriptors, elaboration of the applicability for this theory to periodically- and aperiodically-driven molecular systems, and prediction of regimes in which isomerization of ketene and its derivatives may be controlled using an external field. PMID:26778728

  12. Comparison of Global Reactivity Descriptors Calculated Using Various Density Functionals: A QSAR Perspective.

    PubMed

    Vijayaraj, R; Subramanian, V; Chattaraj, P K

    2009-10-13

    Conceptual density functional theory (DFT) based global reactivity descriptors are used to understand the relationship between structure, stability, and global chemical reactivity. Furthermore, these descriptors are employed in the development of quantitative structure-activity (QSAR), structure-property (QSPR), and structure-toxicity (QSTR) relationships. However, the predictive power of various relationships depends on the reliable estimates of these descriptors. The basic working equations used to calculate these descriptors contain both the ionization potential and the electron affinity of chosen molecules. Therefore, efficiency of different density functionals (DFs) in predicting the ionization potential and the electron affinity has to be systematically evaluated. With a view to benchmark the method of calculation of global reactivity descriptors, comprehensive calculations have been carried out on a series of chlorinated benzenes using a variety of density functionals employing different basis sets. In addition, to assess the utility of global reactivity descriptors, the relationships between the reactivity-electrophilicity and the structure-toxicity have been developed. The ionization potential and the electron affinity values obtained from M05-2X method using the ΔSCF approach are closer to the corresponding experimental values. This method reliably predicts these electronic properties when compared to the other DFT methods. The analysis of a series of QSTR equations reveals that computationally economic DFT functionals can be effectively and routinely applied in the development of QSAR/QSPR/QSTR. PMID:26631787

  13. Plant sex chromosomes: molecular structure and function.

    PubMed

    Jamilena, M; Mariotti, B; Manzano, S

    2008-01-01

    Recent molecular and genomic studies carried out in a number of model dioecious plant species, including Asparagus officinalis, Carica papaya, Silene latifolia, Rumex acetosa and Marchantia polymorpha, have shed light on the molecular structure of both homomorphic and heteromorphic sex chromosomes, and also on the gene functions they have maintained since their evolution from a pair of autosomes. The molecular structure of sex chromosomes in species from different plant families represents the evolutionary pathway followed by sex chromosomes during their evolution. The degree of Y chromosome degeneration that accompanies the suppression of recombination between the Xs and Ys differs among species. The primitive Ys of A. officinalis and C. papaya have only diverged from their homomorphic Xs in a short male-specific and non-recombining region (MSY), while the heteromorphic Ys of S. latifolia, R. acetosa and M. polymorpha have diverged from their respective Xs. As in the Y chromosomes of mammals and Drosophila, the accumulation of repetitive DNA, including both transposable elements and satellite DNA, has played an important role in the divergence and size enlargement of plant Ys, and consequently in reducing gene density. Nevertheless, the degeneration process in plants does not appear to have reached the Y-linked genes. Although a low gene density has been found in the sequenced Y chromosome of M. polymorpha, most of its genes are essential and are expressed in the vegetative and reproductive organs in both male and females. Similarly, most of the Y-linked genes that have been isolated and characterized up to now in S. latifolia are housekeeping genes that have X-linked homologues, and are therefore expressed in both males and females. Only one of them seems to be degenerate with respect to its homologous region in the X. Sequence analysis of larger regions in the homomorphic X and Y chromosomes of papaya and asparagus, and also in the heteromorphic sex chromosomes

  14. A contour-based shape descriptor for biomedical image classification and retrieval

    NASA Astrophysics Data System (ADS)

    You, Daekeun; Antani, Sameer; Demner-Fushman, Dina; Thoma, George R.

    2013-12-01

    Contours, object blobs, and specific feature points are utilized to represent object shapes and extract shape descriptors that can then be used for object detection or image classification. In this research we develop a shape descriptor for biomedical image type (or, modality) classification. We adapt a feature extraction method used in optical character recognition (OCR) for character shape representation, and apply various image preprocessing methods to successfully adapt the method to our application. The proposed shape descriptor is applied to radiology images (e.g., MRI, CT, ultrasound, X-ray, etc.) to assess its usefulness for modality classification. In our experiment we compare our method with other visual descriptors such as CEDD, CLD, Tamura, and PHOG that extract color, texture, or shape information from images. The proposed method achieved the highest classification accuracy of 74.1% among all other individual descriptors in the test, and when combined with CSD (color structure descriptor) showed better performance (78.9%) than using the shape descriptor alone.

  15. The sEDA(=) and pEDA(=) descriptors of the double bonded substituent effect.

    PubMed

    Mazurek, Andrzej; Dobrowolski, Jan Cz

    2013-05-14

    New descriptors of the double bonded substituent effect, sEDA(=) and pEDA(=), were constructed based on quantum chemical calculations and NBO methodology. They show to what extent the σ and π electrons are donated to or withdrawn from the substituted system by a double bonded substituent. The new descriptors differ from descriptors of the classical substituent effect for which the pz orbital of the ipso carbon atom is engaged in the π-electron system of the two neighboring atoms in the ring. For double bonded substituents, the pz orbital participates in double bond formation with only one external atom. Moreover, the external double bond forces localization of the double bond system of the ring, significantly changing the core molecule. We demonstrated good agreement between our descriptors and the Weinhold and Landis' "natural σ and π-electronegativities": so far only descriptors allowing for evaluation of the substitution effect by a double bonded atom. The equivalency between descriptors constructed for 5- and 6-membered model structures as well as linear dependence/independence of the constructed parameters was discussed. Some interrelations between sEDA(=) and pEDA(=) and the other descriptors of (hetero)cyclic systems such as aromaticity and electron density in the ring and bond critical points were also examined. PMID:23532500

  16. Multilayer descriptors for medical image classification.

    PubMed

    Lumini, Alessandra; Nanni, Loris; Brahnam, Sheryl

    2016-05-01

    In this paper, we propose a new method for improving the performance of 2D descriptors by building an n-layer image using different preprocessing approaches from which multilayer descriptors are extracted and used as feature vectors for training a Support Vector Machine. The different preprocessing approaches are used to build different n-layer images (n=3, n=5, etc.). We test both color and gray-level images, two well-known texture descriptors (Local Phase Quantization and Local Binary Pattern), and three of their variants suited for n-layer images (Volume Local Phase Quantization, Local Phase Quantization Three-Orthogonal-Planes, and Volume Local Binary Patterns). Our results show that multilayers and texture descriptors can be combined to outperform the standard single-layer approaches. Experiments on 10 datasets demonstrate the generalizability of the proposed descriptors. Most of these datasets are medical, but in each case the images are very different. Two datasets are completely unrelated to medicine and are included to demonstrate the discriminative power of the proposed descriptors across very different image recognition tasks. A MATLAB version of the complete system developed in this paper will be made available at https://www.dei.unipd.it/node/2357. PMID:26656952

  17. Towards accurate porosity descriptors based on guest-host interactions.

    PubMed

    Paik, Dooam; Haranczyk, Maciej; Kim, Jihan

    2016-05-01

    For nanoporous materials at the characterization level, geometry-based approaches have become the methods of choice to provide information, often encoded in numerical descriptors, about the pores and the channels of a porous material. Examples of most common descriptors of the latter are pore limiting diameters, accessible surface area and accessible volume. The geometry-based methods exploit hard-sphere approximation for atoms, which (1) reduces costly computations of the interatomic interactions between the probe guest molecule and the porous material framework atoms, (2) effectively exploit applied mathematics methods such as Voronoi decomposition to represent and characterize porosity. In this work, we revisit and quantify the shortcoming of the geometry-based approaches. To do so, we have developed a series of algorithms to calculate pore descriptors such as void fraction, accessible surface area, pore limiting diameters (largest included sphere, and largest free sphere) based on a classical force field model of interactions between the guest and the framework atoms. Our resulting energy-based methods are tested on diverse sets of metal-organic frameworks and zeolite structures and comparisons against results obtained from geometric-based method indicate deviations in the cases for structures with small pore sizes. The method provides both high accuracy and performance making it suitable when screening a large database of materials. PMID:27054971

  18. Fractal and Euclidean descriptors of platelet shape.

    PubMed

    Kraus, Max-Joseph; Neeb, Heiko; Strasser, Erwin F

    2014-01-01

    Platelet shape change is a dynamic membrane surface process that exhibits remarkable morphological heterogeneity. Once the outline of an irregular shape is identified and segmented from a digital image, several mathematical descriptors can be applied to numerical characterize the irregularity of the shapes surface. 13072 platelet outlines (PLO) were segmented automatically from 1928 microscopic images using a newly developed algorithm for the software product Matlab R2012b. The fractal dimension (FD), circularity, eccentricity, area and perimeter of each PLO were determined. 972 PLO were randomly assigned for computer-assisted manual measurement of platelet diameter as well as number, width and length of filopodia per platelet. FD can be used as a surrogate parameter for determining the roughness of the PLO and circularity can be used as a surrogate to estimate the number and length of filopodia. The relationship between FD and perimeter of the PLO reveals the existence of distinct groups of platelets with significant structural differences which may be caused by platelet activation. This new method allows for the standardized continuous numerical classification of platelet shape and its dynamic change, which is useful for the analysis of altered platelet activity (e.g. inflammatory diseases, contact activation, drug testing). PMID:24224894

  19. [Molecular structure of luminal diuretic receptors].

    PubMed

    Gamba, G

    1995-01-01

    Since day to day sodium and water intake is more or less constant, the output by urinary sodium excretion is the key to maintain extracellular fluid volume within physiologic ranges. To achieve this goal, the kidneys ensure that most of the large quantities of filtered sodium are reabsorbed, a function that takes place in the proximal tubule, the loop of Henle and the distal tubule, and then the kidneys adjust the small amount of sodium that is excreted in urine in such a way that sodium balance is maintained. This adjustment occurs in the collecting duct. Three groups of diuretic-sensitive sodium transport mechanisms have been identified in the apical membranes of the distal nephron based on their different sensitivities to diuretics and requirements for chloride and potassium: 1) the sulfamoylbenzoic (or bumetanide)-sensitive Na+:K+:2CI- and Na+:CI- symporters in the thick ascending loop of Henle; 2) the benzothiadiazine (or thiazide)-sensitive Na+:CI- cotransporter in the distal tubule; and 3) the amiloride-sensitive Na+ channel in the collecting tubule. The inhibition of any one of these proteins by diuretics results in increased sodium urinary excretion. Recently, the use of molecular biology techniques, specially the functional expression cloning in Xenopus laevis oocytes, has led to the identification of cDNA's encoding members of the three groups of diuretic-sensitive transport proteins. The present paper reviews the primary structure and some aspects of the relationship between structure and function of these transporters as well as the new protein families emerging from these sequences. It also discusses the future implications of these discoveries on the physiology and pathophysiology of kidney disease and sodium retaining states. PMID:7569367

  20. Filamentary structure in the Orion molecular cloud

    NASA Astrophysics Data System (ADS)

    Bally, J.; Dragovan, M.; Langer, W. D.; Stark, A. A.; Wilson, R. W.

    1986-10-01

    A large scale 13CO map (containing 33,000 spectra) of the giant molecular cloud located in the southern part of Orion is presented which contains the Orion Nebula, NGC1977, and the LI641 dark cloud complex. The overall structure of the cloud is filamentary, with individual features having a length up to 40 times their width. This morphology may result from the effects of star formation in the region or embedded magnetic fields in the cloud. We suggest a simple picture for the evolution of the Orion-A cloud and the formation of the major filament. A rotating proto-cloud (counter rotating with respect to the galaxy) contians a b-field aligned with the galaxtic plane. The northern portion of this cloud collapsed first, perhaps triggered by the pressure of the Ori I OB association. The magnetic field combined with the anisotropic pressure produced by the OB-association breaks the symmetry of the pancake instability, a filament rather than a disc is produced. The growth of instabilities in the filament formed sub-condensations which are recent sites of star formation.

  1. Molecular structure of brown-dwarf disks

    NASA Astrophysics Data System (ADS)

    Wiebe, D. S.; Semenov, D. A.; Henning, T.

    2008-11-01

    We describe typical features of the chemical composition of proto-planetary disks around brown dwarfs. We model the chemical evolution in the disks around a low-mass T Tauri star and a cooler brown dwarf over a time span of 1 Myr using a model for the physical structure of an accretion disk with a vertical temperature gradient and an extensive set of gas-phase chemical reactions. We find that the disks of T Tauri stars are, in general, hotter and denser than the disks of lower-luminosity substellar objects. In addition, they have more pronounced vertical temperature gradients. The atmospheres of the disks around low-mass stars are more strongly ionized by UV and X-ray radiation, while less dense brown-dwarf disks have higher fractional ionizations in their midplanes. Nevertheless, in both cases, most molecules are concentrated in the so-called warm molecular layer between the ionized atmosphere and cold midplane, where grains with ice mantles are abundant.

  2. Filamentary structure in the Orion molecular cloud

    NASA Technical Reports Server (NTRS)

    Bally, J.; Langer, W. D.; Bally, J.; Langer, W. D.; Bally, J.; Langer, W. D.

    1986-01-01

    A large scale 13CO map (containing 33,000 spectra) of the giant molecular cloud located in the southern part of Orion is presented which contains the Orion Nebula, NGC1977, and the LI641 dark cloud complex. The overall structure of the cloud is filamentary, with individual features having a length up to 40 times their width. This morphology may result from the effects of star formation in the region or embedded magnetic fields in the cloud. We suggest a simple picture for the evolution of the Orion-A cloud and the formation of the major filament. A rotating proto-cloud (counter rotating with respect to the galaxy) contians a b-field aligned with the galaxtic plane. The northern protion of this cloud collapsed first, perhaps triggered by the pressure of the Ori I OB association. The magnetic field combined with the anisotropic pressure produced by the OB-association breaks the symmetry of the pancake instability, a filament rather than a disc is produced. The growth of instabilities in the filament formed sub-condensations which are recent sites of star formation.

  3. The Determination of Molecular Structure from Rotational Spectra

    DOE R&D Accomplishments Database

    Laurie, V. W.; Herschbach, D. R.

    1962-07-01

    An analysis is presented concerning the average molecular configuration variations and their effects on molecular structure determinations. It is noted that the isotopic dependence of the zero-point is often primarily governed by the isotopic variation of the average molecular configuration. (J.R.D.)

  4. Quantitative Structure-Cytotoxicity Relationship of Bioactive Heterocycles by the Semi-empirical Molecular Orbital Method with the Concept of Absolute Hardness

    NASA Astrophysics Data System (ADS)

    Ishihara, Mariko; Sakagami, Hiroshi; Kawase, Masami; Motohashi, Noboru

    The relationship between the cytotoxicity of N-heterocycles (13 4-trifluoromethylimidazole, 15 phenoxazine and 12 5-trifluoromethyloxazole derivatives), O-heterocycles (11 3-formylchromone and 20 coumarin derivatives) and seven vitamin K2 derivatives against eight tumor cell lines (HSC-2, HSC-3, HSC-4, T98G, HSG, HepG2, HL-60, MT-4) and a maximum of 15 chemical descriptors was investigated using CAChe Worksystem 4.9 project reader. After determination of the conformation of these compounds and approximation to the molecular form present in vivo (biomimetic) by CONFLEX5, the most stable structure was determined by CAChe Worksystem 4.9 MOPAC (PM3). The present study demonstrates the best relationship between the cytotoxic activity and molecular shape or molecular weight of these compounds. Their biological activities can be estimated by hardness and softness, and by using η-χ activity diagrams.

  5. Meaningful structural descriptors from charge density.

    PubMed

    Stalke, Dietmar

    2011-08-16

    This paper provides a short introduction to the basics of electron density investigations. The two predominant approaches for the modelling and various interpretations of electron density distributions are presented. Their potential translations into chemical concepts are explained. The focus of the article lies on the deduction of chemical properties from charge density studies in some selected main group compounds. The relationship between the obtained numerical data and commonly accepted simple chemical concepts unfortunately is not always straightforward, and often the chemist relies on heuristic connections rather than rigorously defined ones. This article tries to demonstrate how charge density analyses can shed light on aspects of chemical bonding and reactivity resulting from the determined bonding situation. Sometimes this helps to identify misconceptions and sets the scene for new unconventional synthetic approaches. PMID:21717511

  6. Molecular cloning of chicken aggrecan. Structural analyses.

    PubMed Central

    Chandrasekaran, L; Tanzer, M L

    1992-01-01

    The large, aggregating chondroitin sulphate proteoglycan of cartilage, aggrecan, has served as a generic model of proteoglycan structure. Molecular cloning of aggrecans has further defined their amino acid sequences and domain structures. In this study, we have obtained the complete coding sequence of chicken sternal cartilage aggrecan by a combination of cDNA and genomic DNA sequencing. The composite sequence is 6117 bp in length, encoding 1951 amino acids. Comparison of chicken aggrecan protein primary structure with rat, human and bovine aggrecans has disclosed both similarities and differences. The domains which are most highly conserved at 70-80% identity are the N-terminal domains G1 and G2 and the C-terminal domain G3. The chondroitin sulphate domain of chicken aggrecan is smaller than that of rat and human aggrecans and has very distinctive repeat sequences. It has two separate sections, one comprising 12 consecutive Ser-Gly-Glu repeats of 20 amino acids each, adjacent to the other which has 23 discontinuous Ser-Gly-Glu repeats of 10 amino acids each; this latter region, N-terminal to the former one, appears to be unique to chicken aggrecan. The two regions contain a total of 94 potential chondroitin sulphate attachment sites. Genomic comparison shows that, although chicken exons 11-14 are identical in size to the rat and human exons, chicken exon 10 is the smallest of the three species. This is also reflected in the size of its chondroitin sulphate coding region and in the total number of Ser-Gly pairs. The putative keratan sulphate domain shows 31-45% identity with the other species and lacks the repetitive sequences seen in the others. In summary, while the linear arrangement of specific domains of chicken aggrecan is identical to that in the aggrecans of other species, and while there is considerable identity of three separate domains, chicken aggrecan demonstrates unique features, notably in its chondroitin sulphate domain and its keratan sulphate

  7. Automated detection of microaneurysms using robust blob descriptors

    NASA Astrophysics Data System (ADS)

    Adal, K.; Ali, S.; Sidibé, D.; Karnowski, T.; Chaum, E.; Mériaudeau, F.

    2013-03-01

    Microaneurysms (MAs) are among the first signs of diabetic retinopathy (DR) that can be seen as round dark-red structures in digital color fundus photographs of retina. In recent years, automated computer-aided detection and diagnosis (CAD) of MAs has attracted many researchers due to its low-cost and versatile nature. In this paper, the MA detection problem is modeled as finding interest points from a given image and several interest point descriptors are introduced and integrated with machine learning techniques to detect MAs. The proposed approach starts by applying a novel fundus image contrast enhancement technique using Singular Value Decomposition (SVD) of fundus images. Then, Hessian-based candidate selection algorithm is applied to extract image regions which are more likely to be MAs. For each candidate region, robust low-level blob descriptors such as Speeded Up Robust Features (SURF) and Intensity Normalized Radon Transform are extracted to characterize candidate MA regions. The combined features are then classified using SVM which has been trained using ten manually annotated training images. The performance of the overall system is evaluated on Retinopathy Online Challenge (ROC) competition database. Preliminary results show the competitiveness of the proposed candidate selection techniques against state-of-the art methods as well as the promising future for the proposed descriptors to be used in the localization of MAs from fundus images.

  8. Replenishing data descriptors in a DMA injection FIFO buffer

    DOEpatents

    Archer, Charles J.; Blocksome, Michael A.; Cernohous, Bob R.; Heidelberger, Philip; Kumar, Sameer; Parker, Jeffrey J.

    2011-10-11

    Methods, apparatus, and products are disclosed for replenishing data descriptors in a Direct Memory Access (`DMA`) injection first-in-first-out (`FIFO`) buffer that include: determining, by a messaging module on an origin compute node, whether a number of data descriptors in a DMA injection FIFO buffer exceeds a predetermined threshold, each data descriptor specifying an application message for transmission to a target compute node; queuing, by the messaging module, a plurality of new data descriptors in a pending descriptor queue if the number of the data descriptors in the DMA injection FIFO buffer exceeds the predetermined threshold; establishing, by the messaging module, interrupt criteria that specify when to replenish the injection FIFO buffer with the plurality of new data descriptors in the pending descriptor queue; and injecting, by the messaging module, the plurality of new data descriptors into the injection FIFO buffer in dependence upon the interrupt criteria.

  9. Molecular structure and motion in zero field magnetic resonance

    SciTech Connect

    Jarvie, T.P.

    1989-10-01

    Zero field magnetic resonance is well suited for the determination of molecular structure and the study of motion in disordered materials. Experiments performed in zero applied magnetic field avoid the anisotropic broadening in high field nuclear magnetic resonance (NMR) experiments. As a result, molecular structure and subtle effects of motion are more readily observed.

  10. Analysis of activity space by fragment fingerprints, 2D descriptors, and multitarget dependent transformation of 2D descriptors.

    PubMed

    Givehchi, Alireza; Bender, Andreas; Glen, Robert C

    2006-01-01

    The effect of multitarget dependent descriptor transformation on classification performance is explored in this work. To this end decision trees as well as neural net QSAR in combination with PLS were applied to predict the activity class of 5HT3 ligands, angiotensin converting enzyme inhibitors, 3-hydroxyl-3-methyl glutaryl coenzyme A reductase inhibitors, platelet activating factor antagonists, and thromboxane A2 antagonists. Physicochemical descriptors calculated by MOE and fragment-based descriptors (MOLPRINT 2D) were employed to generate descriptor vectors. In a subsequent step the physicochemical descriptor vectors were transformed to a lower dimensional space using multitarget dependent descriptor transformation. Cross-validation of the original physicochemical descriptors in combination with decision trees and neural net QSAR as well as cross-validation of PLS multitarget transformed descriptors with neural net QSAR were performed. For comparison this was repeated using fragment-based descriptors in combination with decision trees. PMID:16711727

  11. Molecular clouds and galactic spiral structure

    NASA Technical Reports Server (NTRS)

    Dame, T. M.

    1984-01-01

    Galactic CO line emission at 115 GHz was surveyed in order to study the distribution of molecular clouds in the inner galaxy. Comparison of this survey with similar H1 data reveals a detailed correlation with the most intense 21 cm features. To each of the classical 21 cm H1 spiral arms of the inner galaxy there corresponds a CO molecular arm which is generally more clearly defined and of higher contrast. A simple model is devised for the galactic distribution of molecular clouds. The modeling results suggest that molecular clouds are essentially transient objects, existing for 15 to 40 million years after their formation in a spiral arm, and are largely confined to spiral features about 300 pc wide.

  12. Unraveling the Molecular Structures of Asphaltenes by Atomic Force Microscopy.

    PubMed

    Schuler, Bruno; Meyer, Gerhard; Peña, Diego; Mullins, Oliver C; Gross, Leo

    2015-08-12

    Petroleum is one of the most precious and complex molecular mixtures existing. Because of its chemical complexity, the solid component of crude oil, the asphaltenes, poses an exceptional challenge for structure analysis, with tremendous economic relevance. Here, we combine atomic-resolution imaging using atomic force microscopy and molecular orbital imaging using scanning tunnelling microscopy to study more than 100 asphaltene molecules. The complexity and range of asphaltene polycyclic aromatic hydrocarbons are established in detail. Identifying molecular structures provides a foundation to understand all aspects of petroleum science from colloidal structure and interfacial interactions to petroleum thermodynamics, enabling a first-principles approach to optimize resource utilization. Particularly, the findings contribute to a long-standing debate about asphaltene molecular architecture. Our technique constitutes a paradigm shift for the analysis of complex molecular mixtures, with possible applications in molecular electronics, organic light emitting diodes, and photovoltaic devices. PMID:26170086

  13. Design, synthesis, crystal structure, insecticidal activity, molecular docking, and QSAR studies of novel N3-substituted imidacloprid derivatives.

    PubMed

    Wang, Mei-Juan; Zhao, Xiao-Bo; Wu, Dan; Liu, Ying-Qian; Zhang, Yan; Nan, Xiang; Liu, Huanxiang; Yu, Hai-Tao; Hu, Guan-Fang; Yan, Li-Ting

    2014-06-18

    Three novel series of N3-substituted imidacloprid derivatives were designed and synthesized, and their structures were identified on the basis of satisfactory analytical and spectral ((1)H NMR, (13)C NMR, MS, elemental analysis, and X-ray) data. Preliminary bioassays indicated that all of the derivatives exhibited significant insecticidal activities against Aphis craccivora, with LC50 values ranging from 0.00895 to 0.49947 mmol/L, and the insecticidal activities of some of them were comparable to those of the control imidacloprid. Some key structural features related to their insecticidal activities were identified, and the binding modes between target compounds and nAChR model were also further explored by molecular docking. By comparing the interaction features of imidacloprid and compound 26 with highest insecticidal activity, the origin of the high insecticidal activity of compound 26 was identified. On the basis of the conformations generated by molecular docking, a satisfactory 2D-QSAR model with six selected descriptors was built using genetic algorithm-multiple linear regression (GA-MLR) method. The analysis of the built model showed the molecular size, shape, and the ability to form hydrogen bond were important for insecticidal potency. The information obtained in the study will be very helpful for the design of new derivatives with high insecticidal activities. PMID:24834971

  14. Dual-tree complex wavelet transform applied on color descriptors for remote-sensed images retrieval

    NASA Astrophysics Data System (ADS)

    Sebai, Houria; Kourgli, Assia; Serir, Amina

    2015-01-01

    This paper highlights color component features that improve high-resolution satellite (HRS) images retrieval. Color component correlation across image lines and columns is used to define a revised color space. It is designed to simultaneously take both color and neighborhood information. From this space, color descriptors, namely rotation invariant uniform local binary pattern, histogram of gradient, and a modified version of local variance are derived through dual-tree complex wavelet transform (DT-CWT). A new color descriptor called smoothed local variance (SLV) using an edge-preserving smoothing filter is introduced. It is intended to offer an efficient way to represent texture/structure information using an invariant to rotation descriptor. This descriptor takes advantage of DT-CWT representation to enhance the retrieval performance of HRS images. We report an evaluation of the SLV descriptor associated with the new color space using different similarity distances in our content-based image retrieval scheme. We also perform comparison with some standard features. Experimental results show that SLV descriptor allied to DT-CWT representation outperforms the other approaches.

  15. Entropy descriptors and Entropy Stabilized Oxides

    NASA Astrophysics Data System (ADS)

    Curtarolo, Stefano

    In this presentation we will discuss the development of entropy descriptors for the AFLOWLIB.org ab-initio repository and the path leading to the synthesis of the novel entropy stabilized oxides. [Nat. Comm. 6:8485 (2015)]. Research sponsored by DOD-ONR N000141310635 and N000141512863.

  16. Local Pyramidal Descriptors for Image Recognition.

    PubMed

    Seidenari, Lorenzo; Serra, Giuseppe; Bagdanov, Andrew D; Del Bimbo, Alberto

    2014-05-01

    In this paper, we present a novel method to improve the flexibility of descriptor matching for image recognition by using local multiresolution pyramids in feature space. We propose that image patches be represented at multiple levels of descriptor detail and that these levels be defined in terms of local spatial pooling resolution. Preserving multiple levels of detail in local descriptors is a way of hedging one's bets on which levels will most relevant for matching during learning and recognition. We introduce the Pyramid SIFT (P-SIFT) descriptor and show that its use in four state-of-the-art image recognition pipelines improves accuracy and yields state-of-the-art results. Our technique is applicable independently of spatial pyramid matching and we show that spatial pyramids can be combined with local pyramids to obtain further improvement. We achieve state-of-the-art results on Caltech-101 (80.1%) and Caltech-256 (52.6%) when compared to other approaches based on SIFT features over intensity images. Our technique is efficient and is extremely easy to integrate into image recognition pipelines. PMID:26353235

  17. Local Pyramidal Descriptors for Image Recognition.

    PubMed

    Seidenari, Lorenzo; Serra, Giuseppe; Bagdanov, Andrew D; Del Bimbo, Alberto

    2013-11-22

    In this paper we present a novel method to improve the flexibility of descriptor matching for image recognition by using local multiresolution pyramids in feature space. We propose that image patches be represented at multiple levels of descriptor detail and that these levels be defined in terms of local spatial pooling resolution. Preserving multiple levels of detail in local descriptors is a way of hedging one's bets on which levels will most relevant for matching during learning and recognition. We introduce the Pyramid SIFT (P-SIFT) descriptor and show that its use in four state-of-the-art image recognition pipelines improves accuracy and yields state-of-the-art results. Our technique is applicable independently of spatial pyramid matching and we show that spatial pyramids can be combined with local pyramids to obtain further improvement. We achieve state-of-the-art results on Caltech-101 (80.1%) and Caltech-256 (52.6%) when compared to other approaches based on SIFT features over intensity images. Our technique is efficient and is extremely easy to integrate into image recognition pipelines. PMID:24277944

  18. Ear biometric recognition using local texture descriptors

    NASA Astrophysics Data System (ADS)

    Benzaoui, Amir; Hadid, Abdenour; Boukrouche, Abdelhani

    2014-09-01

    Automated personal identification using the shape of the human ear is emerging as an appealing modality in biometric and forensic domains. This is mainly due to the fact that the ear pattern can provide rich and stable information to differentiate and recognize people. In the literature, there are many approaches and descriptors that achieve relatively good results in constrained environments. The recognition performance tends, however, to significantly decrease under illumination variation, pose variation, and partial occlusion. In this work, we investigate the use of local texture descriptors, namely local binary patterns, local phase quantization, and binarized statistical image features for robust human identification from two-dimensional ear imaging. In contrast to global image descriptors which compute features directly from the entire image, local descriptors representing the features in small local image patches have proven to be more effective in real-world conditions. Our extensive experimental results on the benchmarks IIT Delhi-1, IIT Delhi-2, and USTB ear databases show that local texture features in general and BSIF in particular provide a significant performance improvement compared to the state-of-the-art.

  19. Molecular structural studies of human factor VIII.

    PubMed

    McKee, P A; Andersen, J C; Switzer, M E

    1975-01-20

    Neither normal nor hemophilic factor VIII protein enters a 5% sosium dodecyl sulfate gel; on reduction, however, a single 195 000-molecular-weight peptide is observed. Hemophilic and normal factor VIII contain carbohydrate and appear identical in subunit molecular weight, electrical charge, and major antigenic determinants. Thrombin activation and inactivation of factor VIII does not detectably change the subunit molecular weight. Trypsin causes similar activity changes and obviously cleaves the factor VIII subunit. Human plasmin destroys factor VIII procoagulant activity and degrades the factor VIII subunit to 103 000-, 88 000-, and 17 000-molecular-weight peptides. Both normal and hemophilic factor VIII as well as thrombin-inactivated factor VIII support ristocetin-induced platelet aggregation. Purified factor VIII chromatographed on 4% agarose in 1.0 M sodium chloride shows no dissociation of the procoagulant activity from the void volume protein. Gel chromatography on 4% agarose in 0.25 M calcium chloride results in a procoagulant activity peak removed from the void volume protein; both peaks contain protein which does not enter a 5% SDS gel, but on reduction a 195 000-molecular-weight subunit band is observed for each. Both the void volume protein peak and the procoagulant activity peak from the 0.25 M calcium chloride-agarose gel column support ristocetin-induced platelet aggregation. After removal of calcium, a small amount of procoagulant activity is present only in the void volume peak. These data suggest that both the procoagulant and von Willebrand activities are on the same molecule. Thus our previous conclusion remains the same: human factor VIII is a large glycoprotein composed of identical 195 000-molecular-weight subunits jointed by disulfide bonds and is responsible for both antihemophilic and von Willebrand activities in human plasma. PMID:122889

  20. Protein-protein docking using region-based 3D Zernike descriptors

    PubMed Central

    2009-01-01

    Background Protein-protein interactions are a pivotal component of many biological processes and mediate a variety of functions. Knowing the tertiary structure of a protein complex is therefore essential for understanding the interaction mechanism. However, experimental techniques to solve the structure of the complex are often found to be difficult. To this end, computational protein-protein docking approaches can provide a useful alternative to address this issue. Prediction of docking conformations relies on methods that effectively capture shape features of the participating proteins while giving due consideration to conformational changes that may occur. Results We present a novel protein docking algorithm based on the use of 3D Zernike descriptors as regional features of molecular shape. The key motivation of using these descriptors is their invariance to transformation, in addition to a compact representation of local surface shape characteristics. Docking decoys are generated using geometric hashing, which are then ranked by a scoring function that incorporates a buried surface area and a novel geometric complementarity term based on normals associated with the 3D Zernike shape description. Our docking algorithm was tested on both bound and unbound cases in the ZDOCK benchmark 2.0 dataset. In 74% of the bound docking predictions, our method was able to find a near-native solution (interface C-αRMSD ≤ 2.5 Å) within the top 1000 ranks. For unbound docking, among the 60 complexes for which our algorithm returned at least one hit, 60% of the cases were ranked within the top 2000. Comparison with existing shape-based docking algorithms shows that our method has a better performance than the others in unbound docking while remaining competitive for bound docking cases. Conclusion We show for the first time that the 3D Zernike descriptors are adept in capturing shape complementarity at the protein-protein interface and useful for protein docking prediction

  1. Colour Chemistry, Part I, Principles, Colour, and Molecular Structure

    ERIC Educational Resources Information Center

    Hallas, G.

    1975-01-01

    Discusses various topics in color chemistry, including the electromagnetic spectrum, the absorption and reflection of light, additive and subtractive color mixing, and the molecular structure of simple colored substances. (MLH)

  2. Modeling Polymorphic Molecular Crystals with Electronic Structure Theory.

    PubMed

    Beran, Gregory J O

    2016-05-11

    Interest in molecular crystals has grown thanks to their relevance to pharmaceuticals, organic semiconductor materials, foods, and many other applications. Electronic structure methods have become an increasingly important tool for modeling molecular crystals and polymorphism. This article reviews electronic structure techniques used to model molecular crystals, including periodic density functional theory, periodic second-order Møller-Plesset perturbation theory, fragment-based electronic structure methods, and diffusion Monte Carlo. It also discusses the use of these models for predicting a variety of crystal properties that are relevant to the study of polymorphism, including lattice energies, structures, crystal structure prediction, polymorphism, phase diagrams, vibrational spectroscopies, and nuclear magnetic resonance spectroscopy. Finally, tools for analyzing crystal structures and intermolecular interactions are briefly discussed. PMID:27008426

  3. Predictive Modeling of Antioxidant Coumarin Derivatives Using Multiple Approaches: Descriptor-Based QSAR, 3D-Pharmacophore Mapping, and HQSAR.

    PubMed

    Mitra, Indrani; Saha, Achintya; Roy, Kunal

    2013-03-01

    The inability of the systemic antioxidants to alleviate the exacerbation of free radical formation from metabolic outputs and environmental pollutants claims an urgent demand for the identification and design of new chemical entities with potent antioxidant activity. In the present work, different QSAR approaches have been utilized for identifying the essential structural attributes imparting a potential antioxidant activity profile of the coumarin derivatives. The descriptor-based QSAR model provides a quantitative outline regarding the structural prerequisites of the molecules, while 3D pharmacophore and HQSAR models emphasize the favourable spatial arrangement of the various chemical features and the crucial molecular fragments, respectively. All the models infer that the fused benzene ring and the oxygen atom of the pyran ring constituting the parent coumarin nucleus capture the prime pharmacophoric features, imparting superior antioxidant activity to the molecules. The developed models may serve as indispensable query tools for screening untested molecules belonging to the class of coumarin derivatives. PMID:23641329

  4. Models for anti-tumor activity of bisphosphonates using refined topochemical descriptors

    NASA Astrophysics Data System (ADS)

    Goyal, Rakesh K.; Singh, G.; Madan, A. K.

    2011-10-01

    An in silico approach comprising of decision tree (DT), random forest (RF) and moving average analysis (MAA) was successfully employed for development of models for prediction of anti-tumor activity of bisphosphonates. A dataset consisting of 65 analogues of both nitrogen-containing and non-nitrogen-containing bisphosphonates was selected for the present study. Four refinements of eccentric distance sum topochemical index termed as augmented eccentric distance sum topochemical indices 1-4 ( {ξ_{{1c}}^{{ADS}},ξ_{{2c}}^{{ADS}},ξ_{{3c}}^{{ADS}},ξ_{{4c}}^{{ADS}}} ) have been proposed so as to significantly augment discriminating power. Proposed topological indices (TIs) along with the exiting TIs (>1,400) were subsequently utilized for development of models for prediction of anti-tumor activity of bisphosphonates. A total of 43 descriptors of diverse nature, from a large pool of molecular descriptors, calculated through E-Dragon software (version 1.0) and an in-house computer program were selected for development of suitable models by employing DT, RF and MAA. DT identified two TIs as most important and classified the analogues of the dataset with an accuracy of 97% in training set and 90.7% in tenfold cross-validated set. Random forest correctly classified the analogues with an accuracy of 89.2%. Four independent models developed through MAA predicted the activity of analogues of the dataset with an accuracy of 87.6% to 89%. The statistical significance of proposed models was assessed through intercorrelation analysis, specificity, sensitivity and Matthew's correlation coefficient. The proposed models offer a vast potential for providing lead structures for development of potent anti-tumor agents for treatment of cancer that has spread to the bone.

  5. Feature Point Descriptors: Infrared and Visible Spectra

    PubMed Central

    Ricaurte, Pablo; Chilán, Carmen; Aguilera-Carrasco, Cristhian A.; Vintimilla, Boris X.; Sappa, Angel D.

    2014-01-01

    This manuscript evaluates the behavior of classical feature point descriptors when they are used in images from long-wave infrared spectral band and compare them with the results obtained in the visible spectrum. Robustness to changes in rotation, scaling, blur, and additive noise are analyzed using a state of the art framework. Experimental results using a cross-spectral outdoor image data set are presented and conclusions from these experiments are given. PMID:24566634

  6. Instructional Approach to Molecular Electronic Structure Theory

    ERIC Educational Resources Information Center

    Dykstra, Clifford E.; Schaefer, Henry F.

    1977-01-01

    Describes a graduate quantum mechanics projects in which students write a computer program that performs ab initio calculations on the electronic structure of a simple molecule. Theoretical potential energy curves are produced. (MLH)

  7. Synthesis and molecular structure of gold triarylcorroles.

    PubMed

    Thomas, Kolle E; Alemayehu, Abraham B; Conradie, Jeanet; Beavers, Christine; Ghosh, Abhik

    2011-12-19

    A number of third-row transition-metal corroles have remained elusive as synthetic targets until now, notably osmium, platinum, and gold corroles. Against this backdrop, we present a simple and general synthesis of β-unsubstituted gold(III) triarylcorroles and the first X-ray crystal structure of such a complex. Comparison with analogous copper and silver corrole structures, supplemented by extensive scalar-relativistic, dispersion-corrected density functional theory calculations, suggests that "inherent saddling" may occur for of all coinage metal corroles. The degree of saddling, however, varies considerably among the three metals, decreasing conspicuously along the series Cu > Ag > Au. The structural differences reflect significant differences in metal-corrole bonding, which are also reflected in the electrochemistry and electronic absorption spectra of the complexes. From Cu to Au, the electronic structure changes from noninnocent metal(II)-corrole(•2-) to relatively innocent metal(III)-corrole(3-). PMID:22111600

  8. DFT analysis on the molecular structure, vibrational and electronic spectra of 2-(cyclohexylamino)ethanesulfonic acid.

    PubMed

    Renuga Devi, T S; Sharmi kumar, J; Ramkumaar, G R

    2015-02-25

    The FTIR and FT-Raman spectra of 2-(cyclohexylamino)ethanesulfonic acid were recorded in the regions 4000-400 cm(-1) and 4000-50 cm(-1) respectively. The structural and spectroscopic data of the molecule in the ground state were calculated using Hartee-Fock and Density functional method (B3LYP) with the correlation consistent-polarized valence double zeta (cc-pVDZ) basis set and 6-311++G(d,p) basis set. The most stable conformer was optimized and the structural and vibrational parameters were determined based on this. The complete assignments were performed based on the Potential Energy Distribution (PED) of the vibrational modes, calculated using Vibrational Energy Distribution Analysis (VEDA) 4 program. With the observed FTIR and FT-Raman data, a complete vibrational assignment and analysis of the fundamental modes of the compound were carried out. Thermodynamic properties and Atomic charges were calculated using both Hartee-Fock and density functional method using the cc-pVDZ basis set and compared. The calculated HOMO-LUMO energy gap revealed that charge transfer occurs within the molecule. (1)H and (13)C NMR chemical shifts of the molecule were calculated using Gauge Including Atomic Orbital (GIAO) method and were compared with experimental results. Stability of the molecule arising from hyperconjugative interactions, charge delocalization have been analyzed using Natural Bond Orbital (NBO) analysis. The first order hyperpolarizability (β) and Molecular Electrostatic Potential (MEP) of the molecule was computed using DFT calculations. The electron density based local reactivity descriptor such as Fukui functions were calculated to explain the chemical reactivity site in the molecule. PMID:25262144

  9. DFT analysis on the molecular structure, vibrational and electronic spectra of 2-(cyclohexylamino)ethanesulfonic acid

    NASA Astrophysics Data System (ADS)

    Renuga Devi, T. S.; Sharmi kumar, J.; Ramkumaar, G. R.

    2015-02-01

    The FTIR and FT-Raman spectra of 2-(cyclohexylamino)ethanesulfonic acid were recorded in the regions 4000-400 cm-1 and 4000-50 cm-1 respectively. The structural and spectroscopic data of the molecule in the ground state were calculated using Hartee-Fock and Density functional method (B3LYP) with the correlation consistent-polarized valence double zeta (cc-pVDZ) basis set and 6-311++G(d,p) basis set. The most stable conformer was optimized and the structural and vibrational parameters were determined based on this. The complete assignments were performed based on the Potential Energy Distribution (PED) of the vibrational modes, calculated using Vibrational Energy Distribution Analysis (VEDA) 4 program. With the observed FTIR and FT-Raman data, a complete vibrational assignment and analysis of the fundamental modes of the compound were carried out. Thermodynamic properties and Atomic charges were calculated using both Hartee-Fock and density functional method using the cc-pVDZ basis set and compared. The calculated HOMO-LUMO energy gap revealed that charge transfer occurs within the molecule. 1H and 13C NMR chemical shifts of the molecule were calculated using Gauge Including Atomic Orbital (GIAO) method and were compared with experimental results. Stability of the molecule arising from hyperconjugative interactions, charge delocalization have been analyzed using Natural Bond Orbital (NBO) analysis. The first order hyperpolarizability (β) and Molecular Electrostatic Potential (MEP) of the molecule was computed using DFT calculations. The electron density based local reactivity descriptor such as Fukui functions were calculated to explain the chemical reactivity site in the molecule.

  10. A novel biologically inspired local feature descriptor.

    PubMed

    Zhang, Yun; Tian, Tian; Tian, Jinwen; Gong, Junbin; Ming, Delie

    2014-06-01

    Local feature descriptor is a fundamental representation for image patch which has been extensively used in many computer vision applications. In this paper, different from state-of-the-art features, a novel biologically inspired local descriptor (BILD) is proposed based on the visual information processing mechanism of ventral pathway in human brain. The local features used for constructing BILD are extracted by a two-layer network, which corresponds to the simple-to-complex cell hierarchy in the primary visual cortex (V1). It works in a similar way as the simple cell and complex cell do to get responses by applying the lateral inhibition from different orientations and operating an improved cortical pooling. To enhance the distinctiveness of BILD, we combine the local features from different orientations. Extensive evaluations have been performed for image matching and object recognition. Experimental results reveal that our proposed BILD outperforms many widely used descriptors such as SIFT and SURF, which demonstrate its efficiency for representing local regions. PMID:24677037

  11. A Structural Modelling Study on Marine Sediments Toxicity

    PubMed Central

    Jäntschi, Lorentz; Bolboacã, Sorana D.

    2008-01-01

    Quantitative structure-activity relationship models were obtained by applying the Molecular Descriptor Family approach to eight ordnance compounds with different toxicity on five marine species (arbacia punctulata, dinophilus gyrociliatus, sciaenops ocellatus, opossum shrimp, and ulva fasciata). The selection of the best among molecular descriptors generated and calculated from the ordnance compounds structures lead to accurate monovariate models. The resulting models obtained for six endpoints proved to be accurate in estimation (the squared correlation coefficient varied from 0.8186 to 0.9997) and prediction (the correlation coefficient obtained in leave-one-out analysis varied from 0.7263 to 0.9984). PMID:18728732

  12. Prediction of inhibition of the sodium ion-proton antiporter by benzoylguanidine derivatives from molecular structure.

    PubMed

    Kauffman, G W; Jurs, P C

    2000-01-01

    The use of quantitative structure-activity relationships to predict IC50 values of 113 potential Na+/H+ antiporter inhibitors is reported. Multiple linear regression and computational neural networks (CNNs) are used to develop models using a set of information-rich descriptors. The descriptors encode information about topology, geometry, electronics, and combination hybrids. A five-descriptor CNN model with root-mean-square (rms) errors of 0.278 log units for the training set and 0.377 log units for the prediction set was developed. Examination of data set subclasses showed that systematic structural variations were also well-encoded resulting in 100% accuracy of prediction trends. An experiment involving a committee of five CNNs was also performed to examine the effect of network output averaging. This showed improved results decreasing the training and cross-validation set rms error to 0.228 log units and the prediction set rms error to 0.296 log units. PMID:10850779

  13. Molecular gymnastics: serpin structure, folding and misfolding.

    PubMed

    Whisstock, James C; Bottomley, Stephen P

    2006-12-01

    The native state of serpins represents a long-lived intermediate or metastable structure on the serpin folding pathway. Upon interaction with a protease, the serpin trap is sprung and the molecule continues to fold into a more stable conformation. However, thermodynamic stability can also be achieved through alternative, unproductive folding pathways that result in the formation of inactive conformations. Our increasing understanding of the mechanism of protease inhibition and the dynamics of native serpin structures has begun to reveal how evolution has harnessed the actual process of protein folding (rather than the final folded outcome) to elegantly achieve function. The cost of using metastability for function, however, is an increased propensity for misfolding. PMID:17079131

  14. Molecular Eigensolution Symmetry Analysis and Fine Structure

    PubMed Central

    Harter, William G.; Mitchell, Justin C.

    2013-01-01

    Spectra of high-symmetry molecules contain fine and superfine level cluster structure related to J-tunneling between hills and valleys on rovibronic energy surfaces (RES). Such graphic visualizations help disentangle multi-level dynamics, selection rules, and state mixing effects including widespread violation of nuclear spin symmetry species. A review of RES analysis compares it to that of potential energy surfaces (PES) used in Born–Oppenheimer approximations. Both take advantage of adiabatic coupling in order to visualize Hamiltonian eigensolutions. RES of symmetric and D2 asymmetric top rank-2-tensor Hamiltonians are compared with Oh spherical top rank-4-tensor fine-structure clusters of 6-fold and 8-fold tunneling multiplets. Then extreme 12-fold and 24-fold multiplets are analyzed by RES plots of higher rank tensor Hamiltonians. Such extreme clustering is rare in fundamental bands but prevalent in hot bands, and analysis of its superfine structure requires more efficient labeling and a more powerful group theory. This is introduced using elementary examples involving two groups of order-6 (C6 and D3~C3v), then applied to families of Oh clusters in SF6 spectra and to extreme clusters. PMID:23344041

  15. Molecular Dynamics Simulations and Structural Analysis of Giardia duodenalis 14-3-3 Protein-Protein Interactions.

    PubMed

    Cau, Ylenia; Fiorillo, Annarita; Mori, Mattia; Ilari, Andrea; Botta, Maurizo; Lalle, Marco

    2015-12-28

    Giardiasis is a gastrointestinal diarrheal illness caused by the protozoan parasite Giardia duodenalis, which affects annually over 200 million people worldwide. The limited antigiardial drug arsenal and the emergence of clinical cases refractory to standard treatments dictate the need for new chemotherapeutics. The 14-3-3 family of regulatory proteins, extensively involved in protein-protein interactions (PPIs) with pSer/pThr clients, represents a highly promising target. Despite homology with human counterparts, the single 14-3-3 of G. duodenalis (g14-3-3) is characterized by a constitutive phosphorylation in a region critical for target binding, thus affecting the function and the conformation of g14-3-3/clients interaction. However, to approach the design of specific small molecule modulators of g14-3-3 PPIs, structural elucidations are required. Here, we present a detailed computational and crystallographic study exploring the implications of g14-3-3 phosphorylation on protein structure and target binding. Self-Guided Langevin Dynamics and classical molecular dynamics simulations show that phosphorylation affects locally and globally g14-3-3 conformation, inducing a structural rearrangement more suitable for target binding. Profitable features for g14-3-3/clients interaction were highlighted using a hydrophobicity-based descriptor to characterize g14-3-3 client peptides. Finally, the X-ray structure of g14-3-3 in complex with a mode-1 prototype phosphopeptide was solved and combined with structure-based simulations to identify molecular features relevant for clients binding to g14-3-3. The data presented herein provide a further and structural understanding of g14-3-3 features and set the basis for drug design studies. PMID:26551337

  16. Complementary molecular information changes our perception of food web structure

    PubMed Central

    Wirta, Helena K.; Hebert, Paul D. N.; Kaartinen, Riikka; Prosser, Sean W.; Várkonyi, Gergely; Roslin, Tomas

    2014-01-01

    How networks of ecological interactions are structured has a major impact on their functioning. However, accurately resolving both the nodes of the webs and the links between them is fraught with difficulties. We ask whether the new resolution conferred by molecular information changes perceptions of network structure. To probe a network of antagonistic interactions in the High Arctic, we use two complementary sources of molecular data: parasitoid DNA sequenced from the tissues of their hosts and host DNA sequenced from the gut of adult parasitoids. The information added by molecular analysis radically changes the properties of interaction structure. Overall, three times as many interaction types were revealed by combining molecular information from parasitoids and hosts with rearing data, versus rearing data alone. At the species level, our results alter the perceived host specificity of parasitoids, the parasitoid load of host species, and the web-wide role of predators with a cryptic lifestyle. As the northernmost network of host–parasitoid interactions quantified, our data point exerts high leverage on global comparisons of food web structure. However, how we view its structure will depend on what information we use: compared with variation among networks quantified at other sites, the properties of our web vary as much or much more depending on the techniques used to reconstruct it. We thus urge ecologists to combine multiple pieces of evidence in assessing the structure of interaction webs, and suggest that current perceptions of interaction structure may be strongly affected by the methods used to construct them. PMID:24449902

  17. Giant Molecular Cloud Structure and Evolution

    NASA Technical Reports Server (NTRS)

    Hollenbach, David (Technical Monitor); Bodenheimer, P. H.

    2003-01-01

    Bodenheimer and Burkert extended earlier calculations of cloud core models to study collapse and fragmentation. The initial condition for an SPH collapse calculation is the density distribution of a Bonnor-Ebert sphere, with near balance between turbulent plus thermal energy and gravitational energy. The main parameter is the turbulent Mach number. For each Mach number several runs are made, each with a different random realization of the initial turbulent velocity field. The turbulence decays on a dynamical time scale, leading the cloud into collapse. The collapse proceeds isothermally until the density has increased to about 10(exp 13) g cm(exp -3). Then heating is included in the dense regions. The nature of the fragmentation is investigated. About 15 different runs have been performed with Mach numbers ranging from 0.3 to 3.5 (the typical value observed in molecular cloud cores is 0.7). The results show a definite trend of increasing multiplicity with increasing Mach number (M), with the number of fragments approximately proportional to (1 + M). In general, this result agrees with that of Fisher, Klein, and McKee who published three cases with an AMR grid code. However our results show that there is a large spread about this curve. For example, for M=0.3 one case resulted in no fragmentation while a second produced three fragments. Thus it is not only the value of M but also the details of the superposition of the various velocity modes that play a critical role in the formation of binaries. Also, the simulations produce a wide range of separations (10-1000 AU) for the multiple systems, in rough agreement with observations. These results are discussed in two conference proceedings.

  18. Ionization probes of molecular structure and chemistry

    SciTech Connect

    Johnson, P.M.

    1993-12-01

    Various photoionization processes provide very sensitive probes for the detection and understanding of the spectra of molecules relevant to combustion processes. The detection of ionization can be selective by using resonant multiphoton ionization or by exploiting the fact that different molecules have different sets of ionization potentials. Therefore, the structure and dynamics of individual molecules can be studied even in a mixed sample. The authors are continuing to develop methods for the selective spectroscopic detection of molecules by ionization, and to use these methods for the study of some molecules of combustion interest.

  19. Dominant color correlogram descriptor for content-based image retrieval

    NASA Astrophysics Data System (ADS)

    Fierro-Radilla, Atoany; Perez-Daniel, Karina; Nakano-Miyatake, Mariko; Benois, Jenny

    2015-03-01

    Content-based image retrieval (CBIR) has become an interesting and urgent research topic due to the increase of necessity of indexing and classification of multimedia content in large databases. The low level visual descriptors, such as color-based, texture-based and shape-based descriptors, have been used for the CBIR task. In this paper we propose a color-based descriptor which describes well image contents, integrating both global feature provided by dominant color and local features provided by color correlogram. The performance of the proposed descriptor, called Dominant Color Correlogram descriptor (DCCD), is evaluated comparing with some MPEG-7 visual descriptors and other color-based descriptors reported in the literature, using two image datasets with different size and contents. The performance of the proposed descriptor is assessed using three different metrics commonly used in image retrieval task, which are ARP (Average Retrieval Precision), ARR (Average Retrieval Rate) and ANMRR (Average Normalized Modified Retrieval Rank). Also precision-recall curves are provided to show a better performance of the proposed descriptor compared with other color-based descriptors.

  20. In silico evaluation, molecular docking and QSAR analysis of quinazoline-based EGFR-T790M inhibitors.

    PubMed

    Asadollahi-Baboli, M

    2016-08-01

    Mutated epidermal growth factor receptor (EGFR-T790M) inhibitors hold promise as new agents against cancer. Molecular docking and QSAR analysis were performed based on a series of fifty-three quinazoline derivatives to elucidate key structural and physicochemical properties affecting inhibitory activity. Molecular docking analysis identified the true conformations of ligands in the receptor's active pocket. The structural features of the ligands, expressed as molecular descriptors, were derived from the obtained docked conformations. Non-linear and spline QSAR models were developed through novel genetic algorithm and artificial neural network (GA-ANN) and multivariate adaptive regression spline techniques, respectively. The former technique was employed to consider non-linear relation between molecular descriptors and inhibitory activity of quinazoline derivatives. The later technique was also used to describe the non-linearity using basis functions and sub-region equations for each descriptor. Our QSAR model gave a high predictive performance [Formula: see text] and [Formula: see text]) using diverse validation techniques. Eight new compounds were designed using our QSAR model as potent EGFR-T790M inhibitors. Overall, the proposed in silico strategy based on docked derived descriptor and non-linear descriptor subset selection may help design novel quinazoline derivatives with improved EGFR-T790M inhibitory activity. PMID:27209475

  1. Comprehensive Molecular Structure of the Eukaryotic Ribosome

    PubMed Central

    Taylor, Derek J.; Devkota, Batsal; Huang, Andrew D.; Topf, Maya; Narayanan, Eswar; Sali, Andrej; Harvey, Stephen C.; Frank, Joachim

    2009-01-01

    Despite the emergence of a large number of X-ray crystallographic models of the bacterial 70S ribosome over the past decade, an accurate atomic model of the eukaryotic 80S ribosome is still not available. Eukaryotic ribosomes possess more ribosomal proteins and ribosomal RNA than bacterial ribosomes, which are implicated in extra-ribosomal functions in the eukaryotic cells. By combining cryo-EM with RNA and protein homology modeling, we obtained an atomic model of the yeast 80S ribosome complete with all ribosomal RNA expansion segments and all ribosomal proteins for which a structural homolog can be identified. Mutation or deletion of 80S ribosomal proteins can abrogate maturation of the ribosome, leading to several human diseases. We have localized one such protein unique to eukaryotes, rpS19e, whose mutations are associated with Diamond-Blackfan anemia in humans. Additionally, we characterize crucial and novel interactions between the dynamic stalk base of the ribosome with eukaryotic elongation factor 2. PMID:20004163

  2. COMPUTER-ASSISTED STRUCTURE ACTIVITY RELATIONSHIPS OF NITROGENOUS CYCLIC COMPOUNDS TESTED IN SALMONELLA ASSAYS FOR MUTAGENICITY

    EPA Science Inventory

    Study of the relationship between mutagenicity and molecular structure for a data set of nitrogenous cyclic compounds is reported. A computerized SAR system (ADAPT) was utilized to classify a data set of 114 nitrogenous cyclic compounds with 19 molecular descriptors. All of the d...

  3. Mechanistic Details and Reactivity Descriptors in Oxidation and Acid Catalysis of Methanol

    SciTech Connect

    Deshlahra, Prashant; Carr, Robert T.; Chai, Song-Hai; Iglesia, Enrique

    2015-02-06

    Acid and redox reaction rates of CH₃OH-O₂ mixtures on polyoxometalate (POM) clusters, together with isotopic, spectroscopic, and theoretical assessments of catalyst properties and reaction pathways, were used to define rigorous descriptors of reactivity and to probe the compositional effects for oxidative dehydrogenation (ODH) and dehydration reactions. ³¹P-MAS NMR, transmission electron microscopy and titrations of protons with di-tert-butylpyridine during catalysis showed that POM clusters retained their Keggin structure upon dispersion on SiO₂ and after use in CH₃OH reactions. The effects of CH₃OH and O₂ pressures and of D-substitution on ODH rates show that C-H activation in molecularly adsorbed CH₃OH is the sole kinetically relevant step and leads to reduced centers as intermediates present at low coverages; their concentrations, measured from UV-vis spectra obtained during catalysis, are consistent with the effects of CH₃OH/O₂ ratios predicted from the elementary steps proposed. First-order ODH rate constants depend strongly on the addenda atoms (Mo vs W) but weakly on the central atom (P vs Si) in POM clusters, because C-H activation steps inject electrons into the lowest unoccupied molecular orbitals (LUMO) of the clusters, which are the d-orbitals at Mo⁶⁺ and W⁶⁺ centers. H-atom addition energies (HAE) at O-atoms in POM clusters represent the relevant theoretical probe of the LUMO energies and of ODH reactivity. The calculated energies of ODH transition states at each O-atom depend linearly on their HAE values with slopes near unity, as predicted for late transition states in which electron transfer and C-H cleavage are essentially complete. HAE values averaged over all accessible O-atoms in POM clusters provide the appropriate reactivity descriptor for oxides whose known structures allow accurate HAE calculations. CH₃OH dehydration proceeds via parallel pathways mediated by late carbenium-ion transition states; effects of

  4. The Classification of HEp-2 Cell Patterns Using Fractal Descriptor.

    PubMed

    Xu, Rudan; Sun, Yuanyuan; Yang, Zhihao; Song, Bo; Hu, Xiaopeng

    2015-07-01

    Indirect immunofluorescence (IIF) with HEp-2 cells is considered as a powerful, sensitive and comprehensive technique for analyzing antinuclear autoantibodies (ANAs). The automatic classification of the HEp-2 cell images from IIF has played an important role in diagnosis. Fractal dimension can be used on the analysis of image representing and also on the property quantification like texture complexity and spatial occupation. In this study, we apply the fractal theory in the application of HEp-2 cell staining pattern classification, utilizing fractal descriptor firstly in the HEp-2 cell pattern classification with the help of morphological descriptor and pixel difference descriptor. The method is applied to the data set of MIVIA and uses the support vector machine (SVM) classifier. Experimental results show that the fractal descriptor combining with morphological descriptor and pixel difference descriptor makes the precisions of six patterns more stable, all above 50%, achieving 67.17% overall accuracy at best with relatively simple feature vectors. PMID:26011888

  5. Weighted measurement fusion Kalman estimator for multisensor descriptor system

    NASA Astrophysics Data System (ADS)

    Dou, Yinfeng; Ran, Chenjian; Gao, Yuan

    2016-08-01

    For the multisensor linear stochastic descriptor system with correlated measurement noises, the fused measurement can be obtained based on the weighted least square (WLS) method, and the reduced-order state components are obtained applying singular value decomposition method. Then, the multisensor descriptor system is transformed to a fused reduced-order non-descriptor system with correlated noise. And the weighted measurement fusion (WMF) Kalman estimator of this reduced-order subsystem is presented. According to the relationship of the presented non-descriptor system and the original descriptor system, the WMF Kalman estimator and its estimation error variance matrix of the original multisensor descriptor system are presented. The presented WMF Kalman estimator has global optimality, and can avoid computing these cross-variances of the local Kalman estimator, compared with the state fusion method. A simulation example about three-sensors stochastic dynamic input and output systems in economy verifies the effectiveness.

  6. Invariant Descriptor Learning Using a Siamese Convolutional Neural Network

    NASA Astrophysics Data System (ADS)

    Chen, L.; Rottensteiner, F.; Heipke, C.

    2016-06-01

    In this paper we describe learning of a descriptor based on the Siamese Convolutional Neural Network (CNN) architecture and evaluate our results on a standard patch comparison dataset. The descriptor learning architecture is composed of an input module, a Siamese CNN descriptor module and a cost computation module that is based on the L2 Norm. The cost function we use pulls the descriptors of matching patches close to each other in feature space while pushing the descriptors for non-matching pairs away from each other. Compared to related work, we optimize the training parameters by combining a moving average strategy for gradients and Nesterov's Accelerated Gradient. Experiments show that our learned descriptor reaches a good performance and achieves state-of-art results in terms of the false positive rate at a 95 % recall rate on standard benchmark datasets.

  7. Fingerprint identification using SIFT-based minutia descriptors and improved all descriptor-pair matching.

    PubMed

    Zhou, Ru; Zhong, Dexing; Han, Jiuqiang

    2013-01-01

    The performance of conventional minutiae-based fingerprint authentication algorithms degrades significantly when dealing with low quality fingerprints with lots of cuts or scratches. A similar degradation of the minutiae-based algorithms is observed when small overlapping areas appear because of the quite narrow width of the sensors. Based on the detection of minutiae, Scale Invariant Feature Transformation (SIFT) descriptors are employed to fulfill verification tasks in the above difficult scenarios. However, the original SIFT algorithm is not suitable for fingerprint because of: (1) the similar patterns of parallel ridges; and (2) high computational resource consumption. To enhance the efficiency and effectiveness of the algorithm for fingerprint verification, we propose a SIFT-based Minutia Descriptor (SMD) to improve the SIFT algorithm through image processing, descriptor extraction and matcher. A two-step fast matcher, named improved All Descriptor-Pair Matching (iADM), is also proposed to implement the 1:N verifications in real-time. Fingerprint Identification using SMD and iADM (FISiA) achieved a significant improvement with respect to accuracy in representative databases compared with the conventional minutiae-based method. The speed of FISiA also can meet real-time requirements. PMID:23467056

  8. Fingerprint Identification Using SIFT-Based Minutia Descriptors and Improved All Descriptor-Pair Matching

    PubMed Central

    Zhou, Ru; Zhong, Dexing; Han, Jiuqiang

    2013-01-01

    The performance of conventional minutiae-based fingerprint authentication algorithms degrades significantly when dealing with low quality fingerprints with lots of cuts or scratches. A similar degradation of the minutiae-based algorithms is observed when small overlapping areas appear because of the quite narrow width of the sensors. Based on the detection of minutiae, Scale Invariant Feature Transformation (SIFT) descriptors are employed to fulfill verification tasks in the above difficult scenarios. However, the original SIFT algorithm is not suitable for fingerprint because of: (1) the similar patterns of parallel ridges; and (2) high computational resource consumption. To enhance the efficiency and effectiveness of the algorithm for fingerprint verification, we propose a SIFT-based Minutia Descriptor (SMD) to improve the SIFT algorithm through image processing, descriptor extraction and matcher. A two-step fast matcher, named improved All Descriptor-Pair Matching (iADM), is also proposed to implement the 1:N verifications in real-time. Fingerprint Identification using SMD and iADM (FISiA) achieved a significant improvement with respect to accuracy in representative databases compared with the conventional minutiae-based method. The speed of FISiA also can meet real-time requirements. PMID:23467056

  9. An Efficient Wide-Baseline Dense Matching Descriptor

    NASA Astrophysics Data System (ADS)

    Wan, Yanli; Miao, Zhenjiang; Tang, Zhen; Wan, Lili; Wang, Zhe

    This letter proposes an efficient local descriptor for wide-baseline dense matching. It improves the existing Daisy descriptor by combining intensity-based Haar wavelet response with a new color-based ratio model. The color ratio model is invariant to changes of viewing direction, object geometry, and the direction, intensity and spectral power distribution of the illumination. The experiments show that our descriptor has high discriminative power and robustness.

  10. Connecting the density structure of molecular clouds and star formation.

    NASA Astrophysics Data System (ADS)

    Kainulainen, Jouni

    2015-08-01

    In the current paradigm of turbulence-regulated interstellar medium (ISM), star formation rates of entire galaxies are intricately linked to the density structure of the individual molecular clouds in the ISM. This density structure is essentially encapsulated in the probability distribution function of volume densities (rho-PDF), which directly affects the star formation rates predicted by analytic models. Contrasting its fundamental role, the rho-PDF function and its evolution have remained virtually unconstrained by observations. I describe in this contribution our recent progress in attaining observational constraints for the rho-PDFs of molecular clouds. Specifically, I review our first systematic determination of the rho-PDFs in Solar neighborhood molecular clouds. I will also present new evidence of the time evolution of the projected rho-PDFs, i.e., column density PDFs. These results together enable us to build the first observationally constrained link between the evolving density structure of molecular clouds and the star formation within. Finally, I discuss our work to expand the analysis into a Galactic context and to observationally connect the physical processes acting at the scale of molecular clouds with star formation at the scale of galaxies.

  11. Determination of structure parameters in molecular tunnelling ionisation model

    NASA Astrophysics Data System (ADS)

    Wang, Jun-Ping; Zhao, Song-Feng; Zhang, Cai-Rong; Li, Wei; Zhou, Xiao-Xin

    2014-04-01

    We extracted the accurate structure parameters in a molecular tunnelling ionisation model (the so-called MO-ADK model) for 23 selected linear molecules including some inner orbitals. The molecular wave functions with the correct asymptotic behaviour are obtained by solving the time-independent Schrödinger equation with B-spline functions and molecular potentials numerically constructed using the modified Leeuwen-Baerends (LBα) model. We show that the orientation-dependent ionisation rate reflects the shape of the ionising orbitals in general. The influences of the Stark shifts of the energy levels on the orientation-dependent ionisation rates of the polar molecules are studied. We also examine the angle-dependent ionisation rates (or probabilities) based on the MO-ADK model by comparing with the molecular strong-field approximation calculations and with recent experimental measurements.

  12. From non-random molecular structure to life and mind

    NASA Technical Reports Server (NTRS)

    Fox, S. W.

    1989-01-01

    The evolutionary hierarchy molecular structure-->macromolecular structure-->protobiological structure-->biological structure-->biological functions has been traced by experiments. The sequence always moves through protein. Extension of the experiments traces the formation of nucleic acids instructed by proteins. The proteins themselves were, in this picture, instructed by the self-sequencing of precursor amino acids. While the sequence indicated explains the thread of the emergence of life, protein in cellular membrane also provides the only known material basis for the emergence of mind in the context of emergence of life.

  13. [Structure and molecular mechanisms of infection and replication of HIV].

    PubMed

    Sato, Hironori; Ode, Hirotaka; Motomura, Kazushi; Yokoyama, Masaru

    2009-01-01

    Studies on molecular structure and mechanisms of replication of a pathogen are important from both scientific and clinical viewpoints. The replication study allows us to identify key molecules to regulate life cycle of the pathogen and to screen rationally anti-pathogen drugs. The structural study helps understand how the key molecules work at atomic levels and to design adequately the drugs. In this article, we review important findings on structural and replication studies of human immunodeficiency virus (HIV). We also summarize the latest methods for the structural study, mainly focusing on computational simulation technology (in silico analysis). Finally, we summarize briefly standard methods to study replication of viruses. PMID:19177750

  14. Hierarchical and binary spatial descriptors for lung nodule image retrieval.

    PubMed

    Ng, Gillian; Song, Yang; Cai, Weidong; Zhou, Yun; Liu, Sidong; Feng, David Dagan

    2014-01-01

    With the increasing amount of image data available for cancer staging and diagnosis, it is clear that content-based image retrieval techniques are becoming more important to assist physicians in making diagnoses and tracking disease. Domain-specific feature descriptors have been previously shown to be effective in the retrieval of lung tumors. This work proposes a method to improve the rotation invariance of the hierarchical spatial descriptor, as well as presents a new binary descriptor for the retrieval of lung nodule images. The descriptors were evaluated on the ELCAP public access database, exhibiting good performance overall. PMID:25571476

  15. Heliconia phenotypic diversity based on qualitative descriptors.

    PubMed

    Guimarães, W N R; Martins, L S S; Castro, C E F; Carvalho Filho, J L S; Loges, V

    2014-01-01

    The aim of this study was to characterize Heliconia genotypes phenotypically using 26 qualitative descriptors. The evaluations were conducted in five flowering stems per clump in three replicates of 22 Heliconia genotypes. Data were subjected to multivariate analysis, the Mahalanobis dissimilarity measure was estimated, and the dendrogram was generated using the nearest neighbor method. From the values generated by the dissimilarity matrix and the clusters formed among the Heliconia genotypes studied, the phenotypic characterizations that best differentiated the genotypes were: pseudostem and wax green tone (light or dark green), leaf-wax petiole, the petiole hair, cleft margin at the base of the petiole, midrib underside shade of green, wax midrib underside, color sheet (light or dark green), unequal lamina base, torn limb, inflorescence-wax, position of inflorescence, bract leaf in apex, twisting of the rachis, and type of bloom. These results will be applied in the preparation of a catalog for Heliconia descriptors, in the selection of different genotypes with most promising characteristics for crosses, and for the characterization of new genotypes to be introduced in germplasm collections. PMID:24782170

  16. Notes on quantitative structure-property relationships (QSPR), part 3: density functions origin shift as a source of quantum QSPR algorithms in molecular spaces.

    PubMed

    Carbó-Dorca, Ramon

    2013-04-01

    A general algorithm implementing a useful variant of quantum quantitative structure-property relationships (QQSPR) theory is described. Based on quantum similarity framework and previous theoretical developments on the subject, the present QQSPR procedure relies on the possibility to perform geometrical origin shifts over molecular density function sets. In this way, molecular collections attached to known properties can be easily used over other quantum mechanically well-described molecular structures for the estimation of their unknown property values. The proposed procedure takes quantum mechanical expectation value as provider of causal relation background and overcomes the dimensionality paradox, which haunts classical descriptor space QSPR. Also, contrarily to classical procedures, which are also attached to heavy statistical gear, the present QQSPR approach might use a geometrical assessment only or just some simple statistical outline or both. From an applied point of view, several easily reachable computational levels can be set up. A Fortran 95 program: QQSPR-n is described with two versions, which might be downloaded from a dedicated web site. Various practical examples are provided, yielding excellent results. Finally, it is also shown that an equivalent molecular space classical QSPR formalism can be easily developed. PMID:23238931

  17. A novel phantom system facilitating better descriptors of density within mammographic images

    NASA Astrophysics Data System (ADS)

    Li, Yanpeng; Brennan, Patrick C.; Nickson, Carolyn; Pietrzyk, Mariusz W.; Al Mousa, Dana; Ryan, Elaine

    2013-03-01

    High mammographic density is a risk factor for breast cancer. As it is impossible to measure actual weight or volume of fibroglandular tissue evident within a mammogram, it is hard to know the correlation between measured mammographic density and the actual fibroglandular tissue volume. The aim of this study is to develop a phantom that represents glandular tissue within an adipose tissue structure so that correlations between image feature descriptors and the synthesised glandular structure can be accurately quantified. In this phantom study, ten different weights of fine steel wool were put into gelatine to simulate breast structure. Image feature descriptors are investigated for both the whole phantom image and the simulated density. Descriptors included actual area and percentage area of density, mean pixel intensity for the whole image and dense area, standard deviation of mean intensity, and integrated pixel density which is the production of area and mean intensity. The results show high level correlation between steel-wool weight and percentage density measured on images (r = 0.8421), and the integrated pixel density of dense area (r = 0.8760). The correlation is significant for mean intensity standard deviation for the whole phantom (r = 0.8043). This phantom study may help identify more accurate descriptors of mammographic density, thus facilitating better assessments of fibroglandular tissue appearances.

  18. Molecular structure of DNA by scanning tunneling microscopy.

    PubMed

    Cricenti, A; Selci, S; Felici, A C; Generosi, R; Gori, E; Djaczenko, W; Chiarotti, G

    1989-09-15

    Uncoated DNA molecules marked with an activated tris(l-aziridinyl) phosphine oxide (TAPO) solution were deposited on gold substrates and imaged in air with the use of a high-resolution scanning tunneling microscope (STM). Constant-current and gap-modulated STM images show clear evidence of the helicity of the DNA structure: pitch periodicity ranges from 25 to 35 angstroms, whereas the average diameter is 20 angstroms. Molecular structure within a single helix turn was also observed. PMID:2781279

  19. Molecular Structure of DNA by Scanning Tunneling Microscopy

    NASA Astrophysics Data System (ADS)

    Cricenti, A.; Selci, S.; Felici, A. C.; Generosi, R.; Gori, E.; Djaczenko, W.; Chiarotti, G.

    1989-09-01

    Uncoated DNA molecules marked with an activated tris(1-aziridinyl) phosphine oxide (TAPO) solution were deposited on gold substrates and imaged in air with the use of a high-resolution scanning tunneling microscope (STM). Constant-current and gap-modulated STM images show clear evidence of the helicity of the DNA structure: pitch periodicity ranges from 25 and 35 angstroms, whereas the average diameter is 20 angstroms. Molecular structure within a single helix turn was also observed.

  20. Relating Soil Organic Matter Dynamics to its Molecular Structure

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Our understanding of the dynamics of soil organic matter (SOM) must be integrated with a sound knowledge of it biochemical complexity. The molecular structure of SOM was determined in 98% sand soils to eliminate the known protective effects of clay on the amount and turnover rate of the SOM constitu...

  1. Carcinogenic classification of polycyclic aromatic hydrocarbons through theoretical descriptors

    NASA Astrophysics Data System (ADS)

    Troche, Karla S.; Braga, Scheila F.; Coluci, Vitor R.; Galvão, Douglas S.

    Polycyclic aromatic hydrocarbons (PAHs) constitute an important family of molecules capable of inducing chemical carcinogenesis. In this work we report a comparative structure-activity relationship (SAR) study for 81 PAHs using different methodologies. The recently developed electronic indices methodology (EIM) with quantum descriptors obtained from different semiempirical methods (AM1, PM3, and PM5) was contrasted against more standard pattern recognition methods (PRMs), principal component analysis (PCA), hierarchical cluster analysis (HCA), Kth nearest neighbor (KNN), soft independent modeling of class analogies (SIMCA), and neural networks (NN). Our results show that PRMs validate the statistical value of electronic parameters derived from EIM analysis and their ability to identify active compounds. EIM outperformed more standard SAR methodologies and does not appear to be significantly Hamiltonian-dependent.

  2. Extracting Structure Parameters of Dimers for Molecular Tunneling Ionization Model

    NASA Astrophysics Data System (ADS)

    Song-Feng, Zhao; Fang, Huang; Guo-Li, Wang; Xiao-Xin, Zhou

    2016-03-01

    We determine structure parameters of the highest occupied molecular orbital (HOMO) of 27 dimers for the molecular tunneling ionization (so called MO-ADK) model of Tong et al. [Phys. Rev. A 66 (2002) 033402]. The molecular wave functions with correct asymptotic behavior are obtained by solving the time-independent Schrödinger equation with B-spline functions and molecular potentials which are numerically created using the density functional theory. We examine the alignment-dependent tunneling ionization probabilities from MO-ADK model for several molecules by comparing with the molecular strong-field approximation (MO-SFA) calculations. We show the molecular Perelomov–Popov–Terent'ev (MO-PPT) can successfully give the laser wavelength dependence of ionization rates (or probabilities). Based on the MO-PPT model, two diatomic molecules having valence orbital with antibonding systems (i.e., Cl2, Ne2) show strong ionization suppression when compared with their corresponding closest companion atoms. Supported by National Natural Science Foundation of China under Grant Nos. 11164025, 11264036, 11465016, 11364038, the Specialized Research Fund for the Doctoral Program of Higher Education of China under Grant No. 20116203120001, and the Basic Scientific Research Foundation for Institution of Higher Learning of Gansu Province

  3. Extracting Structure Parameters of Dimers for Molecular Tunneling Ionization Model

    NASA Astrophysics Data System (ADS)

    Zhao, Song-Feng; Huang, Fang; Wang, Guo-Li; Zhou, Xiao-Xin

    2016-03-01

    We determine structure parameters of the highest occupied molecular orbital (HOMO) of 27 dimers for the molecular tunneling ionization (so called MO-ADK) model of Tong et al. [Phys. Rev. A 66 (2002) 033402]. The molecular wave functions with correct asymptotic behavior are obtained by solving the time-independent Schrödinger equation with B-spline functions and molecular potentials which are numerically created using the density functional theory. We examine the alignment-dependent tunneling ionization probabilities from MO-ADK model for several molecules by comparing with the molecular strong-field approximation (MO-SFA) calculations. We show the molecular Perelomov-Popov-Terent'ev (MO-PPT) can successfully give the laser wavelength dependence of ionization rates (or probabilities). Based on the MO-PPT model, two diatomic molecules having valence orbital with antibonding systems (i.e., Cl2, Ne2) show strong ionization suppression when compared with their corresponding closest companion atoms. Supported by National Natural Science Foundation of China under Grant Nos. 11164025, 11264036, 11465016, 11364038, the Specialized Research Fund for the Doctoral Program of Higher Education of China under Grant No. 20116203120001, and the Basic Scientific Research Foundation for Institution of Higher Learning of Gansu Province

  4. Molecular solids of actinide hexacyanoferrate: Structure and bonding

    NASA Astrophysics Data System (ADS)

    Dupouy, G.; Dumas, T.; Fillaux, C.; Guillaumont, D.; Moisy, P.; Den Auwer, C.; Le Naour, C.; Simoni, E.; Fuster, E. G.; Papalardo, R.; Sanchez Marcos, E.; Hennig, C.; Scheinost, A.; Conradson, S. D.; Shuh, D. K.; Tyliszczak, T.

    2010-03-01

    The hexacyanometallate family is well known in transition metal chemistry because the remarkable electronic delocalization along the metal-cyano-metal bond can be tuned in order to design systems that undergo a reversible and controlled change of their physical properties. We have been working for few years on the description of the molecular and electronic structure of materials formed with [Fe(CN)6]n- building blocks and actinide ions (An = Th, U, Np, Pu, Am) and have compared these new materials to those obtained with lanthanide cations at oxidation state +III. In order to evaluate the influence of the actinide coordination polyhedron on the three-dimensional molecular structure, both atomic number and formal oxidation state have been varied : oxidation states +III, +IV. EXAFS at both iron K edge and actinide LIII edge is the dedicated structural probe to obtain structural information on these systems. Data at both edges have been combined to obtain a three-dimensional model. In addition, qualitative electronic information has been gathered with two spectroscopic tools : UV-Near IR spectrophotometry and low energy XANES data that can probe each atom of the structural unit : Fe, C, N and An. Coupling these spectroscopic tools to theoretical calculations will lead in the future to a better description of bonding in these molecular solids. Of primary interest is the actinide cation ability to form ionic — covalent bonding as 5f orbitals are being filled by modification of oxidation state and/or atomic number.

  5. Molecular design for growth of supramolecular membranes with hierarchical structure.

    PubMed

    Zha, R Helen; Velichko, Yuri S; Bitton, Ronit; Stupp, Samuel I

    2016-02-01

    Membranes with hierarchical structure exist in biological systems, and bio-inspired building blocks have been used to grow synthetic analogues in the laboratory through self-assembly. The formation of these synthetic membranes is initiated at the interface of two aqueous solutions, one containing cationic peptide amphiphiles (PA) and the other containing the anionic biopolymer hyaluronic acid (HA). The membrane growth process starts within milliseconds of interface formation and continues over much longer timescales to generate robust membranes with supramolecular PA-HA nanofibers oriented orthogonal to the interface. Computer simulation indicates that formation of these hierarchically structured membranes requires strong interactions between molecular components at early time points in order to generate a diffusion barrier between both solutions. Experimental studies using structurally designed PAs confirm simulation results by showing that only PAs with high ζ potential are able to yield hierarchically structured membranes. Furthermore, the chemical structure of such PAs must incorporate residues that form β-sheets, which facilitates self-assembly of long nanofibers. In contrast, PAs that form low aspect ratio nanostructures interact weakly with HA and yield membranes that exhibit non-fibrous fingering protrusions. Furthermore, experimental results show that increasing HA molecular weight decreases the growth rate of orthogonal nanofibers. This result is supported by simulation results suggesting that the thickness of the interfacial contact layer generated immediately after initiation of self-assembly increases with polymer molecular weight. PMID:26649980

  6. Photoelectron Angular Distribution and Molecular Structure in Multiply Charged Anions

    SciTech Connect

    Xing, Xiaopeng; Wang, Xue B.; Wang, Lai S.

    2009-02-12

    Photoelectrons emitted from multiply charged anions (MCAs) carry information of the intramolecular Coulomb repulsion (ICR), which is dependent on molecular structures. Using photoelectron imaging, we observed the effects of ICR on photoelectron angular distributions (PAD) of the three isomers of benzene dicarboxylate dianions C6H4(CO2)22– (o-, m- and p-BDC2–). Photoelectrons were observed to peak along the laser polarization due to the ICR, but the anisotropy was the largest for p-BDC2–, followed by the m- and o-isomer. The observed anisotropy is related to the direction of the ICR or the detailed molecular structures, suggesting that photoelectron imaging may allow structural information to be obtained for complex multiply charged anions.

  7. Structure factor and rheology of chain molecules from molecular dynamics

    NASA Astrophysics Data System (ADS)

    Castrejón-González, Omar; Castillo-Tejas, Jorge; Manero, Octavio; Alvarado, Juan F. J.

    2013-05-01

    Equilibrium and non-equilibrium molecular dynamics were performed to determine the relationship between the static structure factor, the molecular conformation, and the rheological properties of chain molecules. A spring-monomer model with Finitely Extensible Nonlinear Elastic and Lennard-Jones force field potentials was used to describe chain molecules. The equations of motion were solved for shear flow with SLLOD equations of motion integrated with Verlet's algorithm. A multiple time scale algorithm extended to non-equilibrium situations was used as the integration method. Concentric circular patterns in the structure factor were obtained, indicating an isotropic Newtonian behavior. Under simple shear flow, some peaks in the structure factor were emerged corresponding to an anisotropic pattern as chains aligned along the flow direction. Pure chain molecules and chain molecules in solution displayed shear-thinning regions. Power-law and Carreau-Yasuda models were used to adjust the generated data. Results are in qualitative agreement with rheological and light scattering experiments.

  8. Plant Identification Based on Leaf Midrib Cross-Section Images Using Fractal Descriptors

    PubMed Central

    da Silva, Núbia Rosa; Florindo, João Batista; Gómez, María Cecilia; Rossatto, Davi Rodrigo; Kolb, Rosana Marta; Bruno, Odemir Martinez

    2015-01-01

    The correct identification of plants is a common necessity not only to researchers but also to the lay public. Recently, computational methods have been employed to facilitate this task, however, there are few studies front of the wide diversity of plants occurring in the world. This study proposes to analyse images obtained from cross-sections of leaf midrib using fractal descriptors. These descriptors are obtained from the fractal dimension of the object computed at a range of scales. In this way, they provide rich information regarding the spatial distribution of the analysed structure and, as a consequence, they measure the multiscale morphology of the object of interest. In Biology, such morphology is of great importance because it is related to evolutionary aspects and is successfully employed to characterize and discriminate among different biological structures. Here, the fractal descriptors are used to identify the species of plants based on the image of their leaves. A large number of samples are examined, being 606 leaf samples of 50 species from Brazilian flora. The results are compared to other imaging methods in the literature and demonstrate that fractal descriptors are precise and reliable in the taxonomic process of plant species identification. PMID:26091501

  9. Cytoskeleton Molecular Motors: Structures and Their Functions in Neuron

    PubMed Central

    Xiao, Qingpin; Hu, Xiaohui; Wei, Zhiyi; Tam, Kin Yip

    2016-01-01

    Cells make use of molecular motors to transport small molecules, macromolecules and cellular organelles to target region to execute biological functions, which is utmost important for polarized cells, such as neurons. In particular, cytoskeleton motors play fundamental roles in neuron polarization, extension, shape and neurotransmission. Cytoskeleton motors comprise of myosin, kinesin and cytoplasmic dynein. F-actin filaments act as myosin track, while kinesin and cytoplasmic dynein move on microtubules. Cytoskeleton motors work together to build a highly polarized and regulated system in neuronal cells via different molecular mechanisms and functional regulations. This review discusses the structures and working mechanisms of the cytoskeleton motors in neurons. PMID:27570482

  10. On calculating the equilibrium structure of molecular crystals.

    SciTech Connect

    Mattsson, Ann Elisabet; Wixom, Ryan R.; Mattsson, Thomas Kjell Rene

    2010-03-01

    The difficulty of calculating the ambient properties of molecular crystals, such as the explosive PETN, has long hampered much needed computational investigations of these materials. One reason for the shortcomings is that the exchange-correlation functionals available for Density Functional Theory (DFT) based calculations do not correctly describe the weak intermolecular van der Waals' forces present in molecular crystals. However, this weak interaction also poses other challenges for the computational schemes used. We will discuss these issues in the context of calculations of lattice constants and structure of PETN with a number of different functionals, and also discuss if these limitations can be circumvented for studies at non-ambient conditions.