Quantum descriptors for predictive toxicology of halogenated aliphatic hydrocarbons.
Trohalaki, S; Pachter, R
2003-04-01
In order to improve Quantitative Structure-Activity Relationships (QSARs) for halogenated aliphatics (HA) and to better understand the biophysical mechanism of toxic response to these ubiquitous chemicals, we employ improved quantum-mechanical descriptors to account for HA electrophilicity. We demonstrate that, unlike the lowest unoccupied molecular orbital energy, ELUMO, which was previously used as a descriptor, the electron affinity can be systematically improved by application of higher levels of theory. We also show that employing the reciprocal of ELUMO, which is more consistent with frontier molecular orbital (FMO) theory, improves the correlations with in vitro toxicity data. We offer explanations based on FMO theory for a result from our previous work, in which the LUMO energies of HA anions correlated surprisingly well with in vitro toxicity data. Additional descriptors are also suggested and interpreted in terms of the accepted biophysical mechanism of toxic response to HAs and new QSARs are derived for various chemical categories that compose the data set employed. These alternate descriptors provide important insight and could benefit other classes of compounds where the biophysical mechanism of toxic response involves dissociative attachment.
Visual analytics in cheminformatics: user-supervised descriptor selection for QSAR methods.
Martínez, María Jimena; Ponzoni, Ignacio; Díaz, Mónica F; Vazquez, Gustavo E; Soto, Axel J
2015-01-01
The design of QSAR/QSPR models is a challenging problem, where the selection of the most relevant descriptors constitutes a key step of the process. Several feature selection methods that address this step are concentrated on statistical associations among descriptors and target properties, whereas the chemical knowledge is left out of the analysis. For this reason, the interpretability and generality of the QSAR/QSPR models obtained by these feature selection methods are drastically affected. Therefore, an approach for integrating domain expert's knowledge in the selection process is needed for increase the confidence in the final set of descriptors. In this paper a software tool, which we named Visual and Interactive DEscriptor ANalysis (VIDEAN), that combines statistical methods with interactive visualizations for choosing a set of descriptors for predicting a target property is proposed. Domain expertise can be added to the feature selection process by means of an interactive visual exploration of data, and aided by statistical tools and metrics based on information theory. Coordinated visual representations are presented for capturing different relationships and interactions among descriptors, target properties and candidate subsets of descriptors. The competencies of the proposed software were assessed through different scenarios. These scenarios reveal how an expert can use this tool to choose one subset of descriptors from a group of candidate subsets or how to modify existing descriptor subsets and even incorporate new descriptors according to his or her own knowledge of the target property. The reported experiences showed the suitability of our software for selecting sets of descriptors with low cardinality, high interpretability, low redundancy and high statistical performance in a visual exploratory way. Therefore, it is possible to conclude that the resulting tool allows the integration of a chemist's expertise in the descriptor selection process with a low cognitive effort in contrast with the alternative of using an ad-hoc manual analysis of the selected descriptors. Graphical abstractVIDEAN allows the visual analysis of candidate subsets of descriptors for QSAR/QSPR. In the two panels on the top, users can interactively explore numerical correlations as well as co-occurrences in the candidate subsets through two interactive graphs.
Fjodorova, Natalja; Novič, Marjana
2012-01-01
The knowledge-based Toxtree expert system (SAR approach) was integrated with the statistically based counter propagation artificial neural network (CP ANN) model (QSAR approach) to contribute to a better mechanistic understanding of a carcinogenicity model for non-congeneric chemicals using Dragon descriptors and carcinogenic potency for rats as a response. The transparency of the CP ANN algorithm was demonstrated using intrinsic mapping technique specifically Kohonen maps. Chemical structures were represented by Dragon descriptors that express the structural and electronic features of molecules such as their shape and electronic surrounding related to reactivity of molecules. It was illustrated how the descriptors are correlated with particular structural alerts (SAs) for carcinogenicity with recognized mechanistic link to carcinogenic activity. Moreover, the Kohonen mapping technique enables one to examine the separation of carcinogens and non-carcinogens (for rats) within a family of chemicals with a particular SA for carcinogenicity. The mechanistic interpretation of models is important for the evaluation of safety of chemicals. PMID:24688639
Pang, Siu-Kwong
2017-03-30
Quantum chemical methods and molecular mechanics approaches face a lot of challenges in drug metabolism study because of their either insufficient accuracy or huge computational cost, or lack of clear molecular level pictures for building computational models. Low-cost QSAR methods can often be carried out even though molecular level pictures are not well defined; however, they show difficulty in identifying the mechanisms of drug metabolism and delineating the effects of chemical structures on drug toxicity because a certain amount of molecular descriptors are difficult to be interpreted. In order to make a breakthrough, it was proposed that mechanistically interpretable molecular descriptors were used to correlate with biological activity to establish structure-activity plots. The mechanistically interpretable molecular descriptors used in this study include electrophilicity and the mathematical function in the London formula for dispersion interaction, and they were calculated using quantum chemical methods. The biological activity is the lethality of anthracycline anticancer antibiotics denoted as log LD50, which were obtained by intraperitoneal injection into mice. The results reveal that the plots for electrophilicity, which can be interpreted as redox reactivity of anthracyclines, can describe oxidative degradation for detoxification and reductive bioactivation for toxicity induction. The plots for the dispersion interaction function, which represent the attraction between anthracyclines and biomolecules, can describe efflux from and influx into target cells of toxicity. The plots can also identify three structural scaffolds of anthracyclines that have different metabolic pathways, resulting in their different toxicity behavior. This structure-dependent toxicity behavior revealed in the plots can provide perspectives on design of anthracycline anticancer antibiotics. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Drosos, Juan Carlos; Viola-Rhenals, Maricela; Vivas-Reyes, Ricardo
2010-06-25
Polycyclic aromatic compounds (PAHs) are of concern in environmental chemistry and toxicology. In the present work, a QSRR study was performed for 209 previously reported PAHs using quantum mechanics and other sources descriptors estimated by different approaches. The B3LYP/6-31G* level of theory was used for geometrical optimization and quantum mechanics related variables. A good linear relationship between gas-chromatographic retention index and electronic or topologic descriptors was found by stepwise linear regression analysis. The molecular polarizability (alpha) and the second order molecular connectivity Kier and Hall index ((2)chi) showed evidence of significant correlation with retention index by means of important squared coefficient of determination, (R(2)), values (R(2)=0.950 and 0.962, respectively). A one variable QSRR model is presented for each descriptor and both models demonstrates a significant predictive capacity established using the leave-many-out LMO (excluding 25% of rows) cross validation method's q(2) cross-validation coefficients q(2)(CV-LMO25%), (obtained q(2)(CV-LMO25%) 0.947 and 0.960, respectively). Furthermore, the physicochemical interpretation of selected descriptors allowed detailed explanation of the source of the observed statistical correlation. The model analysis suggests that only one descriptor is sufficient to establish a consistent retention index-structure relationship. Moderate or non-significant improve was observed for quantitative results or statistical validation parameters when introducing more terms in predictive equation. The one parameter QSRR proposed model offers a consistent scheme to predict chromatographic properties of PAHs compounds. Copyright 2010 Elsevier B.V. All rights reserved.
A review on principles, theory and practices of 2D-QSAR.
Roy, Kunal; Das, Rudra Narayan
2014-01-01
The central axiom of science purports the explanation of every natural phenomenon using all possible logics coming from pure as well as mixed scientific background. The quantitative structure-activity relationship (QSAR) analysis is a study correlating the behavioral manifestation of compounds with their structures employing the interdisciplinary knowledge of chemistry, mathematics, biology as well as physics. Several studies have attempted to mathematically correlate the chemistry and property (physicochemical/ biological/toxicological) of molecules using various computationally or experimentally derived quantitative parameters termed as descriptors. The dimensionality of the descriptors depends on the type of algorithm employed and defines the nature of QSAR analysis. The most interesting feature of predictive QSAR models is that the behavior of any new or even hypothesized molecule can be predicted by the use of the mathematical equations. The phrase "2D-QSAR" signifies development of QSAR models using 2D-descriptors. Such predictor variables are the most widely practised ones because of their simple and direct mathematical algorithmic nature involving no time consuming energy computations and having reproducible operability. 2D-descriptors have a deluge of contributions in extracting chemical attributes and they are also capable of representing the 3D molecular features to some extent; although in no case they should be considered as the ultimate one, since they often suffer from the problems of intercorrelation, insufficient chemical information as well as lack of interpretation. However, by following rational approaches, novel 2D-descriptors may be developed to obviate various existing problems giving potential 2D-QSAR equations, thereby solving the innumerable chemical mysteries still unexplored.
Nandi, Sisir; Monesi, Alessandro; Drgan, Viktor; Merzel, Franci; Novič, Marjana
2013-10-30
In the present study, we show the correlation of quantum chemical structural descriptors with the activation barriers of the Diels-Alder ligations. A set of 72 non-catalysed Diels-Alder reactions were subjected to quantitative structure-activation barrier relationship (QSABR) under the framework of theoretical quantum chemical descriptors calculated solely from the structures of diene and dienophile reactants. Experimental activation barrier data were obtained from literature. Descriptors were computed using Hartree-Fock theory using 6-31G(d) basis set as implemented in Gaussian 09 software. Variable selection and model development were carried out by stepwise multiple linear regression methodology. Predictive performance of the quantitative structure-activation barrier relationship (QSABR) model was assessed by training and test set concept and by calculating leave-one-out cross-validated Q2 and predictive R2 values. The QSABR model can explain and predict 86.5% and 80% of the variances, respectively, in the activation energy barrier training data. Alternatively, a neural network model based on back propagation of errors was developed to assess the nonlinearity of the sought correlations between theoretical descriptors and experimental reaction barriers. A reasonable predictability for the activation barrier of the test set reactions was obtained, which enabled an exploration and interpretation of the significant variables responsible for Diels-Alder interaction between dienes and dienophiles. Thus, studies in the direction of QSABR modelling that provide efficient and fast prediction of activation barriers of the Diels-Alder reactions turn out to be a meaningful alternative to transition state theory based computation.
TALMACIU, MONA MARIA; BODOKI, EDE; OPREAN, RADU
2016-01-01
Background and aim Beta-adrenergic antagonists have been established as first line treatment in the medical management of hypertension, acute coronary syndrome and other cardiovascular diseases, as well as for the prevention of initial episodes of gastrointestinal bleeding in patients with cirrhosis and esophageal varices, glaucoma, and have recently become the main form of treatment of infantile hemangiomas. The aim of the present study is to calculate for 14 beta-blockers several quantum chemical descriptors in order to interpret various molecular properties such as electronic structure, conformation, reactivity, in the interest of determining how such descriptors could have an impact on our understanding of the experimental observations and describing various aspects of chemical binding of beta-blockers in terms of these descriptors. Methods The 2D chemical structures of the beta-blockers (14 molecules with one stereogenic center) were cleaned in 3D, their geometry was preoptimized using the software MOPAC2012, by PM6 method, and then further refined using standard settings in MOE; HOMO and LUMO descriptors were calculated using semi-empirical molecular orbital methods AM1, MNDO and PM3, for the lowest energy conformers and the quantum chemical descriptors (HLG, electronegativity, chemical potential, hardness and softness, electrophilicity) were then calculated. Results According to HOMO-LUMO gap and the chemical hardness the most stable compounds are alprenolol, bisoprolol and esmolol. The softness values calculated for the study molecules revolve around 0.100. Propranolol, sotalol and timolol have among the highest electrophilicity index of the studied beta-blocker molecules. Results obtained from calculations showed that acebutolol, atenolol, timolol and sotalol have the highest values for the electronegativity index. Conclusions The future aim is to determine whether it is possible to find a valid correlation between these descriptors and the physicochemical behavior of the molecules from this class. The HLG could be correlated to the experimentally recorded electrochemical properties of the molecules. HOMO could be correlated to the observed oxidation potential, since the required voltage is related to the energy of the HOMO, because only the electron from this orbital is involved in the oxidation process. PMID:27857521
Rasulev, Bakhtiyor; Kusić, Hrvoje; Leszczynska, Danuta; Leszczynski, Jerzy; Koprivanac, Natalija
2010-05-01
The goal of the study was to predict toxicity in vivo caused by aromatic compounds structured with a single benzene ring and the presence or absence of different substituent groups such as hydroxyl-, nitro-, amino-, methyl-, methoxy-, etc., by using QSAR/QSPR tools. A Genetic Algorithm and multiple regression analysis were applied to select the descriptors and to generate the correlation models. The most predictive model is shown to be the 3-variable model which also has a good ratio of the number of descriptors and their predictive ability to avoid overfitting. The main contributions to the toxicity were shown to be the polarizability weighted MATS2p and the number of certain groups C-026 descriptors. The GA-MLRA approach showed good results in this study, which allows the building of a simple, interpretable and transparent model that can be used for future studies of predicting toxicity of organic compounds to mammals.
2018-05-01
the descriptors were correlated to experimental rate constants. The five descriptors fell into one of two categories: whole molecule descriptors or...model based on these correlations . Although that goal was not achieved in full, considerable progress has been made, and there is potential for a...readme.txt) and compiled. We then searched for correlations between the calculated properties from theory and the experimental measurements of reaction rate
NASA Astrophysics Data System (ADS)
Kweun, Joshua Minwoo; Li, Chenzhe; Zheng, Yongping; Cho, Maenghyo; Kim, Yoon Young; Cho, Kyeongjae
2016-05-01
Designing metal-oxides consisting of earth-abundant elements has been a crucial issue to replace precious metal catalysts. To achieve efficient screening of metal-oxide catalysts via bulk descriptors rather than surface descriptors, we investigated the relationship between the electronic structure of bulk and that of the surface for lanthanum-based perovskite oxides, LaMO3 (M = Ti, V, Cr, Mn, Fe, Co, Ni, Cu). Through density functional theory calculations, we examined the d-band occupancy of the bulk and surface transition-metal atoms (nBulk and nSurf) and the adsorption energy of an oxygen atom (Eads) on (001), (110), and (111) surfaces. For the (001) surface, we observed strong correlation between the nBulk and nSurf with an R-squared value over 94%, and the result was interpreted in terms of ligand field splitting and antibonding/bonding level splitting. Moreover, the Eads on the surfaces was highly correlated with the nBulk with an R-squared value of more than 94%, and different surface relaxations could be explained by the bulk electronic structure (e.g., LaMnO3 vs. LaTiO3). These results suggest that a bulk-derived descriptor such as nBulk can be used to screen metal-oxide catalysts.
Prediction of Partition Coefficients of Organic Compounds between SPME/PDMS and Aqueous Solution
Chao, Keh-Ping; Lu, Yu-Ting; Yang, Hsiu-Wen
2014-01-01
Polydimethylsiloxane (PDMS) is commonly used as the coated polymer in the solid phase microextraction (SPME) technique. In this study, the partition coefficients of organic compounds between SPME/PDMS and the aqueous solution were compiled from the literature sources. The correlation analysis for partition coefficients was conducted to interpret the effect of their physicochemical properties and descriptors on the partitioning process. The PDMS-water partition coefficients were significantly correlated to the polarizability of organic compounds (r = 0.977, p < 0.05). An empirical model, consisting of the polarizability, the molecular connectivity index, and an indicator variable, was developed to appropriately predict the partition coefficients of 61 organic compounds for the training set. The predictive ability of the empirical model was demonstrated by using it on a test set of 26 chemicals not included in the training set. The empirical model, applying the straightforward calculated molecular descriptors, for estimating the PDMS-water partition coefficient will contribute to the practical applications of the SPME technique. PMID:24534804
Ligand Electron Density Shape Recognition Using 3D Zernike Descriptors
NASA Astrophysics Data System (ADS)
Gunasekaran, Prasad; Grandison, Scott; Cowtan, Kevin; Mak, Lora; Lawson, David M.; Morris, Richard J.
We present a novel approach to crystallographic ligand density interpretation based on Zernike shape descriptors. Electron density for a bound ligand is expanded in an orthogonal polynomial series (3D Zernike polynomials) and the coefficients from this expansion are employed to construct rotation-invariant descriptors. These descriptors can be compared highly efficiently against large databases of descriptors computed from other molecules. In this manuscript we describe this process and show initial results from an electron density interpretation study on a dataset containing over a hundred OMIT maps. We could identify the correct ligand as the first hit in about 30 % of the cases, within the top five in a further 30 % of the cases, and giving rise to an 80 % probability of getting the correct ligand within the top ten matches. In all but a few examples, the top hit was highly similar to the correct ligand in both shape and chemistry. Further extensions and intrinsic limitations of the method are discussed.
Local Descriptors of Dynamic and Nondynamic Correlation.
Ramos-Cordoba, Eloy; Matito, Eduard
2017-06-13
Quantitatively accurate electronic structure calculations rely on the proper description of electron correlation. A judicious choice of the approximate quantum chemistry method depends upon the importance of dynamic and nondynamic correlation, which is usually assesed by scalar measures. Existing measures of electron correlation do not consider separately the regions of the Cartesian space where dynamic or nondynamic correlation are most important. We introduce real-space descriptors of dynamic and nondynamic electron correlation that admit orbital decomposition. Integration of the local descriptors yields global numbers that can be used to quantify dynamic and nondynamic correlation. Illustrative examples over different chemical systems with varying electron correlation regimes are used to demonstrate the capabilities of the local descriptors. Since the expressions only require orbitals and occupation numbers, they can be readily applied in the context of local correlation methods, hybrid methods, density matrix functional theory, and fractional-occupancy density functional theory.
Ivanciuc, O; Ivanciuc, T; Klein, D J; Seitz, W A; Balaban, A T
2001-02-01
Quantitative structure-retention relationships (QSRR) represent statistical models that quantify the connection between the molecular structure and the chromatographic retention indices of organic compounds, allowing the prediction of retention indices of novel, not yet synthesized compounds, solely from their structural descriptors. Using multiple linear regression, QSRR models for the gas chromatographic Kováts retention indices of 129 alkylbenzenes are generated using molecular graph descriptors. The correlational ability of structural descriptors computed from 10 molecular matrices is investigated, showing that the novel reciprocal matrices give numerical indices with improved correlational ability. A QSRR equation with 5 graph descriptors gives the best calibration and prediction results, demonstrating the usefulness of the molecular graph descriptors in modeling chromatographic retention parameters. The sequential orthogonalization of descriptors suggests simpler QSRR models by eliminating redundant structural information.
Lu, Tong; Tai, Chiew-Lan; Yang, Huafei; Cai, Shijie
2009-08-01
We present a novel knowledge-based system to automatically convert real-life engineering drawings to content-oriented high-level descriptions. The proposed method essentially turns the complex interpretation process into two parts: knowledge representation and knowledge-based interpretation. We propose a new hierarchical descriptor-based knowledge representation method to organize the various types of engineering objects and their complex high-level relations. The descriptors are defined using an Extended Backus Naur Form (EBNF), facilitating modification and maintenance. When interpreting a set of related engineering drawings, the knowledge-based interpretation system first constructs an EBNF-tree from the knowledge representation file, then searches for potential engineering objects guided by a depth-first order of the nodes in the EBNF-tree. Experimental results and comparisons with other interpretation systems demonstrate that our knowledge-based system is accurate and robust for high-level interpretation of complex real-life engineering projects.
Yu, Zhiguo; Nguyen, Thang; Dhombres, Ferdinand; Johnson, Todd; Bodenreider, Olivier
2018-01-01
Extracting and understanding information, themes and relationships from large collections of documents is an important task for biomedical researchers. Latent Dirichlet Allocation is an unsupervised topic modeling technique using the bag-of-words assumption that has been applied extensively to unveil hidden thematic information within large sets of documents. In this paper, we added MeSH descriptors to the bag-of-words assumption to generate ‘hybrid topics’, which are mixed vectors of words and descriptors. We evaluated this approach on the quality and interpretability of topics in both a general corpus and a specialized corpus. Our results demonstrated that the coherence of ‘hybrid topics’ is higher than that of regular bag-of-words topics in the specialized corpus. We also found that the proportion of topics that are not associated with MeSH descriptors is higher in the specialized corpus than in the general corpus. PMID:29295179
Using network analysis to study behavioural phenotypes: an example using domestic dogs.
Goold, Conor; Vas, Judit; Olsen, Christine; Newberry, Ruth C
2016-10-01
Phenotypic integration describes the complex interrelationships between organismal traits, traditionally focusing on morphology. Recently, research has sought to represent behavioural phenotypes as composed of quasi-independent latent traits. Concurrently, psychologists have opposed latent variable interpretations of human behaviour, proposing instead a network perspective envisaging interrelationships between behaviours as emerging from causal dependencies. Network analysis could also be applied to understand integrated behavioural phenotypes in animals. Here, we assimilate this cross-disciplinary progression of ideas by demonstrating the use of network analysis on survey data collected on behavioural and motivational characteristics of police patrol and detection dogs ( Canis lupus familiaris ). Networks of conditional independence relationships illustrated a number of functional connections between descriptors, which varied between dog types. The most central descriptors denoted desirable characteristics in both patrol and detection dog networks, with 'Playful' being widely correlated and possessing mediating relationships between descriptors. Bootstrap analyses revealed the stability of network results. We discuss the results in relation to previous research on dog personality, and benefits of using network analysis to study behavioural phenotypes. We conclude that a network perspective offers widespread opportunities for advancing the understanding of phenotypic integration in animal behaviour.
Hansson, Mari; Pemberton, John; Engkvist, Ola; Feierberg, Isabella; Brive, Lars; Jarvis, Philip; Zander-Balderud, Linda; Chen, Hongming
2014-06-01
High-throughput screening (HTS) is widely used in the pharmaceutical industry to identify novel chemical starting points for drug discovery projects. The current study focuses on the relationship between molecular hit rate in recent in-house HTS and four common molecular descriptors: lipophilicity (ClogP), size (heavy atom count, HEV), fraction of sp(3)-hybridized carbons (Fsp3), and fraction of molecular framework (f(MF)). The molecular hit rate is defined as the fraction of times the molecule has been assigned as active in the HTS campaigns where it has been screened. Beta-binomial statistical models were built to model the molecular hit rate as a function of these descriptors. The advantage of the beta-binomial statistical models is that the correlation between the descriptors is taken into account. Higher degree polynomial terms of the descriptors were also added into the beta-binomial statistic model to improve the model quality. The relative influence of different molecular descriptors on molecular hit rate has been estimated, taking into account that the descriptors are correlated to each other through applying beta-binomial statistical modeling. The results show that ClogP has the largest influence on the molecular hit rate, followed by Fsp3 and HEV. f(MF) has only a minor influence besides its correlation with the other molecular descriptors. © 2013 Society for Laboratory Automation and Screening.
Yin, Tong; König, Sven
2018-03-01
The most common approach in dairy cattle to prove genotype by environment interactions is a multiple-trait model application, and considering the same traits in different environments as different traits. We enhanced such concepts by defining continuous phenotypic, genetic, and genomic herd descriptors, and applying random regression sire models. Traits of interest were test-day traits for milk yield, fat percentage, protein percentage, and somatic cell score, considering 267,393 records from 32,707 first-lactation Holstein cows. Cows were born in the years 2010 to 2013, and kept in 52 large-scale herds from 2 federal states of north-east Germany. The average number of genotyped cows per herd (45,613 single nucleotide polymorphism markers per cow) was 133.5 (range: 45 to 415 genotyped cows). Genomic herd descriptors were (1) the level of linkage disequilibrium (r 2 ) within specific chromosome segments, and (2) the average allele frequency for single nucleotide polymorphisms in close distance to a functional mutation. Genetic herd descriptors were the (1) intra-herd inbreeding coefficient, and (2) the percentage of daughters from foreign sires. Phenotypic herd descriptors were (1) herd size, and (2) the herd mean for nonreturn rate. Most correlations among herd descriptors were close to 0, indicating independence of genomic, genetic, and phenotypic characteristics. Heritabilities for milk yield increased with increasing intra-herd linkage disequilibrium, inbreeding, and herd size. Genetic correlations in same traits between adjacent levels of herd descriptors were close to 1, but declined for descriptor levels in greater distance. Genetic correlation declines were more obvious for somatic cell score, compared with test-day traits with larger heritabilities (fat percentage and protein percentage). Also, for milk yield, alterations of herd descriptor levels had an obvious effect on heritabilities and genetic correlations. By trend, multiple trait model results (based on created discrete herd classes) confirmed the random regression estimates. Identified alterations of breeding values in dependency of herd descriptors suggest utilization of specific sires for specific herd structures, offering new possibilities to improve sire selection strategies. Regarding genomic selection designs and genetic gain transfer into commercial herds, cow herds for the utilization in cow training sets should reflect the genomic, genetic, and phenotypic pattern of the broad population. Copyright © 2018 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Koutsoukas, Alexios; Paricharak, Shardul; Galloway, Warren R J D; Spring, David R; Ijzerman, Adriaan P; Glen, Robert C; Marcus, David; Bender, Andreas
2014-01-27
Chemical diversity is a widely applied approach to select structurally diverse subsets of molecules, often with the objective of maximizing the number of hits in biological screening. While many methods exist in the area, few systematic comparisons using current descriptors in particular with the objective of assessing diversity in bioactivity space have been published, and this shortage is what the current study is aiming to address. In this work, 13 widely used molecular descriptors were compared, including fingerprint-based descriptors (ECFP4, FCFP4, MACCS keys), pharmacophore-based descriptors (TAT, TAD, TGT, TGD, GpiDAPH3), shape-based descriptors (rapid overlay of chemical structures (ROCS) and principal moments of inertia (PMI)), a connectivity-matrix-based descriptor (BCUT), physicochemical-property-based descriptors (prop2D), and a more recently introduced molecular descriptor type (namely, "Bayes Affinity Fingerprints"). We assessed both the similar behavior of the descriptors in assessing the diversity of chemical libraries, and their ability to select compounds from libraries that are diverse in bioactivity space, which is a property of much practical relevance in screening library design. This is particularly evident, given that many future targets to be screened are not known in advance, but that the library should still maximize the likelihood of containing bioactive matter also for future screening campaigns. Overall, our results showed that descriptors based on atom topology (i.e., fingerprint-based descriptors and pharmacophore-based descriptors) correlate well in rank-ordering compounds, both within and between descriptor types. On the other hand, shape-based descriptors such as ROCS and PMI showed weak correlation with the other descriptors utilized in this study, demonstrating significantly different behavior. We then applied eight of the molecular descriptors compared in this study to sample a diverse subset of sample compounds (4%) from an initial population of 2587 compounds, covering the 25 largest human activity classes from ChEMBL and measured the coverage of activity classes by the subsets. Here, it was found that "Bayes Affinity Fingerprints" achieved an average coverage of 92% of activity classes. Using the descriptors ECFP4, GpiDAPH3, TGT, and random sampling, 91%, 84%, 84%, and 84% of the activity classes were represented in the selected compounds respectively, followed by BCUT, prop2D, MACCS, and PMI (in order of decreasing performance). In addition, we were able to show that there is no visible correlation between compound diversity in PMI space and in bioactivity space, despite frequent utilization of PMI plots to this end. To summarize, in this work, we assessed which descriptors select compounds with high coverage of bioactivity space, and can hence be used for diverse compound selection for biological screening. In cases where multiple descriptors are to be used for diversity selection, this work describes which descriptors behave complementarily, and can hence be used jointly to focus on different aspects of diversity in chemical space.
Antúnez, Lucía; Giménez, Ana; Maiche, Alejandro; Ares, Gastón
2015-01-01
To study the influence of 2 interpretational aids of front-of-package (FOP) nutrition labels (color code and text descriptors) on attentional capture and consumers' understanding of nutritional information. A full factorial design was used to assess the influence of color code and text descriptors using visual search and eye tracking. Ten trained assessors participated in the visual search study and 54 consumers completed the eye-tracking study. In the visual search study, assessors were asked to indicate whether there was a label high in fat within sets of mayonnaise labels with different FOP labels. In the eye-tracking study, assessors answered a set of questions about the nutritional content of labels. The researchers used logistic regression to evaluate the influence of interpretational aids of FOP nutrition labels on the percentage of correct answers. Analyses of variance were used to evaluate the influence of the studied variables on attentional measures and participants' response times. Response times were significantly higher for monochromatic FOP labels compared with color-coded ones (3,225 vs 964 ms; P < .001), which suggests that color codes increase attentional capture. The highest number and duration of fixations and visits were recorded on labels that did not include color codes or text descriptors (P < .05). The lowest percentage of incorrect answers was observed when the nutrient level was indicated using color code and text descriptors (P < .05). The combination of color codes and text descriptors seems to be the most effective alternative to increase attentional capture and understanding of nutritional information. Copyright © 2015 Society for Nutrition Education and Behavior. Published by Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Masand, Vijay H.; El-Sayed, Nahed N. E.; Bambole, Mukesh U.; Quazi, Syed A.
2018-04-01
Multiple discrete quantitative structure-activity relationships (QSARs) models were constructed for the anticancer activity of α, β-unsaturated carbonyl-based compounds, oxime and oxime ether analogues with a variety of substituents like sbnd Br, sbnd OH, -OMe, etc. at different positions. A big pool of descriptors was considered for QSAR model building. Genetic algorithm (GA), available in QSARINS-Chem, was executed to choose optimum number and set of descriptors to create the multi-linear regression equations for a dataset of sixty-nine compounds. The newly developed five parametric models were subjected to exhaustive internal and external validation along with Y-scrambling using QSARINS-Chem, according to the OECD principles for QSAR model validation. The models were built using easily interpretable descriptors and accepted after confirming statistically robustness with high external predictive ability. The five parametric models were found to have R2 = 0.80 to 0.86, R2ex = 0.75 to 0.84, and CCCex = 0.85 to 0.90. The models indicate that frequency of nitrogen and oxygen atoms separated by five bonds from each other and internal electronic environment of the molecule have correlation with the anticancer activity.
Kilborn, Joshua P; Jones, David L; Peebles, Ernst B; Naar, David F
2017-04-01
Clustering data continues to be a highly active area of data analysis, and resemblance profiles are being incorporated into ecological methodologies as a hypothesis testing-based approach to clustering multivariate data. However, these new clustering techniques have not been rigorously tested to determine the performance variability based on the algorithm's assumptions or any underlying data structures. Here, we use simulation studies to estimate the statistical error rates for the hypothesis test for multivariate structure based on dissimilarity profiles (DISPROF). We concurrently tested a widely used algorithm that employs the unweighted pair group method with arithmetic mean (UPGMA) to estimate the proficiency of clustering with DISPROF as a decision criterion. We simulated unstructured multivariate data from different probability distributions with increasing numbers of objects and descriptors, and grouped data with increasing overlap, overdispersion for ecological data, and correlation among descriptors within groups. Using simulated data, we measured the resolution and correspondence of clustering solutions achieved by DISPROF with UPGMA against the reference grouping partitions used to simulate the structured test datasets. Our results highlight the dynamic interactions between dataset dimensionality, group overlap, and the properties of the descriptors within a group (i.e., overdispersion or correlation structure) that are relevant to resemblance profiles as a clustering criterion for multivariate data. These methods are particularly useful for multivariate ecological datasets that benefit from distance-based statistical analyses. We propose guidelines for using DISPROF as a clustering decision tool that will help future users avoid potential pitfalls during the application of methods and the interpretation of results.
Clare, Brian W; Supuran, Claudiu T
2005-03-15
A QSAR based almost entirely on quantum theoretically calculated descriptors has been developed for a large and heterogeneous group of aromatic and heteroaromatic carbonic anhydrase inhibitors, using orbital energies, nodal angles, atomic charges, and some other intuitively appealing descriptors. Most calculations have been done at the B3LYP/6-31G* level of theory. For the first time we have treated five-membered rings by the same means that we have used for benzene rings in the past. Our flip regression technique has been expanded to encompass automatic variable selection. The statistical quality of the results, while not equal to those we have had with benzene derivatives, is very good considering the noncongeneric nature of the compounds. The most significant correlation was with charge on the atoms of the sulfonamide group, followed by the nodal orientation and the solvation energy calculated by COSMO and the charge polarization of the molecule calculated as the mean absolute Mulliken charge over all atoms.
Ehresmann, Bernd; de Groot, Marcel J; Alex, Alexander; Clark, Timothy
2004-01-01
New molecular descriptors based on statistical descriptions of the local ionization potential, local electron affinity, and the local polarizability at the surface of the molecule are proposed. The significance of these descriptors has been tested by calculating them for the Maybridge database in addition to our set of 26 descriptors reported previously. The new descriptors show little correlation with those already in use. Furthermore, the principal components of the extended set of descriptors for the Maybridge data show that especially the descriptors based on the local electron affinity extend the variance in our set of descriptors, which we have previously shown to be relevant to physical properties. The first nine principal components are shown to be most significant. As an example of the usefulness of the new descriptors, we have set up a QSPR model for boiling points using both the old and new descriptors.
Real-Time Visual Tracking through Fusion Features
Ruan, Yang; Wei, Zhenzhong
2016-01-01
Due to their high-speed, correlation filters for object tracking have begun to receive increasing attention. Traditional object trackers based on correlation filters typically use a single type of feature. In this paper, we attempt to integrate multiple feature types to improve the performance, and we propose a new DD-HOG fusion feature that consists of discriminative descriptors (DDs) and histograms of oriented gradients (HOG). However, fusion features as multi-vector descriptors cannot be directly used in prior correlation filters. To overcome this difficulty, we propose a multi-vector correlation filter (MVCF) that can directly convolve with a multi-vector descriptor to obtain a single-channel response that indicates the location of an object. Experiments on the CVPR2013 tracking benchmark with the evaluation of state-of-the-art trackers show the effectiveness and speed of the proposed method. Moreover, we show that our MVCF tracker, which uses the DD-HOG descriptor, outperforms the structure-preserving object tracker (SPOT) in multi-object tracking because of its high-speed and ability to address heavy occlusion. PMID:27347951
[The use of Cantonese pain descriptors among healthy young adults in Hong Kong].
Chung, W Y; Wong, C H; Yang, J C; Tan, P P
1998-12-01
The interpretation and expression of pain are closely related to an individual's social and cultural background. To convey messages on pain, language and words (pain descriptors) is particularly significant in assessment and evaluation of pain severity and its management. Therefore, the study of pain descriptors is crucial in clinical practice. It was of exploratory-descriptive design. Samples were recruited by convenience. Data were collected by structured self-administered questionnaire. Data obtained included demographic information and pain descriptors used by the subjects in various pain conditions. Data were analyzed by descriptive statistics. Pain descriptors were categorized according to nature, process, intensity, aggravating factors, accompanying symptoms and behavioral manifestation. Total number of pain descriptors (in Cantonese) based on real pain experience was 3017, mean was 3 (n = 986). The commonest used descriptors was the nature of pain (41%). The intensity of pain constituted 20%. There was no significant difference in the number of pain descriptors between male and female. However, there was a significant difference between the type of pain descriptors used (Mfemale = 526, Mmale = 453, Z = -2.9729, p = 0.0029). There were also significant differences in the use of pain descriptors among the various age groups (X2 = 15.0157, df = 4, P = 0.0047) and educational levels (X2 = 11.2443, df = 4, P = 0.0240). The types of descriptors used increased with an increase in age and education levels. This exploratory-descriptive study explores the use of pain descriptors among Chinese young adults in Hong Kong. The result shows that female use more pain descriptors than male. The pain descriptors that female used are mostly of nature type. The similarities and differences in findings with those of the Ho's (1991) are compared.
ERIC Educational Resources Information Center
Papageorgiou, Spiros; Morgan, Rick; Becker, Valerie
2015-01-01
The purpose of this study was to enhance the meaning of the scores of an English-language test by developing performance levels and descriptors for reporting overall test performance. The levels and descriptors were intended to accompany the total scale scores of TOEFL Junior® Standard, an international test of English as a second/foreign…
A Community Database of Quartz Microstructures: Can we make measurements that constrain rheology?
NASA Astrophysics Data System (ADS)
Toy, Virginia; Peternell, Mark; Morales, Luiz; Kilian, Ruediger
2014-05-01
Rheology can be explored by performing deformation experiments, and by examining resultant microstructures and textures as links to naturally deformed rocks. Certain deformation processes are assumed to result in certain microstructures or textures, of which some might be uniquely indicative, while most cannot be unequivocally used to interpret the deformation mechanism and hence rheology. Despite our lack of a sufficient understanding of microstructure and texture forming processes, huge advances in texture measurements and quantification of microstructural parameters have been made. Unfortunately, there are neither standard procedures nor a common consensus on interpretation of many parameters (e.g. texture, grain size, shape preferred orientation). Textures (crystallographic preferred orientations) have been extensively correlated to the interpretation of deformation mechanisms. For example the strength of textures can be measured either from the orientation distribution function (e.g. the J-index (Bunge, 1983) or texture entropy (Hielscher et al., 2007) or via the intensity of polefigures. However, there are various ways to identify a representative volume, to measure, to process the data and to calculate an odf and texture descriptors, which restricts their use as a comparative and diagnostic measurement. Microstructural parameters such as grain size, grain shape descriptors and fabric descriptors are similarly used to deduce and quantify deformation mechanisms. However there is very little consensus on how to measure and calculate some of these very important parameters, e.g. grain size which makes comparison of a vast amount of precious data in the literature very difficult. We propose establishing a community database of a standard set of such measurements, made using typical samples of different types of quartz rocks through standard methods of microstructural and texture quantification. We invite suggestions and discussion from the community about the worth of proposed parameters, methodology and usefulness and willingness to contribute to a database with free access of the community. We further invite institutions to participate on a benchmark analysis of a set of 'standard' thin sections. Bunge, H.J. 1983, Texture Analysis in Materials Science: mathematical methods. Butterworth-Heinemann, 593pp. Hielscher, R., Schaeben, H., Chateigner, D., 2007, On the entropy to texture index relationship in quantitative texture analysis: Journal of Applied Crystallography 40, 371-375.
Tham, S Y; Agatonovic-Kustrin, S
2002-05-15
Quantitative structure-retention relationship(QSRR) method was used to model reversed-phase high-performance liquid chromatography (RP-HPLC) separation of 18 selected amino acids. Retention data for phenylthiocarbamyl (PTC) amino acids derivatives were obtained using gradient elution on ODS column with mobile phase of varying acetonitrile, acetate buffer and containing 0.5 ml/l of triethylamine (TEA). Molecular structure of each amino acid was encoded with 36 calculated molecular descriptors. The correlation between the molecular descriptors and the retention time of the compounds in the calibration set was established using the genetic neural network method. A genetic algorithm (GA) was used to select important molecular descriptors and supervised artificial neural network (ANN) was used to correlate mobile phase composition and selected descriptors with the experimentally derived retention times. Retention time values were used as the network's output and calculated molecular descriptors and mobile phase composition as the inputs. The best model with five input descriptors was chosen, and the significance of the selected descriptors for amino acid separation was examined. Results confirmed the dominant role of the organic modifier in such chromatographic systems in addition to lipophilicity (log P) and molecular size and shape (topological indices) of investigated solutes.
A new approach of sensorial evaluation of cooked cereal foods: fractal analysis of rheological data
NASA Astrophysics Data System (ADS)
Scher, J.; Hardy, J.
2002-11-01
An analytical method based on a fractal geometry concept was developed through the relationship between structure-texture of solid-like crackers, flat bread and Bretzels. An universal testing machine was used to determine indentation tests. The graphs were irregularly shaped so that usual interpretation was made not possible. Nevertheless, the irregular shape, or “roughness" displays auto-similarity properties which can be interpreted in terms of apparent fractal dimension texture (D_T). A trained panel able to quantify the “hardness", “porous structure" and “crispness" descriptors carried out sensorial characterisation of products. High correlation between sensorial hardness and resistance to indentation, on one hand, and between crispness and D_T on the other hand was found. Modelling mathematics methods for complex systems allow useful contribution to Food Science.
NASA Astrophysics Data System (ADS)
Masand, Vijay H.; El-Sayed, Nahed N. E.; Mahajan, Devidas T.; Mercader, Andrew G.; Alafeefy, Ahmed M.; Shibi, I. G.
2017-02-01
In the present work, sixty substituted 2-Phenylimidazopyridines previously reported with potent anti-human African trypanosomiasis (HAT) activity were selected to build genetic algorithm (GA) based QSAR models to determine the structural features that have significant correlation with the activity. Multiple QSAR models were built using easily interpretable descriptors that are directly associated with the presence or the absence of a structural scaffold, or a specific atom. All the QSAR models have been thoroughly validated according to the OECD principles. All the QSAR models are statistically very robust (R2 = 0.80-0.87) with high external predictive ability (CCCex = 0.81-0.92). The QSAR analysis reveals that the HAT activity has good correlation with the presence of five membered rings in the molecule.
Saint Arnault, Denise; Sakamoto, Shinji; Moriwaki, Aiko
2005-04-01
Research findings that depressed Americans endorse more negative self-related adjectives than controls may be related to a shared self-enhancement cultural frame. This study examines the relationship between negative core self-descriptors and depressive symptoms in 79 Japanese and 50 American women. Americans had more positive self-descriptions and core self-descriptors; however, there were no cultural group differences in number of negative self-descriptors or core self-descriptors. There was a significant correlation between negative core self-descriptor and Beck Depression Inventory (BDI) for Americans only, explaining 10.6% of the BDI variance. Analysis of variance revealed that there was significant BDI group differences for American negative core self-descriptor only. Theoretical possibilities are discussed.
Sakamoto, Shinji; Moriwaki, Aiko
2007-01-01
Research findings that depressed Americans endorse more negative self-related adjectives than controls may be related to a shared self-enhancement cultural frame. This study examines the relationship between negative core self-descriptors and depressive symptoms in 79 Japanese and 50 American women. Americans had more positive self-descriptions and core self-descriptors; however, there were no cultural group differences in number of negative self-descriptors or core self-descriptors. There was a significant correlation between negative core self-descriptor and Beck Depression Inventory (BDI) for Americans only, explaining 10.6% of the BDI variance. Analysis of variance revealed that there was significant BDI group differences for American negative core self-descriptor only. Theoretical possibilities are discussed. PMID:15902678
Descriptor selection for banana accessions based on univariate and multivariate analysis.
Brandão, L P; Souza, C P F; Pereira, V M; Silva, S O; Santos-Serejo, J A; Ledo, C A S; Amorim, E P
2013-05-14
Our objective was to establish a minimum number of morphological descriptors for the characterization of banana germplasm and evaluate the efficiency of removal of redundant characters, based on univariate and multivariate statistical analyses. Phenotypic characterization was made of 77 accessions from Bahia, Brazil, using 92 descriptors. The selection of the descriptors was carried out by principal components analysis (quantitative) and by entropy (multi-category). Efficiency of elimination was analyzed by a comparative study between the clusters formed, taking into consideration all 92 descriptors and smaller groups. The selected descriptors were analyzed with the Ward-MLM procedure and a combined matrix formed by the Gower algorithm. We were able to reduce the number of descriptors used for characterizing the banana germplasm (42%). The correlation between the matrices considering the 92 descriptors and the selected ones was 0.82, showing that the reduction in the number of descriptors did not influence estimation of genetic variability between the banana accessions. We conclude that removing these descriptors caused no loss of information, considering the groups formed from pre-established criteria, including subgroup/subspecies.
Covariance descriptor fusion for target detection
NASA Astrophysics Data System (ADS)
Cukur, Huseyin; Binol, Hamidullah; Bal, Abdullah; Yavuz, Fatih
2016-05-01
Target detection is one of the most important topics for military or civilian applications. In order to address such detection tasks, hyperspectral imaging sensors provide useful images data containing both spatial and spectral information. Target detection has various challenging scenarios for hyperspectral images. To overcome these challenges, covariance descriptor presents many advantages. Detection capability of the conventional covariance descriptor technique can be improved by fusion methods. In this paper, hyperspectral bands are clustered according to inter-bands correlation. Target detection is then realized by fusion of covariance descriptor results based on the band clusters. The proposed combination technique is denoted Covariance Descriptor Fusion (CDF). The efficiency of the CDF is evaluated by applying to hyperspectral imagery to detect man-made objects. The obtained results show that the CDF presents better performance than the conventional covariance descriptor.
Zhang, Yan-Yan; Liu, Houfu; Summerfield, Scott G; Luscombe, Christopher N; Sahi, Jasminder
2016-05-02
Estimation of uptake across the blood-brain barrier (BBB) is key to designing central nervous system (CNS) therapeutics. In silico approaches ranging from physicochemical rules to quantitative structure-activity relationship (QSAR) models are utilized to predict potential for CNS penetration of new chemical entities. However, there are still gaps in our knowledge of (1) the relationship between marketed human drug derived CNS-accessible chemical space and preclinical neuropharmacokinetic (neuroPK) data, (2) interpretability of the selected physicochemical descriptors, and (3) correlation of the in vitro human P-glycoprotein (P-gp) efflux ratio (ER) and in vivo rodent unbound brain-to-blood ratio (Kp,uu), as these are assays routinely used to predict clinical CNS exposure, during drug discovery. To close these gaps, we explored the CNS druglike property boundaries of 920 market oral drugs (315 CNS and 605 non-CNS) and 846 compounds (54 CNS drugs and 792 proprietary GlaxoSmithKline compounds) with available rat Kp,uu data. The exact permeability coefficient (Pexact) and P-gp ER were determined for 176 compounds from the rat Kp,uu data set. Receiver operating characteristic curves were performed to evaluate the predictive power of human P-gp ER for rat Kp,uu. Our data demonstrates that simple physicochemical rules (most acidic pKa ≥ 9.5 and TPSA < 100) in combination with P-gp ER < 1.5 provide mechanistic insights for filtering BBB permeable compounds. For comparison, six classification modeling methods were investigated using multiple sets of in silico molecular descriptors. We present a random forest model with excellent predictive power (∼0.75 overall accuracy) using the rat neuroPK data set. We also observed good concordance between the structural interpretation results and physicochemical descriptor importance from the Kp,uu classification QSAR model. In summary, we propose a novel, hybrid in silico/in vitro approach and an in silico screening model for the effective development of chemical series with the potential to achieve optimal CNS exposure.
NASA Astrophysics Data System (ADS)
Florindo, João. Batista
2018-04-01
This work proposes the use of Singular Spectrum Analysis (SSA) for the classification of texture images, more specifically, to enhance the performance of the Bouligand-Minkowski fractal descriptors in this task. Fractal descriptors are known to be a powerful approach to model and particularly identify complex patterns in natural images. Nevertheless, the multiscale analysis involved in those descriptors makes them highly correlated. Although other attempts to address this point was proposed in the literature, none of them investigated the relation between the fractal correlation and the well-established analysis employed in time series. And SSA is one of the most powerful techniques for this purpose. The proposed method was employed for the classification of benchmark texture images and the results were compared with other state-of-the-art classifiers, confirming the potential of this analysis in image classification.
Benndorf, Matthias; Kotter, Elmar; Langer, Mathias; Herda, Christoph; Wu, Yirong; Burnside, Elizabeth S
2015-06-01
To develop and validate a decision support tool for mammographic mass lesions based on a standardized descriptor terminology (BI-RADS lexicon) to reduce variability of practice. We used separate training data (1,276 lesions, 138 malignant) and validation data (1,177 lesions, 175 malignant). We created naïve Bayes (NB) classifiers from the training data with tenfold cross-validation. Our "inclusive model" comprised BI-RADS categories, BI-RADS descriptors, and age as predictive variables; our "descriptor model" comprised BI-RADS descriptors and age. The resulting NB classifiers were applied to the validation data. We evaluated and compared classifier performance with ROC-analysis. In the training data, the inclusive model yields an AUC of 0.959; the descriptor model yields an AUC of 0.910 (P < 0.001). The inclusive model is superior to the clinical performance (BI-RADS categories alone, P < 0.001); the descriptor model performs similarly. When applied to the validation data, the inclusive model yields an AUC of 0.935; the descriptor model yields an AUC of 0.876 (P < 0.001). Again, the inclusive model is superior to the clinical performance (P < 0.001); the descriptor model performs similarly. We consider our classifier a step towards a more uniform interpretation of combinations of BI-RADS descriptors. We provide our classifier at www.ebm-radiology.com/nbmm/index.html . • We provide a decision support tool for mammographic masses at www.ebm-radiology.com/nbmm/index.html . • Our tool may reduce variability of practice in BI-RADS category assignment. • A formal analysis of BI-RADS descriptors may enhance radiologists' diagnostic performance.
Low, Yen S.; Sedykh, Alexander; Rusyn, Ivan; Tropsha, Alexander
2017-01-01
Cheminformatics approaches such as Quantitative Structure Activity Relationship (QSAR) modeling have been used traditionally for predicting chemical toxicity. In recent years, high throughput biological assays have been increasingly employed to elucidate mechanisms of chemical toxicity and predict toxic effects of chemicals in vivo. The data generated in such assays can be considered as biological descriptors of chemicals that can be combined with molecular descriptors and employed in QSAR modeling to improve the accuracy of toxicity prediction. In this review, we discuss several approaches for integrating chemical and biological data for predicting biological effects of chemicals in vivo and compare their performance across several data sets. We conclude that while no method consistently shows superior performance, the integrative approaches rank consistently among the best yet offer enriched interpretation of models over those built with either chemical or biological data alone. We discuss the outlook for such interdisciplinary methods and offer recommendations to further improve the accuracy and interpretability of computational models that predict chemical toxicity. PMID:24805064
Barisoni, Laura; Troost, Jonathan P; Nast, Cynthia; Bagnasco, Serena; Avila-Casado, Carmen; Hodgin, Jeffrey; Palmer, Matthew; Rosenberg, Avi; Gasim, Adil; Liensziewski, Chrysta; Merlino, Lino; Chien, Hui-Ping; Chang, Anthony; Meehan, Shane M; Gaut, Joseph; Song, Peter; Holzman, Lawrence; Gibson, Debbie; Kretzler, Matthias; Gillespie, Brenda W; Hewitt, Stephen M
2016-07-01
The multicenter Nephrotic Syndrome Study Network (NEPTUNE) digital pathology scoring system employs a novel and comprehensive methodology to document pathologic features from whole-slide images, immunofluorescence and ultrastructural digital images. To estimate inter- and intra-reader concordance of this descriptor-based approach, data from 12 pathologists (eight NEPTUNE and four non-NEPTUNE) with experience from training to 30 years were collected. A descriptor reference manual was generated and a webinar-based protocol for consensus/cross-training implemented. Intra-reader concordance for 51 glomerular descriptors was evaluated on jpeg images by seven NEPTUNE pathologists scoring 131 glomeruli three times (Tests I, II, and III), each test following a consensus webinar review. Inter-reader concordance of glomerular descriptors was evaluated in 315 glomeruli by all pathologists; interstitial fibrosis and tubular atrophy (244 cases, whole-slide images) and four ultrastructural podocyte descriptors (178 cases, jpeg images) were evaluated once by six and five pathologists, respectively. Cohen's kappa for inter-reader concordance for 48/51 glomerular descriptors with sufficient observations was moderate (0.40
Receptive fields selection for binary feature description.
Fan, Bin; Kong, Qingqun; Trzcinski, Tomasz; Wang, Zhiheng; Pan, Chunhong; Fua, Pascal
2014-06-01
Feature description for local image patch is widely used in computer vision. While the conventional way to design local descriptor is based on expert experience and knowledge, learning-based methods for designing local descriptor become more and more popular because of their good performance and data-driven property. This paper proposes a novel data-driven method for designing binary feature descriptor, which we call receptive fields descriptor (RFD). Technically, RFD is constructed by thresholding responses of a set of receptive fields, which are selected from a large number of candidates according to their distinctiveness and correlations in a greedy way. Using two different kinds of receptive fields (namely rectangular pooling area and Gaussian pooling area) for selection, we obtain two binary descriptors RFDR and RFDG .accordingly. Image matching experiments on the well-known patch data set and Oxford data set demonstrate that RFD significantly outperforms the state-of-the-art binary descriptors, and is comparable with the best float-valued descriptors at a fraction of processing time. Finally, experiments on object recognition tasks confirm that both RFDR and RFDG successfully bridge the performance gap between binary descriptors and their floating-point competitors.
Optimizing ROOT’s Performance Using C++ Modules
NASA Astrophysics Data System (ADS)
Vassilev, Vassil
2017-10-01
ROOT comes with a C++ compliant interpreter cling. Cling needs to understand the content of the libraries in order to interact with them. Exposing the full shared library descriptors to the interpreter at runtime translates into increased memory footprint. ROOT’s exploratory programming concepts allow implicit and explicit runtime shared library loading. It requires the interpreter to load the library descriptor. Re-parsing of descriptors’ content has a noticeable effect on the runtime performance. Present state-of-art lazy parsing technique brings the runtime performance to reasonable levels but proves to be fragile and can introduce correctness issues. An elegant solution is to load information from the descriptor lazily and in a non-recursive way. The LLVM community advances its C++ Modules technology providing an io-efficient, on-disk representation capable to reduce build times and peak memory usage. The feature is standardized as a C++ technical specification. C++ Modules are a flexible concept, which can be employed to match CMS and other experiments’ requirement for ROOT: to optimize both runtime memory usage and performance. Cling technically “inherits” the feature, however tweaking it to ROOT scale and beyond is a complex endeavor. The paper discusses the status of the C++ Modules in the context of ROOT, supported by few preliminary performance results. It shows a step-by-step migration plan and describes potential challenges which could appear.
Impact of volatile composition on the sensorial attributes of dried paprikas.
Martín, Alberto; Hernández, Alejandro; Aranda, Emilio; Casquete, Rocio; Velázquez, Rocio; Bartolomé, Teresa; Córdoba, María G
2017-10-01
Here we characterised the aroma of smoked, oven-dried, and sun-dried paprikas by sensorial evaluation and analysis of their volatile profiles. The sensorial panel defined smoked paprikas as having an intense, persistent, smoked odour and flavour and the highest acceptability. The oven-dried paprikas had a fruity odour and flavour related with aroma notes to fresh peppers. The sun-dried paprikas were associated with straw aromas and the worse valued. The chemical classes of volatile compounds also defined the paprika types. The smoked paprikas were richer in alcohols, phenols, pyrroles, and pyranones, whereas the oven-dried samples were characterised by their aldehydes and terpenes. The sun-dried paprikas had significantly lower amounts of odorant substances than the smoked and oven-dried paprikas. The intensity, persistence and smokiness descriptors (associated with smoked paprika) were positively associated with phenols and alcohols. Aldehydes were positively correlated with a fruity descriptor, which defined oven-dried paprikas, and negatively correlated with intensity, persistence, smokiness, toasted, and dried pepper descriptors. The descriptor straw, which defined sun-dried paprikas, was negatively correlated with alcohols, phenols, furans, and pyrroles. Copyright © 2017 Elsevier Ltd. All rights reserved.
3D molecular descriptors important for clinical success.
Kombo, David C; Tallapragada, Kartik; Jain, Rachit; Chewning, Joseph; Mazurov, Anatoly A; Speake, Jason D; Hauser, Terry A; Toler, Steve
2013-02-25
The pharmacokinetic and safety profiles of clinical drug candidates are greatly influenced by their requisite physicochemical properties. In particular, it has been shown that 2D molecular descriptors such as fraction of Sp3 carbon atoms (Fsp3) and number of stereo centers correlate with clinical success. Using the proteomic off-target hit rate of nicotinic ligands, we found that shape-based 3D descriptors such as the radius of gyration and shadow indices discriminate off-target promiscuity better than do Fsp3 and the number of stereo centers. We have deduced the relevant descriptor values required for a ligand to be nonpromiscuous. Investigating the MDL Drug Data Report (MDDR) database as compounds move from the preclinical stage toward the market, we have found that these shape-based 3D descriptors predict clinical success of compounds at preclinical and phase1 stages vs compounds withdrawn from the market better than do Fsp3 and LogD. Further, these computed 3D molecular descriptors correlate well with experimentally observed solubility, which is among well-known physicochemical properties that drive clinical success. We also found that about 84% of launched drugs satisfy either Shadow index or Fsp3 criteria, whereas withdrawn and discontinued compounds fail to meet the same criteria. Our studies suggest that spherical compounds (rather than their elongated counterparts) with a minimal number of aromatic rings may exhibit a high propensity to advance from clinical trials to market.
Influence of Texture and Colour in Breast TMA Classification
Fernández-Carrobles, M. Milagro; Bueno, Gloria; Déniz, Oscar; Salido, Jesús; García-Rojo, Marcial; González-López, Lucía
2015-01-01
Breast cancer diagnosis is still done by observation of biopsies under the microscope. The development of automated methods for breast TMA classification would reduce diagnostic time. This paper is a step towards the solution for this problem and shows a complete study of breast TMA classification based on colour models and texture descriptors. The TMA images were divided into four classes: i) benign stromal tissue with cellularity, ii) adipose tissue, iii) benign and benign anomalous structures, and iv) ductal and lobular carcinomas. A relevant set of features was obtained on eight different colour models from first and second order Haralick statistical descriptors obtained from the intensity image, Fourier, Wavelets, Multiresolution Gabor, M-LBP and textons descriptors. Furthermore, four types of classification experiments were performed using six different classifiers: (1) classification per colour model individually, (2) classification by combination of colour models, (3) classification by combination of colour models and descriptors, and (4) classification by combination of colour models and descriptors with a previous feature set reduction. The best result shows an average of 99.05% accuracy and 98.34% positive predictive value. These results have been obtained by means of a bagging tree classifier with combination of six colour models and the use of 1719 non-correlated (correlation threshold of 97%) textural features based on Statistical, M-LBP, Gabor and Spatial textons descriptors. PMID:26513238
Sun, Lili; Zhou, Liping; Yu, Yu; Lan, Yukun; Li, Zhiliang
2007-01-01
Polychlorinated diphenyl ethers (PCDEs) have received more and more concerns as a group of ubiquitous potential persistent organic pollutants (POPs). By using molecular electronegativity distance vector (MEDV-4), multiple linear regression (MLR) models are developed for sub-cooled liquid vapor pressures (P(L)), n-octanol/water partition coefficients (K(OW)) and sub-cooled liquid water solubilities (S(W,L)) of 209 PCDEs and diphenyl ether. The correlation coefficients (R) and the leave-one-out cross-validation (LOO) correlation coefficients (R(CV)) of all the 6-descriptor models for logP(L), logK(OW) and logS(W,L) are more than 0.98. By using stepwise multiple regression (SMR), the descriptors are selected and the resulting models are 5-descriptor model for logP(L), 4-descriptor model for logK(OW), and 6-descriptor model for logS(W,L), respectively. All these models exhibit excellent estimate capabilities for internal sample set and good predictive capabilities for external samples set. The consistency between observed and estimated/predicted values for logP(L) is the best (R=0.996, R(CV)=0.996), followed by logK(OW) (R=0.992, R(CV)=0.992) and logS(W,L) (R=0.983, R(CV)=0.980). By using MEDV-4 descriptors, the QSPR models can be used for prediction and the model predictions can hence extend the current database of experimental values.
Toropov, Andrey A; Toropova, Alla P; Benfenati, Emilio; Salmona, Mario
2018-06-01
The aim of the present work is an attempt to define computable measure of similarity between different endpoints. The similarity of structural alerts of different biochemical endpoints can be used to solve tasks of medicinal chemistry. Optimal descriptors are a tool to build up models for different endpoints. The optimal descriptor is calculated with simplified molecular input-line entry system (SMILES). A group of elements (single symbol or pair of symbols) can represent any SMILES. Each element of SMILES can be represented by so-called correlation weight i.e. coefficient that should be used to calculate descriptor. Numerical data on the correlation weights are calculated by the Monte Carlo method, i.e. by optimization procedure, which gives maximal correlation coefficient between the optimal descriptor and endpoint for the training set. Statistically stable correlation weights observed in several runs of the optimization can be examined as structural alerts, which are promoters of the increase or the decrease of a biochemical activity of a substance. Having data on several runs of the optimization correlation weights, one can extract list of promoters of increase and list of promoters of decrease for an endpoint. The study of similarity and dissimilarity of the above lists has been carried out for the following pairs of endpoints: (i) mutagenicity and anticancer activity; (ii) mutagenicity and blood brain barrier; and (iii) blood brain barrier and anticancer activity. The computational experiment confirms that similarity and dissimilarity for pairs of endpoints can be measured.
Identification of terms to define unconstrained air transportation demands
NASA Technical Reports Server (NTRS)
Jacobson, I. D.; Kuhilhau, A. R.
1982-01-01
The factors involved in the evaluation of unconstrained air transportation systems were carefully analyzed. By definition an unconstrained system is taken to be one in which the design can employ innovative and advanced concepts no longer limited by present environmental, social, political or regulatory settings. Four principal evaluation criteria are involved: (1) service utilization, based on the operating performance characteristics as viewed by potential patrons; (2) community impacts, reflecting decisions based on the perceived impacts of the system; (3) technological feasibility, estimating what is required to reduce the system to practice; and (4) financial feasibility, predicting the ability of the concepts to attract financial support. For each of these criteria, a set of terms or descriptors was identified, which should be used in the evaluation to render it complete. It is also demonstrated that these descriptors have the following properties: (a) their interpretation may be made by different groups of evaluators; (b) their interpretations and the way they are used may depend on the stage of development of the system in which they are used; (c) in formulating the problem, all descriptors should be addressed independent of the evaluation technique selected.
Silva, R S; Moura, E F; Farias-Neto, J T; Ledo, C A S; Sampaio, J E
2017-04-13
The aim of this study was to select morphoagronomic descriptors to characterize cassava accessions representative of Eastern Brazilian Amazonia. It was characterized 262 accessions using 21 qualitative descriptors. The multiple-correspondence analysis (MCA) technique was applied using the criteria: contribution of the descriptor in the last factorial axis of analysis in successive cycles (SMCA); reverse order of the descriptor's contribution in the last factorial axis of analysis with all descriptors ('O'´p') of Jolliffe's method; mean of the contribution orders of the descriptor in the first three factorial axes in the analysis with all descriptors ('Os') together with ('O'´p'); and order of contribution of weighted mean in the first three factorial axes in the analysis of all descriptors ('Oz'). The dissimilarity coefficient was measured by the method of multicategorical variables. The correlation among the matrix generated with all descriptors and matrices based on each criteria varied (r = 0.21, r = 0.97, r = 0.98, r = 0.13 for SMCA, 'Os', 'Oz' and 'O'´p', respectively). The least informative descriptors were discarded independently and according to both 'Os' and 'Oz' criteria. Thirteen descriptors were capable to discriminate the accessions and to represent the morphological variability of accessions sampled in Brazilian Eastern Amazonia: color of apical leaves, petiole color, color of stem exterior, external color of storage root, color of stem cortex, color of root pulp, texture of root epidermis, color of leaf vein, color of stem epidermis, color of end branches of adult plant, branching habit, root shape, and constriction of root.
Modeling the drugs' passive transfer in the body based on their chromatographic behavior.
Kouskoura, Maria G; Kachrimanis, Kyriakos G; Markopoulou, Catherine K
2014-11-01
One of the most challenging aims in modern analytical chemistry and pharmaceutical analysis is to create models for drugs' behavior based on simulation experiments. Since drugs' effects are closely related to their molecular properties, numerous characteristics of drugs are used in order to acquire a model of passive absorption and transfer in the human body. Importantly, such direction in innovative bioanalytical methodologies is also of stressful need in the area of personalized medicine to implement nanotechnological and genomics advancements. Simulation experiments were carried out by examining and interpreting the chromatographic behavior of 113 analytes/drugs (400 observations) in RP-HPLC. The dataset employed for this purpose included 73 descriptors which are referring to the physicochemical properties of the mobile phase mixture in different proportions, the physicochemical properties of the analytes and the structural characteristics of their molecules. A series of different software packages was used to calculate all the descriptors apart from those referring to the structure of analytes. The correlation of the descriptors with the retention time of the analytes eluted from a C4 column with an aqueous mobile phase was employed as dataset to introduce the behavior models in the human body. Their evaluation with a Partial Least Squares (PLS) software proved that the chromatographic behavior of a drug on a lipophilic stationary and a polar mobile phase is directly related to its drug-ability. At the same time, the behavior of an unknown drug in the human body can be predicted with reliability via the Artificial Neural Networks (ANNs) software. Copyright © 2014 Elsevier B.V. All rights reserved.
Zhang, Yong-Hong; Xia, Zhi-Ning; Qin, Li-Tang; Liu, Shu-Shen
2010-09-01
The objective of this paper is to build a reliable model based on the molecular electronegativity distance vector (MEDV) descriptors for predicting the blood-brain barrier (BBB) permeability and to reveal the effects of the molecular structural segments on the BBB permeability. Using 70 structurally diverse compounds, the partial least squares regression (PLSR) models between the BBB permeability and the MEDV descriptors were developed and validated by the variable selection and modeling based on prediction (VSMP) technique. The estimation ability, stability, and predictive power of a model are evaluated by the estimated correlation coefficient (r), leave-one-out (LOO) cross-validation correlation coefficient (q), and predictive correlation coefficient (R(p)). It has been found that PLSR model has good quality, r=0.9202, q=0.7956, and R(p)=0.6649 for M1 model based on the training set of 57 samples. To search the most important structural factors affecting the BBB permeability of compounds, we performed the values of the variable importance in projection (VIP) analysis for MEDV descriptors. It was found that some structural fragments in compounds, such as -CH(3), -CH(2)-, =CH-, =C, triple bond C-, -CH<, =C<, =N-, -NH-, =O, and -OH, are the most important factors affecting the BBB permeability. (c) 2010. Published by Elsevier Inc.
Catana, Cornel
2009-03-01
Using a well-defined set of fragments/pharmacophores, a new methodology to calculate fragment/ pharmacophore descriptors for any molecule onto which at least one fragment/pharmacophore can be mapped is presented. To each fragment/pharmacophore present in a molecule, we attach a descriptor that is calculated by identifying the molecule's atoms onto which it maps and summing over its constituent atomic descriptors. The attached descriptors are named C-fragment/pharmacophore descriptors, and this methodology can be applied to any descriptors defined at the atomic level, such as the partition coefficient, molar refractivity, electrotopological state, etc. By using this methodology, the same fragment/pharmacophore can be shown to have different values in different molecules resulting in better discrimination power. As we know, fragment and pharmacophore fingerprints have a lot of applications in chemical informatics. This study has attempted to find the impact of replacing the traditional value of "1" in a fingerprint with real numbers derived form C-fragment/pharmacophore descriptors. One way to do this is to assess the utility of C-fragment/ pharmacophore descriptors in modeling different end points. Here, we exemplify with data from CYP and hERG. The fact that, in many cases, the obtained models were fairly successful and C-fragment descriptors were ranked among the top ones supports the idea that they play an important role in correlation. When we modeled hERG with C-pharmacophore descriptors, however, the model performances decreased slightly, and we attribute this, mainly to the fact that there is no technique capable of handling multiple instances (states). We hope this will open new research, especially in the emerging field of machine learning. Further research is needed to see the impact of C-fragment/pharmacophore descriptors in similarity/dissimilarity applications.
Madison, Guy; Gouyon, Fabien; Ullén, Fredrik; Hörnström, Kalle
2011-10-01
Groove is often described as the experience of music that makes people tap their feet and want to dance. A high degree of consistency in ratings of groove across listeners indicates that physical properties of the sound signal contribute to groove (Madison, 2006). Here, correlations were assessed between listeners' ratings and a number of quantitative descriptors of rhythmic properties for one hundred music examples from five distinct traditional music genres. Groove was related to several different rhythmic properties, some of which were genre-specific and some of which were general across genres. Two descriptors corresponding to the density of events between beats and the salience of the beat, respectively, were strongly correlated with groove across domains. In contrast, systematic deviations from strict positions on the metrical grid, so-called microtiming, did not play any significant role. The results are discussed from a functional perspective of rhythmic music to enable and facilitate entrainment and precise synchronization among individuals.
Senior, Samir A; Madbouly, Magdy D; El massry, Abdel-Moneim
2011-09-01
Quantum chemical and topological descriptors of some organophosphorus compounds (OP) were correlated with their toxicity LD(50) as a dermal. The quantum chemical parameters were obtained using B3LYP/LANL2DZdp-ECP optimization. Using linear regression analysis, equations were derived to calculate the theoretical LD(50) of the studied compounds. The inclusion of quantum parameters, having both charge indices and topological indices, affects the toxicity of the studied compounds resulting in high correlation coefficient factors for the obtained equations. Two of the new four firstly supposed descriptors give higher correlation coefficients namely the Heteroatom Corrected Extended Connectivity Randic index ((1)X(HCEC)) and the Density Randic index ((1)X(Den)). The obtained linear equations were applied to predict the toxicity of some related structures. It was found that the sulfur atoms in these compounds must be replaced by oxygen atoms to achieve improved toxicity. Copyright © 2011 Elsevier Ltd. All rights reserved.
Determination of solute descriptors by chromatographic methods.
Poole, Colin F; Atapattu, Sanka N; Poole, Salwa K; Bell, Andrea K
2009-10-12
The solvation parameter model is now well established as a useful tool for obtaining quantitative structure-property relationships for chemical, biomedical and environmental processes. The model correlates a free-energy related property of a system to six free-energy derived descriptors describing molecular properties. These molecular descriptors are defined as L (gas-liquid partition coefficient on hexadecane at 298K), V (McGowan's characteristic volume), E (excess molar refraction), S (dipolarity/polarizability), A (hydrogen-bond acidity), and B (hydrogen-bond basicity). McGowan's characteristic volume is trivially calculated from structure and the excess molar refraction can be calculated for liquids from their refractive index and easily estimated for solids. The remaining four descriptors are derived by experiment using (largely) two-phase partitioning, chromatography, and solubility measurements. In this article, the use of gas chromatography, reversed-phase liquid chromatography, micellar electrokinetic chromatography, and two-phase partitioning for determining solute descriptors is described. A large database of experimental retention factors and partition coefficients is constructed after first applying selection tools to remove unreliable experimental values and an optimized collection of varied compounds with descriptor values suitable for calibrating chromatographic systems is presented. These optimized descriptors are demonstrated to be robust and more suitable than other groups of descriptors characterizing the separation properties of chromatographic systems.
Kadam, Kiran; Prabhakar, Prashant; Jayaraman, V K
2012-11-01
Bacterial lipoproteins play critical roles in various physiological processes including the maintenance of pathogenicity and numbers of them are being considered as potential candidates for generating novel vaccines. In this work, we put forth an algorithm to identify and predict ligand-binding sites in bacterial lipoproteins. The method uses three types of pocket descriptors, namely fpocket descriptors, 3D Zernike descriptors and shell descriptors, and combines them with Support Vector Machine (SVM) method for the classification. The three types of descriptors represent shape-based properties of the pocket as well as its local physio-chemical features. All three types of descriptors, along with their hybrid combinations are evaluated with SVM and to improve classification performance, WEKA-InfoGain feature selection is applied. Results obtained in the study show that the classifier successfully differentiates between ligand-binding and non-binding pockets. For the combination of three types of descriptors, 10 fold cross-validation accuracy of 86.83% is obtained for training while the selected model achieved test Matthews Correlation Coefficient (MCC) of 0.534. Individually or in combination with new and existing methods, our model can be a very useful tool for the prediction of potential ligand-binding sites in bacterial lipoproteins.
In silico quantitative structure-toxicity relationship study of aromatic nitro compounds.
Pasha, Farhan Ahmad; Neaz, Mohammad Morshed; Cho, Seung Joo; Ansari, Mohiuddin; Mishra, Sunil Kumar; Tiwari, Sharvan
2009-05-01
Small molecules often have toxicities that are a function of molecular structural features. Minor variations in structural features can make large difference in such toxicity. Consequently, in silico techniques may be used to correlate such molecular toxicities with their structural features. Relative to nine different sets of aromatic nitro compounds having known observed toxicities against different targets, we developed ligand-based 2D quantitative structure-toxicity relationship models using 20 selected topological descriptors. The topological descriptors have several advantages such as conformational independency, facile and less time-consuming computation to yield good results. Multiple linear regression analysis was used to correlate variations of toxicity with molecular properties. The information index on molecular size, lopping centric index and Kier flexibility index were identified as fundamental descriptors for different kinds of toxicity, and further showed that molecular size, branching and molecular flexibility might be particularly important factors in quantitative structure-toxicity relationship analysis. This study revealed that topological descriptor-guided quantitative structure-toxicity relationship provided a very useful, cost and time-efficient, in silico tool for describing small-molecule toxicities.
Kar, Supratik; Gajewicz, Agnieszka; Puzyn, Tomasz; Roy, Kunal; Leszczynski, Jerzy
2014-09-01
Nanotechnology has evolved as a frontrunner in the development of modern science. Current studies have established toxicity of some nanoparticles to human and environment. Lack of sufficient data and low adequacy of experimental protocols hinder comprehensive risk assessment of nanoparticles (NPs). In the present work, metal electronegativity (χ), the charge of the metal cation corresponding to a given oxide (χox), atomic number and valence electron number of the metal have been used as simple molecular descriptors to build up quantitative structure-toxicity relationship (QSTR) models for prediction of cytotoxicity of metal oxide NPs to bacteria Escherichia coli. These descriptors can be easily obtained from molecular formula and information acquired from periodic table in no time. It has been shown that a simple molecular descriptor χox can efficiently encode cytotoxicity of metal oxides leading to models with high statistical quality as well as interpretability. Based on this model and previously published experimental results, we have hypothesized the most probable mechanism of the cytotoxicity of metal oxide nanoparticles to E. coli. Moreover, the required information for descriptor calculation is independent of size range of NPs, nullifying a significant problem that various physical properties of NPs change for different size ranges. Copyright © 2014 Elsevier Inc. All rights reserved.
Molecular structure and gas chromatographic retention behavior of the components of Ylang-Ylang oil.
Olivero, J; Gracia, T; Payares, P; Vivas, R; Díaz, D; Daza, E; Geerlings, P
1997-05-01
Using quantitative structure-retention relationships (QSRR) methodologies the Kovats gas chromatographic retention indices for both apolar (DB-1) and polar (DB-Wax) columns for 48 compounds from Ylang-Ylang essential oil were empirically predicted from calculated and experimental data on molecular structure. Topological, geometric, and electronic descriptors were obtained for model generation. Relationships between descriptors and the retention data reported were established by linear multiple regression, giving equations that can be used to predict the Kovats indices for compounds present in essential oils, both in DB-1 and DB-Wax columns. Factor analysis was performed to interpret the meaning of the descriptors included in the models. The prediction model for the DB-1 column includes descriptors such as Randic's first-order connectivity index (1X), the molecular surface (MSA), the sum of the atomic charge on all the hydrogens (QH), Randic's third-order connectivity index (3X) and the molecular electronegativity (chi). The prediction model for the DB-Wax column includes the first three descriptors mentioned for the DB-1 column (1X, MSA and QH) and the most negative charge (MNC), the global softness (S), and the difference between Randic's and Kier and Hall's third-order connectivity indexes (3X-3XV).
Finding Chemical Structures Corresponding to a Set of Coordinates in Chemical Descriptor Space.
Miyao, Tomoyuki; Funatsu, Kimito
2017-08-01
When chemical structures are searched based on descriptor values, or descriptors are interpreted based on values, it is important that corresponding chemical structures actually exist. In order to consider the existence of chemical structures located in a specific region in the chemical space, we propose to search them inside training data domains (TDDs), which are dense areas of a training dataset in the chemical space. We investigated TDDs' features using diverse and local datasets, assuming that GDB11 is the chemical universe. These two analyses showed that considering TDDs gives higher chance of finding chemical structures than a random search-based method, and that novel chemical structures actually exist inside TDDs. In addition to those findings, we tested the hypothesis that chemical structures were distributed on the limited areas of chemical space. This hypothesis was confirmed by the fact that distances among chemical structures in several descriptor spaces were much shorter than those among randomly generated coordinates in the training data range. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Ivanciuc, Ovidiu
2013-06-01
Chemical and molecular graphs have fundamental applications in chemoinformatics, quantitative structureproperty relationships (QSPR), quantitative structure-activity relationships (QSAR), virtual screening of chemical libraries, and computational drug design. Chemoinformatics applications of graphs include chemical structure representation and coding, database search and retrieval, and physicochemical property prediction. QSPR, QSAR and virtual screening are based on the structure-property principle, which states that the physicochemical and biological properties of chemical compounds can be predicted from their chemical structure. Such structure-property correlations are usually developed from topological indices and fingerprints computed from the molecular graph and from molecular descriptors computed from the three-dimensional chemical structure. We present here a selection of the most important graph descriptors and topological indices, including molecular matrices, graph spectra, spectral moments, graph polynomials, and vertex topological indices. These graph descriptors are used to define several topological indices based on molecular connectivity, graph distance, reciprocal distance, distance-degree, distance-valency, spectra, polynomials, and information theory concepts. The molecular descriptors and topological indices can be developed with a more general approach, based on molecular graph operators, which define a family of graph indices related by a common formula. Graph descriptors and topological indices for molecules containing heteroatoms and multiple bonds are computed with weighting schemes based on atomic properties, such as the atomic number, covalent radius, or electronegativity. The correlation in QSPR and QSAR models can be improved by optimizing some parameters in the formula of topological indices, as demonstrated for structural descriptors based on atomic connectivity and graph distance.
Ahmed, Shiek S. S. J.; Ramakrishnan, V.
2012-01-01
Background Poor oral bioavailability is an important parameter accounting for the failure of the drug candidates. Approximately, 50% of developing drugs fail because of unfavorable oral bioavailability. In silico prediction of oral bioavailability (%F) based on physiochemical properties are highly needed. Although many computational models have been developed to predict oral bioavailability, their accuracy remains low with a significant number of false positives. In this study, we present an oral bioavailability model based on systems biological approach, using a machine learning algorithm coupled with an optimal discriminative set of physiochemical properties. Results The models were developed based on computationally derived 247 physicochemical descriptors from 2279 molecules, among which 969, 605 and 705 molecules were corresponds to oral bioavailability, intestinal absorption (HIA) and caco-2 permeability data set, respectively. The partial least squares discriminate analysis showed 49 descriptors of HIA and 50 descriptors of caco-2 are the major contributing descriptors in classifying into groups. Of these descriptors, 47 descriptors were commonly associated to HIA and caco-2, which suggests to play a vital role in classifying oral bioavailability. To determine the best machine learning algorithm, 21 classifiers were compared using a bioavailability data set of 969 molecules with 47 descriptors. Each molecule in the data set was represented by a set of 47 physiochemical properties with the functional relevance labeled as (+bioavailability/−bioavailability) to indicate good-bioavailability/poor-bioavailability molecules. The best-performing algorithm was the logistic algorithm. The correlation based feature selection (CFS) algorithm was implemented, which confirms that these 47 descriptors are the fundamental descriptors for oral bioavailability prediction. Conclusion The logistic algorithm with 47 selected descriptors correctly predicted the oral bioavailability, with a predictive accuracy of more than 71%. Overall, the method captures the fundamental molecular descriptors, that can be used as an entity to facilitate prediction of oral bioavailability. PMID:22815781
Ahmed, Shiek S S J; Ramakrishnan, V
2012-01-01
Poor oral bioavailability is an important parameter accounting for the failure of the drug candidates. Approximately, 50% of developing drugs fail because of unfavorable oral bioavailability. In silico prediction of oral bioavailability (%F) based on physiochemical properties are highly needed. Although many computational models have been developed to predict oral bioavailability, their accuracy remains low with a significant number of false positives. In this study, we present an oral bioavailability model based on systems biological approach, using a machine learning algorithm coupled with an optimal discriminative set of physiochemical properties. The models were developed based on computationally derived 247 physicochemical descriptors from 2279 molecules, among which 969, 605 and 705 molecules were corresponds to oral bioavailability, intestinal absorption (HIA) and caco-2 permeability data set, respectively. The partial least squares discriminate analysis showed 49 descriptors of HIA and 50 descriptors of caco-2 are the major contributing descriptors in classifying into groups. Of these descriptors, 47 descriptors were commonly associated to HIA and caco-2, which suggests to play a vital role in classifying oral bioavailability. To determine the best machine learning algorithm, 21 classifiers were compared using a bioavailability data set of 969 molecules with 47 descriptors. Each molecule in the data set was represented by a set of 47 physiochemical properties with the functional relevance labeled as (+bioavailability/-bioavailability) to indicate good-bioavailability/poor-bioavailability molecules. The best-performing algorithm was the logistic algorithm. The correlation based feature selection (CFS) algorithm was implemented, which confirms that these 47 descriptors are the fundamental descriptors for oral bioavailability prediction. The logistic algorithm with 47 selected descriptors correctly predicted the oral bioavailability, with a predictive accuracy of more than 71%. Overall, the method captures the fundamental molecular descriptors, that can be used as an entity to facilitate prediction of oral bioavailability.
Anouar, El Hassane
2014-01-01
Phenolic Schiff bases are known as powerful antioxidants. To select the electronic, 2D and 3D descriptors responsible for the free radical scavenging ability of a series of 30 phenolic Schiff bases, a set of molecular descriptors were calculated by using B3P86 (Becke’s three parameter hybrid functional with Perdew 86 correlation functional) combined with 6-31 + G(d,p) basis set (i.e., at the B3P86/6-31 + G(d,p) level of theory). The chemometric methods, simple and multiple linear regressions (SLR and MLR), principal component analysis (PCA) and hierarchical cluster analysis (HCA) were employed to reduce the dimensionality and to investigate the relationship between the calculated descriptors and the antioxidant activity. The results showed that the antioxidant activity mainly depends on the first and second bond dissociation enthalpies of phenolic hydroxyl groups, the dipole moment and the hydrophobicity descriptors. The antioxidant activity is inversely proportional to the main descriptors. The selected descriptors discriminate the Schiff bases into active and inactive antioxidants. PMID:26784873
Pallicer, Juan M; Pascual, Rosalia; Port, Adriana; Rosés, Martí; Ràfols, Clara; Bosch, Elisabeth
2013-02-14
The influence of the hydrogen bond acidity when the 1-octanol/water partition coefficient (log P(o/w)) of drugs is determined from chromatographic measurements was studied in this work. This influence was firstly evaluated by means of the comparison between the Abraham solvation parameter model when it is applied to express the 1-octanol/water partitioning and the chromatographic retention, expressed as the solute polarity p. Then, several hydrogen bond acidity descriptors were compared in order to determine properly the log P(o/w) of drugs. These descriptors were obtained from different software and comprise two-dimensional parameters such as the calculated Abraham hydrogen bond acidity A and three-dimensional descriptors like HDCA-2 from CODESSA program or WO1 and DRDODO descriptors calculated from Volsurf+software. The additional HOMO-LUMO polarizability descriptor should be added when the three-dimensional descriptors are used to complement the chromatographic retention. The models generated using these descriptors were compared studying the correlations between the determined log P(o/w) values and the reference ones. The comparison showed that there was no significant difference between the tested models and any of them was able to determine the log P(o/w) of drugs from a single chromatographic measurement and the correspondent molecular descriptors terms. However, the model that involved the calculated A descriptor was simpler and it is thus recommended for practical uses. Copyright © 2012 Elsevier B.V. All rights reserved.
Gonzalez Viejo, Claudia; Fuentes, Sigfredo; Torrico, Damir D; Howell, Kate; Dunshea, Frank R
2018-05-01
Sensory attributes of beer are directly linked to perceived foam-related parameters and beer color. The aim of this study was to develop an objective predictive model using machine learning modeling to assess the intensity levels of sensory descriptors in beer using the physical measurements of color and foam-related parameters. A robotic pourer (RoboBEER), was used to obtain 15 color and foam-related parameters from 22 different commercial beer samples. A sensory session using quantitative descriptive analysis (QDA ® ) with trained panelists was conducted to assess the intensity of 10 beer descriptors. Results showed that the principal component analysis explained 64% of data variability with correlations found between foam-related descriptors from sensory and RoboBEER such as the positive and significant correlation between carbon dioxide and carbonation mouthfeel (R = 0.62), correlation of viscosity to sensory, and maximum volume of foam and total lifetime of foam (R = 0.75, R = 0.77, respectively). Using the RoboBEER parameters as inputs, an artificial neural network (ANN) regression model showed high correlation (R = 0.91) to predict the intensity levels of 10 related sensory descriptors such as yeast, grains and hops aromas, hops flavor, bitter, sour and sweet tastes, viscosity, carbonation, and astringency. This paper is a novel approach for food science using machine modeling techniques that could contribute significantly to rapid screenings of food and brewage products for the food industry and the implementation of Artificial Intelligence (AI). The use of RoboBEER to assess beer quality showed to be a reliable, objective, accurate, and less time-consuming method to predict sensory descriptors compared to trained sensory panels. Hence, this method could be useful as a rapid screening procedure to evaluate beer quality at the end of the production line for industry applications. © 2018 Institute of Food Technologists®.
Learning moment-based fast local binary descriptor
NASA Astrophysics Data System (ADS)
Bellarbi, Abdelkader; Zenati, Nadia; Otmane, Samir; Belghit, Hayet
2017-03-01
Recently, binary descriptors have attracted significant attention due to their speed and low memory consumption; however, using intensity differences to calculate the binary descriptive vector is not efficient enough. We propose an approach to binary description called POLAR_MOBIL, in which we perform binary tests between geometrical and statistical information using moments in the patch instead of the classical intensity binary test. In addition, we introduce a learning technique used to select an optimized set of binary tests with low correlation and high variance. This approach offers high distinctiveness against affine transformations and appearance changes. An extensive evaluation on well-known benchmark datasets reveals the robustness and the effectiveness of the proposed descriptor, as well as its good performance in terms of low computation complexity when compared with state-of-the-art real-time local descriptors.
Marrero-Ponce, Yovani; Contreras-Torres, Ernesto; García-Jacas, César R; Barigye, Stephen J; Cubillán, Néstor; Alvarado, Ysaías J
2015-06-07
In the present study, we introduce novel 3D protein descriptors based on the bilinear algebraic form in the ℝ(n) space on the coulombic matrix. For the calculation of these descriptors, macromolecular vectors belonging to ℝ(n) space, whose components represent certain amino acid side-chain properties, were used as weighting schemes. Generalization approaches for the calculation of inter-amino acidic residue spatial distances based on Minkowski metrics are proposed. The simple- and double-stochastic schemes were defined as approaches to normalize the coulombic matrix. The local-fragment indices for both amino acid-types and amino acid-groups are presented in order to permit characterizing fragments of interest in proteins. On the other hand, with the objective of taking into account specific interactions among amino acids in global or local indices, geometric and topological cut-offs are defined. To assess the utility of global and local indices a classification model for the prediction of the major four protein structural classes, was built with the Linear Discriminant Analysis (LDA) technique. The developed LDA-model correctly classifies the 92.6% and 92.7% of the proteins on the training and test sets, respectively. The obtained model showed high values of the generalized square correlation coefficient (GC(2)) on both the training and test series. The statistical parameters derived from the internal and external validation procedures demonstrate the robustness, stability and the high predictive power of the proposed model. The performance of the LDA-model demonstrates the capability of the proposed indices not only to codify relevant biochemical information related to the structural classes of proteins, but also to yield suitable interpretability. It is anticipated that the current method will benefit the prediction of other protein attributes or functions. Copyright © 2015 Elsevier Ltd. All rights reserved.
Learning Careers/Learning Trajectories. Trends and Issues Alert.
ERIC Educational Resources Information Center
Kerka, Sandra
"Learning autobiography,""learning career," and "learning trajectory" are related descriptors for the process of developing attitudes toward learning and the origins of interests, learning styles, and learning processes. The learning career is composed of events, activities, and interpretations that develop individual…
Log-Gabor Weber descriptor for face recognition
NASA Astrophysics Data System (ADS)
Li, Jing; Sang, Nong; Gao, Changxin
2015-09-01
The Log-Gabor transform, which is suitable for analyzing gradually changing data such as in iris and face images, has been widely used in image processing, pattern recognition, and computer vision. In most cases, only the magnitude or phase information of the Log-Gabor transform is considered. However, the complementary effect taken by combining magnitude and phase information simultaneously for an image-feature extraction problem has not been systematically explored in the existing works. We propose a local image descriptor for face recognition, called Log-Gabor Weber descriptor (LGWD). The novelty of our LGWD is twofold: (1) to fully utilize the information from the magnitude or phase feature of multiscale and orientation Log-Gabor transform, we apply the Weber local binary pattern operator to each transform response. (2) The encoded Log-Gabor magnitude and phase information are fused at the feature level by utilizing kernel canonical correlation analysis strategy, considering that feature level information fusion is effective when the modalities are correlated. Experimental results on the AR, Extended Yale B, and UMIST face databases, compared with those available from recent experiments reported in the literature, show that our descriptor yields a better performance than state-of-the art methods.
Predicting Drug-induced Hepatotoxicity Using QSAR and Toxicogenomics Approaches
Low, Yen; Uehara, Takeki; Minowa, Yohsuke; Yamada, Hiroshi; Ohno, Yasuo; Urushidani, Tetsuro; Sedykh, Alexander; Muratov, Eugene; Fourches, Denis; Zhu, Hao; Rusyn, Ivan; Tropsha, Alexander
2014-01-01
Quantitative Structure-Activity Relationship (QSAR) modeling and toxicogenomics are used independently as predictive tools in toxicology. In this study, we evaluated the power of several statistical models for predicting drug hepatotoxicity in rats using different descriptors of drug molecules, namely their chemical descriptors and toxicogenomic profiles. The records were taken from the Toxicogenomics Project rat liver microarray database containing information on 127 drugs (http://toxico.nibio.go.jp/datalist.html). The model endpoint was hepatotoxicity in the rat following 28 days of exposure, established by liver histopathology and serum chemistry. First, we developed multiple conventional QSAR classification models using a comprehensive set of chemical descriptors and several classification methods (k nearest neighbor, support vector machines, random forests, and distance weighted discrimination). With chemical descriptors alone, external predictivity (Correct Classification Rate, CCR) from 5-fold external cross-validation was 61%. Next, the same classification methods were employed to build models using only toxicogenomic data (24h after a single exposure) treated as biological descriptors. The optimized models used only 85 selected toxicogenomic descriptors and had CCR as high as 76%. Finally, hybrid models combining both chemical descriptors and transcripts were developed; their CCRs were between 68 and 77%. Although the accuracy of hybrid models did not exceed that of the models based on toxicogenomic data alone, the use of both chemical and biological descriptors enriched the interpretation of the models. In addition to finding 85 transcripts that were predictive and highly relevant to the mechanisms of drug-induced liver injury, chemical structural alerts for hepatotoxicity were also identified. These results suggest that concurrent exploration of the chemical features and acute treatment-induced changes in transcript levels will both enrich the mechanistic understanding of sub-chronic liver injury and afford models capable of accurate prediction of hepatotoxicity from chemical structure and short-term assay results. PMID:21699217
Hobin, E; Sacco, J; Vanderlee, L; White, C M; Zuo, F; Sheeshka, J; McVey, G; Fodor O'Brien, M; Hammond, D
2015-12-01
Given the proposed changes to nutrition labelling in Canada and the dearth of research examining comprehension and use of nutrition facts tables (NFts) by adolescents and young adults, our objective was to experimentally test the efficacy of modifications to NFts on young Canadians' ability to interpret, compare and mathematically manipulate nutrition information in NFts on prepackaged food. An online survey was conducted among 2010 Canadians aged 16 to 24 years drawn from a consumer sample. Participants were randomized to view two NFts according to one of six experimental conditions, using a between-groups 2 x 3 factorial design: serving size (current NFt vs. standardized serving-sizes across similar products) x percent daily value (% DV) (current NFt vs. "low/med/high" descriptors vs. colour coding). The survey included seven performance tasks requiring participants to interpret, compare and mathematically manipulate nutrition information on NFts. Separate modified Poisson regression models were conducted for each of the three outcomes. The ability to compare two similar products was significantly enhanced in NFt conditions that included standardized serving-sizes (p ≤ .001 for all). Adding descriptors or colour coding of % DV next to calories and nutrients on NFts significantly improved participants' ability to correctly interpret % DV information (p ≤ .001 for all). Providing both standardized serving-sizes and descriptors of % DV had a modest effect on participants' ability to mathematically manipulate nutrition information to calculate the nutrient content of multiple servings of a product (relative ratio = 1.19; 95% confidence limit: 1.04-1.37). Standardizing serving-sizes and adding interpretive % DV information on NFts improved young Canadians' comprehension and use of nutrition information. Some caution should be exercised in generalizing these findings to all Canadian youth due to the sampling issues associated with the study population. Further research is needed to replicate this study in a more heterogeneous sample in Canada and across a range of food products and categories.
Khashan, Raed; Zheng, Weifan; Tropsha, Alexander
2014-03-01
We present a novel approach to generating fragment-based molecular descriptors. The molecules are represented by labeled undirected chemical graph. Fast Frequent Subgraph Mining (FFSM) is used to find chemical-fragments (subgraphs) that occur in at least a subset of all molecules in a dataset. The collection of frequent subgraphs (FSG) forms a dataset-specific descriptors whose values for each molecule are defined by the number of times each frequent fragment occurs in this molecule. We have employed the FSG descriptors to develop variable selection k Nearest Neighbor (kNN) QSAR models of several datasets with binary target property including Maximum Recommended Therapeutic Dose (MRTD), Salmonella Mutagenicity (Ames Genotoxicity), and P-Glycoprotein (PGP) data. Each dataset was divided into training, test, and validation sets to establish the statistical figures of merit reflecting the model validated predictive power. The classification accuracies of models for both training and test sets for all datasets exceeded 75 %, and the accuracy for the external validation sets exceeded 72 %. The model accuracies were comparable or better than those reported earlier in the literature for the same datasets. Furthermore, the use of fragment-based descriptors affords mechanistic interpretation of validated QSAR models in terms of essential chemical fragments responsible for the compounds' target property. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Austin, Jehannine C.; Hippman, Catriona; Honer, William G.
2013-01-01
Studies show that individuals with psychotic illnesses and their families want information about psychosis risks for other relatives. However, deriving accurate numeric probabilities for psychosis risk is challenging, and people have difficulty interpreting probabilistic information, thus some have suggested that clinicians should use risk descriptors, such as ‘moderate’ or ‘quite high’, rather than numbers. Little is known about how individuals with psychosis and their family members use quantitative and qualitative descriptors of risk in the specific context of chance for an individual to develop psychosis. We explored numeric and descriptive estimations of psychosis risk among individuals with psychotic disorders and unaffected first-degree relatives. In an online survey, respondents numerically and descriptively estimated risk for an individual to develop psychosis in scenarios where they had: A) no affected family members; and B) an affected sibling. 219 affected individuals and 211 first-degree relatives participated. Affected individuals estimated significantly higher risks than relatives. Participants attributed all descriptors between “very low” and “very high” to probabilities of 1%, 10%, 25% and 50%+. For a given numeric probability, different risk descriptors were attributed in different scenarios. Clinically, brief interventions around risk (using either probabilities or descriptors alone) are vulnerable to miscommunication and potentially profoundly negative consequences –interventions around risk are best suited to in-depth discussion. PMID:22421074
Austin, Jehannine C; Hippman, Catriona; Honer, William G
2012-03-30
Studies show that individuals with psychotic illnesses and their families want information about psychosis risks for other relatives. However, deriving accurate numeric probabilities for psychosis risk is challenging, and people have difficulty interpreting probabilistic information; thus, some have suggested that clinicians should use risk descriptors, such as "moderate" or "quite high", rather than numbers. Little is known about how individuals with psychosis and their family members use quantitative and qualitative descriptors of risk in the specific context of chance for an individual to develop psychosis. We explored numeric and descriptive estimations of psychosis risk among individuals with psychotic disorders and unaffected first-degree relatives. In an online survey, respondents numerically and descriptively estimated risk for an individual to develop psychosis in scenarios where they had: A) no affected family members; and B) an affected sibling. Participants comprised 219 affected individuals and 211 first-degree relatives participated. Affected individuals estimated significantly higher risks than relatives. Participants attributed all descriptors between "very low" and "very high" to probabilities of 1%, 10%, 25% and 50%+. For a given numeric probability, different risk descriptors were attributed in different scenarios. Clinically, brief interventions around risk (using either probabilities or descriptors alone) are vulnerable to miscommunication and potentially negative consequences-interventions around risk are best suited to in-depth discussion. Copyright © 2012 Elsevier Ltd. All rights reserved.
Pronobis, Wiktor; Tkatchenko, Alexandre; Müller, Klaus-Robert
2018-06-12
Machine learning (ML) based prediction of molecular properties across chemical compound space is an important and alternative approach to efficiently estimate the solutions of highly complex many-electron problems in chemistry and physics. Statistical methods represent molecules as descriptors that should encode molecular symmetries and interactions between atoms. Many such descriptors have been proposed; all of them have advantages and limitations. Here, we propose a set of general two-body and three-body interaction descriptors which are invariant to translation, rotation, and atomic indexing. By adapting the successfully used kernel ridge regression methods of machine learning, we evaluate our descriptors on predicting several properties of small organic molecules calculated using density-functional theory. We use two data sets. The GDB-7 set contains 6868 molecules with up to 7 heavy atoms of type CNO. The GDB-9 set is composed of 131722 molecules with up to 9 heavy atoms containing CNO. When trained on 5000 random molecules, our best model achieves an accuracy of 0.8 kcal/mol (on the remaining 1868 molecules of GDB-7) and 1.5 kcal/mol (on the remaining 126722 molecules of GDB-9) respectively. Applying a linear regression model on our novel many-body descriptors performs almost equal to a nonlinear kernelized model. Linear models are readily interpretable: a feature importance ranking measure helps to obtain qualitative and quantitative insights on the importance of two- and three-body molecular interactions for predicting molecular properties computed with quantum-mechanical methods.
Hemmateenejad, Bahram; Yazdani, Mahdieh
2009-02-16
Steroids are widely distributed in nature and are found in plants, animals, and fungi in abundance. A data set consists of a diverse set of steroids have been used to develop quantitative structure-electrochemistry relationship (QSER) models for their half-wave reduction potential. Modeling was established by means of multiple linear regression (MLR) and principle component regression (PCR) analyses. In MLR analysis, the QSPR models were constructed by first grouping descriptors and then stepwise selection of variables from each group (MLR1) and stepwise selection of predictor variables from the pool of all calculated descriptors (MLR2). Similar procedure was used in PCR analysis so that the principal components (or features) were extracted from different group of descriptors (PCR1) and from entire set of descriptors (PCR2). The resulted models were evaluated using cross-validation, chance correlation, application to prediction reduction potential of some test samples and accessing applicability domain. Both MLR approaches represented accurate results however the QSPR model found by MLR1 was statistically more significant. PCR1 approach produced a model as accurate as MLR approaches whereas less accurate results were obtained by PCR2 approach. In overall, the correlation coefficients of cross-validation and prediction of the QSPR models resulted from MLR1, MLR2 and PCR1 approaches were higher than 90%, which show the high ability of the models to predict reduction potential of the studied steroids.
Hou, Tingjun; Xu, Xiaojie
2002-12-01
In this study, the relationships between the brain-blood concentration ratio of 96 structurally diverse compounds with a large number of structurally derived descriptors were investigated. The linear models were based on molecular descriptors that can be calculated for any compound simply from a knowledge of its molecular structure. The linear correlation coefficients of the models were optimized by genetic algorithms (GAs), and the descriptors used in the linear models were automatically selected from 27 structurally derived descriptors. The GA optimizations resulted in a group of linear models with three or four molecular descriptors with good statistical significance. The change of descriptor use as the evolution proceeds demonstrates that the octane/water partition coefficient and the partial negative solvent-accessible surface area multiplied by the negative charge are crucial to brain-blood barrier permeability. Moreover, we found that the predictions using multiple QSPR models from GA optimization gave quite good results in spite of the diversity of structures, which was better than the predictions using the best single model. The predictions for the two external sets with 37 diverse compounds using multiple QSPR models indicate that the best linear models with four descriptors are sufficiently effective for predictive use. Considering the ease of computation of the descriptors, the linear models may be used as general utilities to screen the blood-brain barrier partitioning of drugs in a high-throughput fashion.
Monitoring the sensory quality of canned white asparagus through cluster analysis.
Arana, Inés; Ibañez, Francisco C; Torre, Paloma
2016-05-01
White asparagus is one of the 30 vegetables most consumed in the world. This paper unifies the stages of their sensory quality control. The aims of this work were to describe the sensory properties of canned white asparagus and their quality control and to evaluate the applicability of agglomerative hierarchical clustering (AHC) for classifying and monitoring the sensory quality of manufacturers. Sixteen sensory descriptors and their evaluation technique were defined. The sensory profile of canned white asparagus was high flavor characteristic, little acidity and bitterness, medium firmness and very light fibrosity, among other characteristics. The dendrogram established groups of manufacturers that had similar scores in the same set of descriptors, and each cluster grouped the manufacturers that had a similar quality profile. The sensory profile of canned white asparagus was clearly defined through the intensity evaluation of 16 descriptors, and the sensory quality report provided to the manufacturers is in detail and of easy interpretation. AHC grouped the manufacturers according to the highest quality scores in certain descriptors and is a useful tool because it is very visual. © 2015 Society of Chemical Industry. © 2015 Society of Chemical Industry.
An insight into morphometric descriptors of cell shape that pertain to regenerative medicine.
Lobo, Joana; See, Eugene Yong-Shun; Biggs, Manus; Pandit, Abhay
2016-07-01
Cellular morphology has recently been indicated as a powerful indicator of cellular function. The analysis of cell shape has evolved from rudimentary forms of microscopic visual inspection to more advanced methodologies that utilize high-resolution microscopy coupled with sophisticated computer hardware and software for data analysis. Despite this progress, there is still a lack of standardization in quantification of morphometric parameters. In addition, uncertainty remains as to which methodologies and parameters of cell morphology will yield meaningful data, which methods should be utilized to categorize cell shape, and the extent of reliability of measurements and the interpretation of the resulting analysis. A large range of descriptors has been employed to objectively assess the cellular morphology in two-dimensional and three-dimensional domains. Intuitively, simple and applicable morphometric descriptors are preferable and standardized protocols for cell shape analysis can be achieved with the help of computerized tools. In this review, cellular morphology is discussed as a descriptor of cellular function and the current morphometric parameters that are used quantitatively in two- and three-dimensional environments are described. Furthermore, the current problems associated with these morphometric measurements are addressed. Copyright © 2015 John Wiley & Sons, Ltd. Copyright © 2015 John Wiley & Sons, Ltd.
Evaluation of estimation methods for organic carbon normalized sorption coefficients
Baker, James R.; Mihelcic, James R.; Luehrs, Dean C.; Hickey, James P.
1997-01-01
A critically evaluated set of 94 soil water partition coefficients normalized to soil organic carbon content (Koc) is presented for 11 classes of organic chemicals. This data set is used to develop and evaluate Koc estimation methods using three different descriptors. The three types of descriptors used in predicting Koc were octanol/water partition coefficient (Kow), molecular connectivity (mXt) and linear solvation energy relationships (LSERs). The best results were obtained estimating Koc from Kow, though a slight improvement in the correlation coefficient was obtained by using a two-parameter regression with Kow and the third order difference term from mXt. Molecular connectivity correlations seemed to be best suited for use with specific chemical classes. The LSER provided a better fit than mXt but not as good as the correlation with Koc. The correlation to predict Koc from Kow was developed for 72 chemicals; log Koc = 0.903* log Kow + 0.094. This correlation accounts for 91% of the variability in the data for chemicals with log Kow ranging from 1.7 to 7.0. The expression to determine the 95% confidence interval on the estimated Koc is provided along with an example for two chemicals of different hydrophobicity showing the confidence interval of the retardation factor determined from the estimated Koc. The data showed that Koc is not likely to be applicable for chemicals with log Kow < 1.7. Finally, the Koc correlation developed using Kow as a descriptor was compared with three nonclass-specific correlations and two 'commonly used' class-specific correlations to determine which method(s) are most suitable.
Temperature sensitivity of organic compound destruction in SCWO process.
Tan, Yaqin; Shen, Zhemin; Guo, Weimin; Ouyang, Chuang; Jia, Jinping; Jiang, Weili; Zhou, Haiyun
2014-03-01
To study the temperature sensitivity of the destruction of organic compounds in supercritical water oxidation process (SCWO), oxidation effects of twelve chemicals in supercritical water were investigated. The SCWO reaction rates of different compounds improved to varying degrees with the increase of temperature, so the highest slope of the temperature-effect curve (imax) was defined as the maximum ratio of removal ratio to working temperature. It is an important index to stand for the temperature sensitivity effect in SCWO. It was proven that the higher imax is, the more significant the effect of temperature on the SCWO effect is. Since the high-temperature area of SCWO equipment is subject to considerable damage from fatigue, the temperature is of great significance in SCWO equipment operation. Generally, most compounds (imax > 0.25) can be completely oxidized when the reactor temperature reaches 500°C. However, some compounds (imax > 0.25) need a higher temperature for complete oxidation, up to 560°C. To analyze the correlation coefficients between imax and various molecular descriptors, a quantum chemical method was used in this study. The structures of the twelve organic compounds were optimized by the Density Functional Theory B3LYP/6-311G method, as well as their quantum properties. It was shown that six molecular descriptors were negatively correlated to imax while other three descriptors were positively correlated to imax. Among them, dipole moment had the greatest effect on the oxidation thermodynamics of the twelve organic compounds. Once a correlation between molecular descriptors and imax is established, SCWO can be run at an appropriate temperature according to molecular structure. Copyright © 2014 The Research Centre for Eco-Environmental Sciences, Chinese Academy of Sciences. Published by Elsevier B.V. All rights reserved.
Webb, Samuel J; Hanser, Thierry; Howlin, Brendan; Krause, Paul; Vessey, Jonathan D
2014-03-25
A new algorithm has been developed to enable the interpretation of black box models. The developed algorithm is agnostic to learning algorithm and open to all structural based descriptors such as fragments, keys and hashed fingerprints. The algorithm has provided meaningful interpretation of Ames mutagenicity predictions from both random forest and support vector machine models built on a variety of structural fingerprints.A fragmentation algorithm is utilised to investigate the model's behaviour on specific substructures present in the query. An output is formulated summarising causes of activation and deactivation. The algorithm is able to identify multiple causes of activation or deactivation in addition to identifying localised deactivations where the prediction for the query is active overall. No loss in performance is seen as there is no change in the prediction; the interpretation is produced directly on the model's behaviour for the specific query. Models have been built using multiple learning algorithms including support vector machine and random forest. The models were built on public Ames mutagenicity data and a variety of fingerprint descriptors were used. These models produced a good performance in both internal and external validation with accuracies around 82%. The models were used to evaluate the interpretation algorithm. Interpretation was revealed that links closely with understood mechanisms for Ames mutagenicity. This methodology allows for a greater utilisation of the predictions made by black box models and can expedite further study based on the output for a (quantitative) structure activity model. Additionally the algorithm could be utilised for chemical dataset investigation and knowledge extraction/human SAR development.
Zarei, Kobra; Atabati, Morteza; Ahmadi, Monire
2017-05-04
Bee algorithm (BA) is an optimization algorithm inspired by the natural foraging behaviour of honey bees to find the optimal solution which can be proposed to feature selection. In this paper, shuffling cross-validation-BA (CV-BA) was applied to select the best descriptors that could describe the retention factor (log k) in the biopartitioning micellar chromatography (BMC) of 79 heterogeneous pesticides. Six descriptors were obtained using BA and then the selected descriptors were applied for model development using multiple linear regression (MLR). The descriptor selection was also performed using stepwise, genetic algorithm and simulated annealing methods and MLR was applied to model development and then the results were compared with those obtained from shuffling CV-BA. The results showed that shuffling CV-BA can be applied as a powerful descriptor selection method. Support vector machine (SVM) was also applied for model development using six selected descriptors by BA. The obtained statistical results using SVM were better than those obtained using MLR, as the root mean square error (RMSE) and correlation coefficient (R) for whole data set (training and test), using shuffling CV-BA-MLR, were obtained as 0.1863 and 0.9426, respectively, while these amounts for the shuffling CV-BA-SVM method were obtained as 0.0704 and 0.9922, respectively.
Ponec, R; Amat, L; Carbó-Dorca, R
1999-05-01
Since the dawn of quantitative structure-properties relationships (QSPR), empirical parameters related to structural, electronic and hydrophobic molecular properties have been used as molecular descriptors to determine such relationships. Among all these parameters, Hammett sigma constants and the logarithm of the octanol-water partition coefficient, log P, have been massively employed in QSPR studies. In the present paper, a new molecular descriptor, based on quantum similarity measures (QSM), is proposed as a general substitute of these empirical parameters. This work continues previous analyses related to the use of QSM to QSPR, introducing molecular quantum self-similarity measures (MQS-SM) as a single working parameter in some cases. The use of MQS-SM as a molecular descriptor is first confirmed from the correlation with the aforementioned empirical parameters. The Hammett equation has been examined using MQS-SM for a series of substituted carboxylic acids. Then, for a series of aliphatic alcohols and acetic acid esters, log P values have been correlated with the self-similarity measure between density functions in water and octanol of a given molecule. And finally, some examples and applications of MQS-SM to determine QSAR are presented. In all studied cases MQS-SM appeared to be excellent molecular descriptors usable in general QSPR applications of chemical interest.
NASA Astrophysics Data System (ADS)
Ponec, Robert; Amat, Lluís; Carbó-dorca, Ramon
1999-05-01
Since the dawn of quantitative structure-properties relationships (QSPR), empirical parameters related to structural, electronic and hydrophobic molecular properties have been used as molecular descriptors to determine such relationships. Among all these parameters, Hammett σ constants and the logarithm of the octanol- water partition coefficient, log P, have been massively employed in QSPR studies. In the present paper, a new molecular descriptor, based on quantum similarity measures (QSM), is proposed as a general substitute of these empirical parameters. This work continues previous analyses related to the use of QSM to QSPR, introducing molecular quantum self-similarity measures (MQS-SM) as a single working parameter in some cases. The use of MQS-SM as a molecular descriptor is first confirmed from the correlation with the aforementioned empirical parameters. The Hammett equation has been examined using MQS-SM for a series of substituted carboxylic acids. Then, for a series of aliphatic alcohols and acetic acid esters, log P values have been correlated with the self-similarity measure between density functions in water and octanol of a given molecule. And finally, some examples and applications of MQS-SM to determine QSAR are presented. In all studied cases MQS-SM appeared to be excellent molecular descriptors usable in general QSPR applications of chemical interest.
Katulska, Katarzyna; Milewska, Agata; Wykretowicz, Mateusz; Krauze, Tomasz; Przymuszala, Dagmara; Piskorski, Jaroslaw; Stajgis, Marek; Guzik, Przemyslaw; Wysocki, Henryk; Wykrętowicz, Andrzej
2013-10-01
Left atrial (LA) size is an important predictor of stroke, death, and atrial fibrillation. It was demonstrated recently that body fat, arterial stiffness and renal functions are associated with LA diameter. However, data are lacking for comprehensive assessments of all these risk factors in a single population. Therefore, the aim of the present study was to investigate the association between LA size and different fat descriptors, central hemodynamics, arterial stiffness, and renal function in healthy subjects. To this end, body fat percentage, abdominal, subcutaneous fat, and general descriptors of body fat were estimated in 162 healthy subjects (mean age 51 years). Echocardiography was performed to assess LA diameter. Arterial stiffness and peripheral and central hemodynamics were estimated by digital volume pulse analysis and pulse wave analysis. Glomerular filtration rate was estimated by MDRD formula. There were significant (p < 0.05) bivariate correlations between LA diameter and all descriptors of body fat (except subcutaneous fat). Arterial stiffness and estimated glomerular filtration rate (eGFR) were also significantly correlated with LA size. Multiple regression analysis including all significant confounders, such as sex, mean arterial pressure, arterial stiffness, eGFR and body fat descriptors, explained 35% of variance in LA diameter. In conclusion, the present study reveals significant, independent relationships between body fat, arterial stiffness, and LA size.
Computational study of AuSi{sub n} (n=1-9) nanoalloy clusters invoking DFT based descriptors
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ranjan, Prabhat; Kumar, Ajay; Chakraborty, Tanmoy, E-mail: tanmoy.chakraborty@jaipur.manipal.edu, E-mail: tanmoychem@gmail.com
2016-04-13
Nanoalloy clusters formed between Au and Si are topics of great interest today from both scientific and technological point of view. Due to its remarkable catalytic, electronic, mechanical and magnetic properties Au-Si nanoalloy clusters have extensive applications in the field of microelectronics, catalysis, biomedicine, and jewelry industry. Density Functional Theory (DFT) is a new paradigm of quantum mechanics, which is very much popular to study the electronic properties of materials. Conceptual DFT based descriptors have been invoked to correlate the experimental properties of nanoalloy clusters. In this venture, we have systematically investigated AuSi{sub n} (n=1-9) nanoalloy clusters in the theoreticalmore » frame of the B3LYP exchange correlation. The experimental properties of AuSi{sub n} (n=1-9) nanoalloy clusters are correlated in terms of DFT based descriptors viz. HOMO-LUMO gap, Electronegativity (χ), Global Hardness (η), Global Softness (S) and Electrophilicity Index (ω). The calculated HOMO-LUMO gap exhibits interesting odd-even alteration behaviour, indicating that even numbered clusters possess higher stability as compare to their neighbour odd numbered clusters. This study also reflects a very well agreement between experimental bond length and computed data.« less
Evidence-Centered Design as a Foundation for ALD Development
ERIC Educational Resources Information Center
Plake, Barbara S.; Huff, Kristen; Reshetar, Rosemary
2009-01-01
[Slides] presented at the Annual Meeting of National Council on Measurement in Education (NCME) in San Diego, CA in April 2009. This presentation discusses a methodology for directly connecting evidence-centered assessment design (ECD) to score interpretation and use through the development of Achievement level descriptors.
Olfactory perception of chemically diverse molecules.
Keller, Andreas; Vosshall, Leslie B
2016-08-08
Understanding the relationship between a stimulus and how it is perceived reveals fundamental principles about the mechanisms of sensory perception. While this stimulus-percept problem is mostly understood for color vision and tone perception, it is not currently possible to predict how a given molecule smells. While there has been some progress in predicting the pleasantness and intensity of an odorant, perceptual data for a larger number of diverse molecules are needed to improve current predictions. Towards this goal, we tested the olfactory perception of 480 structurally and perceptually diverse molecules at two concentrations using a panel of 55 healthy human subjects. For each stimulus, we collected data on perceived intensity, pleasantness, and familiarity. In addition, subjects were asked to apply 20 semantic odor quality descriptors to these stimuli, and were offered the option to describe the smell in their own words. Using this dataset, we replicated several previous correlations between molecular features of the stimulus and olfactory perception. The number of sulfur atoms in a molecule was correlated with the odor quality descriptors "garlic," "fish," and "decayed," and large and structurally complex molecules were perceived to be more pleasant. We discovered a number of correlations in intensity perception between molecules. We show that familiarity had a strong effect on the ability of subjects to describe a smell. Many subjects used commercial products to describe familiar odorants, highlighting the role of prior experience in verbal reports of olfactory perception. Nonspecific descriptors like "chemical" were applied frequently to unfamiliar odorants, and unfamiliar odorants were generally rated as neither pleasant nor unpleasant. We present a very large psychophysical dataset and use this to correlate molecular features of a stimulus to olfactory percept. Our work reveals robust correlations between molecular features and perceptual qualities, and highlights the dominant role of familiarity and experience in assigning verbal descriptors to odorants.
Local Multi-Grouped Binary Descriptor With Ring-Based Pooling Configuration and Optimization.
Gao, Yongqiang; Huang, Weilin; Qiao, Yu
2015-12-01
Local binary descriptors are attracting increasingly attention due to their great advantages in computational speed, which are able to achieve real-time performance in numerous image/vision applications. Various methods have been proposed to learn data-dependent binary descriptors. However, most existing binary descriptors aim overly at computational simplicity at the expense of significant information loss which causes ambiguity in similarity measure using Hamming distance. In this paper, by considering multiple features might share complementary information, we present a novel local binary descriptor, referred as ring-based multi-grouped descriptor (RMGD), to successfully bridge the performance gap between current binary and floated-point descriptors. Our contributions are twofold. First, we introduce a new pooling configuration based on spatial ring-region sampling, allowing for involving binary tests on the full set of pairwise regions with different shapes, scales, and distances. This leads to a more meaningful description than the existing methods which normally apply a limited set of pooling configurations. Then, an extended Adaboost is proposed for an efficient bit selection by emphasizing high variance and low correlation, achieving a highly compact representation. Second, the RMGD is computed from multiple image properties where binary strings are extracted. We cast multi-grouped features integration as rankSVM or sparse support vector machine learning problem, so that different features can compensate strongly for each other, which is the key to discriminativeness and robustness. The performance of the RMGD was evaluated on a number of publicly available benchmarks, where the RMGD outperforms the state-of-the-art binary descriptors significantly.
Berthod, L; Whitley, D C; Roberts, G; Sharpe, A; Greenwood, R; Mills, G A
2017-02-01
Understanding the sorption of pharmaceuticals to sewage sludge during waste water treatment processes is important for understanding their environmental fate and in risk assessments. The degree of sorption is defined by the sludge/water partition coefficient (K d ). Experimental K d values (n=297) for active pharmaceutical ingredients (n=148) in primary and activated sludge were collected from literature. The compounds were classified by their charge at pH7.4 (44 uncharged, 60 positively and 28 negatively charged, and 16 zwitterions). Univariate models relating log K d to log K ow for each charge class showed weak correlations (maximum R 2 =0.51 for positively charged) with no overall correlation for the combined dataset (R 2 =0.04). Weaker correlations were found when relating log K d to log D ow . Three sets of molecular descriptors (Molecular Operating Environment, VolSurf and ParaSurf) encoding a range of physico-chemical properties were used to derive multivariate models using stepwise regression, partial least squares and Bayesian artificial neural networks (ANN). The best predictive performance was obtained with ANN, with R 2 =0.62-0.69 for these descriptors using the complete dataset. Use of more complex Vsurf and ParaSurf descriptors showed little improvement over Molecular Operating Environment descriptors. The most influential descriptors in the ANN models, identified by automatic relevance determination, highlighted the importance of hydrophobicity, charge and molecular shape effects in these sorbate-sorbent interactions. The heterogeneous nature of the different sewage sludges used to measure K d limited the predictability of sorption from physico-chemical properties of the pharmaceuticals alone. Standardization of test materials for the measurement of K d would improve comparability of data from different studies, in the long-term leading to better quality environmental risk assessments. Copyright © 2016 British Geological Survey, NERC. Published by Elsevier B.V. All rights reserved.
Predicting the activity of drugs for a group of imidazopyridine anticoccidial compounds.
Si, Hongzong; Lian, Ning; Yuan, Shuping; Fu, Aiping; Duan, Yun-Bo; Zhang, Kejun; Yao, Xiaojun
2009-10-01
Gene expression programming (GEP) is a novel machine learning technique. The GEP is used to build nonlinear quantitative structure-activity relationship model for the prediction of the IC(50) for the imidazopyridine anticoccidial compounds. This model is based on descriptors which are calculated from the molecular structure. Four descriptors are selected from the descriptors' pool by heuristic method (HM) to build multivariable linear model. The GEP method produced a nonlinear quantitative model with a correlation coefficient and a mean error of 0.96 and 0.24 for the training set, 0.91 and 0.52 for the test set, respectively. It is shown that the GEP predicted results are in good agreement with experimental ones.
Kajita, Seiji; Ohba, Nobuko; Jinnouchi, Ryosuke; Asahi, Ryoji
2017-12-05
Material informatics (MI) is a promising approach to liberate us from the time-consuming Edisonian (trial and error) process for material discoveries, driven by machine-learning algorithms. Several descriptors, which are encoded material features to feed computers, were proposed in the last few decades. Especially to solid systems, however, their insufficient representations of three dimensionality of field quantities such as electron distributions and local potentials have critically hindered broad and practical successes of the solid-state MI. We develop a simple, generic 3D voxel descriptor that compacts any field quantities, in such a suitable way to implement convolutional neural networks (CNNs). We examine the 3D voxel descriptor encoded from the electron distribution by a regression test with 680 oxides data. The present scheme outperforms other existing descriptors in the prediction of Hartree energies that are significantly relevant to the long-wavelength distribution of the valence electrons. The results indicate that this scheme can forecast any functionals of field quantities just by learning sufficient amount of data, if there is an explicit correlation between the target properties and field quantities. This 3D descriptor opens a way to import prominent CNNs-based algorithms of supervised, semi-supervised and reinforcement learnings into the solid-state MI.
Wang, Yi; Shao, Yonghua; Wang, Yangyang; Fan, Lingling; Yu, Xiang; Zhi, Xiaoyan; Yang, Chun; Qu, Huan; Yao, Xiaojun; Xu, Hui
2012-08-29
In continuation of our program aimed at the discovery and development of natural-product-based insecticidal agents, 33 isoxazoline and oxime derivatives of podophyllotoxin modified in the C and D rings were synthesized and their structures were characterized by Proton nuclear magnetic resonance ((1)H NMR), high-resolution mass spectrometry (HRMS), electrospray ionization-mass spectrometry (ESI-MS), optical rotation, melting point (mp), and infrared (IR) spectroscopy. The stereochemical configurations of compounds 5e, 5f, and 9f were unambiguously determined by X-ray crystallography. Their insecticidal activity was evaluated against the pre-third-instar larvae of northern armyworm, Mythimna separata (Walker), in vivo. Compounds 5e, 9c, 11g, and 11h especially exhibited more promising insecticidal activity than toosendanin, a commercial botanical insecticide extracted from Melia azedarach . A genetic algorithm combined with multiple linear regression (GA-MLR) calculation is performed by the MOBY DIGS package. Five selected descriptors are as follows: one two-dimensional (2D) autocorrelation descriptor (GATS4e), one edge adjacency indice (EEig06x), one RDF descriptor (RDF080v), one three-dimensional (3D) MoRSE descriptor (Mor09v), and one atom-centered fragment (H-052) descriptor. Quantitative structure-activity relationship studies demonstrated that the insecticidal activity of these compounds was mainly influenced by many factors, such as electronic distribution, steric factors, etc. For this model, the standard deviation error in prediction (SDEP) is 0.0592, the correlation coefficient (R(2)) is 0.861, and the leave-one-out cross-validation correlation coefficient (Q(2)loo) is 0.797.
Murat, Miraemiliana; Abu, Arpah; Yap, Hwa Jen; Yong, Kien-Thai
2017-01-01
Plants play a crucial role in foodstuff, medicine, industry, and environmental protection. The skill of recognising plants is very important in some applications, including conservation of endangered species and rehabilitation of lands after mining activities. However, it is a difficult task to identify plant species because it requires specialized knowledge. Developing an automated classification system for plant species is necessary and valuable since it can help specialists as well as the public in identifying plant species easily. Shape descriptors were applied on the myDAUN dataset that contains 45 tropical shrub species collected from the University of Malaya (UM), Malaysia. Based on literature review, this is the first study in the development of tropical shrub species image dataset and classification using a hybrid of leaf shape and machine learning approach. Four types of shape descriptors were used in this study namely morphological shape descriptors (MSD), Histogram of Oriented Gradients (HOG), Hu invariant moments (Hu) and Zernike moments (ZM). Single descriptor, as well as the combination of hybrid descriptors were tested and compared. The tropical shrub species are classified using six different classifiers, which are artificial neural network (ANN), random forest (RF), support vector machine (SVM), k-nearest neighbour (k-NN), linear discriminant analysis (LDA) and directed acyclic graph multiclass least squares twin support vector machine (DAG MLSTSVM). In addition, three types of feature selection methods were tested in the myDAUN dataset, Relief, Correlation-based feature selection (CFS) and Pearson’s coefficient correlation (PCC). The well-known Flavia dataset and Swedish Leaf dataset were used as the validation dataset on the proposed methods. The results showed that the hybrid of all descriptors of ANN outperformed the other classifiers with an average classification accuracy of 98.23% for the myDAUN dataset, 95.25% for the Flavia dataset and 99.89% for the Swedish Leaf dataset. In addition, the Relief feature selection method achieved the highest classification accuracy of 98.13% after 80 (or 60%) of the original features were reduced, from 133 to 53 descriptors in the myDAUN dataset with the reduction in computational time. Subsequently, the hybridisation of four descriptors gave the best results compared to others. It is proven that the combination MSD and HOG were good enough for tropical shrubs species classification. Hu and ZM descriptors also improved the accuracy in tropical shrubs species classification in terms of invariant to translation, rotation and scale. ANN outperformed the others for tropical shrub species classification in this study. Feature selection methods can be used in the classification of tropical shrub species, as the comparable results could be obtained with the reduced descriptors and reduced in computational time and cost. PMID:28924506
Murat, Miraemiliana; Chang, Siow-Wee; Abu, Arpah; Yap, Hwa Jen; Yong, Kien-Thai
2017-01-01
Plants play a crucial role in foodstuff, medicine, industry, and environmental protection. The skill of recognising plants is very important in some applications, including conservation of endangered species and rehabilitation of lands after mining activities. However, it is a difficult task to identify plant species because it requires specialized knowledge. Developing an automated classification system for plant species is necessary and valuable since it can help specialists as well as the public in identifying plant species easily. Shape descriptors were applied on the myDAUN dataset that contains 45 tropical shrub species collected from the University of Malaya (UM), Malaysia. Based on literature review, this is the first study in the development of tropical shrub species image dataset and classification using a hybrid of leaf shape and machine learning approach. Four types of shape descriptors were used in this study namely morphological shape descriptors (MSD), Histogram of Oriented Gradients (HOG), Hu invariant moments (Hu) and Zernike moments (ZM). Single descriptor, as well as the combination of hybrid descriptors were tested and compared. The tropical shrub species are classified using six different classifiers, which are artificial neural network (ANN), random forest (RF), support vector machine (SVM), k-nearest neighbour (k-NN), linear discriminant analysis (LDA) and directed acyclic graph multiclass least squares twin support vector machine (DAG MLSTSVM). In addition, three types of feature selection methods were tested in the myDAUN dataset, Relief, Correlation-based feature selection (CFS) and Pearson's coefficient correlation (PCC). The well-known Flavia dataset and Swedish Leaf dataset were used as the validation dataset on the proposed methods. The results showed that the hybrid of all descriptors of ANN outperformed the other classifiers with an average classification accuracy of 98.23% for the myDAUN dataset, 95.25% for the Flavia dataset and 99.89% for the Swedish Leaf dataset. In addition, the Relief feature selection method achieved the highest classification accuracy of 98.13% after 80 (or 60%) of the original features were reduced, from 133 to 53 descriptors in the myDAUN dataset with the reduction in computational time. Subsequently, the hybridisation of four descriptors gave the best results compared to others. It is proven that the combination MSD and HOG were good enough for tropical shrubs species classification. Hu and ZM descriptors also improved the accuracy in tropical shrubs species classification in terms of invariant to translation, rotation and scale. ANN outperformed the others for tropical shrub species classification in this study. Feature selection methods can be used in the classification of tropical shrub species, as the comparable results could be obtained with the reduced descriptors and reduced in computational time and cost.
Chemosensory characteristics of regional Vidal icewines from China and Canada.
Huang, Ling; Ma, Yue; Tian, Xin; Li, Ji-Ming; Li, Lan-Xiao; Tang, Ke; Xu, Yan
2018-09-30
This work aimed to compare the flavor characteristics of Vidal icewines from China and Canada and to establish relationships between sensory descriptors and chemical composition. Descriptive analysis was performed with a trained panel to obtain the sensory profiles. Thirty important aroma-active compounds were quantified by four different methodologies. Partial least squares discriminant analysis was used to identify candidate compounds, which were unique to certain sensory descriptors. The sensory profiles of icewines from China were characterized by nut and honey aromas, while icewines from Canada expressed caramel and rose aromas. Nut and honey aromas had a close correlation with 1-hexanol, isoamyl acetate, phenethyl acetate and phenylethyl alcohol. Caramel aroma was correlated with ethyl esters and lactones and rose aroma was correlated with terpenes. Copyright © 2018 Elsevier Ltd. All rights reserved.
Hypercognitive seizures - Proposal of a new term for the phenomenon forced thinking in epilepsy.
Stephani, C; Koubeissi, M
2017-08-01
Here we propose the term hypercognitive seizures as a descriptor for seizures that manifest as a transient mental experience of intrusive thoughts or words that do not consist mainly of reminiscence. Currently, the term forced thinking is used to describe this uncommon seizure semiology, which has also been elicited by electrical brain stimulation. The available literature on forced thinking shows discordant interpretations of its meaning, justifying the suggestion of a new descriptor. In this paper, we would like to suggest and explain the term hypercognitive seizure and argue that this type of seizure lateralizes to the dominant hemisphere. Copyright © 2017 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
André, M. P.; Galperin, M.; Berry, A.; Ojeda-Fournier, H.; O'Boyle, M.; Olson, L.; Comstock, C.; Taylor, A.; Ledgerwood, M.
Our computer-aided diagnostic (CADx) tool uses advanced image processing and artificial intelligence to analyze findings on breast sonography images. The goal is to standardize reporting of such findings using well-defined descriptors and to improve accuracy and reproducibility of interpretation of breast ultrasound by radiologists. This study examined several factors that may impact accuracy and reproducibility of the CADx software, which proved to be highly accurate and stabile over several operating conditions.
Li, Hong Zhi; Hu, Li Hong; Tao, Wei; Gao, Ting; Li, Hui; Lu, Ying Hua; Su, Zhong Min
2012-01-01
A DFT-SOFM-RBFNN method is proposed to improve the accuracy of DFT calculations on Y-NO (Y = C, N, O, S) homolysis bond dissociation energies (BDE) by combining density functional theory (DFT) and artificial intelligence/machine learning methods, which consist of self-organizing feature mapping neural networks (SOFMNN) and radial basis function neural networks (RBFNN). A descriptor refinement step including SOFMNN clustering analysis and correlation analysis is implemented. The SOFMNN clustering analysis is applied to classify descriptors, and the representative descriptors in the groups are selected as neural network inputs according to their closeness to the experimental values through correlation analysis. Redundant descriptors and intuitively biased choices of descriptors can be avoided by this newly introduced step. Using RBFNN calculation with the selected descriptors, chemical accuracy (≤1 kcal·mol(-1)) is achieved for all 92 calculated organic Y-NO homolysis BDE calculated by DFT-B3LYP, and the mean absolute deviations (MADs) of the B3LYP/6-31G(d) and B3LYP/STO-3G methods are reduced from 4.45 and 10.53 kcal·mol(-1) to 0.15 and 0.18 kcal·mol(-1), respectively. The improved results for the minimal basis set STO-3G reach the same accuracy as those of 6-31G(d), and thus B3LYP calculation with the minimal basis set is recommended to be used for minimizing the computational cost and to expand the applications to large molecular systems. Further extrapolation tests are performed with six molecules (two containing Si-NO bonds and two containing fluorine), and the accuracy of the tests was within 1 kcal·mol(-1). This study shows that DFT-SOFM-RBFNN is an efficient and highly accurate method for Y-NO homolysis BDE. The method may be used as a tool to design new NO carrier molecules.
Li, Hong Zhi; Hu, Li Hong; Tao, Wei; Gao, Ting; Li, Hui; Lu, Ying Hua; Su, Zhong Min
2012-01-01
A DFT-SOFM-RBFNN method is proposed to improve the accuracy of DFT calculations on Y-NO (Y = C, N, O, S) homolysis bond dissociation energies (BDE) by combining density functional theory (DFT) and artificial intelligence/machine learning methods, which consist of self-organizing feature mapping neural networks (SOFMNN) and radial basis function neural networks (RBFNN). A descriptor refinement step including SOFMNN clustering analysis and correlation analysis is implemented. The SOFMNN clustering analysis is applied to classify descriptors, and the representative descriptors in the groups are selected as neural network inputs according to their closeness to the experimental values through correlation analysis. Redundant descriptors and intuitively biased choices of descriptors can be avoided by this newly introduced step. Using RBFNN calculation with the selected descriptors, chemical accuracy (≤1 kcal·mol−1) is achieved for all 92 calculated organic Y-NO homolysis BDE calculated by DFT-B3LYP, and the mean absolute deviations (MADs) of the B3LYP/6-31G(d) and B3LYP/STO-3G methods are reduced from 4.45 and 10.53 kcal·mol−1 to 0.15 and 0.18 kcal·mol−1, respectively. The improved results for the minimal basis set STO-3G reach the same accuracy as those of 6-31G(d), and thus B3LYP calculation with the minimal basis set is recommended to be used for minimizing the computational cost and to expand the applications to large molecular systems. Further extrapolation tests are performed with six molecules (two containing Si-NO bonds and two containing fluorine), and the accuracy of the tests was within 1 kcal·mol−1. This study shows that DFT-SOFM-RBFNN is an efficient and highly accurate method for Y-NO homolysis BDE. The method may be used as a tool to design new NO carrier molecules. PMID:22942689
New Fukui, dual and hyper-dual kernels as bond reactivity descriptors.
Franco-Pérez, Marco; Polanco-Ramírez, Carlos-A; Ayers, Paul W; Gázquez, José L; Vela, Alberto
2017-06-21
We define three new linear response indices with promising applications for bond reactivity using the mathematical framework of τ-CRT (finite temperature chemical reactivity theory). The τ-Fukui kernel is defined as the ratio between the fluctuations of the average electron density at two different points in the space and the fluctuations in the average electron number and is designed to integrate to the finite-temperature definition of the electronic Fukui function. When this kernel is condensed, it can be interpreted as a site-reactivity descriptor of the boundary region between two atoms. The τ-dual kernel corresponds to the first order response of the Fukui kernel and is designed to integrate to the finite temperature definition of the dual descriptor; it indicates the ambiphilic reactivity of a specific bond and enriches the traditional dual descriptor by allowing one to distinguish between the electron-accepting and electron-donating processes. Finally, the τ-hyper dual kernel is defined as the second-order derivative of the Fukui kernel and is proposed as a measure of the strength of ambiphilic bonding interactions. Although these quantities have never been proposed, our results for the τ-Fukui kernel and for τ-dual kernel can be derived in zero-temperature formulation of the chemical reactivity theory with, among other things, the widely-used parabolic interpolation model.
Structure-Activity Relationships for Rates of Aromatic Amine Oxidation by Manganese Dioxide.
Salter-Blanc, Alexandra J; Bylaska, Eric J; Lyon, Molly A; Ness, Stuart C; Tratnyek, Paul G
2016-05-17
New energetic compounds are designed to minimize their potential environmental impacts, which includes their transformation and the fate and effects of their transformation products. The nitro groups of energetic compounds are readily reduced to amines, and the resulting aromatic amines are subject to oxidation and coupling reactions. Manganese dioxide (MnO2) is a common environmental oxidant and model system for kinetic studies of aromatic amine oxidation. In this study, a training set of new and previously reported kinetic data for the oxidation of model and energetic-derived aromatic amines was assembled and subjected to correlation analysis against descriptor variables that ranged from general purpose [Hammett σ constants (σ(-)), pKas of the amines, and energies of the highest occupied molecular orbital (EHOMO)] to specific for the likely rate-limiting step [one-electron oxidation potentials (Eox)]. The selection of calculated descriptors (pKa, EHOMO, and Eox) was based on validation with experimental data. All of the correlations gave satisfactory quantitative structure-activity relationships (QSARs), but they improved with the specificity of the descriptor. The scope of correlation analysis was extended beyond MnO2 to include literature data on aromatic amine oxidation by other environmentally relevant oxidants (ozone, chlorine dioxide, and phosphate and carbonate radicals) by correlating relative rate constants (normalized to 4-chloroaniline) to EHOMO (calculated with a modest level of theory).
Large-scale feature searches of collections of medical imagery
NASA Astrophysics Data System (ADS)
Hedgcock, Marcus W.; Karshat, Walter B.; Levitt, Tod S.; Vosky, D. N.
1993-09-01
Large scale feature searches of accumulated collections of medical imagery are required for multiple purposes, including clinical studies, administrative planning, epidemiology, teaching, quality improvement, and research. To perform a feature search of large collections of medical imagery, one can either search text descriptors of the imagery in the collection (usually the interpretation), or (if the imagery is in digital format) the imagery itself. At our institution, text interpretations of medical imagery are all available in our VA Hospital Information System. These are downloaded daily into an off-line computer. The text descriptors of most medical imagery are usually formatted as free text, and so require a user friendly database search tool to make searches quick and easy for any user to design and execute. We are tailoring such a database search tool (Liveview), developed by one of the authors (Karshat). To further facilitate search construction, we are constructing (from our accumulated interpretation data) a dictionary of medical and radiological terms and synonyms. If the imagery database is digital, the imagery which the search discovers is easily retrieved from the computer archive. We describe our database search user interface, with examples, and compare the efficacy of computer assisted imagery searches from a clinical text database with manual searches. Our initial work on direct feature searches of digital medical imagery is outlined.
NASA Astrophysics Data System (ADS)
Shevade, Abhijit V.; Ryan, Margaret A.; Homer, Margie L.; Zhou, Hanying; Manfreda, Allison M.; Lara, Liana M.; Yen, Shiao-Pin S.; Jewell, April D.; Manatt, Kenneth S.; Kisor, Adam K.
We have developed a Quantitative Structure-Activity Relationships (QSAR) based approach to correlate the response of chemical sensors in an array with molecular descriptors. A novel molecular descriptor set has been developed; this set combines descriptors of sensing film-analyte interactions, representing sensor response, with a basic analyte descriptor set commonly used in QSAR studies. The descriptors are obtained using a combination of molecular modeling tools and empirical and semi-empirical Quantitative Structure-Property Relationships (QSPR) methods. The sensors under investigation are polymer-carbon sensing films which have been exposed to analyte vapors at parts-per-million (ppm) concentrations; response is measured as change in film resistance. Statistically validated QSAR models have been developed using Genetic Function Approximations (GFA) for a sensor array for a given training data set. The applicability of the sensor response models has been tested by using it to predict the sensor activities for test analytes not considered in the training set for the model development. The validated QSAR sensor response models show good predictive ability. The QSAR approach is a promising computational tool for sensing materials evaluation and selection. It can also be used to predict response of an existing sensing film to new target analytes.
Bestelmeyer, Brandon T.; Williamson, Jeb C.; Talbot, Curtis J.; Cates, Greg W.; Duniway, Michael C.; Brown, Joel R.
2016-01-01
State-and-transition models (STMs) are useful tools for management, but they can be difficult to use and have limited content.STMs created for groups of related ecological sites could simplify and improve their utility. The amount of information linked to models can be increased using tables that communicate management interpretations and important within-group variability.We created a new web-based information system (the Ecosystem Dynamics Interpretive Tool) to house STMs, associated tabular information, and other ecological site data and descriptors.Fewer, more informative, better organized, and easily accessible STMs should increase the accessibility of science information.
Sensory shelf life of dulce de leche.
Garitta, L; Hough, G; Sánchez, R
2004-06-01
The objectives of this research were to determine the sensory cutoff points for dulce de leche (DL) critical descriptors, both for defective off-flavors and for storage changes in desirable attributes, and to estimate the shelf life of DL as a function of storage temperature. The critical descriptors used to determine the cutoff points were plastic flavor, burnt flavor, dark color, and spreadability. Linear correlations between sensory acceptability and trained panel scores were used to determine the sensory failure cutoff point for each descriptor. To estimate shelf life, DL samples were stored at 25, 37, and 45 degrees C. Plastic flavor was the first descriptor to reach its cutoff point at 25 degrees C and was used for shelf-life calculations. Plastic flavor vs. storage time followed zero-order reaction rate. Shelf-life estimations at different temperatures were 109 d at 25 degrees C, 53 d at 37 degrees C, and 9 d at 45 degrees C. The activation energy, necessary to calculate shelf lives at different temperatures, was 14,370 +/- 2080 cal/mol.
Combining multiple features for color texture classification
NASA Astrophysics Data System (ADS)
Cusano, Claudio; Napoletano, Paolo; Schettini, Raimondo
2016-11-01
The analysis of color and texture has a long history in image analysis and computer vision. These two properties are often considered as independent, even though they are strongly related in images of natural objects and materials. Correlation between color and texture information is especially relevant in the case of variable illumination, a condition that has a crucial impact on the effectiveness of most visual descriptors. We propose an ensemble of hand-crafted image descriptors designed to capture different aspects of color textures. We show that the use of these descriptors in a multiple classifiers framework makes it possible to achieve a very high classification accuracy in classifying texture images acquired under different lighting conditions. A powerful alternative to hand-crafted descriptors is represented by features obtained with deep learning methods. We also show how the proposed combining strategy hand-crafted and convolutional neural networks features can be used together to further improve the classification accuracy. Experimental results on a food database (raw food texture) demonstrate the effectiveness of the proposed strategy.
Cavity Versus Ligand Shape Descriptors: Application to Urokinase Binding Pockets.
Cerisier, Natacha; Regad, Leslie; Triki, Dhoha; Camproux, Anne-Claude; Petitjean, Michel
2017-11-01
We analyzed 78 binding pockets of the human urokinase plasminogen activator (uPA) catalytic domain extracted from a data set of crystallized uPA-ligand complexes. These binding pockets were computed with an original geometric method that does NOT involve any arbitrary parameter, such as cutoff distances, angles, and so on. We measured the deviation from convexity of each pocket shape with the pocket convexity index (PCI). We defined a new pocket descriptor called distributional sphericity coefficient (DISC), which indicates to which extent the protein atoms of a given pocket lie on the surface of a sphere. The DISC values were computed with the freeware PCI. The pocket descriptors and their high correspondences with ligand descriptors are crucial for polypharmacology prediction. We found that the protein heavy atoms lining the urokinases binding pockets are either located on the surface of their convex hull or lie close to this surface. We also found that the radii of the urokinases binding pockets and the radii of their ligands are highly correlated (r = 0.9).
Liu, Shu-Shen; Liu, Yan; Yin, Da-Qian; Wang, Xiao-Dong; Wang, Lian-Sheng
2006-02-01
Using the molecular electronegativity distance vector (MEDV) descriptors derived directly from the molecular topological structures, the gas chromatographic relative retention times (RRTs) of 209 polychlorinated biphenyls (PCBs) on the SE-54 stationary phase were predicted. A five-variable regression equation with the correlation coefficient of 0.9964 and the root mean square errors of 0.0152 was developed. The descriptors included in the equation represent degree of chlorination (nCl), nonortho index (Ino), and interactions between three pairs of atom types, i.e., atom groups -C= and -C=, -C= and >C=, -C= and -Cl. It has been proved that the retention times of all 209 PCB congeners can be accurately predicted as long as there are more than 50 calibration compounds. In the same way, the MEDV descriptors are also used to develop the five- or six-variable models of RRTs of PCBs on other 18 stationary phases and the correlation coefficients in both modeling stage and LOO cross-validation step are not lower than 0.99 except two models.
Vibration mode shape recognition using image processing
NASA Astrophysics Data System (ADS)
Wang, Weizhuo; Mottershead, John E.; Mares, Cristinel
2009-10-01
Currently the most widely used method for comparing mode shapes from finite elements and experimental measurements is the modal assurance criterion (MAC), which can be interpreted as the cosine of the angle between the numerical and measured eigenvectors. However, the eigenvectors only contain the displacement of discrete coordinates, so that the MAC index carries no explicit information on shape features. New techniques, based upon the well-developed philosophies of image processing (IP) and pattern recognition (PR) are considered in this paper. The Zernike moment descriptor (ZMD), Fourier descriptor (FD), and wavelet descriptor (WD) are the most popular shape descriptors due to their outstanding properties in IP and PR. These include (1) for the ZMD-rotational invariance, expression and computing efficiency, ease of reconstruction and robustness to noise; (2) for the FD—separation of the global shape and shape-details by low and high frequency components, respectively, invariance under geometric transformation; (3) for the WD—multi-scale representation and local feature detection. Once a shape descriptor has been adopted, the comparison of mode shapes is transformed to a comparison of multidimensional shape feature vectors. Deterministic and statistical methods are presented. The deterministic problem of measuring the degree of similarity between two mode shapes (possibly one from a vibration test and the other from a finite element model) may be carried out using Pearson's correlation. Similar shape feature vectors may be arranged in clusters separated by Euclidian distances in the feature space. In the statistical analysis we are typically concerned with the classification of a test mode shape according to clusters of shape feature vectors obtained from a randomised finite element model. The dimension of the statistical problem may often be reduced by principal component analysis. Then, in addition to the Euclidian distance, the Mahalanobis distance, defining the separation of the test point from the cluster in terms of its standard deviation, becomes an important measure. Bayesian decision theory may be applied to formally minimise the risk of misclassification of the test shape feature vector. In this paper the ZMD is applied to the problem of mode shape recognition for a circular plate. Results show that the ZMD has considerable advantages over the traditional MAC index when identifying the cyclically symmetric mode shapes that occur in axisymmetric structures at identical frequencies. Mode shape recognition of rectangular plates is carried out by the FD. Also, the WD is applied to the problem of recognising the mode shapes in the thin and thick regions of a plate with different thicknesses. It shows the benefit of using the WD to identify mode-shapes having both local and global components. The comparison and classification of mode shapes using IP and PR provides a 'toolkit' to complement the conventional MAC approach. The selection of a particular shape descriptor and classification method will depend upon the problem in hand and the experience of the analyst.
Structure-Activity Relationships for Rates of Aromatic Amine Oxidation by Manganese Dioxide
Salter-Blanc, Alexandra J.; Bylaska, Eric J.; Lyon, Molly A.; ...
2016-04-13
New energetic compounds are designed to minimize their potential environmental impacts, which includes their transformation and the fate and effects of their transformation products. The nitro groups of energetic compounds are readily reduced to amines, and the resulting aromatic amines are subject to oxidation and coupling reactions. Manganese dioxide (MnO 2) is a common environmental oxidant and model system for kinetic studies of aromatic amine oxidation. Here in this study, a training set of new and previously reported kinetic data for the oxidation of model and energetic-derived aromatic amines was assembled and subjected to correlation analysis against descriptor variables that ranged from general purpose [Hammettmore » $$\\sigma$$ constants ($$\\sigma^-$$), pK as of the amines, and energies of the highest occupied molecular orbital (E HOMO)] to specific for the likely rate-limiting step [one-electron oxidation potentials (E ox)]. The selection of calculated descriptors (pK a), E HOMO, and E ox) was based on validation with experimental data. All of the correlations gave satisfactory quantitative structure-activity relationships (QSARs), but they improved with the specificity of the descriptor. The scope of correlation analysis was extended beyond MnO 2 to include literature data on aromatic amine oxidation by other environmentally relevant oxidants (ozone, chlorine dioxide, and phosphate and carbonate radicals) by correlating relative rate constants (normalized to 4-chloroaniline) to E HOMO (calculated with a modest level of theory).« less
Structure-Activity Relationships for Rates of Aromatic Amine Oxidation by Manganese Dioxide
DOE Office of Scientific and Technical Information (OSTI.GOV)
Salter-Blanc, Alexandra J.; Bylaska, Eric J.; Lyon, Molly A.
New energetic compounds are designed to minimize their potential environmental impacts, which includes their transformation and the fate and effects of their transformation products. The nitro groups of energetic compounds are readily reduced to amines, and the resulting aromatic amines are subject to oxidation and coupling reactions. Manganese dioxide (MnO 2) is a common environmental oxidant and model system for kinetic studies of aromatic amine oxidation. Here in this study, a training set of new and previously reported kinetic data for the oxidation of model and energetic-derived aromatic amines was assembled and subjected to correlation analysis against descriptor variables that ranged from general purpose [Hammettmore » $$\\sigma$$ constants ($$\\sigma^-$$), pK as of the amines, and energies of the highest occupied molecular orbital (E HOMO)] to specific for the likely rate-limiting step [one-electron oxidation potentials (E ox)]. The selection of calculated descriptors (pK a), E HOMO, and E ox) was based on validation with experimental data. All of the correlations gave satisfactory quantitative structure-activity relationships (QSARs), but they improved with the specificity of the descriptor. The scope of correlation analysis was extended beyond MnO 2 to include literature data on aromatic amine oxidation by other environmentally relevant oxidants (ozone, chlorine dioxide, and phosphate and carbonate radicals) by correlating relative rate constants (normalized to 4-chloroaniline) to E HOMO (calculated with a modest level of theory).« less
Effective structural descriptors for natural and engineered radioactive waste confinement barriers
NASA Astrophysics Data System (ADS)
Lemmens, Laurent; Rogiers, Bart; De Craen, Mieke; Laloy, Eric; Jacques, Diederik; Huysmans, Marijke; Swennen, Rudy; Urai, Janos L.; Desbois, Guillaume
2017-04-01
The microstructure of a radioactive waste confinement barrier strongly influences its flow and transport properties. Numerical flow and transport simulations for these porous media at the pore scale therefore require input data that describe the microstructure as accurately as possible. To date, no imaging method can resolve all heterogeneities within important radioactive waste confinement barrier materials as hardened cement paste and natural clays at the micro scale (nm-cm). Therefore, it is necessary to merge information from different 2D and 3D imaging methods using porous media reconstruction techniques. To qualitatively compare the results of different reconstruction techniques, visual inspection might suffice. To quantitatively compare training-image based algorithms, Tan et al. (2014) proposed an algorithm using an analysis of distance. However, the ranking of the algorithm depends on the choice of the structural descriptor, in their case multiple-point or cluster-based histograms. We present here preliminary work in which we will review different structural descriptors and test their effectiveness, for capturing the main structural characteristics of radioactive waste confinement barrier materials, to determine the descriptors to use in the analysis of distance. The investigated descriptors are particle size distributions, surface area distributions, two point probability functions, multiple point histograms, linear functions and two point cluster functions. The descriptor testing consists of stochastically generating realizations from a reference image using the simulated annealing optimization procedure introduced by Karsanina et al. (2015). This procedure basically minimizes the differences between pre-specified descriptor values associated with the training image and the image being produced. The most efficient descriptor set can therefore be identified by comparing the image generation quality among the tested descriptor combinations. The assessment of the quality of the simulations will be made by combining all considered descriptors. Once the set of the most efficient descriptors is determined, they can be used in the analysis of distance, to rank different reconstruction algorithms in a more objective way in future work. Karsanina MV, Gerke KM, Skvortsova EB, Mallants D (2015) Universal Spatial Correlation Functions for Describing and Reconstructing Soil Microstructure. PLoS ONE 10(5): e0126515. doi:10.1371/journal.pone.0126515 Tan, Xiaojin, Pejman Tahmasebi, and Jef Caers. "Comparing training-image based algorithms using an analysis of distance." Mathematical Geosciences 46.2 (2014): 149-169.
NASA Astrophysics Data System (ADS)
Li, Yao-Wang; Li, Bo; He, Jiguo; Qian, Ping
2011-07-01
A database consisting of 214 tripeptides which contain either His or Tyr residue was applied to study quantitative structure-activity relationships (QSAR) of antioxidative tripeptides. Partial Least-Squares Regression analysis (PLSR) was conducted using parameters individually of each amino acid descriptor, including Divided Physico-chemical Property Scores (DPPS), Hydrophobic, Electronic, Steric, and Hydrogen (HESH), Vectors of Hydrophobic, Steric, and Electronic properties (VHSE), Molecular Surface-Weighted Holistic Invariant Molecular (MS-WHIM), isotropic surface area-electronic charge index (ISA-ECI) and Z-scale, to describe antioxidative tripeptides as X-variables and antioxidant activities measured with ferric thiocyanate methods were as Y-variable. After elimination of outliers by Hotelling's T 2 method and residual analysis, six significant models were obtained describing the entire data set. According to cumulative squared multiple correlation coefficients ( R2), cumulative cross-validation coefficients ( Q2) and relative standard deviation for calibration set (RSD c), the qualities of models using DPPS, HESH, ISA-ECI, and VHSE descriptors are better ( R2 > 0.6, Q2 > 0.5, RSD c < 0.39) than that of models using MS-WHIM and Z-scale descriptors ( R2 < 0.6, Q2 < 0.5, RSD c > 0.44). Furthermore, the predictive ability of models using DPPS descriptor is best among the six descriptors systems (cumulative multiple correlation coefficient for predict set ( Rext2) > 0.7). It was concluded that the DPPS is better to describe the amino acid of antioxidative tripeptides. The results of DPPS descriptor reveal that the importance of the center amino acid and the N-terminal amino acid are far more than the importance of the C-terminal amino acid for antioxidative tripeptides. The hydrophobic (positively to activity) and electronic (negatively to activity) properties of the N-terminal amino acid are suggested to play the most important significance to activity, followed by the hydrogen bond (positively to activity) of the center amino acid. The N-terminal amino acid should be a high hydrophobic and low electronic amino acid (such as Ala, Gly, Val, and Leu); the center amino acid would be an amino acid that possesses high hydrogen bond property (such as base amino acid Arg, Lys, and His). The structural characteristics of antioxidative peptide be found in this paper may contribute to the further research of antioxidative mechanism.
Static sign language recognition using 1D descriptors and neural networks
NASA Astrophysics Data System (ADS)
Solís, José F.; Toxqui, Carina; Padilla, Alfonso; Santiago, César
2012-10-01
A frame work for static sign language recognition using descriptors which represents 2D images in 1D data and artificial neural networks is presented in this work. The 1D descriptors were computed by two methods, first one consists in a correlation rotational operator.1 and second is based on contour analysis of hand shape. One of the main problems in sign language recognition is segmentation; most of papers report a special color in gloves or background for hand shape analysis. In order to avoid the use of gloves or special clothing, a thermal imaging camera was used to capture images. Static signs were picked up from 1 to 9 digits of American Sign Language, a multilayer perceptron reached 100% recognition with cross-validation.
Li, Yuqin; You, Guirong; Jia, Baoxiu; Si, Hongzong; Yao, Xiaojun
2014-01-01
Quantitative structure-activity relationships (QSAR) were developed to predict the inhibition ratio of pyrrolidine derivatives on matrix metalloproteinase via heuristic method (HM) and gene expression programming (GEP). The descriptors of 33 pyrrolidine derivatives were calculated by the software CODESSA, which can calculate quantum chemical, topological, geometrical, constitutional, and electrostatic descriptors. HM was also used for the preselection of 5 appropriate molecular descriptors. Linear and nonlinear QSAR models were developed based on the HM and GEP separately and two prediction models lead to a good correlation coefficient (R (2)) of 0.93 and 0.94. The two QSAR models are useful in predicting the inhibition ratio of pyrrolidine derivatives on matrix metalloproteinase during the discovery of new anticancer drugs and providing theory information for studying the new drugs.
Geng, Hua; Todd, Naomi M; Devlin-Mullin, Aine; Poologasundarampillai, Gowsihan; Kim, Taek Bo; Madi, Kamel; Cartmell, Sarah; Mitchell, Christopher A; Jones, Julian R; Lee, Peter D
2016-06-01
A correlative imaging methodology was developed to accurately quantify bone formation in the complex lattice structure of additive manufactured implants. Micro computed tomography (μCT) and histomorphometry were combined, integrating the best features from both, while demonstrating the limitations of each imaging modality. This semi-automatic methodology registered each modality using a coarse graining technique to speed the registration of 2D histology sections to high resolution 3D μCT datasets. Once registered, histomorphometric qualitative and quantitative bone descriptors were directly correlated to 3D quantitative bone descriptors, such as bone ingrowth and bone contact. The correlative imaging allowed the significant volumetric shrinkage of histology sections to be quantified for the first time (~15 %). This technique demonstrated the importance of location of the histological section, demonstrating that up to a 30 % offset can be introduced. The results were used to quantitatively demonstrate the effectiveness of 3D printed titanium lattice implants.
Inductive electronegativity scale. Iterative calculation of inductive partial charges.
Cherkasov, Artem
2003-01-01
A number of novel QSAR descriptors have been introduced on the basis of the previously elaborated models for steric and inductive effects. The developed "inductive" parameters include absolute and effective electronegativity, atomic partial charges, and local and global chemical hardness and softness. Being based on traditional inductive and steric substituent constants these 3D descriptors provide a valuable insight into intramolecular steric and electronic interactions and can find broad application in structure-activity studies. Possible interpretation of physical meaning of the inductive descriptors has been suggested by considering a neutral molecule as an electrical capacitor formed by charged atomic spheres. This approximation relates inductive chemical softness and hardness of bound atom(s) with the total area of the facings of electrical capacitor formed by the atom(s) and the rest of the molecule. The derived full electronegativity equalization scheme allows iterative calculation of inductive partial charges on the basis of atomic electronegativities, covalent radii, and intramolecular distances. A range of inductive descriptors has been computed for a variety of organic compounds. The calculated inductive charges in the studied molecules have been validated by experimental C-1s Electron Core Binding Energies and molecular dipole moments. Several semiempirical chemical rules, such as equalized electronegativity's arithmetic mean, principle of maximum hardness, and principle of hardness borrowing could be explicitly illustrated in the framework of the developed approach.
NASA Astrophysics Data System (ADS)
Basant, Nikita; Gupta, Shikha
2018-03-01
The reactions of molecular ozone (O3), hydroxyl (•OH) and nitrate (NO3) radicals are among the major pathways of removal of volatile organic compounds (VOCs) in the atmospheric environment. The gas-phase kinetic rate constants (kO3, kOH, kNO3) are thus, important in assessing the ultimate fate and exposure risk of atmospheric VOCs. Experimental data for rate constants are not available for many emerging VOCs and the computational methods reported so far address a single target modeling only. In this study, we have developed a multi-target (mt) QSPR model for simultaneous prediction of multiple kinetic rate constants (kO3, kOH, kNO3) of diverse organic chemicals considering an experimental data set of VOCs for which values of all the three rate constants are available. The mt-QSPR model identified and used five descriptors related to the molecular size, degree of saturation and electron density in a molecule, which were mechanistically interpretable. These descriptors successfully predicted three rate constants simultaneously. The model yielded high correlations (R2 = 0.874-0.924) between the experimental and simultaneously predicted endpoint rate constant (kO3, kOH, kNO3) values in test arrays for all the three systems. The model also passed all the stringent statistical validation tests for external predictivity. The proposed multi-target QSPR model can be successfully used for predicting reactivity of new VOCs simultaneously for their exposure risk assessment.
Do location specific forecasts pose a new challenge for communicating uncertainty?
NASA Astrophysics Data System (ADS)
Abraham, Shyamali; Bartlett, Rachel; Standage, Matthew; Black, Alison; Charlton-Perez, Andrew; McCloy, Rachel
2015-04-01
In the last decade, the growth of local, site-specific weather forecasts delivered by mobile phone or website represents arguably the fastest change in forecast consumption since the beginning of Television weather forecasts 60 years ago. In this study, a street-interception survey of 274 members of the public a clear first preference for narrow weather forecasts above traditional broad weather forecasts is shown for the first time, with a clear bias towards this preference for users under 40. The impact of this change on the understanding of forecast probability and intensity information is explored. While the correct interpretation of the statement 'There is a 30% chance of rain tomorrow' is still low in the cohort, in common with previous studies, a clear impact of age and educational attainment on understanding is shown, with those under 40 and educated to degree level or above more likely to correctly interpret it. The interpretation of rainfall intensity descriptors ('Light', 'Moderate', 'Heavy') by the cohort is shown to be significantly different to official and expert assessment of the same descriptors and to have large variance amongst the cohort. However, despite these key uncertainties, members of the cohort generally seem to make appropriate decisions about rainfall forecasts. There is some evidence that the decisions made are different depending on the communication format used, and the cohort expressed a clear preference for tabular over graphical weather forecast presentation.
Descriptors of Oxygen-Evolution Activity for Oxides: A Statistical Evaluation
Hong, Wesley T.; Welsch, Roy E.; Shao-Horn, Yang
2015-12-16
Catalysts for oxygen electrochemical processes are critical for the commercial viability of renewable energy storage and conversion devices such as fuel cells, artificial photosynthesis, and metal-air batteries. Transition metal oxides are an excellent system for developing scalable, non-noble-metal-based catalysts, especially for the oxygen evolution reaction (OER). Central to the rational design of novel catalysts is the development of quantitative structure-activity relation-ships, which correlate the desired catalytic behavior to structural and/or elemental descriptors of materials. The ultimate goal is to use these relationships to guide materials design. In this study, 101 intrinsic OER activities of 51 perovskites were compiled from fivemore » studies in literature and additional measurements made for this work. We explored the behavior and performance of 14 descriptors of the metal-oxygen bond strength using a number of statistical approaches, including factor analysis and linear regression models. We found that these descriptors can be classified into five descriptor families and identify electron occupancy and metal-oxygen covalency as the dominant influences on the OER activity. However, multiple descriptors still need to be considered in order to develop strong predictive relationships, largely outperforming the use of only one or two descriptors (as conventionally done in the field). Here, we confirmed that the number of d electrons, charge-transfer energy (covalency), and optimality of eg occupancy play the important roles, but found that structural factors such as M-O-M bond angle and tolerance factor are relevant as well. With these tools, we demonstrate how statistical learning can be used to draw novel physical insights and combined with data mining to rapidly screen OER electrocatalysts across a wide chemical space.« less
Structural protein descriptors in 1-dimension and their sequence-based predictions.
Kurgan, Lukasz; Disfani, Fatemeh Miri
2011-09-01
The last few decades observed an increasing interest in development and application of 1-dimensional (1D) descriptors of protein structure. These descriptors project 3D structural features onto 1D strings of residue-wise structural assignments. They cover a wide-range of structural aspects including conformation of the backbone, burying depth/solvent exposure and flexibility of residues, and inter-chain residue-residue contacts. We perform first-of-its-kind comprehensive comparative review of the existing 1D structural descriptors. We define, review and categorize ten structural descriptors and we also describe, summarize and contrast over eighty computational models that are used to predict these descriptors from the protein sequences. We show that the majority of the recent sequence-based predictors utilize machine learning models, with the most popular being neural networks, support vector machines, hidden Markov models, and support vector and linear regressions. These methods provide high-throughput predictions and most of them are accessible to a non-expert user via web servers and/or stand-alone software packages. We empirically evaluate several recent sequence-based predictors of secondary structure, disorder, and solvent accessibility descriptors using a benchmark set based on CASP8 targets. Our analysis shows that the secondary structure can be predicted with over 80% accuracy and segment overlap (SOV), disorder with over 0.9 AUC, 0.6 Matthews Correlation Coefficient (MCC), and 75% SOV, and relative solvent accessibility with PCC of 0.7 and MCC of 0.6 (0.86 when homology is used). We demonstrate that the secondary structure predicted from sequence without the use of homology modeling is as good as the structure extracted from the 3D folds predicted by top-performing template-based methods.
Improving Large-Scale Image Retrieval Through Robust Aggregation of Local Descriptors.
Husain, Syed Sameed; Bober, Miroslaw
2017-09-01
Visual search and image retrieval underpin numerous applications, however the task is still challenging predominantly due to the variability of object appearance and ever increasing size of the databases, often exceeding billions of images. Prior art methods rely on aggregation of local scale-invariant descriptors, such as SIFT, via mechanisms including Bag of Visual Words (BoW), Vector of Locally Aggregated Descriptors (VLAD) and Fisher Vectors (FV). However, their performance is still short of what is required. This paper presents a novel method for deriving a compact and distinctive representation of image content called Robust Visual Descriptor with Whitening (RVD-W). It significantly advances the state of the art and delivers world-class performance. In our approach local descriptors are rank-assigned to multiple clusters. Residual vectors are then computed in each cluster, normalized using a direction-preserving normalization function and aggregated based on the neighborhood rank. Importantly, the residual vectors are de-correlated and whitened in each cluster before aggregation, leading to a balanced energy distribution in each dimension and significantly improved performance. We also propose a new post-PCA normalization approach which improves separability between the matching and non-matching global descriptors. This new normalization benefits not only our RVD-W descriptor but also improves existing approaches based on FV and VLAD aggregation. Furthermore, we show that the aggregation framework developed using hand-crafted SIFT features also performs exceptionally well with Convolutional Neural Network (CNN) based features. The RVD-W pipeline outperforms state-of-the-art global descriptors on both the Holidays and Oxford datasets. On the large scale datasets, Holidays1M and Oxford1M, SIFT-based RVD-W representation obtains a mAP of 45.1 and 35.1 percent, while CNN-based RVD-W achieve a mAP of 63.5 and 44.8 percent, all yielding superior performance to the state-of-the-art.
Sajadi, Seyyed Mohammad Ali; Majidi, Alireza; Abdollahimajd, Fahimeh; jalali, Fatemeh
2017-01-01
Introduction: History taking and physical examination help clinicians identify the patient’s problem and effectively treat it. This study aimed to evaluate the descriptors of dyspnea in patients presenting to emergency department (ED) with asthma, congestive heart failure (CHF), and chronic obstructive pulmonary disease (COPD). Method: This cross-sectional study was conducted on all patients presenting to ED with chief complaint of dyspnea, during 2 years. The patients were asked to describe their dyspnea by choosing three items from the valid and reliable questionnaire or articulating their sensation. The relationship between dyspnea descriptors and underlying cause of symptom was evaluated using SPSS version 16. Results: 312 patients with the mean age of 60.96±17.01 years were evaluated (53.2% male). Most of the patients were > 65 years old (48.7%) and had basic level of education (76.9%). "My breath doesn’t go out all the way" with 83.1%, “My chest feels tight " with 45.8%, and "I feel that my airway is obstructed" with 40.7%, were the most frequent dyspnea descriptors in asthma patients. "My breathing requires work" with 46.3%, "I feel that I am suffocating" with 31.5%, and "My breath doesn’t go out all the way" with 29.6%, were the most frequent dyspnea descriptors in COPD patients. "My breathing is heavy" with 74.4%, "A hunger for more air” with 24.4%, and "I cannot get enough air" with 23.2%, were the most frequent dyspnea descriptors in CHF patients. Except for “My breath does not go in all the way”, there was significant correlation between studied dyspnea descriptors and underlying disease (p = 0.001 for all analyses). Conclusion: It seems that dyspnea descriptors along with other findings from history and physical examination could be helpful in differentiating the causes of the symptom in patients presenting to ED suffering from dyspnea. PMID:28894777
Sajadi, Seyyed Mohammad Ali; Majidi, Alireza; Abdollahimajd, Fahimeh; Jalali, Fatemeh
2017-01-01
History taking and physical examination help clinicians identify the patient's problem and effectively treat it. This study aimed to evaluate the descriptors of dyspnea in patients presenting to emergency department (ED) with asthma, congestive heart failure (CHF), and chronic obstructive pulmonary disease (COPD). This cross-sectional study was conducted on all patients presenting to ED with chief complaint of dyspnea, during 2 years. The patients were asked to describe their dyspnea by choosing three items from the valid and reliable questionnaire or articulating their sensation. The relationship between dyspnea descriptors and underlying cause of symptom was evaluated using SPSS version 16. 312 patients with the mean age of 60.96±17.01 years were evaluated (53.2% male). Most of the patients were > 65 years old (48.7%) and had basic level of education (76.9%). "My breath doesn't go out all the way" with 83.1%, "My chest feels tight " with 45.8%, and "I feel that my airway is obstructed" with 40.7%, were the most frequent dyspnea descriptors in asthma patients. "My breathing requires work" with 46.3%, "I feel that I am suffocating" with 31.5%, and "My breath doesn't go out all the way" with 29.6%, were the most frequent dyspnea descriptors in COPD patients. "My breathing is heavy" with 74.4%, "A hunger for more air" with 24.4%, and "I cannot get enough air" with 23.2%, were the most frequent dyspnea descriptors in CHF patients. Except for "My breath does not go in all the way", there was significant correlation between studied dyspnea descriptors and underlying disease (p = 0.001 for all analyses). It seems that dyspnea descriptors along with other findings from history and physical examination could be helpful in differentiating the causes of the symptom in patients presenting to ED suffering from dyspnea.
Alonso-Carné, J; García-Martín, A; Estrada-Peña, A
2015-01-01
Ticks are sensitive to changes in relative humidity and saturation deficit at the microclimate scale. Trends and changes in rainfall are commonly used as descriptors of field observations of tick populations, to capture the climate niche of ticks or to predict the climate suitability for ticks under future climate scenarios. We evaluated daily and monthly relationships between rainfall, relative humidity and saturation deficit over different ecosystems in Europe using daily climate values from 177 stations over a period of 10 years. We demonstrate that rainfall is poorly correlated with both relative humidity and saturation deficit in any of the ecological domains studied. We conclude that the amount of rainfall recorded in 1 day does not correlate with the values of humidity or saturation deficit recorded 24 h later: rainfall is not an adequate surrogate for evaluating the physiological processes of ticks at regional scales. We compared the Normalized Difference Vegetation Index (NDVI), a descriptor of photosynthetic activity, at a spatial resolution of 0.05°, with monthly averages of relative humidity and saturation deficit and also determined a lack of significant correlation. With the limitations of spatial scale and habitat coverage of this study, we suggest that the rainfall or NDVI cannot replace relative humidity or saturation deficit as descriptors of tick processes.
Alzate-Morales, Jans H; Caballero, Julio; Vergara Jague, Ariela; González Nilo, Fernando D
2009-04-01
N2 and O6 substituted guanine derivatives are well-known as potent and selective CDK2 inhibitors. The ability of molecular docking using the program AutoDock3 and the hybrid method ONIOM, to obtain some quantum chemical descriptors with the aim to successfully rank these inhibitors, was assessed. The quantum chemical descriptors were used to explain the affinity, of the series studied, by a model of the CDK2 binding site. The initial structures were obtained from docking studies and the ONIOM method was applied with only a single point energy calculation on the protein-ligand structure. We obtained a good correlation model between the ONIOM derived quantum chemical descriptor "H-bond interaction energy" and the experimental biological activity, with a correlation coefficient value of R = 0.80 for 75 compounds. To the best of our knowledge, this is the first time that both methodologies are used in conjunction in order to obtain a correlation model. The model suggests that electrostatic interactions are the principal driving force in this protein-ligand interaction. Overall, the approach was successful for the cases considered, and it suggests that could be useful for the design of inhibitors in the lead optimization phase of drug discovery.
Segmentation, modeling and classification of the compact objects in a pile
NASA Technical Reports Server (NTRS)
Gupta, Alok; Funka-Lea, Gareth; Wohn, Kwangyoen
1990-01-01
The problem of interpreting dense range images obtained from the scene of a heap of man-made objects is discussed. A range image interpretation system consisting of segmentation, modeling, verification, and classification procedures is described. First, the range image is segmented into regions and reasoning is done about the physical support of these regions. Second, for each region several possible three-dimensional interpretations are made based on various scenarios of the objects physical support. Finally each interpretation is tested against the data for its consistency. The superquadric model is selected as the three-dimensional shape descriptor, plus tapering deformations along the major axis. Experimental results obtained from some complex range images of mail pieces are reported to demonstrate the soundness and the robustness of our approach.
Cavity Versus Ligand Shape Descriptors: Application to Urokinase Binding Pockets
Cerisier, Natacha; Regad, Leslie; Triki, Dhoha; Camproux, Anne-Claude
2017-01-01
Abstract We analyzed 78 binding pockets of the human urokinase plasminogen activator (uPA) catalytic domain extracted from a data set of crystallized uPA–ligand complexes. These binding pockets were computed with an original geometric method that does NOT involve any arbitrary parameter, such as cutoff distances, angles, and so on. We measured the deviation from convexity of each pocket shape with the pocket convexity index (PCI). We defined a new pocket descriptor called distributional sphericity coefficient (DISC), which indicates to which extent the protein atoms of a given pocket lie on the surface of a sphere. The DISC values were computed with the freeware PCI. The pocket descriptors and their high correspondences with ligand descriptors are crucial for polypharmacology prediction. We found that the protein heavy atoms lining the urokinases binding pockets are either located on the surface of their convex hull or lie close to this surface. We also found that the radii of the urokinases binding pockets and the radii of their ligands are highly correlated (r = 0.9). PMID:28570103
Mining chemical reactions using neighborhood behavior and condensed graphs of reactions approaches.
de Luca, Aurélie; Horvath, Dragos; Marcou, Gilles; Solov'ev, Vitaly; Varnek, Alexandre
2012-09-24
This work addresses the problem of similarity search and classification of chemical reactions using Neighborhood Behavior (NB) and Condensed Graphs of Reaction (CGR) approaches. The CGR formalism represents chemical reactions as a classical molecular graph with dynamic bonds, enabling descriptor calculations on this graph. Different types of the ISIDA fragment descriptors generated for CGRs in combination with two metrics--Tanimoto and Euclidean--were considered as chemical spaces, to serve for reaction dissimilarity scoring. The NB method has been used to select an optimal combination of descriptors which distinguish different types of chemical reactions in a database containing 8544 reactions of 9 classes. Relevance of NB analysis has been validated in generic (multiclass) similarity search and in clustering with Self-Organizing Maps (SOM). NB-compliant sets of descriptors were shown to display enhanced mapping propensities, allowing the construction of better Self-Organizing Maps and similarity searches (NB and classical similarity search criteria--AUC ROC--correlate at a level of 0.7). The analysis of the SOM clusters proved chemically meaningful CGR substructures representing specific reaction signatures.
Jhin, Changho; Hwang, Keum Taek
2014-01-01
Radical scavenging activity of anthocyanins is well known, but only a few studies have been conducted by quantum chemical approach. The adaptive neuro-fuzzy inference system (ANFIS) is an effective technique for solving problems with uncertainty. The purpose of this study was to construct and evaluate quantitative structure-activity relationship (QSAR) models for predicting radical scavenging activities of anthocyanins with good prediction efficiency. ANFIS-applied QSAR models were developed by using quantum chemical descriptors of anthocyanins calculated by semi-empirical PM6 and PM7 methods. Electron affinity (A) and electronegativity (χ) of flavylium cation, and ionization potential (I) of quinoidal base were significantly correlated with radical scavenging activities of anthocyanins. These descriptors were used as independent variables for QSAR models. ANFIS models with two triangular-shaped input fuzzy functions for each independent variable were constructed and optimized by 100 learning epochs. The constructed models using descriptors calculated by both PM6 and PM7 had good prediction efficiency with Q-square of 0.82 and 0.86, respectively. PMID:25153627
Determining the Scoring Validity of a Co-Constructed CEFR-Based Rating Scale
ERIC Educational Resources Information Center
Deygers, Bart; Van Gorp, Koen
2015-01-01
Considering scoring validity as encompassing both reliable rating scale use and valid descriptor interpretation, this study reports on the validation of a CEFR-based scale that was co-constructed and used by novice raters. The research questions this paper wishes to answer are (a) whether it is possible to construct a CEFR-based rating scale with…
Agius, Rudi; Torchala, Mieczyslaw; Moal, Iain H.; Fernández-Recio, Juan; Bates, Paul A.
2013-01-01
Predicting the effects of mutations on the kinetic rate constants of protein-protein interactions is central to both the modeling of complex diseases and the design of effective peptide drug inhibitors. However, while most studies have concentrated on the determination of association rate constants, dissociation rates have received less attention. In this work we take a novel approach by relating the changes in dissociation rates upon mutation to the energetics and architecture of hotspots and hotregions, by performing alanine scans pre- and post-mutation. From these scans, we design a set of descriptors that capture the change in hotspot energy and distribution. The method is benchmarked on 713 kinetically characterized mutations from the SKEMPI database. Our investigations show that, with the use of hotspot descriptors, energies from single-point alanine mutations may be used for the estimation of off-rate mutations to any residue type and also multi-point mutations. A number of machine learning models are built from a combination of molecular and hotspot descriptors, with the best models achieving a Pearson's Correlation Coefficient of 0.79 with experimental off-rates and a Matthew's Correlation Coefficient of 0.6 in the detection of rare stabilizing mutations. Using specialized feature selection models we identify descriptors that are highly specific and, conversely, broadly important to predicting the effects of different classes of mutations, interface regions and complexes. Our results also indicate that the distribution of the critical stability regions across protein-protein interfaces is a function of complex size more strongly than interface area. In addition, mutations at the rim are critical for the stability of small complexes, but consistently harder to characterize. The relationship between hotregion size and the dissociation rate is also investigated and, using hotspot descriptors which model cooperative effects within hotregions, we show how the contribution of hotregions of different sizes, changes under different cooperative effects. PMID:24039569
Kittelmann, Jörg; Lang, Katharina M H; Ottens, Marcel; Hubbuch, Jürgen
2017-01-27
Quantitative structure-activity relationship (QSAR) modeling for prediction of biomolecule parameters has become an established technique in chromatographic purification process design. Unfortunately available descriptor sets fail to describe the orientation of biomolecules and the effects of ionic strength in the mobile phase on the interaction with the stationary phase. The literature describes several special descriptors used for chromatographic retention modeling, all of these do not describe the screening of electrostatic potential by the mobile phase in use. In this work we introduce two new approaches of descriptor calculations, namely surface patches and plane projection, which capture an oriented binding to charged surfaces and steric hindrance of the interaction with chromatographic ligands with regard to electrostatic potential screening by mobile phase ions. We present the use of the developed descriptor sets for predictive modeling of Langmuir isotherms for proteins at different pH values between pH 5 and 10 and varying ionic strength in the range of 10-100mM. The resulting model has a high correlation of calculated descriptors and experimental results, with a coefficient of determination of 0.82 and a predictive coefficient of determination of 0.92 for unknown molecular structures and conditions. The agreement of calculated molecular interaction orientations with both, experimental results as well as molecular dynamic simulations from literature is shown. The developed descriptors provide the means for improved QSAR models of chromatographic processes, as they reflect the complex interactions of biomolecules with chromatographic phases. Copyright © 2016 Elsevier B.V. All rights reserved.
McMullin, Brian T; Leung, Ming-Ying; Shanbhag, Arun S; McNulty, Donald; Mabrey, Jay D; Agrawal, C Mauli
2006-02-01
A total of 750 images of individual ultra-high molecular weight polyethylene (UHMWPE) particles isolated from periprosthetic failed hip, knee, and shoulder arthroplasties were extracted from archival scanning electron micrographs. Particle size and morphology was subsequently analyzed using computerized image analysis software utilizing five descriptors found in ASTM F1877-98, a standard for quantitative description of wear debris. An online survey application was developed to display particle images, and allowed ten respondents to classify particle morphologies according to commonly used terminology as fibers, flakes, or granules. Particles were categorized based on a simple majority of responses. All descriptors were evaluated using a one-way ANOVA and Tukey-Kramer test for all-pairs comparison among each class of particles. A logistic regression model using half of the particles included in the survey was then used to develop a mathematical scheme to predict whether a given particle should be classified as a fiber, flake, or granule based on its quantitative measurements. The validity of the model was then assessed using the other half of the survey particles and compared with human responses. Comparison of the quantitative measurements of isolated particles showed that the morphologies of each particle type classified by respondents were statistically different from one another (p<0.05). The average agreement between mathematical prediction and human respondents was 83.5% (standard error 0.16%). These data suggest that computerized descriptors can be feasibly correlated with subjective terminology, thus providing a basis for a common vocabulary for particle description which can be translated into quantitative dimensions.
McMullin, Brian T.; Leung, Ming-Ying; Shanbhag, Arun S.; McNulty, Donald; Mabrey, Jay D.; Agrawal, C. Mauli
2014-01-01
A total of 750 images of individual ultra-high molecular weight polyethylene (UHMWPE) particles isolated from periprosthetic failed hip, knee, and shoulder arthroplasties were extracted from archival scanning electron micrographs. Particle size and morphology was subsequently analyzed using computerized image analysis software utilizing five descriptors found in ASTM F1877-98, a standard for quantitative description of wear debris. An online survey application was developed to display particle images, and allowed ten respondents to classify particle morphologies according to commonly used terminology as fibers, flakes, or granules. Particles were categorized based on a simple majority of responses. All descriptors were evaluated using a one-way ANOVA and Tukey–Kramer test for all-pairs comparison among each class of particles. A logistic regression model using half of the particles included in the survey was then used to develop a mathematical scheme to predict whether a given particle should be classified as a fiber, flake, or granule based on its quantitative measurements. The validity of the model was then assessed using the other half of the survey particles and compared with human responses. Comparison of the quantitative measurements of isolated particles showed that the morphologies of each particle type classified by respondents were statistically different from one another (po0:05). The average agreement between mathematical prediction and human respondents was 83.5% (standard error 0.16%). These data suggest that computerized descriptors can be feasibly correlated with subjective terminology, thus providing a basis for a common vocabulary for particle description which can be translated into quantitative dimensions. PMID:16112725
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sheng, WC; Zhuang, ZB; Gao, MR
2015-01-08
The hydrogen oxidation/evolution reactions are two of the most fundamental reactions in distributed renewable electrochemical energy conversion and storage systems. The identification of the reaction descriptor is therefore of critical importance for the rational catalyst design and development. Here we report the correlation between hydrogen oxidation/evolution activity and experimentally measured hydrogen binding energy for polycrystalline platinum examined in several buffer solutions in a wide range of electrolyte pH from 0 to 13. The hydrogen oxidation/evolution activity obtained using the rotating disk electrode method is found to decrease with the pH, while the hydrogen binding energy, obtained from cyclic voltammograms, linearlymore » increases with the pH. Correlating the hydrogen oxidation/evolution activity to the hydrogen binding energy renders a monotonic decreasing hydrogen oxidation/evolution activity with the hydrogen binding energy, strongly supporting the hypothesis that hydrogen binding energy is the sole reaction descriptor for the hydrogen oxidation/evolution activity on monometallic platinum.« less
He, Junyi; Peng, Tao; Yang, Xianhai; Liu, Huihui
2018-02-01
Endocrine disrupting effect has become a central point of concern, and various biological mechanisms involve in the disruption of endocrine system. Recently, we have explored the mechanism of disrupting hormonal transport protein, through the binding affinity of sex hormone-binding globulin in different fish species. This study, serving as a companion article, focused on the mechanism of activating/inhibiting hormone receptor, by investigating the binding interaction of chemicals with the estrogen receptor (ER) of different fish species. We collected the relative binding affinity (RBA) of chemicals with 17β-estradiol binding to the ER of eight fish species. With this parameter as the endpoints, quantitative structure-activity relationship (QSAR) models were established using DRAGON descriptors. Statistical results indicated that the developed models had satisfactory goodness of fit, robustness and predictive ability. The Euclidean distance and Williams plot verified that these models had wide application domains, which covered a large number of structurally diverse chemicals. Based on the screened descriptors, we proposed an appropriate mechanism interpretation for the binding potency. Additionally, even though the same chemical had different affinities for ER from different fish species, the affinity of ER exhibited a high correlation for fish species within the same Order (i.e., Salmoniformes, Cypriniformes, Perciformes), which consistent with that in our previous study. Hence, when performing the endocrine disrupting effect assessment, the species diversity should be taken into account, but maybe the fish species in the same Order can be grouped together. Copyright © 2017 Elsevier Inc. All rights reserved.
Cheminformatics-aided pharmacovigilance: application to Stevens-Johnson Syndrome
Low, Yen S; Caster, Ola; Bergvall, Tomas; Fourches, Denis; Zang, Xiaoling; Norén, G Niklas; Rusyn, Ivan; Edwards, Ralph
2016-01-01
Objective Quantitative Structure-Activity Relationship (QSAR) models can predict adverse drug reactions (ADRs), and thus provide early warnings of potential hazards. Timely identification of potential safety concerns could protect patients and aid early diagnosis of ADRs among the exposed. Our objective was to determine whether global spontaneous reporting patterns might allow chemical substructures associated with Stevens-Johnson Syndrome (SJS) to be identified and utilized for ADR prediction by QSAR models. Materials and Methods Using a reference set of 364 drugs having positive or negative reporting correlations with SJS in the VigiBase global repository of individual case safety reports (Uppsala Monitoring Center, Uppsala, Sweden), chemical descriptors were computed from drug molecular structures. Random Forest and Support Vector Machines methods were used to develop QSAR models, which were validated by external 5-fold cross validation. Models were employed for virtual screening of DrugBank to predict SJS actives and inactives, which were corroborated using knowledge bases like VigiBase, ChemoText, and MicroMedex (Truven Health Analytics Inc, Ann Arbor, Michigan). Results We developed QSAR models that could accurately predict if drugs were associated with SJS (area under the curve of 75%–81%). Our 10 most active and inactive predictions were substantiated by SJS reports (or lack thereof) in the literature. Discussion Interpretation of QSAR models in terms of significant chemical descriptors suggested novel SJS structural alerts. Conclusions We have demonstrated that QSAR models can accurately identify SJS active and inactive drugs. Requiring chemical structures only, QSAR models provide effective computational means to flag potentially harmful drugs for subsequent targeted surveillance and pharmacoepidemiologic investigations. PMID:26499102
Correlations between chromatographic parameters and bioactivity predictors of potential herbicides.
Janicka, Małgorzata
2014-08-01
Different liquid chromatography techniques, including reversed-phase liquid chromatography on Purosphere RP-18e, IAM.PC.DD2 and Cosmosil Cholester columns and micellar liqud chromatography with a Purosphere RP-8e column and using buffered sodium dodecyl sulfate-acetonitrile as the mobile phase, were applied to study the lipophilic properties of 15 newly synthesized phenoxyacetic and carbamic acid derivatives, which are potential herbicides. Chromatographic lipophilicity descriptors were used to extrapolate log k parameters (log kw and log km) and log k values. Partitioning lipophilicity descriptors, i.e., log P coefficients in an n-octanol-water system, were computed from the molecular structures of the tested compounds. Bioactivity descriptors, including partition coefficients in a water-plant cuticle system and water-human serum albumin and coefficients for human skin partition and permeation were calculated in silico by ACD/ADME software using the linear solvation energy relationship of Abraham. Principal component analysis was applied to describe similarities between various chromatographic and partitioning lipophilicities. Highly significant, predictive linear relationships were found between chromatographic parameters and bioactivity descriptors. © The Author [2013]. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Lu, Jiwen; Erin Liong, Venice; Zhou, Jie
2017-08-09
In this paper, we propose a simultaneous local binary feature learning and encoding (SLBFLE) approach for both homogeneous and heterogeneous face recognition. Unlike existing hand-crafted face descriptors such as local binary pattern (LBP) and Gabor features which usually require strong prior knowledge, our SLBFLE is an unsupervised feature learning approach which automatically learns face representation from raw pixels. Unlike existing binary face descriptors such as the LBP, discriminant face descriptor (DFD), and compact binary face descriptor (CBFD) which use a two-stage feature extraction procedure, our SLBFLE jointly learns binary codes and the codebook for local face patches so that discriminative information from raw pixels from face images of different identities can be obtained by using a one-stage feature learning and encoding procedure. Moreover, we propose a coupled simultaneous local binary feature learning and encoding (C-SLBFLE) method to make the proposed approach suitable for heterogeneous face matching. Unlike most existing coupled feature learning methods which learn a pair of transformation matrices for each modality, we exploit both the common and specific information from heterogeneous face samples to characterize their underlying correlations. Experimental results on six widely used face datasets are presented to demonstrate the effectiveness of the proposed method.
The proposal of architecture for chemical splitting to optimize QSAR models for aquatic toxicity.
Colombo, Andrea; Benfenati, Emilio; Karelson, Mati; Maran, Uko
2008-06-01
One of the challenges in the field of quantitative structure-activity relationship (QSAR) analysis is the correct classification of a chemical compound to an appropriate model for the prediction of activity. Thus, in previous studies, compounds have been divided into distinct groups according to their mode of action or chemical class. In the current study, theoretical molecular descriptors were used to divide 568 organic substances into subsets with toxicity measured for the 96-h lethal median concentration for the Fathead minnow (Pimephales promelas). Simple constitutional descriptors such as the number of aliphatic and aromatic rings and a quantum chemical descriptor, maximum bond order of a carbon atom divide compounds into nine subsets. For each subset of compounds the automatic forward selection of descriptors was applied to construct QSAR models. Significant correlations were achieved for each subset of chemicals and all models were validated with the leave-one-out internal validation procedure (R(2)(cv) approximately 0.80). The results encourage to consider this alternative way for the prediction of toxicity using QSAR subset models without direct reference to the mechanism of toxic action or the traditional chemical classification.
Dong, Jie; Yao, Zhi-Jiang; Zhang, Lin; Luo, Feijun; Lin, Qinlu; Lu, Ai-Ping; Chen, Alex F; Cao, Dong-Sheng
2018-03-20
With the increasing development of biotechnology and informatics technology, publicly available data in chemistry and biology are undergoing explosive growth. Such wealthy information in these data needs to be extracted and transformed to useful knowledge by various data mining methods. Considering the amazing rate at which data are accumulated in chemistry and biology fields, new tools that process and interpret large and complex interaction data are increasingly important. So far, there are no suitable toolkits that can effectively link the chemical and biological space in view of molecular representation. To further explore these complex data, an integrated toolkit for various molecular representation is urgently needed which could be easily integrated with data mining algorithms to start a full data analysis pipeline. Herein, the python library PyBioMed is presented, which comprises functionalities for online download for various molecular objects by providing different IDs, the pretreatment of molecular structures, the computation of various molecular descriptors for chemicals, proteins, DNAs and their interactions. PyBioMed is a feature-rich and highly customized python library used for the characterization of various complex chemical and biological molecules and interaction samples. The current version of PyBioMed could calculate 775 chemical descriptors and 19 kinds of chemical fingerprints, 9920 protein descriptors based on protein sequences, more than 6000 DNA descriptors from nucleotide sequences, and interaction descriptors from pairwise samples using three different combining strategies. Several examples and five real-life applications were provided to clearly guide the users how to use PyBioMed as an integral part of data analysis projects. By using PyBioMed, users are able to start a full pipelining from getting molecular data, pretreating molecules, molecular representation to constructing machine learning models conveniently. PyBioMed provides various user-friendly and highly customized APIs to calculate various features of biological molecules and complex interaction samples conveniently, which aims at building integrated analysis pipelines from data acquisition, data checking, and descriptor calculation to modeling. PyBioMed is freely available at http://projects.scbdd.com/pybiomed.html .
New descriptors of homogeneity of the propagation of ventricular repolarization.
Batchvarov, V; Dilaveris, P; Färbom, P; Ghuran, A; Acar, B; Hnatkova, K; Camm, A J; Malik, M
2000-11-01
Available descriptors of irregularities of ventricular repolarization are of limited clinical value. We studied the effect of autonomic variations on several new descriptors of the three-dimensional T loop. Twelve-lead digital ECGs were recorded continuously in 40 healthy subjects at baseline in the supine position, during postural changes (supine-->sitting-->standing-->supine-->standing), and during Valsalva maneuver performed three times in the supine and three times in the standing positions. A minimum dimensional space was constructed from the 12-lead ECG, using singular value decomposition, on the basis of median ECG beats constructed from 10-second consecutive ECG recordings. Temporal variations (TLA and PL, which measure the T loop area, and LD, the interlead relationship during repolarization) and wavefront direction descriptors (TCRT, the deviation between the QRS and T vectors) were calculated and expressed as normalized values. Values of TLA, PL, and TCRT were significantly lower in the sitting than in the supine position (-38,139 +/- 9099 vs 47,133 +/- 7511, -0.017 +/- 0.005 vs 0.033 +/- 0.005 and -0.032 +/- 0.019 vs 0.071 +/- 0.015, respectively, P < 0.001 for all) and decreased further in the standing position (-88,288 +/- 14,468, -0.067 +/- 0.013, -0.198 +/- 0.025, respectively, P < 0.001 for all). LD increased from supine to sitting (98.7 +/- 29.4 vs -87.5 +/- 15.2, P < 0.001) and increased further, though nonsignificantly in the standing position (118.3 +/- 35.2). TLA, PL, and TCRT decreased from baseline during Valsalva in the supine (-34,118 +/- 11,424 vs 62,234 +/- 12,215, -0.038 +/- 0.014 vs 0.065 +/- 0.010, -0.08 +/- 0.03 vs 0.10 +/- 0.02, respectively, P < 0.001 for all) and standing positions (-108,263 +/- 21,051 vs -68,909 +/- 10,271, -0.109 +/- 0.014 vs -0.048 +/- 0.009, -0.30 +/- 0.035 vs -015 +/- 0.016, respectively, P < 0.05 for all). LD was significantly increased by Valsalva in the supine position (13 +/- 46 vs -153 +/- 30, P < 0.001) and nonsignificantly in the standing position (99 +/- 50 vs 86 +/- 30, P = NS). There were significant correlations among TLA, PL, and LD, and no significant correlation between TCRT and any of the temporal variation descriptors. These new temporal and wavefront direction descriptors are sensitive and rapid detectors of autonomic effects on ventricular repolarization.
Jardínez, Christiaan; Vela, Alberto; Cruz-Borbolla, Julián; Alvarez-Mendez, Rodrigo J; Alvarado-Rodríguez, José G
2016-12-01
The relationship between the chemical structure and biological activity (log IC 50 ) of 40 derivatives of 1,4-dihydropyridines (DHPs) was studied using density functional theory (DFT) and multiple linear regression analysis methods. With the aim of improving the quantitative structure-activity relationship (QSAR) model, the reduced density gradient s( r) of the optimized equilibrium geometries was used as a descriptor to include weak non-covalent interactions. The QSAR model highlights the correlation between the log IC 50 with highest molecular orbital energy (E HOMO ), molecular volume (V), partition coefficient (log P), non-covalent interactions NCI(H4-G) and the dual descriptor [Δf(r)]. The model yielded values of R 2 =79.57 and Q 2 =69.67 that were validated with the next four internal analytical validations DK=0.076, DQ=-0.006, R P =0.056, and R N =0.000, and the external validation Q 2 boot =64.26. The QSAR model found can be used to estimate biological activity with high reliability in new compounds based on a DHP series. Graphical abstract The good correlation between the log IC 50 with the NCI (H4-G) estimated by the reduced density gradient approach of the DHP derivatives.
Model for the partition of neutral compounds between n-heptane and formamide.
Karunasekara, Thushara; Poole, Colin F
2010-04-01
Partition coefficients for 84 varied compounds were determined for n-heptane-formamide biphasic partition system and used to derive a model for the distribution of neutral compounds between the n-heptane-rich and formamide-rich layers. The partition coefficients, log K(p), were correlated through the solvation parameter model giving log K(p)=0.083+0.559E-2.244S-3.250A-1.614B+2.387V with a multiple correlation coefficient of 0.996, standard error of the estimate 0.139, and Fisher statistic 1791. In the model, the solute descriptors are excess molar refraction, E, dipolarity/polarizability, S, overall hydrogen-bond acidity, A, overall hydrogen-bond basicity, B, and McGowan's characteristic volume, V. The model is expected to be able to estimate further values of the partition coefficient to about 0.13 log units for the same descriptor space covered by the calibration compounds (E=-0.26-2.29, S=0-1.93, A=0-1.25, B=0.02-1.58, and V=0.78-2.50). The n-heptane-formamide partition system is shown to have different selectivity to other totally organic biphasic systems and to be suitable for estimating descriptor values for compounds of low water solubility and/or stability.
DataWarrior: an open-source program for chemistry aware data visualization and analysis.
Sander, Thomas; Freyss, Joel; von Korff, Modest; Rufener, Christian
2015-02-23
Drug discovery projects in the pharmaceutical industry accumulate thousands of chemical structures and ten-thousands of data points from a dozen or more biological and pharmacological assays. A sufficient interpretation of the data requires understanding, which molecular families are present, which structural motifs correlate with measured properties, and which tiny structural changes cause large property changes. Data visualization and analysis software with sufficient chemical intelligence to support chemists in this task is rare. In an attempt to contribute to filling the gap, we released our in-house developed chemistry aware data analysis program DataWarrior for free public use. This paper gives an overview of DataWarrior's functionality and architecture. Exemplarily, a new unsupervised, 2-dimensional scaling algorithm is presented, which employs vector-based or nonvector-based descriptors to visualize the chemical or pharmacophore space of even large data sets. DataWarrior uses this method to interactively explore chemical space, activity landscapes, and activity cliffs.
Votano, Joseph R; Parham, Marc; Hall, L Mark; Hall, Lowell H; Kier, Lemont B; Oloff, Scott; Tropsha, Alexander
2006-11-30
Four modeling techniques, using topological descriptors to represent molecular structure, were employed to produce models of human serum protein binding (% bound) on a data set of 1008 experimental values, carefully screened from publicly available sources. To our knowledge, this data is the largest set on human serum protein binding reported for QSAR modeling. The data was partitioned into a training set of 808 compounds and an external validation test set of 200 compounds. Partitioning was accomplished by clustering the compounds in a structure descriptor space so that random sampling of 20% of the whole data set produced an external test set that is a good representative of the training set with respect to both structure and protein binding values. The four modeling techniques include multiple linear regression (MLR), artificial neural networks (ANN), k-nearest neighbors (kNN), and support vector machines (SVM). With the exception of the MLR model, the ANN, kNN, and SVM QSARs were ensemble models. Training set correlation coefficients and mean absolute error ranged from r2=0.90 and MAE=7.6 for ANN to r2=0.61 and MAE=16.2 for MLR. Prediction results from the validation set yielded correlation coefficients and mean absolute errors which ranged from r2=0.70 and MAE=14.1 for ANN to a low of r2=0.59 and MAE=18.3 for the SVM model. Structure descriptors that contribute significantly to the models are discussed and compared with those found in other published models. For the ANN model, structure descriptor trends with respect to their affects on predicted protein binding can assist the chemist in structure modification during the drug design process.
Khushaba, Rami N; Al-Timemy, Ali H; Al-Ani, Ahmed; Al-Jumaily, Adel
2017-10-01
The extraction of the accurate and efficient descriptors of muscular activity plays an important role in tackling the challenging problem of myoelectric control of powered prostheses. In this paper, we present a new feature extraction framework that aims to give an enhanced representation of muscular activities through increasing the amount of information that can be extracted from individual and combined electromyogram (EMG) channels. We propose to use time-domain descriptors (TDDs) in estimating the EMG signal power spectrum characteristics; a step that preserves the computational power required for the construction of spectral features. Subsequently, TDD is used in a process that involves: 1) representing the temporal evolution of the EMG signals by progressively tracking the correlation between the TDD extracted from each analysis time window and a nonlinearly mapped version of it across the same EMG channel and 2) representing the spatial coherence between the different EMG channels, which is achieved by calculating the correlation between the TDD extracted from the differences of all possible combinations of pairs of channels and their nonlinearly mapped versions. The proposed temporal-spatial descriptors (TSDs) are validated on multiple sparse and high-density (HD) EMG data sets collected from a number of intact-limbed and amputees performing a large number of hand and finger movements. Classification results showed significant reductions in the achieved error rates in comparison to other methods, with the improvement of at least 8% on average across all subjects. Additionally, the proposed TSDs achieved significantly well in problems with HD-EMG with average classification errors of <5% across all subjects using windows lengths of 50 ms only.
Method of data communications with reduced latency
Blocksome, Michael A; Parker, Jeffrey J
2013-11-05
Data communications with reduced latency, including: writing, by a producer, a descriptor and message data into at least two descriptor slots of a descriptor buffer, the descriptor buffer comprising allocated computer memory segmented into descriptor slots, each descriptor slot having a fixed size, the descriptor buffer having a header pointer that identifies a next descriptor slot to be processed by a DMA controller, the descriptor buffer having a tail pointer that identifies a descriptor slot for entry of a next descriptor in the descriptor buffer; recording, by the producer, in the descriptor a value signifying that message data has been written into descriptor slots; and setting, by the producer, in dependence upon the recorded value, a tail pointer to point to a next open descriptor slot.
Ionic and Covalent Stabilization of Intermediates and Transition States in Catalysis by Solid Acids
DOE Office of Scientific and Technical Information (OSTI.GOV)
Deshlahra, Prashant; Carr, Robert T.; Iglesia, Enrique
Reactivity descriptors describe catalyst properties that determine the stability of kinetically relevant transition states and adsorbed intermediates. Theoretical descriptors, such as deprotonation energies (DPE), rigorously account for Brønsted acid strength for catalytic solids with known structure. Here, mechanistic interpretations of methanol dehydration turnover rates are used to assess how charge reorganization (covalency) and electrostatic interactions determine DPE and how such interactions are recovered when intermediates and transition states interact with the conjugate anion in W and Mo polyoxometalate (POM) clusters and gaseous mineral acids. Turnover rates are lower and kinetically relevant species are less stable on Mo than W POMmore » clusters with similar acid strength, and such species are more stable on mineral acids than that predicted from W-POM DPE–reactivity trends, indicating that DPE and acid strength are essential but incomplete reactivity descriptors. Born–Haber thermochemical cycles indicate that these differences reflect more effective charge reorganization upon deprotonation of Mo than W POM clusters and the much weaker reorganization in mineral acids. Such covalency is disrupted upon deprotonation but cannot be recovered fully upon formation of ion pairs at transition states. Predictive descriptors of reactivity for general classes of acids thus require separate assessments of the covalent and ionic DPE components. Here, we describe methods to estimate electrostatic interactions, which, taken together with energies derived from density functional theory, give the covalent and ionic energy components of protons, intermediates, and transition states. In doing so, we provide a framework to predict the reactive properties of protons for chemical reactions mediated by ion-pair transition states.« less
Hathout, Rania M; Metwally, Abdelkader A
2016-11-01
This study represents one of the series applying computer-oriented processes and tools in digging for information, analysing data and finally extracting correlations and meaningful outcomes. In this context, binding energies could be used to model and predict the mass of loaded drugs in solid lipid nanoparticles after molecular docking of literature-gathered drugs using MOE® software package on molecularly simulated tripalmitin matrices using GROMACS®. Consequently, Gaussian processes as a supervised machine learning artificial intelligence technique were used to correlate the drugs' descriptors (e.g. M.W., xLogP, TPSA and fragment complexity) with their molecular docking binding energies. Lower percentage bias was obtained compared to previous studies which allows the accurate estimation of the loaded mass of any drug in the investigated solid lipid nanoparticles by just projecting its chemical structure to its main features (descriptors). Copyright © 2016 Elsevier B.V. All rights reserved.
Pérez-Garrido, Alfonso; Morales Helguera, Aliuska; Abellán Guillén, Adela; Cordeiro, M Natália D S; Garrido Escudero, Amalio
2009-01-15
This paper reports a QSAR study for predicting the complexation of a large and heterogeneous variety of substances (233 organic compounds) with beta-cyclodextrins (beta-CDs). Several different theoretical molecular descriptors, calculated solely from the molecular structure of the compounds under investigation, and an efficient variable selection procedure, like the Genetic Algorithm, led to models with satisfactory global accuracy and predictivity. But the best-final QSAR model is based on Topological descriptors meanwhile offering a reasonable interpretation. This QSAR model was able to explain ca. 84% of the variance in the experimental activity, and displayed very good internal cross-validation statistics and predictivity on external data. It shows that the driving forces for CD complexation are mainly hydrophobic and steric (van der Waals) interactions. Thus, the results of our study provide a valuable tool for future screening and priority testing of beta-CDs guest molecules.
Freitas, Alex A; Limbu, Kriti; Ghafourian, Taravat
2015-01-01
Volume of distribution is an important pharmacokinetic property that indicates the extent of a drug's distribution in the body tissues. This paper addresses the problem of how to estimate the apparent volume of distribution at steady state (Vss) of chemical compounds in the human body using decision tree-based regression methods from the area of data mining (or machine learning). Hence, the pros and cons of several different types of decision tree-based regression methods have been discussed. The regression methods predict Vss using, as predictive features, both the compounds' molecular descriptors and the compounds' tissue:plasma partition coefficients (Kt:p) - often used in physiologically-based pharmacokinetics. Therefore, this work has assessed whether the data mining-based prediction of Vss can be made more accurate by using as input not only the compounds' molecular descriptors but also (a subset of) their predicted Kt:p values. Comparison of the models that used only molecular descriptors, in particular, the Bagging decision tree (mean fold error of 2.33), with those employing predicted Kt:p values in addition to the molecular descriptors, such as the Bagging decision tree using adipose Kt:p (mean fold error of 2.29), indicated that the use of predicted Kt:p values as descriptors may be beneficial for accurate prediction of Vss using decision trees if prior feature selection is applied. Decision tree based models presented in this work have an accuracy that is reasonable and similar to the accuracy of reported Vss inter-species extrapolations in the literature. The estimation of Vss for new compounds in drug discovery will benefit from methods that are able to integrate large and varied sources of data and flexible non-linear data mining methods such as decision trees, which can produce interpretable models. Graphical AbstractDecision trees for the prediction of tissue partition coefficient and volume of distribution of drugs.
Tsatsishvili, Valeri; Burunat, Iballa; Cong, Fengyu; Toiviainen, Petri; Alluri, Vinoo; Ristaniemi, Tapani
2018-06-01
There has been growing interest towards naturalistic neuroimaging experiments, which deepen our understanding of how human brain processes and integrates incoming streams of multifaceted sensory information, as commonly occurs in real world. Music is a good example of such complex continuous phenomenon. In a few recent fMRI studies examining neural correlates of music in continuous listening settings, multiple perceptual attributes of music stimulus were represented by a set of high-level features, produced as the linear combination of the acoustic descriptors computationally extracted from the stimulus audio. NEW METHOD: fMRI data from naturalistic music listening experiment were employed here. Kernel principal component analysis (KPCA) was applied to acoustic descriptors extracted from the stimulus audio to generate a set of nonlinear stimulus features. Subsequently, perceptual and neural correlates of the generated high-level features were examined. The generated features captured musical percepts that were hidden from the linear PCA features, namely Rhythmic Complexity and Event Synchronicity. Neural correlates of the new features revealed activations associated to processing of complex rhythms, including auditory, motor, and frontal areas. Results were compared with the findings in the previously published study, which analyzed the same fMRI data but applied linear PCA for generating stimulus features. To enable comparison of the results, methodology for finding stimulus-driven functional maps was adopted from the previous study. Exploiting nonlinear relationships among acoustic descriptors can lead to the novel high-level stimulus features, which can in turn reveal new brain structures involved in music processing. Copyright © 2018 Elsevier B.V. All rights reserved.
Issues and solutions for storage, retrieval, and searching of MPEG-7 documents
NASA Astrophysics Data System (ADS)
Chang, Yuan-Chi; Lo, Ming-Ling; Smith, John R.
2000-10-01
The ongoing MPEG-7 standardization activity aims at creating a standard for describing multimedia content in order to facilitate the interpretation of the associated information content. Attempting to address a broad range of applications, MPEG-7 has defined a flexible framework consisting of Descriptors, Description Schemes, and Description Definition Language. Descriptors and Description Schemes describe features, structure and semantics of multimedia objects. They are written in the Description Definition Language (DDL). In the most recent revision, DDL applies XML (Extensible Markup Language) Schema with MPEG-7 extensions. DDL has constructs that support inclusion, inheritance, reference, enumeration, choice, sequence, and abstract type of Description Schemes and Descriptors. In order to enable multimedia systems to use MPEG-7, a number of important problems in storing, retrieving and searching MPEG-7 documents need to be solved. This paper reports on initial finding on issues and solutions of storing and accessing MPEG-7 documents. In particular, we discuss the benefits of using a virtual document management framework based on XML Access Server (XAS) in order to bridge the MPEG-7 multimedia applications and database systems. The need arises partly because MPEG-7 descriptions need customized storage schema, indexing and search engines. We also discuss issues arising in managing dependence and cross-description scheme search.
Scaling left ventricular mass in adolescent boys aged 11-15 years.
Valente-Dos-Santos, João; Coelho-E-Silva, Manuel J; Ferraz, António; Castanheira, Joaquim; Ronque, Enio R; Sherar, Lauren B; Elferink-Gemser, Marije T; Malina, Robert M
2014-01-01
Normalizing left ventricular mass (LVM) for inter-individual variation in body size is a central issue in human biology. During the adolescent growth spurt, variability in body size descriptors needs to be interpreted in combination with biological maturation. To examine the contribution of biological maturation, stature, sitting height, body mass, fat-free mass (FFM) and fat mass (FM) to inter-individual variability in LVM in boys, using proportional allometric modelling. The cross-sectional sample included 110 boys of 11-15 years (12.9-1.0 years). Stature, sitting height, body mass, cardiac chamber dimensions and LVM were measured. Age at peak height velocity (APHV) was predicted and used as an indicator of biological maturation. Percentage fat was estimated from triceps and subscapular skinfolds; FM and FFM were derived. Exponents for body size descriptors were k = 2.33 for stature, k = 2.18 for sitting height, k = 0.68 for body mass, k = 0.17 for FM and k = 0.80 for FFM (adjusted R(2 )= 19-62%). The combination of body descriptors and APHV increased the explained variance in LVM (adjusted R(2)( )= 56-69%). Stature, FM and FFM are the best combination for normalizing LVM in adolescent boys; when body composition is not available, an indicator of biological maturity should be included with stature.
Synthesis and DFT calculations of some 2-aminothiazoles
NASA Astrophysics Data System (ADS)
Rezania, Jafar; Behzadi, Hadi; Shockravi, Abbas; Ehsani, Morteza; Akbarzadeh, Elahe
2018-04-01
A series of 2-aminothiazole derivatives have been synthesized by the reaction of acetyl compounds with thiourea and iodine as catalyst under solvent-free condition, a green chemistry method. The quantum chemical calculations at the DFT/B3LYP level of theory in gas phase were carried out for starting acetyl derivatives. The highest occupied molecular orbital (HOMO) and lowest unoccupied molecular orbital (LUMO) and related reactivity descriptor of acetyl derivatives, as well as, enthalpy of reactions are calculated in order to investigate the reaction properties of acetyl compounds and yields of the reactions. The calculated reactivity descriptors are well correlated to activity of different acetyl derivatives.
NASA Astrophysics Data System (ADS)
Tian, Feifei; Zhou, Peng; Li, Zhiliang
2007-03-01
In this paper, a new topological descriptor T-scale is derived from principal component analysis (PCA) on the collected 67 kinds of structural and topological variables of 135 amino acids. Applying T-scale to three peptide panels as 58 angiotensin-converting enzyme (ACE) inhibitors, 20 thromboplastin inhibitors (TI) and 28 bovine lactoferricin-(17-31)-pentadecapeptides (LFB), the resulting QSAR models, constructed by partial least squares (PLS), are all superior to reference reports, with correlative coefficient r2 and cross-validated q2 of 0.845, 0.786; 0.996, 0.782 (0.988, 0.961); 0.760, 0.627, respectively.
Alves, E O S; Cerqueira-Silva, C B M; Souza, A M; Santos, C A F; Lima Neto, F P; Corrêa, R X
2012-03-14
We investigated seven distance measures in a set of observations of physicochemical variables of mango (Mangifera indica) submitted to multivariate analyses (distance, projection and grouping). To estimate the distance measurements, five mango progeny (total of 25 genotypes) were analyzed, using six fruit physicochemical descriptors (fruit weight, equatorial diameter, longitudinal diameter, total soluble solids in °Brix, total titratable acidity, and pH). The distance measurements were compared by the Spearman correlation test, projection in two-dimensional space and grouping efficiency. The Spearman correlation coefficients between the seven distance measurements were, except for the Mahalanobis' generalized distance (0.41 ≤ rs ≤ 0.63), high and significant (rs ≥ 0.91; P < 0.001). Regardless of the origin of the distance matrix, the unweighted pair group method with arithmetic mean grouping method proved to be the most adequate. The various distance measurements and grouping methods gave different values for distortion (-116.5 ≤ D ≤ 74.5), cophenetic correlation (0.26 ≤ rc ≤ 0.76) and stress (-1.9 ≤ S ≤ 58.9). Choice of distance measurement and analysis methods influence the.
Effects of G-Quadruplex Topology on Electronic Transfer Integrals
Sun, Wenming; Varsano, Daniele; Di Felice, Rosa
2016-01-01
G-quadruplex is a quadruple helical form of nucleic acids that can appear in guanine-rich parts of the genome. The basic unit is the G-tetrad, a planar assembly of four guanines connected by eight hydrogen bonds. Its rich topology and its possible relevance as a drug target for a number of diseases have stimulated several structural studies. The superior stiffness and electronic π-π overlap between consecutive G-tetrads suggest exploitation for nanotechnologies. Here we inspect the intimate link between the structure and the electronic properties, with focus on charge transfer parameters. We show that the electronic couplings between stacked G-tetrads strongly depend on the three-dimensional atomic structure. Furthermore, we reveal a remarkable correlation with the topology: a topology characterized by the absence of syn-anti G-G sequences can better support electronic charge transfer. On the other hand, there is no obvious correlation of the electronic coupling with usual descriptors of the helix shape. We establish a procedure to maximize the correlation with a global helix shape descriptor. PMID:28335314
NASA Astrophysics Data System (ADS)
Sánchez-Márquez, Jesús; Zorrilla, David; García, Víctor; Fernández, Manuel
2018-07-01
This work presents a new development based on the condensation scheme proposed by Chamorro and Pérez, in which new terms to correct the frozen molecular orbital approximation have been introduced (improved frontier molecular orbital approximation). The changes performed on the original development allow taking into account the orbital relaxation effects, providing equivalent results to those achieved by the finite difference approximation and leading also to a methodology with great advantages. Local reactivity indices based on this new development have been obtained for a sample set of molecules and they have been compared with those indices based on the frontier molecular orbital and finite difference approximations. A new definition based on the improved frontier molecular orbital methodology for the dual descriptor index is also shown. In addition, taking advantage of the characteristics of the definitions obtained with the new condensation scheme, the descriptor local philicity is analysed by separating the components corresponding to the frontier molecular orbital approximation and orbital relaxation effects, analysing also the local parameter multiphilic descriptor in the same way. Finally, the effect of using the basis set is studied and calculations using DFT, CI and Möller-Plesset methodologies are performed to analyse the consequence of different electronic-correlation levels.
Kamath, Padmaja; Fernandez, Alberto; Giralt, Francesc; Rallo, Robert
2015-01-01
Nanoparticles are likely to interact in real-case application scenarios with mixtures of proteins and biomolecules that will absorb onto their surface forming the so-called protein corona. Information related to the composition of the protein corona and net cell association was collected from literature for a library of surface-modified gold and silver nanoparticles. For each protein in the corona, sequence information was extracted and used to calculate physicochemical properties and statistical descriptors. Data cleaning and preprocessing techniques including statistical analysis and feature selection methods were applied to remove highly correlated, redundant and non-significant features. A weighting technique was applied to construct specific signatures that represent the corona composition for each nanoparticle. Using this basic set of protein descriptors, a new Protein Corona Structure-Activity Relationship (PCSAR) that relates net cell association with the physicochemical descriptors of the proteins that form the corona was developed and validated. The features that resulted from the feature selection were in line with already published literature, and the computational model constructed on these features had a good accuracy (R(2)LOO=0.76 and R(2)LMO(25%)=0.72) and stability, with the advantage that the fingerprints based on physicochemical descriptors were independent of the specific proteins that form the corona.
Ruan, Xiaofang; Zhang, Ruisheng; Yao, Xiaojun; Liu, Mancang; Fan, Botao
2007-03-01
Alkylphenols are a group of permanent pollutants in the environment and could adversely disturb the human endocrine system. It is therefore important to effectively separate and measure the alkylphenols. To guide the chromatographic analysis of these compounds in practice, the development of quantitative relationship between the molecular structure and the retention time of alkylphenols becomes necessary. In this study, topological, constitutional, geometrical, electrostatic and quantum-chemical descriptors of 44 alkylphenols were calculated using a software, CODESSA, and these descriptors were pre-selected using the heuristic method. As a result, three-descriptor linear model (LM) was developed to describe the relationship between the molecular structure and the retention time of alkylphenols. Meanwhile, the non-linear regression model was also developed based on support vector machine (SVM) using the same three descriptors. The correlation coefficient (R(2)) for the LM and SVM was 0.98 and 0. 92, and the corresponding root-mean-square error was 0. 99 and 2. 77, respectively. By comparing the stability and prediction ability of the two models, it was found that the linear model was a better method for describing the quantitative relationship between the retention time of alkylphenols and the molecular structure. The results obtained suggested that the linear model could be applied for the chromatographic analysis of alkylphenols with known molecular structural parameters.
Sensory description of marine oils through development of a sensory wheel and vocabulary.
Larssen, W E; Monteleone, E; Hersleth, M
2018-04-01
The Omega-3 industry lacks a defined methodology and a vocabulary for evaluating the sensory quality of marine oils. This study was conducted to identify the sensory descriptors of marine oils and organize them in a sensory wheel for use as a tool in quality assessment. Samples of marine oils were collected from six of the largest producers of omega-3 products in Norway. The oils were selected to cover as much variation in sensory characteristics as possible, i.e. oils with different fatty acid content originating from different species. Oils were evaluated by six industry expert panels and one trained sensory panel to build up a vocabulary through a series of language sessions. A total of 184 aroma (odor by nose), flavor, taste and mouthfeel descriptors were generated. A sensory wheel based on 60 selected descriptors grouped together in 21 defined categories was created to form a graphical presentation of the sensory vocabulary. A selection of the oil samples was also evaluated by a trained sensory panel using descriptive analysis. Chemical analysis showed a positive correlation between primary and secondary oxidation products and sensory properties such as rancidity, chemical flavor and process flavor and a negative correlation between primary oxidation products and acidic. This research is a first step towards the broader objective of standardizing the sensory terminology related to marine oils. Copyright © 2017 Elsevier Ltd. All rights reserved.
AllergenFP: allergenicity prediction by descriptor fingerprints.
Dimitrov, Ivan; Naneva, Lyudmila; Doytchinova, Irini; Bangov, Ivan
2014-03-15
Allergenicity, like antigenicity and immunogenicity, is a property encoded linearly and non-linearly, and therefore the alignment-based approaches are not able to identify this property unambiguously. A novel alignment-free descriptor-based fingerprint approach is presented here and applied to identify allergens and non-allergens. The approach was implemented into a four step algorithm. Initially, the protein sequences are described by amino acid principal properties as hydrophobicity, size, relative abundance, helix and β-strand forming propensities. Then, the generated strings of different length are converted into vectors with equal length by auto- and cross-covariance (ACC). The vectors were transformed into binary fingerprints and compared in terms of Tanimoto coefficient. The approach was applied to a set of 2427 known allergens and 2427 non-allergens and identified correctly 88% of them with Matthews correlation coefficient of 0.759. The descriptor fingerprint approach presented here is universal. It could be applied for any classification problem in computational biology. The set of E-descriptors is able to capture the main structural and physicochemical properties of amino acids building the proteins. The ACC transformation overcomes the main problem in the alignment-based comparative studies arising from the different length of the aligned protein sequences. The conversion of protein ACC values into binary descriptor fingerprints allows similarity search and classification. The algorithm described in the present study was implemented in a specially designed Web site, named AllergenFP (FP stands for FingerPrint). AllergenFP is written in Python, with GIU in HTML. It is freely accessible at http://ddg-pharmfac.net/Allergen FP. idoytchinova@pharmfac.net or ivanbangov@shu-bg.net.
Structural alignment of protein descriptors - a combinatorial model.
Antczak, Maciej; Kasprzak, Marta; Lukasiak, Piotr; Blazewicz, Jacek
2016-09-17
Structural alignment of proteins is one of the most challenging problems in molecular biology. The tertiary structure of a protein strictly correlates with its function and computationally predicted structures are nowadays a main premise for understanding the latter. However, computationally derived 3D models often exhibit deviations from the native structure. A way to confirm a model is a comparison with other structures. The structural alignment of a pair of proteins can be defined with the use of a concept of protein descriptors. The protein descriptors are local substructures of protein molecules, which allow us to divide the original problem into a set of subproblems and, consequently, to propose a more efficient algorithmic solution. In the literature, one can find many applications of the descriptors concept that prove its usefulness for insight into protein 3D structures, but the proposed approaches are presented rather from the biological perspective than from the computational or algorithmic point of view. Efficient algorithms for identification and structural comparison of descriptors can become crucial components of methods for structural quality assessment as well as tertiary structure prediction. In this paper, we propose a new combinatorial model and new polynomial-time algorithms for the structural alignment of descriptors. The model is based on the maximum-size assignment problem, which we define here and prove that it can be solved in polynomial time. We demonstrate suitability of this approach by comparison with an exact backtracking algorithm. Besides a simplification coming from the combinatorial modeling, both on the conceptual and complexity level, we gain with this approach high quality of obtained results, in terms of 3D alignment accuracy and processing efficiency. All the proposed algorithms were developed and integrated in a computationally efficient tool descs-standalone, which allows the user to identify and structurally compare descriptors of biological molecules, such as proteins and RNAs. Both PDB (Protein Data Bank) and mmCIF (macromolecular Crystallographic Information File) formats are supported. The proposed tool is available as an open source project stored on GitHub ( https://github.com/mantczak/descs-standalone ).
ERIC Educational Resources Information Center
Madison, Guy; Gouyon, Fabien; Ullen, Fredrik; Hornstrom, Kalle
2011-01-01
"Groove" is often described as the experience of music that makes people tap their feet and want to dance. A high degree of consistency in ratings of groove across listeners indicates that physical properties of the sound signal contribute to groove (Madison, 2006). Here, correlations were assessed between listeners' ratings and a number…
NASA Astrophysics Data System (ADS)
Krein, Michael
After decades of development and use in a variety of application areas, Quantitative Structure Property Relationships (QSPRs) and related descriptor-based statistical learning methods have achieved a level of infamy due to their misuse. The field is rife with past examples of overtrained models, overoptimistic performance assessment, and outright cheating in the form of explicitly removing data to fit models. These actions do not serve the community well, nor are they beneficial to future predictions based on established models. In practice, in order to select combinations of descriptors and machine learning methods that might work best, one must consider the nature and size of the training and test datasets, be aware of existing hypotheses about the data, and resist the temptation to bias structure representation and modeling to explicitly fit the hypotheses. The definition and application of these best practices is important for obtaining actionable modeling outcomes, and for setting user expectations of modeling accuracy when predicting the endpoint values of unknowns. A wide variety of statistical learning approaches, descriptor types, and model validation strategies are explored herein, with the goals of helping end users understand the factors involved in creating and using QSPR models effectively, and to better understand relationships within the data, especially by looking at the problem space from multiple perspectives. Molecular relationships are commonly envisioned in a continuous high-dimensional space of numerical descriptors, referred to as chemistry space. Descriptor and similarity metric choice influence the partitioning of this space into regions corresponding to local structural similarity. These regions, known as domains of applicability, are most likely to be successfully modeled by a QSPR. In Chapter 2, the network topology and scaling relationships of several chemistry spaces are thoroughly investigated. Chemistry spaces studied include the ZINC data set, a qHTS PubChem bioassay, as well as the protein binding sites from the PDB. The characteristics of these networks are compared and contrasted with those of the bioassay Structure Activity Landscape Index (SALI) subnetwork, which maps discontinuities or cliffs in the structure activity landscape. Mapping this newly generated information over underlying chemistry space networks generated using different descriptors demonstrates local modeling capacity and can guide the choice of better local representations of chemistry space. Chapter 2 introduces and demonstrates this novel concept, which also enables future work in visualization and interpretation of chemical spaces. Initially, it was discovered that there were no community-available tools to leverage best-practice ideas to comprehensively build, compare, and interpret QSPRs. The Yet Another Modeling System (YAMS) tool performs a series of balanced, rational decisions in dataset preprocessing and parameter/feature selection over a choice of modeling methods. To date, YAMS is the only community-available informatics tool that performs such decisions consistently between methods while also providing multiple model performance comparisons and detailed descriptor importance information. The focus of the tool is thus to convey rich information about model quality and predictions that help to "close the loop" between modeling and experimental efforts, for example, in tailoring nanocomposite properties. Polymer nanocomposites (PNC) are complex material systems encompassing many potential structures, chemistries, and self assembled morphologies that could significantly impact commercial and military applications. There is a strong desire to characterize and understand the tradespace of nanocomposites, to identify the important factors relating nanostructure to materials properties and determine an effective way to control materials properties at the manufacturing scale. Due to the complexity of the systems, existing design approaches rely heavily on trial-and-error learning. By leveraging existing experimental data, Materials Quantitative Structure-Property Relationships (MQSPRs) relate molecular structures to the polar and dispersive components of corresponding surface tensions. In turn, existing theories relate polymer and nanofiller polar and dispersive surface tension components to the dispersion state and interfacial polymer relaxation times. These quantities may, in the future, be used as input to continuum mechanics approaches shown able to predict the thermomechanical response of nanocomposites. For a polymer dataset and a particle dataset, multiple structural representations and descriptor sets are benchmarked, including a set of high performance surface-property descriptors developed as part of this work. The systematic variation of structural representations as part of the informatics approach reveals important insight in modeling polymers, and should become common practice when defining new problem spaces.
The psychoacoustics of musical articulation
NASA Astrophysics Data System (ADS)
Spiegelberg, Scott Charles
This dissertation develops psychoacoustical definitions of notated articulations, the necessary first step in articulation research. This research can be useful to theorists interested in timbre analysis, the psychology of performance, analysis and performance, the psychology of style differentiation, and performance pedagogy. An explanation of wavelet transforms precedes the development of new techniques for analyzing transient sounds. A history of timbre perception research reveals the inadequacies of current sound segmentation models, resulting in the creation of a new model, the Pitch/Amplitude/Centroid Trajectory (PACT) model of sound segmentation. The new analysis techniques and PACT model are used to analyze recordings of performers playing a melodic fragment in a series of notated articulations. Statistical tests showed that the performers generally agreed on the interpretation of five different articulation groups. A cognitive test of articulation similarity, using musicians and non-musicians as participants, revealed a close correlation between similarity judgments and physical attributes, though additional unknown factors are clearly present. A second psychological test explored the perceptual salience of articulation notation, by asking musically-trained participants to match stimuli to the same notations the performers used. The participants also marked verbal descriptors for each articulation, such as short/long, sharp/dull, loud/soft, harsh/gentle, and normal/extreme. These results were matched against the results of Chapters Five and Six, providing an overall interpretation of the psychoacoustics of articulation.
Polanski, Jaroslaw; Tkocz, Aleksandra; Kucia, Urszula
2017-09-11
On the one hand, ligand efficiency (LE) and the binding efficiency index (BEI), which are binding properties (B) averaged versus the heavy atom count (HAC: LE) or molecular weight (MW: BEI), have recently been declared a novel universal tool for drug design. On the other hand, questions have been raised about the mathematical validity of the LE approach. In fact, neither the critics nor the advocates are precise enough to provide a generally understandable and accepted chemistry of the LE metrics. In particular, this refers to the puzzle of the LE trends for small and large molecules. In this paper, we explain the chemistry and mathematics of the LE type of data. Because LE is a weight metrics related to binding per gram, its hyperbolic decrease with an increasing number of heavy atoms can be easily understood by its 1/MW dependency. Accordingly, we analyzed how this influences the LE trends for ligand-target binding, economic big data or molecular descriptor data. In particular, we compared the trends for the thermodynamic ∆G data of a series of ligands that interact with 14 different target classes, which were extracted from the BindingDB database with the market prices of a commercial compound library of ca. 2.5 mln synthetic building blocks. An interpretation of LE and BEI that clearly explains the observed trends for these parameters are presented here for the first time. Accordingly, we show that the main misunderstanding of the chemical meaning of the BEI and LE parameters is their interpretation as molecular descriptors that are connected with a single molecule, while binding is a statistical effect in which a population of ligands limits the formation of ligand-receptor complexes. Therefore, LE (BEI) should not be interpreted as a molecular (physicochemical) descriptor that is connected with a single molecule but as a property (binding per gram). Accordingly, the puzzle of the surprising behavior of LE is explained by the 1/MW dependency. This effect clearly explains the hyperbolic LE trend not as a real increase in binding potency but as a physical limitation due to the different population of ligands with different MWs in a 1 g sample available for the formation of ligand-receptor complexes. Graphical abstract .
Qin, Li-Tang; Liu, Shu-Shen; Liu, Hai-Ling
2010-02-01
A five-variable model (model M2) was developed for the bioconcentration factors (BCFs) of nonpolar organic compounds (NPOCs) by using molecular electronegativity distance vector (MEDV) to characterize the structures of NPOCs and variable selection and modeling based on prediction (VSMP) to select the optimum descriptors. The estimated correlation coefficient (r (2)) and the leave-one-out cross-validation correlation coefficients (q (2)) of model M2 were 0.9271 and 0.9171, respectively. The model was externally validated by splitting the whole data set into a representative training set of 85 chemicals and a validation set of 29 chemicals. The results show that the main structural factors influencing the BCFs of NPOCs are -cCc, cCcc, -Cl, and -Br (where "-" refers to a single bond and "c" refers to a conjugated bond). The quantitative structure-property relationship (QSPR) model can effectively predict the BCFs of NPOCs, and the predictions of the model can also extend the current BCF database of experimental values.
Structure-activity relationships between sterols and their thermal stability in oil matrix.
Hu, Yinzhou; Xu, Junli; Huang, Weisu; Zhao, Yajing; Li, Maiquan; Wang, Mengmeng; Zheng, Lufei; Lu, Baiyi
2018-08-30
Structure-activity relationships between 20 sterols and their thermal stabilities were studied in a model oil system. All sterol degradations were found to be consistent with a first-order kinetic model with determination of coefficient (R 2 ) higher than 0.9444. The number of double bonds in the sterol structure was negatively correlated with the thermal stability of sterol, whereas the length of the branch chain was positively correlated with the thermal stability of sterol. A quantitative structure-activity relationship (QSAR) model to predict thermal stability of sterol was developed by using partial least squares regression (PLSR) combined with genetic algorithm (GA). A regression model was built with R 2 of 0.806. Almost all sterol degradation constants can be predicted accurately with R 2 of cross-validation equals to 0.680. Four important variables were selected in optimal QSAR model and the selected variables were observed to be related with information indices, RDF descriptors, and 3D-MoRSE descriptors. Copyright © 2018 Elsevier Ltd. All rights reserved.
Wang, Zhanhui; Luo, Pengjie; Cheng, Linli; Zhang, Suxia; Shen, Jianzhong
2011-01-01
The molecular recognition of hapten-antibody is a fundamental event in competitive immunoassay, which guarantees the sensitivity and specificity of immunoassay for the detection of haptens. The aim of this study is to investigate the correlation between binding ability of one monoclonal antibody, 1H9B4, recognizing and the molecular aspects of α-zearalanol analogs. The mouse-derived monoclonal antibody was produced by using α-zearalanol conjugated to bovine serum albumin as an immunogen. The antibody recognition abilities, expressed as IC(50) values, were determined by a competitive ELISA. All of the hapten molecules were optimized by Density Function Theory (DFT) at B3LYP/ 6-31G* level and the conformation and electrostatic molecular isosurface were employed to explain the molecular recognition between α-zearalanol analogs and antibody 1H9B4. Pearson Correlation analysis between molecular descriptors and IC(50) values was qualitatively undertaken and the results showed that one molecular descriptor, surface of the hapten molecule, clearly demonstrated linear relationship with antibody recognition ability, where the relationship coefficient was 0.88 and the correlation was significant at p < 0.05 level. The study shows that computational chemistry and Pearson Correlation analysis can be used as tool to help the immunochemistries better understand the processing of antibody recognition of hapten molecules in competitive immunoassay. Copyright © 2011 John Wiley & Sons, Ltd.
Contreras-Torres, Ernesto
2018-06-02
In this study, I introduce novel global and local 0D-protein descriptors based on a statistical quantity named Total Sum of Squares (TSS). This quantity represents the sum of the squares differences of amino acid properties from the arithmetic mean property. As an extension, the amino acid-types and amino acid-groups formalisms are used for describing zones of interest in proteins. To assess the effectiveness of the proposed descriptors, a Nearest Neighbor model for predicting the major four protein structural classes was built. This model has a success rate of 98.53% on the jackknife cross-validation test; this performance being superior to other reported methods despite the simplicity of the predictor. Additionally, this predictor has an average success rate of 98.35% in different cross-validation tests performed. A value of 0.98 for the Kappa statistic clearly discriminates this model from a random predictor. The results obtained by the Nearest Neighbor model demonstrated the ability of the proposed descriptors not only to reflect relevant biochemical information related to the structural classes of proteins but also to allow appropriate interpretability. It can thus be expected that the current method may play a supplementary role to other existing approaches for protein structural class prediction and other protein attributes. Copyright © 2018 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Agrawal, Megha; Deval, Vipin; Gupta, Archana; Sangala, Bagvanth Reddy; Prabhu, S. S.
2016-10-01
The structure and several spectroscopic features along with reactivity parameters of the compound 4-(6-methoxy-2-naphthyl)-2-butanone (Nabumetone) have been studied using experimental techniques and tools derived from quantum chemical calculations. Structure optimization is followed by force field calculations based on density functional theory (DFT) at the B3LYP/6-311++G(d,p) level of theory. The vibrational spectra have been interpreted with the aid of normal coordinate analysis. UV-visible spectrum and the effect of solvent have been discussed. The electronic properties such as HOMO and LUMO energies have been determined by TD-DFT approach. In order to understand various aspects of pharmacological sciences several new chemical reactivity descriptors - chemical potential, global hardness and electrophilicity have been evaluated. Local reactivity descriptors - Fukui functions and local softnesses have also been calculated to find out the reactive sites within molecule. Aqueous solubility and lipophilicity have been calculated which are crucial for estimating transport properties of organic molecules in drug development. Estimation of biological effects, toxic/side effects has been made on the basis of prediction of activity spectra for substances (PASS) prediction results and their analysis by Pharma Expert software. Using the THz-TDS technique, the frequency-dependent absorptions of NBM have been measured in the frequency range up to 3 THz.
Yeung, S; Genaidy, A; Deddens, J; Shoaf, C; Leung, P
2003-01-01
Aims: To investigate the use of a worker based methodology to assess the physical stresses of lifting tasks on effort expended, and to associate this loading with musculoskeletal outcomes (MO). Methods: A cross sectional study was conducted on 217 male manual handling workers from the Hong Kong area. The effects of four lifting variables (weight of load, horizontal distance, twisting angle, and vertical travel distance) on effort were examined using a linguistic approach (that is, characterising variables in descriptors such as "heavy" for weight of load). The numerical interpretations of linguistic descriptors were established. In addition, the associations between on the job effort and MO were investigated for 10 body regions including the spine, and both upper and lower extremities. Results: MO were prevalent in multiple body regions (range 12–58%); effort was significantly associated with MO in 8 of 10 body regions (odds ratios with age adjusted ranged from 1.31 for low back to 1.71 for elbows and forearm). The lifting task variables had significant effects on effort, with the weight of load having twice the effect of other variables; each linguistic descriptor was better described by a range of numerical values rather than a single numerical value. Conclusions: The participatory worker based approach on musculoskeletal outcomes is a promising methodology. Further testing of this approach is recommended. PMID:14504360
2014-01-01
Background Measures of similarity for chemical molecules have been developed since the dawn of chemoinformatics. Molecular similarity has been measured by a variety of methods including molecular descriptor based similarity, common molecular fragments, graph matching and 3D methods such as shape matching. Similarity measures are widespread in practice and have proven to be useful in drug discovery. Because of our interest in electrostatics and high throughput ligand-based virtual screening, we sought to exploit the information contained in atomic coordinates and partial charges of a molecule. Results A new molecular descriptor based on partial charges is proposed. It uses the autocorrelation function and linear binning to encode all atoms of a molecule into two rotation-translation invariant vectors. Combined with a scoring function, the descriptor allows to rank-order a database of compounds versus a query molecule. The proposed implementation is called ACPC (AutoCorrelation of Partial Charges) and released in open source. Extensive retrospective ligand-based virtual screening experiments were performed and other methods were compared with in order to validate the method and associated protocol. Conclusions While it is a simple method, it performed remarkably well in experiments. At an average speed of 1649 molecules per second, it reached an average median area under the curve of 0.81 on 40 different targets; hence validating the proposed protocol and implementation. PMID:24887178
Linear and nonlinear methods in modeling the aqueous solubility of organic compounds.
Catana, Cornel; Gao, Hua; Orrenius, Christian; Stouten, Pieter F W
2005-01-01
Solubility data for 930 diverse compounds have been analyzed using linear Partial Least Square (PLS) and nonlinear PLS methods, Continuum Regression (CR), and Neural Networks (NN). 1D and 2D descriptors from MOE package in combination with E-state or ISIS keys have been used. The best model was obtained using linear PLS for a combination between 22 MOE descriptors and 65 ISIS keys. It has a correlation coefficient (r2) of 0.935 and a root-mean-square error (RMSE) of 0.468 log molar solubility (log S(w)). The model validated on a test set of 177 compounds not included in the training set has r2 0.911 and RMSE 0.475 log S(w). The descriptors were ranked according to their importance, and at the top of the list have been found the 22 MOE descriptors. The CR model produced results as good as PLS, and because of the way in which cross-validation has been done it is expected to be a valuable tool in prediction besides PLS model. The statistics obtained using nonlinear methods did not surpass those got with linear ones. The good statistic obtained for linear PLS and CR recommends these models to be used in prediction when it is difficult or impossible to make experimental measurements, for virtual screening, combinatorial library design, and efficient leads optimization.
QSAR as a random event: modeling of nanoparticles uptake in PaCa2 cancer cells.
Toropov, Andrey A; Toropova, Alla P; Puzyn, Tomasz; Benfenati, Emilio; Gini, Giuseppina; Leszczynska, Danuta; Leszczynski, Jerzy
2013-06-01
Quantitative structure-property/activity relationships (QSPRs/QSARs) are a tool to predict various endpoints for various substances. The "classic" QSPR/QSAR analysis is based on the representation of the molecular structure by the molecular graph. However, simplified molecular input-line entry system (SMILES) gradually becomes most popular representation of the molecular structure in the databases available on the Internet. Under such circumstances, the development of molecular descriptors calculated directly from SMILES becomes attractive alternative to "classic" descriptors. The CORAL software (http://www.insilico.eu/coral) is provider of SMILES-based optimal molecular descriptors which are aimed to correlate with various endpoints. We analyzed data set on nanoparticles uptake in PaCa2 pancreatic cancer cells. The data set includes 109 nanoparticles with the same core but different surface modifiers (small organic molecules). The concept of a QSAR as a random event is suggested in opposition to "classic" QSARs which are based on the only one distribution of available data into the training and the validation sets. In other words, five random splits into the "visible" training set and the "invisible" validation set were examined. The SMILES-based optimal descriptors (obtained by the Monte Carlo technique) for these splits are calculated with the CORAL software. The statistical quality of all these models is good. Copyright © 2013 Elsevier Ltd. All rights reserved.
Abramson, Richard G.; Su, Pei-Fang; Shyr, Yu
2012-01-01
Quantitative imaging has emerged as a leading priority on the imaging research agenda, yet clinical radiology has traditionally maintained a skeptical attitude toward numerical measurement in diagnostic interpretation. To gauge the extent to which quantitative reporting has been incorporated into routine clinical radiology practice, and to offer preliminary baseline data against which the evolution of quantitative imaging can be measured, we obtained all clinical computed tomography (CT) and magnetic resonance imaging (MRI) reports from two randomly selected weekdays in 2011 at a single mixed academic-community practice and evaluated those reports for the presence of quantitative descriptors. We found that 44% of all reports contained at least one “quantitative metric” (QM), defined as any numerical descriptor of a physical property other than quantity, but only 2% of reports contained an “advanced quantitative metric” (AQM), defined as a numerical parameter reporting on lesion function or composition, excluding simple size and distance measurements. Possible reasons for the slow translation of AQMs into routine clinical radiology reporting include perceptions that the primary clinical question may be qualitative in nature or that a qualitative answer may be sufficient; concern that quantitative approaches may obscure important qualitative information, may not be adequately validated, or may not allow sufficient expression of uncertainty; the feeling that “gestalt” interpretation may be superior to quantitative paradigms; and practical workflow limitations. We suggest that quantitative imaging techniques will evolve primarily as dedicated instruments for answering specific clinical questions requiring precise and standardized interpretation. Validation in real-world settings, ease of use, and reimbursement economics will all play a role in determining the rate of translation of AQMs into broad practice. PMID:22795791
Calès, Paul; Chaigneau, Julien; Hunault, Gilles; Michalak, Sophie; Cavaro-Menard, Christine; Fasquel, Jean-Baptiste; Bertrais, Sandrine; Rousselet, Marie-Christine
2015-01-01
Background: Liver fibrosis staging provides prognostic value, although hampered by observer variability. We used digital analysis to develop diagnostic morphometric scores for significant fibrosis, cirrhosis and fibrosis staging in chronic hepatitis C. Materials and Methods: We automated the measurement of 44 classical and new morphometric descriptors. The reference was histological METAVIR fibrosis (F) staging (F0 to F4) on liver biopsies. The derivation population included 416 patients and liver biopsies ≥20 mm-length. Two validation population included 438 patients. Results: In the derivation population, the area under the receiver operating characteristic (AUROC) for clinically significant fibrosis (F stage ≥2) of a logistic score combining 5 new descriptors (stellar fibrosis area, edge linearity, bridge thickness, bridge number, nodularity) was 0.957. The AUROC for cirrhosis of 6 new descriptors (edge linearity, nodularity, portal stellar fibrosis area, portal distance, granularity, fragmentation) was 0.994. Predicted METAVIR F staging combining 8 morphometric descriptors agreed well with METAVIR F staging by pathologists: κ = 0.868. Morphometric score of clinically significant fibrosis had a higher correlation with porto-septal fibrosis area (rs = 0.835) than METAVIR F staging (rs = 0.756, P < 0.001) and the same correlations with fibrosis biomarkers, e.g., serum hyaluronate: rs = 0.484 versus rs = 0.476 for METAVIR F (P = 0.862). In the validation population, the AUROCs of clinically significant fibrosis and cirrhosis scores were, respectively: 0.893 and 0.993 in 153 patients (biopsy < 20 mm); 0.955 and 0.994 in 285 patients (biopsy ≥ 20 mm). The three morphometric diagnoses agreed with consensus expert reference as well as or better than diagnoses by first-line pathologists in 285 patients, respectively: significant fibrosis: 0.733 versus 0.733 (κ), cirrhosis: 0.900 versus 0.827, METAVIR F: 0.881 versus 0.865. Conclusion: The new automated morphometric scores provide reproducible and accurate diagnoses of fibrosis stages via “virtual expert pathologist.” PMID:26110088
Sahoo, Sagarika; Adhikari, Chandana; Kuanar, Minati; Mishra, Bijay K
2016-01-01
Synthesis of organic compounds with specific biological activity or physicochemical characteristics needs a thorough analysis of the enumerable data set obtained from literature. Quantitative structure property/activity relationships have made it simple by predicting the structure of the compound with any optimized activity. For that there is a paramount data set of molecular descriptors (MD). This review is a survey on the generation of the molecular descriptors and its probable applications in QSP/AR. Literatures have been collected from a wide class of research journals, citable web reports, seminar proceedings and books. The MDs were classified according to their generation. The applications of the MDs on the QSP/AR have also been reported in this review. The MDs can be classified into experimental and theoretical types, having a sub classification of the later into structural and quantum chemical descriptors. The structural parameters are derived from molecular graphs or topology of the molecules. Even the pixel of the molecular image can be used as molecular descriptor. In QSPR studies the physicochemical properties include boiling point, heat capacity, density, refractive index, molar volume, surface tension, heat of formation, octanol-water partition coefficient, solubility, chromatographic retention indices etc. Among biological activities toxicity, antimalarial activity, sensory irritant, potencies of local anesthetic, tadpole narcosis, antifungal activity, enzyme inhibiting activity are some important parameters in the QSAR studies. The classification of the MDs is mostly generic in nature. The application of the MDs in QSP/AR also has a generic link. Experimental MDs are more suitable in correlation analysis than the theoretical ones but are more expensive for generation. In advent of sophisticated computational tools and experimental design proliferation of MDs is inevitable, but for a highly optimized MD, studies on generation of MD is an unending process.
The effect of phenol composition on the sensory profile of smoke affected wines.
Kelly, David; Zerihun, Ayalsew
2015-05-26
Vineyards exposed to wildfire generated smoke can produce wines with elevated levels of lignin derived phenols that have acrid, metallic and smoky aromas and flavour attributes. While a large number of phenols are present in smoke affected wines, the effect of smoke vegetation source on the sensory descriptors has not been reported. Here we report on a descriptive sensory analysis of wines made from grapes exposed to different vegetation sources of smoke to examine: (1) the effect vegetation source has on wine sensory attribute ratings and; (2) associations between volatile and glycoconjugated phenol composition and sensory attributes. Sensory attribute ratings were determined by a trained sensory panel and phenol concentrations determined by gas chromatography-mass spectroscopy. Analysis of variance, principal component analysis and partial least squares regressions were used to evaluate the interrelationships between the phenol composition and sensory attributes. The results showed that vegetation source of smoke significantly affected sensory attribute intensity, especially the taste descriptors. Differences in aroma and taste from smoke exposure were not limited to an elevation in a range of detractive descriptors but also a masking of positive fruit descriptors. Sensory differences due to vegetation type were driven by phenol composition and concentration. In particular, the glycoconjugates of 4-hydroxy-3-methoxybenzaldehyde (vanillin), 1-(4-hydroxy-3-methoxyphenyl)ethanone (acetovanillone), 4-hydroxy-3,5-dimethoxybenzaldehyde (syringaldehyde) and 1-(4-hydroxy-3,5-dimethoxyphenyl)ethanone (acetosyringone) concentrations were influential in separating the vegetation sources of smoke. It is concluded that the detractive aroma attributes of smoke affected wine, especially of smoke and ash, were associated with volatile phenols while the detractive flavour descriptors were correlated with glycoconjugated phenols.
NASA Astrophysics Data System (ADS)
Oses, Corey; Isayev, Olexandr; Toher, Cormac; Curtarolo, Stefano; Tropsha, Alexander
Historically, materials discovery is driven by a laborious trial-and-error process. The growth of materials databases and emerging informatics approaches finally offer the opportunity to transform this practice into data- and knowledge-driven rational design-accelerating discovery of novel materials exhibiting desired properties. By using data from the AFLOW repository for high-throughput, ab-initio calculations, we have generated Quantitative Materials Structure-Property Relationship (QMSPR) models to predict critical materials properties, including the metal/insulator classification, band gap energy, and bulk modulus. The prediction accuracy obtained with these QMSPR models approaches training data for virtually any stoichiometric inorganic crystalline material. We attribute the success and universality of these models to the construction of new materials descriptors-referred to as the universal Property-Labeled Material Fragments (PLMF). This representation affords straightforward model interpretation in terms of simple heuristic design rules that could guide rational materials design. This proof-of-concept study demonstrates the power of materials informatics to dramatically accelerate the search for new materials.
NASA Astrophysics Data System (ADS)
Alam, Mahboob; Park, Soonheum
2018-05-01
The synthesis of 3β,6β-dichloro-5α-hydroxy-5α-cholestane (in general, steroidal chlorohydrin or steroidal halohydrin) and theoretical study of the structure are reported in this paper. The individuality of chlorohydrin was confirmed by FT-IR, NMR, MS, CHN microanalysis and X-ray crystallography. DFT calculations on the titled molecule have been performed. The molecular structure and spectra explained by Gaussian hybrid computational analysis theory (B3LYP) are found to be in correlation with the experimental data obtained from the various spectrophotometric techniques. The theoretical geometry optimization data were compared with the X-ray data. The vibrational bands appearing in the FT-IR are assigned with accuracy using harmonic frequencies along with intensities and animated modes. Molecular properties like NBO, HOMO-LUMO analysis, chemical reactivity descriptors, MEP mapping and dipole moment have been dealt at same level of theory. The calculated electronic spectrum of chlorohydrin is interpreted on the basis of TD-DFT calculations.
NASA Technical Reports Server (NTRS)
Garbell, Maurice A.
1990-01-01
A rational, internationally consistent, noise descriptor system is needed to express existing and predicted en route aircraft noise levels in terms closely correlated to the annoyance perceived by people and physiologically identifiable in people, to provide guidance for aircraft and powerplant design, flight management, land-use planning, and building codes. Expanding on previous discussions, a new comprehensive statement of the specific questions that must be resolved by needed research, and the nature and quality of proof that must be adduced to justify further steps toward the drafting and adoption of new international en route aircraft-noise standards is sought. The single noise-descriptor system envisioned must be valid for widely varying aircraft-noise frequency spectra, including time-variant components and agreeable and disagreeable discrete tones and combinations of tones. The measures and criteria established by the system must be valid at high and low immission levels, at high and low ambient noise levels, for great and small number of noise events, and outdoors and indoors.
Goodarzi, Mohammad; Jensen, Richard; Vander Heyden, Yvan
2012-12-01
A Quantitative Structure-Retention Relationship (QSRR) is proposed to estimate the chromatographic retention of 83 diverse drugs on a Unisphere poly butadiene (PBD) column, using isocratic elutions at pH 11.7. Previous work has generated QSRR models for them using Classification And Regression Trees (CART). In this work, Ant Colony Optimization is used as a feature selection method to find the best molecular descriptors from a large pool. In addition, several other selection methods have been applied, such as Genetic Algorithms, Stepwise Regression and the Relief method, not only to evaluate Ant Colony Optimization as a feature selection method but also to investigate its ability to find the important descriptors in QSRR. Multiple Linear Regression (MLR) and Support Vector Machines (SVMs) were applied as linear and nonlinear regression methods, respectively, giving excellent correlation between the experimental, i.e. extrapolated to a mobile phase consisting of pure water, and predicted logarithms of the retention factors of the drugs (logk(w)). The overall best model was the SVM one built using descriptors selected by ACO. Copyright © 2012 Elsevier B.V. All rights reserved.
Batool, Fozia; Iqbal, Shahid; Akbar, Jamshed
2018-04-03
The present study describes Quantitative Structure Property Relationship (QSPR) modeling to relate metal ions characteristics with adsorption potential of Ficus carica leaves for 13 selected metal ions (Ca +2 , Cr +3 , Co +2 , Cu +2 , Cd +2 , K +1 , Mg +2 , Mn +2 , Na +1 , Ni +2 , Pb +2 , Zn +2 , and Fe +2 ) to generate QSPR model. A set of 21 characteristic descriptors were selected and relationship of these metal characteristics with adsorptive behavior of metal ions was investigated. Stepwise Multiple Linear Regression (SMLR) analysis and Artificial Neural Network (ANN) were applied for descriptors selection and model generation. Langmuir and Freundlich isotherms were also applied on adsorption data to generate proper correlation for experimental findings. Model generated indicated covalent index as the most significant descriptor, which is responsible for more than 90% predictive adsorption (α = 0.05). Internal validation of model was performed by measuring [Formula: see text] (0.98). The results indicate that present model is a useful tool for prediction of adsorptive behavior of different metal ions based on their ionic characteristics.
Liu, Shubin; Rong, Chunying; Lu, Tian
2017-01-04
One of the main tasks of theoretical chemistry is to rationalize computational results with chemical insights. Key concepts of such nature include nucleophilicity, electrophilicity, regioselectivity, and stereoselectivity. While computational tools are available to predict barrier heights and other reactivity properties with acceptable accuracy, a conceptual framework to appreciate above quantities is still lacking. In this work, we introduce the electronic force as the fundamental driving force of chemical processes to understand and predict molecular reactivity. It has three components but only two are independent. These forces, electrostatic and steric, can be employed as reliable descriptors for nucleophilic and electrophilic regioselectivity and stereoselectivity. The advantages of using these forces to evaluate molecular reactivity are that electrophilic and nucleophilic attacks are featured by distinct characteristics in the electrostatic force and no knowledge of quantum effects included in the kinetic and exchange-correlation energies is required. Examples are provided to highlight the validity and general applicability of these reactivity descriptors. Possible applications in ambident reactivity, σ and π holes, frustrated Lewis pairs, and stereoselective reactions are also included in this work.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Benigni, R.; Andreoli, C.; Giuliani, A.
1989-01-01
The interrelationships among carcinogenicity, mutagenicity, acute toxicity (LD50), and a number of molecular descriptors were studied by computerized data analysis methods on the data base generated by the International Program for the Evaluation of Short-Term Test for Carcinogens (IPESTTC). With the use of statistical regression methods, three main associations were evidenced: (1) the well-known correlation between carcinogenicity and mutagenicity; (2) a correlation between mutagenicity and toxicity (LD50 ip in mice); and (3) a correlation between toxicity and a recently introduced estimator of the free energy of binding of the molecules to biological receptors. As expected on the basis of themore » large variety of chemical classes represented in the IPESTTC data base, no simple relationship between mutagenicity or carcinogenicity and chemical descriptors was found. To overcome this problem, a new pattern recognition method (REPAD), developed by us for structure-activity studies of noncongeneric chemicals, has been used. This allowed us to highlight a significant difference between the whole patterns of relationships among chemicophysical variables in the two groups to active (mutagenicity and/or carcinogenic) and inactive chemicals. This approach generated a classification rule able to correctly assign about 80% of carcinogens or mutagens.« less
Psychometric Study of the Pain Drawing.
Trahan, Lisa H; Cox-Martin, Emily; Johnson, Carrie E; Dougherty, Patrick M; Yu, Jun; Feng, Lei; Cook, Christina; Novy, Diane M
2017-12-01
The objectives of the study were to (1) assess the extent to which interrater reliability of pain drawing location and dispersion scoring methods are similar across pain disciplines in a sample of patients with cancer treatment-induced neuropathic pain ( N = 56) and (2) investigate indicators of validity of the pain drawing in this unique sample. Patients undergoing cancer therapy completed the Brief Pain Inventory Body Map, the MD Anderson Symptom Inventory, and the McGill Pain Questionnaire. Intraclass correlation coefficients among medical and psychology professionals ranged from .93-.99. Correlations between pain drawing score and symptom burden severity ranged from .29-.39; correlations between pain drawing score and symptom burden interference ranged from .28-.34. Patients who endorsed pain in the hands and feet more often described their pain as electric, numb, and shooting than patients without pain in the hands and feet. They also endorsed significantly more descriptors of neuropathic pain. Results suggest a similar understanding among members of a multidisciplinary pain team as to the location and dispersion of pain as represented by patients' pain drawings. In addition, pain drawing scores were related to symptom burden severity and interference and descriptors of neuropathic pain in expected ways.
Urban area thermal monitoring: Liepaja case study using satellite and aerial thermal data
NASA Astrophysics Data System (ADS)
Gulbe, Linda; Caune, Vairis; Korats, Gundars
2017-12-01
The aim of this study is to explore large (60 m/pixel) and small scale (individual building level) temperature distribution patterns from thermal remote sensing data and to conclude what kind of information could be extracted from thermal remote sensing on regular basis. Landsat program provides frequent large scale thermal images useful for analysis of city temperature patterns. During the study correlation between temperature patterns and vegetation content based on NDVI and building coverage based on OpenStreetMap data was studied. Landsat based temperature patterns were independent from the season, negatively correlated with vegetation content and positively correlated with building coverage. Small scale analysis included spatial and raster descriptor analysis for polygons corresponding to roofs of individual buildings for evaluating insulation of roofs. Remote sensing and spatial descriptors are poorly related to heat consumption data, however, thermal aerial data median and entropy can help to identify poorly insulated roofs. Automated quantitative roof analysis has high potential for acquiring city wide information about roof insulation, but quality is limited by reference data quality and information on building types, and roof materials would be crucial for further studies.
Quantitative Machine Learning Analysis of Brain MRI Morphology throughout Aging.
Shamir, Lior; Long, Joe
2016-01-01
While cognition is clearly affected by aging, it is unclear whether the process of brain aging is driven solely by accumulation of environmental damage, or involves biological pathways. We applied quantitative image analysis to profile the alteration of brain tissues during aging. A dataset of 463 brain MRI images taken from a cohort of 416 subjects was analyzed using a large set of low-level numerical image content descriptors computed from the entire brain MRI images. The correlation between the numerical image content descriptors and the age was computed, and the alterations of the brain tissues during aging were quantified and profiled using machine learning. The comprehensive set of global image content descriptors provides high Pearson correlation of ~0.9822 with the chronological age, indicating that the machine learning analysis of global features is sensitive to the age of the subjects. Profiling of the predicted age shows several periods of mild changes, separated by shorter periods of more rapid alterations. The periods with the most rapid changes were around the age of 55, and around the age of 65. The results show that the process of brain aging of is not linear, and exhibit short periods of rapid aging separated by periods of milder change. These results are in agreement with patterns observed in cognitive decline, mental health status, and general human aging, suggesting that brain aging might not be driven solely by accumulation of environmental damage. Code and data used in the experiments are publicly available.
Analyzing Molecular Clouds with the Spectral Correlation Function
NASA Astrophysics Data System (ADS)
Rosolowsky, E. W.; Goodman, A. A.; Williams, J. P.; Wilner, D. J.
1997-12-01
The Spectral Correlation Function (SCF) is a new data analysis algorithm that measures how the properites of spectra vary from position to position in a spectral-line map. For each spectrum in a data cube, the SCF measures the ``difference" between that spectrum and a specified subset of its neighbors. This algorithm is intended for use on both simulated and observed position-position-velocity data cubes. In initial tests of the SCF, we have shown that a histogram of the SCF for a map is a good descriptor of the spatial-velocity distribution of material. In one test, we compare the SCF distributions for: 1) a real data cube; 2) a cube made from the real cube's spectra with randomized positions; and 3) the results of a preliminary MHD simulation by Gammie, Ostriker, and Stone. The results of the test show that the real cloud and the simulation are much closer to each other in their SCF distributions than is either to the randomized cube. We are now in the process of applying the SCF to a larger set of observed and simulated data cubes. Our ultimate aim is to use the SCF both on its own, as a descriptor of the spatial-kinetic properties of interstellar gas, and also as a tool for evaluating how well simulations resemble observations. Our expectation is that the SCF will be more discriminatory (less likely to produce a false match) than the data cube descriptors currently available.
NASA Astrophysics Data System (ADS)
Di Mauro, Biagio; Fava, Francesco; Busetto, Lorenzo; Crosta, Giovanni Franco; Colombo, Roberto
2013-04-01
In this study a method based on the analysis of MODerate-resolution Imaging Spectroradiometer (MODIS) time series is proposed to estimate the post-fire resilience of mountain vegetation (broadleaf forest and prairies) in the Italian Alps. Resilience is defined herewith as the ability of a dynamical system to counteract disturbances. It can be quantified by the amount of time the disturbed system takes to resume, in statistical terms, an ecological functionality comparable with its undisturbed behavior. Satellite images of the Normalized Difference Vegetation Index (NDVI) and of the Enhanced Vegetation Index (EVI) with spatial resolution of 250m and temporal resolution of 16 days in the 2000-2012 time period were used. Wildfire affected areas in the Lombardy region between the years 2000 and 2010 were analysed. Only large fires (affected area >40ha) were selected. For each burned area, an undisturbed adjacent control site was located. Data pre-processing consisted in the smoothing of MODIS time series for noise removal and then a double logistic function was fitted. Land surface phenology descriptors (proxies for growing season start/end/length and green biomass) were extracted in order to characterize the time evolution of the vegetation. Descriptors from a burned area were compared to those extracted from the respective control site by means of the one-way analysis of variance. According to the number of subsequent years which exhibit statistically meaningful difference between burned and control site, five classes of resilience were identified and a set of thematic maps was created for each descriptor. The same method was applied to all 84 aggregated events and to events aggregated by main land cover. EVI index results more sensitive to fire impact than NDVI index. Analysis shows that fire causes both a reduction of the biomass and a variation in the phenology of the Alpine vegetation. Results suggest an average ecosystem resilience of 6-7 years. Moreover, broadleaf forest and prairies show different post-fire behavior in terms of land surface phenology descriptors. In addition to the above analysis, another method is proposed, which derives from the qualitative theory of dynamical systems. The (time dependent) spectral index of a burned area over the period of one year was plotted against its counterpart from the control site. Yearly plots (or scattergrams) before and after the fire were obtained. Each plot is a sequence of points on the plane, which are the vertices of a generally self-intersecting polygonal chain. Some geometrical descriptors were obtained from the yearly chains of each fire. Principal Components Analysis (PCA) of geometrical descriptors was applied to a set of case studies and the obtained results provide a system dynamics interpretation of the natural process.
What Is a Hydrogen Bond? Resonance Covalency in the Supramolecular Domain
ERIC Educational Resources Information Center
Weinhold, Frank; Klein, Roger A.
2014-01-01
We address the broader conceptual and pedagogical implications of recent recommendations of the International Union of Pure and Applied Chemistry (IUPAC) concerning the re-definition of hydrogen bonding, drawing upon the recommended IUPAC statistical methodology of mutually correlated experimental and theoretical descriptors to operationally…
Marrero-Ponce, Yovani
2004-01-01
This report describes a new set of molecular descriptors of relevance to QSAR/QSPR studies and drug design, atom linear indices fk(xi). These atomic level chemical descriptors are based on the calculation of linear maps on Rn[fk(xi): Rn--> Rn] in canonical basis. In this context, the kth power of the molecular pseudograph's atom adjacency matrix [Mk(G)] denotes the matrix of fk(xi) with respect to the canonical basis. In addition, a local-fragment (atom-type) formalism was developed. The kth atom-type linear indices are calculated by summing the kth atom linear indices of all atoms of the same atom type in the molecules. Moreover, total (whole-molecule) linear indices are also proposed. This descriptor is a linear functional (linear form) on Rn. That is, the kth total linear indices is a linear map from Rn to the scalar R[ fk(x): Rn --> R]. Thus, the kth total linear indices are calculated by summing the atom linear indices of all atoms in the molecule. The features of the kth total and local linear indices are illustrated by examples of various types of molecular structures, including chain-lengthening, branching, heteroatoms-content, and multiple bonds. Additionally, the linear independence of the local linear indices to other 0D, 1D, 2D, and 3D molecular descriptors is demonstrated by using principal component analysis for 42 very heterogeneous molecules. Much redundancy and overlapping was found among total linear indices and most of the other structural indices presently in use in the QSPR/QSAR practice. On the contrary, the information carried by atom-type linear indices was strikingly different from that codified in most of the 229 0D-3D molecular descriptors used in this study. It is concluded that the local linear indices are an independent indices containing important structural information to be used in QSPR/QSAR and drug design studies. In this sense, atom, atom-type, and total linear indices were used for the prediction of pIC50 values for the cleavage process of a set of flavone derivatives inhibitors of HIV-1 integrase. Quantitative models found are significant from a statistical point of view (R of 0.965, 0.902, and 0.927, respectively) and permit a clear interpretation of the studied properties in terms of the structural features of molecules. A LOO cross-validation procedure revealed that the regression models had a fairly good predictability (q2 of 0.679, 0.543, and 0.721, respectively). The comparison with other approaches reveals good behavior of the method proposed. The approach described in this paper appears to be an excellent alternative or guides for discovery and optimization of new lead compounds.
Development of structure-activity relationship for metal oxide nanoparticles
NASA Astrophysics Data System (ADS)
Liu, Rong; Zhang, Hai Yuan; Ji, Zhao Xia; Rallo, Robert; Xia, Tian; Chang, Chong Hyun; Nel, Andre; Cohen, Yoram
2013-05-01
Nanomaterial structure-activity relationships (nano-SARs) for metal oxide nanoparticles (NPs) toxicity were investigated using metrics based on dose-response analysis and consensus self-organizing map clustering. The NP cellular toxicity dataset included toxicity profiles consisting of seven different assays for human bronchial epithelial (BEAS-2B) and murine myeloid (RAW 264.7) cells, over a concentration range of 0.39-100 mg L-1 and exposure time up to 24 h, for twenty-four different metal oxide NPs. Various nano-SAR building models were evaluated, based on an initial pool of thirty NP descriptors. The conduction band energy and ionic index (often correlated with the hydration enthalpy) were identified as suitable NP descriptors that are consistent with suggested toxicity mechanisms for metal oxide NPs and metal ions. The best performing nano-SAR with the above two descriptors, built with support vector machine (SVM) model and of validated robustness, had a balanced classification accuracy of ~94%. An applicability domain for the present data was established with a reasonable confidence level of 80%. Given the potential role of nano-SARs in decision making, regarding the environmental impact of NPs, the class probabilities provided by the SVM nano-SAR enabled the construction of decision boundaries with respect to toxicity classification under different acceptance levels of false negative relative to false positive predictions.Nanomaterial structure-activity relationships (nano-SARs) for metal oxide nanoparticles (NPs) toxicity were investigated using metrics based on dose-response analysis and consensus self-organizing map clustering. The NP cellular toxicity dataset included toxicity profiles consisting of seven different assays for human bronchial epithelial (BEAS-2B) and murine myeloid (RAW 264.7) cells, over a concentration range of 0.39-100 mg L-1 and exposure time up to 24 h, for twenty-four different metal oxide NPs. Various nano-SAR building models were evaluated, based on an initial pool of thirty NP descriptors. The conduction band energy and ionic index (often correlated with the hydration enthalpy) were identified as suitable NP descriptors that are consistent with suggested toxicity mechanisms for metal oxide NPs and metal ions. The best performing nano-SAR with the above two descriptors, built with support vector machine (SVM) model and of validated robustness, had a balanced classification accuracy of ~94%. An applicability domain for the present data was established with a reasonable confidence level of 80%. Given the potential role of nano-SARs in decision making, regarding the environmental impact of NPs, the class probabilities provided by the SVM nano-SAR enabled the construction of decision boundaries with respect to toxicity classification under different acceptance levels of false negative relative to false positive predictions. Electronic supplementary information (ESI) available. See DOI: 10.1039/c3nr01533e
Previous modelling of the median lethal dose (oral rat LD50) has indicated that local class-based models yield better correlations than global models. We evaluated the hypothesis that dividing the dataset by pesticidal mechanisms would improve prediction accuracy. A linear discri...
Chaining direct memory access data transfer operations for compute nodes in a parallel computer
Archer, Charles J.; Blocksome, Michael A.
2010-09-28
Methods, systems, and products are disclosed for chaining DMA data transfer operations for compute nodes in a parallel computer that include: receiving, by an origin DMA engine on an origin node in an origin injection FIFO buffer for the origin DMA engine, a RGET data descriptor specifying a DMA transfer operation data descriptor on the origin node and a second RGET data descriptor on the origin node, the second RGET data descriptor specifying a target RGET data descriptor on the target node, the target RGET data descriptor specifying an additional DMA transfer operation data descriptor on the origin node; creating, by the origin DMA engine, an RGET packet in dependence upon the RGET data descriptor, the RGET packet containing the DMA transfer operation data descriptor and the second RGET data descriptor; and transferring, by the origin DMA engine to a target DMA engine on the target node, the RGET packet.
Replenishing data descriptors in a DMA injection FIFO buffer
Archer, Charles J [Rochester, MN; Blocksome, Michael A [Rochester, MN; Cernohous, Bob R [Rochester, MN; Heidelberger, Philip [Cortlandt Manor, NY; Kumar, Sameer [White Plains, NY; Parker, Jeffrey J [Rochester, MN
2011-10-11
Methods, apparatus, and products are disclosed for replenishing data descriptors in a Direct Memory Access (`DMA`) injection first-in-first-out (`FIFO`) buffer that include: determining, by a messaging module on an origin compute node, whether a number of data descriptors in a DMA injection FIFO buffer exceeds a predetermined threshold, each data descriptor specifying an application message for transmission to a target compute node; queuing, by the messaging module, a plurality of new data descriptors in a pending descriptor queue if the number of the data descriptors in the DMA injection FIFO buffer exceeds the predetermined threshold; establishing, by the messaging module, interrupt criteria that specify when to replenish the injection FIFO buffer with the plurality of new data descriptors in the pending descriptor queue; and injecting, by the messaging module, the plurality of new data descriptors into the injection FIFO buffer in dependence upon the interrupt criteria.
Long, Jiang; Youli, Qiu; Yu, Li
2017-11-01
Twelve substituent descriptors, 17 quantum chemical descriptors and 1/T were selected to establish a quantitative structure-property relationship (QSPR) model of Henry's law constants for 7 polybrominated diphenyl ethers (PBDEs) at five different temperatures. Then, the lgH of 202 congeners at different temperatures were predicted. The variation rule and regulating mechanism of lgH was studied from the perspectives of both quantum chemical descriptors and substituent characteristics. The R 2 for modeling and testing sets of the final QSPR model are 0.977 and 0.979, respectively, thus indicating good fitness and predictive ability for Henry' law constants of PBDEs at different temperatures. The favorable hydrogen binding sites are the 5,5',6,6'-positions for high substituent congeners and the O atom of the ether bond for low substituent congeners, which affects the interaction between PBDEs and water molecules. lgH is negatively and linearly correlated with 1/T, and the variation trends of lgH with temperature are primarily regulated by individual substituent characteristics, wherein: the more substituents involved, the smaller the lgH. The significant sequence for the main effect of substituent positions is para>meta>ortho, where the ortho-positions are mainly involved in second-order interaction effect (64.01%). Having two substituents in the same ring also provides a significant effect, with 81.36% of second-order interaction effects, particularly where there is an adjacent distribution (55.02%). Copyright © 2017 Elsevier Inc. All rights reserved.
In silico design of anti-atherogenic biomaterials.
Lewis, Daniel R; Kholodovych, Vladyslav; Tomasini, Michael D; Abdelhamid, Dalia; Petersen, Latrisha K; Welsh, William J; Uhrich, Kathryn E; Moghe, Prabhas V
2013-10-01
Atherogenesis, the uncontrolled deposition of modified lipoproteins in inflamed arteries, serves as a focal trigger of cardiovascular disease (CVD). Polymeric biomaterials have been envisioned to counteract atherogenesis based on their ability to repress scavenger mediated uptake of oxidized lipoprotein (oxLDL) in macrophages. Following the conceptualization in our laboratories of a new library of amphiphilic macromolecules (AMs), assembled from sugar backbones, aliphatic chains and poly(ethylene glycol) tails, a more rational approach is necessary to parse the diverse features such as charge, hydrophobicity, sugar composition and stereochemistry. In this study, we advance a computational biomaterials design approach to screen and elucidate anti-atherogenic biomaterials with high efficacy. AMs were quantified in terms of not only 1D (molecular formula) and 2D (molecular connectivity) descriptors, but also new 3D (molecular geometry) descriptors of AMs modeled by coarse-grained molecular dynamics (MD) followed by all-atom MD simulations. Quantitative structure-activity relationship (QSAR) models for anti-atherogenic activity were then constructed by screening a total of 1164 descriptors against the corresponding, experimentally measured potency of AM inhibition of oxLDL uptake in human monocyte-derived macrophages. Five key descriptors were identified to provide a strong linear correlation between the predicted and observed anti-atherogenic activity values, and were then used to correctly forecast the efficacy of three newly designed AMs. Thus, a new ligand-based drug design framework was successfully adapted to computationally screen and design biomaterials with cardiovascular therapeutic properties. Copyright © 2013 Elsevier Ltd. All rights reserved.
Das, Rudra Narayan; Roy, Kunal; Popelier, Paul L A
2015-11-01
The present study explores the chemical attributes of diverse ionic liquids responsible for their cytotoxicity in a rat leukemia cell line (IPC-81) by developing predictive classification as well as regression-based mathematical models. Simple and interpretable descriptors derived from a two-dimensional representation of the chemical structures along with quantum topological molecular similarity indices have been used for model development, employing unambiguous modeling strategies that strictly obey the guidelines of the Organization for Economic Co-operation and Development (OECD) for quantitative structure-activity relationship (QSAR) analysis. The structure-toxicity relationships that emerged from both classification and regression-based models were in accordance with the findings of some previous studies. The models suggested that the cytotoxicity of ionic liquids is dependent on the cationic surfactant action, long alkyl side chains, cationic lipophilicity as well as aromaticity, the presence of a dialkylamino substituent at the 4-position of the pyridinium nucleus and a bulky anionic moiety. The models have been transparently presented in the form of equations, thus allowing their easy transferability in accordance with the OECD guidelines. The models have also been subjected to rigorous validation tests proving their predictive potential and can hence be used for designing novel and "greener" ionic liquids. The major strength of the present study lies in the use of a diverse and large dataset, use of simple reproducible descriptors and compliance with the OECD norms. Copyright © 2015 Elsevier Ltd. All rights reserved.
A Bayesian cluster analysis method for single-molecule localization microscopy data.
Griffié, Juliette; Shannon, Michael; Bromley, Claire L; Boelen, Lies; Burn, Garth L; Williamson, David J; Heard, Nicholas A; Cope, Andrew P; Owen, Dylan M; Rubin-Delanchy, Patrick
2016-12-01
Cell function is regulated by the spatiotemporal organization of the signaling machinery, and a key facet of this is molecular clustering. Here, we present a protocol for the analysis of clustering in data generated by 2D single-molecule localization microscopy (SMLM)-for example, photoactivated localization microscopy (PALM) or stochastic optical reconstruction microscopy (STORM). Three features of such data can cause standard cluster analysis approaches to be ineffective: (i) the data take the form of a list of points rather than a pixel array; (ii) there is a non-negligible unclustered background density of points that must be accounted for; and (iii) each localization has an associated uncertainty in regard to its position. These issues are overcome using a Bayesian, model-based approach. Many possible cluster configurations are proposed and scored against a generative model, which assumes Gaussian clusters overlaid on a completely spatially random (CSR) background, before every point is scrambled by its localization precision. We present the process of generating simulated and experimental data that are suitable to our algorithm, the analysis itself, and the extraction and interpretation of key cluster descriptors such as the number of clusters, cluster radii and the number of localizations per cluster. Variations in these descriptors can be interpreted as arising from changes in the organization of the cellular nanoarchitecture. The protocol requires no specific programming ability, and the processing time for one data set, typically containing 30 regions of interest, is ∼18 h; user input takes ∼1 h.
Dostanić, J; Lončarević, D; Zlatar, M; Vlahović, F; Jovanović, D M
2016-10-05
A series of arylazo pyridone dyes was synthesized by changing the type of the substituent group in the diazo moiety, ranging from strong electron-donating to strong electron-withdrawing groups. The structural and electronic properties of the investigated dyes was calculated at the M062X/6-31+G(d,p) level of theory. The observed good linear correlations between atomic charges and Hammett σp constants provided a basis to discuss the transmission of electronic substituent effects through a dye framework. The reactivity of synthesized dyes was tested through their decolorization efficiency in TiO2 photocatalytic system (Degussa P-25). Quantitative structure-activity relationship analysis revealed a strong correlation between reactivity of investigated dyes and Hammett substituent constants. The reaction was facilitated by electron-withdrawing groups, and retarded by electron-donating ones. Quantum mechanical calculations was used in order to describe the mechanism of the photocatalytic oxidation reactions of investigated dyes and interpret their reactivities within the framework of the Density Functional Theory (DFT). According to DFT based reactivity descriptors, i.e. Fukui functions and local softness, the active site moves from azo nitrogen atom linked to benzene ring to pyridone carbon atom linked to azo bond, going from dyes with electron-donating groups to dyes with electron-withdrawing groups. Copyright © 2016 Elsevier B.V. All rights reserved.
Hybrid optimal descriptors as a tool to predict skin sensitization in accordance to OECD principles.
Toropova, Alla P; Toropov, Andrey A
2017-06-05
Skin sensitization (allergic contact dermatitis) is a widespread problem arising from the contact of chemicals with the skin. The detection of molecular features with undesired effect for skin is complex task owing to unclear biochemical mechanisms and unclearness of conditions of action of chemicals to skin. The development of computational methods for estimation of this endpoint in order to reduce animal testing is recommended (Cosmetics Directive EC regulation 1907/2006; EU Regulation, Regulation, 1223/2009). The CORAL software (http://www.insilico.eu/coral) gives good predictive models for the skin sensitization. Simplified molecular input-line entry system (SMILES) together with molecular graph are used to represent the molecular structure for these models. So-called hybrid optimal descriptors are used to establish quantitative structure-activity relationships (QSARs). The aim of this study is the estimation of the predictive potential of the hybrid descriptors. Three different distributions into the training (≈70%), calibration (≈15%), and validation (≈15%) sets are studied. QSAR for these three distributions are built up with using the Monte Carlo technique. The statistical characteristics of these models for external validation set are used as a measure of predictive potential of these models. The best model, according to the above criterion, is characterized by n validation =29, r 2 validation =0.8596, RMSE validation =0.489. Mechanistic interpretation and domain of applicability for these models are defined. Copyright © 2017 Elsevier B.V. All rights reserved.
Anomaly Detection Based on Local Nearest Neighbor Distance Descriptor in Crowded Scenes
Hu, Shiqiang; Zhang, Huanlong; Luo, Lingkun
2014-01-01
We propose a novel local nearest neighbor distance (LNND) descriptor for anomaly detection in crowded scenes. Comparing with the commonly used low-level feature descriptors in previous works, LNND descriptor has two major advantages. First, LNND descriptor efficiently incorporates spatial and temporal contextual information around the video event that is important for detecting anomalous interaction among multiple events, while most existing feature descriptors only contain the information of single event. Second, LNND descriptor is a compact representation and its dimensionality is typically much lower than the low-level feature descriptor. Therefore, not only the computation time and storage requirement can be accordingly saved by using LNND descriptor for the anomaly detection method with offline training fashion, but also the negative aspects caused by using high-dimensional feature descriptor can be avoided. We validate the effectiveness of LNND descriptor by conducting extensive experiments on different benchmark datasets. Experimental results show the promising performance of LNND-based method against the state-of-the-art methods. It is worthwhile to notice that the LNND-based approach requires less intermediate processing steps without any subsequent processing such as smoothing but achieves comparable event better performance. PMID:25105164
Ruiz-Angel, M J; Carda-Broch, S; García-Alvarez-Coque, M C; Berthod, A
2004-03-19
Logarithm of retention factors (log k) of a group of 14 ionizable diuretics were correlated with the molecular (log P o/w) and apparent (log P(app)) octanol-water partition coefficients. The compounds were chromatographed using aqueous-organic (reversed-phase liquid chromatography, RPLC) and micellar-organic mobile phases (micellar liquid chromatography, MLC) with the anionic surfactant sodium dodecyl sulfate (SDS), in the pH range 3-7, and a conventional octadecylsilane column. Acetonitrile was used as the organic modifier in both modes. The quality of the correlations obtained for log P(app) at varying ionization degree confirms that this correction is required in the aqueous-organic mixtures. The correlation is less improved with SDS micellar media because the acid-base equilibriums are shifted towards higher pH values for acidic compounds. In micellar chromatography, an electrostatic interaction with charged solutes is added to hydrophobic forces; consequently, different correlations should be established for neutral and acidic compounds, and for basic compounds. Correlations between log k and the isocratic descriptors log k(w), log k(wm) (extrapolated retention to pure water in the aqueous-organic and micellar-organic systems, respectively), and psi0 (extrapolated mobile phase composition giving a k = 1 retention factor or twice the dead time), and between these descriptors and log P(app) were also satisfactory, although poorer than those between log k and log P(app) due to the extrapolation. The study shows that, in the particular case of the ionizable diuretics studied, classical RPLC gives better results than MLC with SDS in the retention hydrophobicity correlations.
Manfredini, Marco; Arginelli, Federica; Dunsby, Christopher; French, Paul; Talbot, Clifford; König, Karsten; Pellacani, Giovanni; Ponti, Giovanni; Seidenari, Stefania
2013-02-01
The aim of this study was to compare morphological aspects of basal cell carcinoma (BCC) as assessed by two different imaging methods: in vivo reflectance confocal microscopy (RCM) and multiphoton tomography with fluorescence lifetime imaging implementation (MPT-FLIM). The study comprised 16 BCCs for which a complete set of RCM and MPT-FLIM images were available. The presence of seven MPT-FLIM descriptors was evaluated. The presence of seven RCM equivalent parameters was scored in accordance to their extension. Chi-squared test with Fisher's exact test and Spearman's rank correlation coefficient were determined between MPT-FLIM scores and adjusted-RCM scores. MPT-FLIM and RCM descriptors of BCC were coupled to match the descriptors that define the same pathological structures. The comparison included: Streaming and Aligned elongated cells, Streaming with multiple directions and Double alignment, Palisading (RCM) and Palisading (MPT-FLIM), Typical tumor islands, and Cell islands surrounded by fibers, Dark silhouettes and Phantom islands, Plump bright cells and Melanophages, Vessels (RCM), and Vessels (MPT-FLIM). The parameters that were significantly correlated were Melanophages/Plump Bright Cells, Aligned elongated cells/Streaming, Double alignment/Streaming with multiple directions, and Palisading (MPT-FLIM)/Palisading (RCM). According to our data, both methods are suitable to image BCC's features. The concordance between MPT-FLIM and RCM is high, with some limitations due to the technical differences between the two devices. The hardest difficulty when comparing the images generated by the two imaging modalities is represented by their different field of view. © 2012 John Wiley & Sons A/S.
Santoro, Adriana Leandra; Carrilho, Emanuel; Lanças, Fernando Mauro; Montanari, Carlos Alberto
2016-06-10
The pharmacokinetic properties of flavonoids with differing degrees of lipophilicity were investigated using immobilized artificial membranes (IAMs) as the stationary phase in high performance liquid chromatography (HPLC). For each flavonoid compound, we investigated whether the type of column used affected the correlation between the retention factors and the calculated octanol/water partition (log Poct). Three-dimensional (3D) molecular descriptors were calculated from the molecular structure of each compound using i) VolSurf software, ii) the GRID method (computational procedure for determining energetically favorable binding sites in molecules of known structure using a probe for calculating the 3D molecular interaction fields, between the probe and the molecule), and iii) the relationship between partition and molecular structure, analyzed in terms of physicochemical descriptors. The VolSurf built-in Caco-2 model was used to estimate compound permeability. The extent to which the datasets obtained from different columns differ both from each other and from both the calculated log Poct and the predicted permeability in Caco-2 cells was examined by principal component analysis (PCA). The immobilized membrane partition coefficients (kIAM) were analyzed using molecular descriptors in partial least square regression (PLS) and a quantitative structure-retention relationship was generated for the chromatographic retention in the cholesterol column. The cholesterol column provided the best correlation with the permeability predicted by the Caco-2 cell model and a good fit model with great prediction power was obtained for its retention data (R(2)=0.96 and Q(2)=0.85 with four latent variables). Copyright © 2015 Elsevier B.V. All rights reserved.
Descriptors used to define running-related musculoskeletal injury: a systematic review.
Yamato, Tiê Parma; Saragiotto, Bruno Tirotti; Hespanhol Junior, Luiz Carlos; Yeung, Simon S; Lopes, Alexandre Dias
2015-05-01
Systematic review. To systematically review the descriptors used to define running-related musculoskeletal injury and to analyze the implications of different definitions on the results of studies. Studies have developed their own definitions of running-related musculoskeletal injuries based on different criteria. This may affect the rates of injury, which can be overestimated or underestimated due to the lack of a standard definition. Searches were conducted in the Embase, PubMed, CINAHL, SPORTDiscus, LILACS, and SciELO databases, without limits on date of publication and language. Only articles that reported a definition of running-related injury were included. The definitions were classified according to 3 domains and subcategories: (1) presence of physical complaint (symptom, body system involved, region), (2) interruption of training or competition (primary sports involved, extent of injury, extent of limitation, interruption, period of injury), and (3) need for medical assistance. Spearman rank correlation was performed to evaluate the correlation between the completeness of definitions and the rates of injury reported in the studies. A total of 48 articles were included. Most studies described more than half of the subcategories, but with no standardization between the terms used within each category, showing that there is no consensus for a definition. The injury rates ranged between 3% and 85%, and tended to increase with less specific definitions. The descriptors commonly used by researchers to define a running-related injury vary between studies and may affect the rates of injuries. The lack of a standardized definition hinders comparison between studies and rates of injuries.
Revealing plant cryptotypes: defining meaningful phenotypes among infinite traits.
Chitwood, Daniel H; Topp, Christopher N
2015-04-01
The plant phenotype is infinite. Plants vary morphologically and molecularly over developmental time, in response to the environment, and genetically. Exhaustive phenotyping remains not only out of reach, but is also the limiting factor to interpreting the wealth of genetic information currently available. Although phenotyping methods are always improving, an impasse remains: even if we could measure the entirety of phenotype, how would we interpret it? We propose the concept of cryptotype to describe latent, multivariate phenotypes that maximize the separation of a priori classes. Whether the infinite points comprising a leaf outline or shape descriptors defining root architecture, statistical methods to discern the quantitative essence of an organism will be required as we approach measuring the totality of phenotype. Copyright © 2015 Elsevier Ltd. All rights reserved.
Nonparametric regression applied to quantitative structure-activity relationships
Constans; Hirst
2000-03-01
Several nonparametric regressors have been applied to modeling quantitative structure-activity relationship (QSAR) data. The simplest regressor, the Nadaraya-Watson, was assessed in a genuine multivariate setting. Other regressors, the local linear and the shifted Nadaraya-Watson, were implemented within additive models--a computationally more expedient approach, better suited for low-density designs. Performances were benchmarked against the nonlinear method of smoothing splines. A linear reference point was provided by multilinear regression (MLR). Variable selection was explored using systematic combinations of different variables and combinations of principal components. For the data set examined, 47 inhibitors of dopamine beta-hydroxylase, the additive nonparametric regressors have greater predictive accuracy (as measured by the mean absolute error of the predictions or the Pearson correlation in cross-validation trails) than MLR. The use of principal components did not improve the performance of the nonparametric regressors over use of the original descriptors, since the original descriptors are not strongly correlated. It remains to be seen if the nonparametric regressors can be successfully coupled with better variable selection and dimensionality reduction in the context of high-dimensional QSARs.
Correlation between the pattern volatiles and the overall aroma of wild edible mushrooms.
de Pinho, P Guedes; Ribeiro, Bárbara; Gonçalves, Rui F; Baptista, Paula; Valentão, Patrícia; Seabra, Rosa M; Andrade, Paula B
2008-03-12
Volatile and semivolatile components of 11 wild edible mushrooms, Suillus bellini, Suillus luteus, Suillus granulatus, Tricholomopsis rutilans, Hygrophorus agathosmus, Amanita rubescens, Russula cyanoxantha, Boletus edulis, Tricholoma equestre, Fistulina hepatica, and Cantharellus cibarius, were determined by headspace solid-phase microextraction (HS-SPME) and by liquid extraction combined with gas chromatography-mass spectrometry (GC-MS). Fifty volatiles and nonvolatiles components were formally identified and 13 others were tentatively identified. Using sensorial analysis, the descriptors "mushroomlike", "farm-feed", "floral", "honeylike", "hay-herb", and "nutty" were obtained. A correlation between sensory descriptors and volatiles was observed by applying multivariate analysis (principal component analysis and agglomerative hierarchic cluster analysis) to the sensorial and chemical data. The studied edible mushrooms can be divided in three groups. One of them is rich in C8 derivatives, such as 3-octanol, 1-octen-3-ol, trans-2-octen-1-ol, 3-octanone, and 1-octen-3-one; another one is rich in terpenic volatile compounds; and the last one is rich in methional. The presence and contents of these compounds give a considerable contribution to the sensory characteristics of the analyzed species.
QSAR Study and Molecular Design of Open-Chain Enaminones as Anticonvulsant Agents
Garro Martinez, Juan C.; Duchowicz, Pablo R.; Estrada, Mario R.; Zamarbide, Graciela N.; Castro, Eduardo A.
2011-01-01
Present work employs the QSAR formalism to predict the ED50 anticonvulsant activity of ringed-enaminones, in order to apply these relationships for the prediction of unknown open-chain compounds containing the same types of functional groups in their molecular structure. Two different modeling approaches are applied with the purpose of comparing the consistency of our results: (a) the search of molecular descriptors via multivariable linear regressions; and (b) the calculation of flexible descriptors with the CORAL (CORrelation And Logic) program. Among the results found, we propose some potent candidate open-chain enaminones having ED50 values lower than 10 mg·kg−1 for corresponding pharmacological studies. These compounds are classified as Class 1 and Class 2 according to the Anticonvulsant Selection Project. PMID:22272137
Dyekjaer, Jane Dannow; Jónsdóttir, Svava Osk
2004-01-22
Quantitative Structure-Property Relationships (QSPR) have been developed for a series of monosaccharides, including the physical properties of partial molar heat capacity, heat of solution, melting point, heat of fusion, glass-transition temperature, and solid state density. The models were based on molecular descriptors obtained from molecular mechanics and quantum chemical calculations, combined with other types of descriptors. Saccharides exhibit a large degree of conformational flexibility, therefore a methodology for selecting the energetically most favorable conformers has been developed, and was used for the development of the QSPR models. In most cases good correlations were obtained for monosaccharides. For five of the properties predictions were made for disaccharides, and the predicted values for the partial molar heat capacities were in excellent agreement with experimental values.
Nowik, Witold; Héron, Sylvie; Bonose, Myriam; Tchapla, Alain
2013-10-07
A comparison of chromatograms obtained in a series of separation conditions for a given complex mixture may be done with a series of chromatographic descriptors. In this study, we used two descriptors: the number of critical pairs and symmetry of peaks, further rescaled and converted to the corresponding critical pairs' coefficient (CPc) and symmetry coefficient (Sc). Considering the difficulty of appreciating global separation quality using CPc and Sc criteria separately, as their respective values are usually uncorrelated, a double-criteria cross-evaluation system was required. For that purpose we tested the commonly used multi-criteria decision-making method - Derringer's desirability function (D) - as well as the recently introduced sum of ranking differences (SRD). To facilitate the graphical comparison of both approaches, the desirability function (D) was used in the inverse form (Dinv). The advantages and drawbacks of both evaluation methods, especially the respective under- or over-evaluation of outliers, caused us to introduce a new ranking approach, separation system suitability (3S). The obtained suitability rankings for the three tested approaches (Dinv, SRD and 3S) are different; nevertheless, 3S appears to be the most balanced and the easiest to interpret as well. The approach developed for selection of suitable systems was applied to the problem of separation of complex mixtures through the analysis of a series of standards of anthraquinone derivatives. To judge the pertinence of this evaluation, a sample containing a number of natural anthraquinones extracted from the bark of Indian mulberry (Morinda citrifolia) was analysed. In conclusion, the proposed methodology for the cross-evaluation of the series of chromatograms using single specific descriptors (CPc and Sc) through a global composite descriptor (3S) significantly simplifies the decision that separation systems are the most suitable for the separation of complex target mixtures of compounds.
Roth, Michal
2016-12-06
High-pressure phase behavior of systems containing water, carbon dioxide and organics has been important in several environment- and energy-related fields including carbon capture and storage, CO 2 sequestration and CO 2 -assisted enhanced oil recovery. Here, partition coefficients (K-factors) of organic solutes between water and supercritical carbon dioxide have been correlated with extended linear solvation energy relationships (LSERs). In addition to the Abraham molecular descriptors of the solutes, the explanatory variables also include the logarithm of solute vapor pressure, the solubility parameters of carbon dioxide and water, and the internal pressure of water. This is the first attempt to include also the properties of water as explanatory variables in LSER correlations of K-factor data in CO 2 -water-organic systems. Increasing values of the solute hydrogen bond acidity, the solute hydrogen bond basicity, the solute dipolarity/polarizability, the internal pressure of water and the solubility parameter of water all tend to reduce the K-factor, that is, to favor the solute partitioning to the water-rich phase. On the contrary, increasing values of the solute characteristic volume, the solute vapor pressure and the solubility parameter of CO 2 tend to raise the K-factor, that is, to favor the solute partitioning to the CO 2 -rich phase.
Incidence of neoplasms in the most prevalent autoimmune rheumatic diseases: a systematic review.
Machado, Roberta Ismael Lacerda; Braz, Alessandra de Sousa; Freire, Eutilia Andrade Medeiros
2014-01-01
This article is a systematic review of the literature about the coexistence of cancer and autoimmune rheumatic diseases, their main associations, cancers and possible risk factors associated, with emphasis on existing population-based studies, besides checking the relation of this occur with the use of the drugs used in the treatment of autoimmune diseases. A search was conducted of scientific articles indexed in the Cochrane / BVS, Pubmed / Medline and Scielo / Lilacs in the period from 2002 to 2012. Also consulted was the IB-ICT (Brazilian digital library of theses and Masters), with descriptors in Portuguese and English for "Systemic sclerosis", "Rheumatoid Arthritis", " Systemic Lupus Erythematosus" and "Sjögren's syndrome", correlating each one with the descriptor AND "neoplasms". The results showed that in the database IBICT a thesis and a dissertation for the descriptor SLE met the inclusion criteria, none met RA one thesis to SS. Lilacs in the database/Scielo found two articles on "Rheumatoid Arthritis" AND "neoplasms". In Pubmed/Medline the inicial search resulted in 118 articles, and 41 were selected. The review noted the relationship between cancer and autoimmune rheumatic diseases, as well as a risk factor for protection, although the pathophysiological mechanisms are not known.
A Global Covariance Descriptor for Nuclear Atypia Scoring in Breast Histopathology Images.
Khan, Adnan Mujahid; Sirinukunwattana, Korsuk; Rajpoot, Nasir
2015-09-01
Nuclear atypia scoring is a diagnostic measure commonly used to assess tumor grade of various cancers, including breast cancer. It provides a quantitative measure of deviation in visual appearance of cell nuclei from those in normal epithelial cells. In this paper, we present a novel image-level descriptor for nuclear atypia scoring in breast cancer histopathology images. The method is based on the region covariance descriptor that has recently become a popular method in various computer vision applications. The descriptor in its original form is not suitable for classification of histopathology images as cancerous histopathology images tend to possess diversely heterogeneous regions in a single field of view. Our proposed image-level descriptor, which we term as the geodesic mean of region covariance descriptors, possesses all the attractive properties of covariance descriptors lending itself to tractable geodesic-distance-based k-nearest neighbor classification using efficient kernels. The experimental results suggest that the proposed image descriptor yields high classification accuracy compared to a variety of widely used image-level descriptors.
Modern pollen data from the Canadian Arctic, 1972-1973
NASA Astrophysics Data System (ADS)
Nichols, Harvey; Stolze, Susann
2017-05-01
This data descriptor reports results of a 1972-73 baseline study of modern pollen deposition in the Canadian Arctic to originally aid interpretation of Holocene pollen diagrams from that region, especially focussed on the arctic tree-line. The data set is geographically unique due to its extent, and allows the assessment of the effects of modern climate change on northern ecosystems, including fluctuations of the a arctic tree-line. Repeated sampling was conducted along an interior transect at 29 sites from the Boreal Forest to the High Arctic, with five additional coastal sites covering a total distance of 3,200 km. Static pollen samplers captured both local pollen and long-distance pollen wind-blown from the Boreal Forest. Moss and lichen polsters provided multi-year pollen fallout to assess the effectiveness of the static pollen samplers. The local vegetation was recorded at each site. This descriptor provides information on data archived at the World Data Center PANGAEA, which includes spreadsheets detailing site and sample information as well as raw and processed pollen data obtained on over 500 samples.
Modern pollen data from the Canadian Arctic, 1972-1973.
Nichols, Harvey; Stolze, Susann
2017-05-16
This data descriptor reports results of a 1972-73 baseline study of modern pollen deposition in the Canadian Arctic to originally aid interpretation of Holocene pollen diagrams from that region, especially focussed on the arctic tree-line. The data set is geographically unique due to its extent, and allows the assessment of the effects of modern climate change on northern ecosystems, including fluctuations of the a arctic tree-line. Repeated sampling was conducted along an interior transect at 29 sites from the Boreal Forest to the High Arctic, with five additional coastal sites covering a total distance of 3,200 km. Static pollen samplers captured both local pollen and long-distance pollen wind-blown from the Boreal Forest. Moss and lichen polsters provided multi-year pollen fallout to assess the effectiveness of the static pollen samplers. The local vegetation was recorded at each site. This descriptor provides information on data archived at the World Data Center PANGAEA, which includes spreadsheets detailing site and sample information as well as raw and processed pollen data obtained on over 500 samples.
Modern pollen data from the Canadian Arctic, 1972–1973
Nichols, Harvey; Stolze, Susann
2017-01-01
This data descriptor reports results of a 1972–73 baseline study of modern pollen deposition in the Canadian Arctic to originally aid interpretation of Holocene pollen diagrams from that region, especially focussed on the arctic tree-line. The data set is geographically unique due to its extent, and allows the assessment of the effects of modern climate change on northern ecosystems, including fluctuations of the a arctic tree-line. Repeated sampling was conducted along an interior transect at 29 sites from the Boreal Forest to the High Arctic, with five additional coastal sites covering a total distance of 3,200 km. Static pollen samplers captured both local pollen and long-distance pollen wind-blown from the Boreal Forest. Moss and lichen polsters provided multi-year pollen fallout to assess the effectiveness of the static pollen samplers. The local vegetation was recorded at each site. This descriptor provides information on data archived at the World Data Center PANGAEA, which includes spreadsheets detailing site and sample information as well as raw and processed pollen data obtained on over 500 samples. PMID:28509898
A flexible motif search technique based on generalized profiles.
Bucher, P; Karplus, K; Moeri, N; Hofmann, K
1996-03-01
A flexible motif search technique is presented which has two major components: (1) a generalized profile syntax serving as a motif definition language; and (2) a motif search method specifically adapted to the problem of finding multiple instances of a motif in the same sequence. The new profile structure, which is the core of the generalized profile syntax, combines the functions of a variety of motif descriptors implemented in other methods, including regular expression-like patterns, weight matrices, previously used profiles, and certain types of hidden Markov models (HMMs). The relationship between generalized profiles and other biomolecular motif descriptors is analyzed in detail, with special attention to HMMs. Generalized profiles are shown to be equivalent to a particular class of HMMs, and conversion procedures in both directions are given. The conversion procedures provide an interpretation for local alignment in the framework of stochastic models, allowing for clear, simple significance tests. A mathematical statement of the motif search problem defines the new method exactly without linking it to a specific algorithmic solution. Part of the definition includes a new definition of disjointness of alignments.
Streit, M; Reinhardt, F; Thaller, G; Bennewitz, J
2013-01-01
Genotype by environment interaction (G × E) has been widely reported in dairy cattle. If the environment can be measured on a continuous scale, reaction norms can be applied to study G × E. The average herd milk production level has frequently been used as an environmental descriptor because it is influenced by the level of feeding or the feeding regimen. Another important environmental factor is the level of udder health and hygiene, for which the average herd somatic cell count might be a descriptor. In the present study, we conducted a genome-wide association analysis to identify single nucleotide polymorphisms (SNP) that affect intercept and slope of milk protein yield reaction norms when using the average herd test-day solution for somatic cell score as an environmental descriptor. Sire estimates for intercept and slope of the reaction norms were calculated from around 12 million daughter records, using linear reaction norm models. Sires were genotyped for ~54,000 SNP. The sire estimates were used as observations in the association analysis, using 1,797 sires. Significant SNP were confirmed in an independent validation set consisting of 500 sires. A known major gene affecting protein yield was included as a covariable in the statistical model. Sixty (21) SNP were confirmed for intercept with P ≤ 0.01 (P ≤ 0.001) in the validation set, and 28 and 11 SNP, respectively, were confirmed for slope. Most but not all SNP affecting slope also affected intercept. Comparison with an earlier study revealed that SNP affecting slope were, in general, also significant for slope when the environment was modeled by the average herd milk production level, although the two environmental descriptors were poorly correlated. Copyright © 2013 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Zhu, Hao; Ye, Lin; Richard, Ann; Golbraikh, Alexander; Wright, Fred A.; Rusyn, Ivan; Tropsha, Alexander
2009-01-01
Background Accurate prediction of in vivo toxicity from in vitro testing is a challenging problem. Large public–private consortia have been formed with the goal of improving chemical safety assessment by the means of high-throughput screening. Objective A wealth of available biological data requires new computational approaches to link chemical structure, in vitro data, and potential adverse health effects. Methods and results A database containing experimental cytotoxicity values for in vitro half-maximal inhibitory concentration (IC50) and in vivo rodent median lethal dose (LD50) for more than 300 chemicals was compiled by Zentralstelle zur Erfassung und Bewertung von Ersatz- und Ergaenzungsmethoden zum Tierversuch (ZEBET; National Center for Documentation and Evaluation of Alternative Methods to Animal Experiments). The application of conventional quantitative structure–activity relationship (QSAR) modeling approaches to predict mouse or rat acute LD50 values from chemical descriptors of ZEBET compounds yielded no statistically significant models. The analysis of these data showed no significant correlation between IC50 and LD50. However, a linear IC50 versus LD50 correlation could be established for a fraction of compounds. To capitalize on this observation, we developed a novel two-step modeling approach as follows. First, all chemicals are partitioned into two groups based on the relationship between IC50 and LD50 values: One group comprises compounds with linear IC50 versus LD50 relationships, and another group comprises the remaining compounds. Second, we built conventional binary classification QSAR models to predict the group affiliation based on chemical descriptors only. Third, we developed k-nearest neighbor continuous QSAR models for each subclass to predict LD50 values from chemical descriptors. All models were extensively validated using special protocols. Conclusions The novelty of this modeling approach is that it uses the relationships between in vivo and in vitro data only to inform the initial construction of the hierarchical two-step QSAR models. Models resulting from this approach employ chemical descriptors only for external prediction of acute rodent toxicity. PMID:19672406
Zhu, Hao; Ye, Lin; Richard, Ann; Golbraikh, Alexander; Wright, Fred A; Rusyn, Ivan; Tropsha, Alexander
2009-08-01
Accurate prediction of in vivo toxicity from in vitro testing is a challenging problem. Large public-private consortia have been formed with the goal of improving chemical safety assessment by the means of high-throughput screening. A wealth of available biological data requires new computational approaches to link chemical structure, in vitro data, and potential adverse health effects. A database containing experimental cytotoxicity values for in vitro half-maximal inhibitory concentration (IC(50)) and in vivo rodent median lethal dose (LD(50)) for more than 300 chemicals was compiled by Zentralstelle zur Erfassung und Bewertung von Ersatz- und Ergaenzungsmethoden zum Tierversuch (ZEBET; National Center for Documentation and Evaluation of Alternative Methods to Animal Experiments). The application of conventional quantitative structure-activity relationship (QSAR) modeling approaches to predict mouse or rat acute LD(50) values from chemical descriptors of ZEBET compounds yielded no statistically significant models. The analysis of these data showed no significant correlation between IC(50) and LD(50). However, a linear IC(50) versus LD(50) correlation could be established for a fraction of compounds. To capitalize on this observation, we developed a novel two-step modeling approach as follows. First, all chemicals are partitioned into two groups based on the relationship between IC(50) and LD(50) values: One group comprises compounds with linear IC(50) versus LD(50) relationships, and another group comprises the remaining compounds. Second, we built conventional binary classification QSAR models to predict the group affiliation based on chemical descriptors only. Third, we developed k-nearest neighbor continuous QSAR models for each subclass to predict LD(50) values from chemical descriptors. All models were extensively validated using special protocols. The novelty of this modeling approach is that it uses the relationships between in vivo and in vitro data only to inform the initial construction of the hierarchical two-step QSAR models. Models resulting from this approach employ chemical descriptors only for external prediction of acute rodent toxicity.
RED: a set of molecular descriptors based on Renyi entropy.
Delgado-Soler, Laura; Toral, Raul; Tomás, M Santos; Rubio-Martinez, Jaime
2009-11-01
New molecular descriptors, RED (Renyi entropy descriptors), based on the generalized entropies introduced by Renyi are presented. Topological descriptors based on molecular features have proven to be useful for describing molecular profiles. Renyi entropy is used as a variability measure to contract a feature-pair distribution composing the descriptor vector. The performance of RED descriptors was tested for the analysis of different sets of molecular distances, virtual screening, and pharmacological profiling. A free parameter of the Renyi entropy has been optimized for all the considered applications.
Self-pacing direct memory access data transfer operations for compute nodes in a parallel computer
Blocksome, Michael A
2015-02-17
Methods, apparatus, and products are disclosed for self-pacing DMA data transfer operations for nodes in a parallel computer that include: transferring, by an origin DMA on an origin node, a RTS message to a target node, the RTS message specifying an message on the origin node for transfer to the target node; receiving, in an origin injection FIFO for the origin DMA from a target DMA on the target node in response to transferring the RTS message, a target RGET descriptor followed by a DMA transfer operation descriptor, the DMA descriptor for transmitting a message portion to the target node, the target RGET descriptor specifying an origin RGET descriptor on the origin node that specifies an additional DMA descriptor for transmitting an additional message portion to the target node; processing, by the origin DMA, the target RGET descriptor; and processing, by the origin DMA, the DMA transfer operation descriptor.
Predicting membrane protein types by the LLDA algorithm.
Wang, Tong; Yang, Jie; Shen, Hong-Bin; Chou, Kuo-Chen
2008-01-01
Membrane proteins are generally classified into the following eight types: (1) type I transmembrane, (2) type II, (3) type III, (4) type IV, (5) multipass transmembrane, (6) lipid-chain-anchored membrane, (7) GPI-anchored membrane, and (8) peripheral membrane (K.C. Chou and H.B. Shen: BBRC, 2007, 360: 339-345). Knowing the type of an uncharacterized membrane protein often provides useful clues for finding its biological function and interaction process with other molecules in a biological system. With the explosion of protein sequences generated in the Post-Genomic Age, it is urgent to develop an automated method to deal with such a challenge. Recently, the PsePSSM (Pseudo Position-Specific Score Matrix) descriptor is proposed by Chou and Shen (Biochem. Biophys. Res. Comm. 2007, 360, 339-345) to represent a protein sample. The advantage of the PsePSSM descriptor is that it can combine the evolution information and sequence-correlated information. However, incorporating all these effects into a descriptor may cause the "high dimension disaster". To overcome such a problem, the fusion approach was adopted by Chou and Shen. Here, a completely different approach, the so-called LLDA (Local Linear Discriminant Analysis) is introduced to extract the key features from the high-dimensional PsePSSM space. The dimension-reduced descriptor vector thus obtained is a compact representation of the original high dimensional vector. Our jackknife and independent dataset test results indicate that it is very promising to use the LLDA approach to cope with complicated problems in biological systems, such as predicting the membrane protein type.
Mohamed Kamal, Rasha; Hussien Helal, Maha; Wessam, Rasha; Mahmoud Mansour, Sahar; Godda, Iman; Alieldin, Nelly
2015-06-01
To analyze the morphology and enhancement characteristics of breast lesions on contrast-enhanced spectral mammography (CESM) and to assess their impact on the differentiation between benign and malignant lesions. This ethics committee approved study included 168 consecutive patients with 211 breast lesions over 18 months. Lesions classified as non-enhancing and enhancing and then the latter group was subdivided into mass and non-mass. Mass lesions descriptors included: shape, margins, pattern and degree of internal enhancement. Non-mass lesions descriptors included: distribution, pattern and degree of internal enhancement. The impact of each descriptor on diagnosis individually assessed using Chi test and the validity compared in both benign and malignant lesions. The overall performance of CESM were also calculated. The study included 102 benign (48.3%) and 109 malignant (51.7%) lesions. Enhancement was encountered in 145/211 (68.7%) lesions. They further classified into enhancing mass (99/145, 68.3%) and non-mass lesions (46/145, 31.7%). Contrast uptake was significantly more frequent in malignant breast lesions (p value ≤ 0.001). Irregular mass lesions with intense and heterogeneous enhancement patterns correlated with a malignant pathology (p value ≤ 0.001). CESM showed an overall sensitivity of 88.99% and specificity of 83.33%. The positive and negative likelihood ratios were 5.34 and 0.13 respectively. The assessment of the morphology and enhancement characteristics of breast lesions on CESM enhances the performance of digital mammography in the differentiation between benign and malignant breast lesions. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Grimm, Lars J; Johnson, David Y; Johnson, Karen S; Baker, Jay A; Soo, Mary Scott; Hwang, E Shelley; Ghate, Sujata V
2017-06-01
To determine the malignancy rate overall and for specific BI-RADS descriptors in women ≥70 years who undergo stereotactic biopsy for calcifications. We retrospectively reviewed 14,577 consecutive mammogram reports in 6839 women ≥70 years to collect 231 stereotactic biopsies of calcifications in 215 women. Cases with missing images or histopathology and calcifications associated with masses, distortion, or asymmetries were excluded. Three breast radiologists determined BI-RADS descriptors by majority. Histology, hormone receptor status, and lymph node status were correlated with BI-RADS descriptors. There were 131 (57 %) benign, 22 (10 %) atypia/lobular carcinomas in situ, 55 (24 %) ductal carcinomas in situ (DCIS), and 23 (10 %) invasive diagnoses. Twenty-seven (51 %) DCIS cases were high-grade. Five (22 %) invasive cases were high-grade, two (9 %) were triple-negative, and three (12 %) were node-positive. Malignancy was found in 49 % (50/103) of fine pleomorphic, 50 % (14/28) of fine linear, 25 % (10/40) of amorphous, 20 % (3/15) of round, 3 % (1/36) of coarse heterogeneous, and 0 % (0/9) of dystrophic calcifications. Among women ≥70 years that underwent stereotactic biopsy for calcifications only, we observed a high rate of malignancy. Additionally, coarse heterogeneous calcifications may warrant a probable benign designation. • Cancer rates of biopsied calcifications in women ≥70 years are high • Radiologists should not dismiss suspicious calcifications in older women • Coarse heterogeneous calcifications may warrant a probable benign designation.
A comparison between space-time video descriptors
NASA Astrophysics Data System (ADS)
Costantini, Luca; Capodiferro, Licia; Neri, Alessandro
2013-02-01
The description of space-time patches is a fundamental task in many applications such as video retrieval or classification. Each space-time patch can be described by using a set of orthogonal functions that represent a subspace, for example a sphere or a cylinder, within the patch. In this work, our aim is to investigate the differences between the spherical descriptors and the cylindrical descriptors. In order to compute the descriptors, the 3D spherical and cylindrical Zernike polynomials are employed. This is important because both the functions are based on the same family of polynomials, and only the symmetry is different. Our experimental results show that the cylindrical descriptor outperforms the spherical descriptor. However, the performances of the two descriptors are similar.
Physicochemical descriptors of aromatic character and their use in drug discovery.
Ritchie, Timothy J; Macdonald, Simon J F
2014-09-11
Published physicochemical descriptors of molecules that convey aromaticity-related character are reviewed in the context of drug design and discovery. Studies that have employed aromatic descriptors are discussed, and several descriptors are compared and contrasted.
Ishii, Kotaro; Iwata, Hiroyoshi; Oshika, Tetsuro
2011-11-04
To evaluate changes in eyeball shape in emmetropization and myopic changes using magnetic resonance imaging (MRI) and elliptic Fourier descriptors (EFDs). The subjects were 105 patients (age range, 1 month-19 years) who underwent head MRI. The refractive error was determined in 30 patients, and eyeball shape was expressed numerically by principal components analysis of standardized EFDs. In the first principal component (PC1; the oblate-to-prolate change), the proportion of variance/total variance in the development of the eyeball shape was 76%. In all subjects, PC1 showed a significant correlation with age (Pearson r = -0.314; P = 0.001), axial length (AL, r = -0.378; P < 0.001), width (r = -0.200, P = 0.0401), oblateness (r = 0.657, P < 0.001), and spherical equivalent refraction (SER, r = 0.438; P = 0.0146; n = 30). In the group containing patients aged 1 month to 6 years (n = 49), PC1 showed a significant correlation with age (r = -0.366; P = 0.0093). In the group containing patients aged 7 to 19 years (n = 56), PC1 showed a significant correlation with SER (r = 0.640; P = 0.0063). The main deformation pattern in the development of the eyeball shape from oblate to prolate was clarified by quantitative analysis based on EFDs. The results showed clear differences between age groups with regard to changes in the shape of the eyeball, the correlation between these changes, and refractive status changes.
Prediction of atmospheric degradation data for POPs by gene expression programming.
Luan, F; Si, H Z; Liu, H T; Wen, Y Y; Zhang, X Y
2008-01-01
Quantitative structure-activity relationship models for the prediction of the mean and the maximum atmospheric degradation half-life values of persistent organic pollutants were developed based on the linear heuristic method (HM) and non-linear gene expression programming (GEP). Molecular descriptors, calculated from the structures alone, were used to represent the characteristics of the compounds. HM was used both to pre-select the whole descriptor sets and to build the linear model. GEP yielded satisfactory prediction results: the square of the correlation coefficient r(2) was 0.80 and 0.81 for the mean and maximum half-life values of the test set, and the root mean square errors were 0.448 and 0.426, respectively. The results of this work indicate that the GEP is a very promising tool for non-linear approximations.
Courtiol, Alexandre; Ferdy, Jean Baptiste; Godelle, Bernard; Raymond, Michel; Claude, Julien
2010-05-01
Many studies use representations of human body outlines to study how individual characteristics, such as height and body mass, affect perception of body shape. These typically involve reality-based stimuli (e.g., pictures) or manipulated stimuli (e.g., drawings). These two classes of stimuli have important drawbacks that limit result interpretations. Realistic stimuli vary in terms of traits that are correlated, which makes it impossible to assess the effect of a single trait independently. In addition, manipulated stimuli usually do not represent realistic morphologies. We describe and examine a method based on elliptic Fourier descriptors to automatically predict and represent body outlines for a given set of predicted variables (e.g., sex, height, and body mass). We first estimate whether these predictive variables are significantly related to human outlines. We find that height and body mass significantly influence body shape. Unlike height, the effect of body mass on shape differs between sexes. Then, we show that we can easily build a regression model that creates hypothetical outlines for an arbitrary set of covariates. These statistically computed outlines are quite realistic and may be used as stimuli in future studies.
Medeiros Turra, Kely; Pineda Rivelli, Diogo; Berlanga de Moraes Barros, Silvia; Mesquita Pasqualoto, Kerly Fernanda
2016-07-01
A receptor-independent (RI) four-dimensional structure-activity relationship (4D-QSAR) formalism was applied to a set of sixty-four β-N-biaryl ether sulfonamide hydroxamate derivatives, previously reported as potent inhibitors against matrix metalloproteinase subtype 9 (MMP-9). MMP-9 belongs to a group of enzymes related to the cleavage of several extracellular matrix components and has been associated to cancer invasiveness/metastasis. The best RI 4D-QSAR model was statistically significant (N=47; r(2) =0.91; q(2) =0.83; LSE=0.09; LOF=0.35; outliers=0). Leave-N-out (LNO) and y-randomization approaches indicated the QSAR model was robust and presented no chance correlation, respectively. Furthermore, it also had good external predictability (82 %) regarding the test set (N=17). In addition, the grid cell occupancy descriptors (GCOD) of the predicted bioactive conformation for the most potent inhibitor were successfully interpreted when docked into the MMP-9 active site. The 3D-pharmacophore findings were used to predict novel ligands and exploit the MMP-9 calculated binding affinity through molecular docking procedure. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
[Glossary of terms used by radiologists in image processing].
Rolland, Y; Collorec, R; Bruno, A; Ramée, A; Morcet, N; Haigron, P
1995-01-01
We give the definition of 166 words used in image processing. Adaptivity, aliazing, analog-digital converter, analysis, approximation, arc, artifact, artificial intelligence, attribute, autocorrelation, bandwidth, boundary, brightness, calibration, class, classification, classify, centre, cluster, coding, color, compression, contrast, connectivity, convolution, correlation, data base, decision, decomposition, deconvolution, deduction, descriptor, detection, digitization, dilation, discontinuity, discretization, discrimination, disparity, display, distance, distorsion, distribution dynamic, edge, energy, enhancement, entropy, erosion, estimation, event, extrapolation, feature, file, filter, filter floaters, fitting, Fourier transform, frequency, fusion, fuzzy, Gaussian, gradient, graph, gray level, group, growing, histogram, Hough transform, Houndsfield, image, impulse response, inertia, intensity, interpolation, interpretation, invariance, isotropy, iterative, JPEG, knowledge base, label, laplacian, learning, least squares, likelihood, matching, Markov field, mask, matching, mathematical morphology, merge (to), MIP, median, minimization, model, moiré, moment, MPEG, neural network, neuron, node, noise, norm, normal, operator, optical system, optimization, orthogonal, parametric, pattern recognition, periodicity, photometry, pixel, polygon, polynomial, prediction, pulsation, pyramidal, quantization, raster, reconstruction, recursive, region, rendering, representation space, resolution, restoration, robustness, ROC, thinning, transform, sampling, saturation, scene analysis, segmentation, separable function, sequential, smoothing, spline, split (to), shape, threshold, tree, signal, speckle, spectrum, spline, stationarity, statistical, stochastic, structuring element, support, syntaxic, synthesis, texture, truncation, variance, vision, voxel, windowing.
Descriptors of sensation confirm the multidimensional nature of desire to void.
Das, Rebekah; Buckley, Jonathan D; Williams, Marie T
2015-02-01
To collect and categorize descriptors of "desire to void" sensation, determine the reliability of descriptor categories and assess whether descriptor categories discriminate between people with and without symptoms of overactive bladder. This observational, repeated measures study involved 64 Australian volunteers (47 female), aged 50 years or more, with and without symptoms of overactive bladder. Descriptors of desire to void sensation were derived from a structured interview (conducted on two occasions, 1 week apart). Descriptors were recorded verbatim and categorized in a three-stage process. Overactive bladder status was determined by the Overactive Bladder Awareness Tool and the Overactive Bladder Symptom Score. McNemar's test assessed the reliability of descriptors volunteered between two occasions and Partial Least Squares Regression determined whether language categories discriminated according to overactive bladder status. Post hoc Chi squared analysis and relative risk calculation determined the size and direction of overactive bladder prediction. Thirteen language categories (Urgency, Fullness, Pressure, Tickle/tingle, Pain/ache, Heavy, Normal, Intense, Sudden, Annoying, Uncomfortable, Anxiety, and Unique somatic) encapsulated 344 descriptors of sensation. Descriptor categories were stable between two interviews. The categories "Urgency" and "Fullness" predicted overactive bladder status. Participants who volunteered "Urgency" descriptors were twice as likely to have overactive bladder and participants who volunteered "Fullness" descriptors were almost three times as likely not to have overactive bladder. The sensation of desire to void is reliably described over sessions separated by a week, the language used reflects multiple dimensions of sensation, and can predict overactive bladder status. © 2013 Wiley Periodicals, Inc.
The great descriptor melting pot: mixing descriptors for the common good of QSAR models.
Tseng, Yufeng J; Hopfinger, Anton J; Esposito, Emilio Xavier
2012-01-01
The usefulness and utility of QSAR modeling depends heavily on the ability to estimate the values of molecular descriptors relevant to the endpoints of interest followed by an optimized selection of descriptors to form the best QSAR models from a representative set of the endpoints of interest. The performance of a QSAR model is directly related to its molecular descriptors. QSAR modeling, specifically model construction and optimization, has benefited from its ability to borrow from other unrelated fields, yet the molecular descriptors that form QSAR models have remained basically unchanged in both form and preferred usage. There are many types of endpoints that require multiple classes of descriptors (descriptors that encode 1D through multi-dimensional, 4D and above, content) needed to most fully capture the molecular features and interactions that contribute to the endpoint. The advantages of QSAR models constructed from multiple, and different, descriptor classes have been demonstrated in the exploration of markedly different, and principally biological systems and endpoints. Multiple examples of such QSAR applications using different descriptor sets are described and that examined. The take-home-message is that a major part of the future of QSAR analysis, and its application to modeling biological potency, ADME-Tox properties, general use in virtual screening applications, as well as its expanding use into new fields for building QSPR models, lies in developing strategies that combine and use 1D through nD molecular descriptors.
2D-QSAR study of fullerene nanostructure derivatives as potent HIV-1 protease inhibitors
NASA Astrophysics Data System (ADS)
Barzegar, Abolfazl; Jafari Mousavi, Somaye; Hamidi, Hossein; Sadeghi, Mehdi
2017-09-01
The protease of human immunodeficiency virus1 (HIV-PR) is an essential enzyme for antiviral treatments. Carbon nanostructures of fullerene derivatives, have nanoscale dimension with a diameter comparable to the diameter of the active site of HIV-PR which would in turn inhibit HIV. In this research, two dimensional quantitative structure-activity relationships (2D-QSAR) of fullerene derivatives against HIV-PR activity were employed as a powerful tool for elucidation the relationships between structure and experimental observations. QSAR study of 49 fullerene derivatives was performed by employing stepwise-MLR, GAPLS-MLR, and PCA-MLR models for variable (descriptor) selection and model construction. QSAR models were obtained with higher ability to predict the activity of the fullerene derivatives against HIV-PR by a correlation coefficient (R2training) of 0.942, 0.89, and 0.87 as well as R2test values of 0.791, 0.67and 0.674 for stepwise-MLR, GAPLS-MLR, and PCA -MLR models, respectively. Leave-one-out cross-validated correlation coefficient (R2CV) and Y-randomization methods confirmed the models robustness. The descriptors indicated that the HIV-PR inhibition depends on the van der Waals volumes, polarizability, bond order between two atoms and electronegativities of fullerenes derivatives. 2D-QSAR simulation without needing receptor's active site geometry, resulted in useful descriptors mainly denoting ;C60 backbone-functional groups; and ;C60 functional groups; properties. Both properties in fullerene refer to the ligand fitness and improvement van der Waals interactions with HIV-PR active site. Therefore, the QSAR models can be used in the search for novel HIV-PR inhibitors based on fullerene derivatives.
Prasanna, S; Manivannan, E; Chaturvedi, S C
2005-04-15
As a part of our continuing efforts in discerning the structural and physicochemical requirements for selective COX-2 over COX-1 inhibition among the fused pyrazole ring systems, herein we report the QSAR analyses of the title compounds. The conformational flexibility of the title compounds was examined using a simple connection table representation. The conformational investigation was aided by calculating a connection table parameter called fraction of rotable bonds, b_rotR encompassing the number of rotable bonds and b_count, the number of bonds including implicit hydrogens of each ligand. The hydrophobic and steric correlation of the title compounds towards selective COX-2 inhibition was reported previously in one of our recent publications. In this communication, we attempt to calculate Wang-Ford charges of the non-hydrogen common atoms of AM1 optimized geometries of the title compounds. Owing to the partial conformational flexibility of title compounds, conformationally restricted and unrestricted descriptors were calculated from MOE. Correlation analysis of these 2D, 3D and Wang-Ford charges was accomplished by linear regression analysis. 2D molecular descriptor b_single, 3D molecular descriptors glob, std_dim3 showed significant contribution towards COX-2 inhibitory activity. Balaban J, a connectivity topological index showed a negative and positive contribution towards COX-1 and selective COX-2 over COX-1 inhibition, respectively. Wang-Ford charges calculated on C(7) showed a significant contribution towards COX-1 inhibitory activity whereas charges calculated on C(8) were crucial in governing the selectivity of COX-2 over COX-1 inhibition among these congeners.
Information-theoretic signatures of biodiversity in the barcoding gene.
Barbosa, Valmir C
2018-08-14
Analyzing the information content of DNA, though holding the promise to help quantify how the processes of evolution have led to information gain throughout the ages, has remained an elusive goal. Paradoxically, one of the main reasons for this has been precisely the great diversity of life on the planet: if on the one hand this diversity is a rich source of data for information-content analysis, on the other hand there is so much variation as to make the task unmanageable. During the past decade or so, however, succinct fragments of the COI mitochondrial gene, which is present in all animal phyla and in a few others, have been shown to be useful for species identification through DNA barcoding. A few million such fragments are now publicly available through the BOLD systems initiative, thus providing an unprecedented opportunity for relatively comprehensive information-theoretic analyses of DNA to be attempted. Here we show how a generalized form of total correlation can yield distinctive information-theoretic descriptors of the phyla represented in those fragments. In order to illustrate the potential of this analysis to provide new insight into the evolution of species, we performed principal component analysis on standardized versions of the said descriptors for 23 phyla. Surprisingly, we found that, though based solely on the species represented in the data, the first principal component correlates strongly with the natural logarithm of the number of all known living species for those phyla. The new descriptors thus constitute clear information-theoretic signatures of the processes whereby evolution has given rise to current biodiversity, which suggests their potential usefulness in further related studies. Copyright © 2018 Elsevier Ltd. All rights reserved.
Karmakar, Chandan K; Khandoker, Ahsan H; Voss, Andreas; Palaniswami, Marimuthu
2011-03-03
A novel descriptor (Complex Correlation Measure (CCM)) for measuring the variability in the temporal structure of Poincaré plot has been developed to characterize or distinguish between Poincaré plots with similar shapes. This study was designed to assess the changes in temporal structure of the Poincaré plot using CCM during atropine infusion, 70° head-up tilt and scopolamine administration in healthy human subjects. CCM quantifies the point-to-point variation of the signal rather than gross description of the Poincaré plot. The physiological relevance of CCM was demonstrated by comparing the changes in CCM values with autonomic perturbation during all phases of the experiment. The sensitivities of short term variability (SD1), long term variability (SD2) and variability in temporal structure (CCM) were analyzed by changing the temporal structure by shuffling the sequences of points of the Poincaré plot. Surrogate analysis was used to show CCM as a measure of changes in temporal structure rather than random noise and sensitivity of CCM with changes in parasympathetic activity. CCM was found to be most sensitive to changes in temporal structure of the Poincaré plot as compared to SD1 and SD2. The values of all descriptors decreased with decrease in parasympathetic activity during atropine infusion and 70° head-up tilt phase. In contrast, values of all descriptors increased with increase in parasympathetic activity during scopolamine administration. The concordant reduction and enhancement in CCM values with parasympathetic activity indicates that the temporal variability of Poincaré plot is modulated by the parasympathetic activity which correlates with changes in CCM values. CCM is more sensitive than SD1 and SD2 to changes of parasympathetic activity.
Zare-Shahabadi, Vali; Abbasitabar, Fatemeh
2010-09-01
Quantitative structure-activity relationship models were derived for 107 analogs of 1-[(2-hydroxyethoxy) methyl]-6-(phenylthio)thymine, a potent inhibitor of the HIV-1 reverse transcriptase. The activities of these compounds were investigated by means of multiple linear regression (MLR) technique. An ant colony optimization algorithm, called Memorized_ACS, was applied for selecting relevant descriptors and detecting outliers. This algorithm uses an external memory based upon knowledge incorporation from previous iterations. At first, the memory is empty, and then it is filled by running several ACS algorithms. In this respect, after each ACS run, the elite ant is stored in the memory and the process is continued to fill the memory. Here, pheromone updating is performed by all elite ants collected in the memory; this results in improvements in both exploration and exploitation behaviors of the ACS algorithm. The memory is then made empty and is filled again by performing several ACS algorithms using updated pheromone trails. This process is repeated for several iterations. At the end, the memory contains several top solutions for the problem. Number of appearance of each descriptor in the external memory is a good criterion for its importance. Finally, prediction is performed by the elitist ant, and interpretation is carried out by considering the importance of each descriptor. The best MLR model has a training error of 0.47 log (1/EC(50)) units (R(2) = 0.90) and a prediction error of 0.76 log (1/EC(50)) units (R(2) = 0.88). Copyright 2010 Wiley Periodicals, Inc.
Rashotte, Judy; Coburn, Geraldine; Harrison, Denise; Stevens, Bonnie J; Yamada, Janet; Abbott, Laura K
2013-01-01
Although documentation of children's pain by health care professionals is frequently undertaken, few studies have explored the nature of the language used to describe pain in the medical records of hospitalized children. To describe health care professionals' use of written language related to the quality and quantity of pain experienced by hospitalized children. Free-text pain narratives documented during a 24 h period were collected from the medical records of 3822 children (0 to 18 years of age) hospitalized on 32 inpatient units in eight Canadian pediatric hospitals. A qualitative descriptive exploration using a content analysis approach was used. Pain narratives were documented a total of 5390 times in 1518 of the 3822 children's medical records (40%). Overall, word choices represented objective and subjective descriptors. Two major categories were identified, with their respective subcategories of word indicators and associated cues: indicators of pain, including behavioural (e.g., vocal, motor, facial and activities cues), affective and physiological cues, and children's descriptors; and word qualifiers, including intensity, comparator and temporal qualifiers. The richness and complexity of vocabulary used by clinicians to document children's pain lend support to the concept that the word 'pain' is a label that represents a myriad of different experiences. There is potential to refine pediatric pain assessment measures to be inclusive of other cues used to identify children's pain. The results enhance the discussion concerning the development of standardized nomenclature. Further research is warranted to determine whether there is congruence in interpretation across time, place and individuals.
Toropov, Andrey A; Toropova, Alla P; Raska, Ivan; Benfenati, Emilio
2010-04-01
Three different splits into the subtraining set (n = 22), the set of calibration (n = 21), and the test set (n = 12) of 55 antineoplastic agents have been examined. By the correlation balance of SMILES-based optimal descriptors quite satisfactory models for the octanol/water partition coefficient have been obtained on all three splits. The correlation balance is the optimization of a one-variable model with a target function that provides both the maximal values of the correlation coefficient for the subtraining and calibration set and the minimum of the difference between the above-mentioned correlation coefficients. Thus, the calibration set is a preliminary test set. Copyright (c) 2009 Elsevier Masson SAS. All rights reserved.
A contour-based shape descriptor for biomedical image classification and retrieval
NASA Astrophysics Data System (ADS)
You, Daekeun; Antani, Sameer; Demner-Fushman, Dina; Thoma, George R.
2013-12-01
Contours, object blobs, and specific feature points are utilized to represent object shapes and extract shape descriptors that can then be used for object detection or image classification. In this research we develop a shape descriptor for biomedical image type (or, modality) classification. We adapt a feature extraction method used in optical character recognition (OCR) for character shape representation, and apply various image preprocessing methods to successfully adapt the method to our application. The proposed shape descriptor is applied to radiology images (e.g., MRI, CT, ultrasound, X-ray, etc.) to assess its usefulness for modality classification. In our experiment we compare our method with other visual descriptors such as CEDD, CLD, Tamura, and PHOG that extract color, texture, or shape information from images. The proposed method achieved the highest classification accuracy of 74.1% among all other individual descriptors in the test, and when combined with CSD (color structure descriptor) showed better performance (78.9%) than using the shape descriptor alone.
Jabeen, Safia; Mehmood, Zahid; Mahmood, Toqeer; Saba, Tanzila; Rehman, Amjad; Mahmood, Muhammad Tariq
2018-01-01
For the last three decades, content-based image retrieval (CBIR) has been an active research area, representing a viable solution for retrieving similar images from an image repository. In this article, we propose a novel CBIR technique based on the visual words fusion of speeded-up robust features (SURF) and fast retina keypoint (FREAK) feature descriptors. SURF is a sparse descriptor whereas FREAK is a dense descriptor. Moreover, SURF is a scale and rotation-invariant descriptor that performs better in the case of repeatability, distinctiveness, and robustness. It is robust to noise, detection errors, geometric, and photometric deformations. It also performs better at low illumination within an image as compared to the FREAK descriptor. In contrast, FREAK is a retina-inspired speedy descriptor that performs better for classification-based problems as compared to the SURF descriptor. Experimental results show that the proposed technique based on the visual words fusion of SURF-FREAK descriptors combines the features of both descriptors and resolves the aforementioned issues. The qualitative and quantitative analysis performed on three image collections, namely Corel-1000, Corel-1500, and Caltech-256, shows that proposed technique based on visual words fusion significantly improved the performance of the CBIR as compared to the feature fusion of both descriptors and state-of-the-art image retrieval techniques. PMID:29694429
Jabeen, Safia; Mehmood, Zahid; Mahmood, Toqeer; Saba, Tanzila; Rehman, Amjad; Mahmood, Muhammad Tariq
2018-01-01
For the last three decades, content-based image retrieval (CBIR) has been an active research area, representing a viable solution for retrieving similar images from an image repository. In this article, we propose a novel CBIR technique based on the visual words fusion of speeded-up robust features (SURF) and fast retina keypoint (FREAK) feature descriptors. SURF is a sparse descriptor whereas FREAK is a dense descriptor. Moreover, SURF is a scale and rotation-invariant descriptor that performs better in the case of repeatability, distinctiveness, and robustness. It is robust to noise, detection errors, geometric, and photometric deformations. It also performs better at low illumination within an image as compared to the FREAK descriptor. In contrast, FREAK is a retina-inspired speedy descriptor that performs better for classification-based problems as compared to the SURF descriptor. Experimental results show that the proposed technique based on the visual words fusion of SURF-FREAK descriptors combines the features of both descriptors and resolves the aforementioned issues. The qualitative and quantitative analysis performed on three image collections, namely Corel-1000, Corel-1500, and Caltech-256, shows that proposed technique based on visual words fusion significantly improved the performance of the CBIR as compared to the feature fusion of both descriptors and state-of-the-art image retrieval techniques.
Wood, Dustin
2015-02-01
Using a set of 498 English words identified by Saucier (1997) as common person-descriptor adjectives or trait terms, I tested 3 instantiations of the lexical hypothesis, which posit that more socially important person descriptors show greater density in the lexicon. Specifically, I explored whether trait terms that have greater relational impact (i.e., more greatly influence how others respond to a person) have more synonyms, are more frequently used, and are more strongly correlated with other trait terms. I found little evidence to suggest that trait terms rated as having greater relational impact were more frequently used or had more synonyms. However, these terms correlated more strongly with other trait terms in the set. Conversely, a trait term's loadings on structural factors (e.g., the Big Five, HEXACO) were extremely good predictors of the term's relational impact. The findings suggest that the lexical hypothesis may not be strongly supported in some ways it is commonly understood but is supported in the manner most important to investigations of trait structure. Specifically, trait terms with greater relational impact tend to more strongly correlate with other terms in lexical sets and thus have a greater role in driving the location of factors in analyses of trait structure. Implications for understanding the meaning of lexical factors such as the Big Five are discussed. PsycINFO Database Record (c) 2015 APA, all rights reserved.
A new method for shape and texture classification of orthopedic wear nanoparticles.
Zhang, Dongning; Page, Janet R; Kavanaugh, Aaron E; Billi, Fabrizio
2012-09-27
Detailed morphologic analysis of particles produced during wear of orthopedic implants is important in determining a correlation among material, wear, and biological effects. However, the use of simple shape descriptors is insufficient to categorize the data and to compare the nature of wear particles generated by different implants. An approach based on Discrete Fourier Transform (DFT) is presented for describing particle shape and surface texture. Four metal-on-metal bearing couples were tested in an orbital wear simulator under standard and adverse (steep-angled cups) wear simulator conditions. Digitized Scanning Electron Microscope (SEM) images of the wear particles were imported into MATLAB to carry out Fourier descriptor calculations via a specifically developed algorithm. The descriptors were then used for studying particle characteristics (shape and texture) as well as for cluster classification. Analysis of the particles demonstrated the validity of the proposed model by showing that steep-angle Co-Cr wear particles were more asymmetric, compressed, extended, triangular, square, and roughened at 3 Mc than after 0.25 Mc. In contrast, particles from standard angle samples were only more compressed and extended after 3 Mc compared to 0.25 Mc. Cluster analysis revealed that the 0.25 Mc steep-angle particle distribution was a subset of the 3 Mc distribution.
Dong, Pei-Pei; Ge, Guang-Bo; Zhang, Yan-Yan; Ai, Chun-Zhi; Li, Guo-Hui; Zhu, Liang-Liang; Luan, Hong-Wei; Liu, Xing-Bao; Yang, Ling
2009-10-16
Seven pairs of epimers and one pair of isomeric metabolites of taxanes, each pair of which have similar structures but different retention behaviors, together with additional 13 taxanes with different substitutions were chosen to investigate the quantitative structure-retention relationship (QSRR) of taxanes in ultra fast liquid chromatography (UFLC). Monte Carlo variable selection (MCVS) method was adopted to choose descriptors. The selected four descriptors were used to build QSRR model with multi-linear regression (MLR) and artificial neural network (ANN) modeling techniques. Both linear and nonlinear models show good predictive ability, of which ANN model was better with the determination coefficient R(2) for training, validation and test set being 0.9892, 0.9747 and 0.9840, respectively. The results of 100 times' leave-12-out cross validation showed the robustness of this model. All the isomers can be correctly differentiated by this model. According to the selected descriptors, the three dimensional structural information was critical for recognition of epimers. Hydrophobic interaction was the uppermost factor for retention in UFLC. Molecules' polarizability and polarity properties were also closely correlated with retention behaviors. This QSRR model will be useful for separation and identification of taxanes including epimers and metabolites from botanical or biological samples.
Li, Hang; Wang, Maolin; Gong, Ya-Nan; Yan, Aixia
2016-01-01
β-secretase (BACE1) is an aspartyl protease, which is considered as a novel vital target in Alzheimer`s disease therapy. We collected a data set of 294 BACE1 inhibitors, and built six classification models to discriminate active and weakly active inhibitors using Kohonen's Self-Organizing Map (SOM) method and Support Vector Machine (SVM) method. Each molecular descriptor was calculated using the program ADRIANA.Code. We adopted two different methods: random method and Self-Organizing Map method, for training/test set split. The descriptors were selected by F-score and stepwise linear regression analysis. The best SVM model Model2C has a good prediction performance on test set with prediction accuracy, sensitivity (SE), specificity (SP) and Matthews correlation coefficient (MCC) of 89.02%, 90%, 88%, 0.78, respectively. Model 1A is the best SOM model, whose accuracy and MCC of the test set were 94.57% and 0.98, respectively. The lone pair electronegativity and polarizability related descriptors importantly contributed to bioactivity of BACE1 inhibitor. The Extended-Connectivity Finger-Prints_4 (ECFP_4) analysis found some vitally key substructural features, which could be helpful for further drug design research. The SOM and SVM models built in this study can be obtained from the authors by email or other contacts.
Akbar, Jamshed; Iqbal, Shahid; Batool, Fozia; Karim, Abdul; Chan, Kim Wei
2012-01-01
Quantitative structure-retention relationships (QSRRs) have successfully been developed for naturally occurring phenolic compounds in a reversed-phase liquid chromatographic (RPLC) system. A total of 1519 descriptors were calculated from the optimized structures of the molecules using MOPAC2009 and DRAGON softwares. The data set of 39 molecules was divided into training and external validation sets. For feature selection and mapping we used step-wise multiple linear regression (SMLR), unsupervised forward selection followed by step-wise multiple linear regression (UFS-SMLR) and artificial neural networks (ANN). Stable and robust models with significant predictive abilities in terms of validation statistics were obtained with negation of any chance correlation. ANN models were found better than remaining two approaches. HNar, IDM, Mp, GATS2v, DISP and 3D-MoRSE (signals 22, 28 and 32) descriptors based on van der Waals volume, electronegativity, mass and polarizability, at atomic level, were found to have significant effects on the retention times. The possible implications of these descriptors in RPLC have been discussed. All the models are proven to be quite able to predict the retention times of phenolic compounds and have shown remarkable validation, robustness, stability and predictive performance. PMID:23203132
Luo, Wen; Medrek, Sarah; Misra, Jatin; Nohynek, Gerhard J
2007-02-01
The objective of this study was to construct and validate a quantitative structure-activity relationship model for skin absorption. Such models are valuable tools for screening and prioritization in safety and efficacy evaluation, and risk assessment of drugs and chemicals. A database of 340 chemicals with percutaneous absorption was assembled. Two models were derived from the training set consisting 306 chemicals (90/10 random split). In addition to the experimental K(ow) values, over 300 2D and 3D atomic and molecular descriptors were analyzed using MDL's QsarIS computer program. Subsequently, the models were validated using both internal (leave-one-out) and external validation (test set) procedures. Using the stepwise regression analysis, three molecular descriptors were determined to have significant statistical correlation with K(p) (R2 = 0.8225): logK(ow), X0 (quantification of both molecular size and the degree of skeletal branching), and SsssCH (count of aromatic carbon groups). In conclusion, two models to estimate skin absorption were developed. When compared to other skin absorption QSAR models in the literature, our model incorporated more chemicals and explored a large number of descriptors. Additionally, our models are reasonably predictive and have met both internal and external statistical validations.
The conventional Junge-Pankow adsorption model uses the sub-cooled liquid vapor pressure (pLo) as a correlation parameter for gas/particle interactions. An alternative is the octanol-air partition coefficient (Koa) absorption model. Log-log plots of the particle-gas partition c...
Phi-s correlation and dynamic time warping - Two methods for tracking ice floes in SAR images
NASA Technical Reports Server (NTRS)
Mcconnell, Ross; Kober, Wolfgang; Kwok, Ronald; Curlander, John C.; Pang, Shirley S.
1991-01-01
The authors present two algorithms for performing shape matching on ice floe boundaries in SAR (synthetic aperture radar) images. These algorithms quickly produce a set of ice motion and rotation vectors that can be used to guide a pixel value correlator. The algorithms match a shape descriptor known as the Phi-s curve. The first algorithm uses normalized correlation to match the Phi-s curves, while the second uses dynamic programming to compute an elastic match that better accommodates ice floe deformation. Some empirical data on the performance of the algorithms on Seasat SAR images are presented.
Naik, P K; Singh, T; Singh, H
2009-07-01
Quantitative structure-activity relationship (QSAR) analyses were performed independently on data sets belonging to two groups of insecticides, namely the organophosphates and carbamates. Several types of descriptors including topological, spatial, thermodynamic, information content, lead likeness and E-state indices were used to derive quantitative relationships between insecticide activities and structural properties of chemicals. A systematic search approach based on missing value, zero value, simple correlation and multi-collinearity tests as well as the use of a genetic algorithm allowed the optimal selection of the descriptors used to generate the models. The QSAR models developed for both organophosphate and carbamate groups revealed good predictability with r(2) values of 0.949 and 0.838 as well as [image omitted] values of 0.890 and 0.765, respectively. In addition, a linear correlation was observed between the predicted and experimental LD(50) values for the test set data with r(2) of 0.871 and 0.788 for both the organophosphate and carbamate groups, indicating that the prediction accuracy of the QSAR models was acceptable. The models were also tested successfully from external validation criteria. QSAR models developed in this study should help further design of novel potent insecticides.
Taylor, Emily M.; Sweetkind, Donald S.
2014-01-01
Understanding the subsurface geologic framework of the Cenozoic basin fill that underlies the Amargosa Desert in southern Nevada and southeastern California has been improved by using borehole data to construct three-dimensional lithologic and interpreted facies models. Lithologic data from 210 boreholes from a 20-kilometer (km) by 90-km area were reduced to a limited suite of descriptors based on geologic knowledge of the basin and distributed in three-dimensional space using interpolation methods. The resulting lithologic model of the Amargosa Desert basin portrays a complex system of interfingered coarse- to fine-grained alluvium, playa and palustrine deposits, eolian sands, and interbedded volcanic units. Lithologic units could not be represented in the model as a stacked stratigraphic sequence due to the complex interfingering of lithologic units and the absence of available time-stratigraphic markers. Instead, lithologic units were grouped into interpreted genetic classes, such as playa or alluvial fan, to create a three-dimensional model of the interpreted facies data. Three-dimensional facies models computed from these data portray the alluvial infilling of a tectonically formed basin with intermittent internal drainage and localized regional groundwater discharge. The lithologic and interpreted facies models compare favorably to resistivity, aeromagnetic, and geologic map data, lending confidence to the interpretation.
Signaling completion of a message transfer from an origin compute node to a target compute node
Blocksome, Michael A [Rochester, MN; Parker, Jeffrey J [Rochester, MN
2011-05-24
Signaling completion of a message transfer from an origin node to a target node includes: sending, by an origin DMA engine, an RTS message, the RTS message specifying an application message for transfer to the target node from the origin node; receiving, by the origin DMA engine, a remote get message containing a data descriptor for the message and a completion notification descriptor, the completion notification descriptor specifying a local direct put transfer operation for transferring data locally on the origin node; inserting, by the origin DMA engine in an injection FIFO buffer, the data descriptor followed by the completion notification descriptor; transferring, by the origin DMA engine to the target node, the message in dependence upon the data descriptor; and notifying, by the origin DMA engine, the application that transfer of the message is complete in dependence upon the completion notification descriptor.
Direct memory access transfer completion notification
Archer, Charles J. , Blocksome; Michael A. , Parker; Jeffrey, J [Rochester, MN
2011-02-15
Methods, systems, and products are disclosed for DMA transfer completion notification that include: inserting, by an origin DMA on an origin node in an origin injection FIFO, a data descriptor for an application message; inserting, by the origin DMA, a reflection descriptor in the origin injection FIFO, the reflection descriptor specifying a remote get operation for injecting a completion notification descriptor in a reflection injection FIFO on a reflection node; transferring, by the origin DMA to a target node, the message in dependence upon the data descriptor; in response to completing the message transfer, transferring, by the origin DMA to the reflection node, the completion notification descriptor in dependence upon the reflection descriptor; receiving, by the origin DMA from the reflection node, a completion packet; and notifying, by the origin DMA in response to receiving the completion packet, the origin node's processing core that the message transfer is complete.
Signaling completion of a message transfer from an origin compute node to a target compute node
Blocksome, Michael A [Rochester, MN
2011-02-15
Signaling completion of a message transfer from an origin node to a target node includes: sending, by an origin DMA engine, an RTS message, the RTS message specifying an application message for transfer to the target node from the origin node; receiving, by the origin DMA engine, a remote get message containing a data descriptor for the message and a completion notification descriptor, the completion notification descriptor specifying a local memory FIFO data transfer operation for transferring data locally on the origin node; inserting, by the origin DMA engine in an injection FIFO buffer, the data descriptor followed by the completion notification descriptor; transferring, by the origin DMA engine to the target node, the message in dependence upon the data descriptor; and notifying, by the origin DMA engine, the application that transfer of the message is complete in dependence upon the completion notification descriptor.
Multi-Scale Surface Descriptors
Cipriano, Gregory; Phillips, George N.; Gleicher, Michael
2010-01-01
Local shape descriptors compactly characterize regions of a surface, and have been applied to tasks in visualization, shape matching, and analysis. Classically, curvature has be used as a shape descriptor; however, this differential property characterizes only an infinitesimal neighborhood. In this paper, we provide shape descriptors for surface meshes designed to be multi-scale, that is, capable of characterizing regions of varying size. These descriptors capture statistically the shape of a neighborhood around a central point by fitting a quadratic surface. They therefore mimic differential curvature, are efficient to compute, and encode anisotropy. We show how simple variants of mesh operations can be used to compute the descriptors without resorting to expensive parameterizations, and additionally provide a statistical approximation for reduced computational cost. We show how these descriptors apply to a number of uses in visualization, analysis, and matching of surfaces, particularly to tasks in protein surface analysis. PMID:19834190
Respiratory complaints in Chinese: cultural and diagnostic specificities.
Han, Jiangna; Zhu, Yuanjue; Li, Shunwei; Chen, Xiansheng; Put, Claudia; Van de Woestijne, Karel P; Van den Bergh, Omer
2005-06-01
We investigated the qualitative components of a wide range of Chinese descriptors of dyspnea and associated symptoms, and their relevance for clinical diagnosis. Sixty-one spontaneously reported descriptors were elicited in Chinese patients to make a symptom checklist, which was administered to new groups of patients with different cardiopulmonary diseases, to patients with medically unexplained dyspnea and to healthy subjects. Test-retest reliability was satisfactory for most of the descriptors. A principal component analysis on 61 descriptors yielded the following eight factors: dyspnea-effort of breathing; dyspnea-affective aspect; wheezing; anxiety; tingling; palpitation; coughing and sputum; and dying experience. Although the descriptors of dyspnea-effort of breathing resembled Western wordings and were shared by patients with a variety of diseases, the descriptors of dyspnea-affective aspect appeared to be more culturally specific and were primarily linked to the diagnosis of medically unexplained dyspnea, whereas wheezing was specifically linked to asthma. Three factors of breathlessness were found in Chinese. The descriptors of dyspnea-effort of breathing and wheezing appear to be similar to Western descriptors, whereas the dyspnea-affective aspect seems to bear cultural specificity.
Nisius, Britta; Gohlke, Holger
2012-09-24
Analyzing protein binding sites provides detailed insights into the biological processes proteins are involved in, e.g., into drug-target interactions, and so is of crucial importance in drug discovery. Herein, we present novel alignment-independent binding site descriptors based on DrugScore potential fields. The potential fields are transformed to a set of information-rich descriptors using a series expansion in 3D Zernike polynomials. The resulting Zernike descriptors show a promising performance in detecting similarities among proteins with low pairwise sequence identities that bind identical ligands, as well as within subfamilies of one target class. Furthermore, the Zernike descriptors are robust against structural variations among protein binding sites. Finally, the Zernike descriptors show a high data compression power, and computing similarities between binding sites based on these descriptors is highly efficient. Consequently, the Zernike descriptors are a useful tool for computational binding site analysis, e.g., to predict the function of novel proteins, off-targets for drug candidates, or novel targets for known drugs.
Zhang, P; Tao, L; Zeng, X; Qin, C; Chen, S Y; Zhu, F; Yang, S Y; Li, Z R; Chen, W P; Chen, Y Z
2017-02-03
The studies of biological, disease, and pharmacological networks are facilitated by the systems-level investigations using computational tools. In particular, the network descriptors developed in other disciplines have found increasing applications in the study of the protein, gene regulatory, metabolic, disease, and drug-targeted networks. Facilities are provided by the public web servers for computing network descriptors, but many descriptors are not covered, including those used or useful for biological studies. We upgraded the PROFEAT web server http://bidd2.nus.edu.sg/cgi-bin/profeat2016/main.cgi for computing up to 329 network descriptors and protein-protein interaction descriptors. PROFEAT network descriptors comprehensively describe the topological and connectivity characteristics of unweighted (uniform binding constants and molecular levels), edge-weighted (varying binding constants), node-weighted (varying molecular levels), edge-node-weighted (varying binding constants and molecular levels), and directed (oriented processes) networks. The usefulness of the network descriptors is illustrated by the literature-reported studies of the biological networks derived from the genome, interactome, transcriptome, metabolome, and diseasome profiles. Copyright © 2016 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Crivori, Patrizia; Zamora, Ismael; Speed, Bill; Orrenius, Christian; Poggesi, Italo
2004-03-01
A number of computational approaches are being proposed for an early optimization of ADME (absorption, distribution, metabolism and excretion) properties to increase the success rate in drug discovery. The present study describes the development of an in silico model able to estimate, from the three-dimensional structure of a molecule, the stability of a compound with respect to the human cytochrome P450 (CYP) 3A4 enzyme activity. Stability data were obtained by measuring the amount of unchanged compound remaining after a standardized incubation with human cDNA-expressed CYP3A4. The computational method transforms the three-dimensional molecular interaction fields (MIFs) generated from the molecular structure into descriptors (VolSurf and Almond procedures). The descriptors were correlated to the experimental metabolic stability classes by a partial least squares discriminant procedure. The model was trained using a set of 1800 compounds from the Pharmacia collection and was validated using two test sets: the first one including 825 compounds from the Pharmacia collection and the second one consisting of 20 known drugs. This model correctly predicted 75% of the first and 85% of the second test set and showed a precision above 86% to correctly select metabolically stable compounds. The model appears a valuable tool in the design of virtual libraries to bias the selection toward more stable compounds. Abbreviations: ADME - absorption, distribution, metabolism and excretion; CYP - cytochrome P450; MIFs - molecular interaction fields; HTS - high throughput screening; DDI - drug-drug interactions; 3D - three-dimensional; PCA - principal components analysis; CPCA - consensus principal components analysis; PLS - partial least squares; PLSD - partial least squares discriminant; GRIND - grid independent descriptors; GRID - software originally created and developed by Professor Peter Goodford.
Yang, Zhihui; Luo, Shuang; Wei, Zongsu; Ye, Tiantian; Spinney, Richard; Chen, Dong; Xiao, Ruiyang
2016-04-01
The second-order rate constants (k) of hydroxyl radical (·OH) with polychlorinated biphenyls (PCBs) in the gas phase are of scientific and regulatory importance for assessing their global distribution and fate in the atmosphere. Due to the limited number of measured k values, there is a need to model the k values for unknown PCBs congeners. In the present study, we developed a quantitative structure-activity relationship (QSAR) model with quantum chemical descriptors using a sequential approach, including correlation analysis, principal component analysis, multi-linear regression, validation, and estimation of applicability domain. The result indicates that the single descriptor, polarizability (α), plays an important role in determining the reactivity with a global standardized function of lnk = -0.054 × α ‒ 19.49 at 298 K. In order to validate the QSAR predicted k values and expand the current k value database for PCBs congeners, an independent method, density functional theory (DFT), was employed to calculate the kinetics and thermodynamics of the gas-phase ·OH oxidation of 2,4',5-trichlorobiphenyl (PCB31), 2,2',4,4'-tetrachlorobiphenyl (PCB47), 2,3,4,5,6-pentachlorobiphenyl (PCB116), 3,3',4,4',5,5'-hexachlorobiphenyl (PCB169), and 2,3,3',4,5,5',6-heptachlorobiphenyl (PCB192) at 298 K at B3LYP/6-311++G**//B3LYP/6-31 + G** level of theory. The QSAR predicted and DFT calculated k values for ·OH oxidation of these PCB congeners exhibit excellent agreement with the experimental k values, indicating the robustness and predictive power of the single-descriptor based QSAR model we developed. Copyright © 2015 Elsevier Ltd. All rights reserved.
Blue Marble Matches: Using Earth for Planetary Comparisons
NASA Technical Reports Server (NTRS)
Graff, Paige Valderrama
2009-01-01
Goal: This activity is designed to introduce students to geologic processes on Earth and model how scientists use Earth to gain a better understanding of other planetary bodies in the solar system. Objectives: Students will: 1. Identify common descriptor characteristics used by scientists to describe geologic features in images. 2. Identify geologic features and how they form on Earth. 3. Create a list of defining/distinguishing characteristics of geologic features 4. Identify geologic features in images of other planetary bodies. 5. List observations and interpretations about planetary body comparisons. 6. Create summary statements about planetary body comparisons.
Forging a signature of in vivo senescence.
Sharpless, Norman E; Sherr, Charles J
2015-07-01
'Cellular senescence', a term originally defining the characteristics of cultured cells that exceed their replicative limit, has been broadened to describe durable states of proliferative arrest induced by disparate stress factors. Proposed relationships between cellular senescence, tumour suppression, loss of tissue regenerative capacity and ageing suffer from lack of uniform definition and consistently applied criteria. Here, we highlight caveats in interpreting the importance of suboptimal senescence-associated biomarkers, expressed either alone or in combination. We advocate that more-specific descriptors be substituted for the now broadly applied umbrella term 'senescence' in defining the suite of diverse physiological responses to cellular stress.
Teixeira, Christiane Aires; Rodrigues Júnior, Antonio Luiz; Straccia, Luciana Cristina; Vianna, Elcio Dos Santos Oliveira; Silva, Geruza Alves da; Martinez, José Antônio Baddini
2011-01-01
To develop a set of descriptive terms applied to the sensation of dyspnea (dyspnea descriptors) for use in Brazil and to investigate the usefulness of these descriptors in four distinct clinical conditions that can be accompanied by dyspnea. We collected 111 dyspnea descriptors from 67 patients and 10 health professionals. These descriptors were analyzed and reduced to 15 based on their frequency of use, similarity of meaning, and potential pathophysiological value. Those 15 descriptors were applied in 50 asthma patients, 50 COPD patients, 30 patients with heart failure, and 50 patients with class II or III obesity. The three best descriptors, as selected by the patients, were studied by cluster analysis. Potential associations between the identified clusters and the four clinical conditions were also investigated. The use of this set of descriptors led to a solution with seven clusters, designated sufoco (suffocating), aperto (tight), rápido (rapid), fadiga (fatigue), abafado (stuffy), trabalho/inspiração (work/inhalation), and falta de ar (shortness of breath). Overlapping of descriptors was quite common among the patients, regardless of their clinical condition. Asthma was significantly associated with the sufoco and trabalho/inspiração clusters, whereas COPD and heart failure were associated with the sufoco, trabalho/inspiração, and falta de ar clusters. Obesity was associated only with the falta de ar cluster. In Brazil, patients who are accustomed to perceiving dyspnea employ various descriptors in order to describe the symptom, and these descriptors can be grouped into similar clusters. In our study sample, such clusters showed no usefulness in differentiating among the four clinical conditions evaluated.
Garro Martinez, Juan C; Vega-Hissi, Esteban G; Andrada, Matías F; Duchowicz, Pablo R; Torrens, Francisco; Estrada, Mario R
2014-01-01
Lacosamide is an anticonvulsant drug which presents carbonic anhydrase inhibition. In this paper, we analyzed the apparent relationship between both activities performing a molecular modeling, docking and QSAR studies on 18 lacosamide derivatives with known anticonvulsant activity. Docking results suggested the zinc-binding site of carbonic anhydrase is a possible target of lacosamide and lacosamide derivatives making favorable Van der Waals interactions with Asn67, Gln92, Phe131 and Thr200. The mathematical models revealed a poor relationship between the anticonvulsant activity and molecular descriptors obtained from DFT and docking calculations. However, a QSAR model was developed using Dragon software descriptors. The statistic parameters of the model are: correlation coefficient, R=0.957 and standard deviation, S=0.162. Our results provide new valuable information regarding the relationship between both activities and contribute important insights into the essential molecular requirements for the anticonvulsant activity.
Defining the clinical course of multiple sclerosis
Reingold, Stephen C.; Cohen, Jeffrey A.; Cutter, Gary R.; Sørensen, Per Soelberg; Thompson, Alan J.; Wolinsky, Jerry S.; Balcer, Laura J.; Banwell, Brenda; Barkhof, Frederik; Bebo, Bruce; Calabresi, Peter A.; Clanet, Michel; Comi, Giancarlo; Fox, Robert J.; Freedman, Mark S.; Goodman, Andrew D.; Inglese, Matilde; Kappos, Ludwig; Kieseier, Bernd C.; Lincoln, John A.; Lubetzki, Catherine; Miller, Aaron E.; Montalban, Xavier; O'Connor, Paul W.; Petkau, John; Pozzilli, Carlo; Rudick, Richard A.; Sormani, Maria Pia; Stüve, Olaf; Waubant, Emmanuelle; Polman, Chris H.
2014-01-01
Accurate clinical course descriptions (phenotypes) of multiple sclerosis (MS) are important for communication, prognostication, design and recruitment of clinical trials, and treatment decision-making. Standardized descriptions published in 1996 based on a survey of international MS experts provided purely clinical phenotypes based on data and consensus at that time, but imaging and biological correlates were lacking. Increased understanding of MS and its pathology, coupled with general concern that the original descriptors may not adequately reflect more recently identified clinical aspects of the disease, prompted a re-examination of MS disease phenotypes by the International Advisory Committee on Clinical Trials of MS. While imaging and biological markers that might provide objective criteria for separating clinical phenotypes are lacking, we propose refined descriptors that include consideration of disease activity (based on clinical relapse rate and imaging findings) and disease progression. Strategies for future research to better define phenotypes are also outlined. PMID:24871874
Lin, Chun-Yu; Zhang, Lipeng; Zhao, Zhenghang; Xia, Zhenhai
2017-05-01
Covalent organic frameworks (COFs), an emerging class of framework materials linked by covalent bonds, hold potential for various applications such as efficient electrocatalysts, photovoltaics, and sensors. To rationally design COF-based electrocatalysts for oxygen reduction and evolution reactions in fuel cells and metal-air batteries, activity descriptors, derived from orbital energy and bonding structures, are identified with the first-principle calculations for the COFs, which correlate COF structures with their catalytic activities. The calculations also predict that alkaline-earth metal-porphyrin COFs could catalyze the direct production of H 2 O 2 , a green oxidizer and an energy carrier. These predictions are supported by experimental data, and the design principles derived from the descriptors provide an approach for rational design of new electrocatalysts for both clean energy conversion and green oxidizer production. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Quantitative prediction of solvation free energy in octanol of organic compounds.
Delgado, Eduardo J; Jaña, Gonzalo A
2009-03-01
The free energy of solvation, DeltaGS0, in octanol of organic compounds is quantitatively predicted from the molecular structure. The model, involving only three molecular descriptors, is obtained by multiple linear regression analysis from a data set of 147 compounds containing diverse organic functions, namely, halogenated and non-halogenated alkanes, alkenes, alkynes, aromatics, alcohols, aldehydes, ketones, amines, ethers and esters; covering a DeltaGS0 range from about -50 to 0 kJ.mol(-1). The model predicts the free energy of solvation with a squared correlation coefficient of 0.93 and a standard deviation, 2.4 kJ.mol(-1), just marginally larger than the generally accepted value of experimental uncertainty. The involved molecular descriptors have definite physical meaning corresponding to the different intermolecular interactions occurring in the bulk liquid phase. The model is validated with an external set of 36 compounds not included in the training set.
Quantitative Prediction of Solvation Free Energy in Octanol of Organic Compounds
Delgado, Eduardo J.; Jaña, Gonzalo A.
2009-01-01
The free energy of solvation, ΔGS0, in octanol of organic compunds is quantitatively predicted from the molecular structure. The model, involving only three molecular descriptors, is obtained by multiple linear regression analysis from a data set of 147 compounds containing diverse organic functions, namely, halogenated and non-halogenated alkanes, alkenes, alkynes, aromatics, alcohols, aldehydes, ketones, amines, ethers and esters; covering a ΔGS0 range from about −50 to 0 kJ·mol−1. The model predicts the free energy of solvation with a squared correlation coefficient of 0.93 and a standard deviation, 2.4 kJ·mol−1, just marginally larger than the generally accepted value of experimental uncertainty. The involved molecular descriptors have definite physical meaning corresponding to the different intermolecular interactions occurring in the bulk liquid phase. The model is validated with an external set of 36 compounds not included in the training set. PMID:19399236
2013-01-01
Background While a large body of work exists on comparing and benchmarking descriptors of molecular structures, a similar comparison of protein descriptor sets is lacking. Hence, in the current work a total of 13 amino acid descriptor sets have been benchmarked with respect to their ability of establishing bioactivity models. The descriptor sets included in the study are Z-scales (3 variants), VHSE, T-scales, ST-scales, MS-WHIM, FASGAI, BLOSUM, a novel protein descriptor set (termed ProtFP (4 variants)), and in addition we created and benchmarked three pairs of descriptor combinations. Prediction performance was evaluated in seven structure-activity benchmarks which comprise Angiotensin Converting Enzyme (ACE) dipeptidic inhibitor data, and three proteochemometric data sets, namely (1) GPCR ligands modeled against a GPCR panel, (2) enzyme inhibitors (NNRTIs) with associated bioactivities against a set of HIV enzyme mutants, and (3) enzyme inhibitors (PIs) with associated bioactivities on a large set of HIV enzyme mutants. Results The amino acid descriptor sets compared here show similar performance (<0.1 log units RMSE difference and <0.1 difference in MCC), while errors for individual proteins were in some cases found to be larger than those resulting from descriptor set differences ( > 0.3 log units RMSE difference and >0.7 difference in MCC). Combining different descriptor sets generally leads to better modeling performance than utilizing individual sets. The best performers were Z-scales (3) combined with ProtFP (Feature), or Z-Scales (3) combined with an average Z-Scale value for each target, while ProtFP (PCA8), ST-Scales, and ProtFP (Feature) rank last. Conclusions While amino acid descriptor sets capture different aspects of amino acids their ability to be used for bioactivity modeling is still – on average – surprisingly similar. Still, combining sets describing complementary information consistently leads to small but consistent improvement in modeling performance (average MCC 0.01 better, average RMSE 0.01 log units lower). Finally, performance differences exist between the targets compared thereby underlining that choosing an appropriate descriptor set is of fundamental for bioactivity modeling, both from the ligand- as well as the protein side. PMID:24059743
Violence in E-rated video games.
Thompson, K M; Haninger, K
2001-08-01
Children's exposure to violence, alcohol, tobacco and other substances, and sexual messages in the media are a source of public health concern; however, content in video games commonly played by children has not been quantified. To quantify and characterize the depiction of violence, alcohol, tobacco and other substances, and sex in video games rated E (for "Everyone"), analogous to the G rating of films, which suggests suitability for all audiences. We created a database of all existing E-rated video games available for rent or sale in the United States by April 1, 2001, to identify the distribution of games by genre and to characterize the distribution of content descriptors associated with these games. We played and assessed the content of a convenience sample of 55 E-rated video games released for major home video game consoles between 1985 and 2000. Game genre; duration of violence; number of fatalities; types of weapons used; whether injuring characters or destroying objects is rewarded or is required to advance in the game; depiction of alcohol, tobacco and other substances; and sexual content. Based on analysis of the 672 current E-rated video games played on home consoles, 77% were in sports, racing, or action genres and 57% did not receive any content descriptors. We found that 35 of the 55 games we played (64%) involved intentional violence for an average of 30.7% of game play (range, 1.5%-91.2%), and we noted significant differences in the amount of violence among game genres. Injuring characters was rewarded or required for advancement in 33 games (60%). The presence of any content descriptor for violence (n = 23 games) was significantly correlated with the presence of intentional violence in the game (at a 5% significance level based on a 2-sided Wilcoxon rank-sum test, t(53) = 2.59). Notably, 14 of 32 games (44%) that did not receive a content descriptor for violence contained acts of violence. Action and shooting games led to the largest numbers of deaths from violent acts, and we found a significant correlation between the proportion of violent game play and the number of deaths per minute of play. We noted potentially objectionable sexual content in 2 games and the presence of alcohol in 1 game. Content analysis suggests a significant amount of violence in some E-rated video games. The content descriptors provide some information to parents and should be used along with the rating, but the game's genre also appears to play a role in the amount of violent play. Physicians and parents should understand that popular E-rated video games may be a source of exposure to violence and other unexpected content for children and that games may reward the players for violent actions.
Quantitative EEG correlations with brain glucose metabolic rate during anesthesia in volunteers.
Alkire, M T
1998-08-01
To help elucidate the relationship between anesthetic-induced changes in the electroencephalogram (EEG) and the concurrent cerebral metabolic changes caused by anesthesia, positron emission tomography data of cerebral metabolism obtained in volunteers during anesthesia were correlated retrospectively with various concurrently measured EEG descriptors. Volunteers underwent functional brain imaging using the 18fluorodeoxyglucose technique; one scan always assessed awake-baseline cerebral metabolism (n = 7), and the other scans assessed metabolism during propofol sedation (n = 4), propofol anesthesia (n = 4), or isoflurane anesthesia (n = 5). The EEG was recorded continuously during metabolism assessment using a frontal-mastoid montage. Power spectrum variables, median frequency, 95% spectral edge, and bispectral index (BIS) values subsequently were correlated with the percentage of absolute cerebral metabolic reduction (PACMR) of glucose utilization caused by anesthesia. The percentage of absolute cerebral metabolic reduction, evident during anesthesia, trended median frequency (r = -0.46, P = 0.11), and the spectral edge (r = -0.52, P = 0.07), and correlated with anesthetic type (r = -0.70, P < 0.05), relative beta power (r = -0.60, P < 0.05), total power (r = 0.71,P < 0.01), and bispectral index (r = -0.81,P < 0.001). After controlling for anesthetic type, only bispectral index (r = 0.40, P = 0.08) and alpha power (r = 0.37, P = 0.10) approached significance for explaining residual percentage of absolute cerebral metabolic reduction prediction error. Some EEG descriptors correlated linearly with the magnitude of the cerebral metabolic reduction caused by propofol and isoflurane anesthesia. These data suggest that a physiologic link exists between the EEG and cerebral metabolism during anesthesia that is mathematically quantifiable.
Hasegawa, Kiyoshi; Funatsu, Kimito
2014-12-01
Chemogenomics is a new strategy in drug discovery for interrogating all molecules capable of interacting with all biological targets. Because of the almost infinite number of drug-like organic molecules, bench-based experimental chemogenomics methods are not generally feasible. Several in silico chemogenomics models have therefore been developed for high-throughput screening of large numbers of drug candidate compounds and target proteins. In previous studies, we described two novel bi-modal PLS approaches. These methods provide a significant advantage in that they enable direct connections to be made between biological activities and ligand and protein descriptors. In this special issue, we review these two PLS-based approaches using two different chemogenomics datasets for illustration. We then compare the predictive and interpretive performance of the two methods using the same congeneric data set. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
O'Boyle, Noel M; Palmer, David S; Nigsch, Florian; Mitchell, John Bo
2008-10-29
We present a novel feature selection algorithm, Winnowing Artificial Ant Colony (WAAC), that performs simultaneous feature selection and model parameter optimisation for the development of predictive quantitative structure-property relationship (QSPR) models. The WAAC algorithm is an extension of the modified ant colony algorithm of Shen et al. (J Chem Inf Model 2005, 45: 1024-1029). We test the ability of the algorithm to develop a predictive partial least squares model for the Karthikeyan dataset (J Chem Inf Model 2005, 45: 581-590) of melting point values. We also test its ability to perform feature selection on a support vector machine model for the same dataset. Starting from an initial set of 203 descriptors, the WAAC algorithm selected a PLS model with 68 descriptors which has an RMSE on an external test set of 46.6 degrees C and R2 of 0.51. The number of components chosen for the model was 49, which was close to optimal for this feature selection. The selected SVM model has 28 descriptors (cost of 5, epsilon of 0.21) and an RMSE of 45.1 degrees C and R2 of 0.54. This model outperforms a kNN model (RMSE of 48.3 degrees C, R2 of 0.47) for the same data and has similar performance to a Random Forest model (RMSE of 44.5 degrees C, R2 of 0.55). However it is much less prone to bias at the extremes of the range of melting points as shown by the slope of the line through the residuals: -0.43 for WAAC/SVM, -0.53 for Random Forest. With a careful choice of objective function, the WAAC algorithm can be used to optimise machine learning and regression models that suffer from overfitting. Where model parameters also need to be tuned, as is the case with support vector machine and partial least squares models, it can optimise these simultaneously. The moving probabilities used by the algorithm are easily interpreted in terms of the best and current models of the ants, and the winnowing procedure promotes the removal of irrelevant descriptors.
OPERA models for predicting physicochemical properties and environmental fate endpoints.
Mansouri, Kamel; Grulke, Chris M; Judson, Richard S; Williams, Antony J
2018-03-08
The collection of chemical structure information and associated experimental data for quantitative structure-activity/property relationship (QSAR/QSPR) modeling is facilitated by an increasing number of public databases containing large amounts of useful data. However, the performance of QSAR models highly depends on the quality of the data and modeling methodology used. This study aims to develop robust QSAR/QSPR models for chemical properties of environmental interest that can be used for regulatory purposes. This study primarily uses data from the publicly available PHYSPROP database consisting of a set of 13 common physicochemical and environmental fate properties. These datasets have undergone extensive curation using an automated workflow to select only high-quality data, and the chemical structures were standardized prior to calculation of the molecular descriptors. The modeling procedure was developed based on the five Organization for Economic Cooperation and Development (OECD) principles for QSAR models. A weighted k-nearest neighbor approach was adopted using a minimum number of required descriptors calculated using PaDEL, an open-source software. The genetic algorithms selected only the most pertinent and mechanistically interpretable descriptors (2-15, with an average of 11 descriptors). The sizes of the modeled datasets varied from 150 chemicals for biodegradability half-life to 14,050 chemicals for logP, with an average of 3222 chemicals across all endpoints. The optimal models were built on randomly selected training sets (75%) and validated using fivefold cross-validation (CV) and test sets (25%). The CV Q 2 of the models varied from 0.72 to 0.95, with an average of 0.86 and an R 2 test value from 0.71 to 0.96, with an average of 0.82. Modeling and performance details are described in QSAR model reporting format and were validated by the European Commission's Joint Research Center to be OECD compliant. All models are freely available as an open-source, command-line application called OPEn structure-activity/property Relationship App (OPERA). OPERA models were applied to more than 750,000 chemicals to produce freely available predicted data on the U.S. Environmental Protection Agency's CompTox Chemistry Dashboard.
Efficient 3D porous microstructure reconstruction via Gaussian random field and hybrid optimization.
Jiang, Z; Chen, W; Burkhart, C
2013-11-01
Obtaining an accurate three-dimensional (3D) structure of a porous microstructure is important for assessing the material properties based on finite element analysis. Whereas directly obtaining 3D images of the microstructure is impractical under many circumstances, two sets of methods have been developed in literature to generate (reconstruct) 3D microstructure from its 2D images: one characterizes the microstructure based on certain statistical descriptors, typically two-point correlation function and cluster correlation function, and then performs an optimization process to build a 3D structure that matches those statistical descriptors; the other method models the microstructure using stochastic models like a Gaussian random field and generates a 3D structure directly from the function. The former obtains a relatively accurate 3D microstructure, but computationally the optimization process can be very intensive, especially for problems with large image size; the latter generates a 3D microstructure quickly but sacrifices the accuracy due to issues in numerical implementations. A hybrid optimization approach of modelling the 3D porous microstructure of random isotropic two-phase materials is proposed in this paper, which combines the two sets of methods and hence maintains the accuracy of the correlation-based method with improved efficiency. The proposed technique is verified for 3D reconstructions based on silica polymer composite images with different volume fractions. A comparison of the reconstructed microstructures and the optimization histories for both the original correlation-based method and our hybrid approach demonstrates the improved efficiency of the approach. © 2013 The Authors Journal of Microscopy © 2013 Royal Microscopical Society.
A practical introduction to skeletons for the plant sciences1
Bucksch, Alexander
2014-01-01
Before the availability of digital photography resulting from the invention of charged couple devices in 1969, the measurement of plant architecture was a manual process either on the plant itself or on traditional photographs. The introduction of cheap digital imaging devices for the consumer market enabled the wide use of digital images to capture the shape of plant networks such as roots, tree crowns, or leaf venation. Plant networks contain geometric traits that can establish links to genetic or physiological characteristics, support plant breeding efforts, drive evolutionary studies, or serve as input to plant growth simulations. Typically, traits are encoded in shape descriptors that are computed from imaging data. Skeletons are one class of shape descriptors that are used to describe the hierarchies and extent of branching and looping plant networks. While the mathematical understanding of skeletons is well developed, their application within the plant sciences remains challenging because the quality of the measurement depends partly on the interpretation of the skeleton. This article is meant to bridge the skeletonization literature in the plant sciences and related technical fields by discussing best practices for deriving diameters and approximating branching hierarchies in a plant network. PMID:25202645
Zarzo, Manuel
2015-06-01
Many authors have proposed different schemes of odor classification, which are useful to aid the complex task of describing smells. However, reaching a consensus on a particular classification seems difficult because our psychophysical space of odor description is a continuum and is not clustered into well-defined categories. An alternative approach is to describe the perceptual space of odors as a low-dimensional coordinate system. This idea was first proposed by Crocker and Henderson in 1927, who suggested using numeric profiles based on 4 dimensions: "fragrant," "acid," "burnt," and "caprylic." In the present work, the odor profiles of 144 aroma chemicals were compared by means of statistical regression with comparable numeric odor profiles obtained from 2 databases, enabling a plausible interpretation of the 4 dimensions. Based on the results and taking into account comparable 2D sensory maps of odor descriptors from the literature, a 3D sensory map (odor cube) has been drawn up to improve understanding of the similarities and dissimilarities of the odor descriptors most frequently used in fragrance chemistry. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Ren, Y Y; Zhou, L C; Yang, L; Liu, P Y; Zhao, B W; Liu, H X
2016-09-01
The paper highlights the use of the logistic regression (LR) method in the construction of acceptable statistically significant, robust and predictive models for the classification of chemicals according to their aquatic toxic modes of action. Essentials accounting for a reliable model were all considered carefully. The model predictors were selected by stepwise forward discriminant analysis (LDA) from a combined pool of experimental data and chemical structure-based descriptors calculated by the CODESSA and DRAGON software packages. Model predictive ability was validated both internally and externally. The applicability domain was checked by the leverage approach to verify prediction reliability. The obtained models are simple and easy to interpret. In general, LR performs much better than LDA and seems to be more attractive for the prediction of the more toxic compounds, i.e. compounds that exhibit excess toxicity versus non-polar narcotic compounds and more reactive compounds versus less reactive compounds. In addition, model fit and regression diagnostics was done through the influence plot which reflects the hat-values, studentized residuals, and Cook's distance statistics of each sample. Overdispersion was also checked for the LR model. The relationships between the descriptors and the aquatic toxic behaviour of compounds are also discussed.
Automated selection of BI-RADS lesion descriptors for reporting calcifications in mammograms
NASA Astrophysics Data System (ADS)
Paquerault, Sophie; Jiang, Yulei; Nishikawa, Robert M.; Schmidt, Robert A.; D'Orsi, Carl J.; Vyborny, Carl J.; Newstead, Gillian M.
2003-05-01
We are developing an automated computer technique to describe calcifications in mammograms according to the BI-RADS lexicon. We evaluated this technique by its agreement with radiologists' description of the same lesions. Three expert mammographers reviewed our database of 90 cases of digitized mammograms containing clustered microcalcifications and described the calcifications according to BI-RADS. In our study, the radiologists used only 4 of the 5 calcification distribution descriptors and 5 of the 14 calcification morphology descriptors contained in BI-RADS. Our computer technique was therefore designed specifically for these 4 calcification distribution descriptors and 5 calcification morphology descriptors. For calcification distribution, 4 linear discriminant analysis (LDA) classifiers were developed using 5 computer-extracted features to produce scores of how well each descriptor describes a cluster. Similarly, for calcification morphology, 5 LDAs were designed using 10 computer-extracted features. We trained the LDAs using only the BI-RADS data reported by the first radiologist and compared the computer output to the descriptor data reported by all 3 radiologists (for the first radiologist, the leave-one-out method was used). The computer output consisted of the best calcification distribution descriptor and the best 2 calcification morphology descriptors. The results of the comparison with the data from each radiologist, respectively, were: for calcification distribution, percent agreement, 74%, 66%, and 73%, kappa value, 0.44, 0.36, and 0.46; for calcification morphology, percent agreement, 83%, 77%, and 57%, kappa value, 0.78, 0.70, and 0.44. These results indicate that the proposed computer technique can select BI-RADS descriptors in good agreement with radiologists.
ERIC Educational Resources Information Center
Rentzou, Konstantina
2013-01-01
All children and young people need to play. The impulse to play is innate. Yet, the pure essence of play is playfulness a notion not new, yet limitedly researched. Playfulness refers to the individual style each child has to play, which is linked to personality descriptors and attributes. The present study had a twofold aim. On the one hand, it…
ERIC Educational Resources Information Center
Howard, Pierce J.; Howard, Jane M.
The first section of this monograph shows how, by analyzing the language of personality descriptors, researchers have identified five correlated groups of behaviors. It finds that the most popular formulation of the Five-Factor Model (FFM) is that of Costa and McCrae (1992) and that their nomenclature can be adapted to come up with a version for…
NASA Astrophysics Data System (ADS)
Beig, Niha; Patel, Jay; Prasanna, Prateek; Partovi, Sasan; Varadan, Vinay; Madabhushi, Anant; Tiwari, Pallavi
2017-03-01
Glioblastoma Multiforme (GBM) is a highly aggressive brain tumor with a median survival of 14 months. Hypoxia is a hallmark trait in GBM that is known to be associated with angiogenesis, tumor growth, and resistance to conventional therapy, thereby limiting treatment options for GBM patients. There is thus an urgent clinical need for non-invasively capturing tumor hypoxia in GBM towards identifying a subset of patients who would likely benefit from anti-angiogenic therapies (bevacizumab) in the adjuvant setting. In this study, we employed radiomic descriptors to (a) capture molecular variations of tumor hypoxia on routine MRI that are otherwise not appreciable; and (b) employ the radiomic correlates of hypoxia to discriminate patients with short-term survival (STS, overall survival (OS) < 7 months), mid-term survival (MTS) (7 months
Visual feature discrimination versus compression ratio for polygonal shape descriptors
NASA Astrophysics Data System (ADS)
Heuer, Joerg; Sanahuja, Francesc; Kaup, Andre
2000-10-01
In the last decade several methods for low level indexing of visual features appeared. Most often these were evaluated with respect to their discrimination power using measures like precision and recall. Accordingly, the targeted application was indexing of visual data within databases. During the standardization process of MPEG-7 the view on indexing of visual data changed, taking also communication aspects into account where coding efficiency is important. Even if the descriptors used for indexing are small compared to the size of images, it is recognized that there can be several descriptors linked to an image, characterizing different features and regions. Beside the importance of a small memory footprint for the transmission of the descriptor and the memory footprint in a database, eventually the search and filtering can be sped up by reducing the dimensionality of the descriptor if the metric of the matching can be adjusted. Based on a polygon shape descriptor presented for MPEG-7 this paper compares the discrimination power versus memory consumption of the descriptor. Different methods based on quantization are presented and their effect on the retrieval performance are measured. Finally an optimized computation of the descriptor is presented.
Ariyasena, Thiloka C; Poole, Colin F
2014-09-26
Retention factors on several columns and at various temperatures using gas chromatography and from reversed-phase liquid chromatography on a SunFire C18 column with various mobile phase compositions containing acetonitrile, methanol and tetrahydrofuran as strength adjusting solvents are combined with liquid-liquid partition coefficients in totally organic biphasic systems to calculate descriptors for 23 polycyclic aromatic hydrocarbons and eighteen related compounds of environmental interest. The use of a consistent protocol for the above measurements provides descriptors that are more self consistent for the estimation of physicochemical properties (octanol-water, air-octanol, air-water, aqueous solubility, and subcooled liquid vapor pressure). The descriptor in this report tend to have smaller values for the L and E descriptors and random differences in the B and S descriptors compared with literature sources. A simple atom fragment constant model is proposed for the estimation of descriptors from structure for polycyclic aromatic hydrocarbons. The new descriptors show no bias in the prediction of the air-water partition coefficient for polycyclic aromatic hydrocarbons unlike the literature values. Copyright © 2014 Elsevier B.V. All rights reserved.
Hybrid Histogram Descriptor: A Fusion Feature Representation for Image Retrieval.
Feng, Qinghe; Hao, Qiaohong; Chen, Yuqi; Yi, Yugen; Wei, Ying; Dai, Jiangyan
2018-06-15
Currently, visual sensors are becoming increasingly affordable and fashionable, acceleratingly the increasing number of image data. Image retrieval has attracted increasing interest due to space exploration, industrial, and biomedical applications. Nevertheless, designing effective feature representation is acknowledged as a hard yet fundamental issue. This paper presents a fusion feature representation called a hybrid histogram descriptor (HHD) for image retrieval. The proposed descriptor comprises two histograms jointly: a perceptually uniform histogram which is extracted by exploiting the color and edge orientation information in perceptually uniform regions; and a motif co-occurrence histogram which is acquired by calculating the probability of a pair of motif patterns. To evaluate the performance, we benchmarked the proposed descriptor on RSSCN7, AID, Outex-00013, Outex-00014 and ETHZ-53 datasets. Experimental results suggest that the proposed descriptor is more effective and robust than ten recent fusion-based descriptors under the content-based image retrieval framework. The computational complexity was also analyzed to give an in-depth evaluation. Furthermore, compared with the state-of-the-art convolutional neural network (CNN)-based descriptors, the proposed descriptor also achieves comparable performance, but does not require any training process.
Real-time, resource-constrained object classification on a micro-air vehicle
NASA Astrophysics Data System (ADS)
Buck, Louis; Ray, Laura
2013-12-01
A real-time embedded object classification algorithm is developed through the novel combination of binary feature descriptors, a bag-of-visual-words object model and the cortico-striatal loop (CSL) learning algorithm. The BRIEF, ORB and FREAK binary descriptors are tested and compared to SIFT descriptors with regard to their respective classification accuracies, execution times, and memory requirements when used with CSL on a 12.6 g ARM Cortex embedded processor running at 800 MHz. Additionally, the effect of x2 feature mapping and opponent-color representations used with these descriptors is examined. These tests are performed on four data sets of varying sizes and difficulty, and the BRIEF descriptor is found to yield the best combination of speed and classification accuracy. Its use with CSL achieves accuracies between 67% and 95% of those achieved with SIFT descriptors and allows for the embedded classification of a 128x192 pixel image in 0.15 seconds, 60 times faster than classification with SIFT. X2 mapping is found to provide substantial improvements in classification accuracy for all of the descriptors at little cost, while opponent-color descriptors are offer accuracy improvements only on colorful datasets.
Local intensity area descriptor for facial recognition in ideal and noise conditions
NASA Astrophysics Data System (ADS)
Tran, Chi-Kien; Tseng, Chin-Dar; Chao, Pei-Ju; Ting, Hui-Min; Chang, Liyun; Huang, Yu-Jie; Lee, Tsair-Fwu
2017-03-01
We propose a local texture descriptor, local intensity area descriptor (LIAD), which is applied for human facial recognition in ideal and noisy conditions. Each facial image is divided into small regions from which LIAD histograms are extracted and concatenated into a single feature vector to represent the facial image. The recognition is performed using a nearest neighbor classifier with histogram intersection and chi-square statistics as dissimilarity measures. Experiments were conducted with LIAD using the ORL database of faces (Olivetti Research Laboratory, Cambridge), the Face94 face database, the Georgia Tech face database, and the FERET database. The results demonstrated the improvement in accuracy of our proposed descriptor compared to conventional descriptors [local binary pattern (LBP), uniform LBP, local ternary pattern, histogram of oriented gradients, and local directional pattern]. Moreover, the proposed descriptor was less sensitive to noise and had low histogram dimensionality. Thus, it is expected to be a powerful texture descriptor that can be used for various computer vision problems.
A 3D model retrieval approach based on Bayesian networks lightfield descriptor
NASA Astrophysics Data System (ADS)
Xiao, Qinhan; Li, Yanjun
2009-12-01
A new 3D model retrieval methodology is proposed by exploiting a novel Bayesian networks lightfield descriptor (BNLD). There are two key novelties in our approach: (1) a BN-based method for building lightfield descriptor; and (2) a 3D model retrieval scheme based on the proposed BNLD. To overcome the disadvantages of the existing 3D model retrieval methods, we explore BN for building a new lightfield descriptor. Firstly, 3D model is put into lightfield, about 300 binary-views can be obtained along a sphere, then Fourier descriptors and Zernike moments descriptors can be calculated out from binaryviews. Then shape feature sequence would be learned into a BN model based on BN learning algorithm; Secondly, we propose a new 3D model retrieval method by calculating Kullback-Leibler Divergence (KLD) between BNLDs. Beneficial from the statistical learning, our BNLD is noise robustness as compared to the existing methods. The comparison between our method and the lightfield descriptor-based approach is conducted to demonstrate the effectiveness of our proposed methodology.
Flightspeed Integral Image Analysis Toolkit
NASA Technical Reports Server (NTRS)
Thompson, David R.
2009-01-01
The Flightspeed Integral Image Analysis Toolkit (FIIAT) is a C library that provides image analysis functions in a single, portable package. It provides basic low-level filtering, texture analysis, and subwindow descriptor for applications dealing with image interpretation and object recognition. Designed with spaceflight in mind, it addresses: Ease of integration (minimal external dependencies) Fast, real-time operation using integer arithmetic where possible (useful for platforms lacking a dedicated floatingpoint processor) Written entirely in C (easily modified) Mostly static memory allocation 8-bit image data The basic goal of the FIIAT library is to compute meaningful numerical descriptors for images or rectangular image regions. These n-vectors can then be used directly for novelty detection or pattern recognition, or as a feature space for higher-level pattern recognition tasks. The library provides routines for leveraging training data to derive descriptors that are most useful for a specific data set. Its runtime algorithms exploit a structure known as the "integral image." This is a caching method that permits fast summation of values within rectangular regions of an image. This integral frame facilitates a wide range of fast image-processing functions. This toolkit has applicability to a wide range of autonomous image analysis tasks in the space-flight domain, including novelty detection, object and scene classification, target detection for autonomous instrument placement, and science analysis of geomorphology. It makes real-time texture and pattern recognition possible for platforms with severe computational restraints. The software provides an order of magnitude speed increase over alternative software libraries currently in use by the research community. FIIAT can commercially support intelligent video cameras used in intelligent surveillance. It is also useful for object recognition by robots or other autonomous vehicles
Liu, Huihui; Wei, Mengbi; Yang, Xianhai; Yin, Cen; He, Xiao
2017-01-01
Partition coefficients are vital parameters for measuring accurately the chemicals concentrations by passive sampling devices. Given the wide use of low density polyethylene (LDPE) film in passive sampling, we developed a theoretical linear solvation energy relationship (TLSER) model and a quantitative structure-activity relationship (QSAR) model for the prediction of the partition coefficient of chemicals between LDPE and water (K pew ). For chemicals with the octanol-water partition coefficient (log K ow ) <8, a TLSER model with V x (McGowan volume) and qA - (the most negative charge on O, N, S, X atoms) as descriptors was developed, but the model had relatively low determination coefficient (R 2 ) and cross-validated coefficient (Q 2 ). In order to further explore the theoretical mechanisms involved in the partition process, a QSAR model with four descriptors (MLOGP (Moriguchi octanol-water partition coeff.), P_VSA_s_3 (P_VSA-like on I-state, bin 3), Hy (hydrophilic factor) and NssO (number of atoms of type ssO)) was established, and statistical analysis indicated that the model had satisfactory goodness-of-fit, robustness and predictive ability. For chemicals with log K OW >8, a TLSER model with V x and a QSAR model with MLOGP as descriptor were developed. This is the first paper to explore the models for highly hydrophobic chemicals. The applicability domain of the models, characterized by the Euclidean distance-based method and Williams plot, covered a large number of structurally diverse chemicals, which included nearly all the common hydrophobic organic compounds. Additionally, through mechanism interpretation, we explored the structural features those governing the partition behavior of chemicals between LDPE and water. Copyright © 2016 Elsevier B.V. All rights reserved.
Qualitative data analysis for an exploratory sensory study of Grechetto wine.
Esti, Marco; González Airola, Ricardo L; Moneta, Elisabetta; Paperaio, Marina; Sinesio, Fiorella
2010-02-15
Grechetto is a traditional white-grape vine, widespread in Umbria and Lazio regions in central Italy. Despite the wine commercial diffusion, little literature on its sensory characteristics is available. The present study is an exploratory research conducted with the aim of identifying the sensory markers of Grechetto wine and of evaluating the effect of clone, geographical area, vintage and producer on sensory attributes. A qualitative sensory study was conducted on 16 wines, differing for vintage, Typical Geographic Indication, and clone, collected from 7 wineries, using a trained panel in isolation who referred to a glossary of 133 white wine descriptors. Sixty-five attributes identified by a minimum of 50% of the respondents were submitted to a correspondence analysis to link wine samples to the sensory attributes. Seventeen terms identified as common to all samples are considered as characteristics of Grechetto wine, 10 of which olfactory: fruity, apple, acacia flower, pineapple, banana, floral, herbaceous, honey, apricot and peach. In order to interpret the relationship between design variables and sensory attributes data on 2005 and 2006 wines, the 28 most discriminating descriptors were projected in a principal component analysis. The first principal component was best described by olfactory terms and the second by gustative attributes. Good reproducibility of results was obtained for the two vintages. For one winery, vintage effect (2002-2006) was described in a new principal component analysis model applied on 39 most discriminating descriptors, which globally explained about 84% of the variance. In the young wines the notes of sulphur, yeast, dried fruit, butter, combined with herbaceous fresh and tropical fruity notes (melon, grapefruit) were dominant. During wine aging, sweeter notes, like honey, caramel, jam, become more dominant as well as some mineral notes, such as tuff and flint. Copyright 2009 Elsevier B.V. All rights reserved.
Masino, Aaron J; Dechene, Elizabeth T; Dulik, Matthew C; Wilkens, Alisha; Spinner, Nancy B; Krantz, Ian D; Pennington, Jeffrey W; Robinson, Peter N; White, Peter S
2014-07-21
Exome sequencing is a promising method for diagnosing patients with a complex phenotype. However, variant interpretation relative to patient phenotype can be challenging in some scenarios, particularly clinical assessment of rare complex phenotypes. Each patient's sequence reveals many possibly damaging variants that must be individually assessed to establish clear association with patient phenotype. To assist interpretation, we implemented an algorithm that ranks a given set of genes relative to patient phenotype. The algorithm orders genes by the semantic similarity computed between phenotypic descriptors associated with each gene and those describing the patient. Phenotypic descriptor terms are taken from the Human Phenotype Ontology (HPO) and semantic similarity is derived from each term's information content. Model validation was performed via simulation and with clinical data. We simulated 33 Mendelian diseases with 100 patients per disease. We modeled clinical conditions by adding noise and imprecision, i.e. phenotypic terms unrelated to the disease and terms less specific than the actual disease terms. We ranked the causative gene against all 2488 HPO annotated genes. The median causative gene rank was 1 for the optimal and noise cases, 12 for the imprecision case, and 60 for the imprecision with noise case. Additionally, we examined a clinical cohort of subjects with hearing impairment. The disease gene median rank was 22. However, when also considering the patient's exome data and filtering non-exomic and common variants, the median rank improved to 3. Semantic similarity can rank a causative gene highly within a gene list relative to patient phenotype characteristics, provided that imprecision is mitigated. The clinical case results suggest that phenotype rank combined with variant analysis provides significant improvement over the individual approaches. We expect that this combined prioritization approach may increase accuracy and decrease effort for clinical genetic diagnosis.
Shi, Weiwei; Bugrim, Andrej; Nikolsky, Yuri; Nikolskya, Tatiana; Brennan, Richard J
2008-01-01
ABSTRACT The ideal toxicity biomarker is composed of the properties of prediction (is detected prior to traditional pathological signs of injury), accuracy (high sensitivity and specificity), and mechanistic relationships to the endpoint measured (biological relevance). Gene expression-based toxicity biomarkers ("signatures") have shown good predictive power and accuracy, but are difficult to interpret biologically. We have compared different statistical methods of feature selection with knowledge-based approaches, using GeneGo's database of canonical pathway maps, to generate gene sets for the classification of renal tubule toxicity. The gene set selection algorithms include four univariate analyses: t-statistics, fold-change, B-statistics, and RankProd, and their combination and overlap for the identification of differentially expressed probes. Enrichment analysis following the results of the four univariate analyses, Hotelling T-square test, and, finally out-of-bag selection, a variant of cross-validation, were used to identify canonical pathway maps-sets of genes coordinately involved in key biological processes-with classification power. Differentially expressed genes identified by the different statistical univariate analyses all generated reasonably performing classifiers of tubule toxicity. Maps identified by enrichment analysis or Hotelling T-square had lower classification power, but highlighted perturbed lipid homeostasis as a common discriminator of nephrotoxic treatments. The out-of-bag method yielded the best functionally integrated classifier. The map "ephrins signaling" performed comparably to a classifier derived using sparse linear programming, a machine learning algorithm, and represents a signaling network specifically involved in renal tubule development and integrity. Such functional descriptors of toxicity promise to better integrate predictive toxicogenomics with mechanistic analysis, facilitating the interpretation and risk assessment of predictive genomic investigations.
Fukushima, Hidetada; Panczyk, Micah; Hu, Chengcheng; Dameff, Christian; Chikani, Vatsal; Vadeboncoeur, Tyler; Spaite, Daniel W; Bobrow, Bentley J
2017-08-29
Emergency 9-1-1 callers use a wide range of terms to describe abnormal breathing in persons with out-of-hospital cardiac arrest (OHCA). These breathing descriptors can obstruct the telephone cardiopulmonary resuscitation (CPR) process. We conducted an observational study of emergency call audio recordings linked to confirmed OHCAs in a statewide Utstein-style database. Breathing descriptors fell into 1 of 8 groups (eg, gasping, snoring). We divided the study population into groups with and without descriptors for abnormal breathing to investigate the impact of these descriptors on patient outcomes and telephone CPR process. Callers used descriptors in 459 of 2411 cases (19.0%) between October 1, 2010, and December 31, 2014. Survival outcome was better when the caller used a breathing descriptor (19.6% versus 8.8%, P <0.0001), with an odds ratio of 1.63 (95% confidence interval, 1.17-2.25). After exclusions, 379 of 459 cases were eligible for process analysis. When callers described abnormal breathing, the rates of telecommunicator OHCA recognition, CPR instruction, and telephone CPR were lower than when callers did not use a breathing descriptor (79.7% versus 93.0%, P <0.0001; 65.4% versus 72.5%, P =0.0078; and 60.2% versus 66.9%, P =0.0123, respectively). The time interval between call receipt and OHCA recognition was longer when the caller used a breathing descriptor (118.5 versus 73.5 seconds, P <0.0001). Descriptors of abnormal breathing are associated with improved outcomes but also with delays in the identification of OHCA. Familiarizing telecommunicators with these descriptors may improve the telephone CPR process including OHCA recognition for patients with increased probability of survival. © 2017 The Authors. Published on behalf of the American Heart Association, Inc., by Wiley.
Descriptions and identifications of strangers by youth and adult eyewitnesses.
Pozzulo, Joanna D; Warren, Kelly L
2003-04-01
Two studies varying target gender and mode of target exposure were conducted to compare the quantity, nature, and accuracy of free recall person descriptions provided by youths and adults. In addition, the relation among age, identification accuracy, and number of descriptors reported was considered. Youths (10-14 years) reported fewer descriptors than adults. Exterior facial descriptors (e.g., hair items) were predominant and accurately reported by youths and adults. Accuracy was consistently problematic for youths when reporting body descriptors (e.g., height, weight) and interior facial features. Youths reported a similar number of descriptors when making accurate versus inaccurate identification decisions. This pattern also was consistent for adults. With target-absent lineups, the difference in the number of descriptors reported between adults and youths was greater when making a false positive versus correct rejection.
NASA Astrophysics Data System (ADS)
Al-Temeemy, Ali A.
2018-03-01
A descriptor is proposed for use in domiciliary healthcare monitoring systems. The descriptor is produced from chromatic methodology to extract robust features from the monitoring system's images. It has superior discrimination capabilities, is robust to events that normally disturb monitoring systems, and requires less computational time and storage space to achieve recognition. A method of human region segmentation is also used with this descriptor. The performance of the proposed descriptor was evaluated using experimental data sets, obtained through a series of experiments performed in the Centre for Intelligent Monitoring Systems, University of Liverpool. The evaluation results show high recognition performance for the proposed descriptor in comparison to traditional descriptors, such as moments invariant. The results also show the effectiveness of the proposed segmentation method regarding distortion effects associated with domiciliary healthcare systems.
NASA Astrophysics Data System (ADS)
Willaime, J. M. Y.; Turkheimer, F. E.; Kenny, L. M.; Aboagye, E. O.
2013-01-01
Intra-tumour heterogeneity is a characteristic shared by all cancers. We explored the use of texture variables derived from images of [18F]fluorothymidine-positron emission tomography (FLT-PET), thus notionally assessing the heterogeneity of proliferation in individual tumours. Our aims were to study the range of textural feature values across tissue types, verify the repeatability of these image descriptors and further, to explore associations with clinical response to chemotherapy in breast cancer patients. The repeatability of 28 textural descriptors was assessed in patients who had two FLT-PET scans prior to therapy using relative differences and the intra-class correlation coefficient (ICC). We tested associations between features at baseline and clinical response measured in 11 patients after three cycles of chemotherapy, and explored changes in FLT-PET at one week after the start of therapy. A subset of eight features was characterized by low variations at baseline (<±30%) and high repeatability (0.7 ≤ ICC ≤ 1). The intensity distribution profile suggested fewer highly proliferating cells in lesions of non-responders compared to responders at baseline. A true increase in CV and homogeneity was measured in four out of six responders one week after the start of therapy. A number of textural features derived from FLT-PET are altered following chemotherapy in breast cancer, and should be evaluated in larger clinical trials for clinical relevance.
Al-Kadi, Omar S; Chung, Daniel Y F; Carlisle, Robert C; Coussios, Constantin C; Noble, J Alison
2015-04-01
Intensity variations in image texture can provide powerful quantitative information about physical properties of biological tissue. However, tissue patterns can vary according to the utilized imaging system and are intrinsically correlated to the scale of analysis. In the case of ultrasound, the Nakagami distribution is a general model of the ultrasonic backscattering envelope under various scattering conditions and densities where it can be employed for characterizing image texture, but the subtle intra-heterogeneities within a given mass are difficult to capture via this model as it works at a single spatial scale. This paper proposes a locally adaptive 3D multi-resolution Nakagami-based fractal feature descriptor that extends Nakagami-based texture analysis to accommodate subtle speckle spatial frequency tissue intensity variability in volumetric scans. Local textural fractal descriptors - which are invariant to affine intensity changes - are extracted from volumetric patches at different spatial resolutions from voxel lattice-based generated shape and scale Nakagami parameters. Using ultrasound radio-frequency datasets we found that after applying an adaptive fractal decomposition label transfer approach on top of the generated Nakagami voxels, tissue characterization results were superior to the state of art. Experimental results on real 3D ultrasonic pre-clinical and clinical datasets suggest that describing tumor intra-heterogeneity via this descriptor may facilitate improved prediction of therapy response and disease characterization. Copyright © 2014 The Authors. Published by Elsevier B.V. All rights reserved.
A novel binary shape context for 3D local surface description
NASA Astrophysics Data System (ADS)
Dong, Zhen; Yang, Bisheng; Liu, Yuan; Liang, Fuxun; Li, Bijun; Zang, Yufu
2017-08-01
3D local surface description is now at the core of many computer vision technologies, such as 3D object recognition, intelligent driving, and 3D model reconstruction. However, most of the existing 3D feature descriptors still suffer from low descriptiveness, weak robustness, and inefficiency in both time and memory. To overcome these challenges, this paper presents a robust and descriptive 3D Binary Shape Context (BSC) descriptor with high efficiency in both time and memory. First, a novel BSC descriptor is generated for 3D local surface description, and the performance of the BSC descriptor under different settings of its parameters is analyzed. Next, the descriptiveness, robustness, and efficiency in both time and memory of the BSC descriptor are evaluated and compared to those of several state-of-the-art 3D feature descriptors. Finally, the performance of the BSC descriptor for 3D object recognition is also evaluated on a number of popular benchmark datasets, and an urban-scene dataset is collected by a terrestrial laser scanner system. Comprehensive experiments demonstrate that the proposed BSC descriptor obtained high descriptiveness, strong robustness, and high efficiency in both time and memory and achieved high recognition rates of 94.8%, 94.1% and 82.1% on the considered UWA, Queen, and WHU datasets, respectively.
Bhongsatiern, Jiraganya; Stockmann, Chris; Yu, Tian; Constance, Jonathan E; Moorthy, Ganesh; Spigarelli, Michael G; Desai, Pankaj B; Sherwin, Catherine M T
2016-05-01
Growth and maturational changes have been identified as significant covariates in describing variability in clearance of renally excreted drugs such as vancomycin. Because of immaturity of clearance mechanisms, quantification of renal function in neonates is of importance. Several serum creatinine (SCr)-based renal function descriptors have been developed in adults and children, but none are selectively derived for neonates. This review summarizes development of the neonatal kidney and discusses assessment of the renal function regarding estimation of glomerular filtration rate using renal function descriptors. Furthermore, identification of the renal function descriptors that best describe the variability of vancomycin clearance was performed in a sample study of a septic neonatal cohort. Population pharmacokinetic models were developed applying a combination of age-weight, renal function descriptors, or SCr alone. In addition to age and weight, SCr or renal function descriptors significantly reduced variability of vancomycin clearance. The population pharmacokinetic models with Léger and modified Schwartz formulas were selected as the optimal final models, although the other renal function descriptors and SCr provided reasonably good fit to the data, suggesting further evaluation of the final models using external data sets and cross validation. The present study supports incorporation of renal function descriptors in the estimation of vancomycin clearance in neonates. © 2015, The American College of Clinical Pharmacology.
Robust image region descriptor using local derivative ordinal binary pattern
NASA Astrophysics Data System (ADS)
Shang, Jun; Chen, Chuanbo; Pei, Xiaobing; Liang, Hu; Tang, He; Sarem, Mudar
2015-05-01
Binary image descriptors have received a lot of attention in recent years, since they provide numerous advantages, such as low memory footprint and efficient matching strategy. However, they utilize intermediate representations and are generally less discriminative than floating-point descriptors. We propose an image region descriptor, namely local derivative ordinal binary pattern, for object recognition and image categorization. In order to preserve more local contrast and edge information, we quantize the intensity differences between the central pixels and their neighbors of the detected local affine covariant regions in an adaptive way. These differences are then sorted and mapped into binary codes and histogrammed with a weight of the sum of the absolute value of the differences. Furthermore, the gray level of the central pixel is quantized to further improve the discriminative ability. Finally, we combine them to form a joint histogram to represent the features of the image. We observe that our descriptor preserves more local brightness and edge information than traditional binary descriptors. Also, our descriptor is robust to rotation, illumination variations, and other geometric transformations. We conduct extensive experiments on the standard ETHZ and Kentucky datasets for object recognition and PASCAL for image classification. The experimental results show that our descriptor outperforms existing state-of-the-art methods.
Circular blurred shape model for multiclass symbol recognition.
Escalera, Sergio; Fornés, Alicia; Pujol, Oriol; Lladós, Josep; Radeva, Petia
2011-04-01
In this paper, we propose a circular blurred shape model descriptor to deal with the problem of symbol detection and classification as a particular case of object recognition. The feature extraction is performed by capturing the spatial arrangement of significant object characteristics in a correlogram structure. The shape information from objects is shared among correlogram regions, where a prior blurring degree defines the level of distortion allowed in the symbol, making the descriptor tolerant to irregular deformations. Moreover, the descriptor is rotation invariant by definition. We validate the effectiveness of the proposed descriptor in both the multiclass symbol recognition and symbol detection domains. In order to perform the symbol detection, the descriptors are learned using a cascade of classifiers. In the case of multiclass categorization, the new feature space is learned using a set of binary classifiers which are embedded in an error-correcting output code design. The results over four symbol data sets show the significant improvements of the proposed descriptor compared to the state-of-the-art descriptors. In particular, the results are even more significant in those cases where the symbols suffer from elastic deformations.
The analysis of image feature robustness using cometcloud
Qi, Xin; Kim, Hyunjoo; Xing, Fuyong; Parashar, Manish; Foran, David J.; Yang, Lin
2012-01-01
The robustness of image features is a very important consideration in quantitative image analysis. The objective of this paper is to investigate the robustness of a range of image texture features using hematoxylin stained breast tissue microarray slides which are assessed while simulating different imaging challenges including out of focus, changes in magnification and variations in illumination, noise, compression, distortion, and rotation. We employed five texture analysis methods and tested them while introducing all of the challenges listed above. The texture features that were evaluated include co-occurrence matrix, center-symmetric auto-correlation, texture feature coding method, local binary pattern, and texton. Due to the independence of each transformation and texture descriptor, a network structured combination was proposed and deployed on the Rutgers private cloud. The experiments utilized 20 randomly selected tissue microarray cores. All the combinations of the image transformations and deformations are calculated, and the whole feature extraction procedure was completed in 70 minutes using a cloud equipped with 20 nodes. Center-symmetric auto-correlation outperforms all the other four texture descriptors but also requires the longest computational time. It is roughly 10 times slower than local binary pattern and texton. From a speed perspective, both the local binary pattern and texton features provided excellent performance for classification and content-based image retrieval. PMID:23248759
Molecular properties of steroids involved in their effects on the biophysical state of membranes.
Wenz, Jorge J
2015-10-01
The activity of steroids on membranes was studied in relation to their ordering, rigidifying, condensing and/or raft promoting ability. The structures of 82 steroids were modeled by a semi-empirical procedure (AM1) and 245 molecular descriptors were next computed on the optimized energy conformations. Principal component analysis, mean contrasting and logistic regression were used to correlate the molecular properties with 212 cases of documented activities. It was possible to group steroids based on their properties and activities, indicating that steroids having similar molecular properties have similar activities on membranes. Steroids having high values of area, partition coefficient, volume, number of rotatable bonds, molar refractivity, polarizability or mass displayed ordering, rigidifying, condensing and/or raft promoting activity on membranes higher than those steroids having low values in such molecular properties. After a variable selection procedure circumventing correlation problems among descriptors, area and log P were found as the most relevant properties in governing and predicting the activity of steroids on membranes. A logistic regression model as a function of the area and log P of the steroids is proposed, which is able to predict correctly 92.5% of the cases. A rationale of the findings is discussed. Copyright © 2015 Elsevier B.V. All rights reserved.
Food product design: emerging evidence for food policy.
Al-Hamdani, Mohammed; Smith, Steven
2017-03-01
The research on the impact of specific brand elements such as food descriptors and package colors is underexplored. We tested whether a "light" color and a "low-calorie" descriptor on food packages gain favorable consumer perception ratings as compared with regular packages. Our online experiment recruited 406 adults in a 3 (product type: Chips versus Juice versus Yoghurt) × 2 (descriptor type: regular versus low-calorie) × 2 (color type: regular versus light) mixed design. Dependent variables were sensory (evaluations of the product's nutritional value and quality), product-based (evaluations of the product's physical appeal), and consumer-based (evaluations of the potential consumers of the product) scales. "Low-calorie" descriptors were found to increase sensory ratings as compared with regular descriptors and light-colored packages received higher product-based ratings as compared with their regular-colored counterparts. Food package color and descriptors present a promising venue for understanding preventative measures against obesity.[Formula: see text].
Descriptors for ions and ion-pairs for use in linear free energy relationships.
Abraham, Michael H; Acree, William E
2016-01-22
The determination of Abraham descriptors for single ions is reviewed, and equations are given for the partition of single ions from water to a number of solvents. These ions include permanent anions and cations and ionic species such as carboxylic acid anions, phenoxide anions and protonated base cations. Descriptors for a large number of ions and ionic species are listed, and equations for the prediction of Abraham descriptors for ionic species are given. The application of descriptors for ions and ionic species to physicochemical processes is given; these are to water-solvent partitions, HPLC retention data, immobilised artificial membranes, the Finkelstein reaction and diffusion in water. Applications to biological processes include brain permeation, microsomal degradation of drugs, skin permeation and human intestinal absorption. The review concludes with a section on the determination of descriptors for ion-pairs. Copyright © 2015 Elsevier B.V. All rights reserved.
Identifying factors of comfort in using hand tools.
Kuijt-Evers, L F M; Groenesteijn, L; de Looze, M P; Vink, P
2004-09-01
To design comfortable hand tools, knowledge about comfort/discomfort in using hand tools is required. We investigated which factors determine comfort/discomfort in using hand tools according to users. Therefore, descriptors of comfort/discomfort in using hand tools were collected from literature and interviews. After that, the relatedness of a selection of the descriptors to comfort in using hand tools was investigated. Six comfort factors could be distinguished (functionality, posture and muscles, irritation and pain of hand and fingers, irritation of hand surface, handle characteristics, aesthetics). These six factors can be classified into three meaningful groups: functionality, physical interaction and appearance. The main conclusions were that (1) the same descriptors were related to comfort and discomfort in using hand tools, (2) descriptors of functionality are most related to comfort in using hand tools followed by descriptors of physical interaction and (3) descriptors of appearance become secondary in comfort in using hand tools.
Design of an optimal preview controller for linear discrete-time descriptor systems with state delay
NASA Astrophysics Data System (ADS)
Cao, Mengjuan; Liao, Fucheng
2015-04-01
In this paper, the linear discrete-time descriptor system with state delay is studied, and a design method for an optimal preview controller is proposed. First, by using the discrete lifting technique, the original system is transformed into a general descriptor system without state delay in form. Then, taking advantage of the first-order forward difference operator, we construct a descriptor augmented error system, including the state vectors of the lifted system, error vectors, and desired target signals. Rigorous mathematical proofs are given for the regularity, stabilisability, causal controllability, and causal observability of the descriptor augmented error system. Based on these, the optimal preview controller with preview feedforward compensation for the original system is obtained by using the standard optimal regulator theory of the descriptor system. The effectiveness of the proposed method is shown by numerical simulation.
Mass transport modelling for the electroreduction of CO2 on Cu nanowires
NASA Astrophysics Data System (ADS)
Raciti, David; Mao, Mark; Wang, Chao
2018-01-01
Mass transport plays an important role in CO2 reduction electrocatalysis. Albeit being more pronounced on nanostructured electrodes, the studies of mass transport for CO2 reduction have yet been limited to planar electrodes. We report here the development of a mass transport model for the electroreduction of CO2 on Cu nanowire electrodes. Fed with the experimental data from electrocatalytic studies, the local concentrations of CO2, {{{{HCO}}}3}-,{{{{CO}}}3}2- and OH- on the nanostructured electrodes are calculated by solving the diffusion equations with spatially distributed electrochemical reaction terms incorporated. The mass transport effects on the catalytic activity and selectivity of the Cu nanowire electrocatalysts are thus discussed by using the local pH as the descriptor. The established correlations between the electrocatalytic performance and the local pH shows that, the latter does not only determine the acid-base reaction equilibrium, but also regulates the mass transport and reaction kinetics. Based on these findings, the optimal range of local pH for CO2 reduction is discussed in terms of a fine balance among the suppression of hydrogen evolution, improvement of C2 product selectivity and limitation of CO2 supply. Our work highlights the importance of understanding the mass transport effects in interpretation of CO2 reduction electrocatalysis on high-surface-area catalysts.
Impact of low alcohol verbal descriptors on perceived strength: An experimental study.
Vasiljevic, Milica; Couturier, Dominique-Laurent; Marteau, Theresa M
2018-02-01
Low alcohol labels are a set of labels that carry descriptors such as 'low' or 'lighter' to denote alcohol content in beverages. There is growing interest from policymakers and producers in lower strength alcohol products. However, there is a lack of evidence on how the general population perceives verbal descriptors of strength. The present research examines consumers' perceptions of strength (% ABV) and appeal of alcohol products using low or high alcohol verbal descriptors. A within-subjects experimental study in which participants rated the strength and appeal of 18 terms denoting low (nine terms), high (eight terms) and regular (one term) strengths for either (1) wine or (2) beer according to drinking preference. Thousand six hundred adults (796 wine and 804 beer drinkers) sampled from a nationally representative UK panel. Low, Lower, Light, Lighter, and Reduced formed a cluster and were rated as denoting lower strength products than Regular, but higher strength than the cluster with intensifiers consisting of Extra Low, Super Low, Extra Light, and Super Light. Similar clustering in perceived strength was observed amongst the high verbal descriptors. Regular was the most appealing strength descriptor, with the low and high verbal descriptors using intensifiers rated least appealing. The perceived strength and appeal of alcohol products diminished the more the verbal descriptors implied a deviation from Regular. The implications of these findings are discussed in terms of policy implications for lower strength alcohol labelling and associated public health outcomes. Statement of contribution What is already known about this subject? Current UK and EU legislation limits the number of low strength verbal descriptors and the associated alcohol by volume (ABV) to 1.2% ABV and lower. There is growing interest from policymakers and producers to extend the range of lower strength alcohol products above the current cap of 1.2% ABV set out in national legislation. There is a lack of evidence on how the general population perceives verbal descriptors of alcohol product strength (both low and high). What does this study add? Verbal descriptors of lower strength wine and beer form two clusters and effectively communicate reduced alcohol content. Low, Lower, Light, Lighter, and Reduced were considered lower in strength than Regular (average % ABV). Descriptors using intensifiers (Extra Low, Super Low, Extra Light, and Super Light) were considered lowest in strength. Similar clustering in perceived strength was observed amongst the high verbal descriptors. The appeal of alcohol products reduced the more the verbal descriptors implied a deviation from Regular. © 2017 The Authors. British Journal of Health Psychology published by John Wiley & Sons Ltd on behalf of British Psychological Society.
Fatigue reliability of deck structures subjected to correlated crack growth
NASA Astrophysics Data System (ADS)
Feng, G. Q.; Garbatov, Y.; Guedes Soares, C.
2013-12-01
The objective of this work is to analyse fatigue reliability of deck structures subjected to correlated crack growth. The stress intensity factors of the correlated cracks are obtained by finite element analysis and based on which the geometry correction functions are derived. The Monte Carlo simulations are applied to predict the statistical descriptors of correlated cracks based on the Paris-Erdogan equation. A probabilistic model of crack growth as a function of time is used to analyse the fatigue reliability of deck structures accounting for the crack propagation correlation. A deck structure is modelled as a series system of stiffened panels, where a stiffened panel is regarded as a parallel system composed of plates and are longitudinal. It has been proven that the method developed here can be conveniently applied to perform the fatigue reliability assessment of structures subjected to correlated crack growth.
NASA Astrophysics Data System (ADS)
Consonni, Viviana; Todeschini, Roberto
In the last decades, several scientific researches have been focused on studying how to encompass and convert - by a theoretical pathway - the information encoded in the molecular structure into one or more numbers used to establish quantitative relationships between structures and properties, biological activities, or other experimental properties. Molecular descriptors are formally mathematical representations of a molecule obtained by a well-specified algorithm applied to a defined molecular representation or a well-specified experimental procedure. They play a fundamental role in chemistry, pharmaceutical sciences, environmental protection policy, toxicology, ecotoxicology, health research, and quality control. Evidence of the interest of the scientific community in the molecular descriptors is provided by the huge number of descriptors proposed up today: more than 5000 descriptors derived from different theories and approaches are defined in the literature and most of them can be calculated by means of dedicated software applications. Molecular descriptors are of outstanding importance in the research fields of quantitative structure-activity relationships (QSARs) and quantitative structure-property relationships (QSPRs), where they are the independent chemical information used to predict the properties of interest. Along with the definition of appropriate molecular descriptors, the molecular structure representation and the mathematical tools for deriving and assessing models are other fundamental components of the QSAR/QSPR approach. The remarkable progress during the last few years in chemometrics and chemoinformatics has led to new strategies for finding mathematical meaningful relationships between the molecular structure and biological activities, physico-chemical, toxicological, and environmental properties of chemicals. Different approaches for deriving molecular descriptors here reviewed and some of the most relevant descriptors are presented in detail with numerical examples.
Mobile visual object identification: from SIFT-BoF-RANSAC to Sketchprint
NASA Astrophysics Data System (ADS)
Voloshynovskiy, Sviatoslav; Diephuis, Maurits; Holotyak, Taras
2015-03-01
Mobile object identification based on its visual features find many applications in the interaction with physical objects and security. Discriminative and robust content representation plays a central role in object and content identification. Complex post-processing methods are used to compress descriptors and their geometrical information, aggregate them into more compact and discriminative representations and finally re-rank the results based on the similarity geometries of descriptors. Unfortunately, most of the existing descriptors are not very robust and discriminative once applied to the various contend such as real images, text or noise-like microstructures next to requiring at least 500-1'000 descriptors per image for reliable identification. At the same time, the geometric re-ranking procedures are still too complex to be applied to the numerous candidates obtained from the feature similarity based search only. This restricts that list of candidates to be less than 1'000 which obviously causes a higher probability of miss. In addition, the security and privacy of content representation has become a hot research topic in multimedia and security communities. In this paper, we introduce a new framework for non- local content representation based on SketchPrint descriptors. It extends the properties of local descriptors to a more informative and discriminative, yet geometrically invariant content representation. In particular it allows images to be compactly represented by 100 SketchPrint descriptors without being fully dependent on re-ranking methods. We consider several use cases, applying SketchPrint descriptors to natural images, text documents, packages and micro-structures and compare them with the traditional local descriptors.
Mobile Visual Search Based on Histogram Matching and Zone Weight Learning
NASA Astrophysics Data System (ADS)
Zhu, Chuang; Tao, Li; Yang, Fan; Lu, Tao; Jia, Huizhu; Xie, Xiaodong
2018-01-01
In this paper, we propose a novel image retrieval algorithm for mobile visual search. At first, a short visual codebook is generated based on the descriptor database to represent the statistical information of the dataset. Then, an accurate local descriptor similarity score is computed by merging the tf-idf weighted histogram matching and the weighting strategy in compact descriptors for visual search (CDVS). At last, both the global descriptor matching score and the local descriptor similarity score are summed up to rerank the retrieval results according to the learned zone weights. The results show that the proposed approach outperforms the state-of-the-art image retrieval method in CDVS.
2017-01-01
Electrospray ionization (ESI) is widely used in liquid chromatography coupled to mass spectrometry (LC–MS) for the analysis of biomolecules. However, the ESI process is still not completely understood, and it is often a matter of trial and error to enhance ESI efficiency and, hence, the response of a given set of compounds. In this work we performed a systematic study of the ESI response of 14 amino acids that were acylated with organic acid anhydrides of increasing chain length and with poly(ethylene glycol) (PEG) changing certain physicochemical properties in a predictable manner. By comparing the ESI response of 70 derivatives, we found that there was a strong correlation between the calculated molecular volume and the ESI response, while correlation with hydrophobicity (log P values), pKa, and the inverse calculated surface tension was significantly lower although still present, especially for individual derivatized amino acids with increasing acyl chain lengths. Acylation with PEG containing five ethylene glycol units led to the largest gain in ESI response. This response was maximal independent of the calculated physicochemical properties or the type of amino acid. Since no actual physicochemical data is available for most derivatized compounds, the responses were also used as input for a quantitative structure–property relationship (QSPR) model to find the best physicochemical descriptors relating to the ESI response from molecular structures using the amino acids and their derivatives as a reference set. A topological descriptor related to molecular size (SPAN) was isolated next to a descriptor related to the atomic composition and structural groups (BIC0). The validity of the model was checked with a test set of 43 additional compounds that were unrelated to amino acids. While prediction was generally good (R2 > 0.9), compounds containing halogen atoms or nitro groups gave a lower predicted ESI response. PMID:28737384
Fang, Jiansong; Yang, Ranyao; Gao, Li; Zhou, Dan; Yang, Shengqian; Liu, Ai-Lin; Du, Guan-hua
2013-11-25
Butyrylcholinesterase (BuChE, EC 3.1.1.8) is an important pharmacological target for Alzheimer's disease (AD) treatment. However, the currently available BuChE inhibitor screening assays are expensive, labor-intensive, and compound-dependent. It is necessary to develop robust in silico methods to predict the activities of BuChE inhibitors for the lead identification. In this investigation, support vector machine (SVM) models and naive Bayesian models were built to discriminate BuChE inhibitors (BuChEIs) from the noninhibitors. Each molecule was initially represented in 1870 structural descriptors (1235 from ADRIANA.Code, 334 from MOE, and 301 from Discovery studio). Correlation analysis and stepwise variable selection method were applied to figure out activity-related descriptors for prediction models. Additionally, structural fingerprint descriptors were added to improve the predictive ability of models, which were measured by cross-validation, a test set validation with 1001 compounds and an external test set validation with 317 diverse chemicals. The best two models gave Matthews correlation coefficient of 0.9551 and 0.9550 for the test set and 0.9132 and 0.9221 for the external test set. To demonstrate the practical applicability of the models in virtual screening, we screened an in-house data set with 3601 compounds, and 30 compounds were selected for further bioactivity assay. The assay results showed that 10 out of 30 compounds exerted significant BuChE inhibitory activities with IC50 values ranging from 0.32 to 22.22 μM, at which three new scaffolds as BuChE inhibitors were identified for the first time. To our best knowledge, this is the first report on BuChE inhibitors using machine learning approaches. The models generated from SVM and naive Bayesian approaches successfully predicted BuChE inhibitors. The study proved the feasibility of a new method for predicting bioactivities of ligands and discovering novel lead compounds.
Sharma, Ashok K; Srivastava, Gopal N; Roy, Ankita; Sharma, Vineet K
2017-01-01
The experimental methods for the prediction of molecular toxicity are tedious and time-consuming tasks. Thus, the computational approaches could be used to develop alternative methods for toxicity prediction. We have developed a tool for the prediction of molecular toxicity along with the aqueous solubility and permeability of any molecule/metabolite. Using a comprehensive and curated set of toxin molecules as a training set, the different chemical and structural based features such as descriptors and fingerprints were exploited for feature selection, optimization and development of machine learning based classification and regression models. The compositional differences in the distribution of atoms were apparent between toxins and non-toxins, and hence, the molecular features were used for the classification and regression. On 10-fold cross-validation, the descriptor-based, fingerprint-based and hybrid-based classification models showed similar accuracy (93%) and Matthews's correlation coefficient (0.84). The performances of all the three models were comparable (Matthews's correlation coefficient = 0.84-0.87) on the blind dataset. In addition, the regression-based models using descriptors as input features were also compared and evaluated on the blind dataset. Random forest based regression model for the prediction of solubility performed better ( R 2 = 0.84) than the multi-linear regression (MLR) and partial least square regression (PLSR) models, whereas, the partial least squares based regression model for the prediction of permeability (caco-2) performed better ( R 2 = 0.68) in comparison to the random forest and MLR based regression models. The performance of final classification and regression models was evaluated using the two validation datasets including the known toxins and commonly used constituents of health products, which attests to its accuracy. The ToxiM web server would be a highly useful and reliable tool for the prediction of toxicity, solubility, and permeability of small molecules.
Sharma, Ashok K.; Srivastava, Gopal N.; Roy, Ankita; Sharma, Vineet K.
2017-01-01
The experimental methods for the prediction of molecular toxicity are tedious and time-consuming tasks. Thus, the computational approaches could be used to develop alternative methods for toxicity prediction. We have developed a tool for the prediction of molecular toxicity along with the aqueous solubility and permeability of any molecule/metabolite. Using a comprehensive and curated set of toxin molecules as a training set, the different chemical and structural based features such as descriptors and fingerprints were exploited for feature selection, optimization and development of machine learning based classification and regression models. The compositional differences in the distribution of atoms were apparent between toxins and non-toxins, and hence, the molecular features were used for the classification and regression. On 10-fold cross-validation, the descriptor-based, fingerprint-based and hybrid-based classification models showed similar accuracy (93%) and Matthews's correlation coefficient (0.84). The performances of all the three models were comparable (Matthews's correlation coefficient = 0.84–0.87) on the blind dataset. In addition, the regression-based models using descriptors as input features were also compared and evaluated on the blind dataset. Random forest based regression model for the prediction of solubility performed better (R2 = 0.84) than the multi-linear regression (MLR) and partial least square regression (PLSR) models, whereas, the partial least squares based regression model for the prediction of permeability (caco-2) performed better (R2 = 0.68) in comparison to the random forest and MLR based regression models. The performance of final classification and regression models was evaluated using the two validation datasets including the known toxins and commonly used constituents of health products, which attests to its accuracy. The ToxiM web server would be a highly useful and reliable tool for the prediction of toxicity, solubility, and permeability of small molecules. PMID:29249969
Learning Rotation-Invariant Local Binary Descriptor.
Duan, Yueqi; Lu, Jiwen; Feng, Jianjiang; Zhou, Jie
2017-08-01
In this paper, we propose a rotation-invariant local binary descriptor (RI-LBD) learning method for visual recognition. Compared with hand-crafted local binary descriptors, such as local binary pattern and its variants, which require strong prior knowledge, local binary feature learning methods are more efficient and data-adaptive. Unlike existing learning-based local binary descriptors, such as compact binary face descriptor and simultaneous local binary feature learning and encoding, which are susceptible to rotations, our RI-LBD first categorizes each local patch into a rotational binary pattern (RBP), and then jointly learns the orientation for each pattern and the projection matrix to obtain RI-LBDs. As all the rotation variants of a patch belong to the same RBP, they are rotated into the same orientation and projected into the same binary descriptor. Then, we construct a codebook by a clustering method on the learned binary codes, and obtain a histogram feature for each image as the final representation. In order to exploit higher order statistical information, we extend our RI-LBD to the triple rotation-invariant co-occurrence local binary descriptor (TRICo-LBD) learning method, which learns a triple co-occurrence binary code for each local patch. Extensive experimental results on four different visual recognition tasks, including image patch matching, texture classification, face recognition, and scene classification, show that our RI-LBD and TRICo-LBD outperform most existing local descriptors.
Bradley, Jean-Claude; Abraham, Michael H; Acree, William E; Lang, Andrew Sid; Beck, Samantha N; Bulger, David A; Clark, Elizabeth A; Condron, Lacey N; Costa, Stephanie T; Curtin, Evan M; Kurtu, Sozit B; Mangir, Mark I; McBride, Matthew J
2015-01-01
Calculating Abraham descriptors from solubility values requires that the solute have the same form when dissolved in all solvents. However, carboxylic acids can form dimers when dissolved in non-polar solvents. For such compounds Abraham descriptors can be calculated for both the monomeric and dimeric forms by treating the polar and non-polar systems separately. We illustrate the method of how this can be done by calculating the Abraham descriptors for both the monomeric and dimeric forms of trans-cinnamic acid, the first time that descriptors for a carboxylic acid dimer have been obtained. Abraham descriptors were calculated for the monomeric form of trans-cinnamic acid using experimental solubility measurements in polar solvents from the Open Notebook Science Challenge together with a number of water-solvent partition coefficients from the literature. Similarly, experimental solubility measurements in non-polar solvents were used to determine Abraham descriptors for the trans-cinnamic acid dimer. Abraham descriptors were calculated for both the monomeric and dimeric forms of trans-cinnamic acid. This allows for the prediction of further solubilities of trans-cinnamic acid in both polar and non-polar solvents with an error of about 0.10 log units. Graphical abstractMolar concentration of trans-cinnamic acid in various polar and non-polar solvents.
Marrero-Ponce, Yovani; Medina-Marrero, Ricardo; Castillo-Garit, Juan A; Romero-Zaldivar, Vicente; Torrens, Francisco; Castro, Eduardo A
2005-04-15
A novel approach to bio-macromolecular design from a linear algebra point of view is introduced. A protein's total (whole protein) and local (one or more amino acid) linear indices are a new set of bio-macromolecular descriptors of relevance to protein QSAR/QSPR studies. These amino-acid level biochemical descriptors are based on the calculation of linear maps on Rn[f k(xmi):Rn-->Rn] in canonical basis. These bio-macromolecular indices are calculated from the kth power of the macromolecular pseudograph alpha-carbon atom adjacency matrix. Total linear indices are linear functional on Rn. That is, the kth total linear indices are linear maps from Rn to the scalar R[f k(xm):Rn-->R]. Thus, the kth total linear indices are calculated by summing the amino-acid linear indices of all amino acids in the protein molecule. A study of the protein stability effects for a complete set of alanine substitutions in the Arc repressor illustrates this approach. A quantitative model that discriminates near wild-type stability alanine mutants from the reduced-stability ones in a training series was obtained. This model permitted the correct classification of 97.56% (40/41) and 91.67% (11/12) of proteins in the training and test set, respectively. It shows a high Matthews correlation coefficient (MCC=0.952) for the training set and an MCC=0.837 for the external prediction set. Additionally, canonical regression analysis corroborated the statistical quality of the classification model (Rcanc=0.824). This analysis was also used to compute biological stability canonical scores for each Arc alanine mutant. On the other hand, the linear piecewise regression model compared favorably with respect to the linear regression one on predicting the melting temperature (tm) of the Arc alanine mutants. The linear model explains almost 81% of the variance of the experimental tm (R=0.90 and s=4.29) and the LOO press statistics evidenced its predictive ability (q2=0.72 and scv=4.79). Moreover, the TOMOCOMD-CAMPS method produced a linear piecewise regression (R=0.97) between protein backbone descriptors and tm values for alanine mutants of the Arc repressor. A break-point value of 51.87 degrees C characterized two mutant clusters and coincided perfectly with the experimental scale. For this reason, we can use the linear discriminant analysis and piecewise models in combination to classify and predict the stability of the mutant Arc homodimers. These models also permitted the interpretation of the driving forces of such folding process, indicating that topologic/topographic protein backbone interactions control the stability profile of wild-type Arc and its alanine mutants.
Gottschlich, Carsten
2016-01-01
We present a new type of local image descriptor which yields binary patterns from small image patches. For the application to fingerprint liveness detection, we achieve rotation invariant image patches by taking the fingerprint segmentation and orientation field into account. We compute the discrete cosine transform (DCT) for these rotation invariant patches and attain binary patterns by comparing pairs of two DCT coefficients. These patterns are summarized into one or more histograms per image. Each histogram comprises the relative frequencies of pattern occurrences. Multiple histograms are concatenated and the resulting feature vector is used for image classification. We name this novel type of descriptor convolution comparison pattern (CCP). Experimental results show the usefulness of the proposed CCP descriptor for fingerprint liveness detection. CCP outperforms other local image descriptors such as LBP, LPQ and WLD on the LivDet 2013 benchmark. The CCP descriptor is a general type of local image descriptor which we expect to prove useful in areas beyond fingerprint liveness detection such as biological and medical image processing, texture recognition, face recognition and iris recognition, liveness detection for face and iris images, and machine vision for surface inspection and material classification. PMID:26844544
Yuan, Yaxia; Zheng, Fang; Zhan, Chang-Guo
2018-03-21
Blood-brain barrier (BBB) permeability of a compound determines whether the compound can effectively enter the brain. It is an essential property which must be accounted for in drug discovery with a target in the brain. Several computational methods have been used to predict the BBB permeability. In particular, support vector machine (SVM), which is a kernel-based machine learning method, has been used popularly in this field. For SVM training and prediction, the compounds are characterized by molecular descriptors. Some SVM models were based on the use of molecular property-based descriptors (including 1D, 2D, and 3D descriptors) or fragment-based descriptors (known as the fingerprints of a molecule). The selection of descriptors is critical for the performance of a SVM model. In this study, we aimed to develop a generally applicable new SVM model by combining all of the features of the molecular property-based descriptors and fingerprints to improve the accuracy for the BBB permeability prediction. The results indicate that our SVM model has improved accuracy compared to the currently available models of the BBB permeability prediction.
Effect of the image resolution on the statistical descriptors of heterogeneous media.
Ledesma-Alonso, René; Barbosa, Romeli; Ortegón, Jaime
2018-02-01
The characterization and reconstruction of heterogeneous materials, such as porous media and electrode materials, involve the application of image processing methods to data acquired by scanning electron microscopy or other microscopy techniques. Among them, binarization and decimation are critical in order to compute the correlation functions that characterize the microstructure of the above-mentioned materials. In this study, we present a theoretical analysis of the effects of the image-size reduction, due to the progressive and sequential decimation of the original image. Three different decimation procedures (random, bilinear, and bicubic) were implemented and their consequences on the discrete correlation functions (two-point, line-path, and pore-size distribution) and the coarseness (derived from the local volume fraction) are reported and analyzed. The chosen statistical descriptors (correlation functions and coarseness) are typically employed to characterize and reconstruct heterogeneous materials. A normalization for each of the correlation functions has been performed. When the loss of statistical information has not been significant for a decimated image, its normalized correlation function is forecast by the trend of the original image (reference function). In contrast, when the decimated image does not hold statistical evidence of the original one, the normalized correlation function diverts from the reference function. Moreover, the equally weighted sum of the average of the squared difference, between the discrete correlation functions of the decimated images and the reference functions, leads to a definition of an overall error. During the first stages of the gradual decimation, the error remains relatively small and independent of the decimation procedure. Above a threshold defined by the correlation length of the reference function, the error becomes a function of the number of decimation steps. At this stage, some statistical information is lost and the error becomes dependent on the decimation procedure. These results may help us to restrict the amount of information that one can afford to lose during a decimation process, in order to reduce the computational and memory cost, when one aims to diminish the time consumed by a characterization or reconstruction technique, yet maintaining the statistical quality of the digitized sample.
Effect of the image resolution on the statistical descriptors of heterogeneous media
NASA Astrophysics Data System (ADS)
Ledesma-Alonso, René; Barbosa, Romeli; Ortegón, Jaime
2018-02-01
The characterization and reconstruction of heterogeneous materials, such as porous media and electrode materials, involve the application of image processing methods to data acquired by scanning electron microscopy or other microscopy techniques. Among them, binarization and decimation are critical in order to compute the correlation functions that characterize the microstructure of the above-mentioned materials. In this study, we present a theoretical analysis of the effects of the image-size reduction, due to the progressive and sequential decimation of the original image. Three different decimation procedures (random, bilinear, and bicubic) were implemented and their consequences on the discrete correlation functions (two-point, line-path, and pore-size distribution) and the coarseness (derived from the local volume fraction) are reported and analyzed. The chosen statistical descriptors (correlation functions and coarseness) are typically employed to characterize and reconstruct heterogeneous materials. A normalization for each of the correlation functions has been performed. When the loss of statistical information has not been significant for a decimated image, its normalized correlation function is forecast by the trend of the original image (reference function). In contrast, when the decimated image does not hold statistical evidence of the original one, the normalized correlation function diverts from the reference function. Moreover, the equally weighted sum of the average of the squared difference, between the discrete correlation functions of the decimated images and the reference functions, leads to a definition of an overall error. During the first stages of the gradual decimation, the error remains relatively small and independent of the decimation procedure. Above a threshold defined by the correlation length of the reference function, the error becomes a function of the number of decimation steps. At this stage, some statistical information is lost and the error becomes dependent on the decimation procedure. These results may help us to restrict the amount of information that one can afford to lose during a decimation process, in order to reduce the computational and memory cost, when one aims to diminish the time consumed by a characterization or reconstruction technique, yet maintaining the statistical quality of the digitized sample.
NASA Astrophysics Data System (ADS)
Trimborn, Barbara; Wolf, Ivo; Abu-Sammour, Denis; Henzler, Thomas; Schad, Lothar R.; Zöllner, Frank G.
2017-03-01
Image registration of preprocedural contrast-enhanced CTs to intraprocedual cone-beam computed tomography (CBCT) can provide additional information for interventional liver oncology procedures such as transcatheter arterial chemoembolisation (TACE). In this paper, a novel similarity metric for gradient-based image registration is proposed. The metric relies on the patch-based computation of histograms of oriented gradients (HOG) building the basis for a feature descriptor. The metric was implemented in a framework for rigid 3D-3D-registration of pre-interventional CT with intra-interventional CBCT data obtained during the workflow of a TACE. To evaluate the performance of the new metric, the capture range was estimated based on the calculation of the mean target registration error and compared to the results obtained with a normalized cross correlation metric. The results show that 3D HOG feature descriptors are suitable as image-similarity metric and that the novel metric can compete with established methods in terms of registration accuracy
A computational framework to characterize and compare the geometry of coronary networks.
Bulant, C A; Blanco, P J; Lima, T P; Assunção, A N; Liberato, G; Parga, J R; Ávila, L F R; Pereira, A C; Feijóo, R A; Lemos, P A
2017-03-01
This work presents a computational framework to perform a systematic and comprehensive assessment of the morphometry of coronary arteries from in vivo medical images. The methodology embraces image segmentation, arterial vessel representation, characterization and comparison, data storage, and finally analysis. Validation is performed using a sample of 48 patients. Data mining of morphometric information of several coronary arteries is presented. Results agree to medical reports in terms of basic geometric and anatomical variables. Concerning geometric descriptors, inter-artery and intra-artery correlations are studied. Data reported here can be useful for the construction and setup of blood flow models of the coronary circulation. Finally, as an application example, similarity criterion to assess vasculature likelihood based on geometric features is presented and used to test geometric similarity among sibling patients. Results indicate that likelihood, measured through geometric descriptors, is stronger between siblings compared with non-relative patients. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Mathematical modeling of tetrahydroimidazole benzodiazepine-1-one derivatives as an anti HIV agent
NASA Astrophysics Data System (ADS)
Ojha, Lokendra Kumar
2017-07-01
The goal of the present work is the study of drug receptor interaction via QSAR (Quantitative Structure-Activity Relationship) analysis for 89 set of TIBO (Tetrahydroimidazole Benzodiazepine-1-one) derivatives. MLR (Multiple Linear Regression) method is utilized to generate predictive models of quantitative structure-activity relationships between a set of molecular descriptors and biological activity (IC50). The best QSAR model was selected having a correlation coefficient (r) of 0.9299 and Standard Error of Estimation (SEE) of 0.5022, Fisher Ratio (F) of 159.822 and Quality factor (Q) of 1.852. This model is statistically significant and strongly favours the substitution of sulphur atom, IS i.e. indicator parameter for -Z position of the TIBO derivatives. Two other parameter logP (octanol-water partition coefficient) and SAG (Surface Area Grid) also played a vital role in the generation of best QSAR model. All three descriptor shows very good stability towards data variation in leave-one-out (LOO).
Teixeira, Christiane Aires; Rodrigues Júnior, Antonio Luiz; Straccia, Luciana Cristina; Vianna, Elcio Dos Santos Oliveira; Silva, Geruza Alves da; Martinez, José Antônio Baddini
2011-01-01
To investigate the usefulness of descriptive terms applied to the sensation of dyspnea (dyspnea descriptors) that were developed in English and translated to Brazilian Portuguese in patients with four distinct clinical conditions that can be accompanied by dyspnea. We translated, from English to Brazilian Portuguese, a list of 15 dyspnea descriptors reported in a study conducted in the USA. Those 15 descriptors were applied in 50 asthma patients, 50 COPD patients, 30 patients with heart failure, and 50 patients with class II or III obesity. The three best descriptors, as selected by the patients, were studied by cluster analysis. Potential associations between the identified clusters and the four clinical conditions were also investigated. The use of this set of descriptors led to a solution with nine clusters, designated expiração (exhalation), fome de ar (air hunger), sufoco (suffocating), superficial (shallow), rápido (rapid), aperto (tight), falta de ar (shortness of breath), trabalho (work), and inspiração (inhalation). Overlapping of the descriptors was quite common among the patients, regardless of their clinical condition. Asthma, COPD, and heart failure were significantly associated with the inspiração cluster. Heart failure was also associated with the trabalho cluster, whereas obesity was not associated with any of the clusters. In our study sample, the application of dyspnea descriptors translated from English to Portuguese led to the identification of distinct clusters, some of which were similar to those identified in a study conducted in the USA. The translated descriptors were less useful than were those developed in Brazil regarding their ability to generate significant associations among the clinical conditions investigated here.
Direct memory access transfer completion notification
Chen, Dong; Giampapa, Mark E.; Heidelberger, Philip; Kumar, Sameer; Parker, Jeffrey J.; Steinmacher-Burow, Burkhard D.; Vranas, Pavlos
2010-07-27
Methods, compute nodes, and computer program products are provided for direct memory access (`DMA`) transfer completion notification. Embodiments include determining, by an origin DMA engine on an origin compute node, whether a data descriptor for an application message to be sent to a target compute node is currently in an injection first-in-first-out (`FIFO`) buffer in dependence upon a sequence number previously associated with the data descriptor, the total number of descriptors currently in the injection FIFO buffer, and the current sequence number for the newest data descriptor stored in the injection FIFO buffer; and notifying a processor core on the origin DMA engine that the message has been sent if the data descriptor for the message is not currently in the injection FIFO buffer.
Character context: a shape descriptor for Arabic handwriting recognition
NASA Astrophysics Data System (ADS)
Mudhsh, Mohammed; Almodfer, Rolla; Duan, Pengfei; Xiong, Shengwu
2017-11-01
In the handwriting recognition field, designing good descriptors are substantial to obtain rich information of the data. However, the handwriting recognition research of a good descriptor is still an open issue due to unlimited variation in human handwriting. We introduce a "character context descriptor" that efficiently dealt with the structural characteristics of Arabic handwritten characters. First, the character image is smoothed and normalized, then the character context descriptor of 32 feature bins is built based on the proposed "distance function." Finally, a multilayer perceptron with regularization is used as a classifier. On experimentation with a handwritten Arabic characters database, the proposed method achieved a state-of-the-art performance with recognition rate equal to 98.93% and 99.06% for the 66 and 24 classes, respectively.
Adjacent bin stability evaluating for feature description
NASA Astrophysics Data System (ADS)
Nie, Dongdong; Ma, Qinyong
2018-04-01
Recent study improves descriptor performance by accumulating stability votes for all scale pairs to compose the local descriptor. We argue that the stability of a bin depends on the differences across adjacent pairs more than the differences across all scale pairs, and a new local descriptor is composed based on the hypothesis. A series of SIFT descriptors are extracted from multiple scales firstly. Then the difference value of the bin across adjacent scales is calculated, and the stability value of a bin is calculated based on it and accumulated to compose the final descriptor. The performance of the proposed method is evaluated with two popular matching datasets, and compared with other state-of-the-art works. Experimental results show that the proposed method performs satisfactorily.
RPBS: Rotational Projected Binary Structure for point cloud representation
NASA Astrophysics Data System (ADS)
Fang, Bin; Zhou, Zhiwei; Ma, Tao; Hu, Fangyu; Quan, Siwen; Ma, Jie
2018-03-01
In this paper, we proposed a novel three-dimension local surface descriptor named RPBS for point cloud representation. First, points cropped form the query point within a predefined radius is regard as a local surface patch. Then pose normalization is done to the local surface to equip our descriptor with the invariance to rotation transformation. To obtain more information about the cropped surface, multi-view representation is formed by successively rotating it along the coordinate axis. Further, orthogonal projections to the three coordinate plane are adopted to construct two-dimension distribution matrixes, and binarization is applied to each matrix by following the rule that whether the grid is occupied, if yes, set the grid one, otherwise zero. We calculate the binary maps from all the viewpoints and concatenate them together as the final descriptor. Comparative experiments for evaluating our proposed descriptor is conducted on the standard dataset named Bologna with several state-of-the-art 3D descriptors, and results show that our descriptor achieves the best performance on feature matching experiments.
Branch length similarity entropy-based descriptors for shape representation
NASA Astrophysics Data System (ADS)
Kwon, Ohsung; Lee, Sang-Hee
2017-11-01
In previous studies, we showed that the branch length similarity (BLS) entropy profile could be successfully used for the shape recognition such as battle tanks, facial expressions, and butterflies. In the present study, we proposed new descriptors, roundness, symmetry, and surface roughness, for the recognition, which are more accurate and fast in the computation than the previous descriptors. The roundness represents how closely a shape resembles to a circle, the symmetry characterizes how much one shape is similar with another when the shape is moved in flip, and the surface roughness quantifies the degree of vertical deviations of a shape boundary. To evaluate the performance of the descriptors, we used the database of leaf images with 12 species. Each species consisted of 10 - 20 leaf images and the total number of images were 160. The evaluation showed that the new descriptors successfully discriminated the leaf species. We believe that the descriptors can be a useful tool in the field of pattern recognition.
Gun bore flaw image matching based on improved SIFT descriptor
NASA Astrophysics Data System (ADS)
Zeng, Luan; Xiong, Wei; Zhai, You
2013-01-01
In order to increase the operation speed and matching ability of SIFT algorithm, the SIFT descriptor and matching strategy are improved. First, a method of constructing feature descriptor based on sector area is proposed. By computing the gradients histogram of location bins which are parted into 6 sector areas, a descriptor with 48 dimensions is constituted. It can reduce the dimension of feature vector and decrease the complexity of structuring descriptor. Second, it introduce a strategy that partitions the circular region into 6 identical sector areas starting from the dominate orientation. Consequently, the computational complexity is reduced due to cancellation of rotation operation for the area. The experimental results indicate that comparing with the OpenCV SIFT arithmetic, the average matching speed of the new method increase by about 55.86%. The matching veracity can be increased even under some variation of view point, illumination, rotation, scale and out of focus. The new method got satisfied results in gun bore flaw image matching. Keywords: Metrology, Flaw image matching, Gun bore, Feature descriptor
Sorich, Michael J; McKinnon, Ross A; Miners, John O; Winkler, David A; Smith, Paul A
2004-10-07
This study aimed to evaluate in silico models based on quantum chemical (QC) descriptors derived using the electronegativity equalization method (EEM) and to assess the use of QC properties to predict chemical metabolism by human UDP-glucuronosyltransferase (UGT) isoforms. Various EEM-derived QC molecular descriptors were calculated for known UGT substrates and nonsubstrates. Classification models were developed using support vector machine and partial least squares discriminant analysis. In general, the most predictive models were generated with the support vector machine. Combining QC and 2D descriptors (from previous work) using a consensus approach resulted in a statistically significant improvement in predictivity (to 84%) over both the QC and 2D models and the other methods of combining the descriptors. EEM-derived QC descriptors were shown to be both highly predictive and computationally efficient. It is likely that EEM-derived QC properties will be generally useful for predicting ADMET and physicochemical properties during drug discovery.
NASA Astrophysics Data System (ADS)
Nalewajski, Roman F.
The flow of information in the molecular communication networks in the (condensed) atomic orbital (AO) resolution is investigated and the plane-wave (momentum-space) interpretation of the average Fisher information in the molecular information system is given. It is argued using the quantum-mechanical superposition principle that, in the LCAO MO theory the squares of corresponding elements of the Charge and Bond-Order (CBO) matrix determine the conditional probabilities between AO, which generate the molecular communication system of the Orbital Communication Theory (OCT) of the chemical bond. The conditional-entropy ("noise," information-theoretic "covalency") and the mutual-information (information flow, information-theoretic "ionicity") descriptors of these molecular channels are related to Wiberg's covalency indices of chemical bonds. The illustrative application of OCT to the three-orbital model of the chemical bond X-Y, which is capable of describing the forward- and back-donations as well as the atom promotion accompanying the bond formation, is reported. It is demonstrated that the entropy/information characteristics of these separate bond-effects can be extracted by an appropriate reduction of the output of the molecular information channel, carried out by combining several exits into a single (condensed) one. The molecular channels in both the AO and hybrid orbital representations are examined for both the molecular and representative promolecular input probabilities.
Munteanu, Cristian R; Gonzalez-Diaz, Humberto; Garcia, Rafael; Loza, Mabel; Pazos, Alejandro
2015-01-01
The molecular information encoding into molecular descriptors is the first step into in silico Chemoinformatics methods in Drug Design. The Machine Learning methods are a complex solution to find prediction models for specific biological properties of molecules. These models connect the molecular structure information such as atom connectivity (molecular graphs) or physical-chemical properties of an atom/group of atoms to the molecular activity (Quantitative Structure - Activity Relationship, QSAR). Due to the complexity of the proteins, the prediction of their activity is a complicated task and the interpretation of the models is more difficult. The current review presents a series of 11 prediction models for proteins, implemented as free Web tools on an Artificial Intelligence Model Server in Biosciences, Bio-AIMS (http://bio-aims.udc.es/TargetPred.php). Six tools predict protein activity, two models evaluate drug - protein target interactions and the other three calculate protein - protein interactions. The input information is based on the protein 3D structure for nine models, 1D peptide amino acid sequence for three tools and drug SMILES formulas for two servers. The molecular graph descriptor-based Machine Learning models could be useful tools for in silico screening of new peptides/proteins as future drug targets for specific treatments.
A dynamic appearance descriptor approach to facial actions temporal modeling.
Jiang, Bihan; Valstar, Michel; Martinez, Brais; Pantic, Maja
2014-02-01
Both the configuration and the dynamics of facial expressions are crucial for the interpretation of human facial behavior. Yet to date, the vast majority of reported efforts in the field either do not take the dynamics of facial expressions into account, or focus only on prototypic facial expressions of six basic emotions. Facial dynamics can be explicitly analyzed by detecting the constituent temporal segments in Facial Action Coding System (FACS) Action Units (AUs)-onset, apex, and offset. In this paper, we present a novel approach to explicit analysis of temporal dynamics of facial actions using the dynamic appearance descriptor Local Phase Quantization from Three Orthogonal Planes (LPQ-TOP). Temporal segments are detected by combining a discriminative classifier for detecting the temporal segments on a frame-by-frame basis with Markov Models that enforce temporal consistency over the whole episode. The system is evaluated in detail over the MMI facial expression database, the UNBC-McMaster pain database, the SAL database, the GEMEP-FERA dataset in database-dependent experiments, in cross-database experiments using the Cohn-Kanade, and the SEMAINE databases. The comparison with other state-of-the-art methods shows that the proposed LPQ-TOP method outperforms the other approaches for the problem of AU temporal segment detection, and that overall AU activation detection benefits from dynamic appearance information.
Siedenburg, Kai; Jones-Mollerup, Kiray; McAdams, Stephen
2016-01-01
This paper investigates the role of acoustic and categorical information in timbre dissimilarity ratings. Using a Gammatone-filterbank-based sound transformation, we created tones that were rated as less familiar than recorded tones from orchestral instruments and that were harder to associate with an unambiguous sound source (Experiment 1). A subset of transformed tones, a set of orchestral recordings, and a mixed set were then rated on pairwise dissimilarity (Experiment 2A). We observed that recorded instrument timbres clustered into subsets that distinguished timbres according to acoustic and categorical properties. For the subset of cross-category comparisons in the mixed set, we observed asymmetries in the distribution of ratings, as well as a stark decay of inter-rater agreement. These effects were replicated in a more robust within-subjects design (Experiment 2B) and cannot be explained by acoustic factors alone. We finally introduced a novel model of timbre dissimilarity based on partial least-squares regression that compared the contributions of both acoustic and categorical timbre descriptors. The best model fit (R2 = 0.88) was achieved when both types of descriptors were taken into account. These findings are interpreted as evidence for an interplay of acoustic and categorical information in timbre dissimilarity perception. PMID:26779086
Li, Linnan; Xie, Shaodong; Cai, Hao; Bai, Xuetao; Xue, Zhao
2008-08-01
Theoretical molecular descriptors were tested against logK(OW) values for polybrominated diphenyl ethers (PBDEs) using the Partial Least-Squares Regression method which can be used to analyze data with many variables and few observations. A quantitative structure-property relationship (QSPR) model was successfully developed with a high cross-validated value (Q(cum)(2)) of 0.961, indicating a good predictive ability and stability of the model. The predictive power of the QSPR model was further cross-validated. The values of logK(OW) for PBDEs are mainly governed by molecular surface area, energy of the lowest unoccupied molecular orbital and the net atomic charges on the oxygen atom. All these descriptors have been discussed to interpret the partitioning mechanism of PBDE chemicals. The bulk property of the molecules represented by molecular surface area is the leading factor, and K(OW) values increase with the increase of molecular surface area. Higher energy of the lowest unoccupied molecular orbital and higher net atomic charge on the oxygen atom of PBDEs result in smaller K(OW). The energy of the lowest unoccupied molecular orbital and the net atomic charge on PBDEs oxygen also play important roles in affecting the partition of PBDEs between octanol and water by influencing the interactions between PBDEs and solvent molecules.
Informing the Human Plasma Protein Binding of ...
The free fraction of a xenobiotic in plasma (Fub) is an important determinant of chemical adsorption, distribution, metabolism, elimination, and toxicity, yet experimental plasma protein binding data is scarce for environmentally relevant chemicals. The presented work explores the merit of utilizing available pharmaceutical data to predict Fub for environmentally relevant chemicals via machine learning techniques. Quantitative structure-activity relationship (QSAR) models were constructed with k nearest neighbors (kNN), support vector machines (SVM), and random forest (RF) machine learning algorithms from a training set of 1045 pharmaceuticals. The models were then evaluated with independent test sets of pharmaceuticals (200 compounds) and environmentally relevant ToxCast chemicals (406 total, in two groups of 238 and 168 compounds). The selection of a minimal feature set of 10-15 2D molecular descriptors allowed for both informative feature interpretation and practical applicability domain assessment via a bounded box of descriptor ranges and principal component analysis. The diverse pharmaceutical and environmental chemical sets exhibit similarities in terms of chemical space (99-82% overlap), as well as comparable bias and variance in constructed learning curves. All the models exhibit significant predictability with mean absolute errors (MAE) in the range of 0.10-0.18 Fub. The models performed best for highly bound chemicals (MAE 0.07-0.12), neutrals (MAE 0
Leclercq-Perlat, M-N; Sicard, M; Perrot, N; Trelea, I C; Picque, D; Corrieu, G
2015-02-01
Ripening descriptors are the main factors that determine consumers' preferences of soft cheeses. Six descriptors were defined to represent the sensory changes in Camembert cheeses: Penicillium camemberti appearance, cheese odor and rind color, creamy underrind thickness and consistency, and core hardness. To evaluate the effects of the main process parameters on these descriptors, Camembert cheeses were ripened under different temperatures (8, 12, and 16°C) and relative humidity (RH; 88, 92, and 98%). The sensory descriptors were highly dependent on the temperature and RH used throughout ripening in a ripening chamber. All sensory descriptor changes could be explained by microorganism growth, pH, carbon substrate metabolism, and cheese moisture, as well as by microbial enzymatic activities. On d 40, at 8°C and 88% RH, all sensory descriptors scored the worst: the cheese was too dry, its odor and its color were similar to those of the unripe cheese, the underrind was driest, and the core was hardest. At 16°C and 98% RH, the odor was strongly ammonia and the color was dark brown, and the creamy underrind represented the entire thickness of the cheese but was completely runny, descriptors indicative of an over ripened cheese. Statistical analysis showed that the best ripening conditions to achieve an optimum balance between cheese sensory qualities and marketability were 13±1°C and 94±1% RH. Copyright © 2015 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
The effects of variant descriptors on the potential effectiveness of plain packaging.
Borland, Ron; Savvas, Steven
2014-01-01
To examine the effects that variant descriptor labels on cigarette packs have on smokers' perceptions of those packs and the cigarettes contained within. As part of two larger web-based studies (each involved 160 young adult ever-smokers 18-29 years old), respondents were shown a computer image of a plain cigarette pack and sets of related variant descriptors. The sets included terms that varied in terms of descriptors of colours as names, flavour strength, degrees of filter venting, filter types, quality, type of cigarette and numbers. For each set, respondents rated the highest and lowest of two or three of the following four characteristics: quality, strongest or weakest in taste, delivers most or least tar/nicotine, and most or least level of harm. There were significant differences on all four ratings. Quality ratings were the least differentiated. Except for colour descriptors, where 'Gold' rated high in quality but medium in other ratings, ratings of quality, harm, strength and delivery were all positively associated when rated on the same descriptors. Descriptor labels on cigarette packs, can affect smokers' perceptions of the characteristics of the cigarettes contained within. Therefore, they are a potential means by which product differentiation can occur. In particular, having variants differing in perceived strength while not differing in deliveries of harmful ingredients is particularly problematic. Any packaging policy should take into account the possibility that variant descriptors can mislead smokers into making inappropriate product attributions.
NASA Astrophysics Data System (ADS)
de Oliveira, Helder C. R.; Moraes, Diego R.; Reche, Gustavo A.; Borges, Lucas R.; Catani, Juliana H.; de Barros, Nestor; Melo, Carlos F. E.; Gonzaga, Adilson; Vieira, Marcelo A. C.
2017-03-01
This paper presents a new local micro-pattern texture descriptor for the detection of Architectural Distortion (AD) in digital mammography images. AD is a subtle contraction of breast parenchyma that may represent an early sign of breast cancer. Due to its subtlety and variability, AD is more difficult to detect compared to microcalcifications and masses, and is commonly found in retrospective evaluations of false-negative mammograms. Several computer-based systems have been proposed for automatic detection of AD, but their performance are still unsatisfactory. The proposed descriptor, Local Mapped Pattern (LMP), is a generalization of the Local Binary Pattern (LBP), which is considered one of the most powerful feature descriptor for texture classification in digital images. Compared to LBP, the LMP descriptor captures more effectively the minor differences between the local image pixels. Moreover, LMP is a parametric model which can be optimized for the desired application. In our work, the LMP performance was compared to the LBP and four Haralick's texture descriptors for the classification of 400 regions of interest (ROIs) extracted from clinical mammograms. ROIs were selected and divided into four classes: AD, normal tissue, microcalcifications and masses. Feature vectors were used as input to a multilayer perceptron neural network, with a single hidden layer. Results showed that LMP is a good descriptor to distinguish AD from other anomalies in digital mammography. LMP performance was slightly better than the LBP and comparable to Haralick's descriptors (mean classification accuracy = 83%).
NASA Astrophysics Data System (ADS)
Gironés, X.; Gallegos, A.; Carbó-Dorca, R.
2001-12-01
In this work, the antimalarial activity of two series of 20 and 7 synthetic 1,2,4-trioxanes and a set of 20 cyclic peroxy ketals are tested for correlation search by means of Molecular Quantum Similarity Measures (MQSM). QSAR models, dealing with different biological responses (IC90, IC50 and ED90) of the parasite Plasmodium Falciparum, are constructed using MQSM as molecular descriptors and are satisfactorily correlated. The statistical results of the 20 1,2,4-trioxanes are deeply analyzed to elucidate the relevant structural features in the biological activity, revealing the importance of phenyl substitutions.
Vocabulary Development and Maintenance--Descriptors. ERIC Processing Manual, Section VIII (Part 1).
ERIC Educational Resources Information Center
Houston, Jim, Ed.
Comprehensive rules, guidelines, and examples are provided for use by ERIC indexers and lexicographers in developing and maintaining the "Thesaurus of ERIC Descriptors." Evaluation and decision criteria, research procedures, and inputting details for adding new Descriptors are documented. Instructions for modifying existing Thesaurus…
Towards operational interpretations of generalized entropies
NASA Astrophysics Data System (ADS)
Topsøe, Flemming
2010-12-01
The driving force behind our study has been to overcome the difficulties you encounter when you try to extend the clear and convincing operational interpretations of classical Boltzmann-Gibbs-Shannon entropy to other notions, especially to generalized entropies as proposed by Tsallis. Our approach is philosophical, based on speculations regarding the interplay between truth, belief and knowledge. The main result demonstrates that, accepting philosophically motivated assumptions, the only possible measures of entropy are those suggested by Tsallis - which, as we know, include classical entropy. This result constitutes, so it seems, a more transparent interpretation of entropy than previously available. However, further research to clarify the assumptions is still needed. Our study points to the thesis that one should never consider the notion of entropy in isolation - in order to enable a rich and technically smooth study, further concepts, such as divergence, score functions and descriptors or controls should be included in the discussion. This will clarify the distinction between Nature and Observer and facilitate a game theoretical discussion. The usefulness of this distinction and the subsequent exploitation of game theoretical results - such as those connected with the notion of Nash equilibrium - is demonstrated by a discussion of the Maximum Entropy Principle.
O'Boyle, Noel M; Palmer, David S; Nigsch, Florian; Mitchell, John BO
2008-01-01
Background We present a novel feature selection algorithm, Winnowing Artificial Ant Colony (WAAC), that performs simultaneous feature selection and model parameter optimisation for the development of predictive quantitative structure-property relationship (QSPR) models. The WAAC algorithm is an extension of the modified ant colony algorithm of Shen et al. (J Chem Inf Model 2005, 45: 1024–1029). We test the ability of the algorithm to develop a predictive partial least squares model for the Karthikeyan dataset (J Chem Inf Model 2005, 45: 581–590) of melting point values. We also test its ability to perform feature selection on a support vector machine model for the same dataset. Results Starting from an initial set of 203 descriptors, the WAAC algorithm selected a PLS model with 68 descriptors which has an RMSE on an external test set of 46.6°C and R2 of 0.51. The number of components chosen for the model was 49, which was close to optimal for this feature selection. The selected SVM model has 28 descriptors (cost of 5, ε of 0.21) and an RMSE of 45.1°C and R2 of 0.54. This model outperforms a kNN model (RMSE of 48.3°C, R2 of 0.47) for the same data and has similar performance to a Random Forest model (RMSE of 44.5°C, R2 of 0.55). However it is much less prone to bias at the extremes of the range of melting points as shown by the slope of the line through the residuals: -0.43 for WAAC/SVM, -0.53 for Random Forest. Conclusion With a careful choice of objective function, the WAAC algorithm can be used to optimise machine learning and regression models that suffer from overfitting. Where model parameters also need to be tuned, as is the case with support vector machine and partial least squares models, it can optimise these simultaneously. The moving probabilities used by the algorithm are easily interpreted in terms of the best and current models of the ants, and the winnowing procedure promotes the removal of irrelevant descriptors. PMID:18959785
Piou, Tiffany; Romanov-Michailidis, Fedor; Romanova-Michaelides, Maria; Jackson, Kelvin E; Semakul, Natthawat; Taggart, Trevor D; Newell, Brian S; Rithner, Christopher D; Paton, Robert S; Rovis, Tomislav
2017-01-25
Cp X Rh(III)-catalyzed C-H functionalization reactions are a proven method for the efficient assembly of small molecules. However, rationalization of the effects of cyclopentadienyl (Cp X ) ligand structure on reaction rate and selectivity has been viewed as a black box, and a truly systematic study is lacking. Consequently, predicting the outcomes of these reactions is challenging because subtle variations in ligand structure can cause notable changes in reaction behavior. A predictive tool is, nonetheless, of considerable value to the community as it would greatly accelerate reaction development. Designing a data set in which the steric and electronic properties of the Cp X Rh(III) catalysts were systematically varied allowed us to apply multivariate linear regression algorithms to establish correlations between these catalyst-based descriptors and the regio-, diastereoselectivity, and rate of model reactions. This, in turn, led to the development of quantitative predictive models that describe catalyst performance. Our newly described cone angles and Sterimol parameters for Cp X ligands served as highly correlative steric descriptors in the regression models. Through rational design of training and validation sets, key diastereoselectivity outliers were identified. Computations reveal the origins of the outstanding stereoinduction displayed by these outliers. The results are consistent with partial η 5 -η 3 ligand slippage that occurs in the transition state of the selectivity-determining step. In addition to the instructive value of our study, we believe that the insights gained are transposable to other group 9 transition metals and pave the way toward rational design of C-H functionalization catalysts.
Artificial neural networks and the study of the psychoactivity of cannabinoid compounds.
Honório, Káthia M; de Lima, Emmanuela F; Quiles, Marcos G; Romero, Roseli A F; Molfetta, Fábio A; da Silva, Albérico B F
2010-06-01
Cannabinoid compounds have widely been employed because of its medicinal and psychotropic properties. These compounds are isolated from Cannabis sativa (or marijuana) and are used in several medical treatments, such as glaucoma, nausea associated to chemotherapy, pain and many other situations. More recently, its use as appetite stimulant has been indicated in patients with cachexia or AIDS. In this work, the influence of several molecular descriptors on the psychoactivity of 50 cannabinoid compounds is analyzed aiming one obtain a model able to predict the psychoactivity of new cannabinoids. For this purpose, initially, the selection of descriptors was carried out using the Fisher's weight, the correlation matrix among the calculated variables and principal component analysis. From these analyses, the following descriptors have been considered more relevant: E(LUMO) (energy of the lowest unoccupied molecular orbital), Log P (logarithm of the partition coefficient), VC4 (volume of the substituent at the C4 position) and LP1 (Lovasz-Pelikan index, a molecular branching index). To follow, two neural network models were used to construct a more adequate model for classifying new cannabinoid compounds. The first model employed was multi-layer perceptrons, with algorithm back-propagation, and the second model used was the Kohonen network. The results obtained from both networks were compared and showed that both techniques presented a high percentage of correctness to discriminate psychoactive and psychoinactive compounds. However, the Kohonen network was superior to multi-layer perceptrons.
Mondal Roy, Sutapa; Roy, Debesh R; Sahoo, Suban K
2015-11-01
The applicability of Density Functional Theory (DFT) based descriptors for the development of quantitative structure-toxicity relationships (QSTR) is assessed for two different series of toxic aromatic compounds, viz., polyhalogenated dibenzo-p-dioxins (PHDDs) and phenols (PHs). A series of 20 compounds each for PHDDs and PHs with their experimental toxicities (IC50 and IGC50) is chosen in the present study to develop DFT based efficient quantum chemical parameters (QCPs) for explaining the toxin potential of the considered compounds. A systematic analysis to find out the electron donation/acceptance nature of these selected compounds with the considered model biosystems, viz., nucleic acid (NA) bases and DNA base pairs, is performed to identify potential QCPs. Accordingly, PHDDs is found to be electron acceptors whereas phenols as donors, during their interaction with biosystems. Two parameter regression model is carried out comprising global charge transfer (ΔN), and local Fukui Function's for nucleophilic attack (fk(+)) for PHDDs and the same for electrophilic attack (fk(-)) in case of PHs. It is heartening to note that our chosen descriptors, viz, charge transfer (ΔN) and Fukui Function (fk(±)) plays a crucial role by explaining more than 90% of the observed toxic behavior (in terms of correlation-coefficient, R) of PHDDs and PHs. The developed QCPs, viz., ΔN and fk(±) can be added as the new descriptors in the QSTR parlance. Copyright © 2015 Elsevier Inc. All rights reserved.
Pereira, José Aldo Alves; de Oliveira-Filho, Ary Teixeira; Eisenlohr, Pedro V; Miranda, Pedro L S; de Lemos Filho, José Pires
2015-02-01
The loss in forest area due to human occupancy is not the only threat to the remaining biodiversity: forest fragments are susceptible to additional human impact. Our aim was to investigate the effect of human impact on tree community features (species composition and abundance, and structural descriptors) and check if there was a decrease in the number of slender trees, an increase in the amount of large trees, and also a reduction in the number of tree species that occur in 20 fragments of Atlantic montane semideciduous forest in southeastern Brazil. We produced digital maps of each forest fragment using Landsat 7 satellite images and processed the maps to obtain morphometric variables. We used investigative questionnaires and field observations to survey the history of human impact. We then converted the information into scores given to the extent, severity, and duration of each impact, including proportional border area, fire, trails, coppicing, logging, and cattle, and converted these scores into categorical levels. We used linear models to assess the effect of impacts on tree species abundance distribution and stand structural descriptors. Part of the variation in floristic patterns was significantly correlated to the impacts of fire, logging, and proportional border area. Structural descriptors were influenced by cattle and outer roads. Our results provided, for the first time, strong evidence that tree species occurrence and abundance, and forest structure of Atlantic seasonal forest fragments respond differently to various modes of disturbance by humans.
Quantum chemical and statistical study of megazol-derived compounds with trypanocidal activity
NASA Astrophysics Data System (ADS)
Rosselli, F. P.; Albuquerque, C. N.; da Silva, A. B. F.
In this work we performed a structure-activity relationship (SAR) study with the aim to correlate molecular properties of the megazol compound and 10 of its analogs with the biological activity against Trypanosoma cruzi (trypanocidal or antichagasic activity) presented by these molecules. The biological activity indication was obtained from in vitro tests and the molecular properties (variables or descriptors) were obtained from the optimized chemical structures by using the PM3 semiempirical method. It was calculated ˜80 molecular properties selected among steric, constitutional, electronic, and lipophilicity properties. In order to reduce dimensionality and investigate which subset of variables (descriptors) would be more effective in classifying the compounds studied, according to their degree of trypanocidal activity, we employed statistical methodologies (pattern recognition and classification techniques) such as principal component analysis (PCA), hierarchical cluster analysis (HCA), K-nearest neighbor (KNN), and discriminant function analysis (DFA). These methods showed that the descriptors molecular mass (MM), energy of the second lowest unoccupied molecular orbital (LUMO+1), charge on the first nitrogen at substituent 2 (qN'), dihedral angles (D1 and D2), bond length between atom C4 and its substituent (L4), Moriguchi octanol-partition coefficient (MLogP), and length-to-breadth ratio (L/Bw) were the variables responsible for the separation between active and inactive compounds against T. cruzi. Afterwards, the PCA, KNN, and DFA models built in this work were used to perform trypanocidal activity predictions for eight new megazol analog compounds.
ERIC Educational Resources Information Center
Klingbiel, Paul H.; Jacobs, Charles R.
This report summarizes the frequency of use of the 7144 descriptors used for indexing technical reports in the Defense Documentation Center (DDC) collection. The descriptors are arranged alphabetically in the first section and by frequency in the second section. The frequency data cover about 427,000 AD documents spanning the interval from March…
Kim, Sungjin; Jinich, Adrián; Aspuru-Guzik, Alán
2017-04-24
We propose a multiple descriptor multiple kernel (MultiDK) method for efficient molecular discovery using machine learning. We show that the MultiDK method improves both the speed and accuracy of molecular property prediction. We apply the method to the discovery of electrolyte molecules for aqueous redox flow batteries. Using multiple-type-as opposed to single-type-descriptors, we obtain more relevant features for machine learning. Following the principle of "wisdom of the crowds", the combination of multiple-type descriptors significantly boosts prediction performance. Moreover, by employing multiple kernels-more than one kernel function for a set of the input descriptors-MultiDK exploits nonlinear relations between molecular structure and properties better than a linear regression approach. The multiple kernels consist of a Tanimoto similarity kernel and a linear kernel for a set of binary descriptors and a set of nonbinary descriptors, respectively. Using MultiDK, we achieve an average performance of r 2 = 0.92 with a test set of molecules for solubility prediction. We also extend MultiDK to predict pH-dependent solubility and apply it to a set of quinone molecules with different ionizable functional groups to assess their performance as flow battery electrolytes.
Deep Learning for Lowtextured Image Matching
NASA Astrophysics Data System (ADS)
Kniaz, V. V.; Fedorenko, V. V.; Fomin, N. A.
2018-05-01
Low-textured objects pose challenges for an automatic 3D model reconstruction. Such objects are common in archeological applications of photogrammetry. Most of the common feature point descriptors fail to match local patches in featureless regions of an object. Hence, automatic documentation of the archeological process using Structure from Motion (SfM) methods is challenging. Nevertheless, such documentation is possible with the aid of a human operator. Deep learning-based descriptors have outperformed most of common feature point descriptors recently. This paper is focused on the development of a new Wide Image Zone Adaptive Robust feature Descriptor (WIZARD) based on the deep learning. We use a convolutional auto-encoder to compress discriminative features of a local path into a descriptor code. We build a codebook to perform point matching on multiple images. The matching is performed using the nearest neighbor search and a modified voting algorithm. We present a new "Multi-view Amphora" (Amphora) dataset for evaluation of point matching algorithms. The dataset includes images of an Ancient Greek vase found at Taman Peninsula in Southern Russia. The dataset provides color images, a ground truth 3D model, and a ground truth optical flow. We evaluated the WIZARD descriptor on the "Amphora" dataset to show that it outperforms the SIFT and SURF descriptors on the complex patch pairs.
An object-based approach for areal rainfall estimation and validation of atmospheric models
NASA Astrophysics Data System (ADS)
Troemel, Silke; Simmer, Clemens
2010-05-01
An object-based approach for areal rainfall estimation is applied to pseudo-radar data simulated of a weatherforecast model as well as to real radar volume data. The method aims at an as fully as possible exploitation of three-dimensional radar signals produced by precipitation generating systems during their lifetime to enhance areal rainfall estimation. Therefore tracking of radar-detected precipitation-centroids is performed and rain events are investigated using so-called Integral Radar Volume Descriptors (IRVD) containing relevant information of the underlying precipitation process. Some investigated descriptors are statistical quantities from the radar reflectivities within the boundary of a tracked rain cell like the area mean reflectivity or the compactness of a cell; others evaluate the mean vertical structure during the tracking period at the near surface reflectivity-weighted center of the cell like the mean effective efficiency or the mean echo top height. The stage of evolution of a system is given by the trend in the brightband fraction or related quantities. Furthermore, two descriptors not directly derived from radar data are considered: the mean wind shear and an orographic rainfall amplifier. While in case of pseudo-radar data a model based on a small set of IRVDs alone provides rainfall estimates of high accuracy, the application of such a model to the real world remains within the accuracies achievable with a constant Z-R-relationship. However, a combined model based on single IRVDs and the Marshall-Palmer Z-R-estimator already provides considerable enhancements even though the resolution of the data base used has room for improvement. The mean echo top height, the mean effective efficiency, the empirical standard deviation and the Marshall-Palmer estimator are detected for the final rainfall estimator. High correlations between storm height and rain rates, a shift of the probability distribution to higher values with increasing effective efficiency, and the possibility to classify continental and maritime systems using the effective efficiency confirm the informative value of the qualified descriptors. The IRVDs especially correct for the underestimation in case of intense rain events, and the information content of descriptors is most likely higher than demonstrated so far. We used quite sparse information about meteorological variables needed for the calculation of some IRVDs from single radiosoundings, and several descriptors suffered from the range-dependent vertical resolution of the reflectivity profile. Inclusion of neighbouring radars and assimilation runs of weather forecasting models will further enhance the accuracy of rainfall estimates. Finally, the clear difference between the IRVD selection from the pseudo-radar data and from the real world data hint to a new object-based avenue for the validation of higher resolution atmospheric models and for evaluating their potential to digest radar observations in data assimilation schemes.
Bidgood, W. Dean; Bray, Bruce; Brown, Nicolas; Mori, Angelo Rossi; Spackman, Kent A.; Golichowski, Alan; Jones, Robert H.; Korman, Louis; Dove, Brent; Hildebrand, Lloyd; Berg, Michael
1999-01-01
Objective: To support clinically relevant indexing of biomedical images and image-related information based on the attributes of image acquisition procedures and the judgments (observations) expressed by observers in the process of image interpretation. Design: The authors introduce the notion of “image acquisition context,” the set of attributes that describe image acquisition procedures, and present a standards-based strategy for utilizing the attributes of image acquisition context as indexing and retrieval keys for digital image libraries. Methods: The authors' indexing strategy is based on an interdependent message/terminology architecture that combines the Digital Imaging and Communication in Medicine (DICOM) standard, the SNOMED (Systematized Nomenclature of Human and Veterinary Medicine) vocabulary, and the SNOMED DICOM microglossary. The SNOMED DICOM microglossary provides context-dependent mapping of terminology to DICOM data elements. Results: The capability of embedding standard coded descriptors in DICOM image headers and image-interpretation reports improves the potential for selective retrieval of image-related information. This favorably affects information management in digital libraries. PMID:9925229
Bidgood, W D; Bray, B; Brown, N; Mori, A R; Spackman, K A; Golichowski, A; Jones, R H; Korman, L; Dove, B; Hildebrand, L; Berg, M
1999-01-01
To support clinically relevant indexing of biomedical images and image-related information based on the attributes of image acquisition procedures and the judgments (observations) expressed by observers in the process of image interpretation. The authors introduce the notion of "image acquisition context," the set of attributes that describe image acquisition procedures, and present a standards-based strategy for utilizing the attributes of image acquisition context as indexing and retrieval keys for digital image libraries. The authors' indexing strategy is based on an interdependent message/terminology architecture that combines the Digital Imaging and Communication in Medicine (DICOM) standard, the SNOMED (Systematized Nomenclature of Human and Veterinary Medicine) vocabulary, and the SNOMED DICOM microglossary. The SNOMED DICOM microglossary provides context-dependent mapping of terminology to DICOM data elements. The capability of embedding standard coded descriptors in DICOM image headers and image-interpretation reports improves the potential for selective retrieval of image-related information. This favorably affects information management in digital libraries.
Challenges of ECG monitoring and ECG interpretation in dialysis units.
Poulikakos, Dimitrios; Malik, Marek
Patients on hemodialysis (HD) suffer from high cardiovascular morbidity and mortality due to high rates of coronary artery disease and arrhythmias. Electrocardiography (ECG) is often performed in the dialysis units as part of routine clinical assessment. However, fluid and electrolyte changes have been shown to affect all ECG morphologies and intervals. ECG interpretation thus depends on the time of the recording in relation to the HD session. In addition, arrhythmias during HD are common, and dialysis-related ECG artifacts mimicking arrhythmias have been reported. Studies using advanced ECG analyses have examined the impact of the HD procedure on selected repolarization descriptors and heart rate variability indices. Despite the challenges related to the impact of the fluctuant fluid and electrolyte status on conventional and advanced ECG parameters, further research in ECG monitoring during dialysis has the potential to provide clinically meaningful and practically useful information for diagnostic and risk stratification purposes. Crown Copyright © 2016. Published by Elsevier Inc. All rights reserved.
NASA Technical Reports Server (NTRS)
Wilson, Robert M.
2014-01-01
Examined are sunspot cycle- (SC-) length averages of the annual January-December values of the Global Land-Ocean Temperature Index (
Mapping axillary microbiota responsible for body odours using a culture-independent approach.
Troccaz, Myriam; Gaïa, Nadia; Beccucci, Sabine; Schrenzel, Jacques; Cayeux, Isabelle; Starkenmann, Christian; Lazarevic, Vladimir
2015-01-01
Human axillary odour is commonly attributed to the bacterial degradation of precursors in sweat secretions. To assess the role of bacterial communities in the formation of body odours, we used a culture-independent approach to study axillary skin microbiota and correlated these data with olfactory analysis. Twenty-four Caucasian male and female volunteers and four assessors showed that the underarms of non-antiperspirant (non-AP) users have significantly higher global sweat odour intensities and harboured on average about 50 times more bacteria than those of AP users. Global sweat odour and odour descriptors sulfury-cat urine and acid-spicy generally increased from the morning to the afternoon sessions. Among non-AP users, male underarm odours were judged higher in intensity with higher fatty and acid-spicy odours and higher bacterial loads. Although the content of odour precursors in underarm secretions varied widely among individuals, males had a higher acid: sulfur precursor ratio than females did. No direct correlations were found between measured precursor concentration and sweat odours. High-throughput sequencing targeting the 16S rRNA genes of underarm bacteria collected from 11 non-AP users (six females and five males) confirmed the strong dominance of the phyla Firmicutes and Actinobacteria, with 96% of sequences assigned to the genera Staphylococcus, Corynebacterium and Propionibacterium. The proportion of several bacterial taxa showed significant variation between males and females. The genera Anaerococcus and Peptoniphilus and the operational taxonomic units (OTUs) from Staphylococcus haemolyticus and the genus Corynebacterium were more represented in males than in females. The genera Corynebacterium and Propionibacterium were correlated and anti-correlated, respectively, with body odours. Within the genus Staphylococcus, different OTUs were either positively or negatively correlated with axillary odour. The relative abundance of five OTUs (three assigned to S. hominis and one each to Corynebacterium tuberculostearicum and Anaerococcus) were positively correlated with at least one underarm olfactory descriptor. Positive and negative correlations between bacterial taxa found at the phylum, genus and OTU levels suggest the existence of mutualism and competition among skin bacteria. Such interactions, and the types and quantities of underarm bacteria, affect the formation of body odours. These findings open the possibility of developing new solutions for odour control.
How good is good? Students and assessors' perceptions of qualitative markers of performance.
Ma, Heung Kan; Min, Cynthia; Neville, Alan; Eva, Kevin
2013-01-01
Qualitative markers of performance are routinely used for medical student assessment, though the extent to which such markers can be readily translated to actionable pieces of information remains uncertain. To explore (a) the perceived value to be indicated by descriptor phrases commonly used for describing student performance, (b) the perceived weight of the different performance domains (e.g. communication skills, work ethic, knowledge base, etc), and (c) whether or not the perceived value of the descriptors changes as a function of the performance domains. Five domains of performance were identified from the thematic coding of past medical student transcripts (N = 156). From the transcripts, 91 distinct descriptors indicating the language commonly used by assessors were also identified. From the list of 91 descriptors, Thurstone's method of equal-appearing intervals was used to extract 10 descriptors that were representative of the continuum of student performance. A modified paired comparisons method was then used to enable the relative ranking of each of 10 descriptors combined with each of 5 different domains of performance. A web-based survey was used to collect responses from participants (N = 209), which consisted of medical students and faculty members who were previously involved in student assessment. Results demonstrated that respondents did not simply sum positive and negative descriptors in a uniform manner. Rather, comments on some domains (e.g., "ability to apply patient centred medicine") were seen as particularly positive when associated with positive descriptors but not particularly negative when associated with negative descriptors. For others (e.g., "receptivity and responsiveness to feedback") the reverse was true. Comments on "knowledge-base" elicited a relatively muted perception at both ends of the scale. Finally, the results also revealed moderate misalignment in the perceptions of assessors and students. The findings from this study suggest that the use of any given descriptor conveys slightly different meaning dependent on the context in which it is used. This helps to address some key issues surrounding the application of qualitative markers to performance assessment in medical education.
Benndorf, Matthias; Hahn, Felix; Krönig, Malte; Jilg, Cordula Annette; Krauss, Tobias; Langer, Mathias; Dovi-Akué, Philippe
2017-08-01
To examine the diagnostic performance of PI-RADSv2 T2w and diffusion weighted imaging (DWI) based lexicon descriptors, inter-observer agreement for descriptor assignment and diagnostic accuracy of the PI-RADSv2 assessment categories for multiparametric prostate MRI. 176 lesions in 79 consecutive patients are analyzed, lesions are histopathologically verified by MRI-ultrasound fusion biopsy. All lesions are rated according to the PI-RADSv2 lexicon, descriptors for T2w and DWI sequences and resulting assessment categories are assigned by two independent blinded radiologists. We perform receiver-operating-characteristic analysis using the assessment categories. To analyze inter-observer agreement, we calculate weighted kappa values for assessment category assignment and unweighted kappa values for descriptor assignment. PI-RADSv2 assessment categories yield an area under the curve of 0.76/0.74 (radiologist 1/radiologist 2), P >0.05. Weighted kappa for agreement is 0.601 in the peripheral zone and 0.580 in the transition zone. We detect a difference in the cancer rate for PI-RADSv2 category 3 between peripheral zone (32%) and transition zone (12%), P <0.05. We obtain moderate agreement at most for descriptor assignment with kappa values ranging from 0.082 (T2w shape in the transition zone) to 0.407 (T2w signal intensity in the peripheral zone) and 0.493 (ADC pattern in the peripheral zone). Our analysis corroborates typical descriptors for benign/malignant lesions, but also reveals insights into potential pitfalls - T2w wedge shaped lesions in the peripheral zone have a considerable cancer rate, despite being labelled category 2 in the lexicon. Agreement for descriptor assignment in the PI-RADSv2 lexicon is at most moderate in our study. Typical descriptors for benign and malignant lesions are validated, whereas the discriminatory power of some descriptors is challenged. The difference in the cancer rate for PI-RADSv2 category 3 between peripheral zone and transition zone should be considered when management recommendations are linked to assessment categories in the future. Copyright © 2017 Elsevier B.V. All rights reserved.
RANZCR Body Systems Framework of diagnostic imaging examination descriptors.
Pitman, Alexander G; Penlington, Lisa; Doromal, Darren; Slater, Gregory; Vukolova, Natalia
2014-08-01
A unified and logical system of descriptors for diagnostic imaging examinations and procedures is a desirable resource for radiology in Australia and New Zealand and is needed to support core activities of RANZCR. Existing descriptor systems available in Australia and New Zealand (including the Medicare DIST and the ACC Schedule) have significant limitations and are inappropriate for broader clinical application. An anatomically based grid was constructed, with anatomical structures arranged in rows and diagnostic imaging modalities arranged in columns (including nuclear medicine and positron emission tomography). The grid was segregated into five body systems. The cells at the intersection of an anatomical structure row and an imaging modality column were populated with short, formulaic descriptors of the applicable diagnostic imaging examinations. Clinically illogical or physically impossible combinations were 'greyed out'. Where the same examination applied to different anatomical structures, the descriptor was kept identical for the purposes of streamlining. The resulting Body Systems Framework of diagnostic imaging examination descriptors lists all the reasonably common diagnostic imaging examinations currently performed in Australia and New Zealand using a unified grid structure allowing navigation by both referrers and radiologists. The Framework has been placed on the RANZCR website and is available for access free of charge by registered users. The Body Systems Framework of diagnostic imaging examination descriptors is a system of descriptors based on relationships between anatomical structures and imaging modalities. The Framework is now available as a resource and reference point for the radiology profession and to support core College activities. © 2014 The Royal Australian and New Zealand College of Radiologists.
Fang, Xingang; Bagui, Sikha; Bagui, Subhash
2017-08-01
The readily available high throughput screening (HTS) data from the PubChem database provides an opportunity for mining of small molecules in a variety of biological systems using machine learning techniques. From the thousands of available molecular descriptors developed to encode useful chemical information representing the characteristics of molecules, descriptor selection is an essential step in building an optimal quantitative structural-activity relationship (QSAR) model. For the development of a systematic descriptor selection strategy, we need the understanding of the relationship between: (i) the descriptor selection; (ii) the choice of the machine learning model; and (iii) the characteristics of the target bio-molecule. In this work, we employed the Signature descriptor to generate a dataset on the Human kallikrein 5 (hK 5) inhibition confirmatory assay data and compared multiple classification models including logistic regression, support vector machine, random forest and k-nearest neighbor. Under optimal conditions, the logistic regression model provided extremely high overall accuracy (98%) and precision (90%), with good sensitivity (65%) in the cross validation test. In testing the primary HTS screening data with more than 200K molecular structures, the logistic regression model exhibited the capability of eliminating more than 99.9% of the inactive structures. As part of our exploration of the descriptor-model-target relationship, the excellent predictive performance of the combination of the Signature descriptor and the logistic regression model on the assay data of the Human kallikrein 5 (hK 5) target suggested a feasible descriptor/model selection strategy on similar targets. Copyright © 2017 Elsevier Ltd. All rights reserved.
ERIC Educational Resources Information Center
Weisberg, Paul
2003-01-01
Six preschool children, mostly from poverty-level backgrounds, were taught to make descriptive statements about objects. The category-descriptor statements were organized and sequenced into four clusters. As sets of new statements were successively taught and evaluated, the number and diversity of probed category and descriptor terms steadily and…
2017-01-01
Abstract Background: Leaf shape among Passiflora species is spectacularly diverse. Underlying this diversity in leaf shape are profound changes in the patterning of the primary vasculature and laminar outgrowth. Each of these aspects of leaf morphology—vasculature and blade—provides different insights into leaf patterning. Results: Here, we morphometrically analyze >3300 leaves from 40 different Passiflora species collected sequentially across the vine. Each leaf is measured in two different ways: using 1) 15 homologous Procrustes-adjusted landmarks of the vasculature, sinuses, and lobes; and 2) Elliptical Fourier Descriptors (EFDs), which quantify the outline of the leaf. The ability of landmarks, EFDs, and both datasets together are compared to determine their relative ability to predict species and node position within the vine. Pairwise correlation of x and y landmark coordinates and EFD harmonic coefficients reveals close associations between traits and insights into the relationship between vasculature and blade patterning. Conclusions: Landmarks, more reflective of the vasculature, and EFDs, more reflective of the blade contour, describe both similar and distinct features of leaf morphology. Landmarks and EFDs vary in ability to predict species identity and node position in the vine and exhibit a correlational structure (both within landmark or EFD traits and between the two data types) revealing constraints between vascular and blade patterning underlying natural variation in leaf morphology among Passiflora species. PMID:28369351
Šegan, Sandra; Trifković, Jelena; Verbić, Tatjana; Opsenica, Dejan; Zlatović, Mario; Burnett, James; Šolaja, Bogdan; Milojković-Opsenica, Dušanka
2013-01-01
The physicochemical properties, retention parameters (R(M)(0)), partition coefficients (logP(OW)), and pK(a) values for a series of thirteen 1,7-bis(aminoalkyl) diazachrysene (1,7-DAAC) derivatives were determined in order to reveal the characteristics responsible for their biological behavior. The investigated compounds inhibit three unrelated pathogens (the Botulinum neurotoxin serotype A light chain (BoNT/A LC), Plasmodium falciparum malaria, and Ebola filovirus) via three different mechanisms of action. To determine the most influential factors governing the retention and activities of the investigated diazachrysenes, R(M)(0), logP(OW), and biological activity values were correlated with 2D and 3D molecular descriptors, using a partial least squares regression. The resulting quantitative structure-retention (property) relationships indicate the importance of descriptors related to the hydrophobicity of the molecules (e.g., predicted partition coefficients and hydrophobic surface area). Quantitative structure-activity relationship models for describing biological activity against the BoNT/A LC and malarial strains also include overall compound polarity, electron density distribution, and proton donor/acceptor potential. Furthermore, models for Ebola filovirus inhibition are presented qualitatively to provide insights into parameters that may contribute to the compounds' antiviral activities. Overall, the models form the basis for selecting structural features that significantly affect the compound's absorption, distribution, metabolism, excretion, and toxicity profiles. Copyright © 2012 Elsevier B.V. All rights reserved.
Application of 3D-QSAR in the rational design of receptor ligands and enzyme inhibitors.
Mor, Marco; Rivara, Silvia; Lodola, Alessio; Lorenzi, Simone; Bordi, Fabrizio; Plazzi, Pier Vincenzo; Spadoni, Gilberto; Bedini, Annalida; Duranti, Andrea; Tontini, Andrea; Tarzia, Giorgio
2005-11-01
Quantitative structure-activity relationships (QSARs) are frequently employed in medicinal chemistry projects, both to rationalize structure-activity relationships (SAR) for known series of compounds and to help in the design of innovative structures endowed with desired pharmacological actions. As a difference from the so-called structure-based drug design tools, they do not require the knowledge of the biological target structure, but are based on the comparison of drug structural features, thus being defined ligand-based drug design tools. In the 3D-QSAR approach, structural descriptors are calculated from molecular models of the ligands, as interaction fields within a three-dimensional (3D) lattice of points surrounding the ligand structure. These descriptors are collected in a large X matrix, which is submitted to multivariate analysis to look for correlations with biological activity. Like for other QSARs, the reliability and usefulness of the correlation models depends on the validity of the assumptions and on the quality of the data. A careful selection of compounds and pharmacological data can improve the application of 3D-QSAR analysis in drug design. Some examples of the application of CoMFA and CoMSIA approaches to the SAR study and design of receptor or enzyme ligands is described, pointing the attention to the fields of melatonin receptor ligands and FAAH inhibitors.
NASA Astrophysics Data System (ADS)
Anick, David J.
2003-12-01
A method is described for a rapid prediction of B3LYP-optimized geometries for polyhedral water clusters (PWCs). Starting with a database of 121 B3LYP-optimized PWCs containing 2277 H-bonds, linear regressions yield formulas correlating O-O distances, O-O-O angles, and H-O-H orientation parameters, with local and global cluster descriptors. The formulas predict O-O distances with a rms error of 0.85 pm to 1.29 pm and predict O-O-O angles with a rms error of 0.6° to 2.2°. An algorithm is given which uses the O-O and O-O-O formulas to determine coordinates for the oxygen nuclei of a PWC. The H-O-H formulas then determine positions for two H's at each O. For 15 test clusters, the gap between the electronic energy of the predicted geometry and the true B3LYP optimum ranges from 0.11 to 0.54 kcal/mol or 4 to 18 cal/mol per H-bond. Linear regression also identifies 14 parameters that strongly correlate with PWC electronic energy. These descriptors include the number of H-bonds in which both oxygens carry a non-H-bonding H, the number of quadrilateral faces, the number of symmetric angles in 5- and in 6-sided faces, and the square of the cluster's estimated dipole moment.
Shape of wear particles found in human knee joints and their relationship to osteoarthritis.
Kuster, M S; Podsiadlo, P; Stachowiak, G W
1998-09-01
To analyse and compare the shape of wear particles found in healthy and osteoarthritic human knee joints for monitoring the progress of osteoarthritis, the long-term prognosis and to evaluate therapeutic regimens. Joint particles from seven patients with normal cartilage in all compartments of the knee joint, 12 patients with fibrillation of less than half the cartilage thickness (grade 1), seven patients with fibrillation of more than half the cartilage thickness (grade 2) and four patients with erosions down to bone (grade 3) were analysed. A total of 565 particles were extracted from synovial fluid samples by ferrography and analysed in a scanning electron microscope. A number of numerical descriptors, i.e. boundary fractal dimension, shape factor, convexity and elongation, were calculated for each particle image and correlated to the degree of osteoarthritis using non-parametric tests. Experiments demonstrated that there were significant differences between the numerical descriptors calculated for wear particles from healthy and osteoarthritic knee joints (P < 0.01), suggesting that the particle shape can be used as an indicator of the joint condition. In particular, the fractal dimension of the particle boundary was shown to correlate directly with the degree of osteoarthritis. Numerical analysis of the shape of wear particles found in human knee joints may provide a reliable means for the assessment of cartilage repair after surgical or conservative treatment of osteoarthritis.
NASA Astrophysics Data System (ADS)
Malmir, Hessam; Sahimi, Muhammad; Tabar, M. Reza Rahimi
2016-12-01
Packing of cubic particles arises in a variety of problems, ranging from biological materials to colloids and the fabrication of new types of porous materials with controlled morphology. The properties of such packings may also be relevant to problems involving suspensions of cubic zeolites, precipitation of salt crystals during CO2 sequestration in rock, and intrusion of fresh water in aquifers by saline water. Not much is known, however, about the structure and statistical descriptors of such packings. We present a detailed simulation and microstructural characterization of packings of nonoverlapping monodisperse cubic particles, following up on our preliminary results [H. Malmir et al., Sci. Rep. 6, 35024 (2016), 10.1038/srep35024]. A modification of the random sequential addition (RSA) algorithm has been developed to generate such packings, and a variety of microstructural descriptors, including the radial distribution function, the face-normal correlation function, two-point probability and cluster functions, the lineal-path function, the pore-size distribution function, and surface-surface and surface-void correlation functions, have been computed, along with the specific surface and mean chord length of the packings. The results indicate the existence of both spatial and orientational long-range order as the the packing density increases. The maximum packing fraction achievable with the RSA method is about 0.57, which represents the limit for a structure similar to liquid crystals.
Effects of metal ions on the reactivity and corrosion electrochemistry of Fe/FeS nanoparticles.
Kim, Eun-Ju; Kim, Jae-Hwan; Chang, Yoon-Seok; Turcio-Ortega, David; Tratnyek, Paul G
2014-04-01
Nano-zerovalent iron (nZVI) formed under sulfidic conditions results in a biphasic material (Fe/FeS) that reduces trichloroethene (TCE) more rapidly than nZVI associated only with iron oxides (Fe/FeO). Exposing Fe/FeS to dissolved metals (Pd(2+), Cu(2+), Ni(2+), Co(2+), and Mn(2+)) results in their sequestration by coprecipitation as dopants into FeS and FeO and/or by electroless precipitation as zerovalent metals that are hydrogenation catalysts. Using TCE reduction rates to probe the effect of metal amendments on the reactivity of Fe/FeS, it was found that Mn(2+) and Cu(2+) decreased TCE reduction rates, while Pd(2+), Co(2+), and Ni(2+) increased them. Electrochemical characterization of metal-amended Fe/FeS showed that aging caused passivation by growth of FeO and FeS phases and poisoning of catalytic metal deposits by sulfide. Correlation of rate constants for TCE reduction (kobs) with electrochemical parameters (corrosion potentials and currents, Tafel slopes, and polarization resistance) and descriptors of hydrogen activation by metals (exchange current density for hydrogen reduction and enthalpy of solution into metals) showed the controlling process changed with aging. For fresh Fe/FeS, kobs was best described by the exchange current density for activation of hydrogen, whereas kobs for aged Fe/FeS correlated with electrochemical descriptors of electron transfer.
Chitwood, Daniel H; Otoni, Wagner C
2017-10-01
Leaf shape among Passiflora species is spectacularly diverse. Underlying this diversity in leaf shape are profound changes in the patterning of the primary vasculature and laminar outgrowth. Each of these aspects of leaf morphology-vasculature and blade-provides different insights into leaf patterning. Here, we morphometrically analyze >3300 leaves from 40 different Passiflora species collected sequentially across the vine. Each leaf is measured in two different ways: using 1) 15 homologous Procrustes-adjusted landmarks of the vasculature, sinuses, and lobes; and 2) Elliptical Fourier Descriptors (EFDs), which quantify the outline of the leaf. The ability of landmarks, EFDs, and both datasets together are compared to determine their relative ability to predict species and node position within the vine. Pairwise correlation of x and y landmark coordinates and EFD harmonic coefficients reveals close associations between traits and insights into the relationship between vasculature and blade patterning. Landmarks, more reflective of the vasculature, and EFDs, more reflective of the blade contour, describe both similar and distinct features of leaf morphology. Landmarks and EFDs vary in ability to predict species identity and node position in the vine and exhibit a correlational structure (both within landmark or EFD traits and between the two data types) revealing constraints between vascular and blade patterning underlying natural variation in leaf morphology among Passiflora species. © The Authors 2017. Published by Oxford University Press.
Chitwood, Daniel H; Otoni, Wagner C
2017-01-01
Leaf shape among Passiflora species is spectacularly diverse. Underlying this diversity in leaf shape are profound changes in the patterning of the primary vasculature and laminar outgrowth. Each of these aspects of leaf morphology-vasculature and blade-provides different insights into leaf patterning. Here, we morphometrically analyze >3300 leaves from 40 different Passiflora species collected sequentially across the vine. Each leaf is measured in two different ways: using 1) 15 homologous Procrustes-adjusted landmarks of the vasculature, sinuses, and lobes; and 2) Elliptical Fourier Descriptors (EFDs), which quantify the outline of the leaf. The ability of landmarks, EFDs, and both datasets together are compared to determine their relative ability to predict species and node position within the vine. Pairwise correlation of x and y landmark coordinates and EFD harmonic coefficients reveals close associations between traits and insights into the relationship between vasculature and blade patterning. Landmarks, more reflective of the vasculature, and EFDs, more reflective of the blade contour, describe both similar and distinct features of leaf morphology. Landmarks and EFDs vary in ability to predict species identity and node position in the vine and exhibit a correlational structure (both within landmark or EFD traits and between the two data types) revealing constraints between vascular and blade patterning underlying natural variation in leaf morphology among Passiflora species. © The Author 2017. Published by Oxford University Press.
Texture classification using non-Euclidean Minkowski dilation
NASA Astrophysics Data System (ADS)
Florindo, Joao B.; Bruno, Odemir M.
2018-03-01
This study presents a new method to extract meaningful descriptors of gray-scale texture images using Minkowski morphological dilation based on the Lp metric. The proposed approach is motivated by the success previously achieved by Bouligand-Minkowski fractal descriptors on texture classification. In essence, such descriptors are directly derived from the morphological dilation of a three-dimensional representation of the gray-level pixels using the classical Euclidean metric. In this way, we generalize the dilation for different values of p in the Lp metric (Euclidean is a particular case when p = 2) and obtain the descriptors from the cumulated distribution of the distance transform computed over the texture image. The proposed method is compared to other state-of-the-art approaches (such as local binary patterns and textons for example) in the classification of two benchmark data sets (UIUC and Outex). The proposed descriptors outperformed all the other approaches in terms of rate of images correctly classified. The interesting results suggest the potential of these descriptors in this type of task, with a wide range of possible applications to real-world problems.
Flavoured cigarettes, sensation seeking and adolescents' perceptions of cigarette brands.
Manning, K C; Kelly, K J; Comello, M L
2009-12-01
This study examined the interactive effects of cigarette package flavour descriptors and sensation seeking on adolescents' brand perceptions. High school students (n = 253) were randomly assigned to one of two experimental conditions and sequentially exposed to cigarette package illustrations for three different brands. In the flavour descriptor condition, the packages included a description of the cigarettes as "cherry", while in the traditional descriptor condition the cigarette brands were described with common phrases found on tobacco packages such as "domestic blend." Following exposure to each package participants' hedonic beliefs, brand attitudes and trial intentions were assessed. Sensation seeking was also measured, and participants were categorised as lower or higher sensation seekers. Across hedonic belief, brand attitude and trial intention measures, there were interactions between package descriptor condition and sensation seeking. These interactions revealed that among high (but not low) sensation seekers, exposure to cigarette packages including sweet flavour descriptors led to more favourable brand impressions than did exposure to packages with traditional descriptors. Among high sensation seeking youths, the appeal of cigarette brands is enhanced through the use of flavours and associated descriptions on product packaging.
An Effective 3D Shape Descriptor for Object Recognition with RGB-D Sensors
Liu, Zhong; Zhao, Changchen; Wu, Xingming; Chen, Weihai
2017-01-01
RGB-D sensors have been widely used in various areas of computer vision and graphics. A good descriptor will effectively improve the performance of operation. This article further analyzes the recognition performance of shape features extracted from multi-modality source data using RGB-D sensors. A hybrid shape descriptor is proposed as a representation of objects for recognition. We first extracted five 2D shape features from contour-based images and five 3D shape features over point cloud data to capture the global and local shape characteristics of an object. The recognition performance was tested for category recognition and instance recognition. Experimental results show that the proposed shape descriptor outperforms several common global-to-global shape descriptors and is comparable to some partial-to-global shape descriptors that achieved the best accuracies in category and instance recognition. Contribution of partial features and computational complexity were also analyzed. The results indicate that the proposed shape features are strong cues for object recognition and can be combined with other features to boost accuracy. PMID:28245553
Vogt, Martin; Bajorath, Jürgen
2008-01-01
Bayesian classifiers are increasingly being used to distinguish active from inactive compounds and search large databases for novel active molecules. We introduce an approach to directly combine the contributions of property descriptors and molecular fingerprints in the search for active compounds that is based on a Bayesian framework. Conventionally, property descriptors and fingerprints are used as alternative features for virtual screening methods. Following the approach introduced here, probability distributions of descriptor values and fingerprint bit settings are calculated for active and database molecules and the divergence between the resulting combined distributions is determined as a measure of biological activity. In test calculations on a large number of compound activity classes, this methodology was found to consistently perform better than similarity searching using fingerprints and multiple reference compounds or Bayesian screening calculations using probability distributions calculated only from property descriptors. These findings demonstrate that there is considerable synergy between different types of property descriptors and fingerprints in recognizing diverse structure-activity relationships, at least in the context of Bayesian modeling.
Borràs, Eva; Ferré, Joan; Boqué, Ricard; Mestres, Montserrat; Aceña, Laura; Calvo, Angels; Busto, Olga
2016-08-01
Headspace-Mass Spectrometry (HS-MS), Fourier Transform Mid-Infrared spectroscopy (FT-MIR) and UV-Visible spectrophotometry (UV-vis) instrumental responses have been combined to predict virgin olive oil sensory descriptors. 343 olive oil samples analyzed during four consecutive harvests (2010-2014) were used to build multivariate calibration models using partial least squares (PLS) regression. The reference values of the sensory attributes were provided by expert assessors from an official taste panel. The instrumental data were modeled individually and also using data fusion approaches. The use of fused data with both low- and mid-level of abstraction improved PLS predictions for all the olive oil descriptors. The best PLS models were obtained for two positive attributes (fruity and bitter) and two defective descriptors (fusty and musty), all of them using data fusion of MS and MIR spectral fingerprints. Although good predictions were not obtained for some sensory descriptors, the results are encouraging, specially considering that the legal categorization of virgin olive oils only requires the determination of fruity and defective descriptors. Copyright © 2016 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Murugavel, S.; Stephen, C. S. Jacob Prasanna; Subashini, R.; Reddy, H. Raveendranatha; AnanthaKrishnan, Dhanabalan
2016-10-01
The title compound 1-(2-chloro-4-phenylquinolin-3-yl)ethanone (CPQE) was synthesised effectively by chlorination of 3-acetyl-4-phenylquinolin-2(1H)-one (APQ) using POCl3 reagent. Structural and vibrational spectroscopic studies were performed by utilizing single crystal X-ray diffraction, FTIR and NMR spectral analysis along with DFT method utilizing GAUSSIAN‧ 03 software. Veda program has been employed to perform a detailed interpretation of vibrational spectra. Mulliken population analyses on atomic charges, MEP, HOMO-LUMO, NBO, Global chemical reactivity descriptors and thermodynamic properties have been examined by (DFT/B3LYP) method with the 6-311G(d,p) basis set level.
Vander Kloet, S P; Dickinson, T A
2009-05-01
The taxonomic integrity of Vaccinium section Bracteata sensu Sleumer was assessed using a variety of numerical measures on a data matrix created from 46 OTUs scored for 65 descriptors. These analyses supported a much restricted ambit for section Bracteata and the concomitant resurrection of section Nesococcus and section Euepigynium, a more cosmopolitan interpretation for section Eococcus and section Pyxothamnus as well as a new taxon, Vaccinium section Baccula-nigra Kloet, sect. nov. to accommodate V. fragile Franch. and its conspecifics. A key to all the sections as well as a brief description for each section is also provided.
A Kinect based intelligent e-rehabilitation system in physical therapy.
Gal, Norbert; Andrei, Diana; Nemeş, Dan Ion; Nădăşan, Emanuela; Stoicu-Tivadar, Vasile
2015-01-01
This paper presents an intelligent Kinect and fuzzy inference system based e-rehabilitation system. The Kinect can detect the posture and motion of the patients while the fuzzy inference system can interpret the acquired data on the cognitive level. The system is capable to assess the initial posture and motion ranges of 20 joints. Using angles to describe the motion of the joints, exercise patterns can be developed for each patient. Using the exercise descriptors the fuzzy inference system can track the patient and deliver real-time feedback to maximize the efficiency of the rehabilitation. The first laboratory tests confirm the utility of this system for the initial posture detection, motion range and exercise tracking.
Lesnicki, Dominika; Sulpizi, Marialore
2018-06-13
What happens when extra vibrational energy is added to water? Using nonequilibrium molecular dynamics simulations, also including the full electronic structure, and novel descriptors, based on projected vibrational density of states, we are able to follow the flow of excess vibrational energy from the excited stretching and bending modes. We find that the energy relaxation, mostly mediated by a stretching-stretching coupling in the first solvation shell, is highly heterogeneous and strongly depends on the local environment, where a strong hydrogen bond network can transport energy with a time scale of 200 fs, whereas a weaker network can slow down the transport by a factor 2-3.
Characterizing region of interest in image using MPEG-7 visual descriptors
NASA Astrophysics Data System (ADS)
Ryu, Min-Sung; Park, Soo-Jun; Won, Chee Sun
2005-08-01
In this paper, we propose a region-based image retrieval system using EHD (Edge Histogram Descriptor) and CLD (Color Layout Descriptor) of MPEG-7 descriptors. The combined descriptor can efficiently describe edge and color features in terms of sub-image regions. That is, the basic unit for the selection of the region-of-interest (ROI) in the image is the sub-image block of the EHD, which corresponds to 16 (i.e., 4x4) non-overlapping image blocks in the image space. This implies that, to have a one-to-one region correspondence between EHD and CLD, we need to take an 8x8 inverse DCT (IDCT) for the CLD. Experimental results show that the proposed retrieval scheme can be used for image retrieval with the ROI based image retrieval for MPEG-7 indexed images.
Learning to assign binary weights to binary descriptor
NASA Astrophysics Data System (ADS)
Huang, Zhoudi; Wei, Zhenzhong; Zhang, Guangjun
2016-10-01
Constructing robust binary local feature descriptors are receiving increasing interest due to their binary nature, which can enable fast processing while requiring significantly less memory than their floating-point competitors. To bridge the performance gap between the binary and floating-point descriptors without increasing the computational cost of computing and matching, optimal binary weights are learning to assign to binary descriptor for considering each bit might contribute differently to the distinctiveness and robustness. Technically, a large-scale regularized optimization method is applied to learn float weights for each bit of the binary descriptor. Furthermore, binary approximation for the float weights is performed by utilizing an efficient alternatively greedy strategy, which can significantly improve the discriminative power while preserve fast matching advantage. Extensive experimental results on two challenging datasets (Brown dataset and Oxford dataset) demonstrate the effectiveness and efficiency of the proposed method.
NASA Astrophysics Data System (ADS)
Desai, Alok; Lee, Dah-Jye
2013-12-01
There has been significant research on the development of feature descriptors in the past few years. Most of them do not emphasize real-time applications. This paper presents the development of an affine invariant feature descriptor for low resource applications such as UAV and UGV that are equipped with an embedded system with a small microprocessor, a field programmable gate array (FPGA), or a smart phone device. UAV and UGV have proven suitable for many promising applications such as unknown environment exploration, search and rescue operations. These applications required on board image processing for obstacle detection, avoidance and navigation. All these real-time vision applications require a camera to grab images and match features using a feature descriptor. A good feature descriptor will uniquely describe a feature point thus allowing it to be correctly identified and matched with its corresponding feature point in another image. A few feature description algorithms are available for a resource limited system. They either require too much of the device's resource or too much simplification on the algorithm, which results in reduction in performance. This research is aimed at meeting the needs of these systems without sacrificing accuracy. This paper introduces a new feature descriptor called PRObabilistic model (PRO) for UGV navigation applications. It is a compact and efficient binary descriptor that is hardware-friendly and easy for implementation.
Cirujeda, Pol; Muller, Henning; Rubin, Daniel; Aguilera, Todd A; Loo, Billy W; Diehn, Maximilian; Binefa, Xavier; Depeursinge, Adrien
2015-01-01
In this paper we present a novel technique for characterizing and classifying 3D textured volumes belonging to different lung tissue types in 3D CT images. We build a volume-based 3D descriptor, robust to changes of size, rigid spatial transformations and texture variability, thanks to the integration of Riesz-wavelet features within a Covariance-based descriptor formulation. 3D Riesz features characterize the morphology of tissue density due to their response to changes in intensity in CT images. These features are encoded in a Covariance-based descriptor formulation: this provides a compact and flexible representation thanks to the use of feature variations rather than dense features themselves and adds robustness to spatial changes. Furthermore, the particular symmetric definite positive matrix form of these descriptors causes them to lay in a Riemannian manifold. Thus, descriptors can be compared with analytical measures, and accurate techniques from machine learning and clustering can be adapted to their spatial domain. Additionally we present a classification model following a "Bag of Covariance Descriptors" paradigm in order to distinguish three different nodule tissue types in CT: solid, ground-glass opacity, and healthy lung. The method is evaluated on top of an acquired dataset of 95 patients with manually delineated ground truth by radiation oncology specialists in 3D, and quantitative sensitivity and specificity values are presented.
Discovering collectively informative descriptors from high-throughput experiments
2009-01-01
Background Improvements in high-throughput technology and its increasing use have led to the generation of many highly complex datasets that often address similar biological questions. Combining information from these studies can increase the reliability and generalizability of results and also yield new insights that guide future research. Results This paper describes a novel algorithm called BLANKET for symmetric analysis of two experiments that assess informativeness of descriptors. The experiments are required to be related only in that their descriptor sets intersect substantially and their definitions of case and control are consistent. From resulting lists of n descriptors ranked by informativeness, BLANKET determines shortlists of descriptors from each experiment, generally of different lengths p and q. For any pair of shortlists, four numbers are evident: the number of descriptors appearing in both shortlists, in exactly one shortlist, or in neither shortlist. From the associated contingency table, BLANKET computes Right Fisher Exact Test (RFET) values used as scores over a plane of possible pairs of shortlist lengths [1,2]. BLANKET then chooses a pair or pairs with RFET score less than a threshold; the threshold depends upon n and shortlist length limits and represents a quality of intersection achieved by less than 5% of random lists. Conclusions Researchers seek within a universe of descriptors some minimal subset that collectively and efficiently predicts experimental outcomes. Ideally, any smaller subset should be insufficient for reliable prediction and any larger subset should have little additional accuracy. As a method, BLANKET is easy to conceptualize and presents only moderate computational complexity. Many existing databases could be mined using BLANKET to suggest optimal sets of predictive descriptors. PMID:20021653
Health sciences descriptors in the brazilian speech-language and hearing science.
Campanatti-Ostiz, Heliane; Andrade, Claudia Regina Furquim de
2010-01-01
Terminology in Speech-Language and Hearing Science. To propose a specific thesaurus about the Speech-Language and Hearing Science, for the English, Portuguese and Spanish languages, based on the existing keywords available on the Health Sciences Descriptors (DeCS). Methodology was based on the pilot study developed by Campanatti-Ostiz and Andrade; that had as a purpose to verify the methodological viability for the creation of a Speech-Language and Hearing Science category in the DeCS. The scientific journals selected for analyses of the titles, abstracts and keywords of all scientific articles were those in the field of the Speech-Language and Hearing Science, indexed on the SciELO. 1. Recovery of the Descriptors in the English language (Medical Subject Headings--MeSH); 2. Recovery and hierarchic organization of the descriptors in the Portuguese language was done (DeCS). The obtained data was analyzed as follows: descriptive analyses and relative relevance analyses of the DeCS areas. Based on the first analyses, we decided to select all 761 descriptors, with all the hierarchic numbers, independently of their occurrence (occurrence number--ON), and based on the second analyses, we decided to propose to exclude the less relevant areas and the exclusive DeCS areas. The proposal was finished with a total of 1676 occurrences of DeCS descriptors, distributed in the following areas: Anatomy; Diseases; Analytical, Diagnostic and Therapeutic Techniques and Equipments; Psychiatry and Psychology; Phenomena and Processes; Health Care. The presented proposal of a thesaurus contains the specific terminology of the Brazilian Speech-Language and Hearing Sciences and reflects the descriptors of the published scientific production. Being the DeCS a trilingual vocabulary (Portuguese, English and Spanish), the present descriptors organization proposition can be used in these three languages, allowing greater cultural interchange between different nations.
Direct memory access transfer completion notification
Archer, Charles J [Rochester, MN; Blocksome, Michael A [Rochester, MN; Parker, Jeffrey J [Rochester, MN
2011-02-15
DMA transfer completion notification includes: inserting, by an origin DMA engine on an origin node in an injection first-in-first-out (`FIFO`) buffer, a data descriptor for an application message to be transferred to a target node on behalf of an application on the origin node; inserting, by the origin DMA engine, a completion notification descriptor in the injection FIFO buffer after the data descriptor for the message, the completion notification descriptor specifying a packet header for a completion notification packet; transferring, by the origin DMA engine to the target node, the message in dependence upon the data descriptor; sending, by the origin DMA engine, the completion notification packet to a local reception FIFO buffer using a local memory FIFO transfer operation; and notifying, by the origin DMA engine, the application that transfer of the message is complete in response to receiving the completion notification packet in the local reception FIFO buffer.
Direct memory access transfer completion notification
Archer, Charles J.; Blocksome, Michael A.; Parker, Jeffrey J.
2010-08-17
Methods, apparatus, and products are disclosed for DMA transfer completion notification that include: inserting, by an origin DMA engine on an origin compute node in an injection FIFO buffer, a data descriptor for an application message to be transferred to a target compute node on behalf of an application on the origin compute node; inserting, by the origin DMA engine, a completion notification descriptor in the injection FIFO buffer after the data descriptor for the message, the completion notification descriptor specifying an address of a completion notification field in application storage for the application; transferring, by the origin DMA engine to the target compute node, the message in dependence upon the data descriptor; and notifying, by the origin DMA engine, the application that the transfer of the message is complete, including performing a local direct put operation to store predesignated notification data at the address of the completion notification field.
Predicted Hematologic and Plasma Volume Responses Following Rapid Ascent to Progressive Altitudes
2014-06-01
of these changes, and define baseline demographics and physiologic descriptors that are important in predicting these changes. The overall impact of... physiologic descriptors that are important in predicting these changes. Using general linear mixed models and a comprehensive relational database...accomplished using a comprehensive relational database containing individual ascent profiles, demographics, and physiologic subject descriptors as well as
Ancillao, Andrea; van der Krogt, Marjolein M; Buizer, Annemieke I; Witbreuk, Melinda M; Cappa, Paolo; Harlaar, Jaap
2017-10-01
Gait analysis is used for the assessment of walking ability of children with cerebral palsy (CP), to inform clinical decision making and to quantify changes after treatment. To simplify gait analysis interpretation and to quantify deviations from normality, some quantitative synthetic descriptors were developed over the years, such as the Movement Analysis Profile (MAP) and the Linear Fit Method (LFM), but their interpretation is not always straightforward. The aims of this work were to: (i) study gait changes, by means of synthetic descriptors, in children with CP that underwent Single Event Multilevel Surgery; (ii) compare the MAP and the LFM on these patients; (iii) design a new index that may overcome the limitations of the previous methods, i.e. the lack of information about the direction of deviation or its source. Gait analysis exams of 10 children with CP, pre- and post-surgery, were collected and MAP and LFM were computed. A new index was designed asa modified version of the MAP by separating out changes in offset (named OC-MAP). MAP documented an improvement in the gait pattern after surgery. The highest effect was observed for the knee flexion/extension angle. However, a worsening was observed as an increase in anterior pelvic tilt. An important source of gait deviation was recognized in the offset between observed tracks and reference. OC-MAP allowed the assessment of the offset component versus the shape component of deviation. LFM provided results similar to OC-MAP offset analysis but could not be considered reliable due to intrinsic limitations. As offset in gait features played an important role in gait deviation, OC-MAP synthetic analysis was proposed as a novel approach to a meaningful parameterisation of global deviations in gait patterns of subjects with CP and gait changes after treatment. Copyright © 2017 Elsevier B.V. All rights reserved.
CINAHL and MEDLINE: a comparison of indexing practices.
Brenner, S H; McKinin, E J
1989-10-01
A random sample of fifty nursing articles indexed in both MEDLINE and CINAHL (NURSING & ALLIED HEALTH) during 1986 was used for comparing indexing practices. Indexing was analyzed by counting the number of major descriptors, the number of major and minor descriptors, the number of indexing access points, the number of common indexing access points, and the number and type of unique indexing access points. The study results indicate: there are few differences in the number of major descriptors used, MEDLINE uses almost twice as many descriptors, MEDLINE has almost twice as many indexing access points, and MEDLINE and CINAHL provide few common access points.
CINAHL and MEDLINE: a comparison of indexing practices.
Brenner, S H; McKinin, E J
1989-01-01
A random sample of fifty nursing articles indexed in both MEDLINE and CINAHL (NURSING & ALLIED HEALTH) during 1986 was used for comparing indexing practices. Indexing was analyzed by counting the number of major descriptors, the number of major and minor descriptors, the number of indexing access points, the number of common indexing access points, and the number and type of unique indexing access points. The study results indicate: there are few differences in the number of major descriptors used, MEDLINE uses almost twice as many descriptors, MEDLINE has almost twice as many indexing access points, and MEDLINE and CINAHL provide few common access points. PMID:2676049
Spherical harmonics coefficients for ligand-based virtual screening of cyclooxygenase inhibitors.
Wang, Quan; Birod, Kerstin; Angioni, Carlo; Grösch, Sabine; Geppert, Tim; Schneider, Petra; Rupp, Matthias; Schneider, Gisbert
2011-01-01
Molecular descriptors are essential for many applications in computational chemistry, such as ligand-based similarity searching. Spherical harmonics have previously been suggested as comprehensive descriptors of molecular structure and properties. We investigate a spherical harmonics descriptor for shape-based virtual screening. We introduce and validate a partially rotation-invariant three-dimensional molecular shape descriptor based on the norm of spherical harmonics expansion coefficients. Using this molecular representation, we parameterize molecular surfaces, i.e., isosurfaces of spatial molecular property distributions. We validate the shape descriptor in a comprehensive retrospective virtual screening experiment. In a prospective study, we virtually screen a large compound library for cyclooxygenase inhibitors, using a self-organizing map as a pre-filter and the shape descriptor for candidate prioritization. 12 compounds were tested in vitro for direct enzyme inhibition and in a whole blood assay. Active compounds containing a triazole scaffold were identified as direct cyclooxygenase-1 inhibitors. This outcome corroborates the usefulness of spherical harmonics for representation of molecular shape in virtual screening of large compound collections. The combination of pharmacophore and shape-based filtering of screening candidates proved to be a straightforward approach to finding novel bioactive chemotypes with minimal experimental effort.
Zhou, Ru; Zhong, Dexing; Han, Jiuqiang
2013-01-01
The performance of conventional minutiae-based fingerprint authentication algorithms degrades significantly when dealing with low quality fingerprints with lots of cuts or scratches. A similar degradation of the minutiae-based algorithms is observed when small overlapping areas appear because of the quite narrow width of the sensors. Based on the detection of minutiae, Scale Invariant Feature Transformation (SIFT) descriptors are employed to fulfill verification tasks in the above difficult scenarios. However, the original SIFT algorithm is not suitable for fingerprint because of: (1) the similar patterns of parallel ridges; and (2) high computational resource consumption. To enhance the efficiency and effectiveness of the algorithm for fingerprint verification, we propose a SIFT-based Minutia Descriptor (SMD) to improve the SIFT algorithm through image processing, descriptor extraction and matcher. A two-step fast matcher, named improved All Descriptor-Pair Matching (iADM), is also proposed to implement the 1:N verifications in real-time. Fingerprint Identification using SMD and iADM (FISiA) achieved a significant improvement with respect to accuracy in representative databases compared with the conventional minutiae-based method. The speed of FISiA also can meet real-time requirements. PMID:23467056
NASA Astrophysics Data System (ADS)
Rhodes, Andrew P.; Christian, John A.; Evans, Thomas
2017-12-01
With the availability and popularity of 3D sensors, it is advantageous to re-examine the use of point cloud descriptors for the purpose of pose estimation and spacecraft relative navigation. One popular descriptor is the oriented unique repeatable clustered viewpoint feature histogram (
Retro-regression--another important multivariate regression improvement.
Randić, M
2001-01-01
We review the serious problem associated with instabilities of the coefficients of regression equations, referred to as the MRA (multivariate regression analysis) "nightmare of the first kind". This is manifested when in a stepwise regression a descriptor is included or excluded from a regression. The consequence is an unpredictable change of the coefficients of the descriptors that remain in the regression equation. We follow with consideration of an even more serious problem, referred to as the MRA "nightmare of the second kind", arising when optimal descriptors are selected from a large pool of descriptors. This process typically causes at different steps of the stepwise regression a replacement of several previously used descriptors by new ones. We describe a procedure that resolves these difficulties. The approach is illustrated on boiling points of nonanes which are considered (1) by using an ordered connectivity basis; (2) by using an ordering resulting from application of greedy algorithm; and (3) by using an ordering derived from an exhaustive search for optimal descriptors. A novel variant of multiple regression analysis, called retro-regression (RR), is outlined showing how it resolves the ambiguities associated with both "nightmares" of the first and the second kind of MRA.
ANN expert system screening for illicit amphetamines using molecular descriptors
NASA Astrophysics Data System (ADS)
Gosav, S.; Praisler, M.; Dorohoi, D. O.
2007-05-01
The goal of this study was to develop and an artificial neural network (ANN) based on computed descriptors, which would be able to classify the molecular structures of potential illicit amphetamines and to derive their biological activity according to the similarity of their molecular structure with amphetamines of known toxicity. The system is necessary for testing new molecular structures for epidemiological, clinical, and forensic purposes. It was built using a database formed by 146 compounds representing drugs of abuse (mainly central stimulants, hallucinogens, sympathomimetic amines, narcotics and other potent analgesics), precursors, or derivatized counterparts. Their molecular structures were characterized by computing three types of descriptors: 38 constitutional descriptors (CDs), 69 topological descriptors (TDs) and 160 3D-MoRSE descriptors (3DDs). An ANN system was built for each category of variables. All three networks (CD-NN, TD-NN and 3DD-NN) were trained to distinguish between stimulant amphetamines, hallucinogenic amphetamines, and nonamphetamines. A selection of variables was performed when necessary. The efficiency with which each network identifies the class identity of an unknown sample was evaluated by calculating several figures of merit. The results of the comparative analysis are presented.
Arismendi, Ivan; Johnson, Sherri L.; Dunham, Jason B.; Haggerty, Roy
2013-01-01
1. Temperature is a major driver of ecological processes in stream ecosystems, yet the dynamics of thermal regimes remain poorly described. Most work has focused on relatively simple descriptors that fail to capture the full range of conditions that characterise thermal regimes of streams across seasons or throughout the year. 2. To more completely describe thermal regimes, we developed several descriptors of magnitude, variability, frequency, duration and timing of thermal events throughout a year. We evaluated how these descriptors change over time using long-term (1979–2009), continuous temperature data from five relatively undisturbed cold-water streams in western Oregon, U.S.A. In addition to trends for each descriptor, we evaluated similarities among them, as well as patterns of spatial coherence, and temporal synchrony. 3. Using different groups of descriptors, we were able to more fully capture distinct aspects of the full range of variability in thermal regimes across space and time. A subset of descriptors showed both higher coherence and synchrony and, thus, an appropriate level of responsiveness to examine evidence of regional climatic influences on thermal regimes. Most notably, daily minimum values during winter–spring were the most responsive descriptors to potential climatic influences. 4. Overall, thermal regimes in streams we studied showed high frequency and low variability of cold temperatures during the cold-water period in winter and spring, and high frequency and high variability of warm temperatures during the warm-water period in summer and autumn. The cold and warm periods differed in the distribution of events with a higher frequency and longer duration of warm events in summer than cold events in winter. The cold period exhibited lower variability in the duration of events, but showed more variability in timing. 5. In conclusion, our results highlight the importance of a year-round perspective in identifying the most responsive characteristics or descriptors of thermal regimes in streams. The descriptors we provide herein can be applied across hydro-ecological regions to evaluate spatial and temporal patterns in thermal regimes. Evaluation of coherence and synchrony of different components of thermal regimes can facilitate identification of impacts of regional climate variability or local human or natural influences.
Kamal, Rasha M; Helal, Maha H; Mansour, Sahar M; Haggag, Marwa A; Nada, Omniya M; Farahat, Iman G; Alieldin, Nelly H
2016-07-12
To assess the feasibility of using the MRI breast imaging reporting and data system (BI-RADS) lexicon morphology descriptors to characterize enhancing breast lesions identified on contrast-enhanced spectral mammography (CESM). The study is a retrospective analysis of the morphology descriptors of 261 enhancing breast lesions identified on CESM in 239 patients. We presented the morphological categorization of the included lesions into focus, mass and non-mass. Further classifications included (1) the multiplicity for "focus" category, (2) the shape, margin and internal enhancement for "mass" category and (3) the distribution and internal enhancement for "non-mass" category. Each morphology descriptor was evaluated individually (irrespective of all other descriptors) by calculating its sensitivity, specificity, positive-predictive value (PPV) and negative-predictive value (NPV) and likelihood ratios (LRs). The study included 68/261 (26.1%) benign lesions and 193/261 (73.9%) malignant lesions. Intensely enhancing foci, whether single (7/12, 58.3%) or multiple (2/12, 16.7%), were malignant. Descriptors of "irregular"-shape (PPV: 92.4%) and "non-circumscribed" margin (odds ratio: 55.2, LR positive: 4.77; p-value: <0.001) were more compatible with malignancy. Internal mass enhancement patterns showed a very low specificity (58.0%) and NPV (40.0%). Non-mass enhancement (NME) was detected in 81/261 lesions. Asymmetrical NME in 81% (n = 52/81) lesions was malignant lesions and internal enhancement patterns indicative of malignancy were the heterogeneous and clumped ones. We can apply the MRI morphology descriptors to characterize lesions on CESM, but with few expectations. In many situations, irregular-shaped, non-circumscribed masses and NME with focal, ductal or segmental distribution and heterogeneous or clumped enhancement are the most suggestive descriptors of malignant pathologies. (1) The MRI BI-RADS lexicon morphology descriptors can be applied in the characterization of enhancing lesions on CESM with a few exceptions. (2) Multiple bilateral intensely enhancing foci should not be included under the normal background parenchymal enhancement unless they are proved to be benign by biopsy. (3) Mass lesion features that indicated malignancy were irregular-shaped, spiculated and irregular margins and heterogeneous internal enhancement patterns. The rim enhancement pattern should not be considered as a descriptor of malignant lesions unless CESM is coupled with an ultrasound examination.
Kamal, Rasha M; Helal, Maha H; Haggag, Marwa A; Nada, Omniya M; Farahat, Iman G; Alieldin, Nelly H
2016-01-01
Objective: To assess the feasibility of using the MRI breast imaging reporting and data system (BI-RADS) lexicon morphology descriptors to characterize enhancing breast lesions identified on contrast-enhanced spectral mammography (CESM). Methods: The study is a retrospective analysis of the morphology descriptors of 261 enhancing breast lesions identified on CESM in 239 patients. We presented the morphological categorization of the included lesions into focus, mass and non-mass. Further classifications included (1) the multiplicity for “focus” category, (2) the shape, margin and internal enhancement for “mass” category and (3) the distribution and internal enhancement for “non-mass” category. Each morphology descriptor was evaluated individually (irrespective of all other descriptors) by calculating its sensitivity, specificity, positive-predictive value (PPV) and negative-predictive value (NPV) and likelihood ratios (LRs). Results: The study included 68/261 (26.1%) benign lesions and 193/261 (73.9%) malignant lesions. Intensely enhancing foci, whether single (7/12, 58.3%) or multiple (2/12, 16.7%), were malignant. Descriptors of “irregular”-shape (PPV: 92.4%) and “non-circumscribed” margin (odds ratio: 55.2, LR positive: 4.77; p-value: <0.001) were more compatible with malignancy. Internal mass enhancement patterns showed a very low specificity (58.0%) and NPV (40.0%). Non-mass enhancement (NME) was detected in 81/261 lesions. Asymmetrical NME in 81% (n = 52/81) lesions was malignant lesions and internal enhancement patterns indicative of malignancy were the heterogeneous and clumped ones. Conclusion: We can apply the MRI morphology descriptors to characterize lesions on CESM, but with few expectations. In many situations, irregular-shaped, non-circumscribed masses and NME with focal, ductal or segmental distribution and heterogeneous or clumped enhancement are the most suggestive descriptors of malignant pathologies. Advances in knowledge: (1) The MRI BI-RADS lexicon morphology descriptors can be applied in the characterization of enhancing lesions on CESM with a few exceptions. (2) Multiple bilateral intensely enhancing foci should not be included under the normal background parenchymal enhancement unless they are proved to be benign by biopsy. (3) Mass lesion features that indicated malignancy were irregular-shaped, spiculated and irregular margins and heterogeneous internal enhancement patterns. The rim enhancement pattern should not be considered as a descriptor of malignant lesions unless CESM is coupled with an ultrasound examination. PMID:27327403
Classifying Measures of Biological Variation
Gregorius, Hans-Rolf; Gillet, Elizabeth M.
2015-01-01
Biological variation is commonly measured at two basic levels: variation within individual communities, and the distribution of variation over communities or within a metacommunity. We develop a classification for the measurement of biological variation on both levels: Within communities into the categories of dispersion and diversity, and within metacommunities into the categories of compositional differentiation and partitioning of variation. There are essentially two approaches to characterizing the distribution of trait variation over communities in that individuals with the same trait state or type tend to occur in the same community (describes differentiation tendencies), and individuals with different types tend to occur in different communities (describes apportionment tendencies). Both approaches can be viewed from the dual perspectives of trait variation distributed over communities (CT perspective) and community membership distributed over trait states (TC perspective). This classification covers most of the relevant descriptors (qualified measures) of biological variation, as is demonstrated with the help of major families of descriptors. Moreover, the classification is shown to open ways to develop new descriptors that meet current needs. Yet the classification also reveals the misclassification of some prominent and widely applied descriptors: Dispersion is often misclassified as diversity, particularly in cases where dispersion descriptor allow for the computation of effective numbers; the descriptor GST of population genetics is commonly misclassified as compositional differentiation and confused with partitioning-oriented differentiation, whereas it actually measures partitioning-oriented apportionment; descriptors of β-diversity are ambiguous about the differentiation effects they are supposed to represent and therefore require conceptual reconsideration. PMID:25807558
Quantitative Structure-Cytotoxicity Relationship of Oleoylamides.
Sakagami, Hiroshi; Uesawa, Yoshihiro; Ishihara, Mariko; Kagaya, Hajime; Kanamoto, Taisei; Terakubo, Shigemi; Nakashima, Hideki; Takao, Koichi; Sugita, Yoshiaki
2015-10-01
Eighteen oleoylamides were subjected to quantitative structure-activity relationship analysis based on their cytotoxicity, tumor selectivity and anti-HIV activity, in order to assess their biological activities. Cytotoxicity against four human oral squamous cell carcinoma (OSCC) cell lines and five human oral normal cells (gingival fibroblast, periodontal ligament fibroblast, pulp cell, oral keratinocyte, primary gingival epithelial cells) was determined by the 3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide (MTT) method. Tumor-selectivity (TS) was evaluated by the ratio of the mean 50% cytotoxic concentration (CC50) against normal human oral cells to that against OSCC cell lines. Potency-selectivity expression (PSE) was determined by the ratio of TS to CC50 against OSCC. Anti-HIV activity was evaluated by the ratio of CC50 to the concentration leading to 50% cytoprotection from HIV infection (EC50). Physicochemical, structural and quantum-chemical parameters were calculated based on the conformations optimized by the LowModeMD method. Among 18 derivatives, compounds 8: with a catechol group) and 18: with a (2-pyridyl)amino group) had the highest TS. On the other hand, doxorubicin and 5-fluorouracil (5-FU) were more highly cytotoxic to normal epithelial cells, displaying unexpectedly lower TS and PSE values. None of the compounds had anti-HIV activity. Among 330 chemical descriptors, 75, 73 and 19 descriptors significantly correlated to the cytotoxicity to normal and tumor cells, and TS, respectively. Multivariate statistics with chemical descriptors for molecular polarization and hydrophobicity may be useful for the evaluation of cytotoxicity and TS of oleoylamides. Copyright© 2015 International Institute of Anticancer Research (Dr. John G. Delinassios), All rights reserved.
Sirois, S; Tsoukas, C M; Chou, Kuo-Chen; Wei, Dongqing; Boucher, C; Hatzakis, G E
2005-03-01
Quantitative Structure Activity Relationship (QSAR) techniques are used routinely by computational chemists in drug discovery and development to analyze datasets of compounds. Quantitative numerical methods like Partial Least Squares (PLS) and Artificial Neural Networks (ANN) have been used on QSAR to establish correlations between molecular properties and bioactivity. However, ANN may be advantageous over PLS because it considers the interrelations of the modeled variables. This study focused on the HIV-1 Protease (HIV-1 Pr) inhibitors belonging to the peptidomimetic class of compounds. The main objective was to select molecular descriptors with the best predictive value for antiviral potency (Ki). PLS and ANN were used to predict Ki activity of HIV-1 Pr inhibitors and the results were compared. To address the issue of dimensionality reduction, Genetic Algorithms (GA) were used for variable selection and their performance was compared against that of ANN. Finally, the structure of the optimum ANN achieving the highest Pearson's-R coefficient was determined. On the basis of Pearson's-R, PLS and ANN were compared to determine which exhibits maximum performance. Training and validation of models was performed on 15 random split sets of the master dataset consisted of 231 compounds. For each compound 192 molecular descriptors were considered. The molecular structure and constant of inhibition (Ki) were selected from the NIAID database. Study findings suggested that non-covalent interactions such as hydrophobicity, shape and hydrogen bonding describe well the antiviral activity of the HIV-1 Pr compounds. The significance of lipophilicity and relationship to HIV-1 associated hyperlipidemia and lipodystrophy syndrome warrant further investigation.
Andrade-Ochoa, S; García-Machorro, J; Bello, Martiniano; Rodríguez-Valdez, L M; Flores-Sandoval, C A; Correa-Basurto, J
2017-08-03
Human immunodeficiency virus type-1 (HIV-1) has infected more than 40 million people around the world. HIV-1 treatment still has several side effects, and the development of a vaccine, which is another potential option for decreasing human infections, has faced challenges. This work presents a computational study that includes a quantitative structure activity relationship(QSAR) using density functional theory(DFT) for reported peptides to identify the principal quantum mechanics descriptors related to peptide activity. In addition, the molecular recognition properties of these peptides are explored on major histocompatibility complex I (MHC-I) through docking and molecular dynamics (MD) simulations accompanied by the Molecular Mechanics Generalized Born Surface Area (MMGBSA) approach for correlating peptide activity reported elsewhere vs. theoretical peptide affinity. The results show that the carboxylic acid and hydroxyl groups are chemical moieties that have an inverse relationship with biological activity. The number of sulfides, pyrroles and imidazoles from the peptide structure are directly related to biological activity. In addition, the HOMO orbital energy values of the total absolute charge and the Ghose-Crippen molar refractivity of peptides are descriptors directly related to the activity and affinity on MHC-I. Docking and MD simulation studies accompanied by an MMGBSA analysis show that the binding free energy without considering the entropic contribution is energetically favorable for all the complexes. Furthermore, good peptide interaction with the most affinity is evaluated experimentally for three proteins. Overall, this study shows that the combination of quantum mechanics descriptors and molecular modeling studies could help describe the immunogenic properties of peptides from HIV-1.
Correlating methane production to microbiota in anaerobic digesters fed synthetic wastewater.
Venkiteshwaran, K; Milferstedt, K; Hamelin, J; Fujimoto, M; Johnson, M; Zitomer, D H
2017-03-01
A quantitative structure activity relationship (QSAR) between relative abundance values and digester methane production rate was developed. For this, 50 triplicate anaerobic digester sets (150 total digesters) were each seeded with different methanogenic biomass samples obtained from full-scale, engineered methanogenic systems. Although all digesters were operated identically for at least 5 solids retention times (SRTs), their quasi steady-state function varied significantly, with average daily methane production rates ranging from 0.09 ± 0.004 to 1 ± 0.05 L-CH 4 /L R -day (L R = Liter of reactor volume) (average ± standard deviation). Digester microbial community structure was analyzed using more than 4.1 million partial 16S rRNA gene sequences of Archaea and Bacteria. At the genus level, 1300 operational taxonomic units (OTUs) were observed across all digesters, whereas each digester contained 158 ± 27 OTUs. Digester function did not correlate with typical biomass descriptors such as volatile suspended solids (VSS) concentration, microbial richness, diversity or evenness indices. However, methane production rate did correlate notably with relative abundances of one Archaeal and nine Bacterial OTUs. These relative abundances were used as descriptors to develop a multiple linear regression (MLR) QSAR equation to predict methane production rates solely based on microbial community data. The model explained over 66% of the variance in the experimental data set based on 149 anaerobic digesters with a standard error of 0.12 L-CH 4 /L R -day. This study provides a framework to relate engineered process function and microbial community composition which can be further expanded to include different feed stocks and digester operating conditions in order to develop a more robust QSAR model. Copyright © 2016 Elsevier Ltd. All rights reserved.
Text Extraction from Scene Images by Character Appearance and Structure Modeling
Yi, Chucai; Tian, Yingli
2012-01-01
In this paper, we propose a novel algorithm to detect text information from natural scene images. Scene text classification and detection are still open research topics. Our proposed algorithm is able to model both character appearance and structure to generate representative and discriminative text descriptors. The contributions of this paper include three aspects: 1) a new character appearance model by a structure correlation algorithm which extracts discriminative appearance features from detected interest points of character samples; 2) a new text descriptor based on structons and correlatons, which model character structure by structure differences among character samples and structure component co-occurrence; and 3) a new text region localization method by combining color decomposition, character contour refinement, and string line alignment to localize character candidates and refine detected text regions. We perform three groups of experiments to evaluate the effectiveness of our proposed algorithm, including text classification, text detection, and character identification. The evaluation results on benchmark datasets demonstrate that our algorithm achieves the state-of-the-art performance on scene text classification and detection, and significantly outperforms the existing algorithms for character identification. PMID:23316111
Characterizing Atomistic Geometries and Potential Functions Using Strain Functionals
NASA Astrophysics Data System (ADS)
Kober, Edward; Mathew, Nithin; Rudin, Sven
2017-06-01
We demonstrate the use of strain tensor functionals for characterizing arbitrarily ordered atomistic structures. This approach defines a Gaussian-weighted neighborhood around each atom and characterizes that local geometry in terms of n-th order strain tensors, which are equivalent to the n-th order moments/derivatives of the neighborhood. Fourth order expansions can distinguish the cubic structures (and deformations thereof), but sixth order expansions are required to fully characterize hexagonal structures. These functions are continuous and smooth and much less sensitive to thermal fluctuations than other descriptors based on discrete neighborhoods. Reducing these metrics to rotational invariant descriptors allows a large number of defect structures to be readily identified and forms the basis of a classification scheme that allows molecular dynamics simulations to be readily analyzed. Applications to the analysis of shock waves impinging on samples of Cu, Ta and Ti will be presented. The method has been extended to vector fields as well, enabling the local stress to be cast in terms of rotationally invariant functions as well. The stress-strain correlations can then be used as the basis for developing and analyzing potential functions.
Real-time probabilistic covariance tracking with efficient model update.
Wu, Yi; Cheng, Jian; Wang, Jinqiao; Lu, Hanqing; Wang, Jun; Ling, Haibin; Blasch, Erik; Bai, Li
2012-05-01
The recently proposed covariance region descriptor has been proven robust and versatile for a modest computational cost. The covariance matrix enables efficient fusion of different types of features, where the spatial and statistical properties, as well as their correlation, are characterized. The similarity between two covariance descriptors is measured on Riemannian manifolds. Based on the same metric but with a probabilistic framework, we propose a novel tracking approach on Riemannian manifolds with a novel incremental covariance tensor learning (ICTL). To address the appearance variations, ICTL incrementally learns a low-dimensional covariance tensor representation and efficiently adapts online to appearance changes of the target with only O(1) computational complexity, resulting in a real-time performance. The covariance-based representation and the ICTL are then combined with the particle filter framework to allow better handling of background clutter, as well as the temporary occlusions. We test the proposed probabilistic ICTL tracker on numerous benchmark sequences involving different types of challenges including occlusions and variations in illumination, scale, and pose. The proposed approach demonstrates excellent real-time performance, both qualitatively and quantitatively, in comparison with several previously proposed trackers.
Biologically-inspired data decorrelation for hyper-spectral imaging
NASA Astrophysics Data System (ADS)
Picon, Artzai; Ghita, Ovidiu; Rodriguez-Vaamonde, Sergio; Iriondo, Pedro Ma; Whelan, Paul F.
2011-12-01
Hyper-spectral data allows the construction of more robust statistical models to sample the material properties than the standard tri-chromatic color representation. However, because of the large dimensionality and complexity of the hyper-spectral data, the extraction of robust features (image descriptors) is not a trivial issue. Thus, to facilitate efficient feature extraction, decorrelation techniques are commonly applied to reduce the dimensionality of the hyper-spectral data with the aim of generating compact and highly discriminative image descriptors. Current methodologies for data decorrelation such as principal component analysis (PCA), linear discriminant analysis (LDA), wavelet decomposition (WD), or band selection methods require complex and subjective training procedures and in addition the compressed spectral information is not directly related to the physical (spectral) characteristics associated with the analyzed materials. The major objective of this article is to introduce and evaluate a new data decorrelation methodology using an approach that closely emulates the human vision. The proposed data decorrelation scheme has been employed to optimally minimize the amount of redundant information contained in the highly correlated hyper-spectral bands and has been comprehensively evaluated in the context of non-ferrous material classification
Reducing Current Spread using Current Focusing in Cochlear Implant Users
Landsberger, David M.; Padilla, Monica; Srinivasan, Arthi G.
2012-01-01
Cochlear implant performance in difficult listening situations is limited by channel interactions. It is known that partial tripolar (PTP) stimulation reduces the spread of excitation (SOE). However, the greater the degree of current focusing, the greater the absolute current required to maintain a fixed loudness. As current increases, so does SOE. In experiment 1, the SOE for equally loud stimuli with different degrees of current focusing is measured via a forward-masking procedure. Results suggest that at a fixed loudness, some but not all patients have a reduced SOE with PTP stimulation. Therefore, it seems likely that a PTP speech processing strategy could improve spectral resolution for only those patients with a reduced SOE. In experiment 2, the ability to discriminate different levels of current focusing was measured. In experiment 3, patients subjectively scaled verbal descriptors of stimuli of various levels of current focusing. Both discrimination and scaling of verbal descriptors correlated well with SOE reduction, suggesting that either technique have the potential to be used clinically to quickly predict which patients would receive benefit from a current focusing strategy. PMID:22230370
Stevens, V G; Hibbert, C L; Edbrooke, D L
1998-10-01
This study analyses the relationship between the actual patient-related costs of care calculated for 145 patients admitted sequentially to an adult general intensive care unit and a number of factors obtained from a previously described consensus of opinion study. The factors identified in the study were suggested as potential descriptors for the casemix in an intensive care unit that could be used to predict the costs of care. Significant correlations between the costs of care and severity of illness, workload and length of stay were found but these failed to predict the costs of care with sufficient accuracy to be used in isolation to define isoresource groups in the intensive care unit. No associations between intensive care unit mortality, reason for admission and intensive and unit treatments and costs of care were found. Based on these results, it seems that casemix descriptors and isoresource groups for the intensive care unit that would allow costs to be predicted cannot be defined in terms of single factors.
Chemometric studies on potential larvicidal compounds against Aedes aegypti.
Scotti, Luciana; Scotti, Marcus Tullius; Silva, Viviane Barros; Santos, Sandra Regina Lima; Cavalcanti, Sócrates C H; Mendonça, Francisco J B
2014-03-01
The mosquito Aedes aegypti (Diptera, Culicidae) is the vector of yellow and dengue fever. In this study, chemometric tools, such as, Principal Component Analysis (PCA), Consensus PCA (CPCA), and Partial Least Squares Regression (PLS), were applied to a set of fifty five active compounds against Ae. aegypti larvae, which includes terpenes, cyclic alcohols, phenolic compounds, and their synthetic derivatives. The calculations were performed using the VolSurf+ program. CPCA analysis suggests that the higher weight blocks of descriptors were SIZE/SHAPE, DRY, and H2O. The PCA was generated with 48 descriptors selected from the previous blocks. The scores plot showed good separation between more and less potent compounds. The first two PCs accounted for over 60% of the data variance. The best model obtained in PLS, after validation leave-one-out, exhibited q(2) = 0.679 and r(2) = 0.714. External prediction model was R(2) = 0.623. The independent variables having a hydrophobic profile were strongly correlated to the biological data. The interaction maps generated with the GRID force field showed that the most active compounds exhibit more interaction with the DRY probe.
Roivainen, Eka
2015-12-01
Vocabularies of natural languages evolve over time. Useful words become more popular and useless concepts disappear. In this study, the frequency of the use of 295 English, 100 German, and 114 French personality adjectives in book texts and Twitter messages as qualifiers of the words person, woman, homme, femme, and Person was studied. Word frequency data were compared to factor loadings from previous factor analytic studies on personality terms. The correlation between the popularity of an adjective and its highest primary loading in five- and six-factor models was low (-0.12 to 0.17). The Big five (six) marker adjectives were not more popular than "blended" adjectives that had moderate loadings on several factors. This finding implies that laymen consider "blended" adjectives as equally useful descriptors compared to adjectives that represent core features of the five (six) factors. These results are compatible with three hypotheses: 1) laymen are not good at describing personality, 2) the five (six) factors are artifacts of research methods, 3) the interaction of the five (six) factors is not well understood.
QSAR models for anti-malarial activity of 4-aminoquinolines.
Masand, Vijay H; Toropov, Andrey A; Toropova, Alla P; Mahajan, Devidas T
2014-03-01
In the present study, predictive quantitative structure - activity relationship (QSAR) models for anti-malarial activity of 4-aminoquinolines have been developed. CORAL, which is freely available on internet (http://www.insilico.eu/coral), has been used as a tool of QSAR analysis to establish statistically robust QSAR model of anti-malarial activity of 4-aminoquinolines. Six random splits into the visible sub-system of the training and invisible subsystem of validation were examined. Statistical qualities for these splits vary, but in all these cases, statistical quality of prediction for anti-malarial activity was quite good. The optimal SMILES-based descriptor was used to derive the single descriptor based QSAR model for a data set of 112 aminoquinolones. All the splits had r(2)> 0.85 and r(2)> 0.78 for subtraining and validation sets, respectively. The three parametric multilinear regression (MLR) QSAR model has Q(2) = 0.83, R(2) = 0.84 and F = 190.39. The anti-malarial activity has strong correlation with presence/absence of nitrogen and oxygen at a topological distance of six.
Joint sparse coding based spatial pyramid matching for classification of color medical image.
Shi, Jun; Li, Yi; Zhu, Jie; Sun, Haojie; Cai, Yin
2015-04-01
Although color medical images are important in clinical practice, they are usually converted to grayscale for further processing in pattern recognition, resulting in loss of rich color information. The sparse coding based linear spatial pyramid matching (ScSPM) and its variants are popular for grayscale image classification, but cannot extract color information. In this paper, we propose a joint sparse coding based SPM (JScSPM) method for the classification of color medical images. A joint dictionary can represent both the color information in each color channel and the correlation between channels. Consequently, the joint sparse codes calculated from a joint dictionary can carry color information, and therefore this method can easily transform a feature descriptor originally designed for grayscale images to a color descriptor. A color hepatocellular carcinoma histological image dataset was used to evaluate the performance of the proposed JScSPM algorithm. Experimental results show that JScSPM provides significant improvements as compared with the majority voting based ScSPM and the original ScSPM for color medical image classification. Copyright © 2014 Elsevier Ltd. All rights reserved.
Interplay of heritage and habitat in the distribution of bacterial signal transduction systems.
Galperin, Michael Y; Higdon, Roger; Kolker, Eugene
2010-04-01
Comparative analysis of the complete genome sequences from a variety of poorly studied organisms aims at predicting ecological and behavioral properties of these organisms and helping in characterizing their habitats. This task requires finding appropriate descriptors that could be correlated with the core traits of each system and would allow meaningful comparisons. Using the relatively simple bacterial models, first attempts have been made to introduce suitable metrics to describe the complexity of organism's signaling machinery, which included introducing the "bacterial IQ" score. Here, we use an updated census of prokaryotic signal transduction systems to improve this parameter and evaluate its consistency within selected bacterial phyla. We also introduce a more elaborate descriptor, a set of profiles of relative abundance of members of each family of signal transduction proteins encoded in each genome. We show that these family profiles are well conserved within each genus and are often consistent within families of bacteria. Thus, they reflect evolutionary relationships between organisms as well as individual adaptations of each organism to its specific ecological niche.
2011-01-01
areas. We quantified morphometric features by geometric and fractal analysis of traced lesion boundaries. Although no single parameter can reliably...These include acoustic descriptors (“echogenicity,” “heterogeneity,” “shadowing”) and morphometric descriptors (“area,” “aspect ratio,” “border...quantitative descriptors; some morphometric features (such as border irregularity) also were particularly effective in lesion classification. Our
Kirchmair, Johannes; Markt, Patrick; Distinto, Simona; Wolber, Gerhard; Langer, Thierry
2008-01-01
Within the last few years a considerable amount of evaluative studies has been published that investigate the performance of 3D virtual screening approaches. Thereby, in particular assessments of protein-ligand docking are facing remarkable interest in the scientific community. However, comparing virtual screening approaches is a non-trivial task. Several publications, especially in the field of molecular docking, suffer from shortcomings that are likely to affect the significance of the results considerably. These quality issues often arise from poor study design, biasing, by using improper or inexpressive enrichment descriptors, and from errors in interpretation of the data output. In this review we analyze recent literature evaluating 3D virtual screening methods, with focus on molecular docking. We highlight problematic issues and provide guidelines on how to improve the quality of computational studies. Since 3D virtual screening protocols are in general assessed by their ability to discriminate between active and inactive compounds, we summarize the impact of the composition and preparation of test sets on the outcome of evaluations. Moreover, we investigate the significance of both classic enrichment parameters and advanced descriptors for the performance of 3D virtual screening methods. Furthermore, we review the significance and suitability of RMSD as a measure for the accuracy of protein-ligand docking algorithms and of conformational space sub sampling algorithms.
Automatic detection of spiculation of pulmonary nodules in computed tomography images
NASA Astrophysics Data System (ADS)
Ciompi, F.; Jacobs, C.; Scholten, E. T.; van Riel, S. J.; W. Wille, M. M.; Prokop, M.; van Ginneken, B.
2015-03-01
We present a fully automatic method for the assessment of spiculation of pulmonary nodules in low-dose Computed Tomography (CT) images. Spiculation is considered as one of the indicators of nodule malignancy and an important feature to assess in order to decide on a patient-tailored follow-up procedure. For this reason, lung cancer screening scenario would benefit from the presence of a fully automatic system for the assessment of spiculation. The presented framework relies on the fact that spiculated nodules mainly differ from non-spiculated ones in their morphology. In order to discriminate the two categories, information on morphology is captured by sampling intensity profiles along circular patterns on spherical surfaces centered on the nodule, in a multi-scale fashion. Each intensity profile is interpreted as a periodic signal, where the Fourier transform is applied, obtaining a spectrum. A library of spectra is created by clustering data via unsupervised learning. The centroids of the clusters are used to label back each spectrum in the sampling pattern. A compact descriptor encoding the nodule morphology is obtained as the histogram of labels along all the spherical surfaces and used to classify spiculated nodules via supervised learning. We tested our approach on a set of nodules from the Danish Lung Cancer Screening Trial (DLCST) dataset. Our results show that the proposed method outperforms other 3-D descriptors of morphology in the automatic assessment of spiculation.
LQTA-QSAR: a new 4D-QSAR methodology.
Martins, João Paulo A; Barbosa, Euzébio G; Pasqualoto, Kerly F M; Ferreira, Márcia M C
2009-06-01
A novel 4D-QSAR approach which makes use of the molecular dynamics (MD) trajectories and topology information retrieved from the GROMACS package is presented in this study. This new methodology, named LQTA-QSAR (LQTA, Laboratório de Quimiometria Teórica e Aplicada), has a module (LQTAgrid) that calculates intermolecular interaction energies at each grid point considering probes and all aligned conformations resulting from MD simulations. These interaction energies are the independent variables or descriptors employed in a QSAR analysis. The comparison of the proposed methodology to other 4D-QSAR and CoMFA formalisms was performed using a set of forty-seven glycogen phosphorylase b inhibitors (data set 1) and a set of forty-four MAP p38 kinase inhibitors (data set 2). The QSAR models for both data sets were built using the ordered predictor selection (OPS) algorithm for variable selection. Model validation was carried out applying y-randomization and leave-N-out cross-validation in addition to the external validation. PLS models for data set 1 and 2 provided the following statistics: q(2) = 0.72, r(2) = 0.81 for 12 variables selected and 2 latent variables and q(2) = 0.82, r(2) = 0.90 for 10 variables selected and 5 latent variables, respectively. Visualization of the descriptors in 3D space was successfully interpreted from the chemical point of view, supporting the applicability of this new approach in rational drug design.
Martínez-Araya, Jorge I; Glossman-Mitnik, Daniel
2018-01-18
Ten functionals were used to assess their capability to compute a local reactivity descriptor coming from the Conceptual Density Functional Theory on a group of iron-based organometallic compounds that have been synthesized by Zohuri, G.H. et al. in 2010; these compounds bear the following substituent groups: H-, O 2 N- and CH 3 O- at the para position of the pyridine ring and their catalytic activities were experimentally measured by these authors. The present work involved a theoretical analysis applied on the aforementioned iron-based compounds thus leading to suggest a new 2,6-bis(imino)pyridine catalyst based on iron(II) bearing a fluorine atom whose possible catalytic activity is suggested to be near the catalytic activity of the complex bearing a hydrogen atom as a substituent group by means of the so called local hyper-softness (LHS) thus opening a chance to estimate a possible value of catalytic activity for a new catalyst that has not been synthesized yet without simulating the entire process of ethylene polymerization. Since Conceptual DFT is not a predictive theory, but rather interpretative, an analysis of the used reactivity descriptor and its dependence upon the level of theory was carried in the present work, thus revealing that care should be taken when DFT calculations are used for these purposes.
Compilation and physicochemical classification analysis of a diverse hERG inhibition database
NASA Astrophysics Data System (ADS)
Didziapetris, Remigijus; Lanevskij, Kiril
2016-12-01
A large and chemically diverse hERG inhibition data set comprised of 6690 compounds was constructed on the basis of ChEMBL bioactivity database and original publications dealing with experimental determination of hERG activities using patch-clamp and competitive displacement assays. The collected data were converted to binary format at 10 µM activity threshold and subjected to gradient boosting machine classification analysis using a minimal set of physicochemical and topological descriptors. The tested parameters involved lipophilicity (log P), ionization (p K a ), polar surface area, aromaticity, molecular size and flexibility. The employed approach allowed classifying the compounds with an overall 75-80 % accuracy, even though it only accounted for non-specific interactions between hERG and ligand molecules. The observed descriptor-response profiles were consistent with common knowledge about hERG ligand binding site, but also revealed several important quantitative trends, as well as slight inter-assay variability in hERG inhibition data. The results suggest that even weakly basic groups (p K a < 6) might substantially contribute to hERG inhibition potential, whereas the role of lipophilicity depends on the compound's ionization state, and the influence of log P decreases in the order of bases > zwitterions > neutrals > acids. Given its robust performance and clear physicochemical interpretation, the proposed model may provide valuable information to direct drug discovery efforts towards compounds with reduced risk of hERG-related cardiotoxicity.
Whittaker, Rachel L; Park, Woojin; Dickerson, Clark R
2018-04-27
Efficient and holistic identification of fatigue-induced movement strategies can be limited by large between-subject variability in descriptors of joint angle data. One promising alternative to traditional, or computationally intensive methods is the symbolic motion structure representation algorithm (SMSR), which identifies the basic spatial-temporal structure of joint angle data using string descriptors of temporal joint angle trajectories. This study attempted to use the SMSR to identify changes in upper extremity time series joint angle data during a repetitive goal directed task causing muscle fatigue. Twenty-eight participants (15 M, 13 F) performed a seated repetitive task until fatigued. Upper extremity joint angles were extracted from motion capture for representative task cycles. SMSRs, averages and ranges of several joint angles were compared at the start and end of the repetitive task to identify kinematic changes with fatigue. At the group level, significant increases in the range of all joint angle data existed with large between-subject variability that posed a challenge to the interpretation of these fatigue-related changes. However, changes in the SMSRs across participants effectively summarized the adoption of adaptive movement strategies. This establishes SMSR as a viable, logical, and sensitive method of fatigue identification via kinematic changes, with novel application and pragmatism for visual assessment of fatigue development. Copyright © 2018 Elsevier Ltd. All rights reserved.
Lou, Wutao; Xu, Jin; Sheng, Hengsong; Zhao, Songzhen
2011-11-01
Multichannel EEG recorded in a task condition could contain more information about cognition. However, that has not been widely investigated in the vascular-dementia (VaD)- related studies. The purpose of this study was to explore the differences of brain functional states between VaD patients and normal controls while performing a detection task. Three multichannel linear descriptors, i.e. spatial complexity (Ω), field strength (Σ) and frequency of field changes (Φ), were applied to analyse four frequency bands (delta, theta, alpha and beta) of multichannel event-related EEG signals for 12 VaD patients (mean age ± SD: 69.25 ± 10.56 years ; MMSE score ± SD: 22.58 ± 4.42) and 12 age-matched healthy subjects (mean age ± SD: 67.17 ± 5.97 years ; MMSE score ± SD: 29.08 ± 0.9). The correlations between the three measures and MMSE scores were also analysed. VaD patients showed a significant higher Ω value in the delta (p = 0.013) and theta (p = 0.021) frequency bands, a lower Σ value (p = 0.011) and a higher Φ (p = 0.008) value in the delta frequency band compared with normal controls. The MMSE scores were negatively correlated with the Ω (r = -0.52, p = 0.01) and Φ (r = -0.47, p = 0.02) values in the delta frequency band. The results indicated the VaD patients presented a reduction of synchronization in the slow frequency band during target detection, and suggested more neurons might be activated in VaD patients compared with normal controls. The Ω and Φ measures in the delta frequency band might be used to evaluate the degree of cognitive dysfunction. The multichannel linear descriptors are promising measures to reveal the differences in brain functions between VaD patients and normal subjects, and could potentially be used to evaluate the degree of cognitive dysfunction in VaD patients. Copyright © 2011 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.
Global quantitative indices reflecting provider process-of-care: data-base derivation.
Moran, John L; Solomon, Patricia J
2010-04-19
Controversy has attended the relationship between risk-adjusted mortality and process-of-care. There would be advantage in the establishment, at the data-base level, of global quantitative indices subsuming the diversity of process-of-care. A retrospective, cohort study of patients identified in the Australian and New Zealand Intensive Care Society Adult Patient Database, 1993-2003, at the level of geographic and ICU-level descriptors (n = 35), for both hospital survivors and non-survivors. Process-of-care indices were established by analysis of: (i) the smoothed time-hazard curve of individual patient discharge and determined by pharmaco-kinetic methods as area under the hazard-curve (AUC), reflecting the integrated experience of the discharge process, and time-to-peak-hazard (TMAX, in days), reflecting the time to maximum rate of hospital discharge; and (ii) individual patient ability to optimize output (as length-of-stay) for recorded data-base physiological inputs; estimated as a technical production-efficiency (TE, scaled [0,(maximum)1]), via the econometric technique of stochastic frontier analysis. For each descriptor, multivariate correlation-relationships between indices and summed mortality probability were determined. The data-set consisted of 223129 patients from 99 ICUs with mean (SD) age and APACHE III score of 59.2(18.9) years and 52.7(30.6) respectively; 41.7% were female and 45.7% were mechanically ventilated within the first 24 hours post-admission. For survivors, AUC was maximal in rural and for-profit ICUs, whereas TMAX (>or= 7.8 days) and TE (>or= 0.74) were maximal in tertiary-ICUs. For non-survivors, AUC was maximal in tertiary-ICUs, but TMAX (>or= 4.2 days) and TE (>or= 0.69) were maximal in for-profit ICUs. Across descriptors, significant differences in indices were demonstrated (analysis-of-variance, P
DOE Office of Scientific and Technical Information (OSTI.GOV)
Singh, Kunwar P., E-mail: kpsingh_52@yahoo.com; Environmental Chemistry Division, CSIR-Indian Institute of Toxicology Research, Post Box 80, Mahatma Gandhi Marg, Lucknow 226 001; Gupta, Shikha
Robust global models capable of discriminating positive and non-positive carcinogens; and predicting carcinogenic potency of chemicals in rodents were developed. The dataset of 834 structurally diverse chemicals extracted from Carcinogenic Potency Database (CPDB) was used which contained 466 positive and 368 non-positive carcinogens. Twelve non-quantum mechanical molecular descriptors were derived. Structural diversity of the chemicals and nonlinearity in the data were evaluated using Tanimoto similarity index and Brock–Dechert–Scheinkman statistics. Probabilistic neural network (PNN) and generalized regression neural network (GRNN) models were constructed for classification and function optimization problems using the carcinogenicity end point in rat. Validation of the models wasmore » performed using the internal and external procedures employing a wide series of statistical checks. PNN constructed using five descriptors rendered classification accuracy of 92.09% in complete rat data. The PNN model rendered classification accuracies of 91.77%, 80.70% and 92.08% in mouse, hamster and pesticide data, respectively. The GRNN constructed with nine descriptors yielded correlation coefficient of 0.896 between the measured and predicted carcinogenic potency with mean squared error (MSE) of 0.44 in complete rat data. The rat carcinogenicity model (GRNN) applied to the mouse and hamster data yielded correlation coefficient and MSE of 0.758, 0.71 and 0.760, 0.46, respectively. The results suggest for wide applicability of the inter-species models in predicting carcinogenic potency of chemicals. Both the PNN and GRNN (inter-species) models constructed here can be useful tools in predicting the carcinogenicity of new chemicals for regulatory purposes. - Graphical abstract: Figure (a) shows classification accuracies (positive and non-positive carcinogens) in rat, mouse, hamster, and pesticide data yielded by optimal PNN model. Figure (b) shows generalization and predictive abilities of the interspecies GRNN model to predict the carcinogenic potency of diverse chemicals. - Highlights: • Global robust models constructed for carcinogenicity prediction of diverse chemicals. • Tanimoto/BDS test revealed structural diversity of chemicals and nonlinearity in data. • PNN/GRNN successfully predicted carcinogenicity/carcinogenic potency of chemicals. • Developed interspecies PNN/GRNN models for carcinogenicity prediction. • Proposed models can be used as tool to predict carcinogenicity of new chemicals.« less
Language and the pain experience.
Wilson, Dianne; Williams, Marie; Butler, David
2009-03-01
People in persistent pain have been reported to pay increased attention to specific words or descriptors of pain. The amount of attention paid to pain or cues for pain (such as pain descriptors), has been shown to be a major factor in the modulation of persistent pain. This relationship suggests the possibility that language may have a role both in understanding and managing the persistent pain experience. The aim of this paper is to describe current models of neuromatrices for pain and language, consider the role of attention in persistent pain states and highlight discrepancies, in previous studies based on the McGill Pain Questionnaire (MPQ), of the role of attention on pain descriptors. The existence of a pain neuromatrix originally proposed by Melzack (1990) has been supported by emerging technologies. Similar technologies have recently allowed identification of multiple areas of involvement for the processing of auditory input and the construction of language. As with the construction of pain, this neuromatrix for speech and language may intersect with neural systems for broader cognitive functions such as attention, memory and emotion. A systematic search was undertaken to identify experimental or review studies, which specifically investigated the role of attention on pain descriptors (as cues for pain) in persistent pain patients. A total of 99 articles were retrieved from six databases, with 66 articles meeting the inclusion criteria. After duplicated articles were eliminated, the remaining 41 articles were reviewed in order to support a link between persistent pain, pain descriptors and attention. This review revealed a diverse range of specific pain descriptors, the majority of which were derived from the MPQ. Increased attention to pain descriptors was consistently reported to be associated with emotional state as well as being a significant factor in maintaining persistent pain. However, attempts to investigate the attentional bias of specific pain descriptors highlighted discrepancies between the studies. As well as the diversity of pain descriptors used in studies, they were inconsistently categorized into domains of pain. A lack of consistent bias towards certain pain descriptors was observed, and may be explained simply by the fact that the words provided are not those which subjects themselves would use. These findings suggest that the multidimensional and individual nature of the persistent pain experience may not be adequately explained by pain questionnaires such as the MPQ. Personalized pain descriptors may communicate the pain experience more appropriately, but may also contribute to an increased sensitivity of cortical pain processing areas by capturing increased attention for that individual. The language used as part of communication between therapists and people with persistent pain may provide an, as yet, unexplored adjunct strategy in management. Copyright (c) 2008 John Wiley & Sons, Ltd.
Staging Lung Cancer: Metastasis.
Shroff, Girish S; Viswanathan, Chitra; Carter, Brett W; Benveniste, Marcelo F; Truong, Mylene T; Sabloff, Bradley S
2018-05-01
The updated eighth edition of the tumor, node, metastasis (TNM) classification for lung cancer includes revisions to T and M descriptors. In terms of the M descriptor, the classification of intrathoracic metastatic disease as M1a is unchanged from TNM-7. Extrathoracic metastatic disease, which was classified as M1b in TNM-7, is now subdivided into M1b (single metastasis, single organ) and M1c (multiple metastases in one or multiple organs) descriptors. In this article, the rationale for changes in the M descriptors, the utility of preoperative staging with PET/computed tomography, and the treatment options available for patients with oligometastatic disease are discussed. Copyright © 2018 Elsevier Inc. All rights reserved.
Item Selection, Evaluation, and Simple Structure in Personality Data
Pettersson, Erik; Turkheimer, Eric
2010-01-01
We report an investigation of the genesis and interpretation of simple structure in personality data using two very different self-reported data sets. The first consists of a set of relatively unselected lexical descriptors, whereas the second is based on responses to a carefully constructed instrument. In both data sets, we explore the degree of simple structure by comparing factor solutions to solutions from simulated data constructed to have either strong or weak simple structure. The analysis demonstrates that there is little evidence of simple structure in the unselected items, and a moderate degree among the selected items. In both instruments, however, much of the simple structure that could be observed originated in a strong dimension of positive vs. negative evaluation. PMID:20694168
García-Jacas, César R; Contreras-Torres, Ernesto; Marrero-Ponce, Yovani; Pupo-Meriño, Mario; Barigye, Stephen J; Cabrera-Leyva, Lisset
2016-01-01
Recently, novel 3D alignment-free molecular descriptors (also known as QuBiLS-MIDAS) based on two-linear, three-linear and four-linear algebraic forms have been introduced. These descriptors codify chemical information for relations between two, three and four atoms by using several (dis-)similarity metrics and multi-metrics. Several studies aimed at assessing the quality of these novel descriptors have been performed. However, a deeper analysis of their performance is necessary. Therefore, in the present manuscript an assessment and statistical validation of the performance of these novel descriptors in QSAR studies is performed. To this end, eight molecular datasets (angiotensin converting enzyme, acetylcholinesterase inhibitors, benzodiazepine receptor, cyclooxygenase-2 inhibitors, dihydrofolate reductase inhibitors, glycogen phosphorylase b, thermolysin inhibitors, thrombin inhibitors) widely used as benchmarks in the evaluation of several procedures are utilized. Three to nine variable QSAR models based on Multiple Linear Regression are built for each chemical dataset according to the original division into training/test sets. Comparisons with respect to leave-one-out cross-validation correlation coefficients[Formula: see text] reveal that the models based on QuBiLS-MIDAS indices possess superior predictive ability in 7 of the 8 datasets analyzed, outperforming methodologies based on similar or more complex techniques such as: Partial Least Square, Neural Networks, Support Vector Machine and others. On the other hand, superior external correlation coefficients[Formula: see text] are attained in 6 of the 8 test sets considered, confirming the good predictive power of the obtained models. For the [Formula: see text] values non-parametric statistic tests were performed, which demonstrated that the models based on QuBiLS-MIDAS indices have the best global performance and yield significantly better predictions in 11 of the 12 QSAR procedures used in the comparison. Lastly, a study concerning to the performance of the indices according to several conformer generation methods was performed. This demonstrated that the quality of predictions of the QSAR models based on QuBiLS-MIDAS indices depend on 3D structure generation method considered, although in this preliminary study the results achieved do not present significant statistical differences among them. As conclusions it can be stated that the QuBiLS-MIDAS indices are suitable for extracting structural information of the molecules and thus, constitute a promissory alternative to build models that contribute to the prediction of pharmacokinetic, pharmacodynamics and toxicological properties on novel compounds.Graphical abstractComparative graphical representation of the performance of the novel QuBiLS-MIDAS 3D-MDs with respect to other methodologies in QSAR modeling of eight chemical datasets.
Using Theoretical Descriptions in Structure Activity Relations. 3. Electronic Descriptors
1988-08-01
Activity Relationships (QSAR) have been used successfully in the past to develop predictive equations for several biological and physical properties...Linear Free Energy Relationships (,FF.3) and is based on work by Hammet in which he derived electronic descriptors for the dissociation of substituted...structure of a compound and its activity in a system. Several different structural descriptors have been used in QSAR equations . These range from
Building Scientific Confidence in the Development and ...
Read-across remains a popular data gap filling technique within category and analogue approaches for regulatory purposes. Acceptance of read-across is an ongoing challenge with several efforts underway for identifying and addressing uncertainties. Here we demonstrate an algorithmic approach to facilitate read-across using ToxCast in vitro bioactivity data in conjunction with chemical descriptor information to predict in vivo outcomes in guideline testing studies from ToxRefDB. Over 3400 different chemical structure descriptors were generated for a set of 976 chemicals and supplemented with the outcomes from 821 in vitro assays. The read-across prediction for a given chemical was based on the similarity weighted endpoint outcomes of its nearest neighbors calculated using in vitro bioactivity and chemical structure descriptors, called GenRA. GenRA is based on a computational approach for: (i) defining local validity domains using chemical and bioactivity descriptors, (ii) systematically deriving endpoint read-across predictions within these domains using similarity weighted activity of nearest neighbours, (iii) objectively evaluating predicted performance using tested chemicals, and (iv) assigning read-across predictions to untested chemicals along with estimates of uncertainty. We found in vitro bioactivity descriptors were often found to be more predictive of in vivo toxicity outcomes than chemical structure descriptors. We believe GenRA is an important first st
NASA Astrophysics Data System (ADS)
Rotini, Alice; Belmonte, Alessandro; Barrote, Isabel; Micheli, Carla; Peirano, Andrea; Santos, Rui O.; Silva, João; Migliore, Luciana
2013-09-01
The increasing rate of human-induced environmental changes on coastal marine ecosystems has created a demand for effective descriptors, in particular for those suitable for monitoring the status of seagrass meadows. Growing evidence has supported the useful application of biochemical and genetic descriptors such as secondary metabolite synthesis, photosynthetic activity and genetic diversity. In the present study, we have investigated the effectiveness of different descriptors (traditional, biochemical and genetic) in monitoring seagrass meadow conservation status. The Posidonia oceanica meadow of Monterosso al Mare (Ligurian sea, NW Mediterranean) was subjected to the measurement of bed density, leaf biometry, total phenols, soluble protein and photosynthetic pigment content as well as to RAPD marker analysis. This suite of descriptors provided evidence of their effectiveness and convenient application as markers of the conservation status of P. oceanica and/or other seagrasses. Biochemical/genetic descriptors and those obtained by traditional methods depicted a well conserved meadow with seasonal variability and, particularly in summer, indicated a healthier condition in a portion of the bed (station C), which was in agreement with the physical and sedimentological features of the station. Our results support the usefulness of introducing biochemical and genetic approaches to seagrass monitoring programs since they are effective indicators of plant physiological stress and environmental disturbance.
Developing descriptors to predict mechanical properties of nanotubes.
Borders, Tammie L; Fonseca, Alexandre F; Zhang, Hengji; Cho, Kyeongjae; Rusinko, Andrew
2013-04-22
Descriptors and quantitative structure property relationships (QSPR) were investigated for mechanical property prediction of carbon nanotubes (CNTs). 78 molecular dynamics (MD) simulations were carried out, and 20 descriptors were calculated to build quantitative structure property relationships (QSPRs) for Young's modulus and Poisson's ratio in two separate analyses: vacancy only and vacancy plus methyl functionalization. In the first analysis, C(N2)/C(T) (number of non-sp2 hybridized carbons per the total carbons) and chiral angle were identified as critical descriptors for both Young's modulus and Poisson's ratio. Further analysis and literature findings indicate the effect of chiral angle is negligible at larger CNT radii for both properties. Raman spectroscopy can be used to measure C(N2)/C(T), providing a direct link between experimental and computational results. Poisson's ratio approaches two different limiting values as CNT radii increases: 0.23-0.25 for chiral and armchair CNTs and 0.10 for zigzag CNTs (surface defects <3%). In the second analysis, the critical descriptors were C(N2)/C(T), chiral angle, and M(N)/C(T) (number of methyl groups per total carbons). These results imply new types of defects can be represented as a new descriptor in QSPR models. Finally, results are qualified and quantified against experimental data.
Pain Quality Descriptors in Community-Dwelling Older Adults with Nonmalignant Pain
Thakral, Manu; Shi, Ling; Foust, Janice B.; Patel, Kushang V.; Shmerling, Robert H.; Bean, Jonathan F.; Leveille, Suzanne G.
2016-01-01
This study aimed to characterize the prevalence of various pain qualities in older adults with chronic non-malignant pain and determine the association of pain quality to other pain characteristics namely: severity, interference distribution, and pain-associated conditions. In the population-based MOBILIZE Boston Study, 560 participants aged≥70 years reported chronic pain in the baseline assessment, which included a home interview and clinic exam. Pain quality was assessed using a modified version of the McGill Pain Questionnaire (MPQ) consisting of 20 descriptors, from which 3 categories were derived: cognitive/affective, sensory and neuropathic. Presence of ≥2 pain-associated conditions was significantly associated with 18 of the 20 pain quality descriptors. Sensory descriptors were endorsed by nearly all older adults with chronic pain (93%), followed by cognitive/affective (83.4%) and neuropathic descriptors (68.6%). Neuropathic descriptors were associated with the greatest number of pain-associated conditions including osteoarthritis of the hand and knee. More than half of participants (59%) endorsed descriptors in all 3 categories and had more severe pain and interference, and multi-site or widespread pain than those endorsing 1 or 2 categories. Strong associations were observed between pain quality and measures of pain severity, interference, and distribution (p<.0001). Findings from this study indicate that older adults have multiple pain-associated conditions which likely reflect multiple physiological mechanisms for pain. Linking pain qualities with other associated pain characteristics serves to develop a multidimensional approach to geriatric pain assessment. Future research is needed to investigate the physiological mechanisms responsible for the variability in pain qualities endorsed by older adults. PMID:27842050
Haranosono, Yu; Kurata, Masaaki; Sakaki, Hideyuki
2014-08-01
One of the mechanisms of phototoxicity is photo-reaction, such as reactive oxygen species (ROS) generation following photo-absorption. We focused on ROS generation and photo-absorption as key-steps, because these key-steps are able to be described by photochemical properties, and these properties are dependent on chemical structure. Photo-reactivity of a compound is described by HOMO-LUMO Gap (HLG), generally. Herein, we showed that HLG can be used as a descriptor of the generation of reactive oxygen species. Moreover, the maximum-conjugated π electron number (PENMC), which we found as a descriptor of photo-absorption, could also predict in vitro phototoxicity. Each descriptor could predict in vitro phototoxicity with 70.0% concordance, but there was un-predicted area found (gray zone). Interestingly, some compounds in each gray zone were not common, indicating that the combination of two descriptors could improve prediction potential. We reset the cut-off lines to define positive zone, negative zone and gray zone for each descriptor. Thereby we overlapped HLG and PENMC in a graph, and divided the total area to nine zones with cut-off lines of each descriptor. The rules to prediction were decided to achieve the best concordance, and the concordances were improved up to 82.8% for self-validation, 81.6% for cross-validation. We found common properties among false positive or negative compounds, photo-reactive structure and photo-allergenic, respectively. In addition, our method could be adapted to compounds rich in structural diversity using only chemical structure without any statistical analysis and complicated calculation.
Pain quality descriptors in community-dwelling older adults with nonmalignant pain.
Thakral, Manu; Shi, Ling; Foust, Janice B; Patel, Kushang V; Shmerling, Robert H; Bean, Jonathan F; Leveille, Suzanne G
2016-12-01
This study aimed to characterize the prevalence of various pain qualities in older adults with chronic nonmalignant pain and determine the association of pain quality to other pain characteristics namely: severity, interference, distribution, and pain-associated conditions. In the population-based MOBILIZE Boston Study, 560 participants aged ≥70 years reported chronic pain in the baseline assessment, which included a home interview and clinic exam. Pain quality was assessed using a modified version of the McGill Pain Questionnaire (MPQ) consisting of 20 descriptors from which 3 categories were derived: cognitive/affective, sensory, and neuropathic. Presence of ≥2 pain-associated conditions was significantly associated with 18 of the 20 pain quality descriptors. Sensory descriptors were endorsed by nearly all older adults with chronic pain (93%), followed by cognitive/affective (83.4%) and neuropathic descriptors (68.6%). Neuropathic descriptors were associated with the greatest number of pain-associated conditions including osteoarthritis of the hand and knee. More than half of participants (59%) endorsed descriptors in all 3 categories and had more severe pain and interference, and multisite or widespread pain than those endorsing 1 or 2 categories. Strong associations were observed between pain quality and measures of pain severity, interference, and distribution (P < 0.0001). Findings from this study indicate that older adults have multiple pain-associated conditions that likely reflect multiple physiological mechanisms for pain. Linking pain qualities with other associated pain characteristics serve to develop a multidimensional approach to geriatric pain assessment. Future research is needed to investigate the physiological mechanisms responsible for the variability in pain qualities endorsed by older adults.
Beretta-Piccoli, Matteo; D’Antona, Giuseppe; Barbero, Marco; Fisher, Beth; Dieli-Conwright, Christina M.; Clijsen, Ron; Cescon, Corrado
2015-01-01
Purpose Over the past decade, linear and non-linear surface electromyography descriptors for central and peripheral components of fatigue have been developed. In the current study, we tested fractal dimension (FD) and conduction velocity (CV) as myoelectric descriptors of central and peripheral fatigue, respectively. To this aim, we analyzed FD and CV slopes during sustained fatiguing contractions of the quadriceps femoris in healthy humans. Methods A total of 29 recreationally active women (mean age±standard deviation: 24±4 years) and two female elite athletes (one power athlete, age 24 and one endurance athlete, age 30 years) performed two knee extensions: (1) at 20% maximal voluntary contraction (MVC) for 30 s, and (2) at 60% MVC held until exhaustion. Surface EMG signals were detected from the vastus lateralis and vastus medialis using bidimensional arrays. Results Central and peripheral fatigue were described as decreases in FD and CV, respectively. A positive correlation between FD and CV (R=0.51, p<0.01) was found during the sustained 60% MVC, probably as a result of simultaneous motor unit synchronization and a decrease in muscle fiber CV during the fatiguing task. Conclusions Central and peripheral fatigue can be described as changes in FD and CV, at least in young, healthy women. The significant correlation between FD and CV observed at 60% MVC suggests that a mutual interaction between central and peripheral fatigue can arise during submaximal isometric contractions. PMID:25880369
Towards interoperable and reproducible QSAR analyses: Exchange of datasets.
Spjuth, Ola; Willighagen, Egon L; Guha, Rajarshi; Eklund, Martin; Wikberg, Jarl Es
2010-06-30
QSAR is a widely used method to relate chemical structures to responses or properties based on experimental observations. Much effort has been made to evaluate and validate the statistical modeling in QSAR, but these analyses treat the dataset as fixed. An overlooked but highly important issue is the validation of the setup of the dataset, which comprises addition of chemical structures as well as selection of descriptors and software implementations prior to calculations. This process is hampered by the lack of standards and exchange formats in the field, making it virtually impossible to reproduce and validate analyses and drastically constrain collaborations and re-use of data. We present a step towards standardizing QSAR analyses by defining interoperable and reproducible QSAR datasets, consisting of an open XML format (QSAR-ML) which builds on an open and extensible descriptor ontology. The ontology provides an extensible way of uniquely defining descriptors for use in QSAR experiments, and the exchange format supports multiple versioned implementations of these descriptors. Hence, a dataset described by QSAR-ML makes its setup completely reproducible. We also provide a reference implementation as a set of plugins for Bioclipse which simplifies setup of QSAR datasets, and allows for exporting in QSAR-ML as well as old-fashioned CSV formats. The implementation facilitates addition of new descriptor implementations from locally installed software and remote Web services; the latter is demonstrated with REST and XMPP Web services. Standardized QSAR datasets open up new ways to store, query, and exchange data for subsequent analyses. QSAR-ML supports completely reproducible creation of datasets, solving the problems of defining which software components were used and their versions, and the descriptor ontology eliminates confusions regarding descriptors by defining them crisply. This makes is easy to join, extend, combine datasets and hence work collectively, but also allows for analyzing the effect descriptors have on the statistical model's performance. The presented Bioclipse plugins equip scientists with graphical tools that make QSAR-ML easily accessible for the community.
Towards interoperable and reproducible QSAR analyses: Exchange of datasets
2010-01-01
Background QSAR is a widely used method to relate chemical structures to responses or properties based on experimental observations. Much effort has been made to evaluate and validate the statistical modeling in QSAR, but these analyses treat the dataset as fixed. An overlooked but highly important issue is the validation of the setup of the dataset, which comprises addition of chemical structures as well as selection of descriptors and software implementations prior to calculations. This process is hampered by the lack of standards and exchange formats in the field, making it virtually impossible to reproduce and validate analyses and drastically constrain collaborations and re-use of data. Results We present a step towards standardizing QSAR analyses by defining interoperable and reproducible QSAR datasets, consisting of an open XML format (QSAR-ML) which builds on an open and extensible descriptor ontology. The ontology provides an extensible way of uniquely defining descriptors for use in QSAR experiments, and the exchange format supports multiple versioned implementations of these descriptors. Hence, a dataset described by QSAR-ML makes its setup completely reproducible. We also provide a reference implementation as a set of plugins for Bioclipse which simplifies setup of QSAR datasets, and allows for exporting in QSAR-ML as well as old-fashioned CSV formats. The implementation facilitates addition of new descriptor implementations from locally installed software and remote Web services; the latter is demonstrated with REST and XMPP Web services. Conclusions Standardized QSAR datasets open up new ways to store, query, and exchange data for subsequent analyses. QSAR-ML supports completely reproducible creation of datasets, solving the problems of defining which software components were used and their versions, and the descriptor ontology eliminates confusions regarding descriptors by defining them crisply. This makes is easy to join, extend, combine datasets and hence work collectively, but also allows for analyzing the effect descriptors have on the statistical model's performance. The presented Bioclipse plugins equip scientists with graphical tools that make QSAR-ML easily accessible for the community. PMID:20591161
Microscopic structural descriptor of liquid water
NASA Astrophysics Data System (ADS)
Shi, Rui; Tanaka, Hajime
2018-03-01
The microscopic structure of liquid water has been believed to be the key to the understanding of the unique properties of this extremely important substance. Many structural descriptors have been developed for revealing local structural order in water, but their properties are still not well understood. The essential difficulty comes from structural fluctuations due to thermal noise, which are intrinsic to the liquid state. The most popular and widely used descriptors are the local structure index (LSI) and d5. Recently, Russo and Tanaka [Nat. Commun. 3, 3556 (2014)] introduced a new descriptor ζ which measures the translational order between the first and second shells considering hydrogen bonding (H-bonding) in the first shell. In this work, we compare the performance of these three structural descriptors for a popular water model known as TIP5P water. We show that local structural ordering can be properly captured only by the structural descriptor ζ, but not by the other two descriptors particularly at a high temperature, where thermal noise effects are severe. The key difference of ζ from LSI and d5 is that only ζ considers H-bonding which is crucial to detect high translational and tetrahedral order of not only oxygen but also hydrogen atoms. The importance of H-bonding is very natural, considering the fact that the locally favored structures are stabilized by energy gain due to the formation of four hydrogen bonds between the central water molecule and its neighboring ones in the first shell. Our analysis of the water structure by using ζ strongly supports the two-state model of water: water is a dynamic mixture of locally favored (ordered) and normal-liquid (disordered) structures. This work demonstrates the importance of H-bonding in the characterization of water's structures and provides a useful structural descriptor for water-type tetrahedral liquids to study their structure and dynamics.
Primary pre-service teachers' skills in planning a guided scientific inquiry
NASA Astrophysics Data System (ADS)
García-Carmona, Antonio; Criado, Ana M.; Cruz-Guzmán, Marta
2017-10-01
A study is presented of the skills that primary pre-service teachers (PPTs) have in completing the planning of a scientific inquiry on the basis of a guiding script. The sample comprised 66 PPTs who constituted a group-class of the subject Science Teaching, taught in the second year of an undergraduate degree in primary education at a Spanish university. The data was acquired from the responses of the PPTs (working in teams) to open-ended questions posed to them in the script concerning the various tasks involved in a scientific inquiry (formulation of hypotheses, design of the experiment, data collection, interpretation of results, drawing conclusions). Data were analyzed within the framework of a descriptive-interpretive qualitative research study with a combination of inter- and intra-rater methods, and the use of low-inference descriptors. The results showed that the PPTs have major shortcomings in planning the complete development of a guided scientific inquiry. The discussion of the results includes a number of implications for rethinking the Science Teaching course so that PPTs can attain a basic level of training in inquiry-based science education.
New formulae for Zagreb indices
NASA Astrophysics Data System (ADS)
Cangul, Ismail Naci; Yurttas, Aysun; Togan, Muge; Cevik, Ahmet Sinan
2017-07-01
In this paper, we study with some graph descriptors also called topological indices. These descriptors are useful in determination of some properties of chemical structures and preferred to some earlier descriptors as they are more practical. Especially the first and second Zagreb indices together with the first and second multiplicative Zagreb indices are considered and they are calculated in terms of the smallest and largest vertex degrees and vertex number for some well-known classes of graphs.
Analysis of A Drug Target-based Classification System using Molecular Descriptors.
Lu, Jing; Zhang, Pin; Bi, Yi; Luo, Xiaomin
2016-01-01
Drug-target interaction is an important topic in drug discovery and drug repositioning. KEGG database offers a drug annotation and classification using a target-based classification system. In this study, we gave an investigation on five target-based classes: (I) G protein-coupled receptors; (II) Nuclear receptors; (III) Ion channels; (IV) Enzymes; (V) Pathogens, using molecular descriptors to represent each drug compound. Two popular feature selection methods, maximum relevance minimum redundancy and incremental feature selection, were adopted to extract the important descriptors. Meanwhile, an optimal prediction model based on nearest neighbor algorithm was constructed, which got the best result in identifying drug target-based classes. Finally, some key descriptors were discussed to uncover their important roles in the identification of drug-target classes.
A blur-invariant local feature for motion blurred image matching
NASA Astrophysics Data System (ADS)
Tong, Qiang; Aoki, Terumasa
2017-07-01
Image matching between a blurred (caused by camera motion, out of focus, etc.) image and a non-blurred image is a critical task for many image/video applications. However, most of the existing local feature schemes fail to achieve this work. This paper presents a blur-invariant descriptor and a novel local feature scheme including the descriptor and the interest point detector based on moment symmetry - the authors' previous work. The descriptor is based on a new concept - center peak moment-like element (CPME) which is robust to blur and boundary effect. Then by constructing CPMEs, the descriptor is also distinctive and suitable for image matching. Experimental results show our scheme outperforms state of the art methods for blurred image matching
Sharma, Mukesh C; Sharma, S
2016-12-01
A series of 2-dihydro-4-quinazolin with potent highly selective inhibitors of inducible nitric oxide synthase activities was subjected to quantitative structure activity relationships (QSAR) analysis. Statistically significant equations with high correlation coefficient (r 2 = 0.8219) were developed. The k-nearest neighbor model has showed good cross-validated correlation coefficient and external validation values of 0.7866 and 0.7133, respectively. The selected electrostatic field descriptors the presence of blue ball around R1 and R4 in the quinazolinamine moiety showed electronegative groups favorable for nitric oxide synthase activity. The QSAR models may lead to the structural requirements of inducible nitric oxide compounds and help in the design of new compounds.
Seeing and Reading Red: Hue and Color-word Correlation in Images and Attendant Text on the WWW
DOE Office of Scientific and Technical Information (OSTI.GOV)
Newsam, S
2004-07-12
This work represents an initial investigation into determining whether correlations actually exist between metadata and content descriptors in multimedia datasets. We provide a quantitative method for evaluating whether the hue of images on the WWW is correlated with the occurrence of color-words in metadata such as URLs, image names, and attendant text. It turns out that such a correlation does exist: the likelihood that a particular color appears in an image whose URL, name, and/or attendant text contains the corresponding color-word is generally at least twice the likelihood that the color appears in a randomly chosen image on the WWW.more » While this finding might not be significant in and of itself, it represents an initial step towards quantitatively establishing that other, perhaps more useful correlations exist. These correlations form the basis for exciting novel approaches that leverage semi-supervised datasets, such as the WWW, to overcome the semantic gap that has hampered progress in multimedia information retrieval for some time now.« less
Sedykh, Alexander; Zhu, Hao; Tang, Hao; Zhang, Liying; Richard, Ann; Rusyn, Ivan; Tropsha, Alexander
2011-01-01
Background Quantitative high-throughput screening (qHTS) assays are increasingly being used to inform chemical hazard identification. Hundreds of chemicals have been tested in dozens of cell lines across extensive concentration ranges by the National Toxicology Program in collaboration with the National Institutes of Health Chemical Genomics Center. Objectives Our goal was to test a hypothesis that dose–response data points of the qHTS assays can serve as biological descriptors of assayed chemicals and, when combined with conventional chemical descriptors, improve the accuracy of quantitative structure–activity relationship (QSAR) models applied to prediction of in vivo toxicity end points. Methods We obtained cell viability qHTS concentration–response data for 1,408 substances assayed in 13 cell lines from PubChem; for a subset of these compounds, rodent acute toxicity half-maximal lethal dose (LD50) data were also available. We used the k nearest neighbor classification and random forest QSAR methods to model LD50 data using chemical descriptors either alone (conventional models) or combined with biological descriptors derived from the concentration–response qHTS data (hybrid models). Critical to our approach was the use of a novel noise-filtering algorithm to treat qHTS data. Results Both the external classification accuracy and coverage (i.e., fraction of compounds in the external set that fall within the applicability domain) of the hybrid QSAR models were superior to conventional models. Conclusions Concentration–response qHTS data may serve as informative biological descriptors of molecules that, when combined with conventional chemical descriptors, may considerably improve the accuracy and utility of computational approaches for predicting in vivo animal toxicity end points. PMID:20980217
Towards a metadata scheme for the description of materials - the description of microstructures
NASA Astrophysics Data System (ADS)
Schmitz, Georg J.; Böttger, Bernd; Apel, Markus; Eiken, Janin; Laschet, Gottfried; Altenfeld, Ralph; Berger, Ralf; Boussinot, Guillaume; Viardin, Alexandre
2016-01-01
The property of any material is essentially determined by its microstructure. Numerical models are increasingly the focus of modern engineering as helpful tools for tailoring and optimization of custom-designed microstructures by suitable processing and alloy design. A huge variety of software tools is available to predict various microstructural aspects for different materials. In the general frame of an integrated computational materials engineering (ICME) approach, these microstructure models provide the link between models operating at the atomistic or electronic scales, and models operating on the macroscopic scale of the component and its processing. In view of an improved interoperability of all these different tools it is highly desirable to establish a standardized nomenclature and methodology for the exchange of microstructure data. The scope of this article is to provide a comprehensive system of metadata descriptors for the description of a 3D microstructure. The presented descriptors are limited to a mere geometric description of a static microstructure and have to be complemented by further descriptors, e.g. for properties, numerical representations, kinetic data, and others in the future. Further attributes to each descriptor, e.g. on data origin, data uncertainty, and data validity range are being defined in ongoing work. The proposed descriptors are intended to be independent of any specific numerical representation. The descriptors defined in this article may serve as a first basis for standardization and will simplify the data exchange between different numerical models, as well as promote the integration of experimental data into numerical models of microstructures. An HDF5 template data file for a simple, three phase Al-Cu microstructure being based on the defined descriptors complements this article.
Toxicity prediction of ionic liquids based on Daphnia magna by using density functional theory
NASA Astrophysics Data System (ADS)
Nu’aim, M. N.; Bustam, M. A.
2018-04-01
By using a model called density functional theory, the toxicity of ionic liquids can be predicted and forecast. It is a theory that allowing the researcher to have a substantial tool for computation of the quantum state of atoms, molecules and solids, and molecular dynamics which also known as computer simulation method. It can be done by using structural feature based quantum chemical reactivity descriptor. The identification of ionic liquids and its Log[EC50] data are from literature data that available in Ismail Hossain thesis entitled “Synthesis, Characterization and Quantitative Structure Toxicity Relationship of Imidazolium, Pyridinium and Ammonium Based Ionic Liquids”. Each cation and anion of the ionic liquids were optimized and calculated. The geometry optimization and calculation from the software, produce the value of highest occupied molecular orbital (HOMO) and lowest unoccupied molecular orbital (LUMO). From the value of HOMO and LUMO, the value for other toxicity descriptors were obtained according to their formulas. The toxicity descriptor that involves are electrophilicity index, HOMO, LUMO, energy gap, chemical potential, hardness and electronegativity. The interrelation between the descriptors are being determined by using a multiple linear regression (MLR). From this MLR, all descriptors being analyzed and the descriptors that are significant were chosen. In order to develop the finest model equation for toxicity prediction of ionic liquids, the selected descriptors that are significant were used. The validation of model equation was performed with the Log[EC50] data from the literature and the final model equation was developed. A bigger range of ionic liquids which nearly 108 of ionic liquids can be predicted from this model equation.
Towards a metadata scheme for the description of materials - the description of microstructures.
Schmitz, Georg J; Böttger, Bernd; Apel, Markus; Eiken, Janin; Laschet, Gottfried; Altenfeld, Ralph; Berger, Ralf; Boussinot, Guillaume; Viardin, Alexandre
2016-01-01
The property of any material is essentially determined by its microstructure. Numerical models are increasingly the focus of modern engineering as helpful tools for tailoring and optimization of custom-designed microstructures by suitable processing and alloy design. A huge variety of software tools is available to predict various microstructural aspects for different materials. In the general frame of an integrated computational materials engineering (ICME) approach, these microstructure models provide the link between models operating at the atomistic or electronic scales, and models operating on the macroscopic scale of the component and its processing. In view of an improved interoperability of all these different tools it is highly desirable to establish a standardized nomenclature and methodology for the exchange of microstructure data. The scope of this article is to provide a comprehensive system of metadata descriptors for the description of a 3D microstructure. The presented descriptors are limited to a mere geometric description of a static microstructure and have to be complemented by further descriptors, e.g. for properties, numerical representations, kinetic data, and others in the future. Further attributes to each descriptor, e.g. on data origin, data uncertainty, and data validity range are being defined in ongoing work. The proposed descriptors are intended to be independent of any specific numerical representation. The descriptors defined in this article may serve as a first basis for standardization and will simplify the data exchange between different numerical models, as well as promote the integration of experimental data into numerical models of microstructures. An HDF5 template data file for a simple, three phase Al-Cu microstructure being based on the defined descriptors complements this article.
NASA Astrophysics Data System (ADS)
Doytchinova, Irini A.; Walshe, Valerie; Borrow, Persephone; Flower, Darren R.
2005-03-01
The affinities of 177 nonameric peptides binding to the HLA-A*0201 molecule were measured using a FACS-based MHC stabilisation assay and analysed using chemometrics. Their structures were described by global and local descriptors, QSAR models were derived by genetic algorithm, stepwise regression and PLS. The global molecular descriptors included molecular connectivity χ indices, κ shape indices, E-state indices, molecular properties like molecular weight and log P, and three-dimensional descriptors like polarizability, surface area and volume. The local descriptors were of two types. The first used a binary string to indicate the presence of each amino acid type at each position of the peptide. The second was also position-dependent but used five z-scales to describe the main physicochemical properties of the amino acids forming the peptides. The models were developed using a representative training set of 131 peptides and validated using an independent test set of 46 peptides. It was found that the global descriptors could not explain the variance in the training set nor predict the affinities of the test set accurately. Both types of local descriptors gave QSAR models with better explained variance and predictive ability. The results suggest that, in their interactions with the MHC molecule, the peptide acts as a complicated ensemble of multiple amino acids mutually potentiating each other.
Stenzel, Angelika; Goss, Kai-Uwe; Endo, Satoshi
2013-02-05
Polyparameter linear free energy relationships (pp-LFERs) can predict partition coefficients for a multitude of environmental and biological phases with high accuracy. In this work, the pp-LFER substance descriptors of 40 established and alternative flame retardants (e.g., polybrominated diphenyl ethers, hexabromocyclododecane, bromobenzenes, trialkyl phosphates) were determined experimentally. In total, 251 data for gas-chromatographic (GC) retention times and liquid/liquid partition coefficients (K) were measured and used to calibrate the pp-LFER substance descriptors. Substance descriptors were validated through a comparison between predicted and experimental log K for the systems octanol/water (K(ow)), water/air (K(wa)), organic carbon/water (K(oc)) and liposome/water (K(lipw)), revealing a high reliability of pp-LFER predictions based on our descriptors. For instance, the difference between predicted and experimental log K(ow) was <0.3 log units for 17 out of 21 compounds for which experimental values were available. Moreover, we found an indication that the H-bond acceptor value (B) depends on the solvent for some compounds. Thus, for predicting environmentally relevant partition coefficients it is important to determine B values using measurements in aqueous systems. The pp-LFER descriptors calibrated in this study can be used to predict partition coefficients for which experimental data are unavailable, and the predicted values can serve as references for further experimental measurements.
Stargate GTM: Bridging Descriptor and Activity Spaces.
Gaspar, Héléna A; Baskin, Igor I; Marcou, Gilles; Horvath, Dragos; Varnek, Alexandre
2015-11-23
Predicting the activity profile of a molecule or discovering structures possessing a specific activity profile are two important goals in chemoinformatics, which could be achieved by bridging activity and molecular descriptor spaces. In this paper, we introduce the "Stargate" version of the Generative Topographic Mapping approach (S-GTM) in which two different multidimensional spaces (e.g., structural descriptor space and activity space) are linked through a common 2D latent space. In the S-GTM algorithm, the manifolds are trained simultaneously in two initial spaces using the probabilities in the 2D latent space calculated as a weighted geometric mean of probability distributions in both spaces. S-GTM has the following interesting features: (1) activities are involved during the training procedure; therefore, the method is supervised, unlike conventional GTM; (2) using molecular descriptors of a given compound as input, the model predicts a whole activity profile, and (3) using an activity profile as input, areas populated by relevant chemical structures can be detected. To assess the performance of S-GTM prediction models, a descriptor space (ISIDA descriptors) of a set of 1325 GPCR ligands was related to a B-dimensional (B = 1 or 8) activity space corresponding to pKi values for eight different targets. S-GTM outperforms conventional GTM for individual activities and performs similarly to the Lasso multitask learning algorithm, although it is still slightly less accurate than the Random Forest method.
Kennicutt, A R; Morkowchuk, L; Krein, M; Breneman, C M; Kilduff, J E
2016-08-01
A quantitative structure-activity relationship was developed to predict the efficacy of carbon adsorption as a control technology for endocrine-disrupting compounds, pharmaceuticals, and components of personal care products, as a tool for water quality professionals to protect public health. Here, we expand previous work to investigate a broad spectrum of molecular descriptors including subdivided surface areas, adjacency and distance matrix descriptors, electrostatic partial charges, potential energy descriptors, conformation-dependent charge descriptors, and Transferable Atom Equivalent (TAE) descriptors that characterize the regional electronic properties of molecules. We compare the efficacy of linear (Partial Least Squares) and non-linear (Support Vector Machine) machine learning methods to describe a broad chemical space and produce a user-friendly model. We employ cross-validation, y-scrambling, and external validation for quality control. The recommended Support Vector Machine model trained on 95 compounds having 23 descriptors offered a good balance between good performance statistics, low error, and low probability of over-fitting while describing a wide range of chemical features. The cross-validated model using a log-uptake (qe) response calculated at an aqueous equilibrium concentration (Ce) of 1 μM described the training dataset with an r(2) of 0.932, had a cross-validated r(2) of 0.833, and an average residual of 0.14 log units.
2014-01-01
We present four models of solution free-energy prediction for druglike molecules utilizing cheminformatics descriptors and theoretically calculated thermodynamic values. We make predictions of solution free energy using physics-based theory alone and using machine learning/quantitative structure–property relationship (QSPR) models. We also develop machine learning models where the theoretical energies and cheminformatics descriptors are used as combined input. These models are used to predict solvation free energy. While direct theoretical calculation does not give accurate results in this approach, machine learning is able to give predictions with a root mean squared error (RMSE) of ∼1.1 log S units in a 10-fold cross-validation for our Drug-Like-Solubility-100 (DLS-100) dataset of 100 druglike molecules. We find that a model built using energy terms from our theoretical methodology as descriptors is marginally less predictive than one built on Chemistry Development Kit (CDK) descriptors. Combining both sets of descriptors allows a further but very modest improvement in the predictions. However, in some cases, this is a statistically significant enhancement. These results suggest that there is little complementarity between the chemical information provided by these two sets of descriptors, despite their different sources and methods of calculation. Our machine learning models are also able to predict the well-known Solubility Challenge dataset with an RMSE value of 0.9–1.0 log S units. PMID:24564264
Cardio-vascular safety beyond hERG: in silico modelling of a guinea pig right atrium assay
NASA Astrophysics Data System (ADS)
Fenu, Luca A.; Teisman, Ard; De Buck, Stefan S.; Sinha, Vikash K.; Gilissen, Ron A. H. J.; Nijsen, Marjoleen J. M. A.; Mackie, Claire E.; Sanderson, Wendy E.
2009-12-01
As chemists can easily produce large numbers of new potential drug candidates, there is growing demand for high capacity models that can help in driving the chemistry towards efficacious and safe candidates before progressing towards more complex models. Traditionally, the cardiovascular (CV) safety domain plays an important role in this process, as many preclinical CV biomarkers seem to have high prognostic value for the clinical outcome. Throughout the industry, traditional ion channel binding data are generated to drive the early selection process. Although this assay can generate data at high capacity, it has the disadvantage of producing high numbers of false negatives. Therefore, our company applies the isolated guinea pig right atrium (GPRA) assay early-on in discovery. This functional multi-channel/multi-receptor model seems much more predictive in identifying potential CV liabilities. Unfortunately however, its capacity is limited, and there is no room for full automation. We assessed the correlation between ion channel binding and the GPRA's Rate of Contraction (RC), Contractile Force (CF), and effective refractory frequency (ERF) measures assay using over six thousand different data points. Furthermore, the existing experimental knowledge base was used to develop a set of in silico classification models attempting to mimic the GPRA inhibitory activity. The Naïve Bayesian classifier was used to built several models, using the ion channel binding data or in silico computed properties and structural fingerprints as descriptors. The models were validated on an independent and diverse test set of 200 reference compounds. Performances were assessed on the bases of their overall accuracy, sensitivity and specificity in detecting both active and inactive molecules. Our data show that all in silico models are highly predictive of actual GPRA data, at a level equivalent or superior to the ion channel binding assays. Furthermore, the models were interpreted in terms of the descriptors used to highlight the undesirable areas in the explored chemical space, specifically regions of low polarity, high lipophilicity and high molecular weight. In conclusion, we developed a predictive in silico model of a complex physiological assay based on a large and high quality set of experimental data. This model allows high throughput in silico safety screening based on chemical structure within a given chemical space.
Lenca, Nicole; Atapattu, Sanka N; Poole, Colin F
2017-12-01
Retention factors obtained by gas chromatography and reversed-phase liquid chromatography on varied columns and partition constants in different liquid-liquid partition systems are used to estimate WSU descriptor values for 36 anilines and N-heterocyclic compounds, 13 amides and related compounds, and 45 phenols and alcohols. These compounds are suitable for use as calibration compounds to characterize separation systems covering the descriptor space E=0.2-3, S=0.4-2.1, A=0-1.5, B=0.1-1.5, L=2.5-10.0 and V=0.5-2.2. Hydrogen-bonding properties are discussed in terms of structure, the possibility of induction effects, intramolecular hydrogen bonding and steric factors for anilines, amides, phenols and alcohols. The relationship between these parameters and observed descriptor values are difficult to predict from structure but facilitate improving the general occupancy of the descriptor space by creating incremental changes in hydrogen-bonding properties. It is verified that the compounds included in this study can be merged with an existing database of compounds recommended for characterizing separation systems. Copyright © 2017 Elsevier B.V. All rights reserved.
Locator-Checker-Scaler Object Tracking Using Spatially Ordered and Weighted Patch Descriptor.
Kim, Han-Ul; Kim, Chang-Su
2017-08-01
In this paper, we propose a simple yet effective object descriptor and a novel tracking algorithm to track a target object accurately. For the object description, we divide the bounding box of a target object into multiple patches and describe them with color and gradient histograms. Then, we determine the foreground weight of each patch to alleviate the impacts of background information in the bounding box. To this end, we perform random walk with restart (RWR) simulation. We then concatenate the weighted patch descriptors to yield the spatially ordered and weighted patch (SOWP) descriptor. For the object tracking, we incorporate the proposed SOWP descriptor into a novel tracking algorithm, which has three components: locator, checker, and scaler (LCS). The locator and the scaler estimate the center location and the size of a target, respectively. The checker determines whether it is safe to adjust the target scale in a current frame. These three components cooperate with one another to achieve robust tracking. Experimental results demonstrate that the proposed LCS tracker achieves excellent performance on recent benchmarks.
Artificial Intelligence Methods Applied to Parameter Detection of Atrial Fibrillation
NASA Astrophysics Data System (ADS)
Arotaritei, D.; Rotariu, C.
2015-09-01
In this paper we present a novel method to develop an atrial fibrillation (AF) based on statistical descriptors and hybrid neuro-fuzzy and crisp system. The inference of system produce rules of type if-then-else that care extracted to construct a binary decision system: normal of atrial fibrillation. We use TPR (Turning Point Ratio), SE (Shannon Entropy) and RMSSD (Root Mean Square of Successive Differences) along with a new descriptor, Teager- Kaiser energy, in order to improve the accuracy of detection. The descriptors are calculated over a sliding window that produce very large number of vectors (massive dataset) used by classifier. The length of window is a crisp descriptor meanwhile the rest of descriptors are interval-valued type. The parameters of hybrid system are adapted using Genetic Algorithm (GA) algorithm with fitness single objective target: highest values for sensibility and sensitivity. The rules are extracted and they are part of the decision system. The proposed method was tested using the Physionet MIT-BIH Atrial Fibrillation Database and the experimental results revealed a good accuracy of AF detection in terms of sensitivity and specificity (above 90%).
Clar theory and resonance energy
NASA Astrophysics Data System (ADS)
Gutman, Ivan; Gojak, Sabina; Furtula, Boris
2005-09-01
A mathematical model, referred here as the Zhang-Zhang polynomial ζ( x), that embraces all the main concepts encountered in the Clar aromatic sextet theory of benzenoid hydrocarbons, was recently put forward by Zhang and Zhang. We now show that ζ( x) is related to resonance energy, and that ln ζ( x) and RE are best correlated when x ≈ 1. This indicates that ζ(1) could be viewed as a (novel) structure-descriptor, playing a role analogous to the Kekulé structure count in Kekulé-structure-based theories. Some basic properties of ζ(1) are established.
Duan, Lingfeng; Han, Jiwan; Guo, Zilong; Tu, Haifu; Yang, Peng; Zhang, Dong; Fan, Yuan; Chen, Guoxing; Xiong, Lizhong; Dai, Mingqiu; Williams, Kevin; Corke, Fiona; Doonan, John H; Yang, Wanneng
2018-01-01
Dynamic quantification of drought response is a key issue both for variety selection and for functional genetic study of rice drought resistance. Traditional assessment of drought resistance traits, such as stay-green and leaf-rolling, has utilized manual measurements, that are often subjective, error-prone, poorly quantified and time consuming. To relieve this phenotyping bottleneck, we demonstrate a feasible, robust and non-destructive method that dynamically quantifies response to drought, under both controlled and field conditions. Firstly, RGB images of individual rice plants at different growth points were analyzed to derive 4 features that were influenced by imposition of drought. These include a feature related to the ability to stay green, which we termed greenness plant area ratio (GPAR) and 3 shape descriptors [total plant area/bounding rectangle area ratio (TBR), perimeter area ratio (PAR) and total plant area/convex hull area ratio (TCR)]. Experiments showed that these 4 features were capable of discriminating reliably between drought resistant and drought sensitive accessions, and dynamically quantifying the drought response under controlled conditions across time (at either daily or half hourly time intervals). We compared the 3 shape descriptors and concluded that PAR was more robust and sensitive to leaf-rolling than the other shape descriptors. In addition, PAR and GPAR proved to be effective in quantification of drought response in the field. Moreover, the values obtained in field experiments using the collection of rice varieties were correlated with those derived from pot-based experiments. The general applicability of the algorithms is demonstrated by their ability to probe archival Miscanthus data previously collected on an independent platform. In conclusion, this image-based technology is robust providing a platform-independent tool for quantifying drought response that should be of general utility for breeding and functional genomics in future.
Arimura, Tatsuyuki; Hosoi, Masako; Tsukiyama, Yoshihiro; Yoshida, Toshiyuki; Fujiwara, Daiki; Tanaka, Masanori; Tamura, Ryuichi; Nakashima, Yasunori; Sudo, Nobuyuki; Kubo, Chiharu
2012-04-01
The present study aimed to develop a Japanese version of the Short-Form McGill Pain Questionnaire (SF-MPQ-J) that focuses on cross-culturally equivalence to the original English version and to test its reliability and validity. Cross-sectional design. In study 1, SF-MPQ was translated and adapted into Japanese. It included construction of response scales equivalent to the original using a variation of the Thurstone method of equal-appearing intervals. A total of 147 undergraduate students and 44 pain patients participated in the development of the Japanese response scales. To measure the equivalence of pain descriptors, 62 pain patients in four diagnostic groups were asked to choose pain descriptors that described their pain. In study 2, chronic pain patients (N=126) completed the SF-MPQ-J, the Long-Form McGill Pain Questionnaire Japanese version (LF-MPQ-J), and the 11-point numerical rating scale of pain intensity. Correlation analysis examined the construct validity of the SF-MPQ-J. The results from study 1 were used to develop SF-MPQ-J, which is linguistically equivalent to the original questionnaire. Response scales from SF-MPQ-J represented the original scale values. All pain descriptors, except one, were used by >33% in at least one of the four diagnostic groups. Study 2 exhibited adequate internal consistency and test-retest reliability, with the construct validity of SF-MPQ-J comparable to the original. These findings suggested that SF-MPQ-J is reliable, valid, and cross-culturally equivalent to the original questionnaire. Researchers might consider using this scale in multicenter, multi-ethnical trials or cross-cultural studies that include Japanese-speaking patients. Wiley Periodicals, Inc.
Data-Driven Neural Network Model for Robust Reconstruction of Automobile Casting
NASA Astrophysics Data System (ADS)
Lin, Jinhua; Wang, Yanjie; Li, Xin; Wang, Lu
2017-09-01
In computer vision system, it is a challenging task to robustly reconstruct complex 3D geometries of automobile castings. However, 3D scanning data is usually interfered by noises, the scanning resolution is low, these effects normally lead to incomplete matching and drift phenomenon. In order to solve these problems, a data-driven local geometric learning model is proposed to achieve robust reconstruction of automobile casting. In order to relieve the interference of sensor noise and to be compatible with incomplete scanning data, a 3D convolution neural network is established to match the local geometric features of automobile casting. The proposed neural network combines the geometric feature representation with the correlation metric function to robustly match the local correspondence. We use the truncated distance field(TDF) around the key point to represent the 3D surface of casting geometry, so that the model can be directly embedded into the 3D space to learn the geometric feature representation; Finally, the training labels is automatically generated for depth learning based on the existing RGB-D reconstruction algorithm, which accesses to the same global key matching descriptor. The experimental results show that the matching accuracy of our network is 92.2% for automobile castings, the closed loop rate is about 74.0% when the matching tolerance threshold τ is 0.2. The matching descriptors performed well and retained 81.6% matching accuracy at 95% closed loop. For the sparse geometric castings with initial matching failure, the 3D matching object can be reconstructed robustly by training the key descriptors. Our method performs 3D reconstruction robustly for complex automobile castings.
Moore, R.; Brødsgaard, I.; Mao, T. K.; Miller, M. L.; Dworkin, S. F.
1998-01-01
Differences in ethnic beliefs about the perceived need for local anesthesia for tooth drilling and childbirth labor were surveyed among Anglo-Americans, Mandarin Chinese, and Scandinavians (89 dentists and 251 patients) matched for age, gender, and occupation. Subjects matched survey questionnaire items selected from previously reported interview results to estimate (a) their beliefs about the possible use of anesthetic for tooth drilling and labor pain compared with other possible remedies and (b) the choice of pain descriptors associated with the use of nonuse of anesthetic, including descriptions of injection pain. Multidimensional scaling, Gamma, and Chi-square statistics as well as odds ratios and Spearman's correlations were employed in the analysis. Seventy-seven percent of American informants reported the use of anesthetics as possible remedies for drilling and 51% reported the use of anesthetics for labor pain compared with 34% that reported the use of anesthetics among Chinese for drilling and 5% for labor pain and 70% among Scandinavians for drilling and 35% for labor pain. Most Americans and Swedes described tooth-drilling sensations as sharp, most Chinese used descriptors such as sharp and "sourish" (suan), and most Danes used words like shooting (jagende). By rank, Americans described labor pain as cramping, sharp, and excruciating, Chinese used words like sharp, intermittent, and horrible, Danes used words like shooting, tiring, and sharp, and Swedes used words like tiring, "good," yet horrible. Preferred pain descriptors for drilling, birth, and injection pains varied significantly by ethnicity. Results corroborated conclusions of a qualitative study about pain beliefs in relation to perceived needs for anesthetic in tooth drilling. Samples used to obtain the results were estimated to approach qualitative representativity for these urban ethnic groups. PMID:9790007
Postharvest Monitoring of Tomato Ripening Using the Dynamic Laser Speckle
Pieczywek, Piotr Mariusz; Nowacka, Małgorzata; Dadan, Magdalena; Wiktor, Artur; Rybak, Katarzyna; Witrowa-Rajchert, Dorota; Zdunek, Artur
2018-01-01
The dynamic laser speckle (biospeckle) method was tested as a potential tool for the assessment and monitoring of the maturity stage of tomatoes. Two tomato cultivars—Admiro and Starbuck—were tested. The process of climacteric maturation of tomatoes was monitored during a shelf life storage experiment. The biospeckle phenomena were captured using 640 nm and 830 nm laser light wavelength, and analysed using two activity descriptors based on biospeckle pattern decorrelation—C4 and ε. The well-established optical parameters of tomatoes skin were used as a reference method (luminosity, a*/b*, chroma). Both methods were tested with respect to their prediction capabilities of the maturity and destructive indicators of tomatoes—firmness, chlorophyll and carotenoids content. The statistical significance of the tested relationships were investigated by means of linear regression models. The climacteric maturation of tomato fruit was associated with an increase in biospckle activity. Compared to the 830 nm laser wavelength the biospeckle activity measured at 640 nm enabled more accurate predictions of firmness, chlorophyll and carotenoids content. At 640 nm laser wavelength both activity descriptors (C4 and ε) provided similar results, while at 830 nm the ε showed slightly better performance. The linear regression models showed that biospeckle activity descriptors had a higher correlation with chlorophyll and carotenoids content than the a*/b* ratio and luminosity. The results for chroma were comparable with the results for both biospeckle activity indicators. The biospeckle method showed very good results in terms of maturation monitoring and the prediction of the maturity indices of tomatoes, proving the possibility of practical implementation of this method for the determination of the maturity stage of tomatoes. PMID:29617343
Wang, Wenyi; Kim, Marlene T.; Sedykh, Alexander
2015-01-01
Purpose Experimental Blood–Brain Barrier (BBB) permeability models for drug molecules are expensive and time-consuming. As alternative methods, several traditional Quantitative Structure-Activity Relationship (QSAR) models have been developed previously. In this study, we aimed to improve the predictivity of traditional QSAR BBB permeability models by employing relevant public bio-assay data in the modeling process. Methods We compiled a BBB permeability database consisting of 439 unique compounds from various resources. The database was split into a modeling set of 341 compounds and a validation set of 98 compounds. Consensus QSAR modeling workflow was employed on the modeling set to develop various QSAR models. A five-fold cross-validation approach was used to validate the developed models, and the resulting models were used to predict the external validation set compounds. Furthermore, we used previously published membrane transporter models to generate relevant transporter profiles for target compounds. The transporter profiles were used as additional biological descriptors to develop hybrid QSAR BBB models. Results The consensus QSAR models have R2=0.638 for fivefold cross-validation and R2=0.504 for external validation. The consensus model developed by pooling chemical and transporter descriptors showed better predictivity (R2=0.646 for five-fold cross-validation and R2=0.526 for external validation). Moreover, several external bio-assays that correlate with BBB permeability were identified using our automatic profiling tool. Conclusions The BBB permeability models developed in this study can be useful for early evaluation of new compounds (e.g., new drug candidates). The combination of chemical and biological descriptors shows a promising direction to improve the current traditional QSAR models. PMID:25862462
Describing a Strongly Correlated Model System with Density Functional Theory.
Kong, Jing; Proynov, Emil; Yu, Jianguo; Pachter, Ruth
2017-07-06
The linear chain of hydrogen atoms, a basic prototype for the transition from a metal to Mott insulator, is studied with a recent density functional theory model functional for nondynamic and strong correlation. The computed cohesive energy curve for the transition agrees well with accurate literature results. The variation of the electronic structure in this transition is characterized with a density functional descriptor that yields the atomic population of effectively localized electrons. These new methods are also applied to the study of the Peierls dimerization of the stretched even-spaced Mott insulator to a chain of H 2 molecules, a different insulator. The transitions among the two insulating states and the metallic state of the hydrogen chain system are depicted in a semiquantitative phase diagram. Overall, we demonstrate the capability of studying strongly correlated materials with a mean-field model at the fundamental level, in contrast to the general pessimistic view on such a feasibility.
Innovative design method of automobile profile based on Fourier descriptor
NASA Astrophysics Data System (ADS)
Gao, Shuyong; Fu, Chaoxing; Xia, Fan; Shen, Wei
2017-10-01
Aiming at the innovation of the contours of automobile side, this paper presents an innovative design method of vehicle side profile based on Fourier descriptor. The design flow of this design method is: pre-processing, coordinate extraction, standardization, discrete Fourier transform, simplified Fourier descriptor, exchange descriptor innovation, inverse Fourier transform to get the outline of innovative design. Innovative concepts of the innovative methods of gene exchange among species and the innovative methods of gene exchange among different species are presented, and the contours of the innovative design are obtained separately. A three-dimensional model of a car is obtained by referring to the profile curve which is obtained by exchanging xenogeneic genes. The feasibility of the method proposed in this paper is verified by various aspects.
Application of the QSPR approach to the boiling points of azeotropes.
Katritzky, Alan R; Stoyanova-Slavova, Iva B; Tämm, Kaido; Tamm, Tarmo; Karelson, Mati
2011-04-21
CODESSA Pro derivative descriptors were calculated for a data set of 426 azeotropic mixtures by the centroid approximation and the weighted-contribution-factor approximation. The two approximations produced almost identical four-descriptor QSPR models relating the structural characteristic of the individual components of azeotropes to the azeotropic boiling points. These models were supported by internal and external validations. The descriptors contributing to the QSPR models are directly related to the three components of the enthalpy (heat) of vaporization.
TOXICO-CHEMINFORMATICS AND QSAR MODELING OF ...
This abstract concludes that QSAR approaches combined with toxico-chemoinformatics descriptors can enhance predictive toxicology models. This abstract concludes that QSAR approaches combined with toxico-chemoinformatics descriptors can enhance predictive toxicology models.
Kiang, Michael; Light, Gregory A; Prugh, Jocelyn; Coulson, Seana; Braff, David L; Kutas, Marta
2007-07-01
A hallmark of schizophrenia is impaired proverb interpretation, which could be due to: (1) aberrant activation of disorganized semantic associations, or (2) working memory (WM) deficits. We assessed 18 schizophrenia patients and 18 normal control participants on proverb interpretation, and evaluated these two hypotheses by examining within patients the correlations of proverb interpretation with disorganized symptoms and auditory WM, respectively. Secondarily, we also explored the relationships between proverb interpretation and a spectrum of cognitive functions including auditory sensory-memory encoding (as indexed by the mismatch negativity (MMN) event-related brain potential (ERP)); executive function; and social/occupational function. As expected, schizophrenia patients produced less accurate and less abstract descriptions of proverbs than did controls. These proverb interpretation difficulties in patients were not significantly correlated with disorganization or other symptom factors, but were significantly correlated (p < .05) with WM impairment, as well as with impairments in sensory-memory encoding, executive function, and social/occupational function. These results offer no support for disorganized associations in abnormal proverb interpretation in schizophrenia, but implicate WM deficits, perhaps as a part of a syndrome related to generalized frontal cortical dysfunction.
Short memory fuzzy fusion image recognition schema employing spatial and Fourier descriptors
NASA Astrophysics Data System (ADS)
Raptis, Sotiris N.; Tzafestas, Spyros G.
2001-03-01
Single images quite often do not bear enough information for precise interpretation due to a variety of reasons. Multiple image fusion and adequate integration recently became the state of the art in the pattern recognition field. In this paper presented here and enhanced multiple observation schema is discussed investigating improvements to the baseline fuzzy- probabilistic image fusion methodology. The first innovation introduced consists in considering only a limited but seemingly ore effective part of the uncertainty information obtained by a certain time restricting older uncertainty dependencies and alleviating computational burden that is now needed for short sequence (stored into memory) of samples. The second innovation essentially grouping them into feature-blind object hypotheses. Experiment settings include a sequence of independent views obtained by camera being moved around the investigated object.
Perspectives on Porous Media MR in Clinical MRI
NASA Astrophysics Data System (ADS)
Sigmund, E. E.
2011-03-01
Many goals and challenges of research in natural or synthetic porous media are mirrored in quantitative medical MRI. This review will describe examples where MR techniques used in porous media (particularly diffusion-weighted imaging (DWI)) are applied to physiological pathologies. Tissue microstructure is one area with great overlap with porous media science. Diffusion-weighting (esp. in neurological tissue) has motivated models with explicit physical dimensions, statistical parameters, empirical descriptors, or hybrids thereof. Another clinically relevant microscopic process is active flow. Renal (kidney) tissue possesses significant active vascular / tubular transport that manifests as "pseudodiffusion." Cancerous lesions involve anomalies in both structure and flow. The tools of magnetic resonance and their interpretation in porous media has had great impact on clinical MRI, and continued cross-fertilization of ideas can only enhance the progress of both fields.
Du, Hongying; Wang, Jie; Yao, Xiaojun; Hu, Zhide
2009-01-01
The heuristic method (HM) and support vector machine (SVM) were used to construct quantitative structure-retention relationship models by a series of compounds to predict the gradient retention times of reversed-phase high-performance liquid chromatography (HPLC) in three different columns. The aims of this investigation were to predict the retention times of multifarious compounds, to find the main properties of the three columns, and to indicate the theory of separation procedures. In our method, we correlated the retention times of many diverse structural analytes in three columns (Symmetry C18, Chromolith, and SG-MIX) with their representative molecular descriptors, calculated from the molecular structures alone. HM was used to select the most important molecular descriptors and build linear regression models. Furthermore, non-linear regression models were built using the SVM method; the performance of the SVM models were better than that of the HM models, and the prediction results were in good agreement with the experimental values. This paper could give some insights into the factors that were likely to govern the gradient retention process of the three investigated HPLC columns, which could theoretically supervise the practical experiment.
The Impact of DSM-5 A-Criteria Changes on Parent Ratings of ADHD in Adolescents.
Sibley, Margaret H; Yeguez, Carlos E
2018-01-01
Diagnostic and Statistical Manual of Mental Disorders (5th ed.; DSM-5) A-criteria for ADHD were expanded to include new descriptors referencing adolescent and adult symptom manifestations. This study examines the effect of these changes on symptom endorsement in a sample of adolescents with ADHD (N = 259; age range = 10.72-16.70). Parent ratings were collected and Diagnostic and Statistical Manual of Mental Disorders (4th ed., text rev.; DSM-IV-TR) and DSM-5 endorsement of ADHD symptoms were compared. Under the DSM-5, there were significant increases in reported inattention, but not hyperactivity/impulsivity (H/I) symptoms, with specific elevations for certain symptoms. The average adolescent met criteria for less than one additional symptom under the DSM-5, but the correlation between ADHD symptoms and impairment was attenuated when using the DSM-5 items. Impulsivity items appeared to represent adolescent deficits better than hyperactivity items. Results were not moderated by demographic factors. In a sample of adolescents with well-diagnosed DSM-IV-TR ADHD, developmental symptom descriptors led parents to endorse slightly more symptoms of inattention, but this elevation is unlikely to be clinically meaningful.
Abdullah, Nor Hayati; Thomas, Noel Francis; Sivasothy, Yasodha; Lee, Vannajan Sanghiran; Liew, Sook Yee; Noorbatcha, Ibrahim Ali; Awang, Khalijah
2016-01-01
The mammalian hyaluronidase degrades hyaluronic acid by the cleavage of the β-1,4-glycosidic bond furnishing a tetrasaccharide molecule as the main product which is a highly angiogenic and potent inducer of inflammatory cytokines. Ursolic acid 1, isolated from Prismatomeris tetrandra, was identified as having the potential to develop inhibitors of hyaluronidase. A series of ursolic acid analogues were either synthesized via structure modification of ursolic acid 1 or commercially obtained. The evaluation of the inhibitory activity of these compounds on the hyaluronidase enzyme was conducted. Several structural, topological and quantum chemical descriptors for these compounds were calculated using semi empirical quantum chemical methods. A quantitative structure activity relationship study (QSAR) was performed to correlate these descriptors with the hyaluronidase inhibitory activity. The statistical characteristics provided by the best multi linear model (BML) (R2 = 0.9717, R2cv = 0.9506) indicated satisfactory stability and predictive ability of the developed model. The in silico molecular docking study which was used to determine the binding interactions revealed that the ursolic acid analog 22 had a strong affinity towards human hyaluronidase. PMID:26907251
NASA Astrophysics Data System (ADS)
Ghavami, Raouf; Sadeghi, Faridoon; Rasouli, Zolikha; Djannati, Farhad
2012-12-01
Experimental values for the 13C NMR chemical shifts (ppm, TMS = 0) at 300 K ranging from 96.28 ppm (C4' of indole derivative 17) to 159.93 ppm (C4' of indole derivative 23) relative to deuteride chloroform (CDCl3, 77.0 ppm) or dimethylsulfoxide (DMSO, 39.50 ppm) as internal reference in CDCl3 or DMSO-d6 solutions have been collected from literature for thirty 2-functionalized 5-(methylsulfonyl)-1-phenyl-1H-indole derivatives containing different substituted groups. An effective quantitative structure-property relationship (QSPR) models were built using hybrid method combining genetic algorithm (GA) based on stepwise selection multiple linear regression (SWS-MLR) as feature-selection tools and correlation models between each carbon atom of indole derivative and calculated descriptors. Each compound was depicted by molecular structural descriptors that encode constitutional, topological, geometrical, electrostatic, and quantum chemical features. The accuracy of all developed models were confirmed using different types of internal and external procedures and various statistical tests. Furthermore, the domain of applicability for each model which indicates the area of reliable predictions was defined.