algorithm based descriptor: Topics by Science.gov

Sample records for algorithm based descriptor

Systems Biological Approach of Molecular Descriptors Connectivity: Optimal Descriptors for Oral Bioavailability Prediction

PubMed Central

Ahmed, Shiek S. S. J.; Ramakrishnan, V.

2012-01-01

Background Poor oral bioavailability is an important parameter accounting for the failure of the drug candidates. Approximately, 50% of developing drugs fail because of unfavorable oral bioavailability. In silico prediction of oral bioavailability (%F) based on physiochemical properties are highly needed. Although many computational models have been developed to predict oral bioavailability, their accuracy remains low with a significant number of false positives. In this study, we present an oral bioavailability model based on systems biological approach, using a machine learning algorithm coupled with an optimal discriminative set of physiochemical properties. Results The models were developed based on computationally derived 247 physicochemical descriptors from 2279 molecules, among which 969, 605 and 705 molecules were corresponds to oral bioavailability, intestinal absorption (HIA) and caco-2 permeability data set, respectively. The partial least squares discriminate analysis showed 49 descriptors of HIA and 50 descriptors of caco-2 are the major contributing descriptors in classifying into groups. Of these descriptors, 47 descriptors were commonly associated to HIA and caco-2, which suggests to play a vital role in classifying oral bioavailability. To determine the best machine learning algorithm, 21 classifiers were compared using a bioavailability data set of 969 molecules with 47 descriptors. Each molecule in the data set was represented by a set of 47 physiochemical properties with the functional relevance labeled as (+bioavailability/−bioavailability) to indicate good-bioavailability/poor-bioavailability molecules. The best-performing algorithm was the logistic algorithm. The correlation based feature selection (CFS) algorithm was implemented, which confirms that these 47 descriptors are the fundamental descriptors for oral bioavailability prediction. Conclusion The logistic algorithm with 47 selected descriptors correctly predicted the oral bioavailability, with a predictive accuracy of more than 71%. Overall, the method captures the fundamental molecular descriptors, that can be used as an entity to facilitate prediction of oral bioavailability. PMID:22815781
Systems biological approach of molecular descriptors connectivity: optimal descriptors for oral bioavailability prediction.

PubMed

Ahmed, Shiek S S J; Ramakrishnan, V

2012-01-01

Poor oral bioavailability is an important parameter accounting for the failure of the drug candidates. Approximately, 50% of developing drugs fail because of unfavorable oral bioavailability. In silico prediction of oral bioavailability (%F) based on physiochemical properties are highly needed. Although many computational models have been developed to predict oral bioavailability, their accuracy remains low with a significant number of false positives. In this study, we present an oral bioavailability model based on systems biological approach, using a machine learning algorithm coupled with an optimal discriminative set of physiochemical properties. The models were developed based on computationally derived 247 physicochemical descriptors from 2279 molecules, among which 969, 605 and 705 molecules were corresponds to oral bioavailability, intestinal absorption (HIA) and caco-2 permeability data set, respectively. The partial least squares discriminate analysis showed 49 descriptors of HIA and 50 descriptors of caco-2 are the major contributing descriptors in classifying into groups. Of these descriptors, 47 descriptors were commonly associated to HIA and caco-2, which suggests to play a vital role in classifying oral bioavailability. To determine the best machine learning algorithm, 21 classifiers were compared using a bioavailability data set of 969 molecules with 47 descriptors. Each molecule in the data set was represented by a set of 47 physiochemical properties with the functional relevance labeled as (+bioavailability/-bioavailability) to indicate good-bioavailability/poor-bioavailability molecules. The best-performing algorithm was the logistic algorithm. The correlation based feature selection (CFS) algorithm was implemented, which confirms that these 47 descriptors are the fundamental descriptors for oral bioavailability prediction. The logistic algorithm with 47 selected descriptors correctly predicted the oral bioavailability, with a predictive accuracy of more than 71%. Overall, the method captures the fundamental molecular descriptors, that can be used as an entity to facilitate prediction of oral bioavailability.
Fingerprint Identification Using SIFT-Based Minutia Descriptors and Improved All Descriptor-Pair Matching

PubMed Central

Zhou, Ru; Zhong, Dexing; Han, Jiuqiang

2013-01-01

The performance of conventional minutiae-based fingerprint authentication algorithms degrades significantly when dealing with low quality fingerprints with lots of cuts or scratches. A similar degradation of the minutiae-based algorithms is observed when small overlapping areas appear because of the quite narrow width of the sensors. Based on the detection of minutiae, Scale Invariant Feature Transformation (SIFT) descriptors are employed to fulfill verification tasks in the above difficult scenarios. However, the original SIFT algorithm is not suitable for fingerprint because of: (1) the similar patterns of parallel ridges; and (2) high computational resource consumption. To enhance the efficiency and effectiveness of the algorithm for fingerprint verification, we propose a SIFT-based Minutia Descriptor (SMD) to improve the SIFT algorithm through image processing, descriptor extraction and matcher. A two-step fast matcher, named improved All Descriptor-Pair Matching (iADM), is also proposed to implement the 1:N verifications in real-time. Fingerprint Identification using SMD and iADM (FISiA) achieved a significant improvement with respect to accuracy in representative databases compared with the conventional minutiae-based method. The speed of FISiA also can meet real-time requirements. PMID:23467056
The implementation of aerial object recognition algorithm based on contour descriptor in FPGA-based on-board vision system

NASA Astrophysics Data System (ADS)

Babayan, Pavel; Smirnov, Sergey; Strotov, Valery

2017-10-01

This paper describes the aerial object recognition algorithm for on-board and stationary vision system. Suggested algorithm is intended to recognize the objects of a specific kind using the set of the reference objects defined by 3D models. The proposed algorithm based on the outer contour descriptor building. The algorithm consists of two stages: learning and recognition. Learning stage is devoted to the exploring of reference objects. Using 3D models we can build the database containing training images by rendering the 3D model from viewpoints evenly distributed on a sphere. Sphere points distribution is made by the geosphere principle. Gathered training image set is used for calculating descriptors, which will be used in the recognition stage of the algorithm. The recognition stage is focusing on estimating the similarity of the captured object and the reference objects by matching an observed image descriptor and the reference object descriptors. The experimental research was performed using a set of the models of the aircraft of the different types (airplanes, helicopters, UAVs). The proposed orientation estimation algorithm showed good accuracy in all case studies. The real-time performance of the algorithm in FPGA-based vision system was demonstrated.
The implementation of contour-based object orientation estimation algorithm in FPGA-based on-board vision system

NASA Astrophysics Data System (ADS)

Alpatov, Boris; Babayan, Pavel; Ershov, Maksim; Strotov, Valery

2016-10-01

This paper describes the implementation of the orientation estimation algorithm in FPGA-based vision system. An approach to estimate an orientation of objects lacking axial symmetry is proposed. Suggested algorithm is intended to estimate orientation of a specific known 3D object based on object 3D model. The proposed orientation estimation algorithm consists of two stages: learning and estimation. Learning stage is devoted to the exploring of studied object. Using 3D model we can gather set of training images by capturing 3D model from viewpoints evenly distributed on a sphere. Sphere points distribution is made by the geosphere principle. Gathered training image set is used for calculating descriptors, which will be used in the estimation stage of the algorithm. The estimation stage is focusing on matching process between an observed image descriptor and the training image descriptors. The experimental research was performed using a set of images of Airbus A380. The proposed orientation estimation algorithm showed good accuracy in all case studies. The real-time performance of the algorithm in FPGA-based vision system was demonstrated.
Effective structural descriptors for natural and engineered radioactive waste confinement barriers

NASA Astrophysics Data System (ADS)

Lemmens, Laurent; Rogiers, Bart; De Craen, Mieke; Laloy, Eric; Jacques, Diederik; Huysmans, Marijke; Swennen, Rudy; Urai, Janos L.; Desbois, Guillaume

2017-04-01

The microstructure of a radioactive waste confinement barrier strongly influences its flow and transport properties. Numerical flow and transport simulations for these porous media at the pore scale therefore require input data that describe the microstructure as accurately as possible. To date, no imaging method can resolve all heterogeneities within important radioactive waste confinement barrier materials as hardened cement paste and natural clays at the micro scale (nm-cm). Therefore, it is necessary to merge information from different 2D and 3D imaging methods using porous media reconstruction techniques. To qualitatively compare the results of different reconstruction techniques, visual inspection might suffice. To quantitatively compare training-image based algorithms, Tan et al. (2014) proposed an algorithm using an analysis of distance. However, the ranking of the algorithm depends on the choice of the structural descriptor, in their case multiple-point or cluster-based histograms. We present here preliminary work in which we will review different structural descriptors and test their effectiveness, for capturing the main structural characteristics of radioactive waste confinement barrier materials, to determine the descriptors to use in the analysis of distance. The investigated descriptors are particle size distributions, surface area distributions, two point probability functions, multiple point histograms, linear functions and two point cluster functions. The descriptor testing consists of stochastically generating realizations from a reference image using the simulated annealing optimization procedure introduced by Karsanina et al. (2015). This procedure basically minimizes the differences between pre-specified descriptor values associated with the training image and the image being produced. The most efficient descriptor set can therefore be identified by comparing the image generation quality among the tested descriptor combinations. The assessment of the quality of the simulations will be made by combining all considered descriptors. Once the set of the most efficient descriptors is determined, they can be used in the analysis of distance, to rank different reconstruction algorithms in a more objective way in future work. Karsanina MV, Gerke KM, Skvortsova EB, Mallants D (2015) Universal Spatial Correlation Functions for Describing and Reconstructing Soil Microstructure. PLoS ONE 10(5): e0126515. doi:10.1371/journal.pone.0126515 Tan, Xiaojin, Pejman Tahmasebi, and Jef Caers. "Comparing training-image based algorithms using an analysis of distance." Mathematical Geosciences 46.2 (2014): 149-169.
RGB-D SLAM Combining Visual Odometry and Extended Information Filter

PubMed Central

Zhang, Heng; Liu, Yanli; Tan, Jindong; Xiong, Naixue

2015-01-01

In this paper, we present a novel RGB-D SLAM system based on visual odometry and an extended information filter, which does not require any other sensors or odometry. In contrast to the graph optimization approaches, this is more suitable for online applications. A visual dead reckoning algorithm based on visual residuals is devised, which is used to estimate motion control input. In addition, we use a novel descriptor called binary robust appearance and normals descriptor (BRAND) to extract features from the RGB-D frame and use them as landmarks. Furthermore, considering both the 3D positions and the BRAND descriptors of the landmarks, our observation model avoids explicit data association between the observations and the map by marginalizing the observation likelihood over all possible associations. Experimental validation is provided, which compares the proposed RGB-D SLAM algorithm with just RGB-D visual odometry and a graph-based RGB-D SLAM algorithm using the publicly-available RGB-D dataset. The results of the experiments demonstrate that our system is quicker than the graph-based RGB-D SLAM algorithm. PMID:26263990
Molecular descriptor subset selection in theoretical peptide quantitative structure-retention relationship model development using nature-inspired optimization algorithms.

PubMed

Žuvela, Petar; Liu, J Jay; Macur, Katarzyna; Bączek, Tomasz

2015-10-06

In this work, performance of five nature-inspired optimization algorithms, genetic algorithm (GA), particle swarm optimization (PSO), artificial bee colony (ABC), firefly algorithm (FA), and flower pollination algorithm (FPA), was compared in molecular descriptor selection for development of quantitative structure-retention relationship (QSRR) models for 83 peptides that originate from eight model proteins. The matrix with 423 descriptors was used as input, and QSRR models based on selected descriptors were built using partial least squares (PLS), whereas root mean square error of prediction (RMSEP) was used as a fitness function for their selection. Three performance criteria, prediction accuracy, computational cost, and the number of selected descriptors, were used to evaluate the developed QSRR models. The results show that all five variable selection methods outperform interval PLS (iPLS), sparse PLS (sPLS), and the full PLS model, whereas GA is superior because of its lowest computational cost and higher accuracy (RMSEP of 5.534%) with a smaller number of variables (nine descriptors). The GA-QSRR model was validated initially through Y-randomization. In addition, it was successfully validated with an external testing set out of 102 peptides originating from Bacillus subtilis proteomes (RMSEP of 22.030%). Its applicability domain was defined, from which it was evident that the developed GA-QSRR exhibited strong robustness. All the sources of the model's error were identified, thus allowing for further application of the developed methodology in proteomics.
A 3D model retrieval approach based on Bayesian networks lightfield descriptor

NASA Astrophysics Data System (ADS)

Xiao, Qinhan; Li, Yanjun

2009-12-01

A new 3D model retrieval methodology is proposed by exploiting a novel Bayesian networks lightfield descriptor (BNLD). There are two key novelties in our approach: (1) a BN-based method for building lightfield descriptor; and (2) a 3D model retrieval scheme based on the proposed BNLD. To overcome the disadvantages of the existing 3D model retrieval methods, we explore BN for building a new lightfield descriptor. Firstly, 3D model is put into lightfield, about 300 binary-views can be obtained along a sphere, then Fourier descriptors and Zernike moments descriptors can be calculated out from binaryviews. Then shape feature sequence would be learned into a BN model based on BN learning algorithm; Secondly, we propose a new 3D model retrieval method by calculating Kullback-Leibler Divergence (KLD) between BNLDs. Beneficial from the statistical learning, our BNLD is noise robustness as compared to the existing methods. The comparison between our method and the lightfield descriptor-based approach is conducted to demonstrate the effectiveness of our proposed methodology.
Unobtrusive Multi-Static Serial LiDAR Imager (UMSLI) First Generation Shape-Matching Based Classifier for 2D Contours

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cao, Zheng; Ouyang, Bing; Principe, Jose

A multi-static serial LiDAR system prototype was developed under DE-EE0006787 to detect, classify, and record interactions of marine life with marine hydrokinetic generation equipment. This software implements a shape-matching based classifier algorithm for the underwater automated detection of marine life for that system. In addition to applying shape descriptors, the algorithm also adopts information theoretical learning based affine shape registration, improving point correspondences found by shape descriptors as well as the final similarity measure.
Deep Learning for Lowtextured Image Matching

NASA Astrophysics Data System (ADS)

Kniaz, V. V.; Fedorenko, V. V.; Fomin, N. A.

2018-05-01

Low-textured objects pose challenges for an automatic 3D model reconstruction. Such objects are common in archeological applications of photogrammetry. Most of the common feature point descriptors fail to match local patches in featureless regions of an object. Hence, automatic documentation of the archeological process using Structure from Motion (SfM) methods is challenging. Nevertheless, such documentation is possible with the aid of a human operator. Deep learning-based descriptors have outperformed most of common feature point descriptors recently. This paper is focused on the development of a new Wide Image Zone Adaptive Robust feature Descriptor (WIZARD) based on the deep learning. We use a convolutional auto-encoder to compress discriminative features of a local path into a descriptor code. We build a codebook to perform point matching on multiple images. The matching is performed using the nearest neighbor search and a modified voting algorithm. We present a new "Multi-view Amphora" (Amphora) dataset for evaluation of point matching algorithms. The dataset includes images of an Ancient Greek vase found at Taman Peninsula in Southern Russia. The dataset provides color images, a ground truth 3D model, and a ground truth optical flow. We evaluated the WIZARD descriptor on the "Amphora" dataset to show that it outperforms the SIFT and SURF descriptors on the complex patch pairs.
Mobile Visual Search Based on Histogram Matching and Zone Weight Learning

NASA Astrophysics Data System (ADS)

Zhu, Chuang; Tao, Li; Yang, Fan; Lu, Tao; Jia, Huizhu; Xie, Xiaodong

2018-01-01

In this paper, we propose a novel image retrieval algorithm for mobile visual search. At first, a short visual codebook is generated based on the descriptor database to represent the statistical information of the dataset. Then, an accurate local descriptor similarity score is computed by merging the tf-idf weighted histogram matching and the weighting strategy in compact descriptors for visual search (CDVS). At last, both the global descriptor matching score and the local descriptor similarity score are summed up to rerank the retrieval results according to the learned zone weights. The results show that the proposed approach outperforms the state-of-the-art image retrieval method in CDVS.
Analysis of A Drug Target-based Classification System using Molecular Descriptors.

PubMed

Lu, Jing; Zhang, Pin; Bi, Yi; Luo, Xiaomin

2016-01-01

Drug-target interaction is an important topic in drug discovery and drug repositioning. KEGG database offers a drug annotation and classification using a target-based classification system. In this study, we gave an investigation on five target-based classes: (I) G protein-coupled receptors; (II) Nuclear receptors; (III) Ion channels; (IV) Enzymes; (V) Pathogens, using molecular descriptors to represent each drug compound. Two popular feature selection methods, maximum relevance minimum redundancy and incremental feature selection, were adopted to extract the important descriptors. Meanwhile, an optimal prediction model based on nearest neighbor algorithm was constructed, which got the best result in identifying drug target-based classes. Finally, some key descriptors were discussed to uncover their important roles in the identification of drug-target classes.
Fast human pose estimation using 3D Zernike descriptors

NASA Astrophysics Data System (ADS)

Berjón, Daniel; Morán, Francisco

2012-03-01

Markerless video-based human pose estimation algorithms face a high-dimensional problem that is frequently broken down into several lower-dimensional ones by estimating the pose of each limb separately. However, in order to do so they need to reliably locate the torso, for which they typically rely on time coherence and tracking algorithms. Their losing track usually results in catastrophic failure of the process, requiring human intervention and thus precluding their usage in real-time applications. We propose a very fast rough pose estimation scheme based on global shape descriptors built on 3D Zernike moments. Using an articulated model that we configure in many poses, a large database of descriptor/pose pairs can be computed off-line. Thus, the only steps that must be done on-line are the extraction of the descriptors for each input volume and a search against the database to get the most likely poses. While the result of such process is not a fine pose estimation, it can be useful to help more sophisticated algorithms to regain track or make more educated guesses when creating new particles in particle-filter-based tracking schemes. We have achieved a performance of about ten fps on a single computer using a database of about one million entries.
Graphic matching based on shape contexts and reweighted random walks

NASA Astrophysics Data System (ADS)

Zhang, Mingxuan; Niu, Dongmei; Zhao, Xiuyang; Liu, Mingjun

2018-04-01

Graphic matching is a very critical issue in all aspects of computer vision. In this paper, a new graphics matching algorithm combining shape contexts and reweighted random walks was proposed. On the basis of the local descriptor, shape contexts, the reweighted random walks algorithm was modified to possess stronger robustness and correctness in the final result. Our main process is to use the descriptor of the shape contexts for the random walk on the iteration, of which purpose is to control the random walk probability matrix. We calculate bias matrix by using descriptors and then in the iteration we use it to enhance random walks' and random jumps' accuracy, finally we get the one-to-one registration result by discretization of the matrix. The algorithm not only preserves the noise robustness of reweighted random walks but also possesses the rotation, translation, scale invariance of shape contexts. Through extensive experiments, based on real images and random synthetic point sets, and comparisons with other algorithms, it is confirmed that this new method can produce excellent results in graphic matching.
Descriptor selection for banana accessions based on univariate and multivariate analysis.

PubMed

Brandão, L P; Souza, C P F; Pereira, V M; Silva, S O; Santos-Serejo, J A; Ledo, C A S; Amorim, E P

2013-05-14

Our objective was to establish a minimum number of morphological descriptors for the characterization of banana germplasm and evaluate the efficiency of removal of redundant characters, based on univariate and multivariate statistical analyses. Phenotypic characterization was made of 77 accessions from Bahia, Brazil, using 92 descriptors. The selection of the descriptors was carried out by principal components analysis (quantitative) and by entropy (multi-category). Efficiency of elimination was analyzed by a comparative study between the clusters formed, taking into consideration all 92 descriptors and smaller groups. The selected descriptors were analyzed with the Ward-MLM procedure and a combined matrix formed by the Gower algorithm. We were able to reduce the number of descriptors used for characterizing the banana germplasm (42%). The correlation between the matrices considering the 92 descriptors and the selected ones was 0.82, showing that the reduction in the number of descriptors did not influence estimation of genetic variability between the banana accessions. We conclude that removing these descriptors caused no loss of information, considering the groups formed from pre-established criteria, including subgroup/subspecies.
Harmony Search as a Powerful Tool for Feature Selection in QSPR Study of the Drugs Lipophilicity.

PubMed

Bahadori, Behnoosh; Atabati, Morteza

2017-01-01

Aims & Scope: Lipophilicity represents one of the most studied and most frequently used fundamental physicochemical properties. In the present work, harmony search (HS) algorithm is suggested to feature selection in quantitative structure-property relationship (QSPR) modeling to predict lipophilicity of neutral, acidic, basic and amphotheric drugs that were determined by UHPLC. Harmony search is a music-based metaheuristic optimization algorithm. It was affected by the observation that the aim of music is to search for a perfect state of harmony. Semi-empirical quantum-chemical calculations at AM1 level were used to find the optimum 3D geometry of the studied molecules and variant descriptors (1497 descriptors) were calculated by the Dragon software. The selected descriptors by harmony search algorithm (9 descriptors) were applied for model development using multiple linear regression (MLR). In comparison with other feature selection methods such as genetic algorithm and simulated annealing, harmony search algorithm has better results. The root mean square error (RMSE) with and without leave-one out cross validation (LOOCV) were obtained 0.417 and 0.302, respectively. The results were compared with those obtained from the genetic algorithm and simulated annealing methods and it showed that the HS is a helpful tool for feature selection with fine performance. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Structural alignment of protein descriptors - a combinatorial model.

PubMed

Antczak, Maciej; Kasprzak, Marta; Lukasiak, Piotr; Blazewicz, Jacek

2016-09-17

Structural alignment of proteins is one of the most challenging problems in molecular biology. The tertiary structure of a protein strictly correlates with its function and computationally predicted structures are nowadays a main premise for understanding the latter. However, computationally derived 3D models often exhibit deviations from the native structure. A way to confirm a model is a comparison with other structures. The structural alignment of a pair of proteins can be defined with the use of a concept of protein descriptors. The protein descriptors are local substructures of protein molecules, which allow us to divide the original problem into a set of subproblems and, consequently, to propose a more efficient algorithmic solution. In the literature, one can find many applications of the descriptors concept that prove its usefulness for insight into protein 3D structures, but the proposed approaches are presented rather from the biological perspective than from the computational or algorithmic point of view. Efficient algorithms for identification and structural comparison of descriptors can become crucial components of methods for structural quality assessment as well as tertiary structure prediction. In this paper, we propose a new combinatorial model and new polynomial-time algorithms for the structural alignment of descriptors. The model is based on the maximum-size assignment problem, which we define here and prove that it can be solved in polynomial time. We demonstrate suitability of this approach by comparison with an exact backtracking algorithm. Besides a simplification coming from the combinatorial modeling, both on the conceptual and complexity level, we gain with this approach high quality of obtained results, in terms of 3D alignment accuracy and processing efficiency. All the proposed algorithms were developed and integrated in a computationally efficient tool descs-standalone, which allows the user to identify and structurally compare descriptors of biological molecules, such as proteins and RNAs. Both PDB (Protein Data Bank) and mmCIF (macromolecular Crystallographic Information File) formats are supported. The proposed tool is available as an open source project stored on GitHub ( https://github.com/mantczak/descs-standalone ).
Registration algorithm of point clouds based on multiscale normal features

NASA Astrophysics Data System (ADS)

Lu, Jun; Peng, Zhongtao; Su, Hang; Xia, GuiHua

2015-01-01

The point cloud registration technology for obtaining a three-dimensional digital model is widely applied in many areas. To improve the accuracy and speed of point cloud registration, a registration method based on multiscale normal vectors is proposed. The proposed registration method mainly includes three parts: the selection of key points, the calculation of feature descriptors, and the determining and optimization of correspondences. First, key points are selected from the point cloud based on the changes of magnitude of multiscale curvatures obtained by using principal components analysis. Then the feature descriptor of each key point is proposed, which consists of 21 elements based on multiscale normal vectors and curvatures. The correspondences in a pair of two point clouds are determined according to the descriptor's similarity of key points in the source point cloud and target point cloud. Correspondences are optimized by using a random sampling consistency algorithm and clustering technology. Finally, singular value decomposition is applied to optimized correspondences so that the rigid transformation matrix between two point clouds is obtained. Experimental results show that the proposed point cloud registration algorithm has a faster calculation speed, higher registration accuracy, and better antinoise performance.
Contour-based object orientation estimation

NASA Astrophysics Data System (ADS)

Alpatov, Boris; Babayan, Pavel

2016-04-01

Real-time object orientation estimation is an actual problem of computer vision nowadays. In this paper we propose an approach to estimate an orientation of objects lacking axial symmetry. Proposed algorithm is intended to estimate orientation of a specific known 3D object, so 3D model is required for learning. The proposed orientation estimation algorithm consists of 2 stages: learning and estimation. Learning stage is devoted to the exploring of studied object. Using 3D model we can gather set of training images by capturing 3D model from viewpoints evenly distributed on a sphere. Sphere points distribution is made by the geosphere principle. It minimizes the training image set. Gathered training image set is used for calculating descriptors, which will be used in the estimation stage of the algorithm. The estimation stage is focusing on matching process between an observed image descriptor and the training image descriptors. The experimental research was performed using a set of images of Airbus A380. The proposed orientation estimation algorithm showed good accuracy (mean error value less than 6°) in all case studies. The real-time performance of the algorithm was also demonstrated.

LSAH: a fast and efficient local surface feature for point cloud registration

NASA Astrophysics Data System (ADS)

Lu, Rongrong; Zhu, Feng; Wu, Qingxiao; Kong, Yanzi

2018-04-01

Point cloud registration is a fundamental task in high level three dimensional applications. Noise, uneven point density and varying point cloud resolutions are the three main challenges for point cloud registration. In this paper, we design a robust and compact local surface descriptor called Local Surface Angles Histogram (LSAH) and propose an effectively coarse to fine algorithm for point cloud registration. The LSAH descriptor is formed by concatenating five normalized sub-histograms into one histogram. The five sub-histograms are created by accumulating a different type of angle from a local surface patch respectively. The experimental results show that our LSAH is more robust to uneven point density and point cloud resolutions than four state-of-the-art local descriptors in terms of feature matching. Moreover, we tested our LSAH based coarse to fine algorithm for point cloud registration. The experimental results demonstrate that our algorithm is robust and efficient as well.
Multiple feature fusion via covariance matrix for visual tracking

NASA Astrophysics Data System (ADS)

Jin, Zefenfen; Hou, Zhiqiang; Yu, Wangsheng; Wang, Xin; Sun, Hui

2018-04-01

Aiming at the problem of complicated dynamic scenes in visual target tracking, a multi-feature fusion tracking algorithm based on covariance matrix is proposed to improve the robustness of the tracking algorithm. In the frame-work of quantum genetic algorithm, this paper uses the region covariance descriptor to fuse the color, edge and texture features. It also uses a fast covariance intersection algorithm to update the model. The low dimension of region covariance descriptor, the fast convergence speed and strong global optimization ability of quantum genetic algorithm, and the fast computation of fast covariance intersection algorithm are used to improve the computational efficiency of fusion, matching, and updating process, so that the algorithm achieves a fast and effective multi-feature fusion tracking. The experiments prove that the proposed algorithm can not only achieve fast and robust tracking but also effectively handle interference of occlusion, rotation, deformation, motion blur and so on.
SU-F-303-11: Implementation and Applications of Rapid, SIFT-Based Cine MR Image Binning and Region Tracking

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mazur, T; Wang, Y; Fischer-Valuck, B

2015-06-15

Purpose: To develop a novel and rapid, SIFT-based algorithm for assessing feature motion on cine MR images acquired during MRI-guided radiotherapy treatments. In particular, we apply SIFT descriptors toward both partitioning cine images into respiratory states and tracking regions across frames. Methods: Among a training set of images acquired during a fraction, we densely assign SIFT descriptors to pixels within the images. We cluster these descriptors across all frames in order to produce a dictionary of trackable features. Associating the best-matching descriptors at every frame among the training images to these features, we construct motion traces for the features. Wemore » use these traces to define respiratory bins for sorting images in order to facilitate robust pixel-by-pixel tracking. Instead of applying conventional methods for identifying pixel correspondences across frames we utilize a recently-developed algorithm that derives correspondences via a matching objective for SIFT descriptors. Results: We apply these methods to a collection of lung, abdominal, and breast patients. We evaluate the procedure for respiratory binning using target sites exhibiting high-amplitude motion among 20 lung and abdominal patients. In particular, we investigate whether these methods yield minimal variation between images within a bin by perturbing the resulting image distributions among bins. Moreover, we compare the motion between averaged images across respiratory states to 4DCT data for these patients. We evaluate the algorithm for obtaining pixel correspondences between frames by tracking contours among a set of breast patients. As an initial case, we track easily-identifiable edges of lumpectomy cavities that show minimal motion over treatment. Conclusions: These SIFT-based methods reliably extract motion information from cine MR images acquired during patient treatments. While we performed our analysis retrospectively, the algorithm lends itself to prospective motion assessment. Applications of these methods include motion assessment, identifying treatment windows for gating, and determining optimal margins for treatment.« less
Artificial Intelligence Methods Applied to Parameter Detection of Atrial Fibrillation

NASA Astrophysics Data System (ADS)

Arotaritei, D.; Rotariu, C.

2015-09-01

In this paper we present a novel method to develop an atrial fibrillation (AF) based on statistical descriptors and hybrid neuro-fuzzy and crisp system. The inference of system produce rules of type if-then-else that care extracted to construct a binary decision system: normal of atrial fibrillation. We use TPR (Turning Point Ratio), SE (Shannon Entropy) and RMSSD (Root Mean Square of Successive Differences) along with a new descriptor, Teager- Kaiser energy, in order to improve the accuracy of detection. The descriptors are calculated over a sliding window that produce very large number of vectors (massive dataset) used by classifier. The length of window is a crisp descriptor meanwhile the rest of descriptors are interval-valued type. The parameters of hybrid system are adapted using Genetic Algorithm (GA) algorithm with fitness single objective target: highest values for sensibility and sensitivity. The rules are extracted and they are part of the decision system. The proposed method was tested using the Physionet MIT-BIH Atrial Fibrillation Database and the experimental results revealed a good accuracy of AF detection in terms of sensitivity and specificity (above 90%).
Neural network-based feature point descriptors for registration of optical and SAR images

NASA Astrophysics Data System (ADS)

Abulkhanov, Dmitry; Konovalenko, Ivan; Nikolaev, Dmitry; Savchik, Alexey; Shvets, Evgeny; Sidorchuk, Dmitry

2018-04-01

Registration of images of different nature is an important technique used in image fusion, change detection, efficient information representation and other problems of computer vision. Solving this task using feature-based approaches is usually more complex than registration of several optical images because traditional feature descriptors (SIFT, SURF, etc.) perform poorly when images have different nature. In this paper we consider the problem of registration of SAR and optical images. We train neural network to build feature point descriptors and use RANSAC algorithm to align found matches. Experimental results are presented that confirm the method's effectiveness.
Parametric classification of handvein patterns based on texture features

NASA Astrophysics Data System (ADS)

Al Mahafzah, Harbi; Imran, Mohammad; Supreetha Gowda H., D.

2018-04-01

In this paper, we have developed Biometric recognition system adopting hand based modality Handvein,which has the unique pattern for each individual and it is impossible to counterfeit and fabricate as it is an internal feature. We have opted in choosing feature extraction algorithms such as LBP-visual descriptor, LPQ-blur insensitive texture operator, Log-Gabor-Texture descriptor. We have chosen well known classifiers such as KNN and SVM for classification. We have experimented and tabulated results of single algorithm recognition rate for Handvein under different distance measures and kernel options. The feature level fusion is carried out which increased the performance level.
On three dimensional object recognition and pose-determination: An abstraction based approach. Ph.D. Thesis - Michigan Univ. Final Report

NASA Technical Reports Server (NTRS)

Quek, Kok How Francis

1990-01-01

A method of computing reliable Gaussian and mean curvature sign-map descriptors from the polynomial approximation of surfaces was demonstrated. Such descriptors which are invariant under perspective variation are suitable for hypothesis generation. A means for determining the pose of constructed geometric forms whose algebraic surface descriptors are nonlinear in terms of their orienting parameters was developed. This was done by means of linear functions which are capable of approximating nonlinear forms and determining their parameters. It was shown that biquadratic surfaces are suitable companion linear forms for cylindrical approximation and parameter estimation. The estimates provided the initial parametric approximations necessary for a nonlinear regression stage to fine tune the estimates by fitting the actual nonlinear form to the data. A hypothesis-based split-merge algorithm for extraction and pose determination of cylinders and planes which merge smoothly into other surfaces was developed. It was shown that all split-merge algorithms are hypothesis-based. A finite-state algorithm for the extraction of the boundaries of run-length regions was developed. The computation takes advantage of the run list topology and boundary direction constraints implicit in the run-length encoding.
Chi-square-based scoring function for categorization of MEDLINE citations.

PubMed

Kastrin, A; Peterlin, B; Hristovski, D

2010-01-01

Text categorization has been used in biomedical informatics for identifying documents containing relevant topics of interest. We developed a simple method that uses a chi-square-based scoring function to determine the likelihood of MEDLINE citations containing genetic relevant topic. Our procedure requires construction of a genetic and a nongenetic domain document corpus. We used MeSH descriptors assigned to MEDLINE citations for this categorization task. We compared frequencies of MeSH descriptors between two corpora applying chi-square test. A MeSH descriptor was considered to be a positive indicator if its relative observed frequency in the genetic domain corpus was greater than its relative observed frequency in the nongenetic domain corpus. The output of the proposed method is a list of scores for all the citations, with the highest score given to those citations containing MeSH descriptors typical for the genetic domain. Validation was done on a set of 734 manually annotated MEDLINE citations. It achieved predictive accuracy of 0.87 with 0.69 recall and 0.64 precision. We evaluated the method by comparing it to three machine-learning algorithms (support vector machines, decision trees, naïve Bayes). Although the differences were not statistically significantly different, results showed that our chi-square scoring performs as good as compared machine-learning algorithms. We suggest that the chi-square scoring is an effective solution to help categorize MEDLINE citations. The algorithm is implemented in the BITOLA literature-based discovery support system as a preprocessor for gene symbol disambiguation process.
Gun bore flaw image matching based on improved SIFT descriptor

NASA Astrophysics Data System (ADS)

Zeng, Luan; Xiong, Wei; Zhai, You

2013-01-01

In order to increase the operation speed and matching ability of SIFT algorithm, the SIFT descriptor and matching strategy are improved. First, a method of constructing feature descriptor based on sector area is proposed. By computing the gradients histogram of location bins which are parted into 6 sector areas, a descriptor with 48 dimensions is constituted. It can reduce the dimension of feature vector and decrease the complexity of structuring descriptor. Second, it introduce a strategy that partitions the circular region into 6 identical sector areas starting from the dominate orientation. Consequently, the computational complexity is reduced due to cancellation of rotation operation for the area. The experimental results indicate that comparing with the OpenCV SIFT arithmetic, the average matching speed of the new method increase by about 55.86%. The matching veracity can be increased even under some variation of view point, illumination, rotation, scale and out of focus. The new method got satisfied results in gun bore flaw image matching. Keywords: Metrology, Flaw image matching, Gun bore, Feature descriptor
ADME evaluation in drug discovery. 1. Applications of genetic algorithms to the prediction of blood-brain partitioning of a large set of drugs.

PubMed

Hou, Tingjun; Xu, Xiaojie

2002-12-01

In this study, the relationships between the brain-blood concentration ratio of 96 structurally diverse compounds with a large number of structurally derived descriptors were investigated. The linear models were based on molecular descriptors that can be calculated for any compound simply from a knowledge of its molecular structure. The linear correlation coefficients of the models were optimized by genetic algorithms (GAs), and the descriptors used in the linear models were automatically selected from 27 structurally derived descriptors. The GA optimizations resulted in a group of linear models with three or four molecular descriptors with good statistical significance. The change of descriptor use as the evolution proceeds demonstrates that the octane/water partition coefficient and the partial negative solvent-accessible surface area multiplied by the negative charge are crucial to brain-blood barrier permeability. Moreover, we found that the predictions using multiple QSPR models from GA optimization gave quite good results in spite of the diversity of structures, which was better than the predictions using the best single model. The predictions for the two external sets with 37 diverse compounds using multiple QSPR models indicate that the best linear models with four descriptors are sufficiently effective for predictive use. Considering the ease of computation of the descriptors, the linear models may be used as general utilities to screen the blood-brain barrier partitioning of drugs in a high-throughput fashion.
External validation of a publicly available computer assisted diagnostic tool for mammographic mass lesions with two high prevalence research datasets.

PubMed

Benndorf, Matthias; Burnside, Elizabeth S; Herda, Christoph; Langer, Mathias; Kotter, Elmar

2015-08-01

Lesions detected at mammography are described with a highly standardized terminology: the breast imaging-reporting and data system (BI-RADS) lexicon. Up to now, no validated semantic computer assisted classification algorithm exists to interactively link combinations of morphological descriptors from the lexicon to a probabilistic risk estimate of malignancy. The authors therefore aim at the external validation of the mammographic mass diagnosis (MMassDx) algorithm. A classification algorithm like MMassDx must perform well in a variety of clinical circumstances and in datasets that were not used to generate the algorithm in order to ultimately become accepted in clinical routine. The MMassDx algorithm uses a naïve Bayes network and calculates post-test probabilities of malignancy based on two distinct sets of variables, (a) BI-RADS descriptors and age ("descriptor model") and (b) BI-RADS descriptors, age, and BI-RADS assessment categories ("inclusive model"). The authors evaluate both the MMassDx (descriptor) and MMassDx (inclusive) models using two large publicly available datasets of mammographic mass lesions: the digital database for screening mammography (DDSM) dataset, which contains two subsets from the same examinations-a medio-lateral oblique (MLO) view and cranio-caudal (CC) view dataset-and the mammographic mass (MM) dataset. The DDSM contains 1220 mass lesions and the MM dataset contains 961 mass lesions. The authors evaluate discriminative performance using area under the receiver-operating-characteristic curve (AUC) and compare this to the BI-RADS assessment categories alone (i.e., the clinical performance) using the DeLong method. The authors also evaluate whether assigned probabilistic risk estimates reflect the lesions' true risk of malignancy using calibration curves. The authors demonstrate that the MMassDx algorithms show good discriminatory performance. AUC for the MMassDx (descriptor) model in the DDSM data is 0.876/0.895 (MLO/CC view) and AUC for the MMassDx (inclusive) model in the DDSM data is 0.891/0.900 (MLO/CC view). AUC for the MMassDx (descriptor) model in the MM data is 0.862 and AUC for the MMassDx (inclusive) model in the MM data is 0.900. In all scenarios, MMassDx performs significantly better than clinical performance, P < 0.05 each. The authors furthermore demonstrate that the MMassDx algorithm systematically underestimates the risk of malignancy in the DDSM and MM datasets, especially when low probabilities of malignancy are assigned. The authors' results reveal that the MMassDx algorithms have good discriminatory performance but less accurate calibration when tested on two independent validation datasets. Improvement in calibration and testing in a prospective clinical population will be important steps in the pursuit of translation of these algorithms to the clinic.
Detection of Abnormal Events via Optical Flow Feature Analysis

PubMed Central

Wang, Tian; Snoussi, Hichem

2015-01-01

In this paper, a novel algorithm is proposed to detect abnormal events in video streams. The algorithm is based on the histogram of the optical flow orientation descriptor and the classification method. The details of the histogram of the optical flow orientation descriptor are illustrated for describing movement information of the global video frame or foreground frame. By combining one-class support vector machine and kernel principal component analysis methods, the abnormal events in the current frame can be detected after a learning period characterizing normal behaviors. The difference abnormal detection results are analyzed and explained. The proposed detection method is tested on benchmark datasets, then the experimental results show the effectiveness of the algorithm. PMID:25811227
A Universal 3D Voxel Descriptor for Solid-State Material Informatics with Deep Convolutional Neural Networks.

PubMed

Kajita, Seiji; Ohba, Nobuko; Jinnouchi, Ryosuke; Asahi, Ryoji

2017-12-05

Material informatics (MI) is a promising approach to liberate us from the time-consuming Edisonian (trial and error) process for material discoveries, driven by machine-learning algorithms. Several descriptors, which are encoded material features to feed computers, were proposed in the last few decades. Especially to solid systems, however, their insufficient representations of three dimensionality of field quantities such as electron distributions and local potentials have critically hindered broad and practical successes of the solid-state MI. We develop a simple, generic 3D voxel descriptor that compacts any field quantities, in such a suitable way to implement convolutional neural networks (CNNs). We examine the 3D voxel descriptor encoded from the electron distribution by a regression test with 680 oxides data. The present scheme outperforms other existing descriptors in the prediction of Hartree energies that are significantly relevant to the long-wavelength distribution of the valence electrons. The results indicate that this scheme can forecast any functionals of field quantities just by learning sufficient amount of data, if there is an explicit correlation between the target properties and field quantities. This 3D descriptor opens a way to import prominent CNNs-based algorithms of supervised, semi-supervised and reinforcement learnings into the solid-state MI.
THTM: A template matching algorithm based on HOG descriptor and two-stage matching

NASA Astrophysics Data System (ADS)

Jiang, Yuanjie; Ruan, Li; Xiao, Limin; Liu, Xi; Yuan, Feng; Wang, Haitao

2018-04-01

We propose a novel method for template matching named THTM - a template matching algorithm based on HOG (histogram of gradient) and two-stage matching. We rely on the fast construction of HOG and the two-stage matching that jointly lead to a high accuracy approach for matching. TMTM give enough attention on HOG and creatively propose a twice-stage matching while traditional method only matches once. Our contribution is to apply HOG to template matching successfully and present two-stage matching, which is prominent to improve the matching accuracy based on HOG descriptor. We analyze key features of THTM and perform compared to other commonly used alternatives on a challenging real-world datasets. Experiments show that our method outperforms the comparison method.
An Integrated Ransac and Graph Based Mismatch Elimination Approach for Wide-Baseline Image Matching

NASA Astrophysics Data System (ADS)

Hasheminasab, M.; Ebadi, H.; Sedaghat, A.

2015-12-01

In this paper we propose an integrated approach in order to increase the precision of feature point matching. Many different algorithms have been developed as to optimizing the short-baseline image matching while because of illumination differences and viewpoints changes, wide-baseline image matching is so difficult to handle. Fortunately, the recent developments in the automatic extraction of local invariant features make wide-baseline image matching possible. The matching algorithms which are based on local feature similarity principle, using feature descriptor as to establish correspondence between feature point sets. To date, the most remarkable descriptor is the scale-invariant feature transform (SIFT) descriptor , which is invariant to image rotation and scale, and it remains robust across a substantial range of affine distortion, presence of noise, and changes in illumination. The epipolar constraint based on RANSAC (random sample consensus) method is a conventional model for mismatch elimination, particularly in computer vision. Because only the distance from the epipolar line is considered, there are a few false matches in the selected matching results based on epipolar geometry and RANSAC. Aguilariu et al. proposed Graph Transformation Matching (GTM) algorithm to remove outliers which has some difficulties when the mismatched points surrounded by the same local neighbor structure. In this study to overcome these limitations, which mentioned above, a new three step matching scheme is presented where the SIFT algorithm is used to obtain initial corresponding point sets. In the second step, in order to reduce the outliers, RANSAC algorithm is applied. Finally, to remove the remained mismatches, based on the adjacent K-NN graph, the GTM is implemented. Four different close range image datasets with changes in viewpoint are utilized to evaluate the performance of the proposed method and the experimental results indicate its robustness and capability.
Shuffling cross-validation-bee algorithm as a new descriptor selection method for retention studies of pesticides in biopartitioning micellar chromatography.

PubMed

Zarei, Kobra; Atabati, Morteza; Ahmadi, Monire

2017-05-04

Bee algorithm (BA) is an optimization algorithm inspired by the natural foraging behaviour of honey bees to find the optimal solution which can be proposed to feature selection. In this paper, shuffling cross-validation-BA (CV-BA) was applied to select the best descriptors that could describe the retention factor (log k) in the biopartitioning micellar chromatography (BMC) of 79 heterogeneous pesticides. Six descriptors were obtained using BA and then the selected descriptors were applied for model development using multiple linear regression (MLR). The descriptor selection was also performed using stepwise, genetic algorithm and simulated annealing methods and MLR was applied to model development and then the results were compared with those obtained from shuffling CV-BA. The results showed that shuffling CV-BA can be applied as a powerful descriptor selection method. Support vector machine (SVM) was also applied for model development using six selected descriptors by BA. The obtained statistical results using SVM were better than those obtained using MLR, as the root mean square error (RMSE) and correlation coefficient (R) for whole data set (training and test), using shuffling CV-BA-MLR, were obtained as 0.1863 and 0.9426, respectively, while these amounts for the shuffling CV-BA-SVM method were obtained as 0.0704 and 0.9922, respectively.
Protein-protein docking using region-based 3D Zernike descriptors

PubMed Central

2009-01-01

Background Protein-protein interactions are a pivotal component of many biological processes and mediate a variety of functions. Knowing the tertiary structure of a protein complex is therefore essential for understanding the interaction mechanism. However, experimental techniques to solve the structure of the complex are often found to be difficult. To this end, computational protein-protein docking approaches can provide a useful alternative to address this issue. Prediction of docking conformations relies on methods that effectively capture shape features of the participating proteins while giving due consideration to conformational changes that may occur. Results We present a novel protein docking algorithm based on the use of 3D Zernike descriptors as regional features of molecular shape. The key motivation of using these descriptors is their invariance to transformation, in addition to a compact representation of local surface shape characteristics. Docking decoys are generated using geometric hashing, which are then ranked by a scoring function that incorporates a buried surface area and a novel geometric complementarity term based on normals associated with the 3D Zernike shape description. Our docking algorithm was tested on both bound and unbound cases in the ZDOCK benchmark 2.0 dataset. In 74% of the bound docking predictions, our method was able to find a near-native solution (interface C-αRMSD ≤ 2.5 Å) within the top 1000 ranks. For unbound docking, among the 60 complexes for which our algorithm returned at least one hit, 60% of the cases were ranked within the top 2000. Comparison with existing shape-based docking algorithms shows that our method has a better performance than the others in unbound docking while remaining competitive for bound docking cases. Conclusion We show for the first time that the 3D Zernike descriptors are adept in capturing shape complementarity at the protein-protein interface and useful for protein docking prediction. Rigorous benchmark studies show that our docking approach has a superior performance compared to existing methods. PMID:20003235
Protein-protein docking using region-based 3D Zernike descriptors.

PubMed

Venkatraman, Vishwesh; Yang, Yifeng D; Sael, Lee; Kihara, Daisuke

2009-12-09

Protein-protein interactions are a pivotal component of many biological processes and mediate a variety of functions. Knowing the tertiary structure of a protein complex is therefore essential for understanding the interaction mechanism. However, experimental techniques to solve the structure of the complex are often found to be difficult. To this end, computational protein-protein docking approaches can provide a useful alternative to address this issue. Prediction of docking conformations relies on methods that effectively capture shape features of the participating proteins while giving due consideration to conformational changes that may occur. We present a novel protein docking algorithm based on the use of 3D Zernike descriptors as regional features of molecular shape. The key motivation of using these descriptors is their invariance to transformation, in addition to a compact representation of local surface shape characteristics. Docking decoys are generated using geometric hashing, which are then ranked by a scoring function that incorporates a buried surface area and a novel geometric complementarity term based on normals associated with the 3D Zernike shape description. Our docking algorithm was tested on both bound and unbound cases in the ZDOCK benchmark 2.0 dataset. In 74% of the bound docking predictions, our method was able to find a near-native solution (interface C-alphaRMSD < or = 2.5 A) within the top 1000 ranks. For unbound docking, among the 60 complexes for which our algorithm returned at least one hit, 60% of the cases were ranked within the top 2000. Comparison with existing shape-based docking algorithms shows that our method has a better performance than the others in unbound docking while remaining competitive for bound docking cases. We show for the first time that the 3D Zernike descriptors are adept in capturing shape complementarity at the protein-protein interface and useful for protein docking prediction. Rigorous benchmark studies show that our docking approach has a superior performance compared to existing methods.
AllergenFP: allergenicity prediction by descriptor fingerprints.

PubMed

Dimitrov, Ivan; Naneva, Lyudmila; Doytchinova, Irini; Bangov, Ivan

2014-03-15

Allergenicity, like antigenicity and immunogenicity, is a property encoded linearly and non-linearly, and therefore the alignment-based approaches are not able to identify this property unambiguously. A novel alignment-free descriptor-based fingerprint approach is presented here and applied to identify allergens and non-allergens. The approach was implemented into a four step algorithm. Initially, the protein sequences are described by amino acid principal properties as hydrophobicity, size, relative abundance, helix and β-strand forming propensities. Then, the generated strings of different length are converted into vectors with equal length by auto- and cross-covariance (ACC). The vectors were transformed into binary fingerprints and compared in terms of Tanimoto coefficient. The approach was applied to a set of 2427 known allergens and 2427 non-allergens and identified correctly 88% of them with Matthews correlation coefficient of 0.759. The descriptor fingerprint approach presented here is universal. It could be applied for any classification problem in computational biology. The set of E-descriptors is able to capture the main structural and physicochemical properties of amino acids building the proteins. The ACC transformation overcomes the main problem in the alignment-based comparative studies arising from the different length of the aligned protein sequences. The conversion of protein ACC values into binary descriptor fingerprints allows similarity search and classification. The algorithm described in the present study was implemented in a specially designed Web site, named AllergenFP (FP stands for FingerPrint). AllergenFP is written in Python, with GIU in HTML. It is freely accessible at http://ddg-pharmfac.net/Allergen FP. idoytchinova@pharmfac.net or ivanbangov@shu-bg.net.
Scene text recognition in mobile applications by character descriptor and structure configuration.

PubMed

Yi, Chucai; Tian, Yingli

2014-07-01

Text characters and strings in natural scene can provide valuable information for many applications. Extracting text directly from natural scene images or videos is a challenging task because of diverse text patterns and variant background interferences. This paper proposes a method of scene text recognition from detected text regions. In text detection, our previously proposed algorithms are applied to obtain text regions from scene image. First, we design a discriminative character descriptor by combining several state-of-the-art feature detectors and descriptors. Second, we model character structure at each character class by designing stroke configuration maps. Our algorithm design is compatible with the application of scene text extraction in smart mobile devices. An Android-based demo system is developed to show the effectiveness of our proposed method on scene text information extraction from nearby objects. The demo system also provides us some insight into algorithm design and performance improvement of scene text extraction. The evaluation results on benchmark data sets demonstrate that our proposed scheme of text recognition is comparable with the best existing methods.

Multi-sensor image registration based on algebraic projective invariants.

PubMed

Li, Bin; Wang, Wei; Ye, Hao

2013-04-22

A new automatic feature-based registration algorithm is presented for multi-sensor images with projective deformation. Contours are firstly extracted from both reference and sensed images as basic features in the proposed method. Since it is difficult to design a projective-invariant descriptor from the contour information directly, a new feature named Five Sequential Corners (FSC) is constructed based on the corners detected from the extracted contours. By introducing algebraic projective invariants, we design a descriptor for each FSC that is ensured to be robust against projective deformation. Further, no gray scale related information is required in calculating the descriptor, thus it is also robust against the gray scale discrepancy between the multi-sensor image pairs. Experimental results utilizing real image pairs are presented to show the merits of the proposed registration method.
Text Extraction from Scene Images by Character Appearance and Structure Modeling

PubMed Central

Yi, Chucai; Tian, Yingli

2012-01-01

In this paper, we propose a novel algorithm to detect text information from natural scene images. Scene text classification and detection are still open research topics. Our proposed algorithm is able to model both character appearance and structure to generate representative and discriminative text descriptors. The contributions of this paper include three aspects: 1) a new character appearance model by a structure correlation algorithm which extracts discriminative appearance features from detected interest points of character samples; 2) a new text descriptor based on structons and correlatons, which model character structure by structure differences among character samples and structure component co-occurrence; and 3) a new text region localization method by combining color decomposition, character contour refinement, and string line alignment to localize character candidates and refine detected text regions. We perform three groups of experiments to evaluate the effectiveness of our proposed algorithm, including text classification, text detection, and character identification. The evaluation results on benchmark datasets demonstrate that our algorithm achieves the state-of-the-art performance on scene text classification and detection, and significantly outperforms the existing algorithms for character identification. PMID:23316111
Real-time, resource-constrained object classification on a micro-air vehicle

NASA Astrophysics Data System (ADS)

Buck, Louis; Ray, Laura

2013-12-01

A real-time embedded object classification algorithm is developed through the novel combination of binary feature descriptors, a bag-of-visual-words object model and the cortico-striatal loop (CSL) learning algorithm. The BRIEF, ORB and FREAK binary descriptors are tested and compared to SIFT descriptors with regard to their respective classification accuracies, execution times, and memory requirements when used with CSL on a 12.6 g ARM Cortex embedded processor running at 800 MHz. Additionally, the effect of x2 feature mapping and opponent-color representations used with these descriptors is examined. These tests are performed on four data sets of varying sizes and difficulty, and the BRIEF descriptor is found to yield the best combination of speed and classification accuracy. Its use with CSL achieves accuracies between 67% and 95% of those achieved with SIFT descriptors and allows for the embedded classification of a 128x192 pixel image in 0.15 seconds, 60 times faster than classification with SIFT. X2 mapping is found to provide substantial improvements in classification accuracy for all of the descriptors at little cost, while opponent-color descriptors are offer accuracy improvements only on colorful datasets.
Combining point context and dynamic time warping for online gesture recognition

NASA Astrophysics Data System (ADS)

Mao, Xia; Li, Chen

2017-05-01

Previous gesture recognition methods usually focused on recognizing gestures after the entire gesture sequences were obtained. However, in many practical applications, a system has to identify gestures before they end to give instant feedback. We present an online gesture recognition approach that can realize early recognition of unfinished gestures with low latency. First, a curvature buffer-based point context (CBPC) descriptor is proposed to extract the shape feature of a gesture trajectory. The CBPC descriptor is a complete descriptor with a simple computation, and thus has its superiority in online scenarios. Then, we introduce an online windowed dynamic time warping algorithm to realize online matching between the ongoing gesture and the template gestures. In the algorithm, computational complexity is effectively decreased by adding a sliding window to the accumulative distance matrix. Lastly, the experiments are conducted on the Australian sign language data set and the Kinect hand gesture (KHG) data set. Results show that the proposed method outperforms other state-of-the-art methods especially when gesture information is incomplete.
Modular Chemical Descriptor Language (MCDL): Stereochemical modules

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gakh, Andrei A; Burnett, Michael N; Trepalin, Sergei V.

2011-01-01

In our previous papers we introduced the Modular Chemical Descriptor Language (MCDL) for providing a linear representation of chemical information. A subsequent development was the MCDL Java Chemical Structure Editor which is capable of drawing chemical structures from linear representations and generating MCDL descriptors from structures. In this paper we present MCDL modules and accompanying software that incorporate unique representation of molecular stereochemistry based on Cahn-Ingold-Prelog and Fischer ideas in constructing stereoisomer descriptors. The paper also contains additional discussions regarding canonical representation of stereochemical isomers, and brief algorithm descriptions of the open source LINDES, Java applet, and Open Babel MCDLmore » processing module software packages. Testing of the upgraded MCDL Java Chemical Structure Editor on compounds taken from several large and diverse chemical databases demonstrated satisfactory performance for storage and processing of stereochemical information in MCDL format.« less
Building Scientific Confidence in the Development and ...

EPA Pesticide Factsheets

Read-across remains a popular data gap filling technique within category and analogue approaches for regulatory purposes. Acceptance of read-across is an ongoing challenge with several efforts underway for identifying and addressing uncertainties. Here we demonstrate an algorithmic approach to facilitate read-across using ToxCast in vitro bioactivity data in conjunction with chemical descriptor information to predict in vivo outcomes in guideline testing studies from ToxRefDB. Over 3400 different chemical structure descriptors were generated for a set of 976 chemicals and supplemented with the outcomes from 821 in vitro assays. The read-across prediction for a given chemical was based on the similarity weighted endpoint outcomes of its nearest neighbors calculated using in vitro bioactivity and chemical structure descriptors, called GenRA. GenRA is based on a computational approach for: (i) defining local validity domains using chemical and bioactivity descriptors, (ii) systematically deriving endpoint read-across predictions within these domains using similarity weighted activity of nearest neighbours, (iii) objectively evaluating predicted performance using tested chemicals, and (iv) assigning read-across predictions to untested chemicals along with estimates of uncertainty. We found in vitro bioactivity descriptors were often found to be more predictive of in vivo toxicity outcomes than chemical structure descriptors. We believe GenRA is an important first st
SVM prediction of ligand-binding sites in bacterial lipoproteins employing shape and physio-chemical descriptors.

PubMed

Kadam, Kiran; Prabhakar, Prashant; Jayaraman, V K

2012-11-01

Bacterial lipoproteins play critical roles in various physiological processes including the maintenance of pathogenicity and numbers of them are being considered as potential candidates for generating novel vaccines. In this work, we put forth an algorithm to identify and predict ligand-binding sites in bacterial lipoproteins. The method uses three types of pocket descriptors, namely fpocket descriptors, 3D Zernike descriptors and shell descriptors, and combines them with Support Vector Machine (SVM) method for the classification. The three types of descriptors represent shape-based properties of the pocket as well as its local physio-chemical features. All three types of descriptors, along with their hybrid combinations are evaluated with SVM and to improve classification performance, WEKA-InfoGain feature selection is applied. Results obtained in the study show that the classifier successfully differentiates between ligand-binding and non-binding pockets. For the combination of three types of descriptors, 10 fold cross-validation accuracy of 86.83% is obtained for training while the selected model achieved test Matthews Correlation Coefficient (MCC) of 0.534. Individually or in combination with new and existing methods, our model can be a very useful tool for the prediction of potential ligand-binding sites in bacterial lipoproteins.
Genetic algorithm applied to the selection of factors in principal component-artificial neural networks: application to QSAR study of calcium channel antagonist activity of 1,4-dihydropyridines (nifedipine analogous).

PubMed

Hemmateenejad, Bahram; Akhond, Morteza; Miri, Ramin; Shamsipur, Mojtaba

2003-01-01

A QSAR algorithm, principal component-genetic algorithm-artificial neural network (PC-GA-ANN), has been applied to a set of newly synthesized calcium channel blockers, which are of special interest because of their role in cardiac diseases. A data set of 124 1,4-dihydropyridines bearing different ester substituents at the C-3 and C-5 positions of the dihydropyridine ring and nitroimidazolyl, phenylimidazolyl, and methylsulfonylimidazolyl groups at the C-4 position with known Ca(2+) channel binding affinities was employed in this study. Ten different sets of descriptors (837 descriptors) were calculated for each molecule. The principal component analysis was used to compress the descriptor groups into principal components. The most significant descriptors of each set were selected and used as input for the ANN. The genetic algorithm (GA) was used for the selection of the best set of extracted principal components. A feed forward artificial neural network with a back-propagation of error algorithm was used to process the nonlinear relationship between the selected principal components and biological activity of the dihydropyridines. A comparison between PC-GA-ANN and routine PC-ANN shows that the first model yields better prediction ability.
Learning structure-property relationship in crystalline materials: A study of lanthanide-transition metal alloys

NASA Astrophysics Data System (ADS)

Pham, Tien-Lam; Nguyen, Nguyen-Duong; Nguyen, Van-Doan; Kino, Hiori; Miyake, Takashi; Dam, Hieu-Chi

2018-05-01

We have developed a descriptor named Orbital Field Matrix (OFM) for representing material structures in datasets of multi-element materials. The descriptor is based on the information regarding atomic valence shell electrons and their coordination. In this work, we develop an extension of OFM called OFM1. We have shown that these descriptors are highly applicable in predicting the physical properties of materials and in providing insights on the materials space by mapping into a low embedded dimensional space. Our experiments with transition metal/lanthanide metal alloys show that the local magnetic moments and formation energies can be accurately reproduced using simple nearest-neighbor regression, thus confirming the relevance of our descriptors. Using kernel ridge regressions, we could accurately reproduce formation energies and local magnetic moments calculated based on first-principles, with mean absolute errors of 0.03 μB and 0.10 eV/atom, respectively. We show that meaningful low-dimensional representations can be extracted from the original descriptor using descriptive learning algorithms. Intuitive prehension on the materials space, qualitative evaluation on the similarities in local structures or crystalline materials, and inference in the designing of new materials by element substitution can be performed effectively based on these low-dimensional representations.
SAR image segmentation using skeleton-based fuzzy clustering

NASA Astrophysics Data System (ADS)

Cao, Yun Yi; Chen, Yan Qiu

2003-06-01

SAR image segmentation can be converted to a clustering problem in which pixels or small patches are grouped together based on local feature information. In this paper, we present a novel framework for segmentation. The segmentation goal is achieved by unsupervised clustering upon characteristic descriptors extracted from local patches. The mixture model of characteristic descriptor, which combines intensity and texture feature, is investigated. The unsupervised algorithm is derived from the recently proposed Skeleton-Based Data Labeling method. Skeletons are constructed as prototypes of clusters to represent arbitrary latent structures in image data. Segmentation using Skeleton-Based Fuzzy Clustering is able to detect the types of surfaces appeared in SAR images automatically without any user input.
Human action recognition based on point context tensor shape descriptor

NASA Astrophysics Data System (ADS)

Li, Jianjun; Mao, Xia; Chen, Lijiang; Wang, Lan

2017-07-01

Motion trajectory recognition is one of the most important means to determine the identity of a moving object. A compact and discriminative feature representation method can improve the trajectory recognition accuracy. This paper presents an efficient framework for action recognition using a three-dimensional skeleton kinematic joint model. First, we put forward a rotation-scale-translation-invariant shape descriptor based on point context (PC) and the normal vector of hypersurface to jointly characterize local motion and shape information. Meanwhile, an algorithm for extracting the key trajectory based on the confidence coefficient is proposed to reduce the randomness and computational complexity. Second, to decrease the eigenvalue decomposition time complexity, a tensor shape descriptor (TSD) based on PC that can globally capture the spatial layout and temporal order to preserve the spatial information of each frame is proposed. Then, a multilinear projection process is achieved by tensor dynamic time warping to map the TSD to a low-dimensional tensor subspace of the same size. Experimental results show that the proposed shape descriptor is effective and feasible, and the proposed approach obtains considerable performance improvement over the state-of-the-art approaches with respect to accuracy on a public action dataset.
Data-Driven Property Estimation for Protective Clothing

DTIC Science & Technology

2014-09-01

reliable predictions falls under the rubric “machine learning”. Inspired by the applications of machine learning in pharmaceutical drug design and...using genetic algorithms, for instance— descriptor selection can be automated as well. A well-known structured learning technique—Artificial Neural...descriptors automatically, by iteration, e.g., using a genetic algorithm [49]. 4.2.4 Avoiding Overfitting A peril of all regression—least squares as
Application of ant colony optimization in development of models for prediction of anti-HIV-1 activity of HEPT derivatives.

PubMed

Zare-Shahabadi, Vali; Abbasitabar, Fatemeh

2010-09-01

Quantitative structure-activity relationship models were derived for 107 analogs of 1-[(2-hydroxyethoxy) methyl]-6-(phenylthio)thymine, a potent inhibitor of the HIV-1 reverse transcriptase. The activities of these compounds were investigated by means of multiple linear regression (MLR) technique. An ant colony optimization algorithm, called Memorized_ACS, was applied for selecting relevant descriptors and detecting outliers. This algorithm uses an external memory based upon knowledge incorporation from previous iterations. At first, the memory is empty, and then it is filled by running several ACS algorithms. In this respect, after each ACS run, the elite ant is stored in the memory and the process is continued to fill the memory. Here, pheromone updating is performed by all elite ants collected in the memory; this results in improvements in both exploration and exploitation behaviors of the ACS algorithm. The memory is then made empty and is filled again by performing several ACS algorithms using updated pheromone trails. This process is repeated for several iterations. At the end, the memory contains several top solutions for the problem. Number of appearance of each descriptor in the external memory is a good criterion for its importance. Finally, prediction is performed by the elitist ant, and interpretation is carried out by considering the importance of each descriptor. The best MLR model has a training error of 0.47 log (1/EC(50)) units (R(2) = 0.90) and a prediction error of 0.76 log (1/EC(50)) units (R(2) = 0.88). Copyright 2010 Wiley Periodicals, Inc.
Discovery of Novel HIV-1 Integrase Inhibitors Using QSAR-Based Virtual Screening of the NCI Open Database.

PubMed

Ko, Gene M; Garg, Rajni; Bailey, Barbara A; Kumar, Sunil

2016-01-01

Quantitative structure-activity relationship (QSAR) models can be used as a predictive tool for virtual screening of chemical libraries to identify novel drug candidates. The aims of this paper were to report the results of a study performed for descriptor selection, QSAR model development, and virtual screening for identifying novel HIV-1 integrase inhibitor drug candidates. First, three evolutionary algorithms were compared for descriptor selection: differential evolution-binary particle swarm optimization (DE-BPSO), binary particle swarm optimization, and genetic algorithms. Next, three QSAR models were developed from an ensemble of multiple linear regression, partial least squares, and extremely randomized trees models. A comparison of the performances of three evolutionary algorithms showed that DE-BPSO has a significant improvement over the other two algorithms. QSAR models developed in this study were used in consensus as a predictive tool for virtual screening of the NCI Open Database containing 265,242 compounds to identify potential novel HIV-1 integrase inhibitors. Six compounds were predicted to be highly active (plC50 > 6) by each of the three models. The use of a hybrid evolutionary algorithm (DE-BPSO) for descriptor selection and QSAR model development in drug design is a novel approach. Consensus modeling may provide better predictivity by taking into account a broader range of chemical properties within the data set conducive for inhibition that may be missed by an individual model. The six compounds identified provide novel drug candidate leads in the design of next generation HIV- 1 integrase inhibitors targeting drug resistant mutant viruses.
Interactive searching of facial image databases

NASA Astrophysics Data System (ADS)

Nicholls, Robert A.; Shepherd, John W.; Shepherd, Jean

1995-09-01

A set of psychological facial descriptors has been devised to enable computerized searching of criminal photograph albums. The descriptors have been used to encode image databased of up to twelve thousand images. Using a system called FACES, the databases are searched by translating a witness' verbal description into corresponding facial descriptors. Trials of FACES have shown that this coding scheme is more productive and efficient than searching traditional photograph albums. An alternative method of searching the encoded database using a genetic algorithm is currenly being tested. The genetic search method does not require the witness to verbalize a description of the target but merely to indicate a degree of similarity between the target and a limited selection of images from the database. The major drawback of FACES is that is requires a manual encoding of images. Research is being undertaken to automate the process, however, it will require an algorithm which can predict human descriptive values. Alternatives to human derived coding schemes exist using statistical classifications of images. Since databases encoded using statistical classifiers do not have an obvious direct mapping to human derived descriptors, a search method which does not require the entry of human descriptors is required. A genetic search algorithm is being tested for such a purpose.
Applications of genetic algorithms on the structure-activity relationship analysis of some cinnamamides.

PubMed

Hou, T J; Wang, J M; Liao, N; Xu, X J

1999-01-01

Quantitative structure-activity relationships (QSARs) for 35 cinnamamides were studied. By using a genetic algorithm (GA), a group of multiple regression models with high fitness scores was generated. From the statistical analyses of the descriptors used in the evolution procedure, the principal features affecting the anticonvulsant activity were found. The significant descriptors include the partition coefficient, the molar refraction, the Hammet sigma constant of the substituents on the benzene ring, and the formation energy of the molecules. It could be found that the steric complementarity and the hydrophobic interaction between the inhibitors and the receptor were very important to the biological activity, while the contribution of the electronic effect was not so obvious. Moreover, by construction of the spline models for these four principal descriptors, the effective range for each descriptor was identified.
Feature combination networks for the interpretation of statistical machine learning models: application to Ames mutagenicity.

PubMed

Webb, Samuel J; Hanser, Thierry; Howlin, Brendan; Krause, Paul; Vessey, Jonathan D

2014-03-25

A new algorithm has been developed to enable the interpretation of black box models. The developed algorithm is agnostic to learning algorithm and open to all structural based descriptors such as fragments, keys and hashed fingerprints. The algorithm has provided meaningful interpretation of Ames mutagenicity predictions from both random forest and support vector machine models built on a variety of structural fingerprints.A fragmentation algorithm is utilised to investigate the model's behaviour on specific substructures present in the query. An output is formulated summarising causes of activation and deactivation. The algorithm is able to identify multiple causes of activation or deactivation in addition to identifying localised deactivations where the prediction for the query is active overall. No loss in performance is seen as there is no change in the prediction; the interpretation is produced directly on the model's behaviour for the specific query. Models have been built using multiple learning algorithms including support vector machine and random forest. The models were built on public Ames mutagenicity data and a variety of fingerprint descriptors were used. These models produced a good performance in both internal and external validation with accuracies around 82%. The models were used to evaluate the interpretation algorithm. Interpretation was revealed that links closely with understood mechanisms for Ames mutagenicity. This methodology allows for a greater utilisation of the predictions made by black box models and can expedite further study based on the output for a (quantitative) structure activity model. Additionally the algorithm could be utilised for chemical dataset investigation and knowledge extraction/human SAR development.
A Multistage Approach for Image Registration.

PubMed

Bowen, Francis; Hu, Jianghai; Du, Eliza Yingzi

2016-09-01

Successful image registration is an important step for object recognition, target detection, remote sensing, multimodal content fusion, scene blending, and disaster assessment and management. The geometric and photometric variations between images adversely affect the ability for an algorithm to estimate the transformation parameters that relate the two images. Local deformations, lighting conditions, object obstructions, and perspective differences all contribute to the challenges faced by traditional registration techniques. In this paper, a novel multistage registration approach is proposed that is resilient to view point differences, image content variations, and lighting conditions. Robust registration is realized through the utilization of a novel region descriptor which couples with the spatial and texture characteristics of invariant feature points. The proposed region descriptor is exploited in a multistage approach. A multistage process allows the utilization of the graph-based descriptor in many scenarios thus allowing the algorithm to be applied to a broader set of images. Each successive stage of the registration technique is evaluated through an effective similarity metric which determines subsequent action. The registration of aerial and street view images from pre- and post-disaster provide strong evidence that the proposed method estimates more accurate global transformation parameters than traditional feature-based methods. Experimental results show the robustness and accuracy of the proposed multistage image registration methodology.
Target recognition of ladar range images using slice image: comparison of four improved algorithms

NASA Astrophysics Data System (ADS)

Xia, Wenze; Han, Shaokun; Cao, Jingya; Wang, Liang; Zhai, Yu; Cheng, Yang

2017-07-01

Compared with traditional 3-D shape data, ladar range images possess properties of strong noise, shape degeneracy, and sparsity, which make feature extraction and representation difficult. The slice image is an effective feature descriptor to resolve this problem. We propose four improved algorithms on target recognition of ladar range images using slice image. In order to improve resolution invariance of the slice image, mean value detection instead of maximum value detection is applied in these four improved algorithms. In order to improve rotation invariance of the slice image, three new improved feature descriptors-which are feature slice image, slice-Zernike moments, and slice-Fourier moments-are applied to the last three improved algorithms, respectively. Backpropagation neural networks are used as feature classifiers in the last two improved algorithms. The performance of these four improved recognition systems is analyzed comprehensively in the aspects of the three invariances, recognition rate, and execution time. The final experiment results show that the improvements for these four algorithms reach the desired effect, the three invariances of feature descriptors are not directly related to the final recognition performance of recognition systems, and these four improved recognition systems have different performances under different conditions.
Mining and Querying Multimedia Data

DTIC Science & Technology

2011-09-29

able to capture more subtle spatial variations such as repetitiveness. Local feature descriptors such as SIFT [74] and SURF [12] have also been widely...empirically set to s = 90%, r = 50%, K = 20, where small variations lead to little perturbation of the output. The pseudo-code of the algorithm is...by constructing a three-layer graph based on clustering outputs, and executing a slight variation of random walk with restart algorithm. It provided

A fast and automatic fusion algorithm for unregistered multi-exposure image sequence

NASA Astrophysics Data System (ADS)

Liu, Yan; Yu, Feihong

2014-09-01

Human visual system (HVS) can visualize all the brightness levels of the scene through visual adaptation. However, the dynamic range of most commercial digital cameras and display devices are smaller than the dynamic range of human eye. This implies low dynamic range (LDR) images captured by normal digital camera may lose image details. We propose an efficient approach to high dynamic (HDR) image fusion that copes with image displacement and image blur degradation in a computationally efficient manner, which is suitable for implementation on mobile devices. The various image registration algorithms proposed in the previous literatures are unable to meet the efficiency and performance requirements in the application of mobile devices. In this paper, we selected Oriented Brief (ORB) detector to extract local image structures. The descriptor selected in multi-exposure image fusion algorithm has to be fast and robust to illumination variations and geometric deformations. ORB descriptor is the best candidate in our algorithm. Further, we perform an improved RANdom Sample Consensus (RANSAC) algorithm to reject incorrect matches. For the fusion of images, a new approach based on Stationary Wavelet Transform (SWT) is used. The experimental results demonstrate that the proposed algorithm generates high quality images at low computational cost. Comparisons with a number of other feature matching methods show that our method gets better performance.
A scale-invariant keypoint detector in log-polar space

NASA Astrophysics Data System (ADS)

Tao, Tao; Zhang, Yun

2017-02-01

The scale-invariant feature transform (SIFT) algorithm is devised to detect keypoints via the difference of Gaussian (DoG) images. However, the DoG data lacks the high-frequency information, which can lead to a performance drop of the algorithm. To address this issue, this paper proposes a novel log-polar feature detector (LPFD) to detect scale-invariant blubs (keypoints) in log-polar space, which, in contrast, can retain all the image information. The algorithm consists of three components, viz. keypoint detection, descriptor extraction and descriptor matching. Besides, the algorithm is evaluated in detecting keypoints from the INRIA dataset by comparing with the SIFT algorithm and one of its fast versions, the speed up robust features (SURF) algorithm in terms of three performance measures, viz. correspondences, repeatability, correct matches and matching score.
Locator-Checker-Scaler Object Tracking Using Spatially Ordered and Weighted Patch Descriptor.

PubMed

Kim, Han-Ul; Kim, Chang-Su

2017-08-01

In this paper, we propose a simple yet effective object descriptor and a novel tracking algorithm to track a target object accurately. For the object description, we divide the bounding box of a target object into multiple patches and describe them with color and gradient histograms. Then, we determine the foreground weight of each patch to alleviate the impacts of background information in the bounding box. To this end, we perform random walk with restart (RWR) simulation. We then concatenate the weighted patch descriptors to yield the spatially ordered and weighted patch (SOWP) descriptor. For the object tracking, we incorporate the proposed SOWP descriptor into a novel tracking algorithm, which has three components: locator, checker, and scaler (LCS). The locator and the scaler estimate the center location and the size of a target, respectively. The checker determines whether it is safe to adjust the target scale in a current frame. These three components cooperate with one another to achieve robust tracking. Experimental results demonstrate that the proposed LCS tracker achieves excellent performance on recent benchmarks.
Using probabilistic model as feature descriptor on a smartphone device for autonomous navigation of unmanned ground vehicles

NASA Astrophysics Data System (ADS)

Desai, Alok; Lee, Dah-Jye

2013-12-01

There has been significant research on the development of feature descriptors in the past few years. Most of them do not emphasize real-time applications. This paper presents the development of an affine invariant feature descriptor for low resource applications such as UAV and UGV that are equipped with an embedded system with a small microprocessor, a field programmable gate array (FPGA), or a smart phone device. UAV and UGV have proven suitable for many promising applications such as unknown environment exploration, search and rescue operations. These applications required on board image processing for obstacle detection, avoidance and navigation. All these real-time vision applications require a camera to grab images and match features using a feature descriptor. A good feature descriptor will uniquely describe a feature point thus allowing it to be correctly identified and matched with its corresponding feature point in another image. A few feature description algorithms are available for a resource limited system. They either require too much of the device's resource or too much simplification on the algorithm, which results in reduction in performance. This research is aimed at meeting the needs of these systems without sacrificing accuracy. This paper introduces a new feature descriptor called PRObabilistic model (PRO) for UGV navigation applications. It is a compact and efficient binary descriptor that is hardware-friendly and easy for implementation.
Magnostics: Image-Based Search of Interesting Matrix Views for Guided Network Exploration.

PubMed

Behrisch, Michael; Bach, Benjamin; Hund, Michael; Delz, Michael; Von Ruden, Laura; Fekete, Jean-Daniel; Schreck, Tobias

2017-01-01

In this work we address the problem of retrieving potentially interesting matrix views to support the exploration of networks. We introduce Matrix Diagnostics (or Magnostics), following in spirit related approaches for rating and ranking other visualization techniques, such as Scagnostics for scatter plots. Our approach ranks matrix views according to the appearance of specific visual patterns, such as blocks and lines, indicating the existence of topological motifs in the data, such as clusters, bi-graphs, or central nodes. Magnostics can be used to analyze, query, or search for visually similar matrices in large collections, or to assess the quality of matrix reordering algorithms. While many feature descriptors for image analyzes exist, there is no evidence how they perform for detecting patterns in matrices. In order to make an informed choice of feature descriptors for matrix diagnostics, we evaluate 30 feature descriptors-27 existing ones and three new descriptors that we designed specifically for MAGNOSTICS-with respect to four criteria: pattern response, pattern variability, pattern sensibility, and pattern discrimination. We conclude with an informed set of six descriptors as most appropriate for Magnostics and demonstrate their application in two scenarios; exploring a large collection of matrices and analyzing temporal networks.
Morphological classification of odontogenic keratocysts using Bouligand-Minkowski fractal descriptors.

PubMed

Florindo, Joao B; Bruno, Odemir M; Landini, Gabriel

2017-02-01

The Odontogenic keratocyst (OKC) is a cystic lesion of the jaws, which has high growth and recurrence rates compared to other cysts of the jaws (for instance, radicular cyst, which is the most common jaw cyst type). For this reason OKCs are considered by some to be benign neoplasms. There exist two sub-types of OKCs (sporadic and syndromic) and the ability to discriminate between these sub-types, as well as other jaw cysts, is an important task in terms of disease diagnosis and prognosis. With the development of digital pathology, computational algorithms have become central to addressing this type of problem. Considering that only basic feature-based methods have been investigated in this problem before, we propose to use a different approach (the Bouligand-Minkowski descriptors) to assess the success rates achieved on the classification of a database of histological images of the epithelial lining of these cysts. This does not require the level of abstraction necessary to extract histologically-relevant features and therefore has the potential of being more robust than previous approaches. The descriptors were obtained by mapping pixel intensities into a three dimensional cloud of points in discrete space and applying morphological dilations with spheres of increasing radii. The descriptors were computed from the volume of the dilated set and submitted to a machine learning algorithm to classify the samples into diagnostic groups. This approach was capable of discriminating between OKCs and radicular cysts in 98% of images (100% of cases) and between the two sub-types of OKCs in 68% of images (71% of cases). These results improve over previously reported classification rates reported elsewhere and suggest that Bouligand-Minkowski descriptors are useful features to be used in histopathological images of these cysts. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.
Using open source computational tools for predicting human metabolic stability and additional absorption, distribution, metabolism, excretion, and toxicity properties.

PubMed

Gupta, Rishi R; Gifford, Eric M; Liston, Ted; Waller, Chris L; Hohman, Moses; Bunin, Barry A; Ekins, Sean

2010-11-01

Ligand-based computational models could be more readily shared between researchers and organizations if they were generated with open source molecular descriptors [e.g., chemistry development kit (CDK)] and modeling algorithms, because this would negate the requirement for proprietary commercial software. We initially evaluated open source descriptors and model building algorithms using a training set of approximately 50,000 molecules and a test set of approximately 25,000 molecules with human liver microsomal metabolic stability data. A C5.0 decision tree model demonstrated that CDK descriptors together with a set of Smiles Arbitrary Target Specification (SMARTS) keys had good statistics [κ = 0.43, sensitivity = 0.57, specificity = 0.91, and positive predicted value (PPV) = 0.64], equivalent to those of models built with commercial Molecular Operating Environment 2D (MOE2D) and the same set of SMARTS keys (κ = 0.43, sensitivity = 0.58, specificity = 0.91, and PPV = 0.63). Extending the dataset to ∼193,000 molecules and generating a continuous model using Cubist with a combination of CDK and SMARTS keys or MOE2D and SMARTS keys confirmed this observation. When the continuous predictions and actual values were binned to get a categorical score we observed a similar κ statistic (0.42). The same combination of descriptor set and modeling method was applied to passive permeability and P-glycoprotein efflux data with similar model testing statistics. In summary, open source tools demonstrated predictive results comparable to those of commercial software with attendant cost savings. We discuss the advantages and disadvantages of open source descriptors and the opportunity for their use as a tool for organizations to share data precompetitively, avoiding repetition and assisting drug discovery.
Local functional descriptors for surface comparison based binding prediction

PubMed Central

2012-01-01

Background Molecular recognition in proteins occurs due to appropriate arrangements of physical, chemical, and geometric properties of an atomic surface. Similar surface regions should create similar binding interfaces. Effective methods for comparing surface regions can be used in identifying similar regions, and to predict interactions without regard to the underlying structural scaffold that creates the surface. Results We present a new descriptor for protein functional surfaces and algorithms for using these descriptors to compare protein surface regions to identify ligand binding interfaces. Our approach uses descriptors of local regions of the surface, and assembles collections of matches to compare larger regions. Our approach uses a variety of physical, chemical, and geometric properties, adaptively weighting these properties as appropriate for different regions of the interface. Our approach builds a classifier based on a training corpus of examples of binding sites of the target ligand. The constructed classifiers can be applied to a query protein providing a probability for each position on the protein that the position is part of a binding interface. We demonstrate the effectiveness of the approach on a number of benchmarks, demonstrating performance that is comparable to the state-of-the-art, with an approach with more generality than these prior methods. Conclusions Local functional descriptors offer a new method for protein surface comparison that is sufficiently flexible to serve in a variety of applications. PMID:23176080
Integration of QSAR and SAR methods for the mechanistic interpretation of predictive models for carcinogenicity

PubMed Central

Fjodorova, Natalja; Novič, Marjana

2012-01-01

The knowledge-based Toxtree expert system (SAR approach) was integrated with the statistically based counter propagation artificial neural network (CP ANN) model (QSAR approach) to contribute to a better mechanistic understanding of a carcinogenicity model for non-congeneric chemicals using Dragon descriptors and carcinogenic potency for rats as a response. The transparency of the CP ANN algorithm was demonstrated using intrinsic mapping technique specifically Kohonen maps. Chemical structures were represented by Dragon descriptors that express the structural and electronic features of molecules such as their shape and electronic surrounding related to reactivity of molecules. It was illustrated how the descriptors are correlated with particular structural alerts (SAs) for carcinogenicity with recognized mechanistic link to carcinogenic activity. Moreover, the Kohonen mapping technique enables one to examine the separation of carcinogens and non-carcinogens (for rats) within a family of chemicals with a particular SA for carcinogenicity. The mechanistic interpretation of models is important for the evaluation of safety of chemicals. PMID:24688639
RANdom SAmple Consensus (RANSAC) algorithm for material-informatics: application to photovoltaic solar cells.

PubMed

Kaspi, Omer; Yosipof, Abraham; Senderowitz, Hanoch

2017-06-06

An important aspect of chemoinformatics and material-informatics is the usage of machine learning algorithms to build Quantitative Structure Activity Relationship (QSAR) models. The RANdom SAmple Consensus (RANSAC) algorithm is a predictive modeling tool widely used in the image processing field for cleaning datasets from noise. RANSAC could be used as a "one stop shop" algorithm for developing and validating QSAR models, performing outlier removal, descriptors selection, model development and predictions for test set samples using applicability domain. For "future" predictions (i.e., for samples not included in the original test set) RANSAC provides a statistical estimate for the probability of obtaining reliable predictions, i.e., predictions within a pre-defined number of standard deviations from the true values. In this work we describe the first application of RNASAC in material informatics, focusing on the analysis of solar cells. We demonstrate that for three datasets representing different metal oxide (MO) based solar cell libraries RANSAC-derived models select descriptors previously shown to correlate with key photovoltaic properties and lead to good predictive statistics for these properties. These models were subsequently used to predict the properties of virtual solar cells libraries highlighting interesting dependencies of PV properties on MO compositions.
A “loop” shape descriptor and its application to automated segmentation of airways from CT scans

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pu, Jiantao; Jin, Chenwang, E-mail: jcw76@163.com; Yu, Nan

2015-06-15

Purpose: A novel shape descriptor is presented to aid an automated identification of the airways depicted on computed tomography (CT) images. Methods: Instead of simplifying the tubular characteristic of the airways as an ideal mathematical cylindrical or circular shape, the proposed “loop” shape descriptor exploits the fact that the cross sections of any tubular structure (regardless of its regularity) always appear as a loop. In implementation, the authors first reconstruct the anatomical structures in volumetric CT as a three-dimensional surface model using the classical marching cubes algorithm. Then, the loop descriptor is applied to locate the airways with a concavemore » loop cross section. To deal with the variation of the airway walls in density as depicted on CT images, a multiple threshold strategy is proposed. A publicly available chest CT database consisting of 20 CT scans, which was designed specifically for evaluating an airway segmentation algorithm, was used for quantitative performance assessment. Measures, including length, branch count, and generations, were computed under the aid of a skeletonization operation. Results: For the test dataset, the airway length ranged from 64.6 to 429.8 cm, the generation ranged from 7 to 11, and the branch number ranged from 48 to 312. These results were comparable to the performance of the state-of-the-art algorithms validated on the same dataset. Conclusions: The authors’ quantitative experiment demonstrated the feasibility and reliability of the developed shape descriptor in identifying lung airways.« less
A novel class sensitive hashing technique for large-scale content-based remote sensing image retrieval

NASA Astrophysics Data System (ADS)

Reato, Thomas; Demir, Begüm; Bruzzone, Lorenzo

2017-10-01

This paper presents a novel class sensitive hashing technique in the framework of large-scale content-based remote sensing (RS) image retrieval. The proposed technique aims at representing each image with multi-hash codes, each of which corresponds to a primitive (i.e., land cover class) present in the image. To this end, the proposed method consists of a three-steps algorithm. The first step is devoted to characterize each image by primitive class descriptors. These descriptors are obtained through a supervised approach, which initially extracts the image regions and their descriptors that are then associated with primitives present in the images. This step requires a set of annotated training regions to define primitive classes. A correspondence between the regions of an image and the primitive classes is built based on the probability of each primitive class to be present at each region. All the regions belonging to the specific primitive class with a probability higher than a given threshold are highly representative of that class. Thus, the average value of the descriptors of these regions is used to characterize that primitive. In the second step, the descriptors of primitive classes are transformed into multi-hash codes to represent each image. This is achieved by adapting the kernel-based supervised locality sensitive hashing method to multi-code hashing problems. The first two steps of the proposed technique, unlike the standard hashing methods, allow one to represent each image by a set of primitive class sensitive descriptors and their hash codes. Then, in the last step, the images in the archive that are very similar to a query image are retrieved based on a multi-hash-code-matching scheme. Experimental results obtained on an archive of aerial images confirm the effectiveness of the proposed technique in terms of retrieval accuracy when compared to the standard hashing methods.
Resemblance profiles as clustering decision criteria: Estimating statistical power, error, and correspondence for a hypothesis test for multivariate structure.

PubMed

Kilborn, Joshua P; Jones, David L; Peebles, Ernst B; Naar, David F

2017-04-01

Clustering data continues to be a highly active area of data analysis, and resemblance profiles are being incorporated into ecological methodologies as a hypothesis testing-based approach to clustering multivariate data. However, these new clustering techniques have not been rigorously tested to determine the performance variability based on the algorithm's assumptions or any underlying data structures. Here, we use simulation studies to estimate the statistical error rates for the hypothesis test for multivariate structure based on dissimilarity profiles (DISPROF). We concurrently tested a widely used algorithm that employs the unweighted pair group method with arithmetic mean (UPGMA) to estimate the proficiency of clustering with DISPROF as a decision criterion. We simulated unstructured multivariate data from different probability distributions with increasing numbers of objects and descriptors, and grouped data with increasing overlap, overdispersion for ecological data, and correlation among descriptors within groups. Using simulated data, we measured the resolution and correspondence of clustering solutions achieved by DISPROF with UPGMA against the reference grouping partitions used to simulate the structured test datasets. Our results highlight the dynamic interactions between dataset dimensionality, group overlap, and the properties of the descriptors within a group (i.e., overdispersion or correlation structure) that are relevant to resemblance profiles as a clustering criterion for multivariate data. These methods are particularly useful for multivariate ecological datasets that benefit from distance-based statistical analyses. We propose guidelines for using DISPROF as a clustering decision tool that will help future users avoid potential pitfalls during the application of methods and the interpretation of results.
A macrophysical life cycle description for precipitating systems

NASA Astrophysics Data System (ADS)

Evaristo, Raquel; Xie, Xinxin; Troemel, Silke; Diederich, Malte; Simon, Juergen; Simmer, Clemens

2014-05-01

The lack of understanding of cloud and precipitation processes is still the overarching problem of climate simulation, and prediction. The work presented is part of the HD(CP)2 project (High Definition Clouds and Precipitation for Advancing Climate Predictions) which aims at building a very high resolution model in order to evaluate and exploit regional hindcasts for the purpose of parameterization development. To this end, an observational object-based climatology for precipitation systems will be built, and shall later be compared with a twin model-based climatological data base for pseudo precipitation events within an event-based model validation approach. This is done by identifying internal structures, described by means of macrophysical descriptors used to characterize the temporal development of tracked rain events. 2 pre-requisites are necessary for this: 1) a tracking algorithm, and 2) 3D radar/satellite composite. Both prerequisites are ready to be used, and have already been applied to a few case studies. Some examples of these macrophysical descriptors are differential reflectivity columns, bright band fraction and trend, cloud top heights, the spatial extent of updrafts or downdrafts or the ice content. We will show one case study from August 5th 2012, when convective precipitation was observed simultaneously by the BOXPOL and JUXPOL X-band polarimetric radars. We will follow the main paths identified by the tracking algorithm during this event and identify in the 3D composite the descriptors that characterize precipitation development, their temporal evolution, and the different macrophysical processes that are ultimately related to the precipitation observed. In a later stage these observations will be compared to the results of hydrometeor classification algorithm, in order to link the macrophysical and microphysical aspects of the storm evolution. The detailed microphysical processes are the subject of a closely related work also presented in this session: Microphysical processes observed by X band polarimetric radars during the evolution of storm systems, by Xinxin Xie et al.
Salient aspects of PBP2A-inhibition; A QSAR Study.

PubMed

Ogunleye, Adewale J; Eniafe, Gabriel O; Inyang, Olumide K; Adewumi, Benjamin; Omotuyi, Olaposi I

2018-05-15

Backgound: Inhibition of penicillin binding protein 2A (PBP2A) represents a sound drug design strategy in combatting Methicillin resistant Staphylococcus aureus (MRSA). Considering the urgent need for effective antimicrobials in combatting MRSA infections, we have developed a statistically robust ensemble of molecular descriptors (1, 2, & 3-D) from compounds targeting PBP2A in vivo. 37 (training set: 26, test set: 11) PBP2A-inhibitors were submitted for descriptor generation after which an unsupervised, non-exhaustive genetic algorithm (GA) was deployed for fishing out the best descriptor subset. Assignment of descriptors to a regression model was accomplished with the Partial Least Square (PLS) algorithm. At the end, an ensemble of 30 descriptors accurately predicted the ligand bioactivity, IC50 (R = 0.9996, R2 = 0.9992, R2a = 0.9949, SEE =, 0.2297 Q2LOO = 0.9741). Inferentially, we noticed that the overall efficacy of this model greatly depends on atomic polarizability and negative charge (electron) density. Besides the formula derived, the high dimensional model also offers critical insights into salient cheminformatics parameter to note during hit-to-lead PBP2A-antagonist optimization. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Image-based aircraft pose estimation: a comparison of simulations and real-world data

NASA Astrophysics Data System (ADS)

Breuers, Marcel G. J.; de Reus, Nico

2001-10-01

The problem of estimating aircraft pose information from mono-ocular image data is considered using a Fourier descriptor based algorithm. The dependence of pose estimation accuracy on image resolution and aspect angle is investigated through simulations using sets of synthetic aircraft images. Further evaluation shows that god pose estimation accuracy can be obtained in real world image sequences.
Towards the chemometric dissection of peptide - HLA-A*0201 binding affinity: comparison of local and global QSAR models

NASA Astrophysics Data System (ADS)

Doytchinova, Irini A.; Walshe, Valerie; Borrow, Persephone; Flower, Darren R.

2005-03-01

The affinities of 177 nonameric peptides binding to the HLA-A*0201 molecule were measured using a FACS-based MHC stabilisation assay and analysed using chemometrics. Their structures were described by global and local descriptors, QSAR models were derived by genetic algorithm, stepwise regression and PLS. The global molecular descriptors included molecular connectivity χ indices, κ shape indices, E-state indices, molecular properties like molecular weight and log P, and three-dimensional descriptors like polarizability, surface area and volume. The local descriptors were of two types. The first used a binary string to indicate the presence of each amino acid type at each position of the peptide. The second was also position-dependent but used five z-scales to describe the main physicochemical properties of the amino acids forming the peptides. The models were developed using a representative training set of 131 peptides and validated using an independent test set of 46 peptides. It was found that the global descriptors could not explain the variance in the training set nor predict the affinities of the test set accurately. Both types of local descriptors gave QSAR models with better explained variance and predictive ability. The results suggest that, in their interactions with the MHC molecule, the peptide acts as a complicated ensemble of multiple amino acids mutually potentiating each other.
The DSFPN, a new neural network for optical character recognition.

PubMed

Morns, L P; Dlay, S S

1999-01-01

A new type of neural network for recognition tasks is presented in this paper. The network, called the dynamic supervised forward-propagation network (DSFPN), is based on the forward only version of the counterpropagation network (CPN). The DSFPN, trains using a supervised algorithm and can grow dynamically during training, allowing subclasses in the training data to be learnt in an unsupervised manner. It is shown to train in times comparable to the CPN while giving better classification accuracies than the popular backpropagation network. Both Fourier descriptors and wavelet descriptors are used for image preprocessing and the wavelets are proven to give a far better performance.
Stargate GTM: Bridging Descriptor and Activity Spaces.

PubMed

Gaspar, Héléna A; Baskin, Igor I; Marcou, Gilles; Horvath, Dragos; Varnek, Alexandre

2015-11-23

Predicting the activity profile of a molecule or discovering structures possessing a specific activity profile are two important goals in chemoinformatics, which could be achieved by bridging activity and molecular descriptor spaces. In this paper, we introduce the "Stargate" version of the Generative Topographic Mapping approach (S-GTM) in which two different multidimensional spaces (e.g., structural descriptor space and activity space) are linked through a common 2D latent space. In the S-GTM algorithm, the manifolds are trained simultaneously in two initial spaces using the probabilities in the 2D latent space calculated as a weighted geometric mean of probability distributions in both spaces. S-GTM has the following interesting features: (1) activities are involved during the training procedure; therefore, the method is supervised, unlike conventional GTM; (2) using molecular descriptors of a given compound as input, the model predicts a whole activity profile, and (3) using an activity profile as input, areas populated by relevant chemical structures can be detected. To assess the performance of S-GTM prediction models, a descriptor space (ISIDA descriptors) of a set of 1325 GPCR ligands was related to a B-dimensional (B = 1 or 8) activity space corresponding to pKi values for eight different targets. S-GTM outperforms conventional GTM for individual activities and performs similarly to the Lasso multitask learning algorithm, although it is still slightly less accurate than the Random Forest method.
Fast protein tertiary structure retrieval based on global surface shape similarity.

PubMed

Sael, Lee; Li, Bin; La, David; Fang, Yi; Ramani, Karthik; Rustamov, Raif; Kihara, Daisuke

2008-09-01

Characterization and identification of similar tertiary structure of proteins provides rich information for investigating function and evolution. The importance of structure similarity searches is increasing as structure databases continue to expand, partly due to the structural genomics projects. A crucial drawback of conventional protein structure comparison methods, which compare structures by their main-chain orientation or the spatial arrangement of secondary structure, is that a database search is too slow to be done in real-time. Here we introduce a global surface shape representation by three-dimensional (3D) Zernike descriptors, which represent a protein structure compactly as a series expansion of 3D functions. With this simplified representation, the search speed against a few thousand structures takes less than a minute. To investigate the agreement between surface representation defined by 3D Zernike descriptor and conventional main-chain based representation, a benchmark was performed against a protein classification generated by the combinatorial extension algorithm. Despite the different representation, 3D Zernike descriptor retrieved proteins of the same conformation defined by combinatorial extension in 89.6% of the cases within the top five closest structures. The real-time protein structure search by 3D Zernike descriptor will open up new possibility of large-scale global and local protein surface shape comparison. 2008 Wiley-Liss, Inc.

Modeling of adipose/blood partition coefficient for environmental chemicals.

PubMed

Papadaki, K C; Karakitsios, S P; Sarigiannis, D A

2017-12-01

A Quantitative Structure Activity Relationship (QSAR) model was developed in order to predict the adipose/blood partition coefficient of environmental chemical compounds. The first step of QSAR modeling was the collection of inputs. Input data included the experimental values of adipose/blood partition coefficient and two sets of molecular descriptors for 67 organic chemical compounds; a) the descriptors from Linear Free Energy Relationship (LFER) and b) the PaDEL descriptors. The datasets were split to training and prediction set and were analysed using two statistical methods; Genetic Algorithm based Multiple Linear Regression (GA-MLR) and Artificial Neural Networks (ANN). The models with LFER and PaDEL descriptors, coupled with ANN, produced satisfying performance results. The fitting performance (R 2 ) of the models, using LFER and PaDEL descriptors, was 0.94 and 0.96, respectively. The Applicability Domain (AD) of the models was assessed and then the models were applied to a large number of chemical compounds with unknown values of adipose/blood partition coefficient. In conclusion, the proposed models were checked for fitting, validity and applicability. It was demonstrated that they are stable, reliable and capable to predict the values of adipose/blood partition coefficient of "data poor" chemical compounds that fall within the applicability domain. Copyright © 2017. Published by Elsevier Ltd.
An infrastructure to mine molecular descriptors for ligand selection on virtual screening.

PubMed

Seus, Vinicius Rosa; Perazzo, Giovanni Xavier; Winck, Ana T; Werhli, Adriano V; Machado, Karina S

2014-01-01

The receptor-ligand interaction evaluation is one important step in rational drug design. The databases that provide the structures of the ligands are growing on a daily basis. This makes it impossible to test all the ligands for a target receptor. Hence, a ligand selection before testing the ligands is needed. One possible approach is to evaluate a set of molecular descriptors. With the aim of describing the characteristics of promising compounds for a specific receptor we introduce a data warehouse-based infrastructure to mine molecular descriptors for virtual screening (VS). We performed experiments that consider as target the receptor HIV-1 protease and different compounds for this protein. A set of 9 molecular descriptors are taken as the predictive attributes and the free energy of binding is taken as a target attribute. By applying the J48 algorithm over the data we obtain decision tree models that achieved up to 84% of accuracy. The models indicate which molecular descriptors and their respective values are relevant to influence good FEB results. Using their rules we performed ligand selection on ZINC database. Our results show important reduction in ligands selection to be applied in VS experiments; for instance, the best selection model picked only 0.21% of the total amount of drug-like ligands.
Face recognition from unconstrained three-dimensional face images using multitask sparse representation

NASA Astrophysics Data System (ADS)

Bentaieb, Samia; Ouamri, Abdelaziz; Nait-Ali, Amine; Keche, Mokhtar

2018-01-01

We propose and evaluate a three-dimensional (3D) face recognition approach that applies the speeded up robust feature (SURF) algorithm to the depth representation of shape index map, under real-world conditions, using only a single gallery sample for each subject. First, the 3D scans are preprocessed, then SURF is applied on the shape index map to find interest points and their descriptors. Each 3D face scan is represented by keypoints descriptors, and a large dictionary is built from all the gallery descriptors. At the recognition step, descriptors of a probe face scan are sparsely represented by the dictionary. A multitask sparse representation classification is used to determine the identity of each probe face. The feasibility of the approach that uses the SURF algorithm on the shape index map for face identification/authentication is checked through an experimental investigation conducted on Bosphorus, University of Milano Bicocca, and CASIA 3D datasets. It achieves an overall rank one recognition rate of 97.75%, 80.85%, and 95.12%, respectively, on these datasets.
Learning Optimized Local Difference Binaries for Scalable Augmented Reality on Mobile Devices.

PubMed

Xin Yang; Kwang-Ting Cheng

2014-06-01

The efficiency, robustness and distinctiveness of a feature descriptor are critical to the user experience and scalability of a mobile augmented reality (AR) system. However, existing descriptors are either too computationally expensive to achieve real-time performance on a mobile device such as a smartphone or tablet, or not sufficiently robust and distinctive to identify correct matches from a large database. As a result, current mobile AR systems still only have limited capabilities, which greatly restrict their deployment in practice. In this paper, we propose a highly efficient, robust and distinctive binary descriptor, called Learning-based Local Difference Binary (LLDB). LLDB directly computes a binary string for an image patch using simple intensity and gradient difference tests on pairwise grid cells within the patch. To select an optimized set of grid cell pairs, we densely sample grid cells from an image patch and then leverage a modified AdaBoost algorithm to automatically extract a small set of critical ones with the goal of maximizing the Hamming distance between mismatches while minimizing it between matches. Experimental results demonstrate that LLDB is extremely fast to compute and to match against a large database due to its high robustness and distinctiveness. Compared to the state-of-the-art binary descriptors, primarily designed for speed, LLDB has similar efficiency for descriptor construction, while achieving a greater accuracy and faster matching speed when matching over a large database with 2.3M descriptors on mobile devices.
Improved nucleic acid descriptors for siRNA efficacy prediction.

PubMed

Sciabola, Simone; Cao, Qing; Orozco, Modesto; Faustino, Ignacio; Stanton, Robert V

2013-02-01

Although considerable progress has been made recently in understanding how gene silencing is mediated by the RNAi pathway, the rational design of effective sequences is still a challenging task. In this article, we demonstrate that including three-dimensional descriptors improved the discrimination between active and inactive small interfering RNAs (siRNAs) in a statistical model. Five descriptor types were used: (i) nucleotide position along the siRNA sequence, (ii) nucleotide composition in terms of presence/absence of specific combinations of di- and trinucleotides, (iii) nucleotide interactions by means of a modified auto- and cross-covariance function, (iv) nucleotide thermodynamic stability derived by the nearest neighbor model representation and (v) nucleic acid structure flexibility. The duplex flexibility descriptors are derived from extended molecular dynamics simulations, which are able to describe the sequence-dependent elastic properties of RNA duplexes, even for non-standard oligonucleotides. The matrix of descriptors was analysed using three statistical packages in R (partial least squares, random forest, and support vector machine), and the most predictive model was implemented in a modeling tool we have made publicly available through SourceForge. Our implementation of new RNA descriptors coupled with appropriate statistical algorithms resulted in improved model performance for the selection of siRNA candidates when compared with publicly available siRNA prediction tools and previously published test sets. Additional validation studies based on in-house RNA interference projects confirmed the robustness of the scoring procedure in prospective studies.
Secure access control and large scale robust representation for online multimedia event detection.

PubMed

Liu, Changyu; Lu, Bin; Li, Huiling

2014-01-01

We developed an online multimedia event detection (MED) system. However, there are a secure access control issue and a large scale robust representation issue when we want to integrate traditional event detection algorithms into the online environment. For the first issue, we proposed a tree proxy-based and service-oriented access control (TPSAC) model based on the traditional role based access control model. Verification experiments were conducted on the CloudSim simulation platform, and the results showed that the TPSAC model is suitable for the access control of dynamic online environments. For the second issue, inspired by the object-bank scene descriptor, we proposed a 1000-object-bank (1000OBK) event descriptor. Feature vectors of the 1000OBK were extracted from response pyramids of 1000 generic object detectors which were trained on standard annotated image datasets, such as the ImageNet dataset. A spatial bag of words tiling approach was then adopted to encode these feature vectors for bridging the gap between the objects and events. Furthermore, we performed experiments in the context of event classification on the challenging TRECVID MED 2012 dataset, and the results showed that the robust 1000OBK event descriptor outperforms the state-of-the-art approaches.
Augmented Topological Descriptors of Pore Networks for Material Science.

PubMed

Ushizima, D; Morozov, D; Weber, G H; Bianchi, A G C; Sethian, J A; Bethel, E W

2012-12-01

One potential solution to reduce the concentration of carbon dioxide in the atmosphere is the geologic storage of captured CO2 in underground rock formations, also known as carbon sequestration. There is ongoing research to guarantee that this process is both efficient and safe. We describe tools that provide measurements of media porosity, and permeability estimates, including visualization of pore structures. Existing standard algorithms make limited use of geometric information in calculating permeability of complex microstructures. This quantity is important for the analysis of biomineralization, a subsurface process that can affect physical properties of porous media. This paper introduces geometric and topological descriptors that enhance the estimation of material permeability. Our analysis framework includes the processing of experimental data, segmentation, and feature extraction and making novel use of multiscale topological analysis to quantify maximum flow through porous networks. We illustrate our results using synchrotron-based X-ray computed microtomography of glass beads during biomineralization. We also benchmark the proposed algorithms using simulated data sets modeling jammed packed bead beds of a monodispersive material.
MIND: modality independent neighbourhood descriptor for multi-modal deformable registration.

PubMed

Heinrich, Mattias P; Jenkinson, Mark; Bhushan, Manav; Matin, Tahreema; Gleeson, Fergus V; Brady, Sir Michael; Schnabel, Julia A

2012-10-01

Deformable registration of images obtained from different modalities remains a challenging task in medical image analysis. This paper addresses this important problem and proposes a modality independent neighbourhood descriptor (MIND) for both linear and deformable multi-modal registration. Based on the similarity of small image patches within one image, it aims to extract the distinctive structure in a local neighbourhood, which is preserved across modalities. The descriptor is based on the concept of image self-similarity, which has been introduced for non-local means filtering for image denoising. It is able to distinguish between different types of features such as corners, edges and homogeneously textured regions. MIND is robust to the most considerable differences between modalities: non-functional intensity relations, image noise and non-uniform bias fields. The multi-dimensional descriptor can be efficiently computed in a dense fashion across the whole image and provides point-wise local similarity across modalities based on the absolute or squared difference between descriptors, making it applicable for a wide range of transformation models and optimisation algorithms. We use the sum of squared differences of the MIND representations of the images as a similarity metric within a symmetric non-parametric Gauss-Newton registration framework. In principle, MIND would be applicable to the registration of arbitrary modalities. In this work, we apply and validate it for the registration of clinical 3D thoracic CT scans between inhale and exhale as well as the alignment of 3D CT and MRI scans. Experimental results show the advantages of MIND over state-of-the-art techniques such as conditional mutual information and entropy images, with respect to clinically annotated landmark locations. Copyright © 2012 Elsevier B.V. All rights reserved.
Combining semi-automated image analysis techniques with machine learning algorithms to accelerate large-scale genetic studies.

PubMed

Atkinson, Jonathan A; Lobet, Guillaume; Noll, Manuel; Meyer, Patrick E; Griffiths, Marcus; Wells, Darren M

2017-10-01

Genetic analyses of plant root systems require large datasets of extracted architectural traits. To quantify such traits from images of root systems, researchers often have to choose between automated tools (that are prone to error and extract only a limited number of architectural traits) or semi-automated ones (that are highly time consuming). We trained a Random Forest algorithm to infer architectural traits from automatically extracted image descriptors. The training was performed on a subset of the dataset, then applied to its entirety. This strategy allowed us to (i) decrease the image analysis time by 73% and (ii) extract meaningful architectural traits based on image descriptors. We also show that these traits are sufficient to identify the quantitative trait loci that had previously been discovered using a semi-automated method. We have shown that combining semi-automated image analysis with machine learning algorithms has the power to increase the throughput of large-scale root studies. We expect that such an approach will enable the quantification of more complex root systems for genetic studies. We also believe that our approach could be extended to other areas of plant phenotyping. © The Authors 2017. Published by Oxford University Press.
Combining semi-automated image analysis techniques with machine learning algorithms to accelerate large-scale genetic studies

PubMed Central

Atkinson, Jonathan A.; Lobet, Guillaume; Noll, Manuel; Meyer, Patrick E.; Griffiths, Marcus

2017-01-01

Abstract Genetic analyses of plant root systems require large datasets of extracted architectural traits. To quantify such traits from images of root systems, researchers often have to choose between automated tools (that are prone to error and extract only a limited number of architectural traits) or semi-automated ones (that are highly time consuming). We trained a Random Forest algorithm to infer architectural traits from automatically extracted image descriptors. The training was performed on a subset of the dataset, then applied to its entirety. This strategy allowed us to (i) decrease the image analysis time by 73% and (ii) extract meaningful architectural traits based on image descriptors. We also show that these traits are sufficient to identify the quantitative trait loci that had previously been discovered using a semi-automated method. We have shown that combining semi-automated image analysis with machine learning algorithms has the power to increase the throughput of large-scale root studies. We expect that such an approach will enable the quantification of more complex root systems for genetic studies. We also believe that our approach could be extended to other areas of plant phenotyping. PMID:29020748
A learning framework for age rank estimation based on face images with scattering transform.

PubMed

Chang, Kuang-Yu; Chen, Chu-Song

2015-03-01

This paper presents a cost-sensitive ordinal hyperplanes ranking algorithm for human age estimation based on face images. The proposed approach exploits relative-order information among the age labels for rank prediction. In our approach, the age rank is obtained by aggregating a series of binary classification results, where cost sensitivities among the labels are introduced to improve the aggregating performance. In addition, we give a theoretical analysis on designing the cost of individual binary classifier so that the misranking cost can be bounded by the total misclassification costs. An efficient descriptor, scattering transform, which scatters the Gabor coefficients and pooled with Gaussian smoothing in multiple layers, is evaluated for facial feature extraction. We show that this descriptor is a generalization of conventional bioinspired features and is more effective for face-based age inference. Experimental results demonstrate that our method outperforms the state-of-the-art age estimation approaches.
Feature extraction and descriptor calculation methods for automatic georeferencing of Philippines' first microsatellite imagery

NASA Astrophysics Data System (ADS)

Tupas, M. E. A.; Dasallas, J. A.; Jiao, B. J. D.; Magallon, B. J. P.; Sempio, J. N. H.; Ramos, M. K. F.; Aranas, R. K. D.; Tamondong, A. M.

2017-10-01

The FAST-SIFT corner detector and descriptor extractor combination was used to automatically georeference DIWATA-1 Spaceborne Multispectral Imager images. Features from the Fast Accelerated Segment Test (FAST) algorithm detects corners or keypoints in an image, and these robustly detected keypoints have well-defined positions. Descriptors were computed using Scale-Invariant Feature Transform (SIFT) extractor. FAST-SIFT method effectively SMI same-subscene images detected by the NIR sensor. The method was also tested in stitching NIR images with varying subscene swept by the camera. The slave images were matched to the master image. The keypoints served as the ground control points. Random sample consensus was used to eliminate fall-out matches and ensure accuracy of the feature points from which the transformation parameters were derived. Keypoints are matched based on their descriptor vector. Nearest-neighbor matching is employed based on a metric distance between the descriptors. The metrics include Euclidean and city block, among others. Rough matching outputs not only the correct matches but also the faulty matches. A previous work in automatic georeferencing incorporates a geometric restriction. In this work, we applied a simplified version of the learning method. RANSAC was used to eliminate fall-out matches and ensure accuracy of the feature points. This method identifies if a point fits the transformation function and returns inlier matches. The transformation matrix was solved by Affine, Projective, and Polynomial models. The accuracy of the automatic georeferencing method were determined by calculating the RMSE of interest points, selected randomly, between the master image and transformed slave image.
Simultaneous feature selection and parameter optimisation using an artificial ant colony: case study of melting point prediction.

PubMed

O'Boyle, Noel M; Palmer, David S; Nigsch, Florian; Mitchell, John Bo

2008-10-29

We present a novel feature selection algorithm, Winnowing Artificial Ant Colony (WAAC), that performs simultaneous feature selection and model parameter optimisation for the development of predictive quantitative structure-property relationship (QSPR) models. The WAAC algorithm is an extension of the modified ant colony algorithm of Shen et al. (J Chem Inf Model 2005, 45: 1024-1029). We test the ability of the algorithm to develop a predictive partial least squares model for the Karthikeyan dataset (J Chem Inf Model 2005, 45: 581-590) of melting point values. We also test its ability to perform feature selection on a support vector machine model for the same dataset. Starting from an initial set of 203 descriptors, the WAAC algorithm selected a PLS model with 68 descriptors which has an RMSE on an external test set of 46.6 degrees C and R2 of 0.51. The number of components chosen for the model was 49, which was close to optimal for this feature selection. The selected SVM model has 28 descriptors (cost of 5, epsilon of 0.21) and an RMSE of 45.1 degrees C and R2 of 0.54. This model outperforms a kNN model (RMSE of 48.3 degrees C, R2 of 0.47) for the same data and has similar performance to a Random Forest model (RMSE of 44.5 degrees C, R2 of 0.55). However it is much less prone to bias at the extremes of the range of melting points as shown by the slope of the line through the residuals: -0.43 for WAAC/SVM, -0.53 for Random Forest. With a careful choice of objective function, the WAAC algorithm can be used to optimise machine learning and regression models that suffer from overfitting. Where model parameters also need to be tuned, as is the case with support vector machine and partial least squares models, it can optimise these simultaneously. The moving probabilities used by the algorithm are easily interpreted in terms of the best and current models of the ants, and the winnowing procedure promotes the removal of irrelevant descriptors.
Dynamic clustering detection through multi-valued descriptors of dermoscopic images.

PubMed

Cozza, Valentina; Guarracino, Maria Rosario; Maddalena, Lucia; Baroni, Adone

2011-09-10

This paper introduces a dynamic clustering methodology based on multi-valued descriptors of dermoscopic images. The main idea is to support medical diagnosis to decide if pigmented skin lesions belonging to an uncertain set are nearer to malignant melanoma or to benign nevi. Melanoma is the most deadly skin cancer, and early diagnosis is a current challenge for clinicians. Most data analysis algorithms for skin lesions discrimination focus on segmentation and extraction of features of categorical or numerical type. As an alternative approach, this paper introduces two new concepts: first, it considers multi-valued data that scalar variables not only describe but also intervals or histogram variables; second, it introduces a dynamic clustering method based on Wasserstein distance to compare multi-valued data. The overall strategy of analysis can be summarized into the following steps: first, a segmentation of dermoscopic images allows to identify a set of multi-valued descriptors; second, we performed a discriminant analysis on a set of images where there is an a priori classification so that it is possible to detect which features discriminate the benign and malignant lesions; and third, we performed the proposed dynamic clustering method on the uncertain cases, which need to be associated to one of the two previously mentioned groups. Results based on clinical data show that the grading of specific descriptors associated to dermoscopic characteristics provides a novel way to characterize uncertain lesions that can help the dermatologist's diagnosis. Copyright © 2011 John Wiley & Sons, Ltd.
Machine learning-based in-line holographic sensing of unstained malaria-infected red blood cells.

PubMed

Go, Taesik; Kim, Jun H; Byeon, Hyeokjun; Lee, Sang J

2018-04-19

Accurate and immediate diagnosis of malaria is important for medication of the infectious disease. Conventional methods for diagnosing malaria are time consuming and rely on the skill of experts. Therefore, an automatic and simple diagnostic modality is essential for healthcare in developing countries that lack the expertise of trained microscopists. In the present study, a new automatic sensing method using digital in-line holographic microscopy (DIHM) combined with machine learning algorithms was proposed to sensitively detect unstained malaria-infected red blood cells (iRBCs). To identify the RBC characteristics, 13 descriptors were extracted from segmented holograms of individual RBCs. Among the 13 descriptors, 10 features were highly statistically different between healthy RBCs (hRBCs) and iRBCs. Six machine learning algorithms were applied to effectively combine the dominant features and to greatly improve the diagnostic capacity of the present method. Among the classification models trained by the 6 tested algorithms, the model trained by the support vector machine (SVM) showed the best accuracy in separating hRBCs and iRBCs for training (n = 280, 96.78%) and testing sets (n = 120, 97.50%). This DIHM-based artificial intelligence methodology is simple and does not require blood staining. Thus, it will be beneficial and valuable in the diagnosis of malaria. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
The evaluation of distributed damage in concrete based on sinusoidal modeling of the ultrasonic response.

PubMed

Sepehrinezhad, Alireza; Toufigh, Vahab

2018-05-25

Ultrasonic wave attenuation is an effective descriptor of distributed damage in inhomogeneous materials. Methods developed to measure wave attenuation have the potential to provide an in-site evaluation of existing concrete structures insofar as they are accurate and time-efficient. In this study, material classification and distributed damage evaluation were investigated based on the sinusoidal modeling of the response from the through-transmission ultrasonic tests on polymer concrete specimens. The response signal was modeled as single or the sum of damping sinusoids. Due to the inhomogeneous nature of concrete materials, model parameters may vary from one specimen to another. Therefore, these parameters are not known in advance and should be estimated while the response signal is being received. The modeling procedure used in this study involves a data-adaptive algorithm to estimate the parameters online. Data-adaptive algorithms are used due to a lack of knowledge of the model parameters. The damping factor was estimated as a descriptor of the distributed damage. The results were compared in two different cases as follows: (1) constant excitation frequency with varying concrete mixtures and (2) constant mixture with varying excitation frequencies. The specimens were also loaded up to their ultimate compressive strength to investigate the effect of distributed damage in the response signal. The results of the estimation indicated that the damping was highly sensitive to the change in material inhomogeneity, even in comparable mixtures. In addition to the proposed method, three methods were employed to compare the results based on their accuracy in the classification of materials and the evaluation of the distributed damage. It is shown that the estimated damping factor is not only sensitive to damage in the final stages of loading, but it is also applicable in evaluating micro damages in the earlier stages providing a reliable descriptor of damage. In addition, the modified amplitude ratio method is introduced as an improvement of the classical method. The proposed methods were validated to be effective descriptors of distributed damage. The presented models were also in good agreement with the experimental data. Copyright © 2018 Elsevier B.V. All rights reserved.
Research on sparse feature matching of improved RANSAC algorithm

NASA Astrophysics Data System (ADS)

Kong, Xiangsi; Zhao, Xian

2018-04-01

In this paper, a sparse feature matching method based on modified RANSAC algorithm is proposed to improve the precision and speed. Firstly, the feature points of the images are extracted using the SIFT algorithm. Then, the image pair is matched roughly by generating SIFT feature descriptor. At last, the precision of image matching is optimized by the modified RANSAC algorithm,. The RANSAC algorithm is improved from three aspects: instead of the homography matrix, this paper uses the fundamental matrix generated by the 8 point algorithm as the model; the sample is selected by a random block selecting method, which ensures the uniform distribution and the accuracy; adds sequential probability ratio test(SPRT) on the basis of standard RANSAC, which cut down the overall running time of the algorithm. The experimental results show that this method can not only get higher matching accuracy, but also greatly reduce the computation and improve the matching speed.
NMF-Based Image Quality Assessment Using Extreme Learning Machine.

PubMed

Wang, Shuigen; Deng, Chenwei; Lin, Weisi; Huang, Guang-Bin; Zhao, Baojun

2017-01-01

Numerous state-of-the-art perceptual image quality assessment (IQA) algorithms share a common two-stage process: distortion description followed by distortion effects pooling. As for the first stage, the distortion descriptors or measurements are expected to be effective representatives of human visual variations, while the second stage should well express the relationship among quality descriptors and the perceptual visual quality. However, most of the existing quality descriptors (e.g., luminance, contrast, and gradient) do not seem to be consistent with human perception, and the effects pooling is often done in ad-hoc ways. In this paper, we propose a novel full-reference IQA metric. It applies non-negative matrix factorization (NMF) to measure image degradations by making use of the parts-based representation of NMF. On the other hand, a new machine learning technique [extreme learning machine (ELM)] is employed to address the limitations of the existing pooling techniques. Compared with neural networks and support vector regression, ELM can achieve higher learning accuracy with faster learning speed. Extensive experimental results demonstrate that the proposed metric has better performance and lower computational complexity in comparison with the relevant state-of-the-art approaches.
Secure Access Control and Large Scale Robust Representation for Online Multimedia Event Detection

PubMed Central

Liu, Changyu; Li, Huiling

2014-01-01

We developed an online multimedia event detection (MED) system. However, there are a secure access control issue and a large scale robust representation issue when we want to integrate traditional event detection algorithms into the online environment. For the first issue, we proposed a tree proxy-based and service-oriented access control (TPSAC) model based on the traditional role based access control model. Verification experiments were conducted on the CloudSim simulation platform, and the results showed that the TPSAC model is suitable for the access control of dynamic online environments. For the second issue, inspired by the object-bank scene descriptor, we proposed a 1000-object-bank (1000OBK) event descriptor. Feature vectors of the 1000OBK were extracted from response pyramids of 1000 generic object detectors which were trained on standard annotated image datasets, such as the ImageNet dataset. A spatial bag of words tiling approach was then adopted to encode these feature vectors for bridging the gap between the objects and events. Furthermore, we performed experiments in the context of event classification on the challenging TRECVID MED 2012 dataset, and the results showed that the robust 1000OBK event descriptor outperforms the state-of-the-art approaches. PMID:25147840
Gold-standard for computer-assisted morphological sperm analysis.

PubMed

Chang, Violeta; Garcia, Alejandra; Hitschfeld, Nancy; Härtel, Steffen

2017-04-01

Published algorithms for classification of human sperm heads are based on relatively small image databases that are not open to the public, and thus no direct comparison is available for competing methods. We describe a gold-standard for morphological sperm analysis (SCIAN-MorphoSpermGS), a dataset of sperm head images with expert-classification labels in one of the following classes: normal, tapered, pyriform, small or amorphous. This gold-standard is for evaluating and comparing known techniques and future improvements to present approaches for classification of human sperm heads for semen analysis. Although this paper does not provide a computational tool for morphological sperm analysis, we present a set of experiments for comparing sperm head description and classification common techniques. This classification base-line is aimed to be used as a reference for future improvements to present approaches for human sperm head classification. The gold-standard provides a label for each sperm head, which is achieved by majority voting among experts. The classification base-line compares four supervised learning methods (1- Nearest Neighbor, naive Bayes, decision trees and Support Vector Machine (SVM)) and three shape-based descriptors (Hu moments, Zernike moments and Fourier descriptors), reporting the accuracy and the true positive rate for each experiment. We used Fleiss' Kappa Coefficient to evaluate the inter-expert agreement and Fisher's exact test for inter-expert variability and statistical significant differences between descriptors and learning techniques. Our results confirm the high degree of inter-expert variability in the morphological sperm analysis. Regarding the classification base line, we show that none of the standard descriptors or classification approaches is best suitable for tackling the problem of sperm head classification. We discovered that the correct classification rate was highly variable when trying to discriminate among non-normal sperm heads. By using the Fourier descriptor and SVM, we achieved the best mean correct classification: only 49%. We conclude that the SCIAN-MorphoSpermGS will provide a standard tool for evaluation of characterization and classification approaches for human sperm heads. Indeed, there is a clear need for a specific shape-based descriptor for human sperm heads and a specific classification approach to tackle the problem of high variability within subcategories of abnormal sperm cells. Copyright © 2017 Elsevier Ltd. All rights reserved.

Automated detection of microaneurysms using robust blob descriptors

NASA Astrophysics Data System (ADS)

Adal, K.; Ali, S.; Sidibé, D.; Karnowski, T.; Chaum, E.; Mériaudeau, F.

2013-03-01

Microaneurysms (MAs) are among the first signs of diabetic retinopathy (DR) that can be seen as round dark-red structures in digital color fundus photographs of retina. In recent years, automated computer-aided detection and diagnosis (CAD) of MAs has attracted many researchers due to its low-cost and versatile nature. In this paper, the MA detection problem is modeled as finding interest points from a given image and several interest point descriptors are introduced and integrated with machine learning techniques to detect MAs. The proposed approach starts by applying a novel fundus image contrast enhancement technique using Singular Value Decomposition (SVD) of fundus images. Then, Hessian-based candidate selection algorithm is applied to extract image regions which are more likely to be MAs. For each candidate region, robust low-level blob descriptors such as Speeded Up Robust Features (SURF) and Intensity Normalized Radon Transform are extracted to characterize candidate MA regions. The combined features are then classified using SVM which has been trained using ten manually annotated training images. The performance of the overall system is evaluated on Retinopathy Online Challenge (ROC) competition database. Preliminary results show the competitiveness of the proposed candidate selection techniques against state-of-the art methods as well as the promising future for the proposed descriptors to be used in the localization of MAs from fundus images.
Systematically evaluating read-across prediction and ...

EPA Pesticide Factsheets

Read-across is a popular data gap filling technique within category and analogue approaches for regulatory purposes. Acceptance of read-across remains an ongoing challenge with several efforts underway for identifying and addressing uncertainties. Here we demonstrate an algorithmic, automated approach to evaluate the utility of using in vitro bioactivity data (“bioactivity descriptors”, from EPA’s ToxCast program) in conjunction with chemical descriptor information to derive local validity domains (specific sets of nearest neighbors) to facilitate read-across for a number of in vivo repeated dose toxicity study types. Over 3400 different chemical structure descriptors were generated for a set of 976 chemicals and supplemented with the outcomes from 821 in vitro assays. The read-across prediction for a given chemical was based on the similarity weighted endpoint outcomes of its nearest neighbors. The approach enabled a performance baseline for read-across predictions of specific study outcomes to be established. Bioactivity descriptors were often found to be more predictive of in vivo toxicity outcomes than chemical descriptors or a combination of both. The approach shows promise as part of a screening assessment in the absence of prior knowledge. Future work will investigate to what extent encoding expert knowledge leads to an improvement in read-across prediction. Read-across is a popular data gap filling technique within category and analogue approaches
wACSF—Weighted atom-centered symmetry functions as descriptors in machine learning potentials

NASA Astrophysics Data System (ADS)

Gastegger, M.; Schwiedrzik, L.; Bittermann, M.; Berzsenyi, F.; Marquetand, P.

2018-06-01

We introduce weighted atom-centered symmetry functions (wACSFs) as descriptors of a chemical system's geometry for use in the prediction of chemical properties such as enthalpies or potential energies via machine learning. The wACSFs are based on conventional atom-centered symmetry functions (ACSFs) but overcome the undesirable scaling of the latter with an increasing number of different elements in a chemical system. The performance of these two descriptors is compared using them as inputs in high-dimensional neural network potentials (HDNNPs), employing the molecular structures and associated enthalpies of the 133 855 molecules containing up to five different elements reported in the QM9 database as reference data. A substantially smaller number of wACSFs than ACSFs is needed to obtain a comparable spatial resolution of the molecular structures. At the same time, this smaller set of wACSFs leads to a significantly better generalization performance in the machine learning potential than the large set of conventional ACSFs. Furthermore, we show that the intrinsic parameters of the descriptors can in principle be optimized with a genetic algorithm in a highly automated manner. For the wACSFs employed here, we find however that using a simple empirical parametrization scheme is sufficient in order to obtain HDNNPs with high accuracy.
Simultaneous feature selection and parameter optimisation using an artificial ant colony: case study of melting point prediction

PubMed Central

O'Boyle, Noel M; Palmer, David S; Nigsch, Florian; Mitchell, John BO

2008-01-01

Background We present a novel feature selection algorithm, Winnowing Artificial Ant Colony (WAAC), that performs simultaneous feature selection and model parameter optimisation for the development of predictive quantitative structure-property relationship (QSPR) models. The WAAC algorithm is an extension of the modified ant colony algorithm of Shen et al. (J Chem Inf Model 2005, 45: 1024–1029). We test the ability of the algorithm to develop a predictive partial least squares model for the Karthikeyan dataset (J Chem Inf Model 2005, 45: 581–590) of melting point values. We also test its ability to perform feature selection on a support vector machine model for the same dataset. Results Starting from an initial set of 203 descriptors, the WAAC algorithm selected a PLS model with 68 descriptors which has an RMSE on an external test set of 46.6°C and R2 of 0.51. The number of components chosen for the model was 49, which was close to optimal for this feature selection. The selected SVM model has 28 descriptors (cost of 5, ε of 0.21) and an RMSE of 45.1°C and R2 of 0.54. This model outperforms a kNN model (RMSE of 48.3°C, R2 of 0.47) for the same data and has similar performance to a Random Forest model (RMSE of 44.5°C, R2 of 0.55). However it is much less prone to bias at the extremes of the range of melting points as shown by the slope of the line through the residuals: -0.43 for WAAC/SVM, -0.53 for Random Forest. Conclusion With a careful choice of objective function, the WAAC algorithm can be used to optimise machine learning and regression models that suffer from overfitting. Where model parameters also need to be tuned, as is the case with support vector machine and partial least squares models, it can optimise these simultaneously. The moving probabilities used by the algorithm are easily interpreted in terms of the best and current models of the ants, and the winnowing procedure promotes the removal of irrelevant descriptors. PMID:18959785
Predicting the outcomes of organic reactions via machine learning: are current descriptors sufficient?

PubMed

Skoraczyński, G; Dittwald, P; Miasojedow, B; Szymkuć, S; Gajewska, E P; Grzybowski, B A; Gambin, A

2017-06-15

As machine learning/artificial intelligence algorithms are defeating chess masters and, most recently, GO champions, there is interest - and hope - that they will prove equally useful in assisting chemists in predicting outcomes of organic reactions. This paper demonstrates, however, that the applicability of machine learning to the problems of chemical reactivity over diverse types of chemistries remains limited - in particular, with the currently available chemical descriptors, fundamental mathematical theorems impose upper bounds on the accuracy with which raction yields and times can be predicted. Improving the performance of machine-learning methods calls for the development of fundamentally new chemical descriptors.
Rotation and scale invariant shape context registration for remote sensing images with background variations

NASA Astrophysics Data System (ADS)

Jiang, Jie; Zhang, Shumei; Cao, Shixiang

2015-01-01

Multitemporal remote sensing images generally suffer from background variations, which significantly disrupt traditional region feature and descriptor abstracts, especially between pre and postdisasters, making registration by local features unreliable. Because shapes hold relatively stable information, a rotation and scale invariant shape context based on multiscale edge features is proposed. A multiscale morphological operator is adapted to detect edges of shapes, and an equivalent difference of Gaussian scale space is built to detect local scale invariant feature points along the detected edges. Then, a rotation invariant shape context with improved distance discrimination serves as a feature descriptor. For a distance shape context, a self-adaptive threshold (SAT) distance division coordinate system is proposed, which improves the discriminative property of the feature descriptor in mid-long pixel distances from the central point while maintaining it in shorter ones. To achieve rotation invariance, the magnitude of Fourier transform in one-dimension is applied to calculate angle shape context. Finally, the residual error is evaluated after obtaining thin-plate spline transformation between reference and sensed images. Experimental results demonstrate the robustness, efficiency, and accuracy of this automatic algorithm.
Quantitative property-structural relation modeling on polymeric dielectric materials

NASA Astrophysics Data System (ADS)

Wu, Ke

Nowadays, polymeric materials have attracted more and more attention in dielectric applications. But searching for a material with desired properties is still largely based on trial and error. To facilitate the development of new polymeric materials, heuristic models built using the Quantitative Structure Property Relationships (QSPR) techniques can provide reliable "working solutions". In this thesis, the application of QSPR on polymeric materials is studied from two angles: descriptors and algorithms. A novel set of descriptors, called infinite chain descriptors (ICD), are developed to encode the chemical features of pure polymers. ICD is designed to eliminate the uncertainty of polymer conformations and inconsistency of molecular representation of polymers. Models for the dielectric constant, band gap, dielectric loss tangent and glass transition temperatures of organic polymers are built with high prediction accuracy. Two new algorithms, the physics-enlightened learning method (PELM) and multi-mechanism detection, are designed to deal with two typical challenges in material QSPR. PELM is a meta-algorithm that utilizes the classic physical theory as guidance to construct the candidate learning function. It shows better out-of-domain prediction accuracy compared to the classic machine learning algorithm (support vector machine). Multi-mechanism detection is built based on a cluster-weighted mixing model similar to a Gaussian mixture model. The idea is to separate the data into subsets where each subset can be modeled by a much simpler model. The case study on glass transition temperature shows that this method can provide better overall prediction accuracy even though less data is available for each subset model. In addition, the techniques developed in this work are also applied to polymer nanocomposites (PNC). PNC are new materials with outstanding dielectric properties. As a key factor in determining the dispersion state of nanoparticles in the polymer matrix, the surface tension components of polymers are modeled using ICD. Compared to the 3D surface descriptors used in a previous study, the model with ICD has a much improved prediction accuracy and stability particularly for the polar component. In predicting the enhancement effect of grafting functional groups on the breakdown strength of PNC, a simple local charge transfer model is proposed where the electron affinity (EA) and ionization energy (IE) determines the main charge trap depth in the system. This physical model is supported by first principle computation. QSPR models for EA and IE are also built, decreasing the computation time of EA and IE for a single molecule from several hours to less than one second. Furthermore, the designs of two web-based tools are introduced. The tools represent two commonly used applications for QSPR studies: data inquiry and prediction. Making models and data public available and easy to use is particularly crucial for QSPR research. The web tools described in this work should provide a good guidance and starting point for the further development of information tools enabling more efficient cooperation between computational and experimental communities.
SNBRFinder: A Sequence-Based Hybrid Algorithm for Enhanced Prediction of Nucleic Acid-Binding Residues.

PubMed

Yang, Xiaoxia; Wang, Jia; Sun, Jun; Liu, Rong

2015-01-01

Protein-nucleic acid interactions are central to various fundamental biological processes. Automated methods capable of reliably identifying DNA- and RNA-binding residues in protein sequence are assuming ever-increasing importance. The majority of current algorithms rely on feature-based prediction, but their accuracy remains to be further improved. Here we propose a sequence-based hybrid algorithm SNBRFinder (Sequence-based Nucleic acid-Binding Residue Finder) by merging a feature predictor SNBRFinderF and a template predictor SNBRFinderT. SNBRFinderF was established using the support vector machine whose inputs include sequence profile and other complementary sequence descriptors, while SNBRFinderT was implemented with the sequence alignment algorithm based on profile hidden Markov models to capture the weakly homologous template of query sequence. Experimental results show that SNBRFinderF was clearly superior to the commonly used sequence profile-based predictor and SNBRFinderT can achieve comparable performance to the structure-based template methods. Leveraging the complementary relationship between these two predictors, SNBRFinder reasonably improved the performance of both DNA- and RNA-binding residue predictions. More importantly, the sequence-based hybrid prediction reached competitive performance relative to our previous structure-based counterpart. Our extensive and stringent comparisons show that SNBRFinder has obvious advantages over the existing sequence-based prediction algorithms. The value of our algorithm is highlighted by establishing an easy-to-use web server that is freely accessible at http://ibi.hzau.edu.cn/SNBRFinder.
Fully automated detection of diabetic macular edema and dry age-related macular degeneration from optical coherence tomography images

PubMed Central

Srinivasan, Pratul P.; Kim, Leo A.; Mettu, Priyatham S.; Cousins, Scott W.; Comer, Grant M.; Izatt, Joseph A.; Farsiu, Sina

2014-01-01

We present a novel fully automated algorithm for the detection of retinal diseases via optical coherence tomography (OCT) imaging. Our algorithm utilizes multiscale histograms of oriented gradient descriptors as feature vectors of a support vector machine based classifier. The spectral domain OCT data sets used for cross-validation consisted of volumetric scans acquired from 45 subjects: 15 normal subjects, 15 patients with dry age-related macular degeneration (AMD), and 15 patients with diabetic macular edema (DME). Our classifier correctly identified 100% of cases with AMD, 100% cases with DME, and 86.67% cases of normal subjects. This algorithm is a potentially impactful tool for the remote diagnosis of ophthalmic diseases. PMID:25360373
Application of the artificial neural network in quantitative structure-gradient elution retention relationship of phenylthiocarbamyl amino acids derivatives.

PubMed

Tham, S Y; Agatonovic-Kustrin, S

2002-05-15

Quantitative structure-retention relationship(QSRR) method was used to model reversed-phase high-performance liquid chromatography (RP-HPLC) separation of 18 selected amino acids. Retention data for phenylthiocarbamyl (PTC) amino acids derivatives were obtained using gradient elution on ODS column with mobile phase of varying acetonitrile, acetate buffer and containing 0.5 ml/l of triethylamine (TEA). Molecular structure of each amino acid was encoded with 36 calculated molecular descriptors. The correlation between the molecular descriptors and the retention time of the compounds in the calibration set was established using the genetic neural network method. A genetic algorithm (GA) was used to select important molecular descriptors and supervised artificial neural network (ANN) was used to correlate mobile phase composition and selected descriptors with the experimentally derived retention times. Retention time values were used as the network's output and calculated molecular descriptors and mobile phase composition as the inputs. The best model with five input descriptors was chosen, and the significance of the selected descriptors for amino acid separation was examined. Results confirmed the dominant role of the organic modifier in such chromatographic systems in addition to lipophilicity (log P) and molecular size and shape (topological indices) of investigated solutes.
A review on principles, theory and practices of 2D-QSAR.

PubMed

Roy, Kunal; Das, Rudra Narayan

2014-01-01

The central axiom of science purports the explanation of every natural phenomenon using all possible logics coming from pure as well as mixed scientific background. The quantitative structure-activity relationship (QSAR) analysis is a study correlating the behavioral manifestation of compounds with their structures employing the interdisciplinary knowledge of chemistry, mathematics, biology as well as physics. Several studies have attempted to mathematically correlate the chemistry and property (physicochemical/ biological/toxicological) of molecules using various computationally or experimentally derived quantitative parameters termed as descriptors. The dimensionality of the descriptors depends on the type of algorithm employed and defines the nature of QSAR analysis. The most interesting feature of predictive QSAR models is that the behavior of any new or even hypothesized molecule can be predicted by the use of the mathematical equations. The phrase "2D-QSAR" signifies development of QSAR models using 2D-descriptors. Such predictor variables are the most widely practised ones because of their simple and direct mathematical algorithmic nature involving no time consuming energy computations and having reproducible operability. 2D-descriptors have a deluge of contributions in extracting chemical attributes and they are also capable of representing the 3D molecular features to some extent; although in no case they should be considered as the ultimate one, since they often suffer from the problems of intercorrelation, insufficient chemical information as well as lack of interpretation. However, by following rational approaches, novel 2D-descriptors may be developed to obviate various existing problems giving potential 2D-QSAR equations, thereby solving the innumerable chemical mysteries still unexplored.
New molecular descriptors based on local properties at the molecular surface and a boiling-point model derived from them.

PubMed

Ehresmann, Bernd; de Groot, Marcel J; Alex, Alexander; Clark, Timothy

2004-01-01

New molecular descriptors based on statistical descriptions of the local ionization potential, local electron affinity, and the local polarizability at the surface of the molecule are proposed. The significance of these descriptors has been tested by calculating them for the Maybridge database in addition to our set of 26 descriptors reported previously. The new descriptors show little correlation with those already in use. Furthermore, the principal components of the extended set of descriptors for the Maybridge data show that especially the descriptors based on the local electron affinity extend the variance in our set of descriptors, which we have previously shown to be relevant to physical properties. The first nine principal components are shown to be most significant. As an example of the usefulness of the new descriptors, we have set up a QSPR model for boiling points using both the old and new descriptors.
Predicting hepatotoxicity using ToxCast in vitro bioactivity and ...

EPA Pesticide Factsheets

Background: The U.S. EPA ToxCastTM program is screening thousands of environmental chemicals for bioactivity using hundreds of high-throughput in vitro assays to build predictive models of toxicity. We represented chemicals based on bioactivity and chemical structure descriptors then used supervised machine learning to predict their hepatotoxic effects.Results: A set of 677 chemicals were represented by 711 in vitro bioactivity descriptors (from ToxCast assays), 4,376 chemical structure descriptors (from QikProp, OpenBabel, PADEL, and PubChem), and three hepatotoxicity categories (from animal studies). Hepatotoxicants were defined by rat liver histopathology observed after chronic chemical testing and grouped into hypertrophy (161), injury (101) and proliferative lesions (99). Classifiers were built using six machine learning algorithms: linear discriminant analysis (LDA), Naïve Bayes (NB), support vector classification (SVM), classification and regression trees (CART), k-nearest neighbors (KNN) and an ensemble of classifiers (ENSMB). Classifiers of hepatotoxicity were built using chemical structure, ToxCast bioactivity, and a hybrid representation. Predictive performance was evaluated using 10-fold cross-validation testing and in-loop, filter-based, feature subset selection. Hybrid classifiers had the best balanced accuracy for predicting hypertrophy (0.78±0.08), injury (0.73±0.10) and proliferative lesions (0.72±0.09). Though chemical and bioactivity class
Application of a Cloud Model-Set Pair Analysis in Hazard Assessment for Biomass Gasification Stations.

PubMed

Yan, Fang; Xu, Kaili

2017-01-01

Because a biomass gasification station includes various hazard factors, hazard assessment is needed and significant. In this article, the cloud model (CM) is employed to improve set pair analysis (SPA), and a novel hazard assessment method for a biomass gasification station is proposed based on the cloud model-set pair analysis (CM-SPA). In this method, cloud weight is proposed to be the weight of index. In contrast to the index weight of other methods, cloud weight is shown by cloud descriptors; hence, the randomness and fuzziness of cloud weight will make it effective to reflect the linguistic variables of experts. Then, the cloud connection degree (CCD) is proposed to replace the connection degree (CD); the calculation algorithm of CCD is also worked out. By utilizing the CCD, the hazard assessment results are shown by some normal clouds, and the normal clouds are reflected by cloud descriptors; meanwhile, the hazard grade is confirmed by analyzing the cloud descriptors. After that, two biomass gasification stations undergo hazard assessment via CM-SPA and AHP based SPA, respectively. The comparison of assessment results illustrates that the CM-SPA is suitable and effective for the hazard assessment of a biomass gasification station and that CM-SPA will make the assessment results more reasonable and scientific.
Multiple QSAR models, pharmacophore pattern and molecular docking analysis for anticancer activity of α, β-unsaturated carbonyl-based compounds, oxime and oxime ether analogues

NASA Astrophysics Data System (ADS)

Masand, Vijay H.; El-Sayed, Nahed N. E.; Bambole, Mukesh U.; Quazi, Syed A.

2018-04-01

Multiple discrete quantitative structure-activity relationships (QSARs) models were constructed for the anticancer activity of α, β-unsaturated carbonyl-based compounds, oxime and oxime ether analogues with a variety of substituents like sbnd Br, sbnd OH, -OMe, etc. at different positions. A big pool of descriptors was considered for QSAR model building. Genetic algorithm (GA), available in QSARINS-Chem, was executed to choose optimum number and set of descriptors to create the multi-linear regression equations for a dataset of sixty-nine compounds. The newly developed five parametric models were subjected to exhaustive internal and external validation along with Y-scrambling using QSARINS-Chem, according to the OECD principles for QSAR model validation. The models were built using easily interpretable descriptors and accepted after confirming statistically robustness with high external predictive ability. The five parametric models were found to have R2 = 0.80 to 0.86, R2ex = 0.75 to 0.84, and CCCex = 0.85 to 0.90. The models indicate that frequency of nitrogen and oxygen atoms separated by five bonds from each other and internal electronic environment of the molecule have correlation with the anticancer activity.
Application of a Cloud Model-Set Pair Analysis in Hazard Assessment for Biomass Gasification Stations

PubMed Central

Yan, Fang; Xu, Kaili

2017-01-01

Because a biomass gasification station includes various hazard factors, hazard assessment is needed and significant. In this article, the cloud model (CM) is employed to improve set pair analysis (SPA), and a novel hazard assessment method for a biomass gasification station is proposed based on the cloud model-set pair analysis (CM-SPA). In this method, cloud weight is proposed to be the weight of index. In contrast to the index weight of other methods, cloud weight is shown by cloud descriptors; hence, the randomness and fuzziness of cloud weight will make it effective to reflect the linguistic variables of experts. Then, the cloud connection degree (CCD) is proposed to replace the connection degree (CD); the calculation algorithm of CCD is also worked out. By utilizing the CCD, the hazard assessment results are shown by some normal clouds, and the normal clouds are reflected by cloud descriptors; meanwhile, the hazard grade is confirmed by analyzing the cloud descriptors. After that, two biomass gasification stations undergo hazard assessment via CM-SPA and AHP based SPA, respectively. The comparison of assessment results illustrates that the CM-SPA is suitable and effective for the hazard assessment of a biomass gasification station and that CM-SPA will make the assessment results more reasonable and scientific. PMID:28076440
Retro-regression--another important multivariate regression improvement.

PubMed

Randić, M

2001-01-01

We review the serious problem associated with instabilities of the coefficients of regression equations, referred to as the MRA (multivariate regression analysis) "nightmare of the first kind". This is manifested when in a stepwise regression a descriptor is included or excluded from a regression. The consequence is an unpredictable change of the coefficients of the descriptors that remain in the regression equation. We follow with consideration of an even more serious problem, referred to as the MRA "nightmare of the second kind", arising when optimal descriptors are selected from a large pool of descriptors. This process typically causes at different steps of the stepwise regression a replacement of several previously used descriptors by new ones. We describe a procedure that resolves these difficulties. The approach is illustrated on boiling points of nonanes which are considered (1) by using an ordered connectivity basis; (2) by using an ordering resulting from application of greedy algorithm; and (3) by using an ordering derived from an exhaustive search for optimal descriptors. A novel variant of multiple regression analysis, called retro-regression (RR), is outlined showing how it resolves the ambiguities associated with both "nightmares" of the first and the second kind of MRA.
Synthesis and quantitative structure-activity relationship (QSAR) study of novel isoxazoline and oxime derivatives of podophyllotoxin as insecticidal agents.

PubMed

Wang, Yi; Shao, Yonghua; Wang, Yangyang; Fan, Lingling; Yu, Xiang; Zhi, Xiaoyan; Yang, Chun; Qu, Huan; Yao, Xiaojun; Xu, Hui

2012-08-29

In continuation of our program aimed at the discovery and development of natural-product-based insecticidal agents, 33 isoxazoline and oxime derivatives of podophyllotoxin modified in the C and D rings were synthesized and their structures were characterized by Proton nuclear magnetic resonance ((1)H NMR), high-resolution mass spectrometry (HRMS), electrospray ionization-mass spectrometry (ESI-MS), optical rotation, melting point (mp), and infrared (IR) spectroscopy. The stereochemical configurations of compounds 5e, 5f, and 9f were unambiguously determined by X-ray crystallography. Their insecticidal activity was evaluated against the pre-third-instar larvae of northern armyworm, Mythimna separata (Walker), in vivo. Compounds 5e, 9c, 11g, and 11h especially exhibited more promising insecticidal activity than toosendanin, a commercial botanical insecticide extracted from Melia azedarach . A genetic algorithm combined with multiple linear regression (GA-MLR) calculation is performed by the MOBY DIGS package. Five selected descriptors are as follows: one two-dimensional (2D) autocorrelation descriptor (GATS4e), one edge adjacency indice (EEig06x), one RDF descriptor (RDF080v), one three-dimensional (3D) MoRSE descriptor (Mor09v), and one atom-centered fragment (H-052) descriptor. Quantitative structure-activity relationship studies demonstrated that the insecticidal activity of these compounds was mainly influenced by many factors, such as electronic distribution, steric factors, etc. For this model, the standard deviation error in prediction (SDEP) is 0.0592, the correlation coefficient (R(2)) is 0.861, and the leave-one-out cross-validation correlation coefficient (Q(2)loo) is 0.797.
Mining semantic networks of bioinformatics e-resources from the literature

PubMed Central

2011-01-01

Background There have been a number of recent efforts (e.g. BioCatalogue, BioMoby) to systematically catalogue bioinformatics tools, services and datasets. These efforts rely on manual curation, making it difficult to cope with the huge influx of various electronic resources that have been provided by the bioinformatics community. We present a text mining approach that utilises the literature to automatically extract descriptions and semantically profile bioinformatics resources to make them available for resource discovery and exploration through semantic networks that contain related resources. Results The method identifies the mentions of resources in the literature and assigns a set of co-occurring terminological entities (descriptors) to represent them. We have processed 2,691 full-text bioinformatics articles and extracted profiles of 12,452 resources containing associated descriptors with binary and tf*idf weights. Since such representations are typically sparse (on average 13.77 features per resource), we used lexical kernel metrics to identify semantically related resources via descriptor smoothing. Resources are then clustered or linked into semantic networks, providing the users (bioinformaticians, curators and service/tool crawlers) with a possibility to explore algorithms, tools, services and datasets based on their relatedness. Manual exploration of links between a set of 18 well-known bioinformatics resources suggests that the method was able to identify and group semantically related entities. Conclusions The results have shown that the method can reconstruct interesting functional links between resources (e.g. linking data types and algorithms), in particular when tf*idf-like weights are used for profiling. This demonstrates the potential of combining literature mining and simple lexical kernel methods to model relatedness between resource descriptors in particular when there are few features, thus potentially improving the resource description, discovery and exploration process. The resource profiles are available at http://gnode1.mib.man.ac.uk/bioinf/semnets.html PMID:21388573
Shadow Areas Robust Matching Among Image Sequence in Planetary Landing

NASA Astrophysics Data System (ADS)

Ruoyan, Wei; Xiaogang, Ruan; Naigong, Yu; Xiaoqing, Zhu; Jia, Lin

2017-01-01

In this paper, an approach for robust matching shadow areas in autonomous visual navigation and planetary landing is proposed. The approach begins with detecting shadow areas, which are extracted by Maximally Stable Extremal Regions (MSER). Then, an affine normalization algorithm is applied to normalize the areas. Thirdly, a descriptor called Multiple Angles-SIFT (MA-SIFT) that coming from SIFT is proposed, the descriptor can extract more features of an area. Finally, for eliminating the influence of outliers, a method of improved RANSAC based on Skinner Operation Condition is proposed to extract inliers. At last, series of experiments are conducted to test the performance of the approach this paper proposed, the results show that the approach can maintain the matching accuracy at a high level even the differences among the images are obvious with no attitude measurements supplied.

New descriptor for skeletons of planar shapes: the calypter

NASA Astrophysics Data System (ADS)

Pirard, Eric; Nivart, Jean-Francois

1994-05-01

The mathematical definition of the skeleton as the locus of centers of maximal inscribed discs is a nondigitizable one. The idea presented in this paper is to incorporate the skeleton information and the chain-code of the contour into a single descriptor by associating to each point of a contour the center and radius of the maximum inscribed disc tangent at that point. This new descriptor is called calypter. The encoding of a calypter is a three stage algorithm: (1) chain coding of the contour; (2) euclidean distance transformation, (3) climbing on the distance relief from each point of the contour towards the corresponding maximal inscribed disc center. Here we introduce an integer euclidean distance transform called the holodisc distance transform. The major interest of this holodisc transform is to confer 8-connexity to the isolevels of the generated distance relief thereby allowing a climbing algorithm to proceed step by step towards the centers of the maximal inscribed discs. The calypter has a cyclic structure delivering high speed access to the skeleton data. Its potential uses are in high speed euclidean mathematical morphology, shape processing, and analysis.
Segmentation of anatomical branching structures based on texture features and conditional random field

NASA Astrophysics Data System (ADS)

Nuzhnaya, Tatyana; Bakic, Predrag; Kontos, Despina; Megalooikonomou, Vasileios; Ling, Haibin

2012-02-01

This work is a part of our ongoing study aimed at understanding a relation between the topology of anatomical branching structures with the underlying image texture. Morphological variability of the breast ductal network is associated with subsequent development of abnormalities in patients with nipple discharge such as papilloma, breast cancer and atypia. In this work, we investigate complex dependence among ductal components to perform segmentation, the first step for analyzing topology of ductal lobes. Our automated framework is based on incorporating a conditional random field with texture descriptors of skewness, coarseness, contrast, energy and fractal dimension. These features are selected to capture the architectural variability of the enhanced ducts by encoding spatial variations between pixel patches in galactographic image. The segmentation algorithm was applied to a dataset of 20 x-ray galactograms obtained at the Hospital of the University of Pennsylvania. We compared the performance of the proposed approach with fully and semi automated segmentation algorithms based on neural network classification, fuzzy-connectedness, vesselness filter and graph cuts. Global consistency error and confusion matrix analysis were used as accuracy measurements. For the proposed approach, the true positive rate was higher and the false negative rate was significantly lower compared to other fully automated methods. This indicates that segmentation based on CRF incorporated with texture descriptors has potential to efficiently support the analysis of complex topology of the ducts and aid in development of realistic breast anatomy phantoms.
Joint sparse coding based spatial pyramid matching for classification of color medical image.

PubMed

Shi, Jun; Li, Yi; Zhu, Jie; Sun, Haojie; Cai, Yin

2015-04-01

Although color medical images are important in clinical practice, they are usually converted to grayscale for further processing in pattern recognition, resulting in loss of rich color information. The sparse coding based linear spatial pyramid matching (ScSPM) and its variants are popular for grayscale image classification, but cannot extract color information. In this paper, we propose a joint sparse coding based SPM (JScSPM) method for the classification of color medical images. A joint dictionary can represent both the color information in each color channel and the correlation between channels. Consequently, the joint sparse codes calculated from a joint dictionary can carry color information, and therefore this method can easily transform a feature descriptor originally designed for grayscale images to a color descriptor. A color hepatocellular carcinoma histological image dataset was used to evaluate the performance of the proposed JScSPM algorithm. Experimental results show that JScSPM provides significant improvements as compared with the majority voting based ScSPM and the original ScSPM for color medical image classification. Copyright © 2014 Elsevier Ltd. All rights reserved.
RED: a set of molecular descriptors based on Renyi entropy.

PubMed

Delgado-Soler, Laura; Toral, Raul; Tomás, M Santos; Rubio-Martinez, Jaime

2009-11-01

New molecular descriptors, RED (Renyi entropy descriptors), based on the generalized entropies introduced by Renyi are presented. Topological descriptors based on molecular features have proven to be useful for describing molecular profiles. Renyi entropy is used as a variability measure to contract a feature-pair distribution composing the descriptor vector. The performance of RED descriptors was tested for the analysis of different sets of molecular distances, virtual screening, and pharmacological profiling. A free parameter of the Renyi entropy has been optimized for all the considered applications.
Stochastic Analysis and Design of Heterogeneous Microstructural Materials System

NASA Astrophysics Data System (ADS)

Xu, Hongyi

Advanced materials system refers to new materials that are comprised of multiple traditional constituents but complex microstructure morphologies, which lead to superior properties over the conventional materials. To accelerate the development of new advanced materials system, the objective of this dissertation is to develop a computational design framework and the associated techniques for design automation of microstructure materials systems, with an emphasis on addressing the uncertainties associated with the heterogeneity of microstructural materials. Five key research tasks are identified: design representation, design evaluation, design synthesis, material informatics and uncertainty quantification. Design representation of microstructure includes statistical characterization and stochastic reconstruction. This dissertation develops a new descriptor-based methodology, which characterizes 2D microstructures using descriptors of composition, dispersion and geometry. Statistics of 3D descriptors are predicted based on 2D information to enable 2D-to-3D reconstruction. An efficient sequential reconstruction algorithm is developed to reconstruct statistically equivalent random 3D digital microstructures. In design evaluation, a stochastic decomposition and reassembly strategy is developed to deal with the high computational costs and uncertainties induced by material heterogeneity. The properties of Representative Volume Elements (RVE) are predicted by stochastically reassembling SVE elements with stochastic properties into a coarse representation of the RVE. In design synthesis, a new descriptor-based design framework is developed, which integrates computational methods of microstructure characterization and reconstruction, sensitivity analysis, Design of Experiments (DOE), metamodeling and optimization the enable parametric optimization of the microstructure for achieving the desired material properties. Material informatics is studied to efficiently reduce the dimension of microstructure design space. This dissertation develops a machine learning-based methodology to identify the key microstructure descriptors that highly impact properties of interest. In uncertainty quantification, a comparative study on data-driven random process models is conducted to provide guidance for choosing the most accurate model in statistical uncertainty quantification. Two new goodness-of-fit metrics are developed to provide quantitative measurements of random process models' accuracy. The benefits of the proposed methods are demonstrated by the example of designing the microstructure of polymer nanocomposites. This dissertation provides material-generic, intelligent modeling/design methodologies and techniques to accelerate the process of analyzing and designing new microstructural materials system.
A stochastic context free grammar based framework for analysis of protein sequences

PubMed Central

Dyrka, Witold; Nebel, Jean-Christophe

2009-01-01

Background In the last decade, there have been many applications of formal language theory in bioinformatics such as RNA structure prediction and detection of patterns in DNA. However, in the field of proteomics, the size of the protein alphabet and the complexity of relationship between amino acids have mainly limited the application of formal language theory to the production of grammars whose expressive power is not higher than stochastic regular grammars. However, these grammars, like other state of the art methods, cannot cover any higher-order dependencies such as nested and crossing relationships that are common in proteins. In order to overcome some of these limitations, we propose a Stochastic Context Free Grammar based framework for the analysis of protein sequences where grammars are induced using a genetic algorithm. Results This framework was implemented in a system aiming at the production of binding site descriptors. These descriptors not only allow detection of protein regions that are involved in these sites, but also provide insight in their structure. Grammars were induced using quantitative properties of amino acids to deal with the size of the protein alphabet. Moreover, we imposed some structural constraints on grammars to reduce the extent of the rule search space. Finally, grammars based on different properties were combined to convey as much information as possible. Evaluation was performed on sites of various sizes and complexity described either by PROSITE patterns, domain profiles or a set of patterns. Results show the produced binding site descriptors are human-readable and, hence, highlight biologically meaningful features. Moreover, they achieve good accuracy in both annotation and detection. In addition, findings suggest that, unlike current state-of-the-art methods, our system may be particularly suited to deal with patterns shared by non-homologous proteins. Conclusion A new Stochastic Context Free Grammar based framework has been introduced allowing the production of binding site descriptors for analysis of protein sequences. Experiments have shown that not only is this new approach valid, but produces human-readable descriptors for binding sites which have been beyond the capability of current machine learning techniques. PMID:19814800
How diverse are diversity assessment methods? A comparative analysis and benchmarking of molecular descriptor space.

PubMed

Koutsoukas, Alexios; Paricharak, Shardul; Galloway, Warren R J D; Spring, David R; Ijzerman, Adriaan P; Glen, Robert C; Marcus, David; Bender, Andreas

2014-01-27

Chemical diversity is a widely applied approach to select structurally diverse subsets of molecules, often with the objective of maximizing the number of hits in biological screening. While many methods exist in the area, few systematic comparisons using current descriptors in particular with the objective of assessing diversity in bioactivity space have been published, and this shortage is what the current study is aiming to address. In this work, 13 widely used molecular descriptors were compared, including fingerprint-based descriptors (ECFP4, FCFP4, MACCS keys), pharmacophore-based descriptors (TAT, TAD, TGT, TGD, GpiDAPH3), shape-based descriptors (rapid overlay of chemical structures (ROCS) and principal moments of inertia (PMI)), a connectivity-matrix-based descriptor (BCUT), physicochemical-property-based descriptors (prop2D), and a more recently introduced molecular descriptor type (namely, "Bayes Affinity Fingerprints"). We assessed both the similar behavior of the descriptors in assessing the diversity of chemical libraries, and their ability to select compounds from libraries that are diverse in bioactivity space, which is a property of much practical relevance in screening library design. This is particularly evident, given that many future targets to be screened are not known in advance, but that the library should still maximize the likelihood of containing bioactive matter also for future screening campaigns. Overall, our results showed that descriptors based on atom topology (i.e., fingerprint-based descriptors and pharmacophore-based descriptors) correlate well in rank-ordering compounds, both within and between descriptor types. On the other hand, shape-based descriptors such as ROCS and PMI showed weak correlation with the other descriptors utilized in this study, demonstrating significantly different behavior. We then applied eight of the molecular descriptors compared in this study to sample a diverse subset of sample compounds (4%) from an initial population of 2587 compounds, covering the 25 largest human activity classes from ChEMBL and measured the coverage of activity classes by the subsets. Here, it was found that "Bayes Affinity Fingerprints" achieved an average coverage of 92% of activity classes. Using the descriptors ECFP4, GpiDAPH3, TGT, and random sampling, 91%, 84%, 84%, and 84% of the activity classes were represented in the selected compounds respectively, followed by BCUT, prop2D, MACCS, and PMI (in order of decreasing performance). In addition, we were able to show that there is no visible correlation between compound diversity in PMI space and in bioactivity space, despite frequent utilization of PMI plots to this end. To summarize, in this work, we assessed which descriptors select compounds with high coverage of bioactivity space, and can hence be used for diverse compound selection for biological screening. In cases where multiple descriptors are to be used for diversity selection, this work describes which descriptors behave complementarily, and can hence be used jointly to focus on different aspects of diversity in chemical space.
Difet: Distributed Feature Extraction Tool for High Spatial Resolution Remote Sensing Images

NASA Astrophysics Data System (ADS)

Eken, S.; Aydın, E.; Sayar, A.

2017-11-01

In this paper, we propose distributed feature extraction tool from high spatial resolution remote sensing images. Tool is based on Apache Hadoop framework and Hadoop Image Processing Interface. Two corner detection (Harris and Shi-Tomasi) algorithms and five feature descriptors (SIFT, SURF, FAST, BRIEF, and ORB) are considered. Robustness of the tool in the task of feature extraction from LandSat-8 imageries are evaluated in terms of horizontal scalability.
Molecular Descriptors

NASA Astrophysics Data System (ADS)

Consonni, Viviana; Todeschini, Roberto

In the last decades, several scientific researches have been focused on studying how to encompass and convert - by a theoretical pathway - the information encoded in the molecular structure into one or more numbers used to establish quantitative relationships between structures and properties, biological activities, or other experimental properties. Molecular descriptors are formally mathematical representations of a molecule obtained by a well-specified algorithm applied to a defined molecular representation or a well-specified experimental procedure. They play a fundamental role in chemistry, pharmaceutical sciences, environmental protection policy, toxicology, ecotoxicology, health research, and quality control. Evidence of the interest of the scientific community in the molecular descriptors is provided by the huge number of descriptors proposed up today: more than 5000 descriptors derived from different theories and approaches are defined in the literature and most of them can be calculated by means of dedicated software applications. Molecular descriptors are of outstanding importance in the research fields of quantitative structure-activity relationships (QSARs) and quantitative structure-property relationships (QSPRs), where they are the independent chemical information used to predict the properties of interest. Along with the definition of appropriate molecular descriptors, the molecular structure representation and the mathematical tools for deriving and assessing models are other fundamental components of the QSAR/QSPR approach. The remarkable progress during the last few years in chemometrics and chemoinformatics has led to new strategies for finding mathematical meaningful relationships between the molecular structure and biological activities, physico-chemical, toxicological, and environmental properties of chemicals. Different approaches for deriving molecular descriptors here reviewed and some of the most relevant descriptors are presented in detail with numerical examples.
An effective content-based image retrieval technique for image visuals representation based on the bag-of-visual-words model

PubMed Central

Jabeen, Safia; Mehmood, Zahid; Mahmood, Toqeer; Saba, Tanzila; Rehman, Amjad; Mahmood, Muhammad Tariq

2018-01-01

For the last three decades, content-based image retrieval (CBIR) has been an active research area, representing a viable solution for retrieving similar images from an image repository. In this article, we propose a novel CBIR technique based on the visual words fusion of speeded-up robust features (SURF) and fast retina keypoint (FREAK) feature descriptors. SURF is a sparse descriptor whereas FREAK is a dense descriptor. Moreover, SURF is a scale and rotation-invariant descriptor that performs better in the case of repeatability, distinctiveness, and robustness. It is robust to noise, detection errors, geometric, and photometric deformations. It also performs better at low illumination within an image as compared to the FREAK descriptor. In contrast, FREAK is a retina-inspired speedy descriptor that performs better for classification-based problems as compared to the SURF descriptor. Experimental results show that the proposed technique based on the visual words fusion of SURF-FREAK descriptors combines the features of both descriptors and resolves the aforementioned issues. The qualitative and quantitative analysis performed on three image collections, namely Corel-1000, Corel-1500, and Caltech-256, shows that proposed technique based on visual words fusion significantly improved the performance of the CBIR as compared to the feature fusion of both descriptors and state-of-the-art image retrieval techniques. PMID:29694429
An effective content-based image retrieval technique for image visuals representation based on the bag-of-visual-words model.

PubMed

Jabeen, Safia; Mehmood, Zahid; Mahmood, Toqeer; Saba, Tanzila; Rehman, Amjad; Mahmood, Muhammad Tariq

2018-01-01

For the last three decades, content-based image retrieval (CBIR) has been an active research area, representing a viable solution for retrieving similar images from an image repository. In this article, we propose a novel CBIR technique based on the visual words fusion of speeded-up robust features (SURF) and fast retina keypoint (FREAK) feature descriptors. SURF is a sparse descriptor whereas FREAK is a dense descriptor. Moreover, SURF is a scale and rotation-invariant descriptor that performs better in the case of repeatability, distinctiveness, and robustness. It is robust to noise, detection errors, geometric, and photometric deformations. It also performs better at low illumination within an image as compared to the FREAK descriptor. In contrast, FREAK is a retina-inspired speedy descriptor that performs better for classification-based problems as compared to the SURF descriptor. Experimental results show that the proposed technique based on the visual words fusion of SURF-FREAK descriptors combines the features of both descriptors and resolves the aforementioned issues. The qualitative and quantitative analysis performed on three image collections, namely Corel-1000, Corel-1500, and Caltech-256, shows that proposed technique based on visual words fusion significantly improved the performance of the CBIR as compared to the feature fusion of both descriptors and state-of-the-art image retrieval techniques.
Score-level fusion of two-dimensional and three-dimensional palmprint for personal recognition systems

NASA Astrophysics Data System (ADS)

Chaa, Mourad; Boukezzoula, Naceur-Eddine; Attia, Abdelouahab

2017-01-01

Two types of scores extracted from two-dimensional (2-D) and three-dimensional (3-D) palmprint for personal recognition systems are merged, introducing a local image descriptor for 2-D palmprint-based recognition systems, named bank of binarized statistical image features (B-BSIF). The main idea of B-BSIF is that the extracted histograms from the binarized statistical image features (BSIF) code images (the results of applying the different BSIF descriptor size with the length 12) are concatenated into one to produce a large feature vector. 3-D palmprint contains the depth information of the palm surface. The self-quotient image (SQI) algorithm is applied for reconstructing illumination-invariant 3-D palmprint images. To extract discriminative Gabor features from SQI images, Gabor wavelets are defined and used. Indeed, the dimensionality reduction methods have shown their ability in biometrics systems. Given this, a principal component analysis (PCA)+linear discriminant analysis (LDA) technique is employed. For the matching process, the cosine Mahalanobis distance is applied. Extensive experiments were conducted on a 2-D and 3-D palmprint database with 10,400 range images from 260 individuals. Then, a comparison was made between the proposed algorithm and other existing methods in the literature. Results clearly show that the proposed framework provides a higher correct recognition rate. Furthermore, the best results were obtained by merging the score of B-BSIF descriptor with the score of the SQI+Gabor wavelets+PCA+LDA method, yielding an equal error rate of 0.00% and a recognition rate of rank-1=100.00%.
Recognizing characters of ancient manuscripts

NASA Astrophysics Data System (ADS)

Diem, Markus; Sablatnig, Robert

2010-02-01

Considering printed Latin text, the main issues of Optical Character Recognition (OCR) systems are solved. However, for degraded handwritten document images, basic preprocessing steps such as binarization, gain poor results with state-of-the-art methods. In this paper ancient Slavonic manuscripts from the 11th century are investigated. In order to minimize the consequences of false character segmentation, a binarization-free approach based on local descriptors is proposed. Additionally local information allows the recognition of partially visible or washed out characters. The proposed algorithm consists of two steps: character classification and character localization. Initially Scale Invariant Feature Transform (SIFT) features are extracted which are subsequently classified using Support Vector Machines (SVM). Afterwards, the interest points are clustered according to their spatial information. Thereby, characters are localized and finally recognized based on a weighted voting scheme of pre-classified local descriptors. Preliminary results show that the proposed system can handle highly degraded manuscript images with background clutter (e.g. stains, tears) and faded out characters.
The registration of non-cooperative moving targets laser point cloud in different view point

NASA Astrophysics Data System (ADS)

Wang, Shuai; Sun, Huayan; Guo, Huichao

2018-01-01

Non-cooperative moving target multi-view cloud registration is the key technology of 3D reconstruction of laser threedimension imaging. The main problem is that the density changes greatly and noise exists under different acquisition conditions of point cloud. In this paper, firstly, the feature descriptor is used to find the most similar point cloud, and then based on the registration algorithm of region segmentation, the geometric structure of the point is extracted by the geometric similarity between point and point, The point cloud is divided into regions based on spectral clustering, feature descriptors are created for each region, searching to find the most similar regions in the most similar point of view cloud, and then aligning the pair of point clouds by aligning their minimum bounding boxes. Repeat the above steps again until registration of all point clouds is completed. Experiments show that this method is insensitive to the density of point clouds and performs well on the noise of laser three-dimension imaging.
A hierarchical clustering methodology for the estimation of toxicity.

PubMed

Martin, Todd M; Harten, Paul; Venkatapathy, Raghuraman; Das, Shashikala; Young, Douglas M

2008-01-01

ABSTRACT A quantitative structure-activity relationship (QSAR) methodology based on hierarchical clustering was developed to predict toxicological endpoints. This methodology utilizes Ward's method to divide a training set into a series of structurally similar clusters. The structural similarity is defined in terms of 2-D physicochemical descriptors (such as connectivity and E-state indices). A genetic algorithm-based technique is used to generate statistically valid QSAR models for each cluster (using the pool of descriptors described above). The toxicity for a given query compound is estimated using the weighted average of the predictions from the closest cluster from each step in the hierarchical clustering assuming that the compound is within the domain of applicability of the cluster. The hierarchical clustering methodology was tested using a Tetrahymena pyriformis acute toxicity data set containing 644 chemicals in the training set and with two prediction sets containing 339 and 110 chemicals. The results from the hierarchical clustering methodology were compared to the results from several different QSAR methodologies.
Communication target object recognition for D2D connection with feature size limit

NASA Astrophysics Data System (ADS)

Ok, Jiheon; Kim, Soochang; Kim, Young-hoon; Lee, Chulhee

2015-03-01

Recently, a new concept of device-to-device (D2D) communication, which is called "point-and-link communication" has attracted great attentions due to its intuitive and simple operation. This approach enables user to communicate with target devices without any pre-identification information such as SSIDs, MAC addresses by selecting the target image displayed on the user's own device. In this paper, we present an efficient object matching algorithm that can be applied to look(point)-and-link communications for mobile services. Due to the limited channel bandwidth and low computational power of mobile terminals, the matching algorithm should satisfy low-complexity, low-memory and realtime requirements. To meet these requirements, we propose fast and robust feature extraction by considering the descriptor size and processing time. The proposed algorithm utilizes a HSV color histogram, SIFT (Scale Invariant Feature Transform) features and object aspect ratios. To reduce the descriptor size under 300 bytes, a limited number of SIFT key points were chosen as feature points and histograms were binarized while maintaining required performance. Experimental results show the robustness and the efficiency of the proposed algorithm.
Curve Set Feature-Based Robust and Fast Pose Estimation Algorithm

PubMed Central

Hashimoto, Koichi

2017-01-01

Bin picking refers to picking the randomly-piled objects from a bin for industrial production purposes, and robotic bin picking is always used in automated assembly lines. In order to achieve a higher productivity, a fast and robust pose estimation algorithm is necessary to recognize and localize the randomly-piled parts. This paper proposes a pose estimation algorithm for bin picking tasks using point cloud data. A novel descriptor Curve Set Feature (CSF) is proposed to describe a point by the surface fluctuation around this point and is also capable of evaluating poses. The Rotation Match Feature (RMF) is proposed to match CSF efficiently. The matching process combines the idea of the matching in 2D space of origin Point Pair Feature (PPF) algorithm with nearest neighbor search. A voxel-based pose verification method is introduced to evaluate the poses and proved to be more than 30-times faster than the kd-tree-based verification method. Our algorithm is evaluated against a large number of synthetic and real scenes and proven to be robust to noise, able to detect metal parts, more accurately and more than 10-times faster than PPF and Oriented, Unique and Repeatable (OUR)-Clustered Viewpoint Feature Histogram (CVFH). PMID:28771216
Calculation of Five Thermodynamic Molecular Descriptors by Means of a General Computer Algorithm Based on the Group-Additivity Method: Standard Enthalpies of Vaporization, Sublimation and Solvation, and Entropy of Fusion of Ordinary Organic Molecules and Total Phase-Change Entropy of Liquid Crystals.

PubMed

Naef, Rudolf; Acree, William E

2017-06-25

The calculation of the standard enthalpies of vaporization, sublimation and solvation of organic molecules is presented using a common computer algorithm on the basis of a group-additivity method. The same algorithm is also shown to enable the calculation of their entropy of fusion as well as the total phase-change entropy of liquid crystals. The present method is based on the complete breakdown of the molecules into their constituting atoms and their immediate neighbourhood; the respective calculations of the contribution of the atomic groups by means of the Gauss-Seidel fitting method is based on experimental data collected from literature. The feasibility of the calculations for each of the mentioned descriptors was verified by means of a 10-fold cross-validation procedure proving the good to high quality of the predicted values for the three mentioned enthalpies and for the entropy of fusion, whereas the predictive quality for the total phase-change entropy of liquid crystals was poor. The goodness of fit ( Q ²) and the standard deviation (σ) of the cross-validation calculations for the five descriptors was as follows: 0.9641 and 4.56 kJ/mol ( N = 3386 test molecules) for the enthalpy of vaporization, 0.8657 and 11.39 kJ/mol ( N = 1791) for the enthalpy of sublimation, 0.9546 and 4.34 kJ/mol ( N = 373) for the enthalpy of solvation, 0.8727 and 17.93 J/mol/K ( N = 2637) for the entropy of fusion and 0.5804 and 32.79 J/mol/K ( N = 2643) for the total phase-change entropy of liquid crystals. The large discrepancy between the results of the two closely related entropies is discussed in detail. Molecules for which both the standard enthalpies of vaporization and sublimation were calculable, enabled the estimation of their standard enthalpy of fusion by simple subtraction of the former from the latter enthalpy. For 990 of them the experimental enthalpy-of-fusion values are also known, allowing their comparison with predictions, yielding a correlation coefficient R ² of 0.6066.
Application of machine vision in inspecting stem and shape of fruits

NASA Astrophysics Data System (ADS)

Ying, Yibin; Jing, Hansong; Tao, Yang; Jin, Juanqin; Ibarra, Juan G.; Chen, Zhikuan

2000-12-01

The shape and the condition of stem are important features in classification of Huanghua pears. As the commonly used thinning and erosion-dilation algorithm in judging the presence of the stem is too slow, a new fast algorithm was put forward. Compared with other part of the pear, the stem is obviously thin and long, with the help of various sized templates, the judgment of whether the stem is present was easily made, meanwhile the stem head and the intersection point of stem bottom and pear were labeled. Furthermore, after the slopes of the tangential line of stem head and tangential line of stem bottom were found, the included angle of these two lines was calculated. It was found that the included angle of the broken stem was obviously different from that of the good stem. After the analysis of 53 pictures of pears, the accuracy to judge whether the stem is present is 100% and whether the stem is good reaches 93%. Also, the algorithm is of robustness and can be made invariant to translation and rotation Meanwhile, the method to describe the shape of irregular fruits was studied. Fourier transformation and inverse Fourier transformation pair were adopted to describe the shape of Huanghua pears, and the algorithm for shape identification, which was based on artificial neural network, was developed. The first sixteen harmonic components of the Fourier descriptor were enough to represent the primary shape of pear, and the identification accuracy could reach 90% by applying the Fourier descriptor in combination with artificial neural network.
Novel Spectral Representations and Sparsity-Driven Algorithms for Shape Modeling and Analysis

NASA Astrophysics Data System (ADS)

Zhong, Ming

In this dissertation, we focus on extending classical spectral shape analysis by incorporating spectral graph wavelets and sparsity-seeking algorithms. Defined with the graph Laplacian eigenbasis, the spectral graph wavelets are localized both in the vertex domain and graph spectral domain, and thus are very effective in describing local geometry. With a rich dictionary of elementary vectors and forcing certain sparsity constraints, a real life signal can often be well approximated by a very sparse coefficient representation. The many successful applications of sparse signal representation in computer vision and image processing inspire us to explore the idea of employing sparse modeling techniques with dictionary of spectral basis to solve various shape modeling problems. Conventional spectral mesh compression uses the eigenfunctions of mesh Laplacian as shape bases, which are highly inefficient in representing local geometry. To ameliorate, we advocate an innovative approach to 3D mesh compression using spectral graph wavelets as dictionary to encode mesh geometry. The spectral graph wavelets are locally defined at individual vertices and can better capture local shape information than Laplacian eigenbasis. The multi-scale SGWs form a redundant dictionary as shape basis, so we formulate the compression of 3D shape as a sparse approximation problem that can be readily handled by greedy pursuit algorithms. Surface inpainting refers to the completion or recovery of missing shape geometry based on the shape information that is currently available. We devise a new surface inpainting algorithm founded upon the theory and techniques of sparse signal recovery. Instead of estimating the missing geometry directly, our novel method is to find this low-dimensional representation which describes the entire original shape. More specifically, we find that, for many shapes, the vertex coordinate function can be well approximated by a very sparse coefficient representation with respect to the dictionary comprising its Laplacian eigenbasis, and it is then possible to recover this sparse representation from partial measurements of the original shape. Taking advantage of the sparsity cue, we advocate a novel variational approach for surface inpainting, integrating data fidelity constraints on the shape domain with coefficient sparsity constraints on the transformed domain. Because of the powerful properties of Laplacian eigenbasis, the inpainting results of our method tend to be globally coherent with the remaining shape. Informative and discriminative feature descriptors are vital in qualitative and quantitative shape analysis for a large variety of graphics applications. We advocate novel strategies to define generalized, user-specified features on shapes. Our new region descriptors are primarily built upon the coefficients of spectral graph wavelets that are both multi-scale and multi-level in nature, consisting of both local and global information. Based on our novel spectral feature descriptor, we developed a user-specified feature detection framework and a tensor-based shape matching algorithm. Through various experiments, we demonstrate the competitive performance of our proposed methods and the great potential of spectral basis and sparsity-driven methods for shape modeling.

Anomaly Detection Based on Local Nearest Neighbor Distance Descriptor in Crowded Scenes

PubMed Central

Hu, Shiqiang; Zhang, Huanlong; Luo, Lingkun

2014-01-01

We propose a novel local nearest neighbor distance (LNND) descriptor for anomaly detection in crowded scenes. Comparing with the commonly used low-level feature descriptors in previous works, LNND descriptor has two major advantages. First, LNND descriptor efficiently incorporates spatial and temporal contextual information around the video event that is important for detecting anomalous interaction among multiple events, while most existing feature descriptors only contain the information of single event. Second, LNND descriptor is a compact representation and its dimensionality is typically much lower than the low-level feature descriptor. Therefore, not only the computation time and storage requirement can be accordingly saved by using LNND descriptor for the anomaly detection method with offline training fashion, but also the negative aspects caused by using high-dimensional feature descriptor can be avoided. We validate the effectiveness of LNND descriptor by conducting extensive experiments on different benchmark datasets. Experimental results show the promising performance of LNND-based method against the state-of-the-art methods. It is worthwhile to notice that the LNND-based approach requires less intermediate processing steps without any subsequent processing such as smoothing but achieves comparable event better performance. PMID:25105164
Convenient QSAR model for predicting the complexation of structurally diverse compounds with beta-cyclodextrins.

PubMed

Pérez-Garrido, Alfonso; Morales Helguera, Aliuska; Abellán Guillén, Adela; Cordeiro, M Natália D S; Garrido Escudero, Amalio

2009-01-15

This paper reports a QSAR study for predicting the complexation of a large and heterogeneous variety of substances (233 organic compounds) with beta-cyclodextrins (beta-CDs). Several different theoretical molecular descriptors, calculated solely from the molecular structure of the compounds under investigation, and an efficient variable selection procedure, like the Genetic Algorithm, led to models with satisfactory global accuracy and predictivity. But the best-final QSAR model is based on Topological descriptors meanwhile offering a reasonable interpretation. This QSAR model was able to explain ca. 84% of the variance in the experimental activity, and displayed very good internal cross-validation statistics and predictivity on external data. It shows that the driving forces for CD complexation are mainly hydrophobic and steric (van der Waals) interactions. Thus, the results of our study provide a valuable tool for future screening and priority testing of beta-CDs guest molecules.
FLASHFLOOD: A 3D Field-based similarity search and alignment method for flexible molecules

NASA Astrophysics Data System (ADS)

Pitman, Michael C.; Huber, Wolfgang K.; Horn, Hans; Krämer, Andreas; Rice, Julia E.; Swope, William C.

2001-07-01

A three-dimensional field-based similarity search and alignment method for flexible molecules is introduced. The conformational space of a flexible molecule is represented in terms of fragments and torsional angles of allowed conformations. A user-definable property field is used to compute features of fragment pairs. Features are generalizations of CoMMA descriptors (Silverman, B.D. and Platt, D.E., J. Med. Chem., 39 (1996) 2129.) that characterize local regions of the property field by its local moments. The features are invariant under coordinate system transformations. Features taken from a query molecule are used to form alignments with fragment pairs in the database. An assembly algorithm is then used to merge the fragment pairs into full structures, aligned to the query. Key to the method is the use of a context adaptive descriptor scaling procedure as the basis for similarity. This allows the user to tune the weights of the various feature components based on examples relevant to the particular context under investigation. The property fields may range from simple, phenomenological fields, to fields derived from quantum mechanical calculations. We apply the method to the dihydrofolate/methotrexate benchmark system, and show that when one injects relevant contextual information into the descriptor scaling procedure, better results are obtained more efficiently. We also show how the method works and include computer times for a query from a database that represents approximately 23 million conformers of seventeen flexible molecules.
Real-time pose invariant logo and pattern detection

NASA Astrophysics Data System (ADS)

Sidla, Oliver; Kottmann, Michal; Benesova, Wanda

2011-01-01

The detection of pose invariant planar patterns has many practical applications in computer vision and surveillance systems. The recognition of company logos is used in market studies to examine the visibility and frequency of logos in advertisement. Danger signs on vehicles could be detected to trigger warning systems in tunnels, or brand detection on transport vehicles can be used to count company-specific traffic. We present the results of a study on planar pattern detection which is based on keypoint detection and matching of distortion invariant 2d feature descriptors. Specifically we look at the keypoint detectors of type: i) Lowe's DoG approximation from the SURF algorithm, ii) the Harris Corner Detector, iii) the FAST Corner Detector and iv) Lepetit's keypoint detector. Our study then compares the feature descriptors SURF and compact signatures based on Random Ferns: we use 3 sets of sample images to detect and match 3 logos of different structure to find out which combinations of keypoint detector/feature descriptors work well. A real-world test tries to detect vehicles with a distinctive logo in an outdoor environment under realistic lighting and weather conditions: a camera was mounted on a suitable location for observing the entrance to a parking area so that incoming vehicles could be monitored. In this 2 hour long recording we can successfully detect a specific company logo without false positives.
Mixture of learners for cancer stem cell detection using CD13 and H and E stained images

NASA Astrophysics Data System (ADS)

Oǧuz, Oǧuzhan; Akbaş, Cem Emre; Mallah, Maen; Taşdemir, Kasım.; Akhan Güzelcan, Ece; Muenzenmayer, Christian; Wittenberg, Thomas; Üner, Ayşegül; Cetin, A. E.; ćetin Atalay, Rengül

2016-03-01

In this article, algorithms for cancer stem cell (CSC) detection in liver cancer tissue images are developed. Conventionally, a pathologist examines of cancer cell morphologies under microscope. Computer aided diagnosis systems (CAD) aims to help pathologists in this tedious and repetitive work. The first algorithm locates CSCs in CD13 stained liver tissue images. The method has also an online learning algorithm to improve the accuracy of detection. The second family of algorithms classify the cancer tissues stained with H and E which is clinically routine and cost effective than immunohistochemistry (IHC) procedure. The algorithms utilize 1D-SIFT and Eigen-analysis based feature sets as descriptors. Normal and cancerous tissues can be classified with 92.1% accuracy in H and E stained images. Classification accuracy of low and high-grade cancerous tissue images is 70.4%. Therefore, this study paves the way for diagnosing the cancerous tissue and grading the level of it using H and E stained microscopic tissue images.
Fusion of Visible and Thermal Descriptors Using Genetic Algorithms for Face Recognition Systems.

PubMed

Hermosilla, Gabriel; Gallardo, Francisco; Farias, Gonzalo; San Martin, Cesar

2015-07-23

The aim of this article is to present a new face recognition system based on the fusion of visible and thermal features obtained from the most current local matching descriptors by maximizing face recognition rates through the use of genetic algorithms. The article considers a comparison of the performance of the proposed fusion methodology against five current face recognition methods and classic fusion techniques used commonly in the literature. These were selected by considering their performance in face recognition. The five local matching methods and the proposed fusion methodology are evaluated using the standard visible/thermal database, the Equinox database, along with a new database, the PUCV-VTF, designed for visible-thermal studies in face recognition and described for the first time in this work. The latter is created considering visible and thermal image sensors with different real-world conditions, such as variations in illumination, facial expression, pose, occlusion, etc. The main conclusions of this article are that two variants of the proposed fusion methodology surpass current face recognition methods and the classic fusion techniques reported in the literature, attaining recognition rates of over 97% and 99% for the Equinox and PUCV-VTF databases, respectively. The fusion methodology is very robust to illumination and expression changes, as it combines thermal and visible information efficiently by using genetic algorithms, thus allowing it to choose optimal face areas where one spectrum is more representative than the other.
Clustering-based Feature Learning on Variable Stars

NASA Astrophysics Data System (ADS)

Mackenzie, Cristóbal; Pichara, Karim; Protopapas, Pavlos

2016-04-01

The success of automatic classification of variable stars depends strongly on the lightcurve representation. Usually, lightcurves are represented as a vector of many descriptors designed by astronomers called features. These descriptors are expensive in terms of computing, require substantial research effort to develop, and do not guarantee a good classification. Today, lightcurve representation is not entirely automatic; algorithms must be designed and manually tuned up for every survey. The amounts of data that will be generated in the future mean astronomers must develop scalable and automated analysis pipelines. In this work we present a feature learning algorithm designed for variable objects. Our method works by extracting a large number of lightcurve subsequences from a given set, which are then clustered to find common local patterns in the time series. Representatives of these common patterns are then used to transform lightcurves of a labeled set into a new representation that can be used to train a classifier. The proposed algorithm learns the features from both labeled and unlabeled lightcurves, overcoming the bias using only labeled data. We test our method on data sets from the Massive Compact Halo Object survey and the Optical Gravitational Lensing Experiment; the results show that our classification performance is as good as and in some cases better than the performance achieved using traditional statistical features, while the computational cost is significantly lower. With these promising results, we believe that our method constitutes a significant step toward the automation of the lightcurve classification pipeline.
Fusion of Visible and Thermal Descriptors Using Genetic Algorithms for Face Recognition Systems

PubMed Central

Hermosilla, Gabriel; Gallardo, Francisco; Farias, Gonzalo; San Martin, Cesar

2015-01-01

The aim of this article is to present a new face recognition system based on the fusion of visible and thermal features obtained from the most current local matching descriptors by maximizing face recognition rates through the use of genetic algorithms. The article considers a comparison of the performance of the proposed fusion methodology against five current face recognition methods and classic fusion techniques used commonly in the literature. These were selected by considering their performance in face recognition. The five local matching methods and the proposed fusion methodology are evaluated using the standard visible/thermal database, the Equinox database, along with a new database, the PUCV-VTF, designed for visible-thermal studies in face recognition and described for the first time in this work. The latter is created considering visible and thermal image sensors with different real-world conditions, such as variations in illumination, facial expression, pose, occlusion, etc. The main conclusions of this article are that two variants of the proposed fusion methodology surpass current face recognition methods and the classic fusion techniques reported in the literature, attaining recognition rates of over 97% and 99% for the Equinox and PUCV-VTF databases, respectively. The fusion methodology is very robust to illumination and expression changes, as it combines thermal and visible information efficiently by using genetic algorithms, thus allowing it to choose optimal face areas where one spectrum is more representative than the other. PMID:26213932
Automated Detection of P. falciparum Using Machine Learning Algorithms with Quantitative Phase Images of Unstained Cells.

PubMed

Park, Han Sang; Rinehart, Matthew T; Walzer, Katelyn A; Chi, Jen-Tsan Ashley; Wax, Adam

2016-01-01

Malaria detection through microscopic examination of stained blood smears is a diagnostic challenge that heavily relies on the expertise of trained microscopists. This paper presents an automated analysis method for detection and staging of red blood cells infected by the malaria parasite Plasmodium falciparum at trophozoite or schizont stage. Unlike previous efforts in this area, this study uses quantitative phase images of unstained cells. Erythrocytes are automatically segmented using thresholds of optical phase and refocused to enable quantitative comparison of phase images. Refocused images are analyzed to extract 23 morphological descriptors based on the phase information. While all individual descriptors are highly statistically different between infected and uninfected cells, each descriptor does not enable separation of populations at a level satisfactory for clinical utility. To improve the diagnostic capacity, we applied various machine learning techniques, including linear discriminant classification (LDC), logistic regression (LR), and k-nearest neighbor classification (NNC), to formulate algorithms that combine all of the calculated physical parameters to distinguish cells more effectively. Results show that LDC provides the highest accuracy of up to 99.7% in detecting schizont stage infected cells compared to uninfected RBCs. NNC showed slightly better accuracy (99.5%) than either LDC (99.0%) or LR (99.1%) for discriminating late trophozoites from uninfected RBCs. However, for early trophozoites, LDC produced the best accuracy of 98%. Discrimination of infection stage was less accurate, producing high specificity (99.8%) but only 45.0%-66.8% sensitivity with early trophozoites most often mistaken for late trophozoite or schizont stage and late trophozoite and schizont stage most often confused for each other. Overall, this methodology points to a significant clinical potential of using quantitative phase imaging to detect and stage malaria infection without staining or expert analysis.
Automated Detection of P. falciparum Using Machine Learning Algorithms with Quantitative Phase Images of Unstained Cells

PubMed Central

Park, Han Sang; Rinehart, Matthew T.; Walzer, Katelyn A.; Chi, Jen-Tsan Ashley; Wax, Adam

2016-01-01

Malaria detection through microscopic examination of stained blood smears is a diagnostic challenge that heavily relies on the expertise of trained microscopists. This paper presents an automated analysis method for detection and staging of red blood cells infected by the malaria parasite Plasmodium falciparum at trophozoite or schizont stage. Unlike previous efforts in this area, this study uses quantitative phase images of unstained cells. Erythrocytes are automatically segmented using thresholds of optical phase and refocused to enable quantitative comparison of phase images. Refocused images are analyzed to extract 23 morphological descriptors based on the phase information. While all individual descriptors are highly statistically different between infected and uninfected cells, each descriptor does not enable separation of populations at a level satisfactory for clinical utility. To improve the diagnostic capacity, we applied various machine learning techniques, including linear discriminant classification (LDC), logistic regression (LR), and k-nearest neighbor classification (NNC), to formulate algorithms that combine all of the calculated physical parameters to distinguish cells more effectively. Results show that LDC provides the highest accuracy of up to 99.7% in detecting schizont stage infected cells compared to uninfected RBCs. NNC showed slightly better accuracy (99.5%) than either LDC (99.0%) or LR (99.1%) for discriminating late trophozoites from uninfected RBCs. However, for early trophozoites, LDC produced the best accuracy of 98%. Discrimination of infection stage was less accurate, producing high specificity (99.8%) but only 45.0%-66.8% sensitivity with early trophozoites most often mistaken for late trophozoite or schizont stage and late trophozoite and schizont stage most often confused for each other. Overall, this methodology points to a significant clinical potential of using quantitative phase imaging to detect and stage malaria infection without staining or expert analysis. PMID:27636719
MBR-SIFT: A mirror reflected invariant feature descriptor using a binary representation for image matching.

PubMed

Su, Mingzhe; Ma, Yan; Zhang, Xiangfen; Wang, Yan; Zhang, Yuping

2017-01-01

The traditional scale invariant feature transform (SIFT) method can extract distinctive features for image matching. However, it is extremely time-consuming in SIFT matching because of the use of the Euclidean distance measure. Recently, many binary SIFT (BSIFT) methods have been developed to improve matching efficiency; however, none of them is invariant to mirror reflection. To address these problems, in this paper, we present a horizontal or vertical mirror reflection invariant binary descriptor named MBR-SIFT, in addition to a novel image matching approach. First, 16 cells in the local region around the SIFT keypoint are reorganized, and then the 128-dimensional vector of the SIFT descriptor is transformed into a reconstructed vector according to eight directions. Finally, the MBR-SIFT descriptor is obtained after binarization and reverse coding. To improve the matching speed and accuracy, a fast matching algorithm that includes a coarse-to-fine two-step matching strategy in addition to two similarity measures for the MBR-SIFT descriptor are proposed. Experimental results on the UKBench dataset show that the proposed method not only solves the problem of mirror reflection, but also ensures desirable matching accuracy and speed.
MBR-SIFT: A mirror reflected invariant feature descriptor using a binary representation for image matching

PubMed Central

Su, Mingzhe; Ma, Yan; Zhang, Xiangfen; Wang, Yan; Zhang, Yuping

2017-01-01

The traditional scale invariant feature transform (SIFT) method can extract distinctive features for image matching. However, it is extremely time-consuming in SIFT matching because of the use of the Euclidean distance measure. Recently, many binary SIFT (BSIFT) methods have been developed to improve matching efficiency; however, none of them is invariant to mirror reflection. To address these problems, in this paper, we present a horizontal or vertical mirror reflection invariant binary descriptor named MBR-SIFT, in addition to a novel image matching approach. First, 16 cells in the local region around the SIFT keypoint are reorganized, and then the 128-dimensional vector of the SIFT descriptor is transformed into a reconstructed vector according to eight directions. Finally, the MBR-SIFT descriptor is obtained after binarization and reverse coding. To improve the matching speed and accuracy, a fast matching algorithm that includes a coarse-to-fine two-step matching strategy in addition to two similarity measures for the MBR-SIFT descriptor are proposed. Experimental results on the UKBench dataset show that the proposed method not only solves the problem of mirror reflection, but also ensures desirable matching accuracy and speed. PMID:28542537
A simple and robust classification tree for differentiation between benign and malignant lesions in MR-mammography.

PubMed

Baltzer, Pascal A T; Dietzel, Matthias; Kaiser, Werner A

2013-08-01

In the face of multiple available diagnostic criteria in MR-mammography (MRM), a practical algorithm for lesion classification is needed. Such an algorithm should be as simple as possible and include only important independent lesion features to differentiate benign from malignant lesions. This investigation aimed to develop a simple classification tree for differential diagnosis in MRM. A total of 1,084 lesions in standardised MRM with subsequent histological verification (648 malignant, 436 benign) were investigated. Seventeen lesion criteria were assessed by 2 readers in consensus. Classification analysis was performed using the chi-squared automatic interaction detection (CHAID) method. Results include the probability for malignancy for every descriptor combination in the classification tree. A classification tree incorporating 5 lesion descriptors with a depth of 3 ramifications (1, root sign; 2, delayed enhancement pattern; 3, border, internal enhancement and oedema) was calculated. Of all 1,084 lesions, 262 (40.4 %) and 106 (24.3 %) could be classified as malignant and benign with an accuracy above 95 %, respectively. Overall diagnostic accuracy was 88.4 %. The classification algorithm reduced the number of categorical descriptors from 17 to 5 (29.4 %), resulting in a high classification accuracy. More than one third of all lesions could be classified with accuracy above 95 %. • A practical algorithm has been developed to classify lesions found in MR-mammography. • A simple decision tree consisting of five criteria reaches high accuracy of 88.4 %. • Unique to this approach, each classification is associated with a diagnostic certainty. • Diagnostic certainty of greater than 95 % is achieved in 34 % of all cases.
Visualizing and enhancing a deep learning framework using patients age and gender for chest x-ray image retrieval

NASA Astrophysics Data System (ADS)

Anavi, Yaron; Kogan, Ilya; Gelbart, Elad; Geva, Ofer; Greenspan, Hayit

2016-03-01

We explore the combination of text metadata, such as patients' age and gender, with image-based features, for X-ray chest pathology image retrieval. We focus on a feature set extracted from a pre-trained deep convolutional network shown in earlier work to achieve state-of-the-art results. Two distance measures are explored: a descriptor-based measure, which computes the distance between image descriptors, and a classification-based measure, which performed by a comparison of the corresponding SVM classification probabilities. We show that retrieval results increase once the age and gender information combined with the features extracted from the last layers of the network, with best results using the classification-based scheme. Visualization of the X-ray data is presented by embedding the high dimensional deep learning features in a 2-D dimensional space while preserving the pairwise distances using the t-SNE algorithm. The 2-D visualization gives the unique ability to find groups of X-ray images that are similar to the query image and among themselves, which is a characteristic we do not see in a 1-D traditional ranking.
A Distributed Wireless Camera System for the Management of Parking Spaces.

PubMed

Vítek, Stanislav; Melničuk, Petr

2017-12-28

The importance of detection of parking space availability is still growing, particularly in major cities. This paper deals with the design of a distributed wireless camera system for the management of parking spaces, which can determine occupancy of the parking space based on the information from multiple cameras. The proposed system uses small camera modules based on Raspberry Pi Zero and computationally efficient algorithm for the occupancy detection based on the histogram of oriented gradients (HOG) feature descriptor and support vector machine (SVM) classifier. We have included information about the orientation of the vehicle as a supporting feature, which has enabled us to achieve better accuracy. The described solution can deliver occupancy information at the rate of 10 parking spaces per second with more than 90% accuracy in a wide range of conditions. Reliability of the implemented algorithm is evaluated with three different test sets which altogether contain over 700,000 samples of parking spaces.
Discovering collectively informative descriptors from high-throughput experiments

PubMed Central

2009-01-01

Background Improvements in high-throughput technology and its increasing use have led to the generation of many highly complex datasets that often address similar biological questions. Combining information from these studies can increase the reliability and generalizability of results and also yield new insights that guide future research. Results This paper describes a novel algorithm called BLANKET for symmetric analysis of two experiments that assess informativeness of descriptors. The experiments are required to be related only in that their descriptor sets intersect substantially and their definitions of case and control are consistent. From resulting lists of n descriptors ranked by informativeness, BLANKET determines shortlists of descriptors from each experiment, generally of different lengths p and q. For any pair of shortlists, four numbers are evident: the number of descriptors appearing in both shortlists, in exactly one shortlist, or in neither shortlist. From the associated contingency table, BLANKET computes Right Fisher Exact Test (RFET) values used as scores over a plane of possible pairs of shortlist lengths [1,2]. BLANKET then chooses a pair or pairs with RFET score less than a threshold; the threshold depends upon n and shortlist length limits and represents a quality of intersection achieved by less than 5% of random lists. Conclusions Researchers seek within a universe of descriptors some minimal subset that collectively and efficiently predicts experimental outcomes. Ideally, any smaller subset should be insufficient for reliable prediction and any larger subset should have little additional accuracy. As a method, BLANKET is easy to conceptualize and presents only moderate computational complexity. Many existing databases could be mined using BLANKET to suggest optimal sets of predictive descriptors. PMID:20021653
Improved Prediction of Blood-Brain Barrier Permeability Through Machine Learning with Combined Use of Molecular Property-Based Descriptors and Fingerprints.

PubMed

Yuan, Yaxia; Zheng, Fang; Zhan, Chang-Guo

2018-03-21

Blood-brain barrier (BBB) permeability of a compound determines whether the compound can effectively enter the brain. It is an essential property which must be accounted for in drug discovery with a target in the brain. Several computational methods have been used to predict the BBB permeability. In particular, support vector machine (SVM), which is a kernel-based machine learning method, has been used popularly in this field. For SVM training and prediction, the compounds are characterized by molecular descriptors. Some SVM models were based on the use of molecular property-based descriptors (including 1D, 2D, and 3D descriptors) or fragment-based descriptors (known as the fingerprints of a molecule). The selection of descriptors is critical for the performance of a SVM model. In this study, we aimed to develop a generally applicable new SVM model by combining all of the features of the molecular property-based descriptors and fingerprints to improve the accuracy for the BBB permeability prediction. The results indicate that our SVM model has improved accuracy compared to the currently available models of the BBB permeability prediction.
A Global Covariance Descriptor for Nuclear Atypia Scoring in Breast Histopathology Images.

PubMed

Khan, Adnan Mujahid; Sirinukunwattana, Korsuk; Rajpoot, Nasir

2015-09-01

Nuclear atypia scoring is a diagnostic measure commonly used to assess tumor grade of various cancers, including breast cancer. It provides a quantitative measure of deviation in visual appearance of cell nuclei from those in normal epithelial cells. In this paper, we present a novel image-level descriptor for nuclear atypia scoring in breast cancer histopathology images. The method is based on the region covariance descriptor that has recently become a popular method in various computer vision applications. The descriptor in its original form is not suitable for classification of histopathology images as cancerous histopathology images tend to possess diversely heterogeneous regions in a single field of view. Our proposed image-level descriptor, which we term as the geodesic mean of region covariance descriptors, possesses all the attractive properties of covariance descriptors lending itself to tractable geodesic-distance-based k-nearest neighbor classification using efficient kernels. The experimental results suggest that the proposed image descriptor yields high classification accuracy compared to a variety of widely used image-level descriptors.
Deep-Learning-Based Drug-Target Interaction Prediction.

PubMed

Wen, Ming; Zhang, Zhimin; Niu, Shaoyu; Sha, Haozhi; Yang, Ruihan; Yun, Yonghuan; Lu, Hongmei

2017-04-07

Identifying interactions between known drugs and targets is a major challenge in drug repositioning. In silico prediction of drug-target interaction (DTI) can speed up the expensive and time-consuming experimental work by providing the most potent DTIs. In silico prediction of DTI can also provide insights about the potential drug-drug interaction and promote the exploration of drug side effects. Traditionally, the performance of DTI prediction depends heavily on the descriptors used to represent the drugs and the target proteins. In this paper, to accurately predict new DTIs between approved drugs and targets without separating the targets into different classes, we developed a deep-learning-based algorithmic framework named DeepDTIs. It first abstracts representations from raw input descriptors using unsupervised pretraining and then applies known label pairs of interaction to build a classification model. Compared with other methods, it is found that DeepDTIs reaches or outperforms other state-of-the-art methods. The DeepDTIs can be further used to predict whether a new drug targets to some existing targets or whether a new target interacts with some existing drugs.
Alignment-independent comparison of binding sites based on DrugScore potential fields encoded by 3D Zernike descriptors.

PubMed

Nisius, Britta; Gohlke, Holger

2012-09-24

Analyzing protein binding sites provides detailed insights into the biological processes proteins are involved in, e.g., into drug-target interactions, and so is of crucial importance in drug discovery. Herein, we present novel alignment-independent binding site descriptors based on DrugScore potential fields. The potential fields are transformed to a set of information-rich descriptors using a series expansion in 3D Zernike polynomials. The resulting Zernike descriptors show a promising performance in detecting similarities among proteins with low pairwise sequence identities that bind identical ligands, as well as within subfamilies of one target class. Furthermore, the Zernike descriptors are robust against structural variations among protein binding sites. Finally, the Zernike descriptors show a high data compression power, and computing similarities between binding sites based on these descriptors is highly efficient. Consequently, the Zernike descriptors are a useful tool for computational binding site analysis, e.g., to predict the function of novel proteins, off-targets for drug candidates, or novel targets for known drugs.

A new method for shape and texture classification of orthopedic wear nanoparticles.

PubMed

Zhang, Dongning; Page, Janet R; Kavanaugh, Aaron E; Billi, Fabrizio

2012-09-27

Detailed morphologic analysis of particles produced during wear of orthopedic implants is important in determining a correlation among material, wear, and biological effects. However, the use of simple shape descriptors is insufficient to categorize the data and to compare the nature of wear particles generated by different implants. An approach based on Discrete Fourier Transform (DFT) is presented for describing particle shape and surface texture. Four metal-on-metal bearing couples were tested in an orbital wear simulator under standard and adverse (steep-angled cups) wear simulator conditions. Digitized Scanning Electron Microscope (SEM) images of the wear particles were imported into MATLAB to carry out Fourier descriptor calculations via a specifically developed algorithm. The descriptors were then used for studying particle characteristics (shape and texture) as well as for cluster classification. Analysis of the particles demonstrated the validity of the proposed model by showing that steep-angle Co-Cr wear particles were more asymmetric, compressed, extended, triangular, square, and roughened at 3 Mc than after 0.25 Mc. In contrast, particles from standard angle samples were only more compressed and extended after 3 Mc compared to 0.25 Mc. Cluster analysis revealed that the 0.25 Mc steep-angle particle distribution was a subset of the 3 Mc distribution.
Cross-indexing of binary SIFT codes for large-scale image search.

PubMed

Liu, Zhen; Li, Houqiang; Zhang, Liyan; Zhou, Wengang; Tian, Qi

2014-05-01

In recent years, there has been growing interest in mapping visual features into compact binary codes for applications on large-scale image collections. Encoding high-dimensional data as compact binary codes reduces the memory cost for storage. Besides, it benefits the computational efficiency since the computation of similarity can be efficiently measured by Hamming distance. In this paper, we propose a novel flexible scale invariant feature transform (SIFT) binarization (FSB) algorithm for large-scale image search. The FSB algorithm explores the magnitude patterns of SIFT descriptor. It is unsupervised and the generated binary codes are demonstrated to be dispreserving. Besides, we propose a new searching strategy to find target features based on the cross-indexing in the binary SIFT space and original SIFT space. We evaluate our approach on two publicly released data sets. The experiments on large-scale partial duplicate image retrieval system demonstrate the effectiveness and efficiency of the proposed algorithm.
Structure-based predictions of 13C-NMR chemical shifts for a series of 2-functionalized 5-(methylsulfonyl)-1-phenyl-1H-indoles derivatives using GA-based MLR method

NASA Astrophysics Data System (ADS)

Ghavami, Raouf; Sadeghi, Faridoon; Rasouli, Zolikha; Djannati, Farhad

2012-12-01

Experimental values for the 13C NMR chemical shifts (ppm, TMS = 0) at 300 K ranging from 96.28 ppm (C4' of indole derivative 17) to 159.93 ppm (C4' of indole derivative 23) relative to deuteride chloroform (CDCl3, 77.0 ppm) or dimethylsulfoxide (DMSO, 39.50 ppm) as internal reference in CDCl3 or DMSO-d6 solutions have been collected from literature for thirty 2-functionalized 5-(methylsulfonyl)-1-phenyl-1H-indole derivatives containing different substituted groups. An effective quantitative structure-property relationship (QSPR) models were built using hybrid method combining genetic algorithm (GA) based on stepwise selection multiple linear regression (SWS-MLR) as feature-selection tools and correlation models between each carbon atom of indole derivative and calculated descriptors. Each compound was depicted by molecular structural descriptors that encode constitutional, topological, geometrical, electrostatic, and quantum chemical features. The accuracy of all developed models were confirmed using different types of internal and external procedures and various statistical tests. Furthermore, the domain of applicability for each model which indicates the area of reliable predictions was defined.
Statistical Analysis of Interactive Surgical Planning Using Shape Descriptors in Mandibular Reconstruction with Fibular Segments

PubMed Central

2016-01-01

This study was performed to quantitatively analyze medical knowledge of, and experience with, decision-making in preoperative virtual planning of mandibular reconstruction. Three shape descriptors were designed to evaluate local differences between reconstructed mandibles and patients’ original mandibles. We targeted an asymmetrical, wide range of cutting areas including the mandibular sidepiece, and defined a unique three-dimensional coordinate system for each mandibular image. The generalized algorithms for computing the shape descriptors were integrated into interactive planning software, where the user can refine the preoperative plan using the spatial map of the local shape distance as a visual guide. A retrospective study was conducted with two oral surgeons and two dental technicians using the developed software. The obtained 120 reconstruction plans show that the participants preferred a moderate shape distance rather than optimization to the smallest. We observed that a visually plausible shape could be obtained when considering specific anatomical features (e.g., mental foramen. mandibular midline). The proposed descriptors can be used to multilaterally evaluate reconstruction plans and systematically learn surgical procedures. PMID:27583465
QuBiLS-MIDAS: a parallel free-software for molecular descriptors computation based on multilinear algebraic maps.

PubMed

García-Jacas, César R; Marrero-Ponce, Yovani; Acevedo-Martínez, Liesner; Barigye, Stephen J; Valdés-Martiní, José R; Contreras-Torres, Ernesto

2014-07-05

The present report introduces the QuBiLS-MIDAS software belonging to the ToMoCoMD-CARDD suite for the calculation of three-dimensional molecular descriptors (MDs) based on the two-linear (bilinear), three-linear, and four-linear (multilinear or N-linear) algebraic forms. Thus, it is unique software that computes these tensor-based indices. These descriptors, establish relations for two, three, and four atoms by using several (dis-)similarity metrics or multimetrics, matrix transformations, cutoffs, local calculations and aggregation operators. The theoretical background of these N-linear indices is also presented. The QuBiLS-MIDAS software was developed in the Java programming language and employs the Chemical Development Kit library for the manipulation of the chemical structures and the calculation of the atomic properties. This software is composed by a desktop user-friendly interface and an Abstract Programming Interface library. The former was created to simplify the configuration of the different options of the MDs, whereas the library was designed to allow its easy integration to other software for chemoinformatics applications. This program provides functionalities for data cleaning tasks and for batch processing of the molecular indices. In addition, it offers parallel calculation of the MDs through the use of all available processors in current computers. The studies of complexity of the main algorithms demonstrate that these were efficiently implemented with respect to their trivial implementation. Lastly, the performance tests reveal that this software has a suitable behavior when the amount of processors is increased. Therefore, the QuBiLS-MIDAS software constitutes a useful application for the computation of the molecular indices based on N-linear algebraic maps and it can be used freely to perform chemoinformatics studies. Copyright © 2014 Wiley Periodicals, Inc.
Reproducibility of the NEPTUNE descriptor-based scoring system on whole-slide images and histologic and ultrastructural digital images.

PubMed

Barisoni, Laura; Troost, Jonathan P; Nast, Cynthia; Bagnasco, Serena; Avila-Casado, Carmen; Hodgin, Jeffrey; Palmer, Matthew; Rosenberg, Avi; Gasim, Adil; Liensziewski, Chrysta; Merlino, Lino; Chien, Hui-Ping; Chang, Anthony; Meehan, Shane M; Gaut, Joseph; Song, Peter; Holzman, Lawrence; Gibson, Debbie; Kretzler, Matthias; Gillespie, Brenda W; Hewitt, Stephen M

2016-07-01

The multicenter Nephrotic Syndrome Study Network (NEPTUNE) digital pathology scoring system employs a novel and comprehensive methodology to document pathologic features from whole-slide images, immunofluorescence and ultrastructural digital images. To estimate inter- and intra-reader concordance of this descriptor-based approach, data from 12 pathologists (eight NEPTUNE and four non-NEPTUNE) with experience from training to 30 years were collected. A descriptor reference manual was generated and a webinar-based protocol for consensus/cross-training implemented. Intra-reader concordance for 51 glomerular descriptors was evaluated on jpeg images by seven NEPTUNE pathologists scoring 131 glomeruli three times (Tests I, II, and III), each test following a consensus webinar review. Inter-reader concordance of glomerular descriptors was evaluated in 315 glomeruli by all pathologists; interstitial fibrosis and tubular atrophy (244 cases, whole-slide images) and four ultrastructural podocyte descriptors (178 cases, jpeg images) were evaluated once by six and five pathologists, respectively. Cohen's kappa for inter-reader concordance for 48/51 glomerular descriptors with sufficient observations was moderate (0.40
Phi-s correlation and dynamic time warping - Two methods for tracking ice floes in SAR images

NASA Technical Reports Server (NTRS)

Mcconnell, Ross; Kober, Wolfgang; Kwok, Ronald; Curlander, John C.; Pang, Shirley S.

1991-01-01

The authors present two algorithms for performing shape matching on ice floe boundaries in SAR (synthetic aperture radar) images. These algorithms quickly produce a set of ice motion and rotation vectors that can be used to guide a pixel value correlator. The algorithms match a shape descriptor known as the Phi-s curve. The first algorithm uses normalized correlation to match the Phi-s curves, while the second uses dynamic programming to compute an elastic match that better accommodates ice floe deformation. Some empirical data on the performance of the algorithms on Seasat SAR images are presented.
PyBioMed: a python library for various molecular representations of chemicals, proteins and DNAs and their interactions.

PubMed

Dong, Jie; Yao, Zhi-Jiang; Zhang, Lin; Luo, Feijun; Lin, Qinlu; Lu, Ai-Ping; Chen, Alex F; Cao, Dong-Sheng

2018-03-20

With the increasing development of biotechnology and informatics technology, publicly available data in chemistry and biology are undergoing explosive growth. Such wealthy information in these data needs to be extracted and transformed to useful knowledge by various data mining methods. Considering the amazing rate at which data are accumulated in chemistry and biology fields, new tools that process and interpret large and complex interaction data are increasingly important. So far, there are no suitable toolkits that can effectively link the chemical and biological space in view of molecular representation. To further explore these complex data, an integrated toolkit for various molecular representation is urgently needed which could be easily integrated with data mining algorithms to start a full data analysis pipeline. Herein, the python library PyBioMed is presented, which comprises functionalities for online download for various molecular objects by providing different IDs, the pretreatment of molecular structures, the computation of various molecular descriptors for chemicals, proteins, DNAs and their interactions. PyBioMed is a feature-rich and highly customized python library used for the characterization of various complex chemical and biological molecules and interaction samples. The current version of PyBioMed could calculate 775 chemical descriptors and 19 kinds of chemical fingerprints, 9920 protein descriptors based on protein sequences, more than 6000 DNA descriptors from nucleotide sequences, and interaction descriptors from pairwise samples using three different combining strategies. Several examples and five real-life applications were provided to clearly guide the users how to use PyBioMed as an integral part of data analysis projects. By using PyBioMed, users are able to start a full pipelining from getting molecular data, pretreating molecules, molecular representation to constructing machine learning models conveniently. PyBioMed provides various user-friendly and highly customized APIs to calculate various features of biological molecules and complex interaction samples conveniently, which aims at building integrated analysis pipelines from data acquisition, data checking, and descriptor calculation to modeling. PyBioMed is freely available at http://projects.scbdd.com/pybiomed.html .
Hybrid Histogram Descriptor: A Fusion Feature Representation for Image Retrieval.

PubMed

Feng, Qinghe; Hao, Qiaohong; Chen, Yuqi; Yi, Yugen; Wei, Ying; Dai, Jiangyan

2018-06-15

Currently, visual sensors are becoming increasingly affordable and fashionable, acceleratingly the increasing number of image data. Image retrieval has attracted increasing interest due to space exploration, industrial, and biomedical applications. Nevertheless, designing effective feature representation is acknowledged as a hard yet fundamental issue. This paper presents a fusion feature representation called a hybrid histogram descriptor (HHD) for image retrieval. The proposed descriptor comprises two histograms jointly: a perceptually uniform histogram which is extracted by exploiting the color and edge orientation information in perceptually uniform regions; and a motif co-occurrence histogram which is acquired by calculating the probability of a pair of motif patterns. To evaluate the performance, we benchmarked the proposed descriptor on RSSCN7, AID, Outex-00013, Outex-00014 and ETHZ-53 datasets. Experimental results suggest that the proposed descriptor is more effective and robust than ten recent fusion-based descriptors under the content-based image retrieval framework. The computational complexity was also analyzed to give an in-depth evaluation. Furthermore, compared with the state-of-the-art convolutional neural network (CNN)-based descriptors, the proposed descriptor also achieves comparable performance, but does not require any training process.
Multi-texture local ternary pattern for face recognition

NASA Astrophysics Data System (ADS)

Essa, Almabrok; Asari, Vijayan

2017-05-01

In imagery and pattern analysis domain a variety of descriptors have been proposed and employed for different computer vision applications like face detection and recognition. Many of them are affected under different conditions during the image acquisition process such as variations in illumination and presence of noise, because they totally rely on the image intensity values to encode the image information. To overcome these problems, a novel technique named Multi-Texture Local Ternary Pattern (MTLTP) is proposed in this paper. MTLTP combines the edges and corners based on the local ternary pattern strategy to extract the local texture features of the input image. Then returns a spatial histogram feature vector which is the descriptor for each image that we use to recognize a human being. Experimental results using a k-nearest neighbors classifier (k-NN) on two publicly available datasets justify our algorithm for efficient face recognition in the presence of extreme variations of illumination/lighting environments and slight variation of pose conditions.
Food product design: emerging evidence for food policy.

PubMed

Al-Hamdani, Mohammed; Smith, Steven

2017-03-01

The research on the impact of specific brand elements such as food descriptors and package colors is underexplored. We tested whether a "light" color and a "low-calorie" descriptor on food packages gain favorable consumer perception ratings as compared with regular packages. Our online experiment recruited 406 adults in a 3 (product type: Chips versus Juice versus Yoghurt) × 2 (descriptor type: regular versus low-calorie) × 2 (color type: regular versus light) mixed design. Dependent variables were sensory (evaluations of the product's nutritional value and quality), product-based (evaluations of the product's physical appeal), and consumer-based (evaluations of the potential consumers of the product) scales. "Low-calorie" descriptors were found to increase sensory ratings as compared with regular descriptors and light-colored packages received higher product-based ratings as compared with their regular-colored counterparts. Food package color and descriptors present a promising venue for understanding preventative measures against obesity.[Formula: see text].
Localization of the lumbar discs using machine learning and exact probabilistic inference.

PubMed

Oktay, Ayse Betul; Akgul, Yusuf Sinan

2011-01-01

We propose a novel fully automatic approach to localize the lumbar intervertebral discs in MR images with PHOG based SVM and a probabilistic graphical model. At the local level, our method assigns a score to each pixel in target image that indicates whether it is a disc center or not. At the global level, we define a chain-like graphical model that represents the lumbar intervertebral discs and we use an exact inference algorithm to localize the discs. Our main contributions are the employment of the SVM with the PHOG based descriptor which is robust against variations of the discs and a graphical model that reflects the linear nature of the vertebral column. Our inference algorithm runs in polynomial time and produces globally optimal results. The developed system is validated on a real spine MRI dataset and the final localization results are favorable compared to the results reported in the literature.
Cell segmentation in time-lapse fluorescence microscopy with temporally varying sub-cellular fusion protein patterns.

PubMed

Bunyak, Filiz; Palaniappan, Kannappan; Chagin, Vadim; Cardoso, M

2009-01-01

Fluorescently tagged proteins such as GFP-PCNA produce rich dynamically varying textural patterns of foci distributed in the nucleus. This enables the behavioral study of sub-cellular structures during different phases of the cell cycle. The varying punctuate patterns of fluorescence, drastic changes in SNR, shape and position during mitosis and abundance of touching cells, however, require more sophisticated algorithms for reliable automatic cell segmentation and lineage analysis. Since the cell nuclei are non-uniform in appearance, a distribution-based modeling of foreground classes is essential. The recently proposed graph partitioning active contours (GPAC) algorithm supports region descriptors and flexible distance metrics. We extend GPAC for fluorescence-based cell segmentation using regional density functions and dramatically improve its efficiency for segmentation from O(N(4)) to O(N(2)), for an image with N(2) pixels, making it practical and scalable for high throughput microscopy imaging studies.
Assessment of Beer Quality Based on a Robotic Pourer, Computer Vision, and Machine Learning Algorithms Using Commercial Beers.

PubMed

Gonzalez Viejo, Claudia; Fuentes, Sigfredo; Torrico, Damir D; Howell, Kate; Dunshea, Frank R

2018-05-01

Sensory attributes of beer are directly linked to perceived foam-related parameters and beer color. The aim of this study was to develop an objective predictive model using machine learning modeling to assess the intensity levels of sensory descriptors in beer using the physical measurements of color and foam-related parameters. A robotic pourer (RoboBEER), was used to obtain 15 color and foam-related parameters from 22 different commercial beer samples. A sensory session using quantitative descriptive analysis (QDA ® ) with trained panelists was conducted to assess the intensity of 10 beer descriptors. Results showed that the principal component analysis explained 64% of data variability with correlations found between foam-related descriptors from sensory and RoboBEER such as the positive and significant correlation between carbon dioxide and carbonation mouthfeel (R = 0.62), correlation of viscosity to sensory, and maximum volume of foam and total lifetime of foam (R = 0.75, R = 0.77, respectively). Using the RoboBEER parameters as inputs, an artificial neural network (ANN) regression model showed high correlation (R = 0.91) to predict the intensity levels of 10 related sensory descriptors such as yeast, grains and hops aromas, hops flavor, bitter, sour and sweet tastes, viscosity, carbonation, and astringency. This paper is a novel approach for food science using machine modeling techniques that could contribute significantly to rapid screenings of food and brewage products for the food industry and the implementation of Artificial Intelligence (AI). The use of RoboBEER to assess beer quality showed to be a reliable, objective, accurate, and less time-consuming method to predict sensory descriptors compared to trained sensory panels. Hence, this method could be useful as a rapid screening procedure to evaluate beer quality at the end of the production line for industry applications. © 2018 Institute of Food Technologists®.
Use of in Vitro HTS-Derived Concentration–Response Data as Biological Descriptors Improves the Accuracy of QSAR Models of in Vivo Toxicity

PubMed Central

Sedykh, Alexander; Zhu, Hao; Tang, Hao; Zhang, Liying; Richard, Ann; Rusyn, Ivan; Tropsha, Alexander

2011-01-01

Background Quantitative high-throughput screening (qHTS) assays are increasingly being used to inform chemical hazard identification. Hundreds of chemicals have been tested in dozens of cell lines across extensive concentration ranges by the National Toxicology Program in collaboration with the National Institutes of Health Chemical Genomics Center. Objectives Our goal was to test a hypothesis that dose–response data points of the qHTS assays can serve as biological descriptors of assayed chemicals and, when combined with conventional chemical descriptors, improve the accuracy of quantitative structure–activity relationship (QSAR) models applied to prediction of in vivo toxicity end points. Methods We obtained cell viability qHTS concentration–response data for 1,408 substances assayed in 13 cell lines from PubChem; for a subset of these compounds, rodent acute toxicity half-maximal lethal dose (LD50) data were also available. We used the k nearest neighbor classification and random forest QSAR methods to model LD50 data using chemical descriptors either alone (conventional models) or combined with biological descriptors derived from the concentration–response qHTS data (hybrid models). Critical to our approach was the use of a novel noise-filtering algorithm to treat qHTS data. Results Both the external classification accuracy and coverage (i.e., fraction of compounds in the external set that fall within the applicability domain) of the hybrid QSAR models were superior to conventional models. Conclusions Concentration–response qHTS data may serve as informative biological descriptors of molecules that, when combined with conventional chemical descriptors, may considerably improve the accuracy and utility of computational approaches for predicting in vivo animal toxicity end points. PMID:20980217
Modeling complex metabolic reactions, ecological systems, and financial and legal networks with MIANN models based on Markov-Wiener node descriptors.

PubMed

Duardo-Sánchez, Aliuska; Munteanu, Cristian R; Riera-Fernández, Pablo; López-Díaz, Antonio; Pazos, Alejandro; González-Díaz, Humberto

2014-01-27

The use of numerical parameters in Complex Network analysis is expanding to new fields of application. At a molecular level, we can use them to describe the molecular structure of chemical entities, protein interactions, or metabolic networks. However, the applications are not restricted to the world of molecules and can be extended to the study of macroscopic nonliving systems, organisms, or even legal or social networks. On the other hand, the development of the field of Artificial Intelligence has led to the formulation of computational algorithms whose design is based on the structure and functioning of networks of biological neurons. These algorithms, called Artificial Neural Networks (ANNs), can be useful for the study of complex networks, since the numerical parameters that encode information of the network (for example centralities/node descriptors) can be used as inputs for the ANNs. The Wiener index (W) is a graph invariant widely used in chemoinformatics to quantify the molecular structure of drugs and to study complex networks. In this work, we explore for the first time the possibility of using Markov chains to calculate analogues of node distance numbers/W to describe complex networks from the point of view of their nodes. These parameters are called Markov-Wiener node descriptors of order k(th) (W(k)). Please, note that these descriptors are not related to Markov-Wiener stochastic processes. Here, we calculated the W(k)(i) values for a very high number of nodes (>100,000) in more than 100 different complex networks using the software MI-NODES. These networks were grouped according to the field of application. Molecular networks include the Metabolic Reaction Networks (MRNs) of 40 different organisms. In addition, we analyzed other biological and legal and social networks. These include the Interaction Web Database Biological Networks (IWDBNs), with 75 food webs or ecological systems and the Spanish Financial Law Network (SFLN). The calculated W(k)(i) values were used as inputs for different ANNs in order to discriminate correct node connectivity patterns from incorrect random patterns. The MIANN models obtained present good values of Sensitivity/Specificity (%): MRNs (78/78), IWDBNs (90/88), and SFLN (86/84). These preliminary results are very promising from the point of view of a first exploratory study and suggest that the use of these models could be extended to the high-throughput re-evaluation of connectivity in known complex networks (collation).
Methods for investigating the local spatial anisotropy and the preferred orientation of cones in adaptive optics retinal images

PubMed Central

Cooper, Robert F.; Lombardo, Marco; Carroll, Joseph; Sloan, Kenneth R.; Lombardo, Giuseppe

2016-01-01

The ability to non-invasively image the cone photoreceptor mosaic holds significant potential as a diagnostic for retinal disease. Central to the realization of this potential is the development of sensitive metrics for characterizing the organization of the mosaic. Here we evaluated previously-described (Pum et al., 1990) and newly-developed (Fourier- and Radon-based) methods of measuring cone orientation in both simulated and real images of the parafoveal cone mosaic. The proposed algorithms correlated well across both simulated and real mosaics, suggesting that each algorithm would provide an accurate description of individual photoreceptor orientation. Despite the high agreement between algorithms, each performed differently in response to image intensity variation and cone coordinate jitter. The integration property of the Fourier transform allowed the Fourier-based method to be resistant to cone coordinate jitter and perform the most robustly of all three algorithms. Conversely, when there is good image quality but unreliable cone identification, the Radon algorithm performed best. Finally, in cases where both the image and cone coordinate reliability was excellent, the method of Pum et al. (1990) performed best. These descriptors are complementary to conventional descriptive metrics of the cone mosaic, such as cell density and spacing, and have the potential to aid in the detection of photoreceptor pathology. PMID:27484961
Fourier-based quantification of renal glomeruli size using Hough transform and shape descriptors.

PubMed

Najafian, Sohrab; Beigzadeh, Borhan; Riahi, Mohammad; Khadir Chamazkoti, Fatemeh; Pouramir, Mahdi

2017-11-01

Analysis of glomeruli geometry is important in histopathological evaluation of renal microscopic images. Due to the shape and size disparity of even glomeruli of same kidney, automatic detection of these renal objects is not an easy task. Although manual measurements are time consuming and at times are not very accurate, it is commonly used in medical centers. In this paper, a new method based on Fourier transform following usage of some shape descriptors is proposed to detect these objects and their geometrical parameters. Reaching the goal, a database of 400 regions are selected randomly. 200 regions of which are part of glomeruli and the other 200 regions are not belong to renal corpuscles. ROC curve is used to decide which descriptor could classify two groups better. f_measure, which is a combination of both tpr (true positive rate) and fpr (false positive rate), is also proposed to select optimal threshold for descriptors. Combination of three parameters (solidity, eccentricity, and also mean squared error of fitted ellipse) provided better result in terms of f_measure to distinguish desired regions. Then, Fourier transform of outer edges is calculated to form a complete curve out of separated region(s). The generality of proposed model is verified by use of cross validation method, which resulted tpr of 94%, and fpr of 5%. Calculation of glomerulus' and Bowman's space with use of the algorithm are also compared with a non-automatic measurement done by a renal pathologist, and errors of 5.9%, 5.4%, and 6.26% are resulted in calculation of Capsule area, Bowman space, and glomeruli area, respectively. Having tested different glomeruli with various shapes, the experimental consequences show robustness and reliability of our method. Therefore, it could be used to illustrate renal diseases and glomerular disorders by measuring the morphological changes accurately and expeditiously. Copyright © 2017 Elsevier B.V. All rights reserved.
Mobile visual object identification: from SIFT-BoF-RANSAC to Sketchprint

NASA Astrophysics Data System (ADS)

Voloshynovskiy, Sviatoslav; Diephuis, Maurits; Holotyak, Taras

2015-03-01

Mobile object identification based on its visual features find many applications in the interaction with physical objects and security. Discriminative and robust content representation plays a central role in object and content identification. Complex post-processing methods are used to compress descriptors and their geometrical information, aggregate them into more compact and discriminative representations and finally re-rank the results based on the similarity geometries of descriptors. Unfortunately, most of the existing descriptors are not very robust and discriminative once applied to the various contend such as real images, text or noise-like microstructures next to requiring at least 500-1'000 descriptors per image for reliable identification. At the same time, the geometric re-ranking procedures are still too complex to be applied to the numerous candidates obtained from the feature similarity based search only. This restricts that list of candidates to be less than 1'000 which obviously causes a higher probability of miss. In addition, the security and privacy of content representation has become a hot research topic in multimedia and security communities. In this paper, we introduce a new framework for non- local content representation based on SketchPrint descriptors. It extends the properties of local descriptors to a more informative and discriminative, yet geometrically invariant content representation. In particular it allows images to be compactly represented by 100 SketchPrint descriptors without being fully dependent on re-ranking methods. We consider several use cases, applying SketchPrint descriptors to natural images, text documents, packages and micro-structures and compare them with the traditional local descriptors.
Prediction of pKa Values for Neutral and Basic Drugs based on Hybrid Artificial Intelligence Methods.

PubMed

Li, Mengshan; Zhang, Huaijing; Chen, Bingsheng; Wu, Yan; Guan, Lixin

2018-03-05

The pKa value of drugs is an important parameter in drug design and pharmacology. In this paper, an improved particle swarm optimization (PSO) algorithm was proposed based on the population entropy diversity. In the improved algorithm, when the population entropy was higher than the set maximum threshold, the convergence strategy was adopted; when the population entropy was lower than the set minimum threshold the divergence strategy was adopted; when the population entropy was between the maximum and minimum threshold, the self-adaptive adjustment strategy was maintained. The improved PSO algorithm was applied in the training of radial basis function artificial neural network (RBF ANN) model and the selection of molecular descriptors. A quantitative structure-activity relationship model based on RBF ANN trained by the improved PSO algorithm was proposed to predict the pKa values of 74 kinds of neutral and basic drugs and then validated by another database containing 20 molecules. The validation results showed that the model had a good prediction performance. The absolute average relative error, root mean square error, and squared correlation coefficient were 0.3105, 0.0411, and 0.9685, respectively. The model can be used as a reference for exploring other quantitative structure-activity relationships.

Machine learning in materials informatics: recent applications and prospects

NASA Astrophysics Data System (ADS)

Ramprasad, Rampi; Batra, Rohit; Pilania, Ghanshyam; Mannodi-Kanakkithodi, Arun; Kim, Chiho

2017-12-01

Propelled partly by the Materials Genome Initiative, and partly by the algorithmic developments and the resounding successes of data-driven efforts in other domains, informatics strategies are beginning to take shape within materials science. These approaches lead to surrogate machine learning models that enable rapid predictions based purely on past data rather than by direct experimentation or by computations/simulations in which fundamental equations are explicitly solved. Data-centric informatics methods are becoming useful to determine material properties that are hard to measure or compute using traditional methods—due to the cost, time or effort involved—but for which reliable data either already exists or can be generated for at least a subset of the critical cases. Predictions are typically interpolative, involving fingerprinting a material numerically first, and then following a mapping (established via a learning algorithm) between the fingerprint and the property of interest. Fingerprints, also referred to as "descriptors", may be of many types and scales, as dictated by the application domain and needs. Predictions may also be extrapolative—extending into new materials spaces—provided prediction uncertainties are properly taken into account. This article attempts to provide an overview of some of the recent successful data-driven "materials informatics" strategies undertaken in the last decade, with particular emphasis on the fingerprint or descriptor choices. The review also identifies some challenges the community is facing and those that should be overcome in the near future.
Receptive fields selection for binary feature description.

PubMed

Fan, Bin; Kong, Qingqun; Trzcinski, Tomasz; Wang, Zhiheng; Pan, Chunhong; Fua, Pascal

2014-06-01

Feature description for local image patch is widely used in computer vision. While the conventional way to design local descriptor is based on expert experience and knowledge, learning-based methods for designing local descriptor become more and more popular because of their good performance and data-driven property. This paper proposes a novel data-driven method for designing binary feature descriptor, which we call receptive fields descriptor (RFD). Technically, RFD is constructed by thresholding responses of a set of receptive fields, which are selected from a large number of candidates according to their distinctiveness and correlations in a greedy way. Using two different kinds of receptive fields (namely rectangular pooling area and Gaussian pooling area) for selection, we obtain two binary descriptors RFDR and RFDG .accordingly. Image matching experiments on the well-known patch data set and Oxford data set demonstrate that RFD significantly outperforms the state-of-the-art binary descriptors, and is comparable with the best float-valued descriptors at a fraction of processing time. Finally, experiments on object recognition tasks confirm that both RFDR and RFDG successfully bridge the performance gap between binary descriptors and their floating-point competitors.
A Quantum Chemical and Statistical Study of Phenolic Schiff Bases with Antioxidant Activity against DPPH Free Radical

PubMed Central

Anouar, El Hassane

2014-01-01

Phenolic Schiff bases are known as powerful antioxidants. To select the electronic, 2D and 3D descriptors responsible for the free radical scavenging ability of a series of 30 phenolic Schiff bases, a set of molecular descriptors were calculated by using B3P86 (Becke’s three parameter hybrid functional with Perdew 86 correlation functional) combined with 6-31 + G(d,p) basis set (i.e., at the B3P86/6-31 + G(d,p) level of theory). The chemometric methods, simple and multiple linear regressions (SLR and MLR), principal component analysis (PCA) and hierarchical cluster analysis (HCA) were employed to reduce the dimensionality and to investigate the relationship between the calculated descriptors and the antioxidant activity. The results showed that the antioxidant activity mainly depends on the first and second bond dissociation enthalpies of phenolic hydroxyl groups, the dipole moment and the hydrophobicity descriptors. The antioxidant activity is inversely proportional to the main descriptors. The selected descriptors discriminate the Schiff bases into active and inactive antioxidants. PMID:26784873
A Distributed Wireless Camera System for the Management of Parking Spaces

PubMed Central

Melničuk, Petr

2017-01-01

The importance of detection of parking space availability is still growing, particularly in major cities. This paper deals with the design of a distributed wireless camera system for the management of parking spaces, which can determine occupancy of the parking space based on the information from multiple cameras. The proposed system uses small camera modules based on Raspberry Pi Zero and computationally efficient algorithm for the occupancy detection based on the histogram of oriented gradients (HOG) feature descriptor and support vector machine (SVM) classifier. We have included information about the orientation of the vehicle as a supporting feature, which has enabled us to achieve better accuracy. The described solution can deliver occupancy information at the rate of 10 parking spaces per second with more than 90% accuracy in a wide range of conditions. Reliability of the implemented algorithm is evaluated with three different test sets which altogether contain over 700,000 samples of parking spaces. PMID:29283371
Novel Uses of In Vitro Data to Develop Quantitative Biological Activity Relationship Models for in Vivo Carcinogenicity Prediction.

PubMed

Pradeep, Prachi; Povinelli, Richard J; Merrill, Stephen J; Bozdag, Serdar; Sem, Daniel S

2015-04-01

The availability of large in vitro datasets enables better insight into the mode of action of chemicals and better identification of potential mechanism(s) of toxicity. Several studies have shown that not all in vitro assays can contribute as equal predictors of in vivo carcinogenicity for development of hybrid Quantitative Structure Activity Relationship (QSAR) models. We propose two novel approaches for the use of mechanistically relevant in vitro assay data in the identification of relevant biological descriptors and development of Quantitative Biological Activity Relationship (QBAR) models for carcinogenicity prediction. We demonstrate that in vitro assay data can be used to develop QBAR models for in vivo carcinogenicity prediction via two case studies corroborated with firm scientific rationale. The case studies demonstrate the similarities between QBAR and QSAR modeling in: (i) the selection of relevant descriptors to be used in the machine learning algorithm, and (ii) the development of a computational model that maps chemical or biological descriptors to a toxic endpoint. The results of both the case studies show: (i) improved accuracy and sensitivity which is especially desirable under regulatory requirements, and (ii) overall adherence with the OECD/REACH guidelines. Such mechanism based models can be used along with QSAR models for prediction of mechanistically complex toxic endpoints. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Spherical harmonics coefficients for ligand-based virtual screening of cyclooxygenase inhibitors.

PubMed

Wang, Quan; Birod, Kerstin; Angioni, Carlo; Grösch, Sabine; Geppert, Tim; Schneider, Petra; Rupp, Matthias; Schneider, Gisbert

2011-01-01

Molecular descriptors are essential for many applications in computational chemistry, such as ligand-based similarity searching. Spherical harmonics have previously been suggested as comprehensive descriptors of molecular structure and properties. We investigate a spherical harmonics descriptor for shape-based virtual screening. We introduce and validate a partially rotation-invariant three-dimensional molecular shape descriptor based on the norm of spherical harmonics expansion coefficients. Using this molecular representation, we parameterize molecular surfaces, i.e., isosurfaces of spatial molecular property distributions. We validate the shape descriptor in a comprehensive retrospective virtual screening experiment. In a prospective study, we virtually screen a large compound library for cyclooxygenase inhibitors, using a self-organizing map as a pre-filter and the shape descriptor for candidate prioritization. 12 compounds were tested in vitro for direct enzyme inhibition and in a whole blood assay. Active compounds containing a triazole scaffold were identified as direct cyclooxygenase-1 inhibitors. This outcome corroborates the usefulness of spherical harmonics for representation of molecular shape in virtual screening of large compound collections. The combination of pharmacophore and shape-based filtering of screening candidates proved to be a straightforward approach to finding novel bioactive chemotypes with minimal experimental effort.
Advancing Bag-of-Visual-Words Representations for Lesion Classification in Retinal Images

PubMed Central

Pires, Ramon; Jelinek, Herbert F.; Wainer, Jacques; Valle, Eduardo; Rocha, Anderson

2014-01-01

Diabetic Retinopathy (DR) is a complication of diabetes that can lead to blindness if not readily discovered. Automated screening algorithms have the potential to improve identification of patients who need further medical attention. However, the identification of lesions must be accurate to be useful for clinical application. The bag-of-visual-words (BoVW) algorithm employs a maximum-margin classifier in a flexible framework that is able to detect the most common DR-related lesions such as microaneurysms, cotton-wool spots and hard exudates. BoVW allows to bypass the need for pre- and post-processing of the retinographic images, as well as the need of specific ad hoc techniques for identification of each type of lesion. An extensive evaluation of the BoVW model, using three large retinograph datasets (DR1, DR2 and Messidor) with different resolution and collected by different healthcare personnel, was performed. The results demonstrate that the BoVW classification approach can identify different lesions within an image without having to utilize different algorithms for each lesion reducing processing time and providing a more flexible diagnostic system. Our BoVW scheme is based on sparse low-level feature detection with a Speeded-Up Robust Features (SURF) local descriptor, and mid-level features based on semi-soft coding with max pooling. The best BoVW representation for retinal image classification was an area under the receiver operating characteristic curve (AUC-ROC) of 97.8% (exudates) and 93.5% (red lesions), applying a cross-dataset validation protocol. To assess the accuracy for detecting cases that require referral within one year, the sparse extraction technique associated with semi-soft coding and max pooling obtained an AUC of 94.22.0%, outperforming current methods. Those results indicate that, for retinal image classification tasks in clinical practice, BoVW is equal and, in some instances, surpasses results obtained using dense detection (widely believed to be the best choice in many vision problems) for the low-level descriptors. PMID:24886780
In silico prediction of toxicity of phenols to Tetrahymena pyriformis by using genetic algorithm and decision tree-based modeling approach.

PubMed

Abbasitabar, Fatemeh; Zare-Shahabadi, Vahid

2017-04-01

Risk assessment of chemicals is an important issue in environmental protection; however, there is a huge lack of experimental data for a large number of end-points. The experimental determination of toxicity of chemicals involves high costs and time-consuming process. In silico tools such as quantitative structure-toxicity relationship (QSTR) models, which are constructed on the basis of computational molecular descriptors, can predict missing data for toxic end-points for existing or even not yet synthesized chemicals. Phenol derivatives are known to be aquatic pollutants. With this background, we aimed to develop an accurate and reliable QSTR model for the prediction of toxicity of 206 phenols to Tetrahymena pyriformis. A multiple linear regression (MLR)-based QSTR was obtained using a powerful descriptor selection tool named Memorized_ACO algorithm. Statistical parameters of the model were 0.72 and 0.68 for R training 2 and R test 2 , respectively. To develop a high-quality QSTR model, classification and regression tree (CART) was employed. Two approaches were considered: (1) phenols were classified into different modes of action using CART and (2) the phenols in the training set were partitioned to several subsets by a tree in such a manner that in each subset, a high-quality MLR could be developed. For the first approach, the statistical parameters of the resultant QSTR model were improved to 0.83 and 0.75 for R training 2 and R test 2 , respectively. Genetic algorithm was employed in the second approach to obtain an optimal tree, and it was shown that the final QSTR model provided excellent prediction accuracy for the training and test sets (R training 2 and R test 2 were 0.91 and 0.93, respectively). The mean absolute error for the test set was computed as 0.1615. Copyright © 2016 Elsevier Ltd. All rights reserved.
Automatic orientation and 3D modelling from markerless rock art imagery

NASA Astrophysics Data System (ADS)

Lerma, J. L.; Navarro, S.; Cabrelles, M.; Seguí, A. E.; Hernández, D.

2013-02-01

This paper investigates the use of two detectors and descriptors on image pyramids for automatic image orientation and generation of 3D models. The detectors and descriptors replace manual measurements and are used to detect, extract and match features across multiple imagery. The Scale-Invariant Feature Transform (SIFT) and the Speeded Up Robust Features (SURF) will be assessed based on speed, number of features, matched features, and precision in image and object space depending on the adopted hierarchical matching scheme. The influence of applying in addition Area Based Matching (ABM) with normalised cross-correlation (NCC) and least squares matching (LSM) is also investigated. The pipeline makes use of photogrammetric and computer vision algorithms aiming minimum interaction and maximum accuracy from a calibrated camera. Both the exterior orientation parameters and the 3D coordinates in object space are sequentially estimated combining relative orientation, single space resection and bundle adjustment. The fully automatic image-based pipeline presented herein to automate the image orientation step of a sequence of terrestrial markerless imagery is compared with manual bundle block adjustment and terrestrial laser scanning (TLS) which serves as ground truth. The benefits of applying ABM after FBM will be assessed both in image and object space for the 3D modelling of a complex rock art shelter.
Visual Semantic Based 3D Video Retrieval System Using HDFS.

PubMed

Kumar, C Ranjith; Suguna, S

2016-08-01

This paper brings out a neoteric frame of reference for visual semantic based 3d video search and retrieval applications. Newfangled 3D retrieval application spotlight on shape analysis like object matching, classification and retrieval not only sticking up entirely with video retrieval. In this ambit, we delve into 3D-CBVR (Content Based Video Retrieval) concept for the first time. For this purpose, we intent to hitch on BOVW and Mapreduce in 3D framework. Instead of conventional shape based local descriptors, we tried to coalesce shape, color and texture for feature extraction. For this purpose, we have used combination of geometric & topological features for shape and 3D co-occurrence matrix for color and texture. After thriving extraction of local descriptors, TB-PCT (Threshold Based- Predictive Clustering Tree) algorithm is used to generate visual codebook and histogram is produced. Further, matching is performed using soft weighting scheme with L 2 distance function. As a final step, retrieved results are ranked according to the Index value and acknowledged to the user as a feedback .In order to handle prodigious amount of data and Efficacious retrieval, we have incorporated HDFS in our Intellection. Using 3D video dataset, we future the performance of our proposed system which can pan out that the proposed work gives meticulous result and also reduce the time intricacy.
Application of 3D Zernike descriptors to shape-based ligand similarity searching.

PubMed

Venkatraman, Vishwesh; Chakravarthy, Padmasini Ramji; Kihara, Daisuke

2009-12-17

The identification of promising drug leads from a large database of compounds is an important step in the preliminary stages of drug design. Although shape is known to play a key role in the molecular recognition process, its application to virtual screening poses significant hurdles both in terms of the encoding scheme and speed. In this study, we have examined the efficacy of the alignment independent three-dimensional Zernike descriptor (3DZD) for fast shape based similarity searching. Performance of this approach was compared with several other methods including the statistical moments based ultrafast shape recognition scheme (USR) and SIMCOMP, a graph matching algorithm that compares atom environments. Three benchmark datasets are used to thoroughly test the methods in terms of their ability for molecular classification, retrieval rate, and performance under the situation that simulates actual virtual screening tasks over a large pharmaceutical database. The 3DZD performed better than or comparable to the other methods examined, depending on the datasets and evaluation metrics used. Reasons for the success and the failure of the shape based methods for specific cases are investigated. Based on the results for the three datasets, general conclusions are drawn with regard to their efficiency and applicability. The 3DZD has unique ability for fast comparison of three-dimensional shape of compounds. Examples analyzed illustrate the advantages and the room for improvements for the 3DZD.
Application of 3D Zernike descriptors to shape-based ligand similarity searching

PubMed Central

2009-01-01

Background The identification of promising drug leads from a large database of compounds is an important step in the preliminary stages of drug design. Although shape is known to play a key role in the molecular recognition process, its application to virtual screening poses significant hurdles both in terms of the encoding scheme and speed. Results In this study, we have examined the efficacy of the alignment independent three-dimensional Zernike descriptor (3DZD) for fast shape based similarity searching. Performance of this approach was compared with several other methods including the statistical moments based ultrafast shape recognition scheme (USR) and SIMCOMP, a graph matching algorithm that compares atom environments. Three benchmark datasets are used to thoroughly test the methods in terms of their ability for molecular classification, retrieval rate, and performance under the situation that simulates actual virtual screening tasks over a large pharmaceutical database. The 3DZD performed better than or comparable to the other methods examined, depending on the datasets and evaluation metrics used. Reasons for the success and the failure of the shape based methods for specific cases are investigated. Based on the results for the three datasets, general conclusions are drawn with regard to their efficiency and applicability. Conclusion The 3DZD has unique ability for fast comparison of three-dimensional shape of compounds. Examples analyzed illustrate the advantages and the room for improvements for the 3DZD. PMID:20150998
Visual feature discrimination versus compression ratio for polygonal shape descriptors

NASA Astrophysics Data System (ADS)

Heuer, Joerg; Sanahuja, Francesc; Kaup, Andre

2000-10-01

In the last decade several methods for low level indexing of visual features appeared. Most often these were evaluated with respect to their discrimination power using measures like precision and recall. Accordingly, the targeted application was indexing of visual data within databases. During the standardization process of MPEG-7 the view on indexing of visual data changed, taking also communication aspects into account where coding efficiency is important. Even if the descriptors used for indexing are small compared to the size of images, it is recognized that there can be several descriptors linked to an image, characterizing different features and regions. Beside the importance of a small memory footprint for the transmission of the descriptor and the memory footprint in a database, eventually the search and filtering can be sped up by reducing the dimensionality of the descriptor if the metric of the matching can be adjusted. Based on a polygon shape descriptor presented for MPEG-7 this paper compares the discrimination power versus memory consumption of the descriptor. Different methods based on quantization are presented and their effect on the retrieval performance are measured. Finally an optimized computation of the descriptor is presented.
Automatic Marker-free Longitudinal Infrared Image Registration by Shape Context Based Matching and Competitive Winner-guided Optimal Corresponding

PubMed Central

Lee, Chia-Yen; Wang, Hao-Jen; Lai, Jhih-Hao; Chang, Yeun-Chung; Huang, Chiun-Sheng

2017-01-01

Long-term comparisons of infrared image can facilitate the assessment of breast cancer tissue growth and early tumor detection, in which longitudinal infrared image registration is a necessary step. However, it is hard to keep markers attached on a body surface for weeks, and rather difficult to detect anatomic fiducial markers and match them in the infrared image during registration process. The proposed study, automatic longitudinal infrared registration algorithm, develops an automatic vascular intersection detection method and establishes feature descriptors by shape context to achieve robust matching, as well as to obtain control points for the deformation model. In addition, competitive winner-guided mechanism is developed for optimal corresponding. The proposed algorithm is evaluated in two ways. Results show that the algorithm can quickly lead to accurate image registration and that the effectiveness is superior to manual registration with a mean error being 0.91 pixels. These findings demonstrate that the proposed registration algorithm is reasonably accurate and provide a novel method of extracting a greater amount of useful data from infrared images. PMID:28145474
Study on Low Illumination Simultaneous Polarization Image Registration Based on Improved SURF Algorithm

NASA Astrophysics Data System (ADS)

Zhang, Wanjun; Yang, Xu

2017-12-01

Registration of simultaneous polarization images is the premise of subsequent image fusion operations. However, in the process of shooting all-weather, the polarized camera exposure time need to be kept unchanged, sometimes polarization images under low illumination conditions due to too dark result in SURF algorithm can not extract feature points, thus unable to complete the registration, therefore this paper proposes an improved SURF algorithm. Firstly, the luminance operator is used to improve overall brightness of low illumination image, and then create integral image, using Hession matrix to extract the points of interest to get the main direction of characteristic points, calculate Haar wavelet response in X and Y directions to get the SURF descriptor information, then use the RANSAC function to make precise matching, the function can eliminate wrong matching points and improve accuracy rate. And finally resume the brightness of the polarized image after registration, the effect of the polarized image is not affected. Results show that the improved SURF algorithm can be applied well under low illumination conditions.
Infrared and visible images registration with adaptable local-global feature integration for rail inspection

NASA Astrophysics Data System (ADS)

Tang, Chaoqing; Tian, Gui Yun; Chen, Xiaotian; Wu, Jianbo; Li, Kongjing; Meng, Hongying

2017-12-01

Active thermography provides infrared images that contain sub-surface defect information, while visible images only reveal surface information. Mapping infrared information to visible images offers more comprehensive visualization for decision-making in rail inspection. However, the common information for registration is limited due to different modalities in both local and global level. For example, rail track which has low temperature contrast reveals rich details in visible images, but turns blurry in the infrared counterparts. This paper proposes a registration algorithm called Edge-Guided Speeded-Up-Robust-Features (EG-SURF) to address this issue. Rather than sequentially integrating local and global information in matching stage which suffered from buckets effect, this algorithm adaptively integrates local and global information into a descriptor to gather more common information before matching. This adaptability consists of two facets, an adaptable weighting factor between local and global information, and an adaptable main direction accuracy. The local information is extracted using SURF while the global information is represented by shape context from edges. Meanwhile, in shape context generation process, edges are weighted according to local scale and decomposed into bins using a vector decomposition manner to provide more accurate descriptor. The proposed algorithm is qualitatively and quantitatively validated using eddy current pulsed thermography scene in the experiments. In comparison with other algorithms, better performance has been achieved.
Vehicle Detection for RCTA/ANS (Autonomous Navigation System)

NASA Technical Reports Server (NTRS)

Brennan, Shane; Bajracharya, Max; Matthies, Larry H.; Howard, Andrew B.

2012-01-01

Using a stereo camera pair, imagery is acquired and processed through the JPLV stereo processing pipeline. From this stereo data, large 3D blobs are found. These blobs are then described and classified by their shape to determine which are vehicles and which are not. Prior vehicle detection algorithms are either targeted to specific domains, such as following lead cars, or are intensity- based methods that involve learning typical vehicle appearances from a large corpus of training data. In order to detect vehicles, the JPL Vehicle Detection (JVD) algorithm goes through the following steps: 1. Take as input a left disparity image and left rectified image from JPLV stereo. 2. Project the disparity data onto a two-dimensional Cartesian map. 3. Perform some post-processing of the map built in the previous step in order to clean it up. 4. Take the processed map and find peaks. For each peak, grow it out into a map blob. These map blobs represent large, roughly vehicle-sized objects in the scene. 5. Take these map blobs and reject those that do not meet certain criteria. Build descriptors for the ones that remain. Pass these descriptors onto a classifier, which determines if the blob is a vehicle or not. The probability of detection is the probability that if a vehicle is present in the image, is visible, and un-occluded, then it will be detected by the JVD algorithm. In order to estimate this probability, eight sequences were ground-truthed from the RCTA (Robotics Collaborative Technology Alliances) program, totaling over 4,000 frames with 15 unique vehicles. Since these vehicles were observed at varying ranges, one is able to find the probability of detection as a function of range. At the time of this reporting, the JVD algorithm was tuned to perform best at cars seen from the front, rear, or either side, and perform poorly on vehicles seen from oblique angles.
Algorithms for Autonomous Plume Detection on Outer Planet Satellites

NASA Astrophysics Data System (ADS)

Lin, Y.; Bunte, M. K.; Saripalli, S.; Greeley, R.

2011-12-01

We investigate techniques for automated detection of geophysical events (i.e., volcanic plumes) from spacecraft images. The algorithms presented here have not been previously applied to detection of transient events on outer planet satellites. We apply Scale Invariant Feature Transform (SIFT) to raw images of Io and Enceladus from the Voyager, Galileo, Cassini, and New Horizons missions. SIFT produces distinct interest points in every image; feature descriptors are reasonably invariant to changes in illumination, image noise, rotation, scaling, and small changes in viewpoint. We classified these descriptors as plumes using the k-nearest neighbor (KNN) algorithm. In KNN, an object is classified by its similarity to examples in a training set of images based on user defined thresholds. Using the complete database of Io images and a selection of Enceladus images where 1-3 plumes were manually detected in each image, we successfully detected 74% of plumes in Galileo and New Horizons images, 95% in Voyager images, and 93% in Cassini images. Preliminary tests yielded some false positive detections; further iterations will improve performance. In images where detections fail, plumes are less than 9 pixels in size or are lost in image glare. We compared the appearance of plumes and illuminated mountain slopes to determine the potential for feature classification. We successfully differentiated features. An advantage over other methods is the ability to detect plumes in non-limb views where they appear in the shadowed part of the surface; improvements will enable detection against the illuminated background surface where gradient changes would otherwise preclude detection. This detection method has potential applications to future outer planet missions for sustained plume monitoring campaigns and onboard automated prioritization of all spacecraft data. The complementary nature of this method is such that it could be used in conjunction with edge detection algorithms to increase effectiveness. We have demonstrated an ability to detect transient events above the planetary limb and on the surface and to distinguish feature classes in spacecraft images.
Adjacent bin stability evaluating for feature description

NASA Astrophysics Data System (ADS)

Nie, Dongdong; Ma, Qinyong

2018-04-01

Recent study improves descriptor performance by accumulating stability votes for all scale pairs to compose the local descriptor. We argue that the stability of a bin depends on the differences across adjacent pairs more than the differences across all scale pairs, and a new local descriptor is composed based on the hypothesis. A series of SIFT descriptors are extracted from multiple scales firstly. Then the difference value of the bin across adjacent scales is calculated, and the stability value of a bin is calculated based on it and accumulated to compose the final descriptor. The performance of the proposed method is evaluated with two popular matching datasets, and compared with other state-of-the-art works. Experimental results show that the proposed method performs satisfactorily.
A comparison between space-time video descriptors

NASA Astrophysics Data System (ADS)

Costantini, Luca; Capodiferro, Licia; Neri, Alessandro

2013-02-01

The description of space-time patches is a fundamental task in many applications such as video retrieval or classification. Each space-time patch can be described by using a set of orthogonal functions that represent a subspace, for example a sphere or a cylinder, within the patch. In this work, our aim is to investigate the differences between the spherical descriptors and the cylindrical descriptors. In order to compute the descriptors, the 3D spherical and cylindrical Zernike polynomials are employed. This is important because both the functions are based on the same family of polynomials, and only the symmetry is different. Our experimental results show that the cylindrical descriptor outperforms the spherical descriptor. However, the performances of the two descriptors are similar.

3D Riesz-wavelet based Covariance descriptors for texture classification of lung nodule tissue in CT.

PubMed

Cirujeda, Pol; Muller, Henning; Rubin, Daniel; Aguilera, Todd A; Loo, Billy W; Diehn, Maximilian; Binefa, Xavier; Depeursinge, Adrien

2015-01-01

In this paper we present a novel technique for characterizing and classifying 3D textured volumes belonging to different lung tissue types in 3D CT images. We build a volume-based 3D descriptor, robust to changes of size, rigid spatial transformations and texture variability, thanks to the integration of Riesz-wavelet features within a Covariance-based descriptor formulation. 3D Riesz features characterize the morphology of tissue density due to their response to changes in intensity in CT images. These features are encoded in a Covariance-based descriptor formulation: this provides a compact and flexible representation thanks to the use of feature variations rather than dense features themselves and adds robustness to spatial changes. Furthermore, the particular symmetric definite positive matrix form of these descriptors causes them to lay in a Riemannian manifold. Thus, descriptors can be compared with analytical measures, and accurate techniques from machine learning and clustering can be adapted to their spatial domain. Additionally we present a classification model following a "Bag of Covariance Descriptors" paradigm in order to distinguish three different nodule tissue types in CT: solid, ground-glass opacity, and healthy lung. The method is evaluated on top of an acquired dataset of 95 patients with manually delineated ground truth by radiation oncology specialists in 3D, and quantitative sensitivity and specificity values are presented.
Face recognition based on matching of local features on 3D dynamic range sequences

NASA Astrophysics Data System (ADS)

Echeagaray-Patrón, B. A.; Kober, Vitaly

2016-09-01

3D face recognition has attracted attention in the last decade due to improvement of technology of 3D image acquisition and its wide range of applications such as access control, surveillance, human-computer interaction and biometric identification systems. Most research on 3D face recognition has focused on analysis of 3D still data. In this work, a new method for face recognition using dynamic 3D range sequences is proposed. Experimental results are presented and discussed using 3D sequences in the presence of pose variation. The performance of the proposed method is compared with that of conventional face recognition algorithms based on descriptors.
Integrating fuzzy object based image analysis and ant colony optimization for road extraction from remotely sensed images

NASA Astrophysics Data System (ADS)

Maboudi, Mehdi; Amini, Jalal; Malihi, Shirin; Hahn, Michael

2018-04-01

Updated road network as a crucial part of the transportation database plays an important role in various applications. Thus, increasing the automation of the road extraction approaches from remote sensing images has been the subject of extensive research. In this paper, we propose an object based road extraction approach from very high resolution satellite images. Based on the object based image analysis, our approach incorporates various spatial, spectral, and textural objects' descriptors, the capabilities of the fuzzy logic system for handling the uncertainties in road modelling, and the effectiveness and suitability of ant colony algorithm for optimization of network related problems. Four VHR optical satellite images which are acquired by Worldview-2 and IKONOS satellites are used in order to evaluate the proposed approach. Evaluation of the extracted road networks shows that the average completeness, correctness, and quality of the results can reach 89%, 93% and 83% respectively, indicating that the proposed approach is applicable for urban road extraction. We also analyzed the sensitivity of our algorithm to different ant colony optimization parameter values. Comparison of the achieved results with the results of four state-of-the-art algorithms and quantifying the robustness of the fuzzy rule set demonstrate that the proposed approach is both efficient and transferable to other comparable images.
QSAR modeling of acute toxicity on mammals caused by aromatic compounds: the case study using oral LD50 for rats.

PubMed

Rasulev, Bakhtiyor; Kusić, Hrvoje; Leszczynska, Danuta; Leszczynski, Jerzy; Koprivanac, Natalija

2010-05-01

The goal of the study was to predict toxicity in vivo caused by aromatic compounds structured with a single benzene ring and the presence or absence of different substituent groups such as hydroxyl-, nitro-, amino-, methyl-, methoxy-, etc., by using QSAR/QSPR tools. A Genetic Algorithm and multiple regression analysis were applied to select the descriptors and to generate the correlation models. The most predictive model is shown to be the 3-variable model which also has a good ratio of the number of descriptors and their predictive ability to avoid overfitting. The main contributions to the toxicity were shown to be the polarizability weighted MATS2p and the number of certain groups C-026 descriptors. The GA-MLRA approach showed good results in this study, which allows the building of a simple, interpretable and transparent model that can be used for future studies of predicting toxicity of organic compounds to mammals.
iFeature: a python package and web server for features extraction and selection from protein and peptide sequences.

PubMed

Chen, Zhen; Zhao, Pei; Li, Fuyi; Leier, André; Marquez-Lago, Tatiana T; Wang, Yanan; Webb, Geoffrey I; Smith, A Ian; Daly, Roger J; Chou, Kuo-Chen; Song, Jiangning

2018-03-08

Structural and physiochemical descriptors extracted from sequence data have been widely used to represent sequences and predict structural, functional, expression and interaction profiles of proteins and peptides as well as DNAs/RNAs. Here, we present iFeature, a versatile Python-based toolkit for generating various numerical feature representation schemes for both protein and peptide sequences. iFeature is capable of calculating and extracting a comprehensive spectrum of 18 major sequence encoding schemes that encompass 53 different types of feature descriptors. It also allows users to extract specific amino acid properties from the AAindex database. Furthermore, iFeature integrates 12 different types of commonly used feature clustering, selection, and dimensionality reduction algorithms, greatly facilitating training, analysis, and benchmarking of machine-learning models. The functionality of iFeature is made freely available via an online web server and a stand-alone toolkit. http://iFeature.erc.monash.edu/; https://github.com/Superzchen/iFeature/. jiangning.song@monash.edu; kcchou@gordonlifescience.org; roger.daly@monash.edu. Supplementary data are available at Bioinformatics online.
Evaluation of a simple method for the automatic assignment of MeSH descriptors to health resources in a French online catalogue.

PubMed

Névéol, Aurélie; Pereira, Suzanne; Kerdelhué, Gaetan; Dahamna, Badisse; Joubert, Michel; Darmoni, Stéfan J

2007-01-01

The growing number of resources to be indexed in the catalogue of online health resources in French (CISMeF) calls for curating strategies involving automatic indexing tools while maintaining the catalogue's high indexing quality standards. To develop a simple automatic tool that retrieves MeSH descriptors from documents titles. In parallel to research on advanced indexing methods, a bag-of-words tool was developed for timely inclusion in CISMeF's maintenance system. An evaluation was carried out on a corpus of 99 documents. The indexing sets retrieved by the automatic tool were compared to manual indexing based on the title and on the full text of resources. 58% of the major main headings were retrieved by the bag-of-words algorithm and the precision on main heading retrieval was 69%. Bag-of-words indexing has effectively been used on selected resources to be included in CISMeF since August 2006. Meanwhile, on going work aims at improving the current version of the tool.
A flexible motif search technique based on generalized profiles.

PubMed

Bucher, P; Karplus, K; Moeri, N; Hofmann, K

1996-03-01

A flexible motif search technique is presented which has two major components: (1) a generalized profile syntax serving as a motif definition language; and (2) a motif search method specifically adapted to the problem of finding multiple instances of a motif in the same sequence. The new profile structure, which is the core of the generalized profile syntax, combines the functions of a variety of motif descriptors implemented in other methods, including regular expression-like patterns, weight matrices, previously used profiles, and certain types of hidden Markov models (HMMs). The relationship between generalized profiles and other biomolecular motif descriptors is analyzed in detail, with special attention to HMMs. Generalized profiles are shown to be equivalent to a particular class of HMMs, and conversion procedures in both directions are given. The conversion procedures provide an interpretation for local alignment in the framework of stochastic models, allowing for clear, simple significance tests. A mathematical statement of the motif search problem defines the new method exactly without linking it to a specific algorithmic solution. Part of the definition includes a new definition of disjointness of alignments.
Hyper-spectral image segmentation using spectral clustering with covariance descriptors

NASA Astrophysics Data System (ADS)

Kursun, Olcay; Karabiber, Fethullah; Koc, Cemalettin; Bal, Abdullah

2009-02-01

Image segmentation is an important and difficult computer vision problem. Hyper-spectral images pose even more difficulty due to their high-dimensionality. Spectral clustering (SC) is a recently popular clustering/segmentation algorithm. In general, SC lifts the data to a high dimensional space, also known as the kernel trick, then derive eigenvectors in this new space, and finally using these new dimensions partition the data into clusters. We demonstrate that SC works efficiently when combined with covariance descriptors that can be used to assess pixelwise similarities rather than in the high-dimensional Euclidean space. We present the formulations and some preliminary results of the proposed hybrid image segmentation method for hyper-spectral images.
Renal Function Descriptors in Neonates: Which Creatinine-Based Formula Best Describes Vancomycin Clearance?

PubMed

Bhongsatiern, Jiraganya; Stockmann, Chris; Yu, Tian; Constance, Jonathan E; Moorthy, Ganesh; Spigarelli, Michael G; Desai, Pankaj B; Sherwin, Catherine M T

2016-05-01

Growth and maturational changes have been identified as significant covariates in describing variability in clearance of renally excreted drugs such as vancomycin. Because of immaturity of clearance mechanisms, quantification of renal function in neonates is of importance. Several serum creatinine (SCr)-based renal function descriptors have been developed in adults and children, but none are selectively derived for neonates. This review summarizes development of the neonatal kidney and discusses assessment of the renal function regarding estimation of glomerular filtration rate using renal function descriptors. Furthermore, identification of the renal function descriptors that best describe the variability of vancomycin clearance was performed in a sample study of a septic neonatal cohort. Population pharmacokinetic models were developed applying a combination of age-weight, renal function descriptors, or SCr alone. In addition to age and weight, SCr or renal function descriptors significantly reduced variability of vancomycin clearance. The population pharmacokinetic models with Léger and modified Schwartz formulas were selected as the optimal final models, although the other renal function descriptors and SCr provided reasonably good fit to the data, suggesting further evaluation of the final models using external data sets and cross validation. The present study supports incorporation of renal function descriptors in the estimation of vancomycin clearance in neonates. © 2015, The American College of Clinical Pharmacology.
Health sciences descriptors in the brazilian speech-language and hearing science.

PubMed

Campanatti-Ostiz, Heliane; Andrade, Claudia Regina Furquim de

2010-01-01

Terminology in Speech-Language and Hearing Science. To propose a specific thesaurus about the Speech-Language and Hearing Science, for the English, Portuguese and Spanish languages, based on the existing keywords available on the Health Sciences Descriptors (DeCS). Methodology was based on the pilot study developed by Campanatti-Ostiz and Andrade; that had as a purpose to verify the methodological viability for the creation of a Speech-Language and Hearing Science category in the DeCS. The scientific journals selected for analyses of the titles, abstracts and keywords of all scientific articles were those in the field of the Speech-Language and Hearing Science, indexed on the SciELO. 1. Recovery of the Descriptors in the English language (Medical Subject Headings--MeSH); 2. Recovery and hierarchic organization of the descriptors in the Portuguese language was done (DeCS). The obtained data was analyzed as follows: descriptive analyses and relative relevance analyses of the DeCS areas. Based on the first analyses, we decided to select all 761 descriptors, with all the hierarchic numbers, independently of their occurrence (occurrence number--ON), and based on the second analyses, we decided to propose to exclude the less relevant areas and the exclusive DeCS areas. The proposal was finished with a total of 1676 occurrences of DeCS descriptors, distributed in the following areas: Anatomy; Diseases; Analytical, Diagnostic and Therapeutic Techniques and Equipments; Psychiatry and Psychology; Phenomena and Processes; Health Care. The presented proposal of a thesaurus contains the specific terminology of the Brazilian Speech-Language and Hearing Sciences and reflects the descriptors of the published scientific production. Being the DeCS a trilingual vocabulary (Portuguese, English and Spanish), the present descriptors organization proposition can be used in these three languages, allowing greater cultural interchange between different nations.
Performance-scalable volumetric data classification for online industrial inspection

NASA Astrophysics Data System (ADS)

Abraham, Aby J.; Sadki, Mustapha; Lea, R. M.

2002-03-01

Non-intrusive inspection and non-destructive testing of manufactured objects with complex internal structures typically requires the enhancement, analysis and visualization of high-resolution volumetric data. Given the increasing availability of fast 3D scanning technology (e.g. cone-beam CT), enabling on-line detection and accurate discrimination of components or sub-structures, the inherent complexity of classification algorithms inevitably leads to throughput bottlenecks. Indeed, whereas typical inspection throughput requirements range from 1 to 1000 volumes per hour, depending on density and resolution, current computational capability is one to two orders-of-magnitude less. Accordingly, speeding up classification algorithms requires both reduction of algorithm complexity and acceleration of computer performance. A shape-based classification algorithm, offering algorithm complexity reduction, by using ellipses as generic descriptors of solids-of-revolution, and supporting performance-scalability, by exploiting the inherent parallelism of volumetric data, is presented. A two-stage variant of the classical Hough transform is used for ellipse detection and correlation of the detected ellipses facilitates position-, scale- and orientation-invariant component classification. Performance-scalability is achieved cost-effectively by accelerating a PC host with one or more COTS (Commercial-Off-The-Shelf) PCI multiprocessor cards. Experimental results are reported to demonstrate the feasibility and cost-effectiveness of the data-parallel classification algorithm for on-line industrial inspection applications.
Bio-activity of aminosulfonyl ureas in the light of nucleic acid bases and DNA base pair interaction.

PubMed

Mondal Roy, Sutapa

2018-08-01

The quantum chemical descriptors based on density functional theory (DFT) are applied to predict the biological activity (log IC 50 ) of one class of acyl-CoA: cholesterol O-acyltransferase (ACAT) inhibitors, viz. aminosulfonyl ureas. ACAT are very effective agents for reduction of triglyceride and cholesterol levels in human body. Successful two parameter quantitative structure-activity relationship (QSAR) models are developed with a combination of relevant global and local DFT based descriptors for prediction of biological activity of aminosulfonyl ureas. The global descriptors, electron affinity of the ACAT inhibitors (EA) and/or charge transfer (ΔN) between inhibitors and model biosystems (NA bases and DNA base pairs) along with the local group atomic charge on sulfonyl moiety (∑Q Sul ) of the inhibitors reveals more than 90% efficacy of the selected descriptors for predicting the experimental log (IC 50 ) values. Copyright © 2018 Elsevier Ltd. All rights reserved.
Improved dense trajectories for action recognition based on random projection and Fisher vectors

NASA Astrophysics Data System (ADS)

Ai, Shihui; Lu, Tongwei; Xiong, Yudian

2018-03-01

As an important application of intelligent monitoring system, the action recognition in video has become a very important research area of computer vision. In order to improve the accuracy rate of the action recognition in video with improved dense trajectories, one advanced vector method is introduced. Improved dense trajectories combine Fisher Vector with Random Projection. The method realizes the reduction of the characteristic trajectory though projecting the high-dimensional trajectory descriptor into the low-dimensional subspace based on defining and analyzing Gaussian mixture model by Random Projection. And a GMM-FV hybrid model is introduced to encode the trajectory feature vector and reduce dimension. The computational complexity is reduced by Random Projection which can drop Fisher coding vector. Finally, a Linear SVM is used to classifier to predict labels. We tested the algorithm in UCF101 dataset and KTH dataset. Compared with existed some others algorithm, the result showed that the method not only reduce the computational complexity but also improved the accuracy of action recognition.
Analyzing Molecular Clouds with the Spectral Correlation Function

NASA Astrophysics Data System (ADS)

Rosolowsky, E. W.; Goodman, A. A.; Williams, J. P.; Wilner, D. J.

1997-12-01

The Spectral Correlation Function (SCF) is a new data analysis algorithm that measures how the properites of spectra vary from position to position in a spectral-line map. For each spectrum in a data cube, the SCF measures the ``difference" between that spectrum and a specified subset of its neighbors. This algorithm is intended for use on both simulated and observed position-position-velocity data cubes. In initial tests of the SCF, we have shown that a histogram of the SCF for a map is a good descriptor of the spatial-velocity distribution of material. In one test, we compare the SCF distributions for: 1) a real data cube; 2) a cube made from the real cube's spectra with randomized positions; and 3) the results of a preliminary MHD simulation by Gammie, Ostriker, and Stone. The results of the test show that the real cloud and the simulation are much closer to each other in their SCF distributions than is either to the randomized cube. We are now in the process of applying the SCF to a larger set of observed and simulated data cubes. Our ultimate aim is to use the SCF both on its own, as a descriptor of the spatial-kinetic properties of interstellar gas, and also as a tool for evaluating how well simulations resemble observations. Our expectation is that the SCF will be more discriminatory (less likely to produce a false match) than the data cube descriptors currently available.
SigVox - A 3D feature matching algorithm for automatic street object recognition in mobile laser scanning point clouds

NASA Astrophysics Data System (ADS)

Wang, Jinhu; Lindenbergh, Roderik; Menenti, Massimo

2017-06-01

Urban road environments contain a variety of objects including different types of lamp poles and traffic signs. Its monitoring is traditionally conducted by visual inspection, which is time consuming and expensive. Mobile laser scanning (MLS) systems sample the road environment efficiently by acquiring large and accurate point clouds. This work proposes a methodology for urban road object recognition from MLS point clouds. The proposed method uses, for the first time, shape descriptors of complete objects to match repetitive objects in large point clouds. To do so, a novel 3D multi-scale shape descriptor is introduced, that is embedded in a workflow that efficiently and automatically identifies different types of lamp poles and traffic signs. The workflow starts by tiling the raw point clouds along the scanning trajectory and by identifying non-ground points. After voxelization of the non-ground points, connected voxels are clustered to form candidate objects. For automatic recognition of lamp poles and street signs, a 3D significant eigenvector based shape descriptor using voxels (SigVox) is introduced. The 3D SigVox descriptor is constructed by first subdividing the points with an octree into several levels. Next, significant eigenvectors of the points in each voxel are determined by principal component analysis (PCA) and mapped onto the appropriate triangle of a sphere approximating icosahedron. This step is repeated for different scales. By determining the similarity of 3D SigVox descriptors between candidate point clusters and training objects, street furniture is automatically identified. The feasibility and quality of the proposed method is verified on two point clouds obtained in opposite direction of a stretch of road of 4 km. 6 types of lamp pole and 4 types of road sign were selected as objects of interest. Ground truth validation showed that the overall accuracy of the ∼170 automatically recognized objects is approximately 95%. The results demonstrate that the proposed method is able to recognize street furniture in a practical scenario. Remaining difficult cases are touching objects, like a lamp pole close to a tree.
Analysis, Thematic Maps and Data Mining from Point Cloud to Ontology for Software Development

NASA Astrophysics Data System (ADS)

Nespeca, R.; De Luca, L.

2016-06-01

The primary purpose of the survey for the restoration of Cultural Heritage is the interpretation of the state of building preservation. For this, the advantages of the remote sensing systems that generate dense point cloud (range-based or image-based) are not limited only to the acquired data. The paper shows that it is possible to extrapolate very useful information in diagnostics using spatial annotation, with the use of algorithms already implemented in open-source software. Generally, the drawing of degradation maps is the result of manual work, so dependent on the subjectivity of the operator. This paper describes a method of extraction and visualization of information, obtained by mathematical procedures, quantitative, repeatable and verifiable. The case study is a part of the east facade of the Eglise collégiale Saint-Maurice also called Notre Dame des Grâces, in Caromb, in southern France. The work was conducted on the matrix of information contained in the point cloud asci format. The first result is the extrapolation of new geometric descriptors. First, we create the digital maps with the calculated quantities. Subsequently, we have moved to semi-quantitative analyses that transform new data into useful information. We have written the algorithms for accurate selection, for the segmentation of point cloud, for automatic calculation of the real surface and the volume. Furthermore, we have created the graph of spatial distribution of the descriptors. This work shows that if we work during the data processing we can transform the point cloud into an enriched database: the use, the management and the data mining is easy, fast and effective for everyone involved in the restoration process.
Characteristics of genomic signatures derived using univariate methods and mechanistically anchored functional descriptors for predicting drug- and xenobiotic-induced nephrotoxicity.

PubMed

Shi, Weiwei; Bugrim, Andrej; Nikolsky, Yuri; Nikolskya, Tatiana; Brennan, Richard J

2008-01-01

ABSTRACT The ideal toxicity biomarker is composed of the properties of prediction (is detected prior to traditional pathological signs of injury), accuracy (high sensitivity and specificity), and mechanistic relationships to the endpoint measured (biological relevance). Gene expression-based toxicity biomarkers ("signatures") have shown good predictive power and accuracy, but are difficult to interpret biologically. We have compared different statistical methods of feature selection with knowledge-based approaches, using GeneGo's database of canonical pathway maps, to generate gene sets for the classification of renal tubule toxicity. The gene set selection algorithms include four univariate analyses: t-statistics, fold-change, B-statistics, and RankProd, and their combination and overlap for the identification of differentially expressed probes. Enrichment analysis following the results of the four univariate analyses, Hotelling T-square test, and, finally out-of-bag selection, a variant of cross-validation, were used to identify canonical pathway maps-sets of genes coordinately involved in key biological processes-with classification power. Differentially expressed genes identified by the different statistical univariate analyses all generated reasonably performing classifiers of tubule toxicity. Maps identified by enrichment analysis or Hotelling T-square had lower classification power, but highlighted perturbed lipid homeostasis as a common discriminator of nephrotoxic treatments. The out-of-bag method yielded the best functionally integrated classifier. The map "ephrins signaling" performed comparably to a classifier derived using sparse linear programming, a machine learning algorithm, and represents a signaling network specifically involved in renal tubule development and integrity. Such functional descriptors of toxicity promise to better integrate predictive toxicogenomics with mechanistic analysis, facilitating the interpretation and risk assessment of predictive genomic investigations.
Novel Digital Features Discriminate Between Drought Resistant and Drought Sensitive Rice Under Controlled and Field Conditions.

PubMed

Duan, Lingfeng; Han, Jiwan; Guo, Zilong; Tu, Haifu; Yang, Peng; Zhang, Dong; Fan, Yuan; Chen, Guoxing; Xiong, Lizhong; Dai, Mingqiu; Williams, Kevin; Corke, Fiona; Doonan, John H; Yang, Wanneng

2018-01-01

Dynamic quantification of drought response is a key issue both for variety selection and for functional genetic study of rice drought resistance. Traditional assessment of drought resistance traits, such as stay-green and leaf-rolling, has utilized manual measurements, that are often subjective, error-prone, poorly quantified and time consuming. To relieve this phenotyping bottleneck, we demonstrate a feasible, robust and non-destructive method that dynamically quantifies response to drought, under both controlled and field conditions. Firstly, RGB images of individual rice plants at different growth points were analyzed to derive 4 features that were influenced by imposition of drought. These include a feature related to the ability to stay green, which we termed greenness plant area ratio (GPAR) and 3 shape descriptors [total plant area/bounding rectangle area ratio (TBR), perimeter area ratio (PAR) and total plant area/convex hull area ratio (TCR)]. Experiments showed that these 4 features were capable of discriminating reliably between drought resistant and drought sensitive accessions, and dynamically quantifying the drought response under controlled conditions across time (at either daily or half hourly time intervals). We compared the 3 shape descriptors and concluded that PAR was more robust and sensitive to leaf-rolling than the other shape descriptors. In addition, PAR and GPAR proved to be effective in quantification of drought response in the field. Moreover, the values obtained in field experiments using the collection of rice varieties were correlated with those derived from pot-based experiments. The general applicability of the algorithms is demonstrated by their ability to probe archival Miscanthus data previously collected on an independent platform. In conclusion, this image-based technology is robust providing a platform-independent tool for quantifying drought response that should be of general utility for breeding and functional genomics in future.
Characterizing region of interest in image using MPEG-7 visual descriptors

NASA Astrophysics Data System (ADS)

Ryu, Min-Sung; Park, Soo-Jun; Won, Chee Sun

2005-08-01

In this paper, we propose a region-based image retrieval system using EHD (Edge Histogram Descriptor) and CLD (Color Layout Descriptor) of MPEG-7 descriptors. The combined descriptor can efficiently describe edge and color features in terms of sub-image regions. That is, the basic unit for the selection of the region-of-interest (ROI) in the image is the sub-image block of the EHD, which corresponds to 16 (i.e., 4x4) non-overlapping image blocks in the image space. This implies that, to have a one-to-one region correspondence between EHD and CLD, we need to take an 8x8 inverse DCT (IDCT) for the CLD. Experimental results show that the proposed retrieval scheme can be used for image retrieval with the ROI based image retrieval for MPEG-7 indexed images.
A stereo remote sensing feature selection method based on artificial bee colony algorithm

NASA Astrophysics Data System (ADS)

Yan, Yiming; Liu, Pigang; Zhang, Ye; Su, Nan; Tian, Shu; Gao, Fengjiao; Shen, Yi

2014-05-01

To improve the efficiency of stereo information for remote sensing classification, a stereo remote sensing feature selection method is proposed in this paper presents, which is based on artificial bee colony algorithm. Remote sensing stereo information could be described by digital surface model (DSM) and optical image, which contain information of the three-dimensional structure and optical characteristics, respectively. Firstly, three-dimensional structure characteristic could be analyzed by 3D-Zernike descriptors (3DZD). However, different parameters of 3DZD could descript different complexity of three-dimensional structure, and it needs to be better optimized selected for various objects on the ground. Secondly, features for representing optical characteristic also need to be optimized. If not properly handled, when a stereo feature vector composed of 3DZD and image features, that would be a lot of redundant information, and the redundant information may not improve the classification accuracy, even cause adverse effects. To reduce information redundancy while maintaining or improving the classification accuracy, an optimized frame for this stereo feature selection problem is created, and artificial bee colony algorithm is introduced for solving this optimization problem. Experimental results show that the proposed method can effectively improve the computational efficiency, improve the classification accuracy.

Local Multi-Grouped Binary Descriptor With Ring-Based Pooling Configuration and Optimization.

PubMed

Gao, Yongqiang; Huang, Weilin; Qiao, Yu

2015-12-01

Local binary descriptors are attracting increasingly attention due to their great advantages in computational speed, which are able to achieve real-time performance in numerous image/vision applications. Various methods have been proposed to learn data-dependent binary descriptors. However, most existing binary descriptors aim overly at computational simplicity at the expense of significant information loss which causes ambiguity in similarity measure using Hamming distance. In this paper, by considering multiple features might share complementary information, we present a novel local binary descriptor, referred as ring-based multi-grouped descriptor (RMGD), to successfully bridge the performance gap between current binary and floated-point descriptors. Our contributions are twofold. First, we introduce a new pooling configuration based on spatial ring-region sampling, allowing for involving binary tests on the full set of pairwise regions with different shapes, scales, and distances. This leads to a more meaningful description than the existing methods which normally apply a limited set of pooling configurations. Then, an extended Adaboost is proposed for an efficient bit selection by emphasizing high variance and low correlation, achieving a highly compact representation. Second, the RMGD is computed from multiple image properties where binary strings are extracted. We cast multi-grouped features integration as rankSVM or sparse support vector machine learning problem, so that different features can compensate strongly for each other, which is the key to discriminativeness and robustness. The performance of the RMGD was evaluated on a number of publicly available benchmarks, where the RMGD outperforms the state-of-the-art binary descriptors significantly.
Improving Large-Scale Image Retrieval Through Robust Aggregation of Local Descriptors.

PubMed

Husain, Syed Sameed; Bober, Miroslaw

2017-09-01

Visual search and image retrieval underpin numerous applications, however the task is still challenging predominantly due to the variability of object appearance and ever increasing size of the databases, often exceeding billions of images. Prior art methods rely on aggregation of local scale-invariant descriptors, such as SIFT, via mechanisms including Bag of Visual Words (BoW), Vector of Locally Aggregated Descriptors (VLAD) and Fisher Vectors (FV). However, their performance is still short of what is required. This paper presents a novel method for deriving a compact and distinctive representation of image content called Robust Visual Descriptor with Whitening (RVD-W). It significantly advances the state of the art and delivers world-class performance. In our approach local descriptors are rank-assigned to multiple clusters. Residual vectors are then computed in each cluster, normalized using a direction-preserving normalization function and aggregated based on the neighborhood rank. Importantly, the residual vectors are de-correlated and whitened in each cluster before aggregation, leading to a balanced energy distribution in each dimension and significantly improved performance. We also propose a new post-PCA normalization approach which improves separability between the matching and non-matching global descriptors. This new normalization benefits not only our RVD-W descriptor but also improves existing approaches based on FV and VLAD aggregation. Furthermore, we show that the aggregation framework developed using hand-crafted SIFT features also performs exceptionally well with Convolutional Neural Network (CNN) based features. The RVD-W pipeline outperforms state-of-the-art global descriptors on both the Holidays and Oxford datasets. On the large scale datasets, Holidays1M and Oxford1M, SIFT-based RVD-W representation obtains a mAP of 45.1 and 35.1 percent, while CNN-based RVD-W achieve a mAP of 63.5 and 44.8 percent, all yielding superior performance to the state-of-the-art.
A contour-based shape descriptor for biomedical image classification and retrieval

NASA Astrophysics Data System (ADS)

You, Daekeun; Antani, Sameer; Demner-Fushman, Dina; Thoma, George R.

2013-12-01

Contours, object blobs, and specific feature points are utilized to represent object shapes and extract shape descriptors that can then be used for object detection or image classification. In this research we develop a shape descriptor for biomedical image type (or, modality) classification. We adapt a feature extraction method used in optical character recognition (OCR) for character shape representation, and apply various image preprocessing methods to successfully adapt the method to our application. The proposed shape descriptor is applied to radiology images (e.g., MRI, CT, ultrasound, X-ray, etc.) to assess its usefulness for modality classification. In our experiment we compare our method with other visual descriptors such as CEDD, CLD, Tamura, and PHOG that extract color, texture, or shape information from images. The proposed method achieved the highest classification accuracy of 74.1% among all other individual descriptors in the test, and when combined with CSD (color structure descriptor) showed better performance (78.9%) than using the shape descriptor alone.
Smartphone-based quantitative measurements on holographic sensors.

PubMed

Khalili Moghaddam, Gita; Lowe, Christopher Robin

2017-01-01

The research reported herein integrates a generic holographic sensor platform and a smartphone-based colour quantification algorithm in order to standardise and improve the determination of the concentration of analytes of interest. The utility of this approach has been exemplified by analysing the replay colour of the captured image of a holographic pH sensor in near real-time. Personalised image encryption followed by a wavelet-based image compression method were applied to secure the image transfer across a bandwidth-limited network to the cloud. The decrypted and decompressed image was processed through four principal steps: Recognition of the hologram in the image with a complex background using a template-based approach, conversion of device-dependent RGB values to device-independent CIEXYZ values using a polynomial model of the camera and computation of the CIEL*a*b* values, use of the colour coordinates of the captured image to segment the image, select the appropriate colour descriptors and, ultimately, locate the region of interest (ROI), i.e. the hologram in this case, and finally, application of a machine learning-based algorithm to correlate the colour coordinates of the ROI to the analyte concentration. Integrating holographic sensors and the colour image processing algorithm potentially offers a cost-effective platform for the remote monitoring of analytes in real time in readily accessible body fluids by minimally trained individuals.
Smartphone-based quantitative measurements on holographic sensors

PubMed Central

Khalili Moghaddam, Gita

2017-01-01

The research reported herein integrates a generic holographic sensor platform and a smartphone-based colour quantification algorithm in order to standardise and improve the determination of the concentration of analytes of interest. The utility of this approach has been exemplified by analysing the replay colour of the captured image of a holographic pH sensor in near real-time. Personalised image encryption followed by a wavelet-based image compression method were applied to secure the image transfer across a bandwidth-limited network to the cloud. The decrypted and decompressed image was processed through four principal steps: Recognition of the hologram in the image with a complex background using a template-based approach, conversion of device-dependent RGB values to device-independent CIEXYZ values using a polynomial model of the camera and computation of the CIEL*a*b* values, use of the colour coordinates of the captured image to segment the image, select the appropriate colour descriptors and, ultimately, locate the region of interest (ROI), i.e. the hologram in this case, and finally, application of a machine learning-based algorithm to correlate the colour coordinates of the ROI to the analyte concentration. Integrating holographic sensors and the colour image processing algorithm potentially offers a cost-effective platform for the remote monitoring of analytes in real time in readily accessible body fluids by minimally trained individuals. PMID:29141008
Skin image illumination modeling and chromophore identification for melanoma diagnosis

NASA Astrophysics Data System (ADS)

Liu, Zhao; Zerubia, Josiane

2015-05-01

The presence of illumination variation in dermatological images has a negative impact on the automatic detection and analysis of cutaneous lesions. This paper proposes a new illumination modeling and chromophore identification method to correct lighting variation in skin lesion images, as well as to extract melanin and hemoglobin concentrations of human skin, based on an adaptive bilateral decomposition and a weighted polynomial curve fitting, with the knowledge of a multi-layered skin model. Different from state-of-the-art approaches based on the Lambert law, the proposed method, considering both specular reflection and diffuse reflection of the skin, enables us to address highlight and strong shading effects usually existing in skin color images captured in an uncontrolled environment. The derived melanin and hemoglobin indices, directly relating to the pathological tissue conditions, tend to be less influenced by external imaging factors and are more efficient in describing pigmentation distributions. Experiments show that the proposed method gave better visual results and superior lesion segmentation, when compared to two other illumination correction algorithms, both designed specifically for dermatological images. For computer-aided diagnosis of melanoma, sensitivity achieves 85.52% when using our chromophore descriptors, which is 8~20% higher than those derived from other color descriptors. This demonstrates the benefit of the proposed method for automatic skin disease analysis.
Development of an online, publicly accessible naive Bayesian decision support tool for mammographic mass lesions based on the American College of Radiology (ACR) BI-RADS lexicon.

PubMed

Benndorf, Matthias; Kotter, Elmar; Langer, Mathias; Herda, Christoph; Wu, Yirong; Burnside, Elizabeth S

2015-06-01

To develop and validate a decision support tool for mammographic mass lesions based on a standardized descriptor terminology (BI-RADS lexicon) to reduce variability of practice. We used separate training data (1,276 lesions, 138 malignant) and validation data (1,177 lesions, 175 malignant). We created naïve Bayes (NB) classifiers from the training data with tenfold cross-validation. Our "inclusive model" comprised BI-RADS categories, BI-RADS descriptors, and age as predictive variables; our "descriptor model" comprised BI-RADS descriptors and age. The resulting NB classifiers were applied to the validation data. We evaluated and compared classifier performance with ROC-analysis. In the training data, the inclusive model yields an AUC of 0.959; the descriptor model yields an AUC of 0.910 (P < 0.001). The inclusive model is superior to the clinical performance (BI-RADS categories alone, P < 0.001); the descriptor model performs similarly. When applied to the validation data, the inclusive model yields an AUC of 0.935; the descriptor model yields an AUC of 0.876 (P < 0.001). Again, the inclusive model is superior to the clinical performance (P < 0.001); the descriptor model performs similarly. We consider our classifier a step towards a more uniform interpretation of combinations of BI-RADS descriptors. We provide our classifier at www.ebm-radiology.com/nbmm/index.html . • We provide a decision support tool for mammographic masses at www.ebm-radiology.com/nbmm/index.html . • Our tool may reduce variability of practice in BI-RADS category assignment. • A formal analysis of BI-RADS descriptors may enhance radiologists' diagnostic performance.
An Analysis of Descriptors of Volatile Organic Compounds and Their Impact on Rate Constant for Reaction with Hydroxyl Radicals

DTIC Science & Technology

2018-05-01

the descriptors were correlated to experimental rate constants. The five descriptors fell into one of two categories: whole molecule descriptors or...model based on these correlations . Although that goal was not achieved in full, considerable progress has been made, and there is potential for a...readme.txt) and compiled. We then searched for correlations between the calculated properties from theory and the experimental measurements of reaction rate
Structural protein descriptors in 1-dimension and their sequence-based predictions.

PubMed

Kurgan, Lukasz; Disfani, Fatemeh Miri

2011-09-01

The last few decades observed an increasing interest in development and application of 1-dimensional (1D) descriptors of protein structure. These descriptors project 3D structural features onto 1D strings of residue-wise structural assignments. They cover a wide-range of structural aspects including conformation of the backbone, burying depth/solvent exposure and flexibility of residues, and inter-chain residue-residue contacts. We perform first-of-its-kind comprehensive comparative review of the existing 1D structural descriptors. We define, review and categorize ten structural descriptors and we also describe, summarize and contrast over eighty computational models that are used to predict these descriptors from the protein sequences. We show that the majority of the recent sequence-based predictors utilize machine learning models, with the most popular being neural networks, support vector machines, hidden Markov models, and support vector and linear regressions. These methods provide high-throughput predictions and most of them are accessible to a non-expert user via web servers and/or stand-alone software packages. We empirically evaluate several recent sequence-based predictors of secondary structure, disorder, and solvent accessibility descriptors using a benchmark set based on CASP8 targets. Our analysis shows that the secondary structure can be predicted with over 80% accuracy and segment overlap (SOV), disorder with over 0.9 AUC, 0.6 Matthews Correlation Coefficient (MCC), and 75% SOV, and relative solvent accessibility with PCC of 0.7 and MCC of 0.6 (0.86 when homology is used). We demonstrate that the secondary structure predicted from sequence without the use of homology modeling is as good as the structure extracted from the 3D folds predicted by top-performing template-based methods.
Development of a New De Novo Design Algorithm for Exploring Chemical Space.

PubMed

Mishima, Kazuaki; Kaneko, Hiromasa; Funatsu, Kimito

2014-12-01

In the first stage of development of new drugs, various lead compounds with high activity are required. To design such compounds, we focus on chemical space defined by structural descriptors. New compounds close to areas where highly active compounds exist will show the same degree of activity. We have developed a new de novo design system to search a target area in chemical space. First, highly active compounds are manually selected as initial seeds. Then, the seeds are entered into our system, and structures slightly different from the seeds are generated and pooled. Next, seeds are selected from the new structure pool based on the distance from target coordinates on the map. To test the algorithm, we used two datasets of ligand binding affinity and showed that the proposed generator could produce diverse virtual compounds that had high activity in docking simulations. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Sub-Selective Quantization for Learning Binary Codes in Large-Scale Image Search.

PubMed

Li, Yeqing; Liu, Wei; Huang, Junzhou

2018-06-01

Recently with the explosive growth of visual content on the Internet, large-scale image search has attracted intensive attention. It has been shown that mapping high-dimensional image descriptors to compact binary codes can lead to considerable efficiency gains in both storage and performing similarity computation of images. However, most existing methods still suffer from expensive training devoted to large-scale binary code learning. To address this issue, we propose a sub-selection based matrix manipulation algorithm, which can significantly reduce the computational cost of code learning. As case studies, we apply the sub-selection algorithm to several popular quantization techniques including cases using linear and nonlinear mappings. Crucially, we can justify the resulting sub-selective quantization by proving its theoretic properties. Extensive experiments are carried out on three image benchmarks with up to one million samples, corroborating the efficacy of the sub-selective quantization method in terms of image retrieval.
Monitoring methods and predictive models for water status in Jonathan apples.

PubMed

Trincă, Lucia Carmen; Căpraru, Adina-Mirela; Arotăriţei, Dragoş; Volf, Irina; Chiruţă, Ciprian

2014-02-01

Evaluation of water status in Jonathan apples was performed for 20 days. Loss moisture content (LMC) was carried out through slow drying of wholes apples and the moisture content (MC) was carried out through oven drying and lyophilisation for apple samples (chunks, crushed and juice). We approached a non-destructive method to evaluate LMC and MC of apples using image processing and multilayer neural networks (NN) predictor. We proposed a new simple algorithm that selects the texture descriptors based on initial set heuristically chosen. Both structure and weights of NN are optimised by a genetic algorithm with variable length genotype that led to a high precision of the predictive model (R(2)=0.9534). In our opinion, the developing of this non-destructive method for the assessment of LMC and MC (and of other chemical parameters) seems to be very promising in online inspection of food quality. Copyright © 2013 Elsevier Ltd. All rights reserved.
Joint Transform Correlation for face tracking: elderly fall detection application

NASA Astrophysics Data System (ADS)

Katz, Philippe; Aron, Michael; Alfalou, Ayman

2013-03-01

In this paper, an iterative tracking algorithm based on a non-linear JTC (Joint Transform Correlator) architecture and enhanced by a digital image processing method is proposed and validated. This algorithm is based on the computation of a correlation plane where the reference image is updated at each frame. For that purpose, we use the JTC technique in real time to track a patient (target image) in a room fitted with a video camera. The correlation plane is used to localize the target image in the current video frame (frame i). Then, the reference image to be exploited in the next frame (frame i+1) is updated according to the previous one (frame i). In an effort to validate our algorithm, our work is divided into two parts: (i) a large study based on different sequences with several situations and different JTC parameters is achieved in order to quantify their effects on the tracking performances (decimation, non-linearity coefficient, size of the correlation plane, size of the region of interest...). (ii) the tracking algorithm is integrated into an application of elderly fall detection. The first reference image is a face detected by means of Haar descriptors, and then localized into the new video image thanks to our tracking method. In order to avoid a bad update of the reference frame, a method based on a comparison of image intensity histograms is proposed and integrated in our algorithm. This step ensures a robust tracking of the reference frame. This article focuses on face tracking step optimisation and evalutation. A supplementary step of fall detection, based on vertical acceleration and position, will be added and studied in further work.
SVM based colon polyps classifier in a wireless active stereo endoscope.

PubMed

Ayoub, J; Granado, B; Mhanna, Y; Romain, O

2010-01-01

This work focuses on the recognition of three-dimensional colon polyps captured by an active stereo vision sensor. The detection algorithm consists of SVM classifier trained on robust feature descriptors. The study is related to Cyclope, this prototype sensor allows real time 3D object reconstruction and continues to be optimized technically to improve its classification task by differentiation between hyperplastic and adenomatous polyps. Experimental results were encouraging and show correct classification rate of approximately 97%. The work contains detailed statistics about the detection rate and the computing complexity. Inspired by intensity histogram, the work shows a new approach that extracts a set of features based on depth histogram and combines stereo measurement with SVM classifiers to correctly classify benign and malignant polyps.
Reinforcement learning interfaces for biomedical database systems.

PubMed

Rudowsky, I; Kulyba, O; Kunin, M; Parsons, S; Raphan, T

2006-01-01

Studies of neural function that are carried out in different laboratories and that address different questions use a wide range of descriptors for data storage, depending on the laboratory and the individuals that input the data. A common approach to describe non-textual data that are referenced through a relational database is to use metadata descriptors. We have recently designed such a prototype system, but to maintain efficiency and a manageable metadata table, free formatted fields were designed as table entries. The database interface application utilizes an intelligent agent to improve integrity of operation. The purpose of this study was to investigate how reinforcement learning algorithms can assist the user in interacting with the database interface application that has been developed to improve the performance of the system.
[The use of Cantonese pain descriptors among healthy young adults in Hong Kong].

PubMed

Chung, W Y; Wong, C H; Yang, J C; Tan, P P

1998-12-01

The interpretation and expression of pain are closely related to an individual's social and cultural background. To convey messages on pain, language and words (pain descriptors) is particularly significant in assessment and evaluation of pain severity and its management. Therefore, the study of pain descriptors is crucial in clinical practice. It was of exploratory-descriptive design. Samples were recruited by convenience. Data were collected by structured self-administered questionnaire. Data obtained included demographic information and pain descriptors used by the subjects in various pain conditions. Data were analyzed by descriptive statistics. Pain descriptors were categorized according to nature, process, intensity, aggravating factors, accompanying symptoms and behavioral manifestation. Total number of pain descriptors (in Cantonese) based on real pain experience was 3017, mean was 3 (n = 986). The commonest used descriptors was the nature of pain (41%). The intensity of pain constituted 20%. There was no significant difference in the number of pain descriptors between male and female. However, there was a significant difference between the type of pain descriptors used (Mfemale = 526, Mmale = 453, Z = -2.9729, p = 0.0029). There were also significant differences in the use of pain descriptors among the various age groups (X2 = 15.0157, df = 4, P = 0.0047) and educational levels (X2 = 11.2443, df = 4, P = 0.0240). The types of descriptors used increased with an increase in age and education levels. This exploratory-descriptive study explores the use of pain descriptors among Chinese young adults in Hong Kong. The result shows that female use more pain descriptors than male. The pain descriptors that female used are mostly of nature type. The similarities and differences in findings with those of the Ho's (1991) are compared.
Artificial intelligence systems based on texture descriptors for vaccine development.

PubMed

Nanni, Loris; Brahnam, Sheryl; Lumini, Alessandra

2011-02-01

The aim of this work is to analyze and compare several feature extraction methods for peptide classification that are based on the calculation of texture descriptors starting from a matrix representation of the peptide. This texture-based representation of the peptide is then used to train a support vector machine classifier. In our experiments, the best results are obtained using local binary patterns variants and the discrete cosine transform with selected coefficients. These results are better than those previously reported that employed texture descriptors for peptide representation. In addition, we perform experiments that combine standard approaches based on amino acid sequence. The experimental section reports several tests performed on a vaccine dataset for the prediction of peptides that bind human leukocyte antigens and on a human immunodeficiency virus (HIV-1). Experimental results confirm the usefulness of our novel descriptors. The matlab implementation of our approaches is available at http://bias.csr.unibo.it/nanni/TexturePeptide.zip.
Covariance descriptor fusion for target detection

NASA Astrophysics Data System (ADS)

Cukur, Huseyin; Binol, Hamidullah; Bal, Abdullah; Yavuz, Fatih

2016-05-01

Target detection is one of the most important topics for military or civilian applications. In order to address such detection tasks, hyperspectral imaging sensors provide useful images data containing both spatial and spectral information. Target detection has various challenging scenarios for hyperspectral images. To overcome these challenges, covariance descriptor presents many advantages. Detection capability of the conventional covariance descriptor technique can be improved by fusion methods. In this paper, hyperspectral bands are clustered according to inter-bands correlation. Target detection is then realized by fusion of covariance descriptor results based on the band clusters. The proposed combination technique is denoted Covariance Descriptor Fusion (CDF). The efficiency of the CDF is evaluated by applying to hyperspectral imagery to detect man-made objects. The obtained results show that the CDF presents better performance than the conventional covariance descriptor.
How good is good? Students and assessors' perceptions of qualitative markers of performance.

PubMed

Ma, Heung Kan; Min, Cynthia; Neville, Alan; Eva, Kevin

2013-01-01

Qualitative markers of performance are routinely used for medical student assessment, though the extent to which such markers can be readily translated to actionable pieces of information remains uncertain. To explore (a) the perceived value to be indicated by descriptor phrases commonly used for describing student performance, (b) the perceived weight of the different performance domains (e.g. communication skills, work ethic, knowledge base, etc), and (c) whether or not the perceived value of the descriptors changes as a function of the performance domains. Five domains of performance were identified from the thematic coding of past medical student transcripts (N = 156). From the transcripts, 91 distinct descriptors indicating the language commonly used by assessors were also identified. From the list of 91 descriptors, Thurstone's method of equal-appearing intervals was used to extract 10 descriptors that were representative of the continuum of student performance. A modified paired comparisons method was then used to enable the relative ranking of each of 10 descriptors combined with each of 5 different domains of performance. A web-based survey was used to collect responses from participants (N = 209), which consisted of medical students and faculty members who were previously involved in student assessment. Results demonstrated that respondents did not simply sum positive and negative descriptors in a uniform manner. Rather, comments on some domains (e.g., "ability to apply patient centred medicine") were seen as particularly positive when associated with positive descriptors but not particularly negative when associated with negative descriptors. For others (e.g., "receptivity and responsiveness to feedback") the reverse was true. Comments on "knowledge-base" elicited a relatively muted perception at both ends of the scale. Finally, the results also revealed moderate misalignment in the perceptions of assessors and students. The findings from this study suggest that the use of any given descriptor conveys slightly different meaning dependent on the context in which it is used. This helps to address some key issues surrounding the application of qualitative markers to performance assessment in medical education.
Diagnostic performance and reproducibility of T2w based and diffusion weighted imaging (DWI) based PI-RADSv2 lexicon descriptors for prostate MRI.

PubMed

Benndorf, Matthias; Hahn, Felix; Krönig, Malte; Jilg, Cordula Annette; Krauss, Tobias; Langer, Mathias; Dovi-Akué, Philippe

2017-08-01

To examine the diagnostic performance of PI-RADSv2 T2w and diffusion weighted imaging (DWI) based lexicon descriptors, inter-observer agreement for descriptor assignment and diagnostic accuracy of the PI-RADSv2 assessment categories for multiparametric prostate MRI. 176 lesions in 79 consecutive patients are analyzed, lesions are histopathologically verified by MRI-ultrasound fusion biopsy. All lesions are rated according to the PI-RADSv2 lexicon, descriptors for T2w and DWI sequences and resulting assessment categories are assigned by two independent blinded radiologists. We perform receiver-operating-characteristic analysis using the assessment categories. To analyze inter-observer agreement, we calculate weighted kappa values for assessment category assignment and unweighted kappa values for descriptor assignment. PI-RADSv2 assessment categories yield an area under the curve of 0.76/0.74 (radiologist 1/radiologist 2), P >0.05. Weighted kappa for agreement is 0.601 in the peripheral zone and 0.580 in the transition zone. We detect a difference in the cancer rate for PI-RADSv2 category 3 between peripheral zone (32%) and transition zone (12%), P <0.05. We obtain moderate agreement at most for descriptor assignment with kappa values ranging from 0.082 (T2w shape in the transition zone) to 0.407 (T2w signal intensity in the peripheral zone) and 0.493 (ADC pattern in the peripheral zone). Our analysis corroborates typical descriptors for benign/malignant lesions, but also reveals insights into potential pitfalls - T2w wedge shaped lesions in the peripheral zone have a considerable cancer rate, despite being labelled category 2 in the lexicon. Agreement for descriptor assignment in the PI-RADSv2 lexicon is at most moderate in our study. Typical descriptors for benign and malignant lesions are validated, whereas the discriminatory power of some descriptors is challenged. The difference in the cancer rate for PI-RADSv2 category 3 between peripheral zone and transition zone should be considered when management recommendations are linked to assessment categories in the future. Copyright © 2017 Elsevier B.V. All rights reserved.

Web-4D-QSAR: A web-based application to generate 4D-QSAR descriptors.

PubMed

Ataide Martins, João Paulo; Rougeth de Oliveira, Marco Antônio; Oliveira de Queiroz, Mário Sérgio

2018-06-05

A web-based application is developed to generate 4D-QSAR descriptors using the LQTA-QSAR methodology, based on molecular dynamics (MD) trajectories and topology information retrieved from the GROMACS package. The LQTAGrid module calculates the intermolecular interaction energies at each grid point, considering probes and all aligned conformations resulting from MD simulations. These interaction energies are the independent variables or descriptors employed in a QSAR analysis. A friendly front end web interface, built using the Django framework and Python programming language, integrates all steps of the LQTA-QSAR methodology in a way that is transparent to the user, and in the backend, GROMACS and LQTAGrid are executed to generate 4D-QSAR descriptors to be used later in the process of QSAR model building. © 2018 Wiley Periodicals, Inc. © 2018 Wiley Periodicals, Inc.
QSRR modeling for diverse drugs using different feature selection methods coupled with linear and nonlinear regressions.

PubMed

Goodarzi, Mohammad; Jensen, Richard; Vander Heyden, Yvan

2012-12-01

A Quantitative Structure-Retention Relationship (QSRR) is proposed to estimate the chromatographic retention of 83 diverse drugs on a Unisphere poly butadiene (PBD) column, using isocratic elutions at pH 11.7. Previous work has generated QSRR models for them using Classification And Regression Trees (CART). In this work, Ant Colony Optimization is used as a feature selection method to find the best molecular descriptors from a large pool. In addition, several other selection methods have been applied, such as Genetic Algorithms, Stepwise Regression and the Relief method, not only to evaluate Ant Colony Optimization as a feature selection method but also to investigate its ability to find the important descriptors in QSRR. Multiple Linear Regression (MLR) and Support Vector Machines (SVMs) were applied as linear and nonlinear regression methods, respectively, giving excellent correlation between the experimental, i.e. extrapolated to a mobile phase consisting of pure water, and predicted logarithms of the retention factors of the drugs (logk(w)). The overall best model was the SVM one built using descriptors selected by ACO. Copyright © 2012 Elsevier B.V. All rights reserved.
Object tracking on mobile devices using binary descriptors

NASA Astrophysics Data System (ADS)

Savakis, Andreas; Quraishi, Mohammad Faiz; Minnehan, Breton

2015-03-01

With the growing ubiquity of mobile devices, advanced applications are relying on computer vision techniques to provide novel experiences for users. Currently, few tracking approaches take into consideration the resource constraints on mobile devices. Designing efficient tracking algorithms and optimizing performance for mobile devices can result in better and more efficient tracking for applications, such as augmented reality. In this paper, we use binary descriptors, including Fast Retina Keypoint (FREAK), Oriented FAST and Rotated BRIEF (ORB), Binary Robust Independent Features (BRIEF), and Binary Robust Invariant Scalable Keypoints (BRISK) to obtain real time tracking performance on mobile devices. We consider both Google's Android and Apple's iOS operating systems to implement our tracking approach. The Android implementation is done using Android's Native Development Kit (NDK), which gives the performance benefits of using native code as well as access to legacy libraries. The iOS implementation was created using both the native Objective-C and the C++ programing languages. We also introduce simplified versions of the BRIEF and BRISK descriptors that improve processing speed without compromising tracking accuracy.
A new texture descriptor based on local micro-pattern for detection of architectural distortion in mammographic images

NASA Astrophysics Data System (ADS)

de Oliveira, Helder C. R.; Moraes, Diego R.; Reche, Gustavo A.; Borges, Lucas R.; Catani, Juliana H.; de Barros, Nestor; Melo, Carlos F. E.; Gonzaga, Adilson; Vieira, Marcelo A. C.

2017-03-01

This paper presents a new local micro-pattern texture descriptor for the detection of Architectural Distortion (AD) in digital mammography images. AD is a subtle contraction of breast parenchyma that may represent an early sign of breast cancer. Due to its subtlety and variability, AD is more difficult to detect compared to microcalcifications and masses, and is commonly found in retrospective evaluations of false-negative mammograms. Several computer-based systems have been proposed for automatic detection of AD, but their performance are still unsatisfactory. The proposed descriptor, Local Mapped Pattern (LMP), is a generalization of the Local Binary Pattern (LBP), which is considered one of the most powerful feature descriptor for texture classification in digital images. Compared to LBP, the LMP descriptor captures more effectively the minor differences between the local image pixels. Moreover, LMP is a parametric model which can be optimized for the desired application. In our work, the LMP performance was compared to the LBP and four Haralick's texture descriptors for the classification of 400 regions of interest (ROIs) extracted from clinical mammograms. ROIs were selected and divided into four classes: AD, normal tissue, microcalcifications and masses. Feature vectors were used as input to a multilayer perceptron neural network, with a single hidden layer. Results showed that LMP is a good descriptor to distinguish AD from other anomalies in digital mammography. LMP performance was slightly better than the LBP and comparable to Haralick's descriptors (mean classification accuracy = 83%).
Automatic classification of protein structures using physicochemical parameters.

PubMed

Mohan, Abhilash; Rao, M Divya; Sunderrajan, Shruthi; Pennathur, Gautam

2014-09-01

Protein classification is the first step to functional annotation; SCOP and Pfam databases are currently the most relevant protein classification schemes. However, the disproportion in the number of three dimensional (3D) protein structures generated versus their classification into relevant superfamilies/families emphasizes the need for automated classification schemes. Predicting function of novel proteins based on sequence information alone has proven to be a major challenge. The present study focuses on the use of physicochemical parameters in conjunction with machine learning algorithms (Naive Bayes, Decision Trees, Random Forest and Support Vector Machines) to classify proteins into their respective SCOP superfamily/Pfam family, using sequence derived information. Spectrophores™, a 1D descriptor of the 3D molecular field surrounding a structure was used as a benchmark to compare the performance of the physicochemical parameters. The machine learning algorithms were modified to select features based on information gain for each SCOP superfamily/Pfam family. The effect of combining physicochemical parameters and spectrophores on classification accuracy (CA) was studied. Machine learning algorithms trained with the physicochemical parameters consistently classified SCOP superfamilies and Pfam families with a classification accuracy above 90%, while spectrophores performed with a CA of around 85%. Feature selection improved classification accuracy for both physicochemical parameters and spectrophores based machine learning algorithms. Combining both attributes resulted in a marginal loss of performance. Physicochemical parameters were able to classify proteins from both schemes with classification accuracy ranging from 90-96%. These results suggest the usefulness of this method in classifying proteins from amino acid sequences.
Automatic localization of cerebral cortical malformations using fractal analysis.

PubMed

De Luca, A; Arrigoni, F; Romaniello, R; Triulzi, F M; Peruzzo, D; Bertoldo, A

2016-08-21

Malformations of cortical development (MCDs) encompass a variety of brain disorders affecting the normal development and organization of the brain cortex. The relatively low incidence and the extreme heterogeneity of these disorders hamper the application of classical group level approaches for the detection of lesions. Here, we present a geometrical descriptor for a voxel level analysis based on fractal geometry, then define two similarity measures to detect the lesions at single subject level. The pipeline was applied to 15 normal children and nine pediatric patients affected by MCDs following two criteria, maximum accuracy (WACC) and minimization of false positives (FPR), and proved that our lesion detection algorithm is able to detect and locate abnormalities of the brain cortex with high specificity (WACC = 85%, FPR = 96%), sensitivity (WACC = 83%, FPR = 63%) and accuracy (WACC = 85%, FPR = 90%). The combination of global and local features proves to be effective, making the algorithm suitable for the detection of both focal and diffused malformations. Compared to other existing algorithms, this method shows higher accuracy and sensitivity.
Automatic localization of cerebral cortical malformations using fractal analysis

NASA Astrophysics Data System (ADS)

De Luca, A.; Arrigoni, F.; Romaniello, R.; Triulzi, F. M.; Peruzzo, D.; Bertoldo, A.

2016-08-01

Malformations of cortical development (MCDs) encompass a variety of brain disorders affecting the normal development and organization of the brain cortex. The relatively low incidence and the extreme heterogeneity of these disorders hamper the application of classical group level approaches for the detection of lesions. Here, we present a geometrical descriptor for a voxel level analysis based on fractal geometry, then define two similarity measures to detect the lesions at single subject level. The pipeline was applied to 15 normal children and nine pediatric patients affected by MCDs following two criteria, maximum accuracy (WACC) and minimization of false positives (FPR), and proved that our lesion detection algorithm is able to detect and locate abnormalities of the brain cortex with high specificity (WACC = 85%, FPR = 96%), sensitivity (WACC = 83%, FPR = 63%) and accuracy (WACC = 85%, FPR = 90%). The combination of global and local features proves to be effective, making the algorithm suitable for the detection of both focal and diffused malformations. Compared to other existing algorithms, this method shows higher accuracy and sensitivity.
In silico prediction of ROCK II inhibitors by different classification approaches.

PubMed

Cai, Chuipu; Wu, Qihui; Luo, Yunxia; Ma, Huili; Shen, Jiangang; Zhang, Yongbin; Yang, Lei; Chen, Yunbo; Wen, Zehuai; Wang, Qi

2017-11-01

ROCK II is an important pharmacological target linked to central nervous system disorders such as Alzheimer's disease. The purpose of this research is to generate ROCK II inhibitor prediction models by machine learning approaches. Firstly, four sets of descriptors were calculated with MOE 2010 and PaDEL-Descriptor, and optimized by F-score and linear forward selection methods. In addition, four classification algorithms were used to initially build 16 classifiers with k-nearest neighbors [Formula: see text], naïve Bayes, Random forest, and support vector machine. Furthermore, three sets of structural fingerprint descriptors were introduced to enhance the predictive capacity of classifiers, which were assessed with fivefold cross-validation, test set validation and external test set validation. The best two models, MFK + MACCS and MLR + SubFP, have both MCC values of 0.925 for external test set. After that, a privileged substructure analysis was performed to reveal common chemical features of ROCK II inhibitors. Finally, binding modes were analyzed to identify relationships between molecular descriptors and activity, while main interactions were revealed by comparing the docking interaction of the most potent and the weakest ROCK II inhibitors. To the best of our knowledge, this is the first report on ROCK II inhibitors utilizing machine learning approaches that provides a new method for discovering novel ROCK II inhibitors.
Use of in Vitro HTS-Derived Concentration-Response Data as ...

EPA Pesticide Factsheets

Background: Quantitative high-throughput screening (qHTS) assays are increasingly being employed to inform chemical hazard identification. Hundreds of chemicals have been tested in dozens of cell lines across extensive concentration ranges by the National Toxicology Program in collaboration with the NIH Chemical Genomics Center. Objectives: To test a hypothesis that dose-response data points of the qHTS assays can serve as biological descriptors of assayed chemicals and, when combined with conventional chemical descriptors, may improve the accuracy of Quantitative Structure-Activity Relationship (QSAR) models applied to prediction of in vivo toxicity endpoints. Methods and Results: The cell viability qHTS concentration-response data for 1,408 substances assayed in 13 cell lines were obtained from PubChem; for a subset of these compounds rodent acute toxicity LD50 data were also available. The classification k Nearest Neighbor and Random Forest QSAR methods were employed for modeling LD50 data using either chemical descriptors alone (conventional models) or in combination with biological descriptors derived from the concentration-response qHTS data (hybrid models). Critical to our approach was the use of a novel noise-filtering algorithm to treat qHTS data. We show that both the external classification accuracy and coverage (i.e., fraction of compounds in the external set that fall within the applicability domain) of the hybrid QSAR models was superior to convent
A blur-invariant local feature for motion blurred image matching

NASA Astrophysics Data System (ADS)

Tong, Qiang; Aoki, Terumasa

2017-07-01

Image matching between a blurred (caused by camera motion, out of focus, etc.) image and a non-blurred image is a critical task for many image/video applications. However, most of the existing local feature schemes fail to achieve this work. This paper presents a blur-invariant descriptor and a novel local feature scheme including the descriptor and the interest point detector based on moment symmetry - the authors' previous work. The descriptor is based on a new concept - center peak moment-like element (CPME) which is robust to blur and boundary effect. Then by constructing CPMEs, the descriptor is also distinctive and suitable for image matching. Experimental results show our scheme outperforms state of the art methods for blurred image matching
Selection of morphoagronomic descriptors for the characterization of accessions of cassava of the Eastern Brazilian Amazon.

PubMed

Silva, R S; Moura, E F; Farias-Neto, J T; Ledo, C A S; Sampaio, J E

2017-04-13

The aim of this study was to select morphoagronomic descriptors to characterize cassava accessions representative of Eastern Brazilian Amazonia. It was characterized 262 accessions using 21 qualitative descriptors. The multiple-correspondence analysis (MCA) technique was applied using the criteria: contribution of the descriptor in the last factorial axis of analysis in successive cycles (SMCA); reverse order of the descriptor's contribution in the last factorial axis of analysis with all descriptors ('O'´p') of Jolliffe's method; mean of the contribution orders of the descriptor in the first three factorial axes in the analysis with all descriptors ('Os') together with ('O'´p'); and order of contribution of weighted mean in the first three factorial axes in the analysis of all descriptors ('Oz'). The dissimilarity coefficient was measured by the method of multicategorical variables. The correlation among the matrix generated with all descriptors and matrices based on each criteria varied (r = 0.21, r = 0.97, r = 0.98, r = 0.13 for SMCA, 'Os', 'Oz' and 'O'´p', respectively). The least informative descriptors were discarded independently and according to both 'Os' and 'Oz' criteria. Thirteen descriptors were capable to discriminate the accessions and to represent the morphological variability of accessions sampled in Brazilian Eastern Amazonia: color of apical leaves, petiole color, color of stem exterior, external color of storage root, color of stem cortex, color of root pulp, texture of root epidermis, color of leaf vein, color of stem epidermis, color of end branches of adult plant, branching habit, root shape, and constriction of root.
Oral LD50 toxicity modeling and prediction of per- and polyfluorinated chemicals on rat and mouse.

PubMed

Bhhatarai, Barun; Gramatica, Paola

2011-05-01

Quantitative structure-activity relationship (QSAR) analyses were performed using the LD(50) oral toxicity data of per- and polyfluorinated chemicals (PFCs) on rodents: rat and mouse. PFCs are studied under the EU project CADASTER which uses the available experimental data for prediction and prioritization of toxic chemicals for risk assessment by using the in silico tools. The methodology presented here applies chemometrical analysis on the existing experimental data and predicts the toxicity of new compounds. QSAR analyses were performed on the available 58 mouse and 50 rat LD(50) oral data using multiple linear regression (MLR) based on theoretical molecular descriptors selected by genetic algorithm (GA). Training and prediction sets were prepared a priori from available experimental datasets in terms of structure and response. These sets were used to derive statistically robust and predictive (both internally and externally) models. The structural applicability domain (AD) of the models were verified on 376 per- and polyfluorinated chemicals including those in REACH preregistration list. The rat and mouse endpoints were predicted by each model for the studied compounds, and finally 30 compounds, all perfluorinated, were prioritized as most important for experimental toxicity analysis under the project. In addition, cumulative study on compounds within the AD of all four models, including two earlier published models on LC(50) rodent analysis was studied and the cumulative toxicity trend was observed using principal component analysis (PCA). The similarities and the differences observed in terms of descriptors and chemical/mechanistic meaning encoded by descriptors to prioritize the most toxic compounds are highlighted.
Quantitative structure-activity relationship (QSAR) for insecticides: development of predictive in vivo insecticide activity models.

PubMed

Naik, P K; Singh, T; Singh, H

2009-07-01

Quantitative structure-activity relationship (QSAR) analyses were performed independently on data sets belonging to two groups of insecticides, namely the organophosphates and carbamates. Several types of descriptors including topological, spatial, thermodynamic, information content, lead likeness and E-state indices were used to derive quantitative relationships between insecticide activities and structural properties of chemicals. A systematic search approach based on missing value, zero value, simple correlation and multi-collinearity tests as well as the use of a genetic algorithm allowed the optimal selection of the descriptors used to generate the models. The QSAR models developed for both organophosphate and carbamate groups revealed good predictability with r(2) values of 0.949 and 0.838 as well as [image omitted] values of 0.890 and 0.765, respectively. In addition, a linear correlation was observed between the predicted and experimental LD(50) values for the test set data with r(2) of 0.871 and 0.788 for both the organophosphate and carbamate groups, indicating that the prediction accuracy of the QSAR models was acceptable. The models were also tested successfully from external validation criteria. QSAR models developed in this study should help further design of novel potent insecticides.
Cellular automata rule characterization and classification using texture descriptors

NASA Astrophysics Data System (ADS)

Machicao, Jeaneth; Ribas, Lucas C.; Scabini, Leonardo F. S.; Bruno, Odermir M.

2018-05-01

The cellular automata (CA) spatio-temporal patterns have attracted the attention from many researchers since it can provide emergent behavior resulting from the dynamics of each individual cell. In this manuscript, we propose an approach of texture image analysis to characterize and classify CA rules. The proposed method converts the CA spatio-temporal patterns into a gray-scale image. The gray-scale is obtained by creating a binary number based on the 8-connected neighborhood of each dot of the CA spatio-temporal pattern. We demonstrate that this technique enhances the CA rule characterization and allow to use different texture image analysis algorithms. Thus, various texture descriptors were evaluated in a supervised training approach aiming to characterize the CA's global evolution. Our results show the efficiency of the proposed method for the classification of the elementary CA (ECAs), reaching a maximum of 99.57% of accuracy rate according to the Li-Packard scheme (6 classes) and 94.36% for the classification of the 88 rules scheme. Moreover, within the image analysis context, we found a better performance of the method by means of a transformation of the binary states to a gray-scale.
QSAR models based on quantum topological molecular similarity.

PubMed

Popelier, P L A; Smith, P J

2006-07-01

A new method called quantum topological molecular similarity (QTMS) was fairly recently proposed [J. Chem. Inf. Comp. Sc., 41, 2001, 764] to construct a variety of medicinal, ecological and physical organic QSAR/QSPRs. QTMS method uses quantum chemical topology (QCT) to define electronic descriptors drawn from modern ab initio wave functions of geometry-optimised molecules. It was shown that the current abundance of computing power can be utilised to inject realistic descriptors into QSAR/QSPRs. In this article we study seven datasets of medicinal interest : the dissociation constants (pK(a)) for a set of substituted imidazolines , the pK(a) of imidazoles , the ability of a set of indole derivatives to displace [(3)H] flunitrazepam from binding to bovine cortical membranes , the influenza inhibition constants for a set of benzimidazoles , the interaction constants for a set of amides and the enzyme liver alcohol dehydrogenase , the natriuretic activity of sulphonamide carbonic anhydrase inhibitors and the toxicity of a series of benzyl alcohols. A partial least square analysis in conjunction with a genetic algorithm delivered excellent models. They are also able to highlight the active site, of the ligand or the molecule whose structure determines the activity. The advantages and limitations of QTMS are discussed.
Chemical graphs, molecular matrices and topological indices in chemoinformatics and quantitative structure-activity relationships.

PubMed

Ivanciuc, Ovidiu

2013-06-01

Chemical and molecular graphs have fundamental applications in chemoinformatics, quantitative structureproperty relationships (QSPR), quantitative structure-activity relationships (QSAR), virtual screening of chemical libraries, and computational drug design. Chemoinformatics applications of graphs include chemical structure representation and coding, database search and retrieval, and physicochemical property prediction. QSPR, QSAR and virtual screening are based on the structure-property principle, which states that the physicochemical and biological properties of chemical compounds can be predicted from their chemical structure. Such structure-property correlations are usually developed from topological indices and fingerprints computed from the molecular graph and from molecular descriptors computed from the three-dimensional chemical structure. We present here a selection of the most important graph descriptors and topological indices, including molecular matrices, graph spectra, spectral moments, graph polynomials, and vertex topological indices. These graph descriptors are used to define several topological indices based on molecular connectivity, graph distance, reciprocal distance, distance-degree, distance-valency, spectra, polynomials, and information theory concepts. The molecular descriptors and topological indices can be developed with a more general approach, based on molecular graph operators, which define a family of graph indices related by a common formula. Graph descriptors and topological indices for molecules containing heteroatoms and multiple bonds are computed with weighting schemes based on atomic properties, such as the atomic number, covalent radius, or electronegativity. The correlation in QSPR and QSAR models can be improved by optimizing some parameters in the formula of topological indices, as demonstrated for structural descriptors based on atomic connectivity and graph distance.
Automatic Depth Extraction from 2D Images Using a Cluster-Based Learning Framework.

PubMed

Herrera, Jose L; Del-Blanco, Carlos R; Garcia, Narciso

2018-07-01

There has been a significant increase in the availability of 3D players and displays in the last years. Nonetheless, the amount of 3D content has not experimented an increment of such magnitude. To alleviate this problem, many algorithms for converting images and videos from 2D to 3D have been proposed. Here, we present an automatic learning-based 2D-3D image conversion approach, based on the key hypothesis that color images with similar structure likely present a similar depth structure. The presented algorithm estimates the depth of a color query image using the prior knowledge provided by a repository of color + depth images. The algorithm clusters this database attending to their structural similarity, and then creates a representative of each color-depth image cluster that will be used as prior depth map. The selection of the appropriate prior depth map corresponding to one given color query image is accomplished by comparing the structural similarity in the color domain between the query image and the database. The comparison is based on a K-Nearest Neighbor framework that uses a learning procedure to build an adaptive combination of image feature descriptors. The best correspondences determine the cluster, and in turn the associated prior depth map. Finally, this prior estimation is enhanced through a segmentation-guided filtering that obtains the final depth map estimation. This approach has been tested using two publicly available databases, and compared with several state-of-the-art algorithms in order to prove its efficiency.
A novel scheme for automatic nonrigid image registration using deformation invariant feature and geometric constraint

NASA Astrophysics Data System (ADS)

Deng, Zhipeng; Lei, Lin; Zhou, Shilin

2015-10-01

Automatic image registration is a vital yet challenging task, particularly for non-rigid deformation images which are more complicated and common in remote sensing images, such as distorted UAV (unmanned aerial vehicle) images or scanning imaging images caused by flutter. Traditional non-rigid image registration methods are based on the correctly matched corresponding landmarks, which usually needs artificial markers. It is a rather challenging task to locate the accurate position of the points and get accurate homonymy point sets. In this paper, we proposed an automatic non-rigid image registration algorithm which mainly consists of three steps: To begin with, we introduce an automatic feature point extraction method based on non-linear scale space and uniform distribution strategy to extract the points which are uniform distributed along the edge of the image. Next, we propose a hybrid point matching algorithm using DaLI (Deformation and Light Invariant) descriptor and local affine invariant geometric constraint based on triangulation which is constructed by K-nearest neighbor algorithm. Based on the accurate homonymy point sets, the two images are registrated by the model of TPS (Thin Plate Spline). Our method is demonstrated by three deliberately designed experiments. The first two experiments are designed to evaluate the distribution of point set and the correctly matching rate on synthetic data and real data respectively. The last experiment is designed on the non-rigid deformation remote sensing images and the three experimental results demonstrate the accuracy, robustness, and efficiency of the proposed algorithm compared with other traditional methods.
Quantitative morphometric analysis of hepatocellular carcinoma: development of a programmed algorithm and preliminary application.

PubMed

Yap, Felix Y; Bui, James T; Knuttinen, M Grace; Walzer, Natasha M; Cotler, Scott J; Owens, Charles A; Berkes, Jamie L; Gaba, Ron C

2013-01-01

The quantitative relationship between tumor morphology and malignant potential has not been explored in liver tumors. We designed a computer algorithm to analyze shape features of hepatocellular carcinoma (HCC) and tested feasibility of morphologic analysis. Cross-sectional images from 118 patients diagnosed with HCC between 2007 and 2010 were extracted at the widest index tumor diameter. The tumor margins were outlined, and point coordinates were input into a MATLAB (MathWorks Inc., Natick, Massachusetts, USA) algorithm. Twelve shape descriptors were calculated per tumor: the compactness, the mean radial distance (MRD), the RD standard deviation (RDSD), the RD area ratio (RDAR), the zero crossings, entropy, the mean Feret diameter (MFD), the Feret ratio, the convex hull area (CHA) and perimeter (CHP) ratios, the elliptic compactness (EC), and the elliptic irregularity (EI). The parameters were correlated with the levels of alpha-fetoprotein (AFP) as an indicator of tumor aggressiveness. The quantitative morphometric analysis was technically successful in all cases. The mean parameters were as follows: compactness 0.88±0.086, MRD 0.83±0.056, RDSD 0.087±0.037, RDAR 0.045±0.023, zero crossings 6±2.2, entropy 1.43±0.16, MFD 4.40±3.14 cm, Feret ratio 0.78±0.089, CHA 0.98±0.027, CHP 0.98±0.030, EC 0.95±0.043, and EI 0.95±0.023. MFD and RDAR provided the widest value range for the best shape discrimination. The larger tumors were less compact, more concave, and less ellipsoid than the smaller tumors (P < 0.0001). AFP-producing tumors displayed greater morphologic irregularity based on several parameters, including compactness, MRD, RDSD, RDAR, entropy, and EI (P < 0.05 for all). Computerized HCC image analysis using shape descriptors is technically feasible. Aggressively growing tumors have wider diameters and more irregular margins. Future studies will determine further clinical applications for this morphologic analysis.
Lattice enumeration for inverse molecular design using the signature descriptor.

PubMed

Martin, Shawn

2012-07-23

We describe an inverse quantitative structure-activity relationship (QSAR) framework developed for the design of molecular structures with desired properties. This framework uses chemical fragments encoded with a molecular descriptor known as a signature. It solves a system of linear constrained Diophantine equations to reorganize the fragments into novel molecular structures. The method has been previously applied to problems in drug and materials design but has inherent computational limitations due to the necessity of solving the Diophantine constraints. We propose a new approach to overcome these limitations using the Fincke-Pohst algorithm for lattice enumeration. We benchmark the new approach against previous results on LFA-1/ICAM-1 inhibitory peptides, linear homopolymers, and hydrofluoroether foam blowing agents. Software implementing the new approach is available at www.cs.otago.ac.nz/homepages/smartin.

Descriptors for ions and ion-pairs for use in linear free energy relationships.

PubMed

Abraham, Michael H; Acree, William E

2016-01-22

The determination of Abraham descriptors for single ions is reviewed, and equations are given for the partition of single ions from water to a number of solvents. These ions include permanent anions and cations and ionic species such as carboxylic acid anions, phenoxide anions and protonated base cations. Descriptors for a large number of ions and ionic species are listed, and equations for the prediction of Abraham descriptors for ionic species are given. The application of descriptors for ions and ionic species to physicochemical processes is given; these are to water-solvent partitions, HPLC retention data, immobilised artificial membranes, the Finkelstein reaction and diffusion in water. Applications to biological processes include brain permeation, microsomal degradation of drugs, skin permeation and human intestinal absorption. The review concludes with a section on the determination of descriptors for ion-pairs. Copyright © 2015 Elsevier B.V. All rights reserved.
Design of an optimal preview controller for linear discrete-time descriptor systems with state delay

NASA Astrophysics Data System (ADS)

Cao, Mengjuan; Liao, Fucheng

2015-04-01

In this paper, the linear discrete-time descriptor system with state delay is studied, and a design method for an optimal preview controller is proposed. First, by using the discrete lifting technique, the original system is transformed into a general descriptor system without state delay in form. Then, taking advantage of the first-order forward difference operator, we construct a descriptor augmented error system, including the state vectors of the lifted system, error vectors, and desired target signals. Rigorous mathematical proofs are given for the regularity, stabilisability, causal controllability, and causal observability of the descriptor augmented error system. Based on these, the optimal preview controller with preview feedforward compensation for the original system is obtained by using the standard optimal regulator theory of the descriptor system. The effectiveness of the proposed method is shown by numerical simulation.
RANZCR Body Systems Framework of diagnostic imaging examination descriptors.

PubMed

Pitman, Alexander G; Penlington, Lisa; Doromal, Darren; Slater, Gregory; Vukolova, Natalia

2014-08-01

A unified and logical system of descriptors for diagnostic imaging examinations and procedures is a desirable resource for radiology in Australia and New Zealand and is needed to support core activities of RANZCR. Existing descriptor systems available in Australia and New Zealand (including the Medicare DIST and the ACC Schedule) have significant limitations and are inappropriate for broader clinical application. An anatomically based grid was constructed, with anatomical structures arranged in rows and diagnostic imaging modalities arranged in columns (including nuclear medicine and positron emission tomography). The grid was segregated into five body systems. The cells at the intersection of an anatomical structure row and an imaging modality column were populated with short, formulaic descriptors of the applicable diagnostic imaging examinations. Clinically illogical or physically impossible combinations were 'greyed out'. Where the same examination applied to different anatomical structures, the descriptor was kept identical for the purposes of streamlining. The resulting Body Systems Framework of diagnostic imaging examination descriptors lists all the reasonably common diagnostic imaging examinations currently performed in Australia and New Zealand using a unified grid structure allowing navigation by both referrers and radiologists. The Framework has been placed on the RANZCR website and is available for access free of charge by registered users. The Body Systems Framework of diagnostic imaging examination descriptors is a system of descriptors based on relationships between anatomical structures and imaging modalities. The Framework is now available as a resource and reference point for the radiology profession and to support core College activities. © 2014 The Royal Australian and New Zealand College of Radiologists.
Character context: a shape descriptor for Arabic handwriting recognition

NASA Astrophysics Data System (ADS)

Mudhsh, Mohammed; Almodfer, Rolla; Duan, Pengfei; Xiong, Shengwu

2017-11-01

In the handwriting recognition field, designing good descriptors are substantial to obtain rich information of the data. However, the handwriting recognition research of a good descriptor is still an open issue due to unlimited variation in human handwriting. We introduce a "character context descriptor" that efficiently dealt with the structural characteristics of Arabic handwritten characters. First, the character image is smoothed and normalized, then the character context descriptor of 32 feature bins is built based on the proposed "distance function." Finally, a multilayer perceptron with regularization is used as a classifier. On experimentation with a handwritten Arabic characters database, the proposed method achieved a state-of-the-art performance with recognition rate equal to 98.93% and 99.06% for the 66 and 24 classes, respectively.
A method of 3D object recognition and localization in a cloud of points

NASA Astrophysics Data System (ADS)

Bielicki, Jerzy; Sitnik, Robert

2013-12-01

The proposed method given in this article is prepared for analysis of data in the form of cloud of points directly from 3D measurements. It is designed for use in the end-user applications that can directly be integrated with 3D scanning software. The method utilizes locally calculated feature vectors (FVs) in point cloud data. Recognition is based on comparison of the analyzed scene with reference object library. A global descriptor in the form of a set of spatially distributed FVs is created for each reference model. During the detection process, correlation of subsets of reference FVs with FVs calculated in the scene is computed. Features utilized in the algorithm are based on parameters, which qualitatively estimate mean and Gaussian curvatures. Replacement of differentiation with averaging in the curvatures estimation makes the algorithm more resistant to discontinuities and poor quality of the input data. Utilization of the FV subsets allows to detect partially occluded and cluttered objects in the scene, while additional spatial information maintains false positive rate at a reasonably low level.
Constrained dictionary learning and probabilistic hypergraph ranking for person re-identification

NASA Astrophysics Data System (ADS)

He, You; Wu, Song; Pu, Nan; Qian, Li; Xiao, Guoqiang

2018-04-01

Person re-identification is a fundamental and inevitable task in public security. In this paper, we propose a novel framework to improve the performance of this task. First, two different types of descriptors are extracted to represent a pedestrian: (1) appearance-based superpixel features, which are constituted mainly by conventional color features and extracted from the supepixel rather than a whole picture and (2) due to the limitation of discrimination of appearance features, the deep features extracted by feature fusion Network are also used. Second, a view invariant subspace is learned by dictionary learning constrained by the minimum negative sample (termed as DL-cMN) to reduce the noise in appearance-based superpixel feature domain. Then, we use deep features and sparse codes transformed by appearancebased features to establish the hyperedges respectively by k-nearest neighbor, rather than jointing different features simply. Finally, a final ranking is performed by probabilistic hypergraph ranking algorithm. Extensive experiments on three challenging datasets (VIPeR, PRID450S and CUHK01) demonstrate the advantages and effectiveness of our proposed algorithm.
Innovative design method of automobile profile based on Fourier descriptor

NASA Astrophysics Data System (ADS)

Gao, Shuyong; Fu, Chaoxing; Xia, Fan; Shen, Wei

2017-10-01

Aiming at the innovation of the contours of automobile side, this paper presents an innovative design method of vehicle side profile based on Fourier descriptor. The design flow of this design method is: pre-processing, coordinate extraction, standardization, discrete Fourier transform, simplified Fourier descriptor, exchange descriptor innovation, inverse Fourier transform to get the outline of innovative design. Innovative concepts of the innovative methods of gene exchange among species and the innovative methods of gene exchange among different species are presented, and the contours of the innovative design are obtained separately. A three-dimensional model of a car is obtained by referring to the profile curve which is obtained by exchanging xenogeneic genes. The feasibility of the method proposed in this paper is verified by various aspects.
gWEGA: GPU-accelerated WEGA for molecular superposition and shape comparison.

PubMed

Yan, Xin; Li, Jiabo; Gu, Qiong; Xu, Jun

2014-06-05

Virtual screening of a large chemical library for drug lead identification requires searching/superimposing a large number of three-dimensional (3D) chemical structures. This article reports a graphic processing unit (GPU)-accelerated weighted Gaussian algorithm (gWEGA) that expedites shape or shape-feature similarity score-based virtual screening. With 86 GPU nodes (each node has one GPU card), gWEGA can screen 110 million conformations derived from an entire ZINC drug-like database with diverse antidiabetic agents as query structures within 2 s (i.e., screening more than 55 million conformations per second). The rapid screening speed was accomplished through the massive parallelization on multiple GPU nodes and rapid prescreening of 3D structures (based on their shape descriptors and pharmacophore feature compositions). Copyright © 2014 Wiley Periodicals, Inc.
PyGlobal: A toolkit for automated compilation of DFT-based descriptors.

PubMed

Nath, Shilpa R; Kurup, Sudheer S; Joshi, Kaustubh A

2016-06-15

Density Functional Theory (DFT)-based Global reactivity descriptor calculations have emerged as powerful tools for studying the reactivity, selectivity, and stability of chemical and biological systems. A Python-based module, PyGlobal has been developed for systematically parsing a typical Gaussian outfile and extracting the relevant energies of the HOMO and LUMO. Corresponding global reactivity descriptors are further calculated and the data is saved into a spreadsheet compatible with applications like Microsoft Excel and LibreOffice. The efficiency of the module has been accounted by measuring the time interval for randomly selected Gaussian outfiles for 1000 molecules. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
An Automatic Registration Algorithm for 3D Maxillofacial Model

NASA Astrophysics Data System (ADS)

Qiu, Luwen; Zhou, Zhongwei; Guo, Jixiang; Lv, Jiancheng

2016-09-01

3D image registration aims at aligning two 3D data sets in a common coordinate system, which has been widely used in computer vision, pattern recognition and computer assisted surgery. One challenging problem in 3D registration is that point-wise correspondences between two point sets are often unknown apriori. In this work, we develop an automatic algorithm for 3D maxillofacial models registration including facial surface model and skull model. Our proposed registration algorithm can achieve a good alignment result between partial and whole maxillofacial model in spite of ambiguous matching, which has a potential application in the oral and maxillofacial reparative and reconstructive surgery. The proposed algorithm includes three steps: (1) 3D-SIFT features extraction and FPFH descriptors construction; (2) feature matching using SAC-IA; (3) coarse rigid alignment and refinement by ICP. Experiments on facial surfaces and mandible skull models demonstrate the efficiency and robustness of our algorithm.
Detection of gait characteristics for scene registration in video surveillance system.

PubMed

Havasi, László; Szlávik, Zoltán; Szirányi, Tamás

2007-02-01

This paper presents a robust walk-detection algorithm, based on our symmetry approach which can be used to extract gait characteristics from video-image sequences. To obtain a useful descriptor of a walking person, we temporally track the symmetries of a person's legs. Our method is suitable for use in indoor or outdoor surveillance scenes. Determining the leading leg of the walking subject is important, and the presented method can identify this from two successive walk steps (one walk cycle). We tested the accuracy of the presented walk-detection method in a possible application: Image registration methods are presented which are applicable to multicamera systems viewing human subjects in motion.
Eyeglasses Lens Contour Extraction from Facial Images Using an Efficient Shape Description

PubMed Central

Borza, Diana; Darabant, Adrian Sergiu; Danescu, Radu

2013-01-01

This paper presents a system that automatically extracts the position of the eyeglasses and the accurate shape and size of the frame lenses in facial images. The novelty brought by this paper consists in three key contributions. The first one is an original model for representing the shape of the eyeglasses lens, using Fourier descriptors. The second one is a method for generating the search space starting from a finite, relatively small number of representative lens shapes based on Fourier morphing. Finally, we propose an accurate lens contour extraction algorithm using a multi-stage Monte Carlo sampling technique. Multiple experiments demonstrate the effectiveness of our approach. PMID:24152926
Thermal-depth matching in dynamic scene based on affine projection and feature registration

NASA Astrophysics Data System (ADS)

Wang, Hongyu; Jia, Tong; Wu, Chengdong; Li, Yongqiang

2018-03-01

This paper aims to study the construction of 3D temperature distribution reconstruction system based on depth and thermal infrared information. Initially, a traditional calibration method cannot be directly used, because the depth and thermal infrared camera is not sensitive to the color calibration board. Therefore, this paper aims to design a depth and thermal infrared camera calibration board to complete the calibration of the depth and thermal infrared camera. Meanwhile a local feature descriptors in thermal and depth images is proposed. The belief propagation matching algorithm is also investigated based on the space affine transformation matching and local feature matching. The 3D temperature distribution model is built based on the matching of 3D point cloud and 2D thermal infrared information. Experimental results show that the method can accurately construct the 3D temperature distribution model, and has strong robustness.
A machine learning approach for ranking clusters of docked protein‐protein complexes by pairwise cluster comparison

PubMed Central

Pfeiffenberger, Erik; Chaleil, Raphael A.G.; Moal, Iain H.

2017-01-01

ABSTRACT Reliable identification of near‐native poses of docked protein–protein complexes is still an unsolved problem. The intrinsic heterogeneity of protein–protein interactions is challenging for traditional biophysical or knowledge based potentials and the identification of many false positive binding sites is not unusual. Often, ranking protocols are based on initial clustering of docked poses followed by the application of an energy function to rank each cluster according to its lowest energy member. Here, we present an approach of cluster ranking based not only on one molecular descriptor (e.g., an energy function) but also employing a large number of descriptors that are integrated in a machine learning model, whereby, an extremely randomized tree classifier based on 109 molecular descriptors is trained. The protocol is based on first locally enriching clusters with additional poses, the clusters are then characterized using features describing the distribution of molecular descriptors within the cluster, which are combined into a pairwise cluster comparison model to discriminate near‐native from incorrect clusters. The results show that our approach is able to identify clusters containing near‐native protein–protein complexes. In addition, we present an analysis of the descriptors with respect to their power to discriminate near native from incorrect clusters and how data transformations and recursive feature elimination can improve the ranking performance. Proteins 2017; 85:528–543. © 2016 Wiley Periodicals, Inc. PMID:27935158
SVM Based Descriptor Selection and Classification of Neurodegenerative Disease Drugs for Pharmacological Modeling.

PubMed

Shahid, Mohammad; Shahzad Cheema, Muhammad; Klenner, Alexander; Younesi, Erfan; Hofmann-Apitius, Martin

2013-03-01

Systems pharmacological modeling of drug mode of action for the next generation of multitarget drugs may open new routes for drug design and discovery. Computational methods are widely used in this context amongst which support vector machines (SVM) have proven successful in addressing the challenge of classifying drugs with similar features. We have applied a variety of such SVM-based approaches, namely SVM-based recursive feature elimination (SVM-RFE). We use the approach to predict the pharmacological properties of drugs widely used against complex neurodegenerative disorders (NDD) and to build an in-silico computational model for the binary classification of NDD drugs from other drugs. Application of an SVM-RFE model to a set of drugs successfully classified NDD drugs from non-NDD drugs and resulted in overall accuracy of ∼80 % with 10 fold cross validation using 40 top ranked molecular descriptors selected out of total 314 descriptors. Moreover, SVM-RFE method outperformed linear discriminant analysis (LDA) based feature selection and classification. The model reduced the multidimensional descriptors space of drugs dramatically and predicted NDD drugs with high accuracy, while avoiding over fitting. Based on these results, NDD-specific focused libraries of drug-like compounds can be designed and existing NDD-specific drugs can be characterized by a well-characterized set of molecular descriptors. Copyright © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Annotations of Mexican bullfighting videos for semantic index

NASA Astrophysics Data System (ADS)

Montoya Obeso, Abraham; Oropesa Morales, Lester Arturo; Fernando Vázquez, Luis; Cocolán Almeda, Sara Ivonne; Stoian, Andrei; García Vázquez, Mireya Saraí; Zamudio Fuentes, Luis Miguel; Montiel Perez, Jesús Yalja; de la O Torres, Saul; Ramírez Acosta, Alejandro Alvaro

2015-09-01

The video annotation is important for web indexing and browsing systems. Indeed, in order to evaluate the performance of video query and mining techniques, databases with concept annotations are required. Therefore, it is necessary generate a database with a semantic indexing that represents the digital content of the Mexican bullfighting atmosphere. This paper proposes a scheme to make complex annotations in a video in the frame of multimedia search engine project. Each video is partitioned using our segmentation algorithm that creates shots of different length and different number of frames. In order to make complex annotations about the video, we use ELAN software. The annotations are done in two steps: First, we take note about the whole content in each shot. Second, we describe the actions as parameters of the camera like direction, position and deepness. As a consequence, we obtain a more complete descriptor of every action. In both cases we use the concepts of the TRECVid 2014 dataset. We also propose new concepts. This methodology allows to generate a database with the necessary information to create descriptors and algorithms capable to detect actions to automatically index and classify new bullfighting multimedia content.
Multivariate Feature Selection of Image Descriptors Data for Breast Cancer with Computer-Assisted Diagnosis

PubMed Central

Galván-Tejada, Carlos E.; Zanella-Calzada, Laura A.; Galván-Tejada, Jorge I.; Celaya-Padilla, José M.; Gamboa-Rosales, Hamurabi; Garza-Veloz, Idalia; Martinez-Fierro, Margarita L.

2017-01-01

Breast cancer is an important global health problem, and the most common type of cancer among women. Late diagnosis significantly decreases the survival rate of the patient; however, using mammography for early detection has been demonstrated to be a very important tool increasing the survival rate. The purpose of this paper is to obtain a multivariate model to classify benign and malignant tumor lesions using a computer-assisted diagnosis with a genetic algorithm in training and test datasets from mammography image features. A multivariate search was conducted to obtain predictive models with different approaches, in order to compare and validate results. The multivariate models were constructed using: Random Forest, Nearest centroid, and K-Nearest Neighbor (K-NN) strategies as cost function in a genetic algorithm applied to the features in the BCDR public databases. Results suggest that the two texture descriptor features obtained in the multivariate model have a similar or better prediction capability to classify the data outcome compared with the multivariate model composed of all the features, according to their fitness value. This model can help to reduce the workload of radiologists and present a second opinion in the classification of tumor lesions. PMID:28216571
Multivariate Feature Selection of Image Descriptors Data for Breast Cancer with Computer-Assisted Diagnosis.

PubMed

Galván-Tejada, Carlos E; Zanella-Calzada, Laura A; Galván-Tejada, Jorge I; Celaya-Padilla, José M; Gamboa-Rosales, Hamurabi; Garza-Veloz, Idalia; Martinez-Fierro, Margarita L

2017-02-14

Breast cancer is an important global health problem, and the most common type of cancer among women. Late diagnosis significantly decreases the survival rate of the patient; however, using mammography for early detection has been demonstrated to be a very important tool increasing the survival rate. The purpose of this paper is to obtain a multivariate model to classify benign and malignant tumor lesions using a computer-assisted diagnosis with a genetic algorithm in training and test datasets from mammography image features. A multivariate search was conducted to obtain predictive models with different approaches, in order to compare and validate results. The multivariate models were constructed using: Random Forest, Nearest centroid, and K-Nearest Neighbor (K-NN) strategies as cost function in a genetic algorithm applied to the features in the BCDR public databases. Results suggest that the two texture descriptor features obtained in the multivariate model have a similar or better prediction capability to classify the data outcome compared with the multivariate model composed of all the features, according to their fitness value. This model can help to reduce the workload of radiologists and present a second opinion in the classification of tumor lesions.
Visual analytics in cheminformatics: user-supervised descriptor selection for QSAR methods.

PubMed

Martínez, María Jimena; Ponzoni, Ignacio; Díaz, Mónica F; Vazquez, Gustavo E; Soto, Axel J

2015-01-01

The design of QSAR/QSPR models is a challenging problem, where the selection of the most relevant descriptors constitutes a key step of the process. Several feature selection methods that address this step are concentrated on statistical associations among descriptors and target properties, whereas the chemical knowledge is left out of the analysis. For this reason, the interpretability and generality of the QSAR/QSPR models obtained by these feature selection methods are drastically affected. Therefore, an approach for integrating domain expert's knowledge in the selection process is needed for increase the confidence in the final set of descriptors. In this paper a software tool, which we named Visual and Interactive DEscriptor ANalysis (VIDEAN), that combines statistical methods with interactive visualizations for choosing a set of descriptors for predicting a target property is proposed. Domain expertise can be added to the feature selection process by means of an interactive visual exploration of data, and aided by statistical tools and metrics based on information theory. Coordinated visual representations are presented for capturing different relationships and interactions among descriptors, target properties and candidate subsets of descriptors. The competencies of the proposed software were assessed through different scenarios. These scenarios reveal how an expert can use this tool to choose one subset of descriptors from a group of candidate subsets or how to modify existing descriptor subsets and even incorporate new descriptors according to his or her own knowledge of the target property. The reported experiences showed the suitability of our software for selecting sets of descriptors with low cardinality, high interpretability, low redundancy and high statistical performance in a visual exploratory way. Therefore, it is possible to conclude that the resulting tool allows the integration of a chemist's expertise in the descriptor selection process with a low cognitive effort in contrast with the alternative of using an ad-hoc manual analysis of the selected descriptors. Graphical abstractVIDEAN allows the visual analysis of candidate subsets of descriptors for QSAR/QSPR. In the two panels on the top, users can interactively explore numerical correlations as well as co-occurrences in the candidate subsets through two interactive graphs.
Toxicity prediction of ionic liquids based on Daphnia magna by using density functional theory

NASA Astrophysics Data System (ADS)

Nu’aim, M. N.; Bustam, M. A.

2018-04-01

By using a model called density functional theory, the toxicity of ionic liquids can be predicted and forecast. It is a theory that allowing the researcher to have a substantial tool for computation of the quantum state of atoms, molecules and solids, and molecular dynamics which also known as computer simulation method. It can be done by using structural feature based quantum chemical reactivity descriptor. The identification of ionic liquids and its Log[EC50] data are from literature data that available in Ismail Hossain thesis entitled “Synthesis, Characterization and Quantitative Structure Toxicity Relationship of Imidazolium, Pyridinium and Ammonium Based Ionic Liquids”. Each cation and anion of the ionic liquids were optimized and calculated. The geometry optimization and calculation from the software, produce the value of highest occupied molecular orbital (HOMO) and lowest unoccupied molecular orbital (LUMO). From the value of HOMO and LUMO, the value for other toxicity descriptors were obtained according to their formulas. The toxicity descriptor that involves are electrophilicity index, HOMO, LUMO, energy gap, chemical potential, hardness and electronegativity. The interrelation between the descriptors are being determined by using a multiple linear regression (MLR). From this MLR, all descriptors being analyzed and the descriptors that are significant were chosen. In order to develop the finest model equation for toxicity prediction of ionic liquids, the selected descriptors that are significant were used. The validation of model equation was performed with the Log[EC50] data from the literature and the final model equation was developed. A bigger range of ionic liquids which nearly 108 of ionic liquids can be predicted from this model equation.

An Integrative Object-Based Image Analysis Workflow for Uav Images

NASA Astrophysics Data System (ADS)

Yu, Huai; Yan, Tianheng; Yang, Wen; Zheng, Hong

2016-06-01

In this work, we propose an integrative framework to process UAV images. The overall process can be viewed as a pipeline consisting of the geometric and radiometric corrections, subsequent panoramic mosaicking and hierarchical image segmentation for later Object Based Image Analysis (OBIA). More precisely, we first introduce an efficient image stitching algorithm after the geometric calibration and radiometric correction, which employs a fast feature extraction and matching by combining the local difference binary descriptor and the local sensitive hashing. We then use a Binary Partition Tree (BPT) representation for the large mosaicked panoramic image, which starts by the definition of an initial partition obtained by an over-segmentation algorithm, i.e., the simple linear iterative clustering (SLIC). Finally, we build an object-based hierarchical structure by fully considering the spectral and spatial information of the super-pixels and their topological relationships. Moreover, an optimal segmentation is obtained by filtering the complex hierarchies into simpler ones according to some criterions, such as the uniform homogeneity and semantic consistency. Experimental results on processing the post-seismic UAV images of the 2013 Ya'an earthquake demonstrate the effectiveness and efficiency of our proposed method.
Automatic Detection of Galaxy Type From Datasets of Galaxies Image Based on Image Retrieval Approach.

PubMed

Abd El Aziz, Mohamed; Selim, I M; Xiong, Shengwu

2017-06-30

This paper presents a new approach for the automatic detection of galaxy morphology from datasets based on an image-retrieval approach. Currently, there are several classification methods proposed to detect galaxy types within an image. However, in some situations, the aim is not only to determine the type of galaxy within the queried image, but also to determine the most similar images for query image. Therefore, this paper proposes an image-retrieval method to detect the type of galaxies within an image and return with the most similar image. The proposed method consists of two stages, in the first stage, a set of features is extracted based on shape, color and texture descriptors, then a binary sine cosine algorithm selects the most relevant features. In the second stage, the similarity between the features of the queried galaxy image and the features of other galaxy images is computed. Our experiments were performed using the EFIGI catalogue, which contains about 5000 galaxies images with different types (edge-on spiral, spiral, elliptical and irregular). We demonstrate that our proposed approach has better performance compared with the particle swarm optimization (PSO) and genetic algorithm (GA) methods.
Using Theoretical Descriptions in Structure Activity Relations. 3. Electronic Descriptors

DTIC Science & Technology

1988-08-01

Activity Relationships (QSAR) have been used successfully in the past to develop predictive equations for several biological and physical properties...Linear Free Energy Relationships (,FF.3) and is based on work by Hammet in which he derived electronic descriptors for the dissociation of substituted...structure of a compound and its activity in a system. Several different structural descriptors have been used in QSAR equations . These range from
Dyspnea descriptors developed in Brazil: application in obese patients and in patients with cardiorespiratory diseases.

PubMed

Teixeira, Christiane Aires; Rodrigues Júnior, Antonio Luiz; Straccia, Luciana Cristina; Vianna, Elcio Dos Santos Oliveira; Silva, Geruza Alves da; Martinez, José Antônio Baddini

2011-01-01

To develop a set of descriptive terms applied to the sensation of dyspnea (dyspnea descriptors) for use in Brazil and to investigate the usefulness of these descriptors in four distinct clinical conditions that can be accompanied by dyspnea. We collected 111 dyspnea descriptors from 67 patients and 10 health professionals. These descriptors were analyzed and reduced to 15 based on their frequency of use, similarity of meaning, and potential pathophysiological value. Those 15 descriptors were applied in 50 asthma patients, 50 COPD patients, 30 patients with heart failure, and 50 patients with class II or III obesity. The three best descriptors, as selected by the patients, were studied by cluster analysis. Potential associations between the identified clusters and the four clinical conditions were also investigated. The use of this set of descriptors led to a solution with seven clusters, designated sufoco (suffocating), aperto (tight), rápido (rapid), fadiga (fatigue), abafado (stuffy), trabalho/inspiração (work/inhalation), and falta de ar (shortness of breath). Overlapping of descriptors was quite common among the patients, regardless of their clinical condition. Asthma was significantly associated with the sufoco and trabalho/inspiração clusters, whereas COPD and heart failure were associated with the sufoco, trabalho/inspiração, and falta de ar clusters. Obesity was associated only with the falta de ar cluster. In Brazil, patients who are accustomed to perceiving dyspnea employ various descriptors in order to describe the symptom, and these descriptors can be grouped into similar clusters. In our study sample, such clusters showed no usefulness in differentiating among the four clinical conditions evaluated.
Protein docking prediction using predicted protein-protein interface.

PubMed

Li, Bin; Kihara, Daisuke

2012-01-10

Many important cellular processes are carried out by protein complexes. To provide physical pictures of interacting proteins, many computational protein-protein prediction methods have been developed in the past. However, it is still difficult to identify the correct docking complex structure within top ranks among alternative conformations. We present a novel protein docking algorithm that utilizes imperfect protein-protein binding interface prediction for guiding protein docking. Since the accuracy of protein binding site prediction varies depending on cases, the challenge is to develop a method which does not deteriorate but improves docking results by using a binding site prediction which may not be 100% accurate. The algorithm, named PI-LZerD (using Predicted Interface with Local 3D Zernike descriptor-based Docking algorithm), is based on a pair wise protein docking prediction algorithm, LZerD, which we have developed earlier. PI-LZerD starts from performing docking prediction using the provided protein-protein binding interface prediction as constraints, which is followed by the second round of docking with updated docking interface information to further improve docking conformation. Benchmark results on bound and unbound cases show that PI-LZerD consistently improves the docking prediction accuracy as compared with docking without using binding site prediction or using the binding site prediction as post-filtering. We have developed PI-LZerD, a pairwise docking algorithm, which uses imperfect protein-protein binding interface prediction to improve docking accuracy. PI-LZerD consistently showed better prediction accuracy over alternative methods in the series of benchmark experiments including docking using actual docking interface site predictions as well as unbound docking cases.
Comparison Between Supervised and Unsupervised Classifications of Neuronal Cell Types: A Case Study

PubMed Central

Guerra, Luis; McGarry, Laura M; Robles, Víctor; Bielza, Concha; Larrañaga, Pedro; Yuste, Rafael

2011-01-01

In the study of neural circuits, it becomes essential to discern the different neuronal cell types that build the circuit. Traditionally, neuronal cell types have been classified using qualitative descriptors. More recently, several attempts have been made to classify neurons quantitatively, using unsupervised clustering methods. While useful, these algorithms do not take advantage of previous information known to the investigator, which could improve the classification task. For neocortical GABAergic interneurons, the problem to discern among different cell types is particularly difficult and better methods are needed to perform objective classifications. Here we explore the use of supervised classification algorithms to classify neurons based on their morphological features, using a database of 128 pyramidal cells and 199 interneurons from mouse neocortex. To evaluate the performance of different algorithms we used, as a “benchmark,” the test to automatically distinguish between pyramidal cells and interneurons, defining “ground truth” by the presence or absence of an apical dendrite. We compared hierarchical clustering with a battery of different supervised classification algorithms, finding that supervised classifications outperformed hierarchical clustering. In addition, the selection of subsets of distinguishing features enhanced the classification accuracy for both sets of algorithms. The analysis of selected variables indicates that dendritic features were most useful to distinguish pyramidal cells from interneurons when compared with somatic and axonal morphological variables. We conclude that supervised classification algorithms are better matched to the general problem of distinguishing neuronal cell types when some information on these cell groups, in our case being pyramidal or interneuron, is known a priori. As a spin-off of this methodological study, we provide several methods to automatically distinguish neocortical pyramidal cells from interneurons, based on their morphologies. © 2010 Wiley Periodicals, Inc. Develop Neurobiol 71: 71–82, 2011 PMID:21154911
Multi-resolution cell orientation congruence descriptors for epithelium segmentation in endometrial histology images.

PubMed

Li, Guannan; Raza, Shan E Ahmed; Rajpoot, Nasir M

2017-04-01

It has been recently shown that recurrent miscarriage can be caused by abnormally high ratio of number of uterine natural killer (UNK) cells to the number of stromal cells in human female uterus lining. Due to high workload, the counting of UNK and stromal cells needs to be automated using computer algorithms. However, stromal cells are very similar in appearance to epithelial cells which must be excluded in the counting process. To exclude the epithelial cells from the counting process it is necessary to identify epithelial regions. There are two types of epithelial layers that can be encountered in the endometrium: luminal epithelium and glandular epithelium. To the best of our knowledge, there is no existing method that addresses the segmentation of both types of epithelium simultaneously in endometrial histology images. In this paper, we propose a multi-resolution Cell Orientation Congruence (COCo) descriptor which exploits the fact that neighbouring epithelial cells exhibit similarity in terms of their orientations. Our experimental results show that the proposed descriptors yield accurate results in simultaneously segmenting both luminal and glandular epithelium. Copyright © 2017 Elsevier B.V. All rights reserved.
Predicting allergic contact dermatitis: a hierarchical structure activity relationship (SAR) approach to chemical classification using topological and quantum chemical descriptors

NASA Astrophysics Data System (ADS)

Basak, Subhash C.; Mills, Denise; Hawkins, Douglas M.

2008-06-01

A hierarchical classification study was carried out based on a set of 70 chemicals—35 which produce allergic contact dermatitis (ACD) and 35 which do not. This approach was implemented using a regular ridge regression computer code, followed by conversion of regression output to binary data values. The hierarchical descriptor classes used in the modeling include topostructural (TS), topochemical (TC), and quantum chemical (QC), all of which are based solely on chemical structure. The concordance, sensitivity, and specificity are reported. The model based on the TC descriptors was found to be the best, while the TS model was extremely poor.
Branch length similarity entropy-based descriptors for shape representation

NASA Astrophysics Data System (ADS)

Kwon, Ohsung; Lee, Sang-Hee

2017-11-01

In previous studies, we showed that the branch length similarity (BLS) entropy profile could be successfully used for the shape recognition such as battle tanks, facial expressions, and butterflies. In the present study, we proposed new descriptors, roundness, symmetry, and surface roughness, for the recognition, which are more accurate and fast in the computation than the previous descriptors. The roundness represents how closely a shape resembles to a circle, the symmetry characterizes how much one shape is similar with another when the shape is moved in flip, and the surface roughness quantifies the degree of vertical deviations of a shape boundary. To evaluate the performance of the descriptors, we used the database of leaf images with 12 species. Each species consisted of 10 - 20 leaf images and the total number of images were 160. The evaluation showed that the new descriptors successfully discriminated the leaf species. We believe that the descriptors can be a useful tool in the field of pattern recognition.
Rapid prediction of chemical metabolism by human UDP-glucuronosyltransferase isoforms using quantum chemical descriptors derived with the electronegativity equalization method.

PubMed

Sorich, Michael J; McKinnon, Ross A; Miners, John O; Winkler, David A; Smith, Paul A

2004-10-07

This study aimed to evaluate in silico models based on quantum chemical (QC) descriptors derived using the electronegativity equalization method (EEM) and to assess the use of QC properties to predict chemical metabolism by human UDP-glucuronosyltransferase (UGT) isoforms. Various EEM-derived QC molecular descriptors were calculated for known UGT substrates and nonsubstrates. Classification models were developed using support vector machine and partial least squares discriminant analysis. In general, the most predictive models were generated with the support vector machine. Combining QC and 2D descriptors (from previous work) using a consensus approach resulted in a statistically significant improvement in predictivity (to 84%) over both the QC and 2D models and the other methods of combining the descriptors. EEM-derived QC descriptors were shown to be both highly predictive and computationally efficient. It is likely that EEM-derived QC properties will be generally useful for predicting ADMET and physicochemical properties during drug discovery.
Description of 3D digital curves using the theory free groups

NASA Astrophysics Data System (ADS)

Imiya, Atsushi; Oosawa, Muneaki

1999-09-01

In this paper, we propose a new descriptor for two- and three- dimensional digital curves using the theory of free groups. A spatial digital curve is expressed as a word which is an element of the free group which consists from three elements. These three symbols correspond to the directions of the orthogonal coordinates, respectively. Since a digital curve is treated as a word which is a sequence of alphabetical symbols, this expression permits us to describe any geometric operation as rewriting rules for words. Furthermore, the symbolic derivative of words yields geometric invariants of digital curves for digital Euclidean motion. These invariants enable us to design algorithms for the matching and searching procedures of partial structures of digital curves. Moreover, these symbolic descriptors define the global and local distances for digital curves as an editing distance.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Mackenzie, Cristóbal; Pichara, Karim; Protopapas, Pavlos

The success of automatic classification of variable stars depends strongly on the lightcurve representation. Usually, lightcurves are represented as a vector of many descriptors designed by astronomers called features. These descriptors are expensive in terms of computing, require substantial research effort to develop, and do not guarantee a good classification. Today, lightcurve representation is not entirely automatic; algorithms must be designed and manually tuned up for every survey. The amounts of data that will be generated in the future mean astronomers must develop scalable and automated analysis pipelines. In this work we present a feature learning algorithm designed for variablemore » objects. Our method works by extracting a large number of lightcurve subsequences from a given set, which are then clustered to find common local patterns in the time series. Representatives of these common patterns are then used to transform lightcurves of a labeled set into a new representation that can be used to train a classifier. The proposed algorithm learns the features from both labeled and unlabeled lightcurves, overcoming the bias using only labeled data. We test our method on data sets from the Massive Compact Halo Object survey and the Optical Gravitational Lensing Experiment; the results show that our classification performance is as good as and in some cases better than the performance achieved using traditional statistical features, while the computational cost is significantly lower. With these promising results, we believe that our method constitutes a significant step toward the automation of the lightcurve classification pipeline.« less
Automatic Mrf-Based Registration of High Resolution Satellite Video Data

NASA Astrophysics Data System (ADS)

Platias, C.; Vakalopoulou, M.; Karantzalos, K.

2016-06-01

In this paper we propose a deformable registration framework for high resolution satellite video data able to automatically and accurately co-register satellite video frames and/or register them to a reference map/image. The proposed approach performs non-rigid registration, formulates a Markov Random Fields (MRF) model, while efficient linear programming is employed for reaching the lowest potential of the cost function. The developed approach has been applied and validated on satellite video sequences from Skybox Imaging and compared with a rigid, descriptor-based registration method. Regarding the computational performance, both the MRF-based and the descriptor-based methods were quite efficient, with the first one converging in some minutes and the second in some seconds. Regarding the registration accuracy the proposed MRF-based method significantly outperformed the descriptor-based one in all the performing experiments.
Cutting Materials in Half: A Graph Theory Approach for Generating Crystal Surfaces and Its Prediction of 2D Zeolites

DOE Office of Scientific and Technical Information (OSTI.GOV)

Witman, Matthew; Ling, Sanliang; Boyd, Peter

Scientific interest in two-dimensional (2D) materials, ranging from graphene and other single layer materials to atomically thin crystals, is quickly increasing for a large variety of technological applications. While in silico design approaches have made a large impact in the study of 3D crystals, algorithms designed to discover atomically thin 2D materials from their parent 3D materials are by comparison more sparse. Here, we hypothesize that determining how to cut a 3D material in half (i.e., which Miller surface is formed) by severing a minimal number of bonds or a minimal amount of total bond energy per unit area canmore » yield insight into preferred crystal faces. We answer this question by implementing a graph theory technique to mathematically formalize the enumeration of minimum cut surfaces of crystals. While the algorithm is generally applicable to different classes of materials, we focus on zeolitic materials due to their diverse structural topology and because 2D zeolites have promising catalytic and separation performance compared to their 3D counterparts. We report here a simple descriptor based only on structural information that predicts whether a zeolite is likely to be synthesizable in the 2D form and correctly identifies the expressed surface in known layered 2D zeolites. The discovery of this descriptor allows us to highlight other zeolites that may also be synthesized in the 2D form that have not been experimentally realized yet. Finally, our method is general since the mathematical formalism can be applied to find the minimum cut surfaces of other crystallographic materials such as metal-organic frameworks, covalent-organic frameworks, zeolitic-imidazolate frameworks, metal oxides, etc.« less
Cutting Materials in Half: A Graph Theory Approach for Generating Crystal Surfaces and Its Prediction of 2D Zeolites.

PubMed

Witman, Matthew; Ling, Sanliang; Boyd, Peter; Barthel, Senja; Haranczyk, Maciej; Slater, Ben; Smit, Berend

2018-02-28

Scientific interest in two-dimensional (2D) materials, ranging from graphene and other single layer materials to atomically thin crystals, is quickly increasing for a large variety of technological applications. While in silico design approaches have made a large impact in the study of 3D crystals, algorithms designed to discover atomically thin 2D materials from their parent 3D materials are by comparison more sparse. We hypothesize that determining how to cut a 3D material in half (i.e., which Miller surface is formed) by severing a minimal number of bonds or a minimal amount of total bond energy per unit area can yield insight into preferred crystal faces. We answer this question by implementing a graph theory technique to mathematically formalize the enumeration of minimum cut surfaces of crystals. While the algorithm is generally applicable to different classes of materials, we focus on zeolitic materials due to their diverse structural topology and because 2D zeolites have promising catalytic and separation performance compared to their 3D counterparts. We report here a simple descriptor based only on structural information that predicts whether a zeolite is likely to be synthesizable in the 2D form and correctly identifies the expressed surface in known layered 2D zeolites. The discovery of this descriptor allows us to highlight other zeolites that may also be synthesized in the 2D form that have not been experimentally realized yet. Finally, our method is general since the mathematical formalism can be applied to find the minimum cut surfaces of other crystallographic materials such as metal-organic frameworks, covalent-organic frameworks, zeolitic-imidazolate frameworks, metal oxides, etc.
Cutting Materials in Half: A Graph Theory Approach for Generating Crystal Surfaces and Its Prediction of 2D Zeolites

PubMed Central

2018-01-01

Scientific interest in two-dimensional (2D) materials, ranging from graphene and other single layer materials to atomically thin crystals, is quickly increasing for a large variety of technological applications. While in silico design approaches have made a large impact in the study of 3D crystals, algorithms designed to discover atomically thin 2D materials from their parent 3D materials are by comparison more sparse. We hypothesize that determining how to cut a 3D material in half (i.e., which Miller surface is formed) by severing a minimal number of bonds or a minimal amount of total bond energy per unit area can yield insight into preferred crystal faces. We answer this question by implementing a graph theory technique to mathematically formalize the enumeration of minimum cut surfaces of crystals. While the algorithm is generally applicable to different classes of materials, we focus on zeolitic materials due to their diverse structural topology and because 2D zeolites have promising catalytic and separation performance compared to their 3D counterparts. We report here a simple descriptor based only on structural information that predicts whether a zeolite is likely to be synthesizable in the 2D form and correctly identifies the expressed surface in known layered 2D zeolites. The discovery of this descriptor allows us to highlight other zeolites that may also be synthesized in the 2D form that have not been experimentally realized yet. Finally, our method is general since the mathematical formalism can be applied to find the minimum cut surfaces of other crystallographic materials such as metal–organic frameworks, covalent-organic frameworks, zeolitic-imidazolate frameworks, metal oxides, etc. PMID:29532024
Clinical phenotype-based gene prioritization: an initial study using semantic similarity and the human phenotype ontology.

PubMed

Masino, Aaron J; Dechene, Elizabeth T; Dulik, Matthew C; Wilkens, Alisha; Spinner, Nancy B; Krantz, Ian D; Pennington, Jeffrey W; Robinson, Peter N; White, Peter S

2014-07-21

Exome sequencing is a promising method for diagnosing patients with a complex phenotype. However, variant interpretation relative to patient phenotype can be challenging in some scenarios, particularly clinical assessment of rare complex phenotypes. Each patient's sequence reveals many possibly damaging variants that must be individually assessed to establish clear association with patient phenotype. To assist interpretation, we implemented an algorithm that ranks a given set of genes relative to patient phenotype. The algorithm orders genes by the semantic similarity computed between phenotypic descriptors associated with each gene and those describing the patient. Phenotypic descriptor terms are taken from the Human Phenotype Ontology (HPO) and semantic similarity is derived from each term's information content. Model validation was performed via simulation and with clinical data. We simulated 33 Mendelian diseases with 100 patients per disease. We modeled clinical conditions by adding noise and imprecision, i.e. phenotypic terms unrelated to the disease and terms less specific than the actual disease terms. We ranked the causative gene against all 2488 HPO annotated genes. The median causative gene rank was 1 for the optimal and noise cases, 12 for the imprecision case, and 60 for the imprecision with noise case. Additionally, we examined a clinical cohort of subjects with hearing impairment. The disease gene median rank was 22. However, when also considering the patient's exome data and filtering non-exomic and common variants, the median rank improved to 3. Semantic similarity can rank a causative gene highly within a gene list relative to patient phenotype characteristics, provided that imprecision is mitigated. The clinical case results suggest that phenotype rank combined with variant analysis provides significant improvement over the individual approaches. We expect that this combined prioritization approach may increase accuracy and decrease effort for clinical genetic diagnosis.
Cutting Materials in Half: A Graph Theory Approach for Generating Crystal Surfaces and Its Prediction of 2D Zeolites

DOE PAGES

Witman, Matthew; Ling, Sanliang; Boyd, Peter; ...

2018-02-06

Scientific interest in two-dimensional (2D) materials, ranging from graphene and other single layer materials to atomically thin crystals, is quickly increasing for a large variety of technological applications. While in silico design approaches have made a large impact in the study of 3D crystals, algorithms designed to discover atomically thin 2D materials from their parent 3D materials are by comparison more sparse. Here, we hypothesize that determining how to cut a 3D material in half (i.e., which Miller surface is formed) by severing a minimal number of bonds or a minimal amount of total bond energy per unit area canmore » yield insight into preferred crystal faces. We answer this question by implementing a graph theory technique to mathematically formalize the enumeration of minimum cut surfaces of crystals. While the algorithm is generally applicable to different classes of materials, we focus on zeolitic materials due to their diverse structural topology and because 2D zeolites have promising catalytic and separation performance compared to their 3D counterparts. We report here a simple descriptor based only on structural information that predicts whether a zeolite is likely to be synthesizable in the 2D form and correctly identifies the expressed surface in known layered 2D zeolites. The discovery of this descriptor allows us to highlight other zeolites that may also be synthesized in the 2D form that have not been experimentally realized yet. Finally, our method is general since the mathematical formalism can be applied to find the minimum cut surfaces of other crystallographic materials such as metal-organic frameworks, covalent-organic frameworks, zeolitic-imidazolate frameworks, metal oxides, etc.« less
Multi-view video segmentation and tracking for video surveillance

NASA Astrophysics Data System (ADS)

Mohammadi, Gelareh; Dufaux, Frederic; Minh, Thien Ha; Ebrahimi, Touradj

2009-05-01

Tracking moving objects is a critical step for smart video surveillance systems. Despite the complexity increase, multiple camera systems exhibit the undoubted advantages of covering wide areas and handling the occurrence of occlusions by exploiting the different viewpoints. The technical problems in multiple camera systems are several: installation, calibration, objects matching, switching, data fusion, and occlusion handling. In this paper, we address the issue of tracking moving objects in an environment covered by multiple un-calibrated cameras with overlapping fields of view, typical of most surveillance setups. Our main objective is to create a framework that can be used to integrate objecttracking information from multiple video sources. Basically, the proposed technique consists of the following steps. We first perform a single-view tracking algorithm on each camera view, and then apply a consistent object labeling algorithm on all views. In the next step, we verify objects in each view separately for inconsistencies. Correspondent objects are extracted through a Homography transform from one view to the other and vice versa. Having found the correspondent objects of different views, we partition each object into homogeneous regions. In the last step, we apply the Homography transform to find the region map of first view in the second view and vice versa. For each region (in the main frame and mapped frame) a set of descriptors are extracted to find the best match between two views based on region descriptors similarity. This method is able to deal with multiple objects. Track management issues such as occlusion, appearance and disappearance of objects are resolved using information from all views. This method is capable of tracking rigid and deformable objects and this versatility lets it to be suitable for different application scenarios.
Data-Driven Neural Network Model for Robust Reconstruction of Automobile Casting

NASA Astrophysics Data System (ADS)

Lin, Jinhua; Wang, Yanjie; Li, Xin; Wang, Lu

2017-09-01

In computer vision system, it is a challenging task to robustly reconstruct complex 3D geometries of automobile castings. However, 3D scanning data is usually interfered by noises, the scanning resolution is low, these effects normally lead to incomplete matching and drift phenomenon. In order to solve these problems, a data-driven local geometric learning model is proposed to achieve robust reconstruction of automobile casting. In order to relieve the interference of sensor noise and to be compatible with incomplete scanning data, a 3D convolution neural network is established to match the local geometric features of automobile casting. The proposed neural network combines the geometric feature representation with the correlation metric function to robustly match the local correspondence. We use the truncated distance field(TDF) around the key point to represent the 3D surface of casting geometry, so that the model can be directly embedded into the 3D space to learn the geometric feature representation; Finally, the training labels is automatically generated for depth learning based on the existing RGB-D reconstruction algorithm, which accesses to the same global key matching descriptor. The experimental results show that the matching accuracy of our network is 92.2% for automobile castings, the closed loop rate is about 74.0% when the matching tolerance threshold τ is 0.2. The matching descriptors performed well and retained 81.6% matching accuracy at 95% closed loop. For the sparse geometric castings with initial matching failure, the 3D matching object can be reconstructed robustly by training the key descriptors. Our method performs 3D reconstruction robustly for complex automobile castings.

Rotation invariant fast features for large-scale recognition

NASA Astrophysics Data System (ADS)

Takacs, Gabriel; Chandrasekhar, Vijay; Tsai, Sam; Chen, David; Grzeszczuk, Radek; Girod, Bernd

2012-10-01

We present an end-to-end feature description pipeline which uses a novel interest point detector and Rotation- Invariant Fast Feature (RIFF) descriptors. The proposed RIFF algorithm is 15× faster than SURF1 while producing large-scale retrieval results that are comparable to SIFT.2 Such high-speed features benefit a range of applications from Mobile Augmented Reality (MAR) to web-scale image retrieval and analysis.
Sparse and Large-Scale Learning Models and Algorithms for Mining Heterogeneous Big Data

ERIC Educational Resources Information Center

Cai, Xiao

2013-01-01

With the development of PC, internet as well as mobile devices, we are facing a data exploding era. On one hand, more and more features can be collected to describe the data, making the size of the data descriptor larger and larger. On the other hand, the number of data itself explodes and can be collected from multiple resources. When the data…
Landmark matching based retinal image alignment by enforcing sparsity in correspondence matrix.

PubMed

Zheng, Yuanjie; Daniel, Ebenezer; Hunter, Allan A; Xiao, Rui; Gao, Jianbin; Li, Hongsheng; Maguire, Maureen G; Brainard, David H; Gee, James C

2014-08-01

Retinal image alignment is fundamental to many applications in diagnosis of eye diseases. In this paper, we address the problem of landmark matching based retinal image alignment. We propose a novel landmark matching formulation by enforcing sparsity in the correspondence matrix and offer its solutions based on linear programming. The proposed formulation not only enables a joint estimation of the landmark correspondences and a predefined transformation model but also combines the benefits of the softassign strategy (Chui and Rangarajan, 2003) and the combinatorial optimization of linear programming. We also introduced a set of reinforced self-similarities descriptors which can better characterize local photometric and geometric properties of the retinal image. Theoretical analysis and experimental results with both fundus color images and angiogram images show the superior performances of our algorithms to several state-of-the-art techniques. Copyright © 2013 Elsevier B.V. All rights reserved.
Texture Descriptors Ensembles Enable Image-Based Classification of Maturation of Human Stem Cell-Derived Retinal Pigmented Epithelium

PubMed Central

Caetano dos Santos, Florentino Luciano; Skottman, Heli; Juuti-Uusitalo, Kati; Hyttinen, Jari

2016-01-01

Aims A fast, non-invasive and observer-independent method to analyze the homogeneity and maturity of human pluripotent stem cell (hPSC) derived retinal pigment epithelial (RPE) cells is warranted to assess the suitability of hPSC-RPE cells for implantation or in vitro use. The aim of this work was to develop and validate methods to create ensembles of state-of-the-art texture descriptors and to provide a robust classification tool to separate three different maturation stages of RPE cells by using phase contrast microscopy images. The same methods were also validated on a wide variety of biological image classification problems, such as histological or virus image classification. Methods For image classification we used different texture descriptors, descriptor ensembles and preprocessing techniques. Also, three new methods were tested. The first approach was an ensemble of preprocessing methods, to create an additional set of images. The second was the region-based approach, where saliency detection and wavelet decomposition divide each image in two different regions, from which features were extracted through different descriptors. The third method was an ensemble of Binarized Statistical Image Features, based on different sizes and thresholds. A Support Vector Machine (SVM) was trained for each descriptor histogram and the set of SVMs combined by sum rule. The accuracy of the computer vision tool was verified in classifying the hPSC-RPE cell maturation level. Dataset and Results The RPE dataset contains 1862 subwindows from 195 phase contrast images. The final descriptor ensemble outperformed the most recent stand-alone texture descriptors, obtaining, for the RPE dataset, an area under ROC curve (AUC) of 86.49% with the 10-fold cross validation and 91.98% with the leave-one-image-out protocol. The generality of the three proposed approaches was ascertained with 10 more biological image datasets, obtaining an average AUC greater than 97%. Conclusions Here we showed that the developed ensembles of texture descriptors are able to classify the RPE cell maturation stage. Moreover, we proved that preprocessing and region-based decomposition improves many descriptors’ accuracy in biological dataset classification. Finally, we built the first public dataset of stem cell-derived RPE cells, which is publicly available to the scientific community for classification studies. The proposed tool is available at https://www.dei.unipd.it/node/2357 and the RPE dataset at http://www.biomeditech.fi/data/RPE_dataset/. Both are available at https://figshare.com/s/d6fb591f1beb4f8efa6f. PMID:26895509
Cardiac arrhythmia beat classification using DOST and PSO tuned SVM.

PubMed

Raj, Sandeep; Ray, Kailash Chandra; Shankar, Om

2016-11-01

The increase in the number of deaths due to cardiovascular diseases (CVDs) has gained significant attention from the study of electrocardiogram (ECG) signals. These ECG signals are studied by the experienced cardiologist for accurate and proper diagnosis, but it becomes difficult and time-consuming for long-term recordings. Various signal processing techniques are studied to analyze the ECG signal, but they bear limitations due to the non-stationary behavior of ECG signals. Hence, this study aims to improve the classification accuracy rate and provide an automated diagnostic solution for the detection of cardiac arrhythmias. The proposed methodology consists of four stages, i.e. filtering, R-peak detection, feature extraction and classification stages. In this study, Wavelet based approach is used to filter the raw ECG signal, whereas Pan-Tompkins algorithm is used for detecting the R-peak inside the ECG signal. In the feature extraction stage, discrete orthogonal Stockwell transform (DOST) approach is presented for an efficient time-frequency representation (i.e. morphological descriptors) of a time domain signal and retains the absolute phase information to distinguish the various non-stationary behavior ECG signals. Moreover, these morphological descriptors are further reduced in lower dimensional space by using principal component analysis and combined with the dynamic features (i.e based on RR-interval of the ECG signals) of the input signal. This combination of two different kinds of descriptors represents each feature set of an input signal that is utilized for classification into subsequent categories by employing PSO tuned support vector machines (SVM). The proposed methodology is validated on the baseline MIT-BIH arrhythmia database and evaluated under two assessment schemes, yielding an improved overall accuracy of 99.18% for sixteen classes in the category-based and 89.10% for five classes (mapped according to AAMI standard) in the patient-based assessment scheme respectively to the state-of-art diagnosis. The results reported are further compared to the existing methodologies in literature. The proposed feature representation of cardiac signals based on symmetrical features along with PSO based optimization technique for the SVM classifier reported an improved classification accuracy in both the assessment schemes evaluated on the benchmark MIT-BIH arrhythmia database and hence can be utilized for automated computer-aided diagnosis of cardiac arrhythmia beats. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
A comparative study for chest radiograph image retrieval using binary texture and deep learning classification.

PubMed

Anavi, Yaron; Kogan, Ilya; Gelbart, Elad; Geva, Ofer; Greenspan, Hayit

2015-08-01

In this work various approaches are investigated for X-ray image retrieval and specifically chest pathology retrieval. Given a query image taken from a data set of 443 images, the objective is to rank images according to similarity. Different features, including binary features, texture features, and deep learning (CNN) features are examined. In addition, two approaches are investigated for the retrieval task. One approach is based on the distance of image descriptors using the above features (hereon termed the "descriptor"-based approach); the second approach ("classification"-based approach) is based on a probability descriptor, generated by a pair-wise classification of each two classes (pathologies) and their decision values using an SVM classifier. Best results are achieved using deep learning features in a classification scheme.
Predicting human olfactory perception from chemical features of odor molecules.

PubMed

Keller, Andreas; Gerkin, Richard C; Guan, Yuanfang; Dhurandhar, Amit; Turu, Gabor; Szalai, Bence; Mainland, Joel D; Ihara, Yusuke; Yu, Chung Wen; Wolfinger, Russ; Vens, Celine; Schietgat, Leander; De Grave, Kurt; Norel, Raquel; Stolovitzky, Gustavo; Cecchi, Guillermo A; Vosshall, Leslie B; Meyer, Pablo

2017-02-24

It is still not possible to predict whether a given molecule will have a perceived odor or what olfactory percept it will produce. We therefore organized the crowd-sourced DREAM Olfaction Prediction Challenge. Using a large olfactory psychophysical data set, teams developed machine-learning algorithms to predict sensory attributes of molecules based on their chemoinformatic features. The resulting models accurately predicted odor intensity and pleasantness and also successfully predicted 8 among 19 rated semantic descriptors ("garlic," "fish," "sweet," "fruit," "burnt," "spices," "flower," and "sour"). Regularized linear models performed nearly as well as random forest-based ones, with a predictive accuracy that closely approaches a key theoretical limit. These models help to predict the perceptual qualities of virtually any molecule with high accuracy and also reverse-engineer the smell of a molecule. Copyright © 2017, American Association for the Advancement of Science.
Signature molecular descriptor : advanced applications.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Visco, Donald Patrick, Jr.

In this work we report on the development of the Signature Molecular Descriptor (or Signature) for use in the solution of inverse design problems as well as in highthroughput screening applications. The ultimate goal of using Signature is to identify novel and non-intuitive chemical structures with optimal predicted properties for a given application. We demonstrate this in three studies: green solvent design, glucocorticoid receptor ligand design and the design of inhibitors for Factor XIa. In many areas of engineering, compounds are designed and/or modified in incremental ways which rely upon heuristics or institutional knowledge. Often multiple experiments are performed andmore » the optimal compound is identified in this brute-force fashion. Perhaps a traditional chemical scaffold is identified and movement of a substituent group around a ring constitutes the whole of the design process. Also notably, a chemical being evaluated in one area might demonstrate properties very attractive in another area and serendipity was the mechanism for solution. In contrast to such approaches, computer-aided molecular design (CAMD) looks to encompass both experimental and heuristic-based knowledge into a strategy that will design a molecule on a computer to meet a given target. Depending on the algorithm employed, the molecule which is designed might be quite novel (re: no CAS registration number) and/or non-intuitive relative to what is known about the problem at hand. While CAMD is a fairly recent strategy (dating to the early 1980s), it contains a variety of bottlenecks and limitations which have prevented the technique from garnering more attention in the academic, governmental and industrial institutions. A main reason for this is how the molecules are described in the computer. This step can control how models are developed for the properties of interest on a given problem as well as how to go from an output of the algorithm to an actual chemical structure. This report provides details on a technique to describe molecules on a computer, called Signature, as well as the computer-aided molecule design algorithm built around Signature. Two applications are provided of the CAMD algorithm with Signature. The first describes the design of green solvents based on data in the GlaxoSmithKline (GSK) Solvent Selection Guide. The second provides novel non-steroidal glucocorticoid receptor ligands with some optimally predicted properties. In addition to using the CAMD algorithm with Signature, it is demonstrated how to employ Signature in a high-throughput screening study. Here, after classifying both active and inactive inhibitors for the protein Factor XIa using Signature, the model developed is used to screen a large, publicly-available database called PubChem for the most active compounds.« less
OPERA models for predicting physicochemical properties and environmental fate endpoints.

PubMed

Mansouri, Kamel; Grulke, Chris M; Judson, Richard S; Williams, Antony J

2018-03-08

The collection of chemical structure information and associated experimental data for quantitative structure-activity/property relationship (QSAR/QSPR) modeling is facilitated by an increasing number of public databases containing large amounts of useful data. However, the performance of QSAR models highly depends on the quality of the data and modeling methodology used. This study aims to develop robust QSAR/QSPR models for chemical properties of environmental interest that can be used for regulatory purposes. This study primarily uses data from the publicly available PHYSPROP database consisting of a set of 13 common physicochemical and environmental fate properties. These datasets have undergone extensive curation using an automated workflow to select only high-quality data, and the chemical structures were standardized prior to calculation of the molecular descriptors. The modeling procedure was developed based on the five Organization for Economic Cooperation and Development (OECD) principles for QSAR models. A weighted k-nearest neighbor approach was adopted using a minimum number of required descriptors calculated using PaDEL, an open-source software. The genetic algorithms selected only the most pertinent and mechanistically interpretable descriptors (2-15, with an average of 11 descriptors). The sizes of the modeled datasets varied from 150 chemicals for biodegradability half-life to 14,050 chemicals for logP, with an average of 3222 chemicals across all endpoints. The optimal models were built on randomly selected training sets (75%) and validated using fivefold cross-validation (CV) and test sets (25%). The CV Q 2 of the models varied from 0.72 to 0.95, with an average of 0.86 and an R 2 test value from 0.71 to 0.96, with an average of 0.82. Modeling and performance details are described in QSAR model reporting format and were validated by the European Commission's Joint Research Center to be OECD compliant. All models are freely available as an open-source, command-line application called OPEn structure-activity/property Relationship App (OPERA). OPERA models were applied to more than 750,000 chemicals to produce freely available predicted data on the U.S. Environmental Protection Agency's CompTox Chemistry Dashboard.
3D molecular descriptors important for clinical success.

PubMed

Kombo, David C; Tallapragada, Kartik; Jain, Rachit; Chewning, Joseph; Mazurov, Anatoly A; Speake, Jason D; Hauser, Terry A; Toler, Steve

2013-02-25

The pharmacokinetic and safety profiles of clinical drug candidates are greatly influenced by their requisite physicochemical properties. In particular, it has been shown that 2D molecular descriptors such as fraction of Sp3 carbon atoms (Fsp3) and number of stereo centers correlate with clinical success. Using the proteomic off-target hit rate of nicotinic ligands, we found that shape-based 3D descriptors such as the radius of gyration and shadow indices discriminate off-target promiscuity better than do Fsp3 and the number of stereo centers. We have deduced the relevant descriptor values required for a ligand to be nonpromiscuous. Investigating the MDL Drug Data Report (MDDR) database as compounds move from the preclinical stage toward the market, we have found that these shape-based 3D descriptors predict clinical success of compounds at preclinical and phase1 stages vs compounds withdrawn from the market better than do Fsp3 and LogD. Further, these computed 3D molecular descriptors correlate well with experimentally observed solubility, which is among well-known physicochemical properties that drive clinical success. We also found that about 84% of launched drugs satisfy either Shadow index or Fsp3 criteria, whereas withdrawn and discontinued compounds fail to meet the same criteria. Our studies suggest that spherical compounds (rather than their elongated counterparts) with a minimal number of aromatic rings may exhibit a high propensity to advance from clinical trials to market.
Influence of Texture and Colour in Breast TMA Classification

PubMed Central

Fernández-Carrobles, M. Milagro; Bueno, Gloria; Déniz, Oscar; Salido, Jesús; García-Rojo, Marcial; González-López, Lucía

2015-01-01

Breast cancer diagnosis is still done by observation of biopsies under the microscope. The development of automated methods for breast TMA classification would reduce diagnostic time. This paper is a step towards the solution for this problem and shows a complete study of breast TMA classification based on colour models and texture descriptors. The TMA images were divided into four classes: i) benign stromal tissue with cellularity, ii) adipose tissue, iii) benign and benign anomalous structures, and iv) ductal and lobular carcinomas. A relevant set of features was obtained on eight different colour models from first and second order Haralick statistical descriptors obtained from the intensity image, Fourier, Wavelets, Multiresolution Gabor, M-LBP and textons descriptors. Furthermore, four types of classification experiments were performed using six different classifiers: (1) classification per colour model individually, (2) classification by combination of colour models, (3) classification by combination of colour models and descriptors, and (4) classification by combination of colour models and descriptors with a previous feature set reduction. The best result shows an average of 99.05% accuracy and 98.34% positive predictive value. These results have been obtained by means of a bagging tree classifier with combination of six colour models and the use of 1719 non-correlated (correlation threshold of 97%) textural features based on Statistical, M-LBP, Gabor and Spatial textons descriptors. PMID:26513238
Learning Rotation-Invariant Local Binary Descriptor.

PubMed

Duan, Yueqi; Lu, Jiwen; Feng, Jianjiang; Zhou, Jie

2017-08-01

In this paper, we propose a rotation-invariant local binary descriptor (RI-LBD) learning method for visual recognition. Compared with hand-crafted local binary descriptors, such as local binary pattern and its variants, which require strong prior knowledge, local binary feature learning methods are more efficient and data-adaptive. Unlike existing learning-based local binary descriptors, such as compact binary face descriptor and simultaneous local binary feature learning and encoding, which are susceptible to rotations, our RI-LBD first categorizes each local patch into a rotational binary pattern (RBP), and then jointly learns the orientation for each pattern and the projection matrix to obtain RI-LBDs. As all the rotation variants of a patch belong to the same RBP, they are rotated into the same orientation and projected into the same binary descriptor. Then, we construct a codebook by a clustering method on the learned binary codes, and obtain a histogram feature for each image as the final representation. In order to exploit higher order statistical information, we extend our RI-LBD to the triple rotation-invariant co-occurrence local binary descriptor (TRICo-LBD) learning method, which learns a triple co-occurrence binary code for each local patch. Extensive experimental results on four different visual recognition tasks, including image patch matching, texture classification, face recognition, and scene classification, show that our RI-LBD and TRICo-LBD outperform most existing local descriptors.
Analysis of breast thermograms for ROI extraction and description using mathematical morphology

NASA Astrophysics Data System (ADS)

Zermeño-Loreto, O. A.; Toxqui-Quitl, C.; Orozco Guillén, E. E.; Padilla-Vivanco, A.

2017-09-01

The detection of a temperature increase or hot spots in breast thermograms can be related with high metabolic activity of disease cells. Image processing algorithms to seek mainly temperature increases above 3°C which have a high probability of being a malignancy are proposed. Also a derivative operator is used to highlights breast regions of interest (ROI). In order to determinate a medical alert, a feature descriptor of the ROI is constructed using its maximum temperature, maximum increase of temperature, sector/quadrant position in the breast, and area. The proposed algorithms are tested in a home database and a public database for mastology research.
Graph Theoretical Representation of Atomic Asymmetry and Molecular Chirality of Benzenoids in Two-Dimensional Space

PubMed Central

Zhao, Tanfeng; Zhang, Qingyou; Long, Hailin; Xu, Lu

2014-01-01

In order to explore atomic asymmetry and molecular chirality in 2D space, benzenoids composed of 3 to 11 hexagons in 2D space were enumerated in our laboratory. These benzenoids are regarded as planar connected polyhexes and have no internal holes; that is, their internal regions are filled with hexagons. The produced dataset was composed of 357,968 benzenoids, including more than 14 million atoms. Rather than simply labeling the huge number of atoms as being either symmetric or asymmetric, this investigation aims at exploring a quantitative graph theoretical descriptor of atomic asymmetry. Based on the particular characteristics in the 2D plane, we suggested the weighted atomic sum as the descriptor of atomic asymmetry. This descriptor is measured by circulating around the molecule going in opposite directions. The investigation demonstrates that the weighted atomic sums are superior to the previously reported quantitative descriptor, atomic sums. The investigation of quantitative descriptors also reveals that the most asymmetric atom is in a structure with a spiral ring with the convex shape going in clockwise direction and concave shape going in anticlockwise direction from the atom. Based on weighted atomic sums, a weighted F index is introduced to quantitatively represent molecular chirality in the plane, rather than merely regarding benzenoids as being either chiral or achiral. By validating with enumerated benzenoids, the results indicate that the weighted F indexes were in accordance with their chiral classification (achiral or chiral) over the whole benzenoids dataset. Furthermore, weighted F indexes were superior to previously available descriptors. Benzenoids possess a variety of shapes and can be extended to practically represent any shape in 2D space—our proposed descriptor has thus the potential to be a general method to represent 2D molecular chirality based on the difference between clockwise and anticlockwise sums around a molecule. PMID:25032832
Object Tracking Using Adaptive Covariance Descriptor and Clustering-Based Model Updating for Visual Surveillance

PubMed Central

Qin, Lei; Snoussi, Hichem; Abdallah, Fahed

2014-01-01

We propose a novel approach for tracking an arbitrary object in video sequences for visual surveillance. The first contribution of this work is an automatic feature extraction method that is able to extract compact discriminative features from a feature pool before computing the region covariance descriptor. As the feature extraction method is adaptive to a specific object of interest, we refer to the region covariance descriptor computed using the extracted features as the adaptive covariance descriptor. The second contribution is to propose a weakly supervised method for updating the object appearance model during tracking. The method performs a mean-shift clustering procedure among the tracking result samples accumulated during a period of time and selects a group of reliable samples for updating the object appearance model. As such, the object appearance model is kept up-to-date and is prevented from contamination even in case of tracking mistakes. We conducted comparing experiments on real-world video sequences, which confirmed the effectiveness of the proposed approaches. The tracking system that integrates the adaptive covariance descriptor and the clustering-based model updating method accomplished stable object tracking on challenging video sequences. PMID:24865883
Automatic summarization of soccer highlights using audio-visual descriptors.

PubMed

Raventós, A; Quijada, R; Torres, Luis; Tarrés, Francesc

2015-01-01

Automatic summarization generation of sports video content has been object of great interest for many years. Although semantic descriptions techniques have been proposed, many of the approaches still rely on low-level video descriptors that render quite limited results due to the complexity of the problem and to the low capability of the descriptors to represent semantic content. In this paper, a new approach for automatic highlights summarization generation of soccer videos using audio-visual descriptors is presented. The approach is based on the segmentation of the video sequence into shots that will be further analyzed to determine its relevance and interest. Of special interest in the approach is the use of the audio information that provides additional robustness to the overall performance of the summarization system. For every video shot a set of low and mid level audio-visual descriptors are computed and lately adequately combined in order to obtain different relevance measures based on empirical knowledge rules. The final summary is generated by selecting those shots with highest interest according to the specifications of the user and the results of relevance measures. A variety of results are presented with real soccer video sequences that prove the validity of the approach.
Plant Identification Based on Leaf Midrib Cross-Section Images Using Fractal Descriptors.

PubMed

da Silva, Núbia Rosa; Florindo, João Batista; Gómez, María Cecilia; Rossatto, Davi Rodrigo; Kolb, Rosana Marta; Bruno, Odemir Martinez

2015-01-01

The correct identification of plants is a common necessity not only to researchers but also to the lay public. Recently, computational methods have been employed to facilitate this task, however, there are few studies front of the wide diversity of plants occurring in the world. This study proposes to analyse images obtained from cross-sections of leaf midrib using fractal descriptors. These descriptors are obtained from the fractal dimension of the object computed at a range of scales. In this way, they provide rich information regarding the spatial distribution of the analysed structure and, as a consequence, they measure the multiscale morphology of the object of interest. In Biology, such morphology is of great importance because it is related to evolutionary aspects and is successfully employed to characterize and discriminate among different biological structures. Here, the fractal descriptors are used to identify the species of plants based on the image of their leaves. A large number of samples are examined, being 606 leaf samples of 50 species from Brazilian flora. The results are compared to other imaging methods in the literature and demonstrate that fractal descriptors are precise and reliable in the taxonomic process of plant species identification.
Accurate characterisation of hole size and location by projected fringe profilometry

NASA Astrophysics Data System (ADS)

Wu, Yuxiang; Dantanarayana, Harshana G.; Yue, Huimin; Huntley, Jonathan M.

2018-06-01

The ability to accurately estimate the location and geometry of holes is often required in the field of quality control and automated assembly. Projected fringe profilometry is a potentially attractive technique on account of being non-contacting, of lower cost, and orders of magnitude faster than the traditional coordinate measuring machine. However, we demonstrate in this paper that fringe projection is susceptible to significant (hundreds of µm) measurement artefacts in the neighbourhood of hole edges, which give rise to errors of a similar magnitude in the estimated hole geometry. A mechanism for the phenomenon is identified based on the finite size of the imaging system’s point spread function and the resulting bias produced near to sample discontinuities in geometry and reflectivity. A mathematical model is proposed, from which a post-processing compensation algorithm is developed to suppress such errors around the holes. The algorithm includes a robust and accurate sub-pixel edge detection method based on a Fourier descriptor of the hole contour. The proposed algorithm was found to reduce significantly the measurement artefacts near the hole edges. As a result, the errors in estimated hole radius were reduced by up to one order of magnitude, to a few tens of µm for hole radii in the range 2–15 mm, compared to those from the uncompensated measurements.
Texture classification using non-Euclidean Minkowski dilation

NASA Astrophysics Data System (ADS)

Florindo, Joao B.; Bruno, Odemir M.

2018-03-01

This study presents a new method to extract meaningful descriptors of gray-scale texture images using Minkowski morphological dilation based on the Lp metric. The proposed approach is motivated by the success previously achieved by Bouligand-Minkowski fractal descriptors on texture classification. In essence, such descriptors are directly derived from the morphological dilation of a three-dimensional representation of the gray-level pixels using the classical Euclidean metric. In this way, we generalize the dilation for different values of p in the Lp metric (Euclidean is a particular case when p = 2) and obtain the descriptors from the cumulated distribution of the distance transform computed over the texture image. The proposed method is compared to other state-of-the-art approaches (such as local binary patterns and textons for example) in the classification of two benchmark data sets (UIUC and Outex). The proposed descriptors outperformed all the other approaches in terms of rate of images correctly classified. The interesting results suggest the potential of these descriptors in this type of task, with a wide range of possible applications to real-world problems.
An Effective 3D Shape Descriptor for Object Recognition with RGB-D Sensors

PubMed Central

Liu, Zhong; Zhao, Changchen; Wu, Xingming; Chen, Weihai

2017-01-01

RGB-D sensors have been widely used in various areas of computer vision and graphics. A good descriptor will effectively improve the performance of operation. This article further analyzes the recognition performance of shape features extracted from multi-modality source data using RGB-D sensors. A hybrid shape descriptor is proposed as a representation of objects for recognition. We first extracted five 2D shape features from contour-based images and five 3D shape features over point cloud data to capture the global and local shape characteristics of an object. The recognition performance was tested for category recognition and instance recognition. Experimental results show that the proposed shape descriptor outperforms several common global-to-global shape descriptors and is comparable to some partial-to-global shape descriptors that achieved the best accuracies in category and instance recognition. Contribution of partial features and computational complexity were also analyzed. The results indicate that the proposed shape features are strong cues for object recognition and can be combined with other features to boost accuracy. PMID:28245553

Bayesian screening for active compounds in high-dimensional chemical spaces combining property descriptors and molecular fingerprints.

PubMed

Vogt, Martin; Bajorath, Jürgen

2008-01-01

Bayesian classifiers are increasingly being used to distinguish active from inactive compounds and search large databases for novel active molecules. We introduce an approach to directly combine the contributions of property descriptors and molecular fingerprints in the search for active compounds that is based on a Bayesian framework. Conventionally, property descriptors and fingerprints are used as alternative features for virtual screening methods. Following the approach introduced here, probability distributions of descriptor values and fingerprint bit settings are calculated for active and database molecules and the divergence between the resulting combined distributions is determined as a measure of biological activity. In test calculations on a large number of compound activity classes, this methodology was found to consistently perform better than similarity searching using fingerprints and multiple reference compounds or Bayesian screening calculations using probability distributions calculated only from property descriptors. These findings demonstrate that there is considerable synergy between different types of property descriptors and fingerprints in recognizing diverse structure-activity relationships, at least in the context of Bayesian modeling.
A Fast Robot Identification and Mapping Algorithm Based on Kinect Sensor.

PubMed

Zhang, Liang; Shen, Peiyi; Zhu, Guangming; Wei, Wei; Song, Houbing

2015-08-14

Internet of Things (IoT) is driving innovation in an ever-growing set of application domains such as intelligent processing for autonomous robots. For an autonomous robot, one grand challenge is how to sense its surrounding environment effectively. The Simultaneous Localization and Mapping with RGB-D Kinect camera sensor on robot, called RGB-D SLAM, has been developed for this purpose but some technical challenges must be addressed. Firstly, the efficiency of the algorithm cannot satisfy real-time requirements; secondly, the accuracy of the algorithm is unacceptable. In order to address these challenges, this paper proposes a set of novel improvement methods as follows. Firstly, the ORiented Brief (ORB) method is used in feature detection and descriptor extraction. Secondly, a bidirectional Fast Library for Approximate Nearest Neighbors (FLANN) k-Nearest Neighbor (KNN) algorithm is applied to feature match. Then, the improved RANdom SAmple Consensus (RANSAC) estimation method is adopted in the motion transformation. In the meantime, high precision General Iterative Closest Points (GICP) is utilized to register a point cloud in the motion transformation optimization. To improve the accuracy of SLAM, the reduced dynamic covariance scaling (DCS) algorithm is formulated as a global optimization problem under the G2O framework. The effectiveness of the improved algorithm has been verified by testing on standard data and comparing with the ground truth obtained on Freiburg University's datasets. The Dr Robot X80 equipped with a Kinect camera is also applied in a building corridor to verify the correctness of the improved RGB-D SLAM algorithm. With the above experiments, it can be seen that the proposed algorithm achieves higher processing speed and better accuracy.
A combined Fisher and Laplacian score for feature selection in QSAR based drug design using compounds with known and unknown activities.

PubMed

Valizade Hasanloei, Mohammad Amin; Sheikhpour, Razieh; Sarram, Mehdi Agha; Sheikhpour, Elnaz; Sharifi, Hamdollah

2018-02-01

Quantitative structure-activity relationship (QSAR) is an effective computational technique for drug design that relates the chemical structures of compounds to their biological activities. Feature selection is an important step in QSAR based drug design to select the most relevant descriptors. One of the most popular feature selection methods for classification problems is Fisher score which aim is to minimize the within-class distance and maximize the between-class distance. In this study, the properties of Fisher criterion were extended for QSAR models to define the new distance metrics based on the continuous activity values of compounds with known activities. Then, a semi-supervised feature selection method was proposed based on the combination of Fisher and Laplacian criteria which exploits both compounds with known and unknown activities to select the relevant descriptors. To demonstrate the efficiency of the proposed semi-supervised feature selection method in selecting the relevant descriptors, we applied the method and other feature selection methods on three QSAR data sets such as serine/threonine-protein kinase PLK3 inhibitors, ROCK inhibitors and phenol compounds. The results demonstrated that the QSAR models built on the selected descriptors by the proposed semi-supervised method have better performance than other models. This indicates the efficiency of the proposed method in selecting the relevant descriptors using the compounds with known and unknown activities. The results of this study showed that the compounds with known and unknown activities can be helpful to improve the performance of the combined Fisher and Laplacian based feature selection methods.
A combined Fisher and Laplacian score for feature selection in QSAR based drug design using compounds with known and unknown activities

NASA Astrophysics Data System (ADS)

Valizade Hasanloei, Mohammad Amin; Sheikhpour, Razieh; Sarram, Mehdi Agha; Sheikhpour, Elnaz; Sharifi, Hamdollah

2018-02-01

Quantitative structure-activity relationship (QSAR) is an effective computational technique for drug design that relates the chemical structures of compounds to their biological activities. Feature selection is an important step in QSAR based drug design to select the most relevant descriptors. One of the most popular feature selection methods for classification problems is Fisher score which aim is to minimize the within-class distance and maximize the between-class distance. In this study, the properties of Fisher criterion were extended for QSAR models to define the new distance metrics based on the continuous activity values of compounds with known activities. Then, a semi-supervised feature selection method was proposed based on the combination of Fisher and Laplacian criteria which exploits both compounds with known and unknown activities to select the relevant descriptors. To demonstrate the efficiency of the proposed semi-supervised feature selection method in selecting the relevant descriptors, we applied the method and other feature selection methods on three QSAR data sets such as serine/threonine-protein kinase PLK3 inhibitors, ROCK inhibitors and phenol compounds. The results demonstrated that the QSAR models built on the selected descriptors by the proposed semi-supervised method have better performance than other models. This indicates the efficiency of the proposed method in selecting the relevant descriptors using the compounds with known and unknown activities. The results of this study showed that the compounds with known and unknown activities can be helpful to improve the performance of the combined Fisher and Laplacian based feature selection methods.
Adaptive gamma correction-based expert system for nonuniform illumination face enhancement

NASA Astrophysics Data System (ADS)

Abdelhamid, Iratni; Mustapha, Aouache; Adel, Oulefki

2018-03-01

The image quality of a face recognition system suffers under severe lighting conditions. Thus, this study aims to develop an approach for nonuniform illumination adjustment based on an adaptive gamma correction (AdaptGC) filter that can solve the aforementioned issue. An approach for adaptive gain factor prediction was developed via neural network model-based cross-validation (NN-CV). To achieve this objective, a gamma correction function and its effects on the face image quality with different gain values were examined first. Second, an orientation histogram (OH) algorithm was assessed as a face's feature descriptor. Subsequently, a density histogram module was developed for face label generation. During the NN-CV construction, the model was assessed to recognize the OH descriptor and predict the face label. The performance of the NN-CV model was evaluated by examining the statistical measures of root mean square error and coefficient of efficiency. Third, to evaluate the AdaptGC enhancement approach, an image quality metric was adopted using enhancement by entropy, contrast per pixel, second-derivative-like measure of enhancement, and sharpness, then supported by visual inspection. The experiment results were examined using five face's databases, namely, extended Yale-B, Carnegie Mellon University-Pose, Illumination, and Expression, Mobio, FERET, and Oulu-CASIA-NIR-VIS. The final results prove that AdaptGC filter implementation compared with state-of-the-art methods is the best choice in terms of contrast and nonuniform illumination adjustment. In summary, the benefits attained prove that AdaptGC is driven by a profitable enhancement rate, which provides satisfying features for high rate face recognition systems.
Classification of drug molecules considering their IC50 values using mixed-integer linear programming based hyper-boxes method.

PubMed

Armutlu, Pelin; Ozdemir, Muhittin E; Uney-Yuksektepe, Fadime; Kavakli, I Halil; Turkay, Metin

2008-10-03

A priori analysis of the activity of drugs on the target protein by computational approaches can be useful in narrowing down drug candidates for further experimental tests. Currently, there are a large number of computational methods that predict the activity of drugs on proteins. In this study, we approach the activity prediction problem as a classification problem and, we aim to improve the classification accuracy by introducing an algorithm that combines partial least squares regression with mixed-integer programming based hyper-boxes classification method, where drug molecules are classified as low active or high active regarding their binding activity (IC50 values) on target proteins. We also aim to determine the most significant molecular descriptors for the drug molecules. We first apply our approach by analyzing the activities of widely known inhibitor datasets including Acetylcholinesterase (ACHE), Benzodiazepine Receptor (BZR), Dihydrofolate Reductase (DHFR), Cyclooxygenase-2 (COX-2) with known IC50 values. The results at this stage proved that our approach consistently gives better classification accuracies compared to 63 other reported classification methods such as SVM, Naïve Bayes, where we were able to predict the experimentally determined IC50 values with a worst case accuracy of 96%. To further test applicability of this approach we first created dataset for Cytochrome P450 C17 inhibitors and then predicted their activities with 100% accuracy. Our results indicate that this approach can be utilized to predict the inhibitory effects of inhibitors based on their molecular descriptors. This approach will not only enhance drug discovery process, but also save time and resources committed.
Improving Zernike moments comparison for optimal similarity and rotation angle retrieval.

PubMed

Revaud, Jérôme; Lavoué, Guillaume; Baskurt, Atilla

2009-04-01

Zernike moments constitute a powerful shape descriptor in terms of robustness and description capability. However the classical way of comparing two Zernike descriptors only takes into account the magnitude of the moments and loses the phase information. The novelty of our approach is to take advantage of the phase information in the comparison process while still preserving the invariance to rotation. This new Zernike comparator provides a more accurate similarity measure together with the optimal rotation angle between the patterns, while keeping the same complexity as the classical approach. This angle information is particularly of interest for many applications, including 3D scene understanding through images. Experiments demonstrate that our comparator outperforms the classical one in terms of similarity measure. In particular the robustness of the retrieval against noise and geometric deformation is greatly improved. Moreover, the rotation angle estimation is also more accurate than state-of-the-art algorithms.
Predicting membrane protein types by the LLDA algorithm.

PubMed

Wang, Tong; Yang, Jie; Shen, Hong-Bin; Chou, Kuo-Chen

2008-01-01

Membrane proteins are generally classified into the following eight types: (1) type I transmembrane, (2) type II, (3) type III, (4) type IV, (5) multipass transmembrane, (6) lipid-chain-anchored membrane, (7) GPI-anchored membrane, and (8) peripheral membrane (K.C. Chou and H.B. Shen: BBRC, 2007, 360: 339-345). Knowing the type of an uncharacterized membrane protein often provides useful clues for finding its biological function and interaction process with other molecules in a biological system. With the explosion of protein sequences generated in the Post-Genomic Age, it is urgent to develop an automated method to deal with such a challenge. Recently, the PsePSSM (Pseudo Position-Specific Score Matrix) descriptor is proposed by Chou and Shen (Biochem. Biophys. Res. Comm. 2007, 360, 339-345) to represent a protein sample. The advantage of the PsePSSM descriptor is that it can combine the evolution information and sequence-correlated information. However, incorporating all these effects into a descriptor may cause the "high dimension disaster". To overcome such a problem, the fusion approach was adopted by Chou and Shen. Here, a completely different approach, the so-called LLDA (Local Linear Discriminant Analysis) is introduced to extract the key features from the high-dimensional PsePSSM space. The dimension-reduced descriptor vector thus obtained is a compact representation of the original high dimensional vector. Our jackknife and independent dataset test results indicate that it is very promising to use the LLDA approach to cope with complicated problems in biological systems, such as predicting the membrane protein type.
Finding Chemical Structures Corresponding to a Set of Coordinates in Chemical Descriptor Space.

PubMed

Miyao, Tomoyuki; Funatsu, Kimito

2017-08-01

When chemical structures are searched based on descriptor values, or descriptors are interpreted based on values, it is important that corresponding chemical structures actually exist. In order to consider the existence of chemical structures located in a specific region in the chemical space, we propose to search them inside training data domains (TDDs), which are dense areas of a training dataset in the chemical space. We investigated TDDs' features using diverse and local datasets, assuming that GDB11 is the chemical universe. These two analyses showed that considering TDDs gives higher chance of finding chemical structures than a random search-based method, and that novel chemical structures actually exist inside TDDs. In addition to those findings, we tested the hypothesis that chemical structures were distributed on the limited areas of chemical space. This hypothesis was confirmed by the fact that distances among chemical structures in several descriptor spaces were much shorter than those among randomly generated coordinates in the training data range. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
3D-QSAR studies of some reversible Acetyl cholinesterase inhibitors based on CoMFA and ligand protein interaction fingerprints using PC-LS-SVM and PLS-LS-SVM.

PubMed

Ghafouri, Hamidreza; Ranjbar, Mohsen; Sakhteman, Amirhossein

2017-08-01

A great challenge in medicinal chemistry is to develop different methods for structural design based on the pattern of the previously synthesized compounds. In this study two different QSAR methods were established and compared for a series of piperidine acetylcholinesterase inhibitors. In one novel approach, PC-LS-SVM and PLS-LS-SVM was used for modeling 3D interaction descriptors, and in the other method the same nonlinear techniques were used to build QSAR equations based on field descriptors. Different validation methods were used to evaluate the models and the results revealed the more applicability and predictive ability of the model generated by field descriptors (Q 2 LOO-CV =1, R 2 ext =0.97). External validation criteria revealed that both methods can be used in generating reasonable QSAR models. It was concluded that due to ability of interaction descriptors in prediction of binding mode, using this approach can be implemented in future 3D-QSAR softwares. Copyright © 2017 Elsevier Ltd. All rights reserved.
Molecule kernels: a descriptor- and alignment-free quantitative structure-activity relationship approach.

PubMed

Mohr, Johannes A; Jain, Brijnesh J; Obermayer, Klaus

2008-09-01

Quantitative structure activity relationship (QSAR) analysis is traditionally based on extracting a set of molecular descriptors and using them to build a predictive model. In this work, we propose a QSAR approach based directly on the similarity between the 3D structures of a set of molecules measured by a so-called molecule kernel, which is independent of the spatial prealignment of the compounds. Predictors can be build using the molecule kernel in conjunction with the potential support vector machine (P-SVM), a recently proposed machine learning method for dyadic data. The resulting models make direct use of the structural similarities between the compounds in the test set and a subset of the training set and do not require an explicit descriptor construction. We evaluated the predictive performance of the proposed method on one classification and four regression QSAR datasets and compared its results to the results reported in the literature for several state-of-the-art descriptor-based and 3D QSAR approaches. In this comparison, the proposed molecule kernel method performed better than the other QSAR methods.
Human action recognition based on kinematic similarity in real time

PubMed Central

Chen, Longting; Luo, Ailing; Zhang, Sicong

2017-01-01

Human action recognition using 3D pose data has gained a growing interest in the field of computer robotic interfaces and pattern recognition since the availability of hardware to capture human pose. In this paper, we propose a fast, simple, and powerful method of human action recognition based on human kinematic similarity. The key to this method is that the action descriptor consists of joints position, angular velocity and angular acceleration, which can meet the different individual sizes and eliminate the complex normalization. The angular parameters of joints within a short sliding time window (approximately 5 frames) around the current frame are used to express each pose frame of human action sequence. Moreover, three modified KNN (k-nearest-neighbors algorithm) classifiers are employed in our method: one for achieving the confidence of every frame in the training step, one for estimating the frame label of each descriptor, and one for classifying actions. Additional estimating of the frame’s time label makes it possible to address single input frames. This approach can be used on difficult, unsegmented sequences. The proposed method is efficient and can be run in real time. The research shows that many public datasets are irregularly segmented, and a simple method is provided to regularize the datasets. The approach is tested on some challenging datasets such as MSR-Action3D, MSRDailyActivity3D, and UTD-MHAD. The results indicate our method achieves a higher accuracy. PMID:29073131
Laban movement analysis to classify emotions from motion

NASA Astrophysics Data System (ADS)

Dewan, Swati; Agarwal, Shubham; Singh, Navjyoti

2018-04-01

In this paper, we present the study of Laban Movement Analysis (LMA) to understand basic human emotions from nonverbal human behaviors. While there are a lot of studies on understanding behavioral patterns based on natural language processing and speech processing applications, understanding emotions or behavior from non-verbal human motion is still a very challenging and unexplored field. LMA provides a rich overview of the scope of movement possibilities. These basic elements can be used for generating movement or for describing movement. They provide an inroad to understanding movement and for developing movement efficiency and expressiveness. Each human being combines these movement factors in his/her own unique way and organizes them to create phrases and relationships which reveal personal, artistic, or cultural style. In this work, we build a motion descriptor based on a deep understanding of Laban theory. The proposed descriptor builds up on previous works and encodes experiential features by using temporal windows. We present a more conceptually elaborate formulation of Laban theory and test it in a relatively new domain of behavioral research with applications in human-machine interaction. The recognition of affective human communication may be used to provide developers with a rich source of information for creating systems that are capable of interacting well with humans. We test our algorithm on UCLIC dataset which consists of body motions of 13 non-professional actors portraying angry, fear, happy and sad emotions. We achieve an accuracy of 87.30% on this dataset.
The Development of Novel Chemical Fragment-Based Descriptors Using Frequent Common Subgraph Mining Approach and Their Application in QSAR Modeling.

PubMed

Khashan, Raed; Zheng, Weifan; Tropsha, Alexander

2014-03-01

We present a novel approach to generating fragment-based molecular descriptors. The molecules are represented by labeled undirected chemical graph. Fast Frequent Subgraph Mining (FFSM) is used to find chemical-fragments (subgraphs) that occur in at least a subset of all molecules in a dataset. The collection of frequent subgraphs (FSG) forms a dataset-specific descriptors whose values for each molecule are defined by the number of times each frequent fragment occurs in this molecule. We have employed the FSG descriptors to develop variable selection k Nearest Neighbor (kNN) QSAR models of several datasets with binary target property including Maximum Recommended Therapeutic Dose (MRTD), Salmonella Mutagenicity (Ames Genotoxicity), and P-Glycoprotein (PGP) data. Each dataset was divided into training, test, and validation sets to establish the statistical figures of merit reflecting the model validated predictive power. The classification accuracies of models for both training and test sets for all datasets exceeded 75 %, and the accuracy for the external validation sets exceeded 72 %. The model accuracies were comparable or better than those reported earlier in the literature for the same datasets. Furthermore, the use of fragment-based descriptors affords mechanistic interpretation of validated QSAR models in terms of essential chemical fragments responsible for the compounds' target property. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Relational Agreement Measures for Similarity Searching of Cheminformatic Data Sets.

PubMed

Rivera-Borroto, Oscar Miguel; García-de la Vega, José Manuel; Marrero-Ponce, Yovani; Grau, Ricardo

2016-01-01

Research on similarity searching of cheminformatic data sets has been focused on similarity measures using fingerprints. However, nominal scales are the least informative of all metric scales, increasing the tied similarity scores, and decreasing the effectivity of the retrieval engines. Tanimoto's coefficient has been claimed to be the most prominent measure for this task. Nevertheless, this field is far from being exhausted since the computer science no free lunch theorem predicts that "no similarity measure has overall superiority over the population of data sets". We introduce 12 relational agreement (RA) coefficients for seven metric scales, which are integrated within a group fusion-based similarity searching algorithm. These similarity measures are compared to a reference panel of 21 proximity quantifiers over 17 benchmark data sets (MUV), by using informative descriptors, a feature selection stage, a suitable performance metric, and powerful comparison tests. In this stage, RA coefficients perform favourably with repect to the state-of-the-art proximity measures. Afterward, the RA-based method outperform another four nearest neighbor searching algorithms over the same data domains. In a third validation stage, RA measures are successfully applied to the virtual screening of the NCI data set. Finally, we discuss a possible molecular interpretation for these similarity variants.
Method of data communications with reduced latency

DOEpatents

Blocksome, Michael A; Parker, Jeffrey J

2013-11-05

Data communications with reduced latency, including: writing, by a producer, a descriptor and message data into at least two descriptor slots of a descriptor buffer, the descriptor buffer comprising allocated computer memory segmented into descriptor slots, each descriptor slot having a fixed size, the descriptor buffer having a header pointer that identifies a next descriptor slot to be processed by a DMA controller, the descriptor buffer having a tail pointer that identifies a descriptor slot for entry of a next descriptor in the descriptor buffer; recording, by the producer, in the descriptor a value signifying that message data has been written into descriptor slots; and setting, by the producer, in dependence upon the recorded value, a tail pointer to point to a next open descriptor slot.
Clinical descriptors for the recognition of central sensitization pain in patients with knee osteoarthritis.

PubMed

Lluch, Enrique; Nijs, Jo; Courtney, Carol A; Rebbeck, Trudy; Wylde, Vikki; Baert, Isabel; Wideman, Timothy H; Howells, Nick; Skou, Søren T

2017-08-02

Despite growing awareness of the contribution of central pain mechanisms to knee osteoarthritis pain in a subgroup of patients, routine evaluation of central sensitization is yet to be incorporated into clinical practice. The objective of this perspective is to design a set of clinical descriptors for the recognition of central sensitization in patients with knee osteoarthritis that can be implemented in clinical practice. A narrative review of original research papers was conducted by nine clinicians and researchers from seven different countries to reach agreement on clinically relevant descriptors. It is proposed that identification of a dominance of central sensitization pain is based on descriptors derived from the subjective assessment and the physical examination. In the former, clinicians are recommended to inquire about intensity and duration of pain and its association with structural joint changes, pain distribution, behavior of knee pain, presence of neuropathic-like or centrally mediated symptoms and responsiveness to previous treatment. The latter includes assessment of response to clinical test, mechanical hyperalgesia and allodynia, thermal hyperalgesia, hypoesthesia and reduced vibration sense. This article describes a set of clinically relevant descriptors that might indicate the presence of central sensitization in patients with knee osteoarthritis in clinical practice. Although based on research data, the descriptors proposed in this review require experimental testing in future studies. Implications for Rehabilitation Laboratory evaluation of central sensitization for people with knee osteoarthritis is yet to be incorporated into clinical practice. A set of clinical indicators for the recognition of central sensitization in patients with knee osteoarthritis is proposed. Although based on research data, the clinical indicators proposed require further experimental testing of psychometric properties.
The effects of variant descriptors on the potential effectiveness of plain packaging.

PubMed

Borland, Ron; Savvas, Steven

2014-01-01

To examine the effects that variant descriptor labels on cigarette packs have on smokers' perceptions of those packs and the cigarettes contained within. As part of two larger web-based studies (each involved 160 young adult ever-smokers 18-29 years old), respondents were shown a computer image of a plain cigarette pack and sets of related variant descriptors. The sets included terms that varied in terms of descriptors of colours as names, flavour strength, degrees of filter venting, filter types, quality, type of cigarette and numbers. For each set, respondents rated the highest and lowest of two or three of the following four characteristics: quality, strongest or weakest in taste, delivers most or least tar/nicotine, and most or least level of harm. There were significant differences on all four ratings. Quality ratings were the least differentiated. Except for colour descriptors, where 'Gold' rated high in quality but medium in other ratings, ratings of quality, harm, strength and delivery were all positively associated when rated on the same descriptors. Descriptor labels on cigarette packs, can affect smokers' perceptions of the characteristics of the cigarettes contained within. Therefore, they are a potential means by which product differentiation can occur. In particular, having variants differing in perceived strength while not differing in deliveries of harmful ingredients is particularly problematic. Any packaging policy should take into account the possibility that variant descriptors can mislead smokers into making inappropriate product attributions.
Automated planning of ablation targets in atrial fibrillation treatment

NASA Astrophysics Data System (ADS)

Keustermans, Johannes; De Buck, Stijn; Heidbüchel, Hein; Suetens, Paul

2011-03-01

Catheter based radio-frequency ablation is used as an invasive treatment of atrial fibrillation. This procedure is often guided by the use of 3D anatomical models obtained from CT, MRI or rotational angiography. During the intervention the operator accurately guides the catheter to prespecified target ablation lines. The planning stage, however, can be time consuming and operator dependent which is suboptimal both from a cost and health perspective. Therefore, we present a novel statistical model-based algorithm for locating ablation targets from 3D rotational angiography images. Based on a training data set of 20 patients, consisting of 3D rotational angiography images with 30 manually indicated ablation points, a statistical local appearance and shape model is built. The local appearance model is based on local image descriptors to capture the intensity patterns around each ablation point. The local shape model is constructed by embedding the ablation points in an undirected graph and imposing that each ablation point only interacts with its neighbors. Identifying the ablation points on a new 3D rotational angiography image is performed by proposing a set of possible candidate locations for each ablation point, as such, converting the problem into a labeling problem. The algorithm is validated using a leave-one-out-approach on the training data set, by computing the distance between the ablation lines obtained by the algorithm and the manually identified ablation points. The distance error is equal to 3.8+/-2.9 mm. As ablation lesion size is around 5-7 mm, automated planning of ablation targets by the presented approach is sufficiently accurate.
Probabilistic Elastic Part Model: A Pose-Invariant Representation for Real-World Face Verification.

PubMed

Li, Haoxiang; Hua, Gang

2018-04-01

Pose variation remains to be a major challenge for real-world face recognition. We approach this problem through a probabilistic elastic part model. We extract local descriptors (e.g., LBP or SIFT) from densely sampled multi-scale image patches. By augmenting each descriptor with its location, a Gaussian mixture model (GMM) is trained to capture the spatial-appearance distribution of the face parts of all face images in the training corpus, namely the probabilistic elastic part (PEP) model. Each mixture component of the GMM is confined to be a spherical Gaussian to balance the influence of the appearance and the location terms, which naturally defines a part. Given one or multiple face images of the same subject, the PEP-model builds its PEP representation by sequentially concatenating descriptors identified by each Gaussian component in a maximum likelihood sense. We further propose a joint Bayesian adaptation algorithm to adapt the universally trained GMM to better model the pose variations between the target pair of faces/face tracks, which consistently improves face verification accuracy. Our experiments show that we achieve state-of-the-art face verification accuracy with the proposed representations on the Labeled Face in the Wild (LFW) dataset, the YouTube video face database, and the CMU MultiPIE dataset.

QSRR modeling for the chromatographic retention behavior of some β-lactam antibiotics using forward and firefly variable selection algorithms coupled with multiple linear regression.

PubMed

Fouad, Marwa A; Tolba, Enas H; El-Shal, Manal A; El Kerdawy, Ahmed M

2018-05-11

The justified continuous emerging of new β-lactam antibiotics provokes the need for developing suitable analytical methods that accelerate and facilitate their analysis. A face central composite experimental design was adopted using different levels of phosphate buffer pH, acetonitrile percentage at zero time and after 15 min in a gradient program to obtain the optimum chromatographic conditions for the elution of 31 β-lactam antibiotics. Retention factors were used as the target property to build two QSRR models utilizing the conventional forward selection and the advanced nature-inspired firefly algorithm for descriptor selection, coupled with multiple linear regression. The obtained models showed high performance in both internal and external validation indicating their robustness and predictive ability. Williams-Hotelling test and student's t-test showed that there is no statistical significant difference between the models' results. Y-randomization validation showed that the obtained models are due to significant correlation between the selected molecular descriptors and the analytes' chromatographic retention. These results indicate that the generated FS-MLR and FFA-MLR models are showing comparable quality on both the training and validation levels. They also gave comparable information about the molecular features that influence the retention behavior of β-lactams under the current chromatographic conditions. We can conclude that in some cases simple conventional feature selection algorithm can be used to generate robust and predictive models comparable to that are generated using advanced ones. Copyright © 2018 Elsevier B.V. All rights reserved.
A robustness test of the braided device foreshortening algorithm

NASA Astrophysics Data System (ADS)

Moyano, Raquel Kale; Fernandez, Hector; Macho, Juan M.; Blasco, Jordi; San Roman, Luis; Narata, Ana Paula; Larrabide, Ignacio

2017-11-01

Different computational methods have been recently proposed to simulate the virtual deployment of a braided stent inside a patient vasculature. Those methods are primarily based on the segmentation of the region of interest to obtain the local vessel morphology descriptors. The goal of this work is to evaluate the influence of the segmentation quality on the method named "Braided Device Foreshortening" (BDF). METHODS: We used the 3DRA images of 10 aneurysmatic patients (cases). The cases were segmented by applying a marching cubes algorithm with a broad range of thresholds in order to generate 10 surface models each. We selected a braided device to apply the BDF algorithm to each surface model. The range of the computed flow diverter lengths for each case was obtained to calculate the variability of the method against the threshold segmentation values. RESULTS: An evaluation study over 10 clinical cases indicates that the final length of the deployed flow diverter in each vessel model is stable, shielding maximum difference of 11.19% in vessel diameter and maximum of 9.14% in the simulated stent length for the threshold values. The average coefficient of variation was found to be 4.08 %. CONCLUSION: A study evaluating how the threshold segmentation affects the simulated length of the deployed FD, was presented. The segmentation algorithm used to segment intracranial aneurysm 3D angiography images presents small variation in the resulting stent simulation.
AMERICAN ASSOCIATION OF CLINICAL ENDOCRINOLOGISTS AND AMERICAN COLLEGE OF ENDOCRINOLOGY PROTOCOL FOR STANDARDIZED PRODUCTION OF CLINICAL PRACTICE GUIDELINES, ALGORITHMS, AND CHECKLISTS - 2017 UPDATE.

PubMed

Mechanick, Jeffrey I; Pessah-Pollack, Rachel; Camacho, Pauline; Correa, Ricardo; Figaro, M Kathleen; Garber, Jeffrey R; Jasim, Sina; Pantalone, Kevin M; Trence, Dace; Upala, Sikarin

2017-08-01

Clinical practice guideline (CPG), clinical practice algorithm (CPA), and clinical checklist (CC, collectively CPGAC) development is a high priority of the American Association of Clinical Endocrinologists (AACE) and American College of Endocrinology (ACE). This 2017 update in CPG development consists of (1) a paradigm change wherein first, environmental scans identify important clinical issues and needs, second, CPA construction focuses on these clinical issues and needs, and third, CPG provide CPA node/edge-specific scientific substantiation and appended CC; (2) inclusion of new technical semantic and numerical descriptors for evidence types, subjective factors, and qualifiers; and (3) incorporation of patient-centered care components such as economics and transcultural adaptations, as well as implementation, validation, and evaluation strategies. This third point highlights the dominating factors of personal finances, governmental influences, and third-party payer dictates on CPGAC implementation, which ultimately impact CPGAC development. The AACE/ACE guidelines for the CPGAC program is a successful and ongoing iterative exercise to optimize endocrine care in a changing and challenging healthcare environment. AACE = American Association of Clinical Endocrinologists ACC = American College of Cardiology ACE = American College of Endocrinology ASeRT = ACE Scientific Referencing Team BEL = best evidence level CC = clinical checklist CPA = clinical practice algorithm CPG = clinical practice guideline CPGAC = clinical practice guideline, algorithm, and checklist EBM = evidence-based medicine EHR = electronic health record EL = evidence level G4GAC = Guidelines for Guidelines, Algorithms, and Checklists GAC = guidelines, algorithms, and checklists HCP = healthcare professional(s) POEMS = patient-oriented evidence that matters PRCT = prospective randomized controlled trial.
Efficient content-based low-altitude images correlated network and strips reconstruction

NASA Astrophysics Data System (ADS)

He, Haiqing; You, Qi; Chen, Xiaoyong

2017-01-01

The manual intervention method is widely used to reconstruct strips for further aerial triangulation in low-altitude photogrammetry. Clearly the method for fully automatic photogrammetric data processing is not an expected way. In this paper, we explore a content-based approach without manual intervention or external information for strips reconstruction. Feature descriptors in the local spatial patterns are extracted by SIFT to construct vocabulary tree, in which these features are encoded in terms of TF-IDF numerical statistical algorithm to generate new representation for each low-altitude image. Then images correlated network is reconstructed by similarity measure, image matching and geometric graph theory. Finally, strips are reconstructed automatically by tracing straight lines and growing adjacent images gradually. Experimental results show that the proposed approach is highly effective in automatically rearranging strips of lowaltitude images and can provide rough relative orientation for further aerial triangulation.
Distributed Position-Based Consensus of Second-Order Multiagent Systems With Continuous/Intermittent Communication.

PubMed

Song, Qiang; Liu, Fang; Wen, Guanghui; Cao, Jinde; Yang, Xinsong

2017-04-24

This paper considers the position-based consensus in a network of agents with double-integrator dynamics and directed topology. Two types of distributed observer algorithms are proposed to solve the consensus problem by utilizing continuous and intermittent position measurements, respectively, where each observer does not interact with any other observers. For the case of continuous communication between network agents, some convergence conditions are derived for reaching consensus in the network with a single constant delay or multiple time-varying delays on the basis of the eigenvalue analysis and the descriptor method. When the network agents can only obtain intermittent position data from local neighbors at discrete time instants, the consensus in the network without time delay or with nonuniform delays is investigated by using the Wirtinger's inequality and the delayed-input approach. Numerical examples are given to illustrate the theoretical analysis.
Content-aware network storage system supporting metadata retrieval

NASA Astrophysics Data System (ADS)

Liu, Ke; Qin, Leihua; Zhou, Jingli; Nie, Xuejun

2008-12-01

Nowadays, content-based network storage has become the hot research spot of academy and corporation[1]. In order to solve the problem of hit rate decline causing by migration and achieve the content-based query, we exploit a new content-aware storage system which supports metadata retrieval to improve the query performance. Firstly, we extend the SCSI command descriptor block to enable system understand those self-defined query requests. Secondly, the extracted metadata is encoded by extensible markup language to improve the universality. Thirdly, according to the demand of information lifecycle management (ILM), we store those data in different storage level and use corresponding query strategy to retrieval them. Fourthly, as the file content identifier plays an important role in locating data and calculating block correlation, we use it to fetch files and sort query results through friendly user interface. Finally, the experiments indicate that the retrieval strategy and sort algorithm have enhanced the retrieval efficiency and precision.
Segmentation and classification of cell cycle phases in fluorescence imaging.

PubMed

Ersoy, Ilker; Bunyak, Filiz; Chagin, Vadim; Cardoso, M Christina; Palaniappan, Kannappan

2009-01-01

Current chemical biology methods for studying spatiotemporal correlation between biochemical networks and cell cycle phase progression in live-cells typically use fluorescence-based imaging of fusion proteins. Stable cell lines expressing fluorescently tagged protein GFP-PCNA produce rich, dynamically varying sub-cellular foci patterns characterizing the cell cycle phases, including the progress during the S-phase. Variable fluorescence patterns, drastic changes in SNR, shape and position changes and abundance of touching cells require sophisticated algorithms for reliable automatic segmentation and cell cycle classification. We extend the recently proposed graph partitioning active contours (GPAC) for fluorescence-based nucleus segmentation using regional density functions and dramatically improve its efficiency, making it scalable for high content microscopy imaging. We utilize surface shape properties of GFP-PCNA intensity field to obtain descriptors of foci patterns and perform automated cell cycle phase classification, and give quantitative performance by comparing our results to manually labeled data.
Binding ligand prediction for proteins using partial matching of local surface patches.

PubMed

Sael, Lee; Kihara, Daisuke

2010-01-01

Functional elucidation of uncharacterized protein structures is an important task in bioinformatics. We report our new approach for structure-based function prediction which captures local surface features of ligand binding pockets. Function of proteins, specifically, binding ligands of proteins, can be predicted by finding similar local surface regions of known proteins. To enable partial comparison of binding sites in proteins, a weighted bipartite matching algorithm is used to match pairs of surface patches. The surface patches are encoded with the 3D Zernike descriptors. Unlike the existing methods which compare global characteristics of the protein fold or the global pocket shape, the local surface patch method can find functional similarity between non-homologous proteins and binding pockets for flexible ligand molecules. The proposed method improves prediction results over global pocket shape-based method which was previously developed by our group.
Binding Ligand Prediction for Proteins Using Partial Matching of Local Surface Patches

PubMed Central

Sael, Lee; Kihara, Daisuke

2010-01-01

Functional elucidation of uncharacterized protein structures is an important task in bioinformatics. We report our new approach for structure-based function prediction which captures local surface features of ligand binding pockets. Function of proteins, specifically, binding ligands of proteins, can be predicted by finding similar local surface regions of known proteins. To enable partial comparison of binding sites in proteins, a weighted bipartite matching algorithm is used to match pairs of surface patches. The surface patches are encoded with the 3D Zernike descriptors. Unlike the existing methods which compare global characteristics of the protein fold or the global pocket shape, the local surface patch method can find functional similarity between non-homologous proteins and binding pockets for flexible ligand molecules. The proposed method improves prediction results over global pocket shape-based method which was previously developed by our group. PMID:21614188
Validating Performance Level Descriptors (PLDs) for the AP® Environmental Science Exam

ERIC Educational Resources Information Center

Reshetar, Rosemary; Kaliski, Pamela; Chajewski, Michael; Lionberger, Karen

2012-01-01

This presentation summarizes a pilot study conducted after the May 2011 administration of the AP Environmental Science Exam. The study used analytical methods based on scaled anchoring as input to a Performance Level Descriptor validation process that solicited systematic input from subject matter experts.
Ability of bottle cap color to facilitate accurate glaucoma patient-physician communication regarding medication identity

PubMed Central

Dave, Pujan; Villarreal, Guadalupe; Friedman, David S.; Kahook, Malik Y.; Ramulu, Pradeep Y.

2015-01-01

Objective To determine the accuracy of patient-physician communication regarding topical ophthalmic medication use based on bottle cap color, particularly amongst individuals who may have acquired color vision deficiency from glaucoma. Design Cross-sectional, clinical study. Participants Patients ≥ 18 years old with primary open-angle, primary angle-closure, pseudoexfoliation, or pigment dispersion glaucoma, bilateral visual acuity of 20/400 or better, and no concurrent conditions that may affect color vision. Methods One hundred patients provided color descriptions of 11 distinct medication bottle caps. Patient-produced color descriptors were then presented to three physicians. Each physician matched each color descriptor to the medication they thought the descriptor was describing. Main Outcome Measures Frequency of patient-physician agreement, occurring when all three physicians accurately matched the patient-produced color descriptor to the correct medication. Multivariate regression models evaluated whether patient-physician agreement decreased with degree of better-eye visual field (VF) damage, color descriptor heterogeneity, and/or color vision deficiency, as determined by Hardy-Rand-Rittler (HRR) score and the Lanthony D15 testing index (D15 CCI). Results Subjects had a mean age of 69 (±11) years, with mean VF mean deviation of −4.7 (±6.0) and −10.9 (±8.4) dB in the better- and worse-seeing eyes, respectively. Patients produced 102 unique color descriptors to describe the colors of the 11 tested bottle caps. Among individual patients, the mean number of medications demonstrating patient-physician agreement was 6.1/11 (55.5%). Agreement was less than 15% for 4 medications (prednisolone acetate [generic], betaxolol HCl [Betoptic], brinzolamide/brimonidine [Simbrinza], and latanoprost [Xalatan]). Lower HRR scores and higher D15 CCI (both indicating worse color vision) were associated with greater VF damage (p<0.001). Extent of color vision deficiency and color descriptor heterogeneity were the only significant predictors of patient-physician agreement in multivariate models (odds of agreement = 0.90 per 1 point decrement in HRR score, p<0.001; odds of agreement = 0.30 for medications exhibiting high heterogeneity [≥ 11 descriptors], p=0.007). Conclusions Physician understanding of patient medication usage based solely on bottle cap color is frequently incorrect, particularly in glaucoma patients who may have color vision deficiency. Errors based on communication using bottle cap color alone may be common and could lead to confusion and harm. PMID:26260280
Periodic table-based descriptors to encode cytotoxicity profile of metal oxide nanoparticles: a mechanistic QSTR approach.

PubMed

Kar, Supratik; Gajewicz, Agnieszka; Puzyn, Tomasz; Roy, Kunal; Leszczynski, Jerzy

2014-09-01

Nanotechnology has evolved as a frontrunner in the development of modern science. Current studies have established toxicity of some nanoparticles to human and environment. Lack of sufficient data and low adequacy of experimental protocols hinder comprehensive risk assessment of nanoparticles (NPs). In the present work, metal electronegativity (χ), the charge of the metal cation corresponding to a given oxide (χox), atomic number and valence electron number of the metal have been used as simple molecular descriptors to build up quantitative structure-toxicity relationship (QSTR) models for prediction of cytotoxicity of metal oxide NPs to bacteria Escherichia coli. These descriptors can be easily obtained from molecular formula and information acquired from periodic table in no time. It has been shown that a simple molecular descriptor χox can efficiently encode cytotoxicity of metal oxides leading to models with high statistical quality as well as interpretability. Based on this model and previously published experimental results, we have hypothesized the most probable mechanism of the cytotoxicity of metal oxide nanoparticles to E. coli. Moreover, the required information for descriptor calculation is independent of size range of NPs, nullifying a significant problem that various physical properties of NPs change for different size ranges. Copyright © 2014 Elsevier Inc. All rights reserved.
A Concise Guide to Feature Histograms with Applications to LIDAR-Based Spacecraft Relative Navigation

NASA Astrophysics Data System (ADS)

Rhodes, Andrew P.; Christian, John A.; Evans, Thomas

2017-12-01

With the availability and popularity of 3D sensors, it is advantageous to re-examine the use of point cloud descriptors for the purpose of pose estimation and spacecraft relative navigation. One popular descriptor is the oriented unique repeatable clustered viewpoint feature histogram (OUR-CVFH), which is most often utilized in personal and industrial robotics to simultaneously recognize and navigate relative to an object. Recent research into using the OUR-CVFH descriptor for spacecraft navigation has produced favorable results. Since OUR-CVFH is the most recent innovation in a large family of feature histogram point cloud descriptors, discussions of parameter settings and insights into its functionality are spread among various publications and online resources. This paper organizes the history of feature histogram point cloud descriptors for a straightforward explanation of their evolution. This article compiles all the requisite information needed to implement OUR-CVFH into one location, as well as providing useful suggestions on how to tune the generation parameters. This work is beneficial for anyone interested in using this histogram descriptor for object recognition or navigation - may it be personal robotics or spacecraft navigation.
ANN expert system screening for illicit amphetamines using molecular descriptors

NASA Astrophysics Data System (ADS)

Gosav, S.; Praisler, M.; Dorohoi, D. O.

2007-05-01

The goal of this study was to develop and an artificial neural network (ANN) based on computed descriptors, which would be able to classify the molecular structures of potential illicit amphetamines and to derive their biological activity according to the similarity of their molecular structure with amphetamines of known toxicity. The system is necessary for testing new molecular structures for epidemiological, clinical, and forensic purposes. It was built using a database formed by 146 compounds representing drugs of abuse (mainly central stimulants, hallucinogens, sympathomimetic amines, narcotics and other potent analgesics), precursors, or derivatized counterparts. Their molecular structures were characterized by computing three types of descriptors: 38 constitutional descriptors (CDs), 69 topological descriptors (TDs) and 160 3D-MoRSE descriptors (3DDs). An ANN system was built for each category of variables. All three networks (CD-NN, TD-NN and 3DD-NN) were trained to distinguish between stimulant amphetamines, hallucinogenic amphetamines, and nonamphetamines. A selection of variables was performed when necessary. The efficiency with which each network identifies the class identity of an unknown sample was evaluated by calculating several figures of merit. The results of the comparative analysis are presented.
Ligand Electron Density Shape Recognition Using 3D Zernike Descriptors

NASA Astrophysics Data System (ADS)

Gunasekaran, Prasad; Grandison, Scott; Cowtan, Kevin; Mak, Lora; Lawson, David M.; Morris, Richard J.

We present a novel approach to crystallographic ligand density interpretation based on Zernike shape descriptors. Electron density for a bound ligand is expanded in an orthogonal polynomial series (3D Zernike polynomials) and the coefficients from this expansion are employed to construct rotation-invariant descriptors. These descriptors can be compared highly efficiently against large databases of descriptors computed from other molecules. In this manuscript we describe this process and show initial results from an electron density interpretation study on a dataset containing over a hundred OMIT maps. We could identify the correct ligand as the first hit in about 30 % of the cases, within the top five in a further 30 % of the cases, and giving rise to an 80 % probability of getting the correct ligand within the top ten matches. In all but a few examples, the top hit was highly similar to the correct ligand in both shape and chemistry. Further extensions and intrinsic limitations of the method are discussed.
The effects of geometric uncertainties on computational modelling of knee biomechanics

NASA Astrophysics Data System (ADS)

Meng, Qingen; Fisher, John; Wilcox, Ruth

2017-08-01

The geometry of the articular components of the knee is an important factor in predicting joint mechanics in computational models. There are a number of uncertainties in the definition of the geometry of cartilage and meniscus, and evaluating the effects of these uncertainties is fundamental to understanding the level of reliability of the models. In this study, the sensitivity of knee mechanics to geometric uncertainties was investigated by comparing polynomial-based and image-based knee models and varying the size of meniscus. The results suggested that the geometric uncertainties in cartilage and meniscus resulting from the resolution of MRI and the accuracy of segmentation caused considerable effects on the predicted knee mechanics. Moreover, even if the mathematical geometric descriptors can be very close to the imaged-based articular surfaces, the detailed contact pressure distribution produced by the mathematical geometric descriptors was not the same as that of the image-based model. However, the trends predicted by the models based on mathematical geometric descriptors were similar to those of the imaged-based models.
Calculating the dermal flux of chemicals with OELs based on their molecular structure: An attempt to assign the skin notation.

PubMed

Kupczewska-Dobecka, Małgorzata; Jakubowski, Marek; Czerczak, Sławomir

2010-09-01

Our objectives included calculating the permeability coefficient and dermal penetration rates (flux value) for 112 chemicals with occupational exposure limits (OELs) according to the LFER (linear free-energy relationship) model developed using published methods. We also attempted to assign skin notations based on each chemical's molecular structure. There are many studies available where formulae for coefficients of permeability from saturated aqueous solutions (K(p)) have been related to physicochemical characteristics of chemicals. The LFER model is based on the solvation equation, which contains five main descriptors predicted from chemical structure: solute excess molar refractivity, dipolarity/polarisability, summation hydrogen bond acidity and basicity, and the McGowan characteristic volume. Descriptor values, available for about 5000 compounds in the Pharma Algorithms Database were used to calculate permeability coefficients. Dermal penetration rate was estimated as a ratio of permeability coefficient and concentration of chemical in saturated aqueous solution. Finally, estimated dermal penetration rates were used to assign the skin notation to chemicals. Defined critical fluxes defined from the literature were recommended as reference values for skin notation. The application of Abraham descriptors predicted from chemical structure and LFER analysis in calculation of permeability coefficients and flux values for chemicals with OELs was successful. Comparison of calculated K(p) values with data obtained earlier from other models showed that LFER predictions were comparable to those obtained by some previously published models, but the differences were much more significant for others. It seems reasonable to conclude that skin should not be characterised as a simple lipophilic barrier alone. Both lipophilic and polar pathways of permeation exist across the stratum corneum. It is feasible to predict skin notation on the basis of the LFER and other published models; from among 112 chemicals 94 (84%) should have the skin notation in the OEL list based on the LFER calculations. The skin notation had been estimated by other published models for almost 94% of the chemicals. Twenty-nine (25.8%) chemicals were identified to have significant absorption and 65 (58%) the potential for dermal toxicity. We found major differences between alternative published analytical models and their ability to determine whether particular chemicals were potentially dermotoxic. Copyright © 2010 Elsevier B.V. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Singh, Kunwar P., E-mail: kpsingh_52@yahoo.com; Gupta, Shikha

Ensemble learning approach based decision treeboost (DTB) and decision tree forest (DTF) models are introduced in order to establish quantitative structure–toxicity relationship (QSTR) for the prediction of toxicity of 1450 diverse chemicals. Eight non-quantum mechanical molecular descriptors were derived. Structural diversity of the chemicals was evaluated using Tanimoto similarity index. Stochastic gradient boosting and bagging algorithms supplemented DTB and DTF models were constructed for classification and function optimization problems using the toxicity end-point in T. pyriformis. Special attention was drawn to prediction ability and robustness of the models, investigated both in external and 10-fold cross validation processes. In complete data,more » optimal DTB and DTF models rendered accuracies of 98.90%, 98.83% in two-category and 98.14%, 98.14% in four-category toxicity classifications. Both the models further yielded classification accuracies of 100% in external toxicity data of T. pyriformis. The constructed regression models (DTB and DTF) using five descriptors yielded correlation coefficients (R{sup 2}) of 0.945, 0.944 between the measured and predicted toxicities with mean squared errors (MSEs) of 0.059, and 0.064 in complete T. pyriformis data. The T. pyriformis regression models (DTB and DTF) applied to the external toxicity data sets yielded R{sup 2} and MSE values of 0.637, 0.655; 0.534, 0.507 (marine bacteria) and 0.741, 0.691; 0.155, 0.173 (algae). The results suggest for wide applicability of the inter-species models in predicting toxicity of new chemicals for regulatory purposes. These approaches provide useful strategy and robust tools in the screening of ecotoxicological risk or environmental hazard potential of chemicals. - Graphical abstract: Importance of input variables in DTB and DTF classification models for (a) two-category, and (b) four-category toxicity intervals in T. pyriformis data. Generalization and predictive abilities of the constructed (c) DTB and (d) DTF regression models to predict the T. pyriformis toxicity of diverse chemicals. - Highlights: • Ensemble learning (EL) based models constructed for toxicity prediction of chemicals • Predictive models used a few simple non-quantum mechanical molecular descriptors. • EL-based DTB/DTF models successfully discriminated toxic and non-toxic chemicals. • DTB/DTF regression models precisely predicted toxicity of chemicals in multi-species. • Proposed EL based models can be used as tool to predict toxicity of new chemicals.« less
Content Based Image Retrieval by Using Color Descriptor and Discrete Wavelet Transform.

PubMed

Ashraf, Rehan; Ahmed, Mudassar; Jabbar, Sohail; Khalid, Shehzad; Ahmad, Awais; Din, Sadia; Jeon, Gwangil

2018-01-25

Due to recent development in technology, the complexity of multimedia is significantly increased and the retrieval of similar multimedia content is a open research problem. Content-Based Image Retrieval (CBIR) is a process that provides a framework for image search and low-level visual features are commonly used to retrieve the images from the image database. The basic requirement in any image retrieval process is to sort the images with a close similarity in term of visually appearance. The color, shape and texture are the examples of low-level image features. The feature plays a significant role in image processing. The powerful representation of an image is known as feature vector and feature extraction techniques are applied to get features that will be useful in classifying and recognition of images. As features define the behavior of an image, they show its place in terms of storage taken, efficiency in classification and obviously in time consumption also. In this paper, we are going to discuss various types of features, feature extraction techniques and explaining in what scenario, which features extraction technique will be better. The effectiveness of the CBIR approach is fundamentally based on feature extraction. In image processing errands like object recognition and image retrieval feature descriptor is an immense among the most essential step. The main idea of CBIR is that it can search related images to an image passed as query from a dataset got by using distance metrics. The proposed method is explained for image retrieval constructed on YCbCr color with canny edge histogram and discrete wavelet transform. The combination of edge of histogram and discrete wavelet transform increase the performance of image retrieval framework for content based search. The execution of different wavelets is additionally contrasted with discover the suitability of specific wavelet work for image retrieval. The proposed algorithm is prepared and tried to implement for Wang image database. For Image Retrieval Purpose, Artificial Neural Networks (ANN) is used and applied on standard dataset in CBIR domain. The execution of the recommended descriptors is assessed by computing both Precision and Recall values and compared with different other proposed methods with demonstrate the predominance of our method. The efficiency and effectiveness of the proposed approach outperforms the existing research in term of average precision and recall values.
Learning moment-based fast local binary descriptor

NASA Astrophysics Data System (ADS)

Bellarbi, Abdelkader; Zenati, Nadia; Otmane, Samir; Belghit, Hayet

2017-03-01

Recently, binary descriptors have attracted significant attention due to their speed and low memory consumption; however, using intensity differences to calculate the binary descriptive vector is not efficient enough. We propose an approach to binary description called POLAR_MOBIL, in which we perform binary tests between geometrical and statistical information using moments in the patch instead of the classical intensity binary test. In addition, we introduce a learning technique used to select an optimized set of binary tests with low correlation and high variance. This approach offers high distinctiveness against affine transformations and appearance changes. An extensive evaluation on well-known benchmark datasets reveals the robustness and the effectiveness of the proposed descriptor, as well as its good performance in terms of low computation complexity when compared with state-of-the-art real-time local descriptors.

QSAR modeling for anti-human African trypanosomiasis activity of substituted 2-Phenylimidazopyridines

NASA Astrophysics Data System (ADS)

Masand, Vijay H.; El-Sayed, Nahed N. E.; Mahajan, Devidas T.; Mercader, Andrew G.; Alafeefy, Ahmed M.; Shibi, I. G.

2017-02-01

In the present work, sixty substituted 2-Phenylimidazopyridines previously reported with potent anti-human African trypanosomiasis (HAT) activity were selected to build genetic algorithm (GA) based QSAR models to determine the structural features that have significant correlation with the activity. Multiple QSAR models were built using easily interpretable descriptors that are directly associated with the presence or the absence of a structural scaffold, or a specific atom. All the QSAR models have been thoroughly validated according to the OECD principles. All the QSAR models are statistically very robust (R2 = 0.80-0.87) with high external predictive ability (CCCex = 0.81-0.92). The QSAR analysis reveals that the HAT activity has good correlation with the presence of five membered rings in the molecule.
Introducing a new methodology for the calculation of local philicity and multiphilic descriptor: an alternative to the finite difference approximation

NASA Astrophysics Data System (ADS)

Sánchez-Márquez, Jesús; Zorrilla, David; García, Víctor; Fernández, Manuel

2018-07-01

This work presents a new development based on the condensation scheme proposed by Chamorro and Pérez, in which new terms to correct the frozen molecular orbital approximation have been introduced (improved frontier molecular orbital approximation). The changes performed on the original development allow taking into account the orbital relaxation effects, providing equivalent results to those achieved by the finite difference approximation and leading also to a methodology with great advantages. Local reactivity indices based on this new development have been obtained for a sample set of molecules and they have been compared with those indices based on the frontier molecular orbital and finite difference approximations. A new definition based on the improved frontier molecular orbital methodology for the dual descriptor index is also shown. In addition, taking advantage of the characteristics of the definitions obtained with the new condensation scheme, the descriptor local philicity is analysed by separating the components corresponding to the frontier molecular orbital approximation and orbital relaxation effects, analysing also the local parameter multiphilic descriptor in the same way. Finally, the effect of using the basis set is studied and calculations using DFT, CI and Möller-Plesset methodologies are performed to analyse the consequence of different electronic-correlation levels.
Prediction of blood-brain partitioning: a model based on molecular electronegativity distance vector descriptors.

PubMed

Zhang, Yong-Hong; Xia, Zhi-Ning; Qin, Li-Tang; Liu, Shu-Shen

2010-09-01

The objective of this paper is to build a reliable model based on the molecular electronegativity distance vector (MEDV) descriptors for predicting the blood-brain barrier (BBB) permeability and to reveal the effects of the molecular structural segments on the BBB permeability. Using 70 structurally diverse compounds, the partial least squares regression (PLSR) models between the BBB permeability and the MEDV descriptors were developed and validated by the variable selection and modeling based on prediction (VSMP) technique. The estimation ability, stability, and predictive power of a model are evaluated by the estimated correlation coefficient (r), leave-one-out (LOO) cross-validation correlation coefficient (q), and predictive correlation coefficient (R(p)). It has been found that PLSR model has good quality, r=0.9202, q=0.7956, and R(p)=0.6649 for M1 model based on the training set of 57 samples. To search the most important structural factors affecting the BBB permeability of compounds, we performed the values of the variable importance in projection (VIP) analysis for MEDV descriptors. It was found that some structural fragments in compounds, such as -CH(3), -CH(2)-, =CH-, =C, triple bond C-, -CH<, =C<, =N-, -NH-, =O, and -OH, are the most important factors affecting the BBB permeability. (c) 2010. Published by Elsevier Inc.
A New FPGA Architecture of FAST and BRIEF Algorithm for On-Board Corner Detection and Matching.

PubMed

Huang, Jingjin; Zhou, Guoqing; Zhou, Xiang; Zhang, Rongting

2018-03-28

Although some researchers have proposed the Field Programmable Gate Array (FPGA) architectures of Feature From Accelerated Segment Test (FAST) and Binary Robust Independent Elementary Features (BRIEF) algorithm, there is no consideration of image data storage in these traditional architectures that will result in no image data that can be reused by the follow-up algorithms. This paper proposes a new FPGA architecture that considers the reuse of sub-image data. In the proposed architecture, a remainder-based method is firstly designed for reading the sub-image, a FAST detector and a BRIEF descriptor are combined for corner detection and matching. Six pairs of satellite images with different textures, which are located in the Mentougou district, Beijing, China, are used to evaluate the performance of the proposed architecture. The Modelsim simulation results found that: (i) the proposed architecture is effective for sub-image reading from DDR3 at a minimum cost; (ii) the FPGA implementation is corrected and efficient for corner detection and matching, such as the average value of matching rate of natural areas and artificial areas are approximately 67% and 83%, respectively, which are close to PC's and the processing speed by FPGA is approximately 31 and 2.5 times faster than those by PC processing and by GPU processing, respectively.
Object tracking with adaptive HOG detector and adaptive Rao-Blackwellised particle filter

NASA Astrophysics Data System (ADS)

Rosa, Stefano; Paleari, Marco; Ariano, Paolo; Bona, Basilio

2012-01-01

Scenarios for a manned mission to the Moon or Mars call for astronaut teams to be accompanied by semiautonomous robots. A prerequisite for human-robot interaction is the capability of successfully tracking humans and objects in the environment. In this paper we present a system for real-time visual object tracking in 2D images for mobile robotic systems. The proposed algorithm is able to specialize to individual objects and to adapt to substantial changes in illumination and object appearance during tracking. The algorithm is composed by two main blocks: a detector based on Histogram of Oriented Gradient (HOG) descriptors and linear Support Vector Machines (SVM), and a tracker which is implemented by an adaptive Rao-Blackwellised particle filter (RBPF). The SVM is re-trained online on new samples taken from previous predicted positions. We use the effective sample size to decide when the classifier needs to be re-trained. Position hypotheses for the tracked object are the result of a clustering procedure applied on the set of particles. The algorithm has been tested on challenging video sequences presenting strong changes in object appearance, illumination, and occlusion. Experimental tests show that the presented method is able to achieve near real-time performances with a precision of about 7 pixels on standard video sequences of dimensions 320 × 240.
Discrimination Power of Polynomial-Based Descriptors for Graphs by Using Functional Matrices.

PubMed

Dehmer, Matthias; Emmert-Streib, Frank; Shi, Yongtang; Stefu, Monica; Tripathi, Shailesh

2015-01-01

In this paper, we study the discrimination power of graph measures that are based on graph-theoretical matrices. The paper generalizes the work of [M. Dehmer, M. Moosbrugger. Y. Shi, Encoding structural information uniquely with polynomial-based descriptors by employing the Randić matrix, Applied Mathematics and Computation, 268(2015), 164-168]. We demonstrate that by using the new functional matrix approach, exhaustively generated graphs can be discriminated more uniquely than shown in the mentioned previous work.
Discrimination Power of Polynomial-Based Descriptors for Graphs by Using Functional Matrices

PubMed Central

Dehmer, Matthias; Emmert-Streib, Frank; Shi, Yongtang; Stefu, Monica; Tripathi, Shailesh

2015-01-01

In this paper, we study the discrimination power of graph measures that are based on graph-theoretical matrices. The paper generalizes the work of [M. Dehmer, M. Moosbrugger. Y. Shi, Encoding structural information uniquely with polynomial-based descriptors by employing the Randić matrix, Applied Mathematics and Computation, 268(2015), 164–168]. We demonstrate that by using the new functional matrix approach, exhaustively generated graphs can be discriminated more uniquely than shown in the mentioned previous work. PMID:26479495
A rotation-translation invariant molecular descriptor of partial charges and its use in ligand-based virtual screening

PubMed Central

2014-01-01

Background Measures of similarity for chemical molecules have been developed since the dawn of chemoinformatics. Molecular similarity has been measured by a variety of methods including molecular descriptor based similarity, common molecular fragments, graph matching and 3D methods such as shape matching. Similarity measures are widespread in practice and have proven to be useful in drug discovery. Because of our interest in electrostatics and high throughput ligand-based virtual screening, we sought to exploit the information contained in atomic coordinates and partial charges of a molecule. Results A new molecular descriptor based on partial charges is proposed. It uses the autocorrelation function and linear binning to encode all atoms of a molecule into two rotation-translation invariant vectors. Combined with a scoring function, the descriptor allows to rank-order a database of compounds versus a query molecule. The proposed implementation is called ACPC (AutoCorrelation of Partial Charges) and released in open source. Extensive retrospective ligand-based virtual screening experiments were performed and other methods were compared with in order to validate the method and associated protocol. Conclusions While it is a simple method, it performed remarkably well in experiments. At an average speed of 1649 molecules per second, it reached an average median area under the curve of 0.81 on 40 different targets; hence validating the proposed protocol and implementation. PMID:24887178
Multicomponent ionic liquid CMC prediction.

PubMed

Kłosowska-Chomiczewska, I E; Artichowicz, W; Preiss, U; Jungnickel, C

2017-09-27

We created a model to predict CMC of ILs based on 704 experimental values published in 43 publications since 2000. Our model was able to predict CMC of variety of ILs in binary or ternary system in a presence of salt or alcohol. The molecular volume of IL (V m ), solvent-accessible surface (Ŝ), solvation enthalpy (Δ solv G ∞ ), concentration of salt (C s ) or alcohol (C a ) and their molecular volumes (V ms and V ma , respectively) were chosen as descriptors, and Kernel Support Vector Machine (KSVM) and Evolutionary Algorithm (EA) as regression methodologies to create the models. Data was split into training and validation set (80/20) and subjected to bootstrap aggregation. KSVM provided better fit with average R 2 of 0.843, and MSE of 0.608, whereas EA resulted in R 2 of 0.794 and MSE of 0.973. From the sensitivity analysis it was shown that V m and Ŝ have the highest impact on ILs micellization in both binary and ternary systems, however surprisingly in the presence of alcohol the V m becomes insignificant/irrelevant. Micelle stabilizing or destabilizing influence of the descriptors depends upon the additives. Previous attempts at modelling the CMC of ILs was generally limited to small number of ILs in simplified (binary) systems. We however showed successful prediction of the CMC over a range of different systems (binary and ternary).
Major Source of Error in QSPR Prediction of Intrinsic Thermodynamic Solubility of Drugs: Solid vs Nonsolid State Contributions?

PubMed

Abramov, Yuriy A

2015-06-01

The main purpose of this study is to define the major limiting factor in the accuracy of the quantitative structure-property relationship (QSPR) models of the thermodynamic intrinsic aqueous solubility of the drug-like compounds. For doing this, the thermodynamic intrinsic aqueous solubility property was suggested to be indirectly "measured" from the contributions of solid state, ΔGfus, and nonsolid state, ΔGmix, properties, which are estimated by the corresponding QSPR models. The QSPR models of ΔGfus and ΔGmix properties were built based on a set of drug-like compounds with available accurate measurements of fusion and thermodynamic solubility properties. For consistency ΔGfus and ΔGmix models were developed using similar algorithms and descriptor sets, and validated against the similar test compounds. Analysis of the relative performances of these two QSPR models clearly demonstrates that it is the solid state contribution which is the limiting factor in the accuracy and predictive power of the QSPR models of the thermodynamic intrinsic solubility. The performed analysis outlines a necessity of development of new descriptor sets for an accurate description of the long-range order (periodicity) phenomenon in the crystalline state. The proposed approach to the analysis of limitations and suggestions for improvement of QSPR-type models may be generalized to other applications in the pharmaceutical industry.
Texture functions in image analysis: A computationally efficient solution

NASA Technical Reports Server (NTRS)

Cox, S. C.; Rose, J. F.

1983-01-01

A computationally efficient means for calculating texture measurements from digital images by use of the co-occurrence technique is presented. The calculation of the statistical descriptors of image texture and a solution that circumvents the need for calculating and storing a co-occurrence matrix are discussed. The results show that existing efficient algorithms for calculating sums, sums of squares, and cross products can be used to compute complex co-occurrence relationships directly from the digital image input.
lazar: a modular predictive toxicology framework

PubMed Central

Maunz, Andreas; Gütlein, Martin; Rautenberg, Micha; Vorgrimmler, David; Gebele, Denis; Helma, Christoph

2013-01-01

lazar (lazy structure–activity relationships) is a modular framework for predictive toxicology. Similar to the read across procedure in toxicological risk assessment, lazar creates local QSAR (quantitative structure–activity relationship) models for each compound to be predicted. Model developers can choose between a large variety of algorithms for descriptor calculation and selection, chemical similarity indices, and model building. This paper presents a high level description of the lazar framework and discusses the performance of example classification and regression models. PMID:23761761
m-BIRCH: an online clustering approach for computer vision applications

NASA Astrophysics Data System (ADS)

Madan, Siddharth K.; Dana, Kristin J.

2015-03-01

We adapt a classic online clustering algorithm called Balanced Iterative Reducing and Clustering using Hierarchies (BIRCH), to incrementally cluster large datasets of features commonly used in multimedia and computer vision. We call the adapted version modified-BIRCH (m-BIRCH). The algorithm uses only a fraction of the dataset memory to perform clustering, and updates the clustering decisions when new data comes in. Modifications made in m-BIRCH enable data driven parameter selection and effectively handle varying density regions in the feature space. Data driven parameter selection automatically controls the level of coarseness of the data summarization. Effective handling of varying density regions is necessary to well represent the different density regions in data summarization. We use m-BIRCH to cluster 840K color SIFT descriptors, and 60K outlier corrupted grayscale patches. We use the algorithm to cluster datasets consisting of challenging non-convex clustering patterns. Our implementation of the algorithm provides an useful clustering tool and is made publicly available.
Evaluation of the performance of 3D virtual screening protocols: RMSD comparisons, enrichment assessments, and decoy selection--what can we learn from earlier mistakes?

PubMed

Kirchmair, Johannes; Markt, Patrick; Distinto, Simona; Wolber, Gerhard; Langer, Thierry

2008-01-01

Within the last few years a considerable amount of evaluative studies has been published that investigate the performance of 3D virtual screening approaches. Thereby, in particular assessments of protein-ligand docking are facing remarkable interest in the scientific community. However, comparing virtual screening approaches is a non-trivial task. Several publications, especially in the field of molecular docking, suffer from shortcomings that are likely to affect the significance of the results considerably. These quality issues often arise from poor study design, biasing, by using improper or inexpressive enrichment descriptors, and from errors in interpretation of the data output. In this review we analyze recent literature evaluating 3D virtual screening methods, with focus on molecular docking. We highlight problematic issues and provide guidelines on how to improve the quality of computational studies. Since 3D virtual screening protocols are in general assessed by their ability to discriminate between active and inactive compounds, we summarize the impact of the composition and preparation of test sets on the outcome of evaluations. Moreover, we investigate the significance of both classic enrichment parameters and advanced descriptors for the performance of 3D virtual screening methods. Furthermore, we review the significance and suitability of RMSD as a measure for the accuracy of protein-ligand docking algorithms and of conformational space sub sampling algorithms.
Packing of nonoverlapping cubic particles: Computational algorithms and microstructural characteristics

NASA Astrophysics Data System (ADS)

Malmir, Hessam; Sahimi, Muhammad; Tabar, M. Reza Rahimi

2016-12-01

Packing of cubic particles arises in a variety of problems, ranging from biological materials to colloids and the fabrication of new types of porous materials with controlled morphology. The properties of such packings may also be relevant to problems involving suspensions of cubic zeolites, precipitation of salt crystals during CO2 sequestration in rock, and intrusion of fresh water in aquifers by saline water. Not much is known, however, about the structure and statistical descriptors of such packings. We present a detailed simulation and microstructural characterization of packings of nonoverlapping monodisperse cubic particles, following up on our preliminary results [H. Malmir et al., Sci. Rep. 6, 35024 (2016), 10.1038/srep35024]. A modification of the random sequential addition (RSA) algorithm has been developed to generate such packings, and a variety of microstructural descriptors, including the radial distribution function, the face-normal correlation function, two-point probability and cluster functions, the lineal-path function, the pore-size distribution function, and surface-surface and surface-void correlation functions, have been computed, along with the specific surface and mean chord length of the packings. The results indicate the existence of both spatial and orientational long-range order as the the packing density increases. The maximum packing fraction achievable with the RSA method is about 0.57, which represents the limit for a structure similar to liquid crystals.
Application of a symbolic motion structure representation algorithm to identify upper extremity kinematic changes during a repetitive task.

PubMed

Whittaker, Rachel L; Park, Woojin; Dickerson, Clark R

2018-04-27

Efficient and holistic identification of fatigue-induced movement strategies can be limited by large between-subject variability in descriptors of joint angle data. One promising alternative to traditional, or computationally intensive methods is the symbolic motion structure representation algorithm (SMSR), which identifies the basic spatial-temporal structure of joint angle data using string descriptors of temporal joint angle trajectories. This study attempted to use the SMSR to identify changes in upper extremity time series joint angle data during a repetitive goal directed task causing muscle fatigue. Twenty-eight participants (15 M, 13 F) performed a seated repetitive task until fatigued. Upper extremity joint angles were extracted from motion capture for representative task cycles. SMSRs, averages and ranges of several joint angles were compared at the start and end of the repetitive task to identify kinematic changes with fatigue. At the group level, significant increases in the range of all joint angle data existed with large between-subject variability that posed a challenge to the interpretation of these fatigue-related changes. However, changes in the SMSRs across participants effectively summarized the adoption of adaptive movement strategies. This establishes SMSR as a viable, logical, and sensitive method of fatigue identification via kinematic changes, with novel application and pragmatism for visual assessment of fatigue development. Copyright © 2018 Elsevier Ltd. All rights reserved.
A vision-based fall detection algorithm of human in indoor environment

NASA Astrophysics Data System (ADS)

Liu, Hao; Guo, Yongcai

2017-02-01

Elderly care becomes more and more prominent in China as the population is aging fast and the number of aging population is large. Falls, as one of the biggest challenges in elderly guardianship system, have a serious impact on both physical health and mental health of the aged. Based on feature descriptors, such as aspect ratio of human silhouette, velocity of mass center, moving distance of head and angle of the ultimate posture, a novel vision-based fall detection method was proposed in this paper. A fast median method of background modeling with three frames was also suggested. Compared with the conventional bounding box and ellipse method, the novel fall detection technique is not only applicable for recognizing the fall behaviors end of lying down but also suitable for detecting the fall behaviors end of kneeling down and sitting down. In addition, numerous experiment results showed that the method had a good performance in recognition accuracy on the premise of not adding the cost of time.
Music Identification System Using MPEG-7 Audio Signature Descriptors

PubMed Central

You, Shingchern D.; Chen, Wei-Hwa; Chen, Woei-Kae

2013-01-01

This paper describes a multiresolution system based on MPEG-7 audio signature descriptors for music identification. Such an identification system may be used to detect illegally copied music circulated over the Internet. In the proposed system, low-resolution descriptors are used to search likely candidates, and then full-resolution descriptors are used to identify the unknown (query) audio. With this arrangement, the proposed system achieves both high speed and high accuracy. To deal with the problem that a piece of query audio may not be inside the system's database, we suggest two different methods to find the decision threshold. Simulation results show that the proposed method II can achieve an accuracy of 99.4% for query inputs both inside and outside the database. Overall, it is highly possible to use the proposed system for copyright control. PMID:23533359
Chemical reactivity indices for the complete series of chlorinated benzenes: solvent effect.

PubMed

Padmanabhan, J; Parthasarathi, R; Subramanian, V; Chattaraj, P K

2006-03-02

We present a comprehensive analysis to probe the effect of solvation on the reactivity of the complete series of chlorobenzenes through the conceptual density functional theory (DFT)-based global and local descriptors. We propose a multiphilic descriptor in this study to explore the nature of attack at a particular site in a molecule. It is defined as the difference between nucleophilic and electrophilic condensed philicity functions. This descriptor is capable of explaining both the nucleophilicity and electrophilicity of the given atomic sites in the molecule simultaneously. The predictive ability of this descriptor is tested on the complete series of chlorobenzenes in gas and solvent media. A structure-toxicity analysis of these entire sets of chlorobenzenes toward aquatic organisms demonstrates the importance of the electrophilicity index in the prediction of the reactivity/toxicity.
Prediction on the inhibition ratio of pyrrolidine derivatives on matrix metalloproteinase based on gene expression programming.

PubMed

Li, Yuqin; You, Guirong; Jia, Baoxiu; Si, Hongzong; Yao, Xiaojun

2014-01-01

Quantitative structure-activity relationships (QSAR) were developed to predict the inhibition ratio of pyrrolidine derivatives on matrix metalloproteinase via heuristic method (HM) and gene expression programming (GEP). The descriptors of 33 pyrrolidine derivatives were calculated by the software CODESSA, which can calculate quantum chemical, topological, geometrical, constitutional, and electrostatic descriptors. HM was also used for the preselection of 5 appropriate molecular descriptors. Linear and nonlinear QSAR models were developed based on the HM and GEP separately and two prediction models lead to a good correlation coefficient (R (2)) of 0.93 and 0.94. The two QSAR models are useful in predicting the inhibition ratio of pyrrolidine derivatives on matrix metalloproteinase during the discovery of new anticancer drugs and providing theory information for studying the new drugs.

Data-Driven Information Extraction from Chinese Electronic Medical Records

PubMed Central

Zhao, Tianwan; Ge, Chen; Gao, Weiguo; Wei, Jia; Zhu, Kenny Q.

2015-01-01

Objective This study aims to propose a data-driven framework that takes unstructured free text narratives in Chinese Electronic Medical Records (EMRs) as input and converts them into structured time-event-description triples, where the description is either an elaboration or an outcome of the medical event. Materials and Methods Our framework uses a hybrid approach. It consists of constructing cross-domain core medical lexica, an unsupervised, iterative algorithm to accrue more accurate terms into the lexica, rules to address Chinese writing conventions and temporal descriptors, and a Support Vector Machine (SVM) algorithm that innovatively utilizes Normalized Google Distance (NGD) to estimate the correlation between medical events and their descriptions. Results The effectiveness of the framework was demonstrated with a dataset of 24,817 de-identified Chinese EMRs. The cross-domain medical lexica were capable of recognizing terms with an F1-score of 0.896. 98.5% of recorded medical events were linked to temporal descriptors. The NGD SVM description-event matching achieved an F1-score of 0.874. The end-to-end time-event-description extraction of our framework achieved an F1-score of 0.846. Discussion In terms of named entity recognition, the proposed framework outperforms state-of-the-art supervised learning algorithms (F1-score: 0.896 vs. 0.886). In event-description association, the NGD SVM is superior to SVM using only local context and semantic features (F1-score: 0.874 vs. 0.838). Conclusions The framework is data-driven, weakly supervised, and robust against the variations and noises that tend to occur in a large corpus. It addresses Chinese medical writing conventions and variations in writing styles through patterns used for discovering new terms and rules for updating the lexica. PMID:26295801
Data-Driven Information Extraction from Chinese Electronic Medical Records.

PubMed

Xu, Dong; Zhang, Meizhuo; Zhao, Tianwan; Ge, Chen; Gao, Weiguo; Wei, Jia; Zhu, Kenny Q

2015-01-01

This study aims to propose a data-driven framework that takes unstructured free text narratives in Chinese Electronic Medical Records (EMRs) as input and converts them into structured time-event-description triples, where the description is either an elaboration or an outcome of the medical event. Our framework uses a hybrid approach. It consists of constructing cross-domain core medical lexica, an unsupervised, iterative algorithm to accrue more accurate terms into the lexica, rules to address Chinese writing conventions and temporal descriptors, and a Support Vector Machine (SVM) algorithm that innovatively utilizes Normalized Google Distance (NGD) to estimate the correlation between medical events and their descriptions. The effectiveness of the framework was demonstrated with a dataset of 24,817 de-identified Chinese EMRs. The cross-domain medical lexica were capable of recognizing terms with an F1-score of 0.896. 98.5% of recorded medical events were linked to temporal descriptors. The NGD SVM description-event matching achieved an F1-score of 0.874. The end-to-end time-event-description extraction of our framework achieved an F1-score of 0.846. In terms of named entity recognition, the proposed framework outperforms state-of-the-art supervised learning algorithms (F1-score: 0.896 vs. 0.886). In event-description association, the NGD SVM is superior to SVM using only local context and semantic features (F1-score: 0.874 vs. 0.838). The framework is data-driven, weakly supervised, and robust against the variations and noises that tend to occur in a large corpus. It addresses Chinese medical writing conventions and variations in writing styles through patterns used for discovering new terms and rules for updating the lexica.
Novel texture-based descriptors for tool wear condition monitoring

NASA Astrophysics Data System (ADS)

Antić, Aco; Popović, Branislav; Krstanović, Lidija; Obradović, Ratko; Milošević, Mijodrag

2018-01-01

All state-of-the-art tool condition monitoring systems (TCM) in the tool wear recognition task, especially those that use vibration sensors, heavily depend on the choice of descriptors containing information about the tool wear state which are extracted from the particular sensor signals. All other post-processing techniques do not manage to increase the recognition precision if those descriptors are not discriminative enough. In this work, we propose a tool wear monitoring strategy which relies on the novel texture based descriptors. We consider the module of the Short Term Discrete Fourier Transform (STDFT) spectra obtained from the particular vibration sensors signal utterance as the 2D textured image. This is done by identifying the time scale of STDFT as the first dimension, and the frequency scale as the second dimension of the particular textured image. The obtained textured image is then divided into particular 2D texture patches, covering a part of the frequency range of interest. After applying the appropriate filter bank, 2D textons are extracted for each predefined frequency band. By averaging in time, we extract from the textons for each band of interest the information regarding the Probability Density Function (PDF) in the form of lower order moments, thus obtaining robust tool wear state descriptors. We validate the proposed features by the experiments conducted on the real TCM system, obtaining the high recognition accuracy.
Functional Constructivism: In Search of Formal Descriptors.

PubMed

Trofimova, Irina

2017-10-01

The Functional Constructivism (FC) paradigm is an alternative to behaviorism and considers behavior as being generated every time anew, based on an individual's capacities, environmental resources and demands. Walter Freeman's work provided us with evidence supporting the FC principles. In this paper we make parallels between gradual construction processes leading to the formation of individual behavior and habits, and evolutionary processes leading to the establishment of biological systems. Referencing evolutionary theory, several formal descriptors of such processes are proposed. These FC descriptors refer to the most universal aspects for constructing consistent structures: expansion of degrees of freedom, integration processes based on internal and external compatibility between systems and maintenance processes, all given in four different classes of systems: (a) Zone of Proximate Development (poorly defined) systems; (b) peer systems with emerging reproduction of multiple siblings; (c) systems with internalized integration of behavioral elements ('cruise controls'); and (d) systems capable of handling low-probability, not yet present events. The recursive dynamics within this set of descriptors acting on (traditional) downward, upward and horizontal directions of evolution, is conceptualized as diagonal evolution, or di-evolution. Two examples applying these FC descriptors to taxonomy are given: classification of the functionality of neuro-transmitters and temperament traits; classification of mental disorders. The paper is an early step towards finding a formal language describing universal tendencies in highly diverse, complex and multi-level transient systems known in ecology and biology as 'contingency cycles.'
3D face recognition based on multiple keypoint descriptors and sparse representation.

PubMed

Zhang, Lin; Ding, Zhixuan; Li, Hongyu; Shen, Ying; Lu, Jianwei

2014-01-01

Recent years have witnessed a growing interest in developing methods for 3D face recognition. However, 3D scans often suffer from the problems of missing parts, large facial expressions, and occlusions. To be useful in real-world applications, a 3D face recognition approach should be able to handle these challenges. In this paper, we propose a novel general approach to deal with the 3D face recognition problem by making use of multiple keypoint descriptors (MKD) and the sparse representation-based classification (SRC). We call the proposed method 3DMKDSRC for short. Specifically, with 3DMKDSRC, each 3D face scan is represented as a set of descriptor vectors extracted from keypoints by meshSIFT. Descriptor vectors of gallery samples form the gallery dictionary. Given a probe 3D face scan, its descriptors are extracted at first and then its identity can be determined by using a multitask SRC. The proposed 3DMKDSRC approach does not require the pre-alignment between two face scans and is quite robust to the problems of missing data, occlusions and expressions. Its superiority over the other leading 3D face recognition schemes has been corroborated by extensive experiments conducted on three benchmark databases, Bosphorus, GavabDB, and FRGC2.0. The Matlab source code for 3DMKDSRC and the related evaluation results are publicly available at http://sse.tongji.edu.cn/linzhang/3dmkdsrcface/3dmkdsrc.htm.
Predicting volume of distribution with decision tree-based regression methods using predicted tissue:plasma partition coefficients.

PubMed

Freitas, Alex A; Limbu, Kriti; Ghafourian, Taravat

2015-01-01

Volume of distribution is an important pharmacokinetic property that indicates the extent of a drug's distribution in the body tissues. This paper addresses the problem of how to estimate the apparent volume of distribution at steady state (Vss) of chemical compounds in the human body using decision tree-based regression methods from the area of data mining (or machine learning). Hence, the pros and cons of several different types of decision tree-based regression methods have been discussed. The regression methods predict Vss using, as predictive features, both the compounds' molecular descriptors and the compounds' tissue:plasma partition coefficients (Kt:p) - often used in physiologically-based pharmacokinetics. Therefore, this work has assessed whether the data mining-based prediction of Vss can be made more accurate by using as input not only the compounds' molecular descriptors but also (a subset of) their predicted Kt:p values. Comparison of the models that used only molecular descriptors, in particular, the Bagging decision tree (mean fold error of 2.33), with those employing predicted Kt:p values in addition to the molecular descriptors, such as the Bagging decision tree using adipose Kt:p (mean fold error of 2.29), indicated that the use of predicted Kt:p values as descriptors may be beneficial for accurate prediction of Vss using decision trees if prior feature selection is applied. Decision tree based models presented in this work have an accuracy that is reasonable and similar to the accuracy of reported Vss inter-species extrapolations in the literature. The estimation of Vss for new compounds in drug discovery will benefit from methods that are able to integrate large and varied sources of data and flexible non-linear data mining methods such as decision trees, which can produce interpretable models. Graphical AbstractDecision trees for the prediction of tissue partition coefficient and volume of distribution of drugs.
Design of bactericidal peptides against Escherichia coli O157:H7, Pseudomonas aeruginosa and methicillin-resistant Staphylococcus aureus.

PubMed

Cruz, Jenniffer; Rondon, Paola; Torres, Rodrigo; Urquiza, Mauricio; Guzman, Fanny; Alvarez, Claudio; Abengozar, Maria Angeles; Sierra, Daniel A; Rivas, Luis; Fernandez-Lafuente, Roberto; Ortiz, Claudia

2018-05-08

Antimicrobial peptides are on the first line of defense against pathogenic microorganisms of many living beings. These compounds are considered natural antibiotics that can overcome bacterial resistance to conventional antibiotics. Due to this characteristic, new peptides with improved properties are quite appealing for designing new strategies for fighting pathogenic bacteria Methods: Sixteen designed peptides were synthesized using Fmoc chemistry; five of them are new cationic antimicrobial peptides (CAMPs) designed using a genetic algorithm that optimizes the antibacterial activity based on selected physicochemical descriptors and 11 analog peptides derived from these five peptides were designed and constructed by single amino acid substitutions. These 16 peptides were structurally characterized and their biological activity was determined against Escherichia coli O157:H7 (E. coli O157:H7), and methicillin-resistant strains of Staphylococcus aureus (MRSA) and Pseudomonas aeruginosa (P. aeruginosa) were determined Results: These 16 peptides were folded into an α-helix structure in membrane-mimicking environment. Among these 16 peptides, GIBIM-P5S9K (ATKKCGLFKILKGVGKI) showed the highest antimicrobial activity against E. coli O157:H7 (MIC=10µM), methicillin resistant Staphylococcus aureus (MRSA) (MIC=25µM) and Pseudomonas aeruginosa (MIC=10 µM). Peptide GIBIM-P5S9K caused permeabilization of the bacterial membrane at 25 µM as determined by the Sytox Green uptake assay and the labelling of these bacteria by using the fluoresceinated peptide. GIBIM-P5S9K seems to be specific for these bacteria because at 50 µM provoked lower than 40% of erythrocyte hemolysis. New CAMPs have been designed using a genetic algorithm based on selected physicochemical descriptors and single amino acid substitution. These CAMPs interacted quite specifically with the bacterial cell membrane, GIBIM-P5S9K exhibiting high antibacterial activity on Escherichia coli O157:H7, methicillin resistant strains of Staphylococcus aureus and P. aeruginosa. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Categorical QSAR models for skin sensitization based on local lymph node assay measures and both ground and excited state 4D-fingerprint descriptors

NASA Astrophysics Data System (ADS)

Liu, Jianzhong; Kern, Petra S.; Gerberick, G. Frank; Santos-Filho, Osvaldo A.; Esposito, Emilio X.; Hopfinger, Anton J.; Tseng, Yufeng J.

2008-06-01

In previous studies we have developed categorical QSAR models for predicting skin-sensitization potency based on 4D-fingerprint (4D-FP) descriptors and in vivo murine local lymph node assay (LLNA) measures. Only 4D-FP derived from the ground state (GMAX) structures of the molecules were used to build the QSAR models. In this study we have generated 4D-FP descriptors from the first excited state (EMAX) structures of the molecules. The GMAX, EMAX and the combined ground and excited state 4D-FP descriptors (GEMAX) were employed in building categorical QSAR models. Logistic regression (LR) and partial least square coupled logistic regression (PLS-CLR), found to be effective model building for the LLNA skin-sensitization measures in our previous studies, were used again in this study. This also permitted comparison of the prior ground state models to those involving first excited state 4D-FP descriptors. Three types of categorical QSAR models were constructed for each of the GMAX, EMAX and GEMAX datasets: a binary model (2-state), an ordinal model (3-state) and a binary-binary model (two-2-state). No significant differences exist among the LR 2-state model constructed for each of the three datasets. However, the PLS-CLR 3-state and 2-state models based on the EMAX and GEMAX datasets have higher predictivity than those constructed using only the GMAX dataset. These EMAX and GMAX categorical models are also more significant and predictive than corresponding models built in our previous QSAR studies of LLNA skin-sensitization measures.
Aconitum and Delphinium sp. Alkaloids as Antagonist Modulators of Voltage-Gated Na+ Channels. AM1/DFT Electronic Structure Investigations and QSAR Studies

PubMed Central

Turabekova, Malakhat A.; Rasulev, Bakhtiyor F.; Levkovich, Mikhail G.; Abdullaev, Nasrulla D.; Leszczynski, Jerzy

2015-01-01

Early pharmacological studies of Aconitum and Delphinium sp. alkaloids suggested that these neurotoxins act at site 2 of voltage-gated Na+ channel and allosterically modulate its function. Understanding structural requirements for these compounds to exhibit binding activity at voltage-gated Na+ channel has been important in various fields. This paper reports quantum-chemical studies and quantitative structure-activity relationships (QSARs) based on a total of 65 natural alkaloids from two plant species, which includes both blockers and openers of sodium ion channel. A series of 18 antagonist alkaloids (9 blockers and 9 openers) have been studied using AM1 and DFT computational methods in order to reveal their structure-activity (structure-toxicity) relationship at electronic level. An examination of frontier orbitals obtained for ground and protonated forms of the compounds revealed that HOMOs and LUMOs were mainly represented by nitrogen atom and benzyl/benzoylester orbitals with –OH and –OCOCH3 contributions. The results obtained from this research have confirmed the experimental findings suggesting that neurotoxins acting at type 2 receptor site of voltage-dependent sodium channel are activators and blockers with common structural features and differ only in efficacy. The energetic tendency of HOMO-LUMO energy gap can probably distinguish activators and blockers that have been observed. Genetic Algorithm with Multiple Linear Regression Analysis (GA-MLRA) technique was also applied for the generation of two-descriptor QSAR models for the set of 65 blockers. Additionally to the computational studies, the HOMO-LUMO gap descriptor in each obtained QSAR model has confirmed the crucial role of charge transfer in receptor-ligand interactions. A number of other descriptors such as logP, IBEG, nNH2, nHDon, nCO have been selected as complementary ones to LUMO and their role in activity alteration has also been discussed. PMID:18201930
Quantum descriptors for predictive toxicology of halogenated aliphatic hydrocarbons.

PubMed

Trohalaki, S; Pachter, R

2003-04-01

In order to improve Quantitative Structure-Activity Relationships (QSARs) for halogenated aliphatics (HA) and to better understand the biophysical mechanism of toxic response to these ubiquitous chemicals, we employ improved quantum-mechanical descriptors to account for HA electrophilicity. We demonstrate that, unlike the lowest unoccupied molecular orbital energy, ELUMO, which was previously used as a descriptor, the electron affinity can be systematically improved by application of higher levels of theory. We also show that employing the reciprocal of ELUMO, which is more consistent with frontier molecular orbital (FMO) theory, improves the correlations with in vitro toxicity data. We offer explanations based on FMO theory for a result from our previous work, in which the LUMO energies of HA anions correlated surprisingly well with in vitro toxicity data. Additional descriptors are also suggested and interpreted in terms of the accepted biophysical mechanism of toxic response to HAs and new QSARs are derived for various chemical categories that compose the data set employed. These alternate descriptors provide important insight and could benefit other classes of compounds where the biophysical mechanism of toxic response involves dissociative attachment.
A similarity based approach to identify homogeneous regions for seasonal forecasting

NASA Astrophysics Data System (ADS)

Schick, Simon; Rössler, Ole; Weingartner, Rolf

2015-04-01

Seasonal runoff forecasting using statistical models is challenged by a large number of candidate predictors and a general weak predictor-predictand relationship. As the area of the target basin increases, often also the available data sets do, thus reinforcing the predictor selection challenge. We propose an approach which follows the idea of 'divide and conquer' as developed in computational sciences and machine learning: First, the macroscale target basin is partitioned into homogeneous regions using all its gauged mesoscale subbasins. Second, one representative subbasin per homogeneous region is identified, for which models are fitted and applied. Third, the resulting forecasts are combined at the scale of the macroscale target basin. This approach requires a suitable method to identify homogeneous regions and representative subbasins. We suggest a way based on hydrological similarity, as catchment similarity estimated with respect to physiographic-climatic descriptors does not necessarily imply similar runoff response. Each descriptor is derived from daily runoff series and aimed to reflect a specific catchment characteristic: autocorrelation coefficient, parameters of fitted Gamma distribution and low/high flow indices (based on daily runoff values) fluctuation of the standard deviation within the yearly cycle (based on weekly runoff values) dominant harmonics obtained from the discrete Fourier transform (based on monthly runoff values) long term trend (based on yearly runoff values) Where necessary, the runoff series first need to be standardized, aggregated, detrended or deseasonalized. As a preliminary study we present the results of a cluster analysis for the Swiss Rhine River as macroscale target basin, which leads to about 40 mesoscale subbasins with runoff series for the period 1991-2010. Problems we have to address include the choice of a clustering algorithm, the identification of an appropriate number of regions and the selection of representative subbasins per region. The results are finally discussed with respect to the runoff regimes as defined in the Hydrological Atlas of Switzerland.
An object-based approach to weather analysis and its applications

NASA Astrophysics Data System (ADS)

Troemel, Silke; Diederich, Malte; Horvath, Akos; Simmer, Clemens; Kumjian, Matthew

2013-04-01

The research group 'Object-based Analysis and SEamless prediction' (OASE) within the Hans Ertel Centre for Weather Research programme (HErZ) pursues an object-based approach to weather analysis. The object-based tracking approach adopts the Lagrange perspective by identifying and following the development of convective events over the course of their lifetime. Prerequisites of the object-based analysis are a high-resolved observational data base and a tracking algorithm. A near real-time radar and satellite remote sensing-driven 3D observation-microphysics composite covering Germany, currently under development, contains gridded observations and estimated microphysical quantities. A 3D scale-space tracking identifies convective rain events in the dual-composite and monitors the development over the course of their lifetime. The OASE-group exploits the object-based approach in several fields of application: (1) For a better understanding and analysis of precipitation processes responsible for extreme weather events, (2) in nowcasting, (3) as a novel approach for validation of meso-γ atmospheric models, and (4) in data assimilation. Results from the different fields of application will be presented. The basic idea of the object-based approach is to identify a small set of radar- and satellite derived descriptors which characterize the temporal development of precipitation systems which constitute the objects. So-called proxies of the precipitation process are e.g. the temporal change of the brightband, vertically extensive columns of enhanced differential reflectivity ZDR or the cloud top temperature and heights identified in the 4D field of ground-based radar reflectivities and satellite retrievals generated by a cell during its life time. They quantify (micro-) physical differences among rain events and relate to the precipitation yield. Analyses on the informative content of ZDR columns as precursor for storm evolution for example will be presented to demonstrate the use of such system-oriented predictors for nowcasting. Columns of differential reflectivity ZDR measured by polarimetric weather radars are prominent signatures associated with thunderstorm updrafts. Since greater vertical velocities can loft larger drops and water-coated ice particles to higher altitudes above the environmental freezing level, the integrated ZDR column above the freezing level increases with increasing updraft intensity. Validation of atmospheric models concerning precipitation representation or prediction is usually confined to comparisons of precipitation fields or their temporal and spatial statistics. A comparison of the rain rates alone, however, does not immediately explain discrepancies between models and observations, because similar rain rates might be produced by different processes. Within the event-based approach for validation of models both observed and modeled rain events are analyzed by means of proxies of the precipitation process. Both sets of descriptors represent the basis for model validation since different leading descriptors - in a statistical sense- hint at process formulations potentially responsible for model failures.
Ability of Bottle Cap Color to Facilitate Accurate Patient-Physician Communication Regarding Medication Identity in Patients with Glaucoma.

PubMed

Dave, Pujan; Villarreal, Guadalupe; Friedman, David S; Kahook, Malik Y; Ramulu, Pradeep Y

2015-12-01

To determine the accuracy of patient-physician communication regarding topical ophthalmic medication use based on bottle cap color, particularly among individuals who may have acquired color vision deficiency from glaucoma. Cross-sectional, clinical study. Patients aged ≥18 years with primary open-angle, primary angle-closure, pseudoexfoliation, or pigment dispersion glaucoma, bilateral visual acuity of ≥20/400, and no concurrent conditions that may affect color vision. A total of 100 patients provided color descriptions of 11 distinct medication bottle caps. Color descriptors were then presented to 3 physicians. Physicians matched each color descriptor to the medication they thought the descriptor was describing. Frequency of patient-physician agreement, occurring when all 3 physicians accurately matched the color descriptor to the correct medication. Multivariate regression models evaluated whether patient-physician agreement decreased with degree of better-eye visual field (VF) damage, color descriptor heterogeneity, or color vision deficiency, as determined by the Hardy-Rand-Rittler (HRR) score and Lanthony D15 color confusion index (D15 CCI). Subjects had a mean age of 69 (±11) years, with VF mean deviation of -4.7 (±6.0) and -10.9 (±8.4) decibels (dB) in the better- and worse-seeing eyes, respectively. Patients produced 102 unique color descriptors to describe the colors of the 11 bottle caps. Among individual patients, the mean number of medications demonstrating agreement was 6.1/11 (55.5%). Agreement was less than 15% for 4 medications (prednisolone acetate [generic], betaxolol HCl [Betoptic; Alcon Laboratories Inc., Fort Worth, TX], brinzolamide/brimonidine [Simbrinza; Alcon Laboratories Inc.], and latanoprost [Xalatan; Pfizer, Inc., New York, NY]). Lower HRR scores and higher D15 CCI (both indicating worse color vision) were associated with greater VF damage (P < 0.001). Extent of color vision deficiency and color descriptor heterogeneity significantly predicted agreement in multivariate models (odds of agreement = 0.90 per 1 point decrement in HRR score, P < 0.001; odds of agreement = 0.30 for medications exhibiting high heterogeneity [≥11 descriptors], P = 0.007). Physician understanding of patient medication use based solely on bottle cap color is frequently incorrect, particularly in patients with glaucoma who may have color vision deficiency. Errors based on communication using bottle cap color alone may be common and could lead to confusion and harm. Copyright © 2015 American Academy of Ophthalmology. Published by Elsevier Inc. All rights reserved.
Structure-activity relationships between sterols and their thermal stability in oil matrix.

PubMed

Hu, Yinzhou; Xu, Junli; Huang, Weisu; Zhao, Yajing; Li, Maiquan; Wang, Mengmeng; Zheng, Lufei; Lu, Baiyi

2018-08-30

Structure-activity relationships between 20 sterols and their thermal stabilities were studied in a model oil system. All sterol degradations were found to be consistent with a first-order kinetic model with determination of coefficient (R 2 ) higher than 0.9444. The number of double bonds in the sterol structure was negatively correlated with the thermal stability of sterol, whereas the length of the branch chain was positively correlated with the thermal stability of sterol. A quantitative structure-activity relationship (QSAR) model to predict thermal stability of sterol was developed by using partial least squares regression (PLSR) combined with genetic algorithm (GA). A regression model was built with R 2 of 0.806. Almost all sterol degradation constants can be predicted accurately with R 2 of cross-validation equals to 0.680. Four important variables were selected in optimal QSAR model and the selected variables were observed to be related with information indices, RDF descriptors, and 3D-MoRSE descriptors. Copyright © 2018 Elsevier Ltd. All rights reserved.
A parallel implementation of 3D Zernike moment analysis

NASA Astrophysics Data System (ADS)

Berjón, Daniel; Arnaldo, Sergio; Morán, Francisco

2011-01-01

Zernike polynomials are a well known set of functions that find many applications in image or pattern characterization because they allow to construct shape descriptors that are invariant against translations, rotations or scale changes. The concepts behind them can be extended to higher dimension spaces, making them also fit to describe volumetric data. They have been less used than their properties might suggest due to their high computational cost. We present a parallel implementation of 3D Zernike moments analysis, written in C with CUDA extensions, which makes it practical to employ Zernike descriptors in interactive applications, yielding a performance of several frames per second in voxel datasets about 2003 in size. In our contribution, we describe the challenges of implementing 3D Zernike analysis in a general-purpose GPU. These include how to deal with numerical inaccuracies, due to the high precision demands of the algorithm, or how to deal with the high volume of input data so that it does not become a bottleneck for the system.
The effects of geometric uncertainties on computational modelling of knee biomechanics

PubMed Central

Fisher, John; Wilcox, Ruth

2017-01-01

The geometry of the articular components of the knee is an important factor in predicting joint mechanics in computational models. There are a number of uncertainties in the definition of the geometry of cartilage and meniscus, and evaluating the effects of these uncertainties is fundamental to understanding the level of reliability of the models. In this study, the sensitivity of knee mechanics to geometric uncertainties was investigated by comparing polynomial-based and image-based knee models and varying the size of meniscus. The results suggested that the geometric uncertainties in cartilage and meniscus resulting from the resolution of MRI and the accuracy of segmentation caused considerable effects on the predicted knee mechanics. Moreover, even if the mathematical geometric descriptors can be very close to the imaged-based articular surfaces, the detailed contact pressure distribution produced by the mathematical geometric descriptors was not the same as that of the image-based model. However, the trends predicted by the models based on mathematical geometric descriptors were similar to those of the imaged-based models. PMID:28879008
Novel Automated Approach to Predict the Outcome of Laser Peripheral Iridotomy for Primary Angle Closure Suspect Eyes Using Anterior Segment Optical Coherence Tomography.

PubMed

Koh, Victor; Swamidoss, Issac Niwas; Aquino, Maria Cecilia D; Chew, Paul T; Sng, Chelvin

2018-04-27

Develop an algorithm to predict the success of laser peripheral iridotomy (LPI) in primary angle closure suspect (PACS), using pre-treatment anterior segment optical coherence tomography (ASOCT) scans. A total of 116 eyes with PACS underwent LPI and time-domain ASOCT scans (temporal and nasal cuts) were performed before and 1 month after LPI. All the post-treatment scans were classified to one of the following categories: (a) both angles open, (b) one of two angles open and (c) both angles closed. After LPI, success is defined as one or more angles changed from close to open. In this proposed method, the pre and post-LPI ASOCT scans were registered at the corresponding angles based on similarities between the respective local descriptor features and random sample consensus technique was used to identify the largest consensus set of correspondences between the pre and post-LPI ASOCT scans. Subsequently, features such as correlation co-efficient (CC) and structural similarity index (SSIM) were extracted and correlated with the success of LPI. We included 116 eyes and 91 (78.44%) eyes fulfilled the criteria for success after LPI. Using the CC and SSIM index scores from this training set of ASOCT images, our algorithm showed that the success of LPI in eyes with narrow angles can be predicted with 89.7% accuracy, specificity of 95.2% and sensitivity of 36.4% based on pre-LPI ASOCT scans only. Using pre-LPI ASOCT scans, our proposed algorithm showed good accuracy in predicting the success of LPI for PACS eyes. This fully-automated algorithm could aid decision making in offering LPI as a prophylactic treatment for PACS.
Noninvasive forward-scattering system for rapid detection, characterization, and identification of Listeria colonies: image processing and data analysis

NASA Astrophysics Data System (ADS)

Rajwa, Bartek; Bayraktar, Bulent; Banada, Padmapriya P.; Huff, Karleigh; Bae, Euiwon; Hirleman, E. Daniel; Bhunia, Arun K.; Robinson, J. Paul

2006-10-01

Bacterial contamination by Listeria monocytogenes puts the public at risk and is also costly for the food-processing industry. Traditional methods for pathogen identification require complicated sample preparation for reliable results. Previously, we have reported development of a noninvasive optical forward-scattering system for rapid identification of Listeria colonies grown on solid surfaces. The presented system included application of computer-vision and patternrecognition techniques to classify scatter pattern formed by bacterial colonies irradiated with laser light. This report shows an extension of the proposed method. A new scatterometer equipped with a high-resolution CCD chip and application of two additional sets of image features for classification allow for higher accuracy and lower error rates. Features based on Zernike moments are supplemented by Tchebichef moments, and Haralick texture descriptors in the new version of the algorithm. Fisher's criterion has been used for feature selection to decrease the training time of machine learning systems. An algorithm based on support vector machines was used for classification of patterns. Low error rates determined by cross-validation, reproducibility of the measurements, and robustness of the system prove that the proposed technology can be implemented in automated devices for detection and classification of pathogenic bacteria.
QSAR as a random event: modeling of nanoparticles uptake in PaCa2 cancer cells.

PubMed

Toropov, Andrey A; Toropova, Alla P; Puzyn, Tomasz; Benfenati, Emilio; Gini, Giuseppina; Leszczynska, Danuta; Leszczynski, Jerzy

2013-06-01

Quantitative structure-property/activity relationships (QSPRs/QSARs) are a tool to predict various endpoints for various substances. The "classic" QSPR/QSAR analysis is based on the representation of the molecular structure by the molecular graph. However, simplified molecular input-line entry system (SMILES) gradually becomes most popular representation of the molecular structure in the databases available on the Internet. Under such circumstances, the development of molecular descriptors calculated directly from SMILES becomes attractive alternative to "classic" descriptors. The CORAL software (http://www.insilico.eu/coral) is provider of SMILES-based optimal molecular descriptors which are aimed to correlate with various endpoints. We analyzed data set on nanoparticles uptake in PaCa2 pancreatic cancer cells. The data set includes 109 nanoparticles with the same core but different surface modifiers (small organic molecules). The concept of a QSAR as a random event is suggested in opposition to "classic" QSARs which are based on the only one distribution of available data into the training and the validation sets. In other words, five random splits into the "visible" training set and the "invisible" validation set were examined. The SMILES-based optimal descriptors (obtained by the Monte Carlo technique) for these splits are calculated with the CORAL software. The statistical quality of all these models is good. Copyright © 2013 Elsevier Ltd. All rights reserved.
Using LabView for real-time monitoring and tracking of multiple biological objects

NASA Astrophysics Data System (ADS)

Nikolskyy, Aleksandr I.; Krasilenko, Vladimir G.; Bilynsky, Yosyp Y.; Starovier, Anzhelika

2017-04-01

Today real-time studying and tracking of movement dynamics of various biological objects is important and widely researched. Features of objects, conditions of their visualization and model parameters strongly influence the choice of optimal methods and algorithms for a specific task. Therefore, to automate the processes of adaptation of recognition tracking algorithms, several Labview project trackers are considered in the article. Projects allow changing templates for training and retraining the system quickly. They adapt to the speed of objects and statistical characteristics of noise in images. New functions of comparison of images or their features, descriptors and pre-processing methods will be discussed. The experiments carried out to test the trackers on real video files will be presented and analyzed.

Fourier Descriptor Analysis and Unification of Voice Range Profile Contours: Method and Applications

ERIC Educational Resources Information Center

Pabon, Peter; Ternstrom, Sten; Lamarche, Anick

2011-01-01

Purpose: To describe a method for unified description, statistical modeling, and comparison of voice range profile (VRP) contours, even from diverse sources. Method: A morphologic modeling technique, which is based on Fourier descriptors (FDs), is applied to the VRP contour. The technique, which essentially involves resampling of the curve of the…
Australian Thesaurus of Education Descriptors. A Word-Stock for Indexing and Retrieving Australian Educational Literature.

ERIC Educational Resources Information Center

Lavender, G. B.; Findlay, Margaret A.

This core thesaurus of terms suitable for indexing Australian educational literature was developed by the Australian Council for Educational Research by means of a systematic and thorough revision of the "Thesaurus of ERIC Descriptors." Based on the actual terminology of education in Australia, this thesaurus includes: key words and…
Automated classification of tropical shrub species: a hybrid of leaf shape and machine learning approach

PubMed Central

Murat, Miraemiliana; Abu, Arpah; Yap, Hwa Jen; Yong, Kien-Thai

2017-01-01

Plants play a crucial role in foodstuff, medicine, industry, and environmental protection. The skill of recognising plants is very important in some applications, including conservation of endangered species and rehabilitation of lands after mining activities. However, it is a difficult task to identify plant species because it requires specialized knowledge. Developing an automated classification system for plant species is necessary and valuable since it can help specialists as well as the public in identifying plant species easily. Shape descriptors were applied on the myDAUN dataset that contains 45 tropical shrub species collected from the University of Malaya (UM), Malaysia. Based on literature review, this is the first study in the development of tropical shrub species image dataset and classification using a hybrid of leaf shape and machine learning approach. Four types of shape descriptors were used in this study namely morphological shape descriptors (MSD), Histogram of Oriented Gradients (HOG), Hu invariant moments (Hu) and Zernike moments (ZM). Single descriptor, as well as the combination of hybrid descriptors were tested and compared. The tropical shrub species are classified using six different classifiers, which are artificial neural network (ANN), random forest (RF), support vector machine (SVM), k-nearest neighbour (k-NN), linear discriminant analysis (LDA) and directed acyclic graph multiclass least squares twin support vector machine (DAG MLSTSVM). In addition, three types of feature selection methods were tested in the myDAUN dataset, Relief, Correlation-based feature selection (CFS) and Pearson’s coefficient correlation (PCC). The well-known Flavia dataset and Swedish Leaf dataset were used as the validation dataset on the proposed methods. The results showed that the hybrid of all descriptors of ANN outperformed the other classifiers with an average classification accuracy of 98.23% for the myDAUN dataset, 95.25% for the Flavia dataset and 99.89% for the Swedish Leaf dataset. In addition, the Relief feature selection method achieved the highest classification accuracy of 98.13% after 80 (or 60%) of the original features were reduced, from 133 to 53 descriptors in the myDAUN dataset with the reduction in computational time. Subsequently, the hybridisation of four descriptors gave the best results compared to others. It is proven that the combination MSD and HOG were good enough for tropical shrubs species classification. Hu and ZM descriptors also improved the accuracy in tropical shrubs species classification in terms of invariant to translation, rotation and scale. ANN outperformed the others for tropical shrub species classification in this study. Feature selection methods can be used in the classification of tropical shrub species, as the comparable results could be obtained with the reduced descriptors and reduced in computational time and cost. PMID:28924506
Automated classification of tropical shrub species: a hybrid of leaf shape and machine learning approach.

PubMed

Murat, Miraemiliana; Chang, Siow-Wee; Abu, Arpah; Yap, Hwa Jen; Yong, Kien-Thai

2017-01-01

Plants play a crucial role in foodstuff, medicine, industry, and environmental protection. The skill of recognising plants is very important in some applications, including conservation of endangered species and rehabilitation of lands after mining activities. However, it is a difficult task to identify plant species because it requires specialized knowledge. Developing an automated classification system for plant species is necessary and valuable since it can help specialists as well as the public in identifying plant species easily. Shape descriptors were applied on the myDAUN dataset that contains 45 tropical shrub species collected from the University of Malaya (UM), Malaysia. Based on literature review, this is the first study in the development of tropical shrub species image dataset and classification using a hybrid of leaf shape and machine learning approach. Four types of shape descriptors were used in this study namely morphological shape descriptors (MSD), Histogram of Oriented Gradients (HOG), Hu invariant moments (Hu) and Zernike moments (ZM). Single descriptor, as well as the combination of hybrid descriptors were tested and compared. The tropical shrub species are classified using six different classifiers, which are artificial neural network (ANN), random forest (RF), support vector machine (SVM), k-nearest neighbour (k-NN), linear discriminant analysis (LDA) and directed acyclic graph multiclass least squares twin support vector machine (DAG MLSTSVM). In addition, three types of feature selection methods were tested in the myDAUN dataset, Relief, Correlation-based feature selection (CFS) and Pearson's coefficient correlation (PCC). The well-known Flavia dataset and Swedish Leaf dataset were used as the validation dataset on the proposed methods. The results showed that the hybrid of all descriptors of ANN outperformed the other classifiers with an average classification accuracy of 98.23% for the myDAUN dataset, 95.25% for the Flavia dataset and 99.89% for the Swedish Leaf dataset. In addition, the Relief feature selection method achieved the highest classification accuracy of 98.13% after 80 (or 60%) of the original features were reduced, from 133 to 53 descriptors in the myDAUN dataset with the reduction in computational time. Subsequently, the hybridisation of four descriptors gave the best results compared to others. It is proven that the combination MSD and HOG were good enough for tropical shrubs species classification. Hu and ZM descriptors also improved the accuracy in tropical shrubs species classification in terms of invariant to translation, rotation and scale. ANN outperformed the others for tropical shrub species classification in this study. Feature selection methods can be used in the classification of tropical shrub species, as the comparable results could be obtained with the reduced descriptors and reduced in computational time and cost.
A Novel Approach for Weed Type Classification Based on Shape Descriptors and a Fuzzy Decision-Making Method

PubMed Central

Herrera, Pedro Javier; Dorado, José.; Ribeiro, Ángela.

2014-01-01

An important objective in weed management is the discrimination between grasses (monocots) and broad-leaved weeds (dicots), because these two weed groups can be appropriately controlled by specific herbicides. In fact, efficiency is higher if selective treatment is performed for each type of infestation instead of using a broadcast herbicide on the whole surface. This work proposes a strategy where weeds are characterised by a set of shape descriptors (the seven Hu moments and six geometric shape descriptors). Weeds appear in outdoor field images which display real situations obtained from a RGB camera. Thus, images present a mixture of both weed species under varying conditions of lighting. In the presented approach, four decision-making methods were adapted to use the best shape descriptors as attributes and a choice was taken. This proposal establishes a novel methodology with a high success rate in weed species discrimination. PMID:25195854
Image characterization by fractal descriptors in variational mode decomposition domain: Application to brain magnetic resonance

NASA Astrophysics Data System (ADS)

Lahmiri, Salim

2016-08-01

The main purpose of this work is to explore the usefulness of fractal descriptors estimated in multi-resolution domains to characterize biomedical digital image texture. In this regard, three multi-resolution techniques are considered: the well-known discrete wavelet transform (DWT) and the empirical mode decomposition (EMD), and; the newly introduced; variational mode decomposition mode (VMD). The original image is decomposed by the DWT, EMD, and VMD into different scales. Then, Fourier spectrum based fractal descriptors is estimated at specific scales and directions to characterize the image. The support vector machine (SVM) was used to perform supervised classification. The empirical study was applied to the problem of distinguishing between normal and abnormal brain magnetic resonance images (MRI) affected with Alzheimer disease (AD). Our results demonstrate that fractal descriptors estimated in VMD domain outperform those estimated in DWT and EMD domains; and also those directly estimated from the original image.
Observer synthesis for a class of Takagi-Sugeno descriptor system with unmeasurable premise variable. Application to fault diagnosis

NASA Astrophysics Data System (ADS)

López-Estrada, F. R.; Astorga-Zaragoza, C. M.; Theilliol, D.; Ponsart, J. C.; Valencia-Palomo, G.; Torres, L.

2017-12-01

This paper proposes a methodology to design a Takagi-Sugeno (TS) descriptor observer for a class of TS descriptor systems. Unlike the popular approach that considers measurable premise variables, this paper considers the premise variables depending on unmeasurable vectors, e.g. the system states. This consideration covers a large class of nonlinear systems and represents a real challenge for the observer synthesis. Sufficient conditions to guarantee robustness against the unmeasurable premise variables and asymptotic convergence of the TS descriptor observer are obtained based on the H∞ approach together with the Lyapunov method. As a result, the designing conditions are given in terms of linear matrix inequalities (LMIs). In addition, sensor fault detection and isolation are performed by means of a generalised observer bank. Two numerical experiments, an electrical circuit and a rolling disc system, are presented in order to illustrate the effectiveness of the proposed method.
Aerial images visual localization on a vector map using color-texture segmentation

NASA Astrophysics Data System (ADS)

Kunina, I. A.; Teplyakov, L. M.; Gladkov, A. P.; Khanipov, T. M.; Nikolaev, D. P.

2018-04-01

In this paper we study the problem of combining UAV obtained optical data and a coastal vector map in absence of satellite navigation data. The method is based on presenting the territory as a set of segments produced by color-texture image segmentation. We then find such geometric transform which gives the best match between these segments and land and water areas of the georeferenced vector map. We calculate transform consisting of an arbitrary shift relatively to the vector map and bound rotation and scaling. These parameters are estimated using the RANSAC algorithm which matches the segments contours and the contours of land and water areas of the vector map. To implement this matching we suggest computing shape descriptors robust to rotation and scaling. We performed numerical experiments demonstrating the practical applicability of the proposed method.
Onboard Image Registration from Invariant Features

NASA Technical Reports Server (NTRS)

Wang, Yi; Ng, Justin; Garay, Michael J.; Burl, Michael C

2008-01-01

This paper describes a feature-based image registration technique that is potentially well-suited for onboard deployment. The overall goal is to provide a fast, robust method for dynamically combining observations from multiple platforms into sensors webs that respond quickly to short-lived events and provide rich observations of objects that evolve in space and time. The approach, which has enjoyed considerable success in mainstream computer vision applications, uses invariant SIFT descriptors extracted at image interest points together with the RANSAC algorithm to robustly estimate transformation parameters that relate one image to another. Experimental results for two satellite image registration tasks are presented: (1) automatic registration of images from the MODIS instrument on Terra to the MODIS instrument on Aqua and (2) automatic stabilization of a multi-day sequence of GOES-West images collected during the October 2007 Southern California wildfires.
Recognition of Simple 3D Geometrical Objects under Partial Occlusion

NASA Astrophysics Data System (ADS)

Barchunova, Alexandra; Sommer, Gerald

In this paper we present a novel procedure for contour-based recognition of partially occluded three-dimensional objects. In our approach we use images of real and rendered objects whose contours have been deformed by a restricted change of the viewpoint. The preparatory part consists of contour extraction, preprocessing, local structure analysis and feature extraction. The main part deals with an extended construction and functionality of the classifier ensemble Adaptive Occlusion Classifier (AOC). It relies on a hierarchical fragmenting algorithm to perform a local structure analysis which is essential when dealing with occlusions. In the experimental part of this paper we present classification results for five classes of simple geometrical figures: prism, cylinder, half cylinder, a cube, and a bridge. We compare classification results for three classical feature extractors: Fourier descriptors, pseudo Zernike and Zernike moments.
Pain Quality Descriptors in Community-Dwelling Older Adults with Nonmalignant Pain

PubMed Central

Thakral, Manu; Shi, Ling; Foust, Janice B.; Patel, Kushang V.; Shmerling, Robert H.; Bean, Jonathan F.; Leveille, Suzanne G.

2016-01-01

This study aimed to characterize the prevalence of various pain qualities in older adults with chronic non-malignant pain and determine the association of pain quality to other pain characteristics namely: severity, interference distribution, and pain-associated conditions. In the population-based MOBILIZE Boston Study, 560 participants aged≥70 years reported chronic pain in the baseline assessment, which included a home interview and clinic exam. Pain quality was assessed using a modified version of the McGill Pain Questionnaire (MPQ) consisting of 20 descriptors, from which 3 categories were derived: cognitive/affective, sensory and neuropathic. Presence of ≥2 pain-associated conditions was significantly associated with 18 of the 20 pain quality descriptors. Sensory descriptors were endorsed by nearly all older adults with chronic pain (93%), followed by cognitive/affective (83.4%) and neuropathic descriptors (68.6%). Neuropathic descriptors were associated with the greatest number of pain-associated conditions including osteoarthritis of the hand and knee. More than half of participants (59%) endorsed descriptors in all 3 categories and had more severe pain and interference, and multi-site or widespread pain than those endorsing 1 or 2 categories. Strong associations were observed between pain quality and measures of pain severity, interference, and distribution (p<.0001). Findings from this study indicate that older adults have multiple pain-associated conditions which likely reflect multiple physiological mechanisms for pain. Linking pain qualities with other associated pain characteristics serves to develop a multidimensional approach to geriatric pain assessment. Future research is needed to investigate the physiological mechanisms responsible for the variability in pain qualities endorsed by older adults. PMID:27842050
Pain quality descriptors in community-dwelling older adults with nonmalignant pain.

PubMed

Thakral, Manu; Shi, Ling; Foust, Janice B; Patel, Kushang V; Shmerling, Robert H; Bean, Jonathan F; Leveille, Suzanne G

2016-12-01

This study aimed to characterize the prevalence of various pain qualities in older adults with chronic nonmalignant pain and determine the association of pain quality to other pain characteristics namely: severity, interference, distribution, and pain-associated conditions. In the population-based MOBILIZE Boston Study, 560 participants aged ≥70 years reported chronic pain in the baseline assessment, which included a home interview and clinic exam. Pain quality was assessed using a modified version of the McGill Pain Questionnaire (MPQ) consisting of 20 descriptors from which 3 categories were derived: cognitive/affective, sensory, and neuropathic. Presence of ≥2 pain-associated conditions was significantly associated with 18 of the 20 pain quality descriptors. Sensory descriptors were endorsed by nearly all older adults with chronic pain (93%), followed by cognitive/affective (83.4%) and neuropathic descriptors (68.6%). Neuropathic descriptors were associated with the greatest number of pain-associated conditions including osteoarthritis of the hand and knee. More than half of participants (59%) endorsed descriptors in all 3 categories and had more severe pain and interference, and multisite or widespread pain than those endorsing 1 or 2 categories. Strong associations were observed between pain quality and measures of pain severity, interference, and distribution (P < 0.0001). Findings from this study indicate that older adults have multiple pain-associated conditions that likely reflect multiple physiological mechanisms for pain. Linking pain qualities with other associated pain characteristics serve to develop a multidimensional approach to geriatric pain assessment. Future research is needed to investigate the physiological mechanisms responsible for the variability in pain qualities endorsed by older adults.
Characterizing informative sequence descriptors and predicting binding affinities of heterodimeric protein complexes.

PubMed

Srinivasulu, Yerukala Sathipati; Wang, Jyun-Rong; Hsu, Kai-Ti; Tsai, Ming-Ju; Charoenkwan, Phasit; Huang, Wen-Lin; Huang, Hui-Ling; Ho, Shinn-Ying

2015-01-01

Protein-protein interactions (PPIs) are involved in various biological processes, and underlying mechanism of the interactions plays a crucial role in therapeutics and protein engineering. Most machine learning approaches have been developed for predicting the binding affinity of protein-protein complexes based on structure and functional information. This work aims to predict the binding affinity of heterodimeric protein complexes from sequences only. This work proposes a support vector machine (SVM) based binding affinity classifier, called SVM-BAC, to classify heterodimeric protein complexes based on the prediction of their binding affinity. SVM-BAC identified 14 of 580 sequence descriptors (physicochemical, energetic and conformational properties of the 20 amino acids) to classify 216 heterodimeric protein complexes into low and high binding affinity. SVM-BAC yielded the training accuracy, sensitivity, specificity, AUC and test accuracy of 85.80%, 0.89, 0.83, 0.86 and 83.33%, respectively, better than existing machine learning algorithms. The 14 features and support vector regression were further used to estimate the binding affinities (Pkd) of 200 heterodimeric protein complexes. Prediction performance of a Jackknife test was the correlation coefficient of 0.34 and mean absolute error of 1.4. We further analyze three informative physicochemical properties according to their contribution to prediction performance. Results reveal that the following properties are effective in predicting the binding affinity of heterodimeric protein complexes: apparent partition energy based on buried molar fractions, relations between chemical structure and biological activity in principal component analysis IV, and normalized frequency of beta turn. The proposed sequence-based prediction method SVM-BAC uses an optimal feature selection method to identify 14 informative features to classify and predict binding affinity of heterodimeric protein complexes. The characterization analysis revealed that the average numbers of beta turns and hydrogen bonds at protein-protein interfaces in high binding affinity complexes are more than those in low binding affinity complexes.
Characterizing informative sequence descriptors and predicting binding affinities of heterodimeric protein complexes

PubMed Central

2015-01-01

Background Protein-protein interactions (PPIs) are involved in various biological processes, and underlying mechanism of the interactions plays a crucial role in therapeutics and protein engineering. Most machine learning approaches have been developed for predicting the binding affinity of protein-protein complexes based on structure and functional information. This work aims to predict the binding affinity of heterodimeric protein complexes from sequences only. Results This work proposes a support vector machine (SVM) based binding affinity classifier, called SVM-BAC, to classify heterodimeric protein complexes based on the prediction of their binding affinity. SVM-BAC identified 14 of 580 sequence descriptors (physicochemical, energetic and conformational properties of the 20 amino acids) to classify 216 heterodimeric protein complexes into low and high binding affinity. SVM-BAC yielded the training accuracy, sensitivity, specificity, AUC and test accuracy of 85.80%, 0.89, 0.83, 0.86 and 83.33%, respectively, better than existing machine learning algorithms. The 14 features and support vector regression were further used to estimate the binding affinities (Pkd) of 200 heterodimeric protein complexes. Prediction performance of a Jackknife test was the correlation coefficient of 0.34 and mean absolute error of 1.4. We further analyze three informative physicochemical properties according to their contribution to prediction performance. Results reveal that the following properties are effective in predicting the binding affinity of heterodimeric protein complexes: apparent partition energy based on buried molar fractions, relations between chemical structure and biological activity in principal component analysis IV, and normalized frequency of beta turn. Conclusions The proposed sequence-based prediction method SVM-BAC uses an optimal feature selection method to identify 14 informative features to classify and predict binding affinity of heterodimeric protein complexes. The characterization analysis revealed that the average numbers of beta turns and hydrogen bonds at protein-protein interfaces in high binding affinity complexes are more than those in low binding affinity complexes. PMID:26681483
OTIS Basic Index Access System (OBIAS); A System for Retrieval of Information From the ERIC and CIJE Data Bases Utilizing a Direct Access Inverted Index of Descriptors and a Reformatted Direct Access ERIC-CIJE File.

ERIC Educational Resources Information Center

Bracken, Paula

The OTIS Basic Index Access System (OBIAS) for searching the ERIC data base is described. This system offers two advantages over the previous system. First, search time has been halved, reducing the cost per search to an estimated $10 on a batch basis. Second, the "OTIS ERIC Descripter Catalog" which contains all descriptors used in the…
Determination of polyparameter linear free energy relationship (pp-LFER) substance descriptors for established and alternative flame retardants.

PubMed

Stenzel, Angelika; Goss, Kai-Uwe; Endo, Satoshi

2013-02-05

Polyparameter linear free energy relationships (pp-LFERs) can predict partition coefficients for a multitude of environmental and biological phases with high accuracy. In this work, the pp-LFER substance descriptors of 40 established and alternative flame retardants (e.g., polybrominated diphenyl ethers, hexabromocyclododecane, bromobenzenes, trialkyl phosphates) were determined experimentally. In total, 251 data for gas-chromatographic (GC) retention times and liquid/liquid partition coefficients (K) were measured and used to calibrate the pp-LFER substance descriptors. Substance descriptors were validated through a comparison between predicted and experimental log K for the systems octanol/water (K(ow)), water/air (K(wa)), organic carbon/water (K(oc)) and liposome/water (K(lipw)), revealing a high reliability of pp-LFER predictions based on our descriptors. For instance, the difference between predicted and experimental log K(ow) was <0.3 log units for 17 out of 21 compounds for which experimental values were available. Moreover, we found an indication that the H-bond acceptor value (B) depends on the solvent for some compounds. Thus, for predicting environmentally relevant partition coefficients it is important to determine B values using measurements in aqueous systems. The pp-LFER descriptors calibrated in this study can be used to predict partition coefficients for which experimental data are unavailable, and the predicted values can serve as references for further experimental measurements.
Uniting Cheminformatics and Chemical Theory To Predict the Intrinsic Aqueous Solubility of Crystalline Druglike Molecules

PubMed Central

2014-01-01

We present four models of solution free-energy prediction for druglike molecules utilizing cheminformatics descriptors and theoretically calculated thermodynamic values. We make predictions of solution free energy using physics-based theory alone and using machine learning/quantitative structure–property relationship (QSPR) models. We also develop machine learning models where the theoretical energies and cheminformatics descriptors are used as combined input. These models are used to predict solvation free energy. While direct theoretical calculation does not give accurate results in this approach, machine learning is able to give predictions with a root mean squared error (RMSE) of ∼1.1 log S units in a 10-fold cross-validation for our Drug-Like-Solubility-100 (DLS-100) dataset of 100 druglike molecules. We find that a model built using energy terms from our theoretical methodology as descriptors is marginally less predictive than one built on Chemistry Development Kit (CDK) descriptors. Combining both sets of descriptors allows a further but very modest improvement in the predictions. However, in some cases, this is a statistically significant enhancement. These results suggest that there is little complementarity between the chemical information provided by these two sets of descriptors, despite their different sources and methods of calculation. Our machine learning models are also able to predict the well-known Solubility Challenge dataset with an RMSE value of 0.9–1.0 log S units. PMID:24564264
3D Face Recognition Based on Multiple Keypoint Descriptors and Sparse Representation

PubMed Central

Zhang, Lin; Ding, Zhixuan; Li, Hongyu; Shen, Ying; Lu, Jianwei

2014-01-01

Recent years have witnessed a growing interest in developing methods for 3D face recognition. However, 3D scans often suffer from the problems of missing parts, large facial expressions, and occlusions. To be useful in real-world applications, a 3D face recognition approach should be able to handle these challenges. In this paper, we propose a novel general approach to deal with the 3D face recognition problem by making use of multiple keypoint descriptors (MKD) and the sparse representation-based classification (SRC). We call the proposed method 3DMKDSRC for short. Specifically, with 3DMKDSRC, each 3D face scan is represented as a set of descriptor vectors extracted from keypoints by meshSIFT. Descriptor vectors of gallery samples form the gallery dictionary. Given a probe 3D face scan, its descriptors are extracted at first and then its identity can be determined by using a multitask SRC. The proposed 3DMKDSRC approach does not require the pre-alignment between two face scans and is quite robust to the problems of missing data, occlusions and expressions. Its superiority over the other leading 3D face recognition schemes has been corroborated by extensive experiments conducted on three benchmark databases, Bosphorus, GavabDB, and FRGC2.0. The Matlab source code for 3DMKDSRC and the related evaluation results are publicly available at http://sse.tongji.edu.cn/linzhang/3dmkdsrcface/3dmkdsrc.htm. PMID:24940876
Revealing cell cycle control by combining model-based detection of periodic expression with novel cis-regulatory descriptors

PubMed Central

Andersson, Claes R; Hvidsten, Torgeir R; Isaksson, Anders; Gustafsson, Mats G; Komorowski, Jan

2007-01-01

Background We address the issue of explaining the presence or absence of phase-specific transcription in budding yeast cultures under different conditions. To this end we use a model-based detector of gene expression periodicity to divide genes into classes depending on their behavior in experiments using different synchronization methods. While computational inference of gene regulatory circuits typically relies on expression similarity (clustering) in order to find classes of potentially co-regulated genes, this method instead takes advantage of known time profile signatures related to the studied process. Results We explain the regulatory mechanisms of the inferred periodic classes with cis-regulatory descriptors that combine upstream sequence motifs with experimentally determined binding of transcription factors. By systematic statistical analysis we show that periodic classes are best explained by combinations of descriptors rather than single descriptors, and that different combinations correspond to periodic expression in different classes. We also find evidence for additive regulation in that the combinations of cis-regulatory descriptors associated with genes periodically expressed in fewer conditions are frequently subsets of combinations associated with genes periodically expression in more conditions. Finally, we demonstrate that our approach retrieves combinations that are more specific towards known cell-cycle related regulators than the frequently used clustering approach. Conclusion The results illustrate how a model-based approach to expression analysis may be particularly well suited to detect biologically relevant mechanisms. Our new approach makes it possible to provide more refined hypotheses about regulatory mechanisms of the cell cycle and it can easily be adjusted to reveal regulation of other, non-periodic, cellular processes. PMID:17939860
Cell Motility Dynamics: A Novel Segmentation Algorithm to Quantify Multi-Cellular Bright Field Microscopy Images

PubMed Central

Zaritsky, Assaf; Natan, Sari; Horev, Judith; Hecht, Inbal; Wolf, Lior; Ben-Jacob, Eshel; Tsarfaty, Ilan

2011-01-01

Confocal microscopy analysis of fluorescence and morphology is becoming the standard tool in cell biology and molecular imaging. Accurate quantification algorithms are required to enhance the understanding of different biological phenomena. We present a novel approach based on image-segmentation of multi-cellular regions in bright field images demonstrating enhanced quantitative analyses and better understanding of cell motility. We present MultiCellSeg, a segmentation algorithm to separate between multi-cellular and background regions for bright field images, which is based on classification of local patches within an image: a cascade of Support Vector Machines (SVMs) is applied using basic image features. Post processing includes additional classification and graph-cut segmentation to reclassify erroneous regions and refine the segmentation. This approach leads to a parameter-free and robust algorithm. Comparison to an alternative algorithm on wound healing assay images demonstrates its superiority. The proposed approach was used to evaluate common cell migration models such as wound healing and scatter assay. It was applied to quantify the acceleration effect of Hepatocyte growth factor/scatter factor (HGF/SF) on healing rate in a time lapse confocal microscopy wound healing assay and demonstrated that the healing rate is linear in both treated and untreated cells, and that HGF/SF accelerates the healing rate by approximately two-fold. A novel fully automated, accurate, zero-parameters method to classify and score scatter-assay images was developed and demonstrated that multi-cellular texture is an excellent descriptor to measure HGF/SF-induced cell scattering. We show that exploitation of textural information from differential interference contrast (DIC) images on the multi-cellular level can prove beneficial for the analyses of wound healing and scatter assays. The proposed approach is generic and can be used alone or alongside traditional fluorescence single-cell processing to perform objective, accurate quantitative analyses for various biological applications. PMID:22096600

Cell motility dynamics: a novel segmentation algorithm to quantify multi-cellular bright field microscopy images.

PubMed

Zaritsky, Assaf; Natan, Sari; Horev, Judith; Hecht, Inbal; Wolf, Lior; Ben-Jacob, Eshel; Tsarfaty, Ilan

2011-01-01

Confocal microscopy analysis of fluorescence and morphology is becoming the standard tool in cell biology and molecular imaging. Accurate quantification algorithms are required to enhance the understanding of different biological phenomena. We present a novel approach based on image-segmentation of multi-cellular regions in bright field images demonstrating enhanced quantitative analyses and better understanding of cell motility. We present MultiCellSeg, a segmentation algorithm to separate between multi-cellular and background regions for bright field images, which is based on classification of local patches within an image: a cascade of Support Vector Machines (SVMs) is applied using basic image features. Post processing includes additional classification and graph-cut segmentation to reclassify erroneous regions and refine the segmentation. This approach leads to a parameter-free and robust algorithm. Comparison to an alternative algorithm on wound healing assay images demonstrates its superiority. The proposed approach was used to evaluate common cell migration models such as wound healing and scatter assay. It was applied to quantify the acceleration effect of Hepatocyte growth factor/scatter factor (HGF/SF) on healing rate in a time lapse confocal microscopy wound healing assay and demonstrated that the healing rate is linear in both treated and untreated cells, and that HGF/SF accelerates the healing rate by approximately two-fold. A novel fully automated, accurate, zero-parameters method to classify and score scatter-assay images was developed and demonstrated that multi-cellular texture is an excellent descriptor to measure HGF/SF-induced cell scattering. We show that exploitation of textural information from differential interference contrast (DIC) images on the multi-cellular level can prove beneficial for the analyses of wound healing and scatter assays. The proposed approach is generic and can be used alone or alongside traditional fluorescence single-cell processing to perform objective, accurate quantitative analyses for various biological applications.
The Marine Strategy Framework Directive and the ecosystem-based approach – pitfalls and solutions.

PubMed

Berg, Torsten; Fürhaupter, Karin; Teixeira, Heliana; Uusitalo, Laura; Zampoukas, Nikolaos

2015-07-15

The European Marine Strategy Framework Directive aims at good environmental status (GES) in marine waters, following an ecosystem-based approach, focused on 11 descriptors related to ecosystem features, human drivers and pressures. Furthermore, 29 subordinate criteria and 56 attributes are detailed in an EU Commission Decision. The analysis of the Decision and the associated operational indicators revealed ambiguity in the use of terms, such as indicator, impact and habitat and considerable overlap of indicators assigned to various descriptors and criteria. We suggest re-arrangement and elimination of redundant criteria and attributes avoiding double counting in the subsequent indicator synthesis, a clear distinction between pressure and state descriptors and addition of criteria on ecosystem services and functioning. Moreover, we suggest the precautionary principle should be followed for the management of pressures and an evidence-based approach for monitoring state as well as reaching and maintaining GES. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.
Chaining direct memory access data transfer operations for compute nodes in a parallel computer

DOEpatents

Archer, Charles J.; Blocksome, Michael A.

2010-09-28

Methods, systems, and products are disclosed for chaining DMA data transfer operations for compute nodes in a parallel computer that include: receiving, by an origin DMA engine on an origin node in an origin injection FIFO buffer for the origin DMA engine, a RGET data descriptor specifying a DMA transfer operation data descriptor on the origin node and a second RGET data descriptor on the origin node, the second RGET data descriptor specifying a target RGET data descriptor on the target node, the target RGET data descriptor specifying an additional DMA transfer operation data descriptor on the origin node; creating, by the origin DMA engine, an RGET packet in dependence upon the RGET data descriptor, the RGET packet containing the DMA transfer operation data descriptor and the second RGET data descriptor; and transferring, by the origin DMA engine to a target DMA engine on the target node, the RGET packet.
Replenishing data descriptors in a DMA injection FIFO buffer

DOEpatents

Archer, Charles J [Rochester, MN; Blocksome, Michael A [Rochester, MN; Cernohous, Bob R [Rochester, MN; Heidelberger, Philip [Cortlandt Manor, NY; Kumar, Sameer [White Plains, NY; Parker, Jeffrey J [Rochester, MN

2011-10-11

Methods, apparatus, and products are disclosed for replenishing data descriptors in a Direct Memory Access (`DMA`) injection first-in-first-out (`FIFO`) buffer that include: determining, by a messaging module on an origin compute node, whether a number of data descriptors in a DMA injection FIFO buffer exceeds a predetermined threshold, each data descriptor specifying an application message for transmission to a target compute node; queuing, by the messaging module, a plurality of new data descriptors in a pending descriptor queue if the number of the data descriptors in the DMA injection FIFO buffer exceeds the predetermined threshold; establishing, by the messaging module, interrupt criteria that specify when to replenish the injection FIFO buffer with the plurality of new data descriptors in the pending descriptor queue; and injecting, by the messaging module, the plurality of new data descriptors into the injection FIFO buffer in dependence upon the interrupt criteria.
Chemical Sensor Array Response Modeling Using Quantitative Structure-Activity Relationships Technique

NASA Astrophysics Data System (ADS)

Shevade, Abhijit V.; Ryan, Margaret A.; Homer, Margie L.; Zhou, Hanying; Manfreda, Allison M.; Lara, Liana M.; Yen, Shiao-Pin S.; Jewell, April D.; Manatt, Kenneth S.; Kisor, Adam K.

We have developed a Quantitative Structure-Activity Relationships (QSAR) based approach to correlate the response of chemical sensors in an array with molecular descriptors. A novel molecular descriptor set has been developed; this set combines descriptors of sensing film-analyte interactions, representing sensor response, with a basic analyte descriptor set commonly used in QSAR studies. The descriptors are obtained using a combination of molecular modeling tools and empirical and semi-empirical Quantitative Structure-Property Relationships (QSPR) methods. The sensors under investigation are polymer-carbon sensing films which have been exposed to analyte vapors at parts-per-million (ppm) concentrations; response is measured as change in film resistance. Statistically validated QSAR models have been developed using Genetic Function Approximations (GFA) for a sensor array for a given training data set. The applicability of the sensor response models has been tested by using it to predict the sensor activities for test analytes not considered in the training set for the model development. The validated QSAR sensor response models show good predictive ability. The QSAR approach is a promising computational tool for sensing materials evaluation and selection. It can also be used to predict response of an existing sensing film to new target analytes.
Heritabilities and genetic correlations in the same traits across different strata of herds created according to continuous genomic, genetic, and phenotypic descriptors.

PubMed

Yin, Tong; König, Sven

2018-03-01

The most common approach in dairy cattle to prove genotype by environment interactions is a multiple-trait model application, and considering the same traits in different environments as different traits. We enhanced such concepts by defining continuous phenotypic, genetic, and genomic herd descriptors, and applying random regression sire models. Traits of interest were test-day traits for milk yield, fat percentage, protein percentage, and somatic cell score, considering 267,393 records from 32,707 first-lactation Holstein cows. Cows were born in the years 2010 to 2013, and kept in 52 large-scale herds from 2 federal states of north-east Germany. The average number of genotyped cows per herd (45,613 single nucleotide polymorphism markers per cow) was 133.5 (range: 45 to 415 genotyped cows). Genomic herd descriptors were (1) the level of linkage disequilibrium (r 2 ) within specific chromosome segments, and (2) the average allele frequency for single nucleotide polymorphisms in close distance to a functional mutation. Genetic herd descriptors were the (1) intra-herd inbreeding coefficient, and (2) the percentage of daughters from foreign sires. Phenotypic herd descriptors were (1) herd size, and (2) the herd mean for nonreturn rate. Most correlations among herd descriptors were close to 0, indicating independence of genomic, genetic, and phenotypic characteristics. Heritabilities for milk yield increased with increasing intra-herd linkage disequilibrium, inbreeding, and herd size. Genetic correlations in same traits between adjacent levels of herd descriptors were close to 1, but declined for descriptor levels in greater distance. Genetic correlation declines were more obvious for somatic cell score, compared with test-day traits with larger heritabilities (fat percentage and protein percentage). Also, for milk yield, alterations of herd descriptor levels had an obvious effect on heritabilities and genetic correlations. By trend, multiple trait model results (based on created discrete herd classes) confirmed the random regression estimates. Identified alterations of breeding values in dependency of herd descriptors suggest utilization of specific sires for specific herd structures, offering new possibilities to improve sire selection strategies. Regarding genomic selection designs and genetic gain transfer into commercial herds, cow herds for the utilization in cow training sets should reflect the genomic, genetic, and phenotypic pattern of the broad population. Copyright © 2018 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Towards interoperable and reproducible QSAR analyses: Exchange of datasets.

PubMed

Spjuth, Ola; Willighagen, Egon L; Guha, Rajarshi; Eklund, Martin; Wikberg, Jarl Es

2010-06-30

QSAR is a widely used method to relate chemical structures to responses or properties based on experimental observations. Much effort has been made to evaluate and validate the statistical modeling in QSAR, but these analyses treat the dataset as fixed. An overlooked but highly important issue is the validation of the setup of the dataset, which comprises addition of chemical structures as well as selection of descriptors and software implementations prior to calculations. This process is hampered by the lack of standards and exchange formats in the field, making it virtually impossible to reproduce and validate analyses and drastically constrain collaborations and re-use of data. We present a step towards standardizing QSAR analyses by defining interoperable and reproducible QSAR datasets, consisting of an open XML format (QSAR-ML) which builds on an open and extensible descriptor ontology. The ontology provides an extensible way of uniquely defining descriptors for use in QSAR experiments, and the exchange format supports multiple versioned implementations of these descriptors. Hence, a dataset described by QSAR-ML makes its setup completely reproducible. We also provide a reference implementation as a set of plugins for Bioclipse which simplifies setup of QSAR datasets, and allows for exporting in QSAR-ML as well as old-fashioned CSV formats. The implementation facilitates addition of new descriptor implementations from locally installed software and remote Web services; the latter is demonstrated with REST and XMPP Web services. Standardized QSAR datasets open up new ways to store, query, and exchange data for subsequent analyses. QSAR-ML supports completely reproducible creation of datasets, solving the problems of defining which software components were used and their versions, and the descriptor ontology eliminates confusions regarding descriptors by defining them crisply. This makes is easy to join, extend, combine datasets and hence work collectively, but also allows for analyzing the effect descriptors have on the statistical model's performance. The presented Bioclipse plugins equip scientists with graphical tools that make QSAR-ML easily accessible for the community.
Predicting Drug-induced Hepatotoxicity Using QSAR and Toxicogenomics Approaches

PubMed Central

Low, Yen; Uehara, Takeki; Minowa, Yohsuke; Yamada, Hiroshi; Ohno, Yasuo; Urushidani, Tetsuro; Sedykh, Alexander; Muratov, Eugene; Fourches, Denis; Zhu, Hao; Rusyn, Ivan; Tropsha, Alexander

2014-01-01

Quantitative Structure-Activity Relationship (QSAR) modeling and toxicogenomics are used independently as predictive tools in toxicology. In this study, we evaluated the power of several statistical models for predicting drug hepatotoxicity in rats using different descriptors of drug molecules, namely their chemical descriptors and toxicogenomic profiles. The records were taken from the Toxicogenomics Project rat liver microarray database containing information on 127 drugs (http://toxico.nibio.go.jp/datalist.html). The model endpoint was hepatotoxicity in the rat following 28 days of exposure, established by liver histopathology and serum chemistry. First, we developed multiple conventional QSAR classification models using a comprehensive set of chemical descriptors and several classification methods (k nearest neighbor, support vector machines, random forests, and distance weighted discrimination). With chemical descriptors alone, external predictivity (Correct Classification Rate, CCR) from 5-fold external cross-validation was 61%. Next, the same classification methods were employed to build models using only toxicogenomic data (24h after a single exposure) treated as biological descriptors. The optimized models used only 85 selected toxicogenomic descriptors and had CCR as high as 76%. Finally, hybrid models combining both chemical descriptors and transcripts were developed; their CCRs were between 68 and 77%. Although the accuracy of hybrid models did not exceed that of the models based on toxicogenomic data alone, the use of both chemical and biological descriptors enriched the interpretation of the models. In addition to finding 85 transcripts that were predictive and highly relevant to the mechanisms of drug-induced liver injury, chemical structural alerts for hepatotoxicity were also identified. These results suggest that concurrent exploration of the chemical features and acute treatment-induced changes in transcript levels will both enrich the mechanistic understanding of sub-chronic liver injury and afford models capable of accurate prediction of hepatotoxicity from chemical structure and short-term assay results. PMID:21699217
Towards interoperable and reproducible QSAR analyses: Exchange of datasets

PubMed Central

2010-01-01

Background QSAR is a widely used method to relate chemical structures to responses or properties based on experimental observations. Much effort has been made to evaluate and validate the statistical modeling in QSAR, but these analyses treat the dataset as fixed. An overlooked but highly important issue is the validation of the setup of the dataset, which comprises addition of chemical structures as well as selection of descriptors and software implementations prior to calculations. This process is hampered by the lack of standards and exchange formats in the field, making it virtually impossible to reproduce and validate analyses and drastically constrain collaborations and re-use of data. Results We present a step towards standardizing QSAR analyses by defining interoperable and reproducible QSAR datasets, consisting of an open XML format (QSAR-ML) which builds on an open and extensible descriptor ontology. The ontology provides an extensible way of uniquely defining descriptors for use in QSAR experiments, and the exchange format supports multiple versioned implementations of these descriptors. Hence, a dataset described by QSAR-ML makes its setup completely reproducible. We also provide a reference implementation as a set of plugins for Bioclipse which simplifies setup of QSAR datasets, and allows for exporting in QSAR-ML as well as old-fashioned CSV formats. The implementation facilitates addition of new descriptor implementations from locally installed software and remote Web services; the latter is demonstrated with REST and XMPP Web services. Conclusions Standardized QSAR datasets open up new ways to store, query, and exchange data for subsequent analyses. QSAR-ML supports completely reproducible creation of datasets, solving the problems of defining which software components were used and their versions, and the descriptor ontology eliminates confusions regarding descriptors by defining them crisply. This makes is easy to join, extend, combine datasets and hence work collectively, but also allows for analyzing the effect descriptors have on the statistical model's performance. The presented Bioclipse plugins equip scientists with graphical tools that make QSAR-ML easily accessible for the community. PMID:20591161
Learned Compact Local Feature Descriptor for Tls-Based Geodetic Monitoring of Natural Outdoor Scenes

NASA Astrophysics Data System (ADS)

Gojcic, Z.; Zhou, C.; Wieser, A.

2018-05-01

The advantages of terrestrial laser scanning (TLS) for geodetic monitoring of man-made and natural objects are not yet fully exploited. Herein we address one of the open challenges by proposing feature-based methods for identification of corresponding points in point clouds of two or more epochs. We propose a learned compact feature descriptor tailored for point clouds of natural outdoor scenes obtained using TLS. We evaluate our method both on a benchmark data set and on a specially acquired outdoor dataset resembling a simplified monitoring scenario where we successfully estimate 3D displacement vectors of a rock that has been displaced between the scans. We show that the proposed descriptor has the capacity to generalize to unseen data and achieves state-of-the-art performance while being time efficient at the matching step due the low dimension.
Compositional descriptor-based recommender system for the materials discovery

NASA Astrophysics Data System (ADS)

Seko, Atsuto; Hayashi, Hiroyuki; Tanaka, Isao

2018-06-01

Structures and properties of many inorganic compounds have been collected historically. However, it only covers a very small portion of possible inorganic crystals, which implies the presence of numerous currently unknown compounds. A powerful machine-learning strategy is mandatory to discover new inorganic compounds from all chemical combinations. Herein we propose a descriptor-based recommender-system approach to estimate the relevance of chemical compositions where crystals can be formed [i.e., chemically relevant compositions (CRCs)]. In addition to data-driven compositional similarity used in the literature, the use of compositional descriptors as a prior knowledge is helpful for the discovery of new compounds. We validate our recommender systems in two ways. First, one database is used to construct a model, while another is used for the validation. Second, we estimate the phase stability for compounds at expected CRCs using density functional theory calculations.
Optimized Multi-Spectral Filter Array Based Imaging of Natural Scenes.

PubMed

Li, Yuqi; Majumder, Aditi; Zhang, Hao; Gopi, M

2018-04-12

Multi-spectral imaging using a camera with more than three channels is an efficient method to acquire and reconstruct spectral data and is used extensively in tasks like object recognition, relighted rendering, and color constancy. Recently developed methods are used to only guide content-dependent filter selection where the set of spectral reflectances to be recovered are known a priori. We present the first content-independent spectral imaging pipeline that allows optimal selection of multiple channels. We also present algorithms for optimal placement of the channels in the color filter array yielding an efficient demosaicing order resulting in accurate spectral recovery of natural reflectance functions. These reflectance functions have the property that their power spectrum statistically exhibits a power-law behavior. Using this property, we propose power-law based error descriptors that are minimized to optimize the imaging pipeline. We extensively verify our models and optimizations using large sets of commercially available wide-band filters to demonstrate the greater accuracy and efficiency of our multi-spectral imaging pipeline over existing methods.
Optimized Multi-Spectral Filter Array Based Imaging of Natural Scenes

PubMed Central

Li, Yuqi; Majumder, Aditi; Zhang, Hao; Gopi, M.

2018-01-01

Multi-spectral imaging using a camera with more than three channels is an efficient method to acquire and reconstruct spectral data and is used extensively in tasks like object recognition, relighted rendering, and color constancy. Recently developed methods are used to only guide content-dependent filter selection where the set of spectral reflectances to be recovered are known a priori. We present the first content-independent spectral imaging pipeline that allows optimal selection of multiple channels. We also present algorithms for optimal placement of the channels in the color filter array yielding an efficient demosaicing order resulting in accurate spectral recovery of natural reflectance functions. These reflectance functions have the property that their power spectrum statistically exhibits a power-law behavior. Using this property, we propose power-law based error descriptors that are minimized to optimize the imaging pipeline. We extensively verify our models and optimizations using large sets of commercially available wide-band filters to demonstrate the greater accuracy and efficiency of our multi-spectral imaging pipeline over existing methods. PMID:29649114
DataWarrior: an open-source program for chemistry aware data visualization and analysis.

PubMed

Sander, Thomas; Freyss, Joel; von Korff, Modest; Rufener, Christian

2015-02-23

Drug discovery projects in the pharmaceutical industry accumulate thousands of chemical structures and ten-thousands of data points from a dozen or more biological and pharmacological assays. A sufficient interpretation of the data requires understanding, which molecular families are present, which structural motifs correlate with measured properties, and which tiny structural changes cause large property changes. Data visualization and analysis software with sufficient chemical intelligence to support chemists in this task is rare. In an attempt to contribute to filling the gap, we released our in-house developed chemistry aware data analysis program DataWarrior for free public use. This paper gives an overview of DataWarrior's functionality and architecture. Exemplarily, a new unsupervised, 2-dimensional scaling algorithm is presented, which employs vector-based or nonvector-based descriptors to visualize the chemical or pharmacophore space of even large data sets. DataWarrior uses this method to interactively explore chemical space, activity landscapes, and activity cliffs.
A focus of attention mechanism for gaze control within a framework for intelligent image analysis tools

NASA Astrophysics Data System (ADS)

Rodrigo, Ranga P.; Ranaweera, Kamal; Samarabandu, Jagath K.

2004-05-01

Focus of attention is often attributed to biological vision system where the entire field of view is first monitored and then the attention is focused to the object of interest. We propose using a similar approach for object recognition in a color image sequence. The intention is to locate an object based on a prior motive, concentrate on the detected object so that the imaging device can be guided toward it. We use the abilities of the intelligent image analysis framework developed in our laboratory to generate an algorithm dynamically to detect the particular type of object based on the user's object description. The proposed method uses color clustering along with segmentation. The segmented image with labeled regions is used to calculate the shape descriptor parameters. These and the color information are matched with the input description. Gaze is then controlled by issuing camera movement commands as appropriate. We present some preliminary results that demonstrate the success of this approach.
Fusion of LBP and SWLD using spatio-spectral information for hyperspectral face recognition

NASA Astrophysics Data System (ADS)

Xie, Zhihua; Jiang, Peng; Zhang, Shuai; Xiong, Jinquan

2018-01-01

Hyperspectral imaging, recording intrinsic spectral information of the skin cross different spectral bands, become an important issue for robust face recognition. However, the main challenges for hyperspectral face recognition are high data dimensionality, low signal to noise ratio and inter band misalignment. In this paper, hyperspectral face recognition based on LBP (Local binary pattern) and SWLD (Simplified Weber local descriptor) is proposed to extract discriminative local features from spatio-spectral fusion information. Firstly, the spatio-spectral fusion strategy based on statistical information is used to attain discriminative features of hyperspectral face images. Secondly, LBP is applied to extract the orientation of the fusion face edges. Thirdly, SWLD is proposed to encode the intensity information in hyperspectral images. Finally, we adopt a symmetric Kullback-Leibler distance to compute the encoded face images. The hyperspectral face recognition is tested on Hong Kong Polytechnic University Hyperspectral Face database (PolyUHSFD). Experimental results show that the proposed method has higher recognition rate (92.8%) than the state of the art hyperspectral face recognition algorithms.
a Clustering-Based Approach for Evaluation of EO Image Indexing

NASA Astrophysics Data System (ADS)

Bahmanyar, R.; Rigoll, G.; Datcu, M.

2013-09-01

The volume of Earth Observation data is increasing immensely in order of several Terabytes a day. Therefore, to explore and investigate the content of this huge amount of data, developing more sophisticated Content-Based Information Retrieval (CBIR) systems are highly demanded. These systems should be able to not only discover unknown structures behind the data, but also provide relevant results to the users' queries. Since in any retrieval system the images are processed based on a discrete set of their features (i.e., feature descriptors), study and assessment of the structure of feature space, build by different feature descriptors, is of high importance. In this paper, we introduce a clustering-based approach to study the content of image collections. In our approach, we claim that using both internal and external evaluation of clusters for different feature descriptors, helps to understand the structure of feature space. Moreover, the semantic understanding of users about the images also can be assessed. To validate the performance of our approach, we used an annotated Synthetic Aperture Radar (SAR) image collection. Quantitative results besides the visualization of feature space demonstrate the applicability of our approach.
Model of twelve properties of a set of organic solvents with graph-theoretical and/or experimental parameters.

PubMed

Pogliani, Lionello

2010-01-30

Twelve properties of a highly heterogeneous class of organic solvents have been modeled with a graph-theoretical molecular connectivity modified (MC) method, which allows to encode the core electrons and the hydrogen atoms. The graph-theoretical method uses the concepts of simple, general, and complete graphs, where these last types of graphs are used to encode the core electrons. The hydrogen atoms have been encoded by the aid of a graph-theoretical perturbation parameter, which contributes to the definition of the valence delta, delta(v), a key parameter in molecular connectivity studies. The model of the twelve properties done with a stepwise search algorithm is always satisfactory, and it allows to check the influence of the hydrogen content of the solvent molecules on the choice of the type of descriptor. A similar argument holds for the influence of the halogen atoms on the type of core electron representation. In some cases the molar mass, and in a minor way, special "ad hoc" parameters have been used to improve the model. A very good model of the surface tension could be obtained by the aid of five experimental parameters. A mixed model method based on experimental parameters plus molecular connectivity indices achieved, instead, to consistently improve the model quality of five properties. To underline is the importance of the boiling point temperatures as descriptors in these last two model methodologies. Copyright 2009 Wiley Periodicals, Inc.
A Bayesian cluster analysis method for single-molecule localization microscopy data.

PubMed

Griffié, Juliette; Shannon, Michael; Bromley, Claire L; Boelen, Lies; Burn, Garth L; Williamson, David J; Heard, Nicholas A; Cope, Andrew P; Owen, Dylan M; Rubin-Delanchy, Patrick

2016-12-01

Cell function is regulated by the spatiotemporal organization of the signaling machinery, and a key facet of this is molecular clustering. Here, we present a protocol for the analysis of clustering in data generated by 2D single-molecule localization microscopy (SMLM)-for example, photoactivated localization microscopy (PALM) or stochastic optical reconstruction microscopy (STORM). Three features of such data can cause standard cluster analysis approaches to be ineffective: (i) the data take the form of a list of points rather than a pixel array; (ii) there is a non-negligible unclustered background density of points that must be accounted for; and (iii) each localization has an associated uncertainty in regard to its position. These issues are overcome using a Bayesian, model-based approach. Many possible cluster configurations are proposed and scored against a generative model, which assumes Gaussian clusters overlaid on a completely spatially random (CSR) background, before every point is scrambled by its localization precision. We present the process of generating simulated and experimental data that are suitable to our algorithm, the analysis itself, and the extraction and interpretation of key cluster descriptors such as the number of clusters, cluster radii and the number of localizations per cluster. Variations in these descriptors can be interpreted as arising from changes in the organization of the cellular nanoarchitecture. The protocol requires no specific programming ability, and the processing time for one data set, typically containing 30 regions of interest, is ∼18 h; user input takes ∼1 h.
Quantitative structure activity relationship (QSAR) of piperine analogs for bacterial NorA efflux pump inhibitors.

PubMed

Nargotra, Amit; Sharma, Sujata; Koul, Jawahir Lal; Sangwan, Pyare Lal; Khan, Inshad Ali; Kumar, Ashwani; Taneja, Subhash Chander; Koul, Surrinder

2009-10-01

Quantitative structure activity relationship (QSAR) analysis of piperine analogs as inhibitors of efflux pump NorA from Staphylococcus aureus has been performed in order to obtain a highly accurate model enabling prediction of inhibition of S. aureus NorA of new chemical entities from natural sources as well as synthetic ones. Algorithm based on genetic function approximation method of variable selection in Cerius2 was used to generate the model. Among several types of descriptors viz., topological, spatial, thermodynamic, information content and E-state indices that were considered in generating the QSAR model, three descriptors such as partial negative surface area of the compounds, area of the molecular shadow in the XZ plane and heat of formation of the molecules resulted in a statistically significant model with r(2)=0.962 and cross-validation parameter q(2)=0.917. The validation of the QSAR models was done by cross-validation, leave-25%-out and external test set prediction. The theoretical approach indicates that the increase in the exposed partial negative surface area increases the inhibitory activity of the compound against NorA whereas the area of the molecular shadow in the XZ plane is inversely proportional to the inhibitory activity. This model also explains the relationship of the heat of formation of the compound with the inhibitory activity. The model is not only able to predict the activity of new compounds but also explains the important regions in the molecules in quantitative manner.

Arabidopsis phenotyping through Geometric Morphometrics.

PubMed

Manacorda, Carlos A; Asurmendi, Sebastian

2018-06-18

Recently, much technical progress was achieved in the field of plant phenotyping. High-throughput platforms and the development of improved algorithms for rosette image segmentation make it now possible to extract shape and size parameters for genetic, physiological and environmental studies on a large scale. The development of low-cost phenotyping platforms and freeware resources make it possible to widely expand phenotypic analysis tools for Arabidopsis. However, objective descriptors of shape parameters that could be used independently of platform and segmentation software used are still lacking and shape descriptions still rely on ad hoc or even sometimes contradictory descriptors, which could make comparisons difficult and perhaps inaccurate. Modern geometric morphometrics is a family of methods in quantitative biology proposed to be the main source of data and analytical tools in the emerging field of phenomics studies. Based on the location of landmarks (corresponding points) over imaged specimens and by combining geometry, multivariate analysis and powerful statistical techniques, these tools offer the possibility to reproducibly and accurately account for shape variations amongst groups and measure them in shape distance units. Here, a particular scheme of landmarks placement on Arabidopsis rosette images is proposed to study shape variation in the case of viral infection processes. Shape differences between controls and infected plants are quantified throughout the infectious process and visualized. Quantitative comparisons between two unrelated ssRNA+ viruses are shown and reproducibility issues are assessed. Combined with the newest automated platforms and plant segmentation procedures, geometric morphometric tools could boost phenotypic features extraction and processing in an objective, reproducible manner.
Predicting the activity of drugs for a group of imidazopyridine anticoccidial compounds.

PubMed

Si, Hongzong; Lian, Ning; Yuan, Shuping; Fu, Aiping; Duan, Yun-Bo; Zhang, Kejun; Yao, Xiaojun

2009-10-01

Gene expression programming (GEP) is a novel machine learning technique. The GEP is used to build nonlinear quantitative structure-activity relationship model for the prediction of the IC(50) for the imidazopyridine anticoccidial compounds. This model is based on descriptors which are calculated from the molecular structure. Four descriptors are selected from the descriptors' pool by heuristic method (HM) to build multivariable linear model. The GEP method produced a nonlinear quantitative model with a correlation coefficient and a mean error of 0.96 and 0.24 for the training set, 0.91 and 0.52 for the test set, respectively. It is shown that the GEP predicted results are in good agreement with experimental ones.
Prediction of Drug-Plasma Protein Binding Using Artificial Intelligence Based Algorithms.

PubMed

Kumar, Rajnish; Sharma, Anju; Siddiqui, Mohammed Haris; Tiwari, Rajesh Kumar

2018-01-01

Plasma protein binding (PPB) has vital importance in the characterization of drug distribution in the systemic circulation. Unfavorable PPB can pose a negative effect on clinical development of promising drug candidates. The drug distribution properties should be considered at the initial phases of the drug design and development. Therefore, PPB prediction models are receiving an increased attention. In the current study, we present a systematic approach using Support vector machine, Artificial neural network, k- nearest neighbor, Probabilistic neural network, Partial least square and Linear discriminant analysis to relate various in vitro and in silico molecular descriptors to a diverse dataset of 736 drugs/drug-like compounds. The overall accuracy of Support vector machine with Radial basis function kernel came out to be comparatively better than the rest of the applied algorithms. The training set accuracy, validation set accuracy, precision, sensitivity, specificity and F1 score for the Suprort vector machine was found to be 89.73%, 89.97%, 92.56%, 87.26%, 91.97% and 0.898, respectively. This model can potentially be useful in screening of relevant drug candidates at the preliminary stages of drug design and development. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Prediction protein structural classes with pseudo-amino acid composition: approximate entropy and hydrophobicity pattern.

PubMed

Zhang, Tong-Liang; Ding, Yong-Sheng; Chou, Kuo-Chen

2008-01-07

Compared with the conventional amino acid (AA) composition, the pseudo-amino acid (PseAA) composition as originally introduced for protein subcellular location prediction can incorporate much more information of a protein sequence, so as to remarkably enhance the power of using a discrete model to predict various attributes of a protein. In this study, based on the concept of PseAA composition, the approximate entropy and hydrophobicity pattern of a protein sequence are used to characterize the PseAA components. Also, the immune genetic algorithm (IGA) is applied to search the optimal weight factors in generating the PseAA composition. Thus, for a given protein sequence sample, a 27-D (dimensional) PseAA composition is generated as its descriptor. The fuzzy K nearest neighbors (FKNN) classifier is adopted as the prediction engine. The results thus obtained in predicting protein structural classification are quite encouraging, indicating that the current approach may also be used to improve the prediction quality of other protein attributes, or at least can play a complimentary role to the existing methods in the relevant areas. Our algorithm is written in Matlab that is available by contacting the corresponding author.
Towards a metadata scheme for the description of materials - the description of microstructures

NASA Astrophysics Data System (ADS)

Schmitz, Georg J.; Böttger, Bernd; Apel, Markus; Eiken, Janin; Laschet, Gottfried; Altenfeld, Ralph; Berger, Ralf; Boussinot, Guillaume; Viardin, Alexandre

2016-01-01

The property of any material is essentially determined by its microstructure. Numerical models are increasingly the focus of modern engineering as helpful tools for tailoring and optimization of custom-designed microstructures by suitable processing and alloy design. A huge variety of software tools is available to predict various microstructural aspects for different materials. In the general frame of an integrated computational materials engineering (ICME) approach, these microstructure models provide the link between models operating at the atomistic or electronic scales, and models operating on the macroscopic scale of the component and its processing. In view of an improved interoperability of all these different tools it is highly desirable to establish a standardized nomenclature and methodology for the exchange of microstructure data. The scope of this article is to provide a comprehensive system of metadata descriptors for the description of a 3D microstructure. The presented descriptors are limited to a mere geometric description of a static microstructure and have to be complemented by further descriptors, e.g. for properties, numerical representations, kinetic data, and others in the future. Further attributes to each descriptor, e.g. on data origin, data uncertainty, and data validity range are being defined in ongoing work. The proposed descriptors are intended to be independent of any specific numerical representation. The descriptors defined in this article may serve as a first basis for standardization and will simplify the data exchange between different numerical models, as well as promote the integration of experimental data into numerical models of microstructures. An HDF5 template data file for a simple, three phase Al-Cu microstructure being based on the defined descriptors complements this article.
Towards a metadata scheme for the description of materials - the description of microstructures.

PubMed

Schmitz, Georg J; Böttger, Bernd; Apel, Markus; Eiken, Janin; Laschet, Gottfried; Altenfeld, Ralph; Berger, Ralf; Boussinot, Guillaume; Viardin, Alexandre

2016-01-01

The property of any material is essentially determined by its microstructure. Numerical models are increasingly the focus of modern engineering as helpful tools for tailoring and optimization of custom-designed microstructures by suitable processing and alloy design. A huge variety of software tools is available to predict various microstructural aspects for different materials. In the general frame of an integrated computational materials engineering (ICME) approach, these microstructure models provide the link between models operating at the atomistic or electronic scales, and models operating on the macroscopic scale of the component and its processing. In view of an improved interoperability of all these different tools it is highly desirable to establish a standardized nomenclature and methodology for the exchange of microstructure data. The scope of this article is to provide a comprehensive system of metadata descriptors for the description of a 3D microstructure. The presented descriptors are limited to a mere geometric description of a static microstructure and have to be complemented by further descriptors, e.g. for properties, numerical representations, kinetic data, and others in the future. Further attributes to each descriptor, e.g. on data origin, data uncertainty, and data validity range are being defined in ongoing work. The proposed descriptors are intended to be independent of any specific numerical representation. The descriptors defined in this article may serve as a first basis for standardization and will simplify the data exchange between different numerical models, as well as promote the integration of experimental data into numerical models of microstructures. An HDF5 template data file for a simple, three phase Al-Cu microstructure being based on the defined descriptors complements this article.
Descriptors of Oxygen-Evolution Activity for Oxides: A Statistical Evaluation

DOE PAGES

Hong, Wesley T.; Welsch, Roy E.; Shao-Horn, Yang

2015-12-16

Catalysts for oxygen electrochemical processes are critical for the commercial viability of renewable energy storage and conversion devices such as fuel cells, artificial photosynthesis, and metal-air batteries. Transition metal oxides are an excellent system for developing scalable, non-noble-metal-based catalysts, especially for the oxygen evolution reaction (OER). Central to the rational design of novel catalysts is the development of quantitative structure-activity relation-ships, which correlate the desired catalytic behavior to structural and/or elemental descriptors of materials. The ultimate goal is to use these relationships to guide materials design. In this study, 101 intrinsic OER activities of 51 perovskites were compiled from fivemore » studies in literature and additional measurements made for this work. We explored the behavior and performance of 14 descriptors of the metal-oxygen bond strength using a number of statistical approaches, including factor analysis and linear regression models. We found that these descriptors can be classified into five descriptor families and identify electron occupancy and metal-oxygen covalency as the dominant influences on the OER activity. However, multiple descriptors still need to be considered in order to develop strong predictive relationships, largely outperforming the use of only one or two descriptors (as conventionally done in the field). Here, we confirmed that the number of d electrons, charge-transfer energy (covalency), and optimality of eg occupancy play the important roles, but found that structural factors such as M-O-M bond angle and tolerance factor are relevant as well. With these tools, we demonstrate how statistical learning can be used to draw novel physical insights and combined with data mining to rapidly screen OER electrocatalysts across a wide chemical space.« less
On the Development and Use of Large Chemical Similarity Networks, Informatics Best Practices and Novel Chemical Descriptors towards Materials Quantitative Structure Property Relationships

ERIC Educational Resources Information Center

Krein, Michael

2011-01-01

After decades of development and use in a variety of application areas, Quantitative Structure Property Relationships (QSPRs) and related descriptor-based statistical learning methods have achieved a level of infamy due to their misuse. The field is rife with past examples of overtrained models, overoptimistic performance assessment, and outright…
Adapting CEF-Descriptors for Rating Purposes: Validation by a Combined Rater Training and Scale Revision Approach

ERIC Educational Resources Information Center

Harsch, Claudia; Martin, Guido

2012-01-01

We explore how a local rating scale can be based on the Common European Framework CEF-proficiency scales. As part of the scale validation (Alderson, 1991; Lumley, 2002), we examine which adaptations are needed to turn CEF-proficiency descriptors into a rating scale for a local context, and to establish a practicable method to revise the initial…
Build a Robust Learning Feature Descriptor by Using a New Image Visualization Method for Indoor Scenario Recognition

PubMed Central

Wang, Xin; Deng, Zhongliang

2017-01-01

In order to recognize indoor scenarios, we extract image features for detecting objects, however, computers can make some unexpected mistakes. After visualizing the histogram of oriented gradient (HOG) features, we find that the world through the eyes of a computer is indeed different from human eyes, which assists researchers to see the reasons that cause a computer to make errors. Additionally, according to the visualization, we notice that the HOG features can obtain rich texture information. However, a large amount of background interference is also introduced. In order to enhance the robustness of the HOG feature, we propose an improved method for suppressing the background interference. On the basis of the original HOG feature, we introduce a principal component analysis (PCA) to extract the principal components of the image colour information. Then, a new hybrid feature descriptor, which is named HOG–PCA (HOGP), is made by deeply fusing these two features. Finally, the HOGP is compared to the state-of-the-art HOG feature descriptor in four scenes under different illumination. In the simulation and experimental tests, the qualitative and quantitative assessments indicate that the visualizing images of the HOGP feature are close to the observation results obtained by human eyes, which is better than the original HOG feature for object detection. Furthermore, the runtime of our proposed algorithm is hardly increased in comparison to the classic HOG feature. PMID:28677635
Artificial neural networks and the study of the psychoactivity of cannabinoid compounds.

PubMed

Honório, Káthia M; de Lima, Emmanuela F; Quiles, Marcos G; Romero, Roseli A F; Molfetta, Fábio A; da Silva, Albérico B F

2010-06-01

Cannabinoid compounds have widely been employed because of its medicinal and psychotropic properties. These compounds are isolated from Cannabis sativa (or marijuana) and are used in several medical treatments, such as glaucoma, nausea associated to chemotherapy, pain and many other situations. More recently, its use as appetite stimulant has been indicated in patients with cachexia or AIDS. In this work, the influence of several molecular descriptors on the psychoactivity of 50 cannabinoid compounds is analyzed aiming one obtain a model able to predict the psychoactivity of new cannabinoids. For this purpose, initially, the selection of descriptors was carried out using the Fisher's weight, the correlation matrix among the calculated variables and principal component analysis. From these analyses, the following descriptors have been considered more relevant: E(LUMO) (energy of the lowest unoccupied molecular orbital), Log P (logarithm of the partition coefficient), VC4 (volume of the substituent at the C4 position) and LP1 (Lovasz-Pelikan index, a molecular branching index). To follow, two neural network models were used to construct a more adequate model for classifying new cannabinoid compounds. The first model employed was multi-layer perceptrons, with algorithm back-propagation, and the second model used was the Kohonen network. The results obtained from both networks were compared and showed that both techniques presented a high percentage of correctness to discriminate psychoactive and psychoinactive compounds. However, the Kohonen network was superior to multi-layer perceptrons.
Illumination invariant feature point matching for high-resolution planetary remote sensing images

NASA Astrophysics Data System (ADS)

Wu, Bo; Zeng, Hai; Hu, Han

2018-03-01

Despite its success with regular close-range and remote-sensing images, the scale-invariant feature transform (SIFT) algorithm is essentially not invariant to illumination differences due to the use of gradients for feature description. In planetary remote sensing imagery, which normally lacks sufficient textural information, salient regions are generally triggered by the shadow effects of keypoints, reducing the matching performance of classical SIFT. Based on the observation of dual peaks in a histogram of the dominant orientations of SIFT keypoints, this paper proposes an illumination-invariant SIFT matching method for high-resolution planetary remote sensing images. First, as the peaks in the orientation histogram are generally aligned closely with the sub-solar azimuth angle at the time of image collection, an adaptive suppression Gaussian function is tuned to level the histogram and thereby alleviate the differences in illumination caused by a changing solar angle. Next, the suppression function is incorporated into the original SIFT procedure for obtaining feature descriptors, which are used for initial image matching. Finally, as the distribution of feature descriptors changes after anisotropic suppression, and the ratio check used for matching and outlier removal in classical SIFT may produce inferior results, this paper proposes an improved matching procedure based on cross-checking and template image matching. The experimental results for several high-resolution remote sensing images from both the Moon and Mars, with illumination differences of 20°-180°, reveal that the proposed method retrieves about 40%-60% more matches than the classical SIFT method. The proposed method is of significance for matching or co-registration of planetary remote sensing images for their synergistic use in various applications. It also has the potential to be useful for flyby and rover images by integrating with the affine invariant feature detectors.
On the use of INS to improve Feature Matching

NASA Astrophysics Data System (ADS)

Masiero, A.; Guarnieri, A.; Vettore, A.; Pirotti, F.

2014-11-01

The continuous technological improvement of mobile devices opens the frontiers of Mobile Mapping systems to very compact systems, i.e. a smartphone or a tablet. This motivates the development of efficient 3D reconstruction techniques based on the sensors typically embedded in such devices, i.e. imaging sensors, GPS and Inertial Navigation System (INS). Such methods usually exploits photogrammetry techniques (structure from motion) to provide an estimation of the geometry of the scene. Actually, 3D reconstruction techniques (e.g. structure from motion) rely on use of features properly matched in different images to compute the 3D positions of objects by means of triangulation. Hence, correct feature matching is of fundamental importance to ensure good quality 3D reconstructions. Matching methods are based on the appearance of features, that can change as a consequence of variations of camera position and orientation, and environment illumination. For this reason, several methods have been developed in recent years in order to provide feature descriptors robust (ideally invariant) to such variations, e.g. Scale-Invariant Feature Transform (SIFT), Affine SIFT, Hessian affine and Harris affine detectors, Maximally Stable Extremal Regions (MSER). This work deals with the integration of information provided by the INS in the feature matching procedure: a previously developed navigation algorithm is used to constantly estimate the device position and orientation. Then, such information is exploited to estimate the transformation of feature regions between two camera views. This allows to compare regions from different images but associated to the same feature as seen by the same point of view, hence significantly easing the comparison of feature characteristics and, consequently, improving matching. SIFT-like descriptors are used in order to ensure good matching results in presence of illumination variations and to compensate the approximations related to the estimation process.
The comparison of automated clustering algorithms for resampling representative conformer ensembles with RMSD matrix.

PubMed

Kim, Hyoungrae; Jang, Cheongyun; Yadav, Dharmendra K; Kim, Mi-Hyun

2017-03-23

The accuracy of any 3D-QSAR, Pharmacophore and 3D-similarity based chemometric target fishing models are highly dependent on a reasonable sample of active conformations. Since a number of diverse conformational sampling algorithm exist, which exhaustively generate enough conformers, however model building methods relies on explicit number of common conformers. In this work, we have attempted to make clustering algorithms, which could find reasonable number of representative conformer ensembles automatically with asymmetric dissimilarity matrix generated from openeye tool kit. RMSD was the important descriptor (variable) of each column of the N × N matrix considered as N variables describing the relationship (network) between the conformer (in a row) and the other N conformers. This approach used to evaluate the performance of the well-known clustering algorithms by comparison in terms of generating representative conformer ensembles and test them over different matrix transformation functions considering the stability. In the network, the representative conformer group could be resampled for four kinds of algorithms with implicit parameters. The directed dissimilarity matrix becomes the only input to the clustering algorithms. Dunn index, Davies-Bouldin index, Eta-squared values and omega-squared values were used to evaluate the clustering algorithms with respect to the compactness and the explanatory power. The evaluation includes the reduction (abstraction) rate of the data, correlation between the sizes of the population and the samples, the computational complexity and the memory usage as well. Every algorithm could find representative conformers automatically without any user intervention, and they reduced the data to 14-19% of the original values within 1.13 s per sample at the most. The clustering methods are simple and practical as they are fast and do not ask for any explicit parameters. RCDTC presented the maximum Dunn and omega-squared values of the four algorithms in addition to consistent reduction rate between the population size and the sample size. The performance of the clustering algorithms was consistent over different transformation functions. Moreover, the clustering method can also be applied to molecular dynamics sampling simulation results.
Development of bovine serum albumin-water partition coefficients predictive models for ionogenic organic chemicals based on chemical form adjusted descriptors.

PubMed

Ding, Feng; Yang, Xianhai; Chen, Guosong; Liu, Jining; Shi, Lili; Chen, Jingwen

2017-10-01

The partition coefficients between bovine serum albumin (BSA) and water (K BSA/w ) for ionogenic organic chemicals (IOCs) were different greatly from those of neutral organic chemicals (NOCs). For NOCs, several excellent models were developed to predict their logK BSA/w . However, it was found that the conventional descriptors are inappropriate for modeling logK BSA/w of IOCs. Thus, alternative approaches are urgently needed to develop predictive models for K BSA/w of IOCs. In this study, molecular descriptors that can be used to characterize the ionization effects (e.g. chemical form adjusted descriptors) were calculated and used to develop predictive models for logK BSA/w of IOCs. The models developed had high goodness-of-fit, robustness, and predictive ability. The predictor variables selected to construct the models included the chemical form adjusted averages of the negative potentials on the molecular surface (V s-adj - ), the chemical form adjusted molecular dipole moment (dipolemoment adj ), the logarithm of the n-octanol/water distribution coefficient (logD). As these molecular descriptors can be calculated from their molecular structures directly, the developed model can be easily used to fill the logK BSA/w data gap for other IOCs within the applicability domain. Furthermore, the chemical form adjusted descriptors calculated in this study also could be used to construct predictive models on other endpoints of IOCs. Copyright © 2017 Elsevier Inc. All rights reserved.
A short feature vector for image matching: The Log-Polar Magnitude feature descriptor

PubMed Central

Hast, Anders; Wählby, Carolina; Sintorn, Ida-Maria

2017-01-01

The choice of an optimal feature detector-descriptor combination for image matching often depends on the application and the image type. In this paper, we propose the Log-Polar Magnitude feature descriptor—a rotation, scale, and illumination invariant descriptor that achieves comparable performance to SIFT on a large variety of image registration problems but with much shorter feature vectors. The descriptor is based on the Log-Polar Transform followed by a Fourier Transform and selection of the magnitude spectrum components. Selecting different frequency components allows optimizing for image patterns specific for a particular application. In addition, by relying only on coordinates of the found features and (optionally) feature sizes our descriptor is completely detector independent. We propose 48- or 56-long feature vectors that potentially can be shortened even further depending on the application. Shorter feature vectors result in better memory usage and faster matching. This combined with the fact that the descriptor does not require a time-consuming feature orientation estimation (the rotation invariance is achieved solely by using the magnitude spectrum of the Log-Polar Transform) makes it particularly attractive to applications with limited hardware capacity. Evaluation is performed on the standard Oxford dataset and two different microscopy datasets; one with fluorescence and one with transmission electron microscopy images. Our method performs better than SURF and comparable to SIFT on the Oxford dataset, and better than SIFT on both microscopy datasets indicating that it is particularly useful in applications with microscopy images. PMID:29190737
An image retrieval framework for real-time endoscopic image retargeting.

PubMed

Ye, Menglong; Johns, Edward; Walter, Benjamin; Meining, Alexander; Yang, Guang-Zhong

2017-08-01

Serial endoscopic examinations of a patient are important for early diagnosis of malignancies in the gastrointestinal tract. However, retargeting for optical biopsy is challenging due to extensive tissue variations between examinations, requiring the method to be tolerant to these changes whilst enabling real-time retargeting. This work presents an image retrieval framework for inter-examination retargeting. We propose both a novel image descriptor tolerant of long-term tissue changes and a novel descriptor matching method in real time. The descriptor is based on histograms generated from regional intensity comparisons over multiple scales, offering stability over long-term appearance changes at the higher levels, whilst remaining discriminative at the lower levels. The matching method then learns a hashing function using random forests, to compress the string and allow for fast image comparison by a simple Hamming distance metric. A dataset that contains 13 in vivo gastrointestinal videos was collected from six patients, representing serial examinations of each patient, which includes videos captured with significant time intervals. Precision-recall for retargeting shows that our new descriptor outperforms a number of alternative descriptors, whilst our hashing method outperforms a number of alternative hashing approaches. We have proposed a novel framework for optical biopsy in serial endoscopic examinations. A new descriptor, combined with a novel hashing method, achieves state-of-the-art retargeting, with validation on in vivo videos from six patients. Real-time performance also allows for practical integration without disturbing the existing clinical workflow.
Prediction of boiling points of organic compounds by QSPR tools.

PubMed

Dai, Yi-min; Zhu, Zhi-ping; Cao, Zhong; Zhang, Yue-fei; Zeng, Ju-lan; Li, Xun

2013-07-01

The novel electro-negativity topological descriptors of YC, WC were derived from molecular structure by equilibrium electro-negativity of atom and relative bond length of molecule. The quantitative structure-property relationships (QSPR) between descriptors of YC, WC as well as path number parameter P3 and the normal boiling points of 80 alkanes, 65 unsaturated hydrocarbons and 70 alcohols were obtained separately. The high-quality prediction models were evidenced by coefficient of determination (R(2)), the standard error (S), average absolute errors (AAE) and predictive parameters (Qext(2),RCV(2),Rm(2)). According to the regression equations, the influences of the length of carbon backbone, the size, the degree of branching of a molecule and the role of functional groups on the normal boiling point were analyzed. Comparison results with reference models demonstrated that novel topological descriptors based on the equilibrium electro-negativity of atom and the relative bond length were useful molecular descriptors for predicting the normal boiling points of organic compounds. Copyright © 2013 Elsevier Inc. All rights reserved.
Real-Time Visual Tracking through Fusion Features

PubMed Central

Ruan, Yang; Wei, Zhenzhong

2016-01-01

Due to their high-speed, correlation filters for object tracking have begun to receive increasing attention. Traditional object trackers based on correlation filters typically use a single type of feature. In this paper, we attempt to integrate multiple feature types to improve the performance, and we propose a new DD-HOG fusion feature that consists of discriminative descriptors (DDs) and histograms of oriented gradients (HOG). However, fusion features as multi-vector descriptors cannot be directly used in prior correlation filters. To overcome this difficulty, we propose a multi-vector correlation filter (MVCF) that can directly convolve with a multi-vector descriptor to obtain a single-channel response that indicates the location of an object. Experiments on the CVPR2013 tracking benchmark with the evaluation of state-of-the-art trackers show the effectiveness and speed of the proposed method. Moreover, we show that our MVCF tracker, which uses the DD-HOG descriptor, outperforms the structure-preserving object tracker (SPOT) in multi-object tracking because of its high-speed and ability to address heavy occlusion. PMID:27347951
a Robust Descriptor Based on Spatial and Frequency Structural Information for Visible and Thermal Infrared Image Matching

NASA Astrophysics Data System (ADS)

Fu, Z.; Qin, Q.; Wu, C.; Chang, Y.; Luo, B.

2017-09-01

Due to the differences of imaging principles, image matching between visible and thermal infrared images still exist new challenges and difficulties. Inspired by the complementary spatial and frequency information of geometric structural features, a robust descriptor is proposed for visible and thermal infrared images matching. We first divide two different spatial regions to the region around point of interest, using the histogram of oriented magnitudes, which corresponds to the 2-D structural shape information to describe the larger region and the edge oriented histogram to describe the spatial distribution for the smaller region. Then the two vectors are normalized and combined to a higher feature vector. Finally, our proposed descriptor is obtained by applying principal component analysis (PCA) to reduce the dimension of the combined high feature vector to make our descriptor more robust. Experimental results showed that our proposed method was provided with significant improvements in correct matching numbers and obvious advantages by complementing information within spatial and frequency structural information.

Multi-channel linear descriptors for event-related EEG collected in brain computer interface.

PubMed

Pei, Xiao-mei; Zheng, Chong-xun; Xu, Jin; Bin, Guang-yu; Wang, Hong-wu

2006-03-01

By three multi-channel linear descriptors, i.e. spatial complexity (omega), field power (sigma) and frequency of field changes (phi), event-related EEG data within 8-30 Hz were investigated during imagination of left or right hand movement. Studies on the event-related EEG data indicate that a two-channel version of omega, sigma and phi could reflect the antagonistic ERD/ERS patterns over contralateral and ipsilateral areas and also characterize different phases of the changing brain states in the event-related paradigm. Based on the selective two-channel linear descriptors, the left and right hand motor imagery tasks are classified to obtain satisfactory results, which testify the validity of the three linear descriptors omega, sigma and phi for characterizing event-related EEG. The preliminary results show that omega, sigma together with phi have good separability for left and right hand motor imagery tasks, which could be considered for classification of two classes of EEG patterns in the application of brain computer interfaces.
Quantitative structure-activity relationship of the curcumin-related compounds using various regression methods

NASA Astrophysics Data System (ADS)

Khazaei, Ardeshir; Sarmasti, Negin; Seyf, Jaber Yousefi

2016-03-01

Quantitative structure activity relationship were used to study a series of curcumin-related compounds with inhibitory effect on prostate cancer PC-3 cells, pancreas cancer Panc-1 cells, and colon cancer HT-29 cells. Sphere exclusion method was used to split data set in two categories of train and test set. Multiple linear regression, principal component regression and partial least squares were used as the regression methods. In other hand, to investigate the effect of feature selection methods, stepwise, Genetic algorithm, and simulated annealing were used. In two cases (PC-3 cells and Panc-1 cells), the best models were generated by a combination of multiple linear regression and stepwise (PC-3 cells: r2 = 0.86, q2 = 0.82, pred_r2 = 0.93, and r2m (test) = 0.43, Panc-1 cells: r2 = 0.85, q2 = 0.80, pred_r2 = 0.71, and r2m (test) = 0.68). For the HT-29 cells, principal component regression with stepwise (r2 = 0.69, q2 = 0.62, pred_r2 = 0.54, and r2m (test) = 0.41) is the best method. The QSAR study reveals descriptors which have crucial role in the inhibitory property of curcumin-like compounds. 6ChainCount, T_C_C_1, and T_O_O_7 are the most important descriptors that have the greatest effect. With a specific end goal to design and optimization of novel efficient curcumin-related compounds it is useful to introduce heteroatoms such as nitrogen, oxygen, and sulfur atoms in the chemical structure (reduce the contribution of T_C_C_1 descriptor) and increase the contribution of 6ChainCount and T_O_O_7 descriptors. Models can be useful in the better design of some novel curcumin-related compounds that can be used in the treatment of prostate, pancreas, and colon cancers.
Computer-assisted shape descriptors for skull morphology in craniosynostosis.

PubMed

Shim, Kyu Won; Lee, Min Jin; Lee, Myung Chul; Park, Eun Kyung; Kim, Dong Seok; Hong, Helen; Kim, Yong Oock

2016-03-01

Our aim was to develop a novel method for characterizing common skull deformities with high sensitivity and specificity, based on two-dimensional (2D) shape descriptors in computed tomography (CT) images. Between 2003 and 2014, 44 normal subjects and 39 infants with craniosynostosis (sagittal, 29; bicoronal, 10) enrolled for analysis. Mean age overall was 16 months (range, 1-120 months), with a male:female ratio of 56:29. Two reference planes, sagittal (S-plane: through top of lateral ventricle) and coronal (C-plane: at maximum dimension of fourth ventricle), were utilized to formulate three 2D shape descriptors (cranial index [CI], cranial radius index [CR], and cranial extreme spot index [CES]), which were then applied to S- and C-plane target images of both groups. In infants with sagittal craniosynostosis, CI in S-plane (S-CI) usually was <1.0 (mean, 0.78; range, 0.67-0.95), with CR consistently at 3 and a characteristic CES pattern of two discrete hot spots oriented diagonally. In the bicoronal craniosynostosis subset, CI was >1.0 (mean 1.11; range, 1.04-1.25), with CR at -3 and a CES pattern of four discrete diagonally oriented hot spots. Scatter plots underscored the highly intuitive joint performance of CI and CES in distinguishing normal and deformed states. Altogether, these novel 2D shape descriptors enabled effective discrimination of sagittal and bicoronal skull deformities. Newly developed 2D shape descriptors for cranial CT imaging enabled recognition of common skull deformities with statistical significance, perhaps providing impetus for automated CT-based diagnosis of craniosynostosis.
Genetic divergence among accessions of melon from traditional agriculture of the Brazilian Northeast.

PubMed

Aragão, F A S; Torres Filho, J; Nunes, G H S; Queiróz, M A; Bordallo, P N; Buso, G S C; Ferreira, M A; Costa, Z P; Bezerra Neto, F

2013-12-06

The genetic divergence of 38 melon accessions from traditional agriculture of the Brazilian Northeast and three commercial hybrids were evaluated using fruit descriptors and microsatellite markers. The melon germplasm belongs to the botanic varieties cantalupensis (19), momordica (7), conomon (4), and inodorus (3), and to eight genotypes that were identified only at the species level. The fruit descriptors evaluated were: number of fruits per plant (NPF), fruit mass (FM; kg), fruit longitudinal diameter (LD; cm), fruit transversal diameter (TD; cm), shape index based on the LD/TD ratio, flesh pulp thickness, cavity thickness (CT; cm), firmness fruit pulp (N), and soluble solids (SS; °Brix). The results showed high variability for all descriptors, especially for NPF, LD, and FM. The grouping analysis based on fruit descriptors produced eight groups without taxonomic criteria. The LD (22.52%), NPF (19.70%), CT (16.13%), and SS (9.57%) characteristics were the descriptors that contributed the most to genotype dissimilarity. The 17 simple sequence repeat polymorphic markers amplified 41 alleles with an average of 2.41 alleles and three genotypes per locus. Some markers presented a high frequency for the main allele. The genetic diversity ranged from 0.07 to 0.60, the observed heterozygosity had very low values, and the mean polymorphism information content was 0.32. Molecular genetic similarity analyses clustered the accessions in 13 groups, also not following taxonomic ranks. There was no association between morphoagronomic and molecular groupings. In conclusion, there was great variability among the accessions and among and within botanic groups.
Quantitative structure-activation barrier relationship modeling for Diels-Alder ligations utilizing quantum chemical structural descriptors.

PubMed

Nandi, Sisir; Monesi, Alessandro; Drgan, Viktor; Merzel, Franci; Novič, Marjana

2013-10-30

In the present study, we show the correlation of quantum chemical structural descriptors with the activation barriers of the Diels-Alder ligations. A set of 72 non-catalysed Diels-Alder reactions were subjected to quantitative structure-activation barrier relationship (QSABR) under the framework of theoretical quantum chemical descriptors calculated solely from the structures of diene and dienophile reactants. Experimental activation barrier data were obtained from literature. Descriptors were computed using Hartree-Fock theory using 6-31G(d) basis set as implemented in Gaussian 09 software. Variable selection and model development were carried out by stepwise multiple linear regression methodology. Predictive performance of the quantitative structure-activation barrier relationship (QSABR) model was assessed by training and test set concept and by calculating leave-one-out cross-validated Q2 and predictive R2 values. The QSABR model can explain and predict 86.5% and 80% of the variances, respectively, in the activation energy barrier training data. Alternatively, a neural network model based on back propagation of errors was developed to assess the nonlinearity of the sought correlations between theoretical descriptors and experimental reaction barriers. A reasonable predictability for the activation barrier of the test set reactions was obtained, which enabled an exploration and interpretation of the significant variables responsible for Diels-Alder interaction between dienes and dienophiles. Thus, studies in the direction of QSABR modelling that provide efficient and fast prediction of activation barriers of the Diels-Alder reactions turn out to be a meaningful alternative to transition state theory based computation.
A general representation scheme for crystalline solids based on Voronoi-tessellation real feature values and atomic property data

PubMed Central

Jalem, Randy; Nakayama, Masanobu; Noda, Yusuke; Le, Tam; Takeuchi, Ichiro; Tateyama, Yoshitaka; Yamazaki, Hisatsugu

2018-01-01

Abstract Increasing attention has been paid to materials informatics approaches that promise efficient and fast discovery and optimization of functional inorganic materials. Technical breakthrough is urgently requested to advance this field and efforts have been made in the development of materials descriptors to encode or represent characteristics of crystalline solids, such as chemical composition, crystal structure, electronic structure, etc. We propose a general representation scheme for crystalline solids that lifts restrictions on atom ordering, cell periodicity, and system cell size based on structural descriptors of directly binned Voronoi-tessellation real feature values and atomic/chemical descriptors based on the electronegativity of elements in the crystal. Comparison was made vs. radial distribution function (RDF) feature vector, in terms of predictive accuracy on density functional theory (DFT) material properties: cohesive energy (CE), density (d), electronic band gap (BG), and decomposition energy (Ed). It was confirmed that the proposed feature vector from Voronoi real value binning generally outperforms the RDF-based one for the prediction of aforementioned properties. Together with electronegativity-based features, Voronoi-tessellation features from a given crystal structure that are derived from second-nearest neighbor information contribute significantly towards prediction. PMID:29707064
A general representation scheme for crystalline solids based on Voronoi-tessellation real feature values and atomic property data.

PubMed

Jalem, Randy; Nakayama, Masanobu; Noda, Yusuke; Le, Tam; Takeuchi, Ichiro; Tateyama, Yoshitaka; Yamazaki, Hisatsugu

2018-01-01

Increasing attention has been paid to materials informatics approaches that promise efficient and fast discovery and optimization of functional inorganic materials. Technical breakthrough is urgently requested to advance this field and efforts have been made in the development of materials descriptors to encode or represent characteristics of crystalline solids, such as chemical composition, crystal structure, electronic structure, etc. We propose a general representation scheme for crystalline solids that lifts restrictions on atom ordering, cell periodicity, and system cell size based on structural descriptors of directly binned Voronoi-tessellation real feature values and atomic/chemical descriptors based on the electronegativity of elements in the crystal. Comparison was made vs. radial distribution function (RDF) feature vector, in terms of predictive accuracy on density functional theory (DFT) material properties: cohesive energy (CE), density ( d ), electronic band gap (BG), and decomposition energy (Ed). It was confirmed that the proposed feature vector from Voronoi real value binning generally outperforms the RDF-based one for the prediction of aforementioned properties. Together with electronegativity-based features, Voronoi-tessellation features from a given crystal structure that are derived from second-nearest neighbor information contribute significantly towards prediction.
Food Recognition: A New Dataset, Experiments, and Results.

PubMed

Ciocca, Gianluigi; Napoletano, Paolo; Schettini, Raimondo

2017-05-01

We propose a new dataset for the evaluation of food recognition algorithms that can be used in dietary monitoring applications. Each image depicts a real canteen tray with dishes and foods arranged in different ways. Each tray contains multiple instances of food classes. The dataset contains 1027 canteen trays for a total of 3616 food instances belonging to 73 food classes. The food on the tray images has been manually segmented using carefully drawn polygonal boundaries. We have benchmarked the dataset by designing an automatic tray analysis pipeline that takes a tray image as input, finds the regions of interest, and predicts for each region the corresponding food class. We have experimented with three different classification strategies using also several visual descriptors. We achieve about 79% of food and tray recognition accuracy using convolutional-neural-networks-based features. The dataset, as well as the benchmark framework, are available to the research community.
Object Detection Techniques Applied on Mobile Robot Semantic Navigation

PubMed Central

Astua, Carlos; Barber, Ramon; Crespo, Jonathan; Jardon, Alberto

2014-01-01

The future of robotics predicts that robots will integrate themselves more every day with human beings and their environments. To achieve this integration, robots need to acquire information about the environment and its objects. There is a big need for algorithms to provide robots with these sort of skills, from the location where objects are needed to accomplish a task up to where these objects are considered as information about the environment. This paper presents a way to provide mobile robots with the ability-skill to detect objets for semantic navigation. This paper aims to use current trends in robotics and at the same time, that can be exported to other platforms. Two methods to detect objects are proposed, contour detection and a descriptor based technique, and both of them are combined to overcome their respective limitations. Finally, the code is tested on a real robot, to prove its accuracy and efficiency. PMID:24732101
Iris recognition using possibilistic fuzzy matching on local features.

PubMed

Tsai, Chung-Chih; Lin, Heng-Yi; Taur, Jinshiuh; Tao, Chin-Wang

2012-02-01

In this paper, we propose a novel possibilistic fuzzy matching strategy with invariant properties, which can provide a robust and effective matching scheme for two sets of iris feature points. In addition, the nonlinear normalization model is adopted to provide more accurate position before matching. Moreover, an effective iris segmentation method is proposed to refine the detected inner and outer boundaries to smooth curves. For feature extraction, the Gabor filters are adopted to detect the local feature points from the segmented iris image in the Cartesian coordinate system and to generate a rotation-invariant descriptor for each detected point. After that, the proposed matching algorithm is used to compute a similarity score for two sets of feature points from a pair of iris images. The experimental results show that the performance of our system is better than those of the systems based on the local features and is comparable to those of the typical systems.
Parametric Human Body Reconstruction Based on Sparse Key Points.

PubMed

Cheng, Ke-Li; Tong, Ruo-Feng; Tang, Min; Qian, Jing-Ye; Sarkis, Michel

2016-11-01

We propose an automatic parametric human body reconstruction algorithm which can efficiently construct a model using a single Kinect sensor. A user needs to stand still in front of the sensor for a couple of seconds to measure the range data. The user's body shape and pose will then be automatically constructed in several seconds. Traditional methods optimize dense correspondences between range data and meshes. In contrast, our proposed scheme relies on sparse key points for the reconstruction. It employs regression to find the corresponding key points between the scanned range data and some annotated training data. We design two kinds of feature descriptors as well as corresponding regression stages to make the regression robust and accurate. Our scheme follows with dense refinement where a pre-factorization method is applied to improve the computational efficiency. Compared with other methods, our scheme achieves similar reconstruction accuracy but significantly reduces runtime.
SABRE: ligand/structure-based virtual screening approach using consensus molecular-shape pattern recognition.

PubMed

Wei, Ning-Ning; Hamza, Adel

2014-01-27

We present an efficient and rational ligand/structure shape-based virtual screening approach combining our previous ligand shape-based similarity SABRE (shape-approach-based routines enhanced) and the 3D shape of the receptor binding site. Our approach exploits the pharmacological preferences of a number of known active ligands to take advantage of the structural diversities and chemical similarities, using a linear combination of weighted molecular shape density. Furthermore, the algorithm generates a consensus molecular-shape pattern recognition that is used to filter and place the candidate structure into the binding pocket. The descriptor pool used to construct the consensus molecular-shape pattern consists of four dimensional (4D) fingerprints generated from the distribution of conformer states available to a molecule and the 3D shapes of a set of active ligands computed using SABRE software. The virtual screening efficiency of SABRE was validated using the Database of Useful Decoys (DUD) and the filtered version (WOMBAT) of 10 DUD targets. The ligand/structure shape-based similarity SABRE algorithm outperforms several other widely used virtual screening methods which uses the data fusion of multiscreening tools (2D and 3D fingerprints) and demonstrates a superior early retrieval rate of active compounds (EF(0.1%) = 69.0% and EF(1%) = 98.7%) from a large size of ligand database (∼95,000 structures). Therefore, our developed similarity approach can be of particular use for identifying active compounds that are similar to reference molecules and predicting activity against other targets (chemogenomics). An academic license of the SABRE program is available on request.
Calculation of the octanol-water partition coefficient of armchair polyhex BN nanotubes

NASA Astrophysics Data System (ADS)

Mohammadinasab, E.; Pérez-Sánchez, H.; Goodarzi, M.

2017-12-01

A predictive model for determination partition coefficient (log P) of armchair polyhex BN nanotubes by using simple descriptors was built. The relationship between the octanol-water log P and quantum chemical descriptors, electric moments, and topological indices of some armchair polyhex BN nanotubes with various lengths and fixed circumference are represented. Based on density functional theory electric moments and physico-chemical properties of those nanotubes are calculated.
Multiple local feature representations and their fusion based on an SVR model for iris recognition using optimized Gabor filters

NASA Astrophysics Data System (ADS)

He, Fei; Liu, Yuanning; Zhu, Xiaodong; Huang, Chun; Han, Ye; Dong, Hongxing

2014-12-01

Gabor descriptors have been widely used in iris texture representations. However, fixed basic Gabor functions cannot match the changing nature of diverse iris datasets. Furthermore, a single form of iris feature cannot overcome difficulties in iris recognition, such as illumination variations, environmental conditions, and device variations. This paper provides multiple local feature representations and their fusion scheme based on a support vector regression (SVR) model for iris recognition using optimized Gabor filters. In our iris system, a particle swarm optimization (PSO)- and a Boolean particle swarm optimization (BPSO)-based algorithm is proposed to provide suitable Gabor filters for each involved test dataset without predefinition or manual modulation. Several comparative experiments on JLUBR-IRIS, CASIA-I, and CASIA-V4-Interval iris datasets are conducted, and the results show that our work can generate improved local Gabor features by using optimized Gabor filters for each dataset. In addition, our SVR fusion strategy may make full use of their discriminative ability to improve accuracy and reliability. Other comparative experiments show that our approach may outperform other popular iris systems.
High-throughput screening for thermoelectric sulphides by using crystal structure features as descriptors

NASA Astrophysics Data System (ADS)

Zhang, Ruizhi; Du, Baoli; Chen, Kan; Reece, Mike; Materials Research Insititute Team

With the increasing computational power and reliable databases, high-throughput screening is playing a more and more important role in the search of new thermoelectric materials. Rather than the well established density functional theory (DFT) calculation based methods, we propose an alternative approach to screen for new TE materials: using crystal structural features as 'descriptors'. We show that a non-distorted transition metal sulphide polyhedral network can be a good descriptor for high power factor according to crystal filed theory. By using Cu/S containing compounds as an example, 1600+ Cu/S containing entries in the Inorganic Crystal Structure Database (ICSD) were screened, and of those 84 phases are identified as promising thermoelectric materials. The screening results are validated by both electronic structure calculations and experimental results from the literature. We also fabricated some new compounds to test our screening results. Another advantage of using crystal structure features as descriptors is that we can easily establish structural relationships between the identified phases. Based on this, two material design approaches are discussed: 1) High-pressure synthesis of metastable phase; 2) In-situ 2-phase composites with coherent interface. This work was supported by a Marie Curie International Incoming Fellowship of the European Community Human Potential Program.
Physicochemical properties/descriptors governing the solubility and partitioning of chemicals in water-solvent-gas systems. Part 1. Partitioning between octanol and air.

PubMed

Raevsky, O A; Grigor'ev, V J; Raevskaja, O E; Schaper, K-J

2006-06-01

QSPR analyses of a data set containing experimental partition coefficients in the three systems octanol-water, water-gas, and octanol-gas for 98 chemicals have shown that it is possible to calculate any partition coefficient in the system 'gas phase/octanol/water' by three different approaches: (1) from experimental partition coefficients obtained in the corresponding two other subsystems. However, in many cases these data may not be available. Therefore, a solution may be approached (2), a traditional QSPR analysis based on e.g. HYBOT descriptors (hydrogen bond acceptor and donor factors, SigmaCa and SigmaCd, together with polarisability alpha, a steric bulk effect descriptor) and supplemented with substructural indicator variables. (3) A very promising approach which is a combination of the similarity concept and QSPR based on HYBOT descriptors. In this approach observed partition coefficients of structurally nearest neighbours of a compound-of-interest are used. In addition, contributions arising from differences in alpha, SigmaCa, and SigmaCd values between the compound-of-interest and its nearest neighbour(s), respectively, are considered. In this investigation highly significant relationships were obtained by approaches (1) and (3) for the octanol/gas phase partition coefficient (log Log).
Computational strategies for the automated design of RNA nanoscale structures from building blocks using NanoTiler.

PubMed

Bindewald, Eckart; Grunewald, Calvin; Boyle, Brett; O'Connor, Mary; Shapiro, Bruce A

2008-10-01

One approach to designing RNA nanoscale structures is to use known RNA structural motifs such as junctions, kissing loops or bulges and to construct a molecular model by connecting these building blocks with helical struts. We previously developed an algorithm for detecting internal loops, junctions and kissing loops in RNA structures. Here we present algorithms for automating or assisting many of the steps that are involved in creating RNA structures from building blocks: (1) assembling building blocks into nanostructures using either a combinatorial search or constraint satisfaction; (2) optimizing RNA 3D ring structures to improve ring closure; (3) sequence optimisation; (4) creating a unique non-degenerate RNA topology descriptor. This effectively creates a computational pipeline for generating molecular models of RNA nanostructures and more specifically RNA ring structures with optimized sequences from RNA building blocks. We show several examples of how the algorithms can be utilized to generate RNA tecto-shapes.
Computational strategies for the automated design of RNA nanoscale structures from building blocks using NanoTiler☆

PubMed Central

Bindewald, Eckart; Grunewald, Calvin; Boyle, Brett; O’Connor, Mary; Shapiro, Bruce A.

2013-01-01

One approach to designing RNA nanoscale structures is to use known RNA structural motifs such as junctions, kissing loops or bulges and to construct a molecular model by connecting these building blocks with helical struts. We previously developed an algorithm for detecting internal loops, junctions and kissing loops in RNA structures. Here we present algorithms for automating or assisting many of the steps that are involved in creating RNA structures from building blocks: (1) assembling building blocks into nanostructures using either a combinatorial search or constraint satisfaction; (2) optimizing RNA 3D ring structures to improve ring closure; (3) sequence optimisation; (4) creating a unique non-degenerate RNA topology descriptor. This effectively creates a computational pipeline for generating molecular models of RNA nanostructures and more specifically RNA ring structures with optimized sequences from RNA building blocks. We show several examples of how the algorithms can be utilized to generate RNA tecto-shapes. PMID:18838281
Recognizing Disguised Faces: Human and Machine Evaluation

PubMed Central

Dhamecha, Tejas Indulal; Singh, Richa; Vatsa, Mayank; Kumar, Ajay

2014-01-01

Face verification, though an easy task for humans, is a long-standing open research area. This is largely due to the challenging covariates, such as disguise and aging, which make it very hard to accurately verify the identity of a person. This paper investigates human and machine performance for recognizing/verifying disguised faces. Performance is also evaluated under familiarity and match/mismatch with the ethnicity of observers. The findings of this study are used to develop an automated algorithm to verify the faces presented under disguise variations. We use automatically localized feature descriptors which can identify disguised face patches and account for this information to achieve improved matching accuracy. The performance of the proposed algorithm is evaluated on the IIIT-Delhi Disguise database that contains images pertaining to 75 subjects with different kinds of disguise variations. The experiments suggest that the proposed algorithm can outperform a popular commercial system and evaluates them against humans in matching disguised face images. PMID:25029188
Automation of aggregate characterization using laser profiling and digital image analysis

NASA Astrophysics Data System (ADS)

Kim, Hyoungkwan

2002-08-01

Particle morphological properties such as size, shape, angularity, and texture are key properties that are frequently used to characterize aggregates. The characteristics of aggregates are crucial to the strength, durability, and serviceability of the structure in which they are used. Thus, it is important to select aggregates that have proper characteristics for each specific application. Use of improper aggregate can cause rapid deterioration or even failure of the structure. The current standard aggregate test methods are generally labor-intensive, time-consuming, and subject to human errors. Moreover, important properties of aggregates may not be captured by the standard methods due to a lack of an objective way of quantifying critical aggregate properties. Increased quality expectations of products along with recent technological advances in information technology are motivating new developments to provide fast and accurate aggregate characterization. The resulting information can enable a real time quality control of aggregate production as well as lead to better design and construction methods of portland cement concrete and hot mix asphalt. This dissertation presents a system to measure various morphological characteristics of construction aggregates effectively. Automatic measurement of various particle properties is of great interest because it has the potential to solve such problems in manual measurements as subjectivity, labor intensity, and slow speed. The main efforts of this research are placed on three-dimensional (3D) laser profiling, particle segmentation algorithms, particle measurement algorithms, and generalized particle descriptors. First, true 3D data of aggregate particles obtained by laser profiling are transformed into digital images. Second, a segmentation algorithm and a particle measurement algorithm are developed to separate particles and process each particle data individually with the aid of various kinds of digital image technologies. Finally, in order to provide a generalized, quantitative, and representative way to characterize aggregate particles, 3D particle descriptors are developed using the multi-resolution analysis feature of wavelet transforms. Verification tests show that this approach could characterize various aggregate properties in a fast, accurate, and reliable way. When implemented, this ability to automatically analyze multiple characteristics of an aggregate sample is expected to provide not only economic but also intangible strategic gains.

Prediction of octanol-water partition coefficients of organic compounds by multiple linear regression, partial least squares, and artificial neural network.

PubMed

Golmohammadi, Hassan

2009-11-30

A quantitative structure-property relationship (QSPR) study was performed to develop models those relate the structure of 141 organic compounds to their octanol-water partition coefficients (log P(o/w)). A genetic algorithm was applied as a variable selection tool. Modeling of log P(o/w) of these compounds as a function of theoretically derived descriptors was established by multiple linear regression (MLR), partial least squares (PLS), and artificial neural network (ANN). The best selected descriptors that appear in the models are: atomic charge weighted partial positively charged surface area (PPSA-3), fractional atomic charge weighted partial positive surface area (FPSA-3), minimum atomic partial charge (Qmin), molecular volume (MV), total dipole moment of molecule (mu), maximum antibonding contribution of a molecule orbital in the molecule (MAC), and maximum free valency of a C atom in the molecule (MFV). The result obtained showed the ability of developed artificial neural network to prediction of partition coefficients of organic compounds. Also, the results revealed the superiority of ANN over the MLR and PLS models. Copyright 2009 Wiley Periodicals, Inc.
Design of focused and restrained subsets from extremely large virtual libraries.

PubMed

Jamois, Eric A; Lin, Chien T; Waldman, Marvin

2003-11-01

With the current and ever-growing offering of reagents along with the vast palette of organic reactions, virtual libraries accessible to combinatorial chemists can reach sizes of billions of compounds or more. Extracting practical size subsets for experimentation has remained an essential step in the design of combinatorial libraries. A typical approach to computational library design involves enumeration of structures and properties for the entire virtual library, which may be unpractical for such large libraries. This study describes a new approach termed as on the fly optimization (OTFO) where descriptors are computed as needed within the subset optimization cycle and without intermediate enumeration of structures. Results reported herein highlight the advantages of coupling an ultra-fast descriptor calculation engine to subset optimization capabilities. We also show that enumeration of properties for the entire virtual library may not only be unpractical but also wasteful. Successful design of focused and restrained subsets can be achieved while sampling only a small fraction of the virtual library. We also investigate the stability of the method and compare results obtained from simulated annealing (SA) and genetic algorithms (GA).
High-order statistics of weber local descriptors for image representation.

PubMed

Han, Xian-Hua; Chen, Yen-Wei; Xu, Gang

2015-06-01

Highly discriminant visual features play a key role in different image classification applications. This study aims to realize a method for extracting highly-discriminant features from images by exploring a robust local descriptor inspired by Weber's law. The investigated local descriptor is based on the fact that human perception for distinguishing a pattern depends not only on the absolute intensity of the stimulus but also on the relative variance of the stimulus. Therefore, we firstly transform the original stimulus (the images in our study) into a differential excitation-domain according to Weber's law, and then explore a local patch, called micro-Texton, in the transformed domain as Weber local descriptor (WLD). Furthermore, we propose to employ a parametric probability process to model the Weber local descriptors, and extract the higher-order statistics to the model parameters for image representation. The proposed strategy can adaptively characterize the WLD space using generative probability model, and then learn the parameters for better fitting the training space, which would lead to more discriminant representation for images. In order to validate the efficiency of the proposed strategy, we apply three different image classification applications including texture, food images and HEp-2 cell pattern recognition, which validates that our proposed strategy has advantages over the state-of-the-art approaches.
Classification of Strawberry Fruit Shape by Machine Learning

NASA Astrophysics Data System (ADS)

Ishikawa, T.; Hayashi, A.; Nagamatsu, S.; Kyutoku, Y.; Dan, I.; Wada, T.; Oku, K.; Saeki, Y.; Uto, T.; Tanabata, T.; Isobe, S.; Kochi, N.

2018-05-01

Shape is one of the most important traits of agricultural products due to its relationships with the quality, quantity, and value of the products. For strawberries, the nine types of fruit shape were defined and classified by humans based on the sampler patterns of the nine types. In this study, we tested the classification of strawberry shapes by machine learning in order to increase the accuracy of the classification, and we introduce the concept of computerization into this field. Four types of descriptors were extracted from the digital images of strawberries: (1) the Measured Values (MVs) including the length of the contour line, the area, the fruit length and width, and the fruit width/length ratio; (2) the Ellipse Similarity Index (ESI); (3) Elliptic Fourier Descriptors (EFDs), and (4) Chain Code Subtraction (CCS). We used these descriptors for the classification test along with the random forest approach, and eight of the nine shape types were classified with combinations of MVs + CCS + EFDs. CCS is a descriptor that adds human knowledge to the chain codes, and it showed higher robustness in classification than the other descriptors. Our results suggest machine learning's high ability to classify fruit shapes accurately. We will attempt to increase the classification accuracy and apply the machine learning methods to other plant species.
Output feedback control for a class of nonlinear systems with actuator degradation and sensor noise.

PubMed

Ai, Weiqing; Lu, Zhenli; Li, Bin; Fei, Shumin

2016-11-01

This paper investigates the output feedback control problem of a class of nonlinear systems with sensor noise and actuator degradation. Firstly, by using the descriptor observer approach, the origin system is transformed into a descriptor system. On the basis of the descriptor system, a novel Proportional Derivative (PD) observer is developed to asymptotically estimate sensor noise and system state simultaneously. Then, by designing an adaptive law to estimate the effectiveness of actuator, an adaptive observer-based controller is constructed to ensure that system state can be regulated to the origin asymptotically. Finally, the design scheme is applied to address a flexible joint robot link problem. Copyright © 2016 ISA. Published by Elsevier Ltd. All rights reserved.
Predictive Modeling of Human Perception Subjectivity: Feasibility Study of Mammographic Lesion Similarity

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xu, Songhua; Tourassi, Georgia

2012-01-01

The majority of clinical content-based image retrieval (CBIR) studies disregard human perception subjectivity, aiming to duplicate the consensus expert assessment of the visual similarity on example cases. The purpose of our study is twofold: (i) discern better the extent of human perception subjectivity when assessing the visual similarity of two images with similar semantic content, and (ii) explore the feasibility of personalized predictive modeling of visual similarity. We conducted a human observer study in which five observers of various expertise were shown ninety-nine triplets of mammographic masses with similar BI-RADS descriptors and were asked to select the two masses withmore » the highest visual relevance. Pairwise agreement ranged between poor and fair among the five observers, as assessed by the kappa statistic. The observers' self-consistency rate was remarkably low, based on repeated questions where either the orientation or the presentation order of a mass was changed. Various machine learning algorithms were explored to determine whether they can predict each observer's personalized selection using textural features. Many algorithms performed with accuracy that exceeded each observer's self-consistency rate, as determined using a cross-validation scheme. This accuracy was statistically significantly higher than would be expected by chance alone (two-tailed p-value ranged between 0.001 and 0.01 for all five personalized models). The study confirmed that human perception subjectivity should be taken into account when developing CBIR-based medical applications.« less
Robust feature matching via support-line voting and affine-invariant ratios

NASA Astrophysics Data System (ADS)

Li, Jiayuan; Hu, Qingwu; Ai, Mingyao; Zhong, Ruofei

2017-10-01

Robust image matching is crucial for many applications of remote sensing and photogrammetry, such as image fusion, image registration, and change detection. In this paper, we propose a robust feature matching method based on support-line voting and affine-invariant ratios. We first use popular feature matching algorithms, such as SIFT, to obtain a set of initial matches. A support-line descriptor based on multiple adaptive binning gradient histograms is subsequently applied in the support-line voting stage to filter outliers. In addition, we use affine-invariant ratios computed by a two-line structure to refine the matching results and estimate the local affine transformation. The local affine model is more robust to distortions caused by elevation differences than the global affine transformation, especially for high-resolution remote sensing images and UAV images. Thus, the proposed method is suitable for both rigid and non-rigid image matching problems. Finally, we extract as many high-precision correspondences as possible based on the local affine extension and build a grid-wise affine model for remote sensing image registration. We compare the proposed method with six state-of-the-art algorithms on several data sets and show that our method significantly outperforms the other methods. The proposed method achieves 94.46% average precision on 15 challenging remote sensing image pairs, while the second-best method, RANSAC, only achieves 70.3%. In addition, the number of detected correct matches of the proposed method is approximately four times the number of initial SIFT matches.
Language and the pain experience.

PubMed

Wilson, Dianne; Williams, Marie; Butler, David

2009-03-01

People in persistent pain have been reported to pay increased attention to specific words or descriptors of pain. The amount of attention paid to pain or cues for pain (such as pain descriptors), has been shown to be a major factor in the modulation of persistent pain. This relationship suggests the possibility that language may have a role both in understanding and managing the persistent pain experience. The aim of this paper is to describe current models of neuromatrices for pain and language, consider the role of attention in persistent pain states and highlight discrepancies, in previous studies based on the McGill Pain Questionnaire (MPQ), of the role of attention on pain descriptors. The existence of a pain neuromatrix originally proposed by Melzack (1990) has been supported by emerging technologies. Similar technologies have recently allowed identification of multiple areas of involvement for the processing of auditory input and the construction of language. As with the construction of pain, this neuromatrix for speech and language may intersect with neural systems for broader cognitive functions such as attention, memory and emotion. A systematic search was undertaken to identify experimental or review studies, which specifically investigated the role of attention on pain descriptors (as cues for pain) in persistent pain patients. A total of 99 articles were retrieved from six databases, with 66 articles meeting the inclusion criteria. After duplicated articles were eliminated, the remaining 41 articles were reviewed in order to support a link between persistent pain, pain descriptors and attention. This review revealed a diverse range of specific pain descriptors, the majority of which were derived from the MPQ. Increased attention to pain descriptors was consistently reported to be associated with emotional state as well as being a significant factor in maintaining persistent pain. However, attempts to investigate the attentional bias of specific pain descriptors highlighted discrepancies between the studies. As well as the diversity of pain descriptors used in studies, they were inconsistently categorized into domains of pain. A lack of consistent bias towards certain pain descriptors was observed, and may be explained simply by the fact that the words provided are not those which subjects themselves would use. These findings suggest that the multidimensional and individual nature of the persistent pain experience may not be adequately explained by pain questionnaires such as the MPQ. Personalized pain descriptors may communicate the pain experience more appropriately, but may also contribute to an increased sensitivity of cortical pain processing areas by capturing increased attention for that individual. The language used as part of communication between therapists and people with persistent pain may provide an, as yet, unexplored adjunct strategy in management. Copyright (c) 2008 John Wiley & Sons, Ltd.
Action Recognition Using Nonnegative Action Component Representation and Sparse Basis Selection.

PubMed

Wang, Haoran; Yuan, Chunfeng; Hu, Weiming; Ling, Haibin; Yang, Wankou; Sun, Changyin

2014-02-01

In this paper, we propose using high-level action units to represent human actions in videos and, based on such units, a novel sparse model is developed for human action recognition. There are three interconnected components in our approach. First, we propose a new context-aware spatial-temporal descriptor, named locally weighted word context, to improve the discriminability of the traditionally used local spatial-temporal descriptors. Second, from the statistics of the context-aware descriptors, we learn action units using the graph regularized nonnegative matrix factorization, which leads to a part-based representation and encodes the geometrical information. These units effectively bridge the semantic gap in action recognition. Third, we propose a sparse model based on a joint l2,1-norm to preserve the representative items and suppress noise in the action units. Intuitively, when learning the dictionary for action representation, the sparse model captures the fact that actions from the same class share similar units. The proposed approach is evaluated on several publicly available data sets. The experimental results and analysis clearly demonstrate the effectiveness of the proposed approach.
View subspaces for indexing and retrieval of 3D models

NASA Astrophysics Data System (ADS)

Dutagaci, Helin; Godil, Afzal; Sankur, Bülent; Yemez, Yücel

2010-02-01

View-based indexing schemes for 3D object retrieval are gaining popularity since they provide good retrieval results. These schemes are coherent with the theory that humans recognize objects based on their 2D appearances. The viewbased techniques also allow users to search with various queries such as binary images, range images and even 2D sketches. The previous view-based techniques use classical 2D shape descriptors such as Fourier invariants, Zernike moments, Scale Invariant Feature Transform-based local features and 2D Digital Fourier Transform coefficients. These methods describe each object independent of others. In this work, we explore data driven subspace models, such as Principal Component Analysis, Independent Component Analysis and Nonnegative Matrix Factorization to describe the shape information of the views. We treat the depth images obtained from various points of the view sphere as 2D intensity images and train a subspace to extract the inherent structure of the views within a database. We also show the benefit of categorizing shapes according to their eigenvalue spread. Both the shape categorization and data-driven feature set conjectures are tested on the PSB database and compared with the competitor view-based 3D shape retrieval algorithms.
Many-Body Descriptors for Predicting Molecular Properties with Machine Learning: Analysis of Pairwise and Three-Body Interactions in Molecules.

PubMed

Pronobis, Wiktor; Tkatchenko, Alexandre; Müller, Klaus-Robert

2018-06-12

Machine learning (ML) based prediction of molecular properties across chemical compound space is an important and alternative approach to efficiently estimate the solutions of highly complex many-electron problems in chemistry and physics. Statistical methods represent molecules as descriptors that should encode molecular symmetries and interactions between atoms. Many such descriptors have been proposed; all of them have advantages and limitations. Here, we propose a set of general two-body and three-body interaction descriptors which are invariant to translation, rotation, and atomic indexing. By adapting the successfully used kernel ridge regression methods of machine learning, we evaluate our descriptors on predicting several properties of small organic molecules calculated using density-functional theory. We use two data sets. The GDB-7 set contains 6868 molecules with up to 7 heavy atoms of type CNO. The GDB-9 set is composed of 131722 molecules with up to 9 heavy atoms containing CNO. When trained on 5000 random molecules, our best model achieves an accuracy of 0.8 kcal/mol (on the remaining 1868 molecules of GDB-7) and 1.5 kcal/mol (on the remaining 126722 molecules of GDB-9) respectively. Applying a linear regression model on our novel many-body descriptors performs almost equal to a nonlinear kernelized model. Linear models are readily interpretable: a feature importance ranking measure helps to obtain qualitative and quantitative insights on the importance of two- and three-body molecular interactions for predicting molecular properties computed with quantum-mechanical methods.
Blast and ballistic trajectories in combat casualties: a preliminary analysis using a cartesian positioning system with MDCT.

PubMed

Folio, Les R; Fischer, Tatjana; Shogan, Paul; Frew, Michael; Dwyer, Andrew; Provenzale, James M

2011-08-01

The purpose of this study is to determine the agreement with which radiologists identify wound paths in vivo on MDCT and calculate missile trajectories on the basis of Cartesian coordinates using a Cartesian positioning system (CPS). Three radiologists retrospectively identified 25 trajectories on MDCT in 19 casualties who sustained penetrating trauma in Iraq. Trajectories were described qualitatively in terms of directional path descriptors and quantitatively as trajectory vectors. Directional descriptors, trajectory angles, and angles between trajectories were calculated based on Cartesian coordinates of entrance and terminus or exit recorded in x, y image and table space (z) using a Trajectory Calculator created using spreadsheet software. The consistency of qualitative descriptor determinations was assessed in terms of frequency of observer agreement and multirater kappa statistics. Consistency of trajectory vectors was evaluated in terms of distribution of magnitude of the angles between vectors and the differences between their paraaxial and parasagittal angles. In 68% of trajectories, the observers' visual assessment of qualitative descriptors was congruent. Calculated descriptors agreed across observers in 60% of the trajectories. Estimated kappa also showed good agreement (0.65-0.79, p < 0.001); 70% of calculated paraaxial and parasagittal angles were within 20° across observers, and 61.3% of angles between trajectory vectors were within 20° across observers. Results show agreement of visually assessed and calculated qualitative descriptors and trajectory angles among observers. The Trajectory Calculator describes trajectories qualitatively similar to radiologists' visual assessment, showing the potential feasibility of automated trajectory analysis.
Hearing aid fine-tuning based on Dutch descriptions.

PubMed

Thielemans, Thijs; Pans, Donné; Chenault, Michelene; Anteunis, Lucien

2017-07-01

The aim of this study was to derive an independent fitting assistant based on expert consensus. Two questions were asked: (1) what (Dutch) terms do hearing impaired listeners use nowadays to describe their specific hearing aid fitting problems? (2) What is the expert consensus on how to resolve these complaints by adjusting hearing aid parameters? Hearing aid dispensers provided descriptors that impaired listeners use to describe their reactions to specific hearing aid fitting problems. Hearing aid fitting experts were asked "How would you adjust the hearing aid if its user reports that the aid sounds…?" with the blank filled with each of the 40 most frequently mentioned descriptors. 112 hearing aid dispensers and 15 hearing aid experts. The expert solution with the highest weight value was considered the best solution for that descriptor. Principal component analysis (PCA) was performed to identify a factor structure in fitting problems. Nine fitting problems could be identified resulting in an expert-based, hearing aid manufacturer independent, fine-tuning fitting assistant for clinical use. The construction of an expert-based, hearing aid manufacturer independent, fine-tuning fitting assistant to be used as an additional tool in the iterative fitting process is feasible.
An object-based approach for areal rainfall estimation and validation of atmospheric models

NASA Astrophysics Data System (ADS)

Troemel, Silke; Simmer, Clemens

2010-05-01

An object-based approach for areal rainfall estimation is applied to pseudo-radar data simulated of a weatherforecast model as well as to real radar volume data. The method aims at an as fully as possible exploitation of three-dimensional radar signals produced by precipitation generating systems during their lifetime to enhance areal rainfall estimation. Therefore tracking of radar-detected precipitation-centroids is performed and rain events are investigated using so-called Integral Radar Volume Descriptors (IRVD) containing relevant information of the underlying precipitation process. Some investigated descriptors are statistical quantities from the radar reflectivities within the boundary of a tracked rain cell like the area mean reflectivity or the compactness of a cell; others evaluate the mean vertical structure during the tracking period at the near surface reflectivity-weighted center of the cell like the mean effective efficiency or the mean echo top height. The stage of evolution of a system is given by the trend in the brightband fraction or related quantities. Furthermore, two descriptors not directly derived from radar data are considered: the mean wind shear and an orographic rainfall amplifier. While in case of pseudo-radar data a model based on a small set of IRVDs alone provides rainfall estimates of high accuracy, the application of such a model to the real world remains within the accuracies achievable with a constant Z-R-relationship. However, a combined model based on single IRVDs and the Marshall-Palmer Z-R-estimator already provides considerable enhancements even though the resolution of the data base used has room for improvement. The mean echo top height, the mean effective efficiency, the empirical standard deviation and the Marshall-Palmer estimator are detected for the final rainfall estimator. High correlations between storm height and rain rates, a shift of the probability distribution to higher values with increasing effective efficiency, and the possibility to classify continental and maritime systems using the effective efficiency confirm the informative value of the qualified descriptors. The IRVDs especially correct for the underestimation in case of intense rain events, and the information content of descriptors is most likely higher than demonstrated so far. We used quite sparse information about meteorological variables needed for the calculation of some IRVDs from single radiosoundings, and several descriptors suffered from the range-dependent vertical resolution of the reflectivity profile. Inclusion of neighbouring radars and assimilation runs of weather forecasting models will further enhance the accuracy of rainfall estimates. Finally, the clear difference between the IRVD selection from the pseudo-radar data and from the real world data hint to a new object-based avenue for the validation of higher resolution atmospheric models and for evaluating their potential to digest radar observations in data assimilation schemes.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Plessow, Philipp N.; Bajdich, Michal; Greene, Joshua

The formation of thin oxide films on metal supports is an important phenomenon, especially in the context of strong metal support interaction (SMSI). Computational predictions of the stability of these films are hampered by their structural complexity and a varying lattice mismatch with different supports. In this study, we report a large combination of supports and ultrathin oxide films studied with density functional theory (DFT). Trends in stability are investigated through a descriptor-based analysis. Since the studied films are bound to the support exclusively through metal–metal interaction, the adsorption energy of the oxide-constituting metal atom can be expected to bemore » a reasonable descriptor for the stability of the overlayers. If the same supercell is used for all supports, the overlayers experience different amounts of stress. Using supercells with small lattice mismatch for each system leads to significantly improved scaling relations for the stability of the overlayers. Finally, this approach works well for the studied systems and therefore allows the descriptor-based exploration of the thermodynamic stability of supported thin oxide layers.« less
Real-time action recognition using a multilayer descriptor with variable size

NASA Astrophysics Data System (ADS)

Alcantara, Marlon F.; Moreira, Thierry P.; Pedrini, Helio

2016-01-01

Video analysis technology has become less expensive and more powerful in terms of storage resources and resolution capacity, promoting progress in a wide range of applications. Video-based human action detection has been used for several tasks in surveillance environments, such as forensic investigation, patient monitoring, medical training, accident prevention, and traffic monitoring, among others. We present a method for action identification based on adaptive training of a multilayer descriptor applied to a single classifier. Cumulative motion shapes (CMSs) are extracted according to the number of frames present in the video. Each CMS is employed as a self-sufficient layer in the training stage but belongs to the same descriptor. A robust classification is achieved through individual responses of classifiers for each layer, and the dominant result is used as a final outcome. Experiments are conducted on five public datasets (Weizmann, KTH, MuHAVi, IXMAS, and URADL) to demonstrate the effectiveness of the method in terms of accuracy in real time.
Self-pacing direct memory access data transfer operations for compute nodes in a parallel computer

DOEpatents

Blocksome, Michael A

2015-02-17

Methods, apparatus, and products are disclosed for self-pacing DMA data transfer operations for nodes in a parallel computer that include: transferring, by an origin DMA on an origin node, a RTS message to a target node, the RTS message specifying an message on the origin node for transfer to the target node; receiving, in an origin injection FIFO for the origin DMA from a target DMA on the target node in response to transferring the RTS message, a target RGET descriptor followed by a DMA transfer operation descriptor, the DMA descriptor for transmitting a message portion to the target node, the target RGET descriptor specifying an origin RGET descriptor on the origin node that specifies an additional DMA descriptor for transmitting an additional message portion to the target node; processing, by the origin DMA, the target RGET descriptor; and processing, by the origin DMA, the DMA transfer operation descriptor.
Visual analytics for semantic queries of TerraSAR-X image content

NASA Astrophysics Data System (ADS)

Espinoza-Molina, Daniela; Alonso, Kevin; Datcu, Mihai

2015-10-01

With the continuous image product acquisition of satellite missions, the size of the image archives is considerably increasing every day as well as the variety and complexity of their content, surpassing the end-user capacity to analyse and exploit them. Advances in the image retrieval field have contributed to the development of tools for interactive exploration and extraction of the images from huge archives using different parameters like metadata, key-words, and basic image descriptors. Even though we count on more powerful tools for automated image retrieval and data analysis, we still face the problem of understanding and analyzing the results. Thus, a systematic computational analysis of these results is required in order to provide to the end-user a summary of the archive content in comprehensible terms. In this context, visual analytics combines automated analysis with interactive visualizations analysis techniques for an effective understanding, reasoning and decision making on the basis of very large and complex datasets. Moreover, currently several researches are focused on associating the content of the images with semantic definitions for describing the data in a format to be easily understood by the end-user. In this paper, we present our approach for computing visual analytics and semantically querying the TerraSAR-X archive. Our approach is mainly composed of four steps: 1) the generation of a data model that explains the information contained in a TerraSAR-X product. The model is formed by primitive descriptors and metadata entries, 2) the storage of this model in a database system, 3) the semantic definition of the image content based on machine learning algorithms and relevance feedback, and 4) querying the image archive using semantic descriptors as query parameters and computing the statistical analysis of the query results. The experimental results shows that with the help of visual analytics and semantic definitions we are able to explain the image content using semantic terms and the relations between them answering questions such as what is the percentage of urban area in a region? or what is the distribution of water bodies in a city?
High-Throughput Design of Two-Dimensional Electron Gas Systems Based on Polar/Nonpolar Perovskite Oxide Heterostructures

PubMed Central

Yang, Kesong; Nazir, Safdar; Behtash, Maziar; Cheng, Jianli

2016-01-01

The two-dimensional electron gas (2DEG) formed at the interface between two insulating oxides such as LaAlO3 and SrTiO3 (STO) is of fundamental and practical interest because of its novel interfacial conductivity and its promising applications in next-generation nanoelectronic devices. Here we show that a group of combinatorial descriptors that characterize the polar character, lattice mismatch, band gap, and the band alignment between the perovskite-oxide-based band insulators and the STO substrate, can be introduced to realize a high-throughput (HT) design of SrTiO3-based 2DEG systems from perovskite oxide quantum database. Equipped with these combinatorial descriptors, we have carried out a HT screening of all the polar perovskite compounds, uncovering 42 compounds of potential interests. Of these, Al-, Ga-, Sc-, and Ta-based compounds can form a 2DEG with STO, while In-based compounds exhibit a strain-induced strong polarization when deposited on STO substrate. In particular, the Ta-based compounds can form 2DEG with potentially high electron mobility at (TaO2)+/(SrO)0 interface. Our approach, by defining materials descriptors solely based on the bulk materials properties, and by relying on the perovskite-oriented quantum materials repository, opens new avenues for the discovery of perovskite-oxide-based functional interface materials in a HT fashion. PMID:27708415
Physicochemical descriptors of aromatic character and their use in drug discovery.

PubMed

Ritchie, Timothy J; Macdonald, Simon J F

2014-09-11

Published physicochemical descriptors of molecules that convey aromaticity-related character are reviewed in the context of drug design and discovery. Studies that have employed aromatic descriptors are discussed, and several descriptors are compared and contrasted.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.