Sample records for kernel learning mkl

  1. Efficient Multiple Kernel Learning Algorithms Using Low-Rank Representation.

    PubMed

    Niu, Wenjia; Xia, Kewen; Zu, Baokai; Bai, Jianchuan

    2017-01-01

    Unlike Support Vector Machine (SVM), Multiple Kernel Learning (MKL) allows datasets to be free to choose the useful kernels based on their distribution characteristics rather than a precise one. It has been shown in the literature that MKL holds superior recognition accuracy compared with SVM, however, at the expense of time consuming computations. This creates analytical and computational difficulties in solving MKL algorithms. To overcome this issue, we first develop a novel kernel approximation approach for MKL and then propose an efficient Low-Rank MKL (LR-MKL) algorithm by using the Low-Rank Representation (LRR). It is well-acknowledged that LRR can reduce dimension while retaining the data features under a global low-rank constraint. Furthermore, we redesign the binary-class MKL as the multiclass MKL based on pairwise strategy. Finally, the recognition effect and efficiency of LR-MKL are verified on the datasets Yale, ORL, LSVT, and Digit. Experimental results show that the proposed LR-MKL algorithm is an efficient kernel weights allocation method in MKL and boosts the performance of MKL largely.

  2. Insights from Classifying Visual Concepts with Multiple Kernel Learning

    PubMed Central

    Binder, Alexander; Nakajima, Shinichi; Kloft, Marius; Müller, Christina; Samek, Wojciech; Brefeld, Ulf; Müller, Klaus-Robert; Kawanabe, Motoaki

    2012-01-01

    Combining information from various image features has become a standard technique in concept recognition tasks. However, the optimal way of fusing the resulting kernel functions is usually unknown in practical applications. Multiple kernel learning (MKL) techniques allow to determine an optimal linear combination of such similarity matrices. Classical approaches to MKL promote sparse mixtures. Unfortunately, 1-norm regularized MKL variants are often observed to be outperformed by an unweighted sum kernel. The main contributions of this paper are the following: we apply a recently developed non-sparse MKL variant to state-of-the-art concept recognition tasks from the application domain of computer vision. We provide insights on benefits and limits of non-sparse MKL and compare it against its direct competitors, the sum-kernel SVM and sparse MKL. We report empirical results for the PASCAL VOC 2009 Classification and ImageCLEF2010 Photo Annotation challenge data sets. Data sets (kernel matrices) as well as further information are available at http://doc.ml.tu-berlin.de/image_mkl/(Accessed 2012 Jun 25). PMID:22936970

  3. Using the Intel Math Kernel Library on Peregrine | High-Performance

    Science.gov Websites

    Computing | NREL the Intel Math Kernel Library on Peregrine Using the Intel Math Kernel Library on Peregrine Learn how to use the Intel Math Kernel Library (MKL) with Peregrine system software. MKL architectures. Core math functions in MKL include BLAS, LAPACK, ScaLAPACK, sparse solvers, fast Fourier

  4. L2-norm multiple kernel learning and its application to biomedical data fusion

    PubMed Central

    2010-01-01

    Background This paper introduces the notion of optimizing different norms in the dual problem of support vector machines with multiple kernels. The selection of norms yields different extensions of multiple kernel learning (MKL) such as L∞, L1, and L2 MKL. In particular, L2 MKL is a novel method that leads to non-sparse optimal kernel coefficients, which is different from the sparse kernel coefficients optimized by the existing L∞ MKL method. In real biomedical applications, L2 MKL may have more advantages over sparse integration method for thoroughly combining complementary information in heterogeneous data sources. Results We provide a theoretical analysis of the relationship between the L2 optimization of kernels in the dual problem with the L2 coefficient regularization in the primal problem. Understanding the dual L2 problem grants a unified view on MKL and enables us to extend the L2 method to a wide range of machine learning problems. We implement L2 MKL for ranking and classification problems and compare its performance with the sparse L∞ and the averaging L1 MKL methods. The experiments are carried out on six real biomedical data sets and two large scale UCI data sets. L2 MKL yields better performance on most of the benchmark data sets. In particular, we propose a novel L2 MKL least squares support vector machine (LSSVM) algorithm, which is shown to be an efficient and promising classifier for large scale data sets processing. Conclusions This paper extends the statistical framework of genomic data fusion based on MKL. Allowing non-sparse weights on the data sources is an attractive option in settings where we believe most data sources to be relevant to the problem at hand and want to avoid a "winner-takes-all" effect seen in L∞ MKL, which can be detrimental to the performance in prospective studies. The notion of optimizing L2 kernels can be straightforwardly extended to ranking, classification, regression, and clustering algorithms. To tackle the computational burden of MKL, this paper proposes several novel LSSVM based MKL algorithms. Systematic comparison on real data sets shows that LSSVM MKL has comparable performance as the conventional SVM MKL algorithms. Moreover, large scale numerical experiments indicate that when cast as semi-infinite programming, LSSVM MKL can be solved more efficiently than SVM MKL. Availability The MATLAB code of algorithms implemented in this paper is downloadable from http://homes.esat.kuleuven.be/~sistawww/bioi/syu/l2lssvm.html. PMID:20529363

  5. Reduced multiple empirical kernel learning machine.

    PubMed

    Wang, Zhe; Lu, MingZhe; Gao, Daqi

    2015-02-01

    Multiple kernel learning (MKL) is demonstrated to be flexible and effective in depicting heterogeneous data sources since MKL can introduce multiple kernels rather than a single fixed kernel into applications. However, MKL would get a high time and space complexity in contrast to single kernel learning, which is not expected in real-world applications. Meanwhile, it is known that the kernel mapping ways of MKL generally have two forms including implicit kernel mapping and empirical kernel mapping (EKM), where the latter is less attracted. In this paper, we focus on the MKL with the EKM, and propose a reduced multiple empirical kernel learning machine named RMEKLM for short. To the best of our knowledge, it is the first to reduce both time and space complexity of the MKL with EKM. Different from the existing MKL, the proposed RMEKLM adopts the Gauss Elimination technique to extract a set of feature vectors, which is validated that doing so does not lose much information of the original feature space. Then RMEKLM adopts the extracted feature vectors to span a reduced orthonormal subspace of the feature space, which is visualized in terms of the geometry structure. It can be demonstrated that the spanned subspace is isomorphic to the original feature space, which means that the dot product of two vectors in the original feature space is equal to that of the two corresponding vectors in the generated orthonormal subspace. More importantly, the proposed RMEKLM brings a simpler computation and meanwhile needs a less storage space, especially in the processing of testing. Finally, the experimental results show that RMEKLM owns a much efficient and effective performance in terms of both complexity and classification. The contributions of this paper can be given as follows: (1) by mapping the input space into an orthonormal subspace, the geometry of the generated subspace is visualized; (2) this paper first reduces both the time and space complexity of the EKM-based MKL; (3) this paper adopts the Gauss Elimination, one of the on-the-shelf techniques, to generate a basis of the original feature space, which is stable and efficient.

  6. Generalized multiple kernel learning with data-dependent priors.

    PubMed

    Mao, Qi; Tsang, Ivor W; Gao, Shenghua; Wang, Li

    2015-06-01

    Multiple kernel learning (MKL) and classifier ensemble are two mainstream methods for solving learning problems in which some sets of features/views are more informative than others, or the features/views within a given set are inconsistent. In this paper, we first present a novel probabilistic interpretation of MKL such that maximum entropy discrimination with a noninformative prior over multiple views is equivalent to the formulation of MKL. Instead of using the noninformative prior, we introduce a novel data-dependent prior based on an ensemble of kernel predictors, which enhances the prediction performance of MKL by leveraging the merits of the classifier ensemble. With the proposed probabilistic framework of MKL, we propose a hierarchical Bayesian model to learn the proposed data-dependent prior and classification model simultaneously. The resultant problem is convex and other information (e.g., instances with either missing views or missing labels) can be seamlessly incorporated into the data-dependent priors. Furthermore, a variety of existing MKL models can be recovered under the proposed MKL framework and can be readily extended to incorporate these priors. Extensive experiments demonstrate the benefits of our proposed framework in supervised and semisupervised settings, as well as in tasks with partial correspondence among multiple views.

  7. Evaluation of Multiple Kernel Learning Algorithms for Crop Mapping Using Satellite Image Time-Series Data

    NASA Astrophysics Data System (ADS)

    Niazmardi, S.; Safari, A.; Homayouni, S.

    2017-09-01

    Crop mapping through classification of Satellite Image Time-Series (SITS) data can provide very valuable information for several agricultural applications, such as crop monitoring, yield estimation, and crop inventory. However, the SITS data classification is not straightforward. Because different images of a SITS data have different levels of information regarding the classification problems. Moreover, the SITS data is a four-dimensional data that cannot be classified using the conventional classification algorithms. To address these issues in this paper, we presented a classification strategy based on Multiple Kernel Learning (MKL) algorithms for SITS data classification. In this strategy, initially different kernels are constructed from different images of the SITS data and then they are combined into a composite kernel using the MKL algorithms. The composite kernel, once constructed, can be used for the classification of the data using the kernel-based classification algorithms. We compared the computational time and the classification performances of the proposed classification strategy using different MKL algorithms for the purpose of crop mapping. The considered MKL algorithms are: MKL-Sum, SimpleMKL, LPMKL and Group-Lasso MKL algorithms. The experimental tests of the proposed strategy on two SITS data sets, acquired by SPOT satellite sensors, showed that this strategy was able to provide better performances when compared to the standard classification algorithm. The results also showed that the optimization method of the used MKL algorithms affects both the computational time and classification accuracy of this strategy.

  8. A trace ratio maximization approach to multiple kernel-based dimensionality reduction.

    PubMed

    Jiang, Wenhao; Chung, Fu-lai

    2014-01-01

    Most dimensionality reduction techniques are based on one metric or one kernel, hence it is necessary to select an appropriate kernel for kernel-based dimensionality reduction. Multiple kernel learning for dimensionality reduction (MKL-DR) has been recently proposed to learn a kernel from a set of base kernels which are seen as different descriptions of data. As MKL-DR does not involve regularization, it might be ill-posed under some conditions and consequently its applications are hindered. This paper proposes a multiple kernel learning framework for dimensionality reduction based on regularized trace ratio, termed as MKL-TR. Our method aims at learning a transformation into a space of lower dimension and a corresponding kernel from the given base kernels among which some may not be suitable for the given data. The solutions for the proposed framework can be found based on trace ratio maximization. The experimental results demonstrate its effectiveness in benchmark datasets, which include text, image and sound datasets, for supervised, unsupervised as well as semi-supervised settings. Copyright © 2013 Elsevier Ltd. All rights reserved.

  9. Integrated model of multiple kernel learning and differential evolution for EUR/USD trading.

    PubMed

    Deng, Shangkun; Sakurai, Akito

    2014-01-01

    Currency trading is an important area for individual investors, government policy decisions, and organization investments. In this study, we propose a hybrid approach referred to as MKL-DE, which combines multiple kernel learning (MKL) with differential evolution (DE) for trading a currency pair. MKL is used to learn a model that predicts changes in the target currency pair, whereas DE is used to generate the buy and sell signals for the target currency pair based on the relative strength index (RSI), while it is also combined with MKL as a trading signal. The new hybrid implementation is applied to EUR/USD trading, which is the most traded foreign exchange (FX) currency pair. MKL is essential for utilizing information from multiple information sources and DE is essential for formulating a trading rule based on a mixture of discrete structures and continuous parameters. Initially, the prediction model optimized by MKL predicts the returns based on a technical indicator called the moving average convergence and divergence. Next, a combined trading signal is optimized by DE using the inputs from the prediction model and technical indicator RSI obtained from multiple timeframes. The experimental results showed that trading using the prediction learned by MKL yielded consistent profits.

  10. Integrated Model of Multiple Kernel Learning and Differential Evolution for EUR/USD Trading

    PubMed Central

    Deng, Shangkun; Sakurai, Akito

    2014-01-01

    Currency trading is an important area for individual investors, government policy decisions, and organization investments. In this study, we propose a hybrid approach referred to as MKL-DE, which combines multiple kernel learning (MKL) with differential evolution (DE) for trading a currency pair. MKL is used to learn a model that predicts changes in the target currency pair, whereas DE is used to generate the buy and sell signals for the target currency pair based on the relative strength index (RSI), while it is also combined with MKL as a trading signal. The new hybrid implementation is applied to EUR/USD trading, which is the most traded foreign exchange (FX) currency pair. MKL is essential for utilizing information from multiple information sources and DE is essential for formulating a trading rule based on a mixture of discrete structures and continuous parameters. Initially, the prediction model optimized by MKL predicts the returns based on a technical indicator called the moving average convergence and divergence. Next, a combined trading signal is optimized by DE using the inputs from the prediction model and technical indicator RSI obtained from multiple timeframes. The experimental results showed that trading using the prediction learned by MKL yielded consistent profits. PMID:25097891

  11. Multiple kernel learning using single stage function approximation for binary classification problems

    NASA Astrophysics Data System (ADS)

    Shiju, S.; Sumitra, S.

    2017-12-01

    In this paper, the multiple kernel learning (MKL) is formulated as a supervised classification problem. We dealt with binary classification data and hence the data modelling problem involves the computation of two decision boundaries of which one related with that of kernel learning and the other with that of input data. In our approach, they are found with the aid of a single cost function by constructing a global reproducing kernel Hilbert space (RKHS) as the direct sum of the RKHSs corresponding to the decision boundaries of kernel learning and input data and searching that function from the global RKHS, which can be represented as the direct sum of the decision boundaries under consideration. In our experimental analysis, the proposed model had shown superior performance in comparison with that of existing two stage function approximation formulation of MKL, where the decision functions of kernel learning and input data are found separately using two different cost functions. This is due to the fact that single stage representation helps the knowledge transfer between the computation procedures for finding the decision boundaries of kernel learning and input data, which inturn boosts the generalisation capacity of the model.

  12. Drug related webpages classification using images and text information based on multi-kernel learning

    NASA Astrophysics Data System (ADS)

    Hu, Ruiguang; Xiao, Liping; Zheng, Wenjuan

    2015-12-01

    In this paper, multi-kernel learning(MKL) is used for drug-related webpages classification. First, body text and image-label text are extracted through HTML parsing, and valid images are chosen by the FOCARSS algorithm. Second, text based BOW model is used to generate text representation, and image-based BOW model is used to generate images representation. Last, text and images representation are fused with a few methods. Experimental results demonstrate that the classification accuracy of MKL is higher than those of all other fusion methods in decision level and feature level, and much higher than the accuracy of single-modal classification.

  13. Pulmonary Nodule Recognition Based on Multiple Kernel Learning Support Vector Machine-PSO

    PubMed Central

    Zhu, Zhichuan; Zhao, Qingdong; Liu, Liwei; Zhang, Lijuan

    2018-01-01

    Pulmonary nodule recognition is the core module of lung CAD. The Support Vector Machine (SVM) algorithm has been widely used in pulmonary nodule recognition, and the algorithm of Multiple Kernel Learning Support Vector Machine (MKL-SVM) has achieved good results therein. Based on grid search, however, the MKL-SVM algorithm needs long optimization time in course of parameter optimization; also its identification accuracy depends on the fineness of grid. In the paper, swarm intelligence is introduced and the Particle Swarm Optimization (PSO) is combined with MKL-SVM algorithm to be MKL-SVM-PSO algorithm so as to realize global optimization of parameters rapidly. In order to obtain the global optimal solution, different inertia weights such as constant inertia weight, linear inertia weight, and nonlinear inertia weight are applied to pulmonary nodules recognition. The experimental results show that the model training time of the proposed MKL-SVM-PSO algorithm is only 1/7 of the training time of the MKL-SVM grid search algorithm, achieving better recognition effect. Moreover, Euclidean norm of normalized error vector is proposed to measure the proximity between the average fitness curve and the optimal fitness curve after convergence. Through statistical analysis of the average of 20 times operation results with different inertial weights, it can be seen that the dynamic inertial weight is superior to the constant inertia weight in the MKL-SVM-PSO algorithm. In the dynamic inertial weight algorithm, the parameter optimization time of nonlinear inertia weight is shorter; the average fitness value after convergence is much closer to the optimal fitness value, which is better than the linear inertial weight. Besides, a better nonlinear inertial weight is verified. PMID:29853983

  14. Pulmonary Nodule Recognition Based on Multiple Kernel Learning Support Vector Machine-PSO.

    PubMed

    Li, Yang; Zhu, Zhichuan; Hou, Alin; Zhao, Qingdong; Liu, Liwei; Zhang, Lijuan

    2018-01-01

    Pulmonary nodule recognition is the core module of lung CAD. The Support Vector Machine (SVM) algorithm has been widely used in pulmonary nodule recognition, and the algorithm of Multiple Kernel Learning Support Vector Machine (MKL-SVM) has achieved good results therein. Based on grid search, however, the MKL-SVM algorithm needs long optimization time in course of parameter optimization; also its identification accuracy depends on the fineness of grid. In the paper, swarm intelligence is introduced and the Particle Swarm Optimization (PSO) is combined with MKL-SVM algorithm to be MKL-SVM-PSO algorithm so as to realize global optimization of parameters rapidly. In order to obtain the global optimal solution, different inertia weights such as constant inertia weight, linear inertia weight, and nonlinear inertia weight are applied to pulmonary nodules recognition. The experimental results show that the model training time of the proposed MKL-SVM-PSO algorithm is only 1/7 of the training time of the MKL-SVM grid search algorithm, achieving better recognition effect. Moreover, Euclidean norm of normalized error vector is proposed to measure the proximity between the average fitness curve and the optimal fitness curve after convergence. Through statistical analysis of the average of 20 times operation results with different inertial weights, it can be seen that the dynamic inertial weight is superior to the constant inertia weight in the MKL-SVM-PSO algorithm. In the dynamic inertial weight algorithm, the parameter optimization time of nonlinear inertia weight is shorter; the average fitness value after convergence is much closer to the optimal fitness value, which is better than the linear inertial weight. Besides, a better nonlinear inertial weight is verified.

  15. Multiple Kernel Sparse Representation based Orthogonal Discriminative Projection and Its Cost-Sensitive Extension.

    PubMed

    Zhang, Guoqing; Sun, Huaijiang; Xia, Guiyu; Sun, Quansen

    2016-07-07

    Sparse representation based classification (SRC) has been developed and shown great potential for real-world application. Based on SRC, Yang et al. [10] devised a SRC steered discriminative projection (SRC-DP) method. However, as a linear algorithm, SRC-DP cannot handle the data with highly nonlinear distribution. Kernel sparse representation-based classifier (KSRC) is a non-linear extension of SRC and can remedy the drawback of SRC. KSRC requires the use of a predetermined kernel function and selection of the kernel function and its parameters is difficult. Recently, multiple kernel learning for SRC (MKL-SRC) [22] has been proposed to learn a kernel from a set of base kernels. However, MKL-SRC only considers the within-class reconstruction residual while ignoring the between-class relationship, when learning the kernel weights. In this paper, we propose a novel multiple kernel sparse representation-based classifier (MKSRC), and then we use it as a criterion to design a multiple kernel sparse representation based orthogonal discriminative projection method (MK-SR-ODP). The proposed algorithm aims at learning a projection matrix and a corresponding kernel from the given base kernels such that in the low dimension subspace the between-class reconstruction residual is maximized and the within-class reconstruction residual is minimized. Furthermore, to achieve a minimum overall loss by performing recognition in the learned low-dimensional subspace, we introduce cost information into the dimensionality reduction method. The solutions for the proposed method can be efficiently found based on trace ratio optimization method [33]. Extensive experimental results demonstrate the superiority of the proposed algorithm when compared with the state-of-the-art methods.

  16. Lesion classification using clinical and visual data fusion by multiple kernel learning

    NASA Astrophysics Data System (ADS)

    Kisilev, Pavel; Hashoul, Sharbell; Walach, Eugene; Tzadok, Asaf

    2014-03-01

    To overcome operator dependency and to increase diagnosis accuracy in breast ultrasound (US), a lot of effort has been devoted to developing computer-aided diagnosis (CAD) systems for breast cancer detection and classification. Unfortunately, the efficacy of such CAD systems is limited since they rely on correct automatic lesions detection and localization, and on robustness of features computed based on the detected areas. In this paper we propose a new approach to boost the performance of a Machine Learning based CAD system, by combining visual and clinical data from patient files. We compute a set of visual features from breast ultrasound images, and construct the textual descriptor of patients by extracting relevant keywords from patients' clinical data files. We then use the Multiple Kernel Learning (MKL) framework to train SVM based classifier to discriminate between benign and malignant cases. We investigate different types of data fusion methods, namely, early, late, and intermediate (MKL-based) fusion. Our database consists of 408 patient cases, each containing US images, textual description of complaints and symptoms filled by physicians, and confirmed diagnoses. We show experimentally that the proposed MKL-based approach is superior to other classification methods. Even though the clinical data is very sparse and noisy, its MKL-based fusion with visual features yields significant improvement of the classification accuracy, as compared to the image features only based classifier.

  17. Selecting the most relevant brain regions to discriminate Alzheimer's disease patients from healthy controls using multiple kernel learning: A comparison across functional and structural imaging modalities and atlases.

    PubMed

    Rondina, Jane Maryam; Ferreira, Luiz Kobuti; de Souza Duran, Fabio Luis; Kubo, Rodrigo; Ono, Carla Rachel; Leite, Claudia Costa; Smid, Jerusa; Nitrini, Ricardo; Buchpiguel, Carlos Alberto; Busatto, Geraldo F

    2018-01-01

    Machine learning techniques such as support vector machine (SVM) have been applied recently in order to accurately classify individuals with neuropsychiatric disorders such as Alzheimer's disease (AD) based on neuroimaging data. However, the multivariate nature of the SVM approach often precludes the identification of the brain regions that contribute most to classification accuracy. Multiple kernel learning (MKL) is a sparse machine learning method that allows the identification of the most relevant sources for the classification. By parcelating the brain into regions of interest (ROI) it is possible to use each ROI as a source to MKL (ROI-MKL). We applied MKL to multimodal neuroimaging data in order to: 1) compare the diagnostic performance of ROI-MKL and whole-brain SVM in discriminating patients with AD from demographically matched healthy controls and 2) identify the most relevant brain regions to the classification. We used two atlases (AAL and Brodmann's) to parcelate the brain into ROIs and applied ROI-MKL to structural (T1) MRI, 18 F-FDG-PET and regional cerebral blood flow SPECT (rCBF-SPECT) data acquired from the same subjects (20 patients with early AD and 18 controls). In ROI-MKL, each ROI received a weight (ROI-weight) that indicated the region's relevance to the classification. For each ROI, we also calculated whether there was a predominance of voxels indicating decreased or increased regional activity (for 18 F-FDG-PET and rCBF-SPECT) or volume (for T1-MRI) in AD patients. Compared to whole-brain SVM, the ROI-MKL approach resulted in better accuracies (with either atlas) for classification using 18 F-FDG-PET (92.5% accuracy for ROI-MKL versus 84% for whole-brain), but not when using rCBF-SPECT or T1-MRI. Although several cortical and subcortical regions contributed to discrimination, high ROI-weights and predominance of hypometabolism and atrophy were identified specially in medial parietal and temporo-limbic cortical regions. Also, the weight of discrimination due to a pattern of increased voxel-weight values in AD individuals was surprisingly high (ranging from approximately 20% to 40% depending on the imaging modality), located mainly in primary sensorimotor and visual cortices and subcortical nuclei. The MKL-ROI approach highlights the high discriminative weight of a subset of brain regions of known relevance to AD, the selection of which contributes to increased classification accuracy when applied to 18 F-FDG-PET data. Moreover, the MKL-ROI approach demonstrates that brain regions typically spared in mild stages of AD also contribute substantially in the individual discrimination of AD patients from controls.

  18. An algorithm of improving speech emotional perception for hearing aid

    NASA Astrophysics Data System (ADS)

    Xi, Ji; Liang, Ruiyu; Fei, Xianju

    2017-07-01

    In this paper, a speech emotion recognition (SER) algorithm was proposed to improve the emotional perception of hearing-impaired people. The algorithm utilizes multiple kernel technology to overcome the drawback of SVM: slow training speed. Firstly, in order to improve the adaptive performance of Gaussian Radial Basis Function (RBF), the parameter determining the nonlinear mapping was optimized on the basis of Kernel target alignment. Then, the obtained Kernel Function was used as the basis kernel of Multiple Kernel Learning (MKL) with slack variable that could solve the over-fitting problem. However, the slack variable also brings the error into the result. Therefore, a soft-margin MKL was proposed to balance the margin against the error. Moreover, the relatively iterative algorithm was used to solve the combination coefficients and hyper-plane equations. Experimental results show that the proposed algorithm can acquire an accuracy of 90% for five kinds of emotions including happiness, sadness, anger, fear and neutral. Compared with KPCA+CCA and PIM-FSVM, the proposed algorithm has the highest accuracy.

  19. Localized Multiple Kernel Learning A Convex Approach

    DTIC Science & Technology

    2016-11-22

    data. All the aforementioned approaches to localized MKL are formulated in terms of non-convex optimization problems, and deep the- oretical...learning. IEEE Transactions on Neural Networks, 22(3):433–446, 2011. Jingjing Yang, Yuanning Li, Yonghong Tian, Lingyu Duan, and Wen Gao. Group-sensitive

  20. Nonlinear Deep Kernel Learning for Image Annotation.

    PubMed

    Jiu, Mingyuan; Sahbi, Hichem

    2017-02-08

    Multiple kernel learning (MKL) is a widely used technique for kernel design. Its principle consists in learning, for a given support vector classifier, the most suitable convex (or sparse) linear combination of standard elementary kernels. However, these combinations are shallow and often powerless to capture the actual similarity between highly semantic data, especially for challenging classification tasks such as image annotation. In this paper, we redefine multiple kernels using deep multi-layer networks. In this new contribution, a deep multiple kernel is recursively defined as a multi-layered combination of nonlinear activation functions, each one involves a combination of several elementary or intermediate kernels, and results into a positive semi-definite deep kernel. We propose four different frameworks in order to learn the weights of these networks: supervised, unsupervised, kernel-based semisupervised and Laplacian-based semi-supervised. When plugged into support vector machines (SVMs), the resulting deep kernel networks show clear gain, compared to several shallow kernels for the task of image annotation. Extensive experiments and analysis on the challenging ImageCLEF photo annotation benchmark, the COREL5k database and the Banana dataset validate the effectiveness of the proposed method.

  1. Approach to explosive hazard detection using sensor fusion and multiple kernel learning with downward-looking GPR and EMI sensor data

    NASA Astrophysics Data System (ADS)

    Pinar, Anthony; Masarik, Matthew; Havens, Timothy C.; Burns, Joseph; Thelen, Brian; Becker, John

    2015-05-01

    This paper explores the effectiveness of an anomaly detection algorithm for downward-looking ground penetrating radar (GPR) and electromagnetic inductance (EMI) data. Threat detection with GPR is challenged by high responses to non-target/clutter objects, leading to a large number of false alarms (FAs), and since the responses of target and clutter signatures are so similar, classifier design is not trivial. We suggest a method based on a Run Packing (RP) algorithm to fuse GPR and EMI data into a composite confidence map to improve detection as measured by the area-under-ROC (NAUC) metric. We examine the value of a multiple kernel learning (MKL) support vector machine (SVM) classifier using image features such as histogram of oriented gradients (HOG), local binary patterns (LBP), and local statistics. Experimental results on government furnished data show that use of our proposed fusion and classification methods improves the NAUC when compared with the results from individual sensors and a single kernel SVM classifier.

  2. Stock price change rate prediction by utilizing social network activities.

    PubMed

    Deng, Shangkun; Mitsubuchi, Takashi; Sakurai, Akito

    2014-01-01

    Predicting stock price change rates for providing valuable information to investors is a challenging task. Individual participants may express their opinions in social network service (SNS) before or after their transactions in the market; we hypothesize that stock price change rate is better predicted by a function of social network service activities and technical indicators than by a function of just stock market activities. The hypothesis is tested by accuracy of predictions as well as performance of simulated trading because success or failure of prediction is better measured by profits or losses the investors gain or suffer. In this paper, we propose a hybrid model that combines multiple kernel learning (MKL) and genetic algorithm (GA). MKL is adopted to optimize the stock price change rate prediction models that are expressed in a multiple kernel linear function of different types of features extracted from different sources. GA is used to optimize the trading rules used in the simulated trading by fusing the return predictions and values of three well-known overbought and oversold technical indicators. Accumulated return and Sharpe ratio were used to test the goodness of performance of the simulated trading. Experimental results show that our proposed model performed better than other models including ones using state of the art techniques.

  3. Stock Price Change Rate Prediction by Utilizing Social Network Activities

    PubMed Central

    Mitsubuchi, Takashi; Sakurai, Akito

    2014-01-01

    Predicting stock price change rates for providing valuable information to investors is a challenging task. Individual participants may express their opinions in social network service (SNS) before or after their transactions in the market; we hypothesize that stock price change rate is better predicted by a function of social network service activities and technical indicators than by a function of just stock market activities. The hypothesis is tested by accuracy of predictions as well as performance of simulated trading because success or failure of prediction is better measured by profits or losses the investors gain or suffer. In this paper, we propose a hybrid model that combines multiple kernel learning (MKL) and genetic algorithm (GA). MKL is adopted to optimize the stock price change rate prediction models that are expressed in a multiple kernel linear function of different types of features extracted from different sources. GA is used to optimize the trading rules used in the simulated trading by fusing the return predictions and values of three well-known overbought and oversold technical indicators. Accumulated return and Sharpe ratio were used to test the goodness of performance of the simulated trading. Experimental results show that our proposed model performed better than other models including ones using state of the art techniques. PMID:24790586

  4. Integration of heterogeneous features for remote sensing scene classification

    NASA Astrophysics Data System (ADS)

    Wang, Xin; Xiong, Xingnan; Ning, Chen; Shi, Aiye; Lv, Guofang

    2018-01-01

    Scene classification is one of the most important issues in remote sensing (RS) image processing. We find that features from different channels (shape, spectral, texture, etc.), levels (low-level and middle-level), or perspectives (local and global) could provide various properties for RS images, and then propose a heterogeneous feature framework to extract and integrate heterogeneous features with different types for RS scene classification. The proposed method is composed of three modules (1) heterogeneous features extraction, where three heterogeneous feature types, called DS-SURF-LLC, mean-Std-LLC, and MS-CLBP, are calculated, (2) heterogeneous features fusion, where the multiple kernel learning (MKL) is utilized to integrate the heterogeneous features, and (3) an MKL support vector machine classifier for RS scene classification. The proposed method is extensively evaluated on three challenging benchmark datasets (a 6-class dataset, a 12-class dataset, and a 21-class dataset), and the experimental results show that the proposed method leads to good classification performance. It produces good informative features to describe the RS image scenes. Moreover, the integration of heterogeneous features outperforms some state-of-the-art features on RS scene classification tasks.

  5. Automatic plankton image classification combining multiple view features via multiple kernel learning.

    PubMed

    Zheng, Haiyong; Wang, Ruchen; Yu, Zhibin; Wang, Nan; Gu, Zhaorui; Zheng, Bing

    2017-12-28

    Plankton, including phytoplankton and zooplankton, are the main source of food for organisms in the ocean and form the base of marine food chain. As the fundamental components of marine ecosystems, plankton is very sensitive to environment changes, and the study of plankton abundance and distribution is crucial, in order to understand environment changes and protect marine ecosystems. This study was carried out to develop an extensive applicable plankton classification system with high accuracy for the increasing number of various imaging devices. Literature shows that most plankton image classification systems were limited to only one specific imaging device and a relatively narrow taxonomic scope. The real practical system for automatic plankton classification is even non-existent and this study is partly to fill this gap. Inspired by the analysis of literature and development of technology, we focused on the requirements of practical application and proposed an automatic system for plankton image classification combining multiple view features via multiple kernel learning (MKL). For one thing, in order to describe the biomorphic characteristics of plankton more completely and comprehensively, we combined general features with robust features, especially by adding features like Inner-Distance Shape Context for morphological representation. For another, we divided all the features into different types from multiple views and feed them to multiple classifiers instead of only one by combining different kernel matrices computed from different types of features optimally via multiple kernel learning. Moreover, we also applied feature selection method to choose the optimal feature subsets from redundant features for satisfying different datasets from different imaging devices. We implemented our proposed classification system on three different datasets across more than 20 categories from phytoplankton to zooplankton. The experimental results validated that our system outperforms state-of-the-art plankton image classification systems in terms of accuracy and robustness. This study demonstrated automatic plankton image classification system combining multiple view features using multiple kernel learning. The results indicated that multiple view features combined by NLMKL using three kernel functions (linear, polynomial and Gaussian kernel functions) can describe and use information of features better so that achieve a higher classification accuracy.

  6. Decoding negative affect personality trait from patterns of brain activation to threat stimuli.

    PubMed

    Fernandes, Orlando; Portugal, Liana C L; Alves, Rita de Cássia S; Arruda-Sanchez, Tiago; Rao, Anil; Volchan, Eliane; Pereira, Mirtes; Oliveira, Letícia; Mourao-Miranda, Janaina

    2017-01-15

    Pattern recognition analysis (PRA) applied to functional magnetic resonance imaging (fMRI) has been used to decode cognitive processes and identify possible biomarkers for mental illness. In the present study, we investigated whether the positive affect (PA) or negative affect (NA) personality traits could be decoded from patterns of brain activation in response to a human threat using a healthy sample. fMRI data from 34 volunteers (15 women) were acquired during a simple motor task while the volunteers viewed a set of threat stimuli that were directed either toward them or away from them and matched neutral pictures. For each participant, contrast images from a General Linear Model (GLM) between the threat versus neutral stimuli defined the spatial patterns used as input to the regression model. We applied a multiple kernel learning (MKL) regression combining information from different brain regions hierarchically in a whole brain model to decode the NA and PA from patterns of brain activation in response to threat stimuli. The MKL model was able to decode NA but not PA from the contrast images between threat stimuli directed away versus neutral with a significance above chance. The correlation and the mean squared error (MSE) between predicted and actual NA were 0.52 (p-value=0.01) and 24.43 (p-value=0.01), respectively. The MKL pattern regression model identified a network with 37 regions that contributed to the predictions. Some of the regions were related to perception (e.g., occipital and temporal regions) while others were related to emotional evaluation (e.g., caudate and prefrontal regions). These results suggest that there was an interaction between the individuals' NA and the brain response to the threat stimuli directed away, which enabled the MKL model to decode NA from the brain patterns. To our knowledge, this is the first evidence that PRA can be used to decode a personality trait from patterns of brain activation during emotional contexts. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.

  7. Integration of Network Topological and Connectivity Properties for Neuroimaging Classification

    PubMed Central

    Jie, Biao; Gao, Wei; Wang, Qian; Wee, Chong-Yaw

    2014-01-01

    Rapid advances in neuroimaging techniques have provided an efficient and noninvasive way for exploring the structural and functional connectivity of the human brain. Quantitative measurement of abnormality of brain connectivity in patients with neurodegenerative diseases, such as mild cognitive impairment (MCI) and Alzheimer’s disease (AD), have also been widely reported, especially at a group level. Recently, machine learning techniques have been applied to the study of AD and MCI, i.e., to identify the individuals with AD/MCI from the healthy controls (HCs). However, most existing methods focus on using only a single property of a connectivity network, although multiple network properties, such as local connectivity and global topological properties, can potentially be used. In this paper, by employing multikernel based approach, we propose a novel connectivity based framework to integrate multiple properties of connectivity network for improving the classification performance. Specifically, two different types of kernels (i.e., vector-based kernel and graph kernel) are used to quantify two different yet complementary properties of the network, i.e., local connectivity and global topological properties. Then, multikernel learning (MKL) technique is adopted to fuse these heterogeneous kernels for neuroimaging classification. We test the performance of our proposed method on two different data sets. First, we test it on the functional connectivity networks of 12 MCI and 25 HC subjects. The results show that our method achieves significant performance improvement over those using only one type of network property. Specifically, our method achieves a classification accuracy of 91.9%, which is 10.8% better than those by single network-property-based methods. Then, we test our method for gender classification on a large set of functional connectivity networks with 133 infants scanned at birth, 1 year, and 2 years, also demonstrating very promising results. PMID:24108708

  8. Surgical gesture classification from video and kinematic data.

    PubMed

    Zappella, Luca; Béjar, Benjamín; Hager, Gregory; Vidal, René

    2013-10-01

    Much of the existing work on automatic classification of gestures and skill in robotic surgery is based on dynamic cues (e.g., time to completion, speed, forces, torque) or kinematic data (e.g., robot trajectories and velocities). While videos could be equally or more discriminative (e.g., videos contain semantic information not present in kinematic data), they are typically not used because of the difficulties associated with automatic video interpretation. In this paper, we propose several methods for automatic surgical gesture classification from video data. We assume that the video of a surgical task (e.g., suturing) has been segmented into video clips corresponding to a single gesture (e.g., grabbing the needle, passing the needle) and propose three methods to classify the gesture of each video clip. In the first one, we model each video clip as the output of a linear dynamical system (LDS) and use metrics in the space of LDSs to classify new video clips. In the second one, we use spatio-temporal features extracted from each video clip to learn a dictionary of spatio-temporal words, and use a bag-of-features (BoF) approach to classify new video clips. In the third one, we use multiple kernel learning (MKL) to combine the LDS and BoF approaches. Since the LDS approach is also applicable to kinematic data, we also use MKL to combine both types of data in order to exploit their complementarity. Our experiments on a typical surgical training setup show that methods based on video data perform equally well, if not better, than state-of-the-art approaches based on kinematic data. In turn, the combination of both kinematic and video data outperforms any other algorithm based on one type of data alone. Copyright © 2013 Elsevier B.V. All rights reserved.

  9. Protein subcellular localization prediction using multiple kernel learning based support vector machine.

    PubMed

    Hasan, Md Al Mehedi; Ahmad, Shamim; Molla, Md Khademul Islam

    2017-03-28

    Predicting the subcellular locations of proteins can provide useful hints that reveal their functions, increase our understanding of the mechanisms of some diseases, and finally aid in the development of novel drugs. As the number of newly discovered proteins has been growing exponentially, which in turns, makes the subcellular localization prediction by purely laboratory tests prohibitively laborious and expensive. In this context, to tackle the challenges, computational methods are being developed as an alternative choice to aid biologists in selecting target proteins and designing related experiments. However, the success of protein subcellular localization prediction is still a complicated and challenging issue, particularly, when query proteins have multi-label characteristics, i.e., if they exist simultaneously in more than one subcellular location or if they move between two or more different subcellular locations. To date, to address this problem, several types of subcellular localization prediction methods with different levels of accuracy have been proposed. The support vector machine (SVM) has been employed to provide potential solutions to the protein subcellular localization prediction problem. However, the practicability of an SVM is affected by the challenges of selecting an appropriate kernel and selecting the parameters of the selected kernel. To address this difficulty, in this study, we aimed to develop an efficient multi-label protein subcellular localization prediction system, named as MKLoc, by introducing multiple kernel learning (MKL) based SVM. We evaluated MKLoc using a combined dataset containing 5447 single-localized proteins (originally published as part of the Höglund dataset) and 3056 multi-localized proteins (originally published as part of the DBMLoc set). Note that this dataset was used by Briesemeister et al. in their extensive comparison of multi-localization prediction systems. Finally, our experimental results indicate that MKLoc not only achieves higher accuracy than a single kernel based SVM system but also shows significantly better results than those obtained from other top systems (MDLoc, BNCs, YLoc+). Moreover, MKLoc requires less computation time to tune and train the system than that required for BNCs and single kernel based SVM.

  10. Transient α-helices in the disordered RPEL motifs of the serum response factor coactivator MKL1

    NASA Astrophysics Data System (ADS)

    Mizuguchi, Mineyuki; Fuju, Takahiro; Obita, Takayuki; Ishikawa, Mitsuru; Tsuda, Masaaki; Tabuchi, Akiko

    2014-06-01

    The megakaryoblastic leukemia 1 (MKL1) protein functions as a transcriptional coactivator of the serum response factor. MKL1 has three RPEL motifs (RPEL1, RPEL2, and RPEL3) in its N-terminal region. MKL1 binds to monomeric G-actin through RPEL motifs, and the dissociation of MKL1 from G-actin promotes the translocation of MKL1 to the nucleus. Although structural data are available for RPEL motifs of MKL1 in complex with G-actin, the structural characteristics of RPEL motifs in the free state have been poorly defined. Here we characterized the structures of free RPEL motifs using NMR and CD spectroscopy. NMR and CD measurements showed that free RPEL motifs are largely unstructured in solution. However, NMR analysis identified transient α-helices in the regions where helices α1 and α2 are induced upon binding to G-actin. Proline mutagenesis showed that the transient α-helices are locally formed without helix-helix interactions. The helix content is higher in the order of RPEL1, RPEL2, and RPEL3. The amount of preformed structure may correlate with the binding affinity between the intrinsically disordered protein and its target molecule.

  11. C11orf95-MKL2 is the Resulting Fusion Oncogene of t(11;16)(q13;p13) in Chondroid Lipoma

    PubMed Central

    Huang, Dali; Sumegi, Janos; Cin, Paola Dal; Reith, John D.; Yasuda, Taketoshi; Nelson, Marilu; Muirhead, David; Bridge, Julia A.

    2010-01-01

    Chondroid lipoma, a rare benign adipose tissue tumor, may histologically resemble myxoid liposarcoma or extraskeletal myxoid chondrosarcoma, but is genetically distinct. In the current study, an identical reciprocal translocation, t(11;16)(q13;p13) was identified in three chondroid lipomas, a finding consistent with previous isolated reports. A fluorescence in situ hybridization (FISH)-based positional cloning strategy using a series of bacterial artificial chromosome (BAC) probe combinations designed to narrow the 16p13 breakpoint revealed MKL2 as the candidate gene. Subsequent 5′ RACE studies demonstrated C11orf95 as the MKL2 fusion gene partner. MKL/myocardin-like 2 (MKL2) encodes myocardin-related transcription factor B (MRTF-B) in a megakaryoblastic leukemia gene family, and C11orf95 (chromosome 11 open reading frame 95) is a hypothetical protein. Sequencing analysis of RT-PCR generated transcripts from all three chondroid lipomas defined the fusion as occurring between exons 5 and 9 of C11orf95 and MKL2, respectively. Dual-color breakpoint spanning probe sets custom-designed for recognition of the translocation event in interphase cells confirmed the anticipated rearrangements of the C11orf95 and MKL2 loci in all cases. The FISH and RT-PCR assays developed in this study can serve as diagnostic adjuncts for identification of this novel C11orf95-MKL2 fusion oncogene in chondroid lipoma. PMID:20607705

  12. Combining heterogenous features for 3D hand-held object recognition

    NASA Astrophysics Data System (ADS)

    Lv, Xiong; Wang, Shuang; Li, Xiangyang; Jiang, Shuqiang

    2014-10-01

    Object recognition has wide applications in the area of human-machine interaction and multimedia retrieval. However, due to the problem of visual polysemous and concept polymorphism, it is still a great challenge to obtain reliable recognition result for the 2D images. Recently, with the emergence and easy availability of RGB-D equipment such as Kinect, this challenge could be relieved because the depth channel could bring more information. A very special and important case of object recognition is hand-held object recognition, as hand is a straight and natural way for both human-human interaction and human-machine interaction. In this paper, we study the problem of 3D object recognition by combining heterogenous features with different modalities and extraction techniques. For hand-craft feature, although it reserves the low-level information such as shape and color, it has shown weakness in representing hiconvolutionalgh-level semantic information compared with the automatic learned feature, especially deep feature. Deep feature has shown its great advantages in large scale dataset recognition but is not always robust to rotation or scale variance compared with hand-craft feature. In this paper, we propose a method to combine hand-craft point cloud features and deep learned features in RGB and depth channle. First, hand-held object segmentation is implemented by using depth cues and human skeleton information. Second, we combine the extracted hetegerogenous 3D features in different stages using linear concatenation and multiple kernel learning (MKL). Then a training model is used to recognize 3D handheld objects. Experimental results validate the effectiveness and gerneralization ability of the proposed method.

  13. Common Variants in the MKL1 Gene Confer Risk of Schizophrenia

    PubMed Central

    Luo, Xiong-jian; Huang, Liang; van den Oord, Edwin J.; Aberg, Karolina A.; Gan, Lin; Zhao, Zhongming; Yao, Yong-Gang

    2015-01-01

    Genome-wide association studies (GWAS) of schizophrenia have identified multiple risk variants with robust association signals for schizophrenia. However, these variants could explain only a small proportion of schizophrenia heritability. Furthermore, the effect size of these risk variants is relatively small (eg, most of them had an OR less than 1.2), suggesting that additional risk variants may be detected when increasing sample size in analysis. Here, we report the identification of a genome-wide significant schizophrenia risk locus at 22q13.1 by combining 2 large-scale schizophrenia cohort studies. Our meta-analysis revealed that 7 single nucleotide polymorphism (SNPs) on chromosome 22q13.1 reached the genome-wide significance level (P < 5.0×10–8) in the combined samples (a total of 38441 individuals). Among them, SNP rs6001946 had the most significant association with schizophrenia (P = 2.04×10–8). Interestingly, all 7 SNPs are in high linkage disequilibrium and located in the MKL1 gene. Expression analysis showed that MKL1 is highly expressed in human and mouse brains. We further investigated functional links between MKL1 and proteins encoded by other schizophrenia susceptibility genes in the whole human protein interaction network. We found that MKL1 physically interacts with GSK3B, a protein encoded by a well-characterized schizophrenia susceptibility gene. Collectively, our results revealed that genetic variants in MKL1 might confer risk to schizophrenia. Further investigation of the roles of MKL1 in the pathogenesis of schizophrenia is warranted. PMID:25380769

  14. Approximate kernel competitive learning.

    PubMed

    Wu, Jian-Sheng; Zheng, Wei-Shi; Lai, Jian-Huang

    2015-03-01

    Kernel competitive learning has been successfully used to achieve robust clustering. However, kernel competitive learning (KCL) is not scalable for large scale data processing, because (1) it has to calculate and store the full kernel matrix that is too large to be calculated and kept in the memory and (2) it cannot be computed in parallel. In this paper we develop a framework of approximate kernel competitive learning for processing large scale dataset. The proposed framework consists of two parts. First, it derives an approximate kernel competitive learning (AKCL), which learns kernel competitive learning in a subspace via sampling. We provide solid theoretical analysis on why the proposed approximation modelling would work for kernel competitive learning, and furthermore, we show that the computational complexity of AKCL is largely reduced. Second, we propose a pseudo-parallelled approximate kernel competitive learning (PAKCL) based on a set-based kernel competitive learning strategy, which overcomes the obstacle of using parallel programming in kernel competitive learning and significantly accelerates the approximate kernel competitive learning for large scale clustering. The empirical evaluation on publicly available datasets shows that the proposed AKCL and PAKCL can perform comparably as KCL, with a large reduction on computational cost. Also, the proposed methods achieve more effective clustering performance in terms of clustering precision against related approximate clustering approaches. Copyright © 2014 Elsevier Ltd. All rights reserved.

  15. Klotho Up-regulates Renal Calcium Channel Transient Receptor Potential Vanilloid 5 (TRPV5) by Intra- and Extracellular N-glycosylation-dependent Mechanisms*

    PubMed Central

    Wolf, Matthias T. F.; An, Sung-Wan; Nie, Mingzhu; Bal, Manjot S.; Huang, Chou-Long

    2014-01-01

    The anti-aging protein Klotho is a type 1 membrane protein produced predominantly in the distal convoluted tubule. The ectodomain of Klotho is cleaved and secreted into the urine to regulate several ion channels and transporters. Secreted Klotho (sKL) up-regulates the TRPV5 calcium channel from the cell exterior by removing sialic acids from N-glycan of the channel and inhibiting its endocytosis. Because TRPV5 and Klotho coexpress in the distal convoluted tubule, we investigated whether Klotho regulates TRPV5 action from inside the cell. Whole-cell TRPV5-mediated channel activity was recorded in HEK cells coexpressing TRPV5 and sKL or membranous Klotho (mKL). Transfection of sKL, but not mKL, produced detectable Klotho protein in cell culture media. As for sKL, mKL increased TRPV5 current density. The role of sialidase activity of mKL acting inside is supported by findings that mutations of putative sialidase activity sites in sKL and mKL abrogated the regulation of TRPV5 but that the extracellular application of a sialidase inhibitor prevented the regulation of TRPV5 by sKL only. Mechanistically, coexpression with a dominant-negative dynamin II prevented the regulation of TRPV5 by sKL but not by mKL. In contrast, blocking forward trafficking by brefeldin A prevented the effect with mKL but not with sKL. Therefore, Klotho up-regulates TRPV5 from both the inside and outside of cells. The intracellular action of Klotho is likely due to enhanced forward trafficking of channel proteins, whereas the extracellular action is due to inhibition of endocytosis. Both effects involve putative Klotho sialidase activity. These effects of Klotho may play important roles regarding calcium reabsorption in the kidney. PMID:25378396

  16. A Novel Extreme Learning Machine Classification Model for e-Nose Application Based on the Multiple Kernel Approach.

    PubMed

    Jian, Yulin; Huang, Daoyu; Yan, Jia; Lu, Kun; Huang, Ying; Wen, Tailai; Zeng, Tanyue; Zhong, Shijie; Xie, Qilong

    2017-06-19

    A novel classification model, named the quantum-behaved particle swarm optimization (QPSO)-based weighted multiple kernel extreme learning machine (QWMK-ELM), is proposed in this paper. Experimental validation is carried out with two different electronic nose (e-nose) datasets. Being different from the existing multiple kernel extreme learning machine (MK-ELM) algorithms, the combination coefficients of base kernels are regarded as external parameters of single-hidden layer feedforward neural networks (SLFNs). The combination coefficients of base kernels, the model parameters of each base kernel, and the regularization parameter are optimized by QPSO simultaneously before implementing the kernel extreme learning machine (KELM) with the composite kernel function. Four types of common single kernel functions (Gaussian kernel, polynomial kernel, sigmoid kernel, and wavelet kernel) are utilized to constitute different composite kernel functions. Moreover, the method is also compared with other existing classification methods: extreme learning machine (ELM), kernel extreme learning machine (KELM), k-nearest neighbors (KNN), support vector machine (SVM), multi-layer perceptron (MLP), radical basis function neural network (RBFNN), and probabilistic neural network (PNN). The results have demonstrated that the proposed QWMK-ELM outperforms the aforementioned methods, not only in precision, but also in efficiency for gas classification.

  17. A Novel Extreme Learning Machine Classification Model for e-Nose Application Based on the Multiple Kernel Approach

    PubMed Central

    Jian, Yulin; Huang, Daoyu; Yan, Jia; Lu, Kun; Huang, Ying; Wen, Tailai; Zeng, Tanyue; Zhong, Shijie; Xie, Qilong

    2017-01-01

    A novel classification model, named the quantum-behaved particle swarm optimization (QPSO)-based weighted multiple kernel extreme learning machine (QWMK-ELM), is proposed in this paper. Experimental validation is carried out with two different electronic nose (e-nose) datasets. Being different from the existing multiple kernel extreme learning machine (MK-ELM) algorithms, the combination coefficients of base kernels are regarded as external parameters of single-hidden layer feedforward neural networks (SLFNs). The combination coefficients of base kernels, the model parameters of each base kernel, and the regularization parameter are optimized by QPSO simultaneously before implementing the kernel extreme learning machine (KELM) with the composite kernel function. Four types of common single kernel functions (Gaussian kernel, polynomial kernel, sigmoid kernel, and wavelet kernel) are utilized to constitute different composite kernel functions. Moreover, the method is also compared with other existing classification methods: extreme learning machine (ELM), kernel extreme learning machine (KELM), k-nearest neighbors (KNN), support vector machine (SVM), multi-layer perceptron (MLP), radical basis function neural network (RBFNN), and probabilistic neural network (PNN). The results have demonstrated that the proposed QWMK-ELM outperforms the aforementioned methods, not only in precision, but also in efficiency for gas classification. PMID:28629202

  18. An introduction to kernel-based learning algorithms.

    PubMed

    Müller, K R; Mika, S; Rätsch, G; Tsuda, K; Schölkopf, B

    2001-01-01

    This paper provides an introduction to support vector machines, kernel Fisher discriminant analysis, and kernel principal component analysis, as examples for successful kernel-based learning methods. We first give a short background about Vapnik-Chervonenkis theory and kernel feature spaces and then proceed to kernel based learning in supervised and unsupervised scenarios including practical and algorithmic considerations. We illustrate the usefulness of kernel algorithms by discussing applications such as optical character recognition and DNA analysis.

  19. Ideal regularization for learning kernels from labels.

    PubMed

    Pan, Binbin; Lai, Jianhuang; Shen, Lixin

    2014-08-01

    In this paper, we propose a new form of regularization that is able to utilize the label information of a data set for learning kernels. The proposed regularization, referred to as ideal regularization, is a linear function of the kernel matrix to be learned. The ideal regularization allows us to develop efficient algorithms to exploit labels. Three applications of the ideal regularization are considered. Firstly, we use the ideal regularization to incorporate the labels into a standard kernel, making the resulting kernel more appropriate for learning tasks. Next, we employ the ideal regularization to learn a data-dependent kernel matrix from an initial kernel matrix (which contains prior similarity information, geometric structures, and labels of the data). Finally, we incorporate the ideal regularization to some state-of-the-art kernel learning problems. With this regularization, these learning problems can be formulated as simpler ones which permit more efficient solvers. Empirical results show that the ideal regularization exploits the labels effectively and efficiently. Copyright © 2014 Elsevier Ltd. All rights reserved.

  20. Out-of-Sample Extensions for Non-Parametric Kernel Methods.

    PubMed

    Pan, Binbin; Chen, Wen-Sheng; Chen, Bo; Xu, Chen; Lai, Jianhuang

    2017-02-01

    Choosing suitable kernels plays an important role in the performance of kernel methods. Recently, a number of studies were devoted to developing nonparametric kernels. Without assuming any parametric form of the target kernel, nonparametric kernel learning offers a flexible scheme to utilize the information of the data, which may potentially characterize the data similarity better. The kernel methods using nonparametric kernels are referred to as nonparametric kernel methods. However, many nonparametric kernel methods are restricted to transductive learning, where the prediction function is defined only over the data points given beforehand. They have no straightforward extension for the out-of-sample data points, and thus cannot be applied to inductive learning. In this paper, we show how to make the nonparametric kernel methods applicable to inductive learning. The key problem of out-of-sample extension is how to extend the nonparametric kernel matrix to the corresponding kernel function. A regression approach in the hyper reproducing kernel Hilbert space is proposed to solve this problem. Empirical results indicate that the out-of-sample performance is comparable to the in-sample performance in most cases. Experiments on face recognition demonstrate the superiority of our nonparametric kernel method over the state-of-the-art parametric kernel methods.

  1. A Comparative Study of Pairwise Learning Methods Based on Kernel Ridge Regression.

    PubMed

    Stock, Michiel; Pahikkala, Tapio; Airola, Antti; De Baets, Bernard; Waegeman, Willem

    2018-06-12

    Many machine learning problems can be formulated as predicting labels for a pair of objects. Problems of that kind are often referred to as pairwise learning, dyadic prediction, or network inference problems. During the past decade, kernel methods have played a dominant role in pairwise learning. They still obtain a state-of-the-art predictive performance, but a theoretical analysis of their behavior has been underexplored in the machine learning literature. In this work we review and unify kernel-based algorithms that are commonly used in different pairwise learning settings, ranging from matrix filtering to zero-shot learning. To this end, we focus on closed-form efficient instantiations of Kronecker kernel ridge regression. We show that independent task kernel ridge regression, two-step kernel ridge regression, and a linear matrix filter arise naturally as a special case of Kronecker kernel ridge regression, implying that all these methods implicitly minimize a squared loss. In addition, we analyze universality, consistency, and spectral filtering properties. Our theoretical results provide valuable insights into assessing the advantages and limitations of existing pairwise learning methods.

  2. Multiscale Support Vector Learning With Projection Operator Wavelet Kernel for Nonlinear Dynamical System Identification.

    PubMed

    Lu, Zhao; Sun, Jing; Butts, Kenneth

    2016-02-03

    A giant leap has been made in the past couple of decades with the introduction of kernel-based learning as a mainstay for designing effective nonlinear computational learning algorithms. In view of the geometric interpretation of conditional expectation and the ubiquity of multiscale characteristics in highly complex nonlinear dynamic systems [1]-[3], this paper presents a new orthogonal projection operator wavelet kernel, aiming at developing an efficient computational learning approach for nonlinear dynamical system identification. In the framework of multiresolution analysis, the proposed projection operator wavelet kernel can fulfill the multiscale, multidimensional learning to estimate complex dependencies. The special advantage of the projection operator wavelet kernel developed in this paper lies in the fact that it has a closed-form expression, which greatly facilitates its application in kernel learning. To the best of our knowledge, it is the first closed-form orthogonal projection wavelet kernel reported in the literature. It provides a link between grid-based wavelets and mesh-free kernel-based methods. Simulation studies for identifying the parallel models of two benchmark nonlinear dynamical systems confirm its superiority in model accuracy and sparsity.

  3. Multiscale asymmetric orthogonal wavelet kernel for linear programming support vector learning and nonlinear dynamic systems identification.

    PubMed

    Lu, Zhao; Sun, Jing; Butts, Kenneth

    2014-05-01

    Support vector regression for approximating nonlinear dynamic systems is more delicate than the approximation of indicator functions in support vector classification, particularly for systems that involve multitudes of time scales in their sampled data. The kernel used for support vector learning determines the class of functions from which a support vector machine can draw its solution, and the choice of kernel significantly influences the performance of a support vector machine. In this paper, to bridge the gap between wavelet multiresolution analysis and kernel learning, the closed-form orthogonal wavelet is exploited to construct new multiscale asymmetric orthogonal wavelet kernels for linear programming support vector learning. The closed-form multiscale orthogonal wavelet kernel provides a systematic framework to implement multiscale kernel learning via dyadic dilations and also enables us to represent complex nonlinear dynamics effectively. To demonstrate the superiority of the proposed multiscale wavelet kernel in identifying complex nonlinear dynamic systems, two case studies are presented that aim at building parallel models on benchmark datasets. The development of parallel models that address the long-term/mid-term prediction issue is more intricate and challenging than the identification of series-parallel models where only one-step ahead prediction is required. Simulation results illustrate the effectiveness of the proposed multiscale kernel learning.

  4. Deep kernel learning method for SAR image target recognition

    NASA Astrophysics Data System (ADS)

    Chen, Xiuyuan; Peng, Xiyuan; Duan, Ran; Li, Junbao

    2017-10-01

    With the development of deep learning, research on image target recognition has made great progress in recent years. Remote sensing detection urgently requires target recognition for military, geographic, and other scientific research. This paper aims to solve the synthetic aperture radar image target recognition problem by combining deep and kernel learning. The model, which has a multilayer multiple kernel structure, is optimized layer by layer with the parameters of Support Vector Machine and a gradient descent algorithm. This new deep kernel learning method improves accuracy and achieves competitive recognition results compared with other learning methods.

  5. Online learning control using adaptive critic designs with sparse kernel machines.

    PubMed

    Xu, Xin; Hou, Zhongsheng; Lian, Chuanqiang; He, Haibo

    2013-05-01

    In the past decade, adaptive critic designs (ACDs), including heuristic dynamic programming (HDP), dual heuristic programming (DHP), and their action-dependent ones, have been widely studied to realize online learning control of dynamical systems. However, because neural networks with manually designed features are commonly used to deal with continuous state and action spaces, the generalization capability and learning efficiency of previous ACDs still need to be improved. In this paper, a novel framework of ACDs with sparse kernel machines is presented by integrating kernel methods into the critic of ACDs. To improve the generalization capability as well as the computational efficiency of kernel machines, a sparsification method based on the approximately linear dependence analysis is used. Using the sparse kernel machines, two kernel-based ACD algorithms, that is, kernel HDP (KHDP) and kernel DHP (KDHP), are proposed and their performance is analyzed both theoretically and empirically. Because of the representation learning and generalization capability of sparse kernel machines, KHDP and KDHP can obtain much better performance than previous HDP and DHP with manually designed neural networks. Simulation and experimental results of two nonlinear control problems, that is, a continuous-action inverted pendulum problem and a ball and plate control problem, demonstrate the effectiveness of the proposed kernel ACD methods.

  6. Kernel learning at the first level of inference.

    PubMed

    Cawley, Gavin C; Talbot, Nicola L C

    2014-05-01

    Kernel learning methods, whether Bayesian or frequentist, typically involve multiple levels of inference, with the coefficients of the kernel expansion being determined at the first level and the kernel and regularisation parameters carefully tuned at the second level, a process known as model selection. Model selection for kernel machines is commonly performed via optimisation of a suitable model selection criterion, often based on cross-validation or theoretical performance bounds. However, if there are a large number of kernel parameters, as for instance in the case of automatic relevance determination (ARD), there is a substantial risk of over-fitting the model selection criterion, resulting in poor generalisation performance. In this paper we investigate the possibility of learning the kernel, for the Least-Squares Support Vector Machine (LS-SVM) classifier, at the first level of inference, i.e. parameter optimisation. The kernel parameters and the coefficients of the kernel expansion are jointly optimised at the first level of inference, minimising a training criterion with an additional regularisation term acting on the kernel parameters. The key advantage of this approach is that the values of only two regularisation parameters need be determined in model selection, substantially alleviating the problem of over-fitting the model selection criterion. The benefits of this approach are demonstrated using a suite of synthetic and real-world binary classification benchmark problems, where kernel learning at the first level of inference is shown to be statistically superior to the conventional approach, improves on our previous work (Cawley and Talbot, 2007) and is competitive with Multiple Kernel Learning approaches, but with reduced computational expense. Copyright © 2014 Elsevier Ltd. All rights reserved.

  7. Kernel Methods for Mining Instance Data in Ontologies

    NASA Astrophysics Data System (ADS)

    Bloehdorn, Stephan; Sure, York

    The amount of ontologies and meta data available on the Web is constantly growing. The successful application of machine learning techniques for learning of ontologies from textual data, i.e. mining for the Semantic Web, contributes to this trend. However, no principal approaches exist so far for mining from the Semantic Web. We investigate how machine learning algorithms can be made amenable for directly taking advantage of the rich knowledge expressed in ontologies and associated instance data. Kernel methods have been successfully employed in various learning tasks and provide a clean framework for interfacing between non-vectorial data and machine learning algorithms. In this spirit, we express the problem of mining instances in ontologies as the problem of defining valid corresponding kernels. We present a principled framework for designing such kernels by means of decomposing the kernel computation into specialized kernels for selected characteristics of an ontology which can be flexibly assembled and tuned. Initial experiments on real world Semantic Web data enjoy promising results and show the usefulness of our approach.

  8. Brain tumor image segmentation using kernel dictionary learning.

    PubMed

    Jeon Lee; Seung-Jun Kim; Rong Chen; Herskovits, Edward H

    2015-08-01

    Automated brain tumor image segmentation with high accuracy and reproducibility holds a big potential to enhance the current clinical practice. Dictionary learning (DL) techniques have been applied successfully to various image processing tasks recently. In this work, kernel extensions of the DL approach are adopted. Both reconstructive and discriminative versions of the kernel DL technique are considered, which can efficiently incorporate multi-modal nonlinear feature mappings based on the kernel trick. Our novel discriminative kernel DL formulation allows joint learning of a task-driven kernel-based dictionary and a linear classifier using a K-SVD-type algorithm. The proposed approaches were tested using real brain magnetic resonance (MR) images of patients with high-grade glioma. The obtained preliminary performances are competitive with the state of the art. The discriminative kernel DL approach is seen to reduce computational burden without much sacrifice in performance.

  9. Impact of deep learning on the normalization of reconstruction kernel effects in imaging biomarker quantification: a pilot study in CT emphysema

    NASA Astrophysics Data System (ADS)

    Jin, Hyeongmin; Heo, Changyong; Kim, Jong Hyo

    2018-02-01

    Differing reconstruction kernels are known to strongly affect the variability of imaging biomarkers and thus remain as a barrier in translating the computer aided quantification techniques into clinical practice. This study presents a deep learning application to CT kernel conversion which converts a CT image of sharp kernel to that of standard kernel and evaluates its impact on variability reduction of a pulmonary imaging biomarker, the emphysema index (EI). Forty cases of low-dose chest CT exams obtained with 120kVp, 40mAs, 1mm thickness, of 2 reconstruction kernels (B30f, B50f) were selected from the low dose lung cancer screening database of our institution. A Fully convolutional network was implemented with Keras deep learning library. The model consisted of symmetric layers to capture the context and fine structure characteristics of CT images from the standard and sharp reconstruction kernels. Pairs of the full-resolution CT data set were fed to input and output nodes to train the convolutional network to learn the appropriate filter kernels for converting the CT images of sharp kernel to standard kernel with a criterion of measuring the mean squared error between the input and target images. EIs (RA950 and Perc15) were measured with a software package (ImagePrism Pulmo, Seoul, South Korea) and compared for the data sets of B50f, B30f, and the converted B50f. The effect of kernel conversion was evaluated with the mean and standard deviation of pair-wise differences in EI. The population mean of RA950 was 27.65 +/- 7.28% for B50f data set, 10.82 +/- 6.71% for the B30f data set, and 8.87 +/- 6.20% for the converted B50f data set. The mean of pair-wise absolute differences in RA950 between B30f and B50f is reduced from 16.83% to 1.95% using kernel conversion. Our study demonstrates the feasibility of applying the deep learning technique for CT kernel conversion and reducing the kernel-induced variability of EI quantification. The deep learning model has a potential to improve the reliability of imaging biomarker, especially in evaluating the longitudinal changes of EI even when the patient CT scans were performed with different kernels.

  10. Study of the convergence behavior of the complex kernel least mean square algorithm.

    PubMed

    Paul, Thomas K; Ogunfunmi, Tokunbo

    2013-09-01

    The complex kernel least mean square (CKLMS) algorithm is recently derived and allows for online kernel adaptive learning for complex data. Kernel adaptive methods can be used in finding solutions for neural network and machine learning applications. The derivation of CKLMS involved the development of a modified Wirtinger calculus for Hilbert spaces to obtain the cost function gradient. We analyze the convergence of the CKLMS with different kernel forms for complex data. The expressions obtained enable us to generate theory-predicted mean-square error curves considering the circularity of the complex input signals and their effect on nonlinear learning. Simulations are used for verifying the analysis results.

  11. Kernel Temporal Differences for Neural Decoding

    PubMed Central

    Bae, Jihye; Sanchez Giraldo, Luis G.; Pohlmeyer, Eric A.; Francis, Joseph T.; Sanchez, Justin C.; Príncipe, José C.

    2015-01-01

    We study the feasibility and capability of the kernel temporal difference (KTD)(λ) algorithm for neural decoding. KTD(λ) is an online, kernel-based learning algorithm, which has been introduced to estimate value functions in reinforcement learning. This algorithm combines kernel-based representations with the temporal difference approach to learning. One of our key observations is that by using strictly positive definite kernels, algorithm's convergence can be guaranteed for policy evaluation. The algorithm's nonlinear functional approximation capabilities are shown in both simulations of policy evaluation and neural decoding problems (policy improvement). KTD can handle high-dimensional neural states containing spatial-temporal information at a reasonable computational complexity allowing real-time applications. When the algorithm seeks a proper mapping between a monkey's neural states and desired positions of a computer cursor or a robot arm, in both open-loop and closed-loop experiments, it can effectively learn the neural state to action mapping. Finally, a visualization of the coadaptation process between the decoder and the subject shows the algorithm's capabilities in reinforcement learning brain machine interfaces. PMID:25866504

  12. Generalization Performance of Regularized Ranking With Multiscale Kernels.

    PubMed

    Zhou, Yicong; Chen, Hong; Lan, Rushi; Pan, Zhibin

    2016-05-01

    The regularized kernel method for the ranking problem has attracted increasing attentions in machine learning. The previous regularized ranking algorithms are usually based on reproducing kernel Hilbert spaces with a single kernel. In this paper, we go beyond this framework by investigating the generalization performance of the regularized ranking with multiscale kernels. A novel ranking algorithm with multiscale kernels is proposed and its representer theorem is proved. We establish the upper bound of the generalization error in terms of the complexity of hypothesis spaces. It shows that the multiscale ranking algorithm can achieve satisfactory learning rates under mild conditions. Experiments demonstrate the effectiveness of the proposed method for drug discovery and recommendation tasks.

  13. Design of a multiple kernel learning algorithm for LS-SVM by convex programming.

    PubMed

    Jian, Ling; Xia, Zhonghang; Liang, Xijun; Gao, Chuanhou

    2011-06-01

    As a kernel based method, the performance of least squares support vector machine (LS-SVM) depends on the selection of the kernel as well as the regularization parameter (Duan, Keerthi, & Poo, 2003). Cross-validation is efficient in selecting a single kernel and the regularization parameter; however, it suffers from heavy computational cost and is not flexible to deal with multiple kernels. In this paper, we address the issue of multiple kernel learning for LS-SVM by formulating it as semidefinite programming (SDP). Furthermore, we show that the regularization parameter can be optimized in a unified framework with the kernel, which leads to an automatic process for model selection. Extensive experimental validations are performed and analyzed. Copyright © 2011 Elsevier Ltd. All rights reserved.

  14. Fast Gaussian kernel learning for classification tasks based on specially structured global optimization.

    PubMed

    Zhong, Shangping; Chen, Tianshun; He, Fengying; Niu, Yuzhen

    2014-09-01

    For a practical pattern classification task solved by kernel methods, the computing time is mainly spent on kernel learning (or training). However, the current kernel learning approaches are based on local optimization techniques, and hard to have good time performances, especially for large datasets. Thus the existing algorithms cannot be easily extended to large-scale tasks. In this paper, we present a fast Gaussian kernel learning method by solving a specially structured global optimization (SSGO) problem. We optimize the Gaussian kernel function by using the formulated kernel target alignment criterion, which is a difference of increasing (d.i.) functions. Through using a power-transformation based convexification method, the objective criterion can be represented as a difference of convex (d.c.) functions with a fixed power-transformation parameter. And the objective programming problem can then be converted to a SSGO problem: globally minimizing a concave function over a convex set. The SSGO problem is classical and has good solvability. Thus, to find the global optimal solution efficiently, we can adopt the improved Hoffman's outer approximation method, which need not repeat the searching procedure with different starting points to locate the best local minimum. Also, the proposed method can be proven to converge to the global solution for any classification task. We evaluate the proposed method on twenty benchmark datasets, and compare it with four other Gaussian kernel learning methods. Experimental results show that the proposed method stably achieves both good time-efficiency performance and good classification performance. Copyright © 2014 Elsevier Ltd. All rights reserved.

  15. A phylogenomic analysis of the Actinomycetales mce operons.

    PubMed

    Casali, Nicola; Riley, Lee W

    2007-02-26

    The genome of Mycobacterium tuberculosis harbors four copies of a cluster of genes termed mce operons. Despite extensive research that has demonstrated the importance of these operons on infection outcome, their physiological function remains obscure. Expanding databases of complete microbial genome sequences facilitate a comparative genomic approach that can provide valuable insight into the role of uncharacterized proteins. The M. tuberculosis mce loci each include two yrbE and six mce genes, which have homology to ABC transporter permeases and substrate-binding proteins, respectively. Operons with an identical structure were identified in all Mycobacterium species examined, as well as in five other Actinomycetales genera. Some of the Actinomycetales mce operons include an mkl gene, which encodes an ATPase resembling those of ABC uptake transporters. The phylogenetic profile of Mkl orthologs exactly matched that of the Mce and YrbE proteins. Through topology and motif analyses of YrbE homologs, we identified a region within the penultimate cytoplasmic loop that may serve as the site of interaction with the putative cognate Mkl ATPase. Homologs of the exported proteins encoded adjacent to the M. tuberculosis mce operons were detected in a conserved chromosomal location downstream of the majority of Actinomycetales operons. Operons containing linked mkl, yrbE and mce genes, resembling the classic organization of an ABC importer, were found to be common in Gram-negative bacteria and appear to be associated with changes in properties of the cell surface. Evidence presented suggests that the mce operons of Actinomycetales species and related operons in Gram-negative bacteria encode a subfamily of ABC uptake transporters with a possible role in remodeling the cell envelope.

  16. Semi-supervised learning for ordinal Kernel Discriminant Analysis.

    PubMed

    Pérez-Ortiz, M; Gutiérrez, P A; Carbonero-Ruz, M; Hervás-Martínez, C

    2016-12-01

    Ordinal classification considers those classification problems where the labels of the variable to predict follow a given order. Naturally, labelled data is scarce or difficult to obtain in this type of problems because, in many cases, ordinal labels are given by a user or expert (e.g. in recommendation systems). Firstly, this paper develops a new strategy for ordinal classification where both labelled and unlabelled data are used in the model construction step (a scheme which is referred to as semi-supervised learning). More specifically, the ordinal version of kernel discriminant learning is extended for this setting considering the neighbourhood information of unlabelled data, which is proposed to be computed in the feature space induced by the kernel function. Secondly, a new method for semi-supervised kernel learning is devised in the context of ordinal classification, which is combined with our developed classification strategy to optimise the kernel parameters. The experiments conducted compare 6 different approaches for semi-supervised learning in the context of ordinal classification in a battery of 30 datasets, showing (1) the good synergy of the ordinal version of discriminant analysis and the use of unlabelled data and (2) the advantage of computing distances in the feature space induced by the kernel function. Copyright © 2016 Elsevier Ltd. All rights reserved.

  17. Metabolite identification through multiple kernel learning on fragmentation trees.

    PubMed

    Shen, Huibin; Dührkop, Kai; Böcker, Sebastian; Rousu, Juho

    2014-06-15

    Metabolite identification from tandem mass spectrometric data is a key task in metabolomics. Various computational methods have been proposed for the identification of metabolites from tandem mass spectra. Fragmentation tree methods explore the space of possible ways in which the metabolite can fragment, and base the metabolite identification on scoring of these fragmentation trees. Machine learning methods have been used to map mass spectra to molecular fingerprints; predicted fingerprints, in turn, can be used to score candidate molecular structures. Here, we combine fragmentation tree computations with kernel-based machine learning to predict molecular fingerprints and identify molecular structures. We introduce a family of kernels capturing the similarity of fragmentation trees, and combine these kernels using recently proposed multiple kernel learning approaches. Experiments on two large reference datasets show that the new methods significantly improve molecular fingerprint prediction accuracy. These improvements result in better metabolite identification, doubling the number of metabolites ranked at the top position of the candidates list. © The Author 2014. Published by Oxford University Press.

  18. Learn the Lagrangian: A Vector-Valued RKHS Approach to Identifying Lagrangian Systems.

    PubMed

    Cheng, Ching-An; Huang, Han-Pang

    2016-12-01

    We study the modeling of Lagrangian systems with multiple degrees of freedom. Based on system dynamics, canonical parametric models require ad hoc derivations and sometimes simplification for a computable solution; on the other hand, due to the lack of prior knowledge in the system's structure, modern nonparametric models in machine learning face the curse of dimensionality, especially in learning large systems. In this paper, we bridge this gap by unifying the theories of Lagrangian systems and vector-valued reproducing kernel Hilbert space. We reformulate Lagrangian systems with kernels that embed the governing Euler-Lagrange equation-the Lagrangian kernels-and show that these kernels span a subspace capturing the Lagrangian's projection as inverse dynamics. By such property, our model uses only inputs and outputs as in machine learning and inherits the structured form as in system dynamics, thereby removing the need for the mundane derivations for new systems as well as the generalization problem in learning from scratches. In effect, it learns the system's Lagrangian, a simpler task than directly learning the dynamics. To demonstrate, we applied the proposed kernel to identify the robot inverse dynamics in simulations and experiments. Our results present a competitive novel approach to identifying Lagrangian systems, despite using only inputs and outputs.

  19. Structured Kernel Dictionary Learning with Correlation Constraint for Object Recognition.

    PubMed

    Wang, Zhengjue; Wang, Yinghua; Liu, Hongwei; Zhang, Hao

    2017-06-21

    In this paper, we propose a new discriminative non-linear dictionary learning approach, called correlation constrained structured kernel KSVD, for object recognition. The objective function for dictionary learning contains a reconstructive term and a discriminative term. In the reconstructive term, signals are implicitly non-linearly mapped into a space, where a structured kernel dictionary, each sub-dictionary of which lies in the span of the mapped signals from the corresponding class, is established. In the discriminative term, by analyzing the classification mechanism, the correlation constraint is proposed in kernel form, constraining the correlations between different discriminative codes, and restricting the coefficient vectors to be transformed into a feature space, where the features are highly correlated inner-class and nearly independent between-classes. The objective function is optimized by the proposed structured kernel KSVD. During the classification stage, the specific form of the discriminative feature is needless to be known, while the inner product of the discriminative feature with kernel matrix embedded is available, and is suitable for a linear SVM classifier. Experimental results demonstrate that the proposed approach outperforms many state-of-the-art dictionary learning approaches for face, scene and synthetic aperture radar (SAR) vehicle target recognition.

  20. Online selective kernel-based temporal difference learning.

    PubMed

    Chen, Xingguo; Gao, Yang; Wang, Ruili

    2013-12-01

    In this paper, an online selective kernel-based temporal difference (OSKTD) learning algorithm is proposed to deal with large scale and/or continuous reinforcement learning problems. OSKTD includes two online procedures: online sparsification and parameter updating for the selective kernel-based value function. A new sparsification method (i.e., a kernel distance-based online sparsification method) is proposed based on selective ensemble learning, which is computationally less complex compared with other sparsification methods. With the proposed sparsification method, the sparsified dictionary of samples is constructed online by checking if a sample needs to be added to the sparsified dictionary. In addition, based on local validity, a selective kernel-based value function is proposed to select the best samples from the sample dictionary for the selective kernel-based value function approximator. The parameters of the selective kernel-based value function are iteratively updated by using the temporal difference (TD) learning algorithm combined with the gradient descent technique. The complexity of the online sparsification procedure in the OSKTD algorithm is O(n). In addition, two typical experiments (Maze and Mountain Car) are used to compare with both traditional and up-to-date O(n) algorithms (GTD, GTD2, and TDC using the kernel-based value function), and the results demonstrate the effectiveness of our proposed algorithm. In the Maze problem, OSKTD converges to an optimal policy and converges faster than both traditional and up-to-date algorithms. In the Mountain Car problem, OSKTD converges, requires less computation time compared with other sparsification methods, gets a better local optima than the traditional algorithms, and converges much faster than the up-to-date algorithms. In addition, OSKTD can reach a competitive ultimate optima compared with the up-to-date algorithms.

  1. Deep Restricted Kernel Machines Using Conjugate Feature Duality.

    PubMed

    Suykens, Johan A K

    2017-08-01

    The aim of this letter is to propose a theory of deep restricted kernel machines offering new foundations for deep learning with kernel machines. From the viewpoint of deep learning, it is partially related to restricted Boltzmann machines, which are characterized by visible and hidden units in a bipartite graph without hidden-to-hidden connections and deep learning extensions as deep belief networks and deep Boltzmann machines. From the viewpoint of kernel machines, it includes least squares support vector machines for classification and regression, kernel principal component analysis (PCA), matrix singular value decomposition, and Parzen-type models. A key element is to first characterize these kernel machines in terms of so-called conjugate feature duality, yielding a representation with visible and hidden units. It is shown how this is related to the energy form in restricted Boltzmann machines, with continuous variables in a nonprobabilistic setting. In this new framework of so-called restricted kernel machine (RKM) representations, the dual variables correspond to hidden features. Deep RKM are obtained by coupling the RKMs. The method is illustrated for deep RKM, consisting of three levels with a least squares support vector machine regression level and two kernel PCA levels. In its primal form also deep feedforward neural networks can be trained within this framework.

  2. A phylogenomic analysis of the Actinomycetales mce operons

    PubMed Central

    Casali, Nicola; Riley, Lee W

    2007-01-01

    Background The genome of Mycobacterium tuberculosis harbors four copies of a cluster of genes termed mce operons. Despite extensive research that has demonstrated the importance of these operons on infection outcome, their physiological function remains obscure. Expanding databases of complete microbial genome sequences facilitate a comparative genomic approach that can provide valuable insight into the role of uncharacterized proteins. Results The M. tuberculosis mce loci each include two yrbE and six mce genes, which have homology to ABC transporter permeases and substrate-binding proteins, respectively. Operons with an identical structure were identified in all Mycobacterium species examined, as well as in five other Actinomycetales genera. Some of the Actinomycetales mce operons include an mkl gene, which encodes an ATPase resembling those of ABC uptake transporters. The phylogenetic profile of Mkl orthologs exactly matched that of the Mce and YrbE proteins. Through topology and motif analyses of YrbE homologs, we identified a region within the penultimate cytoplasmic loop that may serve as the site of interaction with the putative cognate Mkl ATPase. Homologs of the exported proteins encoded adjacent to the M. tuberculosis mce operons were detected in a conserved chromosomal location downstream of the majority of Actinomycetales operons. Operons containing linked mkl, yrbE and mce genes, resembling the classic organization of an ABC importer, were found to be common in Gram-negative bacteria and appear to be associated with changes in properties of the cell surface. Conclusion Evidence presented suggests that the mce operons of Actinomycetales species and related operons in Gram-negative bacteria encode a subfamily of ABC uptake transporters with a possible role in remodeling the cell envelope. PMID:17324287

  3. miR-206 Inhibits Stemness and Metastasis of Breast Cancer by Targeting MKL1/IL11 Pathway.

    PubMed

    Samaeekia, Ravand; Adorno-Cruz, Valery; Bockhorn, Jessica; Chang, Ya-Fang; Huang, Simo; Prat, Aleix; Ha, Nahun; Kibria, Golam; Huo, Dezheng; Zheng, Hui; Dalton, Rachel; Wang, Yuhao; Moskalenko, Grigoriy Y; Liu, Huiping

    2017-02-15

    Purpose: Effective targeting of cancer stem cells is necessary and important for eradicating cancer and reducing metastasis-related mortality. Understanding of cancer stemness-related signaling pathways at the molecular level will help control cancer and stop metastasis in the clinic. Experimental Design: By analyzing miRNA profiles and functions in cancer development, we aimed to identify regulators of breast tumor stemness and metastasis in human xenograft models in vivo and examined their effects on self-renewal and invasion of breast cancer cells in vitro To discover the direct targets and essential signaling pathways responsible for miRNA functions in breast cancer progression, we performed microarray analysis and target gene prediction in combination with functional studies on candidate genes (overexpression rescues and pheno-copying knockdowns). Results: In this study, we report that hsa-miR-206 suppresses breast tumor stemness and metastasis by inhibiting both self-renewal and invasion. We identified that among the candidate targets, twinfilin ( TWF1 ) rescues the miR-206 phenotype in invasion by enhancing the actin cytoskeleton dynamics and the activity of the mesenchymal lineage transcription factors, megakaryoblastic leukemia (translocation) 1 (MKL1), and serum response factor (SRF). MKL1 and SRF were further demonstrated to promote the expression of IL11 , which is essential for miR-206's function in inhibiting both invasion and stemness of breast cancer. Conclusions: The identification of the miR-206/TWF1/MKL1-SRF/IL11 signaling pathway sheds lights on the understanding of breast cancer initiation and progression, unveils new therapeutic targets, and facilitates innovative drug development to control cancer and block metastasis. Clin Cancer Res; 23(4); 1091-103. ©2016 AACR . ©2016 American Association for Cancer Research.

  4. Structured Kernel Subspace Learning for Autonomous Robot Navigation.

    PubMed

    Kim, Eunwoo; Choi, Sungjoon; Oh, Songhwai

    2018-02-14

    This paper considers two important problems for autonomous robot navigation in a dynamic environment, where the goal is to predict pedestrian motion and control a robot with the prediction for safe navigation. While there are several methods for predicting the motion of a pedestrian and controlling a robot to avoid incoming pedestrians, it is still difficult to safely navigate in a dynamic environment due to challenges, such as the varying quality and complexity of training data with unwanted noises. This paper addresses these challenges simultaneously by proposing a robust kernel subspace learning algorithm based on the recent advances in nuclear-norm and l 1 -norm minimization. We model the motion of a pedestrian and the robot controller using Gaussian processes. The proposed method efficiently approximates a kernel matrix used in Gaussian process regression by learning low-rank structured matrix (with symmetric positive semi-definiteness) to find an orthogonal basis, which eliminates the effects of erroneous and inconsistent data. Based on structured kernel subspace learning, we propose a robust motion model and motion controller for safe navigation in dynamic environments. We evaluate the proposed robust kernel learning in various tasks, including regression, motion prediction, and motion control problems, and demonstrate that the proposed learning-based systems are robust against outliers and outperform existing regression and navigation methods.

  5. Ranking Support Vector Machine with Kernel Approximation

    PubMed Central

    Dou, Yong

    2017-01-01

    Learning to rank algorithm has become important in recent years due to its successful application in information retrieval, recommender system, and computational biology, and so forth. Ranking support vector machine (RankSVM) is one of the state-of-art ranking models and has been favorably used. Nonlinear RankSVM (RankSVM with nonlinear kernels) can give higher accuracy than linear RankSVM (RankSVM with a linear kernel) for complex nonlinear ranking problem. However, the learning methods for nonlinear RankSVM are still time-consuming because of the calculation of kernel matrix. In this paper, we propose a fast ranking algorithm based on kernel approximation to avoid computing the kernel matrix. We explore two types of kernel approximation methods, namely, the Nyström method and random Fourier features. Primal truncated Newton method is used to optimize the pairwise L2-loss (squared Hinge-loss) objective function of the ranking model after the nonlinear kernel approximation. Experimental results demonstrate that our proposed method gets a much faster training speed than kernel RankSVM and achieves comparable or better performance over state-of-the-art ranking algorithms. PMID:28293256

  6. Ranking Support Vector Machine with Kernel Approximation.

    PubMed

    Chen, Kai; Li, Rongchun; Dou, Yong; Liang, Zhengfa; Lv, Qi

    2017-01-01

    Learning to rank algorithm has become important in recent years due to its successful application in information retrieval, recommender system, and computational biology, and so forth. Ranking support vector machine (RankSVM) is one of the state-of-art ranking models and has been favorably used. Nonlinear RankSVM (RankSVM with nonlinear kernels) can give higher accuracy than linear RankSVM (RankSVM with a linear kernel) for complex nonlinear ranking problem. However, the learning methods for nonlinear RankSVM are still time-consuming because of the calculation of kernel matrix. In this paper, we propose a fast ranking algorithm based on kernel approximation to avoid computing the kernel matrix. We explore two types of kernel approximation methods, namely, the Nyström method and random Fourier features. Primal truncated Newton method is used to optimize the pairwise L2-loss (squared Hinge-loss) objective function of the ranking model after the nonlinear kernel approximation. Experimental results demonstrate that our proposed method gets a much faster training speed than kernel RankSVM and achieves comparable or better performance over state-of-the-art ranking algorithms.

  7. CW-SSIM kernel based random forest for image classification

    NASA Astrophysics Data System (ADS)

    Fan, Guangzhe; Wang, Zhou; Wang, Jiheng

    2010-07-01

    Complex wavelet structural similarity (CW-SSIM) index has been proposed as a powerful image similarity metric that is robust to translation, scaling and rotation of images, but how to employ it in image classification applications has not been deeply investigated. In this paper, we incorporate CW-SSIM as a kernel function into a random forest learning algorithm. This leads to a novel image classification approach that does not require a feature extraction or dimension reduction stage at the front end. We use hand-written digit recognition as an example to demonstrate our algorithm. We compare the performance of the proposed approach with random forest learning based on other kernels, including the widely adopted Gaussian and the inner product kernels. Empirical evidences show that the proposed method is superior in its classification power. We also compared our proposed approach with the direct random forest method without kernel and the popular kernel-learning method support vector machine. Our test results based on both simulated and realworld data suggest that the proposed approach works superior to traditional methods without the feature selection procedure.

  8. Deep neural mapping support vector machines.

    PubMed

    Li, Yujian; Zhang, Ting

    2017-09-01

    The choice of kernel has an important effect on the performance of a support vector machine (SVM). The effect could be reduced by NEUROSVM, an architecture using multilayer perceptron for feature extraction and SVM for classification. In binary classification, a general linear kernel NEUROSVM can be theoretically simplified as an input layer, many hidden layers, and an SVM output layer. As a feature extractor, the sub-network composed of the input and hidden layers is first trained together with a virtual ordinary output layer by backpropagation, then with the output of its last hidden layer taken as input of the SVM classifier for further training separately. By taking the sub-network as a kernel mapping from the original input space into a feature space, we present a novel model, called deep neural mapping support vector machine (DNMSVM), from the viewpoint of deep learning. This model is also a new and general kernel learning method, where the kernel mapping is indeed an explicit function expressed as a sub-network, different from an implicit function induced by a kernel function traditionally. Moreover, we exploit a two-stage procedure of contrastive divergence learning and gradient descent for DNMSVM to jointly training an adaptive kernel mapping instead of a kernel function, without requirement of kernel tricks. As a whole of the sub-network and the SVM classifier, the joint training of DNMSVM is done by using gradient descent to optimize the objective function with the sub-network layer-wise pre-trained via contrastive divergence learning of restricted Boltzmann machines. Compared to the separate training of NEUROSVM, the joint training is a new algorithm for DNMSVM to have advantages over NEUROSVM. Experimental results show that DNMSVM can outperform NEUROSVM and RBFSVM (i.e., SVM with the kernel of radial basis function), demonstrating its effectiveness. Copyright © 2017 Elsevier Ltd. All rights reserved.

  9. Feature Weighting via Optimal Thresholding for Video Analysis (Open Access)

    DTIC Science & Technology

    2014-03-03

    EKF MKL LPBoost ALF FWOT BP CaVT FMG GaVU GaA MaS PR PK RaA WaSP Mean 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 P m is s@ T E R = 1 2 .5 Trajectory EKF MKL...processing, and finance. SIAM Journal on Op- timization, 19(3):1344–1367, 2008. [8] H. Kuehne, H. Jhuang, E . Garrote, T. Poggio, and T. Serre. HMDB: a large...Vitaladevuni, X. Zhuang, S. Tsakalidis , U. Park, and R. Prasad. Multimodal feature fusion for robust event detection in web videos. In CVPR, 2012. [16] A

  10. SRF phosphorylation by glycogen synthase kinase-3 promotes axon growth in hippocampal neurons.

    PubMed

    Li, Cong L; Sathyamurthy, Aruna; Oldenborg, Anna; Tank, Dharmesh; Ramanan, Narendrakumar

    2014-03-12

    The growth of axons is an intricately regulated process involving intracellular signaling cascades and gene transcription. We had previously shown that the stimulus-dependent transcription factor, serum response factor (SRF), plays a critical role in regulating axon growth in the mammalian brain. However, the molecular mechanisms underlying SRF-dependent axon growth remains unknown. Here we report that SRF is phosphorylated and activated by GSK-3 to promote axon outgrowth in mouse hippocampal neurons. GSK-3 binds to and directly phosphorylates SRF on a highly conserved serine residue. This serine phosphorylation is necessary for SRF activity and for its interaction with MKL-family cofactors, MKL1 and MKL2, but not with TCF-family cofactor, ELK-1. Axonal growth deficits caused by GSK-3 inhibition could be rescued by expression of a constitutively active SRF. The SRF target gene and actin-binding protein, vinculin, is sufficient to overcome the axonal growth deficits of SRF-deficient and GSK-3-inhibited neurons. Furthermore, short hairpin RNA-mediated knockdown of vinculin also attenuated axonal growth. Thus, our findings reveal a novel phosphorylation and activation of SRF by GSK-3 that is critical for SRF-dependent axon growth in mammalian central neurons.

  11. A multi-label learning based kernel automatic recommendation method for support vector machine.

    PubMed

    Zhang, Xueying; Song, Qinbao

    2015-01-01

    Choosing an appropriate kernel is very important and critical when classifying a new problem with Support Vector Machine. So far, more attention has been paid on constructing new kernels and choosing suitable parameter values for a specific kernel function, but less on kernel selection. Furthermore, most of current kernel selection methods focus on seeking a best kernel with the highest classification accuracy via cross-validation, they are time consuming and ignore the differences among the number of support vectors and the CPU time of SVM with different kernels. Considering the tradeoff between classification success ratio and CPU time, there may be multiple kernel functions performing equally well on the same classification problem. Aiming to automatically select those appropriate kernel functions for a given data set, we propose a multi-label learning based kernel recommendation method built on the data characteristics. For each data set, the meta-knowledge data base is first created by extracting the feature vector of data characteristics and identifying the corresponding applicable kernel set. Then the kernel recommendation model is constructed on the generated meta-knowledge data base with the multi-label classification method. Finally, the appropriate kernel functions are recommended to a new data set by the recommendation model according to the characteristics of the new data set. Extensive experiments over 132 UCI benchmark data sets, with five different types of data set characteristics, eleven typical kernels (Linear, Polynomial, Radial Basis Function, Sigmoidal function, Laplace, Multiquadric, Rational Quadratic, Spherical, Spline, Wave and Circular), and five multi-label classification methods demonstrate that, compared with the existing kernel selection methods and the most widely used RBF kernel function, SVM with the kernel function recommended by our proposed method achieved the highest classification performance.

  12. A Multi-Label Learning Based Kernel Automatic Recommendation Method for Support Vector Machine

    PubMed Central

    Zhang, Xueying; Song, Qinbao

    2015-01-01

    Choosing an appropriate kernel is very important and critical when classifying a new problem with Support Vector Machine. So far, more attention has been paid on constructing new kernels and choosing suitable parameter values for a specific kernel function, but less on kernel selection. Furthermore, most of current kernel selection methods focus on seeking a best kernel with the highest classification accuracy via cross-validation, they are time consuming and ignore the differences among the number of support vectors and the CPU time of SVM with different kernels. Considering the tradeoff between classification success ratio and CPU time, there may be multiple kernel functions performing equally well on the same classification problem. Aiming to automatically select those appropriate kernel functions for a given data set, we propose a multi-label learning based kernel recommendation method built on the data characteristics. For each data set, the meta-knowledge data base is first created by extracting the feature vector of data characteristics and identifying the corresponding applicable kernel set. Then the kernel recommendation model is constructed on the generated meta-knowledge data base with the multi-label classification method. Finally, the appropriate kernel functions are recommended to a new data set by the recommendation model according to the characteristics of the new data set. Extensive experiments over 132 UCI benchmark data sets, with five different types of data set characteristics, eleven typical kernels (Linear, Polynomial, Radial Basis Function, Sigmoidal function, Laplace, Multiquadric, Rational Quadratic, Spherical, Spline, Wave and Circular), and five multi-label classification methods demonstrate that, compared with the existing kernel selection methods and the most widely used RBF kernel function, SVM with the kernel function recommended by our proposed method achieved the highest classification performance. PMID:25893896

  13. A scalable kernel-based semisupervised metric learning algorithm with out-of-sample generalization ability.

    PubMed

    Yeung, Dit-Yan; Chang, Hong; Dai, Guang

    2008-11-01

    In recent years, metric learning in the semisupervised setting has aroused a lot of research interest. One type of semisupervised metric learning utilizes supervisory information in the form of pairwise similarity or dissimilarity constraints. However, most methods proposed so far are either limited to linear metric learning or unable to scale well with the data set size. In this letter, we propose a nonlinear metric learning method based on the kernel approach. By applying low-rank approximation to the kernel matrix, our method can handle significantly larger data sets. Moreover, our low-rank approximation scheme can naturally lead to out-of-sample generalization. Experiments performed on both artificial and real-world data show very promising results.

  14. Generalization Analysis of Fredholm Kernel Regularized Classifiers.

    PubMed

    Gong, Tieliang; Xu, Zongben; Chen, Hong

    2017-07-01

    Recently, a new framework, Fredholm learning, was proposed for semisupervised learning problems based on solving a regularized Fredholm integral equation. It allows a natural way to incorporate unlabeled data into learning algorithms to improve their prediction performance. Despite rapid progress on implementable algorithms with theoretical guarantees, the generalization ability of Fredholm kernel learning has not been studied. In this letter, we focus on investigating the generalization performance of a family of classification algorithms, referred to as Fredholm kernel regularized classifiers. We prove that the corresponding learning rate can achieve [Formula: see text] ([Formula: see text] is the number of labeled samples) in a limiting case. In addition, a representer theorem is provided for the proposed regularized scheme, which underlies its applications.

  15. A Classification of Remote Sensing Image Based on Improved Compound Kernels of Svm

    NASA Astrophysics Data System (ADS)

    Zhao, Jianing; Gao, Wanlin; Liu, Zili; Mou, Guifen; Lu, Lin; Yu, Lina

    The accuracy of RS classification based on SVM which is developed from statistical learning theory is high under small number of train samples, which results in satisfaction of classification on RS using SVM methods. The traditional RS classification method combines visual interpretation with computer classification. The accuracy of the RS classification, however, is improved a lot based on SVM method, because it saves much labor and time which is used to interpret images and collect training samples. Kernel functions play an important part in the SVM algorithm. It uses improved compound kernel function and therefore has a higher accuracy of classification on RS images. Moreover, compound kernel improves the generalization and learning ability of the kernel.

  16. Integrating semantic information into multiple kernels for protein-protein interaction extraction from biomedical literatures.

    PubMed

    Li, Lishuang; Zhang, Panpan; Zheng, Tianfu; Zhang, Hongying; Jiang, Zhenchao; Huang, Degen

    2014-01-01

    Protein-Protein Interaction (PPI) extraction is an important task in the biomedical information extraction. Presently, many machine learning methods for PPI extraction have achieved promising results. However, the performance is still not satisfactory. One reason is that the semantic resources were basically ignored. In this paper, we propose a multiple-kernel learning-based approach to extract PPIs, combining the feature-based kernel, tree kernel and semantic kernel. Particularly, we extend the shortest path-enclosed tree kernel (SPT) by a dynamic extended strategy to retrieve the richer syntactic information. Our semantic kernel calculates the protein-protein pair similarity and the context similarity based on two semantic resources: WordNet and Medical Subject Heading (MeSH). We evaluate our method with Support Vector Machine (SVM) and achieve an F-score of 69.40% and an AUC of 92.00%, which show that our method outperforms most of the state-of-the-art systems by integrating semantic information.

  17. Deep Learning @15 Petaflops/second: Semi-supervised pattern detection for 15 Terabytes of climate data

    NASA Astrophysics Data System (ADS)

    Collins, W. D.; Wehner, M. F.; Prabhat, M.; Kurth, T.; Satish, N.; Mitliagkas, I.; Zhang, J.; Racah, E.; Patwary, M.; Sundaram, N.; Dubey, P.

    2017-12-01

    Anthropogenically-forced climate changes in the number and character of extreme storms have the potential to significantly impact human and natural systems. Current high-performance computing enables multidecadal simulations with global climate models at resolutions of 25km or finer. Such high-resolution simulations are demonstrably superior in simulating extreme storms such as tropical cyclones than the coarser simulations available in the Coupled Model Intercomparison Project (CMIP5) and provide the capability to more credibly project future changes in extreme storm statistics and properties. The identification and tracking of storms in the voluminous model output is very challenging as it is impractical to manually identify storms due to the enormous size of the datasets, and therefore automated procedures are used. Traditionally, these procedures are based on a multi-variate set of physical conditions based on known properties of the class of storms in question. In recent years, we have successfully demonstrated that Deep Learning produces state of the art results for pattern detection in climate data. We have developed supervised and semi-supervised convolutional architectures for detecting and localizing tropical cyclones, extra-tropical cyclones and atmospheric rivers in simulation data. One of the primary challenges in the applicability of Deep Learning to climate data is in the expensive training phase. Typical networks may take days to converge on 10GB-sized datasets, while the climate science community has ready access to O(10 TB)-O(PB) sized datasets. In this work, we present the most scalable implementation of Deep Learning to date. We successfully scale a unified, semi-supervised convolutional architecture on all of the Cori Phase II supercomputer at NERSC. We use IntelCaffe, MKL and MLSL libraries. We have optimized single node MKL libraries to obtain 1-4 TF on single KNL nodes. We have developed a novel hybrid parameter update strategy to improve scaling to 9600 KNL nodes (600,000 cores). We obtain 15PF performance over the course of the training run; setting a new watermark for the HPC and Deep Learning communities. This talk will share insights on how to obtain this extreme level of performance, current gaps/challenges and implications for the climate science community.

  18. Scalable Nonparametric Low-Rank Kernel Learning Using Block Coordinate Descent.

    PubMed

    Hu, En-Liang; Kwok, James T

    2015-09-01

    Nonparametric kernel learning (NPKL) is a flexible approach to learn the kernel matrix directly without assuming any parametric form. It can be naturally formulated as a semidefinite program (SDP), which, however, is not very scalable. To address this problem, we propose the combined use of low-rank approximation and block coordinate descent (BCD). Low-rank approximation avoids the expensive positive semidefinite constraint in the SDP by replacing the kernel matrix variable with V(T)V, where V is a low-rank matrix. The resultant nonlinear optimization problem is then solved by BCD, which optimizes each column of V sequentially. It can be shown that the proposed algorithm has nice convergence properties and low computational complexities. Experiments on a number of real-world data sets show that the proposed algorithm outperforms state-of-the-art NPKL solvers.

  19. The Classification of Diabetes Mellitus Using Kernel k-means

    NASA Astrophysics Data System (ADS)

    Alamsyah, M.; Nafisah, Z.; Prayitno, E.; Afida, A. M.; Imah, E. M.

    2018-01-01

    Diabetes Mellitus is a metabolic disorder which is characterized by chronicle hypertensive glucose. Automatics detection of diabetes mellitus is still challenging. This study detected diabetes mellitus by using kernel k-Means algorithm. Kernel k-means is an algorithm which was developed from k-means algorithm. Kernel k-means used kernel learning that is able to handle non linear separable data; where it differs with a common k-means. The performance of kernel k-means in detecting diabetes mellitus is also compared with SOM algorithms. The experiment result shows that kernel k-means has good performance and a way much better than SOM.

  20. Kernelized Elastic Net Regularization: Generalization Bounds, and Sparse Recovery.

    PubMed

    Feng, Yunlong; Lv, Shao-Gao; Hang, Hanyuan; Suykens, Johan A K

    2016-03-01

    Kernelized elastic net regularization (KENReg) is a kernelization of the well-known elastic net regularization (Zou & Hastie, 2005). The kernel in KENReg is not required to be a Mercer kernel since it learns from a kernelized dictionary in the coefficient space. Feng, Yang, Zhao, Lv, and Suykens (2014) showed that KENReg has some nice properties including stability, sparseness, and generalization. In this letter, we continue our study on KENReg by conducting a refined learning theory analysis. This letter makes the following three main contributions. First, we present refined error analysis on the generalization performance of KENReg. The main difficulty of analyzing the generalization error of KENReg lies in characterizing the population version of its empirical target function. We overcome this by introducing a weighted Banach space associated with the elastic net regularization. We are then able to conduct elaborated learning theory analysis and obtain fast convergence rates under proper complexity and regularity assumptions. Second, we study the sparse recovery problem in KENReg with fixed design and show that the kernelization may improve the sparse recovery ability compared to the classical elastic net regularization. Finally, we discuss the interplay among different properties of KENReg that include sparseness, stability, and generalization. We show that the stability of KENReg leads to generalization, and its sparseness confidence can be derived from generalization. Moreover, KENReg is stable and can be simultaneously sparse, which makes it attractive theoretically and practically.

  1. Inference of Spatio-Temporal Functions Over Graphs via Multikernel Kriged Kalman Filtering

    NASA Astrophysics Data System (ADS)

    Ioannidis, Vassilis N.; Romero, Daniel; Giannakis, Georgios B.

    2018-06-01

    Inference of space-time varying signals on graphs emerges naturally in a plethora of network science related applications. A frequently encountered challenge pertains to reconstructing such dynamic processes, given their values over a subset of vertices and time instants. The present paper develops a graph-aware kernel-based kriged Kalman filter that accounts for the spatio-temporal variations, and offers efficient online reconstruction, even for dynamically evolving network topologies. The kernel-based learning framework bypasses the need for statistical information by capitalizing on the smoothness that graph signals exhibit with respect to the underlying graph. To address the challenge of selecting the appropriate kernel, the proposed filter is combined with a multi-kernel selection module. Such a data-driven method selects a kernel attuned to the signal dynamics on-the-fly within the linear span of a pre-selected dictionary. The novel multi-kernel learning algorithm exploits the eigenstructure of Laplacian kernel matrices to reduce computational complexity. Numerical tests with synthetic and real data demonstrate the superior reconstruction performance of the novel approach relative to state-of-the-art alternatives.

  2. CLAss-Specific Subspace Kernel Representations and Adaptive Margin Slack Minimization for Large Scale Classification.

    PubMed

    Yu, Yinan; Diamantaras, Konstantinos I; McKelvey, Tomas; Kung, Sun-Yuan

    2018-02-01

    In kernel-based classification models, given limited computational power and storage capacity, operations over the full kernel matrix becomes prohibitive. In this paper, we propose a new supervised learning framework using kernel models for sequential data processing. The framework is based on two components that both aim at enhancing the classification capability with a subset selection scheme. The first part is a subspace projection technique in the reproducing kernel Hilbert space using a CLAss-specific Subspace Kernel representation for kernel approximation. In the second part, we propose a novel structural risk minimization algorithm called the adaptive margin slack minimization to iteratively improve the classification accuracy by an adaptive data selection. We motivate each part separately, and then integrate them into learning frameworks for large scale data. We propose two such frameworks: the memory efficient sequential processing for sequential data processing and the parallelized sequential processing for distributed computing with sequential data acquisition. We test our methods on several benchmark data sets and compared with the state-of-the-art techniques to verify the validity of the proposed techniques.

  3. An Approximate Approach to Automatic Kernel Selection.

    PubMed

    Ding, Lizhong; Liao, Shizhong

    2016-02-02

    Kernel selection is a fundamental problem of kernel-based learning algorithms. In this paper, we propose an approximate approach to automatic kernel selection for regression from the perspective of kernel matrix approximation. We first introduce multilevel circulant matrices into automatic kernel selection, and develop two approximate kernel selection algorithms by exploiting the computational virtues of multilevel circulant matrices. The complexity of the proposed algorithms is quasi-linear in the number of data points. Then, we prove an approximation error bound to measure the effect of the approximation in kernel matrices by multilevel circulant matrices on the hypothesis and further show that the approximate hypothesis produced with multilevel circulant matrices converges to the accurate hypothesis produced with kernel matrices. Experimental evaluations on benchmark datasets demonstrate the effectiveness of approximate kernel selection.

  4. Classification With Truncated Distance Kernel.

    PubMed

    Huang, Xiaolin; Suykens, Johan A K; Wang, Shuning; Hornegger, Joachim; Maier, Andreas

    2018-05-01

    This brief proposes a truncated distance (TL1) kernel, which results in a classifier that is nonlinear in the global region but is linear in each subregion. With this kernel, the subregion structure can be trained using all the training data and local linear classifiers can be established simultaneously. The TL1 kernel has good adaptiveness to nonlinearity and is suitable for problems which require different nonlinearities in different areas. Though the TL1 kernel is not positive semidefinite, some classical kernel learning methods are still applicable which means that the TL1 kernel can be directly used in standard toolboxes by replacing the kernel evaluation. In numerical experiments, the TL1 kernel with a pregiven parameter achieves similar or better performance than the radial basis function kernel with the parameter tuned by cross validation, implying the TL1 kernel a promising nonlinear kernel for classification tasks.

  5. A ℓ2, 1 norm regularized multi-kernel learning for false positive reduction in Lung nodule CAD.

    PubMed

    Cao, Peng; Liu, Xiaoli; Zhang, Jian; Li, Wei; Zhao, Dazhe; Huang, Min; Zaiane, Osmar

    2017-03-01

    The aim of this paper is to describe a novel algorithm for False Positive Reduction in lung nodule Computer Aided Detection(CAD). In this paper, we describes a new CT lung CAD method which aims to detect solid nodules. Specially, we proposed a multi-kernel classifier with a ℓ 2, 1 norm regularizer for heterogeneous feature fusion and selection from the feature subset level, and designed two efficient strategies to optimize the parameters of kernel weights in non-smooth ℓ 2, 1 regularized multiple kernel learning algorithm. The first optimization algorithm adapts a proximal gradient method for solving the ℓ 2, 1 norm of kernel weights, and use an accelerated method based on FISTA; the second one employs an iterative scheme based on an approximate gradient descent method. The results demonstrates that the FISTA-style accelerated proximal descent method is efficient for the ℓ 2, 1 norm formulation of multiple kernel learning with the theoretical guarantee of the convergence rate. Moreover, the experimental results demonstrate the effectiveness of the proposed methods in terms of Geometric mean (G-mean) and Area under the ROC curve (AUC), and significantly outperforms the competing methods. The proposed approach exhibits some remarkable advantages both in heterogeneous feature subsets fusion and classification phases. Compared with the fusion strategies of feature-level and decision level, the proposed ℓ 2, 1 norm multi-kernel learning algorithm is able to accurately fuse the complementary and heterogeneous feature sets, and automatically prune the irrelevant and redundant feature subsets to form a more discriminative feature set, leading a promising classification performance. Moreover, the proposed algorithm consistently outperforms the comparable classification approaches in the literature. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  6. Multiple Kernel Learning with Random Effects for Predicting Longitudinal Outcomes and Data Integration

    PubMed Central

    Chen, Tianle; Zeng, Donglin

    2015-01-01

    Summary Predicting disease risk and progression is one of the main goals in many clinical research studies. Cohort studies on the natural history and etiology of chronic diseases span years and data are collected at multiple visits. Although kernel-based statistical learning methods are proven to be powerful for a wide range of disease prediction problems, these methods are only well studied for independent data but not for longitudinal data. It is thus important to develop time-sensitive prediction rules that make use of the longitudinal nature of the data. In this paper, we develop a novel statistical learning method for longitudinal data by introducing subject-specific short-term and long-term latent effects through a designed kernel to account for within-subject correlation of longitudinal measurements. Since the presence of multiple sources of data is increasingly common, we embed our method in a multiple kernel learning framework and propose a regularized multiple kernel statistical learning with random effects to construct effective nonparametric prediction rules. Our method allows easy integration of various heterogeneous data sources and takes advantage of correlation among longitudinal measures to increase prediction power. We use different kernels for each data source taking advantage of the distinctive feature of each data modality, and then optimally combine data across modalities. We apply the developed methods to two large epidemiological studies, one on Huntington's disease and the other on Alzheimer's Disease (Alzheimer's Disease Neuroimaging Initiative, ADNI) where we explore a unique opportunity to combine imaging and genetic data to study prediction of mild cognitive impairment, and show a substantial gain in performance while accounting for the longitudinal aspect of the data. PMID:26177419

  7. An information theoretic approach of designing sparse kernel adaptive filters.

    PubMed

    Liu, Weifeng; Park, Il; Principe, José C

    2009-12-01

    This paper discusses an information theoretic approach of designing sparse kernel adaptive filters. To determine useful data to be learned and remove redundant ones, a subjective information measure called surprise is introduced. Surprise captures the amount of information a datum contains which is transferable to a learning system. Based on this concept, we propose a systematic sparsification scheme, which can drastically reduce the time and space complexity without harming the performance of kernel adaptive filters. Nonlinear regression, short term chaotic time-series prediction, and long term time-series forecasting examples are presented.

  8. Automatic classification of retinal three-dimensional optical coherence tomography images using principal component analysis network with composite kernels

    NASA Astrophysics Data System (ADS)

    Fang, Leyuan; Wang, Chong; Li, Shutao; Yan, Jun; Chen, Xiangdong; Rabbani, Hossein

    2017-11-01

    We present an automatic method, termed as the principal component analysis network with composite kernel (PCANet-CK), for the classification of three-dimensional (3-D) retinal optical coherence tomography (OCT) images. Specifically, the proposed PCANet-CK method first utilizes the PCANet to automatically learn features from each B-scan of the 3-D retinal OCT images. Then, multiple kernels are separately applied to a set of very important features of the B-scans and these kernels are fused together, which can jointly exploit the correlations among features of the 3-D OCT images. Finally, the fused (composite) kernel is incorporated into an extreme learning machine for the OCT image classification. We tested our proposed algorithm on two real 3-D spectral domain OCT (SD-OCT) datasets (of normal subjects and subjects with the macular edema and age-related macular degeneration), which demonstrated its effectiveness.

  9. Adaptive Shape Kernel-Based Mean Shift Tracker in Robot Vision System

    PubMed Central

    2016-01-01

    This paper proposes an adaptive shape kernel-based mean shift tracker using a single static camera for the robot vision system. The question that we address in this paper is how to construct such a kernel shape that is adaptive to the object shape. We perform nonlinear manifold learning technique to obtain the low-dimensional shape space which is trained by training data with the same view as the tracking video. The proposed kernel searches the shape in the low-dimensional shape space obtained by nonlinear manifold learning technique and constructs the adaptive kernel shape in the high-dimensional shape space. It can improve mean shift tracker performance to track object position and object contour and avoid the background clutter. In the experimental part, we take the walking human as example to validate that our method is accurate and robust to track human position and describe human contour. PMID:27379165

  10. Semisupervised kernel marginal Fisher analysis for face recognition.

    PubMed

    Wang, Ziqiang; Sun, Xia; Sun, Lijun; Huang, Yuchun

    2013-01-01

    Dimensionality reduction is a key problem in face recognition due to the high-dimensionality of face image. To effectively cope with this problem, a novel dimensionality reduction algorithm called semisupervised kernel marginal Fisher analysis (SKMFA) for face recognition is proposed in this paper. SKMFA can make use of both labelled and unlabeled samples to learn the projection matrix for nonlinear dimensionality reduction. Meanwhile, it can successfully avoid the singularity problem by not calculating the matrix inverse. In addition, in order to make the nonlinear structure captured by the data-dependent kernel consistent with the intrinsic manifold structure, a manifold adaptive nonparameter kernel is incorporated into the learning process of SKMFA. Experimental results on three face image databases demonstrate the effectiveness of our proposed algorithm.

  11. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Allada, Veerendra, Benjegerdes, Troy; Bode, Brett

    Commodity clusters augmented with application accelerators are evolving as competitive high performance computing systems. The Graphical Processing Unit (GPU) with a very high arithmetic density and performance per price ratio is a good platform for the scientific application acceleration. In addition to the interconnect bottlenecks among the cluster compute nodes, the cost of memory copies between the host and the GPU device have to be carefully amortized to improve the overall efficiency of the application. Scientific applications also rely on efficient implementation of the BAsic Linear Algebra Subroutines (BLAS), among which the General Matrix Multiply (GEMM) is considered as themore » workhorse subroutine. In this paper, they study the performance of the memory copies and GEMM subroutines that are critical to port the computational chemistry algorithms to the GPU clusters. To that end, a benchmark based on the NetPIPE framework is developed to evaluate the latency and bandwidth of the memory copies between the host and the GPU device. The performance of the single and double precision GEMM subroutines from the NVIDIA CUBLAS 2.0 library are studied. The results have been compared with that of the BLAS routines from the Intel Math Kernel Library (MKL) to understand the computational trade-offs. The test bed is a Intel Xeon cluster equipped with NVIDIA Tesla GPUs.« less

  12. Improving prediction of heterodimeric protein complexes using combination with pairwise kernel.

    PubMed

    Ruan, Peiying; Hayashida, Morihiro; Akutsu, Tatsuya; Vert, Jean-Philippe

    2018-02-19

    Since many proteins become functional only after they interact with their partner proteins and form protein complexes, it is essential to identify the sets of proteins that form complexes. Therefore, several computational methods have been proposed to predict complexes from the topology and structure of experimental protein-protein interaction (PPI) network. These methods work well to predict complexes involving at least three proteins, but generally fail at identifying complexes involving only two different proteins, called heterodimeric complexes or heterodimers. There is however an urgent need for efficient methods to predict heterodimers, since the majority of known protein complexes are precisely heterodimers. In this paper, we use three promising kernel functions, Min kernel and two pairwise kernels, which are Metric Learning Pairwise Kernel (MLPK) and Tensor Product Pairwise Kernel (TPPK). We also consider the normalization forms of Min kernel. Then, we combine Min kernel or its normalization form and one of the pairwise kernels by plugging. We applied kernels based on PPI, domain, phylogenetic profile, and subcellular localization properties to predicting heterodimers. Then, we evaluate our method by employing C-Support Vector Classification (C-SVC), carrying out 10-fold cross-validation, and calculating the average F-measures. The results suggest that the combination of normalized-Min-kernel and MLPK leads to the best F-measure and improved the performance of our previous work, which had been the best existing method so far. We propose new methods to predict heterodimers, using a machine learning-based approach. We train a support vector machine (SVM) to discriminate interacting vs non-interacting protein pairs, based on informations extracted from PPI, domain, phylogenetic profiles and subcellular localization. We evaluate in detail new kernel functions to encode these data, and report prediction performance that outperforms the state-of-the-art.

  13. Protein fold recognition using geometric kernel data fusion.

    PubMed

    Zakeri, Pooya; Jeuris, Ben; Vandebril, Raf; Moreau, Yves

    2014-07-01

    Various approaches based on features extracted from protein sequences and often machine learning methods have been used in the prediction of protein folds. Finding an efficient technique for integrating these different protein features has received increasing attention. In particular, kernel methods are an interesting class of techniques for integrating heterogeneous data. Various methods have been proposed to fuse multiple kernels. Most techniques for multiple kernel learning focus on learning a convex linear combination of base kernels. In addition to the limitation of linear combinations, working with such approaches could cause a loss of potentially useful information. We design several techniques to combine kernel matrices by taking more involved, geometry inspired means of these matrices instead of convex linear combinations. We consider various sequence-based protein features including information extracted directly from position-specific scoring matrices and local sequence alignment. We evaluate our methods for classification on the SCOP PDB-40D benchmark dataset for protein fold recognition. The best overall accuracy on the protein fold recognition test set obtained by our methods is ∼ 86.7%. This is an improvement over the results of the best existing approach. Moreover, our computational model has been developed by incorporating the functional domain composition of proteins through a hybridization model. It is observed that by using our proposed hybridization model, the protein fold recognition accuracy is further improved to 89.30%. Furthermore, we investigate the performance of our approach on the protein remote homology detection problem by fusing multiple string kernels. The MATLAB code used for our proposed geometric kernel fusion frameworks are publicly available at http://people.cs.kuleuven.be/∼raf.vandebril/homepage/software/geomean.php?menu=5/. © The Author 2014. Published by Oxford University Press.

  14. Efficient nonparametric n -body force fields from machine learning

    NASA Astrophysics Data System (ADS)

    Glielmo, Aldo; Zeni, Claudio; De Vita, Alessandro

    2018-05-01

    We provide a definition and explicit expressions for n -body Gaussian process (GP) kernels, which can learn any interatomic interaction occurring in a physical system, up to n -body contributions, for any value of n . The series is complete, as it can be shown that the "universal approximator" squared exponential kernel can be written as a sum of n -body kernels. These recipes enable the choice of optimally efficient force models for each target system, as confirmed by extensive testing on various materials. We furthermore describe how the n -body kernels can be "mapped" on equivalent representations that provide database-size-independent predictions and are thus crucially more efficient. We explicitly carry out this mapping procedure for the first nontrivial (three-body) kernel of the series, and we show that this reproduces the GP-predicted forces with meV /Å accuracy while being orders of magnitude faster. These results pave the way to using novel force models (here named "M-FFs") that are computationally as fast as their corresponding standard parametrized n -body force fields, while retaining the nonparametric character, the ease of training and validation, and the accuracy of the best recently proposed machine-learning potentials.

  15. On Quantile Regression in Reproducing Kernel Hilbert Spaces with Data Sparsity Constraint

    PubMed Central

    Zhang, Chong; Liu, Yufeng; Wu, Yichao

    2015-01-01

    For spline regressions, it is well known that the choice of knots is crucial for the performance of the estimator. As a general learning framework covering the smoothing splines, learning in a Reproducing Kernel Hilbert Space (RKHS) has a similar issue. However, the selection of training data points for kernel functions in the RKHS representation has not been carefully studied in the literature. In this paper we study quantile regression as an example of learning in a RKHS. In this case, the regular squared norm penalty does not perform training data selection. We propose a data sparsity constraint that imposes thresholding on the kernel function coefficients to achieve a sparse kernel function representation. We demonstrate that the proposed data sparsity method can have competitive prediction performance for certain situations, and have comparable performance in other cases compared to that of the traditional squared norm penalty. Therefore, the data sparsity method can serve as a competitive alternative to the squared norm penalty method. Some theoretical properties of our proposed method using the data sparsity constraint are obtained. Both simulated and real data sets are used to demonstrate the usefulness of our data sparsity constraint. PMID:27134575

  16. Protein Analysis Meets Visual Word Recognition: A Case for String Kernels in the Brain

    ERIC Educational Resources Information Center

    Hannagan, Thomas; Grainger, Jonathan

    2012-01-01

    It has been recently argued that some machine learning techniques known as Kernel methods could be relevant for capturing cognitive and neural mechanisms (Jakel, Scholkopf, & Wichmann, 2009). We point out that "String kernels," initially designed for protein function prediction and spam detection, are virtually identical to one contending proposal…

  17. Partial Deconvolution with Inaccurate Blur Kernel.

    PubMed

    Ren, Dongwei; Zuo, Wangmeng; Zhang, David; Xu, Jun; Zhang, Lei

    2017-10-17

    Most non-blind deconvolution methods are developed under the error-free kernel assumption, and are not robust to inaccurate blur kernel. Unfortunately, despite the great progress in blind deconvolution, estimation error remains inevitable during blur kernel estimation. Consequently, severe artifacts such as ringing effects and distortions are likely to be introduced in the non-blind deconvolution stage. In this paper, we tackle this issue by suggesting: (i) a partial map in the Fourier domain for modeling kernel estimation error, and (ii) a partial deconvolution model for robust deblurring with inaccurate blur kernel. The partial map is constructed by detecting the reliable Fourier entries of estimated blur kernel. And partial deconvolution is applied to wavelet-based and learning-based models to suppress the adverse effect of kernel estimation error. Furthermore, an E-M algorithm is developed for estimating the partial map and recovering the latent sharp image alternatively. Experimental results show that our partial deconvolution model is effective in relieving artifacts caused by inaccurate blur kernel, and can achieve favorable deblurring quality on synthetic and real blurry images.Most non-blind deconvolution methods are developed under the error-free kernel assumption, and are not robust to inaccurate blur kernel. Unfortunately, despite the great progress in blind deconvolution, estimation error remains inevitable during blur kernel estimation. Consequently, severe artifacts such as ringing effects and distortions are likely to be introduced in the non-blind deconvolution stage. In this paper, we tackle this issue by suggesting: (i) a partial map in the Fourier domain for modeling kernel estimation error, and (ii) a partial deconvolution model for robust deblurring with inaccurate blur kernel. The partial map is constructed by detecting the reliable Fourier entries of estimated blur kernel. And partial deconvolution is applied to wavelet-based and learning-based models to suppress the adverse effect of kernel estimation error. Furthermore, an E-M algorithm is developed for estimating the partial map and recovering the latent sharp image alternatively. Experimental results show that our partial deconvolution model is effective in relieving artifacts caused by inaccurate blur kernel, and can achieve favorable deblurring quality on synthetic and real blurry images.

  18. Vowel Imagery Decoding toward Silent Speech BCI Using Extreme Learning Machine with Electroencephalogram

    PubMed Central

    Kim, Jongin; Park, Hyeong-jun

    2016-01-01

    The purpose of this study is to classify EEG data on imagined speech in a single trial. We recorded EEG data while five subjects imagined different vowels, /a/, /e/, /i/, /o/, and /u/. We divided each single trial dataset into thirty segments and extracted features (mean, variance, standard deviation, and skewness) from all segments. To reduce the dimension of the feature vector, we applied a feature selection algorithm based on the sparse regression model. These features were classified using a support vector machine with a radial basis function kernel, an extreme learning machine, and two variants of an extreme learning machine with different kernels. Because each single trial consisted of thirty segments, our algorithm decided the label of the single trial by selecting the most frequent output among the outputs of the thirty segments. As a result, we observed that the extreme learning machine and its variants achieved better classification rates than the support vector machine with a radial basis function kernel and linear discrimination analysis. Thus, our results suggested that EEG responses to imagined speech could be successfully classified in a single trial using an extreme learning machine with a radial basis function and linear kernel. This study with classification of imagined speech might contribute to the development of silent speech BCI systems. PMID:28097128

  19. Implementing Kernel Methods Incrementally by Incremental Nonlinear Projection Trick.

    PubMed

    Kwak, Nojun

    2016-05-20

    Recently, the nonlinear projection trick (NPT) was introduced enabling direct computation of coordinates of samples in a reproducing kernel Hilbert space. With NPT, any machine learning algorithm can be extended to a kernel version without relying on the so called kernel trick. However, NPT is inherently difficult to be implemented incrementally because an ever increasing kernel matrix should be treated as additional training samples are introduced. In this paper, an incremental version of the NPT (INPT) is proposed based on the observation that the centerization step in NPT is unnecessary. Because the proposed INPT does not change the coordinates of the old data, the coordinates obtained by INPT can directly be used in any incremental methods to implement a kernel version of the incremental methods. The effectiveness of the INPT is shown by applying it to implement incremental versions of kernel methods such as, kernel singular value decomposition, kernel principal component analysis, and kernel discriminant analysis which are utilized for problems of kernel matrix reconstruction, letter classification, and face image retrieval, respectively.

  20. Putting Priors in Mixture Density Mercer Kernels

    NASA Technical Reports Server (NTRS)

    Srivastava, Ashok N.; Schumann, Johann; Fischer, Bernd

    2004-01-01

    This paper presents a new methodology for automatic knowledge driven data mining based on the theory of Mercer Kernels, which are highly nonlinear symmetric positive definite mappings from the original image space to a very high, possibly infinite dimensional feature space. We describe a new method called Mixture Density Mercer Kernels to learn kernel function directly from data, rather than using predefined kernels. These data adaptive kernels can en- code prior knowledge in the kernel using a Bayesian formulation, thus allowing for physical information to be encoded in the model. We compare the results with existing algorithms on data from the Sloan Digital Sky Survey (SDSS). The code for these experiments has been generated with the AUTOBAYES tool, which automatically generates efficient and documented C/C++ code from abstract statistical model specifications. The core of the system is a schema library which contains template for learning and knowledge discovery algorithms like different versions of EM, or numeric optimization methods like conjugate gradient methods. The template instantiation is supported by symbolic- algebraic computations, which allows AUTOBAYES to find closed-form solutions and, where possible, to integrate them into the code. The results show that the Mixture Density Mercer-Kernel described here outperforms tree-based classification in distinguishing high-redshift galaxies from low- redshift galaxies by approximately 16% on test data, bagged trees by approximately 7%, and bagged trees built on a much larger sample of data by approximately 2%.

  1. Automatic classification of retinal three-dimensional optical coherence tomography images using principal component analysis network with composite kernels.

    PubMed

    Fang, Leyuan; Wang, Chong; Li, Shutao; Yan, Jun; Chen, Xiangdong; Rabbani, Hossein

    2017-11-01

    We present an automatic method, termed as the principal component analysis network with composite kernel (PCANet-CK), for the classification of three-dimensional (3-D) retinal optical coherence tomography (OCT) images. Specifically, the proposed PCANet-CK method first utilizes the PCANet to automatically learn features from each B-scan of the 3-D retinal OCT images. Then, multiple kernels are separately applied to a set of very important features of the B-scans and these kernels are fused together, which can jointly exploit the correlations among features of the 3-D OCT images. Finally, the fused (composite) kernel is incorporated into an extreme learning machine for the OCT image classification. We tested our proposed algorithm on two real 3-D spectral domain OCT (SD-OCT) datasets (of normal subjects and subjects with the macular edema and age-related macular degeneration), which demonstrated its effectiveness. (2017) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE).

  2. Kernel-based least squares policy iteration for reinforcement learning.

    PubMed

    Xu, Xin; Hu, Dewen; Lu, Xicheng

    2007-07-01

    In this paper, we present a kernel-based least squares policy iteration (KLSPI) algorithm for reinforcement learning (RL) in large or continuous state spaces, which can be used to realize adaptive feedback control of uncertain dynamic systems. By using KLSPI, near-optimal control policies can be obtained without much a priori knowledge on dynamic models of control plants. In KLSPI, Mercer kernels are used in the policy evaluation of a policy iteration process, where a new kernel-based least squares temporal-difference algorithm called KLSTD-Q is proposed for efficient policy evaluation. To keep the sparsity and improve the generalization ability of KLSTD-Q solutions, a kernel sparsification procedure based on approximate linear dependency (ALD) is performed. Compared to the previous works on approximate RL methods, KLSPI makes two progresses to eliminate the main difficulties of existing results. One is the better convergence and (near) optimality guarantee by using the KLSTD-Q algorithm for policy evaluation with high precision. The other is the automatic feature selection using the ALD-based kernel sparsification. Therefore, the KLSPI algorithm provides a general RL method with generalization performance and convergence guarantee for large-scale Markov decision problems (MDPs). Experimental results on a typical RL task for a stochastic chain problem demonstrate that KLSPI can consistently achieve better learning efficiency and policy quality than the previous least squares policy iteration (LSPI) algorithm. Furthermore, the KLSPI method was also evaluated on two nonlinear feedback control problems, including a ship heading control problem and the swing up control of a double-link underactuated pendulum called acrobot. Simulation results illustrate that the proposed method can optimize controller performance using little a priori information of uncertain dynamic systems. It is also demonstrated that KLSPI can be applied to online learning control by incorporating an initial controller to ensure online performance.

  3. MultiDK: A Multiple Descriptor Multiple Kernel Approach for Molecular Discovery and Its Application to Organic Flow Battery Electrolytes.

    PubMed

    Kim, Sungjin; Jinich, Adrián; Aspuru-Guzik, Alán

    2017-04-24

    We propose a multiple descriptor multiple kernel (MultiDK) method for efficient molecular discovery using machine learning. We show that the MultiDK method improves both the speed and accuracy of molecular property prediction. We apply the method to the discovery of electrolyte molecules for aqueous redox flow batteries. Using multiple-type-as opposed to single-type-descriptors, we obtain more relevant features for machine learning. Following the principle of "wisdom of the crowds", the combination of multiple-type descriptors significantly boosts prediction performance. Moreover, by employing multiple kernels-more than one kernel function for a set of the input descriptors-MultiDK exploits nonlinear relations between molecular structure and properties better than a linear regression approach. The multiple kernels consist of a Tanimoto similarity kernel and a linear kernel for a set of binary descriptors and a set of nonbinary descriptors, respectively. Using MultiDK, we achieve an average performance of r 2 = 0.92 with a test set of molecules for solubility prediction. We also extend MultiDK to predict pH-dependent solubility and apply it to a set of quinone molecules with different ionizable functional groups to assess their performance as flow battery electrolytes.

  4. An Ensemble Approach to Building Mercer Kernels with Prior Information

    NASA Technical Reports Server (NTRS)

    Srivastava, Ashok N.; Schumann, Johann; Fischer, Bernd

    2005-01-01

    This paper presents a new methodology for automatic knowledge driven data mining based on the theory of Mercer Kernels, which are highly nonlinear symmetric positive definite mappings from the original image space to a very high, possibly dimensional feature space. we describe a new method called Mixture Density Mercer Kernels to learn kernel function directly from data, rather than using pre-defined kernels. These data adaptive kernels can encode prior knowledge in the kernel using a Bayesian formulation, thus allowing for physical information to be encoded in the model. Specifically, we demonstrate the use of the algorithm in situations with extremely small samples of data. We compare the results with existing algorithms on data from the Sloan Digital Sky Survey (SDSS) and demonstrate the method's superior performance against standard methods. The code for these experiments has been generated with the AUTOBAYES tool, which automatically generates efficient and documented C/C++ code from abstract statistical model specifications. The core of the system is a schema library which contains templates for learning and knowledge discovery algorithms like different versions of EM, or numeric optimization methods like conjugate gradient methods. The template instantiation is supported by symbolic-algebraic computations, which allows AUTOBAYES to find closed-form solutions and, where possible, to integrate them into the code.

  5. MO-G-17A-05: PET Image Deblurring Using Adaptive Dictionary Learning

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Valiollahzadeh, S; Clark, J; Mawlawi, O

    2014-06-15

    Purpose: The aim of this work is to deblur PET images while suppressing Poisson noise effects using adaptive dictionary learning (DL) techniques. Methods: The model that relates a blurred and noisy PET image to the desired image is described as a linear transform y=Hm+n where m is the desired image, H is a blur kernel, n is Poisson noise and y is the blurred image. The approach we follow to recover m involves the sparse representation of y over a learned dictionary, since the image has lots of repeated patterns, edges, textures and smooth regions. The recovery is based onmore » an optimization of a cost function having four major terms: adaptive dictionary learning term, sparsity term, regularization term, and MLEM Poisson noise estimation term. The optimization is solved by a variable splitting method that introduces additional variables. We simulated a 128×128 Hoffman brain PET image (baseline) with varying kernel types and sizes (Gaussian 9×9, σ=5.4mm; Uniform 5×5, σ=2.9mm) with additive Poisson noise (Blurred). Image recovery was performed once when the kernel type was included in the model optimization and once with the model blinded to kernel type. The recovered image was compared to the baseline as well as another recovery algorithm PIDSPLIT+ (Setzer et. al.) by calculating PSNR (Peak SNR) and normalized average differences in pixel intensities (NADPI) of line profiles across the images. Results: For known kernel types, the PSNR of the Gaussian (Uniform) was 28.73 (25.1) and 25.18 (23.4) for DL and PIDSPLIT+ respectively. For blinded deblurring the PSNRs were 25.32 and 22.86 for DL and PIDSPLIT+ respectively. NADPI between baseline and DL, and baseline and blurred for the Gaussian kernel was 2.5 and 10.8 respectively. Conclusion: PET image deblurring using dictionary learning seems to be a good approach to restore image resolution in presence of Poisson noise. GE Health Care.« less

  6. Online Feature Transformation Learning for Cross-Domain Object Category Recognition.

    PubMed

    Zhang, Xuesong; Zhuang, Yan; Wang, Wei; Pedrycz, Witold

    2017-06-09

    In this paper, we introduce a new research problem termed online feature transformation learning in the context of multiclass object category recognition. The learning of a feature transformation is viewed as learning a global similarity metric function in an online manner. We first consider the problem of online learning a feature transformation matrix expressed in the original feature space and propose an online passive aggressive feature transformation algorithm. Then these original features are mapped to kernel space and an online single kernel feature transformation (OSKFT) algorithm is developed to learn a nonlinear feature transformation. Based on the OSKFT and the existing Hedge algorithm, a novel online multiple kernel feature transformation algorithm is also proposed, which can further improve the performance of online feature transformation learning in large-scale application. The classifier is trained with k nearest neighbor algorithm together with the learned similarity metric function. Finally, we experimentally examined the effect of setting different parameter values in the proposed algorithms and evaluate the model performance on several multiclass object recognition data sets. The experimental results demonstrate the validity and good performance of our methods on cross-domain and multiclass object recognition application.

  7. Differential evolution algorithm-based kernel parameter selection for Fukunaga-Koontz Transform subspaces construction

    NASA Astrophysics Data System (ADS)

    Binol, Hamidullah; Bal, Abdullah; Cukur, Huseyin

    2015-10-01

    The performance of the kernel based techniques depends on the selection of kernel parameters. That's why; suitable parameter selection is an important problem for many kernel based techniques. This article presents a novel technique to learn the kernel parameters in kernel Fukunaga-Koontz Transform based (KFKT) classifier. The proposed approach determines the appropriate values of kernel parameters through optimizing an objective function constructed based on discrimination ability of KFKT. For this purpose we have utilized differential evolution algorithm (DEA). The new technique overcomes some disadvantages such as high time consumption existing in the traditional cross-validation method, and it can be utilized in any type of data. The experiments for target detection applications on the hyperspectral images verify the effectiveness of the proposed method.

  8. A Fast Reduced Kernel Extreme Learning Machine.

    PubMed

    Deng, Wan-Yu; Ong, Yew-Soon; Zheng, Qing-Hua

    2016-04-01

    In this paper, we present a fast and accurate kernel-based supervised algorithm referred to as the Reduced Kernel Extreme Learning Machine (RKELM). In contrast to the work on Support Vector Machine (SVM) or Least Square SVM (LS-SVM), which identifies the support vectors or weight vectors iteratively, the proposed RKELM randomly selects a subset of the available data samples as support vectors (or mapping samples). By avoiding the iterative steps of SVM, significant cost savings in the training process can be readily attained, especially on Big datasets. RKELM is established based on the rigorous proof of universal learning involving reduced kernel-based SLFN. In particular, we prove that RKELM can approximate any nonlinear functions accurately under the condition of support vectors sufficiency. Experimental results on a wide variety of real world small instance size and large instance size applications in the context of binary classification, multi-class problem and regression are then reported to show that RKELM can perform at competitive level of generalized performance as the SVM/LS-SVM at only a fraction of the computational effort incurred. Copyright © 2015 Elsevier Ltd. All rights reserved.

  9. Spontaneous lung metastasis formation of human Merkel cell carcinoma cell lines transplanted into scid mice.

    PubMed

    Knips, Jill; Czech-Sioli, Manja; Spohn, Michael; Heiland, Max; Moll, Ingrid; Grundhoff, Adam; Schumacher, Udo; Fischer, Nicole

    2017-07-01

    Merkel cell carcinoma (MCC) is an aggressive skin cancer entity that frequently leads to rapid death due to its high propensity to metastasize. The etiology of most MCC cases is linked to Merkel cell polyomavirus (MCPyV), a virus which is monoclonally integrated in up to 95% of tumors. While there are presently no animal models to study the role of authentic MCPyV infection on transformation, tumorigenesis or metastasis formation, xenograft mouse models employing engrafted MCC-derived cell lines (MCCL) represent a promising approach to study certain aspects of MCC pathogenesis. Here, the two MCPyV-positive MCC cell lines WaGa and MKL-1 were subcutaneously engrafted in scid mice. Engraftment of both MCC cell lines resulted in the appearance of circulating tumor cells and metastasis formation, with WaGa-engrafted mice showing a significantly shorter survival time as well as increased numbers of spontaneous lung metastases compared to MKL-1 mice. Interestingly, explanted tumors compared to parental cell lines exhibit an upregulation of MCPyV sT-Antigen expression in all tumors, with WaGa tumors showing significantly higher sT-Antigen expression than MKL-1 tumors. RNA-Seq analysis of explanted tumors and parental cell lines furthermore revealed that in the more aggressive WaGa tumors, genes involved in inflammatory response, growth factor activity and Wnt signalling pathway are significantly upregulated, suggesting that sT-Antigen is the driver of the observed differences in metastasis formation. © 2017 UICC.

  10. Stochastic subset selection for learning with kernel machines.

    PubMed

    Rhinelander, Jason; Liu, Xiaoping P

    2012-06-01

    Kernel machines have gained much popularity in applications of machine learning. Support vector machines (SVMs) are a subset of kernel machines and generalize well for classification, regression, and anomaly detection tasks. The training procedure for traditional SVMs involves solving a quadratic programming (QP) problem. The QP problem scales super linearly in computational effort with the number of training samples and is often used for the offline batch processing of data. Kernel machines operate by retaining a subset of observed data during training. The data vectors contained within this subset are referred to as support vectors (SVs). The work presented in this paper introduces a subset selection method for the use of kernel machines in online, changing environments. Our algorithm works by using a stochastic indexing technique when selecting a subset of SVs when computing the kernel expansion. The work described here is novel because it separates the selection of kernel basis functions from the training algorithm used. The subset selection algorithm presented here can be used in conjunction with any online training technique. It is important for online kernel machines to be computationally efficient due to the real-time requirements of online environments. Our algorithm is an important contribution because it scales linearly with the number of training samples and is compatible with current training techniques. Our algorithm outperforms standard techniques in terms of computational efficiency and provides increased recognition accuracy in our experiments. We provide results from experiments using both simulated and real-world data sets to verify our algorithm.

  11. Optimizing Support Vector Machine Parameters with Genetic Algorithm for Credit Risk Assessment

    NASA Astrophysics Data System (ADS)

    Manurung, Jonson; Mawengkang, Herman; Zamzami, Elviawaty

    2017-12-01

    Support vector machine (SVM) is a popular classification method known to have strong generalization capabilities. SVM can solve the problem of classification and linear regression or nonlinear kernel which can be a learning algorithm for the ability of classification and regression. However, SVM also has a weakness that is difficult to determine the optimal parameter value. SVM calculates the best linear separator on the input feature space according to the training data. To classify data which are non-linearly separable, SVM uses kernel tricks to transform the data into a linearly separable data on a higher dimension feature space. The kernel trick using various kinds of kernel functions, such as : linear kernel, polynomial, radial base function (RBF) and sigmoid. Each function has parameters which affect the accuracy of SVM classification. To solve the problem genetic algorithms are proposed to be applied as the optimal parameter value search algorithm thus increasing the best classification accuracy on SVM. Data taken from UCI repository of machine learning database: Australian Credit Approval. The results show that the combination of SVM and genetic algorithms is effective in improving classification accuracy. Genetic algorithms has been shown to be effective in systematically finding optimal kernel parameters for SVM, instead of randomly selected kernel parameters. The best accuracy for data has been upgraded from kernel Linear: 85.12%, polynomial: 81.76%, RBF: 77.22% Sigmoid: 78.70%. However, for bigger data sizes, this method is not practical because it takes a lot of time.

  12. Graph Kernels for Molecular Similarity.

    PubMed

    Rupp, Matthias; Schneider, Gisbert

    2010-04-12

    Molecular similarity measures are important for many cheminformatics applications like ligand-based virtual screening and quantitative structure-property relationships. Graph kernels are formal similarity measures defined directly on graphs, such as the (annotated) molecular structure graph. Graph kernels are positive semi-definite functions, i.e., they correspond to inner products. This property makes them suitable for use with kernel-based machine learning algorithms such as support vector machines and Gaussian processes. We review the major types of kernels between graphs (based on random walks, subgraphs, and optimal assignments, respectively), and discuss their advantages, limitations, and successful applications in cheminformatics. Copyright © 2010 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  13. Performance Assessment of Kernel Density Clustering for Gene Expression Profile Data

    PubMed Central

    Zeng, Beiyan; Chen, Yiping P.; Smith, Oscar H.

    2003-01-01

    Kernel density smoothing techniques have been used in classification or supervised learning of gene expression profile (GEP) data, but their applications to clustering or unsupervised learning of those data have not been explored and assessed. Here we report a kernel density clustering method for analysing GEP data and compare its performance with the three most widely-used clustering methods: hierarchical clustering, K-means clustering, and multivariate mixture model-based clustering. Using several methods to measure agreement, between-cluster isolation, and withincluster coherence, such as the Adjusted Rand Index, the Pseudo F test, the r2 test, and the profile plot, we have assessed the effectiveness of kernel density clustering for recovering clusters, and its robustness against noise on clustering both simulated and real GEP data. Our results show that the kernel density clustering method has excellent performance in recovering clusters from simulated data and in grouping large real expression profile data sets into compact and well-isolated clusters, and that it is the most robust clustering method for analysing noisy expression profile data compared to the other three methods assessed. PMID:18629292

  14. Intelligent Control of a Sensor-Actuator System via Kernelized Least-Squares Policy Iteration

    PubMed Central

    Liu, Bo; Chen, Sanfeng; Li, Shuai; Liang, Yongsheng

    2012-01-01

    In this paper a new framework, called Compressive Kernelized Reinforcement Learning (CKRL), for computing near-optimal policies in sequential decision making with uncertainty is proposed via incorporating the non-adaptive data-independent Random Projections and nonparametric Kernelized Least-squares Policy Iteration (KLSPI). Random Projections are a fast, non-adaptive dimensionality reduction framework in which high-dimensionality data is projected onto a random lower-dimension subspace via spherically random rotation and coordination sampling. KLSPI introduce kernel trick into the LSPI framework for Reinforcement Learning, often achieving faster convergence and providing automatic feature selection via various kernel sparsification approaches. In this approach, policies are computed in a low-dimensional subspace generated by projecting the high-dimensional features onto a set of random basis. We first show how Random Projections constitute an efficient sparsification technique and how our method often converges faster than regular LSPI, while at lower computational costs. Theoretical foundation underlying this approach is a fast approximation of Singular Value Decomposition (SVD). Finally, simulation results are exhibited on benchmark MDP domains, which confirm gains both in computation time and in performance in large feature spaces. PMID:22736969

  15. ℓ(p)-Norm multikernel learning approach for stock market price forecasting.

    PubMed

    Shao, Xigao; Wu, Kun; Liao, Bifeng

    2012-01-01

    Linear multiple kernel learning model has been used for predicting financial time series. However, ℓ(1)-norm multiple support vector regression is rarely observed to outperform trivial baselines in practical applications. To allow for robust kernel mixtures that generalize well, we adopt ℓ(p)-norm multiple kernel support vector regression (1 ≤ p < ∞) as a stock price prediction model. The optimization problem is decomposed into smaller subproblems, and the interleaved optimization strategy is employed to solve the regression model. The model is evaluated on forecasting the daily stock closing prices of Shanghai Stock Index in China. Experimental results show that our proposed model performs better than ℓ(1)-norm multiple support vector regression model.

  16. Distributed smoothed tree kernel for protein-protein interaction extraction from the biomedical literature

    PubMed Central

    Murugesan, Gurusamy; Abdulkadhar, Sabenabanu; Natarajan, Jeyakumar

    2017-01-01

    Automatic extraction of protein-protein interaction (PPI) pairs from biomedical literature is a widely examined task in biological information extraction. Currently, many kernel based approaches such as linear kernel, tree kernel, graph kernel and combination of multiple kernels has achieved promising results in PPI task. However, most of these kernel methods fail to capture the semantic relation information between two entities. In this paper, we present a special type of tree kernel for PPI extraction which exploits both syntactic (structural) and semantic vectors information known as Distributed Smoothed Tree kernel (DSTK). DSTK comprises of distributed trees with syntactic information along with distributional semantic vectors representing semantic information of the sentences or phrases. To generate robust machine learning model composition of feature based kernel and DSTK were combined using ensemble support vector machine (SVM). Five different corpora (AIMed, BioInfer, HPRD50, IEPA, and LLL) were used for evaluating the performance of our system. Experimental results show that our system achieves better f-score with five different corpora compared to other state-of-the-art systems. PMID:29099838

  17. Distributed smoothed tree kernel for protein-protein interaction extraction from the biomedical literature.

    PubMed

    Murugesan, Gurusamy; Abdulkadhar, Sabenabanu; Natarajan, Jeyakumar

    2017-01-01

    Automatic extraction of protein-protein interaction (PPI) pairs from biomedical literature is a widely examined task in biological information extraction. Currently, many kernel based approaches such as linear kernel, tree kernel, graph kernel and combination of multiple kernels has achieved promising results in PPI task. However, most of these kernel methods fail to capture the semantic relation information between two entities. In this paper, we present a special type of tree kernel for PPI extraction which exploits both syntactic (structural) and semantic vectors information known as Distributed Smoothed Tree kernel (DSTK). DSTK comprises of distributed trees with syntactic information along with distributional semantic vectors representing semantic information of the sentences or phrases. To generate robust machine learning model composition of feature based kernel and DSTK were combined using ensemble support vector machine (SVM). Five different corpora (AIMed, BioInfer, HPRD50, IEPA, and LLL) were used for evaluating the performance of our system. Experimental results show that our system achieves better f-score with five different corpora compared to other state-of-the-art systems.

  18. Online Pairwise Learning Algorithms.

    PubMed

    Ying, Yiming; Zhou, Ding-Xuan

    2016-04-01

    Pairwise learning usually refers to a learning task that involves a loss function depending on pairs of examples, among which the most notable ones are bipartite ranking, metric learning, and AUC maximization. In this letter we study an online algorithm for pairwise learning with a least-square loss function in an unconstrained setting of a reproducing kernel Hilbert space (RKHS) that we refer to as the Online Pairwise lEaRning Algorithm (OPERA). In contrast to existing works (Kar, Sriperumbudur, Jain, & Karnick, 2013 ; Wang, Khardon, Pechyony, & Jones, 2012 ), which require that the iterates are restricted to a bounded domain or the loss function is strongly convex, OPERA is associated with a non-strongly convex objective function and learns the target function in an unconstrained RKHS. Specifically, we establish a general theorem that guarantees the almost sure convergence for the last iterate of OPERA without any assumptions on the underlying distribution. Explicit convergence rates are derived under the condition of polynomially decaying step sizes. We also establish an interesting property for a family of widely used kernels in the setting of pairwise learning and illustrate the convergence results using such kernels. Our methodology mainly depends on the characterization of RKHSs using its associated integral operators and probability inequalities for random variables with values in a Hilbert space.

  19. Metabolic network prediction through pairwise rational kernels.

    PubMed

    Roche-Lima, Abiel; Domaratzki, Michael; Fristensky, Brian

    2014-09-26

    Metabolic networks are represented by the set of metabolic pathways. Metabolic pathways are a series of biochemical reactions, in which the product (output) from one reaction serves as the substrate (input) to another reaction. Many pathways remain incompletely characterized. One of the major challenges of computational biology is to obtain better models of metabolic pathways. Existing models are dependent on the annotation of the genes. This propagates error accumulation when the pathways are predicted by incorrectly annotated genes. Pairwise classification methods are supervised learning methods used to classify new pair of entities. Some of these classification methods, e.g., Pairwise Support Vector Machines (SVMs), use pairwise kernels. Pairwise kernels describe similarity measures between two pairs of entities. Using pairwise kernels to handle sequence data requires long processing times and large storage. Rational kernels are kernels based on weighted finite-state transducers that represent similarity measures between sequences or automata. They have been effectively used in problems that handle large amount of sequence information such as protein essentiality, natural language processing and machine translations. We create a new family of pairwise kernels using weighted finite-state transducers (called Pairwise Rational Kernel (PRK)) to predict metabolic pathways from a variety of biological data. PRKs take advantage of the simpler representations and faster algorithms of transducers. Because raw sequence data can be used, the predictor model avoids the errors introduced by incorrect gene annotations. We then developed several experiments with PRKs and Pairwise SVM to validate our methods using the metabolic network of Saccharomyces cerevisiae. As a result, when PRKs are used, our method executes faster in comparison with other pairwise kernels. Also, when we use PRKs combined with other simple kernels that include evolutionary information, the accuracy values have been improved, while maintaining lower construction and execution times. The power of using kernels is that almost any sort of data can be represented using kernels. Therefore, completely disparate types of data can be combined to add power to kernel-based machine learning methods. When we compared our proposal using PRKs with other similar kernel, the execution times were decreased, with no compromise of accuracy. We also proved that by combining PRKs with other kernels that include evolutionary information, the accuracy can also also be improved. As our proposal can use any type of sequence data, genes do not need to be properly annotated, avoiding accumulation errors because of incorrect previous annotations.

  20. Multiple kernel learning in protein-protein interaction extraction from biomedical literature.

    PubMed

    Yang, Zhihao; Tang, Nan; Zhang, Xiao; Lin, Hongfei; Li, Yanpeng; Yang, Zhiwei

    2011-03-01

    Knowledge about protein-protein interactions (PPIs) unveils the molecular mechanisms of biological processes. The volume and content of published biomedical literature on protein interactions is expanding rapidly, making it increasingly difficult for interaction database administrators, responsible for content input and maintenance to detect and manually update protein interaction information. The objective of this work is to develop an effective approach to automatic extraction of PPI information from biomedical literature. We present a weighted multiple kernel learning-based approach for automatic PPI extraction from biomedical literature. The approach combines the following kernels: feature-based, tree, graph and part-of-speech (POS) path. In particular, we extend the shortest path-enclosed tree (SPT) and dependency path tree to capture richer contextual information. Our experimental results show that the combination of SPT and dependency path tree extensions contributes to the improvement of performance by almost 0.7 percentage units in F-score and 2 percentage units in area under the receiver operating characteristics curve (AUC). Combining two or more appropriately weighed individual will further improve the performance. Both on the individual corpus and cross-corpus evaluation our combined kernel can achieve state-of-the-art performance with respect to comparable evaluations, with 64.41% F-score and 88.46% AUC on the AImed corpus. As different kernels calculate the similarity between two sentences from different aspects. Our combined kernel can reduce the risk of missing important features. More specifically, we use a weighted linear combination of individual kernels instead of assigning the same weight to each individual kernel, thus allowing the introduction of each kernel to incrementally contribute to the performance improvement. In addition, SPT and dependency path tree extensions can improve the performance by including richer context information. Copyright © 2010 Elsevier B.V. All rights reserved.

  1. Modeling adaptive kernels from probabilistic phylogenetic trees.

    PubMed

    Nicotra, Luca; Micheli, Alessio

    2009-01-01

    Modeling phylogenetic interactions is an open issue in many computational biology problems. In the context of gene function prediction we introduce a class of kernels for structured data leveraging on a hierarchical probabilistic modeling of phylogeny among species. We derive three kernels belonging to this setting: a sufficient statistics kernel, a Fisher kernel, and a probability product kernel. The new kernels are used in the context of support vector machine learning. The kernels adaptivity is obtained through the estimation of the parameters of a tree structured model of evolution using as observed data phylogenetic profiles encoding the presence or absence of specific genes in a set of fully sequenced genomes. We report results obtained in the prediction of the functional class of the proteins of the budding yeast Saccharomyces cerevisae which favorably compare to a standard vector based kernel and to a non-adaptive tree kernel function. A further comparative analysis is performed in order to assess the impact of the different components of the proposed approach. We show that the key features of the proposed kernels are the adaptivity to the input domain and the ability to deal with structured data interpreted through a graphical model representation.

  2. Kernel-aligned multi-view canonical correlation analysis for image recognition

    NASA Astrophysics Data System (ADS)

    Su, Shuzhi; Ge, Hongwei; Yuan, Yun-Hao

    2016-09-01

    Existing kernel-based correlation analysis methods mainly adopt a single kernel in each view. However, only a single kernel is usually insufficient to characterize nonlinear distribution information of a view. To solve the problem, we transform each original feature vector into a 2-dimensional feature matrix by means of kernel alignment, and then propose a novel kernel-aligned multi-view canonical correlation analysis (KAMCCA) method on the basis of the feature matrices. Our proposed method can simultaneously employ multiple kernels to better capture the nonlinear distribution information of each view, so that correlation features learned by KAMCCA can have well discriminating power in real-world image recognition. Extensive experiments are designed on five real-world image datasets, including NIR face images, thermal face images, visible face images, handwritten digit images, and object images. Promising experimental results on the datasets have manifested the effectiveness of our proposed method.

  3. Providing Observation Context via Kernel Visualization and Informatics for Planning and Data Analysis

    NASA Astrophysics Data System (ADS)

    Kidd, J. N.; Selznick, S.; Hergenrother, C. W.

    2018-04-01

    From our lessons learned and SPICE expertise, we lay out the features and capabilities of a new web-based tool to provide an accessible platform to obtain context and informatics from a planetary mission's SPICE kernels.

  4. Adaptive learning in complex reproducing kernel Hilbert spaces employing Wirtinger's subgradients.

    PubMed

    Bouboulis, Pantelis; Slavakis, Konstantinos; Theodoridis, Sergios

    2012-03-01

    This paper presents a wide framework for non-linear online supervised learning tasks in the context of complex valued signal processing. The (complex) input data are mapped into a complex reproducing kernel Hilbert space (RKHS), where the learning phase is taking place. Both pure complex kernels and real kernels (via the complexification trick) can be employed. Moreover, any convex, continuous and not necessarily differentiable function can be used to measure the loss between the output of the specific system and the desired response. The only requirement is the subgradient of the adopted loss function to be available in an analytic form. In order to derive analytically the subgradients, the principles of the (recently developed) Wirtinger's calculus in complex RKHS are exploited. Furthermore, both linear and widely linear (in RKHS) estimation filters are considered. To cope with the problem of increasing memory requirements, which is present in almost all online schemes in RKHS, the sparsification scheme, based on projection onto closed balls, has been adopted. We demonstrate the effectiveness of the proposed framework in a non-linear channel identification task, a non-linear channel equalization problem and a quadrature phase shift keying equalization scheme, using both circular and non circular synthetic signal sources.

  5. Compound analysis via graph kernels incorporating chirality.

    PubMed

    Brown, J B; Urata, Takashi; Tamura, Takeyuki; Arai, Midori A; Kawabata, Takeo; Akutsu, Tatsuya

    2010-12-01

    High accuracy is paramount when predicting biochemical characteristics using Quantitative Structural-Property Relationships (QSPRs). Although existing graph-theoretic kernel methods combined with machine learning techniques are efficient for QSPR model construction, they cannot distinguish topologically identical chiral compounds which often exhibit different biological characteristics. In this paper, we propose a new method that extends the recently developed tree pattern graph kernel to accommodate stereoisomers. We show that Support Vector Regression (SVR) with a chiral graph kernel is useful for target property prediction by demonstrating its application to a set of human vitamin D receptor ligands currently under consideration for their potential anti-cancer effects.

  6. ℓ p-Norm Multikernel Learning Approach for Stock Market Price Forecasting

    PubMed Central

    Shao, Xigao; Wu, Kun; Liao, Bifeng

    2012-01-01

    Linear multiple kernel learning model has been used for predicting financial time series. However, ℓ 1-norm multiple support vector regression is rarely observed to outperform trivial baselines in practical applications. To allow for robust kernel mixtures that generalize well, we adopt ℓ p-norm multiple kernel support vector regression (1 ≤ p < ∞) as a stock price prediction model. The optimization problem is decomposed into smaller subproblems, and the interleaved optimization strategy is employed to solve the regression model. The model is evaluated on forecasting the daily stock closing prices of Shanghai Stock Index in China. Experimental results show that our proposed model performs better than ℓ 1-norm multiple support vector regression model. PMID:23365561

  7. Biologically-Inspired Spike-Based Automatic Speech Recognition of Isolated Digits Over a Reproducing Kernel Hilbert Space

    PubMed Central

    Li, Kan; Príncipe, José C.

    2018-01-01

    This paper presents a novel real-time dynamic framework for quantifying time-series structure in spoken words using spikes. Audio signals are converted into multi-channel spike trains using a biologically-inspired leaky integrate-and-fire (LIF) spike generator. These spike trains are mapped into a function space of infinite dimension, i.e., a Reproducing Kernel Hilbert Space (RKHS) using point-process kernels, where a state-space model learns the dynamics of the multidimensional spike input using gradient descent learning. This kernelized recurrent system is very parsimonious and achieves the necessary memory depth via feedback of its internal states when trained discriminatively, utilizing the full context of the phoneme sequence. A main advantage of modeling nonlinear dynamics using state-space trajectories in the RKHS is that it imposes no restriction on the relationship between the exogenous input and its internal state. We are free to choose the input representation with an appropriate kernel, and changing the kernel does not impact the system nor the learning algorithm. Moreover, we show that this novel framework can outperform both traditional hidden Markov model (HMM) speech processing as well as neuromorphic implementations based on spiking neural network (SNN), yielding accurate and ultra-low power word spotters. As a proof of concept, we demonstrate its capabilities using the benchmark TI-46 digit corpus for isolated-word automatic speech recognition (ASR) or keyword spotting. Compared to HMM using Mel-frequency cepstral coefficient (MFCC) front-end without time-derivatives, our MFCC-KAARMA offered improved performance. For spike-train front-end, spike-KAARMA also outperformed state-of-the-art SNN solutions. Furthermore, compared to MFCCs, spike trains provided enhanced noise robustness in certain low signal-to-noise ratio (SNR) regime. PMID:29666568

  8. Biologically-Inspired Spike-Based Automatic Speech Recognition of Isolated Digits Over a Reproducing Kernel Hilbert Space.

    PubMed

    Li, Kan; Príncipe, José C

    2018-01-01

    This paper presents a novel real-time dynamic framework for quantifying time-series structure in spoken words using spikes. Audio signals are converted into multi-channel spike trains using a biologically-inspired leaky integrate-and-fire (LIF) spike generator. These spike trains are mapped into a function space of infinite dimension, i.e., a Reproducing Kernel Hilbert Space (RKHS) using point-process kernels, where a state-space model learns the dynamics of the multidimensional spike input using gradient descent learning. This kernelized recurrent system is very parsimonious and achieves the necessary memory depth via feedback of its internal states when trained discriminatively, utilizing the full context of the phoneme sequence. A main advantage of modeling nonlinear dynamics using state-space trajectories in the RKHS is that it imposes no restriction on the relationship between the exogenous input and its internal state. We are free to choose the input representation with an appropriate kernel, and changing the kernel does not impact the system nor the learning algorithm. Moreover, we show that this novel framework can outperform both traditional hidden Markov model (HMM) speech processing as well as neuromorphic implementations based on spiking neural network (SNN), yielding accurate and ultra-low power word spotters. As a proof of concept, we demonstrate its capabilities using the benchmark TI-46 digit corpus for isolated-word automatic speech recognition (ASR) or keyword spotting. Compared to HMM using Mel-frequency cepstral coefficient (MFCC) front-end without time-derivatives, our MFCC-KAARMA offered improved performance. For spike-train front-end, spike-KAARMA also outperformed state-of-the-art SNN solutions. Furthermore, compared to MFCCs, spike trains provided enhanced noise robustness in certain low signal-to-noise ratio (SNR) regime.

  9. A Novel Weighted Kernel PCA-Based Method for Optimization and Uncertainty Quantification

    NASA Astrophysics Data System (ADS)

    Thimmisetty, C.; Talbot, C.; Chen, X.; Tong, C. H.

    2016-12-01

    It has been demonstrated that machine learning methods can be successfully applied to uncertainty quantification for geophysical systems through the use of the adjoint method coupled with kernel PCA-based optimization. In addition, it has been shown through weighted linear PCA how optimization with respect to both observation weights and feature space control variables can accelerate convergence of such methods. Linear machine learning methods, however, are inherently limited in their ability to represent features of non-Gaussian stochastic random fields, as they are based on only the first two statistical moments of the original data. Nonlinear spatial relationships and multipoint statistics leading to the tortuosity characteristic of channelized media, for example, are captured only to a limited extent by linear PCA. With the aim of coupling the kernel-based and weighted methods discussed, we present a novel mathematical formulation of kernel PCA, Weighted Kernel Principal Component Analysis (WKPCA), that both captures nonlinear relationships and incorporates the attribution of significance levels to different realizations of the stochastic random field of interest. We also demonstrate how new instantiations retaining defining characteristics of the random field can be generated using Bayesian methods. In particular, we present a novel WKPCA-based optimization method that minimizes a given objective function with respect to both feature space random variables and observation weights through which optimal snapshot significance levels and optimal features are learned. We showcase how WKPCA can be applied to nonlinear optimal control problems involving channelized media, and in particular demonstrate an application of the method to learning the spatial distribution of material parameter values in the context of linear elasticity, and discuss further extensions of the method to stochastic inversion.

  10. Utilization of spectral-spatial characteristics in shortwave infrared hyperspectral images to classify and identify fungi-contaminated peanuts.

    PubMed

    Qiao, Xiaojun; Jiang, Jinbao; Qi, Xiaotong; Guo, Haiqiang; Yuan, Deshuai

    2017-04-01

    It's well-known fungi-contaminated peanuts contain potent carcinogen. Efficiently identifying and separating the contaminated can help prevent aflatoxin entering in food chain. In this study, shortwave infrared (SWIR) hyperspectral images for identifying the prepared contaminated kernels. Feature selection method of analysis of variance (ANOVA) and feature extraction method of nonparametric weighted feature extraction (NWFE) were used to concentrate spectral information into a subspace where contaminated and healthy peanuts can have favorable separability. Then, peanut pixels were classified using SVM. Moreover, image segmentation method of region growing was applied to segment the image as kernel-scale patches and meanwhile to number the kernels. The result shows that pixel-wise classification accuracies are 99.13% for breed A, 96.72% for B and 99.73% for C in learning images, and are 96.32%, 94.2% and 97.51% in validation images. Contaminated peanuts were correctly marked as aberrant kernels in both learning images and validation images. Copyright © 2016 Elsevier Ltd. All rights reserved.

  11. Kernel Recursive Least-Squares Temporal Difference Algorithms with Sparsification and Regularization

    PubMed Central

    Zhu, Qingxin; Niu, Xinzheng

    2016-01-01

    By combining with sparse kernel methods, least-squares temporal difference (LSTD) algorithms can construct the feature dictionary automatically and obtain a better generalization ability. However, the previous kernel-based LSTD algorithms do not consider regularization and their sparsification processes are batch or offline, which hinder their widespread applications in online learning problems. In this paper, we combine the following five techniques and propose two novel kernel recursive LSTD algorithms: (i) online sparsification, which can cope with unknown state regions and be used for online learning, (ii) L 2 and L 1 regularization, which can avoid overfitting and eliminate the influence of noise, (iii) recursive least squares, which can eliminate matrix-inversion operations and reduce computational complexity, (iv) a sliding-window approach, which can avoid caching all history samples and reduce the computational cost, and (v) the fixed-point subiteration and online pruning, which can make L 1 regularization easy to implement. Finally, simulation results on two 50-state chain problems demonstrate the effectiveness of our algorithms. PMID:27436996

  12. Kernel Recursive Least-Squares Temporal Difference Algorithms with Sparsification and Regularization.

    PubMed

    Zhang, Chunyuan; Zhu, Qingxin; Niu, Xinzheng

    2016-01-01

    By combining with sparse kernel methods, least-squares temporal difference (LSTD) algorithms can construct the feature dictionary automatically and obtain a better generalization ability. However, the previous kernel-based LSTD algorithms do not consider regularization and their sparsification processes are batch or offline, which hinder their widespread applications in online learning problems. In this paper, we combine the following five techniques and propose two novel kernel recursive LSTD algorithms: (i) online sparsification, which can cope with unknown state regions and be used for online learning, (ii) L 2 and L 1 regularization, which can avoid overfitting and eliminate the influence of noise, (iii) recursive least squares, which can eliminate matrix-inversion operations and reduce computational complexity, (iv) a sliding-window approach, which can avoid caching all history samples and reduce the computational cost, and (v) the fixed-point subiteration and online pruning, which can make L 1 regularization easy to implement. Finally, simulation results on two 50-state chain problems demonstrate the effectiveness of our algorithms.

  13. Prediction and early detection of delirium in the intensive care unit by using heart rate variability and machine learning.

    PubMed

    Oh, Jooyoung; Cho, Dongrae; Park, Jaesub; Na, Se Hee; Kim, Jongin; Heo, Jaeseok; Shin, Cheung Soo; Kim, Jae-Jin; Park, Jin Young; Lee, Boreom

    2018-03-27

    Delirium is an important syndrome found in patients in the intensive care unit (ICU), however, it is usually under-recognized during treatment. This study was performed to investigate whether delirious patients can be successfully distinguished from non-delirious patients by using heart rate variability (HRV) and machine learning. Electrocardiography data of 140 patients was acquired during daily ICU care, and HRV data were analyzed. Delirium, including its type, severity, and etiologies, was evaluated daily by trained psychiatrists. HRV data and various machine learning algorithms including linear support vector machine (SVM), SVM with radial basis function (RBF) kernels, linear extreme learning machine (ELM), ELM with RBF kernels, linear discriminant analysis, and quadratic discriminant analysis were utilized to distinguish delirium patients from non-delirium patients. HRV data of 4797 ECGs were included, and 39 patients had delirium at least once during their ICU stay. The maximum classification accuracy was acquired using SVM with RBF kernels. Our prediction method based on HRV with machine learning was comparable to previous delirium prediction models using massive amounts of clinical information. Our results show that autonomic alterations could be a significant feature of patients with delirium in the ICU, suggesting the potential for the automatic prediction and early detection of delirium based on HRV with machine learning.

  14. Oligo kernels for datamining on biological sequences: a case study on prokaryotic translation initiation sites

    PubMed Central

    Meinicke, Peter; Tech, Maike; Morgenstern, Burkhard; Merkl, Rainer

    2004-01-01

    Background Kernel-based learning algorithms are among the most advanced machine learning methods and have been successfully applied to a variety of sequence classification tasks within the field of bioinformatics. Conventional kernels utilized so far do not provide an easy interpretation of the learnt representations in terms of positional and compositional variability of the underlying biological signals. Results We propose a kernel-based approach to datamining on biological sequences. With our method it is possible to model and analyze positional variability of oligomers of any length in a natural way. On one hand this is achieved by mapping the sequences to an intuitive but high-dimensional feature space, well-suited for interpretation of the learnt models. On the other hand, by means of the kernel trick we can provide a general learning algorithm for that high-dimensional representation because all required statistics can be computed without performing an explicit feature space mapping of the sequences. By introducing a kernel parameter that controls the degree of position-dependency, our feature space representation can be tailored to the characteristics of the biological problem at hand. A regularized learning scheme enables application even to biological problems for which only small sets of example sequences are available. Our approach includes a visualization method for transparent representation of characteristic sequence features. Thereby importance of features can be measured in terms of discriminative strength with respect to classification of the underlying sequences. To demonstrate and validate our concept on a biochemically well-defined case, we analyze E. coli translation initiation sites in order to show that we can find biologically relevant signals. For that case, our results clearly show that the Shine-Dalgarno sequence is the most important signal upstream a start codon. The variability in position and composition we found for that signal is in accordance with previous biological knowledge. We also find evidence for signals downstream of the start codon, previously introduced as transcriptional enhancers. These signals are mainly characterized by occurrences of adenine in a region of about 4 nucleotides next to the start codon. Conclusions We showed that the oligo kernel can provide a valuable tool for the analysis of relevant signals in biological sequences. In the case of translation initiation sites we could clearly deduce the most discriminative motifs and their positional variation from example sequences. Attractive features of our approach are its flexibility with respect to oligomer length and position conservation. By means of these two parameters oligo kernels can easily be adapted to different biological problems. PMID:15511290

  15. Gaussian processes with optimal kernel construction for neuro-degenerative clinical onset prediction

    NASA Astrophysics Data System (ADS)

    Canas, Liane S.; Yvernault, Benjamin; Cash, David M.; Molteni, Erika; Veale, Tom; Benzinger, Tammie; Ourselin, Sébastien; Mead, Simon; Modat, Marc

    2018-02-01

    Gaussian Processes (GP) are a powerful tool to capture the complex time-variations of a dataset. In the context of medical imaging analysis, they allow a robust modelling even in case of highly uncertain or incomplete datasets. Predictions from GP are dependent of the covariance kernel function selected to explain the data variance. To overcome this limitation, we propose a framework to identify the optimal covariance kernel function to model the data.The optimal kernel is defined as a composition of base kernel functions used to identify correlation patterns between data points. Our approach includes a modified version of the Compositional Kernel Learning (CKL) algorithm, in which we score the kernel families using a new energy function that depends both the Bayesian Information Criterion (BIC) and the explained variance score. We applied the proposed framework to model the progression of neurodegenerative diseases over time, in particular the progression of autosomal dominantly-inherited Alzheimer's disease, and use it to predict the time to clinical onset of subjects carrying genetic mutation.

  16. Integrating different data types by regularized unsupervised multiple kernel learning with application to cancer subtype discovery.

    PubMed

    Speicher, Nora K; Pfeifer, Nico

    2015-06-15

    Despite ongoing cancer research, available therapies are still limited in quantity and effectiveness, and making treatment decisions for individual patients remains a hard problem. Established subtypes, which help guide these decisions, are mainly based on individual data types. However, the analysis of multidimensional patient data involving the measurements of various molecular features could reveal intrinsic characteristics of the tumor. Large-scale projects accumulate this kind of data for various cancer types, but we still lack the computational methods to reliably integrate this information in a meaningful manner. Therefore, we apply and extend current multiple kernel learning for dimensionality reduction approaches. On the one hand, we add a regularization term to avoid overfitting during the optimization procedure, and on the other hand, we show that one can even use several kernels per data type and thereby alleviate the user from having to choose the best kernel functions and kernel parameters for each data type beforehand. We have identified biologically meaningful subgroups for five different cancer types. Survival analysis has revealed significant differences between the survival times of the identified subtypes, with P values comparable or even better than state-of-the-art methods. Moreover, our resulting subtypes reflect combined patterns from the different data sources, and we demonstrate that input kernel matrices with only little information have less impact on the integrated kernel matrix. Our subtypes show different responses to specific therapies, which could eventually assist in treatment decision making. An executable is available upon request. © The Author 2015. Published by Oxford University Press.

  17. A kernel adaptive algorithm for quaternion-valued inputs.

    PubMed

    Paul, Thomas K; Ogunfunmi, Tokunbo

    2015-10-01

    The use of quaternion data can provide benefit in applications like robotics and image recognition, and particularly for performing transforms in 3-D space. Here, we describe a kernel adaptive algorithm for quaternions. A least mean square (LMS)-based method was used, resulting in the derivation of the quaternion kernel LMS (Quat-KLMS) algorithm. Deriving this algorithm required describing the idea of a quaternion reproducing kernel Hilbert space (RKHS), as well as kernel functions suitable with quaternions. A modified HR calculus for Hilbert spaces was used to find the gradient of cost functions defined on a quaternion RKHS. In addition, the use of widely linear (or augmented) filtering is proposed to improve performance. The benefit of the Quat-KLMS and widely linear forms in learning nonlinear transformations of quaternion data are illustrated with simulations.

  18. Multiple kernels learning-based biological entity relationship extraction method.

    PubMed

    Dongliang, Xu; Jingchang, Pan; Bailing, Wang

    2017-09-20

    Automatic extracting protein entity interaction information from biomedical literature can help to build protein relation network and design new drugs. There are more than 20 million literature abstracts included in MEDLINE, which is the most authoritative textual database in the field of biomedicine, and follow an exponential growth over time. This frantic expansion of the biomedical literature can often be difficult to absorb or manually analyze. Thus efficient and automated search engines are necessary to efficiently explore the biomedical literature using text mining techniques. The P, R, and F value of tag graph method in Aimed corpus are 50.82, 69.76, and 58.61%, respectively. The P, R, and F value of tag graph kernel method in other four evaluation corpuses are 2-5% higher than that of all-paths graph kernel. And The P, R and F value of feature kernel and tag graph kernel fuse methods is 53.43, 71.62 and 61.30%, respectively. The P, R and F value of feature kernel and tag graph kernel fuse methods is 55.47, 70.29 and 60.37%, respectively. It indicated that the performance of the two kinds of kernel fusion methods is better than that of simple kernel. In comparison with the all-paths graph kernel method, the tag graph kernel method is superior in terms of overall performance. Experiments show that the performance of the multi-kernels method is better than that of the three separate single-kernel method and the dual-mutually fused kernel method used hereof in five corpus sets.

  19. Application of kernel method in fluorescence molecular tomography

    NASA Astrophysics Data System (ADS)

    Zhao, Yue; Baikejiang, Reheman; Li, Changqing

    2017-02-01

    Reconstruction of fluorescence molecular tomography (FMT) is an ill-posed inverse problem. Anatomical guidance in the FMT reconstruction can improve FMT reconstruction efficiently. We have developed a kernel method to introduce the anatomical guidance into FMT robustly and easily. The kernel method is from machine learning for pattern analysis and is an efficient way to represent anatomical features. For the finite element method based FMT reconstruction, we calculate a kernel function for each finite element node from an anatomical image, such as a micro-CT image. Then the fluorophore concentration at each node is represented by a kernel coefficient vector and the corresponding kernel function. In the FMT forward model, we have a new system matrix by multiplying the sensitivity matrix with the kernel matrix. Thus, the kernel coefficient vector is the unknown to be reconstructed following a standard iterative reconstruction process. We convert the FMT reconstruction problem into the kernel coefficient reconstruction problem. The desired fluorophore concentration at each node can be calculated accordingly. Numerical simulation studies have demonstrated that the proposed kernel-based algorithm can improve the spatial resolution of the reconstructed FMT images. In the proposed kernel method, the anatomical guidance can be obtained directly from the anatomical image and is included in the forward modeling. One of the advantages is that we do not need to segment the anatomical image for the targets and background.

  20. Predicting complex traits using a diffusion kernel on genetic markers with an application to dairy cattle and wheat data

    PubMed Central

    2013-01-01

    Background Arguably, genotypes and phenotypes may be linked in functional forms that are not well addressed by the linear additive models that are standard in quantitative genetics. Therefore, developing statistical learning models for predicting phenotypic values from all available molecular information that are capable of capturing complex genetic network architectures is of great importance. Bayesian kernel ridge regression is a non-parametric prediction model proposed for this purpose. Its essence is to create a spatial distance-based relationship matrix called a kernel. Although the set of all single nucleotide polymorphism genotype configurations on which a model is built is finite, past research has mainly used a Gaussian kernel. Results We sought to investigate the performance of a diffusion kernel, which was specifically developed to model discrete marker inputs, using Holstein cattle and wheat data. This kernel can be viewed as a discretization of the Gaussian kernel. The predictive ability of the diffusion kernel was similar to that of non-spatial distance-based additive genomic relationship kernels in the Holstein data, but outperformed the latter in the wheat data. However, the difference in performance between the diffusion and Gaussian kernels was negligible. Conclusions It is concluded that the ability of a diffusion kernel to capture the total genetic variance is not better than that of a Gaussian kernel, at least for these data. Although the diffusion kernel as a choice of basis function may have potential for use in whole-genome prediction, our results imply that embedding genetic markers into a non-Euclidean metric space has very small impact on prediction. Our results suggest that use of the black box Gaussian kernel is justified, given its connection to the diffusion kernel and its similar predictive performance. PMID:23763755

  1. A Distributed Learning Method for ℓ1-Regularized Kernel Machine over Wireless Sensor Networks

    PubMed Central

    Ji, Xinrong; Hou, Cuiqin; Hou, Yibin; Gao, Fang; Wang, Shulong

    2016-01-01

    In wireless sensor networks, centralized learning methods have very high communication costs and energy consumption. These are caused by the need to transmit scattered training examples from various sensor nodes to the central fusion center where a classifier or a regression machine is trained. To reduce the communication cost, a distributed learning method for a kernel machine that incorporates ℓ1 norm regularization (ℓ1-regularized) is investigated, and a novel distributed learning algorithm for the ℓ1-regularized kernel minimum mean squared error (KMSE) machine is proposed. The proposed algorithm relies on in-network processing and a collaboration that transmits the sparse model only between single-hop neighboring nodes. This paper evaluates the proposed algorithm with respect to the prediction accuracy, the sparse rate of model, the communication cost and the number of iterations on synthetic and real datasets. The simulation results show that the proposed algorithm can obtain approximately the same prediction accuracy as that obtained by the batch learning method. Moreover, it is significantly superior in terms of the sparse rate of model and communication cost, and it can converge with fewer iterations. Finally, an experiment conducted on a wireless sensor network (WSN) test platform further shows the advantages of the proposed algorithm with respect to communication cost. PMID:27376298

  2. Optimized extreme learning machine for urban land cover classification using hyperspectral imagery

    NASA Astrophysics Data System (ADS)

    Su, Hongjun; Tian, Shufang; Cai, Yue; Sheng, Yehua; Chen, Chen; Najafian, Maryam

    2017-12-01

    This work presents a new urban land cover classification framework using the firefly algorithm (FA) optimized extreme learning machine (ELM). FA is adopted to optimize the regularization coefficient C and Gaussian kernel σ for kernel ELM. Additionally, effectiveness of spectral features derived from an FA-based band selection algorithm is studied for the proposed classification task. Three sets of hyperspectral databases were recorded using different sensors, namely HYDICE, HyMap, and AVIRIS. Our study shows that the proposed method outperforms traditional classification algorithms such as SVM and reduces computational cost significantly.

  3. Detecting epileptic seizure with different feature extracting strategies using robust machine learning classification techniques by applying advance parameter optimization approach.

    PubMed

    Hussain, Lal

    2018-06-01

    Epilepsy is a neurological disorder produced due to abnormal excitability of neurons in the brain. The research reveals that brain activity is monitored through electroencephalogram (EEG) of patients suffered from seizure to detect the epileptic seizure. The performance of EEG detection based epilepsy require feature extracting strategies. In this research, we have extracted varying features extracting strategies based on time and frequency domain characteristics, nonlinear, wavelet based entropy and few statistical features. A deeper study was undertaken using novel machine learning classifiers by considering multiple factors. The support vector machine kernels are evaluated based on multiclass kernel and box constraint level. Likewise, for K-nearest neighbors (KNN), we computed the different distance metrics, Neighbor weights and Neighbors. Similarly, the decision trees we tuned the paramours based on maximum splits and split criteria and ensemble classifiers are evaluated based on different ensemble methods and learning rate. For training/testing tenfold Cross validation was employed and performance was evaluated in form of TPR, NPR, PPV, accuracy and AUC. In this research, a deeper analysis approach was performed using diverse features extracting strategies using robust machine learning classifiers with more advanced optimal options. Support Vector Machine linear kernel and KNN with City block distance metric give the overall highest accuracy of 99.5% which was higher than using the default parameters for these classifiers. Moreover, highest separation (AUC = 0.9991, 0.9990) were obtained at different kernel scales using SVM. Additionally, the K-nearest neighbors with inverse squared distance weight give higher performance at different Neighbors. Moreover, to distinguish the postictal heart rate oscillations from epileptic ictal subjects, and highest performance of 100% was obtained using different machine learning classifiers.

  4. Alumina Concentration Detection Based on the Kernel Extreme Learning Machine.

    PubMed

    Zhang, Sen; Zhang, Tao; Yin, Yixin; Xiao, Wendong

    2017-09-01

    The concentration of alumina in the electrolyte is of great significance during the production of aluminum. The amount of the alumina concentration may lead to unbalanced material distribution and low production efficiency and affect the stability of the aluminum reduction cell and current efficiency. The existing methods cannot meet the needs for online measurement because industrial aluminum electrolysis has the characteristics of high temperature, strong magnetic field, coupled parameters, and high nonlinearity. Currently, there are no sensors or equipment that can detect the alumina concentration on line. Most companies acquire the alumina concentration from the electrolyte samples which are analyzed through an X-ray fluorescence spectrometer. To solve the problem, the paper proposes a soft sensing model based on a kernel extreme learning machine algorithm that takes the kernel function into the extreme learning machine. K-fold cross validation is used to estimate the generalization error. The proposed soft sensing algorithm can detect alumina concentration by the electrical signals such as voltages and currents of the anode rods. The predicted results show that the proposed approach can give more accurate estimations of alumina concentration with faster learning speed compared with the other methods such as the basic ELM, BP, and SVM.

  5. Osteoarthritis Severity Determination using Self Organizing Map Based Gabor Kernel

    NASA Astrophysics Data System (ADS)

    Anifah, L.; Purnomo, M. H.; Mengko, T. L. R.; Purnama, I. K. E.

    2018-02-01

    The number of osteoarthritis patients in Indonesia is enormous, so early action is needed in order for this disease to be handled. The aim of this paper to determine osteoarthritis severity based on x-ray image template based on gabor kernel. This research is divided into 3 stages, the first step is image processing that is using gabor kernel. The second stage is the learning stage, and the third stage is the testing phase. The image processing stage is by normalizing the image dimension to be template to 50 □ 200 image. Learning stage is done with parameters initial learning rate of 0.5 and the total number of iterations of 1000. The testing stage is performed using the weights generated at the learning stage. The testing phase has been done and the results were obtained. The result shows KL-Grade 0 has an accuracy of 36.21%, accuracy for KL-Grade 2 is 40,52%, while accuracy for KL-Grade 2 and KL-Grade 3 are 15,52%, and 25,86%. The implication of this research is expected that this research as decision support system for medical practitioners in determining KL-Grade on X-ray images of knee osteoarthritis.

  6. A framework for optimal kernel-based manifold embedding of medical image data.

    PubMed

    Zimmer, Veronika A; Lekadir, Karim; Hoogendoorn, Corné; Frangi, Alejandro F; Piella, Gemma

    2015-04-01

    Kernel-based dimensionality reduction is a widely used technique in medical image analysis. To fully unravel the underlying nonlinear manifold the selection of an adequate kernel function and of its free parameters is critical. In practice, however, the kernel function is generally chosen as Gaussian or polynomial and such standard kernels might not always be optimal for a given image dataset or application. In this paper, we present a study on the effect of the kernel functions in nonlinear manifold embedding of medical image data. To this end, we first carry out a literature review on existing advanced kernels developed in the statistics, machine learning, and signal processing communities. In addition, we implement kernel-based formulations of well-known nonlinear dimensional reduction techniques such as Isomap and Locally Linear Embedding, thus obtaining a unified framework for manifold embedding using kernels. Subsequently, we present a method to automatically choose a kernel function and its associated parameters from a pool of kernel candidates, with the aim to generate the most optimal manifold embeddings. Furthermore, we show how the calculated selection measures can be extended to take into account the spatial relationships in images, or used to combine several kernels to further improve the embedding results. Experiments are then carried out on various synthetic and phantom datasets for numerical assessment of the methods. Furthermore, the workflow is applied to real data that include brain manifolds and multispectral images to demonstrate the importance of the kernel selection in the analysis of high-dimensional medical images. Copyright © 2014 Elsevier Ltd. All rights reserved.

  7. A Kernel-Based Low-Rank (KLR) Model for Low-Dimensional Manifold Recovery in Highly Accelerated Dynamic MRI.

    PubMed

    Nakarmi, Ukash; Wang, Yanhua; Lyu, Jingyuan; Liang, Dong; Ying, Leslie

    2017-11-01

    While many low rank and sparsity-based approaches have been developed for accelerated dynamic magnetic resonance imaging (dMRI), they all use low rankness or sparsity in input space, overlooking the intrinsic nonlinear correlation in most dMRI data. In this paper, we propose a kernel-based framework to allow nonlinear manifold models in reconstruction from sub-Nyquist data. Within this framework, many existing algorithms can be extended to kernel framework with nonlinear models. In particular, we have developed a novel algorithm with a kernel-based low-rank model generalizing the conventional low rank formulation. The algorithm consists of manifold learning using kernel, low rank enforcement in feature space, and preimaging with data consistency. Extensive simulation and experiment results show that the proposed method surpasses the conventional low-rank-modeled approaches for dMRI.

  8. Single image super-resolution based on convolutional neural networks

    NASA Astrophysics Data System (ADS)

    Zou, Lamei; Luo, Ming; Yang, Weidong; Li, Peng; Jin, Liujia

    2018-03-01

    We present a deep learning method for single image super-resolution (SISR). The proposed approach learns end-to-end mapping between low-resolution (LR) images and high-resolution (HR) images. The mapping is represented as a deep convolutional neural network which inputs the LR image and outputs the HR image. Our network uses 5 convolution layers, which kernels size include 5×5, 3×3 and 1×1. In our proposed network, we use residual-learning and combine different sizes of convolution kernels at the same layer. The experiment results show that our proposed method performs better than the existing methods in reconstructing quality index and human visual effects on benchmarked images.

  9. Lung dynamic MRI deblurring using low-rank decomposition and dictionary learning.

    PubMed

    Gou, Shuiping; Wang, Yueyue; Wu, Jiaolong; Lee, Percy; Sheng, Ke

    2015-04-01

    Lung dynamic MRI (dMRI) has emerged to be an appealing tool to quantify lung motion for both planning and treatment guidance purposes. However, this modality can result in blurry images due to intrinsically low signal-to-noise ratio in the lung and spatial/temporal interpolation. The image blurring could adversely affect the image processing that depends on the availability of fine landmarks. The purpose of this study is to reduce dMRI blurring using image postprocessing. To enhance the image quality and exploit the spatiotemporal continuity of dMRI sequences, a low-rank decomposition and dictionary learning (LDDL) method was employed to deblur lung dMRI and enhance the conspicuity of lung blood vessels. Fifty frames of continuous 2D coronal dMRI frames using a steady state free precession sequence were obtained from five subjects including two healthy volunteer and three lung cancer patients. In LDDL, the lung dMRI was decomposed into sparse and low-rank components. Dictionary learning was employed to estimate the blurring kernel based on the whole image, low-rank or sparse component of the first image in the lung MRI sequence. Deblurring was performed on the whole image sequences using deconvolution based on the estimated blur kernel. The deblurring results were quantified using an automated blood vessel extraction method based on the classification of Hessian matrix filtered images. Accuracy of automated extraction was calculated using manual segmentation of the blood vessels as the ground truth. In the pilot study, LDDL based on the blurring kernel estimated from the sparse component led to performance superior to the other ways of kernel estimation. LDDL consistently improved image contrast and fine feature conspicuity of the original MRI without introducing artifacts. The accuracy of automated blood vessel extraction was on average increased by 16% using manual segmentation as the ground truth. Image blurring in dMRI images can be effectively reduced using a low-rank decomposition and dictionary learning method using kernels estimated by the sparse component.

  10. High-throughput method for ear phenotyping and kernel weight estimation in maize using ear digital imaging.

    PubMed

    Makanza, R; Zaman-Allah, M; Cairns, J E; Eyre, J; Burgueño, J; Pacheco, Ángela; Diepenbrock, C; Magorokosho, C; Tarekegne, A; Olsen, M; Prasanna, B M

    2018-01-01

    Grain yield, ear and kernel attributes can assist to understand the performance of maize plant under different environmental conditions and can be used in the variety development process to address farmer's preferences. These parameters are however still laborious and expensive to measure. A low-cost ear digital imaging method was developed that provides estimates of ear and kernel attributes i.e., ear number and size, kernel number and size as well as kernel weight from photos of ears harvested from field trial plots. The image processing method uses a script that runs in a batch mode on ImageJ; an open source software. Kernel weight was estimated using the total kernel number derived from the number of kernels visible on the image and the average kernel size. Data showed a good agreement in terms of accuracy and precision between ground truth measurements and data generated through image processing. Broad-sense heritability of the estimated parameters was in the range or higher than that for measured grain weight. Limitation of the method for kernel weight estimation is discussed. The method developed in this work provides an opportunity to significantly reduce the cost of selection in the breeding process, especially for resource constrained crop improvement programs and can be used to learn more about the genetic bases of grain yield determinants.

  11. Ranking support vector machine for multiple kernels output combination in protein-protein interaction extraction from biomedical literature.

    PubMed

    Yang, Zhihao; Lin, Yuan; Wu, Jiajin; Tang, Nan; Lin, Hongfei; Li, Yanpeng

    2011-10-01

    Knowledge about protein-protein interactions (PPIs) unveils the molecular mechanisms of biological processes. However, the volume and content of published biomedical literature on protein interactions is expanding rapidly, making it increasingly difficult for interaction database curators to detect and curate protein interaction information manually. We present a multiple kernel learning-based approach for automatic PPI extraction from biomedical literature. The approach combines the following kernels: feature-based, tree, and graph and combines their output with Ranking support vector machine (SVM). Experimental evaluations show that the features in individual kernels are complementary and the kernel combined with Ranking SVM achieves better performance than those of the individual kernels, equal weight combination and optimal weight combination. Our approach can achieve state-of-the-art performance with respect to the comparable evaluations, with 64.88% F-score and 88.02% AUC on the AImed corpus. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  12. Unsupervised multiple kernel learning for heterogeneous data integration.

    PubMed

    Mariette, Jérôme; Villa-Vialaneix, Nathalie

    2018-03-15

    Recent high-throughput sequencing advances have expanded the breadth of available omics datasets and the integrated analysis of multiple datasets obtained on the same samples has allowed to gain important insights in a wide range of applications. However, the integration of various sources of information remains a challenge for systems biology since produced datasets are often of heterogeneous types, with the need of developing generic methods to take their different specificities into account. We propose a multiple kernel framework that allows to integrate multiple datasets of various types into a single exploratory analysis. Several solutions are provided to learn either a consensus meta-kernel or a meta-kernel that preserves the original topology of the datasets. We applied our framework to analyse two public multi-omics datasets. First, the multiple metagenomic datasets, collected during the TARA Oceans expedition, was explored to demonstrate that our method is able to retrieve previous findings in a single kernel PCA as well as to provide a new image of the sample structures when a larger number of datasets are included in the analysis. To perform this analysis, a generic procedure is also proposed to improve the interpretability of the kernel PCA in regards with the original data. Second, the multi-omics breast cancer datasets, provided by The Cancer Genome Atlas, is analysed using a kernel Self-Organizing Maps with both single and multi-omics strategies. The comparison of these two approaches demonstrates the benefit of our integration method to improve the representation of the studied biological system. Proposed methods are available in the R package mixKernel, released on CRAN. It is fully compatible with the mixOmics package and a tutorial describing the approach can be found on mixOmics web site http://mixomics.org/mixkernel/. jerome.mariette@inra.fr or nathalie.villa-vialaneix@inra.fr. Supplementary data are available at Bioinformatics online.

  13. All-paths graph kernel for protein-protein interaction extraction with evaluation of cross-corpus learning.

    PubMed

    Airola, Antti; Pyysalo, Sampo; Björne, Jari; Pahikkala, Tapio; Ginter, Filip; Salakoski, Tapio

    2008-11-19

    Automated extraction of protein-protein interactions (PPI) is an important and widely studied task in biomedical text mining. We propose a graph kernel based approach for this task. In contrast to earlier approaches to PPI extraction, the introduced all-paths graph kernel has the capability to make use of full, general dependency graphs representing the sentence structure. We evaluate the proposed method on five publicly available PPI corpora, providing the most comprehensive evaluation done for a machine learning based PPI-extraction system. We additionally perform a detailed evaluation of the effects of training and testing on different resources, providing insight into the challenges involved in applying a system beyond the data it was trained on. Our method is shown to achieve state-of-the-art performance with respect to comparable evaluations, with 56.4 F-score and 84.8 AUC on the AImed corpus. We show that the graph kernel approach performs on state-of-the-art level in PPI extraction, and note the possible extension to the task of extracting complex interactions. Cross-corpus results provide further insight into how the learning generalizes beyond individual corpora. Further, we identify several pitfalls that can make evaluations of PPI-extraction systems incomparable, or even invalid. These include incorrect cross-validation strategies and problems related to comparing F-score results achieved on different evaluation resources. Recommendations for avoiding these pitfalls are provided.

  14. Classification of Phylogenetic Profiles for Protein Function Prediction: An SVM Approach

    NASA Astrophysics Data System (ADS)

    Kotaru, Appala Raju; Joshi, Ramesh C.

    Predicting the function of an uncharacterized protein is a major challenge in post-genomic era due to problems complexity and scale. Having knowledge of protein function is a crucial link in the development of new drugs, better crops, and even the development of biochemicals such as biofuels. Recently numerous high-throughput experimental procedures have been invented to investigate the mechanisms leading to the accomplishment of a protein’s function and Phylogenetic profile is one of them. Phylogenetic profile is a way of representing a protein which encodes evolutionary history of proteins. In this paper we proposed a method for classification of phylogenetic profiles using supervised machine learning method, support vector machine classification along with radial basis function as kernel for identifying functionally linked proteins. We experimentally evaluated the performance of the classifier with the linear kernel, polynomial kernel and compared the results with the existing tree kernel. In our study we have used proteins of the budding yeast saccharomyces cerevisiae genome. We generated the phylogenetic profiles of 2465 yeast genes and for our study we used the functional annotations that are available in the MIPS database. Our experiments show that the performance of the radial basis kernel is similar to polynomial kernel is some functional classes together are better than linear, tree kernel and over all radial basis kernel outperformed the polynomial kernel, linear kernel and tree kernel. In analyzing these results we show that it will be feasible to make use of SVM classifier with radial basis function as kernel to predict the gene functionality using phylogenetic profiles.

  15. A fuzzy pattern matching method based on graph kernel for lithography hotspot detection

    NASA Astrophysics Data System (ADS)

    Nitta, Izumi; Kanazawa, Yuzi; Ishida, Tsutomu; Banno, Koji

    2017-03-01

    In advanced technology nodes, lithography hotspot detection has become one of the most significant issues in design for manufacturability. Recently, machine learning based lithography hotspot detection has been widely investigated, but it has trade-off between detection accuracy and false alarm. To apply machine learning based technique to the physical verification phase, designers require minimizing undetected hotspots to avoid yield degradation. They also need a ranking of similar known patterns with a detected hotspot to prioritize layout pattern to be corrected. To achieve high detection accuracy and to prioritize detected hotspots, we propose a novel lithography hotspot detection method using Delaunay triangulation and graph kernel based machine learning. Delaunay triangulation extracts features of hotspot patterns where polygons locate irregularly and closely one another, and graph kernel expresses inner structure of graphs. Additionally, our method provides similarity between two patterns and creates a list of similar training patterns with a detected hotspot. Experiments results on ICCAD 2012 benchmarks show that our method achieves high accuracy with allowable range of false alarm. We also show the ranking of the similar known patterns with a detected hotspot.

  16. Learning molecular energies using localized graph kernels.

    PubMed

    Ferré, Grégoire; Haut, Terry; Barros, Kipton

    2017-03-21

    Recent machine learning methods make it possible to model potential energy of atomic configurations with chemical-level accuracy (as calculated from ab initio calculations) and at speeds suitable for molecular dynamics simulation. Best performance is achieved when the known physical constraints are encoded in the machine learning models. For example, the atomic energy is invariant under global translations and rotations; it is also invariant to permutations of same-species atoms. Although simple to state, these symmetries are complicated to encode into machine learning algorithms. In this paper, we present a machine learning approach based on graph theory that naturally incorporates translation, rotation, and permutation symmetries. Specifically, we use a random walk graph kernel to measure the similarity of two adjacency matrices, each of which represents a local atomic environment. This Graph Approximated Energy (GRAPE) approach is flexible and admits many possible extensions. We benchmark a simple version of GRAPE by predicting atomization energies on a standard dataset of organic molecules.

  17. Learning molecular energies using localized graph kernels

    NASA Astrophysics Data System (ADS)

    Ferré, Grégoire; Haut, Terry; Barros, Kipton

    2017-03-01

    Recent machine learning methods make it possible to model potential energy of atomic configurations with chemical-level accuracy (as calculated from ab initio calculations) and at speeds suitable for molecular dynamics simulation. Best performance is achieved when the known physical constraints are encoded in the machine learning models. For example, the atomic energy is invariant under global translations and rotations; it is also invariant to permutations of same-species atoms. Although simple to state, these symmetries are complicated to encode into machine learning algorithms. In this paper, we present a machine learning approach based on graph theory that naturally incorporates translation, rotation, and permutation symmetries. Specifically, we use a random walk graph kernel to measure the similarity of two adjacency matrices, each of which represents a local atomic environment. This Graph Approximated Energy (GRAPE) approach is flexible and admits many possible extensions. We benchmark a simple version of GRAPE by predicting atomization energies on a standard dataset of organic molecules.

  18. Least square regularized regression in sum space.

    PubMed

    Xu, Yong-Li; Chen, Di-Rong; Li, Han-Xiong; Liu, Lu

    2013-04-01

    This paper proposes a least square regularized regression algorithm in sum space of reproducing kernel Hilbert spaces (RKHSs) for nonflat function approximation, and obtains the solution of the algorithm by solving a system of linear equations. This algorithm can approximate the low- and high-frequency component of the target function with large and small scale kernels, respectively. The convergence and learning rate are analyzed. We measure the complexity of the sum space by its covering number and demonstrate that the covering number can be bounded by the product of the covering numbers of basic RKHSs. For sum space of RKHSs with Gaussian kernels, by choosing appropriate parameters, we tradeoff the sample error and regularization error, and obtain a polynomial learning rate, which is better than that in any single RKHS. The utility of this method is illustrated with two simulated data sets and five real-life databases.

  19. Kernel methods for large-scale genomic data analysis

    PubMed Central

    Xing, Eric P.; Schaid, Daniel J.

    2015-01-01

    Machine learning, particularly kernel methods, has been demonstrated as a promising new tool to tackle the challenges imposed by today’s explosive data growth in genomics. They provide a practical and principled approach to learning how a large number of genetic variants are associated with complex phenotypes, to help reveal the complexity in the relationship between the genetic markers and the outcome of interest. In this review, we highlight the potential key role it will have in modern genomic data processing, especially with regard to integration with classical methods for gene prioritizing, prediction and data fusion. PMID:25053743

  20. Kernelized rank learning for personalized drug recommendation.

    PubMed

    He, Xiao; Folkman, Lukas; Borgwardt, Karsten

    2018-03-08

    Large-scale screenings of cancer cell lines with detailed molecular profiles against libraries of pharmacological compounds are currently being performed in order to gain a better understanding of the genetic component of drug response and to enhance our ability to recommend therapies given a patient's molecular profile. These comprehensive screens differ from the clinical setting in which (1) medical records only contain the response of a patient to very few drugs, (2) drugs are recommended by doctors based on their expert judgment, and (3) selecting the most promising therapy is often more important than accurately predicting the sensitivity to all potential drugs. Current regression models for drug sensitivity prediction fail to account for these three properties. We present a machine learning approach, named Kernelized Rank Learning (KRL), that ranks drugs based on their predicted effect per cell line (patient), circumventing the difficult problem of precisely predicting the sensitivity to the given drug. Our approach outperforms several state-of-the-art predictors in drug recommendation, particularly if the training dataset is sparse, and generalizes to patient data. Our work phrases personalized drug recommendation as a new type of machine learning problem with translational potential to the clinic. The Python implementation of KRL and scripts for running our experiments are available at https://github.com/BorgwardtLab/Kernelized-Rank-Learning. xiao.he@bsse.ethz.ch, lukas.folkman@bsse.ethz.ch. Supplementary data are available at Bioinformatics online.

  1. Nonlinear Semi-Supervised Metric Learning Via Multiple Kernels and Local Topology.

    PubMed

    Li, Xin; Bai, Yanqin; Peng, Yaxin; Du, Shaoyi; Ying, Shihui

    2018-03-01

    Changing the metric on the data may change the data distribution, hence a good distance metric can promote the performance of learning algorithm. In this paper, we address the semi-supervised distance metric learning (ML) problem to obtain the best nonlinear metric for the data. First, we describe the nonlinear metric by the multiple kernel representation. By this approach, we project the data into a high dimensional space, where the data can be well represented by linear ML. Then, we reformulate the linear ML by a minimization problem on the positive definite matrix group. Finally, we develop a two-step algorithm for solving this model and design an intrinsic steepest descent algorithm to learn the positive definite metric matrix. Experimental results validate that our proposed method is effective and outperforms several state-of-the-art ML methods.

  2. Feature and Region Selection for Visual Learning.

    PubMed

    Zhao, Ji; Wang, Liantao; Cabral, Ricardo; De la Torre, Fernando

    2016-03-01

    Visual learning problems, such as object classification and action recognition, are typically approached using extensions of the popular bag-of-words (BoWs) model. Despite its great success, it is unclear what visual features the BoW model is learning. Which regions in the image or video are used to discriminate among classes? Which are the most discriminative visual words? Answering these questions is fundamental for understanding existing BoW models and inspiring better models for visual recognition. To answer these questions, this paper presents a method for feature selection and region selection in the visual BoW model. This allows for an intermediate visualization of the features and regions that are important for visual learning. The main idea is to assign latent weights to the features or regions, and jointly optimize these latent variables with the parameters of a classifier (e.g., support vector machine). There are four main benefits of our approach: 1) our approach accommodates non-linear additive kernels, such as the popular χ(2) and intersection kernel; 2) our approach is able to handle both regions in images and spatio-temporal regions in videos in a unified way; 3) the feature selection problem is convex, and both problems can be solved using a scalable reduced gradient method; and 4) we point out strong connections with multiple kernel learning and multiple instance learning approaches. Experimental results in the PASCAL VOC 2007, MSR Action Dataset II and YouTube illustrate the benefits of our approach.

  3. Learning a peptide-protein binding affinity predictor with kernel ridge regression

    PubMed Central

    2013-01-01

    Background The cellular function of a vast majority of proteins is performed through physical interactions with other biomolecules, which, most of the time, are other proteins. Peptides represent templates of choice for mimicking a secondary structure in order to modulate protein-protein interaction. They are thus an interesting class of therapeutics since they also display strong activity, high selectivity, low toxicity and few drug-drug interactions. Furthermore, predicting peptides that would bind to a specific MHC alleles would be of tremendous benefit to improve vaccine based therapy and possibly generate antibodies with greater affinity. Modern computational methods have the potential to accelerate and lower the cost of drug and vaccine discovery by selecting potential compounds for testing in silico prior to biological validation. Results We propose a specialized string kernel for small bio-molecules, peptides and pseudo-sequences of binding interfaces. The kernel incorporates physico-chemical properties of amino acids and elegantly generalizes eight kernels, comprised of the Oligo, the Weighted Degree, the Blended Spectrum, and the Radial Basis Function. We provide a low complexity dynamic programming algorithm for the exact computation of the kernel and a linear time algorithm for it’s approximation. Combined with kernel ridge regression and SupCK, a novel binding pocket kernel, the proposed kernel yields biologically relevant and good prediction accuracy on the PepX database. For the first time, a machine learning predictor is capable of predicting the binding affinity of any peptide to any protein with reasonable accuracy. The method was also applied to both single-target and pan-specific Major Histocompatibility Complex class II benchmark datasets and three Quantitative Structure Affinity Model benchmark datasets. Conclusion On all benchmarks, our method significantly (p-value ≤ 0.057) outperforms the current state-of-the-art methods at predicting peptide-protein binding affinities. The proposed approach is flexible and can be applied to predict any quantitative biological activity. Moreover, generating reliable peptide-protein binding affinities will also improve system biology modelling of interaction pathways. Lastly, the method should be of value to a large segment of the research community with the potential to accelerate the discovery of peptide-based drugs and facilitate vaccine development. The proposed kernel is freely available at http://graal.ift.ulaval.ca/downloads/gs-kernel/. PMID:23497081

  4. Predicting activity approach based on new atoms similarity kernel function.

    PubMed

    Abu El-Atta, Ahmed H; Moussa, M I; Hassanien, Aboul Ella

    2015-07-01

    Drug design is a high cost and long term process. To reduce time and costs for drugs discoveries, new techniques are needed. Chemoinformatics field implements the informational techniques and computer science like machine learning and graph theory to discover the chemical compounds properties, such as toxicity or biological activity. This is done through analyzing their molecular structure (molecular graph). To overcome this problem there is an increasing need for algorithms to analyze and classify graph data to predict the activity of molecules. Kernels methods provide a powerful framework which combines machine learning with graph theory techniques. These kernels methods have led to impressive performance results in many several chemoinformatics problems like biological activity prediction. This paper presents a new approach based on kernel functions to solve activity prediction problem for chemical compounds. First we encode all atoms depending on their neighbors then we use these codes to find a relationship between those atoms each other. Then we use relation between different atoms to find similarity between chemical compounds. The proposed approach was compared with many other classification methods and the results show competitive accuracy with these methods. Copyright © 2015 Elsevier Inc. All rights reserved.

  5. Oversampling the Minority Class in the Feature Space.

    PubMed

    Perez-Ortiz, Maria; Gutierrez, Pedro Antonio; Tino, Peter; Hervas-Martinez, Cesar

    2016-09-01

    The imbalanced nature of some real-world data is one of the current challenges for machine learning researchers. One common approach oversamples the minority class through convex combination of its patterns. We explore the general idea of synthetic oversampling in the feature space induced by a kernel function (as opposed to input space). If the kernel function matches the underlying problem, the classes will be linearly separable and synthetically generated patterns will lie on the minority class region. Since the feature space is not directly accessible, we use the empirical feature space (EFS) (a Euclidean space isomorphic to the feature space) for oversampling purposes. The proposed method is framed in the context of support vector machines, where the imbalanced data sets can pose a serious hindrance. The idea is investigated in three scenarios: 1) oversampling in the full and reduced-rank EFSs; 2) a kernel learning technique maximizing the data class separation to study the influence of the feature space structure (implicitly defined by the kernel function); and 3) a unified framework for preferential oversampling that spans some of the previous approaches in the literature. We support our investigation with extensive experiments over 50 imbalanced data sets.

  6. A Temperature Compensation Method for Piezo-Resistive Pressure Sensor Utilizing Chaotic Ions Motion Algorithm Optimized Hybrid Kernel LSSVM.

    PubMed

    Li, Ji; Hu, Guoqing; Zhou, Yonghong; Zou, Chong; Peng, Wei; Alam Sm, Jahangir

    2016-10-14

    A piezo-resistive pressure sensor is made of silicon, the nature of which is considerably influenced by ambient temperature. The effect of temperature should be eliminated during the working period in expectation of linear output. To deal with this issue, an approach consists of a hybrid kernel Least Squares Support Vector Machine (LSSVM) optimized by a chaotic ions motion algorithm presented. To achieve the learning and generalization for excellent performance, a hybrid kernel function, constructed by a local kernel as Radial Basis Function (RBF) kernel, and a global kernel as polynomial kernel is incorporated into the Least Squares Support Vector Machine. The chaotic ions motion algorithm is introduced to find the best hyper-parameters of the Least Squares Support Vector Machine. The temperature data from a calibration experiment is conducted to validate the proposed method. With attention on algorithm robustness and engineering applications, the compensation result shows the proposed scheme outperforms other compared methods on several performance measures as maximum absolute relative error, minimum absolute relative error mean and variance of the averaged value on fifty runs. Furthermore, the proposed temperature compensation approach lays a foundation for more extensive research.

  7. Using LAMMPS Software on the Peregrine System | High-Performance Computing

    Science.gov Websites

    -l walltime=4:00:00 # WALLTIME #PBS -l nodes=2:ppn=16 # Number of nodes and processes per node #PBS module purge module load impi-intel/2017.0.5 mkl/2017.0.5 lammps/11Aug17 mpirun -np 32 lmp -in lmp.in -l

  8. Regularized Embedded Multiple Kernel Dimensionality Reduction for Mine Signal Processing.

    PubMed

    Li, Shuang; Liu, Bing; Zhang, Chen

    2016-01-01

    Traditional multiple kernel dimensionality reduction models are generally based on graph embedding and manifold assumption. But such assumption might be invalid for some high-dimensional or sparse data due to the curse of dimensionality, which has a negative influence on the performance of multiple kernel learning. In addition, some models might be ill-posed if the rank of matrices in their objective functions was not high enough. To address these issues, we extend the traditional graph embedding framework and propose a novel regularized embedded multiple kernel dimensionality reduction method. Different from the conventional convex relaxation technique, the proposed algorithm directly takes advantage of a binary search and an alternative optimization scheme to obtain optimal solutions efficiently. The experimental results demonstrate the effectiveness of the proposed method for supervised, unsupervised, and semisupervised scenarios.

  9. Online Distributed Learning Over Networks in RKH Spaces Using Random Fourier Features

    NASA Astrophysics Data System (ADS)

    Bouboulis, Pantelis; Chouvardas, Symeon; Theodoridis, Sergios

    2018-04-01

    We present a novel diffusion scheme for online kernel-based learning over networks. So far, a major drawback of any online learning algorithm, operating in a reproducing kernel Hilbert space (RKHS), is the need for updating a growing number of parameters as time iterations evolve. Besides complexity, this leads to an increased need of communication resources, in a distributed setting. In contrast, the proposed method approximates the solution as a fixed-size vector (of larger dimension than the input space) using Random Fourier Features. This paves the way to use standard linear combine-then-adapt techniques. To the best of our knowledge, this is the first time that a complete protocol for distributed online learning in RKHS is presented. Conditions for asymptotic convergence and boundness of the networkwise regret are also provided. The simulated tests illustrate the performance of the proposed scheme.

  10. Parsimonious kernel extreme learning machine in primal via Cholesky factorization.

    PubMed

    Zhao, Yong-Ping

    2016-08-01

    Recently, extreme learning machine (ELM) has become a popular topic in machine learning community. By replacing the so-called ELM feature mappings with the nonlinear mappings induced by kernel functions, two kernel ELMs, i.e., P-KELM and D-KELM, are obtained from primal and dual perspectives, respectively. Unfortunately, both P-KELM and D-KELM possess the dense solutions in direct proportion to the number of training data. To this end, a constructive algorithm for P-KELM (CCP-KELM) is first proposed by virtue of Cholesky factorization, in which the training data incurring the largest reductions on the objective function are recruited as significant vectors. To reduce its training cost further, PCCP-KELM is then obtained with the application of a probabilistic speedup scheme into CCP-KELM. Corresponding to CCP-KELM, a destructive P-KELM (CDP-KELM) is presented using a partial Cholesky factorization strategy, where the training data incurring the smallest reductions on the objective function after their removals are pruned from the current set of significant vectors. Finally, to verify the efficacy and feasibility of the proposed algorithms in this paper, experiments on both small and large benchmark data sets are investigated. Copyright © 2016 Elsevier Ltd. All rights reserved.

  11. Prioritizing individual genetic variants after kernel machine testing using variable selection.

    PubMed

    He, Qianchuan; Cai, Tianxi; Liu, Yang; Zhao, Ni; Harmon, Quaker E; Almli, Lynn M; Binder, Elisabeth B; Engel, Stephanie M; Ressler, Kerry J; Conneely, Karen N; Lin, Xihong; Wu, Michael C

    2016-12-01

    Kernel machine learning methods, such as the SNP-set kernel association test (SKAT), have been widely used to test associations between traits and genetic polymorphisms. In contrast to traditional single-SNP analysis methods, these methods are designed to examine the joint effect of a set of related SNPs (such as a group of SNPs within a gene or a pathway) and are able to identify sets of SNPs that are associated with the trait of interest. However, as with many multi-SNP testing approaches, kernel machine testing can draw conclusion only at the SNP-set level, and does not directly inform on which one(s) of the identified SNP set is actually driving the associations. A recently proposed procedure, KerNel Iterative Feature Extraction (KNIFE), provides a general framework for incorporating variable selection into kernel machine methods. In this article, we focus on quantitative traits and relatively common SNPs, and adapt the KNIFE procedure to genetic association studies and propose an approach to identify driver SNPs after the application of SKAT to gene set analysis. Our approach accommodates several kernels that are widely used in SNP analysis, such as the linear kernel and the Identity by State (IBS) kernel. The proposed approach provides practically useful utilities to prioritize SNPs, and fills the gap between SNP set analysis and biological functional studies. Both simulation studies and real data application are used to demonstrate the proposed approach. © 2016 WILEY PERIODICALS, INC.

  12. A Framework for the Ethical Practice of Action Learning

    ERIC Educational Resources Information Center

    Johnson, Craig

    2010-01-01

    By tradition the action learning community has encouraged an eclectic view of practice. This involves a number of different permutations around a kernel of nebulous ideas. However, the disadvantages of such an open philosophy have never been considered. In particular consumer protection against inauthentic action learning experiences has been…

  13. Learning molecular energies using localized graph kernels

    DOE PAGES

    Ferré, Grégoire; Haut, Terry Scot; Barros, Kipton Marcos

    2017-03-21

    We report that recent machine learning methods make it possible to model potential energy of atomic configurations with chemical-level accuracy (as calculated from ab initio calculations) and at speeds suitable for molecular dynamics simulation. Best performance is achieved when the known physical constraints are encoded in the machine learning models. For example, the atomic energy is invariant under global translations and rotations; it is also invariant to permutations of same-species atoms. Although simple to state, these symmetries are complicated to encode into machine learning algorithms. In this paper, we present a machine learning approach based on graph theory that naturallymore » incorporates translation, rotation, and permutation symmetries. Specifically, we use a random walk graph kernel to measure the similarity of two adjacency matrices, each of which represents a local atomic environment. This Graph Approximated Energy (GRAPE) approach is flexible and admits many possible extensions. Finally, we benchmark a simple version of GRAPE by predicting atomization energies on a standard dataset of organic molecules.« less

  14. Learning molecular energies using localized graph kernels

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ferré, Grégoire; Haut, Terry Scot; Barros, Kipton Marcos

    We report that recent machine learning methods make it possible to model potential energy of atomic configurations with chemical-level accuracy (as calculated from ab initio calculations) and at speeds suitable for molecular dynamics simulation. Best performance is achieved when the known physical constraints are encoded in the machine learning models. For example, the atomic energy is invariant under global translations and rotations; it is also invariant to permutations of same-species atoms. Although simple to state, these symmetries are complicated to encode into machine learning algorithms. In this paper, we present a machine learning approach based on graph theory that naturallymore » incorporates translation, rotation, and permutation symmetries. Specifically, we use a random walk graph kernel to measure the similarity of two adjacency matrices, each of which represents a local atomic environment. This Graph Approximated Energy (GRAPE) approach is flexible and admits many possible extensions. Finally, we benchmark a simple version of GRAPE by predicting atomization energies on a standard dataset of organic molecules.« less

  15. A linear recurrent kernel online learning algorithm with sparse updates.

    PubMed

    Fan, Haijin; Song, Qing

    2014-02-01

    In this paper, we propose a recurrent kernel algorithm with selectively sparse updates for online learning. The algorithm introduces a linear recurrent term in the estimation of the current output. This makes the past information reusable for updating of the algorithm in the form of a recurrent gradient term. To ensure that the reuse of this recurrent gradient indeed accelerates the convergence speed, a novel hybrid recurrent training is proposed to switch on or off learning the recurrent information according to the magnitude of the current training error. Furthermore, the algorithm includes a data-dependent adaptive learning rate which can provide guaranteed system weight convergence at each training iteration. The learning rate is set as zero when the training violates the derived convergence conditions, which makes the algorithm updating process sparse. Theoretical analyses of the weight convergence are presented and experimental results show the good performance of the proposed algorithm in terms of convergence speed and estimation accuracy. Copyright © 2013 Elsevier Ltd. All rights reserved.

  16. Electron beam lithographic modeling assisted by artificial intelligence technology

    NASA Astrophysics Data System (ADS)

    Nakayamada, Noriaki; Nishimura, Rieko; Miura, Satoru; Nomura, Haruyuki; Kamikubo, Takashi

    2017-07-01

    We propose a new concept of tuning a point-spread function (a "kernel" function) in the modeling of electron beam lithography using the machine learning scheme. Normally in the work of artificial intelligence, the researchers focus on the output results from a neural network, such as success ratio in image recognition or improved production yield, etc. In this work, we put more focus on the weights connecting the nodes in a convolutional neural network, which are naturally the fractions of a point-spread function, and take out those weighted fractions after learning to be utilized as a tuned kernel. Proof-of-concept of the kernel tuning has been demonstrated using the examples of proximity effect correction with 2-layer network, and charging effect correction with 3-layer network. This type of new tuning method can be beneficial to give researchers more insights to come up with a better model, yet it might be too early to be deployed to production to give better critical dimension (CD) and positional accuracy almost instantly.

  17. Dimensional feature weighting utilizing multiple kernel learning for single-channel talker location discrimination using the acoustic transfer function.

    PubMed

    Takashima, Ryoichi; Takiguchi, Tetsuya; Ariki, Yasuo

    2013-02-01

    This paper presents a method for discriminating the location of the sound source (talker) using only a single microphone. In a previous work, the single-channel approach for discriminating the location of the sound source was discussed, where the acoustic transfer function from a user's position is estimated by using a hidden Markov model of clean speech in the cepstral domain. In this paper, each cepstral dimension of the acoustic transfer function is newly weighted, in order to obtain the cepstral dimensions having information that is useful for classifying the user's position. Then, this paper proposes a feature-weighting method for the cepstral parameter using multiple kernel learning, defining the base kernels for each cepstral dimension of the acoustic transfer function. The user's position is trained and classified by support vector machine. The effectiveness of this method has been confirmed by sound source (talker) localization experiments performed in different room environments.

  18. Automatic sleep staging using multi-dimensional feature extraction and multi-kernel fuzzy support vector machine.

    PubMed

    Zhang, Yanjun; Zhang, Xiangmin; Liu, Wenhui; Luo, Yuxi; Yu, Enjia; Zou, Keju; Liu, Xiaoliang

    2014-01-01

    This paper employed the clinical Polysomnographic (PSG) data, mainly including all-night Electroencephalogram (EEG), Electrooculogram (EOG) and Electromyogram (EMG) signals of subjects, and adopted the American Academy of Sleep Medicine (AASM) clinical staging manual as standards to realize automatic sleep staging. Authors extracted eighteen different features of EEG, EOG and EMG in time domains and frequency domains to construct the vectors according to the existing literatures as well as clinical experience. By adopting sleep samples self-learning, the linear combination of weights and parameters of multiple kernels of the fuzzy support vector machine (FSVM) were learned and the multi-kernel FSVM (MK-FSVM) was constructed. The overall agreement between the experts' scores and the results presented was 82.53%. Compared with previous results, the accuracy of N1 was improved to some extent while the accuracies of other stages were approximate, which well reflected the sleep structure. The staging algorithm proposed in this paper is transparent, and worth further investigation.

  19. Many Molecular Properties from One Kernel in Chemical Space

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ramakrishnan, Raghunathan; von Lilienfeld, O. Anatole

    We introduce property-independent kernels for machine learning modeling of arbitrarily many molecular properties. The kernels encode molecular structures for training sets of varying size, as well as similarity measures sufficiently diffuse in chemical space to sample over all training molecules. Corresponding molecular reference properties provided, they enable the instantaneous generation of ML models which can systematically be improved through the addition of more data. This idea is exemplified for single kernel based modeling of internal energy, enthalpy, free energy, heat capacity, polarizability, electronic spread, zero-point vibrational energy, energies of frontier orbitals, HOMOLUMO gap, and the highest fundamental vibrational wavenumber. Modelsmore » of these properties are trained and tested using 112 kilo organic molecules of similar size. Resulting models are discussed as well as the kernels’ use for generating and using other property models.« less

  20. Gene function prediction with gene interaction networks: a context graph kernel approach.

    PubMed

    Li, Xin; Chen, Hsinchun; Li, Jiexun; Zhang, Zhu

    2010-01-01

    Predicting gene functions is a challenge for biologists in the postgenomic era. Interactions among genes and their products compose networks that can be used to infer gene functions. Most previous studies adopt a linkage assumption, i.e., they assume that gene interactions indicate functional similarities between connected genes. In this study, we propose to use a gene's context graph, i.e., the gene interaction network associated with the focal gene, to infer its functions. In a kernel-based machine-learning framework, we design a context graph kernel to capture the information in context graphs. Our experimental study on a testbed of p53-related genes demonstrates the advantage of using indirect gene interactions and shows the empirical superiority of the proposed approach over linkage-assumption-based methods, such as the algorithm to minimize inconsistent connected genes and diffusion kernels.

  1. Exploiting graph kernels for high performance biomedical relation extraction.

    PubMed

    Panyam, Nagesh C; Verspoor, Karin; Cohn, Trevor; Ramamohanarao, Kotagiri

    2018-01-30

    Relation extraction from biomedical publications is an important task in the area of semantic mining of text. Kernel methods for supervised relation extraction are often preferred over manual feature engineering methods, when classifying highly ordered structures such as trees and graphs obtained from syntactic parsing of a sentence. Tree kernels such as the Subset Tree Kernel and Partial Tree Kernel have been shown to be effective for classifying constituency parse trees and basic dependency parse graphs of a sentence. Graph kernels such as the All Path Graph kernel (APG) and Approximate Subgraph Matching (ASM) kernel have been shown to be suitable for classifying general graphs with cycles, such as the enhanced dependency parse graph of a sentence. In this work, we present a high performance Chemical-Induced Disease (CID) relation extraction system. We present a comparative study of kernel methods for the CID task and also extend our study to the Protein-Protein Interaction (PPI) extraction task, an important biomedical relation extraction task. We discuss novel modifications to the ASM kernel to boost its performance and a method to apply graph kernels for extracting relations expressed in multiple sentences. Our system for CID relation extraction attains an F-score of 60%, without using external knowledge sources or task specific heuristic or rules. In comparison, the state of the art Chemical-Disease Relation Extraction system achieves an F-score of 56% using an ensemble of multiple machine learning methods, which is then boosted to 61% with a rule based system employing task specific post processing rules. For the CID task, graph kernels outperform tree kernels substantially, and the best performance is obtained with APG kernel that attains an F-score of 60%, followed by the ASM kernel at 57%. The performance difference between the ASM and APG kernels for CID sentence level relation extraction is not significant. In our evaluation of ASM for the PPI task, ASM performed better than APG kernel for the BioInfer dataset, in the Area Under Curve (AUC) measure (74% vs 69%). However, for all the other PPI datasets, namely AIMed, HPRD50, IEPA and LLL, ASM is substantially outperformed by the APG kernel in F-score and AUC measures. We demonstrate a high performance Chemical Induced Disease relation extraction, without employing external knowledge sources or task specific heuristics. Our work shows that graph kernels are effective in extracting relations that are expressed in multiple sentences. We also show that the graph kernels, namely the ASM and APG kernels, substantially outperform the tree kernels. Among the graph kernels, we showed the ASM kernel as effective for biomedical relation extraction, with comparable performance to the APG kernel for datasets such as the CID-sentence level relation extraction and BioInfer in PPI. Overall, the APG kernel is shown to be significantly more accurate than the ASM kernel, achieving better performance on most datasets.

  2. An Efficient Implementation of Deep Convolutional Neural Networks for MRI Segmentation.

    PubMed

    Hoseini, Farnaz; Shahbahrami, Asadollah; Bayat, Peyman

    2018-02-27

    Image segmentation is one of the most common steps in digital image processing, classifying a digital image into different segments. The main goal of this paper is to segment brain tumors in magnetic resonance images (MRI) using deep learning. Tumors having different shapes, sizes, brightness and textures can appear anywhere in the brain. These complexities are the reasons to choose a high-capacity Deep Convolutional Neural Network (DCNN) containing more than one layer. The proposed DCNN contains two parts: architecture and learning algorithms. The architecture and the learning algorithms are used to design a network model and to optimize parameters for the network training phase, respectively. The architecture contains five convolutional layers, all using 3 × 3 kernels, and one fully connected layer. Due to the advantage of using small kernels with fold, it allows making the effect of larger kernels with smaller number of parameters and fewer computations. Using the Dice Similarity Coefficient metric, we report accuracy results on the BRATS 2016, brain tumor segmentation challenge dataset, for the complete, core, and enhancing regions as 0.90, 0.85, and 0.84 respectively. The learning algorithm includes the task-level parallelism. All the pixels of an MR image are classified using a patch-based approach for segmentation. We attain a good performance and the experimental results show that the proposed DCNN increases the segmentation accuracy compared to previous techniques.

  3. What Would a Graph Look Like in this Layout? A Machine Learning Approach to Large Graph Visualization.

    PubMed

    Kwon, Oh-Hyun; Crnovrsanin, Tarik; Ma, Kwan-Liu

    2018-01-01

    Using different methods for laying out a graph can lead to very different visual appearances, with which the viewer perceives different information. Selecting a "good" layout method is thus important for visualizing a graph. The selection can be highly subjective and dependent on the given task. A common approach to selecting a good layout is to use aesthetic criteria and visual inspection. However, fully calculating various layouts and their associated aesthetic metrics is computationally expensive. In this paper, we present a machine learning approach to large graph visualization based on computing the topological similarity of graphs using graph kernels. For a given graph, our approach can show what the graph would look like in different layouts and estimate their corresponding aesthetic metrics. An important contribution of our work is the development of a new framework to design graph kernels. Our experimental study shows that our estimation calculation is considerably faster than computing the actual layouts and their aesthetic metrics. Also, our graph kernels outperform the state-of-the-art ones in both time and accuracy. In addition, we conducted a user study to demonstrate that the topological similarity computed with our graph kernel matches perceptual similarity assessed by human users.

  4. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Womeldorff, Geoffrey Alan; Payne, Joshua Estes; Bergen, Benjamin Karl

    These are slides for a presentation on PARTISN Research and FleCSI Updates. The following topics are covered: SNAP vs PARTISN, Background Research, Production Code (structural design and changes, kernel design and implementation, lessons learned), NuT IMC Proxy, FleCSI Update (design and lessons learned). It can all be summarized in the following manner: Kokkos was shown to be effective in FY15 in implementing a C++ version of SNAP's kernel. This same methodology was applied to a production IC code, PARTISN. This was a much more complex endeavour than in FY15 for many reasons; a C++ kernel embedded in Fortran, overloading Fortranmore » memory allocations, general language interoperability, and a fully fleshed out production code versus a simplified proxy code. Lessons learned are Legion. In no particular order: Interoperability between Fortran and C++ was really not that hard, and a useful engineering effort. Tracking down all necessary memory allocations for a kernel in a production code is pretty hard. Modifying a production code to work for more than a handful of use cases is also pretty hard. Figuring out the toolchain that will allow a successful implementation of design decisions is quite hard, if making use of "bleeding edge" design choices. In terms of performance, production code concurrency architecture can be a virtual showstopper; being too complex to easily rewrite and test in a short period of time, or depending on tool features which do not exist yet. Ultimately, while the tools used in this work were not successful in speeding up the production code, they helped to identify how work would be done, and provide requirements to tools.« less

  5. Optimal projection method determination by Logdet Divergence and perturbed von-Neumann Divergence.

    PubMed

    Jiang, Hao; Ching, Wai-Ki; Qiu, Yushan; Cheng, Xiao-Qing

    2017-12-14

    Positive semi-definiteness is a critical property in kernel methods for Support Vector Machine (SVM) by which efficient solutions can be guaranteed through convex quadratic programming. However, a lot of similarity functions in applications do not produce positive semi-definite kernels. We propose projection method by constructing projection matrix on indefinite kernels. As a generalization of the spectrum method (denoising method and flipping method), the projection method shows better or comparable performance comparing to the corresponding indefinite kernel methods on a number of real world data sets. Under the Bregman matrix divergence theory, we can find suggested optimal λ in projection method using unconstrained optimization in kernel learning. In this paper we focus on optimal λ determination, in the pursuit of precise optimal λ determination method in unconstrained optimization framework. We developed a perturbed von-Neumann divergence to measure kernel relationships. We compared optimal λ determination with Logdet Divergence and perturbed von-Neumann Divergence, aiming at finding better λ in projection method. Results on a number of real world data sets show that projection method with optimal λ by Logdet divergence demonstrate near optimal performance. And the perturbed von-Neumann Divergence can help determine a relatively better optimal projection method. Projection method ia easy to use for dealing with indefinite kernels. And the parameter embedded in the method can be determined through unconstrained optimization under Bregman matrix divergence theory. This may provide a new way in kernel SVMs for varied objectives.

  6. Multiple kernel SVR based on the MRE for remote sensing water depth fusion detection

    NASA Astrophysics Data System (ADS)

    Wang, Jinjin; Ma, Yi; Zhang, Jingyu

    2018-03-01

    Remote sensing has an important means of water depth detection in coastal shallow waters and reefs. Support vector regression (SVR) is a machine learning method which is widely used in data regression. In this paper, SVR is used to remote sensing multispectral bathymetry. Aiming at the problem that the single-kernel SVR method has a large error in shallow water depth inversion, the mean relative error (MRE) of different water depth is retrieved as a decision fusion factor with single kernel SVR method, a multi kernel SVR fusion method based on the MRE is put forward. And taking the North Island of the Xisha Islands in China as an experimentation area, the comparison experiments with the single kernel SVR method and the traditional multi-bands bathymetric method are carried out. The results show that: 1) In range of 0 to 25 meters, the mean absolute error(MAE)of the multi kernel SVR fusion method is 1.5m,the MRE is 13.2%; 2) Compared to the 4 single kernel SVR method, the MRE of the fusion method reduced 1.2% (1.9%) 3.4% (1.8%), and compared to traditional multi-bands method, the MRE reduced 1.9%; 3) In 0-5m depth section, compared to the single kernel method and the multi-bands method, the MRE of fusion method reduced 13.5% to 44.4%, and the distribution of points is more concentrated relative to y=x.

  7. Optimization of fixture layouts of glass laser optics using multiple kernel regression.

    PubMed

    Su, Jianhua; Cao, Enhua; Qiao, Hong

    2014-05-10

    We aim to build an integrated fixturing model to describe the structural properties and thermal properties of the support frame of glass laser optics. Therefore, (a) a near global optimal set of clamps can be computed to minimize the surface shape error of the glass laser optic based on the proposed model, and (b) a desired surface shape error can be obtained by adjusting the clamping forces under various environmental temperatures based on the model. To construct the model, we develop a new multiple kernel learning method and call it multiple kernel support vector functional regression. The proposed method uses two layer regressions to group and order the data sources by the weights of the kernels and the factors of the layers. Because of that, the influences of the clamps and the temperature can be evaluated by grouping them into different layers.

  8. Racing to learn: statistical inference and learning in a single spiking neuron with adaptive kernels

    PubMed Central

    Afshar, Saeed; George, Libin; Tapson, Jonathan; van Schaik, André; Hamilton, Tara J.

    2014-01-01

    This paper describes the Synapto-dendritic Kernel Adapting Neuron (SKAN), a simple spiking neuron model that performs statistical inference and unsupervised learning of spatiotemporal spike patterns. SKAN is the first proposed neuron model to investigate the effects of dynamic synapto-dendritic kernels and demonstrate their computational power even at the single neuron scale. The rule-set defining the neuron is simple: there are no complex mathematical operations such as normalization, exponentiation or even multiplication. The functionalities of SKAN emerge from the real-time interaction of simple additive and binary processes. Like a biological neuron, SKAN is robust to signal and parameter noise, and can utilize both in its operations. At the network scale neurons are locked in a race with each other with the fastest neuron to spike effectively “hiding” its learnt pattern from its neighbors. The robustness to noise, high speed, and simple building blocks not only make SKAN an interesting neuron model in computational neuroscience, but also make it ideal for implementation in digital and analog neuromorphic systems which is demonstrated through an implementation in a Field Programmable Gate Array (FPGA). Matlab, Python, and Verilog implementations of SKAN are available at: http://www.uws.edu.au/bioelectronics_neuroscience/bens/reproducible_research. PMID:25505378

  9. Transfer Kernel Common Spatial Patterns for Motor Imagery Brain-Computer Interface Classification.

    PubMed

    Dai, Mengxi; Zheng, Dezhi; Liu, Shucong; Zhang, Pengju

    2018-01-01

    Motor-imagery-based brain-computer interfaces (BCIs) commonly use the common spatial pattern (CSP) as preprocessing step before classification. The CSP method is a supervised algorithm. Therefore a lot of time-consuming training data is needed to build the model. To address this issue, one promising approach is transfer learning, which generalizes a learning model can extract discriminative information from other subjects for target classification task. To this end, we propose a transfer kernel CSP (TKCSP) approach to learn a domain-invariant kernel by directly matching distributions of source subjects and target subjects. The dataset IVa of BCI Competition III is used to demonstrate the validity by our proposed methods. In the experiment, we compare the classification performance of the TKCSP against CSP, CSP for subject-to-subject transfer (CSP SJ-to-SJ), regularizing CSP (RCSP), stationary subspace CSP (ssCSP), multitask CSP (mtCSP), and the combined mtCSP and ssCSP (ss + mtCSP) method. The results indicate that the superior mean classification performance of TKCSP can achieve 81.14%, especially in case of source subjects with fewer number of training samples. Comprehensive experimental evidence on the dataset verifies the effectiveness and efficiency of the proposed TKCSP approach over several state-of-the-art methods.

  10. Transfer Kernel Common Spatial Patterns for Motor Imagery Brain-Computer Interface Classification

    PubMed Central

    Dai, Mengxi; Liu, Shucong; Zhang, Pengju

    2018-01-01

    Motor-imagery-based brain-computer interfaces (BCIs) commonly use the common spatial pattern (CSP) as preprocessing step before classification. The CSP method is a supervised algorithm. Therefore a lot of time-consuming training data is needed to build the model. To address this issue, one promising approach is transfer learning, which generalizes a learning model can extract discriminative information from other subjects for target classification task. To this end, we propose a transfer kernel CSP (TKCSP) approach to learn a domain-invariant kernel by directly matching distributions of source subjects and target subjects. The dataset IVa of BCI Competition III is used to demonstrate the validity by our proposed methods. In the experiment, we compare the classification performance of the TKCSP against CSP, CSP for subject-to-subject transfer (CSP SJ-to-SJ), regularizing CSP (RCSP), stationary subspace CSP (ssCSP), multitask CSP (mtCSP), and the combined mtCSP and ssCSP (ss + mtCSP) method. The results indicate that the superior mean classification performance of TKCSP can achieve 81.14%, especially in case of source subjects with fewer number of training samples. Comprehensive experimental evidence on the dataset verifies the effectiveness and efficiency of the proposed TKCSP approach over several state-of-the-art methods. PMID:29743934

  11. Racing to learn: statistical inference and learning in a single spiking neuron with adaptive kernels.

    PubMed

    Afshar, Saeed; George, Libin; Tapson, Jonathan; van Schaik, André; Hamilton, Tara J

    2014-01-01

    This paper describes the Synapto-dendritic Kernel Adapting Neuron (SKAN), a simple spiking neuron model that performs statistical inference and unsupervised learning of spatiotemporal spike patterns. SKAN is the first proposed neuron model to investigate the effects of dynamic synapto-dendritic kernels and demonstrate their computational power even at the single neuron scale. The rule-set defining the neuron is simple: there are no complex mathematical operations such as normalization, exponentiation or even multiplication. The functionalities of SKAN emerge from the real-time interaction of simple additive and binary processes. Like a biological neuron, SKAN is robust to signal and parameter noise, and can utilize both in its operations. At the network scale neurons are locked in a race with each other with the fastest neuron to spike effectively "hiding" its learnt pattern from its neighbors. The robustness to noise, high speed, and simple building blocks not only make SKAN an interesting neuron model in computational neuroscience, but also make it ideal for implementation in digital and analog neuromorphic systems which is demonstrated through an implementation in a Field Programmable Gate Array (FPGA). Matlab, Python, and Verilog implementations of SKAN are available at: http://www.uws.edu.au/bioelectronics_neuroscience/bens/reproducible_research.

  12. A scalable approach to solving dense linear algebra problems on hybrid CPU-GPU systems

    DOE PAGES

    Song, Fengguang; Dongarra, Jack

    2014-10-01

    Aiming to fully exploit the computing power of all CPUs and all graphics processing units (GPUs) on hybrid CPU-GPU systems to solve dense linear algebra problems, in this paper we design a class of heterogeneous tile algorithms to maximize the degree of parallelism, to minimize the communication volume, and to accommodate the heterogeneity between CPUs and GPUs. The new heterogeneous tile algorithms are executed upon our decentralized dynamic scheduling runtime system, which schedules a task graph dynamically and transfers data between compute nodes automatically. The runtime system uses a new distributed task assignment protocol to solve data dependencies between tasksmore » without any coordination between processing units. By overlapping computation and communication through dynamic scheduling, we are able to attain scalable performance for the double-precision Cholesky factorization and QR factorization. Finally, our approach demonstrates a performance comparable to Intel MKL on shared-memory multicore systems and better performance than both vendor (e.g., Intel MKL) and open source libraries (e.g., StarPU) in the following three environments: heterogeneous clusters with GPUs, conventional clusters without GPUs, and shared-memory systems with multiple GPUs.« less

  13. A scalable approach to solving dense linear algebra problems on hybrid CPU-GPU systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Song, Fengguang; Dongarra, Jack

    Aiming to fully exploit the computing power of all CPUs and all graphics processing units (GPUs) on hybrid CPU-GPU systems to solve dense linear algebra problems, in this paper we design a class of heterogeneous tile algorithms to maximize the degree of parallelism, to minimize the communication volume, and to accommodate the heterogeneity between CPUs and GPUs. The new heterogeneous tile algorithms are executed upon our decentralized dynamic scheduling runtime system, which schedules a task graph dynamically and transfers data between compute nodes automatically. The runtime system uses a new distributed task assignment protocol to solve data dependencies between tasksmore » without any coordination between processing units. By overlapping computation and communication through dynamic scheduling, we are able to attain scalable performance for the double-precision Cholesky factorization and QR factorization. Finally, our approach demonstrates a performance comparable to Intel MKL on shared-memory multicore systems and better performance than both vendor (e.g., Intel MKL) and open source libraries (e.g., StarPU) in the following three environments: heterogeneous clusters with GPUs, conventional clusters without GPUs, and shared-memory systems with multiple GPUs.« less

  14. Candidate gene prioritization by network analysis of differential expression using machine learning approaches

    PubMed Central

    2010-01-01

    Background Discovering novel disease genes is still challenging for diseases for which no prior knowledge - such as known disease genes or disease-related pathways - is available. Performing genetic studies frequently results in large lists of candidate genes of which only few can be followed up for further investigation. We have recently developed a computational method for constitutional genetic disorders that identifies the most promising candidate genes by replacing prior knowledge by experimental data of differential gene expression between affected and healthy individuals. To improve the performance of our prioritization strategy, we have extended our previous work by applying different machine learning approaches that identify promising candidate genes by determining whether a gene is surrounded by highly differentially expressed genes in a functional association or protein-protein interaction network. Results We have proposed three strategies scoring disease candidate genes relying on network-based machine learning approaches, such as kernel ridge regression, heat kernel, and Arnoldi kernel approximation. For comparison purposes, a local measure based on the expression of the direct neighbors is also computed. We have benchmarked these strategies on 40 publicly available knockout experiments in mice, and performance was assessed against results obtained using a standard procedure in genetics that ranks candidate genes based solely on their differential expression levels (Simple Expression Ranking). Our results showed that our four strategies could outperform this standard procedure and that the best results were obtained using the Heat Kernel Diffusion Ranking leading to an average ranking position of 8 out of 100 genes, an AUC value of 92.3% and an error reduction of 52.8% relative to the standard procedure approach which ranked the knockout gene on average at position 17 with an AUC value of 83.7%. Conclusion In this study we could identify promising candidate genes using network based machine learning approaches even if no knowledge is available about the disease or phenotype. PMID:20840752

  15. Predicting plant protein subcellular multi-localization by Chou's PseAAC formulation based multi-label homolog knowledge transfer learning.

    PubMed

    Mei, Suyu

    2012-10-07

    Recent years have witnessed much progress in computational modeling for protein subcellular localization. However, there are far few computational models for predicting plant protein subcellular multi-localization. In this paper, we propose a multi-label multi-kernel transfer learning model for predicting multiple subcellular locations of plant proteins (MLMK-TLM). The method proposes a multi-label confusion matrix and adapts one-against-all multi-class probabilistic outputs to multi-label learning scenario, based on which we further extend our published work MK-TLM (multi-kernel transfer learning based on Chou's PseAAC formulation for protein submitochondria localization) for plant protein subcellular multi-localization. By proper homolog knowledge transfer, MLMK-TLM is applicable to novel plant protein subcellular localization in multi-label learning scenario. The experiments on plant protein benchmark dataset show that MLMK-TLM outperforms the baseline model. Unlike the existing models, MLMK-TLM also reports its misleading tendency, which is important for comprehensive survey of model's multi-labeling performance. Copyright © 2012 Elsevier Ltd. All rights reserved.

  16. Symmetry-Adapted Machine Learning for Tensorial Properties of Atomistic Systems.

    PubMed

    Grisafi, Andrea; Wilkins, David M; Csányi, Gábor; Ceriotti, Michele

    2018-01-19

    Statistical learning methods show great promise in providing an accurate prediction of materials and molecular properties, while minimizing the need for computationally demanding electronic structure calculations. The accuracy and transferability of these models are increased significantly by encoding into the learning procedure the fundamental symmetries of rotational and permutational invariance of scalar properties. However, the prediction of tensorial properties requires that the model respects the appropriate geometric transformations, rather than invariance, when the reference frame is rotated. We introduce a formalism that extends existing schemes and makes it possible to perform machine learning of tensorial properties of arbitrary rank, and for general molecular geometries. To demonstrate it, we derive a tensor kernel adapted to rotational symmetry, which is the natural generalization of the smooth overlap of atomic positions kernel commonly used for the prediction of scalar properties at the atomic scale. The performance and generality of the approach is demonstrated by learning the instantaneous response to an external electric field of water oligomers of increasing complexity, from the isolated molecule to the condensed phase.

  17. Symmetry-Adapted Machine Learning for Tensorial Properties of Atomistic Systems

    NASA Astrophysics Data System (ADS)

    Grisafi, Andrea; Wilkins, David M.; Csányi, Gábor; Ceriotti, Michele

    2018-01-01

    Statistical learning methods show great promise in providing an accurate prediction of materials and molecular properties, while minimizing the need for computationally demanding electronic structure calculations. The accuracy and transferability of these models are increased significantly by encoding into the learning procedure the fundamental symmetries of rotational and permutational invariance of scalar properties. However, the prediction of tensorial properties requires that the model respects the appropriate geometric transformations, rather than invariance, when the reference frame is rotated. We introduce a formalism that extends existing schemes and makes it possible to perform machine learning of tensorial properties of arbitrary rank, and for general molecular geometries. To demonstrate it, we derive a tensor kernel adapted to rotational symmetry, which is the natural generalization of the smooth overlap of atomic positions kernel commonly used for the prediction of scalar properties at the atomic scale. The performance and generality of the approach is demonstrated by learning the instantaneous response to an external electric field of water oligomers of increasing complexity, from the isolated molecule to the condensed phase.

  18. Alexnet Feature Extraction and Multi-Kernel Learning for Objectoriented Classification

    NASA Astrophysics Data System (ADS)

    Ding, L.; Li, H.; Hu, C.; Zhang, W.; Wang, S.

    2018-04-01

    In view of the fact that the deep convolutional neural network has stronger ability of feature learning and feature expression, an exploratory research is done on feature extraction and classification for high resolution remote sensing images. Taking the Google image with 0.3 meter spatial resolution in Ludian area of Yunnan Province as an example, the image segmentation object was taken as the basic unit, and the pre-trained AlexNet deep convolution neural network model was used for feature extraction. And the spectral features, AlexNet features and GLCM texture features are combined with multi-kernel learning and SVM classifier, finally the classification results were compared and analyzed. The results show that the deep convolution neural network can extract more accurate remote sensing image features, and significantly improve the overall accuracy of classification, and provide a reference value for earthquake disaster investigation and remote sensing disaster evaluation.

  19. Combining Statistical and Geometric Features for Colonic Polyp Detection in CTC Based on Multiple Kernel Learning

    PubMed Central

    Wang, Shijun; Yao, Jianhua; Petrick, Nicholas; Summers, Ronald M.

    2010-01-01

    Colon cancer is the second leading cause of cancer-related deaths in the United States. Computed tomographic colonography (CTC) combined with a computer aided detection system provides a feasible approach for improving colonic polyps detection and increasing the use of CTC for colon cancer screening. To distinguish true polyps from false positives, various features extracted from polyp candidates have been proposed. Most of these traditional features try to capture the shape information of polyp candidates or neighborhood knowledge about the surrounding structures (fold, colon wall, etc.). In this paper, we propose a new set of shape descriptors for polyp candidates based on statistical curvature information. These features called histograms of curvature features are rotation, translation and scale invariant and can be treated as complementing existing feature set. Then in order to make full use of the traditional geometric features (defined as group A) and the new statistical features (group B) which are highly heterogeneous, we employed a multiple kernel learning method based on semi-definite programming to learn an optimized classification kernel from the two groups of features. We conducted leave-one-patient-out test on a CTC dataset which contained scans from 66 patients. Experimental results show that a support vector machine (SVM) based on the combined feature set and the semi-definite optimization kernel achieved higher FROC performance compared to SVMs using the two groups of features separately. At a false positive per scan rate of 5, the sensitivity of the SVM using the combined features improved from 0.77 (Group A) and 0.73 (Group B) to 0.83 (p ≤ 0.01). PMID:20953299

  20. KernelADASYN: Kernel Based Adaptive Synthetic Data Generation for Imbalanced Learning

    DTIC Science & Technology

    2015-08-17

    eases [35], Indian liver patient dataset (ILPD) [36], Parkinsons dataset [37], Vertebral Column dataset [38], breast cancer dataset [39], breast tissue...Both the data set of Breast Cancer and Breast Tissue aim to predict the patient is normal or abnormal according to the measurements. The data set SPECT...9, pp. 1263–1284, 2009. [3] M. Elter, R. Schulz-Wendtland, and T. Wittenberg, “The prediction of breast cancer biopsy outcomes using two cad

  1. Fast metabolite identification with Input Output Kernel Regression.

    PubMed

    Brouard, Céline; Shen, Huibin; Dührkop, Kai; d'Alché-Buc, Florence; Böcker, Sebastian; Rousu, Juho

    2016-06-15

    An important problematic of metabolomics is to identify metabolites using tandem mass spectrometry data. Machine learning methods have been proposed recently to solve this problem by predicting molecular fingerprint vectors and matching these fingerprints against existing molecular structure databases. In this work we propose to address the metabolite identification problem using a structured output prediction approach. This type of approach is not limited to vector output space and can handle structured output space such as the molecule space. We use the Input Output Kernel Regression method to learn the mapping between tandem mass spectra and molecular structures. The principle of this method is to encode the similarities in the input (spectra) space and the similarities in the output (molecule) space using two kernel functions. This method approximates the spectra-molecule mapping in two phases. The first phase corresponds to a regression problem from the input space to the feature space associated to the output kernel. The second phase is a preimage problem, consisting in mapping back the predicted output feature vectors to the molecule space. We show that our approach achieves state-of-the-art accuracy in metabolite identification. Moreover, our method has the advantage of decreasing the running times for the training step and the test step by several orders of magnitude over the preceding methods. celine.brouard@aalto.fi Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  2. Predicting receptor-ligand pairs through kernel learning

    PubMed Central

    2011-01-01

    Background Regulation of cellular events is, often, initiated via extracellular signaling. Extracellular signaling occurs when a circulating ligand interacts with one or more membrane-bound receptors. Identification of receptor-ligand pairs is thus an important and specific form of PPI prediction. Results Given a set of disparate data sources (expression data, domain content, and phylogenetic profile) we seek to predict new receptor-ligand pairs. We create a combined kernel classifier and assess its performance with respect to the Database of Ligand-Receptor Partners (DLRP) 'golden standard' as well as the method proposed by Gertz et al. Among our findings, we discover that our predictions for the tgfβ family accurately reconstruct over 76% of the supported edges (0.76 recall and 0.67 precision) of the receptor-ligand bipartite graph defined by the DLRP "golden standard". In addition, for the tgfβ family, the combined kernel classifier is able to relatively improve upon the Gertz et al. work by a factor of approximately 1.5 when considering that our method has an F-measure of 0.71 while that of Gertz et al. has a value of 0.48. Conclusions The prediction of receptor-ligand pairings is a difficult and complex task. We have demonstrated that using kernel learning on multiple data sources provides a stronger alternative to the existing method in solving this task. PMID:21834994

  3. Fast metabolite identification with Input Output Kernel Regression

    PubMed Central

    Brouard, Céline; Shen, Huibin; Dührkop, Kai; d'Alché-Buc, Florence; Böcker, Sebastian; Rousu, Juho

    2016-01-01

    Motivation: An important problematic of metabolomics is to identify metabolites using tandem mass spectrometry data. Machine learning methods have been proposed recently to solve this problem by predicting molecular fingerprint vectors and matching these fingerprints against existing molecular structure databases. In this work we propose to address the metabolite identification problem using a structured output prediction approach. This type of approach is not limited to vector output space and can handle structured output space such as the molecule space. Results: We use the Input Output Kernel Regression method to learn the mapping between tandem mass spectra and molecular structures. The principle of this method is to encode the similarities in the input (spectra) space and the similarities in the output (molecule) space using two kernel functions. This method approximates the spectra-molecule mapping in two phases. The first phase corresponds to a regression problem from the input space to the feature space associated to the output kernel. The second phase is a preimage problem, consisting in mapping back the predicted output feature vectors to the molecule space. We show that our approach achieves state-of-the-art accuracy in metabolite identification. Moreover, our method has the advantage of decreasing the running times for the training step and the test step by several orders of magnitude over the preceding methods. Availability and implementation: Contact: celine.brouard@aalto.fi Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27307628

  4. Detection of Splice Sites Using Support Vector Machine

    NASA Astrophysics Data System (ADS)

    Varadwaj, Pritish; Purohit, Neetesh; Arora, Bhumika

    Automatic identification and annotation of exon and intron region of gene, from DNA sequences has been an important research area in field of computational biology. Several approaches viz. Hidden Markov Model (HMM), Artificial Intelligence (AI) based machine learning and Digital Signal Processing (DSP) techniques have extensively and independently been used by various researchers to cater this challenging task. In this work, we propose a Support Vector Machine based kernel learning approach for detection of splice sites (the exon-intron boundary) in a gene. Electron-Ion Interaction Potential (EIIP) values of nucleotides have been used for mapping character sequences to corresponding numeric sequences. Radial Basis Function (RBF) SVM kernel is trained using EIIP numeric sequences. Furthermore this was tested on test gene dataset for detection of splice site by window (of 12 residues) shifting. Optimum values of window size, various important parameters of SVM kernel have been optimized for a better accuracy. Receiver Operating Characteristic (ROC) curves have been utilized for displaying the sensitivity rate of the classifier and results showed 94.82% accuracy for splice site detection on test dataset.

  5. Direct Patlak Reconstruction From Dynamic PET Data Using the Kernel Method With MRI Information Based on Structural Similarity.

    PubMed

    Gong, Kuang; Cheng-Liao, Jinxiu; Wang, Guobao; Chen, Kevin T; Catana, Ciprian; Qi, Jinyi

    2018-04-01

    Positron emission tomography (PET) is a functional imaging modality widely used in oncology, cardiology, and neuroscience. It is highly sensitive, but suffers from relatively poor spatial resolution, as compared with anatomical imaging modalities, such as magnetic resonance imaging (MRI). With the recent development of combined PET/MR systems, we can improve the PET image quality by incorporating MR information into image reconstruction. Previously, kernel learning has been successfully embedded into static and dynamic PET image reconstruction using either PET temporal or MRI information. Here, we combine both PET temporal and MRI information adaptively to improve the quality of direct Patlak reconstruction. We examined different approaches to combine the PET and MRI information in kernel learning to address the issue of potential mismatches between MRI and PET signals. Computer simulations and hybrid real-patient data acquired on a simultaneous PET/MR scanner were used to evaluate the proposed methods. Results show that the method that combines PET temporal information and MRI spatial information adaptively based on the structure similarity index has the best performance in terms of noise reduction and resolution improvement.

  6. A New Deep Learning Model for Fault Diagnosis with Good Anti-Noise and Domain Adaptation Ability on Raw Vibration Signals

    PubMed Central

    Zhang, Wei; Peng, Gaoliang; Li, Chuanhao; Chen, Yuanhang; Zhang, Zhujun

    2017-01-01

    Intelligent fault diagnosis techniques have replaced time-consuming and unreliable human analysis, increasing the efficiency of fault diagnosis. Deep learning models can improve the accuracy of intelligent fault diagnosis with the help of their multilayer nonlinear mapping ability. This paper proposes a novel method named Deep Convolutional Neural Networks with Wide First-layer Kernels (WDCNN). The proposed method uses raw vibration signals as input (data augmentation is used to generate more inputs), and uses the wide kernels in the first convolutional layer for extracting features and suppressing high frequency noise. Small convolutional kernels in the preceding layers are used for multilayer nonlinear mapping. AdaBN is implemented to improve the domain adaptation ability of the model. The proposed model addresses the problem that currently, the accuracy of CNN applied to fault diagnosis is not very high. WDCNN can not only achieve 100% classification accuracy on normal signals, but also outperform the state-of-the-art DNN model which is based on frequency features under different working load and noisy environment conditions. PMID:28241451

  7. Omnibus Risk Assessment via Accelerated Failure Time Kernel Machine Modeling

    PubMed Central

    Sinnott, Jennifer A.; Cai, Tianxi

    2013-01-01

    Summary Integrating genomic information with traditional clinical risk factors to improve the prediction of disease outcomes could profoundly change the practice of medicine. However, the large number of potential markers and possible complexity of the relationship between markers and disease make it difficult to construct accurate risk prediction models. Standard approaches for identifying important markers often rely on marginal associations or linearity assumptions and may not capture non-linear or interactive effects. In recent years, much work has been done to group genes into pathways and networks. Integrating such biological knowledge into statistical learning could potentially improve model interpretability and reliability. One effective approach is to employ a kernel machine (KM) framework, which can capture nonlinear effects if nonlinear kernels are used (Scholkopf and Smola, 2002; Liu et al., 2007, 2008). For survival outcomes, KM regression modeling and testing procedures have been derived under a proportional hazards (PH) assumption (Li and Luan, 2003; Cai et al., 2011). In this paper, we derive testing and prediction methods for KM regression under the accelerated failure time model, a useful alternative to the PH model. We approximate the null distribution of our test statistic using resampling procedures. When multiple kernels are of potential interest, it may be unclear in advance which kernel to use for testing and estimation. We propose a robust Omnibus Test that combines information across kernels, and an approach for selecting the best kernel for estimation. The methods are illustrated with an application in breast cancer. PMID:24328713

  8. Salient Feature Identification and Analysis using Kernel-Based Classification Techniques for Synthetic Aperture Radar Automatic Target Recognition

    DTIC Science & Technology

    2014-03-27

    and machine learning for a range of research including such topics as medical imaging [10] and handwriting recognition [11]. The type of feature...1989. [11] C. Bahlmann, B. Haasdonk, and H. Burkhardt, “Online handwriting recognition with support vector machines-a kernel approach,” in Eighth...International Workshop on Frontiers in Handwriting Recognition, pp. 49–54, IEEE, 2002. [12] C. Cortes and V. Vapnik, “Support-vector networks,” Machine

  9. Numerical performance and throughput benchmark for electronic structure calculations in PC-Linux systems with new architectures, updated compilers, and libraries.

    PubMed

    Yu, Jen-Shiang K; Hwang, Jenn-Kang; Tang, Chuan Yi; Yu, Chin-Hui

    2004-01-01

    A number of recently released numerical libraries including Automatically Tuned Linear Algebra Subroutines (ATLAS) library, Intel Math Kernel Library (MKL), GOTO numerical library, and AMD Core Math Library (ACML) for AMD Opteron processors, are linked against the executables of the Gaussian 98 electronic structure calculation package, which is compiled by updated versions of Fortran compilers such as Intel Fortran compiler (ifc/efc) 7.1 and PGI Fortran compiler (pgf77/pgf90) 5.0. The ifc 7.1 delivers about 3% of improvement on 32-bit machines compared to the former version 6.0. Performance improved from pgf77 3.3 to 5.0 is also around 3% when utilizing the original unmodified optimization options of the compiler enclosed in the software. Nevertheless, if extensive compiler tuning options are used, the speed can be further accelerated to about 25%. The performances of these fully optimized numerical libraries are similar. The double-precision floating-point (FP) instruction sets (SSE2) are also functional on AMD Opteron processors operated in 32-bit compilation, and Intel Fortran compiler has performed better optimization. Hardware-level tuning is able to improve memory bandwidth by adjusting the DRAM timing, and the efficiency in the CL2 mode is further accelerated by 2.6% compared to that of the CL2.5 mode. The FP throughput is measured by simultaneous execution of two identical copies of each of the test jobs. Resultant performance impact suggests that IA64 and AMD64 architectures are able to fulfill significantly higher throughput than the IA32, which is consistent with the SpecFPrate2000 benchmarks.

  10. Study on Temperature and Synthetic Compensation of Piezo-Resistive Differential Pressure Sensors by Coupled Simulated Annealing and Simplex Optimized Kernel Extreme Learning Machine

    PubMed Central

    Li, Ji; Hu, Guoqing; Zhou, Yonghong; Zou, Chong; Peng, Wei; Alam SM, Jahangir

    2017-01-01

    As a high performance-cost ratio solution for differential pressure measurement, piezo-resistive differential pressure sensors are widely used in engineering processes. However, their performance is severely affected by the environmental temperature and the static pressure applied to them. In order to modify the non-linear measuring characteristics of the piezo-resistive differential pressure sensor, compensation actions should synthetically consider these two aspects. Advantages such as nonlinear approximation capability, highly desirable generalization ability and computational efficiency make the kernel extreme learning machine (KELM) a practical approach for this critical task. Since the KELM model is intrinsically sensitive to the regularization parameter and the kernel parameter, a searching scheme combining the coupled simulated annealing (CSA) algorithm and the Nelder-Mead simplex algorithm is adopted to find an optimal KLEM parameter set. A calibration experiment at different working pressure levels was conducted within the temperature range to assess the proposed method. In comparison with other compensation models such as the back-propagation neural network (BP), radius basis neural network (RBF), particle swarm optimization optimized support vector machine (PSO-SVM), particle swarm optimization optimized least squares support vector machine (PSO-LSSVM) and extreme learning machine (ELM), the compensation results show that the presented compensation algorithm exhibits a more satisfactory performance with respect to temperature compensation and synthetic compensation problems. PMID:28422080

  11. Study on Temperature and Synthetic Compensation of Piezo-Resistive Differential Pressure Sensors by Coupled Simulated Annealing and Simplex Optimized Kernel Extreme Learning Machine.

    PubMed

    Li, Ji; Hu, Guoqing; Zhou, Yonghong; Zou, Chong; Peng, Wei; Alam Sm, Jahangir

    2017-04-19

    As a high performance-cost ratio solution for differential pressure measurement, piezo-resistive differential pressure sensors are widely used in engineering processes. However, their performance is severely affected by the environmental temperature and the static pressure applied to them. In order to modify the non-linear measuring characteristics of the piezo-resistive differential pressure sensor, compensation actions should synthetically consider these two aspects. Advantages such as nonlinear approximation capability, highly desirable generalization ability and computational efficiency make the kernel extreme learning machine (KELM) a practical approach for this critical task. Since the KELM model is intrinsically sensitive to the regularization parameter and the kernel parameter, a searching scheme combining the coupled simulated annealing (CSA) algorithm and the Nelder-Mead simplex algorithm is adopted to find an optimal KLEM parameter set. A calibration experiment at different working pressure levels was conducted within the temperature range to assess the proposed method. In comparison with other compensation models such as the back-propagation neural network (BP), radius basis neural network (RBF), particle swarm optimization optimized support vector machine (PSO-SVM), particle swarm optimization optimized least squares support vector machine (PSO-LSSVM) and extreme learning machine (ELM), the compensation results show that the presented compensation algorithm exhibits a more satisfactory performance with respect to temperature compensation and synthetic compensation problems.

  12. Prostate cancer detection using machine learning techniques by employing combination of features extracting strategies.

    PubMed

    Hussain, Lal; Ahmed, Adeel; Saeed, Sharjil; Rathore, Saima; Awan, Imtiaz Ahmed; Shah, Saeed Arif; Majid, Abdul; Idris, Adnan; Awan, Anees Ahmed

    2018-02-06

    Prostate is a second leading causes of cancer deaths among men. Early detection of cancer can effectively reduce the rate of mortality caused by Prostate cancer. Due to high and multiresolution of MRIs from prostate cancer require a proper diagnostic systems and tools. In the past researchers developed Computer aided diagnosis (CAD) systems that help the radiologist to detect the abnormalities. In this research paper, we have employed novel Machine learning techniques such as Bayesian approach, Support vector machine (SVM) kernels: polynomial, radial base function (RBF) and Gaussian and Decision Tree for detecting prostate cancer. Moreover, different features extracting strategies are proposed to improve the detection performance. The features extracting strategies are based on texture, morphological, scale invariant feature transform (SIFT), and elliptic Fourier descriptors (EFDs) features. The performance was evaluated based on single as well as combination of features using Machine Learning Classification techniques. The Cross validation (Jack-knife k-fold) was performed and performance was evaluated in term of receiver operating curve (ROC) and specificity, sensitivity, Positive predictive value (PPV), negative predictive value (NPV), false positive rate (FPR). Based on single features extracting strategies, SVM Gaussian Kernel gives the highest accuracy of 98.34% with AUC of 0.999. While, using combination of features extracting strategies, SVM Gaussian kernel with texture + morphological, and EFDs + morphological features give the highest accuracy of 99.71% and AUC of 1.00.

  13. Accurate interatomic force fields via machine learning with covariant kernels

    NASA Astrophysics Data System (ADS)

    Glielmo, Aldo; Sollich, Peter; De Vita, Alessandro

    2017-06-01

    We present a novel scheme to accurately predict atomic forces as vector quantities, rather than sets of scalar components, by Gaussian process (GP) regression. This is based on matrix-valued kernel functions, on which we impose the requirements that the predicted force rotates with the target configuration and is independent of any rotations applied to the configuration database entries. We show that such covariant GP kernels can be obtained by integration over the elements of the rotation group SO (d ) for the relevant dimensionality d . Remarkably, in specific cases the integration can be carried out analytically and yields a conservative force field that can be recast into a pair interaction form. Finally, we show that restricting the integration to a summation over the elements of a finite point group relevant to the target system is sufficient to recover an accurate GP. The accuracy of our kernels in predicting quantum-mechanical forces in real materials is investigated by tests on pure and defective Ni, Fe, and Si crystalline systems.

  14. Learning Circulant Sensing Kernels

    DTIC Science & Technology

    2014-03-01

    Furthermore, we test learning the circulant sensing matrix/operator and the nonparametric dictionary altogether and obtain even better performance. We...scale. Furthermore, we test learning the circulant sensing matrix/operator and the nonparametric dictionary altogether and obtain even better performance...matrices, Tropp et al.[28] de - scribes a random filter for acquiring a signal x̄; Haupt et al.[12] describes a channel estimation problem to identify a

  15. Spectral methods in machine learning and new strategies for very large datasets

    PubMed Central

    Belabbas, Mohamed-Ali; Wolfe, Patrick J.

    2009-01-01

    Spectral methods are of fundamental importance in statistics and machine learning, because they underlie algorithms from classical principal components analysis to more recent approaches that exploit manifold structure. In most cases, the core technical problem can be reduced to computing a low-rank approximation to a positive-definite kernel. For the growing number of applications dealing with very large or high-dimensional datasets, however, the optimal approximation afforded by an exact spectral decomposition is too costly, because its complexity scales as the cube of either the number of training examples or their dimensionality. Motivated by such applications, we present here 2 new algorithms for the approximation of positive-semidefinite kernels, together with error bounds that improve on results in the literature. We approach this problem by seeking to determine, in an efficient manner, the most informative subset of our data relative to the kernel approximation task at hand. This leads to two new strategies based on the Nyström method that are directly applicable to massive datasets. The first of these—based on sampling—leads to a randomized algorithm whereupon the kernel induces a probability distribution on its set of partitions, whereas the latter approach—based on sorting—provides for the selection of a partition in a deterministic way. We detail their numerical implementation and provide simulation results for a variety of representative problems in statistical data analysis, each of which demonstrates the improved performance of our approach relative to existing methods. PMID:19129490

  16. Kernel-based whole-genome prediction of complex traits: a review.

    PubMed

    Morota, Gota; Gianola, Daniel

    2014-01-01

    Prediction of genetic values has been a focus of applied quantitative genetics since the beginning of the 20th century, with renewed interest following the advent of the era of whole genome-enabled prediction. Opportunities offered by the emergence of high-dimensional genomic data fueled by post-Sanger sequencing technologies, especially molecular markers, have driven researchers to extend Ronald Fisher and Sewall Wright's models to confront new challenges. In particular, kernel methods are gaining consideration as a regression method of choice for genome-enabled prediction. Complex traits are presumably influenced by many genomic regions working in concert with others (clearly so when considering pathways), thus generating interactions. Motivated by this view, a growing number of statistical approaches based on kernels attempt to capture non-additive effects, either parametrically or non-parametrically. This review centers on whole-genome regression using kernel methods applied to a wide range of quantitative traits of agricultural importance in animals and plants. We discuss various kernel-based approaches tailored to capturing total genetic variation, with the aim of arriving at an enhanced predictive performance in the light of available genome annotation information. Connections between prediction machines born in animal breeding, statistics, and machine learning are revisited, and their empirical prediction performance is discussed. Overall, while some encouraging results have been obtained with non-parametric kernels, recovering non-additive genetic variation in a validation dataset remains a challenge in quantitative genetics.

  17. SVM and SVM Ensembles in Breast Cancer Prediction.

    PubMed

    Huang, Min-Wei; Chen, Chih-Wen; Lin, Wei-Chao; Ke, Shih-Wen; Tsai, Chih-Fong

    2017-01-01

    Breast cancer is an all too common disease in women, making how to effectively predict it an active research problem. A number of statistical and machine learning techniques have been employed to develop various breast cancer prediction models. Among them, support vector machines (SVM) have been shown to outperform many related techniques. To construct the SVM classifier, it is first necessary to decide the kernel function, and different kernel functions can result in different prediction performance. However, there have been very few studies focused on examining the prediction performances of SVM based on different kernel functions. Moreover, it is unknown whether SVM classifier ensembles which have been proposed to improve the performance of single classifiers can outperform single SVM classifiers in terms of breast cancer prediction. Therefore, the aim of this paper is to fully assess the prediction performance of SVM and SVM ensembles over small and large scale breast cancer datasets. The classification accuracy, ROC, F-measure, and computational times of training SVM and SVM ensembles are compared. The experimental results show that linear kernel based SVM ensembles based on the bagging method and RBF kernel based SVM ensembles with the boosting method can be the better choices for a small scale dataset, where feature selection should be performed in the data pre-processing stage. For a large scale dataset, RBF kernel based SVM ensembles based on boosting perform better than the other classifiers.

  18. SVM and SVM Ensembles in Breast Cancer Prediction

    PubMed Central

    Huang, Min-Wei; Chen, Chih-Wen; Lin, Wei-Chao; Ke, Shih-Wen; Tsai, Chih-Fong

    2017-01-01

    Breast cancer is an all too common disease in women, making how to effectively predict it an active research problem. A number of statistical and machine learning techniques have been employed to develop various breast cancer prediction models. Among them, support vector machines (SVM) have been shown to outperform many related techniques. To construct the SVM classifier, it is first necessary to decide the kernel function, and different kernel functions can result in different prediction performance. However, there have been very few studies focused on examining the prediction performances of SVM based on different kernel functions. Moreover, it is unknown whether SVM classifier ensembles which have been proposed to improve the performance of single classifiers can outperform single SVM classifiers in terms of breast cancer prediction. Therefore, the aim of this paper is to fully assess the prediction performance of SVM and SVM ensembles over small and large scale breast cancer datasets. The classification accuracy, ROC, F-measure, and computational times of training SVM and SVM ensembles are compared. The experimental results show that linear kernel based SVM ensembles based on the bagging method and RBF kernel based SVM ensembles with the boosting method can be the better choices for a small scale dataset, where feature selection should be performed in the data pre-processing stage. For a large scale dataset, RBF kernel based SVM ensembles based on boosting perform better than the other classifiers. PMID:28060807

  19. On the role of cost-sensitive learning in multi-class brain-computer interfaces.

    PubMed

    Devlaminck, Dieter; Waegeman, Willem; Wyns, Bart; Otte, Georges; Santens, Patrick

    2010-06-01

    Brain-computer interfaces (BCIs) present an alternative way of communication for people with severe disabilities. One of the shortcomings in current BCI systems, recently put forward in the fourth BCI competition, is the asynchronous detection of motor imagery versus resting state. We investigated this extension to the three-class case, in which the resting state is considered virtually lying between two motor classes, resulting in a large penalty when one motor task is misclassified into the other motor class. We particularly focus on the behavior of different machine-learning techniques and on the role of multi-class cost-sensitive learning in such a context. To this end, four different kernel methods are empirically compared, namely pairwise multi-class support vector machines (SVMs), two cost-sensitive multi-class SVMs and kernel-based ordinal regression. The experimental results illustrate that ordinal regression performs better than the other three approaches when a cost-sensitive performance measure such as the mean-squared error is considered. By contrast, multi-class cost-sensitive learning enables us to control the number of large errors made between two motor tasks.

  20. Learning Midlevel Auditory Codes from Natural Sound Statistics.

    PubMed

    Młynarski, Wiktor; McDermott, Josh H

    2018-03-01

    Interaction with the world requires an organism to transform sensory signals into representations in which behaviorally meaningful properties of the environment are made explicit. These representations are derived through cascades of neuronal processing stages in which neurons at each stage recode the output of preceding stages. Explanations of sensory coding may thus involve understanding how low-level patterns are combined into more complex structures. To gain insight into such midlevel representations for sound, we designed a hierarchical generative model of natural sounds that learns combinations of spectrotemporal features from natural stimulus statistics. In the first layer, the model forms a sparse convolutional code of spectrograms using a dictionary of learned spectrotemporal kernels. To generalize from specific kernel activation patterns, the second layer encodes patterns of time-varying magnitude of multiple first-layer coefficients. When trained on corpora of speech and environmental sounds, some second-layer units learned to group similar spectrotemporal features. Others instantiate opponency between distinct sets of features. Such groupings might be instantiated by neurons in the auditory cortex, providing a hypothesis for midlevel neuronal computation.

  1. Mexican Hat Wavelet Kernel ELM for Multiclass Classification.

    PubMed

    Wang, Jie; Song, Yi-Fan; Ma, Tian-Lei

    2017-01-01

    Kernel extreme learning machine (KELM) is a novel feedforward neural network, which is widely used in classification problems. To some extent, it solves the existing problems of the invalid nodes and the large computational complexity in ELM. However, the traditional KELM classifier usually has a low test accuracy when it faces multiclass classification problems. In order to solve the above problem, a new classifier, Mexican Hat wavelet KELM classifier, is proposed in this paper. The proposed classifier successfully improves the training accuracy and reduces the training time in the multiclass classification problems. Moreover, the validity of the Mexican Hat wavelet as a kernel function of ELM is rigorously proved. Experimental results on different data sets show that the performance of the proposed classifier is significantly superior to the compared classifiers.

  2. Discriminative graph embedding for label propagation.

    PubMed

    Nguyen, Canh Hao; Mamitsuka, Hiroshi

    2011-09-01

    In many applications, the available information is encoded in graph structures. This is a common problem in biological networks, social networks, web communities and document citations. We investigate the problem of classifying nodes' labels on a similarity graph given only a graph structure on the nodes. Conventional machine learning methods usually require data to reside in some Euclidean spaces or to have a kernel representation. Applying these methods to nodes on graphs would require embedding the graphs into these spaces. By embedding and then learning the nodes on graphs, most methods are either flexible with different learning objectives or efficient enough for large scale applications. We propose a method to embed a graph into a feature space for a discriminative purpose. Our idea is to include label information into the embedding process, making the space representation tailored to the task. We design embedding objective functions that the following learning formulations become spectral transforms. We then reformulate these spectral transforms into multiple kernel learning problems. Our method, while being tailored to the discriminative tasks, is efficient and can scale to massive data sets. We show the need of discriminative embedding on some simulations. Applying to biological network problems, our method is shown to outperform baselines.

  3. Encoding Dissimilarity Data for Statistical Model Building.

    PubMed

    Wahba, Grace

    2010-12-01

    We summarize, review and comment upon three papers which discuss the use of discrete, noisy, incomplete, scattered pairwise dissimilarity data in statistical model building. Convex cone optimization codes are used to embed the objects into a Euclidean space which respects the dissimilarity information while controlling the dimension of the space. A "newbie" algorithm is provided for embedding new objects into this space. This allows the dissimilarity information to be incorporated into a Smoothing Spline ANOVA penalized likelihood model, a Support Vector Machine, or any model that will admit Reproducing Kernel Hilbert Space components, for nonparametric regression, supervised learning, or semi-supervised learning. Future work and open questions are discussed. The papers are: F. Lu, S. Keles, S. Wright and G. Wahba 2005. A framework for kernel regularization with application to protein clustering. Proceedings of the National Academy of Sciences 102, 12332-1233.G. Corrada Bravo, G. Wahba, K. Lee, B. Klein, R. Klein and S. Iyengar 2009. Examining the relative influence of familial, genetic and environmental covariate information in flexible risk models. Proceedings of the National Academy of Sciences 106, 8128-8133F. Lu, Y. Lin and G. Wahba. Robust manifold unfolding with kernel regularization. TR 1008, Department of Statistics, University of Wisconsin-Madison.

  4. Explaining Support Vector Machines: A Color Based Nomogram

    PubMed Central

    Van Belle, Vanya; Van Calster, Ben; Van Huffel, Sabine; Suykens, Johan A. K.; Lisboa, Paulo

    2016-01-01

    Problem setting Support vector machines (SVMs) are very popular tools for classification, regression and other problems. Due to the large choice of kernels they can be applied with, a large variety of data can be analysed using these tools. Machine learning thanks its popularity to the good performance of the resulting models. However, interpreting the models is far from obvious, especially when non-linear kernels are used. Hence, the methods are used as black boxes. As a consequence, the use of SVMs is less supported in areas where interpretability is important and where people are held responsible for the decisions made by models. Objective In this work, we investigate whether SVMs using linear, polynomial and RBF kernels can be explained such that interpretations for model-based decisions can be provided. We further indicate when SVMs can be explained and in which situations interpretation of SVMs is (hitherto) not possible. Here, explainability is defined as the ability to produce the final decision based on a sum of contributions which depend on one single or at most two input variables. Results Our experiments on simulated and real-life data show that explainability of an SVM depends on the chosen parameter values (degree of polynomial kernel, width of RBF kernel and regularization constant). When several combinations of parameter values yield the same cross-validation performance, combinations with a lower polynomial degree or a larger kernel width have a higher chance of being explainable. Conclusions This work summarizes SVM classifiers obtained with linear, polynomial and RBF kernels in a single plot. Linear and polynomial kernels up to the second degree are represented exactly. For other kernels an indication of the reliability of the approximation is presented. The complete methodology is available as an R package and two apps and a movie are provided to illustrate the possibilities offered by the method. PMID:27723811

  5. Data-Driven Hierarchical Structure Kernel for Multiscale Part-Based Object Recognition

    PubMed Central

    Wang, Botao; Xiong, Hongkai; Jiang, Xiaoqian; Zheng, Yuan F.

    2017-01-01

    Detecting generic object categories in images and videos are a fundamental issue in computer vision. However, it faces the challenges from inter and intraclass diversity, as well as distortions caused by viewpoints, poses, deformations, and so on. To solve object variations, this paper constructs a structure kernel and proposes a multiscale part-based model incorporating the discriminative power of kernels. The structure kernel would measure the resemblance of part-based objects in three aspects: 1) the global similarity term to measure the resemblance of the global visual appearance of relevant objects; 2) the part similarity term to measure the resemblance of the visual appearance of distinctive parts; and 3) the spatial similarity term to measure the resemblance of the spatial layout of parts. In essence, the deformation of parts in the structure kernel is penalized in a multiscale space with respect to horizontal displacement, vertical displacement, and scale difference. Part similarities are combined with different weights, which are optimized efficiently to maximize the intraclass similarities and minimize the interclass similarities by the normalized stochastic gradient ascent algorithm. In addition, the parameters of the structure kernel are learned during the training process with regard to the distribution of the data in a more discriminative way. With flexible part sizes on scale and displacement, it can be more robust to the intraclass variations, poses, and viewpoints. Theoretical analysis and experimental evaluations demonstrate that the proposed multiscale part-based representation model with structure kernel exhibits accurate and robust performance, and outperforms state-of-the-art object classification approaches. PMID:24808345

  6. Omnibus risk assessment via accelerated failure time kernel machine modeling.

    PubMed

    Sinnott, Jennifer A; Cai, Tianxi

    2013-12-01

    Integrating genomic information with traditional clinical risk factors to improve the prediction of disease outcomes could profoundly change the practice of medicine. However, the large number of potential markers and possible complexity of the relationship between markers and disease make it difficult to construct accurate risk prediction models. Standard approaches for identifying important markers often rely on marginal associations or linearity assumptions and may not capture non-linear or interactive effects. In recent years, much work has been done to group genes into pathways and networks. Integrating such biological knowledge into statistical learning could potentially improve model interpretability and reliability. One effective approach is to employ a kernel machine (KM) framework, which can capture nonlinear effects if nonlinear kernels are used (Scholkopf and Smola, 2002; Liu et al., 2007, 2008). For survival outcomes, KM regression modeling and testing procedures have been derived under a proportional hazards (PH) assumption (Li and Luan, 2003; Cai, Tonini, and Lin, 2011). In this article, we derive testing and prediction methods for KM regression under the accelerated failure time (AFT) model, a useful alternative to the PH model. We approximate the null distribution of our test statistic using resampling procedures. When multiple kernels are of potential interest, it may be unclear in advance which kernel to use for testing and estimation. We propose a robust Omnibus Test that combines information across kernels, and an approach for selecting the best kernel for estimation. The methods are illustrated with an application in breast cancer. © 2013, The International Biometric Society.

  7. Deep learning architecture for iris recognition based on optimal Gabor filters and deep belief network

    NASA Astrophysics Data System (ADS)

    He, Fei; Han, Ye; Wang, Han; Ji, Jinchao; Liu, Yuanning; Ma, Zhiqiang

    2017-03-01

    Gabor filters are widely utilized to detect iris texture information in several state-of-the-art iris recognition systems. However, the proper Gabor kernels and the generative pattern of iris Gabor features need to be predetermined in application. The traditional empirical Gabor filters and shallow iris encoding ways are incapable of dealing with such complex variations in iris imaging including illumination, aging, deformation, and device variations. Thereby, an adaptive Gabor filter selection strategy and deep learning architecture are presented. We first employ particle swarm optimization approach and its binary version to define a set of data-driven Gabor kernels for fitting the most informative filtering bands, and then capture complex pattern from the optimal Gabor filtered coefficients by a trained deep belief network. A succession of comparative experiments validate that our optimal Gabor filters may produce more distinctive Gabor coefficients and our iris deep representations be more robust and stable than traditional iris Gabor codes. Furthermore, the depth and scales of the deep learning architecture are also discussed.

  8. Multi-Target Regression via Robust Low-Rank Learning.

    PubMed

    Zhen, Xiantong; Yu, Mengyang; He, Xiaofei; Li, Shuo

    2018-02-01

    Multi-target regression has recently regained great popularity due to its capability of simultaneously learning multiple relevant regression tasks and its wide applications in data mining, computer vision and medical image analysis, while great challenges arise from jointly handling inter-target correlations and input-output relationships. In this paper, we propose Multi-layer Multi-target Regression (MMR) which enables simultaneously modeling intrinsic inter-target correlations and nonlinear input-output relationships in a general framework via robust low-rank learning. Specifically, the MMR can explicitly encode inter-target correlations in a structure matrix by matrix elastic nets (MEN); the MMR can work in conjunction with the kernel trick to effectively disentangle highly complex nonlinear input-output relationships; the MMR can be efficiently solved by a new alternating optimization algorithm with guaranteed convergence. The MMR leverages the strength of kernel methods for nonlinear feature learning and the structural advantage of multi-layer learning architectures for inter-target correlation modeling. More importantly, it offers a new multi-layer learning paradigm for multi-target regression which is endowed with high generality, flexibility and expressive ability. Extensive experimental evaluation on 18 diverse real-world datasets demonstrates that our MMR can achieve consistently high performance and outperforms representative state-of-the-art algorithms, which shows its great effectiveness and generality for multivariate prediction.

  9. G-Hash: Towards Fast Kernel-based Similarity Search in Large Graph Databases.

    PubMed

    Wang, Xiaohong; Smalter, Aaron; Huan, Jun; Lushington, Gerald H

    2009-01-01

    Structured data including sets, sequences, trees and graphs, pose significant challenges to fundamental aspects of data management such as efficient storage, indexing, and similarity search. With the fast accumulation of graph databases, similarity search in graph databases has emerged as an important research topic. Graph similarity search has applications in a wide range of domains including cheminformatics, bioinformatics, sensor network management, social network management, and XML documents, among others.Most of the current graph indexing methods focus on subgraph query processing, i.e. determining the set of database graphs that contains the query graph and hence do not directly support similarity search. In data mining and machine learning, various graph kernel functions have been designed to capture the intrinsic similarity of graphs. Though successful in constructing accurate predictive and classification models for supervised learning, graph kernel functions have (i) high computational complexity and (ii) non-trivial difficulty to be indexed in a graph database.Our objective is to bridge graph kernel function and similarity search in graph databases by proposing (i) a novel kernel-based similarity measurement and (ii) an efficient indexing structure for graph data management. Our method of similarity measurement builds upon local features extracted from each node and their neighboring nodes in graphs. A hash table is utilized to support efficient storage and fast search of the extracted local features. Using the hash table, a graph kernel function is defined to capture the intrinsic similarity of graphs and for fast similarity query processing. We have implemented our method, which we have named G-hash, and have demonstrated its utility on large chemical graph databases. Our results show that the G-hash method achieves state-of-the-art performance for k-nearest neighbor (k-NN) classification. Most importantly, the new similarity measurement and the index structure is scalable to large database with smaller indexing size, faster indexing construction time, and faster query processing time as compared to state-of-the-art indexing methods such as C-tree, gIndex, and GraphGrep.

  10. Multi-task feature learning by using trace norm regularization

    NASA Astrophysics Data System (ADS)

    Jiangmei, Zhang; Binfeng, Yu; Haibo, Ji; Wang, Kunpeng

    2017-11-01

    Multi-task learning can extract the correlation of multiple related machine learning problems to improve performance. This paper considers applying the multi-task learning method to learn a single task. We propose a new learning approach, which employs the mixture of expert model to divide a learning task into several related sub-tasks, and then uses the trace norm regularization to extract common feature representation of these sub-tasks. A nonlinear extension of this approach by using kernel is also provided. Experiments conducted on both simulated and real data sets demonstrate the advantage of the proposed approach.

  11. Kernel machines for epilepsy diagnosis via EEG signal classification: a comparative study.

    PubMed

    Lima, Clodoaldo A M; Coelho, André L V

    2011-10-01

    We carry out a systematic assessment on a suite of kernel-based learning machines while coping with the task of epilepsy diagnosis through automatic electroencephalogram (EEG) signal classification. The kernel machines investigated include the standard support vector machine (SVM), the least squares SVM, the Lagrangian SVM, the smooth SVM, the proximal SVM, and the relevance vector machine. An extensive series of experiments was conducted on publicly available data, whose clinical EEG recordings were obtained from five normal subjects and five epileptic patients. The performance levels delivered by the different kernel machines are contrasted in terms of the criteria of predictive accuracy, sensitivity to the kernel function/parameter value, and sensitivity to the type of features extracted from the signal. For this purpose, 26 values for the kernel parameter (radius) of two well-known kernel functions (namely, Gaussian and exponential radial basis functions) were considered as well as 21 types of features extracted from the EEG signal, including statistical values derived from the discrete wavelet transform, Lyapunov exponents, and combinations thereof. We first quantitatively assess the impact of the choice of the wavelet basis on the quality of the features extracted. Four wavelet basis functions were considered in this study. Then, we provide the average accuracy (i.e., cross-validation error) values delivered by 252 kernel machine configurations; in particular, 40%/35% of the best-calibrated models of the standard and least squares SVMs reached 100% accuracy rate for the two kernel functions considered. Moreover, we show the sensitivity profiles exhibited by a large sample of the configurations whereby one can visually inspect their levels of sensitiveness to the type of feature and to the kernel function/parameter value. Overall, the results evidence that all kernel machines are competitive in terms of accuracy, with the standard and least squares SVMs prevailing more consistently. Moreover, the choice of the kernel function and parameter value as well as the choice of the feature extractor are critical decisions to be taken, albeit the choice of the wavelet family seems not to be so relevant. Also, the statistical values calculated over the Lyapunov exponents were good sources of signal representation, but not as informative as their wavelet counterparts. Finally, a typical sensitivity profile has emerged among all types of machines, involving some regions of stability separated by zones of sharp variation, with some kernel parameter values clearly associated with better accuracy rates (zones of optimality). Copyright © 2011 Elsevier B.V. All rights reserved.

  12. An implementation of support vector machine on sentiment classification of movie reviews

    NASA Astrophysics Data System (ADS)

    Yulietha, I. M.; Faraby, S. A.; Adiwijaya; Widyaningtyas, W. C.

    2018-03-01

    With technological advances, all information about movie is available on the internet. If the information is processed properly, it will get the quality of the information. This research proposes to the classify sentiments on movie review documents. This research uses Support Vector Machine (SVM) method because it can classify high dimensional data in accordance with the data used in this research in the form of text. Support Vector Machine is a popular machine learning technique for text classification because it can classify by learning from a collection of documents that have been classified previously and can provide good result. Based on number of datasets, the 90-10 composition has the best result that is 85.6%. Based on SVM kernel, kernel linear with constant 1 has the best result that is 84.9%

  13. A low cost implementation of multi-parameter patient monitor using intersection kernel support vector machine classifier

    NASA Astrophysics Data System (ADS)

    Mohan, Dhanya; Kumar, C. Santhosh

    2016-03-01

    Predicting the physiological condition (normal/abnormal) of a patient is highly desirable to enhance the quality of health care. Multi-parameter patient monitors (MPMs) using heart rate, arterial blood pressure, respiration rate and oxygen saturation (S pO2) as input parameters were developed to monitor the condition of patients, with minimum human resource utilization. The Support vector machine (SVM), an advanced machine learning approach popularly used for classification and regression is used for the realization of MPMs. For making MPMs cost effective, we experiment on the hardware implementation of the MPM using support vector machine classifier. The training of the system is done using the matlab environment and the detection of the alarm/noalarm condition is implemented in hardware. We used different kernels for SVM classification and note that the best performance was obtained using intersection kernel SVM (IKSVM). The intersection kernel support vector machine classifier MPM has outperformed the best known MPM using radial basis function kernel by an absoute improvement of 2.74% in accuracy, 1.86% in sensitivity and 3.01% in specificity. The hardware model was developed based on the improved performance system using Verilog Hardware Description Language and was implemented on Altera cyclone-II development board.

  14. Improved object optimal synthetic description, modeling, learning, and discrimination by GEOGINE computational kernel

    NASA Astrophysics Data System (ADS)

    Fiorini, Rodolfo A.; Dacquino, Gianfranco

    2005-03-01

    GEOGINE (GEOmetrical enGINE), a state-of-the-art OMG (Ontological Model Generator) based on n-D Tensor Invariants for n-Dimensional shape/texture optimal synthetic representation, description and learning, was presented in previous conferences elsewhere recently. Improved computational algorithms based on the computational invariant theory of finite groups in Euclidean space and a demo application is presented. Progressive model automatic generation is discussed. GEOGINE can be used as an efficient computational kernel for fast reliable application development and delivery in advanced biomedical engineering, biometric, intelligent computing, target recognition, content image retrieval, data mining technological areas mainly. Ontology can be regarded as a logical theory accounting for the intended meaning of a formal dictionary, i.e., its ontological commitment to a particular conceptualization of the world object. According to this approach, "n-D Tensor Calculus" can be considered a "Formal Language" to reliably compute optimized "n-Dimensional Tensor Invariants" as specific object "invariant parameter and attribute words" for automated n-Dimensional shape/texture optimal synthetic object description by incremental model generation. The class of those "invariant parameter and attribute words" can be thought as a specific "Formal Vocabulary" learned from a "Generalized Formal Dictionary" of the "Computational Tensor Invariants" language. Even object chromatic attributes can be effectively and reliably computed from object geometric parameters into robust colour shape invariant characteristics. As a matter of fact, any highly sophisticated application needing effective, robust object geometric/colour invariant attribute capture and parameterization features, for reliable automated object learning and discrimination can deeply benefit from GEOGINE progressive automated model generation computational kernel performance. Main operational advantages over previous, similar approaches are: 1) Progressive Automated Invariant Model Generation, 2) Invariant Minimal Complete Description Set for computational efficiency, 3) Arbitrary Model Precision for robust object description and identification.

  15. Predicting radiotherapy outcomes using statistical learning techniques

    NASA Astrophysics Data System (ADS)

    El Naqa, Issam; Bradley, Jeffrey D.; Lindsay, Patricia E.; Hope, Andrew J.; Deasy, Joseph O.

    2009-09-01

    Radiotherapy outcomes are determined by complex interactions between treatment, anatomical and patient-related variables. A common obstacle to building maximally predictive outcome models for clinical practice is the failure to capture potential complexity of heterogeneous variable interactions and applicability beyond institutional data. We describe a statistical learning methodology that can automatically screen for nonlinear relations among prognostic variables and generalize to unseen data before. In this work, several types of linear and nonlinear kernels to generate interaction terms and approximate the treatment-response function are evaluated. Examples of institutional datasets of esophagitis, pneumonitis and xerostomia endpoints were used. Furthermore, an independent RTOG dataset was used for 'generalizabilty' validation. We formulated the discrimination between risk groups as a supervised learning problem. The distribution of patient groups was initially analyzed using principle components analysis (PCA) to uncover potential nonlinear behavior. The performance of the different methods was evaluated using bivariate correlations and actuarial analysis. Over-fitting was controlled via cross-validation resampling. Our results suggest that a modified support vector machine (SVM) kernel method provided superior performance on leave-one-out testing compared to logistic regression and neural networks in cases where the data exhibited nonlinear behavior on PCA. For instance, in prediction of esophagitis and pneumonitis endpoints, which exhibited nonlinear behavior on PCA, the method provided 21% and 60% improvements, respectively. Furthermore, evaluation on the independent pneumonitis RTOG dataset demonstrated good generalizabilty beyond institutional data in contrast with other models. This indicates that the prediction of treatment response can be improved by utilizing nonlinear kernel methods for discovering important nonlinear interactions among model variables. These models have the capacity to predict on unseen data. Part of this work was first presented at the Seventh International Conference on Machine Learning and Applications, San Diego, CA, USA, 11-13 December 2008.

  16. Scuba: scalable kernel-based gene prioritization.

    PubMed

    Zampieri, Guido; Tran, Dinh Van; Donini, Michele; Navarin, Nicolò; Aiolli, Fabio; Sperduti, Alessandro; Valle, Giorgio

    2018-01-25

    The uncovering of genes linked to human diseases is a pressing challenge in molecular biology and precision medicine. This task is often hindered by the large number of candidate genes and by the heterogeneity of the available information. Computational methods for the prioritization of candidate genes can help to cope with these problems. In particular, kernel-based methods are a powerful resource for the integration of heterogeneous biological knowledge, however, their practical implementation is often precluded by their limited scalability. We propose Scuba, a scalable kernel-based method for gene prioritization. It implements a novel multiple kernel learning approach, based on a semi-supervised perspective and on the optimization of the margin distribution. Scuba is optimized to cope with strongly unbalanced settings where known disease genes are few and large scale predictions are required. Importantly, it is able to efficiently deal both with a large amount of candidate genes and with an arbitrary number of data sources. As a direct consequence of scalability, Scuba integrates also a new efficient strategy to select optimal kernel parameters for each data source. We performed cross-validation experiments and simulated a realistic usage setting, showing that Scuba outperforms a wide range of state-of-the-art methods. Scuba achieves state-of-the-art performance and has enhanced scalability compared to existing kernel-based approaches for genomic data. This method can be useful to prioritize candidate genes, particularly when their number is large or when input data is highly heterogeneous. The code is freely available at https://github.com/gzampieri/Scuba .

  17. Molecule kernels: a descriptor- and alignment-free quantitative structure-activity relationship approach.

    PubMed

    Mohr, Johannes A; Jain, Brijnesh J; Obermayer, Klaus

    2008-09-01

    Quantitative structure activity relationship (QSAR) analysis is traditionally based on extracting a set of molecular descriptors and using them to build a predictive model. In this work, we propose a QSAR approach based directly on the similarity between the 3D structures of a set of molecules measured by a so-called molecule kernel, which is independent of the spatial prealignment of the compounds. Predictors can be build using the molecule kernel in conjunction with the potential support vector machine (P-SVM), a recently proposed machine learning method for dyadic data. The resulting models make direct use of the structural similarities between the compounds in the test set and a subset of the training set and do not require an explicit descriptor construction. We evaluated the predictive performance of the proposed method on one classification and four regression QSAR datasets and compared its results to the results reported in the literature for several state-of-the-art descriptor-based and 3D QSAR approaches. In this comparison, the proposed molecule kernel method performed better than the other QSAR methods.

  18. Simultaneous multiple non-crossing quantile regression estimation using kernel constraints

    PubMed Central

    Liu, Yufeng; Wu, Yichao

    2011-01-01

    Quantile regression (QR) is a very useful statistical tool for learning the relationship between the response variable and covariates. For many applications, one often needs to estimate multiple conditional quantile functions of the response variable given covariates. Although one can estimate multiple quantiles separately, it is of great interest to estimate them simultaneously. One advantage of simultaneous estimation is that multiple quantiles can share strength among them to gain better estimation accuracy than individually estimated quantile functions. Another important advantage of joint estimation is the feasibility of incorporating simultaneous non-crossing constraints of QR functions. In this paper, we propose a new kernel-based multiple QR estimation technique, namely simultaneous non-crossing quantile regression (SNQR). We use kernel representations for QR functions and apply constraints on the kernel coefficients to avoid crossing. Both unregularised and regularised SNQR techniques are considered. Asymptotic properties such as asymptotic normality of linear SNQR and oracle properties of the sparse linear SNQR are developed. Our numerical results demonstrate the competitive performance of our SNQR over the original individual QR estimation. PMID:22190842

  19. Nonlinearity-aware based dimensionality reduction and over-sampling for AD/MCI classification from MRI measures.

    PubMed

    Cao, Peng; Liu, Xiaoli; Yang, Jinzhu; Zhao, Dazhe; Huang, Min; Zhang, Jian; Zaiane, Osmar

    2017-12-01

    Alzheimer's disease (AD) has been not only a substantial financial burden to the health care system but also an emotional burden to patients and their families. Making accurate diagnosis of AD based on brain magnetic resonance imaging (MRI) is becoming more and more critical and emphasized at the earliest stages. However, the high dimensionality and imbalanced data issues are two major challenges in the study of computer aided AD diagnosis. The greatest limitations of existing dimensionality reduction and over-sampling methods are that they assume a linear relationship between the MRI features (predictor) and the disease status (response). To better capture the complicated but more flexible relationship, we propose a multi-kernel based dimensionality reduction and over-sampling approaches. We combined Marginal Fisher Analysis with ℓ 2,1 -norm based multi-kernel learning (MKMFA) to achieve the sparsity of region-of-interest (ROI), which leads to simultaneously selecting a subset of the relevant brain regions and learning a dimensionality transformation. Meanwhile, a multi-kernel over-sampling (MKOS) was developed to generate synthetic instances in the optimal kernel space induced by MKMFA, so as to compensate for the class imbalanced distribution. We comprehensively evaluate the proposed models for the diagnostic classification (binary class and multi-class classification) including all subjects from the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset. The experimental results not only demonstrate the proposed method has superior performance over multiple comparable methods, but also identifies relevant imaging biomarkers that are consistent with prior medical knowledge. Copyright © 2017 Elsevier Ltd. All rights reserved.

  20. Detection of subjects and brain regions related to Alzheimer's disease using 3D MRI scans based on eigenbrain and machine learning

    PubMed Central

    Zhang, Yudong; Dong, Zhengchao; Phillips, Preetha; Wang, Shuihua; Ji, Genlin; Yang, Jiquan; Yuan, Ti-Fei

    2015-01-01

    Purpose: Early diagnosis or detection of Alzheimer's disease (AD) from the normal elder control (NC) is very important. However, the computer-aided diagnosis (CAD) was not widely used, and the classification performance did not reach the standard of practical use. We proposed a novel CAD system for MR brain images based on eigenbrains and machine learning with two goals: accurate detection of both AD subjects and AD-related brain regions. Method: First, we used maximum inter-class variance (ICV) to select key slices from 3D volumetric data. Second, we generated an eigenbrain set for each subject. Third, the most important eigenbrain (MIE) was obtained by Welch's t-test (WTT). Finally, kernel support-vector-machines with different kernels that were trained by particle swarm optimization, were used to make an accurate prediction of AD subjects. Coefficients of MIE with values higher than 0.98 quantile were highlighted to obtain the discriminant regions that distinguish AD from NC. Results: The experiments showed that the proposed method can predict AD subjects with a competitive performance with existing methods, especially the accuracy of the polynomial kernel (92.36 ± 0.94) was better than the linear kernel of 91.47 ± 1.02 and the radial basis function (RBF) kernel of 86.71 ± 1.93. The proposed eigenbrain-based CAD system detected 30 AD-related brain regions (Anterior Cingulate, Caudate Nucleus, Cerebellum, Cingulate Gyrus, Claustrum, Inferior Frontal Gyrus, Inferior Parietal Lobule, Insula, Lateral Ventricle, Lentiform Nucleus, Lingual Gyrus, Medial Frontal Gyrus, Middle Frontal Gyrus, Middle Occipital Gyrus, Middle Temporal Gyrus, Paracentral Lobule, Parahippocampal Gyrus, Postcentral Gyrus, Posterial Cingulate, Precentral Gyrus, Precuneus, Subcallosal Gyrus, Sub-Gyral, Superior Frontal Gyrus, Superior Parietal Lobule, Superior Temporal Gyrus, Supramarginal Gyrus, Thalamus, Transverse Temporal Gyrus, and Uncus). The results were coherent with existing literatures. Conclusion: The eigenbrain method was effective in AD subject prediction and discriminant brain-region detection in MRI scanning. PMID:26082713

  1. Interaction with Machine Improvisation

    NASA Astrophysics Data System (ADS)

    Assayag, Gerard; Bloch, George; Cont, Arshia; Dubnov, Shlomo

    We describe two multi-agent architectures for an improvisation oriented musician-machine interaction systems that learn in real time from human performers. The improvisation kernel is based on sequence modeling and statistical learning. We present two frameworks of interaction with this kernel. In the first, the stylistic interaction is guided by a human operator in front of an interactive computer environment. In the second framework, the stylistic interaction is delegated to machine intelligence and therefore, knowledge propagation and decision are taken care of by the computer alone. The first framework involves a hybrid architecture using two popular composition/performance environments, Max and OpenMusic, that are put to work and communicate together, each one handling the process at a different time/memory scale. The second framework shares the same representational schemes with the first but uses an Active Learning architecture based on collaborative, competitive and memory-based learning to handle stylistic interactions. Both systems are capable of processing real-time audio/video as well as MIDI. After discussing the general cognitive background of improvisation practices, the statistical modelling tools and the concurrent agent architecture are presented. Then, an Active Learning scheme is described and considered in terms of using different improvisation regimes for improvisation planning. Finally, we provide more details about the different system implementations and describe several performances with the system.

  2. Integrating concept ontology and multitask learning to achieve more effective classifier training for multilevel image annotation.

    PubMed

    Fan, Jianping; Gao, Yuli; Luo, Hangzai

    2008-03-01

    In this paper, we have developed a new scheme for achieving multilevel annotations of large-scale images automatically. To achieve more sufficient representation of various visual properties of the images, both the global visual features and the local visual features are extracted for image content representation. To tackle the problem of huge intraconcept visual diversity, multiple types of kernels are integrated to characterize the diverse visual similarity relationships between the images more precisely, and a multiple kernel learning algorithm is developed for SVM image classifier training. To address the problem of huge interconcept visual similarity, a novel multitask learning algorithm is developed to learn the correlated classifiers for the sibling image concepts under the same parent concept and enhance their discrimination and adaptation power significantly. To tackle the problem of huge intraconcept visual diversity for the image concepts at the higher levels of the concept ontology, a novel hierarchical boosting algorithm is developed to learn their ensemble classifiers hierarchically. In order to assist users on selecting more effective hypotheses for image classifier training, we have developed a novel hyperbolic framework for large-scale image visualization and interactive hypotheses assessment. Our experiments on large-scale image collections have also obtained very positive results.

  3. Classification and recognition of dynamical models: the role of phase, independent components, kernels and optimal transport.

    PubMed

    Bissacco, Alessandro; Chiuso, Alessandro; Soatto, Stefano

    2007-11-01

    We address the problem of performing decision tasks, and in particular classification and recognition, in the space of dynamical models in order to compare time series of data. Motivated by the application of recognition of human motion in image sequences, we consider a class of models that include linear dynamics, both stable and marginally stable (periodic), both minimum and non-minimum phase, driven by non-Gaussian processes. This requires extending existing learning and system identification algorithms to handle periodic modes and nonminimum phase behavior, while taking into account higher-order statistics of the data. Once a model is identified, we define a kernel-based cord distance between models that includes their dynamics, their initial conditions as well as input distribution. This is made possible by a novel kernel defined between two arbitrary (non-Gaussian) distributions, which is computed by efficiently solving an optimal transport problem. We validate our choice of models, inference algorithm, and distance on the tasks of human motion synthesis (sample paths of the learned models), and recognition (nearest-neighbor classification in the computed distance). However, our work can be applied more broadly where one needs to compare historical data while taking into account periodic trends, non-minimum phase behavior, and non-Gaussian input distributions.

  4. T antigen mutations are a human tumor-specific signature for Merkel cell polyomavirus

    PubMed Central

    Shuda, Masahiro; Feng, Huichen; Kwun, Hyun Jin; Rosen, Steven T.; Gjoerup, Ole; Moore, Patrick S.; Chang, Yuan

    2008-01-01

    Merkel cell polyomavirus (MCV) is a virus discovered in our laboratory at the University of Pittsburgh that is monoclonally integrated into the genome of ≈80% of human Merkel cell carcinomas (MCCs). Transcript mapping was performed to show that MCV expresses transcripts in MCCs similar to large T (LT), small T (ST), and 17kT transcripts of SV40. Nine MCC tumor-derived LT genomic sequences have been examined, and all were found to harbor mutations prematurely truncating the MCV LT helicase. In contrast, four presumed episomal viruses from nontumor sources did not possess this T antigen signature mutation. Using coimmunoprecipitation and origin replication assays, we show that tumor-derived virus mutations do not affect retinoblastoma tumor suppressor protein (Rb) binding by LT but do eliminate viral DNA replication capacity. Identification of an MCC cell line (MKL-1) having monoclonal MCV integration and the signature LT mutation allowed us to functionally test both tumor-derived and wild type (WT) T antigens. Only WT LT expression activates replication of integrated MCV DNA in MKL-1 cells. Our findings suggest that MCV-positive MCC tumor cells undergo selection for LT mutations to prevent autoactivation of integrated virus replication that would be detrimental to cell survival. Because these mutations render the virus replication-incompetent, MCV is not a “passenger virus” that secondarily infects MCC tumors. PMID:18812503

  5. A benchmark study of scoring methods for non-coding mutations.

    PubMed

    Drubay, Damien; Gautheret, Daniel; Michiels, Stefan

    2018-05-15

    Detailed knowledge of coding sequences has led to different candidate models for pathogenic variant prioritization. Several deleteriousness scores have been proposed for the non-coding part of the genome, but no large-scale comparison has been realized to date to assess their performance. We compared the leading scoring tools (CADD, FATHMM-MKL, Funseq2 and GWAVA) and some recent competitors (DANN, SNP and SOM scores) for their ability to discriminate assumed pathogenic variants from assumed benign variants (using the ClinVar, COSMIC and 1000 genomes project databases). Using the ClinVar benchmark, CADD was the best tool for detecting the pathogenic variants that are mainly located in protein coding gene regions. Using the COSMIC benchmark, FATHMM-MKL, GWAVA and SOMliver outperformed the other tools for pathogenic variants that are typically located in lincRNAs, pseudogenes and other parts of the non-coding genome. However, all tools had low precision, which could potentially be improved by future non-coding genome feature discoveries. These results may have been influenced by the presence of potential benign variants in the COSMIC database. The development of a gold standard as consistent as ClinVar for these regions will be necessary to confirm our tool ranking. The Snakemake, C++ and R codes are freely available from https://github.com/Oncostat/BenchmarkNCVTools and supported on Linux. damien.drubay@gustaveroussy.fr or stefan.michiels@gustaveroussy.fr. Supplementary data are available at Bioinformatics online.

  6. Automatic lumbar vertebrae detection based on feature fusion deep learning for partial occluded C-arm X-ray images.

    PubMed

    Yang Li; Wei Liang; Yinlong Zhang; Haibo An; Jindong Tan

    2016-08-01

    Automatic and accurate lumbar vertebrae detection is an essential step of image-guided minimally invasive spine surgery (IG-MISS). However, traditional methods still require human intervention due to the similarity of vertebrae, abnormal pathological conditions and uncertain imaging angle. In this paper, we present a novel convolutional neural network (CNN) model to automatically detect lumbar vertebrae for C-arm X-ray images. Training data is augmented by DRR and automatic segmentation of ROI is able to reduce the computational complexity. Furthermore, a feature fusion deep learning (FFDL) model is introduced to combine two types of features of lumbar vertebrae X-ray images, which uses sobel kernel and Gabor kernel to obtain the contour and texture of lumbar vertebrae, respectively. Comprehensive qualitative and quantitative experiments demonstrate that our proposed model performs more accurate in abnormal cases with pathologies and surgical implants in multi-angle views.

  7. Automated discrimination of dementia spectrum disorders using extreme learning machine and structural T1 MRI features.

    PubMed

    Jongin Kim; Boreom Lee

    2017-07-01

    The classification of neuroimaging data for the diagnosis of Alzheimer's Disease (AD) is one of the main research goals of the neuroscience and clinical fields. In this study, we performed extreme learning machine (ELM) classifier to discriminate the AD, mild cognitive impairment (MCI) from normal control (NC). We compared the performance of ELM with that of a linear kernel support vector machine (SVM) for 718 structural MRI images from Alzheimer's Disease Neuroimaging Initiative (ADNI) database. The data consisted of normal control, MCI converter (MCI-C), MCI non-converter (MCI-NC), and AD. We employed SVM-based recursive feature elimination (RFE-SVM) algorithm to find the optimal subset of features. In this study, we found that the RFE-SVM feature selection approach in combination with ELM shows the superior classification accuracy to that of linear kernel SVM for structural T1 MRI data.

  8. Common spatial pattern combined with kernel linear discriminate and generalized radial basis function for motor imagery-based brain computer interface applications

    NASA Astrophysics Data System (ADS)

    Hekmatmanesh, Amin; Jamaloo, Fatemeh; Wu, Huapeng; Handroos, Heikki; Kilpeläinen, Asko

    2018-04-01

    Brain Computer Interface (BCI) can be a challenge for developing of robotic, prosthesis and human-controlled systems. This work focuses on the implementation of a common spatial pattern (CSP) base algorithm to detect event related desynchronization patterns. Utilizing famous previous work in this area, features are extracted by filter bank with common spatial pattern (FBCSP) method, and then weighted by a sensitive learning vector quantization (SLVQ) algorithm. In the current work, application of the radial basis function (RBF) as a mapping kernel of linear discriminant analysis (KLDA) method on the weighted features, allows the transfer of data into a higher dimension for more discriminated data scattering by RBF kernel. Afterwards, support vector machine (SVM) with generalized radial basis function (GRBF) kernel is employed to improve the efficiency and robustness of the classification. Averagely, 89.60% accuracy and 74.19% robustness are achieved. BCI Competition III, Iva data set is used to evaluate the algorithm for detecting right hand and foot imagery movement patterns. Results show that combination of KLDA with SVM-GRBF classifier makes 8.9% and 14.19% improvements in accuracy and robustness, respectively. For all the subjects, it is concluded that mapping the CSP features into a higher dimension by RBF and utilization GRBF as a kernel of SVM, improve the accuracy and reliability of the proposed method.

  9. Multi-channel EEG-based sleep stage classification with joint collaborative representation and multiple kernel learning.

    PubMed

    Shi, Jun; Liu, Xiao; Li, Yan; Zhang, Qi; Li, Yingjie; Ying, Shihui

    2015-10-30

    Electroencephalography (EEG) based sleep staging is commonly used in clinical routine. Feature extraction and representation plays a crucial role in EEG-based automatic classification of sleep stages. Sparse representation (SR) is a state-of-the-art unsupervised feature learning method suitable for EEG feature representation. Collaborative representation (CR) is an effective data coding method used as a classifier. Here we use CR as a data representation method to learn features from the EEG signal. A joint collaboration model is established to develop a multi-view learning algorithm, and generate joint CR (JCR) codes to fuse and represent multi-channel EEG signals. A two-stage multi-view learning-based sleep staging framework is then constructed, in which JCR and joint sparse representation (JSR) algorithms first fuse and learning the feature representation from multi-channel EEG signals, respectively. Multi-view JCR and JSR features are then integrated and sleep stages recognized by a multiple kernel extreme learning machine (MK-ELM) algorithm with grid search. The proposed two-stage multi-view learning algorithm achieves superior performance for sleep staging. With a K-means clustering based dictionary, the mean classification accuracy, sensitivity and specificity are 81.10 ± 0.15%, 71.42 ± 0.66% and 94.57 ± 0.07%, respectively; while with the dictionary learned using the submodular optimization method, they are 80.29 ± 0.22%, 71.26 ± 0.78% and 94.38 ± 0.10%, respectively. The two-stage multi-view learning based sleep staging framework outperforms all other classification methods compared in this work, while JCR is superior to JSR. The proposed multi-view learning framework has the potential for sleep staging based on multi-channel or multi-modality polysomnography signals. Copyright © 2015 Elsevier B.V. All rights reserved.

  10. An Efficient Method Coupling Kernel Principal Component Analysis with Adjoint-Based Optimal Control and Its Goal-Oriented Extensions

    NASA Astrophysics Data System (ADS)

    Thimmisetty, C.; Talbot, C.; Tong, C. H.; Chen, X.

    2016-12-01

    The representativeness of available data poses a significant fundamental challenge to the quantification of uncertainty in geophysical systems. Furthermore, the successful application of machine learning methods to geophysical problems involving data assimilation is inherently constrained by the extent to which obtainable data represent the problem considered. We show how the adjoint method, coupled with optimization based on methods of machine learning, can facilitate the minimization of an objective function defined on a space of significantly reduced dimension. By considering uncertain parameters as constituting a stochastic process, the Karhunen-Loeve expansion and its nonlinear extensions furnish an optimal basis with respect to which optimization using L-BFGS can be carried out. In particular, we demonstrate that kernel PCA can be coupled with adjoint-based optimal control methods to successfully determine the distribution of material parameter values for problems in the context of channelized deformable media governed by the equations of linear elasticity. Since certain subsets of the original data are characterized by different features, the convergence rate of the method in part depends on, and may be limited by, the observations used to furnish the kernel principal component basis. By determining appropriate weights for realizations of the stochastic random field, then, one may accelerate the convergence of the method. To this end, we present a formulation of Weighted PCA combined with a gradient-based means using automatic differentiation to iteratively re-weight observations concurrent with the determination of an optimal reduced set control variables in the feature space. We demonstrate how improvements in the accuracy and computational efficiency of the weighted linear method can be achieved over existing unweighted kernel methods, and discuss nonlinear extensions of the algorithm.

  11. The Latent Structure of Dictionaries.

    PubMed

    Vincent-Lamarre, Philippe; Massé, Alexandre Blondin; Lopes, Marcos; Lord, Mélanie; Marcotte, Odile; Harnad, Stevan

    2016-07-01

    How many words-and which ones-are sufficient to define all other words? When dictionaries are analyzed as directed graphs with links from defining words to defined words, they reveal a latent structure. Recursively removing all words that are reachable by definition but that do not define any further words reduces the dictionary to a Kernel of about 10% of its size. This is still not the smallest number of words that can define all the rest. About 75% of the Kernel turns out to be its Core, a "Strongly Connected Subset" of words with a definitional path to and from any pair of its words and no word's definition depending on a word outside the set. But the Core cannot define all the rest of the dictionary. The 25% of the Kernel surrounding the Core consists of small strongly connected subsets of words: the Satellites. The size of the smallest set of words that can define all the rest-the graph's "minimum feedback vertex set" or MinSet-is about 1% of the dictionary, about 15% of the Kernel, and part-Core/part-Satellite. But every dictionary has a huge number of MinSets. The Core words are learned earlier, more frequent, and less concrete than the Satellites, which are in turn learned earlier, more frequent, but more concrete than the rest of the Dictionary. In principle, only one MinSet's words would need to be grounded through the sensorimotor capacity to recognize and categorize their referents. In a dual-code sensorimotor/symbolic model of the mental lexicon, the symbolic code could do all the rest through recombinatory definition. Copyright © 2016 Cognitive Science Society, Inc.

  12. Learning Spatially-Smooth Mappings in Non-Rigid Structure from Motion

    PubMed Central

    Hamsici, Onur C.; Gotardo, Paulo F.U.; Martinez, Aleix M.

    2013-01-01

    Non-rigid structure from motion (NRSFM) is a classical underconstrained problem in computer vision. A common approach to make NRSFM more tractable is to constrain 3D shape deformation to be smooth over time. This constraint has been used to compress the deformation model and reduce the number of unknowns that are estimated. However, temporal smoothness cannot be enforced when the data lacks temporal ordering and its benefits are less evident when objects undergo abrupt deformations. This paper proposes a new NRSFM method that addresses these problems by considering deformations as spatial variations in shape space and then enforcing spatial, rather than temporal, smoothness. This is done by modeling each 3D shape coefficient as a function of its input 2D shape. This mapping is learned in the feature space of a rotation invariant kernel, where spatial smoothness is intrinsically defined by the mapping function. As a result, our model represents shape variations compactly using custom-built coefficient bases learned from the input data, rather than a pre-specified set such as the Discrete Cosine Transform. The resulting kernel-based mapping is a by-product of the NRSFM solution and leads to another fundamental advantage of our approach: for a newly observed 2D shape, its 3D shape is recovered by simply evaluating the learned function. PMID:23946937

  13. Learning Spatially-Smooth Mappings in Non-Rigid Structure from Motion.

    PubMed

    Hamsici, Onur C; Gotardo, Paulo F U; Martinez, Aleix M

    2012-01-01

    Non-rigid structure from motion (NRSFM) is a classical underconstrained problem in computer vision. A common approach to make NRSFM more tractable is to constrain 3D shape deformation to be smooth over time. This constraint has been used to compress the deformation model and reduce the number of unknowns that are estimated. However, temporal smoothness cannot be enforced when the data lacks temporal ordering and its benefits are less evident when objects undergo abrupt deformations. This paper proposes a new NRSFM method that addresses these problems by considering deformations as spatial variations in shape space and then enforcing spatial, rather than temporal, smoothness. This is done by modeling each 3D shape coefficient as a function of its input 2D shape. This mapping is learned in the feature space of a rotation invariant kernel, where spatial smoothness is intrinsically defined by the mapping function. As a result, our model represents shape variations compactly using custom-built coefficient bases learned from the input data, rather than a pre-specified set such as the Discrete Cosine Transform. The resulting kernel-based mapping is a by-product of the NRSFM solution and leads to another fundamental advantage of our approach: for a newly observed 2D shape, its 3D shape is recovered by simply evaluating the learned function.

  14. Support vector machines for nuclear reactor state estimation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zavaljevski, N.; Gross, K. C.

    2000-02-14

    Validation of nuclear power reactor signals is often performed by comparing signal prototypes with the actual reactor signals. The signal prototypes are often computed based on empirical data. The implementation of an estimation algorithm which can make predictions on limited data is an important issue. A new machine learning algorithm called support vector machines (SVMS) recently developed by Vladimir Vapnik and his coworkers enables a high level of generalization with finite high-dimensional data. The improved generalization in comparison with standard methods like neural networks is due mainly to the following characteristics of the method. The input data space is transformedmore » into a high-dimensional feature space using a kernel function, and the learning problem is formulated as a convex quadratic programming problem with a unique solution. In this paper the authors have applied the SVM method for data-based state estimation in nuclear power reactors. In particular, they implemented and tested kernels developed at Argonne National Laboratory for the Multivariate State Estimation Technique (MSET), a nonlinear, nonparametric estimation technique with a wide range of applications in nuclear reactors. The methodology has been applied to three data sets from experimental and commercial nuclear power reactor applications. The results are promising. The combination of MSET kernels with the SVM method has better noise reduction and generalization properties than the standard MSET algorithm.« less

  15. An efficient diagnosis system for Parkinson's disease using kernel-based extreme learning machine with subtractive clustering features weighting approach.

    PubMed

    Ma, Chao; Ouyang, Jihong; Chen, Hui-Ling; Zhao, Xue-Hua

    2014-01-01

    A novel hybrid method named SCFW-KELM, which integrates effective subtractive clustering features weighting and a fast classifier kernel-based extreme learning machine (KELM), has been introduced for the diagnosis of PD. In the proposed method, SCFW is used as a data preprocessing tool, which aims at decreasing the variance in features of the PD dataset, in order to further improve the diagnostic accuracy of the KELM classifier. The impact of the type of kernel functions on the performance of KELM has been investigated in detail. The efficiency and effectiveness of the proposed method have been rigorously evaluated against the PD dataset in terms of classification accuracy, sensitivity, specificity, area under the receiver operating characteristic (ROC) curve (AUC), f-measure, and kappa statistics value. Experimental results have demonstrated that the proposed SCFW-KELM significantly outperforms SVM-based, KNN-based, and ELM-based approaches and other methods in the literature and achieved highest classification results reported so far via 10-fold cross validation scheme, with the classification accuracy of 99.49%, the sensitivity of 100%, the specificity of 99.39%, AUC of 99.69%, the f-measure value of 0.9964, and kappa value of 0.9867. Promisingly, the proposed method might serve as a new candidate of powerful methods for the diagnosis of PD with excellent performance.

  16. An Efficient Diagnosis System for Parkinson's Disease Using Kernel-Based Extreme Learning Machine with Subtractive Clustering Features Weighting Approach

    PubMed Central

    Ma, Chao; Ouyang, Jihong; Chen, Hui-Ling; Zhao, Xue-Hua

    2014-01-01

    A novel hybrid method named SCFW-KELM, which integrates effective subtractive clustering features weighting and a fast classifier kernel-based extreme learning machine (KELM), has been introduced for the diagnosis of PD. In the proposed method, SCFW is used as a data preprocessing tool, which aims at decreasing the variance in features of the PD dataset, in order to further improve the diagnostic accuracy of the KELM classifier. The impact of the type of kernel functions on the performance of KELM has been investigated in detail. The efficiency and effectiveness of the proposed method have been rigorously evaluated against the PD dataset in terms of classification accuracy, sensitivity, specificity, area under the receiver operating characteristic (ROC) curve (AUC), f-measure, and kappa statistics value. Experimental results have demonstrated that the proposed SCFW-KELM significantly outperforms SVM-based, KNN-based, and ELM-based approaches and other methods in the literature and achieved highest classification results reported so far via 10-fold cross validation scheme, with the classification accuracy of 99.49%, the sensitivity of 100%, the specificity of 99.39%, AUC of 99.69%, the f-measure value of 0.9964, and kappa value of 0.9867. Promisingly, the proposed method might serve as a new candidate of powerful methods for the diagnosis of PD with excellent performance. PMID:25484912

  17. Identification of Alzheimer's disease and mild cognitive impairment using multimodal sparse hierarchical extreme learning machine.

    PubMed

    Kim, Jongin; Lee, Boreom

    2018-05-07

    Different modalities such as structural MRI, FDG-PET, and CSF have complementary information, which is likely to be very useful for diagnosis of AD and MCI. Therefore, it is possible to develop a more effective and accurate AD/MCI automatic diagnosis method by integrating complementary information of different modalities. In this paper, we propose multi-modal sparse hierarchical extreme leaning machine (MSH-ELM). We used volume and mean intensity extracted from 93 regions of interest (ROIs) as features of MRI and FDG-PET, respectively, and used p-tau, t-tau, and Aβ42 as CSF features. In detail, high-level representation was individually extracted from each of MRI, FDG-PET, and CSF using a stacked sparse extreme learning machine auto-encoder (sELM-AE). Then, another stacked sELM-AE was devised to acquire a joint hierarchical feature representation by fusing the high-level representations obtained from each modality. Finally, we classified joint hierarchical feature representation using a kernel-based extreme learning machine (KELM). The results of MSH-ELM were compared with those of conventional ELM, single kernel support vector machine (SK-SVM), multiple kernel support vector machine (MK-SVM) and stacked auto-encoder (SAE). Performance was evaluated through 10-fold cross-validation. In the classification of AD vs. HC and MCI vs. HC problem, the proposed MSH-ELM method showed mean balanced accuracies of 96.10% and 86.46%, respectively, which is much better than those of competing methods. In summary, the proposed algorithm exhibits consistently better performance than SK-SVM, ELM, MK-SVM and SAE in the two binary classification problems (AD vs. HC and MCI vs. HC). © 2018 Wiley Periodicals, Inc.

  18. A Nonrigid Kernel-Based Framework for 2D-3D Pose Estimation and 2D Image Segmentation

    PubMed Central

    Sandhu, Romeil; Dambreville, Samuel; Yezzi, Anthony; Tannenbaum, Allen

    2013-01-01

    In this work, we present a nonrigid approach to jointly solving the tasks of 2D-3D pose estimation and 2D image segmentation. In general, most frameworks that couple both pose estimation and segmentation assume that one has exact knowledge of the 3D object. However, under nonideal conditions, this assumption may be violated if only a general class to which a given shape belongs is given (e.g., cars, boats, or planes). Thus, we propose to solve the 2D-3D pose estimation and 2D image segmentation via nonlinear manifold learning of 3D embedded shapes for a general class of objects or deformations for which one may not be able to associate a skeleton model. Thus, the novelty of our method is threefold: First, we present and derive a gradient flow for the task of nonrigid pose estimation and segmentation. Second, due to the possible nonlinear structures of one’s training set, we evolve the preimage obtained through kernel PCA for the task of shape analysis. Third, we show that the derivation for shape weights is general. This allows us to use various kernels, as well as other statistical learning methodologies, with only minimal changes needing to be made to the overall shape evolution scheme. In contrast with other techniques, we approach the nonrigid problem, which is an infinite-dimensional task, with a finite-dimensional optimization scheme. More importantly, we do not explicitly need to know the interaction between various shapes such as that needed for skeleton models as this is done implicitly through shape learning. We provide experimental results on several challenging pose estimation and segmentation scenarios. PMID:20733218

  19. Agile convolutional neural network for pulmonary nodule classification using CT images.

    PubMed

    Zhao, Xinzhuo; Liu, Liyao; Qi, Shouliang; Teng, Yueyang; Li, Jianhua; Qian, Wei

    2018-04-01

    To distinguish benign from malignant pulmonary nodules using CT images is critical for their precise diagnosis and treatment. A new Agile convolutional neural network (CNN) framework is proposed to conquer the challenges of a small-scale medical image database and the small size of the nodules, and it improves the performance of pulmonary nodule classification using CT images. A hybrid CNN of LeNet and AlexNet is constructed through combining the layer settings of LeNet and the parameter settings of AlexNet. A dataset with 743 CT image nodule samples is built up based on the 1018 CT scans of LIDC to train and evaluate the Agile CNN model. Through adjusting the parameters of the kernel size, learning rate, and other factors, the effect of these parameters on the performance of the CNN model is investigated, and an optimized setting of the CNN is obtained finally. After finely optimizing the settings of the CNN, the estimation accuracy and the area under the curve can reach 0.822 and 0.877, respectively. The accuracy of the CNN is significantly dependent on the kernel size, learning rate, training batch size, dropout, and weight initializations. The best performance is achieved when the kernel size is set to [Formula: see text], the learning rate is 0.005, the batch size is 32, and dropout and Gaussian initialization are used. This competitive performance demonstrates that our proposed CNN framework and the optimization strategy of the CNN parameters are suitable for pulmonary nodule classification characterized by small medical datasets and small targets. The classification model might help diagnose and treat pulmonary nodules effectively.

  20. Limitations of commonly used internal controls for real-time RT-PCR analysis of renal epithelial-mesenchymal cell transition.

    PubMed

    Elberg, Gerard; Elberg, Dorit; Logan, Charlotte J; Chen, Lijuan; Turman, Martin A

    2006-01-01

    Progressive renal fibrotic disease is accompanied by the massive accumulation of myofibroblasts as defined by alpha smooth muscle actin (alphaSMA) expression. We quantitated gene expression using real-time RT-PCR analysis during conversion of primary cultured human renal tubular cells (RTC) to myofibroblasts after treatment with transforming growth factor-beta1 (TGF-beta1). We report herein the limitations of commonly used reference genes for mRNA quantitation. We determined the expression of alphaSMA and megakaryoblastic leukemia-1 (MKL1), a transcriptional regulator of alphaSMA, by quantitative real-time PCR using three common internal controls, glyceraldehyde-3-phosphate dehydrogenase (GAPDH), cyclophilin A and 18S rRNA. Expression of GAPDH mRNA and cyclophilin A mRNA, and to a lesser extent, 18S rRNA levels varied over time in culture and with exposure to TGF-beta1. Thus, depending on which reference gene was used, TGF-beta1 appeared to have different effects on expression of MKL1 and alphaSMA. RTC converting to myofibroblasts in primary culture is a valuable system to study renal fibrosis in humans. However, variability in expression of reference genes with TGF-beta1 treatment illustrates the need to validate mRNA quantitation with multiple reference genes to provide accurate interpretation of fibrosis studies in the absence of a universal internal standard for mRNA expression. 2006 S. Karger AG, Basel.

  1. Kernel-imbedded Gaussian processes for disease classification using microarray gene expression data

    PubMed Central

    Zhao, Xin; Cheung, Leo Wang-Kit

    2007-01-01

    Background Designing appropriate machine learning methods for identifying genes that have a significant discriminating power for disease outcomes has become more and more important for our understanding of diseases at genomic level. Although many machine learning methods have been developed and applied to the area of microarray gene expression data analysis, the majority of them are based on linear models, which however are not necessarily appropriate for the underlying connection between the target disease and its associated explanatory genes. Linear model based methods usually also bring in false positive significant features more easily. Furthermore, linear model based algorithms often involve calculating the inverse of a matrix that is possibly singular when the number of potentially important genes is relatively large. This leads to problems of numerical instability. To overcome these limitations, a few non-linear methods have recently been introduced to the area. Many of the existing non-linear methods have a couple of critical problems, the model selection problem and the model parameter tuning problem, that remain unsolved or even untouched. In general, a unified framework that allows model parameters of both linear and non-linear models to be easily tuned is always preferred in real-world applications. Kernel-induced learning methods form a class of approaches that show promising potentials to achieve this goal. Results A hierarchical statistical model named kernel-imbedded Gaussian process (KIGP) is developed under a unified Bayesian framework for binary disease classification problems using microarray gene expression data. In particular, based on a probit regression setting, an adaptive algorithm with a cascading structure is designed to find the appropriate kernel, to discover the potentially significant genes, and to make the optimal class prediction accordingly. A Gibbs sampler is built as the core of the algorithm to make Bayesian inferences. Simulation studies showed that, even without any knowledge of the underlying generative model, the KIGP performed very close to the theoretical Bayesian bound not only in the case with a linear Bayesian classifier but also in the case with a very non-linear Bayesian classifier. This sheds light on its broader usability to microarray data analysis problems, especially to those that linear methods work awkwardly. The KIGP was also applied to four published microarray datasets, and the results showed that the KIGP performed better than or at least as well as any of the referred state-of-the-art methods did in all of these cases. Conclusion Mathematically built on the kernel-induced feature space concept under a Bayesian framework, the KIGP method presented in this paper provides a unified machine learning approach to explore both the linear and the possibly non-linear underlying relationship between the target features of a given binary disease classification problem and the related explanatory gene expression data. More importantly, it incorporates the model parameter tuning into the framework. The model selection problem is addressed in the form of selecting a proper kernel type. The KIGP method also gives Bayesian probabilistic predictions for disease classification. These properties and features are beneficial to most real-world applications. The algorithm is naturally robust in numerical computation. The simulation studies and the published data studies demonstrated that the proposed KIGP performs satisfactorily and consistently. PMID:17328811

  2. Stable Local Volatility Calibration Using Kernel Splines

    NASA Astrophysics Data System (ADS)

    Coleman, Thomas F.; Li, Yuying; Wang, Cheng

    2010-09-01

    We propose an optimization formulation using L1 norm to ensure accuracy and stability in calibrating a local volatility function for option pricing. Using a regularization parameter, the proposed objective function balances the calibration accuracy with the model complexity. Motivated by the support vector machine learning, the unknown local volatility function is represented by a kernel function generating splines and the model complexity is controlled by minimizing the 1-norm of the kernel coefficient vector. In the context of the support vector regression for function estimation based on a finite set of observations, this corresponds to minimizing the number of support vectors for predictability. We illustrate the ability of the proposed approach to reconstruct the local volatility function in a synthetic market. In addition, based on S&P 500 market index option data, we demonstrate that the calibrated local volatility surface is simple and resembles the observed implied volatility surface in shape. Stability is illustrated by calibrating local volatility functions using market option data from different dates.

  3. A Hybrid Short-Term Traffic Flow Prediction Model Based on Singular Spectrum Analysis and Kernel Extreme Learning Machine.

    PubMed

    Shang, Qiang; Lin, Ciyun; Yang, Zhaosheng; Bing, Qichun; Zhou, Xiyang

    2016-01-01

    Short-term traffic flow prediction is one of the most important issues in the field of intelligent transport system (ITS). Because of the uncertainty and nonlinearity, short-term traffic flow prediction is a challenging task. In order to improve the accuracy of short-time traffic flow prediction, a hybrid model (SSA-KELM) is proposed based on singular spectrum analysis (SSA) and kernel extreme learning machine (KELM). SSA is used to filter out the noise of traffic flow time series. Then, the filtered traffic flow data is used to train KELM model, the optimal input form of the proposed model is determined by phase space reconstruction, and parameters of the model are optimized by gravitational search algorithm (GSA). Finally, case validation is carried out using the measured data of an expressway in Xiamen, China. And the SSA-KELM model is compared with several well-known prediction models, including support vector machine, extreme learning machine, and single KLEM model. The experimental results demonstrate that performance of the proposed model is superior to that of the comparison models. Apart from accuracy improvement, the proposed model is more robust.

  4. A Hybrid Short-Term Traffic Flow Prediction Model Based on Singular Spectrum Analysis and Kernel Extreme Learning Machine

    PubMed Central

    Lin, Ciyun; Yang, Zhaosheng; Bing, Qichun; Zhou, Xiyang

    2016-01-01

    Short-term traffic flow prediction is one of the most important issues in the field of intelligent transport system (ITS). Because of the uncertainty and nonlinearity, short-term traffic flow prediction is a challenging task. In order to improve the accuracy of short-time traffic flow prediction, a hybrid model (SSA-KELM) is proposed based on singular spectrum analysis (SSA) and kernel extreme learning machine (KELM). SSA is used to filter out the noise of traffic flow time series. Then, the filtered traffic flow data is used to train KELM model, the optimal input form of the proposed model is determined by phase space reconstruction, and parameters of the model are optimized by gravitational search algorithm (GSA). Finally, case validation is carried out using the measured data of an expressway in Xiamen, China. And the SSA-KELM model is compared with several well-known prediction models, including support vector machine, extreme learning machine, and single KLEM model. The experimental results demonstrate that performance of the proposed model is superior to that of the comparison models. Apart from accuracy improvement, the proposed model is more robust. PMID:27551829

  5. Monitoring Hitting Load in Tennis Using Inertial Sensors and Machine Learning.

    PubMed

    Whiteside, David; Cant, Olivia; Connolly, Molly; Reid, Machar

    2017-10-01

    Quantifying external workload is fundamental to training prescription in sport. In tennis, global positioning data are imprecise and fail to capture hitting loads. The current gold standard (manual notation) is time intensive and often not possible given players' heavy travel schedules. To develop an automated stroke-classification system to help quantify hitting load in tennis. Nineteen athletes wore an inertial measurement unit (IMU) on their wrist during 66 video-recorded training sessions. Video footage was manually notated such that known shot type (serve, rally forehand, slice forehand, forehand volley, rally backhand, slice backhand, backhand volley, smash, or false positive) was associated with the corresponding IMU data for 28,582 shots. Six types of machine-learning models were then constructed to classify true shot type from the IMU signals. Across 10-fold cross-validation, a cubic-kernel support vector machine classified binned shots (overhead, forehand, or backhand) with an accuracy of 97.4%. A second cubic-kernel support vector machine achieved 93.2% accuracy when classifying all 9 shot types. With a view to monitoring external load, the combination of miniature inertial sensors and machine learning offers a practical and automated method of quantifying shot counts and discriminating shot types in elite tennis players.

  6. Bayesian Kernel Methods for Non-Gaussian Distributions: Binary and Multi-class Classification Problems

    DTIC Science & Technology

    2013-05-28

    those of the support vector machine and relevance vector machine, and the model runs more quickly than the other algorithms . When one class occurs...incremental support vector machine algorithm for online learning when fewer than 50 data points are available. (a) Papers published in peer-reviewed journals...learning environments, where data processing occurs one observation at a time and the classification algorithm improves over time with new

  7. Building machine learning force fields for nanoclusters

    NASA Astrophysics Data System (ADS)

    Zeni, Claudio; Rossi, Kevin; Glielmo, Aldo; Fekete, Ádám; Gaston, Nicola; Baletto, Francesca; De Vita, Alessandro

    2018-06-01

    We assess Gaussian process (GP) regression as a technique to model interatomic forces in metal nanoclusters by analyzing the performance of 2-body, 3-body, and many-body kernel functions on a set of 19-atom Ni cluster structures. We find that 2-body GP kernels fail to provide faithful force estimates, despite succeeding in bulk Ni systems. However, both 3- and many-body kernels predict forces within an ˜0.1 eV/Å average error even for small training datasets and achieve high accuracy even on out-of-sample, high temperature structures. While training and testing on the same structure always provide satisfactory accuracy, cross-testing on dissimilar structures leads to higher prediction errors, posing an extrapolation problem. This can be cured using heterogeneous training on databases that contain more than one structure, which results in a good trade-off between versatility and overall accuracy. Starting from a 3-body kernel trained this way, we build an efficient non-parametric 3-body force field that allows accurate prediction of structural properties at finite temperatures, following a newly developed scheme [A. Glielmo et al., Phys. Rev. B 95, 214302 (2017)]. We use this to assess the thermal stability of Ni19 nanoclusters at a fractional cost of full ab initio calculations.

  8. Analyzing Kernel Matrices for the Identification of Differentially Expressed Genes

    PubMed Central

    Xia, Xiao-Lei; Xing, Huanlai; Liu, Xueqin

    2013-01-01

    One of the most important applications of microarray data is the class prediction of biological samples. For this purpose, statistical tests have often been applied to identify the differentially expressed genes (DEGs), followed by the employment of the state-of-the-art learning machines including the Support Vector Machines (SVM) in particular. The SVM is a typical sample-based classifier whose performance comes down to how discriminant samples are. However, DEGs identified by statistical tests are not guaranteed to result in a training dataset composed of discriminant samples. To tackle this problem, a novel gene ranking method namely the Kernel Matrix Gene Selection (KMGS) is proposed. The rationale of the method, which roots in the fundamental ideas of the SVM algorithm, is described. The notion of ''the separability of a sample'' which is estimated by performing -like statistics on each column of the kernel matrix, is first introduced. The separability of a classification problem is then measured, from which the significance of a specific gene is deduced. Also described is a method of Kernel Matrix Sequential Forward Selection (KMSFS) which shares the KMGS method's essential ideas but proceeds in a greedy manner. On three public microarray datasets, our proposed algorithms achieved noticeably competitive performance in terms of the B.632+ error rate. PMID:24349110

  9. Margin-maximizing feature elimination methods for linear and nonlinear kernel-based discriminant functions.

    PubMed

    Aksu, Yaman; Miller, David J; Kesidis, George; Yang, Qing X

    2010-05-01

    Feature selection for classification in high-dimensional spaces can improve generalization, reduce classifier complexity, and identify important, discriminating feature "markers." For support vector machine (SVM) classification, a widely used technique is recursive feature elimination (RFE). We demonstrate that RFE is not consistent with margin maximization, central to the SVM learning approach. We thus propose explicit margin-based feature elimination (MFE) for SVMs and demonstrate both improved margin and improved generalization, compared with RFE. Moreover, for the case of a nonlinear kernel, we show that RFE assumes that the squared weight vector 2-norm is strictly decreasing as features are eliminated. We demonstrate this is not true for the Gaussian kernel and, consequently, RFE may give poor results in this case. MFE for nonlinear kernels gives better margin and generalization. We also present an extension which achieves further margin gains, by optimizing only two degrees of freedom--the hyperplane's intercept and its squared 2-norm--with the weight vector orientation fixed. We finally introduce an extension that allows margin slackness. We compare against several alternatives, including RFE and a linear programming method that embeds feature selection within the classifier design. On high-dimensional gene microarray data sets, University of California at Irvine (UCI) repository data sets, and Alzheimer's disease brain image data, MFE methods give promising results.

  10. More than Testing.

    ERIC Educational Resources Information Center

    Badger, Elizabeth

    1992-01-01

    Explains a set of processes that teachers might use to structure their evaluation of students' learning and understanding. Illustrates the processes of setting goals, deciding what to assess, gathering information, and using the results through a measurement task requiring students to estimate the number of popcorn kernels in a container. (MDH)

  11. Enhanced Data Representation by Kernel Metric Learning for Dementia Diagnosis

    PubMed Central

    Cárdenas-Peña, David; Collazos-Huertas, Diego; Castellanos-Dominguez, German

    2017-01-01

    Alzheimer's disease (AD) is the kind of dementia that affects the most people around the world. Therefore, an early identification supporting effective treatments is required to increase the life quality of a wide number of patients. Recently, computer-aided diagnosis tools for dementia using Magnetic Resonance Imaging scans have been successfully proposed to discriminate between patients with AD, mild cognitive impairment, and healthy controls. Most of the attention has been given to the clinical data, provided by initiatives as the ADNI, supporting reliable researches on intervention, prevention, and treatments of AD. Therefore, there is a need for improving the performance of classification machines. In this paper, we propose a kernel framework for learning metrics that enhances conventional machines and supports the diagnosis of dementia. Our framework aims at building discriminative spaces through the maximization of center kernel alignment function, aiming at improving the discrimination of the three considered neurological classes. The proposed metric learning performance is evaluated on the widely-known ADNI database using three supervised classification machines (k-nn, SVM and NNs) for multi-class and bi-class scenarios from structural MRIs. Specifically, from ADNI collection 286 AD patients, 379 MCI patients and 231 healthy controls are used for development and validation of our proposed metric learning framework. For the experimental validation, we split the data into two subsets: 30% of subjects used like a blindfolded assessment and 70% employed for parameter tuning. Then, in the preprocessing stage, each structural MRI scan a total of 310 morphological measurements are automatically extracted from by FreeSurfer software package and concatenated to build an input feature matrix. Obtained test performance results, show that including a supervised metric learning improves the compared baseline classifiers in both scenarios. In the multi-class scenario, we achieve the best performance (accuracy 60.1%) for pretrained 1-layered NN, and we obtain measures over 90% in the average for HC vs. AD task. From the machine learning point of view, our proposal enhances the classifier performance by building spaces with a better class separability. From the clinical application, our enhancement results in a more balanced performance in each class than the compared approaches from the CADDementia challenge by increasing the sensitivity of pathological groups and the specificity of healthy controls. PMID:28798659

  12. Machine learning with quantum relative entropy

    NASA Astrophysics Data System (ADS)

    Tsuda, Koji

    2009-12-01

    Density matrices are a central tool in quantum physics, but it is also used in machine learning. A positive definite matrix called kernel matrix is used to represent the similarities between examples. Positive definiteness assures that the examples are embedded in an Euclidean space. When a positive definite matrix is learned from data, one has to design an update rule that maintains the positive definiteness. Our update rule, called matrix exponentiated gradient update, is motivated by the quantum relative entropy. Notably, the relative entropy is an instance of Bregman divergences, which are asymmetric distance measures specifying theoretical properties of machine learning algorithms. Using the calculus commonly used in quantum physics, we prove an upperbound of the generalization error of online learning.

  13. Replicated computational results (RCR) report for "BLIS: A framework for rapidly instantiating BLAS functionality"

    DOE PAGES

    Willenbring, James Michael

    2015-06-03

    “BLIS: A Framework for Rapidly Instantiating BLAS Functionality” includes single-platform BLIS performance results for both level-2 and level-3 operations that is competitive with OpenBLAS, ATLAS, and Intel MKL. A detailed description of the configuration used to generate the performance results was provided to the reviewer by the authors. All the software components used in the comparison were reinstalled and new performance results were generated and compared to the original results. After completing this process, the published results are deemed replicable by the reviewer.

  14. Learning User Preferences for Sets of Objects

    NASA Technical Reports Server (NTRS)

    desJardins, Marie; Eaton, Eric; Wagstaff, Kiri L.

    2006-01-01

    Most work on preference learning has focused on pairwise preferences or rankings over individual items. In this paper, we present a method for learning preferences over sets of items. Our learning method takes as input a collection of positive examples--that is, one or more sets that have been identified by a user as desirable. Kernel density estimation is used to estimate the value function for individual items, and the desired set diversity is estimated from the average set diversity observed in the collection. Since this is a new learning problem, we introduce a new evaluation methodology and evaluate the learning method on two data collections: synthetic blocks-world data and a new real-world music data collection that we have gathered.

  15. An Analysis of the Nonlinear Spectral Mixing of Didymium and Soda-Lime Glass Beads Using Hyperspectral Imagery (HSI) Microscopy

    DTIC Science & Technology

    2014-05-01

    2001). Learning with Kernels: Support Vector Machines , Regularization, Optimization, and Beyond. The MIT Press, Cambridge, MA, 644 p. [26] Banerjee...A., Burlina, P., and Broadwater, J., (2007). A Machine Learning Approach for finding hyperspectral endmembers. Proceedings of the IEEE International... lime glass beads using hyperspectral imagery (HSI) microscopy Ronald G. Resmini1*, Robert S. Rand2, David W. Allen3, and Christopher J. Deloye1

  16. Machine Learning Intermolecular Potentials for 1,3,5-Triamino-2,4,6-trinitrobenzene (TATB) Using Symmetry-Adapted Perturbation Theory

    DTIC Science & Technology

    2018-04-25

    unlimited. NOTICES Disclaimers The findings in this report are not to be construed as an official Department of the Army position unless so...this report, intermolecular potentials for 1,3,5-triamino-2,4,6-trinitrobenzene (TATB) are developed using machine learning techniques. Three...potentials based on support vector regression, kernel ridge regression, and a neural network are fit using symmetry-adapted perturbation theory. The

  17. Scalable Kernel Methods and Algorithms for General Sequence Analysis

    ERIC Educational Resources Information Center

    Kuksa, Pavel

    2011-01-01

    Analysis of large-scale sequential data has become an important task in machine learning and pattern recognition, inspired in part by numerous scientific and technological applications such as the document and text classification or the analysis of biological sequences. However, current computational methods for sequence comparison still lack…

  18. SVM-Fold: a tool for discriminative multi-class protein fold and superfamily recognition

    PubMed Central

    Melvin, Iain; Ie, Eugene; Kuang, Rui; Weston, Jason; Stafford, William Noble; Leslie, Christina

    2007-01-01

    Background Predicting a protein's structural class from its amino acid sequence is a fundamental problem in computational biology. Much recent work has focused on developing new representations for protein sequences, called string kernels, for use with support vector machine (SVM) classifiers. However, while some of these approaches exhibit state-of-the-art performance at the binary protein classification problem, i.e. discriminating between a particular protein class and all other classes, few of these studies have addressed the real problem of multi-class superfamily or fold recognition. Moreover, there are only limited software tools and systems for SVM-based protein classification available to the bioinformatics community. Results We present a new multi-class SVM-based protein fold and superfamily recognition system and web server called SVM-Fold, which can be found at . Our system uses an efficient implementation of a state-of-the-art string kernel for sequence profiles, called the profile kernel, where the underlying feature representation is a histogram of inexact matching k-mer frequencies. We also employ a novel machine learning approach to solve the difficult multi-class problem of classifying a sequence of amino acids into one of many known protein structural classes. Binary one-vs-the-rest SVM classifiers that are trained to recognize individual structural classes yield prediction scores that are not comparable, so that standard "one-vs-all" classification fails to perform well. Moreover, SVMs for classes at different levels of the protein structural hierarchy may make useful predictions, but one-vs-all does not try to combine these multiple predictions. To deal with these problems, our method learns relative weights between one-vs-the-rest classifiers and encodes information about the protein structural hierarchy for multi-class prediction. In large-scale benchmark results based on the SCOP database, our code weighting approach significantly improves on the standard one-vs-all method for both the superfamily and fold prediction in the remote homology setting and on the fold recognition problem. Moreover, our code weight learning algorithm strongly outperforms nearest-neighbor methods based on PSI-BLAST in terms of prediction accuracy on every structure classification problem we consider. Conclusion By combining state-of-the-art SVM kernel methods with a novel multi-class algorithm, the SVM-Fold system delivers efficient and accurate protein fold and superfamily recognition. PMID:17570145

  19. Parameterized Micro-benchmarking: An Auto-tuning Approach for Complex Applications

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ma, Wenjing; Krishnamoorthy, Sriram; Agrawal, Gagan

    2012-05-15

    Auto-tuning has emerged as an important practical method for creating highly optimized implementations of key computational kernels and applications. However, the growing complexity of architectures and applications is creating new challenges for auto-tuning. Complex applications can involve a prohibitively large search space that precludes empirical auto-tuning. Similarly, architectures are becoming increasingly complicated, making it hard to model performance. In this paper, we focus on the challenge to auto-tuning presented by applications with a large number of kernels and kernel instantiations. While these kernels may share a somewhat similar pattern, they differ considerably in problem sizes and the exact computation performed.more » We propose and evaluate a new approach to auto-tuning which we refer to as parameterized micro-benchmarking. It is an alternative to the two existing classes of approaches to auto-tuning: analytical model-based and empirical search-based. Particularly, we argue that the former may not be able to capture all the architectural features that impact performance, whereas the latter might be too expensive for an application that has several different kernels. In our approach, different expressions in the application, different possible implementations of each expression, and the key architectural features, are used to derive a simple micro-benchmark and a small parameter space. This allows us to learn the most significant features of the architecture that can impact the choice of implementation for each kernel. We have evaluated our approach in the context of GPU implementations of tensor contraction expressions encountered in excited state calculations in quantum chemistry. We have focused on two aspects of GPUs that affect tensor contraction execution: memory access patterns and kernel consolidation. Using our parameterized micro-benchmarking approach, we obtain a speedup of up to 2 over the version that used default optimizations, but no auto-tuning. We demonstrate that observations made from microbenchmarks match the behavior seen from real expressions. In the process, we make important observations about the memory hierarchy of two of the most recent NVIDIA GPUs, which can be used in other optimization frameworks as well.« less

  20. fRMSDPred: Predicting Local RMSD Between Structural Fragments Using Sequence Information

    DTIC Science & Technology

    2007-04-04

    machine learning approaches for estimating the RMSD value of a pair of protein fragments. These estimated fragment-level RMSD values can be used to construct the alignment, assess the quality of an alignment, and identify high-quality alignment segments. We present algorithms to solve this fragment-level RMSD prediction problem using a supervised learning framework based on support vector regression and classification that incorporates protein profiles, predicted secondary structure, effective information encoding schemes, and novel second-order pairwise exponential kernel

  1. Nonlinear PET parametric image reconstruction with MRI information using kernel method

    NASA Astrophysics Data System (ADS)

    Gong, Kuang; Wang, Guobao; Chen, Kevin T.; Catana, Ciprian; Qi, Jinyi

    2017-03-01

    Positron Emission Tomography (PET) is a functional imaging modality widely used in oncology, cardiology, and neurology. It is highly sensitive, but suffers from relatively poor spatial resolution, as compared with anatomical imaging modalities, such as magnetic resonance imaging (MRI). With the recent development of combined PET/MR systems, we can improve the PET image quality by incorporating MR information. Previously we have used kernel learning to embed MR information in static PET reconstruction and direct Patlak reconstruction. Here we extend this method to direct reconstruction of nonlinear parameters in a compartment model by using the alternating direction of multiplier method (ADMM) algorithm. Simulation studies show that the proposed method can produce superior parametric images compared with existing methods.

  2. Infrared small target detection with kernel Fukunaga Koontz transform

    NASA Astrophysics Data System (ADS)

    Liu, Rui-ming; Liu, Er-qi; Yang, Jie; Zhang, Tian-hao; Wang, Fang-lin

    2007-09-01

    The Fukunaga-Koontz transform (FKT) has been proposed for many years. It can be used to solve two-pattern classification problems successfully. However, there are few researchers who have definitely extended FKT to kernel FKT (KFKT). In this paper, we first complete this task. Then a method based on KFKT is developed to detect infrared small targets. KFKT is a supervised learning algorithm. How to construct training sets is very important. For automatically detecting targets, the synthetic target images and real background images are used to train KFKT. Because KFKT can represent the higher order statistical properties of images, we expect better detection performance of KFKT than that of FKT. The well-devised experiments verify that KFKT outperforms FKT in detecting infrared small targets.

  3. A multiple-feature and multiple-kernel scene segmentation algorithm for humanoid robot.

    PubMed

    Liu, Zhi; Xu, Shuqiong; Zhang, Yun; Chen, Chun Lung Philip

    2014-11-01

    This technical correspondence presents a multiple-feature and multiple-kernel support vector machine (MFMK-SVM) methodology to achieve a more reliable and robust segmentation performance for humanoid robot. The pixel wise intensity, gradient, and C1 SMF features are extracted via the local homogeneity model and Gabor filter, which would be used as inputs of MFMK-SVM model. It may provide multiple features of the samples for easier implementation and efficient computation of MFMK-SVM model. A new clustering method, which is called feature validity-interval type-2 fuzzy C-means (FV-IT2FCM) clustering algorithm, is proposed by integrating a type-2 fuzzy criterion in the clustering optimization process to improve the robustness and reliability of clustering results by the iterative optimization. Furthermore, the clustering validity is employed to select the training samples for the learning of the MFMK-SVM model. The MFMK-SVM scene segmentation method is able to fully take advantage of the multiple features of scene image and the ability of multiple kernels. Experiments on the BSDS dataset and real natural scene images demonstrate the superior performance of our proposed method.

  4. Correlated Topic Vector for Scene Classification.

    PubMed

    Wei, Pengxu; Qin, Fei; Wan, Fang; Zhu, Yi; Jiao, Jianbin; Ye, Qixiang

    2017-07-01

    Scene images usually involve semantic correlations, particularly when considering large-scale image data sets. This paper proposes a novel generative image representation, correlated topic vector, to model such semantic correlations. Oriented from the correlated topic model, correlated topic vector intends to naturally utilize the correlations among topics, which are seldom considered in the conventional feature encoding, e.g., Fisher vector, but do exist in scene images. It is expected that the involvement of correlations can increase the discriminative capability of the learned generative model and consequently improve the recognition accuracy. Incorporated with the Fisher kernel method, correlated topic vector inherits the advantages of Fisher vector. The contributions to the topics of visual words have been further employed by incorporating the Fisher kernel framework to indicate the differences among scenes. Combined with the deep convolutional neural network (CNN) features and Gibbs sampling solution, correlated topic vector shows great potential when processing large-scale and complex scene image data sets. Experiments on two scene image data sets demonstrate that correlated topic vector improves significantly the deep CNN features, and outperforms existing Fisher kernel-based features.

  5. Kernel-Based Relevance Analysis with Enhanced Interpretability for Detection of Brain Activity Patterns

    PubMed Central

    Alvarez-Meza, Andres M.; Orozco-Gutierrez, Alvaro; Castellanos-Dominguez, German

    2017-01-01

    We introduce Enhanced Kernel-based Relevance Analysis (EKRA) that aims to support the automatic identification of brain activity patterns using electroencephalographic recordings. EKRA is a data-driven strategy that incorporates two kernel functions to take advantage of the available joint information, associating neural responses to a given stimulus condition. Regarding this, a Centered Kernel Alignment functional is adjusted to learning the linear projection that best discriminates the input feature set, optimizing the required free parameters automatically. Our approach is carried out in two scenarios: (i) feature selection by computing a relevance vector from extracted neural features to facilitating the physiological interpretation of a given brain activity task, and (ii) enhanced feature selection to perform an additional transformation of relevant features aiming to improve the overall identification accuracy. Accordingly, we provide an alternative feature relevance analysis strategy that allows improving the system performance while favoring the data interpretability. For the validation purpose, EKRA is tested in two well-known tasks of brain activity: motor imagery discrimination and epileptic seizure detection. The obtained results show that the EKRA approach estimates a relevant representation space extracted from the provided supervised information, emphasizing the salient input features. As a result, our proposal outperforms the state-of-the-art methods regarding brain activity discrimination accuracy with the benefit of enhanced physiological interpretation about the task at hand. PMID:29056897

  6. Initial Kernel Timing Using a Simple PIM Performance Model

    NASA Technical Reports Server (NTRS)

    Katz, Daniel S.; Block, Gary L.; Springer, Paul L.; Sterling, Thomas; Brockman, Jay B.; Callahan, David

    2005-01-01

    This presentation will describe some initial results of paper-and-pencil studies of 4 or 5 application kernels applied to a processor-in-memory (PIM) system roughly similar to the Cascade Lightweight Processor (LWP). The application kernels are: * Linked list traversal * Sun of leaf nodes on a tree * Bitonic sort * Vector sum * Gaussian elimination The intent of this work is to guide and validate work on the Cascade project in the areas of compilers, simulators, and languages. We will first discuss the generic PIM structure. Then, we will explain the concepts needed to program a parallel PIM system (locality, threads, parcels). Next, we will present a simple PIM performance model that will be used in the remainder of the presentation. For each kernel, we will then present a set of codes, including codes for a single PIM node, and codes for multiple PIM nodes that move data to threads and move threads to data. These codes are written at a fairly low level, between assembly and C, but much closer to C than to assembly. For each code, we will present some hand-drafted timing forecasts, based on the simple PIM performance model. Finally, we will conclude by discussing what we have learned from this work, including what programming styles seem to work best, from the point-of-view of both expressiveness and performance.

  7. Semisupervised learning using Bayesian interpretation: application to LS-SVM.

    PubMed

    Adankon, Mathias M; Cheriet, Mohamed; Biem, Alain

    2011-04-01

    Bayesian reasoning provides an ideal basis for representing and manipulating uncertain knowledge, with the result that many interesting algorithms in machine learning are based on Bayesian inference. In this paper, we use the Bayesian approach with one and two levels of inference to model the semisupervised learning problem and give its application to the successful kernel classifier support vector machine (SVM) and its variant least-squares SVM (LS-SVM). Taking advantage of Bayesian interpretation of LS-SVM, we develop a semisupervised learning algorithm for Bayesian LS-SVM using our approach based on two levels of inference. Experimental results on both artificial and real pattern recognition problems show the utility of our method.

  8. Idea Swap: The Best Ideas Come from Teachers Like You!

    ERIC Educational Resources Information Center

    Instructor, 2006

    2006-01-01

    This article presents several activities and teaching ideas shared by teachers. One teacher shared how an Egyptian mummy-making activity can be a great hands-on learning through time. Another teacher shared how a bowl filled with popcorn kernels has made it easy for her to get the attention of her students.

  9. Boundary Kernel Estimation of the Two Sample Comparison Density Function

    DTIC Science & Technology

    1989-05-01

    not for the understand- ing, love, and steadfast support of my wife, Catheryn . She supported my move to statistics a mere fortnight after we were...school one learns things of a narrow and technical nature; 0 Catheryn has shown me much of what is fundamentally true and important in this world. To her

  10. An intelligent fault diagnosis method of rolling bearings based on regularized kernel Marginal Fisher analysis

    NASA Astrophysics Data System (ADS)

    Jiang, Li; Shi, Tielin; Xuan, Jianping

    2012-05-01

    Generally, the vibration signals of fault bearings are non-stationary and highly nonlinear under complicated operating conditions. Thus, it's a big challenge to extract optimal features for improving classification and simultaneously decreasing feature dimension. Kernel Marginal Fisher analysis (KMFA) is a novel supervised manifold learning algorithm for feature extraction and dimensionality reduction. In order to avoid the small sample size problem in KMFA, we propose regularized KMFA (RKMFA). A simple and efficient intelligent fault diagnosis method based on RKMFA is put forward and applied to fault recognition of rolling bearings. So as to directly excavate nonlinear features from the original high-dimensional vibration signals, RKMFA constructs two graphs describing the intra-class compactness and the inter-class separability, by combining traditional manifold learning algorithm with fisher criteria. Therefore, the optimal low-dimensional features are obtained for better classification and finally fed into the simplest K-nearest neighbor (KNN) classifier to recognize different fault categories of bearings. The experimental results demonstrate that the proposed approach improves the fault classification performance and outperforms the other conventional approaches.

  11. Surface electromyography based muscle fatigue detection using high-resolution time-frequency methods and machine learning algorithms.

    PubMed

    Karthick, P A; Ghosh, Diptasree Maitra; Ramakrishnan, S

    2018-02-01

    Surface electromyography (sEMG) based muscle fatigue research is widely preferred in sports science and occupational/rehabilitation studies due to its noninvasiveness. However, these signals are complex, multicomponent and highly nonstationary with large inter-subject variations, particularly during dynamic contractions. Hence, time-frequency based machine learning methodologies can improve the design of automated system for these signals. In this work, the analysis based on high-resolution time-frequency methods, namely, Stockwell transform (S-transform), B-distribution (BD) and extended modified B-distribution (EMBD) are proposed to differentiate the dynamic muscle nonfatigue and fatigue conditions. The nonfatigue and fatigue segments of sEMG signals recorded from the biceps brachii of 52 healthy volunteers are preprocessed and subjected to S-transform, BD and EMBD. Twelve features are extracted from each method and prominent features are selected using genetic algorithm (GA) and binary particle swarm optimization (BPSO). Five machine learning algorithms, namely, naïve Bayes, support vector machine (SVM) of polynomial and radial basis kernel, random forest and rotation forests are used for the classification. The results show that all the proposed time-frequency distributions (TFDs) are able to show the nonstationary variations of sEMG signals. Most of the features exhibit statistically significant difference in the muscle fatigue and nonfatigue conditions. The maximum number of features (66%) is reduced by GA and BPSO for EMBD and BD-TFD respectively. The combination of EMBD- polynomial kernel based SVM is found to be most accurate (91% accuracy) in classifying the conditions with the features selected using GA. The proposed methods are found to be capable of handling the nonstationary and multicomponent variations of sEMG signals recorded in dynamic fatiguing contractions. Particularly, the combination of EMBD- polynomial kernel based SVM could be used to detect the dynamic muscle fatigue conditions. Copyright © 2017 Elsevier B.V. All rights reserved.

  12. Sorafenib Tosylate and Chemotherapy in Treating Older Patients With Acute Myeloid Leukemia

    ClinicalTrials.gov

    2018-06-01

    Acute Myeloid Leukemia; Acute Myeloid Leukemia (Megakaryoblastic) With t(1;22)(p13.3;q13.3); RBM15-MKL1; Acute Myeloid Leukemia With a Variant RARA Translocation; Acute Myeloid Leukemia With Inv(3) (q21.3;q26.2) or t(3;3) (q21.3;q26.2); GATA2, MECOM; Acute Myeloid Leukemia With t(6;9) (p23;q34.1); DEK-NUP214; Acute Myeloid Leukemia With t(9;11)(p22.3;q23.3); MLLT3-KMT2A; Acute Myeloid Leukemia With Variant MLL Translocations; Untreated Adult Acute Myeloid Leukemia

  13. Boiler Control Systems Oxygen Trim Systems Manual.

    DTIC Science & Technology

    1983-02-01

    F/ G 5/9 N EhhhhhhhmonsoI smhhhEmhhhhhh smhhhhhhhhhhh Ehh0h0010I-E1E FEmhhhhEmmhhhhhhI LL+4 -.Lm(m~mkl~~d - - ." " - , -, - - ’ - °. ,,_ ". . . b...PAGIES Ik NNITrG A9NC ORS7 Ir- e, 0) 1 SECURITY CLASS. (of K4* MPMf) Office, g &, Washington, DC 20360 Unclassified Naval Facilities Engineering Coand...1NPLJT SoNAI . S141NAL. f* FIR4GIb[CTIQNAL SCkEMIATC DICIAN A’ RLfL FOSITIOWINC- SY<STEM1 (FUUMArIC) PRESSURE PROC.ESS PIIN42 ATI 6 C~rEMASTERO F13UE

  14. PIPE: a protein–protein interaction passage extraction module for BioCreative challenge

    PubMed Central

    Chu, Chun-Han; Su, Yu-Chen; Chen, Chien Chin; Hsu, Wen-Lian

    2016-01-01

    Identifying the interactions between proteins mentioned in biomedical literatures is one of the frequently discussed topics of text mining in the life science field. In this article, we propose PIPE, an interaction pattern generation module used in the Collaborative Biocurator Assistant Task at BioCreative V (http://www.biocreative.org/) to capture frequent protein-protein interaction (PPI) patterns within text. We also present an interaction pattern tree (IPT) kernel method that integrates the PPI patterns with convolution tree kernel (CTK) to extract PPIs. Methods were evaluated on LLL, IEPA, HPRD50, AIMed and BioInfer corpora using cross-validation, cross-learning and cross-corpus evaluation. Empirical evaluations demonstrate that our method is effective and outperforms several well-known PPI extraction methods. Database URL: PMID:27524807

  15. Is extreme learning machine feasible? A theoretical assessment (part II).

    PubMed

    Lin, Shaobo; Liu, Xia; Fang, Jian; Xu, Zongben

    2015-01-01

    An extreme learning machine (ELM) can be regarded as a two-stage feed-forward neural network (FNN) learning system that randomly assigns the connections with and within hidden neurons in the first stage and tunes the connections with output neurons in the second stage. Therefore, ELM training is essentially a linear learning problem, which significantly reduces the computational burden. Numerous applications show that such a computation burden reduction does not degrade the generalization capability. It has, however, been open that whether this is true in theory. The aim of this paper is to study the theoretical feasibility of ELM by analyzing the pros and cons of ELM. In the previous part of this topic, we pointed out that via appropriately selected activation functions, ELM does not degrade the generalization capability in the sense of expectation. In this paper, we launch the study in a different direction and show that the randomness of ELM also leads to certain negative consequences. On one hand, we find that the randomness causes an additional uncertainty problem of ELM, both in approximation and learning. On the other hand, we theoretically justify that there also exist activation functions such that the corresponding ELM degrades the generalization capability. In particular, we prove that the generalization capability of ELM with Gaussian kernel is essentially worse than that of FNN with Gaussian kernel. To facilitate the use of ELM, we also provide a remedy to such a degradation. We find that the well-developed coefficient regularization technique can essentially improve the generalization capability. The obtained results reveal the essential characteristic of ELM in a certain sense and give theoretical guidance concerning how to use ELM.

  16. Kernel abortion in maize : I. Carbohydrate concentration patterns and Acid invertase activity of maize kernels induced to abort in vitro.

    PubMed

    Hanft, J M; Jones, R J

    1986-06-01

    Kernels cultured in vitro were induced to abort by high temperature (35 degrees C) and by culturing six kernels/cob piece. Aborting kernels failed to enter a linear phase of dry mass accumulation and had a final mass that was less than 6% of nonaborting field-grown kernels. Kernels induced to abort by high temperature failed to synthesize starch in the endosperm and had elevated sucrose concentrations and low fructose and glucose concentrations in the pedicel during early growth compared to nonaborting kernels. Kernels induced to abort by high temperature also had much lower pedicel soluble acid invertase activities than did nonaborting kernels. These results suggest that high temperature during the lag phase of kernel growth may impair the process of sucrose unloading in the pedicel by indirectly inhibiting soluble acid invertase activity and prevent starch synthesis in the endosperm. Kernels induced to abort by culturing six kernels/cob piece had reduced pedicel fructose, glucose, and sucrose concentrations compared to kernels from field-grown ears. These aborting kernels also had a lower pedicel soluble acid invertase activity compared to nonaborting kernels from the same cob piece and from field-grown ears. The low invertase activity in pedicel tissue of the aborting kernels was probably caused by a lack of substrate (sucrose) for the invertase to cleave due to the intense competition for available assimilates. In contrast to kernels cultured at 35 degrees C, aborting kernels from cob pieces containing all six kernels accumulated starch in a linear fashion. These results indicate that kernels cultured six/cob piece abort because of an inadequate supply of sugar and are similar to apical kernels from field-grown ears that often abort prior to the onset of linear growth.

  17. DRREP: deep ridge regressed epitope predictor.

    PubMed

    Sher, Gene; Zhi, Degui; Zhang, Shaojie

    2017-10-03

    The ability to predict epitopes plays an enormous role in vaccine development in terms of our ability to zero in on where to do a more thorough in-vivo analysis of the protein in question. Though for the past decade there have been numerous advancements and improvements in epitope prediction, on average the best benchmark prediction accuracies are still only around 60%. New machine learning algorithms have arisen within the domain of deep learning, text mining, and convolutional networks. This paper presents a novel analytically trained and string kernel using deep neural network, which is tailored for continuous epitope prediction, called: Deep Ridge Regressed Epitope Predictor (DRREP). DRREP was tested on long protein sequences from the following datasets: SARS, Pellequer, HIV, AntiJen, and SEQ194. DRREP was compared to numerous state of the art epitope predictors, including the most recently published predictors called LBtope and DMNLBE. Using area under ROC curve (AUC), DRREP achieved a performance improvement over the best performing predictors on SARS (13.7%), HIV (8.9%), Pellequer (1.5%), and SEQ194 (3.1%), with its performance being matched only on the AntiJen dataset, by the LBtope predictor, where both DRREP and LBtope achieved an AUC of 0.702. DRREP is an analytically trained deep neural network, thus capable of learning in a single step through regression. By combining the features of deep learning, string kernels, and convolutional networks, the system is able to perform residue-by-residue prediction of continues epitopes with higher accuracy than the current state of the art predictors.

  18. Training set expansion: an approach to improving the reconstruction of biological networks from limited and uneven reliable interactions

    PubMed Central

    Yip, Kevin Y.; Gerstein, Mark

    2009-01-01

    Motivation: An important problem in systems biology is reconstructing complete networks of interactions between biological objects by extrapolating from a few known interactions as examples. While there are many computational techniques proposed for this network reconstruction task, their accuracy is consistently limited by the small number of high-confidence examples, and the uneven distribution of these examples across the potential interaction space, with some objects having many known interactions and others few. Results: To address this issue, we propose two computational methods based on the concept of training set expansion. They work particularly effectively in conjunction with kernel approaches, which are a popular class of approaches for fusing together many disparate types of features. Both our methods are based on semi-supervised learning and involve augmenting the limited number of gold-standard training instances with carefully chosen and highly confident auxiliary examples. The first method, prediction propagation, propagates highly confident predictions of one local model to another as the auxiliary examples, thus learning from information-rich regions of the training network to help predict the information-poor regions. The second method, kernel initialization, takes the most similar and most dissimilar objects of each object in a global kernel as the auxiliary examples. Using several sets of experimentally verified protein–protein interactions from yeast, we show that training set expansion gives a measurable performance gain over a number of representative, state-of-the-art network reconstruction methods, and it can correctly identify some interactions that are ranked low by other methods due to the lack of training examples of the involved proteins. Contact: mark.gerstein@yale.edu Availability: The datasets and additional materials can be found at http://networks.gersteinlab.org/tse. PMID:19015141

  19. Classifying Lower Extremity Muscle Fatigue during Walking using Machine Learning and Inertial Sensors

    PubMed Central

    Zhang, Jian; Lockhart, Thurmon E.; Soangra, Rahul

    2013-01-01

    Fatigue in lower extremity musculature is associated with decline in postural stability, motor performance and alters normal walking patterns in human subjects. Automated recognition of lower extremity muscle fatigue condition may be advantageous in early detection of fall and injury risks. Supervised machine learning methods such as Support Vector Machines (SVM) have been previously used for classifying healthy and pathological gait patterns and also for separating old and young gait patterns. In this study we explore the classification potential of SVM in recognition of gait patterns utilizing an inertial measurement unit associated with lower extremity muscular fatigue. Both kinematic and kinetic gait patterns of 17 participants (29±11 years) were recorded and analyzed in normal and fatigued state of walking. Lower extremities were fatigued by performance of a squatting exercise until the participants reached 60% of their baseline maximal voluntary exertion level. Feature selection methods were used to classify fatigue and no-fatigue conditions based on temporal and frequency information of the signals. Additionally, influences of three different kernel schemes (i.e., linear, polynomial, and radial basis function) were investigated for SVM classification. The results indicated that lower extremity muscle fatigue condition influenced gait and loading responses. In terms of the SVM classification results, an accuracy of 96% was reached in distinguishing the two gait patterns (fatigue and no-fatigue) within the same subject using the kinematic, time and frequency domain features. It is also found that linear kernel and RBF kernel were equally good to identify intra-individual fatigue characteristics. These results suggest that intra-subject fatigue classification using gait patterns from an inertial sensor holds considerable potential in identifying “at-risk” gait due to muscle fatigue. PMID:24081829

  20. Willingness to Communicate in the Second Language: Understanding the Decision to Speak as a Volitional Process

    ERIC Educational Resources Information Center

    MacIntyre, Peter D.

    2007-01-01

    Previous research has devoted a great deal of attention to describing the long-term patterns and relationships among trait-level or situation-specific variables. The present discussion extracts kernels of wisdom, based on the literatures on language anxiety and language learning motivation, that are used to frame the argument that choosing to…

  1. 7 CFR 810.602 - Definition of other terms.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ...) Damaged kernels. Kernels and pieces of flaxseed kernels that are badly ground-damaged, badly weather... instructions. Also, underdeveloped, shriveled, and small pieces of flaxseed kernels removed in properly... recleaning. (c) Heat-damaged kernels. Kernels and pieces of flaxseed kernels that are materially discolored...

  2. Kernel Abortion in Maize 1

    PubMed Central

    Hanft, Jonathan M.; Jones, Robert J.

    1986-01-01

    Kernels cultured in vitro were induced to abort by high temperature (35°C) and by culturing six kernels/cob piece. Aborting kernels failed to enter a linear phase of dry mass accumulation and had a final mass that was less than 6% of nonaborting field-grown kernels. Kernels induced to abort by high temperature failed to synthesize starch in the endosperm and had elevated sucrose concentrations and low fructose and glucose concentrations in the pedicel during early growth compared to nonaborting kernels. Kernels induced to abort by high temperature also had much lower pedicel soluble acid invertase activities than did nonaborting kernels. These results suggest that high temperature during the lag phase of kernel growth may impair the process of sucrose unloading in the pedicel by indirectly inhibiting soluble acid invertase activity and prevent starch synthesis in the endosperm. Kernels induced to abort by culturing six kernels/cob piece had reduced pedicel fructose, glucose, and sucrose concentrations compared to kernels from field-grown ears. These aborting kernels also had a lower pedicel soluble acid invertase activity compared to nonaborting kernels from the same cob piece and from field-grown ears. The low invertase activity in pedicel tissue of the aborting kernels was probably caused by a lack of substrate (sucrose) for the invertase to cleave due to the intense competition for available assimilates. In contrast to kernels cultured at 35°C, aborting kernels from cob pieces containing all six kernels accumulated starch in a linear fashion. These results indicate that kernels cultured six/cob piece abort because of an inadequate supply of sugar and are similar to apical kernels from field-grown ears that often abort prior to the onset of linear growth. PMID:16664846

  3. Multivariate temporal dictionary learning for EEG.

    PubMed

    Barthélemy, Q; Gouy-Pailler, C; Isaac, Y; Souloumiac, A; Larue, A; Mars, J I

    2013-04-30

    This article addresses the issue of representing electroencephalographic (EEG) signals in an efficient way. While classical approaches use a fixed Gabor dictionary to analyze EEG signals, this article proposes a data-driven method to obtain an adapted dictionary. To reach an efficient dictionary learning, appropriate spatial and temporal modeling is required. Inter-channels links are taken into account in the spatial multivariate model, and shift-invariance is used for the temporal model. Multivariate learned kernels are informative (a few atoms code plentiful energy) and interpretable (the atoms can have a physiological meaning). Using real EEG data, the proposed method is shown to outperform the classical multichannel matching pursuit used with a Gabor dictionary, as measured by the representative power of the learned dictionary and its spatial flexibility. Moreover, dictionary learning can capture interpretable patterns: this ability is illustrated on real data, learning a P300 evoked potential. Copyright © 2013 Elsevier B.V. All rights reserved.

  4. 7 CFR 810.1202 - Definition of other terms.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... kernels. Kernels, pieces of rye kernels, and other grains that are badly ground-damaged, badly weather.... Also, underdeveloped, shriveled, and small pieces of rye kernels removed in properly separating the...-damaged kernels. Kernels, pieces of rye kernels, and other grains that are materially discolored and...

  5. The Genetic Basis of Natural Variation in Kernel Size and Related Traits Using a Four-Way Cross Population in Maize.

    PubMed

    Chen, Jiafa; Zhang, Luyan; Liu, Songtao; Li, Zhimin; Huang, Rongrong; Li, Yongming; Cheng, Hongliang; Li, Xiantang; Zhou, Bo; Wu, Suowei; Chen, Wei; Wu, Jianyu; Ding, Junqiang

    2016-01-01

    Kernel size is an important component of grain yield in maize breeding programs. To extend the understanding on the genetic basis of kernel size traits (i.e., kernel length, kernel width and kernel thickness), we developed a set of four-way cross mapping population derived from four maize inbred lines with varied kernel sizes. In the present study, we investigated the genetic basis of natural variation in seed size and other components of maize yield (e.g., hundred kernel weight, number of rows per ear, number of kernels per row). In total, ten QTL affecting kernel size were identified, three of which (two for kernel length and one for kernel width) had stable expression in other components of maize yield. The possible genetic mechanism behind the trade-off of kernel size and yield components was discussed.

  6. The Genetic Basis of Natural Variation in Kernel Size and Related Traits Using a Four-Way Cross Population in Maize

    PubMed Central

    Liu, Songtao; Li, Zhimin; Huang, Rongrong; Li, Yongming; Cheng, Hongliang; Li, Xiantang; Zhou, Bo; Wu, Suowei; Chen, Wei; Wu, Jianyu; Ding, Junqiang

    2016-01-01

    Kernel size is an important component of grain yield in maize breeding programs. To extend the understanding on the genetic basis of kernel size traits (i.e., kernel length, kernel width and kernel thickness), we developed a set of four-way cross mapping population derived from four maize inbred lines with varied kernel sizes. In the present study, we investigated the genetic basis of natural variation in seed size and other components of maize yield (e.g., hundred kernel weight, number of rows per ear, number of kernels per row). In total, ten QTL affecting kernel size were identified, three of which (two for kernel length and one for kernel width) had stable expression in other components of maize yield. The possible genetic mechanism behind the trade-off of kernel size and yield components was discussed. PMID:27070143

  7. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Collins, Benjamin S.

    The Futility package contains the following: 1) Definition of the size of integers and real numbers; 2) A generic Unit test harness; 3) Definitions for some basic extensions to the Fortran language: arbitrary length strings, a parameter list construct, exception handlers, command line processor, timers; 4) Geometry definitions: point, line, plane, box, cylinder, polyhedron; 5) File wrapper functions: standard Fortran input/output files, Fortran binary files, HDF5 files; 6) Parallel wrapper functions: MPI, and Open MP abstraction layers, partitioning algorithms; 7) Math utilities: BLAS, Matrix and Vector definitions, Linear Solver methods and wrappers for other TPLs (PETSC, MKL, etc), preconditioner classes;more » 8) Misc: random number generator, water saturation properties, sorting algorithms.« less

  8. 7 CFR 810.802 - Definition of other terms.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ...) Damaged kernels. Kernels and pieces of grain kernels for which standards have been established under the.... (d) Heat-damaged kernels. Kernels and pieces of grain kernels for which standards have been...

  9. 7 CFR 981.408 - Inedible kernel.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... kernel is modified to mean a kernel, piece, or particle of almond kernel with any defect scored as... purposes of determining inedible kernels, pieces, or particles of almond kernels. [59 FR 39419, Aug. 3...

  10. 7 CFR 981.408 - Inedible kernel.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... kernel is modified to mean a kernel, piece, or particle of almond kernel with any defect scored as... purposes of determining inedible kernels, pieces, or particles of almond kernels. [59 FR 39419, Aug. 3...

  11. 7 CFR 981.408 - Inedible kernel.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... kernel is modified to mean a kernel, piece, or particle of almond kernel with any defect scored as... purposes of determining inedible kernels, pieces, or particles of almond kernels. [59 FR 39419, Aug. 3...

  12. 7 CFR 981.408 - Inedible kernel.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... kernel is modified to mean a kernel, piece, or particle of almond kernel with any defect scored as... purposes of determining inedible kernels, pieces, or particles of almond kernels. [59 FR 39419, Aug. 3...

  13. Batched matrix computations on hardware accelerators based on GPUs

    DOE PAGES

    Haidar, Azzam; Dong, Tingxing; Luszczek, Piotr; ...

    2015-02-09

    Scientific applications require solvers that work on many small size problems that are independent from each other. At the same time, the high-end hardware evolves rapidly and becomes ever more throughput-oriented and thus there is an increasing need for an effective approach to develop energy-efficient, high-performance codes for these small matrix problems that we call batched factorizations. The many applications that need this functionality could especially benefit from the use of GPUs, which currently are four to five times more energy efficient than multicore CPUs on important scientific workloads. This study, consequently, describes the development of the most common, one-sidedmore » factorizations, Cholesky, LU, and QR, for a set of small dense matrices. The algorithms we present together with their implementations are, by design, inherently parallel. In particular, our approach is based on representing the process as a sequence of batched BLAS routines that are executed entirely on a GPU. Importantly, this is unlike the LAPACK and the hybrid MAGMA factorization algorithms that work under drastically different assumptions of hardware design and efficiency of execution of the various computational kernels involved in the implementation. Thus, our approach is more efficient than what works for a combination of multicore CPUs and GPUs for the problems sizes of interest of the application use cases. The paradigm where upon a single chip (a GPU or a CPU) factorizes a single problem at a time is not at all efficient in our applications’ context. We illustrate all of these claims through a detailed performance analysis. With the help of profiling and tracing tools, we guide our development of batched factorizations to achieve up to two-fold speedup and three-fold better energy efficiency as compared against our highly optimized batched CPU implementations based on MKL library. Finally, the tested system featured two sockets of Intel Sandy Bridge CPUs and we compared with a batched LU factorizations featured in the CUBLAS library for GPUs, we achieve as high as 2.5× speedup on the NVIDIA K40 GPU.« less

  14. Distance Metric Learning via Iterated Support Vector Machines.

    PubMed

    Zuo, Wangmeng; Wang, Faqiang; Zhang, David; Lin, Liang; Huang, Yuchi; Meng, Deyu; Zhang, Lei

    2017-07-11

    Distance metric learning aims to learn from the given training data a valid distance metric, with which the similarity between data samples can be more effectively evaluated for classification. Metric learning is often formulated as a convex or nonconvex optimization problem, while most existing methods are based on customized optimizers and become inefficient for large scale problems. In this paper, we formulate metric learning as a kernel classification problem with the positive semi-definite constraint, and solve it by iterated training of support vector machines (SVMs). The new formulation is easy to implement and efficient in training with the off-the-shelf SVM solvers. Two novel metric learning models, namely Positive-semidefinite Constrained Metric Learning (PCML) and Nonnegative-coefficient Constrained Metric Learning (NCML), are developed. Both PCML and NCML can guarantee the global optimality of their solutions. Experiments are conducted on general classification, face verification and person re-identification to evaluate our methods. Compared with the state-of-the-art approaches, our methods can achieve comparable classification accuracy and are efficient in training.

  15. Prediction of heterotrimeric protein complexes by two-phase learning using neighboring kernels

    PubMed Central

    2014-01-01

    Background Protein complexes play important roles in biological systems such as gene regulatory networks and metabolic pathways. Most methods for predicting protein complexes try to find protein complexes with size more than three. It, however, is known that protein complexes with smaller sizes occupy a large part of whole complexes for several species. In our previous work, we developed a method with several feature space mappings and the domain composition kernel for prediction of heterodimeric protein complexes, which outperforms existing methods. Results We propose methods for prediction of heterotrimeric protein complexes by extending techniques in the previous work on the basis of the idea that most heterotrimeric protein complexes are not likely to share the same protein with each other. We make use of the discriminant function in support vector machines (SVMs), and design novel feature space mappings for the second phase. As the second classifier, we examine SVMs and relevance vector machines (RVMs). We perform 10-fold cross-validation computational experiments. The results suggest that our proposed two-phase methods and SVM with the extended features outperform the existing method NWE, which was reported to outperform other existing methods such as MCL, MCODE, DPClus, CMC, COACH, RRW, and PPSampler for prediction of heterotrimeric protein complexes. Conclusions We propose two-phase prediction methods with the extended features, the domain composition kernel, SVMs and RVMs. The two-phase method with the extended features and the domain composition kernel using SVM as the second classifier is particularly useful for prediction of heterotrimeric protein complexes. PMID:24564744

  16. Gabor-based kernel PCA with fractional power polynomial models for face recognition.

    PubMed

    Liu, Chengjun

    2004-05-01

    This paper presents a novel Gabor-based kernel Principal Component Analysis (PCA) method by integrating the Gabor wavelet representation of face images and the kernel PCA method for face recognition. Gabor wavelets first derive desirable facial features characterized by spatial frequency, spatial locality, and orientation selectivity to cope with the variations due to illumination and facial expression changes. The kernel PCA method is then extended to include fractional power polynomial models for enhanced face recognition performance. A fractional power polynomial, however, does not necessarily define a kernel function, as it might not define a positive semidefinite Gram matrix. Note that the sigmoid kernels, one of the three classes of widely used kernel functions (polynomial kernels, Gaussian kernels, and sigmoid kernels), do not actually define a positive semidefinite Gram matrix either. Nevertheless, the sigmoid kernels have been successfully used in practice, such as in building support vector machines. In order to derive real kernel PCA features, we apply only those kernel PCA eigenvectors that are associated with positive eigenvalues. The feasibility of the Gabor-based kernel PCA method with fractional power polynomial models has been successfully tested on both frontal and pose-angled face recognition, using two data sets from the FERET database and the CMU PIE database, respectively. The FERET data set contains 600 frontal face images of 200 subjects, while the PIE data set consists of 680 images across five poses (left and right profiles, left and right half profiles, and frontal view) with two different facial expressions (neutral and smiling) of 68 subjects. The effectiveness of the Gabor-based kernel PCA method with fractional power polynomial models is shown in terms of both absolute performance indices and comparative performance against the PCA method, the kernel PCA method with polynomial kernels, the kernel PCA method with fractional power polynomial models, the Gabor wavelet-based PCA method, and the Gabor wavelet-based kernel PCA method with polynomial kernels.

  17. Nowcasting Cloud Fields for U.S. Air Force Special Operations

    DTIC Science & Technology

    2017-03-01

    application of Bayes’ Rule offers many advantages over Kernel Density Estimation (KDE) and other commonly used statistical post-processing methods...reflectance and probability of cloud. A statistical post-processing technique is applied using Bayesian estimation to train the system from a set of past...nowcasting, low cloud forecasting, cloud reflectance, ISR, Bayesian estimation, statistical post-processing, machine learning 15. NUMBER OF PAGES

  18. 7 CFR 981.7 - Edible kernel.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... 7 Agriculture 8 2010-01-01 2010-01-01 false Edible kernel. 981.7 Section 981.7 Agriculture... Regulating Handling Definitions § 981.7 Edible kernel. Edible kernel means a kernel, piece, or particle of almond kernel that is not inedible. [41 FR 26852, June 30, 1976] ...

  19. Kernel K-Means Sampling for Nyström Approximation.

    PubMed

    He, Li; Zhang, Hong

    2018-05-01

    A fundamental problem in Nyström-based kernel matrix approximation is the sampling method by which training set is built. In this paper, we suggest to use kernel -means sampling, which is shown in our works to minimize the upper bound of a matrix approximation error. We first propose a unified kernel matrix approximation framework, which is able to describe most existing Nyström approximations under many popular kernels, including Gaussian kernel and polynomial kernel. We then show that, the matrix approximation error upper bound, in terms of the Frobenius norm, is equal to the -means error of data points in kernel space plus a constant. Thus, the -means centers of data in kernel space, or the kernel -means centers, are the optimal representative points with respect to the Frobenius norm error upper bound. Experimental results, with both Gaussian kernel and polynomial kernel, on real-world data sets and image segmentation tasks show the superiority of the proposed method over the state-of-the-art methods.

  20. Optimization of the kernel functions in a probabilistic neural network analyzing the local pattern distribution.

    PubMed

    Galleske, I; Castellanos, J

    2002-05-01

    This article proposes a procedure for the automatic determination of the elements of the covariance matrix of the gaussian kernel function of probabilistic neural networks. Two matrices, a rotation matrix and a matrix of variances, can be calculated by analyzing the local environment of each training pattern. The combination of them will form the covariance matrix of each training pattern. This automation has two advantages: First, it will free the neural network designer from indicating the complete covariance matrix, and second, it will result in a network with better generalization ability than the original model. A variation of the famous two-spiral problem and real-world examples from the UCI Machine Learning Repository will show a classification rate not only better than the original probabilistic neural network but also that this model can outperform other well-known classification techniques.

  1. Kinase Identification with Supervised Laplacian Regularized Least Squares

    PubMed Central

    Zhang, He; Wang, Minghui

    2015-01-01

    Phosphorylation is catalyzed by protein kinases and is irreplaceable in regulating biological processes. Identification of phosphorylation sites with their corresponding kinases contributes to the understanding of molecular mechanisms. Mass spectrometry analysis of phosphor-proteomes generates a large number of phosphorylated sites. However, experimental methods are costly and time-consuming, and most phosphorylation sites determined by experimental methods lack kinase information. Therefore, computational methods are urgently needed to address the kinase identification problem. To this end, we propose a new kernel-based machine learning method called Supervised Laplacian Regularized Least Squares (SLapRLS), which adopts a new method to construct kernels based on the similarity matrix and minimizes both structure risk and overall inconsistency between labels and similarities. The results predicted using both Phospho.ELM and an additional independent test dataset indicate that SLapRLS can more effectively identify kinases compared to other existing algorithms. PMID:26448296

  2. Efficient HIK SVM learning for image classification.

    PubMed

    Wu, Jianxin

    2012-10-01

    Histograms are used in almost every aspect of image processing and computer vision, from visual descriptors to image representations. Histogram intersection kernel (HIK) and support vector machine (SVM) classifiers are shown to be very effective in dealing with histograms. This paper presents contributions concerning HIK SVM for image classification. First, we propose intersection coordinate descent (ICD), a deterministic and scalable HIK SVM solver. ICD is much faster than, and has similar accuracies to, general purpose SVM solvers and other fast HIK SVM training methods. We also extend ICD to the efficient training of a broader family of kernels. Second, we show an important empirical observation that ICD is not sensitive to the C parameter in SVM, and we provide some theoretical analyses to explain this observation. ICD achieves high accuracies in many problems, using its default parameters. This is an attractive property for practitioners, because many image processing tasks are too large to choose SVM parameters using cross-validation.

  3. Kinase Identification with Supervised Laplacian Regularized Least Squares.

    PubMed

    Li, Ao; Xu, Xiaoyi; Zhang, He; Wang, Minghui

    2015-01-01

    Phosphorylation is catalyzed by protein kinases and is irreplaceable in regulating biological processes. Identification of phosphorylation sites with their corresponding kinases contributes to the understanding of molecular mechanisms. Mass spectrometry analysis of phosphor-proteomes generates a large number of phosphorylated sites. However, experimental methods are costly and time-consuming, and most phosphorylation sites determined by experimental methods lack kinase information. Therefore, computational methods are urgently needed to address the kinase identification problem. To this end, we propose a new kernel-based machine learning method called Supervised Laplacian Regularized Least Squares (SLapRLS), which adopts a new method to construct kernels based on the similarity matrix and minimizes both structure risk and overall inconsistency between labels and similarities. The results predicted using both Phospho.ELM and an additional independent test dataset indicate that SLapRLS can more effectively identify kinases compared to other existing algorithms.

  4. 7 CFR 810.2202 - Definition of other terms.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... kernels, foreign material, and shrunken and broken kernels. The sum of these three factors may not exceed... the removal of dockage and shrunken and broken kernels. (g) Heat-damaged kernels. Kernels, pieces of... sample after the removal of dockage and shrunken and broken kernels. (h) Other grains. Barley, corn...

  5. 7 CFR 981.8 - Inedible kernel.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... 7 Agriculture 8 2010-01-01 2010-01-01 false Inedible kernel. 981.8 Section 981.8 Agriculture... Regulating Handling Definitions § 981.8 Inedible kernel. Inedible kernel means a kernel, piece, or particle of almond kernel with any defect scored as serious damage, or damage due to mold, gum, shrivel, or...

  6. 7 CFR 51.1415 - Inedible kernels.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... 7 Agriculture 2 2010-01-01 2010-01-01 false Inedible kernels. 51.1415 Section 51.1415 Agriculture... Standards for Grades of Pecans in the Shell 1 Definitions § 51.1415 Inedible kernels. Inedible kernels means that the kernel or pieces of kernels are rancid, moldy, decayed, injured by insects or otherwise...

  7. Energy scaling advantages of resistive memory crossbar based computation and its application to sparse coding

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Agarwal, Sapan; Quach, Tu -Thach; Parekh, Ojas

    In this study, the exponential increase in data over the last decade presents a significant challenge to analytics efforts that seek to process and interpret such data for various applications. Neural-inspired computing approaches are being developed in order to leverage the computational properties of the analog, low-power data processing observed in biological systems. Analog resistive memory crossbars can perform a parallel read or a vector-matrix multiplication as well as a parallel write or a rank-1 update with high computational efficiency. For an N × N crossbar, these two kernels can be O(N) more energy efficient than a conventional digital memory-basedmore » architecture. If the read operation is noise limited, the energy to read a column can be independent of the crossbar size (O(1)). These two kernels form the basis of many neuromorphic algorithms such as image, text, and speech recognition. For instance, these kernels can be applied to a neural sparse coding algorithm to give an O(N) reduction in energy for the entire algorithm when run with finite precision. Sparse coding is a rich problem with a host of applications including computer vision, object tracking, and more generally unsupervised learning.« less

  8. A Genetic Algorithm Based Support Vector Machine Model for Blood-Brain Barrier Penetration Prediction

    PubMed Central

    Zhang, Daqing; Xiao, Jianfeng; Zhou, Nannan; Luo, Xiaomin; Jiang, Hualiang; Chen, Kaixian

    2015-01-01

    Blood-brain barrier (BBB) is a highly complex physical barrier determining what substances are allowed to enter the brain. Support vector machine (SVM) is a kernel-based machine learning method that is widely used in QSAR study. For a successful SVM model, the kernel parameters for SVM and feature subset selection are the most important factors affecting prediction accuracy. In most studies, they are treated as two independent problems, but it has been proven that they could affect each other. We designed and implemented genetic algorithm (GA) to optimize kernel parameters and feature subset selection for SVM regression and applied it to the BBB penetration prediction. The results show that our GA/SVM model is more accurate than other currently available log BB models. Therefore, to optimize both SVM parameters and feature subset simultaneously with genetic algorithm is a better approach than other methods that treat the two problems separately. Analysis of our log BB model suggests that carboxylic acid group, polar surface area (PSA)/hydrogen-bonding ability, lipophilicity, and molecular charge play important role in BBB penetration. Among those properties relevant to BBB penetration, lipophilicity could enhance the BBB penetration while all the others are negatively correlated with BBB penetration. PMID:26504797

  9. Energy scaling advantages of resistive memory crossbar based computation and its application to sparse coding

    DOE PAGES

    Agarwal, Sapan; Quach, Tu -Thach; Parekh, Ojas; ...

    2016-01-06

    In this study, the exponential increase in data over the last decade presents a significant challenge to analytics efforts that seek to process and interpret such data for various applications. Neural-inspired computing approaches are being developed in order to leverage the computational properties of the analog, low-power data processing observed in biological systems. Analog resistive memory crossbars can perform a parallel read or a vector-matrix multiplication as well as a parallel write or a rank-1 update with high computational efficiency. For an N × N crossbar, these two kernels can be O(N) more energy efficient than a conventional digital memory-basedmore » architecture. If the read operation is noise limited, the energy to read a column can be independent of the crossbar size (O(1)). These two kernels form the basis of many neuromorphic algorithms such as image, text, and speech recognition. For instance, these kernels can be applied to a neural sparse coding algorithm to give an O(N) reduction in energy for the entire algorithm when run with finite precision. Sparse coding is a rich problem with a host of applications including computer vision, object tracking, and more generally unsupervised learning.« less

  10. Kernel parameter variation-based selective ensemble support vector data description for oil spill detection on the ocean via hyperspectral imaging

    NASA Astrophysics Data System (ADS)

    Uslu, Faruk Sukru

    2017-07-01

    Oil spills on the ocean surface cause serious environmental, political, and economic problems. Therefore, these catastrophic threats to marine ecosystems require detection and monitoring. Hyperspectral sensors are powerful optical sensors used for oil spill detection with the help of detailed spectral information of materials. However, huge amounts of data in hyperspectral imaging (HSI) require fast and accurate computation methods for detection problems. Support vector data description (SVDD) is one of the most suitable methods for detection, especially for large data sets. Nevertheless, the selection of kernel parameters is one of the main problems in SVDD. This paper presents a method, inspired by ensemble learning, for improving performance of SVDD without tuning its kernel parameters. Additionally, a classifier selection technique is proposed to get more gain. The proposed approach also aims to solve the small sample size problem, which is very important for processing high-dimensional data in HSI. The algorithm is applied to two HSI data sets for detection problems. In the first HSI data set, various targets are detected; in the second HSI data set, oil spill detection in situ is realized. The experimental results demonstrate the feasibility and performance improvement of the proposed algorithm for oil spill detection problems.

  11. Coupling individual kernel-filling processes with source-sink interactions into GREENLAB-Maize.

    PubMed

    Ma, Yuntao; Chen, Youjia; Zhu, Jinyu; Meng, Lei; Guo, Yan; Li, Baoguo; Hoogenboom, Gerrit

    2018-02-13

    Failure to account for the variation of kernel growth in a cereal crop simulation model may cause serious deviations in the estimates of crop yield. The goal of this research was to revise the GREENLAB-Maize model to incorporate source- and sink-limited allocation approaches to simulate the dry matter accumulation of individual kernels of an ear (GREENLAB-Maize-Kernel). The model used potential individual kernel growth rates to characterize the individual potential sink demand. The remobilization of non-structural carbohydrates from reserve organs to kernels was also incorporated. Two years of field experiments were conducted to determine the model parameter values and to evaluate the model using two maize hybrids with different plant densities and pollination treatments. Detailed observations were made on the dimensions and dry weights of individual kernels and other above-ground plant organs throughout the seasons. Three basic traits characterizing an individual kernel were compared on simulated and measured individual kernels: (1) final kernel size; (2) kernel growth rate; and (3) duration of kernel filling. Simulations of individual kernel growth closely corresponded to experimental data. The model was able to reproduce the observed dry weight of plant organs well. Then, the source-sink dynamics and the remobilization of carbohydrates for kernel growth were quantified to show that remobilization processes accompanied source-sink dynamics during the kernel-filling process. We conclude that the model may be used to explore options for optimizing plant kernel yield by matching maize management to the environment, taking into account responses at the level of individual kernels. © The Author(s) 2018. Published by Oxford University Press on behalf of the Annals of Botany Company. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  12. Unconventional protein sources: apricot seed kernels.

    PubMed

    Gabrial, G N; El-Nahry, F I; Awadalla, M Z; Girgis, S M

    1981-09-01

    Hamawy apricot seed kernels (sweet), Amar apricot seed kernels (bitter) and treated Amar apricot kernels (bitterness removed) were evaluated biochemically. All kernels were found to be high in fat (42.2--50.91%), protein (23.74--25.70%) and fiber (15.08--18.02%). Phosphorus, calcium, and iron were determined in all experimental samples. The three different apricot seed kernels were used for extensive study including the qualitative determination of the amino acid constituents by acid hydrolysis, quantitative determination of some amino acids, and biological evaluation of the kernel proteins in order to use them as new protein sources. Weanling albino rats failed to grow on diets containing the Amar apricot seed kernels due to low food consumption because of its bitterness. There was no loss in weight in that case. The Protein Efficiency Ratio data and blood analysis results showed the Hamawy apricot seed kernels to be higher in biological value than treated apricot seed kernels. The Net Protein Ratio data which accounts for both weight, maintenance and growth showed the treated apricot seed kernels to be higher in biological value than both Hamawy and Amar kernels. The Net Protein Ratio for the last two kernels were nearly equal.

  13. A feasibility study of automatic lung nodule detection in chest digital tomosynthesis with machine learning based on support vector machine

    NASA Astrophysics Data System (ADS)

    Lee, Donghoon; Kim, Ye-seul; Choi, Sunghoon; Lee, Haenghwa; Jo, Byungdu; Choi, Seungyeon; Shin, Jungwook; Kim, Hee-Joung

    2017-03-01

    The chest digital tomosynthesis(CDT) is recently developed medical device that has several advantage for diagnosing lung disease. For example, CDT provides depth information with relatively low radiation dose compared to computed tomography (CT). However, a major problem with CDT is the image artifacts associated with data incompleteness resulting from limited angle data acquisition in CDT geometry. For this reason, the sensitivity of lung disease was not clear compared to CT. In this study, to improve sensitivity of lung disease detection in CDT, we developed computer aided diagnosis (CAD) systems based on machine learning. For design CAD systems, we used 100 cases of lung nodules cropped images and 100 cases of normal lesion cropped images acquired by lung man phantoms and proto type CDT. We used machine learning techniques based on support vector machine and Gabor filter. The Gabor filter was used for extracting characteristics of lung nodules and we compared performance of feature extraction of Gabor filter with various scale and orientation parameters. We used 3, 4, 5 scales and 4, 6, 8 orientations. After extracting features, support vector machine (SVM) was used for classifying feature of lesions. The linear, polynomial and Gaussian kernels of SVM were compared to decide the best SVM conditions for CDT reconstruction images. The results of CAD system with machine learning showed the capability of automatically lung lesion detection. Furthermore detection performance was the best when Gabor filter with 5 scale and 8 orientation and SVM with Gaussian kernel were used. In conclusion, our suggested CAD system showed improving sensitivity of lung lesion detection in CDT and decide Gabor filter and SVM conditions to achieve higher detection performance of our developed CAD system for CDT.

  14. Incorporating High-Frequency Physiologic Data Using Computational Dictionary Learning Improves Prediction of Delayed Cerebral Ischemia Compared to Existing Methods.

    PubMed

    Megjhani, Murad; Terilli, Kalijah; Frey, Hans-Peter; Velazquez, Angela G; Doyle, Kevin William; Connolly, Edward Sander; Roh, David Jinou; Agarwal, Sachin; Claassen, Jan; Elhadad, Noemie; Park, Soojin

    2018-01-01

    Accurate prediction of delayed cerebral ischemia (DCI) after subarachnoid hemorrhage (SAH) can be critical for planning interventions to prevent poor neurological outcome. This paper presents a model using convolution dictionary learning to extract features from physiological data available from bedside monitors. We develop and validate a prediction model for DCI after SAH, demonstrating improved precision over standard methods alone. 488 consecutive SAH admissions from 2006 to 2014 to a tertiary care hospital were included. Models were trained on 80%, while 20% were set aside for validation testing. Modified Fisher Scale was considered the standard grading scale in clinical use; baseline features also analyzed included age, sex, Hunt-Hess, and Glasgow Coma Scales. An unsupervised approach using convolution dictionary learning was used to extract features from physiological time series (systolic blood pressure and diastolic blood pressure, heart rate, respiratory rate, and oxygen saturation). Classifiers (partial least squares and linear and kernel support vector machines) were trained on feature subsets of the derivation dataset. Models were applied to the validation dataset. The performances of the best classifiers on the validation dataset are reported by feature subset. Standard grading scale (mFS): AUC 0.54. Combined demographics and grading scales (baseline features): AUC 0.63. Kernel derived physiologic features: AUC 0.66. Combined baseline and physiologic features with redundant feature reduction: AUC 0.71 on derivation dataset and 0.78 on validation dataset. Current DCI prediction tools rely on admission imaging and are advantageously simple to employ. However, using an agnostic and computationally inexpensive learning approach for high-frequency physiologic time series data, we demonstrated that we could incorporate individual physiologic data to achieve higher classification accuracy.

  15. Statistical Methods in Ai: Rare Event Learning Using Associative Rules and Higher-Order Statistics

    NASA Astrophysics Data System (ADS)

    Iyer, V.; Shetty, S.; Iyengar, S. S.

    2015-07-01

    Rare event learning has not been actively researched since lately due to the unavailability of algorithms which deal with big samples. The research addresses spatio-temporal streams from multi-resolution sensors to find actionable items from a perspective of real-time algorithms. This computing framework is independent of the number of input samples, application domain, labelled or label-less streams. A sampling overlap algorithm such as Brooks-Iyengar is used for dealing with noisy sensor streams. We extend the existing noise pre-processing algorithms using Data-Cleaning trees. Pre-processing using ensemble of trees using bagging and multi-target regression showed robustness to random noise and missing data. As spatio-temporal streams are highly statistically correlated, we prove that a temporal window based sampling from sensor data streams converges after n samples using Hoeffding bounds. Which can be used for fast prediction of new samples in real-time. The Data-cleaning tree model uses a nonparametric node splitting technique, which can be learned in an iterative way which scales linearly in memory consumption for any size input stream. The improved task based ensemble extraction is compared with non-linear computation models using various SVM kernels for speed and accuracy. We show using empirical datasets the explicit rule learning computation is linear in time and is only dependent on the number of leafs present in the tree ensemble. The use of unpruned trees (t) in our proposed ensemble always yields minimum number (m) of leafs keeping pre-processing computation to n × t log m compared to N2 for Gram Matrix. We also show that the task based feature induction yields higher Qualify of Data (QoD) in the feature space compared to kernel methods using Gram Matrix.

  16. Prediction of anti-cancer drug response by kernelized multi-task learning.

    PubMed

    Tan, Mehmet

    2016-10-01

    Chemotherapy or targeted therapy are two of the main treatment options for many types of cancer. Due to the heterogeneous nature of cancer, the success of the therapeutic agents differs among patients. In this sense, determination of chemotherapeutic response of the malign cells is essential for establishing a personalized treatment protocol and designing new drugs. With the recent technological advances in producing large amounts of pharmacogenomic data, in silico methods have become important tools to achieve this aim. Data produced by using cancer cell lines provide a test bed for machine learning algorithms that try to predict the response of cancer cells to different agents. The potential use of these algorithms in drug discovery/repositioning and personalized treatments motivated us in this study to work on predicting drug response by exploiting the recent pharmacogenomic databases. We aim to improve the prediction of drug response of cancer cell lines. We propose to use a method that employs multi-task learning to improve learning by transfer, and kernels to extract non-linear relationships to predict drug response. The method outperforms three state-of-the-art algorithms on three anti-cancer drug screen datasets. We achieved a mean squared error of 3.305 and 0.501 on two different large scale screen data sets. On a recent challenge dataset, we obtained an error of 0.556. We report the methodological comparison results as well as the performance of the proposed algorithm on each single drug. The results show that the proposed method is a strong candidate to predict drug response of cancer cell lines in silico for pre-clinical studies. The source code of the algorithm and data used can be obtained from http://mtan.etu.edu.tr/Supplementary/kMTrace/. Copyright © 2016 Elsevier B.V. All rights reserved.

  17. 7 CFR 981.408 - Inedible kernel.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... 7 Agriculture 8 2010-01-01 2010-01-01 false Inedible kernel. 981.408 Section 981.408 Agriculture... Administrative Rules and Regulations § 981.408 Inedible kernel. Pursuant to § 981.8, the definition of inedible kernel is modified to mean a kernel, piece, or particle of almond kernel with any defect scored as...

  18. Design of CT reconstruction kernel specifically for clinical lung imaging

    NASA Astrophysics Data System (ADS)

    Cody, Dianna D.; Hsieh, Jiang; Gladish, Gregory W.

    2005-04-01

    In this study we developed a new reconstruction kernel specifically for chest CT imaging. An experimental flat-panel CT scanner was used on large dogs to produce 'ground-truth" reference chest CT images. These dogs were also examined using a clinical 16-slice CT scanner. We concluded from the dog images acquired on the clinical scanner that the loss of subtle lung structures was due mostly to the presence of the background noise texture when using currently available reconstruction kernels. This qualitative evaluation of the dog CT images prompted the design of a new recon kernel. This new kernel consisted of the combination of a low-pass and a high-pass kernel to produce a new reconstruction kernel, called the 'Hybrid" kernel. The performance of this Hybrid kernel fell between the two kernels on which it was based, as expected. This Hybrid kernel was also applied to a set of 50 patient data sets; the analysis of these clinical images is underway. We are hopeful that this Hybrid kernel will produce clinical images with an acceptable tradeoff of lung detail, reliable HU, and image noise.

  19. Quality changes in macadamia kernel between harvest and farm-gate.

    PubMed

    Walton, David A; Wallace, Helen M

    2011-02-01

    Macadamia integrifolia, Macadamia tetraphylla and their hybrids are cultivated for their edible kernels. After harvest, nuts-in-shell are partially dried on-farm and sorted to eliminate poor-quality kernels before consignment to a processor. During these operations, kernel quality may be lost. In this study, macadamia nuts-in-shell were sampled at five points of an on-farm postharvest handling chain from dehusking to the final storage silo to assess quality loss prior to consignment. Shoulder damage, weight of pieces and unsound kernel were assessed for raw kernels, and colour, mottled colour and surface damage for roasted kernels. Shoulder damage, weight of pieces and unsound kernel for raw kernels increased significantly between the dehusker and the final silo. Roasted kernels displayed a significant increase in dark colour, mottled colour and surface damage during on-farm handling. Significant loss of macadamia kernel quality occurred on a commercial farm during sorting and storage of nuts-in-shell before nuts were consigned to a processor. Nuts-in-shell should be dried as quickly as possible and on-farm handling minimised to maintain optimum kernel quality. 2010 Society of Chemical Industry.

  20. A new discriminative kernel from probabilistic models.

    PubMed

    Tsuda, Koji; Kawanabe, Motoaki; Rätsch, Gunnar; Sonnenburg, Sören; Müller, Klaus-Robert

    2002-10-01

    Recently, Jaakkola and Haussler (1999) proposed a method for constructing kernel functions from probabilistic models. Their so-called Fisher kernel has been combined with discriminative classifiers such as support vector machines and applied successfully in, for example, DNA and protein analysis. Whereas the Fisher kernel is calculated from the marginal log-likelihood, we propose the TOP kernel derived; from tangent vectors of posterior log-odds. Furthermore, we develop a theoretical framework on feature extractors from probabilistic models and use it for analyzing the TOP kernel. In experiments, our new discriminative TOP kernel compares favorably to the Fisher kernel.

  1. Fast Query-Optimized Kernel-Machine Classification

    NASA Technical Reports Server (NTRS)

    Mazzoni, Dominic; DeCoste, Dennis

    2004-01-01

    A recently developed algorithm performs kernel-machine classification via incremental approximate nearest support vectors. The algorithm implements support-vector machines (SVMs) at speeds 10 to 100 times those attainable by use of conventional SVM algorithms. The algorithm offers potential benefits for classification of images, recognition of speech, recognition of handwriting, and diverse other applications in which there are requirements to discern patterns in large sets of data. SVMs constitute a subset of kernel machines (KMs), which have become popular as models for machine learning and, more specifically, for automated classification of input data on the basis of labeled training data. While similar in many ways to k-nearest-neighbors (k-NN) models and artificial neural networks (ANNs), SVMs tend to be more accurate. Using representations that scale only linearly in the numbers of training examples, while exploring nonlinear (kernelized) feature spaces that are exponentially larger than the original input dimensionality, KMs elegantly and practically overcome the classic curse of dimensionality. However, the price that one must pay for the power of KMs is that query-time complexity scales linearly with the number of training examples, making KMs often orders of magnitude more computationally expensive than are ANNs, decision trees, and other popular machine learning alternatives. The present algorithm treats an SVM classifier as a special form of a k-NN. The algorithm is based partly on an empirical observation that one can often achieve the same classification as that of an exact KM by using only small fraction of the nearest support vectors (SVs) of a query. The exact KM output is a weighted sum over the kernel values between the query and the SVs. In this algorithm, the KM output is approximated with a k-NN classifier, the output of which is a weighted sum only over the kernel values involving k selected SVs. Before query time, there are gathered statistics about how misleading the output of the k-NN model can be, relative to the outputs of the exact KM for a representative set of examples, for each possible k from 1 to the total number of SVs. From these statistics, there are derived upper and lower thresholds for each step k. These thresholds identify output levels for which the particular variant of the k-NN model already leans so strongly positively or negatively that a reversal in sign is unlikely, given the weaker SV neighbors still remaining. At query time, the partial output of each query is incrementally updated, stopping as soon as it exceeds the predetermined statistical thresholds of the current step. For an easy query, stopping can occur as early as step k = 1. For more difficult queries, stopping might not occur until nearly all SVs are touched. A key empirical observation is that this approach can tolerate very approximate nearest-neighbor orderings. In experiments, SVs and queries were projected to a subspace comprising the top few principal- component dimensions and neighbor orderings were computed in that subspace. This approach ensured that the overhead of the nearest-neighbor computations was insignificant, relative to that of the exact KM computation.

  2. Learning Hierarchical Feature Extractors for Image Recognition

    DTIC Science & Technology

    2012-09-01

    space as a natural criterion for devising better pools. Finally, we propose ways to make coding faster and more powerful through fast convolutional...parameter is the set of pools over which the summary statistic is computed. We propose locality in feature configuration space as a natural criterion for...pooling (dotted lines) is consistently higher than average pooling (solid lines), but the gap is much less signif - icant with intersection kernel (closed

  3. Increasing accuracy of dispersal kernels in grid-based population models

    USGS Publications Warehouse

    Slone, D.H.

    2011-01-01

    Dispersal kernels in grid-based population models specify the proportion, distance and direction of movements within the model landscape. Spatial errors in dispersal kernels can have large compounding effects on model accuracy. Circular Gaussian and Laplacian dispersal kernels at a range of spatial resolutions were investigated, and methods for minimizing errors caused by the discretizing process were explored. Kernels of progressively smaller sizes relative to the landscape grid size were calculated using cell-integration and cell-center methods. These kernels were convolved repeatedly, and the final distribution was compared with a reference analytical solution. For large Gaussian kernels (σ > 10 cells), the total kernel error was <10 &sup-11; compared to analytical results. Using an invasion model that tracked the time a population took to reach a defined goal, the discrete model results were comparable to the analytical reference. With Gaussian kernels that had σ ≤ 0.12 using the cell integration method, or σ ≤ 0.22 using the cell center method, the kernel error was greater than 10%, which resulted in invasion times that were orders of magnitude different than theoretical results. A goal-seeking routine was developed to adjust the kernels to minimize overall error. With this, corrections for small kernels were found that decreased overall kernel error to <10-11 and invasion time error to <5%.

  4. Local structure preserving sparse coding for infrared target recognition

    PubMed Central

    Han, Jing; Yue, Jiang; Zhang, Yi; Bai, Lianfa

    2017-01-01

    Sparse coding performs well in image classification. However, robust target recognition requires a lot of comprehensive template images and the sparse learning process is complex. We incorporate sparsity into a template matching concept to construct a local sparse structure matching (LSSM) model for general infrared target recognition. A local structure preserving sparse coding (LSPSc) formulation is proposed to simultaneously preserve the local sparse and structural information of objects. By adding a spatial local structure constraint into the classical sparse coding algorithm, LSPSc can improve the stability of sparse representation for targets and inhibit background interference in infrared images. Furthermore, a kernel LSPSc (K-LSPSc) formulation is proposed, which extends LSPSc to the kernel space to weaken the influence of the linear structure constraint in nonlinear natural data. Because of the anti-interference and fault-tolerant capabilities, both LSPSc- and K-LSPSc-based LSSM can implement target identification based on a simple template set, which just needs several images containing enough local sparse structures to learn a sufficient sparse structure dictionary of a target class. Specifically, this LSSM approach has stable performance in the target detection with scene, shape and occlusions variations. High performance is demonstrated on several datasets, indicating robust infrared target recognition in diverse environments and imaging conditions. PMID:28323824

  5. Analysis of the performance, emission and combustion characteristics of a turbocharged diesel engine fuelled with Jatropha curcas biodiesel-diesel blends using kernel-based extreme learning machine.

    PubMed

    Silitonga, Arridina Susan; Hassan, Masjuki Haji; Ong, Hwai Chyuan; Kusumo, Fitranto

    2017-11-01

    The purpose of this study is to investigate the performance, emission and combustion characteristics of a four-cylinder common-rail turbocharged diesel engine fuelled with Jatropha curcas biodiesel-diesel blends. A kernel-based extreme learning machine (KELM) model is developed in this study using MATLAB software in order to predict the performance, combustion and emission characteristics of the engine. To acquire the data for training and testing the KELM model, the engine speed was selected as the input parameter, whereas the performance, exhaust emissions and combustion characteristics were chosen as the output parameters of the KELM model. The performance, emissions and combustion characteristics predicted by the KELM model were validated by comparing the predicted data with the experimental data. The results show that the coefficient of determination of the parameters is within a range of 0.9805-0.9991 for both the KELM model and the experimental data. The mean absolute percentage error is within a range of 0.1259-2.3838. This study shows that KELM modelling is a useful technique in biodiesel production since it facilitates scientists and researchers to predict the performance, exhaust emissions and combustion characteristics of internal combustion engines with high accuracy.

  6. Tensor manifold-based extreme learning machine for 2.5-D face recognition

    NASA Astrophysics Data System (ADS)

    Chong, Lee Ying; Ong, Thian Song; Teoh, Andrew Beng Jin

    2018-01-01

    We explore the use of the Gabor regional covariance matrix (GRCM), a flexible matrix-based descriptor that embeds the Gabor features in the covariance matrix, as a 2.5-D facial descriptor and an effective means of feature fusion for 2.5-D face recognition problems. Despite its promise, matching is not a trivial problem for GRCM since it is a special instance of a symmetric positive definite (SPD) matrix that resides in non-Euclidean space as a tensor manifold. This implies that GRCM is incompatible with the existing vector-based classifiers and distance matchers. Therefore, we bridge the gap of the GRCM and extreme learning machine (ELM), a vector-based classifier for the 2.5-D face recognition problem. We put forward a tensor manifold-compliant ELM and its two variants by embedding the SPD matrix randomly into reproducing kernel Hilbert space (RKHS) via tensor kernel functions. To preserve the pair-wise distance of the embedded data, we orthogonalize the random-embedded SPD matrix. Hence, classification can be done using a simple ridge regressor, an integrated component of ELM, on the random orthogonal RKHS. Experimental results show that our proposed method is able to improve the recognition performance and further enhance the computational efficiency.

  7. Anthraquinones isolated from the browned Chinese chestnut kernels (Castanea mollissima blume)

    NASA Astrophysics Data System (ADS)

    Zhang, Y. L.; Qi, J. H.; Qin, L.; Wang, F.; Pang, M. X.

    2016-08-01

    Anthraquinones (AQS) represent a group of secondary metallic products in plants. AQS are often naturally occurring in plants and microorganisms. In a previous study, we found that AQS were produced by enzymatic browning reaction in Chinese chestnut kernels. To find out whether non-enzymatic browning reaction in the kernels could produce AQS too, AQS were extracted from three groups of chestnut kernels: fresh kernels, non-enzymatic browned kernels, and browned kernels, and the contents of AQS were determined. High performance liquid chromatography (HPLC) and nuclear magnetic resonance (NMR) methods were used to identify two compounds of AQS, rehein(1) and emodin(2). AQS were barely exists in the fresh kernels, while both browned kernel groups sample contained a high amount of AQS. Thus, we comfirmed that AQS could be produced during both enzymatic and non-enzymatic browning process. Rhein and emodin were the main components of AQS in the browned kernels.

  8. Completing sparse and disconnected protein-protein network by deep learning.

    PubMed

    Huang, Lei; Liao, Li; Wu, Cathy H

    2018-03-22

    Protein-protein interaction (PPI) prediction remains a central task in systems biology to achieve a better and holistic understanding of cellular and intracellular processes. Recently, an increasing number of computational methods have shifted from pair-wise prediction to network level prediction. Many of the existing network level methods predict PPIs under the assumption that the training network should be connected. However, this assumption greatly affects the prediction power and limits the application area because the current golden standard PPI networks are usually very sparse and disconnected. Therefore, how to effectively predict PPIs based on a training network that is sparse and disconnected remains a challenge. In this work, we developed a novel PPI prediction method based on deep learning neural network and regularized Laplacian kernel. We use a neural network with an autoencoder-like architecture to implicitly simulate the evolutionary processes of a PPI network. Neurons of the output layer correspond to proteins and are labeled with values (1 for interaction and 0 for otherwise) from the adjacency matrix of a sparse disconnected training PPI network. Unlike autoencoder, neurons at the input layer are given all zero input, reflecting an assumption of no a priori knowledge about PPIs, and hidden layers of smaller sizes mimic ancient interactome at different times during evolution. After the training step, an evolved PPI network whose rows are outputs of the neural network can be obtained. We then predict PPIs by applying the regularized Laplacian kernel to the transition matrix that is built upon the evolved PPI network. The results from cross-validation experiments show that the PPI prediction accuracies for yeast data and human data measured as AUC are increased by up to 8.4 and 14.9% respectively, as compared to the baseline. Moreover, the evolved PPI network can also help us leverage complementary information from the disconnected training network and multiple heterogeneous data sources. Tested by the yeast data with six heterogeneous feature kernels, the results show our method can further improve the prediction performance by up to 2%, which is very close to an upper bound that is obtained by an Approximate Bayesian Computation based sampling method. The proposed evolution deep neural network, coupled with regularized Laplacian kernel, is an effective tool in completing sparse and disconnected PPI networks and in facilitating integration of heterogeneous data sources.

  9. Broken rice kernels and the kinetics of rice hydration and texture during cooking.

    PubMed

    Saleh, Mohammed; Meullenet, Jean-Francois

    2013-05-01

    During rice milling and processing, broken kernels are inevitably present, although to date it has been unclear as to how the presence of broken kernels affects rice hydration and cooked rice texture. Therefore, this work intended to study the effect of broken kernels in a rice sample on rice hydration and texture during cooking. Two medium-grain and two long-grain rice cultivars were harvested, dried and milled, and the broken kernels were separated from unbroken kernels. Broken rice kernels were subsequently combined with unbroken rice kernels forming treatments of 0, 40, 150, 350 or 1000 g kg(-1) broken kernels ratio. Rice samples were then cooked and the moisture content of the cooked rice, the moisture uptake rate, and rice hardness and stickiness were measured. As the amount of broken rice kernels increased, rice sample texture became increasingly softer (P < 0.05) but the unbroken kernels became significantly harder. Moisture content and moisture uptake rate were positively correlated, and cooked rice hardness was negatively correlated to the percentage of broken kernels in rice samples. Differences in the proportions of broken rice in a milled rice sample play a major role in determining the texture properties of cooked rice. Variations in the moisture migration kinetics between broken and unbroken kernels caused faster hydration of the cores of broken rice kernels, with greater starch leach-out during cooking affecting the texture of the cooked rice. The texture of cooked rice can be controlled, to some extent, by varying the proportion of broken kernels in milled rice. © 2012 Society of Chemical Industry.

  10. A comparison of different chemometrics approaches for the robust classification of electronic nose data.

    PubMed

    Gromski, Piotr S; Correa, Elon; Vaughan, Andrew A; Wedge, David C; Turner, Michael L; Goodacre, Royston

    2014-11-01

    Accurate detection of certain chemical vapours is important, as these may be diagnostic for the presence of weapons, drugs of misuse or disease. In order to achieve this, chemical sensors could be deployed remotely. However, the readout from such sensors is a multivariate pattern, and this needs to be interpreted robustly using powerful supervised learning methods. Therefore, in this study, we compared the classification accuracy of four pattern recognition algorithms which include linear discriminant analysis (LDA), partial least squares-discriminant analysis (PLS-DA), random forests (RF) and support vector machines (SVM) which employed four different kernels. For this purpose, we have used electronic nose (e-nose) sensor data (Wedge et al., Sensors Actuators B Chem 143:365-372, 2009). In order to allow direct comparison between our four different algorithms, we employed two model validation procedures based on either 10-fold cross-validation or bootstrapping. The results show that LDA (91.56% accuracy) and SVM with a polynomial kernel (91.66% accuracy) were very effective at analysing these e-nose data. These two models gave superior prediction accuracy, sensitivity and specificity in comparison to the other techniques employed. With respect to the e-nose sensor data studied here, our findings recommend that SVM with a polynomial kernel should be favoured as a classification method over the other statistical models that we assessed. SVM with non-linear kernels have the advantage that they can be used for classifying non-linear as well as linear mapping from analytical data space to multi-group classifications and would thus be a suitable algorithm for the analysis of most e-nose sensor data.

  11. Multiple Kernel Learning with Data Augmentation

    DTIC Science & Technology

    2016-11-22

    model, we further show how to make inference and learn parameters in the following section. Note that we still need hyper -parameters (i.e., κ0, θ0, µ0...parameter α and sparsity tuning parameter β, these hyper -parameters are not sensitive to data. As in the BEMKL method, we fix the values of these... hyper -parameters for all datasets. αxn β yn wmf λmf µ0 σ0κ0 θ0 N M × F α ∼ G (κ0, θ0) β ∼ N ( µ0, σ 2 0 ) wmf , λmf | α, β ∼ Equation (8) yn | xn,wmf

  12. Multineuron spike train analysis with R-convolution linear combination kernel.

    PubMed

    Tezuka, Taro

    2018-06-01

    A spike train kernel provides an effective way of decoding information represented by a spike train. Some spike train kernels have been extended to multineuron spike trains, which are simultaneously recorded spike trains obtained from multiple neurons. However, most of these multineuron extensions were carried out in a kernel-specific manner. In this paper, a general framework is proposed for extending any single-neuron spike train kernel to multineuron spike trains, based on the R-convolution kernel. Special subclasses of the proposed R-convolution linear combination kernel are explored. These subclasses have a smaller number of parameters and make optimization tractable when the size of data is limited. The proposed kernel was evaluated using Gaussian process regression for multineuron spike trains recorded from an animal brain. It was compared with the sum kernel and the population Spikernel, which are existing ways of decoding multineuron spike trains using kernels. The results showed that the proposed approach performs better than these kernels and also other commonly used neural decoding methods. Copyright © 2018 Elsevier Ltd. All rights reserved.

  13. Study on Energy Productivity Ratio (EPR) at palm kernel oil processing factory: case study on PT-X at Sumatera Utara Plantation

    NASA Astrophysics Data System (ADS)

    Haryanto, B.; Bukit, R. Br; Situmeang, E. M.; Christina, E. P.; Pandiangan, F.

    2018-02-01

    The purpose of this study was to determine the performance, productivity and feasibility of the operation of palm kernel processing plant based on Energy Productivity Ratio (EPR). EPR is expressed as the ratio of output to input energy and by-product. Palm Kernel plan is process in palm kernel to become palm kernel oil. The procedure started from collecting data needed as energy input such as: palm kernel prices, energy demand and depreciation of the factory. The energy output and its by-product comprise the whole production price such as: palm kernel oil price and the remaining products such as shells and pulp price. Calculation the equality of energy of palm kernel oil is to analyze the value of Energy Productivity Ratio (EPR) bases on processing capacity per year. The investigation has been done in Kernel Oil Processing Plant PT-X at Sumatera Utara plantation. The value of EPR was 1.54 (EPR > 1), which indicated that the processing of palm kernel into palm kernel oil is feasible to be operated based on the energy productivity.

  14. 7 CFR 981.9 - Kernel weight.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... 7 Agriculture 8 2010-01-01 2010-01-01 false Kernel weight. 981.9 Section 981.9 Agriculture Regulations of the Department of Agriculture (Continued) AGRICULTURAL MARKETING SERVICE (Marketing Agreements... Regulating Handling Definitions § 981.9 Kernel weight. Kernel weight means the weight of kernels, including...

  15. An SVM model with hybrid kernels for hydrological time series

    NASA Astrophysics Data System (ADS)

    Wang, C.; Wang, H.; Zhao, X.; Xie, Q.

    2017-12-01

    Support Vector Machine (SVM) models have been widely applied to the forecast of climate/weather and its impact on other environmental variables such as hydrologic response to climate/weather. When using SVM, the choice of the kernel function plays the key role. Conventional SVM models mostly use one single type of kernel function, e.g., radial basis kernel function. Provided that there are several featured kernel functions available, each having its own advantages and drawbacks, a combination of these kernel functions may give more flexibility and robustness to SVM approach, making it suitable for a wide range of application scenarios. This paper presents such a linear combination of radial basis kernel and polynomial kernel for the forecast of monthly flowrate in two gaging stations using SVM approach. The results indicate significant improvement in the accuracy of predicted series compared to the approach with either individual kernel function, thus demonstrating the feasibility and advantages of such hybrid kernel approach for SVM applications.

  16. The generalization ability of online SVM classification based on Markov sampling.

    PubMed

    Xu, Jie; Yan Tang, Yuan; Zou, Bin; Xu, Zongben; Li, Luoqing; Lu, Yang

    2015-03-01

    In this paper, we consider online support vector machine (SVM) classification learning algorithms with uniformly ergodic Markov chain (u.e.M.c.) samples. We establish the bound on the misclassification error of an online SVM classification algorithm with u.e.M.c. samples based on reproducing kernel Hilbert spaces and obtain a satisfactory convergence rate. We also introduce a novel online SVM classification algorithm based on Markov sampling, and present the numerical studies on the learning ability of online SVM classification based on Markov sampling for benchmark repository. The numerical studies show that the learning performance of the online SVM classification algorithm based on Markov sampling is better than that of classical online SVM classification based on random sampling as the size of training samples is larger.

  17. 7 CFR 51.2295 - Half kernel.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... 7 Agriculture 2 2010-01-01 2010-01-01 false Half kernel. 51.2295 Section 51.2295 Agriculture... Standards for Shelled English Walnuts (Juglans Regia) Definitions § 51.2295 Half kernel. Half kernel means the separated half of a kernel with not more than one-eighth broken off. ...

  18. 7 CFR 810.206 - Grades and grade requirements for barley.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... weight per bushel (pounds) Sound barley (percent) Maximum Limits of— Damaged kernels 1 (percent) Heat damaged kernels (percent) Foreign material (percent) Broken kernels (percent) Thin barley (percent) U.S... or otherwise of distinctly low quality. 1 Includes heat-damaged kernels. Injured-by-frost kernels and...

  19. Myocardin Family Members Drive Formation of Caveolae

    PubMed Central

    Krawczyk, Katarzyna K.; Yao Mattisson, Ingrid; Ekman, Mari; Oskolkov, Nikolay; Grantinge, Rebecka; Kotowska, Dorota; Olde, Björn; Hansson, Ola; Albinsson, Sebastian; Miano, Joseph M.; Rippe, Catarina; Swärd, Karl

    2015-01-01

    Caveolae are membrane organelles that play roles in glucose and lipid metabolism and in vascular function. Formation of caveolae requires caveolins and cavins. The make-up of caveolae and their density is considered to reflect cell-specific transcriptional control mechanisms for caveolins and cavins, but knowledge regarding regulation of caveolae genes is incomplete. Myocardin (MYOCD) and its relative MRTF-A (MKL1) are transcriptional coactivators that control genes which promote smooth muscle differentiation. MRTF-A communicates changes in actin polymerization to nuclear gene transcription. Here we tested if myocardin family proteins control biogenesis of caveolae via activation of caveolin and cavin transcription. Using human coronary artery smooth muscle cells we found that jasplakinolide and latrunculin B (LatB), substances that promote and inhibit actin polymerization, increased and decreased protein levels of caveolins and cavins, respectively. The effect of LatB was associated with reduced mRNA levels for these genes and this was replicated by the MRTF inhibitor CCG-1423 which was non-additive with LatB. Overexpression of myocardin and MRTF-A caused 5-10-fold induction of caveolins whereas cavin-1 and cavin-2 were induced 2-3-fold. PACSIN2 also increased, establishing positive regulation of caveolae genes from three families. Full regulation of CAV1 was retained in its proximal promoter. Knock down of the serum response factor (SRF), which mediates many of the effects of myocardin, decreased cavin-1 but increased caveolin-1 and -2 mRNAs. Viral transduction of myocardin increased the density of caveolae 5-fold in vitro. A decrease of CAV1 was observed concomitant with a decrease of the smooth muscle marker calponin in aortic aneurysms from mice (C57Bl/6) infused with angiotensin II. Human expression data disclosed correlations of MYOCD with CAV1 in a majority of human tissues and in the heart, correlation with MKL2 (MRTF-B) was observed. The myocardin family of transcriptional coactivators therefore drives formation of caveolae and this effect is largely independent of SRF. PMID:26244347

  20. Characterization of a Merkel Cell Polyomavirus-Positive Merkel Cell Carcinoma Cell Line CVG-1.

    PubMed

    Velásquez, Celestino; Amako, Yutaka; Harold, Alexis; Toptan, Tuna; Chang, Yuan; Shuda, Masahiro

    2018-01-01

    Merkel cell polyomavirus (MCV) plays a causal role in ∼80% of Merkel cell carcinomas (MCC). MCV is clonally integrated into the MCC tumor genome, which results in persistent expression of large T (LT) and small T (sT) antigen oncoproteins encoded by the early locus. In MCV-positive MCC tumors, LT is truncated by premature stop codons or deletions that lead to loss of the C-terminal origin binding (OBD) and helicase domains important for replication. The N-terminal Rb binding domain remains intact. MCV-positive cell lines derived from MCC explants have been valuable tools to study the molecular mechanism of MCV-induced Merkel cell carcinogenesis. Although all cell lines have integrated MCV and express truncated LT antigens, the molecular sizes of the LT proteins differ between cell lines. The copy number of integrated viral genome also varies across cell lines, leading to significantly different levels of viral protein expression. Nevertheless, these cell lines share phenotypic similarities in cell morphology, growth characteristics, and neuroendocrine marker expression. Several low-passage MCV-positive MCC cell lines have been established since the identification of MCV. We describe a new MCV-positive MCV cell line, CVG-1, with features distinct from previously reported cell lines. CVG-1 tumor cells grow in more discohesive clusters in loose round cell suspension, and individual cells show dramatic size heterogeneity. It is the first cell line to encode an MCV sT polymorphism resulting in a unique leucine (L) to proline (P) substitution mutation at amino acid 144. CVG-1 possesses a LT truncation pattern near identical to that of MKL-1 cells differing by the last two C-terminal amino acids and also shows an LT protein expression level similar to MKL-1. Viral T antigen knockdown reveals that, like other MCV-positive MCC cell lines, CVG-1 requires T antigen expression for cell proliferation.

  1. 7 CFR 51.1449 - Damage.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ...) Kernel which is “dark amber” or darker color; (e) Kernel having more than one dark kernel spot, or one dark kernel spot more than one-eighth inch in greatest dimension; (f) Shriveling when the surface of the kernel is very conspicuously wrinkled; (g) Internal flesh discoloration of a medium shade of gray...

  2. 7 CFR 51.1449 - Damage.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ...) Kernel which is “dark amber” or darker color; (e) Kernel having more than one dark kernel spot, or one dark kernel spot more than one-eighth inch in greatest dimension; (f) Shriveling when the surface of the kernel is very conspicuously wrinkled; (g) Internal flesh discoloration of a medium shade of gray...

  3. 7 CFR 51.2125 - Split or broken kernels.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... 7 Agriculture 2 2010-01-01 2010-01-01 false Split or broken kernels. 51.2125 Section 51.2125 Agriculture Regulations of the Department of Agriculture AGRICULTURAL MARKETING SERVICE (Standards... kernels. Split or broken kernels means seven-eighths or less of complete whole kernels but which will not...

  4. 7 CFR 51.2296 - Three-fourths half kernel.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... 7 Agriculture 2 2010-01-01 2010-01-01 false Three-fourths half kernel. 51.2296 Section 51.2296 Agriculture Regulations of the Department of Agriculture AGRICULTURAL MARKETING SERVICE (Standards...-fourths half kernel. Three-fourths half kernel means a portion of a half of a kernel which has more than...

  5. UNICOS Kernel Internals Application Development

    NASA Technical Reports Server (NTRS)

    Caredo, Nicholas; Craw, James M. (Technical Monitor)

    1995-01-01

    Having an understanding of UNICOS Kernel Internals is valuable information. However, having the knowledge is only half the value. The second half comes with knowing how to use this information and apply it to the development of tools. The kernel contains vast amounts of useful information that can be utilized. This paper discusses the intricacies of developing utilities that utilize kernel information. In addition, algorithms, logic, and code will be discussed for accessing kernel information. Code segments will be provided that demonstrate how to locate and read kernel structures. Types of applications that can utilize kernel information will also be discussed.

  6. Detection of maize kernels breakage rate based on K-means clustering

    NASA Astrophysics Data System (ADS)

    Yang, Liang; Wang, Zhuo; Gao, Lei; Bai, Xiaoping

    2017-04-01

    In order to optimize the recognition accuracy of maize kernels breakage detection and improve the detection efficiency of maize kernels breakage, this paper using computer vision technology and detecting of the maize kernels breakage based on K-means clustering algorithm. First, the collected RGB images are converted into Lab images, then the original images clarity evaluation are evaluated by the energy function of Sobel 8 gradient. Finally, the detection of maize kernels breakage using different pixel acquisition equipments and different shooting angles. In this paper, the broken maize kernels are identified by the color difference between integrity kernels and broken kernels. The original images clarity evaluation and different shooting angles are taken to verify that the clarity and shooting angles of the images have a direct influence on the feature extraction. The results show that K-means clustering algorithm can distinguish the broken maize kernels effectively.

  7. Aflatoxin and nutrient contents of peanut collected from local market and their processed foods

    NASA Astrophysics Data System (ADS)

    Ginting, E.; Rahmianna, A. A.; Yusnawan, E.

    2018-01-01

    Peanut is succeptable to aflatoxin contamination and the sources of peanut as well as processing methods considerably affect aflatoxin content of the products. Therefore, the study on aflatoxin and nutrient contents of peanut collected from local market and their processed foods were performed. Good kernels of peanut were prepared into fried peanut, pressed-fried peanut, peanut sauce, peanut press cake, fermented peanut press cake (tempe) and fried tempe, while blended kernels (good and poor kernels) were processed into peanut sauce and tempe and poor kernels were only processed into tempe. The results showed that good and blended kernels which had high number of sound/intact kernels (82,46% and 62,09%), contained 9.8-9.9 ppb of aflatoxin B1, while slightly higher level was seen in poor kernels (12.1 ppb). However, the moisture, ash, protein, and fat contents of the kernels were similar as well as the products. Peanut tempe and fried tempe showed the highest increase in protein content, while decreased fat contents were seen in all products. The increase in aflatoxin B1 of peanut tempe prepared from poor kernels > blended kernels > good kernels. However, it averagely decreased by 61.2% after deep-fried. Excluding peanut tempe and fried tempe, aflatoxin B1 levels in all products derived from good kernels were below the permitted level (15 ppb). This suggests that sorting peanut kernels as ingredients and followed by heat processing would decrease the aflatoxin content in the products.

  8. Many-Body Descriptors for Predicting Molecular Properties with Machine Learning: Analysis of Pairwise and Three-Body Interactions in Molecules.

    PubMed

    Pronobis, Wiktor; Tkatchenko, Alexandre; Müller, Klaus-Robert

    2018-06-12

    Machine learning (ML) based prediction of molecular properties across chemical compound space is an important and alternative approach to efficiently estimate the solutions of highly complex many-electron problems in chemistry and physics. Statistical methods represent molecules as descriptors that should encode molecular symmetries and interactions between atoms. Many such descriptors have been proposed; all of them have advantages and limitations. Here, we propose a set of general two-body and three-body interaction descriptors which are invariant to translation, rotation, and atomic indexing. By adapting the successfully used kernel ridge regression methods of machine learning, we evaluate our descriptors on predicting several properties of small organic molecules calculated using density-functional theory. We use two data sets. The GDB-7 set contains 6868 molecules with up to 7 heavy atoms of type CNO. The GDB-9 set is composed of 131722 molecules with up to 9 heavy atoms containing CNO. When trained on 5000 random molecules, our best model achieves an accuracy of 0.8 kcal/mol (on the remaining 1868 molecules of GDB-7) and 1.5 kcal/mol (on the remaining 126722 molecules of GDB-9) respectively. Applying a linear regression model on our novel many-body descriptors performs almost equal to a nonlinear kernelized model. Linear models are readily interpretable: a feature importance ranking measure helps to obtain qualitative and quantitative insights on the importance of two- and three-body molecular interactions for predicting molecular properties computed with quantum-mechanical methods.

  9. Machine Learning-based Texture Analysis of Contrast-enhanced MR Imaging to Differentiate between Glioblastoma and Primary Central Nervous System Lymphoma.

    PubMed

    Kunimatsu, Akira; Kunimatsu, Natsuko; Yasaka, Koichiro; Akai, Hiroyuki; Kamiya, Kouhei; Watadani, Takeyuki; Mori, Harushi; Abe, Osamu

    2018-05-16

    Although advanced MRI techniques are increasingly available, imaging differentiation between glioblastoma and primary central nervous system lymphoma (PCNSL) is sometimes confusing. We aimed to evaluate the performance of image classification by support vector machine, a method of traditional machine learning, using texture features computed from contrast-enhanced T 1 -weighted images. This retrospective study on preoperative brain tumor MRI included 76 consecutives, initially treated patients with glioblastoma (n = 55) or PCNSL (n = 21) from one institution, consisting of independent training group (n = 60: 44 glioblastomas and 16 PCNSLs) and test group (n = 16: 11 glioblastomas and 5 PCNSLs) sequentially separated by time periods. A total set of 67 texture features was computed on routine contrast-enhanced T 1 -weighted images of the training group, and the top four most discriminating features were selected as input variables to train support vector machine classifiers. These features were then evaluated on the test group with subsequent image classification. The area under the receiver operating characteristic curves on the training data was calculated at 0.99 (95% confidence interval [CI]: 0.96-1.00) for the classifier with a Gaussian kernel and 0.87 (95% CI: 0.77-0.95) for the classifier with a linear kernel. On the test data, both of the classifiers showed prediction accuracy of 75% (12/16) of the test images. Although further improvement is needed, our preliminary results suggest that machine learning-based image classification may provide complementary diagnostic information on routine brain MRI.

  10. Web document ranking via active learning and kernel principal component analysis

    NASA Astrophysics Data System (ADS)

    Cai, Fei; Chen, Honghui; Shu, Zhen

    2015-09-01

    Web document ranking arises in many information retrieval (IR) applications, such as the search engine, recommendation system and online advertising. A challenging issue is how to select the representative query-document pairs and informative features as well for better learning and exploring new ranking models to produce an acceptable ranking list of candidate documents of each query. In this study, we propose an active sampling (AS) plus kernel principal component analysis (KPCA) based ranking model, viz. AS-KPCA Regression, to study the document ranking for a retrieval system, i.e. how to choose the representative query-document pairs and features for learning. More precisely, we fill those documents gradually into the training set by AS such that each of which will incur the highest expected DCG loss if unselected. Then, the KPCA is performed via projecting the selected query-document pairs onto p-principal components in the feature space to complete the regression. Hence, we can cut down the computational overhead and depress the impact incurred by noise simultaneously. To the best of our knowledge, we are the first to perform the document ranking via dimension reductions in two dimensions, namely, the number of documents and features simultaneously. Our experiments demonstrate that the performance of our approach is better than that of the baseline methods on the public LETOR 4.0 datasets. Our approach brings an improvement against RankBoost as well as other baselines near 20% in terms of MAP metric and less improvements using P@K and NDCG@K, respectively. Moreover, our approach is particularly suitable for document ranking on the noisy dataset in practice.

  11. Techniques for Exploiting Unlabeled Data

    DTIC Science & Technology

    2008-10-01

    Moore, Ar- jit Singh, Jure Leskovec, Stano Funiak, Andreas Krause, Gaurav Veda, John Lang- ford, R. Ravi, Peter Lee, Srinath Sridhar, Virginia Vassilevska...information. However, in the past few decades for many tasks the supply of information has outpaced our ability to effectively utilize it. For example in...function which contains kernel functions as a sub-class and show that effective learning can be done in this framework. Although the work in this area is

  12. Learning for Semantic Parsing with Kernels under Various Forms of Supervision

    DTIC Science & Technology

    2007-08-01

    natural language sentences to their formal executable meaning representations. This is a challenging problem and is critical for developing computing...sentences are semantically tractable. This indi- cates that Geoquery is more challenging domain for semantic parsing than ATIS. In the past, there have been a...Combining parsers. In Proceedings of the Conference on Em- pirical Methods in Natural Language Processing and Very Large Corpora (EMNLP/ VLC -99), pp. 187–194

  13. MKRMDA: multiple kernel learning-based Kronecker regularized least squares for MiRNA-disease association prediction.

    PubMed

    Chen, Xing; Niu, Ya-Wei; Wang, Guang-Hui; Yan, Gui-Ying

    2017-12-12

    Recently, as the research of microRNA (miRNA) continues, there are plenty of experimental evidences indicating that miRNA could be associated with various human complex diseases development and progression. Hence, it is necessary and urgent to pay more attentions to the relevant study of predicting diseases associated miRNAs, which may be helpful for effective prevention, diagnosis and treatment of human diseases. Especially, constructing computational methods to predict potential miRNA-disease associations is worthy of more studies because of the feasibility and effectivity. In this work, we developed a novel computational model of multiple kernels learning-based Kronecker regularized least squares for MiRNA-disease association prediction (MKRMDA), which could reveal potential miRNA-disease associations by automatically optimizing the combination of multiple kernels for disease and miRNA. MKRMDA obtained AUCs of 0.9040 and 0.8446 in global and local leave-one-out cross validation, respectively. Meanwhile, MKRMDA achieved average AUCs of 0.8894 ± 0.0015 in fivefold cross validation. Furthermore, we conducted three different kinds of case studies on some important human cancers for further performance evaluation. In the case studies of colonic cancer, esophageal cancer and lymphoma based on known miRNA-disease associations in HMDDv2.0 database, 76, 94 and 88% of the corresponding top 50 predicted miRNAs were confirmed by experimental reports, respectively. In another two kinds of case studies for new diseases without any known associated miRNAs and diseases only with known associations in HMDDv1.0 database, the verified ratios of two different cancers were 88 and 94%, respectively. All the results mentioned above adequately showed the reliable prediction ability of MKRMDA. We anticipated that MKRMDA could serve to facilitate further developments in the field and the follow-up investigations by biomedical researchers.

  14. 7 CFR 981.401 - Adjusted kernel weight.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... based on the analysis of a 1,000 gram sample taken from a lot of almonds weighing 10,000 pounds with less than 95 percent kernels, and a 1,000 gram sample taken from a lot of almonds weighing 10,000... percent kernels containing the following: Edible kernels, 530 grams; inedible kernels, 120 grams; foreign...

  15. 7 CFR 981.401 - Adjusted kernel weight.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... based on the analysis of a 1,000 gram sample taken from a lot of almonds weighing 10,000 pounds with less than 95 percent kernels, and a 1,000 gram sample taken from a lot of almonds weighing 10,000... percent kernels containing the following: Edible kernels, 530 grams; inedible kernels, 120 grams; foreign...

  16. 7 CFR 981.401 - Adjusted kernel weight.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... based on the analysis of a 1,000 gram sample taken from a lot of almonds weighing 10,000 pounds with less than 95 percent kernels, and a 1,000 gram sample taken from a lot of almonds weighing 10,000... percent kernels containing the following: Edible kernels, 530 grams; inedible kernels, 120 grams; foreign...

  17. 7 CFR 981.401 - Adjusted kernel weight.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... based on the analysis of a 1,000 gram sample taken from a lot of almonds weighing 10,000 pounds with less than 95 percent kernels, and a 1,000 gram sample taken from a lot of almonds weighing 10,000... percent kernels containing the following: Edible kernels, 530 grams; inedible kernels, 120 grams; foreign...

  18. 7 CFR 981.401 - Adjusted kernel weight.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... based on the analysis of a 1,000 gram sample taken from a lot of almonds weighing 10,000 pounds with less than 95 percent kernels, and a 1,000 gram sample taken from a lot of almonds weighing 10,000... percent kernels containing the following: Edible kernels, 530 grams; inedible kernels, 120 grams; foreign...

  19. 7 CFR 51.1441 - Half-kernel.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... 7 Agriculture 2 2010-01-01 2010-01-01 false Half-kernel. 51.1441 Section 51.1441 Agriculture... Standards for Grades of Shelled Pecans Definitions § 51.1441 Half-kernel. Half-kernel means one of the separated halves of an entire pecan kernel with not more than one-eighth of its original volume missing...

  20. 7 CFR 51.1403 - Kernel color classification.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... 7 Agriculture 2 2010-01-01 2010-01-01 false Kernel color classification. 51.1403 Section 51.1403... STANDARDS) United States Standards for Grades of Pecans in the Shell 1 Kernel Color Classification § 51.1403 Kernel color classification. (a) The skin color of pecan kernels may be described in terms of the color...

  1. 7 CFR 51.1450 - Serious damage.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ...; (c) Decay affecting any portion of the kernel; (d) Insects, web, or frass or any distinct evidence of insect feeding on the kernel; (e) Internal discoloration which is dark gray, dark brown, or black and...) Dark kernel spots when more than three are on the kernel, or when any dark kernel spot or the aggregate...

  2. 7 CFR 51.1450 - Serious damage.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ...; (c) Decay affecting any portion of the kernel; (d) Insects, web, or frass or any distinct evidence of insect feeding on the kernel; (e) Internal discoloration which is dark gray, dark brown, or black and...) Dark kernel spots when more than three are on the kernel, or when any dark kernel spot or the aggregate...

  3. 7 CFR 51.1450 - Serious damage.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ...; (c) Decay affecting any portion of the kernel; (d) Insects, web, or frass or any distinct evidence of insect feeding on the kernel; (e) Internal discoloration which is dark gray, dark brown, or black and...) Dark kernel spots when more than three are on the kernel, or when any dark kernel spot or the aggregate...

  4. Wavelet SVM in Reproducing Kernel Hilbert Space for hyperspectral remote sensing image classification

    NASA Astrophysics Data System (ADS)

    Du, Peijun; Tan, Kun; Xing, Xiaoshi

    2010-12-01

    Combining Support Vector Machine (SVM) with wavelet analysis, we constructed wavelet SVM (WSVM) classifier based on wavelet kernel functions in Reproducing Kernel Hilbert Space (RKHS). In conventional kernel theory, SVM is faced with the bottleneck of kernel parameter selection which further results in time-consuming and low classification accuracy. The wavelet kernel in RKHS is a kind of multidimensional wavelet function that can approximate arbitrary nonlinear functions. Implications on semiparametric estimation are proposed in this paper. Airborne Operational Modular Imaging Spectrometer II (OMIS II) hyperspectral remote sensing image with 64 bands and Reflective Optics System Imaging Spectrometer (ROSIS) data with 115 bands were used to experiment the performance and accuracy of the proposed WSVM classifier. The experimental results indicate that the WSVM classifier can obtain the highest accuracy when using the Coiflet Kernel function in wavelet transform. In contrast with some traditional classifiers, including Spectral Angle Mapping (SAM) and Minimum Distance Classification (MDC), and SVM classifier using Radial Basis Function kernel, the proposed wavelet SVM classifier using the wavelet kernel function in Reproducing Kernel Hilbert Space is capable of improving classification accuracy obviously.

  5. Hadamard Kernel SVM with applications for breast cancer outcome predictions.

    PubMed

    Jiang, Hao; Ching, Wai-Ki; Cheung, Wai-Shun; Hou, Wenpin; Yin, Hong

    2017-12-21

    Breast cancer is one of the leading causes of deaths for women. It is of great necessity to develop effective methods for breast cancer detection and diagnosis. Recent studies have focused on gene-based signatures for outcome predictions. Kernel SVM for its discriminative power in dealing with small sample pattern recognition problems has attracted a lot attention. But how to select or construct an appropriate kernel for a specified problem still needs further investigation. Here we propose a novel kernel (Hadamard Kernel) in conjunction with Support Vector Machines (SVMs) to address the problem of breast cancer outcome prediction using gene expression data. Hadamard Kernel outperform the classical kernels and correlation kernel in terms of Area under the ROC Curve (AUC) values where a number of real-world data sets are adopted to test the performance of different methods. Hadamard Kernel SVM is effective for breast cancer predictions, either in terms of prognosis or diagnosis. It may benefit patients by guiding therapeutic options. Apart from that, it would be a valuable addition to the current SVM kernel families. We hope it will contribute to the wider biology and related communities.

  6. LZW-Kernel: fast kernel utilizing variable length code blocks from LZW compressors for protein sequence classification.

    PubMed

    Filatov, Gleb; Bauwens, Bruno; Kertész-Farkas, Attila

    2018-05-07

    Bioinformatics studies often rely on similarity measures between sequence pairs, which often pose a bottleneck in large-scale sequence analysis. Here, we present a new convolutional kernel function for protein sequences called the LZW-Kernel. It is based on code words identified with the Lempel-Ziv-Welch (LZW) universal text compressor. The LZW-Kernel is an alignment-free method, it is always symmetric, is positive, always provides 1.0 for self-similarity and it can directly be used with Support Vector Machines (SVMs) in classification problems, contrary to normalized compression distance (NCD), which often violates the distance metric properties in practice and requires further techniques to be used with SVMs. The LZW-Kernel is a one-pass algorithm, which makes it particularly plausible for big data applications. Our experimental studies on remote protein homology detection and protein classification tasks reveal that the LZW-Kernel closely approaches the performance of the Local Alignment Kernel (LAK) and the SVM-pairwise method combined with Smith-Waterman (SW) scoring at a fraction of the time. Moreover, the LZW-Kernel outperforms the SVM-pairwise method when combined with BLAST scores, which indicates that the LZW code words might be a better basis for similarity measures than local alignment approximations found with BLAST. In addition, the LZW-Kernel outperforms n-gram based mismatch kernels, hidden Markov model based SAM and Fisher kernel, and protein family based PSI-BLAST, among others. Further advantages include the LZW-Kernel's reliance on a simple idea, its ease of implementation, and its high speed, three times faster than BLAST and several magnitudes faster than SW or LAK in our tests. LZW-Kernel is implemented as a standalone C code and is a free open-source program distributed under GPLv3 license and can be downloaded from https://github.com/kfattila/LZW-Kernel. akerteszfarkas@hse.ru. Supplementary data are available at Bioinformatics Online.

  7. Automatic Defect Detection for TFT-LCD Array Process Using Quasiconformal Kernel Support Vector Data Description

    PubMed Central

    Liu, Yi-Hung; Chen, Yan-Jen

    2011-01-01

    Defect detection has been considered an efficient way to increase the yield rate of panels in thin film transistor liquid crystal display (TFT-LCD) manufacturing. In this study we focus on the array process since it is the first and key process in TFT-LCD manufacturing. Various defects occur in the array process, and some of them could cause great damage to the LCD panels. Thus, how to design a method that can robustly detect defects from the images captured from the surface of LCD panels has become crucial. Previously, support vector data description (SVDD) has been successfully applied to LCD defect detection. However, its generalization performance is limited. In this paper, we propose a novel one-class machine learning method, called quasiconformal kernel SVDD (QK-SVDD) to address this issue. The QK-SVDD can significantly improve generalization performance of the traditional SVDD by introducing the quasiconformal transformation into a predefined kernel. Experimental results, carried out on real LCD images provided by an LCD manufacturer in Taiwan, indicate that the proposed QK-SVDD not only obtains a high defect detection rate of 96%, but also greatly improves generalization performance of SVDD. The improvement has shown to be over 30%. In addition, results also show that the QK-SVDD defect detector is able to accomplish the task of defect detection on an LCD image within 60 ms. PMID:22016625

  8. Target oriented dimensionality reduction of hyperspectral data by Kernel Fukunaga-Koontz Transform

    NASA Astrophysics Data System (ADS)

    Binol, Hamidullah; Ochilov, Shuhrat; Alam, Mohammad S.; Bal, Abdullah

    2017-02-01

    Principal component analysis (PCA) is a popular technique in remote sensing for dimensionality reduction. While PCA is suitable for data compression, it is not necessarily an optimal technique for feature extraction, particularly when the features are exploited in supervised learning applications (Cheriyadat and Bruce, 2003) [1]. Preserving features belonging to the target is very crucial to the performance of target detection/recognition techniques. Fukunaga-Koontz Transform (FKT) based supervised band reduction technique can be used to provide this requirement. FKT achieves feature selection by transforming into a new space in where feature classes have complimentary eigenvectors. Analysis of these eigenvectors under two classes, target and background clutter, can be utilized for target oriented band reduction since each basis functions best represent target class while carrying least information of the background class. By selecting few eigenvectors which are the most relevant to the target class, dimension of hyperspectral data can be reduced and thus, it presents significant advantages for near real time target detection applications. The nonlinear properties of the data can be extracted by kernel approach which provides better target features. Thus, we propose constructing kernel FKT (KFKT) to present target oriented band reduction. The performance of the proposed KFKT based target oriented dimensionality reduction algorithm has been tested employing two real-world hyperspectral data and results have been reported consequently.

  9. Automatic defect detection for TFT-LCD array process using quasiconformal kernel support vector data description.

    PubMed

    Liu, Yi-Hung; Chen, Yan-Jen

    2011-01-01

    Defect detection has been considered an efficient way to increase the yield rate of panels in thin film transistor liquid crystal display (TFT-LCD) manufacturing. In this study we focus on the array process since it is the first and key process in TFT-LCD manufacturing. Various defects occur in the array process, and some of them could cause great damage to the LCD panels. Thus, how to design a method that can robustly detect defects from the images captured from the surface of LCD panels has become crucial. Previously, support vector data description (SVDD) has been successfully applied to LCD defect detection. However, its generalization performance is limited. In this paper, we propose a novel one-class machine learning method, called quasiconformal kernel SVDD (QK-SVDD) to address this issue. The QK-SVDD can significantly improve generalization performance of the traditional SVDD by introducing the quasiconformal transformation into a predefined kernel. Experimental results, carried out on real LCD images provided by an LCD manufacturer in Taiwan, indicate that the proposed QK-SVDD not only obtains a high defect detection rate of 96%, but also greatly improves generalization performance of SVDD. The improvement has shown to be over 30%. In addition, results also show that the QK-SVDD defect detector is able to accomplish the task of defect detection on an LCD image within 60 ms.

  10. Novel images extraction model using improved delay vector variance feature extraction and multi-kernel neural network for EEG detection and prediction.

    PubMed

    Ge, Jing; Zhang, Guoping

    2015-01-01

    Advanced intelligent methodologies could help detect and predict diseases from the EEG signals in cases the manual analysis is inefficient available, for instance, the epileptic seizures detection and prediction. This is because the diversity and the evolution of the epileptic seizures make it very difficult in detecting and identifying the undergoing disease. Fortunately, the determinism and nonlinearity in a time series could characterize the state changes. Literature review indicates that the Delay Vector Variance (DVV) could examine the nonlinearity to gain insight into the EEG signals but very limited work has been done to address the quantitative DVV approach. Hence, the outcomes of the quantitative DVV should be evaluated to detect the epileptic seizures. To develop a new epileptic seizure detection method based on quantitative DVV. This new epileptic seizure detection method employed an improved delay vector variance (IDVV) to extract the nonlinearity value as a distinct feature. Then a multi-kernel functions strategy was proposed in the extreme learning machine (ELM) network to provide precise disease detection and prediction. The nonlinearity is more sensitive than the energy and entropy. 87.5% overall accuracy of recognition and 75.0% overall accuracy of forecasting were achieved. The proposed IDVV and multi-kernel ELM based method was feasible and effective for epileptic EEG detection. Hence, the newly proposed method has importance for practical applications.

  11. Evaluating the Gradient of the Thin Wire Kernel

    NASA Technical Reports Server (NTRS)

    Wilton, Donald R.; Champagne, Nathan J.

    2008-01-01

    Recently, a formulation for evaluating the thin wire kernel was developed that employed a change of variable to smooth the kernel integrand, canceling the singularity in the integrand. Hence, the typical expansion of the wire kernel in a series for use in the potential integrals is avoided. The new expression for the kernel is exact and may be used directly to determine the gradient of the wire kernel, which consists of components that are parallel and radial to the wire axis.

  12. Kernel Machine SNP-set Testing under Multiple Candidate Kernels

    PubMed Central

    Wu, Michael C.; Maity, Arnab; Lee, Seunggeun; Simmons, Elizabeth M.; Harmon, Quaker E.; Lin, Xinyi; Engel, Stephanie M.; Molldrem, Jeffrey J.; Armistead, Paul M.

    2013-01-01

    Joint testing for the cumulative effect of multiple single nucleotide polymorphisms grouped on the basis of prior biological knowledge has become a popular and powerful strategy for the analysis of large scale genetic association studies. The kernel machine (KM) testing framework is a useful approach that has been proposed for testing associations between multiple genetic variants and many different types of complex traits by comparing pairwise similarity in phenotype between subjects to pairwise similarity in genotype, with similarity in genotype defined via a kernel function. An advantage of the KM framework is its flexibility: choosing different kernel functions allows for different assumptions concerning the underlying model and can allow for improved power. In practice, it is difficult to know which kernel to use a priori since this depends on the unknown underlying trait architecture and selecting the kernel which gives the lowest p-value can lead to inflated type I error. Therefore, we propose practical strategies for KM testing when multiple candidate kernels are present based on constructing composite kernels and based on efficient perturbation procedures. We demonstrate through simulations and real data applications that the procedures protect the type I error rate and can lead to substantially improved power over poor choices of kernels and only modest differences in power versus using the best candidate kernel. PMID:23471868

  13. Combined multi-kernel head computed tomography images optimized for depicting both brain parenchyma and bone.

    PubMed

    Takagi, Satoshi; Nagase, Hiroyuki; Hayashi, Tatsuya; Kita, Tamotsu; Hayashi, Katsumi; Sanada, Shigeru; Koike, Masayuki

    2014-01-01

    The hybrid convolution kernel technique for computed tomography (CT) is known to enable the depiction of an image set using different window settings. Our purpose was to decrease the number of artifacts in the hybrid convolution kernel technique for head CT and to determine whether our improved combined multi-kernel head CT images enabled diagnosis as a substitute for both brain (low-pass kernel-reconstructed) and bone (high-pass kernel-reconstructed) images. Forty-four patients with nondisplaced skull fractures were included. Our improved multi-kernel images were generated so that pixels of >100 Hounsfield unit in both brain and bone images were composed of CT values of bone images and other pixels were composed of CT values of brain images. Three radiologists compared the improved multi-kernel images with bone images. The improved multi-kernel images and brain images were identically displayed on the brain window settings. All three radiologists agreed that the improved multi-kernel images on the bone window settings were sufficient for diagnosing skull fractures in all patients. This improved multi-kernel technique has a simple algorithm and is practical for clinical use. Thus, simplified head CT examinations and fewer images that need to be stored can be expected.

  14. 7 CFR 810.202 - Definition of other terms.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... barley kernels, other grains, and wild oats that are badly shrunken and distinctly discolored black or... kernels. Kernels and pieces of barley kernels that are distinctly indented, immature or shrunken in...

  15. 7 CFR 810.202 - Definition of other terms.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... barley kernels, other grains, and wild oats that are badly shrunken and distinctly discolored black or... kernels. Kernels and pieces of barley kernels that are distinctly indented, immature or shrunken in...

  16. 7 CFR 810.202 - Definition of other terms.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... barley kernels, other grains, and wild oats that are badly shrunken and distinctly discolored black or... kernels. Kernels and pieces of barley kernels that are distinctly indented, immature or shrunken in...

  17. graphkernels: R and Python packages for graph comparison

    PubMed Central

    Ghisu, M Elisabetta; Llinares-López, Felipe; Borgwardt, Karsten

    2018-01-01

    Abstract Summary Measuring the similarity of graphs is a fundamental step in the analysis of graph-structured data, which is omnipresent in computational biology. Graph kernels have been proposed as a powerful and efficient approach to this problem of graph comparison. Here we provide graphkernels, the first R and Python graph kernel libraries including baseline kernels such as label histogram based kernels, classic graph kernels such as random walk based kernels, and the state-of-the-art Weisfeiler-Lehman graph kernel. The core of all graph kernels is implemented in C ++ for efficiency. Using the kernel matrices computed by the package, we can easily perform tasks such as classification, regression and clustering on graph-structured samples. Availability and implementation The R and Python packages including source code are available at https://CRAN.R-project.org/package=graphkernels and https://pypi.python.org/pypi/graphkernels. Contact mahito@nii.ac.jp or elisabetta.ghisu@bsse.ethz.ch Supplementary information Supplementary data are available online at Bioinformatics. PMID:29028902

  18. Aflatoxin variability in pistachios.

    PubMed Central

    Mahoney, N E; Rodriguez, S B

    1996-01-01

    Pistachio fruit components, including hulls (mesocarps and epicarps), seed coats (testas), and kernels (seeds), all contribute to variable aflatoxin content in pistachios. Fresh pistachio kernels were individually inoculated with Aspergillus flavus and incubated 7 or 10 days. Hulled, shelled kernels were either left intact or wounded prior to inoculation. Wounded kernels, with or without the seed coat, were readily colonized by A. flavus and after 10 days of incubation contained 37 times more aflatoxin than similarly treated unwounded kernels. The aflatoxin levels in the individual wounded pistachios were highly variable. Neither fungal colonization nor aflatoxin was detected in intact kernels without seed coats. Intact kernels with seed coats had limited fungal colonization and low aflatoxin concentrations compared with their wounded counterparts. Despite substantial fungal colonization of wounded hulls, aflatoxin was not detected in hulls. Aflatoxin levels were significantly lower in wounded kernels with hulls than in kernels of hulled pistachios. Both the seed coat and a water-soluble extract of hulls suppressed aflatoxin production by A. flavus. PMID:8919781

  19. graphkernels: R and Python packages for graph comparison.

    PubMed

    Sugiyama, Mahito; Ghisu, M Elisabetta; Llinares-López, Felipe; Borgwardt, Karsten

    2018-02-01

    Measuring the similarity of graphs is a fundamental step in the analysis of graph-structured data, which is omnipresent in computational biology. Graph kernels have been proposed as a powerful and efficient approach to this problem of graph comparison. Here we provide graphkernels, the first R and Python graph kernel libraries including baseline kernels such as label histogram based kernels, classic graph kernels such as random walk based kernels, and the state-of-the-art Weisfeiler-Lehman graph kernel. The core of all graph kernels is implemented in C ++ for efficiency. Using the kernel matrices computed by the package, we can easily perform tasks such as classification, regression and clustering on graph-structured samples. The R and Python packages including source code are available at https://CRAN.R-project.org/package=graphkernels and https://pypi.python.org/pypi/graphkernels. mahito@nii.ac.jp or elisabetta.ghisu@bsse.ethz.ch. Supplementary data are available online at Bioinformatics. © The Author(s) 2017. Published by Oxford University Press.

  20. Investigation of various energy deposition kernel refinements for the convolution/superposition method

    PubMed Central

    Huang, Jessie Y.; Eklund, David; Childress, Nathan L.; Howell, Rebecca M.; Mirkovic, Dragan; Followill, David S.; Kry, Stephen F.

    2013-01-01

    Purpose: Several simplifications used in clinical implementations of the convolution/superposition (C/S) method, specifically, density scaling of water kernels for heterogeneous media and use of a single polyenergetic kernel, lead to dose calculation inaccuracies. Although these weaknesses of the C/S method are known, it is not well known which of these simplifications has the largest effect on dose calculation accuracy in clinical situations. The purpose of this study was to generate and characterize high-resolution, polyenergetic, and material-specific energy deposition kernels (EDKs), as well as to investigate the dosimetric impact of implementing spatially variant polyenergetic and material-specific kernels in a collapsed cone C/S algorithm. Methods: High-resolution, monoenergetic water EDKs and various material-specific EDKs were simulated using the EGSnrc Monte Carlo code. Polyenergetic kernels, reflecting the primary spectrum of a clinical 6 MV photon beam at different locations in a water phantom, were calculated for different depths, field sizes, and off-axis distances. To investigate the dosimetric impact of implementing spatially variant polyenergetic kernels, depth dose curves in water were calculated using two different implementations of the collapsed cone C/S method. The first method uses a single polyenergetic kernel, while the second method fully takes into account spectral changes in the convolution calculation. To investigate the dosimetric impact of implementing material-specific kernels, depth dose curves were calculated for a simplified titanium implant geometry using both a traditional C/S implementation that performs density scaling of water kernels and a novel implementation using material-specific kernels. Results: For our high-resolution kernels, we found good agreement with the Mackie et al. kernels, with some differences near the interaction site for low photon energies (<500 keV). For our spatially variant polyenergetic kernels, we found that depth was the most dominant factor affecting the pattern of energy deposition; however, the effects of field size and off-axis distance were not negligible. For the material-specific kernels, we found that as the density of the material increased, more energy was deposited laterally by charged particles, as opposed to in the forward direction. Thus, density scaling of water kernels becomes a worse approximation as the density and the effective atomic number of the material differ more from water. Implementation of spatially variant, polyenergetic kernels increased the percent depth dose value at 25 cm depth by 2.1%–5.8% depending on the field size, while implementation of titanium kernels gave 4.9% higher dose upstream of the metal cavity (i.e., higher backscatter dose) and 8.2% lower dose downstream of the cavity. Conclusions: Of the various kernel refinements investigated, inclusion of depth-dependent and metal-specific kernels into the C/S method has the greatest potential to improve dose calculation accuracy. Implementation of spatially variant polyenergetic kernels resulted in a harder depth dose curve and thus has the potential to affect beam modeling parameters obtained in the commissioning process. For metal implants, the C/S algorithms generally underestimate the dose upstream and overestimate the dose downstream of the implant. Implementation of a metal-specific kernel mitigated both of these errors. PMID:24320507

  1. Unified heat kernel regression for diffusion, kernel smoothing and wavelets on manifolds and its application to mandible growth modeling in CT images.

    PubMed

    Chung, Moo K; Qiu, Anqi; Seo, Seongho; Vorperian, Houri K

    2015-05-01

    We present a novel kernel regression framework for smoothing scalar surface data using the Laplace-Beltrami eigenfunctions. Starting with the heat kernel constructed from the eigenfunctions, we formulate a new bivariate kernel regression framework as a weighted eigenfunction expansion with the heat kernel as the weights. The new kernel method is mathematically equivalent to isotropic heat diffusion, kernel smoothing and recently popular diffusion wavelets. The numerical implementation is validated on a unit sphere using spherical harmonics. As an illustration, the method is applied to characterize the localized growth pattern of mandible surfaces obtained in CT images between ages 0 and 20 by regressing the length of displacement vectors with respect to a surface template. Copyright © 2015 Elsevier B.V. All rights reserved.

  2. Computational algorithms for simulations in atmospheric optics.

    PubMed

    Konyaev, P A; Lukin, V P

    2016-04-20

    A computer simulation technique for atmospheric and adaptive optics based on parallel programing is discussed. A parallel propagation algorithm is designed and a modified spectral-phase method for computer generation of 2D time-variant random fields is developed. Temporal power spectra of Laguerre-Gaussian beam fluctuations are considered as an example to illustrate the applications discussed. Implementation of the proposed algorithms using Intel MKL and IPP libraries and NVIDIA CUDA technology is shown to be very fast and accurate. The hardware system for the computer simulation is an off-the-shelf desktop with an Intel Core i7-4790K CPU operating at a turbo-speed frequency up to 5 GHz and an NVIDIA GeForce GTX-960 graphics accelerator with 1024 1.5 GHz processors.

  3. Open source acceleration of wave optics simulations on energy efficient high-performance computing platforms

    NASA Astrophysics Data System (ADS)

    Beck, Jeffrey; Bos, Jeremy P.

    2017-05-01

    We compare several modifications to the open-source wave optics package, WavePy, intended to improve execution time. Specifically, we compare the relative performance of the Intel MKL, a CPU based OpenCV distribution, and GPU-based version. Performance is compared between distributions both on the same compute platform and between a fully-featured computing workstation and the NVIDIA Jetson TX1 platform. Comparisons are drawn in terms of both execution time and power consumption. We have found that substituting the Fast Fourier Transform operation from OpenCV provides a marked improvement on all platforms. In addition, we show that embedded platforms offer some possibility for extensive improvement in terms of efficiency compared to a fully featured workstation.

  4. Active learning for semi-supervised clustering based on locally linear propagation reconstruction.

    PubMed

    Chang, Chin-Chun; Lin, Po-Yi

    2015-03-01

    The success of semi-supervised clustering relies on the effectiveness of side information. To get effective side information, a new active learner learning pairwise constraints known as must-link and cannot-link constraints is proposed in this paper. Three novel techniques are developed for learning effective pairwise constraints. The first technique is used to identify samples less important to cluster structures. This technique makes use of a kernel version of locally linear embedding for manifold learning. Samples neither important to locally linear propagation reconstructions of other samples nor on flat patches in the learned manifold are regarded as unimportant samples. The second is a novel criterion for query selection. This criterion considers not only the importance of a sample to expanding the space coverage of the learned samples but also the expected number of queries needed to learn the sample. To facilitate semi-supervised clustering, the third technique yields inferred must-links for passing information about flat patches in the learned manifold to semi-supervised clustering algorithms. Experimental results have shown that the learned pairwise constraints can capture the underlying cluster structures and proven the feasibility of the proposed approach. Copyright © 2014 Elsevier Ltd. All rights reserved.

  5. Comparing Alternative Kernels for the Kernel Method of Test Equating: Gaussian, Logistic, and Uniform Kernels. Research Report. ETS RR-08-12

    ERIC Educational Resources Information Center

    Lee, Yi-Hsuan; von Davier, Alina A.

    2008-01-01

    The kernel equating method (von Davier, Holland, & Thayer, 2004) is based on a flexible family of equipercentile-like equating functions that use a Gaussian kernel to continuize the discrete score distributions. While the classical equipercentile, or percentile-rank, equating method carries out the continuization step by linear interpolation,…

  6. 7 CFR 810.204 - Grades and grade requirements for Six-rowed Malting barley and Six-rowed Blue Malting barley.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ...— Damaged kernels 1 (percent) Foreign material (percent) Other grains (percent) Skinned and broken kernels....0 10.0 15.0 1 Injured-by-frost kernels and injured-by-mold kernels are not considered damaged kernels or considered against sound barley. Notes: Malting barley shall not be infested in accordance with...

  7. 7 CFR 51.1413 - Damage.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... well cured; (e) Poorly developed kernels; (f) Kernels which are dark amber in color; (g) Kernel spots when more than one dark spot is present on either half of the kernel, or when any such spot is more...

  8. 7 CFR 51.1413 - Damage.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... well cured; (e) Poorly developed kernels; (f) Kernels which are dark amber in color; (g) Kernel spots when more than one dark spot is present on either half of the kernel, or when any such spot is more...

  9. 7 CFR 810.205 - Grades and grade requirements for Two-rowed Malting barley.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... (percent) Maximum limits of— Wild oats (percent) Foreign material (percent) Skinned and broken kernels... Injured-by-frost kernels and injured-by-mold kernels are not considered damaged kernels or considered...

  10. An assessment of support vector machines for land cover classification

    USGS Publications Warehouse

    Huang, C.; Davis, L.S.; Townshend, J.R.G.

    2002-01-01

    The support vector machine (SVM) is a group of theoretically superior machine learning algorithms. It was found competitive with the best available machine learning algorithms in classifying high-dimensional data sets. This paper gives an introduction to the theoretical development of the SVM and an experimental evaluation of its accuracy, stability and training speed in deriving land cover classifications from satellite images. The SVM was compared to three other popular classifiers, including the maximum likelihood classifier (MLC), neural network classifiers (NNC) and decision tree classifiers (DTC). The impacts of kernel configuration on the performance of the SVM and of the selection of training data and input variables on the four classifiers were also evaluated in this experiment.

  11. Segmentation of the Speaker's Face Region with Audiovisual Correlation

    NASA Astrophysics Data System (ADS)

    Liu, Yuyu; Sato, Yoichi

    The ability to find the speaker's face region in a video is useful for various applications. In this work, we develop a novel technique to find this region within different time windows, which is robust against the changes of view, scale, and background. The main thrust of our technique is to integrate audiovisual correlation analysis into a video segmentation framework. We analyze the audiovisual correlation locally by computing quadratic mutual information between our audiovisual features. The computation of quadratic mutual information is based on the probability density functions estimated by kernel density estimation with adaptive kernel bandwidth. The results of this audiovisual correlation analysis are incorporated into graph cut-based video segmentation to resolve a globally optimum extraction of the speaker's face region. The setting of any heuristic threshold in this segmentation is avoided by learning the correlation distributions of speaker and background by expectation maximization. Experimental results demonstrate that our method can detect the speaker's face region accurately and robustly for different views, scales, and backgrounds.

  12. Support Vector Machines Trained with Evolutionary Algorithms Employing Kernel Adatron for Large Scale Classification of Protein Structures.

    PubMed

    Arana-Daniel, Nancy; Gallegos, Alberto A; López-Franco, Carlos; Alanís, Alma Y; Morales, Jacob; López-Franco, Adriana

    2016-01-01

    With the increasing power of computers, the amount of data that can be processed in small periods of time has grown exponentially, as has the importance of classifying large-scale data efficiently. Support vector machines have shown good results classifying large amounts of high-dimensional data, such as data generated by protein structure prediction, spam recognition, medical diagnosis, optical character recognition and text classification, etc. Most state of the art approaches for large-scale learning use traditional optimization methods, such as quadratic programming or gradient descent, which makes the use of evolutionary algorithms for training support vector machines an area to be explored. The present paper proposes an approach that is simple to implement based on evolutionary algorithms and Kernel-Adatron for solving large-scale classification problems, focusing on protein structure prediction. The functional properties of proteins depend upon their three-dimensional structures. Knowing the structures of proteins is crucial for biology and can lead to improvements in areas such as medicine, agriculture and biofuels.

  13. Learning SVM in Kreĭn Spaces.

    PubMed

    Loosli, Gaelle; Canu, Stephane; Ong, Cheng Soon

    2016-06-01

    This paper presents a theoretical foundation for an SVM solver in Kreĭn spaces. Up to now, all methods are based either on the matrix correction, or on non-convex minimization, or on feature-space embedding. Here we justify and evaluate a solution that uses the original (indefinite) similarity measure, in the original Kreĭn space. This solution is the result of a stabilization procedure. We establish the correspondence between the stabilization problem (which has to be solved) and a classical SVM based on minimization (which is easy to solve). We provide simple equations to go from one to the other (in both directions). This link between stabilization and minimization problems is the key to obtain a solution in the original Kreĭn space. Using KSVM, one can solve SVM with usually troublesome kernels (large negative eigenvalues or large numbers of negative eigenvalues). We show experiments showing that our algorithm KSVM outperforms all previously proposed approaches to deal with indefinite matrices in SVM-like kernel methods.

  14. A boosted optimal linear learner for retinal vessel segmentation

    NASA Astrophysics Data System (ADS)

    Poletti, E.; Grisan, E.

    2014-03-01

    Ocular fundus images provide important information about retinal degeneration, which may be related to acute pathologies or to early signs of systemic diseases. An automatic and quantitative assessment of vessel morphological features, such as diameters and tortuosity, can improve clinical diagnosis and evaluation of retinopathy. At variance with available methods, we propose a data-driven approach, in which the system learns a set of optimal discriminative convolution kernels (linear learner). The set is progressively built based on an ADA-boost sample weighting scheme, providing seamless integration between linear learner estimation and classification. In order to capture the vessel appearance changes at different scales, the kernels are estimated on a pyramidal decomposition of the training samples. The set is employed as a rotating bank of matched filters, whose response is used by the boosted linear classifier to provide a classification of each image pixel into the two classes of interest (vessel/background). We tested the approach fundus images available from the DRIVE dataset. We show that the segmentation performance yields an accuracy of 0.94.

  15. Local Kernel for Brains Classification in Schizophrenia

    NASA Astrophysics Data System (ADS)

    Castellani, U.; Rossato, E.; Murino, V.; Bellani, M.; Rambaldelli, G.; Tansella, M.; Brambilla, P.

    In this paper a novel framework for brain classification is proposed in the context of mental health research. A learning by example method is introduced by combining local measurements with non linear Support Vector Machine. Instead of considering a voxel-by-voxel comparison between patients and controls, we focus on landmark points which are characterized by local region descriptors, namely Scale Invariance Feature Transform (SIFT). Then, matching is obtained by introducing the local kernel for which the samples are represented by unordered set of features. Moreover, a new weighting approach is proposed to take into account the discriminative relevance of the detected groups of features. Experiments have been performed including a set of 54 patients with schizophrenia and 54 normal controls on which region of interest (ROI) have been manually traced by experts. Preliminary results on Dorso-lateral PreFrontal Cortex (DLPFC) region are promising since up to 75% of successful classification rate has been obtained with this technique and the performance has improved up to 85% when the subjects have been stratified by sex.

  16. Mass Estimation and Its Applications

    DTIC Science & Technology

    2012-02-23

    parameters); e.g., the rect- angular kernel function has fixed width or fixed per unit size. But the rectangular function used in mass has no parameter...MassTER is implemented in JAVA , and we use DBSCAN in WEKA [13] and a version of DENCLUE implemented in R (www.r-project.org) in our empirical evaluation...Proceedings of SIGKDD, 2010, 989-998. [13] I.H. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations

  17. Multiple Kernel Learning for Explosive Hazard Detection in Forward-Looking Ground-Penetrating Radar

    DTIC Science & Technology

    2012-04-01

    Kevin Stone, Derek T. Anderson, James M. Keller, Dominic K.C. Ho, Tuan T. Ton, David C. Wong , Mehrdad Soumekh University of Missouri - Columbia...James M. Kellera, K.C. Hoa, Tuan T. Tone, David C. Wong\\ and Mehrdad Soumekhd aDept. ofElectrical and Computer Engineering, University of Missouri...2] Cremer , F., Schavemaker, J.G., de Jong, W., and Schutte, K., "Comparison of vehicle-mounted forward-looking polarimetric infrared and downward

  18. Multiple Kernel Learning for Heterogeneous Anomaly Detection: Algorithm and Aviation Safety Case Study

    NASA Technical Reports Server (NTRS)

    Das, Santanu; Srivastava, Ashok N.; Matthews, Bryan L.; Oza, Nikunj C.

    2010-01-01

    The world-wide aviation system is one of the most complex dynamical systems ever developed and is generating data at an extremely rapid rate. Most modern commercial aircraft record several hundred flight parameters including information from the guidance, navigation, and control systems, the avionics and propulsion systems, and the pilot inputs into the aircraft. These parameters may be continuous measurements or binary or categorical measurements recorded in one second intervals for the duration of the flight. Currently, most approaches to aviation safety are reactive, meaning that they are designed to react to an aviation safety incident or accident. In this paper, we discuss a novel approach based on the theory of multiple kernel learning to detect potential safety anomalies in very large data bases of discrete and continuous data from world-wide operations of commercial fleets. We pose a general anomaly detection problem which includes both discrete and continuous data streams, where we assume that the discrete streams have a causal influence on the continuous streams. We also assume that atypical sequence of events in the discrete streams can lead to off-nominal system performance. We discuss the application domain, novel algorithms, and also discuss results on real-world data sets. Our algorithm uncovers operationally significant events in high dimensional data streams in the aviation industry which are not detectable using state of the art methods

  19. Robust visual tracking based on deep convolutional neural networks and kernelized correlation filters

    NASA Astrophysics Data System (ADS)

    Yang, Hua; Zhong, Donghong; Liu, Chenyi; Song, Kaiyou; Yin, Zhouping

    2018-03-01

    Object tracking is still a challenging problem in computer vision, as it entails learning an effective model to account for appearance changes caused by occlusion, out of view, plane rotation, scale change, and background clutter. This paper proposes a robust visual tracking algorithm called deep convolutional neural network (DCNNCT) to simultaneously address these challenges. The proposed DCNNCT algorithm utilizes a DCNN to extract the image feature of a tracked target, and the full range of information regarding each convolutional layer is used to express the image feature. Subsequently, the kernelized correlation filters (CF) in each convolutional layer are adaptively learned, the correlation response maps of that are combined to estimate the location of the tracked target. To avoid the case of tracking failure, an online random ferns classifier is employed to redetect the tracked target, and a dual-threshold scheme is used to obtain the final target location by comparing the tracking result with the detection result. Finally, the change in scale of the target is determined by building scale pyramids and training a CF. Extensive experiments demonstrate that the proposed algorithm is effective at tracking, especially when evaluated using an index called the overlap rate. The DCNNCT algorithm is also highly competitive in terms of robustness with respect to state-of-the-art trackers in various challenging scenarios.

  20. Detection of ochratoxin A contamination in stored wheat using near-infrared hyperspectral imaging

    NASA Astrophysics Data System (ADS)

    Senthilkumar, T.; Jayas, D. S.; White, N. D. G.; Fields, P. G.; Gräfenhan, T.

    2017-03-01

    Near-infrared (NIR) hyperspectral imaging system was used to detect five concentration levels of ochratoxin A (OTA) in contaminated wheat kernels. The wheat kernels artificially inoculated with two different OTA producing Penicillium verrucosum strains, two different non-toxigenic P. verrucosum strains, and sterile control wheat kernels were subjected to NIR hyperspectral imaging. The acquired three-dimensional data were reshaped into readable two-dimensional data. Principal Component Analysis (PCA) was applied to the two dimensional data to identify the key wavelengths which had greater significance in detecting OTA contamination in wheat. Statistical and histogram features extracted at the key wavelengths were used in the linear, quadratic and Mahalanobis statistical discriminant models to differentiate between sterile control, five concentration levels of OTA contamination in wheat kernels, and five infection levels of non-OTA producing P. verrucosum inoculated wheat kernels. The classification models differentiated sterile control samples from OTA contaminated wheat kernels and non-OTA producing P. verrucosum inoculated wheat kernels with a 100% accuracy. The classification models also differentiated between five concentration levels of OTA contaminated wheat kernels and between five infection levels of non-OTA producing P. verrucosum inoculated wheat kernels with a correct classification of more than 98%. The non-OTA producing P. verrucosum inoculated wheat kernels and OTA contaminated wheat kernels subjected to hyperspectral imaging provided different spectral patterns.

  1. Credit scoring analysis using kernel discriminant

    NASA Astrophysics Data System (ADS)

    Widiharih, T.; Mukid, M. A.; Mustafid

    2018-05-01

    Credit scoring model is an important tool for reducing the risk of wrong decisions when granting credit facilities to applicants. This paper investigate the performance of kernel discriminant model in assessing customer credit risk. Kernel discriminant analysis is a non- parametric method which means that it does not require any assumptions about the probability distribution of the input. The main ingredient is a kernel that allows an efficient computation of Fisher discriminant. We use several kernel such as normal, epanechnikov, biweight, and triweight. The models accuracy was compared each other using data from a financial institution in Indonesia. The results show that kernel discriminant can be an alternative method that can be used to determine who is eligible for a credit loan. In the data we use, it shows that a normal kernel is relevant to be selected for credit scoring using kernel discriminant model. Sensitivity and specificity reach to 0.5556 and 0.5488 respectively.

  2. Unified Heat Kernel Regression for Diffusion, Kernel Smoothing and Wavelets on Manifolds and Its Application to Mandible Growth Modeling in CT Images

    PubMed Central

    Chung, Moo K.; Qiu, Anqi; Seo, Seongho; Vorperian, Houri K.

    2014-01-01

    We present a novel kernel regression framework for smoothing scalar surface data using the Laplace-Beltrami eigenfunctions. Starting with the heat kernel constructed from the eigenfunctions, we formulate a new bivariate kernel regression framework as a weighted eigenfunction expansion with the heat kernel as the weights. The new kernel regression is mathematically equivalent to isotropic heat diffusion, kernel smoothing and recently popular diffusion wavelets. Unlike many previous partial differential equation based approaches involving diffusion, our approach represents the solution of diffusion analytically, reducing numerical inaccuracy and slow convergence. The numerical implementation is validated on a unit sphere using spherical harmonics. As an illustration, we have applied the method in characterizing the localized growth pattern of mandible surfaces obtained in CT images from subjects between ages 0 and 20 years by regressing the length of displacement vectors with respect to the template surface. PMID:25791435

  3. Correlation and classification of single kernel fluorescence hyperspectral data with aflatoxin concentration in corn kernels inoculated with Aspergillus flavus spores.

    PubMed

    Yao, H; Hruska, Z; Kincaid, R; Brown, R; Cleveland, T; Bhatnagar, D

    2010-05-01

    The objective of this study was to examine the relationship between fluorescence emissions of corn kernels inoculated with Aspergillus flavus and aflatoxin contamination levels within the kernels. Aflatoxin contamination in corn has been a long-standing problem plaguing the grain industry with potentially devastating consequences to corn growers. In this study, aflatoxin-contaminated corn kernels were produced through artificial inoculation of corn ears in the field with toxigenic A. flavus spores. The kernel fluorescence emission data were taken with a fluorescence hyperspectral imaging system when corn kernels were excited with ultraviolet light. Raw fluorescence image data were preprocessed and regions of interest in each image were created for all kernels. The regions of interest were used to extract spectral signatures and statistical information. The aflatoxin contamination level of single corn kernels was then chemically measured using affinity column chromatography. A fluorescence peak shift phenomenon was noted among different groups of kernels with different aflatoxin contamination levels. The fluorescence peak shift was found to move more toward the longer wavelength in the blue region for the highly contaminated kernels and toward the shorter wavelengths for the clean kernels. Highly contaminated kernels were also found to have a lower fluorescence peak magnitude compared with the less contaminated kernels. It was also noted that a general negative correlation exists between measured aflatoxin and the fluorescence image bands in the blue and green regions. The correlation coefficients of determination, r(2), was 0.72 for the multiple linear regression model. The multivariate analysis of variance found that the fluorescence means of four aflatoxin groups, <1, 1-20, 20-100, and >or=100 ng g(-1) (parts per billion), were significantly different from each other at the 0.01 level of alpha. Classification accuracy under a two-class schema ranged from 0.84 to 0.91 when a threshold of either 20 or 100 ng g(-1) was used. Overall, the results indicate that fluorescence hyperspectral imaging may be applicable in estimating aflatoxin content in individual corn kernels.

  4. Intraear Compensation of Field Corn, Zea mays, from Simulated and Naturally Occurring Injury by Ear-Feeding Larvae.

    PubMed

    Steckel, S; Stewart, S D

    2015-06-01

    Ear-feeding larvae, such as corn earworm, Helicoverpa zea Boddie (Lepidoptera: Noctuidae), can be important insect pests of field corn, Zea mays L., by feeding on kernels. Recently introduced, stacked Bacillus thuringiensis (Bt) traits provide improved protection from ear-feeding larvae. Thus, our objective was to evaluate how injury to kernels in the ear tip might affect yield when this injury was inflicted at the blister and milk stages. In 2010, simulated corn earworm injury reduced total kernel weight (i.e., yield) at both the blister and milk stage. In 2011, injury to ear tips at the milk stage affected total kernel weight. No differences in total kernel weight were found in 2013, regardless of when or how much injury was inflicted. Our data suggested that kernels within the same ear could compensate for injury to ear tips by increasing in size, but this increase was not always statistically significant or sufficient to overcome high levels of kernel injury. For naturally occurring injury observed on multiple corn hybrids during 2011 and 2012, our analyses showed either no or a minimal relationship between number of kernels injured by ear-feeding larvae and the total number of kernels per ear, total kernel weight, or the size of individual kernels. The results indicate that intraear compensation for kernel injury to ear tips can occur under at least some conditions. © The Authors 2015. Published by Oxford University Press on behalf of Entomological Society of America. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  5. Evidence-based Kernels: Fundamental Units of Behavioral Influence

    PubMed Central

    Biglan, Anthony

    2008-01-01

    This paper describes evidence-based kernels, fundamental units of behavioral influence that appear to underlie effective prevention and treatment for children, adults, and families. A kernel is a behavior–influence procedure shown through experimental analysis to affect a specific behavior and that is indivisible in the sense that removing any of its components would render it inert. Existing evidence shows that a variety of kernels can influence behavior in context, and some evidence suggests that frequent use or sufficient use of some kernels may produce longer lasting behavioral shifts. The analysis of kernels could contribute to an empirically based theory of behavioral influence, augment existing prevention or treatment efforts, facilitate the dissemination of effective prevention and treatment practices, clarify the active ingredients in existing interventions, and contribute to efficiently developing interventions that are more effective. Kernels involve one or more of the following mechanisms of behavior influence: reinforcement, altering antecedents, changing verbal relational responding, or changing physiological states directly. The paper describes 52 of these kernels, and details practical, theoretical, and research implications, including calling for a national database of kernels that influence human behavior. PMID:18712600

  6. Integrating the Gradient of the Thin Wire Kernel

    NASA Technical Reports Server (NTRS)

    Champagne, Nathan J.; Wilton, Donald R.

    2008-01-01

    A formulation for integrating the gradient of the thin wire kernel is presented. This approach employs a new expression for the gradient of the thin wire kernel derived from a recent technique for numerically evaluating the exact thin wire kernel. This approach should provide essentially arbitrary accuracy and may be used with higher-order elements and basis functions using the procedure described in [4].When the source and observation points are close, the potential integrals over wire segments involving the wire kernel are split into parts to handle the singular behavior of the integrand [1]. The singularity characteristics of the gradient of the wire kernel are different than those of the wire kernel, and the axial and radial components have different singularities. The characteristics of the gradient of the wire kernel are discussed in [2]. To evaluate the near electric and magnetic fields of a wire, the integration of the gradient of the wire kernel needs to be calculated over the source wire. Since the vector bases for current have constant direction on linear wire segments, these integrals reduce to integrals of the form

  7. Adaptive distance metric learning for diffusion tensor image segmentation.

    PubMed

    Kong, Youyong; Wang, Defeng; Shi, Lin; Hui, Steve C N; Chu, Winnie C W

    2014-01-01

    High quality segmentation of diffusion tensor images (DTI) is of key interest in biomedical research and clinical application. In previous studies, most efforts have been made to construct predefined metrics for different DTI segmentation tasks. These methods require adequate prior knowledge and tuning parameters. To overcome these disadvantages, we proposed to automatically learn an adaptive distance metric by a graph based semi-supervised learning model for DTI segmentation. An original discriminative distance vector was first formulated by combining both geometry and orientation distances derived from diffusion tensors. The kernel metric over the original distance and labels of all voxels were then simultaneously optimized in a graph based semi-supervised learning approach. Finally, the optimization task was efficiently solved with an iterative gradient descent method to achieve the optimal solution. With our approach, an adaptive distance metric could be available for each specific segmentation task. Experiments on synthetic and real brain DTI datasets were performed to demonstrate the effectiveness and robustness of the proposed distance metric learning approach. The performance of our approach was compared with three classical metrics in the graph based semi-supervised learning framework.

  8. Adaptive Distance Metric Learning for Diffusion Tensor Image Segmentation

    PubMed Central

    Kong, Youyong; Wang, Defeng; Shi, Lin; Hui, Steve C. N.; Chu, Winnie C. W.

    2014-01-01

    High quality segmentation of diffusion tensor images (DTI) is of key interest in biomedical research and clinical application. In previous studies, most efforts have been made to construct predefined metrics for different DTI segmentation tasks. These methods require adequate prior knowledge and tuning parameters. To overcome these disadvantages, we proposed to automatically learn an adaptive distance metric by a graph based semi-supervised learning model for DTI segmentation. An original discriminative distance vector was first formulated by combining both geometry and orientation distances derived from diffusion tensors. The kernel metric over the original distance and labels of all voxels were then simultaneously optimized in a graph based semi-supervised learning approach. Finally, the optimization task was efficiently solved with an iterative gradient descent method to achieve the optimal solution. With our approach, an adaptive distance metric could be available for each specific segmentation task. Experiments on synthetic and real brain DTI datasets were performed to demonstrate the effectiveness and robustness of the proposed distance metric learning approach. The performance of our approach was compared with three classical metrics in the graph based semi-supervised learning framework. PMID:24651858

  9. 21 CFR 182.40 - Natural extractives (solvent-free) used in conjunction with spices, seasonings, and flavorings.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... source Apricot kernel (persic oil) Prunus armeniaca L. Peach kernel (persic oil) Prunus persica Sieb. et Zucc. Peanut stearine Arachis hypogaea L. Persic oil (see apricot kernel and peach kernel) Quince seed...

  10. 21 CFR 182.40 - Natural extractives (solvent-free) used in conjunction with spices, seasonings, and flavorings.

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ... source Apricot kernel (persic oil) Prunus armeniaca L. Peach kernel (persic oil) Prunus persica Sieb. et Zucc. Peanut stearine Arachis hypogaea L. Persic oil (see apricot kernel and peach kernel) Quince seed...

  11. 21 CFR 182.40 - Natural extractives (solvent-free) used in conjunction with spices, seasonings, and flavorings.

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ... source Apricot kernel (persic oil) Prunus armeniaca L. Peach kernel (persic oil) Prunus persica Sieb. et Zucc. Peanut stearine Arachis hypogaea L. Persic oil (see apricot kernel and peach kernel) Quince seed...

  12. Wigner functions defined with Laplace transform kernels.

    PubMed

    Oh, Se Baek; Petruccelli, Jonathan C; Tian, Lei; Barbastathis, George

    2011-10-24

    We propose a new Wigner-type phase-space function using Laplace transform kernels--Laplace kernel Wigner function. Whereas momentum variables are real in the traditional Wigner function, the Laplace kernel Wigner function may have complex momentum variables. Due to the property of the Laplace transform, a broader range of signals can be represented in complex phase-space. We show that the Laplace kernel Wigner function exhibits similar properties in the marginals as the traditional Wigner function. As an example, we use the Laplace kernel Wigner function to analyze evanescent waves supported by surface plasmon polariton. © 2011 Optical Society of America

  13. Influence of wheat kernel physical properties on the pulverizing process.

    PubMed

    Dziki, Dariusz; Cacak-Pietrzak, Grażyna; Miś, Antoni; Jończyk, Krzysztof; Gawlik-Dziki, Urszula

    2014-10-01

    The physical properties of wheat kernel were determined and related to pulverizing performance by correlation analysis. Nineteen samples of wheat cultivars about similar level of protein content (11.2-12.8 % w.b.) and obtained from organic farming system were used for analysis. The kernel (moisture content 10 % w.b.) was pulverized by using the laboratory hammer mill equipped with round holes 1.0 mm screen. The specific grinding energy ranged from 120 kJkg(-1) to 159 kJkg(-1). On the basis of data obtained many of significant correlations (p < 0.05) were found between wheat kernel physical properties and pulverizing process of wheat kernel, especially wheat kernel hardness index (obtained on the basis of Single Kernel Characterization System) and vitreousness significantly and positively correlated with the grinding energy indices and the mass fraction of coarse particles (> 0.5 mm). Among the kernel mechanical properties determined on the basis of uniaxial compression test only the rapture force was correlated with the impact grinding results. The results showed also positive and significant relationships between kernel ash content and grinding energy requirements. On the basis of wheat physical properties the multiple linear regression was proposed for predicting the average particle size of pulverized kernel.

  14. Relationship between processing score and kernel-fraction particle size in whole-plant corn silage.

    PubMed

    Dias Junior, G S; Ferraretto, L F; Salvati, G G S; de Resende, L C; Hoffman, P C; Pereira, M N; Shaver, R D

    2016-04-01

    Kernel processing increases starch digestibility in whole-plant corn silage (WPCS). Corn silage processing score (CSPS), the percentage of starch passing through a 4.75-mm sieve, is widely used to assess degree of kernel breakage in WPCS. However, the geometric mean particle size (GMPS) of the kernel-fraction that passes through the 4.75-mm sieve has not been well described. Therefore, the objectives of this study were (1) to evaluate particle size distribution and digestibility of kernels cut in varied particle sizes; (2) to propose a method to measure GMPS in WPCS kernels; and (3) to evaluate the relationship between CSPS and GMPS of the kernel fraction in WPCS. Composite samples of unfermented, dried kernels from 110 corn hybrids commonly used for silage production were kept whole (WH) or manually cut in 2, 4, 8, 16, 32 or 64 pieces (2P, 4P, 8P, 16P, 32P, and 64P, respectively). Dry sieving to determine GMPS, surface area, and particle size distribution using 9 sieves with nominal square apertures of 9.50, 6.70, 4.75, 3.35, 2.36, 1.70, 1.18, and 0.59 mm and pan, as well as ruminal in situ dry matter (DM) digestibilities were performed for each kernel particle number treatment. Incubation times were 0, 3, 6, 12, and 24 h. The ruminal in situ DM disappearance of unfermented kernels increased with the reduction in particle size of corn kernels. Kernels kept whole had the lowest ruminal DM disappearance for all time points with maximum DM disappearance of 6.9% at 24 h and the greatest disappearance was observed for 64P, followed by 32P and 16P. Samples of WPCS (n=80) from 3 studies representing varied theoretical length of cut settings and processor types and settings were also evaluated. Each WPCS sample was divided in 2 and then dried at 60 °C for 48 h. The CSPS was determined in duplicate on 1 of the split samples, whereas on the other split sample the kernel and stover fractions were separated using a hydrodynamic separation procedure. After separation, the kernel fraction was redried at 60°C for 48 h in a forced-air oven and dry sieved to determine GMPS and surface area. Linear relationships between CSPS from WPCS (n=80) and kernel fraction GMPS, surface area, and proportion passing through the 4.75-mm screen were poor. Strong quadratic relationships between proportion of kernel fraction passing through the 4.75-mm screen and kernel fraction GMPS and surface area were observed. These findings suggest that hydrodynamic separation and dry sieving of the kernel fraction may provide a better assessment of kernel breakage in WPCS than CSPS. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  15. Classification of corn kernels contaminated with aflatoxins using fluorescence and reflectance hyperspectral images analysis

    NASA Astrophysics Data System (ADS)

    Zhu, Fengle; Yao, Haibo; Hruska, Zuzana; Kincaid, Russell; Brown, Robert; Bhatnagar, Deepak; Cleveland, Thomas

    2015-05-01

    Aflatoxins are secondary metabolites produced by certain fungal species of the Aspergillus genus. Aflatoxin contamination remains a problem in agricultural products due to its toxic and carcinogenic properties. Conventional chemical methods for aflatoxin detection are time-consuming and destructive. This study employed fluorescence and reflectance visible near-infrared (VNIR) hyperspectral images to classify aflatoxin contaminated corn kernels rapidly and non-destructively. Corn ears were artificially inoculated in the field with toxigenic A. flavus spores at the early dough stage of kernel development. After harvest, a total of 300 kernels were collected from the inoculated ears. Fluorescence hyperspectral imagery with UV excitation and reflectance hyperspectral imagery with halogen illumination were acquired on both endosperm and germ sides of kernels. All kernels were then subjected to chemical analysis individually to determine aflatoxin concentrations. A region of interest (ROI) was created for each kernel to extract averaged spectra. Compared with healthy kernels, fluorescence spectral peaks for contaminated kernels shifted to longer wavelengths with lower intensity, and reflectance values for contaminated kernels were lower with a different spectral shape in 700-800 nm region. Principal component analysis was applied for data compression before classifying kernels into contaminated and healthy based on a 20 ppb threshold utilizing the K-nearest neighbors algorithm. The best overall accuracy achieved was 92.67% for germ side in the fluorescence data analysis. The germ side generally performed better than endosperm side. Fluorescence and reflectance image data achieved similar accuracy.

  16. Influence of Kernel Age on Fumonisin B1 Production in Maize by Fusarium moniliforme

    PubMed Central

    Warfield, Colleen Y.; Gilchrist, David G.

    1999-01-01

    Production of fumonisins by Fusarium moniliforme on naturally infected maize ears is an important food safety concern due to the toxic nature of this class of mycotoxins. Assessing the potential risk of fumonisin production in developing maize ears prior to harvest requires an understanding of the regulation of toxin biosynthesis during kernel maturation. We investigated the developmental-stage-dependent relationship between maize kernels and fumonisin B1 production by using kernels collected at the blister (R2), milk (R3), dough (R4), and dent (R5) stages following inoculation in culture at their respective field moisture contents with F. moniliforme. Highly significant differences (P ≤ 0.001) in fumonisin B1 production were found among kernels at the different developmental stages. The highest levels of fumonisin B1 were produced on the dent stage kernels, and the lowest levels were produced on the blister stage kernels. The differences in fumonisin B1 production among kernels at the different developmental stages remained significant (P ≤ 0.001) when the moisture contents of the kernels were adjusted to the same level prior to inoculation. We concluded that toxin production is affected by substrate composition as well as by moisture content. Our study also demonstrated that fumonisin B1 biosynthesis on maize kernels is influenced by factors which vary with the developmental age of the tissue. The risk of fumonisin contamination may begin early in maize ear development and increases as the kernels reach physiological maturity. PMID:10388675

  17. Novel near-infrared sampling apparatus for single kernel analysis of oil content in maize.

    PubMed

    Janni, James; Weinstock, B André; Hagen, Lisa; Wright, Steve

    2008-04-01

    A method of rapid, nondestructive chemical and physical analysis of individual maize (Zea mays L.) kernels is needed for the development of high value food, feed, and fuel traits. Near-infrared (NIR) spectroscopy offers a robust nondestructive method of trait determination. However, traditional NIR bulk sampling techniques cannot be applied successfully to individual kernels. Obtaining optimized single kernel NIR spectra for applied chemometric predictive analysis requires a novel sampling technique that can account for the heterogeneous forms, morphologies, and opacities exhibited in individual maize kernels. In this study such a novel technique is described and compared to less effective means of single kernel NIR analysis. Results of the application of a partial least squares (PLS) derived model for predictive determination of percent oil content per individual kernel are shown.

  18. Multiple kernel based feature and decision level fusion of iECO individuals for explosive hazard detection in FLIR imagery

    NASA Astrophysics Data System (ADS)

    Price, Stanton R.; Murray, Bryce; Hu, Lequn; Anderson, Derek T.; Havens, Timothy C.; Luke, Robert H.; Keller, James M.

    2016-05-01

    A serious threat to civilians and soldiers is buried and above ground explosive hazards. The automatic detection of such threats is highly desired. Many methods exist for explosive hazard detection, e.g., hand-held based sensors, downward and forward looking vehicle mounted platforms, etc. In addition, multiple sensors are used to tackle this extreme problem, such as radar and infrared (IR) imagery. In this article, we explore the utility of feature and decision level fusion of learned features for forward looking explosive hazard detection in IR imagery. Specifically, we investigate different ways to fuse learned iECO features pre and post multiple kernel (MK) support vector machine (SVM) based classification. Three MK strategies are explored; fixed rule, heuristics and optimization-based. Performance is assessed in the context of receiver operating characteristic (ROC) curves on data from a U.S. Army test site that contains multiple target and clutter types, burial depths and times of day. Specifically, the results reveal two interesting things. First, the different MK strategies appear to indicate that the different iECO individuals are all more-or-less important and there is not a dominant feature. This is reinforcing as our hypothesis was that iECO provides different ways to approach target detection. Last, we observe that while optimization-based MK is mathematically appealing, i.e., it connects the learning of the fusion to the underlying classification problem we are trying to solve, it appears to be highly susceptible to over fitting and simpler, e.g., fixed rule and heuristics approaches help us realize more generalizable iECO solutions.

  19. Computed tomography coronary stent imaging with iterative reconstruction: a trade-off study between medium kernel and sharp kernel.

    PubMed

    Zhou, Qijing; Jiang, Biao; Dong, Fei; Huang, Peiyu; Liu, Hongtao; Zhang, Minming

    2014-01-01

    To evaluate the improvement of iterative reconstruction in image space (IRIS) technique in computed tomographic (CT) coronary stent imaging with sharp kernel, and to make a trade-off analysis. Fifty-six patients with 105 stents were examined by 128-slice dual-source CT coronary angiography (CTCA). Images were reconstructed using standard filtered back projection (FBP) and IRIS with both medium kernel and sharp kernel applied. Image noise and the stent diameter were investigated. Image noise was measured both in background vessel and in-stent lumen as objective image evaluation. Image noise score and stent score were performed as subjective image evaluation. The CTCA images reconstructed with IRIS were associated with significant noise reduction compared to that of CTCA images reconstructed using FBP technique in both of background vessel and in-stent lumen (the background noise decreased by approximately 25.4% ± 8.2% in medium kernel (P

  20. Mapping QTLs controlling kernel dimensions in a wheat inter-varietal RIL mapping population.

    PubMed

    Cheng, Ruiru; Kong, Zhongxin; Zhang, Liwei; Xie, Quan; Jia, Haiyan; Yu, Dong; Huang, Yulong; Ma, Zhengqiang

    2017-07-01

    Seven kernel dimension QTLs were identified in wheat, and kernel thickness was found to be the most important dimension for grain weight improvement. Kernel morphology and weight of wheat (Triticum aestivum L.) affect both yield and quality; however, the genetic basis of these traits and their interactions has not been fully understood. In this study, to investigate the genetic factors affecting kernel morphology and the association of kernel morphology traits with kernel weight, kernel length (KL), width (KW) and thickness (KT) were evaluated, together with hundred-grain weight (HGW), in a recombinant inbred line population derived from Nanda2419 × Wangshuibai, with data from five trials (two different locations over 3 years). The results showed that HGW was more closely correlated with KT and KW than with KL. A whole genome scan revealed four QTLs for KL, one for KW and two for KT, distributed on five different chromosomes. Of them, QKl.nau-2D for KL, and QKt.nau-4B and QKt.nau-5A for KT were newly identified major QTLs for the respective traits, explaining up to 32.6 and 41.5% of the phenotypic variations, respectively. Increase of KW and KT and reduction of KL/KT and KW/KT ratios always resulted in significant higher grain weight. Lines combining the Nanda 2419 alleles of the 4B and 5A intervals had wider, thicker, rounder kernels and a 14% higher grain weight in the genotype-based analysis. A strong, negative linear relationship of the KW/KT ratio with grain weight was observed. It thus appears that kernel thickness is the most important kernel dimension factor in wheat improvement for higher yield. Mapping and marker identification of the kernel dimension-related QTLs definitely help realize the breeding goals.

  1. Adaptive kernel function using line transect sampling

    NASA Astrophysics Data System (ADS)

    Albadareen, Baker; Ismail, Noriszura

    2018-04-01

    The estimation of f(0) is crucial in the line transect method which is used for estimating population abundance in wildlife survey's. The classical kernel estimator of f(0) has a high negative bias. Our study proposes an adaptation in the kernel function which is shown to be more efficient than the usual kernel estimator. A simulation study is adopted to compare the performance of the proposed estimators with the classical kernel estimators.

  2. Kernel Partial Least Squares for Nonlinear Regression and Discrimination

    NASA Technical Reports Server (NTRS)

    Rosipal, Roman; Clancy, Daniel (Technical Monitor)

    2002-01-01

    This paper summarizes recent results on applying the method of partial least squares (PLS) in a reproducing kernel Hilbert space (RKHS). A previously proposed kernel PLS regression model was proven to be competitive with other regularized regression methods in RKHS. The family of nonlinear kernel-based PLS models is extended by considering the kernel PLS method for discrimination. Theoretical and experimental results on a two-class discrimination problem indicate usefulness of the method.

  3. Pollen source effects on growth of kernel structures and embryo chemical compounds in maize.

    PubMed

    Tanaka, W; Mantese, A I; Maddonni, G A

    2009-08-01

    Previous studies have reported effects of pollen source on the oil concentration of maize (Zea mays) kernels through modifications to both the embryo/kernel ratio and embryo oil concentration. The present study expands upon previous analyses by addressing pollen source effects on the growth of kernel structures (i.e. pericarp, endosperm and embryo), allocation of embryo chemical constituents (i.e. oil, protein, starch and soluble sugars), and the anatomy and histology of the embryos. Maize kernels with different oil concentration were obtained from pollinations with two parental genotypes of contrasting oil concentration. The dynamics of the growth of kernel structures and allocation of embryo chemical constituents were analysed during the post-flowering period. Mature kernels were dissected to study the anatomy (embryonic axis and scutellum) and histology [cell number and cell size of the scutellums, presence of sub-cellular structures in scutellum tissue (starch granules, oil and protein bodies)] of the embryos. Plants of all crosses exhibited a similar kernel number and kernel weight. Pollen source modified neither the growth period of kernel structures, nor pericarp growth rate. By contrast, pollen source determined a trade-off between embryo and endosperm growth rates, which impacted on the embryo/kernel ratio of mature kernels. Modifications to the embryo size were mediated by scutellum cell number. Pollen source also affected (P < 0.01) allocation of embryo chemical compounds. Negative correlations among embryo oil concentration and those of starch (r = 0.98, P < 0.01) and soluble sugars (r = 0.95, P < 0.05) were found. Coincidently, embryos with low oil concentration had an increased (P < 0.05-0.10) scutellum cell area occupied by starch granules and fewer oil bodies. The effects of pollen source on both embryo/kernel ratio and allocation of embryo chemicals seems to be related to the early established sink strength (i.e. sink size and sink activity) of the embryos.

  4. 7 CFR 868.254 - Broken kernels determination.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... 7 Agriculture 7 2010-01-01 2010-01-01 false Broken kernels determination. 868.254 Section 868.254 Agriculture Regulations of the Department of Agriculture (Continued) GRAIN INSPECTION, PACKERS AND STOCKYARD... Governing Application of Standards § 868.254 Broken kernels determination. Broken kernels shall be...

  5. 7 CFR 51.2090 - Serious damage.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... defect which makes a kernel or piece of kernel unsuitable for human consumption, and includes decay...: Shriveling when the kernel is seriously withered, shrunken, leathery, tough or only partially developed: Provided, that partially developed kernels are not considered seriously damaged if more than one-fourth of...

  6. Anisotropic hydrodynamics with a scalar collisional kernel

    NASA Astrophysics Data System (ADS)

    Almaalol, Dekrayat; Strickland, Michael

    2018-04-01

    Prior studies of nonequilibrium dynamics using anisotropic hydrodynamics have used the relativistic Anderson-Witting scattering kernel or some variant thereof. In this paper, we make the first study of the impact of using a more realistic scattering kernel. For this purpose, we consider a conformal system undergoing transversally homogenous and boost-invariant Bjorken expansion and take the collisional kernel to be given by the leading order 2 ↔2 scattering kernel in scalar λ ϕ4 . We consider both classical and quantum statistics to assess the impact of Bose enhancement on the dynamics. We also determine the anisotropic nonequilibrium attractor of a system subject to this collisional kernel. We find that, when the near-equilibrium relaxation-times in the Anderson-Witting and scalar collisional kernels are matched, the scalar kernel results in a higher degree of momentum-space anisotropy during the system's evolution, given the same initial conditions. Additionally, we find that taking into account Bose enhancement further increases the dynamically generated momentum-space anisotropy.

  7. Straight-chain halocarbon forming fluids for TRISO fuel kernel production - Tests with yttria-stabilized zirconia microspheres

    NASA Astrophysics Data System (ADS)

    Baker, M. P.; King, J. C.; Gorman, B. P.; Braley, J. C.

    2015-03-01

    Current methods of TRISO fuel kernel production in the United States use a sol-gel process with trichloroethylene (TCE) as the forming fluid. After contact with radioactive materials, the spent TCE becomes a mixed hazardous waste, and high costs are associated with its recycling or disposal. Reducing or eliminating this mixed waste stream would not only benefit the environment, but would also enhance the economics of kernel production. Previous research yielded three candidates for testing as alternatives to TCE: 1-bromotetradecane, 1-chlorooctadecane, and 1-iodododecane. This study considers the production of yttria-stabilized zirconia (YSZ) kernels in silicone oil and the three chosen alternative formation fluids, with subsequent characterization of the produced kernels and used forming fluid. Kernels formed in silicone oil and bromotetradecane were comparable to those produced by previous kernel production efforts, while those produced in chlorooctadecane and iodododecane experienced gelation issues leading to poor kernel formation and geometry.

  8. Numerical study of the ignition behavior of a post-discharge kernel injected into a turbulent stratified cross-flow

    NASA Astrophysics Data System (ADS)

    Jaravel, Thomas; Labahn, Jeffrey; Ihme, Matthias

    2017-11-01

    The reliable initiation of flame ignition by high-energy spark kernels is critical for the operability of aviation gas turbines. The evolution of a spark kernel ejected by an igniter into a turbulent stratified environment is investigated using detailed numerical simulations with complex chemistry. At early times post ejection, comparisons of simulation results with high-speed Schlieren data show that the initial trajectory of the kernel is well reproduced, with a significant amount of air entrainment from the surrounding flow that is induced by the kernel ejection. After transiting in a non-flammable mixture, the kernel reaches a second stream of flammable methane-air mixture, where the successful of the kernel ignition was found to depend on the local flow state and operating conditions. By performing parametric studies, the probability of kernel ignition was identified, and compared with experimental observations. The ignition behavior is characterized by analyzing the local chemical structure, and its stochastic variability is also investigated.

  9. The site, size, spatial stability, and energetics of an X-ray flare kernel

    NASA Technical Reports Server (NTRS)

    Petrasso, R.; Gerassimenko, M.; Nolte, J.

    1979-01-01

    The site, size evolution, and energetics of an X-ray kernel that dominated a solar flare during its rise and somewhat during its peak are investigated. The position of the kernel remained stationary to within about 3 arc sec over the 30-min interval of observations, despite pulsations in the kernel X-ray brightness in excess of a factor of 10. This suggests a tightly bound, deeply rooted magnetic structure, more plausibly associated with the near chromosphere or low corona rather than with the high corona. The H-alpha flare onset coincided with the appearance of the kernel, again suggesting a close spatial and temporal coupling between the chromospheric H-alpha event and the X-ray kernel. At the first kernel brightness peak its size was no larger than about 2 arc sec, when it accounted for about 40% of the total flare flux. In the second rise phase of the kernel, a source power input of order 2 times 10 to the 24th ergs/sec is minimally required.

  10. The pre-image problem in kernel methods.

    PubMed

    Kwok, James Tin-yau; Tsang, Ivor Wai-hung

    2004-11-01

    In this paper, we address the problem of finding the pre-image of a feature vector in the feature space induced by a kernel. This is of central importance in some kernel applications, such as on using kernel principal component analysis (PCA) for image denoising. Unlike the traditional method which relies on nonlinear optimization, our proposed method directly finds the location of the pre-image based on distance constraints in the feature space. It is noniterative, involves only linear algebra and does not suffer from numerical instability or local minimum problems. Evaluations on performing kernel PCA and kernel clustering on the USPS data set show much improved performance.

  11. Effects of Amygdaline from Apricot Kernel on Transplanted Tumors in Mice.

    PubMed

    Yamshanov, V A; Kovan'ko, E G; Pustovalov, Yu I

    2016-03-01

    The effects of amygdaline from apricot kernel added to fodder on the growth of transplanted LYO-1 and Ehrlich carcinoma were studied in mice. Apricot kernels inhibited the growth of both tumors. Apricot kernels, raw and after thermal processing, given 2 days before transplantation produced a pronounced antitumor effect. Heat-processed apricot kernels given in 3 days after transplantation modified the tumor growth and prolonged animal lifespan. Thermal treatment did not considerably reduce the antitumor effect of apricot kernels. It was hypothesized that the antitumor effect of amygdaline on Ehrlich carcinoma and LYO-1 lymphosarcoma was associated with the presence of bacterial genome in the tumor.

  12. Development of a kernel function for clinical data.

    PubMed

    Daemen, Anneleen; De Moor, Bart

    2009-01-01

    For most diseases and examinations, clinical data such as age, gender and medical history guides clinical management, despite the rise of high-throughput technologies. To fully exploit such clinical information, appropriate modeling of relevant parameters is required. As the widely used linear kernel function has several disadvantages when applied to clinical data, we propose a new kernel function specifically developed for this data. This "clinical kernel function" more accurately represents similarities between patients. Evidently, three data sets were studied and significantly better performances were obtained with a Least Squares Support Vector Machine when based on the clinical kernel function compared to the linear kernel function.

  13. Manycore Performance-Portability: Kokkos Multidimensional Array Library

    DOE PAGES

    Edwards, H. Carter; Sunderland, Daniel; Porter, Vicki; ...

    2012-01-01

    Large, complex scientific and engineering application code have a significant investment in computational kernels to implement their mathematical models. Porting these computational kernels to the collection of modern manycore accelerator devices is a major challenge in that these devices have diverse programming models, application programming interfaces (APIs), and performance requirements. The Kokkos Array programming model provides library-based approach to implement computational kernels that are performance-portable to CPU-multicore and GPGPU accelerator devices. This programming model is based upon three fundamental concepts: (1) manycore compute devices each with its own memory space, (2) data parallel kernels and (3) multidimensional arrays. Kernel executionmore » performance is, especially for NVIDIA® devices, extremely dependent on data access patterns. Optimal data access pattern can be different for different manycore devices – potentially leading to different implementations of computational kernels specialized for different devices. The Kokkos Array programming model supports performance-portable kernels by (1) separating data access patterns from computational kernels through a multidimensional array API and (2) introduce device-specific data access mappings when a kernel is compiled. An implementation of Kokkos Array is available through Trilinos [Trilinos website, http://trilinos.sandia.gov/, August 2011].« less

  14. Protein Subcellular Localization with Gaussian Kernel Discriminant Analysis and Its Kernel Parameter Selection.

    PubMed

    Wang, Shunfang; Nie, Bing; Yue, Kun; Fei, Yu; Li, Wenjia; Xu, Dongshu

    2017-12-15

    Kernel discriminant analysis (KDA) is a dimension reduction and classification algorithm based on nonlinear kernel trick, which can be novelly used to treat high-dimensional and complex biological data before undergoing classification processes such as protein subcellular localization. Kernel parameters make a great impact on the performance of the KDA model. Specifically, for KDA with the popular Gaussian kernel, to select the scale parameter is still a challenging problem. Thus, this paper introduces the KDA method and proposes a new method for Gaussian kernel parameter selection depending on the fact that the differences between reconstruction errors of edge normal samples and those of interior normal samples should be maximized for certain suitable kernel parameters. Experiments with various standard data sets of protein subcellular localization show that the overall accuracy of protein classification prediction with KDA is much higher than that without KDA. Meanwhile, the kernel parameter of KDA has a great impact on the efficiency, and the proposed method can produce an optimum parameter, which makes the new algorithm not only perform as effectively as the traditional ones, but also reduce the computational time and thus improve efficiency.

  15. Differential metabolome analysis of field-grown maize kernels in response to drought stress

    USDA-ARS?s Scientific Manuscript database

    Drought stress constrains maize kernel development and can exacerbate aflatoxin contamination. In order to identify drought responsive metabolites and explore pathways involved in kernel responses, a metabolomics analysis was conducted on kernels from a drought tolerant line, Lo964, and a sensitive ...

  16. Occurrence of 'super soft' wheat kernel texture in hexaploid and tetraploid wheats

    USDA-ARS?s Scientific Manuscript database

    Wheat kernel texture is a key trait that governs milling performance, flour starch damage, flour particle size, flour hydration properties, and baking quality. Kernel texture is commonly measured using the Perten Single Kernel Characterization System (SKCS). The SKCS returns texture values (Hardness...

  17. 7 CFR 868.203 - Basis of determination.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... FOR CERTAIN AGRICULTURAL COMMODITIES United States Standards for Rough Rice Principles Governing..., heat-damaged kernels, red rice and damaged kernels, chalky kernels, other types, color, and the special grade Parboiled rough rice shall be on the basis of the whole and large broken kernels of milled rice...

  18. 7 CFR 868.203 - Basis of determination.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... FOR CERTAIN AGRICULTURAL COMMODITIES United States Standards for Rough Rice Principles Governing..., heat-damaged kernels, red rice and damaged kernels, chalky kernels, other types, color, and the special grade Parboiled rough rice shall be on the basis of the whole and large broken kernels of milled rice...

  19. 7 CFR 868.304 - Broken kernels determination.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... 7 Agriculture 7 2011-01-01 2011-01-01 false Broken kernels determination. 868.304 Section 868.304 Agriculture Regulations of the Department of Agriculture (Continued) GRAIN INSPECTION, PACKERS AND STOCKYARD... Application of Standards § 868.304 Broken kernels determination. Broken kernels shall be determined by the use...

  20. 7 CFR 868.304 - Broken kernels determination.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... 7 Agriculture 7 2010-01-01 2010-01-01 false Broken kernels determination. 868.304 Section 868.304 Agriculture Regulations of the Department of Agriculture (Continued) GRAIN INSPECTION, PACKERS AND STOCKYARD... Application of Standards § 868.304 Broken kernels determination. Broken kernels shall be determined by the use...

  1. Machine Learning Feature Selection for Tuning Memory Page Swapping

    DTIC Science & Technology

    2013-09-01

    environments we set up. 13 Figure 4.1 Updated Feature Vector List. Features we added to the kernel are anno - tated with “(MLVM...Feb. 1966. [2] P. J . Denning, “The working set model for program behavior,” Communications of the ACM, vol. 11, no. 5, pp. 323–333, May 1968. [3] L. A...8] R. W. Cart and J . L. Hennessy, “WSClock — A simple and effective algorithm for virtual memory management,” M.S. thesis, Dept. Computer Science

  2. Biasing anisotropic scattering kernels for deep-penetration Monte Carlo calculations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Carter, L.L.; Hendricks, J.S.

    1983-01-01

    The exponential transform is often used to improve the efficiency of deep-penetration Monte Carlo calculations. This technique is usually implemented by biasing the distance-to-collision kernel of the transport equation, but leaving the scattering kernel unchanged. Dwivedi obtained significant improvements in efficiency by biasing an isotropic scattering kernel as well as the distance-to-collision kernel. This idea is extended to anisotropic scattering, particularly the highly forward Klein-Nishina scattering of gamma rays.

  3. Performance Characteristics of a Kernel-Space Packet Capture Module

    DTIC Science & Technology

    2010-03-01

    Defense, or the United States Government . AFIT/GCO/ENG/10-03 PERFORMANCE CHARACTERISTICS OF A KERNEL-SPACE PACKET CAPTURE MODULE THESIS Presented to the...3.1.2.3 Prototype. The proof of concept for this research is the design, development, and comparative performance analysis of a kernel level N2d capture...changes to kernel code 5. Can be used for both user-space and kernel-space capture applications in order to control comparative performance analysis to

  4. A Kernel-based Lagrangian method for imperfectly-mixed chemical reactions

    NASA Astrophysics Data System (ADS)

    Schmidt, Michael J.; Pankavich, Stephen; Benson, David A.

    2017-05-01

    Current Lagrangian (particle-tracking) algorithms used to simulate diffusion-reaction equations must employ a certain number of particles to properly emulate the system dynamics-particularly for imperfectly-mixed systems. The number of particles is tied to the statistics of the initial concentration fields of the system at hand. Systems with shorter-range correlation and/or smaller concentration variance require more particles, potentially limiting the computational feasibility of the method. For the well-known problem of bimolecular reaction, we show that using kernel-based, rather than Dirac delta, particles can significantly reduce the required number of particles. We derive the fixed width of a Gaussian kernel for a given reduced number of particles that analytically eliminates the error between kernel and Dirac solutions at any specified time. We also show how to solve for the fixed kernel size by minimizing the squared differences between solutions over any given time interval. Numerical results show that the width of the kernel should be kept below about 12% of the domain size, and that the analytic equations used to derive kernel width suffer significantly from the neglect of higher-order moments. The simulations with a kernel width given by least squares minimization perform better than those made to match at one specific time. A heuristic time-variable kernel size, based on the previous results, performs on par with the least squares fixed kernel size.

  5. Optimized Kernel Entropy Components.

    PubMed

    Izquierdo-Verdiguier, Emma; Laparra, Valero; Jenssen, Robert; Gomez-Chova, Luis; Camps-Valls, Gustau

    2017-06-01

    This brief addresses two main issues of the standard kernel entropy component analysis (KECA) algorithm: the optimization of the kernel decomposition and the optimization of the Gaussian kernel parameter. KECA roughly reduces to a sorting of the importance of kernel eigenvectors by entropy instead of variance, as in the kernel principal components analysis. In this brief, we propose an extension of the KECA method, named optimized KECA (OKECA), that directly extracts the optimal features retaining most of the data entropy by means of compacting the information in very few features (often in just one or two). The proposed method produces features which have higher expressive power. In particular, it is based on the independent component analysis framework, and introduces an extra rotation to the eigen decomposition, which is optimized via gradient-ascent search. This maximum entropy preservation suggests that OKECA features are more efficient than KECA features for density estimation. In addition, a critical issue in both the methods is the selection of the kernel parameter, since it critically affects the resulting performance. Here, we analyze the most common kernel length-scale selection criteria. The results of both the methods are illustrated in different synthetic and real problems. Results show that OKECA returns projections with more expressive power than KECA, the most successful rule for estimating the kernel parameter is based on maximum likelihood, and OKECA is more robust to the selection of the length-scale parameter in kernel density estimation.

  6. SEMI-SUPERVISED OBJECT RECOGNITION USING STRUCTURE KERNEL

    PubMed Central

    Wang, Botao; Xiong, Hongkai; Jiang, Xiaoqian; Ling, Fan

    2013-01-01

    Object recognition is a fundamental problem in computer vision. Part-based models offer a sparse, flexible representation of objects, but suffer from difficulties in training and often use standard kernels. In this paper, we propose a positive definite kernel called “structure kernel”, which measures the similarity of two part-based represented objects. The structure kernel has three terms: 1) the global term that measures the global visual similarity of two objects; 2) the part term that measures the visual similarity of corresponding parts; 3) the spatial term that measures the spatial similarity of geometric configuration of parts. The contribution of this paper is to generalize the discriminant capability of local kernels to complex part-based object models. Experimental results show that the proposed kernel exhibit higher accuracy than state-of-art approaches using standard kernels. PMID:23666108

  7. Burrower bugs (Heteroptera: Cydnidae) in peanut: seasonal species abundance, tillage effects, grade reduction effects, insecticide efficacy, and management.

    PubMed

    Chapin, Jay W; Thomas, James S

    2003-08-01

    Pitfall traps placed in South Carolina peanut, Arachis hypogaea (L.), fields collected three species of burrower bugs (Cydnidae): Cyrtomenus ciliatus (Palisot de Beauvois), Sehirus cinctus cinctus (Palisot de Beauvois), and Pangaeus bilineatus (Say). Cyrtomenus ciliatus was rarely collected. Sehirus cinctus produced a nymphal cohort in peanut during May and June, probably because of abundant henbit seeds, Lamium amplexicaule L., in strip-till production systems. No S. cinctus were present during peanut pod formation. Pangaeus bilineatus was the most abundant species collected and the only species associated with peanut kernel feeding injury. Overwintering P. bilineatus adults were present in a conservation tillage peanut field before planting and two to three subsequent generations were observed. Few nymphs were collected until the R6 (full seed) growth stage. Tillage and choice of cover crop affected P. bilineatus populations. Peanuts strip-tilled into corn or wheat residue had greater P. bilineatus populations and kernel-feeding than conventional tillage or strip-tillage into rye residue. Fall tillage before planting a wheat cover crop also reduced burrower bug feeding on peanut. At-pegging (early July) granular chlorpyrifos treatments were most consistent in suppressing kernel feeding. Kernels fed on by P. bilineatus were on average 10% lighter than unfed on kernels. Pangaeus bilineatus feeding reduced peanut grade by reducing individual kernel weight, and increasing the percentage damaged kernels. Each 10% increase in kernels fed on by P. bilineatus was associated with a 1.7% decrease in total sound mature kernels, and kernel feeding levels above 30% increase the risk of damaged kernel grade penalties.

  8. Imaging and automated detection of Sitophilus oryzae (Coleoptera: Curculionidae) pupae in hard red winter wheat.

    PubMed

    Toews, Michael D; Pearson, Tom C; Campbell, James F

    2006-04-01

    Computed tomography, an imaging technique commonly used for diagnosing internal human health ailments, uses multiple x-rays and sophisticated software to recreate a cross-sectional representation of a subject. The use of this technique to image hard red winter wheat, Triticum aestivm L., samples infested with pupae of Sitophilus oryzae (L.) was investigated. A software program was developed to rapidly recognize and quantify the infested kernels. Samples were imaged in a 7.6-cm (o.d.) plastic tube containing 0, 50, or 100 infested kernels per kg of wheat. Interkernel spaces were filled with corn oil so as to increase the contrast between voids inside kernels and voids among kernels. Automated image processing, using a custom C language software program, was conducted separately on each 100 g portion of the prepared samples. The average detection accuracy in the five infested kernels per 100-g samples was 94.4 +/- 7.3% (mean +/- SD, n = 10), whereas the average detection accuracy in the 10 infested kernels per 100-g sample was 87.3 +/- 7.9% (n = 10). Detection accuracy in the 10 infested kernels per 100-g samples was slightly less than the five infested kernels per 100-g samples because of some infested kernels overlapping with each other or air bubbles in the oil. A mean of 1.2 +/- 0.9 (n = 10) bubbles (per tube) was incorrectly classed as infested kernels in replicates containing no infested kernels. In light of these positive results, future studies should be conducted using additional grains, insect species, and life stages.

  9. Relationship of source and sink in determining kernel composition of maize

    PubMed Central

    Seebauer, Juliann R.; Singletary, George W.; Krumpelman, Paulette M.; Ruffo, Matías L.; Below, Frederick E.

    2010-01-01

    The relative role of the maternal source and the filial sink in controlling the composition of maize (Zea mays L.) kernels is unclear and may be influenced by the genotype and the N supply. The objective of this study was to determine the influence of assimilate supply from the vegetative source and utilization of assimilates by the grain sink on the final composition of maize kernels. Intermated B73×Mo17 recombinant inbred lines (IBM RILs) which displayed contrasting concentrations of endosperm starch were grown in the field with deficient or sufficient N, and the source supply altered by ear truncation (45% reduction) at 15 d after pollination (DAP). The assimilate supply into the kernels was determined at 19 DAP using the agar trap technique, and the final kernel composition was measured. The influence of N supply and kernel ear position on final kernel composition was also determined for a commercial hybrid. Concentrations of kernel protein and starch could be altered by genotype or the N supply, but remained fairly constant along the length of the ear. Ear truncation also produced a range of variation in endosperm starch and protein concentrations. The C/N ratio of the assimilate supply at 19 DAP was directly related to the final kernel composition, with an inverse relationship between the concentrations of starch and protein in the mature endosperm. The accumulation of kernel starch and protein in maize is uniform along the ear, yet adaptable within genotypic limits, suggesting that kernel composition is source limited in maize. PMID:19917600

  10. Genomic Prediction of Genotype × Environment Interaction Kernel Regression Models.

    PubMed

    Cuevas, Jaime; Crossa, José; Soberanis, Víctor; Pérez-Elizalde, Sergio; Pérez-Rodríguez, Paulino; Campos, Gustavo de Los; Montesinos-López, O A; Burgueño, Juan

    2016-11-01

    In genomic selection (GS), genotype × environment interaction (G × E) can be modeled by a marker × environment interaction (M × E). The G × E may be modeled through a linear kernel or a nonlinear (Gaussian) kernel. In this study, we propose using two nonlinear Gaussian kernels: the reproducing kernel Hilbert space with kernel averaging (RKHS KA) and the Gaussian kernel with the bandwidth estimated through an empirical Bayesian method (RKHS EB). We performed single-environment analyses and extended to account for G × E interaction (GBLUP-G × E, RKHS KA-G × E and RKHS EB-G × E) in wheat ( L.) and maize ( L.) data sets. For single-environment analyses of wheat and maize data sets, RKHS EB and RKHS KA had higher prediction accuracy than GBLUP for all environments. For the wheat data, the RKHS KA-G × E and RKHS EB-G × E models did show up to 60 to 68% superiority over the corresponding single environment for pairs of environments with positive correlations. For the wheat data set, the models with Gaussian kernels had accuracies up to 17% higher than that of GBLUP-G × E. For the maize data set, the prediction accuracy of RKHS EB-G × E and RKHS KA-G × E was, on average, 5 to 6% higher than that of GBLUP-G × E. The superiority of the Gaussian kernel models over the linear kernel is due to more flexible kernels that accounts for small, more complex marker main effects and marker-specific interaction effects. Copyright © 2016 Crop Science Society of America.

  11. Genetic dissection of the maize kernel development process via conditional QTL mapping for three developing kernel-related traits in an immortalized F2 population.

    PubMed

    Zhang, Zhanhui; Wu, Xiangyuan; Shi, Chaonan; Wang, Rongna; Li, Shengfei; Wang, Zhaohui; Liu, Zonghua; Xue, Yadong; Tang, Guiliang; Tang, Jihua

    2016-02-01

    Kernel development is an important dynamic trait that determines the final grain yield in maize. To dissect the genetic basis of maize kernel development process, a conditional quantitative trait locus (QTL) analysis was conducted using an immortalized F2 (IF2) population comprising 243 single crosses at two locations over 2 years. Volume (KV) and density (KD) of dried developing kernels, together with kernel weight (KW) at different developmental stages, were used to describe dynamic changes during kernel development. Phenotypic analysis revealed that final KW and KD were determined at DAP22 and KV at DAP29. Unconditional QTL mapping for KW, KV and KD uncovered 97 QTLs at different kernel development stages, of which qKW6b, qKW7a, qKW7b, qKW10b, qKW10c, qKV10a, qKV10b and qKV7 were identified under multiple kernel developmental stages and environments. Among the 26 QTLs detected by conditional QTL mapping, conqKW7a, conqKV7a, conqKV10a, conqKD2, conqKD7 and conqKD8a were conserved between the two mapping methodologies. Furthermore, most of these QTLs were consistent with QTLs and genes for kernel development/grain filling reported in previous studies. These QTLs probably contain major genes associated with the kernel development process, and can be used to improve grain yield and quality through marker-assisted selection.

  12. Image quality of mixed convolution kernel in thoracic computed tomography.

    PubMed

    Neubauer, Jakob; Spira, Eva Maria; Strube, Juliane; Langer, Mathias; Voss, Christian; Kotter, Elmar

    2016-11-01

    The mixed convolution kernel alters his properties geographically according to the depicted organ structure, especially for the lung. Therefore, we compared the image quality of the mixed convolution kernel to standard soft and hard kernel reconstructions for different organ structures in thoracic computed tomography (CT) images.Our Ethics Committee approved this prospective study. In total, 31 patients who underwent contrast-enhanced thoracic CT studies were included after informed consent. Axial reconstructions were performed with hard, soft, and mixed convolution kernel. Three independent and blinded observers rated the image quality according to the European Guidelines for Quality Criteria of Thoracic CT for 13 organ structures. The observers rated the depiction of the structures in all reconstructions on a 5-point Likert scale. Statistical analysis was performed with the Friedman Test and post hoc analysis with the Wilcoxon rank-sum test.Compared to the soft convolution kernel, the mixed convolution kernel was rated with a higher image quality for lung parenchyma, segmental bronchi, and the border between the pleura and the thoracic wall (P < 0.03). Compared to the hard convolution kernel, the mixed convolution kernel was rated with a higher image quality for aorta, anterior mediastinal structures, paratracheal soft tissue, hilar lymph nodes, esophagus, pleuromediastinal border, large and medium sized pulmonary vessels and abdomen (P < 0.004) but a lower image quality for trachea, segmental bronchi, lung parenchyma, and skeleton (P < 0.001).The mixed convolution kernel cannot fully substitute the standard CT reconstructions. Hard and soft convolution kernel reconstructions still seem to be mandatory for thoracic CT.

  13. Total Ambient Dose Equivalent Buildup Factor Determination for Nbs04 Concrete.

    PubMed

    Duckic, Paulina; Hayes, Robert B

    2018-06-01

    Buildup factors are dimensionless multiplicative factors required by the point kernel method to account for scattered radiation through a shielding material. The accuracy of the point kernel method is strongly affected by the correspondence of analyzed parameters to experimental configurations, which is attempted to be simplified here. The point kernel method has not been found to have widespread practical use for neutron shielding calculations due to the complex neutron transport behavior through shielding materials (i.e. the variety of interaction mechanisms that neutrons may undergo while traversing the shield) as well as non-linear neutron total cross section energy dependence. In this work, total ambient dose buildup factors for NBS04 concrete are calculated in terms of neutron and secondary gamma ray transmission factors. The neutron and secondary gamma ray transmission factors are calculated using MCNP6™ code with updated cross sections. Both transmission factors and buildup factors are given in a tabulated form. Practical use of neutron transmission and buildup factors warrants rigorously calculated results with all associated uncertainties. In this work, sensitivity analysis of neutron transmission factors and total buildup factors with varying water content has been conducted. The analysis showed significant impact of varying water content in concrete on both neutron transmission factors and total buildup factors. Finally, support vector regression, a machine learning technique, has been engaged to make a model based on the calculated data for calculation of the buildup factors. The developed model can predict most of the data with 20% relative error.

  14. CONSTRUCTING A FLEXIBLE LIKELIHOOD FUNCTION FOR SPECTROSCOPIC INFERENCE

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Czekala, Ian; Andrews, Sean M.; Mandel, Kaisey S.

    2015-10-20

    We present a modular, extensible likelihood framework for spectroscopic inference based on synthetic model spectra. The subtraction of an imperfect model from a continuously sampled spectrum introduces covariance between adjacent datapoints (pixels) into the residual spectrum. For the high signal-to-noise data with large spectral range that is commonly employed in stellar astrophysics, that covariant structure can lead to dramatically underestimated parameter uncertainties (and, in some cases, biases). We construct a likelihood function that accounts for the structure of the covariance matrix, utilizing the machinery of Gaussian process kernels. This framework specifically addresses the common problem of mismatches in model spectralmore » line strengths (with respect to data) due to intrinsic model imperfections (e.g., in the atomic/molecular databases or opacity prescriptions) by developing a novel local covariance kernel formalism that identifies and self-consistently downweights pathological spectral line “outliers.” By fitting many spectra in a hierarchical manner, these local kernels provide a mechanism to learn about and build data-driven corrections to synthetic spectral libraries. An open-source software implementation of this approach is available at http://iancze.github.io/Starfish, including a sophisticated probabilistic scheme for spectral interpolation when using model libraries that are sparsely sampled in the stellar parameters. We demonstrate some salient features of the framework by fitting the high-resolution V-band spectrum of WASP-14, an F5 dwarf with a transiting exoplanet, and the moderate-resolution K-band spectrum of Gliese 51, an M5 field dwarf.« less

  15. 21 CFR 176.350 - Tamarind seed kernel powder.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... 21 Food and Drugs 3 2014-04-01 2014-04-01 false Tamarind seed kernel powder. 176.350 Section 176... Paperboard § 176.350 Tamarind seed kernel powder. Tamarind seed kernel powder may be safely used as a component of articles intended for use in producing, manufacturing, packing, processing, preparing, treating...

  16. Local Observed-Score Kernel Equating

    ERIC Educational Resources Information Center

    Wiberg, Marie; van der Linden, Wim J.; von Davier, Alina A.

    2014-01-01

    Three local observed-score kernel equating methods that integrate methods from the local equating and kernel equating frameworks are proposed. The new methods were compared with their earlier counterparts with respect to such measures as bias--as defined by Lord's criterion of equity--and percent relative error. The local kernel item response…

  17. 7 CFR 51.1241 - Damage.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... which have been broken to the extent that the kernel within is plainly visible without minute... discoloration beneath, but the peanut shall be judged as it appears with the talc. (c) Kernels which are rancid or decayed. (d) Moldy kernels. (e) Kernels showing sprouts extending more than one-eighth inch from...

  18. 7 CFR 981.61 - Redetermination of kernel weight.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... 7 Agriculture 8 2010-01-01 2010-01-01 false Redetermination of kernel weight. 981.61 Section 981... GROWN IN CALIFORNIA Order Regulating Handling Volume Regulation § 981.61 Redetermination of kernel weight. The Board, on the basis of reports by handlers, shall redetermine the kernel weight of almonds...

  19. 7 CFR 981.60 - Determination of kernel weight.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... 7 Agriculture 8 2010-01-01 2010-01-01 false Determination of kernel weight. 981.60 Section 981.60... Regulating Handling Volume Regulation § 981.60 Determination of kernel weight. (a) Almonds for which settlement is made on kernel weight. All lots of almonds, whether shelled or unshelled, for which settlement...

  20. Genome-wide Association Analysis of Kernel Weight in Hard Winter Wheat

    USDA-ARS?s Scientific Manuscript database

    Wheat kernel weight is an important and heritable component of wheat grain yield and a key predictor of flour extraction. Genome-wide association analysis was conducted to identify genomic regions associated with kernel weight and kernel weight environmental response in 8 trials of 299 hard winter ...

  1. 7 CFR 999.400 - Regulation governing the importation of filberts.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ...) Definitions. (1) Filberts means filberts or hazelnuts. (2) Inshell filberts means filberts, the kernels or edible portions of which are contained in the shell. (3) Shelled filberts means the kernels of filberts... Filbert kernels or portions of filbert kernels shall meet the following requirements: (1) Well dried and...

  2. 7 CFR 51.1404 - Tolerances.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    .... (2) For kernel defects, by count. (i) 12 percent for pecans with kernels which fail to meet the... kernels which are seriously damaged: Provided, That not more than six-sevenths of this amount, or 6 percent, shall be allowed for kernels which are rancid, moldy, decayed or injured by insects: And provided...

  3. Enhanced gluten properties in soft kernel durum wheat

    USDA-ARS?s Scientific Manuscript database

    Soft kernel durum wheat is a relatively recent development (Morris et al. 2011 Crop Sci. 51:114). The soft kernel trait exerts profound effects on kernel texture, flour milling including break flour yield, milling energy, and starch damage, and dough water absorption (DWA). With the caveat of reduce...

  4. End-use quality of soft kernel durum wheat

    USDA-ARS?s Scientific Manuscript database

    Kernel texture is a major determinant of end-use quality of wheat. Durum wheat has very hard kernels. We developed soft kernel durum wheat via Ph1b-mediated homoeologous recombination. The Hardness locus was transferred from Chinese Spring to Svevo durum wheat via back-crossing. ‘Soft Svevo’ had SKC...

  5. 7 CFR 51.2560 - Definitions.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... are excessively thin kernels and can have black, brown or gray surface with a dark interior color and the immaturity has adversely affected the flavor of the kernel. (2) Kernel spotting refers to dark brown or dark gray spots aggregating more than one-eighth of the surface of the kernel. (g) Serious...

  6. 7 CFR 51.2560 - Definitions.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... are excessively thin kernels and can have black, brown or gray surface with a dark interior color and the immaturity has adversely affected the flavor of the kernel. (2) Kernel spotting refers to dark brown or dark gray spots aggregating more than one-eighth of the surface of the kernel. (g) Serious...

  7. 7 CFR 51.1416 - Optional determinations.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... throughout the lot. (a) Edible kernel content. A minimum sample of at least 500 grams of in-shell pecans shall be used for determination of edible kernel content. After the sample is weighed and shelled... determine edible kernel content for the lot. (b) Poorly developed kernel content. A minimum sample of at...

  8. Mapping Fire Severity Using Imaging Spectroscopy and Kernel Based Image Analysis

    NASA Astrophysics Data System (ADS)

    Prasad, S.; Cui, M.; Zhang, Y.; Veraverbeke, S.

    2014-12-01

    Improved spatial representation of within-burn heterogeneity after wildfires is paramount to effective land management decisions and more accurate fire emissions estimates. In this work, we demonstrate feasibility and efficacy of airborne imaging spectroscopy (hyperspectral imagery) for quantifying wildfire burn severity, using kernel based image analysis techniques. Two different airborne hyperspectral datasets, acquired over the 2011 Canyon and 2013 Rim fire in California using the Airborne Visible InfraRed Imaging Spectrometer (AVIRIS) sensor, were used in this study. The Rim Fire, covering parts of the Yosemite National Park started on August 17, 2013, and was the third largest fire in California's history. Canyon Fire occurred in the Tehachapi mountains, and started on September 4, 2011. In addition to post-fire data for both fires, half of the Rim fire was also covered with pre-fire images. Fire severity was measured in the field using Geo Composite Burn Index (GeoCBI). The field data was utilized to train and validate our models, wherein the trained models, in conjunction with imaging spectroscopy data were used for GeoCBI estimation wide geographical regions. This work presents an approach for using remotely sensed imagery combined with GeoCBI field data to map fire scars based on a non-linear (kernel based) epsilon-Support Vector Regression (e-SVR), which was used to learn the relationship between spectra and GeoCBI in a kernel-induced feature space. Classification of healthy vegetation versus fire-affected areas based on morphological multi-attribute profiles was also studied. The availability of pre- and post-fire imaging spectroscopy data over the Rim Fire provided a unique opportunity to evaluate the performance of bi-temporal imaging spectroscopy for assessing post-fire effects. This type of data is currently constrained because of limited airborne acquisitions before a fire, but will become widespread with future spaceborne sensors such as those on the planned NASA HyspIRI mission.

  9. Application Characterization at Scale: Lessons learned from developing a distributed Open Community Runtime system for High Performance Computing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Landwehr, Joshua B.; Suetterlein, Joshua D.; Marquez, Andres

    2016-05-16

    Since 2012, the U.S. Department of Energy’s X-Stack program has been developing solutions including runtime systems, programming models, languages, compilers, and tools for the Exascale system software to address crucial performance and power requirements. Fine grain programming models and runtime systems show a great potential to efficiently utilize the underlying hardware. Thus, they are essential to many X-Stack efforts. An abundant amount of small tasks can better utilize the vast parallelism available on current and future machines. Moreover, finer tasks can recover faster and adapt better, due to a decrease in state and control. Nevertheless, current applications have been writtenmore » to exploit old paradigms (such as Communicating Sequential Processor and Bulk Synchronous Parallel processing). To fully utilize the advantages of these new systems, applications need to be adapted to these new paradigms. As part of the applications’ porting process, in-depth characterization studies, focused on both application characteristics and runtime features, need to take place to fully understand the application performance bottlenecks and how to resolve them. This paper presents a characterization study for a novel high performance runtime system, called the Open Community Runtime, using key HPC kernels as its vehicle. This study has the following contributions: one of the first high performance, fine grain, distributed memory runtime system implementing the OCR standard (version 0.99a); and a characterization study of key HPC kernels in terms of runtime primitives running on both intra and inter node environments. Running on a general purpose cluster, we have found up to 1635x relative speed-up for a parallel tiled Cholesky Kernels on 128 nodes with 16 cores each and a 1864x relative speed-up for a parallel tiled Smith-Waterman kernel on 128 nodes with 30 cores.« less

  10. An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods.

    PubMed

    Valentini, Giorgio; Paccanaro, Alberto; Caniza, Horacio; Romero, Alfonso E; Re, Matteo

    2014-06-01

    In the context of "network medicine", gene prioritization methods represent one of the main tools to discover candidate disease genes by exploiting the large amount of data covering different types of functional relationships between genes. Several works proposed to integrate multiple sources of data to improve disease gene prioritization, but to our knowledge no systematic studies focused on the quantitative evaluation of the impact of network integration on gene prioritization. In this paper, we aim at providing an extensive analysis of gene-disease associations not limited to genetic disorders, and a systematic comparison of different network integration methods for gene prioritization. We collected nine different functional networks representing different functional relationships between genes, and we combined them through both unweighted and weighted network integration methods. We then prioritized genes with respect to each of the considered 708 medical subject headings (MeSH) diseases by applying classical guilt-by-association, random walk and random walk with restart algorithms, and the recently proposed kernelized score functions. The results obtained with classical random walk algorithms and the best single network achieved an average area under the curve (AUC) across the 708 MeSH diseases of about 0.82, while kernelized score functions and network integration boosted the average AUC to about 0.89. Weighted integration, by exploiting the different "informativeness" embedded in different functional networks, outperforms unweighted integration at 0.01 significance level, according to the Wilcoxon signed rank sum test. For each MeSH disease we provide the top-ranked unannotated candidate genes, available for further bio-medical investigation. Network integration is necessary to boost the performances of gene prioritization methods. Moreover the methods based on kernelized score functions can further enhance disease gene ranking results, by adopting both local and global learning strategies, able to exploit the overall topology of the network. Copyright © 2014 The Authors. Published by Elsevier B.V. All rights reserved.

  11. Nonparametric Density Estimation Based on Self-Organizing Incremental Neural Network for Large Noisy Data.

    PubMed

    Nakamura, Yoshihiro; Hasegawa, Osamu

    2017-01-01

    With the ongoing development and expansion of communication networks and sensors, massive amounts of data are continuously generated in real time from real environments. Beforehand, prediction of a distribution underlying such data is difficult; furthermore, the data include substantial amounts of noise. These factors make it difficult to estimate probability densities. To handle these issues and massive amounts of data, we propose a nonparametric density estimator that rapidly learns data online and has high robustness. Our approach is an extension of both kernel density estimation (KDE) and a self-organizing incremental neural network (SOINN); therefore, we call our approach KDESOINN. An SOINN provides a clustering method that learns about the given data as networks of prototype of data; more specifically, an SOINN can learn the distribution underlying the given data. Using this information, KDESOINN estimates the probability density function. The results of our experiments show that KDESOINN outperforms or achieves performance comparable to the current state-of-the-art approaches in terms of robustness, learning time, and accuracy.

  12. 3DRT-MPASS

    NASA Technical Reports Server (NTRS)

    Lickly, Ben

    2005-01-01

    Data from all current JPL missions are stored in files called SPICE kernels. At present, animators who want to use data from these kernels have to either read through the kernels looking for the desired data, or write programs themselves to retrieve information about all the needed objects for their animations. In this project, methods of automating the process of importing the data from the SPICE kernels were researched. In particular, tools were developed for creating basic scenes in Maya, a 3D computer graphics software package, from SPICE kernels.

  13. Prototype Vector Machine for Large Scale Semi-Supervised Learning

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Kai; Kwok, James T.; Parvin, Bahram

    2009-04-29

    Practicaldataminingrarelyfalls exactlyinto the supervisedlearning scenario. Rather, the growing amount of unlabeled data poses a big challenge to large-scale semi-supervised learning (SSL). We note that the computationalintensivenessofgraph-based SSLarises largely from the manifold or graph regularization, which in turn lead to large models that are dificult to handle. To alleviate this, we proposed the prototype vector machine (PVM), a highlyscalable,graph-based algorithm for large-scale SSL. Our key innovation is the use of"prototypes vectors" for effcient approximation on both the graph-based regularizer and model representation. The choice of prototypes are grounded upon two important criteria: they not only perform effective low-rank approximation of themore » kernel matrix, but also span a model suffering the minimum information loss compared with the complete model. We demonstrate encouraging performance and appealing scaling properties of the PVM on a number of machine learning benchmark data sets.« less

  14. Quantum machine learning for quantum anomaly detection

    NASA Astrophysics Data System (ADS)

    Liu, Nana; Rebentrost, Patrick

    2018-04-01

    Anomaly detection is used for identifying data that deviate from "normal" data patterns. Its usage on classical data finds diverse applications in many important areas such as finance, fraud detection, medical diagnoses, data cleaning, and surveillance. With the advent of quantum technologies, anomaly detection of quantum data, in the form of quantum states, may become an important component of quantum applications. Machine-learning algorithms are playing pivotal roles in anomaly detection using classical data. Two widely used algorithms are the kernel principal component analysis and the one-class support vector machine. We find corresponding quantum algorithms to detect anomalies in quantum states. We show that these two quantum algorithms can be performed using resources that are logarithmic in the dimensionality of quantum states. For pure quantum states, these resources can also be logarithmic in the number of quantum states used for training the machine-learning algorithm. This makes these algorithms potentially applicable to big quantum data applications.

  15. Graph wavelet alignment kernels for drug virtual screening.

    PubMed

    Smalter, Aaron; Huan, Jun; Lushington, Gerald

    2009-06-01

    In this paper, we introduce a novel statistical modeling technique for target property prediction, with applications to virtual screening and drug design. In our method, we use graphs to model chemical structures and apply a wavelet analysis of graphs to summarize features capturing graph local topology. We design a novel graph kernel function to utilize the topology features to build predictive models for chemicals via Support Vector Machine classifier. We call the new graph kernel a graph wavelet-alignment kernel. We have evaluated the efficacy of the wavelet-alignment kernel using a set of chemical structure-activity prediction benchmarks. Our results indicate that the use of the kernel function yields performance profiles comparable to, and sometimes exceeding that of the existing state-of-the-art chemical classification approaches. In addition, our results also show that the use of wavelet functions significantly decreases the computational costs for graph kernel computation with more than ten fold speedup.

  16. Oecophylla longinoda (Hymenoptera: Formicidae) Lead to Increased Cashew Kernel Size and Kernel Quality.

    PubMed

    Anato, F M; Sinzogan, A A C; Offenberg, J; Adandonon, A; Wargui, R B; Deguenon, J M; Ayelo, P M; Vayssières, J-F; Kossou, D K

    2017-06-01

    Weaver ants, Oecophylla spp., are known to positively affect cashew, Anacardium occidentale L., raw nut yield, but their effects on the kernels have not been reported. We compared nut size and the proportion of marketable kernels between raw nuts collected from trees with and without ants. Raw nuts collected from trees with weaver ants were 2.9% larger than nuts from control trees (i.e., without weaver ants), leading to 14% higher proportion of marketable kernels. On trees with ants, the kernel: raw nut ratio from nuts damaged by formic acid was 4.8% lower compared with nondamaged nuts from the same trees. Weaver ants provided three benefits to cashew production by increasing yields, yielding larger nuts, and by producing greater proportions of marketable kernel mass. © The Authors 2017. Published by Oxford University Press on behalf of Entomological Society of America. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  17. Small convolution kernels for high-fidelity image restoration

    NASA Technical Reports Server (NTRS)

    Reichenbach, Stephen E.; Park, Stephen K.

    1991-01-01

    An algorithm is developed for computing the mean-square-optimal values for small, image-restoration kernels. The algorithm is based on a comprehensive, end-to-end imaging system model that accounts for the important components of the imaging process: the statistics of the scene, the point-spread function of the image-gathering device, sampling effects, noise, and display reconstruction. Subject to constraints on the spatial support of the kernel, the algorithm generates the kernel values that restore the image with maximum fidelity, that is, the kernel minimizes the expected mean-square restoration error. The algorithm is consistent with the derivation of the spatially unconstrained Wiener filter, but leads to a small, spatially constrained kernel that, unlike the unconstrained filter, can be efficiently implemented by convolution. Simulation experiments demonstrate that for a wide range of imaging systems these small kernels can restore images with fidelity comparable to images restored with the unconstrained Wiener filter.

  18. Kernels, Degrees of Freedom, and Power Properties of Quadratic Distance Goodness-of-Fit Tests

    PubMed Central

    Lindsay, Bruce G.; Markatou, Marianthi; Ray, Surajit

    2014-01-01

    In this article, we study the power properties of quadratic-distance-based goodness-of-fit tests. First, we introduce the concept of a root kernel and discuss the considerations that enter the selection of this kernel. We derive an easy to use normal approximation to the power of quadratic distance goodness-of-fit tests and base the construction of a noncentrality index, an analogue of the traditional noncentrality parameter, on it. This leads to a method akin to the Neyman-Pearson lemma for constructing optimal kernels for specific alternatives. We then introduce a midpower analysis as a device for choosing optimal degrees of freedom for a family of alternatives of interest. Finally, we introduce a new diffusion kernel, called the Pearson-normal kernel, and study the extent to which the normal approximation to the power of tests based on this kernel is valid. Supplementary materials for this article are available online. PMID:24764609

  19. The quantitative properties of three soft X-ray flare kernels observed with the AS&E X-ray telescope on Skylab

    NASA Technical Reports Server (NTRS)

    Kahler, S. W.; Petrasso, R. D.; Kane, S. R.

    1976-01-01

    The physical parameters for the kernels of three solar X-ray flare events have been deduced using photographic data from the S-054 X-ray telescope on Skylab as the primary data source and 1-8 and 8-20 A fluxes from Solrad 9 as the secondary data source. The kernels had diameters of about 5-7 seconds of arc and in two cases electron densities at least as high as 0.3 trillion per cu cm. The lifetimes of the kernels were 5-10 min. The presence of thermal conduction during the decay phases is used to argue: (1) that kernels are entire, not small portions of, coronal loop structures, and (2) that flare heating must continue during the decay phase. We suggest a simple geometric model to explain the role of kernels in flares in which kernels are identified with emerging flux regions.

  20. 21 CFR 176.350 - Tamarind seed kernel powder.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... 21 Food and Drugs 3 2011-04-01 2011-04-01 false Tamarind seed kernel powder. 176.350 Section 176... Substances for Use Only as Components of Paper and Paperboard § 176.350 Tamarind seed kernel powder. Tamarind seed kernel powder may be safely used as a component of articles intended for use in producing...

Top