Sample records for classification support vector

  1. Progressive Classification Using Support Vector Machines

    NASA Technical Reports Server (NTRS)

    Wagstaff, Kiri; Kocurek, Michael

    2009-01-01

    An algorithm for progressive classification of data, analogous to progressive rendering of images, makes it possible to compromise between speed and accuracy. This algorithm uses support vector machines (SVMs) to classify data. An SVM is a machine learning algorithm that builds a mathematical model of the desired classification concept by identifying the critical data points, called support vectors. Coarse approximations to the concept require only a few support vectors, while precise, highly accurate models require far more support vectors. Once the model has been constructed, the SVM can be applied to new observations. The cost of classifying a new observation is proportional to the number of support vectors in the model. When computational resources are limited, an SVM of the appropriate complexity can be produced. However, if the constraints are not known when the model is constructed, or if they can change over time, a method for adaptively responding to the current resource constraints is required. This capability is particularly relevant for spacecraft (or any other real-time systems) that perform onboard data analysis. The new algorithm enables the fast, interactive application of an SVM classifier to a new set of data. The classification process achieved by this algorithm is characterized as progressive because a coarse approximation to the true classification is generated rapidly and thereafter iteratively refined. The algorithm uses two SVMs: (1) a fast, approximate one and (2) slow, highly accurate one. New data are initially classified by the fast SVM, producing a baseline approximate classification. For each classified data point, the algorithm calculates a confidence index that indicates the likelihood that it was classified correctly in the first pass. Next, the data points are sorted by their confidence indices and progressively reclassified by the slower, more accurate SVM, starting with the items most likely to be incorrectly classified. The user

  2. Fuzzy support vector machine: an efficient rule-based classification technique for microarrays.

    PubMed

    Hajiloo, Mohsen; Rabiee, Hamid R; Anooshahpour, Mahdi

    2013-01-01

    The abundance of gene expression microarray data has led to the development of machine learning algorithms applicable for tackling disease diagnosis, disease prognosis, and treatment selection problems. However, these algorithms often produce classifiers with weaknesses in terms of accuracy, robustness, and interpretability. This paper introduces fuzzy support vector machine which is a learning algorithm based on combination of fuzzy classifiers and kernel machines for microarray classification. Experimental results on public leukemia, prostate, and colon cancer datasets show that fuzzy support vector machine applied in combination with filter or wrapper feature selection methods develops a robust model with higher accuracy than the conventional microarray classification models such as support vector machine, artificial neural network, decision trees, k nearest neighbors, and diagonal linear discriminant analysis. Furthermore, the interpretable rule-base inferred from fuzzy support vector machine helps extracting biological knowledge from microarray data. Fuzzy support vector machine as a new classification model with high generalization power, robustness, and good interpretability seems to be a promising tool for gene expression microarray classification.

  3. A support vector machine approach for classification of welding defects from ultrasonic signals

    NASA Astrophysics Data System (ADS)

    Chen, Yuan; Ma, Hong-Wei; Zhang, Guang-Ming

    2014-07-01

    Defect classification is an important issue in ultrasonic non-destructive evaluation. A layered multi-class support vector machine (LMSVM) classification system, which combines multiple SVM classifiers through a layered architecture, is proposed in this paper. The proposed LMSVM classification system is applied to the classification of welding defects from ultrasonic test signals. The measured ultrasonic defect echo signals are first decomposed into wavelet coefficients by the wavelet packet transform. The energy of the wavelet coefficients at different frequency channels are used to construct the feature vectors. The bees algorithm (BA) is then used for feature selection and SVM parameter optimisation for the LMSVM classification system. The BA-based feature selection optimises the energy feature vectors. The optimised feature vectors are input to the LMSVM classification system for training and testing. Experimental results of classifying welding defects demonstrate that the proposed technique is highly robust, precise and reliable for ultrasonic defect classification.

  4. Quantum Support Vector Machine for Big Data Classification

    NASA Astrophysics Data System (ADS)

    Rebentrost, Patrick; Mohseni, Masoud; Lloyd, Seth

    2014-09-01

    Supervised machine learning is the classification of new data based on already classified training examples. In this work, we show that the support vector machine, an optimized binary classifier, can be implemented on a quantum computer, with complexity logarithmic in the size of the vectors and the number of training examples. In cases where classical sampling algorithms require polynomial time, an exponential speedup is obtained. At the core of this quantum big data algorithm is a nonsparse matrix exponentiation technique for efficiently performing a matrix inversion of the training data inner-product (kernel) matrix.

  5. Testing of the Support Vector Machine for Binary-Class Classification

    NASA Technical Reports Server (NTRS)

    Scholten, Matthew

    2011-01-01

    The Support Vector Machine is a powerful algorithm, useful in classifying data in to species. The Support Vector Machines implemented in this research were used as classifiers for the final stage in a Multistage Autonomous Target Recognition system. A single kernel SVM known as SVMlight, and a modified version known as a Support Vector Machine with K-Means Clustering were used. These SVM algorithms were tested as classifiers under varying conditions. Image noise levels varied, and the orientation of the targets changed. The classifiers were then optimized to demonstrate their maximum potential as classifiers. Results demonstrate the reliability of SMV as a method for classification. From trial to trial, SVM produces consistent results

  6. Classification of Stellar Spectra with Fuzzy Minimum Within-Class Support Vector Machine

    NASA Astrophysics Data System (ADS)

    Zhong-bao, Liu; Wen-ai, Song; Jing, Zhang; Wen-juan, Zhao

    2017-06-01

    Classification is one of the important tasks in astronomy, especially in spectra analysis. Support Vector Machine (SVM) is a typical classification method, which is widely used in spectra classification. Although it performs well in practice, its classification accuracies can not be greatly improved because of two limitations. One is it does not take the distribution of the classes into consideration. The other is it is sensitive to noise. In order to solve the above problems, inspired by the maximization of the Fisher's Discriminant Analysis (FDA) and the SVM separability constraints, fuzzy minimum within-class support vector machine (FMWSVM) is proposed in this paper. In FMWSVM, the distribution of the classes is reflected by the within-class scatter in FDA and the fuzzy membership function is introduced to decrease the influence of the noise. The comparative experiments with SVM on the SDSS datasets verify the effectiveness of the proposed classifier FMWSVM.

  7. Support Vector Machines for Hyperspectral Remote Sensing Classification

    NASA Technical Reports Server (NTRS)

    Gualtieri, J. Anthony; Cromp, R. F.

    1998-01-01

    The Support Vector Machine provides a new way to design classification algorithms which learn from examples (supervised learning) and generalize when applied to new data. We demonstrate its success on a difficult classification problem from hyperspectral remote sensing, where we obtain performances of 96%, and 87% correct for a 4 class problem, and a 16 class problem respectively. These results are somewhat better than other recent results on the same data. A key feature of this classifier is its ability to use high-dimensional data without the usual recourse to a feature selection step to reduce the dimensionality of the data. For this application, this is important, as hyperspectral data consists of several hundred contiguous spectral channels for each exemplar. We provide an introduction to this new approach, and demonstrate its application to classification of an agriculture scene.

  8. Clifford support vector machines for classification, regression, and recurrence.

    PubMed

    Bayro-Corrochano, Eduardo Jose; Arana-Daniel, Nancy

    2010-11-01

    This paper introduces the Clifford support vector machines (CSVM) as a generalization of the real and complex-valued support vector machines using the Clifford geometric algebra. In this framework, we handle the design of kernels involving the Clifford or geometric product. In this approach, one redefines the optimization variables as multivectors. This allows us to have a multivector as output. Therefore, we can represent multiple classes according to the dimension of the geometric algebra in which we work. We show that one can apply CSVM for classification and regression and also to build a recurrent CSVM. The CSVM is an attractive approach for the multiple input multiple output processing of high-dimensional geometric entities. We carried out comparisons between CSVM and the current approaches to solve multiclass classification and regression. We also study the performance of the recurrent CSVM with experiments involving time series. The authors believe that this paper can be of great use for researchers and practitioners interested in multiclass hypercomplex computing, particularly for applications in complex and quaternion signal and image processing, satellite control, neurocomputation, pattern recognition, computer vision, augmented virtual reality, robotics, and humanoids.

  9. A comparative study of surface EMG classification by fuzzy relevance vector machine and fuzzy support vector machine.

    PubMed

    Xie, Hong-Bo; Huang, Hu; Wu, Jianhua; Liu, Lei

    2015-02-01

    We present a multiclass fuzzy relevance vector machine (FRVM) learning mechanism and evaluate its performance to classify multiple hand motions using surface electromyographic (sEMG) signals. The relevance vector machine (RVM) is a sparse Bayesian kernel method which avoids some limitations of the support vector machine (SVM). However, RVM still suffers the difficulty of possible unclassifiable regions in multiclass problems. We propose two fuzzy membership function-based FRVM algorithms to solve such problems, based on experiments conducted on seven healthy subjects and two amputees with six hand motions. Two feature sets, namely, AR model coefficients and room mean square value (AR-RMS), and wavelet transform (WT) features, are extracted from the recorded sEMG signals. Fuzzy support vector machine (FSVM) analysis was also conducted for wide comparison in terms of accuracy, sparsity, training and testing time, as well as the effect of training sample sizes. FRVM yielded comparable classification accuracy with dramatically fewer support vectors in comparison with FSVM. Furthermore, the processing delay of FRVM was much less than that of FSVM, whilst training time of FSVM much faster than FRVM. The results indicate that FRVM classifier trained using sufficient samples can achieve comparable generalization capability as FSVM with significant sparsity in multi-channel sEMG classification, which is more suitable for sEMG-based real-time control applications.

  10. Gradient Evolution-based Support Vector Machine Algorithm for Classification

    NASA Astrophysics Data System (ADS)

    Zulvia, Ferani E.; Kuo, R. J.

    2018-03-01

    This paper proposes a classification algorithm based on a support vector machine (SVM) and gradient evolution (GE) algorithms. SVM algorithm has been widely used in classification. However, its result is significantly influenced by the parameters. Therefore, this paper aims to propose an improvement of SVM algorithm which can find the best SVMs’ parameters automatically. The proposed algorithm employs a GE algorithm to automatically determine the SVMs’ parameters. The GE algorithm takes a role as a global optimizer in finding the best parameter which will be used by SVM algorithm. The proposed GE-SVM algorithm is verified using some benchmark datasets and compared with other metaheuristic-based SVM algorithms. The experimental results show that the proposed GE-SVM algorithm obtains better results than other algorithms tested in this paper.

  11. Classification of burn wounds using support vector machines

    NASA Astrophysics Data System (ADS)

    Acha, Begona; Serrano, Carmen; Palencia, Sergio; Murillo, Juan Jose

    2004-05-01

    The purpose of this work is to improve a previous method developed by the authors for the classification of burn wounds into their depths. The inputs of the system are color and texture information, as these are the characteristics observed by physicians in order to give a diagnosis. Our previous work consisted in segmenting the burn wound from the rest of the image and classifying the burn into its depth. In this paper we focus on the classification problem only. We already proposed to use a Fuzzy-ARTMAP neural network (NN). However, we may take advantage of new powerful classification tools such as Support Vector Machines (SVM). We apply the five-folded cross validation scheme to divide the database into training and validating sets. Then, we apply a feature selection method for each classifier, which will give us the set of features that yields the smallest classification error for each classifier. Features used to classify are first-order statistical parameters extracted from the L*, u* and v* color components of the image. The feature selection algorithms used are the Sequential Forward Selection (SFS) and the Sequential Backward Selection (SBS) methods. As data of the problem faced here are not linearly separable, the SVM was trained using some different kernels. The validating process shows that the SVM method, when using a Gaussian kernel of variance 1, outperforms classification results obtained with the rest of the classifiers, yielding an error classification rate of 0.7% whereas the Fuzzy-ARTMAP NN attained 1.6 %.

  12. Support vector machine based classification of fast Fourier transform spectroscopy of proteins

    NASA Astrophysics Data System (ADS)

    Lazarevic, Aleksandar; Pokrajac, Dragoljub; Marcano, Aristides; Melikechi, Noureddine

    2009-02-01

    Fast Fourier transform spectroscopy has proved to be a powerful method for study of the secondary structure of proteins since peak positions and their relative amplitude are affected by the number of hydrogen bridges that sustain this secondary structure. However, to our best knowledge, the method has not been used yet for identification of proteins within a complex matrix like a blood sample. The principal reason is the apparent similarity of protein infrared spectra with actual differences usually masked by the solvent contribution and other interactions. In this paper, we propose a novel machine learning based method that uses protein spectra for classification and identification of such proteins within a given sample. The proposed method uses principal component analysis (PCA) to identify most important linear combinations of original spectral components and then employs support vector machine (SVM) classification model applied on such identified combinations to categorize proteins into one of given groups. Our experiments have been performed on the set of four different proteins, namely: Bovine Serum Albumin, Leptin, Insulin-like Growth Factor 2 and Osteopontin. Our proposed method of applying principal component analysis along with support vector machines exhibits excellent classification accuracy when identifying proteins using their infrared spectra.

  13. Support-vector-machines-based multidimensional signal classification for fetal activity characterization

    NASA Astrophysics Data System (ADS)

    Ribes, S.; Voicu, I.; Girault, J. M.; Fournier, M.; Perrotin, F.; Tranquart, F.; Kouamé, D.

    2011-03-01

    Electronic fetal monitoring may be required during the whole pregnancy to closely monitor specific fetal and maternal disorders. Currently used methods suffer from many limitations and are not sufficient to evaluate fetal asphyxia. Fetal activity parameters such as movements, heart rate and associated parameters are essential indicators of the fetus well being, and no current device gives a simultaneous and sufficient estimation of all these parameters to evaluate the fetus well-being. We built for this purpose, a multi-transducer-multi-gate Doppler system and developed dedicated signal processing techniques for fetal activity parameter extraction in order to investigate fetus's asphyxia or well-being through fetal activity parameters. To reach this goal, this paper shows preliminary feasibility of separating normal and compromised fetuses using our system. To do so, data set consisting of two groups of fetal signals (normal and compromised) has been established and provided by physicians. From estimated parameters an instantaneous Manning-like score, referred to as ultrasonic score was introduced and was used together with movements, heart rate and associated parameters in a classification process using Support Vector Machines (SVM) method. The influence of the fetal activity parameters and the performance of the SVM were evaluated using the computation of sensibility, specificity, percentage of support vectors and total classification accuracy. We showed our ability to separate the data into two sets : normal fetuses and compromised fetuses and obtained an excellent matching with the clinical classification performed by physician.

  14. SVM Classifier - a comprehensive java interface for support vector machine classification of microarray data.

    PubMed

    Pirooznia, Mehdi; Deng, Youping

    2006-12-12

    Graphical user interface (GUI) software promotes novelty by allowing users to extend the functionality. SVM Classifier is a cross-platform graphical application that handles very large datasets well. The purpose of this study is to create a GUI application that allows SVM users to perform SVM training, classification and prediction. The GUI provides user-friendly access to state-of-the-art SVM methods embodied in the LIBSVM implementation of Support Vector Machine. We implemented the java interface using standard swing libraries. We used a sample data from a breast cancer study for testing classification accuracy. We achieved 100% accuracy in classification among the BRCA1-BRCA2 samples with RBF kernel of SVM. We have developed a java GUI application that allows SVM users to perform SVM training, classification and prediction. We have demonstrated that support vector machines can accurately classify genes into functional categories based upon expression data from DNA microarray hybridization experiments. Among the different kernel functions that we examined, the SVM that uses a radial basis kernel function provides the best performance. The SVM Classifier is available at http://mfgn.usm.edu/ebl/svm/.

  15. Support Vector Machine algorithm for regression and classification

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yu, Chenggang; Zavaljevski, Nela

    2001-08-01

    The software is an implementation of the Support Vector Machine (SVM) algorithm that was invented and developed by Vladimir Vapnik and his co-workers at AT&T Bell Laboratories. The specific implementation reported here is an Active Set method for solving a quadratic optimization problem that forms the major part of any SVM program. The implementation is tuned to specific constraints generated in the SVM learning. Thus, it is more efficient than general-purpose quadratic optimization programs. A decomposition method has been implemented in the software that enables processing large data sets. The size of the learning data is virtually unlimited by themore » capacity of the computer physical memory. The software is flexible and extensible. Two upper bounds are implemented to regulate the SVM learning for classification, which allow users to adjust the false positive and false negative rates. The software can be used either as a standalone, general-purpose SVM regression or classification program, or be embedded into a larger software system.« less

  16. Protein Kinase Classification with 2866 Hidden Markov Models and One Support Vector Machine

    NASA Technical Reports Server (NTRS)

    Weber, Ryan; New, Michael H.; Fonda, Mark (Technical Monitor)

    2002-01-01

    The main application considered in this paper is predicting true kinases from randomly permuted kinases that share the same length and amino acid distributions as the true kinases. Numerous methods already exist for this classification task, such as HMMs, motif-matchers, and sequence comparison algorithms. We build on some of these efforts by creating a vector from the output of thousands of structurally based HMMs, created offline with Pfam-A seed alignments using SAM-T99, which then must be combined into an overall classification for the protein. Then we use a Support Vector Machine for classifying this large ensemble Pfam-Vector, with a polynomial and chisquared kernel. In particular, the chi-squared kernel SVM performs better than the HMMs and better than the BLAST pairwise comparisons, when predicting true from false kinases in some respects, but no one algorithm is best for all purposes or in all instances so we consider the particular strengths and weaknesses of each.

  17. Support-vector-machine tree-based domain knowledge learning toward automated sports video classification

    NASA Astrophysics Data System (ADS)

    Xiao, Guoqiang; Jiang, Yang; Song, Gang; Jiang, Jianmin

    2010-12-01

    We propose a support-vector-machine (SVM) tree to hierarchically learn from domain knowledge represented by low-level features toward automatic classification of sports videos. The proposed SVM tree adopts a binary tree structure to exploit the nature of SVM's binary classification, where each internal node is a single SVM learning unit, and each external node represents the classified output type. Such a SVM tree presents a number of advantages, which include: 1. low computing cost; 2. integrated learning and classification while preserving individual SVM's learning strength; and 3. flexibility in both structure and learning modules, where different numbers of nodes and features can be added to address specific learning requirements, and various learning models can be added as individual nodes, such as neural networks, AdaBoost, hidden Markov models, dynamic Bayesian networks, etc. Experiments support that the proposed SVM tree achieves good performances in sports video classifications.

  18. Optimization of Support Vector Machine (SVM) for Object Classification

    NASA Technical Reports Server (NTRS)

    Scholten, Matthew; Dhingra, Neil; Lu, Thomas T.; Chao, Tien-Hsin

    2012-01-01

    The Support Vector Machine (SVM) is a powerful algorithm, useful in classifying data into species. The SVMs implemented in this research were used as classifiers for the final stage in a Multistage Automatic Target Recognition (ATR) system. A single kernel SVM known as SVMlight, and a modified version known as a SVM with K-Means Clustering were used. These SVM algorithms were tested as classifiers under varying conditions. Image noise levels varied, and the orientation of the targets changed. The classifiers were then optimized to demonstrate their maximum potential as classifiers. Results demonstrate the reliability of SVM as a method for classification. From trial to trial, SVM produces consistent results.

  19. An implementation of support vector machine on sentiment classification of movie reviews

    NASA Astrophysics Data System (ADS)

    Yulietha, I. M.; Faraby, S. A.; Adiwijaya; Widyaningtyas, W. C.

    2018-03-01

    With technological advances, all information about movie is available on the internet. If the information is processed properly, it will get the quality of the information. This research proposes to the classify sentiments on movie review documents. This research uses Support Vector Machine (SVM) method because it can classify high dimensional data in accordance with the data used in this research in the form of text. Support Vector Machine is a popular machine learning technique for text classification because it can classify by learning from a collection of documents that have been classified previously and can provide good result. Based on number of datasets, the 90-10 composition has the best result that is 85.6%. Based on SVM kernel, kernel linear with constant 1 has the best result that is 84.9%

  20. Fabric wrinkle characterization and classification using modified wavelet coefficients and optimized support-vector-machine classifier

    USDA-ARS?s Scientific Manuscript database

    This paper presents a novel wrinkle evaluation method that uses modified wavelet coefficients and an optimized support-vector-machine (SVM) classification scheme to characterize and classify wrinkle appearance of fabric. Fabric images were decomposed with the wavelet transform (WT), and five parame...

  1. A tool for urban soundscape evaluation applying Support Vector Machines for developing a soundscape classification model.

    PubMed

    Torija, Antonio J; Ruiz, Diego P; Ramos-Ridao, Angel F

    2014-06-01

    To ensure appropriate soundscape management in urban environments, the urban-planning authorities need a range of tools that enable such a task to be performed. An essential step during the management of urban areas from a sound standpoint should be the evaluation of the soundscape in such an area. In this sense, it has been widely acknowledged that a subjective and acoustical categorization of a soundscape is the first step to evaluate it, providing a basis for designing or adapting it to match people's expectations as well. In this sense, this work proposes a model for automatic classification of urban soundscapes. This model is intended for the automatic classification of urban soundscapes based on underlying acoustical and perceptual criteria. Thus, this classification model is proposed to be used as a tool for a comprehensive urban soundscape evaluation. Because of the great complexity associated with the problem, two machine learning techniques, Support Vector Machines (SVM) and Support Vector Machines trained with Sequential Minimal Optimization (SMO), are implemented in developing model classification. The results indicate that the SMO model outperforms the SVM model in the specific task of soundscape classification. With the implementation of the SMO algorithm, the classification model achieves an outstanding performance (91.3% of instances correctly classified). © 2013 Elsevier B.V. All rights reserved.

  2. Fuzzy support vector machine for microarray imbalanced data classification

    NASA Astrophysics Data System (ADS)

    Ladayya, Faroh; Purnami, Santi Wulan; Irhamah

    2017-11-01

    DNA microarrays are data containing gene expression with small sample sizes and high number of features. Furthermore, imbalanced classes is a common problem in microarray data. This occurs when a dataset is dominated by a class which have significantly more instances than the other minority classes. Therefore, it is needed a classification method that solve the problem of high dimensional and imbalanced data. Support Vector Machine (SVM) is one of the classification methods that is capable of handling large or small samples, nonlinear, high dimensional, over learning and local minimum issues. SVM has been widely applied to DNA microarray data classification and it has been shown that SVM provides the best performance among other machine learning methods. However, imbalanced data will be a problem because SVM treats all samples in the same importance thus the results is bias for minority class. To overcome the imbalanced data, Fuzzy SVM (FSVM) is proposed. This method apply a fuzzy membership to each input point and reformulate the SVM such that different input points provide different contributions to the classifier. The minority classes have large fuzzy membership so FSVM can pay more attention to the samples with larger fuzzy membership. Given DNA microarray data is a high dimensional data with a very large number of features, it is necessary to do feature selection first using Fast Correlation based Filter (FCBF). In this study will be analyzed by SVM, FSVM and both methods by applying FCBF and get the classification performance of them. Based on the overall results, FSVM on selected features has the best classification performance compared to SVM.

  3. Classification of Regional Ionospheric Disturbances Based on Support Vector Machines

    NASA Astrophysics Data System (ADS)

    Begüm Terzi, Merve; Arikan, Feza; Arikan, Orhan; Karatay, Secil

    2016-07-01

    Ionosphere is an anisotropic, inhomogeneous, time varying and spatio-temporally dispersive medium whose parameters can be estimated almost always by using indirect measurements. Geomagnetic, gravitational, solar or seismic activities cause variations of ionosphere at various spatial and temporal scales. This complex spatio-temporal variability is challenging to be identified due to extensive scales in period, duration, amplitude and frequency of disturbances. Since geomagnetic and solar indices such as Disturbance storm time (Dst), F10.7 solar flux, Sun Spot Number (SSN), Auroral Electrojet (AE), Kp and W-index provide information about variability on a global scale, identification and classification of regional disturbances poses a challenge. The main aim of this study is to classify the regional effects of global geomagnetic storms and classify them according to their risk levels. For this purpose, Total Electron Content (TEC) estimated from GPS receivers, which is one of the major parameters of ionosphere, will be used to model the regional and local variability that differs from global activity along with solar and geomagnetic indices. In this work, for the automated classification of the regional disturbances, a classification technique based on a robust machine learning technique that have found wide spread use, Support Vector Machine (SVM) is proposed. SVM is a supervised learning model used for classification with associated learning algorithm that analyze the data and recognize patterns. In addition to performing linear classification, SVM can efficiently perform nonlinear classification by embedding data into higher dimensional feature spaces. Performance of the developed classification technique is demonstrated for midlatitude ionosphere over Anatolia using TEC estimates generated from the GPS data provided by Turkish National Permanent GPS Network (TNPGN-Active) for solar maximum year of 2011. As a result of implementing the developed classification

  4. Credit Risk Evaluation Using a C-Variable Least Squares Support Vector Classification Model

    NASA Astrophysics Data System (ADS)

    Yu, Lean; Wang, Shouyang; Lai, K. K.

    Credit risk evaluation is one of the most important issues in financial risk management. In this paper, a C-variable least squares support vector classification (C-VLSSVC) model is proposed for credit risk analysis. The main idea of this model is based on the prior knowledge that different classes may have different importance for modeling and more weights should be given to those classes with more importance. The C-VLSSVC model can be constructed by a simple modification of the regularization parameter in LSSVC, whereby more weights are given to the lease squares classification errors with important classes than the lease squares classification errors with unimportant classes while keeping the regularized terms in its original form. For illustration purpose, a real-world credit dataset is used to test the effectiveness of the C-VLSSVC model.

  5. An assessment of support vector machines for land cover classification

    USGS Publications Warehouse

    Huang, C.; Davis, L.S.; Townshend, J.R.G.

    2002-01-01

    The support vector machine (SVM) is a group of theoretically superior machine learning algorithms. It was found competitive with the best available machine learning algorithms in classifying high-dimensional data sets. This paper gives an introduction to the theoretical development of the SVM and an experimental evaluation of its accuracy, stability and training speed in deriving land cover classifications from satellite images. The SVM was compared to three other popular classifiers, including the maximum likelihood classifier (MLC), neural network classifiers (NNC) and decision tree classifiers (DTC). The impacts of kernel configuration on the performance of the SVM and of the selection of training data and input variables on the four classifiers were also evaluated in this experiment.

  6. Stroke localization and classification using microwave tomography with k-means clustering and support vector machine.

    PubMed

    Guo, Lei; Abbosh, Amin

    2018-05-01

    For any chance for stroke patients to survive, the stroke type should be classified to enable giving medication within a few hours of the onset of symptoms. In this paper, a microwave-based stroke localization and classification framework is proposed. It is based on microwave tomography, k-means clustering, and a support vector machine (SVM) method. The dielectric profile of the brain is first calculated using the Born iterative method, whereas the amplitude of the dielectric profile is then taken as the input to k-means clustering. The cluster is selected as the feature vector for constructing and testing the SVM. A database of MRI-derived realistic head phantoms at different signal-to-noise ratios is used in the classification procedure. The performance of the proposed framework is evaluated using the receiver operating characteristic (ROC) curve. The results based on a two-dimensional framework show that 88% classification accuracy, with a sensitivity of 91% and a specificity of 87%, can be achieved. Bioelectromagnetics. 39:312-324, 2018. © 2018 Wiley Periodicals, Inc. © 2018 Wiley Periodicals, Inc.

  7. Support vector machine and principal component analysis for microarray data classification

    NASA Astrophysics Data System (ADS)

    Astuti, Widi; Adiwijaya

    2018-03-01

    Cancer is a leading cause of death worldwide although a significant proportion of it can be cured if it is detected early. In recent decades, technology called microarray takes an important role in the diagnosis of cancer. By using data mining technique, microarray data classification can be performed to improve the accuracy of cancer diagnosis compared to traditional techniques. The characteristic of microarray data is small sample but it has huge dimension. Since that, there is a challenge for researcher to provide solutions for microarray data classification with high performance in both accuracy and running time. This research proposed the usage of Principal Component Analysis (PCA) as a dimension reduction method along with Support Vector Method (SVM) optimized by kernel functions as a classifier for microarray data classification. The proposed scheme was applied on seven data sets using 5-fold cross validation and then evaluation and analysis conducted on term of both accuracy and running time. The result showed that the scheme can obtained 100% accuracy for Ovarian and Lung Cancer data when Linear and Cubic kernel functions are used. In term of running time, PCA greatly reduced the running time for every data sets.

  8. Experimental Investigation for Fault Diagnosis Based on a Hybrid Approach Using Wavelet Packet and Support Vector Classification

    PubMed Central

    Li, Pengfei; Jiang, Yongying; Xiang, Jiawei

    2014-01-01

    To deal with the difficulty to obtain a large number of fault samples under the practical condition for mechanical fault diagnosis, a hybrid method that combined wavelet packet decomposition and support vector classification (SVC) is proposed. The wavelet packet is employed to decompose the vibration signal to obtain the energy ratio in each frequency band. Taking energy ratios as feature vectors, the pattern recognition results are obtained by the SVC. The rolling bearing and gear fault diagnostic results of the typical experimental platform show that the present approach is robust to noise and has higher classification accuracy and, thus, provides a better way to diagnose mechanical faults under the condition of small fault samples. PMID:24688361

  9. Support Vector Machines Trained with Evolutionary Algorithms Employing Kernel Adatron for Large Scale Classification of Protein Structures.

    PubMed

    Arana-Daniel, Nancy; Gallegos, Alberto A; López-Franco, Carlos; Alanís, Alma Y; Morales, Jacob; López-Franco, Adriana

    2016-01-01

    With the increasing power of computers, the amount of data that can be processed in small periods of time has grown exponentially, as has the importance of classifying large-scale data efficiently. Support vector machines have shown good results classifying large amounts of high-dimensional data, such as data generated by protein structure prediction, spam recognition, medical diagnosis, optical character recognition and text classification, etc. Most state of the art approaches for large-scale learning use traditional optimization methods, such as quadratic programming or gradient descent, which makes the use of evolutionary algorithms for training support vector machines an area to be explored. The present paper proposes an approach that is simple to implement based on evolutionary algorithms and Kernel-Adatron for solving large-scale classification problems, focusing on protein structure prediction. The functional properties of proteins depend upon their three-dimensional structures. Knowing the structures of proteins is crucial for biology and can lead to improvements in areas such as medicine, agriculture and biofuels.

  10. Support Vector Machine Model for Automatic Detection and Classification of Seismic Events

    NASA Astrophysics Data System (ADS)

    Barros, Vesna; Barros, Lucas

    2016-04-01

    The automated processing of multiple seismic signals to detect, localize and classify seismic events is a central tool in both natural hazards monitoring and nuclear treaty verification. However, false detections and missed detections caused by station noise and incorrect classification of arrivals are still an issue and the events are often unclassified or poorly classified. Thus, machine learning techniques can be used in automatic processing for classifying the huge database of seismic recordings and provide more confidence in the final output. Applied in the context of the International Monitoring System (IMS) - a global sensor network developed for the Comprehensive Nuclear-Test-Ban Treaty (CTBT) - we propose a fully automatic method for seismic event detection and classification based on a supervised pattern recognition technique called the Support Vector Machine (SVM). According to Kortström et al., 2015, the advantages of using SVM are handleability of large number of features and effectiveness in high dimensional spaces. Our objective is to detect seismic events from one IMS seismic station located in an area of high seismicity and mining activity and classify them as earthquakes or quarry blasts. It is expected to create a flexible and easily adjustable SVM method that can be applied in different regions and datasets. Taken a step further, accurate results for seismic stations could lead to a modification of the model and its parameters to make it applicable to other waveform technologies used to monitor nuclear explosions such as infrasound and hydroacoustic waveforms. As an authorized user, we have direct access to all IMS data and bulletins through a secure signatory account. A set of significant seismic waveforms containing different types of events (e.g. earthquake, quarry blasts) and noise is being analysed to train the model and learn the typical pattern of the signal from these events. Moreover, comparing the performance of the support-vector

  11. Combined empirical mode decomposition and texture features for skin lesion classification using quadratic support vector machine.

    PubMed

    Wahba, Maram A; Ashour, Amira S; Napoleon, Sameh A; Abd Elnaby, Mustafa M; Guo, Yanhui

    2017-12-01

    Basal cell carcinoma is one of the most common malignant skin lesions. Automated lesion identification and classification using image processing techniques is highly required to reduce the diagnosis errors. In this study, a novel technique is applied to classify skin lesion images into two classes, namely the malignant Basal cell carcinoma and the benign nevus. A hybrid combination of bi-dimensional empirical mode decomposition and gray-level difference method features is proposed after hair removal. The combined features are further classified using quadratic support vector machine (Q-SVM). The proposed system has achieved outstanding performance of 100% accuracy, sensitivity and specificity compared to other support vector machine procedures as well as with different extracted features. Basal Cell Carcinoma is effectively classified using Q-SVM with the proposed combined features.

  12. TWSVR: Regression via Twin Support Vector Machine.

    PubMed

    Khemchandani, Reshma; Goyal, Keshav; Chandra, Suresh

    2016-02-01

    Taking motivation from Twin Support Vector Machine (TWSVM) formulation, Peng (2010) attempted to propose Twin Support Vector Regression (TSVR) where the regressor is obtained via solving a pair of quadratic programming problems (QPPs). In this paper we argue that TSVR formulation is not in the true spirit of TWSVM. Further, taking motivation from Bi and Bennett (2003), we propose an alternative approach to find a formulation for Twin Support Vector Regression (TWSVR) which is in the true spirit of TWSVM. We show that our proposed TWSVR can be derived from TWSVM for an appropriately constructed classification problem. To check the efficacy of our proposed TWSVR we compare its performance with TSVR and classical Support Vector Regression(SVR) on various regression datasets. Copyright © 2015 Elsevier Ltd. All rights reserved.

  13. Comparison of Support Vector Machine, Neural Network, and CART Algorithms for the Land-Cover Classification Using Limited Training Data Points

    EPA Science Inventory

    Support vector machine (SVM) was applied for land-cover characterization using MODIS time-series data. Classification performance was examined with respect to training sample size, sample variability, and landscape homogeneity (purity). The results were compared to two convention...

  14. Applications of Support Vector Machine (SVM) Learning in Cancer Genomics

    PubMed Central

    HUANG, SHUJUN; CAI, NIANGUANG; PACHECO, PEDRO PENZUTI; NARANDES, SHAVIRA; WANG, YANG; XU, WAYNE

    2017-01-01

    Machine learning with maximization (support) of separating margin (vector), called support vector machine (SVM) learning, is a powerful classification tool that has been used for cancer genomic classification or subtyping. Today, as advancements in high-throughput technologies lead to production of large amounts of genomic and epigenomic data, the classification feature of SVMs is expanding its use in cancer genomics, leading to the discovery of new biomarkers, new drug targets, and a better understanding of cancer driver genes. Herein we reviewed the recent progress of SVMs in cancer genomic studies. We intend to comprehend the strength of the SVM learning and its future perspective in cancer genomic applications. PMID:29275361

  15. Vision based nutrient deficiency classification in maize plants using multi class support vector machines

    NASA Astrophysics Data System (ADS)

    Leena, N.; Saju, K. K.

    2018-04-01

    Nutritional deficiencies in plants are a major concern for farmers as it affects productivity and thus profit. The work aims to classify nutritional deficiencies in maize plant in a non-destructive mannerusing image processing and machine learning techniques. The colored images of the leaves are analyzed and classified with multi-class support vector machine (SVM) method. Several images of maize leaves with known deficiencies like nitrogen, phosphorous and potassium (NPK) are used to train the SVM classifier prior to the classification of test images. The results show that the method was able to classify and identify nutritional deficiencies.

  16. Predicting primary progressive aphasias with support vector machine approaches in structural MRI data.

    PubMed

    Bisenius, Sandrine; Mueller, Karsten; Diehl-Schmid, Janine; Fassbender, Klaus; Grimmer, Timo; Jessen, Frank; Kassubek, Jan; Kornhuber, Johannes; Landwehrmeyer, Bernhard; Ludolph, Albert; Schneider, Anja; Anderl-Straub, Sarah; Stuke, Katharina; Danek, Adrian; Otto, Markus; Schroeter, Matthias L

    2017-01-01

    Primary progressive aphasia (PPA) encompasses the three subtypes nonfluent/agrammatic variant PPA, semantic variant PPA, and the logopenic variant PPA, which are characterized by distinct patterns of language difficulties and regional brain atrophy. To validate the potential of structural magnetic resonance imaging data for early individual diagnosis, we used support vector machine classification on grey matter density maps obtained by voxel-based morphometry analysis to discriminate PPA subtypes (44 patients: 16 nonfluent/agrammatic variant PPA, 17 semantic variant PPA, 11 logopenic variant PPA) from 20 healthy controls (matched for sample size, age, and gender) in the cohort of the multi-center study of the German consortium for frontotemporal lobar degeneration. Here, we compared a whole-brain with a meta-analysis-based disease-specific regions-of-interest approach for support vector machine classification. We also used support vector machine classification to discriminate the three PPA subtypes from each other. Whole brain support vector machine classification enabled a very high accuracy between 91 and 97% for identifying specific PPA subtypes vs. healthy controls, and 78/95% for the discrimination between semantic variant vs. nonfluent/agrammatic or logopenic PPA variants. Only for the discrimination between nonfluent/agrammatic and logopenic PPA variants accuracy was low with 55%. Interestingly, the regions that contributed the most to the support vector machine classification of patients corresponded largely to the regions that were atrophic in these patients as revealed by group comparisons. Although the whole brain approach took also into account regions that were not covered in the regions-of-interest approach, both approaches showed similar accuracies due to the disease-specificity of the selected networks. Conclusion, support vector machine classification of multi-center structural magnetic resonance imaging data enables prediction of PPA subtypes with

  17. Automatic road sign detecion and classification based on support vector machines and HOG descriptos

    NASA Astrophysics Data System (ADS)

    Adam, A.; Ioannidis, C.

    2014-05-01

    This paper examines the detection and classification of road signs in color-images acquired by a low cost camera mounted on a moving vehicle. A new method for the detection and classification of road signs is proposed based on color based detection, in order to locate regions of interest. Then, a circular Hough transform is applied to complete detection taking advantage of the shape properties of the road signs. The regions of interest are finally represented using HOG descriptors and are fed into trained Support Vector Machines (SVMs) in order to be recognized. For the training procedure, a database with several training examples depicting Greek road sings has been developed. Many experiments have been conducted and are presented, to measure the efficiency of the proposed methodology especially under adverse weather conditions and poor illumination. For the experiments training datasets consisting of different number of examples were used and the results are presented, along with some possible extensions of this work.

  18. Per-field crop classification in irrigated agricultural regions in middle Asia using random forest and support vector machine ensemble

    NASA Astrophysics Data System (ADS)

    Löw, Fabian; Schorcht, Gunther; Michel, Ulrich; Dech, Stefan; Conrad, Christopher

    2012-10-01

    Accurate crop identification and crop area estimation are important for studies on irrigated agricultural systems, yield and water demand modeling, and agrarian policy development. In this study a novel combination of Random Forest (RF) and Support Vector Machine (SVM) classifiers is presented that (i) enhances crop classification accuracy and (ii) provides spatial information on map uncertainty. The methodology was implemented over four distinct irrigated sites in Middle Asia using RapidEye time series data. The RF feature importance statistics was used as feature-selection strategy for the SVM to assess possible negative effects on classification accuracy caused by an oversized feature space. The results of the individual RF and SVM classifications were combined with rules based on posterior classification probability and estimates of classification probability entropy. SVM classification performance was increased by feature selection through RF. Further experimental results indicate that the hybrid classifier improves overall classification accuracy in comparison to the single classifiers as well as useŕs and produceŕs accuracy.

  19. Applications of Support Vector Machine (SVM) Learning in Cancer Genomics.

    PubMed

    Huang, Shujun; Cai, Nianguang; Pacheco, Pedro Penzuti; Narrandes, Shavira; Wang, Yang; Xu, Wayne

    2018-01-01

    Machine learning with maximization (support) of separating margin (vector), called support vector machine (SVM) learning, is a powerful classification tool that has been used for cancer genomic classification or subtyping. Today, as advancements in high-throughput technologies lead to production of large amounts of genomic and epigenomic data, the classification feature of SVMs is expanding its use in cancer genomics, leading to the discovery of new biomarkers, new drug targets, and a better understanding of cancer driver genes. Herein we reviewed the recent progress of SVMs in cancer genomic studies. We intend to comprehend the strength of the SVM learning and its future perspective in cancer genomic applications. Copyright© 2018, International Institute of Anticancer Research (Dr. George J. Delinasios), All rights reserved.

  20. Speech sound classification and detection of articulation disorders with support vector machines and wavelets.

    PubMed

    Georgoulas, George; Georgopoulos, Voula C; Stylios, Chrysostomos D

    2006-01-01

    This paper proposes a novel integrated methodology to extract features and classify speech sounds with intent to detect the possible existence of a speech articulation disorder in a speaker. Articulation, in effect, is the specific and characteristic way that an individual produces the speech sounds. A methodology to process the speech signal, extract features and finally classify the signal and detect articulation problems in a speaker is presented. The use of support vector machines (SVMs), for the classification of speech sounds and detection of articulation disorders is introduced. The proposed method is implemented on a data set where different sets of features and different schemes of SVMs are tested leading to satisfactory performance.

  1. Ensemble support vector machine classification of dementia using structural MRI and mini-mental state examination.

    PubMed

    Sørensen, Lauge; Nielsen, Mads

    2018-05-15

    The International Challenge for Automated Prediction of MCI from MRI data offered independent, standardized comparison of machine learning algorithms for multi-class classification of normal control (NC), mild cognitive impairment (MCI), converting MCI (cMCI), and Alzheimer's disease (AD) using brain imaging and general cognition. We proposed to use an ensemble of support vector machines (SVMs) that combined bagging without replacement and feature selection. SVM is the most commonly used algorithm in multivariate classification of dementia, and it was therefore valuable to evaluate the potential benefit of ensembling this type of classifier. The ensemble SVM, using either a linear or a radial basis function (RBF) kernel, achieved multi-class classification accuracies of 55.6% and 55.0% in the challenge test set (60 NC, 60 MCI, 60 cMCI, 60 AD), resulting in a third place in the challenge. Similar feature subset sizes were obtained for both kernels, and the most frequently selected MRI features were the volumes of the two hippocampal subregions left presubiculum and right subiculum. Post-challenge analysis revealed that enforcing a minimum number of selected features and increasing the number of ensemble classifiers improved classification accuracy up to 59.1%. The ensemble SVM outperformed single SVM classifications consistently in the challenge test set. Ensemble methods using bagging and feature selection can improve the performance of the commonly applied SVM classifier in dementia classification. This resulted in competitive classification accuracies in the International Challenge for Automated Prediction of MCI from MRI data. Copyright © 2018 Elsevier B.V. All rights reserved.

  2. New fuzzy support vector machine for the class imbalance problem in medical datasets classification.

    PubMed

    Gu, Xiaoqing; Ni, Tongguang; Wang, Hongyuan

    2014-01-01

    In medical datasets classification, support vector machine (SVM) is considered to be one of the most successful methods. However, most of the real-world medical datasets usually contain some outliers/noise and data often have class imbalance problems. In this paper, a fuzzy support machine (FSVM) for the class imbalance problem (called FSVM-CIP) is presented, which can be seen as a modified class of FSVM by extending manifold regularization and assigning two misclassification costs for two classes. The proposed FSVM-CIP can be used to handle the class imbalance problem in the presence of outliers/noise, and enhance the locality maximum margin. Five real-world medical datasets, breast, heart, hepatitis, BUPA liver, and pima diabetes, from the UCI medical database are employed to illustrate the method presented in this paper. Experimental results on these datasets show the outperformed or comparable effectiveness of FSVM-CIP.

  3. Using support vector machines with tract-based spatial statistics for automated classification of Tourette syndrome children

    NASA Astrophysics Data System (ADS)

    Wen, Hongwei; Liu, Yue; Wang, Jieqiong; Zhang, Jishui; Peng, Yun; He, Huiguang

    2016-03-01

    Tourette syndrome (TS) is a developmental neuropsychiatric disorder with the cardinal symptoms of motor and vocal tics which emerges in early childhood and fluctuates in severity in later years. To date, the neural basis of TS is not fully understood yet and TS has a long-term prognosis that is difficult to accurately estimate. Few studies have looked at the potential of using diffusion tensor imaging (DTI) in conjunction with machine learning algorithms in order to automate the classification of healthy children and TS children. Here we apply Tract-Based Spatial Statistics (TBSS) method to 44 TS children and 48 age and gender matched healthy children in order to extract the diffusion values from each voxel in the white matter (WM) skeleton, and a feature selection algorithm (ReliefF) was used to select the most salient voxels for subsequent classification with support vector machine (SVM). We use a nested cross validation to yield an unbiased assessment of the classification method and prevent overestimation. The accuracy (88.04%), sensitivity (88.64%) and specificity (87.50%) were achieved in our method as peak performance of the SVM classifier was achieved using the axial diffusion (AD) metric, demonstrating the potential of a joint TBSS and SVM pipeline for fast, objective classification of healthy and TS children. These results support that our methods may be useful for the early identification of subjects with TS, and hold promise for predicting prognosis and treatment outcome for individuals with TS.

  4. Binary classification of aqueous solubility using support vector machines with reduction and recombination feature selection.

    PubMed

    Cheng, Tiejun; Li, Qingliang; Wang, Yanli; Bryant, Stephen H

    2011-02-28

    Aqueous solubility is recognized as a critical parameter in both the early- and late-stage drug discovery. Therefore, in silico modeling of solubility has attracted extensive interests in recent years. Most previous studies have been limited in using relatively small data sets with limited diversity, which in turn limits the predictability of derived models. In this work, we present a support vector machines model for the binary classification of solubility by taking advantage of the largest known public data set that contains over 46 000 compounds with experimental solubility. Our model was optimized in combination with a reduction and recombination feature selection strategy. The best model demonstrated robust performance in both cross-validation and prediction of two independent test sets, indicating it could be a practical tool to select soluble compounds for screening, purchasing, and synthesizing. Moreover, our work may be used for comparative evaluation of solubility classification studies ascribe to the use of completely public resources.

  5. Antepartum fetal heart rate feature extraction and classification using empirical mode decomposition and support vector machine

    PubMed Central

    2011-01-01

    Background Cardiotocography (CTG) is the most widely used tool for fetal surveillance. The visual analysis of fetal heart rate (FHR) traces largely depends on the expertise and experience of the clinician involved. Several approaches have been proposed for the effective interpretation of FHR. In this paper, a new approach for FHR feature extraction based on empirical mode decomposition (EMD) is proposed, which was used along with support vector machine (SVM) for the classification of FHR recordings as 'normal' or 'at risk'. Methods The FHR were recorded from 15 subjects at a sampling rate of 4 Hz and a dataset consisting of 90 randomly selected records of 20 minutes duration was formed from these. All records were labelled as 'normal' or 'at risk' by two experienced obstetricians. A training set was formed by 60 records, the remaining 30 left as the testing set. The standard deviations of the EMD components are input as features to a support vector machine (SVM) to classify FHR samples. Results For the training set, a five-fold cross validation test resulted in an accuracy of 86% whereas the overall geometric mean of sensitivity and specificity was 94.8%. The Kappa value for the training set was .923. Application of the proposed method to the testing set (30 records) resulted in a geometric mean of 81.5%. The Kappa value for the testing set was .684. Conclusions Based on the overall performance of the system it can be stated that the proposed methodology is a promising new approach for the feature extraction and classification of FHR signals. PMID:21244712

  6. Predicting complications of percutaneous coronary intervention using a novel support vector method.

    PubMed

    Lee, Gyemin; Gurm, Hitinder S; Syed, Zeeshan

    2013-01-01

    To explore the feasibility of a novel approach using an augmented one-class learning algorithm to model in-laboratory complications of percutaneous coronary intervention (PCI). Data from the Blue Cross Blue Shield of Michigan Cardiovascular Consortium (BMC2) multicenter registry for the years 2007 and 2008 (n=41 016) were used to train models to predict 13 different in-laboratory PCI complications using a novel one-plus-class support vector machine (OP-SVM) algorithm. The performance of these models in terms of discrimination and calibration was compared to the performance of models trained using the following classification algorithms on BMC2 data from 2009 (n=20 289): logistic regression (LR), one-class support vector machine classification (OC-SVM), and two-class support vector machine classification (TC-SVM). For the OP-SVM and TC-SVM approaches, variants of the algorithms with cost-sensitive weighting were also considered. The OP-SVM algorithm and its cost-sensitive variant achieved the highest area under the receiver operating characteristic curve for the majority of the PCI complications studied (eight cases). Similar improvements were observed for the Hosmer-Lemeshow χ(2) value (seven cases) and the mean cross-entropy error (eight cases). The OP-SVM algorithm based on an augmented one-class learning problem improved discrimination and calibration across different PCI complications relative to LR and traditional support vector machine classification. Such an approach may have value in a broader range of clinical domains.

  7. Towards automatic lithological classification from remote sensing data using support vector machines

    NASA Astrophysics Data System (ADS)

    Yu, Le; Porwal, Alok; Holden, Eun-Jung; Dentith, Michael

    2010-05-01

    Remote sensing data can be effectively used as a mean to build geological knowledge for poorly mapped terrains. Spectral remote sensing data from space- and air-borne sensors have been widely used to geological mapping, especially in areas of high outcrop density in arid regions. However, spectral remote sensing information by itself cannot be efficiently used for a comprehensive lithological classification of an area due to (1) diagnostic spectral response of a rock within an image pixel is conditioned by several factors including the atmospheric effects, spectral and spatial resolution of the image, sub-pixel level heterogeneity in chemical and mineralogical composition of the rock, presence of soil and vegetation cover; (2) only surface information and is therefore highly sensitive to the noise due to weathering, soil cover, and vegetation. Consequently, for efficient lithological classification, spectral remote sensing data needs to be supplemented with other remote sensing datasets that provide geomorphological and subsurface geological information, such as digital topographic model (DEM) and aeromagnetic data. Each of the datasets contain significant information about geology that, in conjunction, can potentially be used for automated lithological classification using supervised machine learning algorithms. In this study, support vector machine (SVM), which is a kernel-based supervised learning method, was applied to automated lithological classification of a study area in northwestern India using remote sensing data, namely, ASTER, DEM and aeromagnetic data. Several digital image processing techniques were used to produce derivative datasets that contained enhanced information relevant to lithological discrimination. A series of SVMs (trained using k-folder cross-validation with grid search) were tested using various combinations of input datasets selected from among 50 datasets including the original 14 ASTER bands and 36 derivative datasets (including 14

  8. Using Support Vector Machine Ensembles for Target Audience Classification on Twitter

    PubMed Central

    Lo, Siaw Ling; Chiong, Raymond; Cornforth, David

    2015-01-01

    The vast amount and diversity of the content shared on social media can pose a challenge for any business wanting to use it to identify potential customers. In this paper, our aim is to investigate the use of both unsupervised and supervised learning methods for target audience classification on Twitter with minimal annotation efforts. Topic domains were automatically discovered from contents shared by followers of an account owner using Twitter Latent Dirichlet Allocation (LDA). A Support Vector Machine (SVM) ensemble was then trained using contents from different account owners of the various topic domains identified by Twitter LDA. Experimental results show that the methods presented are able to successfully identify a target audience with high accuracy. In addition, we show that using a statistical inference approach such as bootstrapping in over-sampling, instead of using random sampling, to construct training datasets can achieve a better classifier in an SVM ensemble. We conclude that such an ensemble system can take advantage of data diversity, which enables real-world applications for differentiating prospective customers from the general audience, leading to business advantage in the crowded social media space. PMID:25874768

  9. Using support vector machine ensembles for target audience classification on Twitter.

    PubMed

    Lo, Siaw Ling; Chiong, Raymond; Cornforth, David

    2015-01-01

    The vast amount and diversity of the content shared on social media can pose a challenge for any business wanting to use it to identify potential customers. In this paper, our aim is to investigate the use of both unsupervised and supervised learning methods for target audience classification on Twitter with minimal annotation efforts. Topic domains were automatically discovered from contents shared by followers of an account owner using Twitter Latent Dirichlet Allocation (LDA). A Support Vector Machine (SVM) ensemble was then trained using contents from different account owners of the various topic domains identified by Twitter LDA. Experimental results show that the methods presented are able to successfully identify a target audience with high accuracy. In addition, we show that using a statistical inference approach such as bootstrapping in over-sampling, instead of using random sampling, to construct training datasets can achieve a better classifier in an SVM ensemble. We conclude that such an ensemble system can take advantage of data diversity, which enables real-world applications for differentiating prospective customers from the general audience, leading to business advantage in the crowded social media space.

  10. Predicting complications of percutaneous coronary intervention using a novel support vector method

    PubMed Central

    Lee, Gyemin; Gurm, Hitinder S; Syed, Zeeshan

    2013-01-01

    Objective To explore the feasibility of a novel approach using an augmented one-class learning algorithm to model in-laboratory complications of percutaneous coronary intervention (PCI). Materials and methods Data from the Blue Cross Blue Shield of Michigan Cardiovascular Consortium (BMC2) multicenter registry for the years 2007 and 2008 (n=41 016) were used to train models to predict 13 different in-laboratory PCI complications using a novel one-plus-class support vector machine (OP-SVM) algorithm. The performance of these models in terms of discrimination and calibration was compared to the performance of models trained using the following classification algorithms on BMC2 data from 2009 (n=20 289): logistic regression (LR), one-class support vector machine classification (OC-SVM), and two-class support vector machine classification (TC-SVM). For the OP-SVM and TC-SVM approaches, variants of the algorithms with cost-sensitive weighting were also considered. Results The OP-SVM algorithm and its cost-sensitive variant achieved the highest area under the receiver operating characteristic curve for the majority of the PCI complications studied (eight cases). Similar improvements were observed for the Hosmer–Lemeshow χ2 value (seven cases) and the mean cross-entropy error (eight cases). Conclusions The OP-SVM algorithm based on an augmented one-class learning problem improved discrimination and calibration across different PCI complications relative to LR and traditional support vector machine classification. Such an approach may have value in a broader range of clinical domains. PMID:23599229

  11. Classification of diesel pool refinery streams through near infrared spectroscopy and support vector machines using C-SVC and ν-SVC.

    PubMed

    Alves, Julio Cesar L; Henriques, Claudete B; Poppi, Ronei J

    2014-01-03

    The use of near infrared (NIR) spectroscopy combined with chemometric methods have been widely used in petroleum and petrochemical industry and provides suitable methods for process control and quality control. The algorithm support vector machines (SVM) has demonstrated to be a powerful chemometric tool for development of classification models due to its ability to nonlinear modeling and with high generalization capability and these characteristics can be especially important for treating near infrared (NIR) spectroscopy data of complex mixtures such as petroleum refinery streams. In this work, a study on the performance of the support vector machines algorithm for classification was carried out, using C-SVC and ν-SVC, applied to near infrared (NIR) spectroscopy data of different types of streams that make up the diesel pool in a petroleum refinery: light gas oil, heavy gas oil, hydrotreated diesel, kerosene, heavy naphtha and external diesel. In addition to these six streams, the diesel final blend produced in the refinery was added to complete the data set. C-SVC and ν-SVC classification models with 2, 4, 6 and 7 classes were developed for comparison between its results and also for comparison with the soft independent modeling of class analogy (SIMCA) models results. It is demonstrated the superior performance of SVC models especially using ν-SVC for development of classification models for 6 and 7 classes leading to an improvement of sensitivity on validation sample sets of 24% and 15%, respectively, when compared to SIMCA models, providing better identification of chemical compositions of different diesel pool refinery streams. Copyright © 2013 Elsevier B.V. All rights reserved.

  12. Ambulatory activity classification with dendogram-based support vector machine: Application in lower-limb active exoskeleton.

    PubMed

    Mazumder, Oishee; Kundu, Ananda Sankar; Lenka, Prasanna Kumar; Bhaumik, Subhasis

    2016-10-01

    Ambulatory activity classification is an active area of research for controlling and monitoring state initiation, termination, and transition in mobility assistive devices such as lower-limb exoskeletons. State transition of lower-limb exoskeletons reported thus far are achieved mostly through the use of manual switches or state machine-based logic. In this paper, we propose a postural activity classifier using a 'dendogram-based support vector machine' (DSVM) which can be used to control a lower-limb exoskeleton. A pressure sensor-based wearable insole and two six-axis inertial measurement units (IMU) have been used for recognising two static and seven dynamic postural activities: sit, stand, and sit-to-stand, stand-to-sit, level walk, fast walk, slope walk, stair ascent and stair descent. Most of the ambulatory activities are periodic in nature and have unique patterns of response. The proposed classification algorithm involves the recognition of activity patterns on the basis of the periodic shape of trajectories. Polynomial coefficients extracted from the hip angle trajectory and the centre-of-pressure (CoP) trajectory during an activity cycle are used as features to classify dynamic activities. The novelty of this paper lies in finding suitable instrumentation, developing post-processing techniques, and selecting shape-based features for ambulatory activity classification. The proposed activity classifier is used to identify the activity states of a lower-limb exoskeleton. The DSVM classifier algorithm achieved an overall classification accuracy of 95.2%. Copyright © 2016 Elsevier B.V. All rights reserved.

  13. A support vector machine for spectral classification of emission-line galaxies from the Sloan Digital Sky Survey

    NASA Astrophysics Data System (ADS)

    Shi, Fei; Liu, Yu-Yan; Sun, Guang-Lan; Li, Pei-Yu; Lei, Yu-Ming; Wang, Jian

    2015-10-01

    The emission-lines of galaxies originate from massive young stars or supermassive blackholes. As a result, spectral classification of emission-line galaxies into star-forming galaxies, active galactic nucleus (AGN) hosts, or compositions of both relates closely to formation and evolution of galaxy. To find efficient and automatic spectral classification method, especially in large surveys and huge data bases, a support vector machine (SVM) supervised learning algorithm is applied to a sample of emission-line galaxies from the Sloan Digital Sky Survey (SDSS) data release 9 (DR9) provided by the Max Planck Institute and the Johns Hopkins University (MPA/JHU). A two-step approach is adopted. (i) The SVM must be trained with a subset of objects that are known to be AGN hosts, composites or star-forming galaxies, treating the strong emission-line flux measurements as input feature vectors in an n-dimensional space, where n is the number of strong emission-line flux ratios. (ii) After training on a sample of emission-line galaxies, the remaining galaxies are automatically classified. In the classification process, we use a 10-fold cross-validation technique. We show that the classification diagrams based on the [N II]/Hα versus other emission-line ratio, such as [O III]/Hβ, [Ne III]/[O II], ([O III]λ4959+[O III]λ5007)/[O III]λ4363, [O II]/Hβ, [Ar III]/[O III], [S II]/Hα, and [O I]/Hα, plus colour, allows us to separate unambiguously AGN hosts, composites or star-forming galaxies. Among them, the diagram of [N II]/Hα versus [O III]/Hβ achieved an accuracy of 99 per cent to separate the three classes of objects. The other diagrams above give an accuracy of ˜91 per cent.

  14. Signal detection using support vector machines in the presence of ultrasonic speckle

    NASA Astrophysics Data System (ADS)

    Kotropoulos, Constantine L.; Pitas, Ioannis

    2002-04-01

    Support Vector Machines are a general algorithm based on guaranteed risk bounds of statistical learning theory. They have found numerous applications, such as in classification of brain PET images, optical character recognition, object detection, face verification, text categorization and so on. In this paper we propose the use of support vector machines to segment lesions in ultrasound images and we assess thoroughly their lesion detection ability. We demonstrate that trained support vector machines with a Radial Basis Function kernel segment satisfactorily (unseen) ultrasound B-mode images as well as clinical ultrasonic images.

  15. Support vector machine for automatic pain recognition

    NASA Astrophysics Data System (ADS)

    Monwar, Md Maruf; Rezaei, Siamak

    2009-02-01

    Facial expressions are a key index of emotion and the interpretation of such expressions of emotion is critical to everyday social functioning. In this paper, we present an efficient video analysis technique for recognition of a specific expression, pain, from human faces. We employ an automatic face detector which detects face from the stored video frame using skin color modeling technique. For pain recognition, location and shape features of the detected faces are computed. These features are then used as inputs to a support vector machine (SVM) for classification. We compare the results with neural network based and eigenimage based automatic pain recognition systems. The experiment results indicate that using support vector machine as classifier can certainly improve the performance of automatic pain recognition system.

  16. Prediction of B-cell linear epitopes with a combination of support vector machine classification and amino acid propensity identification.

    PubMed

    Wang, Hsin-Wei; Lin, Ya-Chi; Pai, Tun-Wen; Chang, Hao-Teng

    2011-01-01

    Epitopes are antigenic determinants that are useful because they induce B-cell antibody production and stimulate T-cell activation. Bioinformatics can enable rapid, efficient prediction of potential epitopes. Here, we designed a novel B-cell linear epitope prediction system called LEPS, Linear Epitope Prediction by Propensities and Support Vector Machine, that combined physico-chemical propensity identification and support vector machine (SVM) classification. We tested the LEPS on four datasets: AntiJen, HIV, a newly generated PC, and AHP, a combination of these three datasets. Peptides with globally or locally high physicochemical propensities were first identified as primitive linear epitope (LE) candidates. Then, candidates were classified with the SVM based on the unique features of amino acid segments. This reduced the number of predicted epitopes and enhanced the positive prediction value (PPV). Compared to four other well-known LE prediction systems, the LEPS achieved the highest accuracy (72.52%), specificity (84.22%), PPV (32.07%), and Matthews' correlation coefficient (10.36%).

  17. Weighted K-means support vector machine for cancer prediction.

    PubMed

    Kim, SungHwan

    2016-01-01

    To date, the support vector machine (SVM) has been widely applied to diverse bio-medical fields to address disease subtype identification and pathogenicity of genetic variants. In this paper, I propose the weighted K-means support vector machine (wKM-SVM) and weighted support vector machine (wSVM), for which I allow the SVM to impose weights to the loss term. Besides, I demonstrate the numerical relations between the objective function of the SVM and weights. Motivated by general ensemble techniques, which are known to improve accuracy, I directly adopt the boosting algorithm to the newly proposed weighted KM-SVM (and wSVM). For predictive performance, a range of simulation studies demonstrate that the weighted KM-SVM (and wSVM) with boosting outperforms the standard KM-SVM (and SVM) including but not limited to many popular classification rules. I applied the proposed methods to simulated data and two large-scale real applications in the TCGA pan-cancer methylation data of breast and kidney cancer. In conclusion, the weighted KM-SVM (and wSVM) increases accuracy of the classification model, and will facilitate disease diagnosis and clinical treatment decisions to benefit patients. A software package (wSVM) is publicly available at the R-project webpage (https://www.r-project.org).

  18. Classification of toxicity effects of biotransformed hepatic drugs using whale optimized support vector machines.

    PubMed

    Tharwat, Alaa; Moemen, Yasmine S; Hassanien, Aboul Ella

    2017-04-01

    Measuring toxicity is an important step in drug development. Nevertheless, the current experimental methods used to estimate the drug toxicity are expensive and time-consuming, indicating that they are not suitable for large-scale evaluation of drug toxicity in the early stage of drug development. Hence, there is a high demand to develop computational models that can predict the drug toxicity risks. In this study, we used a dataset that consists of 553 drugs that biotransformed in liver. The toxic effects were calculated for the current data, namely, mutagenic, tumorigenic, irritant and reproductive effect. Each drug is represented by 31 chemical descriptors (features). The proposed model consists of three phases. In the first phase, the most discriminative subset of features is selected using rough set-based methods to reduce the classification time while improving the classification performance. In the second phase, different sampling methods such as Random Under-Sampling, Random Over-Sampling and Synthetic Minority Oversampling Technique (SMOTE), BorderLine SMOTE and Safe Level SMOTE are used to solve the problem of imbalanced dataset. In the third phase, the Support Vector Machines (SVM) classifier is used to classify an unknown drug into toxic or non-toxic. SVM parameters such as the penalty parameter and kernel parameter have a great impact on the classification accuracy of the model. In this paper, Whale Optimization Algorithm (WOA) has been proposed to optimize the parameters of SVM, so that the classification error can be reduced. The experimental results proved that the proposed model achieved high sensitivity to all toxic effects. Overall, the high sensitivity of the WOA+SVM model indicates that it could be used for the prediction of drug toxicity in the early stage of drug development. Copyright © 2017 Elsevier Inc. All rights reserved.

  19. Support Vector Machine Classification of Major Depressive Disorder Using Diffusion-Weighted Neuroimaging and Graph Theory

    PubMed Central

    Sacchet, Matthew D.; Prasad, Gautam; Foland-Ross, Lara C.; Thompson, Paul M.; Gotlib, Ian H.

    2015-01-01

    Recently, there has been considerable interest in understanding brain networks in major depressive disorder (MDD). Neural pathways can be tracked in the living brain using diffusion-weighted imaging (DWI); graph theory can then be used to study properties of the resulting fiber networks. To date, global abnormalities have not been reported in tractography-based graph metrics in MDD, so we used a machine learning approach based on “support vector machines” to differentiate depressed from healthy individuals based on multiple brain network properties. We also assessed how important specific graph metrics were for this differentiation. Finally, we conducted a local graph analysis to identify abnormal connectivity at specific nodes of the network. We were able to classify depression using whole-brain graph metrics. Small-worldness was the most useful graph metric for classification. The right pars orbitalis, right inferior parietal cortex, and left rostral anterior cingulate all showed abnormal network connectivity in MDD. This is the first use of structural global graph metrics to classify depressed individuals. These findings highlight the importance of future research to understand network properties in depression across imaging modalities, improve classification results, and relate network alterations to psychiatric symptoms, medication, and comorbidities. PMID:25762941

  20. Support vector machine classification of major depressive disorder using diffusion-weighted neuroimaging and graph theory.

    PubMed

    Sacchet, Matthew D; Prasad, Gautam; Foland-Ross, Lara C; Thompson, Paul M; Gotlib, Ian H

    2015-01-01

    Recently, there has been considerable interest in understanding brain networks in major depressive disorder (MDD). Neural pathways can be tracked in the living brain using diffusion-weighted imaging (DWI); graph theory can then be used to study properties of the resulting fiber networks. To date, global abnormalities have not been reported in tractography-based graph metrics in MDD, so we used a machine learning approach based on "support vector machines" to differentiate depressed from healthy individuals based on multiple brain network properties. We also assessed how important specific graph metrics were for this differentiation. Finally, we conducted a local graph analysis to identify abnormal connectivity at specific nodes of the network. We were able to classify depression using whole-brain graph metrics. Small-worldness was the most useful graph metric for classification. The right pars orbitalis, right inferior parietal cortex, and left rostral anterior cingulate all showed abnormal network connectivity in MDD. This is the first use of structural global graph metrics to classify depressed individuals. These findings highlight the importance of future research to understand network properties in depression across imaging modalities, improve classification results, and relate network alterations to psychiatric symptoms, medication, and comorbidities.

  1. Recursive feature selection with significant variables of support vectors.

    PubMed

    Tsai, Chen-An; Huang, Chien-Hsun; Chang, Ching-Wei; Chen, Chun-Houh

    2012-01-01

    The development of DNA microarray makes researchers screen thousands of genes simultaneously and it also helps determine high- and low-expression level genes in normal and disease tissues. Selecting relevant genes for cancer classification is an important issue. Most of the gene selection methods use univariate ranking criteria and arbitrarily choose a threshold to choose genes. However, the parameter setting may not be compatible to the selected classification algorithms. In this paper, we propose a new gene selection method (SVM-t) based on the use of t-statistics embedded in support vector machine. We compared the performance to two similar SVM-based methods: SVM recursive feature elimination (SVMRFE) and recursive support vector machine (RSVM). The three methods were compared based on extensive simulation experiments and analyses of two published microarray datasets. In the simulation experiments, we found that the proposed method is more robust in selecting informative genes than SVMRFE and RSVM and capable to attain good classification performance when the variations of informative and noninformative genes are different. In the analysis of two microarray datasets, the proposed method yields better performance in identifying fewer genes with good prediction accuracy, compared to SVMRFE and RSVM.

  2. Extraction of inland Nypa fruticans (Nipa Palm) using Support Vector Machine

    NASA Astrophysics Data System (ADS)

    Alberto, R. T.; Serrano, S. C.; Damian, G. B.; Camaso, E. E.; Biagtan, A. R.; Panuyas, N. Z.; Quibuyen, J. S.

    2017-09-01

    Mangroves are considered as one of the major habitats in coastal ecosystem, providing a lot of economic and ecological services in human society. Nypa fruticans (Nipa palm) is one of the important species of mangroves because of its versatility and uniqueness as halophytic palm. However, nipas are not only adaptable in saline areas, they can also managed to thrive away from the coastline depending on the favorable soil types available in the area. Because of this, mapping of this species are not limited alone in the near shore areas, but in areas where this species are present as well. The extraction process of Nypa fruticans were carried out using the available LiDAR data. Support Vector Machine (SVM) classification process was used to extract nipas in inland areas. The SVM classification process in mapping Nypa fruticans produced high accuracy of 95+%. The Support Vector Machine classification process to extract inland nipas was proven to be effective by utilizing different terrain derivatives from LiDAR data.

  3. Applications of Support Vector Machines In Chemo And Bioinformatics

    NASA Astrophysics Data System (ADS)

    Jayaraman, V. K.; Sundararajan, V.

    2010-10-01

    Conventional linear & nonlinear tools for classification, regression & data driven modeling are being replaced on a rapid scale by newer techniques & tools based on artificial intelligence and machine learning. While the linear techniques are not applicable for inherently nonlinear problems, newer methods serve as attractive alternatives for solving real life problems. Support Vector Machine (SVM) classifiers are a set of universal feed-forward network based classification algorithms that have been formulated from statistical learning theory and structural risk minimization principle. SVM regression closely follows the classification methodology. In this work recent applications of SVM in Chemo & Bioinformatics will be described with suitable illustrative examples.

  4. Feature selection using a one dimensional naïve Bayes' classifier increases the accuracy of support vector machine classification of CDR3 repertoires.

    PubMed

    Cinelli, Mattia; Sun, Yuxin; Best, Katharine; Heather, James M; Reich-Zeliger, Shlomit; Shifrut, Eric; Friedman, Nir; Shawe-Taylor, John; Chain, Benny

    2017-04-01

    Somatic DNA recombination, the hallmark of vertebrate adaptive immunity, has the potential to generate a vast diversity of antigen receptor sequences. How this diversity captures antigen specificity remains incompletely understood. In this study we use high throughput sequencing to compare the global changes in T cell receptor β chain complementarity determining region 3 (CDR3β) sequences following immunization with ovalbumin administered with complete Freund's adjuvant (CFA) or CFA alone. The CDR3β sequences were deconstructed into short stretches of overlapping contiguous amino acids. The motifs were ranked according to a one-dimensional Bayesian classifier score comparing their frequency in the repertoires of the two immunization classes. The top ranking motifs were selected and used to create feature vectors which were used to train a support vector machine. The support vector machine achieved high classification scores in a leave-one-out validation test reaching >90% in some cases. The study describes a novel two-stage classification strategy combining a one-dimensional Bayesian classifier with a support vector machine. Using this approach we demonstrate that the frequency of a small number of linear motifs three amino acids in length can accurately identify a CD4 T cell response to ovalbumin against a background response to the complex mixture of antigens which characterize Complete Freund's Adjuvant. The sequence data is available at www.ncbi.nlm.nih.gov/sra/?term¼SRP075893 . The Decombinator package is available at github.com/innate2adaptive/Decombinator . The R package e1071 is available at the CRAN repository https://cran.r-project.org/web/packages/e1071/index.html . b.chain@ucl.ac.uk. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press.

  5. Feature selection using a one dimensional naïve Bayes’ classifier increases the accuracy of support vector machine classification of CDR3 repertoires

    PubMed Central

    Cinelli, Mattia; Sun, , Yuxin; Best, Katharine; Heather, James M.; Reich-Zeliger, Shlomit; Shifrut, Eric; Friedman, Nir; Shawe-Taylor, John; Chain, Benny

    2017-01-01

    Abstract Motivation: Somatic DNA recombination, the hallmark of vertebrate adaptive immunity, has the potential to generate a vast diversity of antigen receptor sequences. How this diversity captures antigen specificity remains incompletely understood. In this study we use high throughput sequencing to compare the global changes in T cell receptor β chain complementarity determining region 3 (CDR3β) sequences following immunization with ovalbumin administered with complete Freund’s adjuvant (CFA) or CFA alone. Results: The CDR3β sequences were deconstructed into short stretches of overlapping contiguous amino acids. The motifs were ranked according to a one-dimensional Bayesian classifier score comparing their frequency in the repertoires of the two immunization classes. The top ranking motifs were selected and used to create feature vectors which were used to train a support vector machine. The support vector machine achieved high classification scores in a leave-one-out validation test reaching >90% in some cases. Summary: The study describes a novel two-stage classification strategy combining a one-dimensional Bayesian classifier with a support vector machine. Using this approach we demonstrate that the frequency of a small number of linear motifs three amino acids in length can accurately identify a CD4 T cell response to ovalbumin against a background response to the complex mixture of antigens which characterize Complete Freund’s Adjuvant. Availability and implementation: The sequence data is available at www.ncbi.nlm.nih.gov/sra/?term¼SRP075893. The Decombinator package is available at github.com/innate2adaptive/Decombinator. The R package e1071 is available at the CRAN repository https://cran.r-project.org/web/packages/e1071/index.html. Contact: b.chain@ucl.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:28073756

  6. Interpreting support vector machine models for multivariate group wise analysis in neuroimaging

    PubMed Central

    Gaonkar, Bilwaj; Shinohara, Russell T; Davatzikos, Christos

    2015-01-01

    Machine learning based classification algorithms like support vector machines (SVMs) have shown great promise for turning a high dimensional neuroimaging data into clinically useful decision criteria. However, tracing imaging based patterns that contribute significantly to classifier decisions remains an open problem. This is an issue of critical importance in imaging studies seeking to determine which anatomical or physiological imaging features contribute to the classifier’s decision, thereby allowing users to critically evaluate the findings of such machine learning methods and to understand disease mechanisms. The majority of published work addresses the question of statistical inference for support vector classification using permutation tests based on SVM weight vectors. Such permutation testing ignores the SVM margin, which is critical in SVM theory. In this work we emphasize the use of a statistic that explicitly accounts for the SVM margin and show that the null distributions associated with this statistic are asymptotically normal. Further, our experiments show that this statistic is a lot less conservative as compared to weight based permutation tests and yet specific enough to tease out multivariate patterns in the data. Thus, we can better understand the multivariate patterns that the SVM uses for neuroimaging based classification. PMID:26210913

  7. Comparative evaluation of support vector machine classification for computer aided detection of breast masses in mammography

    NASA Astrophysics Data System (ADS)

    Lesniak, J. M.; Hupse, R.; Blanc, R.; Karssemeijer, N.; Székely, G.

    2012-08-01

    False positive (FP) marks represent an obstacle for effective use of computer-aided detection (CADe) of breast masses in mammography. Typically, the problem can be approached either by developing more discriminative features or by employing different classifier designs. In this paper, the usage of support vector machine (SVM) classification for FP reduction in CADe is investigated, presenting a systematic quantitative evaluation against neural networks, k-nearest neighbor classification, linear discriminant analysis and random forests. A large database of 2516 film mammography examinations and 73 input features was used to train the classifiers and evaluate for their performance on correctly diagnosed exams as well as false negatives. Further, classifier robustness was investigated using varying training data and feature sets as input. The evaluation was based on the mean exam sensitivity in 0.05-1 FPs on normals on the free-response receiver operating characteristic curve (FROC), incorporated into a tenfold cross validation framework. It was found that SVM classification using a Gaussian kernel offered significantly increased detection performance (P = 0.0002) compared to the reference methods. Varying training data and input features, SVMs showed improved exploitation of large feature sets. It is concluded that with the SVM-based CADe a significant reduction of FPs is possible outperforming other state-of-the-art approaches for breast mass CADe.

  8. Implementation of support vector machine for classification of speech marked hijaiyah letters based on Mel frequency cepstrum coefficient feature extraction

    NASA Astrophysics Data System (ADS)

    Adhi Pradana, Wisnu; Adiwijaya; Novia Wisesty, Untari

    2018-03-01

    Support Vector Machine or commonly called SVM is one method that can be used to process the classification of a data. SVM classifies data from 2 different classes with hyperplane. In this study, the system was built using SVM to develop Arabic Speech Recognition. In the development of the system, there are 2 kinds of speakers that have been tested that is dependent speakers and independent speakers. The results from this system is an accuracy of 85.32% for speaker dependent and 61.16% for independent speakers.

  9. Quad-polarized synthetic aperture radar and multispectral data classification using classification and regression tree and support vector machine-based data fusion system

    NASA Astrophysics Data System (ADS)

    Bigdeli, Behnaz; Pahlavani, Parham

    2017-01-01

    Interpretation of synthetic aperture radar (SAR) data processing is difficult because the geometry and spectral range of SAR are different from optical imagery. Consequently, SAR imaging can be a complementary data to multispectral (MS) optical remote sensing techniques because it does not depend on solar illumination and weather conditions. This study presents a multisensor fusion of SAR and MS data based on the use of classification and regression tree (CART) and support vector machine (SVM) through a decision fusion system. First, different feature extraction strategies were applied on SAR and MS data to produce more spectral and textural information. To overcome the redundancy and correlation between features, an intrinsic dimension estimation method based on noise-whitened Harsanyi, Farrand, and Chang determines the proper dimension of the features. Then, principal component analysis and independent component analysis were utilized on stacked feature space of two data. Afterward, SVM and CART classified each reduced feature space. Finally, a fusion strategy was utilized to fuse the classification results. To show the effectiveness of the proposed methodology, single classification on each data was compared to the obtained results. A coregistered Radarsat-2 and WorldView-2 data set from San Francisco, USA, was available to examine the effectiveness of the proposed method. The results show that combinations of SAR data with optical sensor based on the proposed methodology improve the classification results for most of the classes. The proposed fusion method provided approximately 93.24% and 95.44% for two different areas of the data.

  10. [MicroRNA Target Prediction Based on Support Vector Machine Ensemble Classification Algorithm of Under-sampling Technique].

    PubMed

    Chen, Zhiru; Hong, Wenxue

    2016-02-01

    Considering the low accuracy of prediction in the positive samples and poor overall classification effects caused by unbalanced sample data of MicroRNA (miRNA) target, we proposes a support vector machine (SVM)-integration of under-sampling and weight (IUSM) algorithm in this paper, an under-sampling based on the ensemble learning algorithm. The algorithm adopts SVM as learning algorithm and AdaBoost as integration framework, and embeds clustering-based under-sampling into the iterative process, aiming at reducing the degree of unbalanced distribution of positive and negative samples. Meanwhile, in the process of adaptive weight adjustment of the samples, the SVM-IUSM algorithm eliminates the abnormal ones in negative samples with robust sample weights smoothing mechanism so as to avoid over-learning. Finally, the prediction of miRNA target integrated classifier is achieved with the combination of multiple weak classifiers through the voting mechanism. The experiment revealed that the SVM-IUSW, compared with other algorithms on unbalanced dataset collection, could not only improve the accuracy of positive targets and the overall effect of classification, but also enhance the generalization ability of miRNA target classifier.

  11. Multi-phase classification by a least-squares support vector machine approach in tomography images of geological samples

    NASA Astrophysics Data System (ADS)

    Khan, Faisal; Enzmann, Frieder; Kersten, Michael

    2016-03-01

    Image processing of X-ray-computed polychromatic cone-beam micro-tomography (μXCT) data of geological samples mainly involves artefact reduction and phase segmentation. For the former, the main beam-hardening (BH) artefact is removed by applying a best-fit quadratic surface algorithm to a given image data set (reconstructed slice), which minimizes the BH offsets of the attenuation data points from that surface. A Matlab code for this approach is provided in the Appendix. The final BH-corrected image is extracted from the residual data or from the difference between the surface elevation values and the original grey-scale values. For the segmentation, we propose a novel least-squares support vector machine (LS-SVM, an algorithm for pixel-based multi-phase classification) approach. A receiver operating characteristic (ROC) analysis was performed on BH-corrected and uncorrected samples to show that BH correction is in fact an important prerequisite for accurate multi-phase classification. The combination of the two approaches was thus used to classify successfully three different more or less complex multi-phase rock core samples.

  12. Multivariate neuroanatomical classification of cognitive subtypes in schizophrenia: A support vector machine learning approach

    PubMed Central

    Gould, Ian C.; Shepherd, Alana M.; Laurens, Kristin R.; Cairns, Murray J.; Carr, Vaughan J.; Green, Melissa J.

    2014-01-01

    Heterogeneity in the structural brain abnormalities associated with schizophrenia has made identification of reliable neuroanatomical markers of the disease difficult. The use of more homogenous clinical phenotypes may improve the accuracy of predicting psychotic disorder/s on the basis of observable brain disturbances. Here we investigate the utility of cognitive subtypes of schizophrenia – ‘cognitive deficit’ and ‘cognitively spared’ – in determining whether multivariate patterns of volumetric brain differences can accurately discriminate these clinical subtypes from healthy controls, and from each other. We applied support vector machine classification to grey- and white-matter volume data from 126 schizophrenia patients previously allocated to the cognitive spared subtype, 74 cognitive deficit schizophrenia patients, and 134 healthy controls. Using this method, cognitive subtypes were distinguished from healthy controls with up to 72% accuracy. Cross-validation analyses between subtypes achieved an accuracy of 71%, suggesting that some common neuroanatomical patterns distinguish both subtypes from healthy controls. Notably, cognitive subtypes were best distinguished from one another when the sample was stratified by sex prior to classification analysis: cognitive subtype classification accuracy was relatively low (<60%) without stratification, and increased to 83% for females with sex stratification. Distinct neuroanatomical patterns predicted cognitive subtype status in each sex: sex-specific multivariate patterns did not predict cognitive subtype status in the other sex above chance, and weight map analyses demonstrated negative correlations between the spatial patterns of weights underlying classification for each sex. These results suggest that in typical mixed-sex samples of schizophrenia patients, the volumetric brain differences between cognitive subtypes are relatively minor in contrast to the large common disease-associated changes

  13. A Discriminant Distance Based Composite Vector Selection Method for Odor Classification

    PubMed Central

    Choi, Sang-Il; Jeong, Gu-Min

    2014-01-01

    We present a composite vector selection method for an effective electronic nose system that performs well even in noisy environments. Each composite vector generated from a electronic nose data sample is evaluated by computing the discriminant distance. By quantitatively measuring the amount of discriminative information in each composite vector, composite vectors containing informative variables can be distinguished and the final composite features for odor classification are extracted using the selected composite vectors. Using the only informative composite vectors can be also helpful to extract better composite features instead of using all the generated composite vectors. Experimental results with different volatile organic compound data show that the proposed system has good classification performance even in a noisy environment compared to other methods. PMID:24747735

  14. Support vector inductive logic programming outperforms the naive Bayes classifier and inductive logic programming for the classification of bioactive chemical compounds.

    PubMed

    Cannon, Edward O; Amini, Ata; Bender, Andreas; Sternberg, Michael J E; Muggleton, Stephen H; Glen, Robert C; Mitchell, John B O

    2007-05-01

    We investigate the classification performance of circular fingerprints in combination with the Naive Bayes Classifier (MP2D), Inductive Logic Programming (ILP) and Support Vector Inductive Logic Programming (SVILP) on a standard molecular benchmark dataset comprising 11 activity classes and about 102,000 structures. The Naive Bayes Classifier treats features independently while ILP combines structural fragments, and then creates new features with higher predictive power. SVILP is a very recently presented method which adds a support vector machine after common ILP procedures. The performance of the methods is evaluated via a number of statistical measures, namely recall, specificity, precision, F-measure, Matthews Correlation Coefficient, area under the Receiver Operating Characteristic (ROC) curve and enrichment factor (EF). According to the F-measure, which takes both recall and precision into account, SVILP is for seven out of the 11 classes the superior method. The results show that the Bayes Classifier gives the best recall performance for eight of the 11 targets, but has a much lower precision, specificity and F-measure. The SVILP model on the other hand has the highest recall for only three of the 11 classes, but generally far superior specificity and precision. To evaluate the statistical significance of the SVILP superiority, we employ McNemar's test which shows that SVILP performs significantly (p < 5%) better than both other methods for six out of 11 activity classes, while being superior with less significance for three of the remaining classes. While previously the Bayes Classifier was shown to perform very well in molecular classification studies, these results suggest that SVILP is able to extract additional knowledge from the data, thus improving classification results further.

  15. Comparison of support vector machine classification to partial least squares dimension reduction with logistic descrimination of hyperspectral data

    NASA Astrophysics Data System (ADS)

    Wilson, Machelle; Ustin, Susan L.; Rocke, David

    2003-03-01

    Remote sensing technologies with high spatial and spectral resolution show a great deal of promise in addressing critical environmental monitoring issues, but the ability to analyze and interpret the data lags behind the technology. Robust analytical methods are required before the wealth of data available through remote sensing can be applied to a wide range of environmental problems for which remote detection is the best method. In this study we compare the classification effectiveness of two relatively new techniques on data consisting of leaf-level reflectance from plants that have been exposed to varying levels of heavy metal toxicity. If these methodologies work well on leaf-level data, then there is some hope that they will also work well on data from airborne and space-borne platforms. The classification methods compared were support vector machine classification of exposed and non-exposed plants based on the reflectance data, and partial east squares compression of the reflectance data followed by classification using logistic discrimination (PLS/LD). PLS/LD was performed in two ways. We used the continuous concentration data as the response during compression, and then used the binary response required during logistic discrimination. We also used a binary response during compression followed by logistic discrimination. The statistics we used to compare the effectiveness of the methodologies was the leave-one-out cross validation estimate of the prediction error.

  16. Optimization of breast mass classification using sequential forward floating selection (SFFS) and a support vector machine (SVM) model

    PubMed Central

    Tan, Maxine; Pu, Jiantao; Zheng, Bin

    2014-01-01

    Purpose: Improving radiologists’ performance in classification between malignant and benign breast lesions is important to increase cancer detection sensitivity and reduce false-positive recalls. For this purpose, developing computer-aided diagnosis (CAD) schemes has been attracting research interest in recent years. In this study, we investigated a new feature selection method for the task of breast mass classification. Methods: We initially computed 181 image features based on mass shape, spiculation, contrast, presence of fat or calcifications, texture, isodensity, and other morphological features. From this large image feature pool, we used a sequential forward floating selection (SFFS)-based feature selection method to select relevant features, and analyzed their performance using a support vector machine (SVM) model trained for the classification task. On a database of 600 benign and 600 malignant mass regions of interest (ROIs), we performed the study using a ten-fold cross-validation method. Feature selection and optimization of the SVM parameters were conducted on the training subsets only. Results: The area under the receiver operating characteristic curve (AUC) = 0.805±0.012 was obtained for the classification task. The results also showed that the most frequently-selected features by the SFFS-based algorithm in 10-fold iterations were those related to mass shape, isodensity and presence of fat, which are consistent with the image features frequently used by radiologists in the clinical environment for mass classification. The study also indicated that accurately computing mass spiculation features from the projection mammograms was difficult, and failed to perform well for the mass classification task due to tissue overlap within the benign mass regions. Conclusions: In conclusion, this comprehensive feature analysis study provided new and valuable information for optimizing computerized mass classification schemes that may have potential to be

  17. Hybrid three-dimensional and support vector machine approach for automatic vehicle tracking and classification using a single camera

    NASA Astrophysics Data System (ADS)

    Kachach, Redouane; Cañas, José María

    2016-05-01

    Using video in traffic monitoring is one of the most active research domains in the computer vision community. TrafficMonitor, a system that employs a hybrid approach for automatic vehicle tracking and classification on highways using a simple stationary calibrated camera, is presented. The proposed system consists of three modules: vehicle detection, vehicle tracking, and vehicle classification. Moving vehicles are detected by an enhanced Gaussian mixture model background estimation algorithm. The design includes a technique to resolve the occlusion problem by using a combination of two-dimensional proximity tracking algorithm and the Kanade-Lucas-Tomasi feature tracking algorithm. The last module classifies the shapes identified into five vehicle categories: motorcycle, car, van, bus, and truck by using three-dimensional templates and an algorithm based on histogram of oriented gradients and the support vector machine classifier. Several experiments have been performed using both real and simulated traffic in order to validate the system. The experiments were conducted on GRAM-RTM dataset and a proper real video dataset which is made publicly available as part of this work.

  18. Fusion of multiple quadratic penalty function support vector machines (QPFSVM) for automated sea mine detection and classification

    NASA Astrophysics Data System (ADS)

    Dobeck, Gerald J.; Cobb, J. Tory

    2002-08-01

    The high-resolution sonar is one of the principal sensors used by the Navy to detect and classify sea mines in minehunting operations. For such sonar systems, substantial effort has been devoted to the development of automated detection and classification (D/C) algorithms. These have been spurred by several factors including (1) aids for operators to reduce work overload, (2) more optimal use of all available data, and (3) the introduction of unmanned minehunting systems. The environments where sea mines are typically laid (harbor areas, shipping lanes, and the littorals) give rise to many false alarms caused by natural, biologic, and man-made clutter. The objective of the automated D/C algorithms is to eliminate most of these false alarms while still maintaining a very high probability of mine detection and classification (PdPc). In recent years, the benefits of fusing the outputs of multiple D/C algorithms have been studied. We refer to this as Algorithm Fusion. The results have been remarkable, including reliable robustness to new environments. The Quadratic Penalty Function Support Vector Machine (QPFSVM) algorithm to aid in the automated detection and classification of sea mines is introduced in this paper. The QPFSVM algorithm is easy to train, simple to implement, and robust to feature space dimension. Outputs of successive SVM algorithms are cascaded in stages (fused) to improve the Probability of Classification (Pc) and reduce the number of false alarms. Even though our experience has been gained in the area of sea mine detection and classification, the principles described herein are general and can be applied to fusion of any D/C problem (e.g., automated medical diagnosis or automatic target recognition for ballistic missile defense).

  19. Optimizing support vector machine learning for semi-arid vegetation mapping by using clustering analysis

    NASA Astrophysics Data System (ADS)

    Su, Lihong

    In remote sensing communities, support vector machine (SVM) learning has recently received increasing attention. SVM learning usually requires large memory and enormous amounts of computation time on large training sets. According to SVM algorithms, the SVM classification decision function is fully determined by support vectors, which compose a subset of the training sets. In this regard, a solution to optimize SVM learning is to efficiently reduce training sets. In this paper, a data reduction method based on agglomerative hierarchical clustering is proposed to obtain smaller training sets for SVM learning. Using a multiple angle remote sensing dataset of a semi-arid region, the effectiveness of the proposed method is evaluated by classification experiments with a series of reduced training sets. The experiments show that there is no loss of SVM accuracy when the original training set is reduced to 34% using the proposed approach. Maximum likelihood classification (MLC) also is applied on the reduced training sets. The results show that MLC can also maintain the classification accuracy. This implies that the most informative data instances can be retained by this approach.

  20. HMM for hyperspectral spectrum representation and classification with endmember entropy vectors

    NASA Astrophysics Data System (ADS)

    Arabi, Samir Y. W.; Fernandes, David; Pizarro, Marco A.

    2015-10-01

    The Hyperspectral images due to its good spectral resolution are extensively used for classification, but its high number of bands requires a higher bandwidth in the transmission data, a higher data storage capability and a higher computational capability in processing systems. This work presents a new methodology for hyperspectral data classification that can work with a reduced number of spectral bands and achieve good results, comparable with processing methods that require all hyperspectral bands. The proposed method for hyperspectral spectra classification is based on the Hidden Markov Model (HMM) associated to each Endmember (EM) of a scene and the conditional probabilities of each EM belongs to each other EM. The EM conditional probability is transformed in EM vector entropy and those vectors are used as reference vectors for the classes in the scene. The conditional probability of a spectrum that will be classified is also transformed in a spectrum entropy vector, which is classified in a given class by the minimum ED (Euclidian Distance) among it and the EM entropy vectors. The methodology was tested with good results using AVIRIS spectra of a scene with 13 EM considering the full 209 bands and the reduced spectral bands of 128, 64 and 32. For the test area its show that can be used only 32 spectral bands instead of the original 209 bands, without significant loss in the classification process.

  1. Learning machines and sleeping brains: Automatic sleep stage classification using decision-tree multi-class support vector machines.

    PubMed

    Lajnef, Tarek; Chaibi, Sahbi; Ruby, Perrine; Aguera, Pierre-Emmanuel; Eichenlaub, Jean-Baptiste; Samet, Mounir; Kachouri, Abdennaceur; Jerbi, Karim

    2015-07-30

    Sleep staging is a critical step in a range of electrophysiological signal processing pipelines used in clinical routine as well as in sleep research. Although the results currently achievable with automatic sleep staging methods are promising, there is need for improvement, especially given the time-consuming and tedious nature of visual sleep scoring. Here we propose a sleep staging framework that consists of a multi-class support vector machine (SVM) classification based on a decision tree approach. The performance of the method was evaluated using polysomnographic data from 15 subjects (electroencephalogram (EEG), electrooculogram (EOG) and electromyogram (EMG) recordings). The decision tree, or dendrogram, was obtained using a hierarchical clustering technique and a wide range of time and frequency-domain features were extracted. Feature selection was carried out using forward sequential selection and classification was evaluated using k-fold cross-validation. The dendrogram-based SVM (DSVM) achieved mean specificity, sensitivity and overall accuracy of 0.92, 0.74 and 0.88 respectively, compared to expert visual scoring. Restricting DSVM classification to data where both experts' scoring was consistent (76.73% of the data) led to a mean specificity, sensitivity and overall accuracy of 0.94, 0.82 and 0.92 respectively. The DSVM framework outperforms classification with more standard multi-class "one-against-all" SVM and linear-discriminant analysis. The promising results of the proposed methodology suggest that it may be a valuable alternative to existing automatic methods and that it could accelerate visual scoring by providing a robust starting hypnogram that can be further fine-tuned by expert inspection. Copyright © 2015 Elsevier B.V. All rights reserved.

  2. Research on intrusion detection based on Kohonen network and support vector machine

    NASA Astrophysics Data System (ADS)

    Shuai, Chunyan; Yang, Hengcheng; Gong, Zeweiyi

    2018-05-01

    In view of the problem of low detection accuracy and the long detection time of support vector machine, which directly applied to the network intrusion detection system. Optimization of SVM parameters can greatly improve the detection accuracy, but it can not be applied to high-speed network because of the long detection time. a method based on Kohonen neural network feature selection is proposed to reduce the optimization time of support vector machine parameters. Firstly, this paper is to calculate the weights of the KDD99 network intrusion data by Kohonen network and select feature by weight. Then, after the feature selection is completed, genetic algorithm (GA) and grid search method are used for parameter optimization to find the appropriate parameters and classify them by support vector machines. By comparing experiments, it is concluded that feature selection can reduce the time of parameter optimization, which has little influence on the accuracy of classification. The experiments suggest that the support vector machine can be used in the network intrusion detection system and reduce the missing rate.

  3. Support Vector Machine-based classification of protein folds using the structural properties of amino acid residues and amino acid residue pairs.

    PubMed

    Shamim, Mohammad Tabrez Anwar; Anwaruddin, Mohammad; Nagarajaram, H A

    2007-12-15

    Fold recognition is a key step in the protein structure discovery process, especially when traditional sequence comparison methods fail to yield convincing structural homologies. Although many methods have been developed for protein fold recognition, their accuracies remain low. This can be attributed to insufficient exploitation of fold discriminatory features. We have developed a new method for protein fold recognition using structural information of amino acid residues and amino acid residue pairs. Since protein fold recognition can be treated as a protein fold classification problem, we have developed a Support Vector Machine (SVM) based classifier approach that uses secondary structural state and solvent accessibility state frequencies of amino acids and amino acid pairs as feature vectors. Among the individual properties examined secondary structural state frequencies of amino acids gave an overall accuracy of 65.2% for fold discrimination, which is better than the accuracy by any method reported so far in the literature. Combination of secondary structural state frequencies with solvent accessibility state frequencies of amino acids and amino acid pairs further improved the fold discrimination accuracy to more than 70%, which is approximately 8% higher than the best available method. In this study we have also tested, for the first time, an all-together multi-class method known as Crammer and Singer method for protein fold classification. Our studies reveal that the three multi-class classification methods, namely one versus all, one versus one and Crammer and Singer method, yield similar predictions. Dataset and stand-alone program are available upon request.

  4. A multi-label learning based kernel automatic recommendation method for support vector machine.

    PubMed

    Zhang, Xueying; Song, Qinbao

    2015-01-01

    Choosing an appropriate kernel is very important and critical when classifying a new problem with Support Vector Machine. So far, more attention has been paid on constructing new kernels and choosing suitable parameter values for a specific kernel function, but less on kernel selection. Furthermore, most of current kernel selection methods focus on seeking a best kernel with the highest classification accuracy via cross-validation, they are time consuming and ignore the differences among the number of support vectors and the CPU time of SVM with different kernels. Considering the tradeoff between classification success ratio and CPU time, there may be multiple kernel functions performing equally well on the same classification problem. Aiming to automatically select those appropriate kernel functions for a given data set, we propose a multi-label learning based kernel recommendation method built on the data characteristics. For each data set, the meta-knowledge data base is first created by extracting the feature vector of data characteristics and identifying the corresponding applicable kernel set. Then the kernel recommendation model is constructed on the generated meta-knowledge data base with the multi-label classification method. Finally, the appropriate kernel functions are recommended to a new data set by the recommendation model according to the characteristics of the new data set. Extensive experiments over 132 UCI benchmark data sets, with five different types of data set characteristics, eleven typical kernels (Linear, Polynomial, Radial Basis Function, Sigmoidal function, Laplace, Multiquadric, Rational Quadratic, Spherical, Spline, Wave and Circular), and five multi-label classification methods demonstrate that, compared with the existing kernel selection methods and the most widely used RBF kernel function, SVM with the kernel function recommended by our proposed method achieved the highest classification performance.

  5. A Multi-Label Learning Based Kernel Automatic Recommendation Method for Support Vector Machine

    PubMed Central

    Zhang, Xueying; Song, Qinbao

    2015-01-01

    Choosing an appropriate kernel is very important and critical when classifying a new problem with Support Vector Machine. So far, more attention has been paid on constructing new kernels and choosing suitable parameter values for a specific kernel function, but less on kernel selection. Furthermore, most of current kernel selection methods focus on seeking a best kernel with the highest classification accuracy via cross-validation, they are time consuming and ignore the differences among the number of support vectors and the CPU time of SVM with different kernels. Considering the tradeoff between classification success ratio and CPU time, there may be multiple kernel functions performing equally well on the same classification problem. Aiming to automatically select those appropriate kernel functions for a given data set, we propose a multi-label learning based kernel recommendation method built on the data characteristics. For each data set, the meta-knowledge data base is first created by extracting the feature vector of data characteristics and identifying the corresponding applicable kernel set. Then the kernel recommendation model is constructed on the generated meta-knowledge data base with the multi-label classification method. Finally, the appropriate kernel functions are recommended to a new data set by the recommendation model according to the characteristics of the new data set. Extensive experiments over 132 UCI benchmark data sets, with five different types of data set characteristics, eleven typical kernels (Linear, Polynomial, Radial Basis Function, Sigmoidal function, Laplace, Multiquadric, Rational Quadratic, Spherical, Spline, Wave and Circular), and five multi-label classification methods demonstrate that, compared with the existing kernel selection methods and the most widely used RBF kernel function, SVM with the kernel function recommended by our proposed method achieved the highest classification performance. PMID:25893896

  6. Optimal classification for the diagnosis of duchenne muscular dystrophy images using support vector machines.

    PubMed

    Zhang, Ming-Huan; Ma, Jun-Shan; Shen, Ying; Chen, Ying

    2016-09-01

    This study aimed to investigate the optimal support vector machines (SVM)-based classifier of duchenne muscular dystrophy (DMD) magnetic resonance imaging (MRI) images. T1-weighted (T1W) and T2-weighted (T2W) images of the 15 boys with DMD and 15 normal controls were obtained. Textural features of the images were extracted and wavelet decomposed, and then, principal features were selected. Scale transform was then performed for MRI images. Afterward, SVM-based classifiers of MRI images were analyzed based on the radical basis function and decomposition levels. The cost (C) parameter and kernel parameter [Formula: see text] were used for classification. Then, the optimal SVM-based classifier, expressed as [Formula: see text]), was identified by performance evaluation (sensitivity, specificity and accuracy). Eight of 12 textural features were selected as principal features (eigenvalues [Formula: see text]). The 16 SVM-based classifiers were obtained using combination of (C, [Formula: see text]), and those with lower C and [Formula: see text] values showed higher performances, especially classifier of [Formula: see text]). The SVM-based classifiers of T1W images showed higher performance than T1W images at the same decomposition level. The T1W images in classifier of [Formula: see text]) at level 2 decomposition showed the highest performance of all, and its overall correct sensitivity, specificity, and accuracy reached 96.9, 97.3, and 97.1 %, respectively. The T1W images in SVM-based classifier [Formula: see text] at level 2 decomposition showed the highest performance of all, demonstrating that it was the optimal classification for the diagnosis of DMD.

  7. A targeted change-detection procedure by combining change vector analysis and post-classification approach

    NASA Astrophysics Data System (ADS)

    Ye, Su; Chen, Dongmei; Yu, Jie

    2016-04-01

    In remote sensing, conventional supervised change-detection methods usually require effective training data for multiple change types. This paper introduces a more flexible and efficient procedure that seeks to identify only the changes that users are interested in, here after referred to as "targeted change detection". Based on a one-class classifier "Support Vector Domain Description (SVDD)", a novel algorithm named "Three-layer SVDD Fusion (TLSF)" is developed specially for targeted change detection. The proposed algorithm combines one-class classification generated from change vector maps, as well as before- and after-change images in order to get a more reliable detecting result. In addition, this paper introduces a detailed workflow for implementing this algorithm. This workflow has been applied to two case studies with different practical monitoring objectives: urban expansion and forest fire assessment. The experiment results of these two case studies show that the overall accuracy of our proposed algorithm is superior (Kappa statistics are 86.3% and 87.8% for Case 1 and 2, respectively), compared to applying SVDD to change vector analysis and post-classification comparison.

  8. Distance Metric Learning via Iterated Support Vector Machines.

    PubMed

    Zuo, Wangmeng; Wang, Faqiang; Zhang, David; Lin, Liang; Huang, Yuchi; Meng, Deyu; Zhang, Lei

    2017-07-11

    Distance metric learning aims to learn from the given training data a valid distance metric, with which the similarity between data samples can be more effectively evaluated for classification. Metric learning is often formulated as a convex or nonconvex optimization problem, while most existing methods are based on customized optimizers and become inefficient for large scale problems. In this paper, we formulate metric learning as a kernel classification problem with the positive semi-definite constraint, and solve it by iterated training of support vector machines (SVMs). The new formulation is easy to implement and efficient in training with the off-the-shelf SVM solvers. Two novel metric learning models, namely Positive-semidefinite Constrained Metric Learning (PCML) and Nonnegative-coefficient Constrained Metric Learning (NCML), are developed. Both PCML and NCML can guarantee the global optimality of their solutions. Experiments are conducted on general classification, face verification and person re-identification to evaluate our methods. Compared with the state-of-the-art approaches, our methods can achieve comparable classification accuracy and are efficient in training.

  9. Automatic classification of 6-month-old infants at familial risk for language-based learning disorder using a support vector machine.

    PubMed

    Zare, Marzieh; Rezvani, Zahra; Benasich, April A

    2016-07-01

    This study assesses the ability of a novel, "automatic classification" approach to facilitate identification of infants at highest familial risk for language-learning disorders (LLD) and to provide converging assessments to enable earlier detection of developmental disorders that disrupt language acquisition. Network connectivity measures derived from 62-channel electroencephalogram (EEG) recording were used to identify selected features within two infant groups who differed on LLD risk: infants with a family history of LLD (FH+) and typically-developing infants without such a history (FH-). A support vector machine was deployed; global efficiency and global and local clustering coefficients were computed. A novel minimum spanning tree (MST) approach was also applied. Cross-validation was employed to assess the resultant classification. Infants were classified with about 80% accuracy into FH+ and FH- groups with 89% specificity and precision of 92%. Clustering patterns differed by risk group and MST network analysis suggests that FH+ infants' EEG complexity patterns were significantly different from FH- infants. The automatic classification techniques used here were shown to be both robust and reliable and should provide valuable information when applied to early identification of risk or clinical groups. The ability to identify infants at highest risk for LLD using "automatic classification" strategies is a novel convergent approach that may facilitate earlier diagnosis and remediation. Copyright © 2016 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.

  10. Deep neural mapping support vector machines.

    PubMed

    Li, Yujian; Zhang, Ting

    2017-09-01

    The choice of kernel has an important effect on the performance of a support vector machine (SVM). The effect could be reduced by NEUROSVM, an architecture using multilayer perceptron for feature extraction and SVM for classification. In binary classification, a general linear kernel NEUROSVM can be theoretically simplified as an input layer, many hidden layers, and an SVM output layer. As a feature extractor, the sub-network composed of the input and hidden layers is first trained together with a virtual ordinary output layer by backpropagation, then with the output of its last hidden layer taken as input of the SVM classifier for further training separately. By taking the sub-network as a kernel mapping from the original input space into a feature space, we present a novel model, called deep neural mapping support vector machine (DNMSVM), from the viewpoint of deep learning. This model is also a new and general kernel learning method, where the kernel mapping is indeed an explicit function expressed as a sub-network, different from an implicit function induced by a kernel function traditionally. Moreover, we exploit a two-stage procedure of contrastive divergence learning and gradient descent for DNMSVM to jointly training an adaptive kernel mapping instead of a kernel function, without requirement of kernel tricks. As a whole of the sub-network and the SVM classifier, the joint training of DNMSVM is done by using gradient descent to optimize the objective function with the sub-network layer-wise pre-trained via contrastive divergence learning of restricted Boltzmann machines. Compared to the separate training of NEUROSVM, the joint training is a new algorithm for DNMSVM to have advantages over NEUROSVM. Experimental results show that DNMSVM can outperform NEUROSVM and RBFSVM (i.e., SVM with the kernel of radial basis function), demonstrating its effectiveness. Copyright © 2017 Elsevier Ltd. All rights reserved.

  11. Support vector machine for breast cancer classification using diffusion-weighted MRI histogram features: Preliminary study.

    PubMed

    Vidić, Igor; Egnell, Liv; Jerome, Neil P; Teruel, Jose R; Sjøbakk, Torill E; Østlie, Agnes; Fjøsne, Hans E; Bathen, Tone F; Goa, Pål Erik

    2018-05-01

    Diffusion-weighted MRI (DWI) is currently one of the fastest developing MRI-based techniques in oncology. Histogram properties from model fitting of DWI are useful features for differentiation of lesions, and classification can potentially be improved by machine learning. To evaluate classification of malignant and benign tumors and breast cancer subtypes using support vector machine (SVM). Prospective. Fifty-one patients with benign (n = 23) and malignant (n = 28) breast tumors (26 ER+, whereof six were HER2+). Patients were imaged with DW-MRI (3T) using twice refocused spin-echo echo-planar imaging with echo time / repetition time (TR/TE) = 9000/86 msec, 90 × 90 matrix size, 2 × 2 mm in-plane resolution, 2.5 mm slice thickness, and 13 b-values. Apparent diffusion coefficient (ADC), relative enhanced diffusivity (RED), and the intravoxel incoherent motion (IVIM) parameters diffusivity (D), pseudo-diffusivity (D*), and perfusion fraction (f) were calculated. The histogram properties (median, mean, standard deviation, skewness, kurtosis) were used as features in SVM (10-fold cross-validation) for differentiation of lesions and subtyping. Accuracies of the SVM classifications were calculated to find the combination of features with highest prediction accuracy. Mann-Whitney tests were performed for univariate comparisons. For benign versus malignant tumors, univariate analysis found 11 histogram properties to be significant differentiators. Using SVM, the highest accuracy (0.96) was achieved from a single feature (mean of RED), or from three feature combinations of IVIM or ADC. Combining features from all models gave perfect classification. No single feature predicted HER2 status of ER + tumors (univariate or SVM), although high accuracy (0.90) was achieved with SVM combining several features. Importantly, these features had to include higher-order statistics (kurtosis and skewness), indicating the importance to account for heterogeneity. Our

  12. Object-based delineation and classification of alluvial fans by application of mean-shift segmentation and support vector machines

    NASA Astrophysics Data System (ADS)

    Pipaud, Isabel; Lehmkuhl, Frank

    2017-09-01

    In the field of geomorphology, automated extraction and classification of landforms is one of the most active research areas. Until the late 2000s, this task has primarily been tackled using pixel-based approaches. As these methods consider pixels and pixel neighborhoods as the sole basic entities for analysis, they cannot account for the irregular boundaries of real-world objects. Object-based analysis frameworks emerging from the field of remote sensing have been proposed as an alternative approach, and were successfully applied in case studies falling in the domains of both general and specific geomorphology. In this context, the a-priori selection of scale parameters or bandwidths is crucial for the segmentation result, because inappropriate parametrization will either result in over-segmentation or insufficient segmentation. In this study, we describe a novel supervised method for delineation and classification of alluvial fans, and assess its applicability using a SRTM 1‧‧ DEM scene depicting a section of the north-eastern Mongolian Altai, located in northwest Mongolia. The approach is premised on the application of mean-shift segmentation and the use of a one-class support vector machine (SVM) for classification. To consider variability in terms of alluvial fan dimension and shape, segmentation is performed repeatedly for different weightings of the incorporated morphometric parameters as well as different segmentation bandwidths. The final classification layer is obtained by selecting, for each real-world object, the most appropriate segmentation result according to fuzzy membership values derived from the SVM classification. Our results show that mean-shift segmentation and SVM-based classification provide an effective framework for delineation and classification of a particular landform. Variable bandwidths and terrain parameter weightings were identified as being crucial for consideration of intra-class variability, and, in turn, for a constantly

  13. Improvements on ν-Twin Support Vector Machine.

    PubMed

    Khemchandani, Reshma; Saigal, Pooja; Chandra, Suresh

    2016-07-01

    In this paper, we propose two novel binary classifiers termed as "Improvements on ν-Twin Support Vector Machine: Iν-TWSVM and Iν-TWSVM (Fast)" that are motivated by ν-Twin Support Vector Machine (ν-TWSVM). Similar to ν-TWSVM, Iν-TWSVM determines two nonparallel hyperplanes such that they are closer to their respective classes and are at least ρ distance away from the other class. The significant advantage of Iν-TWSVM over ν-TWSVM is that Iν-TWSVM solves one smaller-sized Quadratic Programming Problem (QPP) and one Unconstrained Minimization Problem (UMP); as compared to solving two related QPPs in ν-TWSVM. Further, Iν-TWSVM (Fast) avoids solving a smaller sized QPP and transforms it as a unimodal function, which can be solved using line search methods and similar to Iν-TWSVM, the other problem is solved as a UMP. Due to their novel formulation, the proposed classifiers are faster than ν-TWSVM and have comparable generalization ability. Iν-TWSVM also implements structural risk minimization (SRM) principle by introducing a regularization term, along with minimizing the empirical risk. The other properties of Iν-TWSVM, related to support vectors (SVs), are similar to that of ν-TWSVM. To test the efficacy of the proposed method, experiments have been conducted on a wide range of UCI and a skewed variation of NDC datasets. We have also given the application of Iν-TWSVM as a binary classifier for pixel classification of color images. Copyright © 2016 Elsevier Ltd. All rights reserved.

  14. Application of support vector machines for copper potential mapping in Kerman region, Iran

    NASA Astrophysics Data System (ADS)

    Shabankareh, Mahdi; Hezarkhani, Ardeshir

    2017-04-01

    The first step in systematic exploration studies is mineral potential mapping, which involves classification of the study area to favorable and unfavorable parts. Support vector machines (SVM) are designed for supervised classification based on statistical learning theory. This method named support vector classification (SVC). This paper describes SVC model, which combine exploration data in the regional-scale for copper potential mapping in Kerman copper bearing belt in south of Iran. Data layers or evidential maps were in six datasets namely lithology, tectonic, airborne geophysics, ferric alteration, hydroxide alteration and geochemistry. The SVC modeling result selected 2220 pixels as favorable zones, approximately 25 percent of the study area. Besides, 66 out of 86 copper indices, approximately 78.6% of all, were located in favorable zones. Other main goal of this study was to determine how each input affects favorable output. For this purpose, the histogram of each normalized input data to its favorable output was drawn. The histograms of each input dataset for favorable output showed that each information layer had a certain pattern. These patterns of SVC results could be considered as regional copper exploration characteristics.

  15. Prediction of Skin Sensitization with a Particle Swarm Optimized Support Vector Machine

    PubMed Central

    Yuan, Hua; Huang, Jianping; Cao, Chenzhong

    2009-01-01

    Skin sensitization is the most commonly reported occupational illness, causing much suffering to a wide range of people. Identification and labeling of environmental allergens is urgently required to protect people from skin sensitization. The guinea pig maximization test (GPMT) and murine local lymph node assay (LLNA) are the two most important in vivo models for identification of skin sensitizers. In order to reduce the number of animal tests, quantitative structure-activity relationships (QSARs) are strongly encouraged in the assessment of skin sensitization of chemicals. This paper has investigated the skin sensitization potential of 162 compounds with LLNA results and 92 compounds with GPMT results using a support vector machine. A particle swarm optimization algorithm was implemented for feature selection from a large number of molecular descriptors calculated by Dragon. For the LLNA data set, the classification accuracies are 95.37% and 88.89% for the training and the test sets, respectively. For the GPMT data set, the classification accuracies are 91.80% and 90.32% for the training and the test sets, respectively. The classification performances were greatly improved compared to those reported in the literature, indicating that the support vector machine optimized by particle swarm in this paper is competent for the identification of skin sensitizers. PMID:19742136

  16. Granular support vector machines with association rules mining for protein homology prediction.

    PubMed

    Tang, Yuchun; Jin, Bo; Zhang, Yan-Qing

    2005-01-01

    Protein homology prediction between protein sequences is one of critical problems in computational biology. Such a complex classification problem is common in medical or biological information processing applications. How to build a model with superior generalization capability from training samples is an essential issue for mining knowledge to accurately predict/classify unseen new samples and to effectively support human experts to make correct decisions. A new learning model called granular support vector machines (GSVM) is proposed based on our previous work. GSVM systematically and formally combines the principles from statistical learning theory and granular computing theory and thus provides an interesting new mechanism to address complex classification problems. It works by building a sequence of information granules and then building support vector machines (SVM) in some of these information granules on demand. A good granulation method to find suitable granules is crucial for modeling a GSVM with good performance. In this paper, we also propose an association rules-based granulation method. For the granules induced by association rules with high enough confidence and significant support, we leave them as they are because of their high "purity" and significant effect on simplifying the classification task. For every other granule, a SVM is modeled to discriminate the corresponding data. In this way, a complex classification problem is divided into multiple smaller problems so that the learning task is simplified. The proposed algorithm, here named GSVM-AR, is compared with SVM by KDDCUP04 protein homology prediction data. The experimental results show that finding the splitting hyperplane is not a trivial task (we should be careful to select the association rules to avoid overfitting) and GSVM-AR does show significant improvement compared to building one single SVM in the whole feature space. Another advantage is that the utility of GSVM-AR is very good

  17. An ensemble approach to protein fold classification by integration of template-based assignment and support vector machine classifier.

    PubMed

    Xia, Jiaqi; Peng, Zhenling; Qi, Dawei; Mu, Hongbo; Yang, Jianyi

    2017-03-15

    Protein fold classification is a critical step in protein structure prediction. There are two possible ways to classify protein folds. One is through template-based fold assignment and the other is ab-initio prediction using machine learning algorithms. Combination of both solutions to improve the prediction accuracy was never explored before. We developed two algorithms, HH-fold and SVM-fold for protein fold classification. HH-fold is a template-based fold assignment algorithm using the HHsearch program. SVM-fold is a support vector machine-based ab-initio classification algorithm, in which a comprehensive set of features are extracted from three complementary sequence profiles. These two algorithms are then combined, resulting to the ensemble approach TA-fold. We performed a comprehensive assessment for the proposed methods by comparing with ab-initio methods and template-based threading methods on six benchmark datasets. An accuracy of 0.799 was achieved by TA-fold on the DD dataset that consists of proteins from 27 folds. This represents improvement of 5.4-11.7% over ab-initio methods. After updating this dataset to include more proteins in the same folds, the accuracy increased to 0.971. In addition, TA-fold achieved >0.9 accuracy on a large dataset consisting of 6451 proteins from 184 folds. Experiments on the LE dataset show that TA-fold consistently outperforms other threading methods at the family, superfamily and fold levels. The success of TA-fold is attributed to the combination of template-based fold assignment and ab-initio classification using features from complementary sequence profiles that contain rich evolution information. http://yanglab.nankai.edu.cn/TA-fold/. yangjy@nankai.edu.cn or mhb-506@163.com. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  18. Classification of Alzheimer's disease patients with hippocampal shape wrapper-based feature selection and support vector machine

    NASA Astrophysics Data System (ADS)

    Young, Jonathan; Ridgway, Gerard; Leung, Kelvin; Ourselin, Sebastien

    2012-02-01

    It is well known that hippocampal atrophy is a marker of the onset of Alzheimer's disease (AD) and as a result hippocampal volumetry has been used in a number of studies to provide early diagnosis of AD and predict conversion of mild cognitive impairment patients to AD. However, rates of atrophy are not uniform across the hippocampus making shape analysis a potentially more accurate biomarker. This study studies the hippocampi from 226 healthy controls, 148 AD patients and 330 MCI patients obtained from T1 weighted structural MRI images from the ADNI database. The hippocampi are anatomically segmented using the MAPS multi-atlas segmentation method, and the resulting binary images are then processed with SPHARM software to decompose their shapes as a weighted sum of spherical harmonic basis functions. The resulting parameterizations are then used as feature vectors in Support Vector Machine (SVM) classification. A wrapper based feature selection method was used as this considers the utility of features in discriminating classes in combination, fully exploiting the multivariate nature of the data and optimizing the selected set of features for the type of classifier that is used. The leave-one-out cross validated accuracy obtained on training data is 88.6% for classifying AD vs controls and 74% for classifying MCI-converters vs MCI-stable with very compact feature sets, showing that this is a highly promising method. There is currently a considerable fall in accuracy on unseen data indicating that the feature selection is sensitive to the data used, however feature ensemble methods may overcome this.

  19. Justification of Fuzzy Declustering Vector Quantization Modeling in Classification of Genotype-Image Phenotypes

    NASA Astrophysics Data System (ADS)

    Ng, Theam Foo; Pham, Tuan D.; Zhou, Xiaobo

    2010-01-01

    With the fast development of multi-dimensional data compression and pattern classification techniques, vector quantization (VQ) has become a system that allows large reduction of data storage and computational effort. One of the most recent VQ techniques that handle the poor estimation of vector centroids due to biased data from undersampling is to use fuzzy declustering-based vector quantization (FDVQ) technique. Therefore, in this paper, we are motivated to propose a justification of FDVQ based hidden Markov model (HMM) for investigating its effectiveness and efficiency in classification of genotype-image phenotypes. The performance evaluation and comparison of the recognition accuracy between a proposed FDVQ based HMM (FDVQ-HMM) and a well-known LBG (Linde, Buzo, Gray) vector quantization based HMM (LBG-HMM) will be carried out. The experimental results show that the performances of both FDVQ-HMM and LBG-HMM are almost similar. Finally, we have justified the competitiveness of FDVQ-HMM in classification of cellular phenotype image database by using hypotheses t-test. As a result, we have validated that the FDVQ algorithm is a robust and an efficient classification technique in the application of RNAi genome-wide screening image data.

  20. Landslides Identification Using Airborne Laser Scanning Data Derived Topographic Terrain Attributes and Support Vector Machine Classification

    NASA Astrophysics Data System (ADS)

    Pawłuszek, Kamila; Borkowski, Andrzej

    2016-06-01

    Since the availability of high-resolution Airborne Laser Scanning (ALS) data, substantial progress in geomorphological research, especially in landslide analysis, has been carried out. First and second order derivatives of Digital Terrain Model (DTM) have become a popular and powerful tool in landslide inventory mapping. Nevertheless, an automatic landslide mapping based on sophisticated classifiers including Support Vector Machine (SVM), Artificial Neural Network or Random Forests is often computationally time consuming. The objective of this research is to deeply explore topographic information provided by ALS data and overcome computational time limitation. For this reason, an extended set of topographic features and the Principal Component Analysis (PCA) were used to reduce redundant information. The proposed novel approach was tested on a susceptible area affected by more than 50 landslides located on Rożnów Lake in Carpathian Mountains, Poland. The initial seven PCA components with 90% of the total variability in the original topographic attributes were used for SVM classification. Comparing results with landslide inventory map, the average user's accuracy (UA), producer's accuracy (PA), and overall accuracy (OA) were calculated for two models according to the classification results. Thereby, for the PCA-feature-reduced model UA, PA, and OA were found to be 72%, 76%, and 72%, respectively. Similarly, UA, PA, and OA in the non-reduced original topographic model, was 74%, 77% and 74%, respectively. Using the initial seven PCA components instead of the twenty original topographic attributes does not significantly change identification accuracy but reduce computational time.

  1. Diagnosis of Chronic Kidney Disease Based on Support Vector Machine by Feature Selection Methods.

    PubMed

    Polat, Huseyin; Danaei Mehr, Homay; Cetin, Aydin

    2017-04-01

    As Chronic Kidney Disease progresses slowly, early detection and effective treatment are the only cure to reduce the mortality rate. Machine learning techniques are gaining significance in medical diagnosis because of their classification ability with high accuracy rates. The accuracy of classification algorithms depend on the use of correct feature selection algorithms to reduce the dimension of datasets. In this study, Support Vector Machine classification algorithm was used to diagnose Chronic Kidney Disease. To diagnose the Chronic Kidney Disease, two essential types of feature selection methods namely, wrapper and filter approaches were chosen to reduce the dimension of Chronic Kidney Disease dataset. In wrapper approach, classifier subset evaluator with greedy stepwise search engine and wrapper subset evaluator with the Best First search engine were used. In filter approach, correlation feature selection subset evaluator with greedy stepwise search engine and filtered subset evaluator with the Best First search engine were used. The results showed that the Support Vector Machine classifier by using filtered subset evaluator with the Best First search engine feature selection method has higher accuracy rate (98.5%) in the diagnosis of Chronic Kidney Disease compared to other selected methods.

  2. Comparison of Random Forest, k-Nearest Neighbor, and Support Vector Machine Classifiers for Land Cover Classification Using Sentinel-2 Imagery

    PubMed Central

    Thanh Noi, Phan; Kappas, Martin

    2017-01-01

    In previous classification studies, three non-parametric classifiers, Random Forest (RF), k-Nearest Neighbor (kNN), and Support Vector Machine (SVM), were reported as the foremost classifiers at producing high accuracies. However, only a few studies have compared the performances of these classifiers with different training sample sizes for the same remote sensing images, particularly the Sentinel-2 Multispectral Imager (MSI). In this study, we examined and compared the performances of the RF, kNN, and SVM classifiers for land use/cover classification using Sentinel-2 image data. An area of 30 × 30 km2 within the Red River Delta of Vietnam with six land use/cover types was classified using 14 different training sample sizes, including balanced and imbalanced, from 50 to over 1250 pixels/class. All classification results showed a high overall accuracy (OA) ranging from 90% to 95%. Among the three classifiers and 14 sub-datasets, SVM produced the highest OA with the least sensitivity to the training sample sizes, followed consecutively by RF and kNN. In relation to the sample size, all three classifiers showed a similar and high OA (over 93.85%) when the training sample size was large enough, i.e., greater than 750 pixels/class or representing an area of approximately 0.25% of the total study area. The high accuracy was achieved with both imbalanced and balanced datasets. PMID:29271909

  3. Comparison of Random Forest, k-Nearest Neighbor, and Support Vector Machine Classifiers for Land Cover Classification Using Sentinel-2 Imagery.

    PubMed

    Thanh Noi, Phan; Kappas, Martin

    2017-12-22

    In previous classification studies, three non-parametric classifiers, Random Forest (RF), k-Nearest Neighbor (kNN), and Support Vector Machine (SVM), were reported as the foremost classifiers at producing high accuracies. However, only a few studies have compared the performances of these classifiers with different training sample sizes for the same remote sensing images, particularly the Sentinel-2 Multispectral Imager (MSI). In this study, we examined and compared the performances of the RF, kNN, and SVM classifiers for land use/cover classification using Sentinel-2 image data. An area of 30 × 30 km² within the Red River Delta of Vietnam with six land use/cover types was classified using 14 different training sample sizes, including balanced and imbalanced, from 50 to over 1250 pixels/class. All classification results showed a high overall accuracy (OA) ranging from 90% to 95%. Among the three classifiers and 14 sub-datasets, SVM produced the highest OA with the least sensitivity to the training sample sizes, followed consecutively by RF and kNN. In relation to the sample size, all three classifiers showed a similar and high OA (over 93.85%) when the training sample size was large enough, i.e., greater than 750 pixels/class or representing an area of approximately 0.25% of the total study area. The high accuracy was achieved with both imbalanced and balanced datasets.

  4. Quantum optimization for training support vector machines.

    PubMed

    Anguita, Davide; Ridella, Sandro; Rivieccio, Fabio; Zunino, Rodolfo

    2003-01-01

    Refined concepts, such as Rademacher estimates of model complexity and nonlinear criteria for weighting empirical classification errors, represent recent and promising approaches to characterize the generalization ability of Support Vector Machines (SVMs). The advantages of those techniques lie in both improving the SVM representation ability and yielding tighter generalization bounds. On the other hand, they often make Quadratic-Programming algorithms no longer applicable, and SVM training cannot benefit from efficient, specialized optimization techniques. The paper considers the application of Quantum Computing to solve the problem of effective SVM training, especially in the case of digital implementations. The presented research compares the behavioral aspects of conventional and enhanced SVMs; experiments in both a synthetic and real-world problems support the theoretical analysis. At the same time, the related differences between Quadratic-Programming and Quantum-based optimization techniques are considered.

  5. Design of Clinical Support Systems Using Integrated Genetic Algorithm and Support Vector Machine

    NASA Astrophysics Data System (ADS)

    Chen, Yung-Fu; Huang, Yung-Fa; Jiang, Xiaoyi; Hsu, Yuan-Nian; Lin, Hsuan-Hung

    Clinical decision support system (CDSS) provides knowledge and specific information for clinicians to enhance diagnostic efficiency and improving healthcare quality. An appropriate CDSS can highly elevate patient safety, improve healthcare quality, and increase cost-effectiveness. Support vector machine (SVM) is believed to be superior to traditional statistical and neural network classifiers. However, it is critical to determine suitable combination of SVM parameters regarding classification performance. Genetic algorithm (GA) can find optimal solution within an acceptable time, and is faster than greedy algorithm with exhaustive searching strategy. By taking the advantage of GA in quickly selecting the salient features and adjusting SVM parameters, a method using integrated GA and SVM (IGS), which is different from the traditional method with GA used for feature selection and SVM for classification, was used to design CDSSs for prediction of successful ventilation weaning, diagnosis of patients with severe obstructive sleep apnea, and discrimination of different cell types form Pap smear. The results show that IGS is better than methods using SVM alone or linear discriminator.

  6. Data mining methods in the prediction of Dementia: A real-data comparison of the accuracy, sensitivity and specificity of linear discriminant analysis, logistic regression, neural networks, support vector machines, classification trees and random forests

    PubMed Central

    2011-01-01

    Background Dementia and cognitive impairment associated with aging are a major medical and social concern. Neuropsychological testing is a key element in the diagnostic procedures of Mild Cognitive Impairment (MCI), but has presently a limited value in the prediction of progression to dementia. We advance the hypothesis that newer statistical classification methods derived from data mining and machine learning methods like Neural Networks, Support Vector Machines and Random Forests can improve accuracy, sensitivity and specificity of predictions obtained from neuropsychological testing. Seven non parametric classifiers derived from data mining methods (Multilayer Perceptrons Neural Networks, Radial Basis Function Neural Networks, Support Vector Machines, CART, CHAID and QUEST Classification Trees and Random Forests) were compared to three traditional classifiers (Linear Discriminant Analysis, Quadratic Discriminant Analysis and Logistic Regression) in terms of overall classification accuracy, specificity, sensitivity, Area under the ROC curve and Press'Q. Model predictors were 10 neuropsychological tests currently used in the diagnosis of dementia. Statistical distributions of classification parameters obtained from a 5-fold cross-validation were compared using the Friedman's nonparametric test. Results Press' Q test showed that all classifiers performed better than chance alone (p < 0.05). Support Vector Machines showed the larger overall classification accuracy (Median (Me) = 0.76) an area under the ROC (Me = 0.90). However this method showed high specificity (Me = 1.0) but low sensitivity (Me = 0.3). Random Forest ranked second in overall accuracy (Me = 0.73) with high area under the ROC (Me = 0.73) specificity (Me = 0.73) and sensitivity (Me = 0.64). Linear Discriminant Analysis also showed acceptable overall accuracy (Me = 0.66), with acceptable area under the ROC (Me = 0.72) specificity (Me = 0.66) and sensitivity (Me = 0.64). The remaining classifiers showed

  7. Multiscale asymmetric orthogonal wavelet kernel for linear programming support vector learning and nonlinear dynamic systems identification.

    PubMed

    Lu, Zhao; Sun, Jing; Butts, Kenneth

    2014-05-01

    Support vector regression for approximating nonlinear dynamic systems is more delicate than the approximation of indicator functions in support vector classification, particularly for systems that involve multitudes of time scales in their sampled data. The kernel used for support vector learning determines the class of functions from which a support vector machine can draw its solution, and the choice of kernel significantly influences the performance of a support vector machine. In this paper, to bridge the gap between wavelet multiresolution analysis and kernel learning, the closed-form orthogonal wavelet is exploited to construct new multiscale asymmetric orthogonal wavelet kernels for linear programming support vector learning. The closed-form multiscale orthogonal wavelet kernel provides a systematic framework to implement multiscale kernel learning via dyadic dilations and also enables us to represent complex nonlinear dynamics effectively. To demonstrate the superiority of the proposed multiscale wavelet kernel in identifying complex nonlinear dynamic systems, two case studies are presented that aim at building parallel models on benchmark datasets. The development of parallel models that address the long-term/mid-term prediction issue is more intricate and challenging than the identification of series-parallel models where only one-step ahead prediction is required. Simulation results illustrate the effectiveness of the proposed multiscale kernel learning.

  8. CASAnova: a multiclass support vector machine model for the classification of human sperm motility patterns.

    PubMed

    Goodson, Summer G; White, Sarah; Stevans, Alicia M; Bhat, Sanjana; Kao, Chia-Yu; Jaworski, Scott; Marlowe, Tamara R; Kohlmeier, Martin; McMillan, Leonard; Zeisel, Steven H; O'Brien, Deborah A

    2017-11-01

    The ability to accurately monitor alterations in sperm motility is paramount to understanding multiple genetic and biochemical perturbations impacting normal fertilization. Computer-aided sperm analysis (CASA) of human sperm typically reports motile percentage and kinematic parameters at the population level, and uses kinematic gating methods to identify subpopulations such as progressive or hyperactivated sperm. The goal of this study was to develop an automated method that classifies all patterns of human sperm motility during in vitro capacitation following the removal of seminal plasma. We visually classified CASA tracks of 2817 sperm from 18 individuals and used a support vector machine-based decision tree to compute four hyperplanes that separate five classes based on their kinematic parameters. We then developed a web-based program, CASAnova, which applies these equations sequentially to assign a single classification to each motile sperm. Vigorous sperm are classified as progressive, intermediate, or hyperactivated, and nonvigorous sperm as slow or weakly motile. This program correctly classifies sperm motility into one of five classes with an overall accuracy of 89.9%. Application of CASAnova to capacitating sperm populations showed a shift from predominantly linear patterns of motility at initial time points to more vigorous patterns, including hyperactivated motility, as capacitation proceeds. Both intermediate and hyperactivated motility patterns were largely eliminated when sperm were incubated in noncapacitating medium, demonstrating the sensitivity of this method. The five CASAnova classifications are distinctive and reflect kinetic parameters of washed human sperm, providing an accurate, quantitative, and high-throughput method for monitoring alterations in motility. © The Authors 2017. Published by Oxford University Press on behalf of Society for the Study of Reproduction. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  9. Computer-aided classification of Alzheimer's disease based on support vector machine with combination of cerebral image features in MRI

    NASA Astrophysics Data System (ADS)

    Jongkreangkrai, C.; Vichianin, Y.; Tocharoenchai, C.; Arimura, H.; Alzheimer's Disease Neuroimaging Initiative

    2016-03-01

    Several studies have differentiated Alzheimer's disease (AD) using cerebral image features derived from MR brain images. In this study, we were interested in combining hippocampus and amygdala volumes and entorhinal cortex thickness to improve the performance of AD differentiation. Thus, our objective was to investigate the useful features obtained from MRI for classification of AD patients using support vector machine (SVM). T1-weighted MR brain images of 100 AD patients and 100 normal subjects were processed using FreeSurfer software to measure hippocampus and amygdala volumes and entorhinal cortex thicknesses in both brain hemispheres. Relative volumes of hippocampus and amygdala were calculated to correct variation in individual head size. SVM was employed with five combinations of features (H: hippocampus relative volumes, A: amygdala relative volumes, E: entorhinal cortex thicknesses, HA: hippocampus and amygdala relative volumes and ALL: all features). Receiver operating characteristic (ROC) analysis was used to evaluate the method. AUC values of five combinations were 0.8575 (H), 0.8374 (A), 0.8422 (E), 0.8631 (HA) and 0.8906 (ALL). Although “ALL” provided the highest AUC, there were no statistically significant differences among them except for “A” feature. Our results showed that all suggested features may be feasible for computer-aided classification of AD patients.

  10. Analysis of spectrally resolved autofluorescence images by support vector machines

    NASA Astrophysics Data System (ADS)

    Mateasik, A.; Chorvat, D.; Chorvatova, A.

    2013-02-01

    Spectral analysis of the autofluorescence images of isolated cardiac cells was performed to evaluate and to classify the metabolic state of the cells in respect to the responses to metabolic modulators. The classification was done using machine learning approach based on support vector machine with the set of the automatically calculated features from recorded spectral profile of spectral autofluorescence images. This classification method was compared with the classical approach where the individual spectral components contributing to cell autofluorescence were estimated by spectral analysis, namely by blind source separation using non-negative matrix factorization. Comparison of both methods showed that machine learning can effectively classify the spectrally resolved autofluorescence images without the need of detailed knowledge about the sources of autofluorescence and their spectral properties.

  11. Support Vector Machine and Artificial Neural Network Models for the Classification of Grapevine Varieties Using a Portable NIR Spectrophotometer.

    PubMed

    Gutiérrez, Salvador; Tardaguila, Javier; Fernández-Novales, Juan; Diago, María P

    2015-01-01

    The identification of different grapevine varieties, currently attended using visual ampelometry, DNA analysis and very recently, by hyperspectral analysis under laboratory conditions, is an issue of great importance in the wine industry. This work presents support vector machine and artificial neural network's modelling for grapevine varietal classification from in-field leaf spectroscopy. Modelling was attempted at two scales: site-specific and a global scale. Spectral measurements were obtained on the near-infrared (NIR) spectral range between 1600 to 2400 nm under field conditions in a non-destructive way using a portable spectrophotometer. For the site specific approach, spectra were collected from the adaxial side of 400 individual leaves of 20 grapevine (Vitis vinifera L.) varieties one week after veraison. For the global model, two additional sets of spectra were collected one week before harvest from two different vineyards in another vintage, each one consisting on 48 measurement from individual leaves of six varieties. Several combinations of spectra scatter correction and smoothing filtering were studied. For the training of the models, support vector machines and artificial neural networks were employed using the pre-processed spectra as input and the varieties as the classes of the models. The results from the pre-processing study showed that there was no influence whether using scatter correction or not. Also, a second-degree derivative with a window size of 5 Savitzky-Golay filtering yielded the highest outcomes. For the site-specific model, with 20 classes, the best results from the classifiers thrown an overall score of 87.25% of correctly classified samples. These results were compared under the same conditions with a model trained using partial least squares discriminant analysis, which showed a worse performance in every case. For the global model, a 6-class dataset involving samples from three different vineyards, two years and leaves

  12. The construction of support vector machine classifier using the firefly algorithm.

    PubMed

    Chao, Chih-Feng; Horng, Ming-Huwi

    2015-01-01

    The setting of parameters in the support vector machines (SVMs) is very important with regard to its accuracy and efficiency. In this paper, we employ the firefly algorithm to train all parameters of the SVM simultaneously, including the penalty parameter, smoothness parameter, and Lagrangian multiplier. The proposed method is called the firefly-based SVM (firefly-SVM). This tool is not considered the feature selection, because the SVM, together with feature selection, is not suitable for the application in a multiclass classification, especially for the one-against-all multiclass SVM. In experiments, binary and multiclass classifications are explored. In the experiments on binary classification, ten of the benchmark data sets of the University of California, Irvine (UCI), machine learning repository are used; additionally the firefly-SVM is applied to the multiclass diagnosis of ultrasonic supraspinatus images. The classification performance of firefly-SVM is also compared to the original LIBSVM method associated with the grid search method and the particle swarm optimization based SVM (PSO-SVM). The experimental results advocate the use of firefly-SVM to classify pattern classifications for maximum accuracy.

  13. The Construction of Support Vector Machine Classifier Using the Firefly Algorithm

    PubMed Central

    Chao, Chih-Feng; Horng, Ming-Huwi

    2015-01-01

    The setting of parameters in the support vector machines (SVMs) is very important with regard to its accuracy and efficiency. In this paper, we employ the firefly algorithm to train all parameters of the SVM simultaneously, including the penalty parameter, smoothness parameter, and Lagrangian multiplier. The proposed method is called the firefly-based SVM (firefly-SVM). This tool is not considered the feature selection, because the SVM, together with feature selection, is not suitable for the application in a multiclass classification, especially for the one-against-all multiclass SVM. In experiments, binary and multiclass classifications are explored. In the experiments on binary classification, ten of the benchmark data sets of the University of California, Irvine (UCI), machine learning repository are used; additionally the firefly-SVM is applied to the multiclass diagnosis of ultrasonic supraspinatus images. The classification performance of firefly-SVM is also compared to the original LIBSVM method associated with the grid search method and the particle swarm optimization based SVM (PSO-SVM). The experimental results advocate the use of firefly-SVM to classify pattern classifications for maximum accuracy. PMID:25802511

  14. Curriculum Assessment Using Artificial Neural Network and Support Vector Machine Modeling Approaches: A Case Study. IR Applications. Volume 29

    ERIC Educational Resources Information Center

    Chen, Chau-Kuang

    2010-01-01

    Artificial Neural Network (ANN) and Support Vector Machine (SVM) approaches have been on the cutting edge of science and technology for pattern recognition and data classification. In the ANN model, classification accuracy can be achieved by using the feed-forward of inputs, back-propagation of errors, and the adjustment of connection weights. In…

  15. A low cost implementation of multi-parameter patient monitor using intersection kernel support vector machine classifier

    NASA Astrophysics Data System (ADS)

    Mohan, Dhanya; Kumar, C. Santhosh

    2016-03-01

    Predicting the physiological condition (normal/abnormal) of a patient is highly desirable to enhance the quality of health care. Multi-parameter patient monitors (MPMs) using heart rate, arterial blood pressure, respiration rate and oxygen saturation (S pO2) as input parameters were developed to monitor the condition of patients, with minimum human resource utilization. The Support vector machine (SVM), an advanced machine learning approach popularly used for classification and regression is used for the realization of MPMs. For making MPMs cost effective, we experiment on the hardware implementation of the MPM using support vector machine classifier. The training of the system is done using the matlab environment and the detection of the alarm/noalarm condition is implemented in hardware. We used different kernels for SVM classification and note that the best performance was obtained using intersection kernel SVM (IKSVM). The intersection kernel support vector machine classifier MPM has outperformed the best known MPM using radial basis function kernel by an absoute improvement of 2.74% in accuracy, 1.86% in sensitivity and 3.01% in specificity. The hardware model was developed based on the improved performance system using Verilog Hardware Description Language and was implemented on Altera cyclone-II development board.

  16. Improved Online Support Vector Machines Spam Filtering Using String Kernels

    NASA Astrophysics Data System (ADS)

    Amayri, Ola; Bouguila, Nizar

    A major bottleneck in electronic communications is the enormous dissemination of spam emails. Developing of suitable filters that can adequately capture those emails and achieve high performance rate become a main concern. Support vector machines (SVMs) have made a large contribution to the development of spam email filtering. Based on SVMs, the crucial problems in email classification are feature mapping of input emails and the choice of the kernels. In this paper, we present thorough investigation of several distance-based kernels and propose the use of string kernels and prove its efficiency in blocking spam emails. We detail a feature mapping variants in text classification (TC) that yield improved performance for the standard SVMs in filtering task. Furthermore, to cope for realtime scenarios we propose an online active framework for spam filtering.

  17. The Low Backscattering Objects Classification in Polsar Image Based on Bag of Words Model Using Support Vector Machine

    NASA Astrophysics Data System (ADS)

    Yang, L.; Shi, L.; Li, P.; Yang, J.; Zhao, L.; Zhao, B.

    2018-04-01

    Due to the forward scattering and block of radar signal, the water, bare soil, shadow, named low backscattering objects (LBOs), often present low backscattering intensity in polarimetric synthetic aperture radar (PolSAR) image. Because the LBOs rise similar backscattering intensity and polarimetric responses, the spectral-based classifiers are inefficient to deal with LBO classification, such as Wishart method. Although some polarimetric features had been exploited to relieve the confusion phenomenon, the backscattering features are still found unstable when the system noise floor varies in the range direction. This paper will introduce a simple but effective scene classification method based on Bag of Words (BoW) model using Support Vector Machine (SVM) to discriminate the LBOs, without relying on any polarimetric features. In the proposed approach, square windows are firstly opened around the LBOs adaptively to determine the scene images, and then the Scale-Invariant Feature Transform (SIFT) points are detected in training and test scenes. The several SIFT features detected are clustered using K-means to obtain certain cluster centers as the visual word lists and scene images are represented using word frequency. At last, the SVM is selected for training and predicting new scenes as some kind of LBOs. The proposed method is executed over two AIRSAR data sets at C band and L band, including water, bare soil and shadow scenes. The experimental results illustrate the effectiveness of the scene method in distinguishing LBOs.

  18. Evaluation and recognition of skin images with aging by support vector machine

    NASA Astrophysics Data System (ADS)

    Hu, Liangjun; Wu, Shulian; Li, Hui

    2016-10-01

    Aging is a very important issue not only in dermatology, but also cosmetic science. Cutaneous aging involves both chronological and photoaging aging process. The evaluation and classification of aging is an important issue with the medical cosmetology workers nowadays. The purpose of this study is to assess chronological-age-related and photo-age-related of human skin. The texture features of skin surface skin, such as coarseness, contrast were analyzed by Fourier transform and Tamura. And the aim of it is to detect the object hidden in the skin texture in difference aging skin. Then, Support vector machine was applied to train the texture feature. The different age's states were distinguished by the support vector machine (SVM) classifier. The results help us to further understand the mechanism of different aging skin from texture feature and help us to distinguish the different aging states.

  19. Facial Expression Recognition using Multiclass Ensemble Least-Square Support Vector Machine

    NASA Astrophysics Data System (ADS)

    Lawi, Armin; Sya'Rani Machrizzandi, M.

    2018-03-01

    Facial expression is one of behavior characteristics of human-being. The use of biometrics technology system with facial expression characteristics makes it possible to recognize a person’s mood or emotion. The basic components of facial expression analysis system are face detection, face image extraction, facial classification and facial expressions recognition. This paper uses Principal Component Analysis (PCA) algorithm to extract facial features with expression parameters, i.e., happy, sad, neutral, angry, fear, and disgusted. Then Multiclass Ensemble Least-Squares Support Vector Machine (MELS-SVM) is used for the classification process of facial expression. The result of MELS-SVM model obtained from our 185 different expression images of 10 persons showed high accuracy level of 99.998% using RBF kernel.

  20. Incorporation of support vector machines in the LIBS toolbox for sensitive and robust classification amidst unexpected sample and system variability.

    PubMed

    Dingari, Narahara Chari; Barman, Ishan; Myakalwar, Ashwin Kumar; Tewari, Surya P; Kumar Gundawar, Manoj

    2012-03-20

    Despite the intrinsic elemental analysis capability and lack of sample preparation requirements, laser-induced breakdown spectroscopy (LIBS) has not been extensively used for real-world applications, e.g., quality assurance and process monitoring. Specifically, variability in sample, system, and experimental parameters in LIBS studies present a substantive hurdle for robust classification, even when standard multivariate chemometric techniques are used for analysis. Considering pharmaceutical sample investigation as an example, we propose the use of support vector machines (SVM) as a nonlinear classification method over conventional linear techniques such as soft independent modeling of class analogy (SIMCA) and partial least-squares discriminant analysis (PLS-DA) for discrimination based on LIBS measurements. Using over-the-counter pharmaceutical samples, we demonstrate that the application of SVM enables statistically significant improvements in prospective classification accuracy (sensitivity), because of its ability to address variability in LIBS sample ablation and plasma self-absorption behavior. Furthermore, our results reveal that SVM provides nearly 10% improvement in correct allocation rate and a concomitant reduction in misclassification rates of 75% (cf. PLS-DA) and 80% (cf. SIMCA)-when measurements from samples not included in the training set are incorporated in the test data-highlighting its robustness. While further studies on a wider matrix of sample types performed using different LIBS systems is needed to fully characterize the capability of SVM to provide superior predictions, we anticipate that the improved sensitivity and robustness observed here will facilitate application of the proposed LIBS-SVM toolbox for screening drugs and detecting counterfeit samples, as well as in related areas of forensic and biological sample analysis.

  1. Extraction and Analysis of Mega Cities’ Impervious Surface on Pixel-based and Object-oriented Support Vector Machine Classification Technology: A case of Bombay

    NASA Astrophysics Data System (ADS)

    Yu, S. S.; Sun, Z. C.; Sun, L.; Wu, M. F.

    2017-02-01

    The object of this paper is to study the impervious surface extraction method using remote sensing imagery and monitor the spatiotemporal changing patterns of mega cities. Megacity Bombay was selected as the interesting area. Firstly, the pixel-based and object-oriented support vector machine (SVM) classification methods were used to acquire the land use/land cover (LULC) products of Bombay in 2010. Consequently, the overall accuracy (OA) and overall Kappa (OK) of the pixel-based method were 94.97% and 0.96 with a running time of 78 minutes, the OA and OK of the object-oriented method were 93.72% and 0.94 with a running time of only 17s. Additionally, OA and OK of the object-oriented method after a post-classification were improved up to 95.8% and 0.94. Then, the dynamic impervious surfaces of Bombay in the period 1973-2015 were extracted and the urbanization pattern of Bombay was analysed. Results told that both the two SVM classification methods could accomplish the impervious surface extraction, but the object-oriented method should be a better choice. Urbanization of Bombay experienced a fast extending during the past 42 years, implying a dramatically urban sprawl of mega cities in the developing countries along the One Belt and One Road (OBOR).

  2. Enhancing spatial resolution of (18)F positron imaging with the Timepix detector by classification of primary fired pixels using support vector machine.

    PubMed

    Wang, Qian; Liu, Zhen; Ziegler, Sibylle I; Shi, Kuangyu

    2015-07-07

    Position-sensitive positron cameras using silicon pixel detectors have been applied for some preclinical and intraoperative clinical applications. However, the spatial resolution of a positron camera is limited by positron multiple scattering in the detector. An incident positron may fire a number of successive pixels on the imaging plane. It is still impossible to capture the primary fired pixel along a particle trajectory by hardware or to perceive the pixel firing sequence by direct observation. Here, we propose a novel data-driven method to improve the spatial resolution by classifying the primary pixels within the detector using support vector machine. A classification model is constructed by learning the features of positron trajectories based on Monte-Carlo simulations using Geant4. Topological and energy features of pixels fired by (18)F positrons were considered for the training and classification. After applying the classification model on measurements, the primary fired pixels of the positron tracks in the silicon detector were estimated. The method was tested and assessed for [(18)F]FDG imaging of an absorbing edge protocol and a leaf sample. The proposed method improved the spatial resolution from 154.6   ±   4.2 µm (energy weighted centroid approximation) to 132.3   ±   3.5 µm in the absorbing edge measurements. For the positron imaging of a leaf sample, the proposed method achieved lower root mean square error relative to phosphor plate imaging, and higher similarity with the reference optical image. The improvements of the preliminary results support further investigation of the proposed algorithm for the enhancement of positron imaging in clinical and preclinical applications.

  3. Enhancing spatial resolution of 18F positron imaging with the Timepix detector by classification of primary fired pixels using support vector machine

    NASA Astrophysics Data System (ADS)

    Wang, Qian; Liu, Zhen; Ziegler, Sibylle I.; Shi, Kuangyu

    2015-07-01

    Position-sensitive positron cameras using silicon pixel detectors have been applied for some preclinical and intraoperative clinical applications. However, the spatial resolution of a positron camera is limited by positron multiple scattering in the detector. An incident positron may fire a number of successive pixels on the imaging plane. It is still impossible to capture the primary fired pixel along a particle trajectory by hardware or to perceive the pixel firing sequence by direct observation. Here, we propose a novel data-driven method to improve the spatial resolution by classifying the primary pixels within the detector using support vector machine. A classification model is constructed by learning the features of positron trajectories based on Monte-Carlo simulations using Geant4. Topological and energy features of pixels fired by 18F positrons were considered for the training and classification. After applying the classification model on measurements, the primary fired pixels of the positron tracks in the silicon detector were estimated. The method was tested and assessed for [18F]FDG imaging of an absorbing edge protocol and a leaf sample. The proposed method improved the spatial resolution from 154.6   ±   4.2 µm (energy weighted centroid approximation) to 132.3   ±   3.5 µm in the absorbing edge measurements. For the positron imaging of a leaf sample, the proposed method achieved lower root mean square error relative to phosphor plate imaging, and higher similarity with the reference optical image. The improvements of the preliminary results support further investigation of the proposed algorithm for the enhancement of positron imaging in clinical and preclinical applications.

  4. A Support Vector Machine-Based Gender Identification Using Speech Signal

    NASA Astrophysics Data System (ADS)

    Lee, Kye-Hwan; Kang, Sang-Ick; Kim, Deok-Hwan; Chang, Joon-Hyuk

    We propose an effective voice-based gender identification method using a support vector machine (SVM). The SVM is a binary classification algorithm that classifies two groups by finding the voluntary nonlinear boundary in a feature space and is known to yield high classification performance. In the present work, we compare the identification performance of the SVM with that of a Gaussian mixture model (GMM)-based method using the mel frequency cepstral coefficients (MFCC). A novel approach of incorporating a features fusion scheme based on a combination of the MFCC and the fundamental frequency is proposed with the aim of improving the performance of gender identification. Experimental results demonstrate that the gender identification performance using the SVM is significantly better than that of the GMM-based scheme. Moreover, the performance is substantially improved when the proposed features fusion technique is applied.

  5. Information extraction with object based support vector machines and vegetation indices

    NASA Astrophysics Data System (ADS)

    Ustuner, Mustafa; Abdikan, Saygin; Balik Sanli, Fusun

    2016-07-01

    Information extraction through remote sensing data is important for policy and decision makers as extracted information provide base layers for many application of real world. Classification of remotely sensed data is the one of the most common methods of extracting information however it is still a challenging issue because several factors are affecting the accuracy of the classification. Resolution of the imagery, number and homogeneity of land cover classes, purity of training data and characteristic of adopted classifiers are just some of these challenging factors. Object based image classification has some superiority than pixel based classification for high resolution images since it uses geometry and structure information besides spectral information. Vegetation indices are also commonly used for the classification process since it provides additional spectral information for vegetation, forestry and agricultural areas. In this study, the impacts of the Normalized Difference Vegetation Index (NDVI) and Normalized Difference Red Edge Index (NDRE) on the classification accuracy of RapidEye imagery were investigated. Object based Support Vector Machines were implemented for the classification of crop types for the study area located in Aegean region of Turkey. Results demonstrated that the incorporation of NDRE increase the classification accuracy from 79,96% to 86,80% as overall accuracy, however NDVI decrease the classification accuracy from 79,96% to 78,90%. Moreover it is proven than object based classification with RapidEye data give promising results for crop type mapping and analysis.

  6. Tuning support vector machines for minimax and Neyman-Pearson classification.

    PubMed

    Davenport, Mark A; Baraniuk, Richard G; Scott, Clayton D

    2010-10-01

    This paper studies the training of support vector machine (SVM) classifiers with respect to the minimax and Neyman-Pearson criteria. In principle, these criteria can be optimized in a straightforward way using a cost-sensitive SVM. In practice, however, because these criteria require especially accurate error estimation, standard techniques for tuning SVM parameters, such as cross-validation, can lead to poor classifier performance. To address this issue, we first prove that the usual cost-sensitive SVM, here called the 2C-SVM, is equivalent to another formulation called the 2nu-SVM. We then exploit a characterization of the 2nu-SVM parameter space to develop a simple yet powerful approach to error estimation based on smoothing. In an extensive experimental study, we demonstrate that smoothing significantly improves the accuracy of cross-validation error estimates, leading to dramatic performance gains. Furthermore, we propose coordinate descent strategies that offer significant gains in computational efficiency, with little to no loss in performance.

  7. Mangrove classification through the use of object oriented classification and support vector machine of lidar datasets: a case study in Naawan and Manticao, Misamis Oriental, Philippines

    NASA Astrophysics Data System (ADS)

    Jalbuena, Rey L.; Peralta, Rudolph V.; Tamondong, Ayin M.

    2016-10-01

    Mangroves are trees or shrubs that grows at the surface between the land and the sea in tropical and sub-tropical latitudes. Mangroves are essential in supporting various marine life, thus, it is important to preserve and manage these areas. There are many approaches in creating Mangroves maps, one of which is through the use of Light Detection and Ranging (LiDAR). It is a remote sensing technique which uses light pulses to measure distances and to generate three-dimensional point clouds of the Earth's surface. In this study, the topographic LiDAR Data will be used to analyze the geophysical features of the terrain and create a Mangrove map. The dataset that we have were first pre-processed using the LAStools software. It is a software that is used to process LiDAR data sets and create different layers such as DSM, DTM, nDSM, Slope, LiDAR Intensity, LiDAR number of first returns, and CHM. All the aforementioned layers together was used to derive the Mangrove class. Then, an Object-based Image Analysis (OBIA) was performed using eCognition. OBIA analyzes a group of pixels with similar properties called objects, as compared to the traditional pixel-based which only examines a single pixel. Multi-threshold and multiresolution segmentation were used to delineate the different classes and split the image into objects. There are four levels of classification, first is the separation of the Land from the Water. Then the Land class was further dived into Ground and Non-ground objects. Furthermore classification of Nonvegetation, Mangroves, and Other Vegetation was done from the Non-ground objects. Lastly Separation of the mangrove class was done through the Use of field verified training points which was then run into a Support Vector Machine (SVM) classification. Different classes were separated using the different layer feature properties, such as mean, mode, standard deviation, geometrical properties, neighbor-related properties, and textural properties. Accuracy

  8. Integration of multi-array sensors and support vector machines for the detection and classification of organophosphate nerve agents

    NASA Astrophysics Data System (ADS)

    Land, Walker H., Jr.; Sadik, Omowunmi A.; Embrechts, Mark J.; Leibensperger, Dale; Wong, Lut; Wanekaya, Adam; Uematsu, Michiko

    2003-08-01

    Due to the increased threats of chemical and biological weapons of mass destruction (WMD) by international terrorist organizations, a significant effort is underway to develop tools that can be used to detect and effectively combat biochemical warfare. Furthermore, recent events have highlighted awareness that chemical and biological agents (CBAs) may become the preferred, cheap alternative WMD, because these agents can effectively attack large populations while leaving infrastructures intact. Despite the availability of numerous sensing devices, intelligent hybrid sensors that can detect and degrade CBAs are virtually nonexistent. This paper reports the integration of multi-array sensors with Support Vector Machines (SVMs) for the detection of organophosphates nerve agents using parathion and dichlorvos as model stimulants compounds. SVMs were used for the design and evaluation of new and more accurate data extraction, preprocessing and classification. Experimental results for the paradigms developed using Structural Risk Minimization, show a significant increase in classification accuracy when compared to the existing AromaScan baseline system. Specifically, the results of this research has demonstrated that, for the Parathion versus Dichlorvos pair, when compared to the AromaScan baseline system: (1) a 23% improvement in the overall ROC Az index using the S2000 kernel, with similar improvements with the Gaussian and polynomial (of degree 2) kernels, (2) a significant 173% improvement in specificity with the S2000 kernel. This means that the number of false negative errors were reduced by 173%, while making no false positive errors, when compared to the AromaScan base line performance. (3) The Gaussian and polynomial kernels demonstrated similar specificity at 100% sensitivity. All SVM classifiers provided essentially perfect classification performance for the Dichlorvos versus Trichlorfon pair. For the most difficult classification task, the Parathion versus

  9. Development of a Support Vector Machine - Based Image Analysis System for Focal Liver Lesions Classification in Magnetic Resonance Images

    NASA Astrophysics Data System (ADS)

    Gatos, I.; Tsantis, S.; Karamesini, M.; Skouroliakou, A.; Kagadis, G.

    2015-09-01

    Purpose: The design and implementation of a computer-based image analysis system employing the support vector machine (SVM) classifier system for the classification of Focal Liver Lesions (FLLs) on routine non-enhanced, T2-weighted Magnetic Resonance (MR) images. Materials and Methods: The study comprised 92 patients; each one of them has undergone MRI performed on a Magnetom Concerto (Siemens). Typical signs on dynamic contrast-enhanced MRI and biopsies were employed towards a three class categorization of the 92 cases: 40-benign FLLs, 25-Hepatocellular Carcinomas (HCC) within Cirrhotic liver parenchyma and 27-liver metastases from Non-Cirrhotic liver. Prior to FLLs classification an automated lesion segmentation algorithm based on Marcov Random Fields was employed in order to acquire each FLL Region of Interest. 42 texture features derived from the gray-level histogram, co-occurrence and run-length matrices and 12 morphological features were obtained from each lesion. Stepwise multi-linear regression analysis was utilized to avoid feature redundancy leading to a feature subset that fed the multiclass SVM classifier designed for lesion classification. SVM System evaluation was performed by means of leave-one-out method and ROC analysis. Results: Maximum accuracy for all three classes (90.0%) was obtained by means of the Radial Basis Kernel Function and three textural features (Inverse- Different-Moment, Sum-Variance and Long-Run-Emphasis) that describe lesion's contrast, variability and shape complexity. Sensitivity values for the three classes were 92.5%, 81.5% and 96.2% respectively, whereas specificity values were 94.2%, 95.3% and 95.5%. The AUC value achieved for the selected subset was 0.89 with 0.81 - 0.94 confidence interval. Conclusion: The proposed SVM system exhibit promising results that could be utilized as a second opinion tool to the radiologist in order to decrease the time/cost of diagnosis and the need for patients to undergo invasive examination.

  10. Assessing the performance of multiple spectral-spatial features of a hyperspectral image for classification of urban land cover classes using support vector machines and artificial neural network

    NASA Astrophysics Data System (ADS)

    Pullanagari, Reddy; Kereszturi, Gábor; Yule, Ian J.; Ghamisi, Pedram

    2017-04-01

    Accurate and spatially detailed mapping of complex urban environments is essential for land managers. Classifying high spectral and spatial resolution hyperspectral images is a challenging task because of its data abundance and computational complexity. Approaches with a combination of spectral and spatial information in a single classification framework have attracted special attention because of their potential to improve the classification accuracy. We extracted multiple features from spectral and spatial domains of hyperspectral images and evaluated them with two supervised classification algorithms; support vector machines (SVM) and an artificial neural network. The spatial features considered are produced by a gray level co-occurrence matrix and extended multiattribute profiles. All of these features were stacked, and the most informative features were selected using a genetic algorithm-based SVM. After selecting the most informative features, the classification model was integrated with a segmentation map derived using a hidden Markov random field. We tested the proposed method on a real application of a hyperspectral image acquired from AisaFENIX and on widely used hyperspectral images. From the results, it can be concluded that the proposed framework significantly improves the results with different spectral and spatial resolutions over different instrumentation.

  11. Support vector machines

    NASA Technical Reports Server (NTRS)

    Garay, Michael J.; Mazzoni, Dominic; Davies, Roger; Wagstaff, Kiri

    2004-01-01

    Support Vector Machines (SVMs) are a type of supervised learning algorith,, other examples of which are Artificial Neural Networks (ANNs), Decision Trees, and Naive Bayesian Classifiers. Supervised learning algorithms are used to classify objects labled by a 'supervisor' - typically a human 'expert.'.

  12. Incorporation of support vector machines in the LIBS toolbox for sensitive and robust classification amidst unexpected sample and system variability

    PubMed Central

    ChariDingari, Narahara; Barman, Ishan; Myakalwar, Ashwin Kumar; Tewari, Surya P.; Kumar, G. Manoj

    2012-01-01

    Despite the intrinsic elemental analysis capability and lack of sample preparation requirements, laser-induced breakdown spectroscopy (LIBS) has not been extensively used for real world applications, e.g. quality assurance and process monitoring. Specifically, variability in sample, system and experimental parameters in LIBS studies present a substantive hurdle for robust classification, even when standard multivariate chemometric techniques are used for analysis. Considering pharmaceutical sample investigation as an example, we propose the use of support vector machines (SVM) as a non-linear classification method over conventional linear techniques such as soft independent modeling of class analogy (SIMCA) and partial least-squares discriminant analysis (PLS-DA) for discrimination based on LIBS measurements. Using over-the-counter pharmaceutical samples, we demonstrate that application of SVM enables statistically significant improvements in prospective classification accuracy (sensitivity), due to its ability to address variability in LIBS sample ablation and plasma self-absorption behavior. Furthermore, our results reveal that SVM provides nearly 10% improvement in correct allocation rate and a concomitant reduction in misclassification rates of 75% (cf. PLS-DA) and 80% (cf. SIMCA)-when measurements from samples not included in the training set are incorporated in the test data – highlighting its robustness. While further studies on a wider matrix of sample types performed using different LIBS systems is needed to fully characterize the capability of SVM to provide superior predictions, we anticipate that the improved sensitivity and robustness observed here will facilitate application of the proposed LIBS-SVM toolbox for screening drugs and detecting counterfeit samples as well as in related areas of forensic and biological sample analysis. PMID:22292496

  13. Color image segmentation with support vector machines: applications to road signs detection.

    PubMed

    Cyganek, Bogusław

    2008-08-01

    In this paper we propose efficient color segmentation method which is based on the Support Vector Machine classifier operating in a one-class mode. The method has been developed especially for the road signs recognition system, although it can be used in other applications. The main advantage of the proposed method comes from the fact that the segmentation of characteristic colors is performed not in the original but in the higher dimensional feature space. By this a better data encapsulation with a linear hypersphere can be usually achieved. Moreover, the classifier does not try to capture the whole distribution of the input data which is often difficult to achieve. Instead, the characteristic data samples, called support vectors, are selected which allow construction of the tightest hypersphere that encloses majority of the input data. Then classification of a test data simply consists in a measurement of its distance to a centre of the found hypersphere. The experimental results show high accuracy and speed of the proposed method.

  14. Support vector machines-based fault diagnosis for turbo-pump rotor

    NASA Astrophysics Data System (ADS)

    Yuan, Sheng-Fa; Chu, Fu-Lei

    2006-05-01

    Most artificial intelligence methods used in fault diagnosis are based on empirical risk minimisation principle and have poor generalisation when fault samples are few. Support vector machines (SVM) is a new general machine-learning tool based on structural risk minimisation principle that exhibits good generalisation even when fault samples are few. Fault diagnosis based on SVM is discussed. Since basic SVM is originally designed for two-class classification, while most of fault diagnosis problems are multi-class cases, a new multi-class classification of SVM named 'one to others' algorithm is presented to solve the multi-class recognition problems. It is a binary tree classifier composed of several two-class classifiers organised by fault priority, which is simple, and has little repeated training amount, and the rate of training and recognition is expedited. The effectiveness of the method is verified by the application to the fault diagnosis for turbo pump rotor.

  15. Agricultural mapping using Support Vector Machine-Based Endmember Extraction (SVM-BEE)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Archibald, Richard K; Filippi, Anthony M; Bhaduri, Budhendra L

    Extracting endmembers from remotely sensed images of vegetated areas can present difficulties. In this research, we applied a recently developed endmember-extraction algorithm based on Support Vector Machines (SVMs) to the problem of semi-autonomous estimation of vegetation endmembers from a hyperspectral image. This algorithm, referred to as Support Vector Machine-Based Endmember Extraction (SVM-BEE), accurately and rapidly yields a computed representation of hyperspectral data that can accommodate multiple distributions. The number of distributions is identified without prior knowledge, based upon this representation. Prior work established that SVM-BEE is robustly noise-tolerant and can semi-automatically and effectively estimate endmembers; synthetic data and a geologicmore » scene were previously analyzed. Here we compared the efficacies of the SVM-BEE and N-FINDR algorithms in extracting endmembers from a predominantly agricultural scene. SVM-BEE was able to estimate vegetation and other endmembers for all classes in the image, which N-FINDR failed to do. Classifications based on SVM-BEE endmembers were markedly more accurate compared with those based on N-FINDR endmembers.« less

  16. Comparative Performance Analysis of Support Vector Machine, Random Forest, Logistic Regression and k-Nearest Neighbours in Rainbow Trout (Oncorhynchus Mykiss) Classification Using Image-Based Features

    PubMed Central

    Císař, Petr; Labbé, Laurent; Souček, Pavel; Pelissier, Pablo; Kerneis, Thierry

    2018-01-01

    The main aim of this study was to develop a new objective method for evaluating the impacts of different diets on the live fish skin using image-based features. In total, one-hundred and sixty rainbow trout (Oncorhynchus mykiss) were fed either a fish-meal based diet (80 fish) or a 100% plant-based diet (80 fish) and photographed using consumer-grade digital camera. Twenty-three colour features and four texture features were extracted. Four different classification methods were used to evaluate fish diets including Random forest (RF), Support vector machine (SVM), Logistic regression (LR) and k-Nearest neighbours (k-NN). The SVM with radial based kernel provided the best classifier with correct classification rate (CCR) of 82% and Kappa coefficient of 0.65. Although the both LR and RF methods were less accurate than SVM, they achieved good classification with CCR 75% and 70% respectively. The k-NN was the least accurate (40%) classification model. Overall, it can be concluded that consumer-grade digital cameras could be employed as the fast, accurate and non-invasive sensor for classifying rainbow trout based on their diets. Furthermore, these was a close association between image-based features and fish diet received during cultivation. These procedures can be used as non-invasive, accurate and precise approaches for monitoring fish status during the cultivation by evaluating diet’s effects on fish skin. PMID:29596375

  17. Comparative Performance Analysis of Support Vector Machine, Random Forest, Logistic Regression and k-Nearest Neighbours in Rainbow Trout (Oncorhynchus Mykiss) Classification Using Image-Based Features.

    PubMed

    Saberioon, Mohammadmehdi; Císař, Petr; Labbé, Laurent; Souček, Pavel; Pelissier, Pablo; Kerneis, Thierry

    2018-03-29

    The main aim of this study was to develop a new objective method for evaluating the impacts of different diets on the live fish skin using image-based features. In total, one-hundred and sixty rainbow trout ( Oncorhynchus mykiss ) were fed either a fish-meal based diet (80 fish) or a 100% plant-based diet (80 fish) and photographed using consumer-grade digital camera. Twenty-three colour features and four texture features were extracted. Four different classification methods were used to evaluate fish diets including Random forest (RF), Support vector machine (SVM), Logistic regression (LR) and k -Nearest neighbours ( k -NN). The SVM with radial based kernel provided the best classifier with correct classification rate (CCR) of 82% and Kappa coefficient of 0.65. Although the both LR and RF methods were less accurate than SVM, they achieved good classification with CCR 75% and 70% respectively. The k -NN was the least accurate (40%) classification model. Overall, it can be concluded that consumer-grade digital cameras could be employed as the fast, accurate and non-invasive sensor for classifying rainbow trout based on their diets. Furthermore, these was a close association between image-based features and fish diet received during cultivation. These procedures can be used as non-invasive, accurate and precise approaches for monitoring fish status during the cultivation by evaluating diet's effects on fish skin.

  18. Multiclass Reduced-Set Support Vector Machines

    NASA Technical Reports Server (NTRS)

    Tang, Benyang; Mazzoni, Dominic

    2006-01-01

    There are well-established methods for reducing the number of support vectors in a trained binary support vector machine, often with minimal impact on accuracy. We show how reduced-set methods can be applied to multiclass SVMs made up of several binary SVMs, with significantly better results than reducing each binary SVM independently. Our approach is based on Burges' approach that constructs each reduced-set vector as the pre-image of a vector in kernel space, but we extend this by recomputing the SVM weights and bias optimally using the original SVM objective function. This leads to greater accuracy for a binary reduced-set SVM, and also allows vectors to be 'shared' between multiple binary SVMs for greater multiclass accuracy with fewer reduced-set vectors. We also propose computing pre-images using differential evolution, which we have found to be more robust than gradient descent alone. We show experimental results on a variety of problems and find that this new approach is consistently better than previous multiclass reduced-set methods, sometimes with a dramatic difference.

  19. Deep learning of support vector machines with class probability output networks.

    PubMed

    Kim, Sangwook; Yu, Zhibin; Kil, Rhee Man; Lee, Minho

    2015-04-01

    Deep learning methods endeavor to learn features automatically at multiple levels and allow systems to learn complex functions mapping from the input space to the output space for the given data. The ability to learn powerful features automatically is increasingly important as the volume of data and range of applications of machine learning methods continues to grow. This paper proposes a new deep architecture that uses support vector machines (SVMs) with class probability output networks (CPONs) to provide better generalization power for pattern classification problems. As a result, deep features are extracted without additional feature engineering steps, using multiple layers of the SVM classifiers with CPONs. The proposed structure closely approaches the ideal Bayes classifier as the number of layers increases. Using a simulation of classification problems, the effectiveness of the proposed method is demonstrated. Copyright © 2014 Elsevier Ltd. All rights reserved.

  20. Combination of support vector machine, artificial neural network and random forest for improving the classification of convective and stratiform rain using spectral features of SEVIRI data

    NASA Astrophysics Data System (ADS)

    Lazri, Mourad; Ameur, Soltane

    2018-05-01

    A model combining three classifiers, namely Support vector machine, Artificial neural network and Random forest (SAR) is designed for improving the classification of convective and stratiform rain. This model (SAR model) has been trained and then tested on a datasets derived from MSG-SEVIRI (Meteosat Second Generation-Spinning Enhanced Visible and Infrared Imager). Well-classified, mid-classified and misclassified pixels are determined from the combination of three classifiers. Mid-classified and misclassified pixels that are considered unreliable pixels are reclassified by using a novel training of the developed scheme. In this novel training, only the input data corresponding to the pixels in question to are used. This whole process is repeated a second time and applied to mid-classified and misclassified pixels separately. Learning and validation of the developed scheme are realized against co-located data observed by ground radar. The developed scheme outperformed different classifiers used separately and reached 97.40% of overall accuracy of classification.

  1. Segmentation and classification of brain images using firefly and hybrid kernel-based support vector machine

    NASA Astrophysics Data System (ADS)

    Selva Bhuvaneswari, K.; Geetha, P.

    2017-05-01

    Magnetic resonance imaging segmentation refers to a process of assigning labels to set of pixels or multiple regions. It plays a major role in the field of biomedical applications as it is widely used by the radiologists to segment the medical images input into meaningful regions. In recent years, various brain tumour detection techniques are presented in the literature. The entire segmentation process of our proposed work comprises three phases: threshold generation with dynamic modified region growing phase, texture feature generation phase and region merging phase. by dynamically changing two thresholds in the modified region growing approach, the first phase of the given input image can be performed as dynamic modified region growing process, in which the optimisation algorithm, firefly algorithm help to optimise the two thresholds in modified region growing. After obtaining the region growth segmented image using modified region growing, the edges can be detected with edge detection algorithm. In the second phase, the texture feature can be extracted using entropy-based operation from the input image. In region merging phase, the results obtained from the texture feature-generation phase are combined with the results of dynamic modified region growing phase and similar regions are merged using a distance comparison between regions. After identifying the abnormal tissues, the classification can be done by hybrid kernel-based SVM (Support Vector Machine). The performance analysis of the proposed method will be carried by K-cross fold validation method. The proposed method will be implemented in MATLAB with various images.

  2. Ebolavirus Classification Based on Natural Vectors

    PubMed Central

    Zheng, Hui; Yin, Changchuan; Hoang, Tung; He, Rong Lucy; Yang, Jie

    2015-01-01

    According to the WHO, ebolaviruses have resulted in 8818 human deaths in West Africa as of January 2015. To better understand the evolutionary relationship of the ebolaviruses and infer virulence from the relationship, we applied the alignment-free natural vector method to classify the newest ebolaviruses. The dataset includes three new Guinea viruses as well as 99 viruses from Sierra Leone. For the viruses of the family of Filoviridae, both genus label classification and species label classification achieve an accuracy rate of 100%. We represented the relationships among Filoviridae viruses by Unweighted Pair Group Method with Arithmetic Mean (UPGMA) phylogenetic trees and found that the filoviruses can be separated well by three genera. We performed the phylogenetic analysis on the relationship among different species of Ebolavirus by their coding-complete genomes and seven viral protein genes (glycoprotein [GP], nucleoprotein [NP], VP24, VP30, VP35, VP40, and RNA polymerase [L]). The topology of the phylogenetic tree by the viral protein VP24 shows consistency with the variations of virulence of ebolaviruses. The result suggests that VP24 be a pharmaceutical target for treating or preventing ebolaviruses. PMID:25803489

  3. Support-vector-based emergent self-organising approach for emotional understanding

    NASA Astrophysics Data System (ADS)

    Nguwi, Yok-Yen; Cho, Siu-Yeung

    2010-12-01

    This study discusses the computational analysis of general emotion understanding from questionnaires methodology. The questionnaires method approaches the subject by investigating the real experience that accompanied the emotions, whereas the other laboratory approaches are generally associated with exaggerated elements. We adopted a connectionist model called support-vector-based emergent self-organising map (SVESOM) to analyse the emotion profiling from the questionnaires method. The SVESOM first identifies the important variables by giving discriminative features with high ranking. The classifier then performs the classification based on the selected features. Experimental results show that the top rank features are in line with the work of Scherer and Wallbott [(1994), 'Evidence for Universality and Cultural Variation of Differential Emotion Response Patterning', Journal of Personality and Social Psychology, 66, 310-328], which approached the emotions physiologically. While the performance measures show that using the full features for classifications can degrade the performance, the selected features provide superior results in terms of accuracy and generalisation.

  4. Recurrence quantification analysis and support vector machines for golf handicap and low back pain EMG classification.

    PubMed

    Silva, Luís; Vaz, João Rocha; Castro, Maria António; Serranho, Pedro; Cabri, Jan; Pezarat-Correia, Pedro

    2015-08-01

    The quantification of non-linear characteristics of electromyography (EMG) must contain information allowing to discriminate neuromuscular strategies during dynamic skills. There are a lack of studies about muscle coordination under motor constrains during dynamic contractions. In golf, both handicap (Hc) and low back pain (LBP) are the main factors associated with the occurrence of injuries. The aim of this study was to analyze the accuracy of support vector machines SVM on EMG-based classification to discriminate Hc (low and high handicap) and LBP (with and without LPB) in the main phases of golf swing. For this purpose recurrence quantification analysis (RQA) features of the trunk and the lower limb muscles were used to feed a SVM classifier. Recurrence rate (RR) and the ratio between determinism (DET) and RR showed a high discriminant power. The Hc accuracy for the swing, backswing, and downswing were 94.4±2.7%, 97.1±2.3%, and 95.3±2.6%, respectively. For LBP, the accuracy was 96.9±3.8% for the swing, and 99.7±0.4% in the backswing. External oblique (EO), biceps femoris (BF), semitendinosus (ST) and rectus femoris (RF) showed high accuracy depending on the laterality within the phase. RQA features and SVM showed a high muscle discriminant capacity within swing phases by Hc and by LBP. Low back pain golfers showed different neuromuscular coordination strategies when compared with asymptomatic. Copyright © 2015 Elsevier Ltd. All rights reserved.

  5. Automatic EEG artifact removal: a weighted support vector machine approach with error correction.

    PubMed

    Shao, Shi-Yun; Shen, Kai-Quan; Ong, Chong Jin; Wilder-Smith, Einar P V; Li, Xiao-Ping

    2009-02-01

    An automatic electroencephalogram (EEG) artifact removal method is presented in this paper. Compared to past methods, it has two unique features: 1) a weighted version of support vector machine formulation that handles the inherent unbalanced nature of component classification and 2) the ability to accommodate structural information typically found in component classification. The advantages of the proposed method are demonstrated on real-life EEG recordings with comparisons made to several benchmark methods. Results show that the proposed method is preferable to the other methods in the context of artifact removal by achieving a better tradeoff between removing artifacts and preserving inherent brain activities. Qualitative evaluation of the reconstructed EEG epochs also demonstrates that after artifact removal inherent brain activities are largely preserved.

  6. Unresolved Galaxy Classifier for ESA/Gaia mission: Support Vector Machines approach

    NASA Astrophysics Data System (ADS)

    Bellas-Velidis, Ioannis; Kontizas, Mary; Dapergolas, Anastasios; Livanou, Evdokia; Kontizas, Evangelos; Karampelas, Antonios

    A software package Unresolved Galaxy Classifier (UGC) is being developed for the ground-based pipeline of ESA's Gaia mission. It aims to provide an automated taxonomic classification and specific parameters estimation analyzing Gaia BP/RP instrument low-dispersion spectra of unresolved galaxies. The UGC algorithm is based on a supervised learning technique, the Support Vector Machines (SVM). The software is implemented in Java as two separate modules. An offline learning module provides functions for SVM-models training. Once trained, the set of models can be repeatedly applied to unknown galaxy spectra by the pipeline's application module. A library of galaxy models synthetic spectra, simulated for the BP/RP instrument, is used to train and test the modules. Science tests show a very good classification performance of UGC and relatively good regression performance, except for some of the parameters. Possible approaches to improve the performance are discussed.

  7. Optimized spatio-temporal descriptors for real-time fall detection: comparison of support vector machine and Adaboost-based classification

    NASA Astrophysics Data System (ADS)

    Charfi, Imen; Miteran, Johel; Dubois, Julien; Atri, Mohamed; Tourki, Rached

    2013-10-01

    We propose a supervised approach to detect falls in a home environment using an optimized descriptor adapted to real-time tasks. We introduce a realistic dataset of 222 videos, a new metric allowing evaluation of fall detection performance in a video stream, and an automatically optimized set of spatio-temporal descriptors which fed a supervised classifier. We build the initial spatio-temporal descriptor named STHF using several combinations of transformations of geometrical features (height and width of human body bounding box, the user's trajectory with her/his orientation, projection histograms, and moments of orders 0, 1, and 2). We study the combinations of usual transformations of the features (Fourier transform, wavelet transform, first and second derivatives), and we show experimentally that it is possible to achieve high performance using support vector machine and Adaboost classifiers. Automatic feature selection allows to show that the best tradeoff between classification performance and processing time is obtained by combining the original low-level features with their first derivative. Hence, we evaluate the robustness of the fall detection regarding location changes. We propose a realistic and pragmatic protocol that enables performance to be improved by updating the training in the current location with normal activities records.

  8. Research on Classification of Chinese Text Data Based on SVM

    NASA Astrophysics Data System (ADS)

    Lin, Yuan; Yu, Hongzhi; Wan, Fucheng; Xu, Tao

    2017-09-01

    Data Mining has important application value in today’s industry and academia. Text classification is a very important technology in data mining. At present, there are many mature algorithms for text classification. KNN, NB, AB, SVM, decision tree and other classification methods all show good classification performance. Support Vector Machine’ (SVM) classification method is a good classifier in machine learning research. This paper will study the classification effect based on the SVM method in the Chinese text data, and use the support vector machine method in the chinese text to achieve the classify chinese text, and to able to combination of academia and practical application.

  9. Classification of ECG signal with Support Vector Machine Method for Arrhythmia Detection

    NASA Astrophysics Data System (ADS)

    Turnip, Arjon; Ilham Rizqywan, M.; Kusumandari, Dwi E.; Turnip, Mardi; Sihombing, Poltak

    2018-03-01

    An electrocardiogram is a potential bioelectric record that occurs as a result of cardiac activity. QRS Detection with zero crossing calculation is one method that can precisely determine peak R of QRS wave as part of arrhythmia detection. In this paper, two experimental scheme (2 minutes duration with different activities: relaxed and, typing) were conducted. From the two experiments it were obtained: accuracy, sensitivity, and positive predictivity about 100% each for the first experiment and about 79%, 93%, 83% for the second experiment, respectively. Furthermore, the feature set of MIT-BIH arrhythmia using the support vector machine (SVM) method on the WEKA software is evaluated. By combining the available attributes on the WEKA algorithm, the result is constant since all classes of SVM goes to the normal class with average 88.49% accuracy.

  10. HYBRID NEURAL NETWORK AND SUPPORT VECTOR MACHINE METHOD FOR OPTIMIZATION

    NASA Technical Reports Server (NTRS)

    Rai, Man Mohan (Inventor)

    2005-01-01

    System and method for optimization of a design associated with a response function, using a hybrid neural net and support vector machine (NN/SVM) analysis to minimize or maximize an objective function, optionally subject to one or more constraints. As a first example, the NN/SVM analysis is applied iteratively to design of an aerodynamic component, such as an airfoil shape, where the objective function measures deviation from a target pressure distribution on the perimeter of the aerodynamic component. As a second example, the NN/SVM analysis is applied to data classification of a sequence of data points in a multidimensional space. The NN/SVM analysis is also applied to data regression.

  11. Hybrid Neural Network and Support Vector Machine Method for Optimization

    NASA Technical Reports Server (NTRS)

    Rai, Man Mohan (Inventor)

    2007-01-01

    System and method for optimization of a design associated with a response function, using a hybrid neural net and support vector machine (NN/SVM) analysis to minimize or maximize an objective function, optionally subject to one or more constraints. As a first example, the NN/SVM analysis is applied iteratively to design of an aerodynamic component, such as an airfoil shape, where the objective function measures deviation from a target pressure distribution on the perimeter of the aerodynamic component. As a second example, the NN/SVM analysis is applied to data classification of a sequence of data points in a multidimensional space. The NN/SVM analysis is also applied to data regression.

  12. Acoustic Biometric System Based on Preprocessing Techniques and Linear Support Vector Machines

    PubMed Central

    del Val, Lara; Izquierdo-Fuente, Alberto; Villacorta, Juan J.; Raboso, Mariano

    2015-01-01

    Drawing on the results of an acoustic biometric system based on a MSE classifier, a new biometric system has been implemented. This new system preprocesses acoustic images, extracts several parameters and finally classifies them, based on Support Vector Machine (SVM). The preprocessing techniques used are spatial filtering, segmentation—based on a Gaussian Mixture Model (GMM) to separate the person from the background, masking—to reduce the dimensions of images—and binarization—to reduce the size of each image. An analysis of classification error and a study of the sensitivity of the error versus the computational burden of each implemented algorithm are presented. This allows the selection of the most relevant algorithms, according to the benefits required by the system. A significant improvement of the biometric system has been achieved by reducing the classification error, the computational burden and the storage requirements. PMID:26091392

  13. Acoustic Biometric System Based on Preprocessing Techniques and Linear Support Vector Machines.

    PubMed

    del Val, Lara; Izquierdo-Fuente, Alberto; Villacorta, Juan J; Raboso, Mariano

    2015-06-17

    Drawing on the results of an acoustic biometric system based on a MSE classifier, a new biometric system has been implemented. This new system preprocesses acoustic images, extracts several parameters and finally classifies them, based on Support Vector Machine (SVM). The preprocessing techniques used are spatial filtering, segmentation-based on a Gaussian Mixture Model (GMM) to separate the person from the background, masking-to reduce the dimensions of images-and binarization-to reduce the size of each image. An analysis of classification error and a study of the sensitivity of the error versus the computational burden of each implemented algorithm are presented. This allows the selection of the most relevant algorithms, according to the benefits required by the system. A significant improvement of the biometric system has been achieved by reducing the classification error, the computational burden and the storage requirements.

  14. A Comparative Object-Based Sugarcane Classification from Sentinel-2 Data Using Random Forests and Support Vector Machines

    NASA Astrophysics Data System (ADS)

    Chen, C. R.; Chen, C. F.; Nguyen, S. T.; Lau, K.; Lay, J. G.

    2016-12-01

    Sugarcane mostly grown in tropical and subtropical regions is one of the important commercial crops worldwide, providing significant employment, foreign exchange earnings, and other social and environmental benefits. The sugar industry is a vital component of Belize's economy as it provides employment to 15% of the country's population and 60% of the national agricultural exports. Sugarcane mapping is thus an important task due to official initiatives to provide reliable information on sugarcane-growing areas in respect to improved accuracy in monitoring sugarcane production and yield estimates. Policymakers need such monitoring information to formulate timely plans to ensure sustainably socioeconomic development. Sugarcane monitoring in Belize is traditionally carried out through time-consuming and costly field surveys. Remote sensing is an indispensable tool for crop monitoring on national, regional and global scales. The use of high and low resolution satellites for sugarcane monitoring in Belize is often restricted due to cost limitations and mixed pixel problems because sugarcane fields are small and fragmental. With the launch of Sentinel-2 satellite, it is possible to collectively map small patches of sugarcane fields over a large region as the data are free of charge and have high spectral, spatial, and temporal resolutions. This study aims to develop an object-based classification approach to comparatively map sugarcane fields in Belize from Sentinel-2 data using random forests (RF) and support vector machines (SVM). The data were processed through four main steps: (1) data pre-processing, (2) image segmentation, (3) sugarcane classification, and (4) accuracy assessment. The mapping results compared with the ground reference data indicated satisfactory results. The overall accuracies and Kappa coefficients were generally higher than 80% and 0.7, in both cases. The RF produced slightly more accurate mapping results than SVM. This study demonstrates the

  15. Improving Classification of Cancer and Mining Biomarkers from Gene Expression Profiles Using Hybrid Optimization Algorithms and Fuzzy Support Vector Machine

    PubMed Central

    Moteghaed, Niloofar Yousefi; Maghooli, Keivan; Garshasbi, Masoud

    2018-01-01

    Background: Gene expression data are characteristically high dimensional with a small sample size in contrast to the feature size and variability inherent in biological processes that contribute to difficulties in analysis. Selection of highly discriminative features decreases the computational cost and complexity of the classifier and improves its reliability for prediction of a new class of samples. Methods: The present study used hybrid particle swarm optimization and genetic algorithms for gene selection and a fuzzy support vector machine (SVM) as the classifier. Fuzzy logic is used to infer the importance of each sample in the training phase and decrease the outlier sensitivity of the system to increase the ability to generalize the classifier. A decision-tree algorithm was applied to the most frequent genes to develop a set of rules for each type of cancer. This improved the abilities of the algorithm by finding the best parameters for the classifier during the training phase without the need for trial-and-error by the user. The proposed approach was tested on four benchmark gene expression profiles. Results: Good results have been demonstrated for the proposed algorithm. The classification accuracy for leukemia data is 100%, for colon cancer is 96.67% and for breast cancer is 98%. The results show that the best kernel used in training the SVM classifier is the radial basis function. Conclusions: The experimental results show that the proposed algorithm can decrease the dimensionality of the dataset, determine the most informative gene subset, and improve classification accuracy using the optimal parameters of the classifier with no user interface. PMID:29535919

  16. Cervical cancer survival prediction using hybrid of SMOTE, CART and smooth support vector machine

    NASA Astrophysics Data System (ADS)

    Purnami, S. W.; Khasanah, P. M.; Sumartini, S. H.; Chosuvivatwong, V.; Sriplung, H.

    2016-04-01

    According to the WHO, every two minutes there is one patient who died from cervical cancer. The high mortality rate is due to the lack of awareness of women for early detection. There are several factors that supposedly influence the survival of cervical cancer patients, including age, anemia status, stage, type of treatment, complications and secondary disease. This study wants to classify/predict cervical cancer survival based on those factors. Various classifications methods: classification and regression tree (CART), smooth support vector machine (SSVM), three order spline SSVM (TSSVM) were used. Since the data of cervical cancer are imbalanced, synthetic minority oversampling technique (SMOTE) is used for handling imbalanced dataset. Performances of these methods are evaluated using accuracy, sensitivity and specificity. Results of this study show that balancing data using SMOTE as preprocessing can improve performance of classification. The SMOTE-SSVM method provided better result than SMOTE-TSSVM and SMOTE-CART.

  17. A new method for the prediction of chatter stability lobes based on dynamic cutting force simulation model and support vector machine

    NASA Astrophysics Data System (ADS)

    Peng, Chong; Wang, Lun; Liao, T. Warren

    2015-10-01

    Currently, chatter has become the critical factor in hindering machining quality and productivity in machining processes. To avoid cutting chatter, a new method based on dynamic cutting force simulation model and support vector machine (SVM) is presented for the prediction of chatter stability lobes. The cutting force is selected as the monitoring signal, and the wavelet energy entropy theory is used to extract the feature vectors. A support vector machine is constructed using the MATLAB LIBSVM toolbox for pattern classification based on the feature vectors derived from the experimental cutting data. Then combining with the dynamic cutting force simulation model, the stability lobes diagram (SLD) can be estimated. Finally, the predicted results are compared with existing methods such as zero-order analytical (ZOA) and semi-discretization (SD) method as well as actual cutting experimental results to confirm the validity of this new method.

  18. Optimizing Support Vector Machine Parameters with Genetic Algorithm for Credit Risk Assessment

    NASA Astrophysics Data System (ADS)

    Manurung, Jonson; Mawengkang, Herman; Zamzami, Elviawaty

    2017-12-01

    Support vector machine (SVM) is a popular classification method known to have strong generalization capabilities. SVM can solve the problem of classification and linear regression or nonlinear kernel which can be a learning algorithm for the ability of classification and regression. However, SVM also has a weakness that is difficult to determine the optimal parameter value. SVM calculates the best linear separator on the input feature space according to the training data. To classify data which are non-linearly separable, SVM uses kernel tricks to transform the data into a linearly separable data on a higher dimension feature space. The kernel trick using various kinds of kernel functions, such as : linear kernel, polynomial, radial base function (RBF) and sigmoid. Each function has parameters which affect the accuracy of SVM classification. To solve the problem genetic algorithms are proposed to be applied as the optimal parameter value search algorithm thus increasing the best classification accuracy on SVM. Data taken from UCI repository of machine learning database: Australian Credit Approval. The results show that the combination of SVM and genetic algorithms is effective in improving classification accuracy. Genetic algorithms has been shown to be effective in systematically finding optimal kernel parameters for SVM, instead of randomly selected kernel parameters. The best accuracy for data has been upgraded from kernel Linear: 85.12%, polynomial: 81.76%, RBF: 77.22% Sigmoid: 78.70%. However, for bigger data sizes, this method is not practical because it takes a lot of time.

  19. Pharmaceutical Raw Material Identification Using Miniature Near-Infrared (MicroNIR) Spectroscopy and Supervised Pattern Recognition Using Support Vector Machine

    PubMed Central

    Hsiung, Chang; Pederson, Christopher G.; Zou, Peng; Smith, Valton; von Gunten, Marc; O’Brien, Nada A.

    2016-01-01

    Near-infrared spectroscopy as a rapid and non-destructive analytical technique offers great advantages for pharmaceutical raw material identification (RMID) to fulfill the quality and safety requirements in pharmaceutical industry. In this study, we demonstrated the use of portable miniature near-infrared (MicroNIR) spectrometers for NIR-based pharmaceutical RMID and solved two challenges in this area, model transferability and large-scale classification, with the aid of support vector machine (SVM) modeling. We used a set of 19 pharmaceutical compounds including various active pharmaceutical ingredients (APIs) and excipients and six MicroNIR spectrometers to test model transferability. For the test of large-scale classification, we used another set of 253 pharmaceutical compounds comprised of both chemically and physically different APIs and excipients. We compared SVM with conventional chemometric modeling techniques, including soft independent modeling of class analogy, partial least squares discriminant analysis, linear discriminant analysis, and quadratic discriminant analysis. Support vector machine modeling using a linear kernel, especially when combined with a hierarchical scheme, exhibited excellent performance in both model transferability and large-scale classification. Hence, ultra-compact, portable and robust MicroNIR spectrometers coupled with SVM modeling can make on-site and in situ pharmaceutical RMID for large-volume applications highly achievable. PMID:27029624

  20. Retinal Microaneurysms Detection Using Gradient Vector Analysis and Class Imbalance Classification.

    PubMed

    Dai, Baisheng; Wu, Xiangqian; Bu, Wei

    2016-01-01

    Retinal microaneurysms (MAs) are the earliest clinically observable lesions of diabetic retinopathy. Reliable automated MAs detection is thus critical for early diagnosis of diabetic retinopathy. This paper proposes a novel method for the automated MAs detection in color fundus images based on gradient vector analysis and class imbalance classification, which is composed of two stages, i.e. candidate MAs extraction and classification. In the first stage, a candidate MAs extraction algorithm is devised by analyzing the gradient field of the image, in which a multi-scale log condition number map is computed based on the gradient vectors for vessel removal, and then the candidate MAs are localized according to the second order directional derivatives computed in different directions. Due to the complexity of fundus image, besides a small number of true MAs, there are also a large amount of non-MAs in the extracted candidates. Classifying the true MAs and the non-MAs is an extremely class imbalanced classification problem. Therefore, in the second stage, several types of features including geometry, contrast, intensity, edge, texture, region descriptors and other features are extracted from the candidate MAs and a class imbalance classifier, i.e., RUSBoost, is trained for the MAs classification. With the Retinopathy Online Challenge (ROC) criterion, the proposed method achieves an average sensitivity of 0.433 at 1/8, 1/4, 1/2, 1, 2, 4 and 8 false positives per image on the ROC database, which is comparable with the state-of-the-art approaches, and 0.321 on the DiaRetDB1 V2.1 database, which outperforms the state-of-the-art approaches.

  1. Balanced VS Imbalanced Training Data: Classifying Rapideye Data with Support Vector Machines

    NASA Astrophysics Data System (ADS)

    Ustuner, M.; Sanli, F. B.; Abdikan, S.

    2016-06-01

    The accuracy of supervised image classification is highly dependent upon several factors such as the design of training set (sample selection, composition, purity and size), resolution of input imagery and landscape heterogeneity. The design of training set is still a challenging issue since the sensitivity of classifier algorithm at learning stage is different for the same dataset. In this paper, the classification of RapidEye imagery with balanced and imbalanced training data for mapping the crop types was addressed. Classification with imbalanced training data may result in low accuracy in some scenarios. Support Vector Machines (SVM), Maximum Likelihood (ML) and Artificial Neural Network (ANN) classifications were implemented here to classify the data. For evaluating the influence of the balanced and imbalanced training data on image classification algorithms, three different training datasets were created. Two different balanced datasets which have 70 and 100 pixels for each class of interest and one imbalanced dataset in which each class has different number of pixels were used in classification stage. Results demonstrate that ML and NN classifications are affected by imbalanced training data in resulting a reduction in accuracy (from 90.94% to 85.94% for ML and from 91.56% to 88.44% for NN) while SVM is not affected significantly (from 94.38% to 94.69%) and slightly improved. Our results highlighted that SVM is proven to be a very robust, consistent and effective classifier as it can perform very well under balanced and imbalanced training data situations. Furthermore, the training stage should be precisely and carefully designed for the need of adopted classifier.

  2. Parameters selection in gene selection using Gaussian kernel support vector machines by genetic algorithm.

    PubMed

    Mao, Yong; Zhou, Xiao-Bo; Pi, Dao-Ying; Sun, You-Xian; Wong, Stephen T C

    2005-10-01

    In microarray-based cancer classification, gene selection is an important issue owing to the large number of variables and small number of samples as well as its non-linearity. It is difficult to get satisfying results by using conventional linear statistical methods. Recursive feature elimination based on support vector machine (SVM RFE) is an effective algorithm for gene selection and cancer classification, which are integrated into a consistent framework. In this paper, we propose a new method to select parameters of the aforementioned algorithm implemented with Gaussian kernel SVMs as better alternatives to the common practice of selecting the apparently best parameters by using a genetic algorithm to search for a couple of optimal parameter. Fast implementation issues for this method are also discussed for pragmatic reasons. The proposed method was tested on two representative hereditary breast cancer and acute leukaemia datasets. The experimental results indicate that the proposed method performs well in selecting genes and achieves high classification accuracies with these genes.

  3. Differentiation of Glioblastoma and Lymphoma Using Feature Extraction and Support Vector Machine.

    PubMed

    Yang, Zhangjing; Feng, Piaopiao; Wen, Tian; Wan, Minghua; Hong, Xunning

    2017-01-01

    Differentiation of glioblastoma multiformes (GBMs) and lymphomas using multi-sequence magnetic resonance imaging (MRI) is an important task that is valuable for treatment planning. However, this task is a challenge because GBMs and lymphomas may have a similar appearance in MRI images. This similarity may lead to misclassification and could affect the treatment results. In this paper, we propose a semi-automatic method based on multi-sequence MRI to differentiate these two types of brain tumors. Our method consists of three steps: 1) the key slice is selected from 3D MRIs and region of interests (ROIs) are drawn around the tumor region; 2) different features are extracted based on prior clinical knowledge and validated using a t-test; and 3) features that are helpful for classification are used to build an original feature vector and a support vector machine is applied to perform classification. In total, 58 GBM cases and 37 lymphoma cases are used to validate our method. A leave-one-out crossvalidation strategy is adopted in our experiments. The global accuracy of our method was determined as 96.84%, which indicates that our method is effective for the differentiation of GBM and lymphoma and can be applied in clinical diagnosis. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  4. Automatic event detection in low SNR microseismic signals based on multi-scale permutation entropy and a support vector machine

    NASA Astrophysics Data System (ADS)

    Jia, Rui-Sheng; Sun, Hong-Mei; Peng, Yan-Jun; Liang, Yong-Quan; Lu, Xin-Ming

    2017-07-01

    Microseismic monitoring is an effective means for providing early warning of rock or coal dynamical disasters, and its first step is microseismic event detection, although low SNR microseismic signals often cannot effectively be detected by routine methods. To solve this problem, this paper presents permutation entropy and a support vector machine to detect low SNR microseismic events. First, an extraction method of signal features based on multi-scale permutation entropy is proposed by studying the influence of the scale factor on the signal permutation entropy. Second, the detection model of low SNR microseismic events based on the least squares support vector machine is built by performing a multi-scale permutation entropy calculation for the collected vibration signals, constructing a feature vector set of signals. Finally, a comparative analysis of the microseismic events and noise signals in the experiment proves that the different characteristics of the two can be fully expressed by using multi-scale permutation entropy. The detection model of microseismic events combined with the support vector machine, which has the features of high classification accuracy and fast real-time algorithms, can meet the requirements of online, real-time extractions of microseismic events.

  5. Prediction of chemical biodegradability using support vector classifier optimized with differential evolution.

    PubMed

    Cao, Qi; Leung, K M

    2014-09-22

    Reliable computer models for the prediction of chemical biodegradability from molecular descriptors and fingerprints are very important for making health and environmental decisions. Coupling of the differential evolution (DE) algorithm with the support vector classifier (SVC) in order to optimize the main parameters of the classifier resulted in an improved classifier called the DE-SVC, which is introduced in this paper for use in chemical biodegradability studies. The DE-SVC was applied to predict the biodegradation of chemicals on the basis of extensive sample data sets and known structural features of molecules. Our optimization experiments showed that DE can efficiently find the proper parameters of the SVC. The resulting classifier possesses strong robustness and reliability compared with grid search, genetic algorithm, and particle swarm optimization methods. The classification experiments conducted here showed that the DE-SVC exhibits better classification performance than models previously used for such studies. It is a more effective and efficient prediction model for chemical biodegradability.

  6. An Automated and Intelligent Medical Decision Support System for Brain MRI Scans Classification.

    PubMed

    Siddiqui, Muhammad Faisal; Reza, Ahmed Wasif; Kanesan, Jeevan

    2015-01-01

    A wide interest has been observed in the medical health care applications that interpret neuroimaging scans by machine learning systems. This research proposes an intelligent, automatic, accurate, and robust classification technique to classify the human brain magnetic resonance image (MRI) as normal or abnormal, to cater down the human error during identifying the diseases in brain MRIs. In this study, fast discrete wavelet transform (DWT), principal component analysis (PCA), and least squares support vector machine (LS-SVM) are used as basic components. Firstly, fast DWT is employed to extract the salient features of brain MRI, followed by PCA, which reduces the dimensions of the features. These reduced feature vectors also shrink the memory storage consumption by 99.5%. At last, an advanced classification technique based on LS-SVM is applied to brain MR image classification using reduced features. For improving the efficiency, LS-SVM is used with non-linear radial basis function (RBF) kernel. The proposed algorithm intelligently determines the optimized values of the hyper-parameters of the RBF kernel and also applied k-fold stratified cross validation to enhance the generalization of the system. The method was tested by 340 patients' benchmark datasets of T1-weighted and T2-weighted scans. From the analysis of experimental results and performance comparisons, it is observed that the proposed medical decision support system outperformed all other modern classifiers and achieves 100% accuracy rate (specificity/sensitivity 100%/100%). Furthermore, in terms of computation time, the proposed technique is significantly faster than the recent well-known methods, and it improves the efficiency by 71%, 3%, and 4% on feature extraction stage, feature reduction stage, and classification stage, respectively. These results indicate that the proposed well-trained machine learning system has the potential to make accurate predictions about brain abnormalities from the

  7. Extended robust support vector machine based on financial risk minimization.

    PubMed

    Takeda, Akiko; Fujiwara, Shuhei; Kanamori, Takafumi

    2014-11-01

    Financial risk measures have been used recently in machine learning. For example, ν-support vector machine ν-SVM) minimizes the conditional value at risk (CVaR) of margin distribution. The measure is popular in finance because of the subadditivity property, but it is very sensitive to a few outliers in the tail of the distribution. We propose a new classification method, extended robust SVM (ER-SVM), which minimizes an intermediate risk measure between the CVaR and value at risk (VaR) by expecting that the resulting model becomes less sensitive than ν-SVM to outliers. We can regard ER-SVM as an extension of robust SVM, which uses a truncated hinge loss. Numerical experiments imply the ER-SVM's possibility of achieving a better prediction performance with proper parameter setting.

  8. Semisupervised Support Vector Machines With Tangent Space Intrinsic Manifold Regularization.

    PubMed

    Sun, Shiliang; Xie, Xijiong

    2016-09-01

    Semisupervised learning has been an active research topic in machine learning and data mining. One main reason is that labeling examples is expensive and time-consuming, while there are large numbers of unlabeled examples available in many practical problems. So far, Laplacian regularization has been widely used in semisupervised learning. In this paper, we propose a new regularization method called tangent space intrinsic manifold regularization. It is intrinsic to data manifold and favors linear functions on the manifold. Fundamental elements involved in the formulation of the regularization are local tangent space representations, which are estimated by local principal component analysis, and the connections that relate adjacent tangent spaces. Simultaneously, we explore its application to semisupervised classification and propose two new learning algorithms called tangent space intrinsic manifold regularized support vector machines (TiSVMs) and tangent space intrinsic manifold regularized twin SVMs (TiTSVMs). They effectively integrate the tangent space intrinsic manifold regularization consideration. The optimization of TiSVMs can be solved by a standard quadratic programming, while the optimization of TiTSVMs can be solved by a pair of standard quadratic programmings. The experimental results of semisupervised classification problems show the effectiveness of the proposed semisupervised learning algorithms.

  9. N-gram support vector machines for scalable procedure and diagnosis classification, with applications to clinical free text data from the intensive care unit.

    PubMed

    Marafino, Ben J; Davies, Jason M; Bardach, Naomi S; Dean, Mitzi L; Dudley, R Adams

    2014-01-01

    Existing risk adjustment models for intensive care unit (ICU) outcomes rely on manual abstraction of patient-level predictors from medical charts. Developing an automated method for abstracting these data from free text might reduce cost and data collection times. To develop a support vector machine (SVM) classifier capable of identifying a range of procedures and diagnoses in ICU clinical notes for use in risk adjustment. We selected notes from 2001-2008 for 4191 neonatal ICU (NICU) and 2198 adult ICU patients from the MIMIC-II database from the Beth Israel Deaconess Medical Center. Using these notes, we developed an implementation of the SVM classifier to identify procedures (mechanical ventilation and phototherapy in NICU notes) and diagnoses (jaundice in NICU and intracranial hemorrhage (ICH) in adult ICU). On the jaundice classification task, we also compared classifier performance using n-gram features to unigrams with application of a negation algorithm (NegEx). Our classifier accurately identified mechanical ventilation (accuracy=0.982, F1=0.954) and phototherapy use (accuracy=0.940, F1=0.912), as well as jaundice (accuracy=0.898, F1=0.884) and ICH diagnoses (accuracy=0.938, F1=0.943). Including bigram features improved performance on the jaundice (accuracy=0.898 vs 0.865) and ICH (0.938 vs 0.927) tasks, and outperformed NegEx-derived unigram features (accuracy=0.898 vs 0.863) on the jaundice task. Overall, a classifier using n-gram support vectors displayed excellent performance characteristics. The classifier generalizes to diverse patient populations, diagnoses, and procedures. SVM-based classifiers can accurately identify procedure status and diagnoses among ICU patients, and including n-gram features improves performance, compared to existing methods. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  10. T-ray relevant frequencies for osteosarcoma classification

    NASA Astrophysics Data System (ADS)

    Withayachumnankul, W.; Ferguson, B.; Rainsford, T.; Findlay, D.; Mickan, S. P.; Abbott, D.

    2006-01-01

    We investigate the classification of the T-ray response of normal human bone cells and human osteosarcoma cells, grown in culture. Given the magnitude and phase responses within a reliable spectral range as features for input vectors, a trained support vector machine can correctly classify the two cell types to some extent. Performance of the support vector machine is deteriorated by the curse of dimensionality, resulting from the comparatively large number of features in the input vectors. Feature subset selection methods are used to select only an optimal number of relevant features for inputs. As a result, an improvement in generalization performance is attainable, and the selected frequencies can be used for further describing different mechanisms of the cells, responding to T-rays. We demonstrate a consistent classification accuracy of 89.6%, while the only one fifth of the original features are retained in the data set.

  11. A support vector machine classifier reduces interscanner variation in the HRCT classification of regional disease pattern in diffuse lung disease: Comparison to a Bayesian classifier

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chang, Yongjun; Lim, Jonghyuck; Kim, Namkug

    2013-05-15

    Purpose: To investigate the effect of using different computed tomography (CT) scanners on the accuracy of high-resolution CT (HRCT) images in classifying regional disease patterns in patients with diffuse lung disease, support vector machine (SVM) and Bayesian classifiers were applied to multicenter data. Methods: Two experienced radiologists marked sets of 600 rectangular 20 Multiplication-Sign 20 pixel regions of interest (ROIs) on HRCT images obtained from two scanners (GE and Siemens), including 100 ROIs for each of local patterns of lungs-normal lung and five of regional pulmonary disease patterns (ground-glass opacity, reticular opacity, honeycombing, emphysema, and consolidation). Each ROI was assessedmore » using 22 quantitative features belonging to one of the following descriptors: histogram, gradient, run-length, gray level co-occurrence matrix, low-attenuation area cluster, and top-hat transform. For automatic classification, a Bayesian classifier and a SVM classifier were compared under three different conditions. First, classification accuracies were estimated using data from each scanner. Next, data from the GE and Siemens scanners were used for training and testing, respectively, and vice versa. Finally, all ROI data were integrated regardless of the scanner type and were then trained and tested together. All experiments were performed based on forward feature selection and fivefold cross-validation with 20 repetitions. Results: For each scanner, better classification accuracies were achieved with the SVM classifier than the Bayesian classifier (92% and 82%, respectively, for the GE scanner; and 92% and 86%, respectively, for the Siemens scanner). The classification accuracies were 82%/72% for training with GE data and testing with Siemens data, and 79%/72% for the reverse. The use of training and test data obtained from the HRCT images of different scanners lowered the classification accuracy compared to the use of HRCT images from the same

  12. Ensemble Feature Learning of Genomic Data Using Support Vector Machine

    PubMed Central

    Anaissi, Ali; Goyal, Madhu; Catchpoole, Daniel R.; Braytee, Ali; Kennedy, Paul J.

    2016-01-01

    The identification of a subset of genes having the ability to capture the necessary information to distinguish classes of patients is crucial in bioinformatics applications. Ensemble and bagging methods have been shown to work effectively in the process of gene selection and classification. Testament to that is random forest which combines random decision trees with bagging to improve overall feature selection and classification accuracy. Surprisingly, the adoption of these methods in support vector machines has only recently received attention but mostly on classification not gene selection. This paper introduces an ensemble SVM-Recursive Feature Elimination (ESVM-RFE) for gene selection that follows the concepts of ensemble and bagging used in random forest but adopts the backward elimination strategy which is the rationale of RFE algorithm. The rationale behind this is, building ensemble SVM models using randomly drawn bootstrap samples from the training set, will produce different feature rankings which will be subsequently aggregated as one feature ranking. As a result, the decision for elimination of features is based upon the ranking of multiple SVM models instead of choosing one particular model. Moreover, this approach will address the problem of imbalanced datasets by constructing a nearly balanced bootstrap sample. Our experiments show that ESVM-RFE for gene selection substantially increased the classification performance on five microarray datasets compared to state-of-the-art methods. Experiments on the childhood leukaemia dataset show that an average 9% better accuracy is achieved by ESVM-RFE over SVM-RFE, and 5% over random forest based approach. The selected genes by the ESVM-RFE algorithm were further explored with Singular Value Decomposition (SVD) which reveals significant clusters with the selected data. PMID:27304923

  13. Using support vector machines to identify literacy skills: Evidence from eye movements.

    PubMed

    Lou, Ya; Liu, Yanping; Kaakinen, Johanna K; Li, Xingshan

    2017-06-01

    Is inferring readers' literacy skills possible by analyzing their eye movements during text reading? This study used Support Vector Machines (SVM) to analyze eye movement data from 61 undergraduate students who read a multiple-paragraph, multiple-topic expository text. Forward fixation time, first-pass rereading time, second-pass fixation time, and regression path reading time on different regions of the text were provided as features. The SVM classification algorithm assisted in distinguishing high-literacy-skilled readers from low-literacy-skilled readers with 80.3 % accuracy. Results demonstrate the effectiveness of combining eye tracking and machine learning techniques to detect readers with low literacy skills, and suggest that such approaches can be potentially used in predicting other cognitive abilities.

  14. Robust support vector regression networks for function approximation with outliers.

    PubMed

    Chuang, Chen-Chia; Su, Shun-Feng; Jeng, Jin-Tsong; Hsiao, Chih-Ching

    2002-01-01

    Support vector regression (SVR) employs the support vector machine (SVM) to tackle problems of function approximation and regression estimation. SVR has been shown to have good robust properties against noise. When the parameters used in SVR are improperly selected, overfitting phenomena may still occur. However, the selection of various parameters is not straightforward. Besides, in SVR, outliers may also possibly be taken as support vectors. Such an inclusion of outliers in support vectors may lead to seriously overfitting phenomena. In this paper, a novel regression approach, termed as the robust support vector regression (RSVR) network, is proposed to enhance the robust capability of SVR. In the approach, traditional robust learning approaches are employed to improve the learning performance for any selected parameters. From the simulation results, our RSVR can always improve the performance of the learned systems for all cases. Besides, it can be found that even the training lasted for a long period, the testing errors would not go up. In other words, the overfitting phenomenon is indeed suppressed.

  15. Breast cancer risk assessment and diagnosis model using fuzzy support vector machine based expert system

    NASA Astrophysics Data System (ADS)

    Dheeba, J.; Jaya, T.; Singh, N. Albert

    2017-09-01

    Classification of cancerous masses is a challenging task in many computerised detection systems. Cancerous masses are difficult to detect because these masses are obscured and subtle in mammograms. This paper investigates an intelligent classifier - fuzzy support vector machine (FSVM) applied to classify the tissues containing masses on mammograms for breast cancer diagnosis. The algorithm utilises texture features extracted using Laws texture energy measures and a FSVM to classify the suspicious masses. The new FSVM treats every feature as both normal and abnormal samples, but with different membership. By this way, the new FSVM have more generalisation ability to classify the masses in mammograms. The classifier analysed 219 clinical mammograms collected from breast cancer screening laboratory. The tests made on the real clinical mammograms shows that the proposed detection system has better discriminating power than the conventional support vector machine. With the best combination of FSVM and Laws texture features, the area under the Receiver operating characteristic curve reached .95, which corresponds to a sensitivity of 93.27% with a specificity of 87.17%. The results suggest that detecting masses using FSVM contribute to computer-aided detection of breast cancer and as a decision support system for radiologists.

  16. I-CAN: the classification and prediction of support needs.

    PubMed

    Arnold, Samuel R C; Riches, Vivienne C; Stancliffe, Roger J

    2014-03-01

    Since 1992, the diagnosis and classification of intellectual disability has been dependent upon three constructs: intelligence, adaptive behaviour and support needs (Luckasson et al. 1992. Mental Retardation: Definition, Classification and Systems of Support. American Association on Intellectual and Developmental Disability, Washington, DC). While the methods and instruments to measure intelligence and adaptive behaviour are well established and generally accepted, the measurement and classification of support needs is still in its infancy. This article explores the measurement and classification of support needs. A study is presented comparing scores on the ICF (WHO, 2001) based I-CAN v4.2 support needs assessment and planning tool with expert clinical judgment using a proposed classification of support needs. A logical classification algorithm was developed and validated on a separate sample. Good internal consistency (range 0.73-0.91, N = 186) and criterion validity (κ = 0.94, n = 49) were found. Further advances in our understanding and measurement of support needs could change the way we assess, describe and classify disability. © 2013 John Wiley & Sons Ltd.

  17. Detection of Dendritic Spines Using Wavelet Packet Entropy and Fuzzy Support Vector Machine.

    PubMed

    Wang, Shuihua; Li, Yang; Shao, Ying; Cattani, Carlo; Zhang, Yudong; Du, Sidan

    2017-01-01

    The morphology of dendritic spines is highly correlated with the neuron function. Therefore, it is of positive influence for the research of the dendritic spines. However, it is tried to manually label the spine types for statistical analysis. In this work, we proposed an approach based on the combination of wavelet contour analysis for the backbone detection, wavelet packet entropy, and fuzzy support vector machine for the spine classification. The experiments show that this approach is promising. The average detection accuracy of "MushRoom" achieves 97.3%, "Stubby" achieves 94.6%, and "Thin" achieves 97.2%. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  18. Transportation Modes Classification Using Sensors on Smartphones.

    PubMed

    Fang, Shih-Hau; Liao, Hao-Hsiang; Fei, Yu-Xiang; Chen, Kai-Hsiang; Huang, Jen-Wei; Lu, Yu-Ding; Tsao, Yu

    2016-08-19

    This paper investigates the transportation and vehicular modes classification by using big data from smartphone sensors. The three types of sensors used in this paper include the accelerometer, magnetometer, and gyroscope. This study proposes improved features and uses three machine learning algorithms including decision trees, K-nearest neighbor, and support vector machine to classify the user's transportation and vehicular modes. In the experiments, we discussed and compared the performance from different perspectives including the accuracy for both modes, the executive time, and the model size. Results show that the proposed features enhance the accuracy, in which the support vector machine provides the best performance in classification accuracy whereas it consumes the largest prediction time. This paper also investigates the vehicle classification mode and compares the results with that of the transportation modes.

  19. Transportation Modes Classification Using Sensors on Smartphones

    PubMed Central

    Fang, Shih-Hau; Liao, Hao-Hsiang; Fei, Yu-Xiang; Chen, Kai-Hsiang; Huang, Jen-Wei; Lu, Yu-Ding; Tsao, Yu

    2016-01-01

    This paper investigates the transportation and vehicular modes classification by using big data from smartphone sensors. The three types of sensors used in this paper include the accelerometer, magnetometer, and gyroscope. This study proposes improved features and uses three machine learning algorithms including decision trees, K-nearest neighbor, and support vector machine to classify the user’s transportation and vehicular modes. In the experiments, we discussed and compared the performance from different perspectives including the accuracy for both modes, the executive time, and the model size. Results show that the proposed features enhance the accuracy, in which the support vector machine provides the best performance in classification accuracy whereas it consumes the largest prediction time. This paper also investigates the vehicle classification mode and compares the results with that of the transportation modes. PMID:27548182

  20. Towards human behavior recognition based on spatio temporal features and support vector machines

    NASA Astrophysics Data System (ADS)

    Ghabri, Sawsen; Ouarda, Wael; Alimi, Adel M.

    2017-03-01

    Security and surveillance are vital issues in today's world. The recent acts of terrorism have highlighted the urgent need for efficient surveillance. There is indeed a need for an automated system for video surveillance which can detect identity and activity of person. In this article, we propose a new paradigm to recognize an aggressive human behavior such as boxing action. Our proposed system for human activity detection includes the use of a fusion between Spatio Temporal Interest Point (STIP) and Histogram of Oriented Gradient (HoG) features. The novel feature called Spatio Temporal Histogram Oriented Gradient (STHOG). To evaluate the robustness of our proposed paradigm with a local application of HoG technique on STIP points, we made experiments on KTH human action dataset based on Multi Class Support Vector Machines classification. The proposed scheme outperforms basic descriptors like HoG and STIP to achieve 82.26% us an accuracy value of classification rate.

  1. Variable Selection for Support Vector Machines in Moderately High Dimensions

    PubMed Central

    Zhang, Xiang; Wu, Yichao; Wang, Lan; Li, Runze

    2015-01-01

    Summary The support vector machine (SVM) is a powerful binary classification tool with high accuracy and great flexibility. It has achieved great success, but its performance can be seriously impaired if many redundant covariates are included. Some efforts have been devoted to studying variable selection for SVMs, but asymptotic properties, such as variable selection consistency, are largely unknown when the number of predictors diverges to infinity. In this work, we establish a unified theory for a general class of nonconvex penalized SVMs. We first prove that in ultra-high dimensions, there exists one local minimizer to the objective function of nonconvex penalized SVMs possessing the desired oracle property. We further address the problem of nonunique local minimizers by showing that the local linear approximation algorithm is guaranteed to converge to the oracle estimator even in the ultra-high dimensional setting if an appropriate initial estimator is available. This condition on initial estimator is verified to be automatically valid as long as the dimensions are moderately high. Numerical examples provide supportive evidence. PMID:26778916

  2. Identification of handwriting by using the genetic algorithm (GA) and support vector machine (SVM)

    NASA Astrophysics Data System (ADS)

    Zhang, Qigui; Deng, Kai

    2016-12-01

    As portable digital camera and a camera phone comes more and more popular, and equally pressing is meeting the requirements of people to shoot at any time, to identify and storage handwritten character. In this paper, genetic algorithm(GA) and support vector machine(SVM)are used for identification of handwriting. Compare with parameters-optimized method, this technique overcomes two defects: first, it's easy to trap in the local optimum; second, finding the best parameters in the larger range will affects the efficiency of classification and prediction. As the experimental results suggest, GA-SVM has a higher recognition rate.

  3. Mining protein function from text using term-based support vector machines

    PubMed Central

    Rice, Simon B; Nenadic, Goran; Stapley, Benjamin J

    2005-01-01

    Background Text mining has spurred huge interest in the domain of biology. The goal of the BioCreAtIvE exercise was to evaluate the performance of current text mining systems. We participated in Task 2, which addressed assigning Gene Ontology terms to human proteins and selecting relevant evidence from full-text documents. We approached it as a modified form of the document classification task. We used a supervised machine-learning approach (based on support vector machines) to assign protein function and select passages that support the assignments. As classification features, we used a protein's co-occurring terms that were automatically extracted from documents. Results The results evaluated by curators were modest, and quite variable for different problems: in many cases we have relatively good assignment of GO terms to proteins, but the selected supporting text was typically non-relevant (precision spanning from 3% to 50%). The method appears to work best when a substantial set of relevant documents is obtained, while it works poorly on single documents and/or short passages. The initial results suggest that our approach can also mine annotations from text even when an explicit statement relating a protein to a GO term is absent. Conclusion A machine learning approach to mining protein function predictions from text can yield good performance only if sufficient training data is available, and significant amount of supporting data is used for prediction. The most promising results are for combined document retrieval and GO term assignment, which calls for the integration of methods developed in BioCreAtIvE Task 1 and Task 2. PMID:15960835

  4. Vector-model-supported approach in prostate plan optimization

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Liu, Eva Sau Fan; Department of Health Technology and Informatics, The Hong Kong Polytechnic University; Wu, Vincent Wing Cheung

    Lengthy time consumed in traditional manual plan optimization can limit the use of step-and-shoot intensity-modulated radiotherapy/volumetric-modulated radiotherapy (S&S IMRT/VMAT). A vector model base, retrieving similar radiotherapy cases, was developed with respect to the structural and physiologic features extracted from the Digital Imaging and Communications in Medicine (DICOM) files. Planning parameters were retrieved from the selected similar reference case and applied to the test case to bypass the gradual adjustment of planning parameters. Therefore, the planning time spent on the traditional trial-and-error manual optimization approach in the beginning of optimization could be reduced. Each S&S IMRT/VMAT prostate reference database comprised 100more » previously treated cases. Prostate cases were replanned with both traditional optimization and vector-model-supported optimization based on the oncologists' clinical dose prescriptions. A total of 360 plans, which consisted of 30 cases of S&S IMRT, 30 cases of 1-arc VMAT, and 30 cases of 2-arc VMAT plans including first optimization and final optimization with/without vector-model-supported optimization, were compared using the 2-sided t-test and paired Wilcoxon signed rank test, with a significance level of 0.05 and a false discovery rate of less than 0.05. For S&S IMRT, 1-arc VMAT, and 2-arc VMAT prostate plans, there was a significant reduction in the planning time and iteration with vector-model-supported optimization by almost 50%. When the first optimization plans were compared, 2-arc VMAT prostate plans had better plan quality than 1-arc VMAT plans. The volume receiving 35 Gy in the femoral head for 2-arc VMAT plans was reduced with the vector-model-supported optimization compared with the traditional manual optimization approach. Otherwise, the quality of plans from both approaches was comparable. Vector-model-supported optimization was shown to offer much shortened planning time and iteration

  5. The maximum vector-angular margin classifier and its fast training on large datasets using a core vector machine.

    PubMed

    Hu, Wenjun; Chung, Fu-Lai; Wang, Shitong

    2012-03-01

    Although pattern classification has been extensively studied in the past decades, how to effectively solve the corresponding training on large datasets is a problem that still requires particular attention. Many kernelized classification methods, such as SVM and SVDD, can be formulated as the corresponding quadratic programming (QP) problems, but computing the associated kernel matrices requires O(n2)(or even up to O(n3)) computational complexity, where n is the size of the training patterns, which heavily limits the applicability of these methods for large datasets. In this paper, a new classification method called the maximum vector-angular margin classifier (MAMC) is first proposed based on the vector-angular margin to find an optimal vector c in the pattern feature space, and all the testing patterns can be classified in terms of the maximum vector-angular margin ρ, between the vector c and all the training data points. Accordingly, it is proved that the kernelized MAMC can be equivalently formulated as the kernelized Minimum Enclosing Ball (MEB), which leads to a distinctive merit of MAMC, i.e., it has the flexibility of controlling the sum of support vectors like v-SVC and may be extended to a maximum vector-angular margin core vector machine (MAMCVM) by connecting the core vector machine (CVM) method with MAMC such that the corresponding fast training on large datasets can be effectively achieved. Experimental results on artificial and real datasets are provided to validate the power of the proposed methods. Copyright © 2011 Elsevier Ltd. All rights reserved.

  6. Fuzzy Nonlinear Proximal Support Vector Machine for Land Extraction Based on Remote Sensing Image

    PubMed Central

    Zhong, Xiaomei; Li, Jianping; Dou, Huacheng; Deng, Shijun; Wang, Guofei; Jiang, Yu; Wang, Yongjie; Zhou, Zebing; Wang, Li; Yan, Fei

    2013-01-01

    Currently, remote sensing technologies were widely employed in the dynamic monitoring of the land. This paper presented an algorithm named fuzzy nonlinear proximal support vector machine (FNPSVM) by basing on ETM+ remote sensing image. This algorithm is applied to extract various types of lands of the city Da’an in northern China. Two multi-category strategies, namely “one-against-one” and “one-against-rest” for this algorithm were described in detail and then compared. A fuzzy membership function was presented to reduce the effects of noises or outliers on the data samples. The approaches of feature extraction, feature selection, and several key parameter settings were also given. Numerous experiments were carried out to evaluate its performances including various accuracies (overall accuracies and kappa coefficient), stability, training speed, and classification speed. The FNPSVM classifier was compared to the other three classifiers including the maximum likelihood classifier (MLC), back propagation neural network (BPN), and the proximal support vector machine (PSVM) under different training conditions. The impacts of the selection of training samples, testing samples and features on the four classifiers were also evaluated in these experiments. PMID:23936016

  7. Support Vector Machine-Based Endmember Extraction

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Filippi, Anthony M; Archibald, Richard K

    Introduced in this paper is the utilization of Support Vector Machines (SVMs) to automatically perform endmember extraction from hyperspectral data. The strengths of SVM are exploited to provide a fast and accurate calculated representation of high-dimensional data sets that may consist of multiple distributions. Once this representation is computed, the number of distributions can be determined without prior knowledge. For each distribution, an optimal transform can be determined that preserves informational content while reducing the data dimensionality, and hence, the computational cost. Finally, endmember extraction for the whole data set is accomplished. Results indicate that this Support Vector Machine-Based Endmembermore » Extraction (SVM-BEE) algorithm has the capability of autonomously determining endmembers from multiple clusters with computational speed and accuracy, while maintaining a robust tolerance to noise.« less

  8. Support Vector Machines for Multitemporal and Multisensor Change Detection in a Mining Area

    NASA Astrophysics Data System (ADS)

    Hecheltjen, Antje; Waske, Bjorn; Thonfeld, Frank; Braun, Matthias; Menz, Gunter

    2010-12-01

    Long-term change detection often implies the challenge of incorporating multitemporal data from different sensors. Most of the conventional change detection algorithms are designed for bi-temporal datasets from the same sensors detecting only the existence of changes. The labeling of change areas remains a difficult task. To overcome such drawbacks, much attention has been given lately to algorithms arising from machine learning, such as Support Vector Machines (SVMs). While SVMs have been applied successfully for land cover classifications, the exploitation of this approach for change detection is still in its infancy. Few studies have already proven the applicability of SVMs for bi- and multitemporal change detection using data from one sensor only. In this paper we demonstrate the application of SVM for multitemporal and -sensor change detection. Our study site covers lignite open pit mining areas in the German state North Rhine-Westphalia. The dataset consists of bi-temporal Landsat data and multi-temporal ERS SAR data covering two time slots (2001 and 2009). The SVM is conducted using the IDL program imageSVM. Change is deduced from one time slot to the next resulting in two change maps. In contrast to change detection, which is based on post-classification comparison, change detection is seen here as a specific classification problem. Thus, changes are directly classified from a layer-stack of the two years. To reduce the number of change classes, we created a change mask using the magnitude of Change Vector Analysis (CVA). Training data were selected for different change classes (e.g. forest to mining or mining to agriculture) as well as for the no-change classes (e.g. agriculture). Subsequently, they were divided in two independent sets for training the SVMs and accuracy assessment, respectively. Our study shows the applicability of SVMs to classify changes via SVMs. The proposed method yielded a change map of reclaimed and active mines. The use of ERS SAR

  9. Currency crisis indication by using ensembles of support vector machine classifiers

    NASA Astrophysics Data System (ADS)

    Ramli, Nor Azuana; Ismail, Mohd Tahir; Wooi, Hooy Chee

    2014-07-01

    There are many methods that had been experimented in the analysis of currency crisis. However, not all methods could provide accurate indications. This paper introduces an ensemble of classifiers by using Support Vector Machine that's never been applied in analyses involving currency crisis before with the aim of increasing the indication accuracy. The proposed ensemble classifiers' performances are measured using percentage of accuracy, root mean squared error (RMSE), area under the Receiver Operating Characteristics (ROC) curve and Type II error. The performances of an ensemble of Support Vector Machine classifiers are compared with the single Support Vector Machine classifier and both of classifiers are tested on the data set from 27 countries with 12 macroeconomic indicators for each country. From our analyses, the results show that the ensemble of Support Vector Machine classifiers outperforms single Support Vector Machine classifier on the problem involving indicating a currency crisis in terms of a range of standard measures for comparing the performance of classifiers.

  10. Target specific compound identification using a support vector machine.

    PubMed

    Plewczynski, Dariusz; von Grotthuss, Marcin; Spieser, Stephane A H; Rychlewski, Leszek; Wyrwicz, Lucjan S; Ginalski, Krzysztof; Koch, Uwe

    2007-03-01

    In many cases at the beginning of an HTS-campaign, some information about active molecules is already available. Often known active compounds (such as substrate analogues, natural products, inhibitors of a related protein or ligands published by a pharmaceutical company) are identified in low-throughput validation studies of the biochemical target. In this study we evaluate the effectiveness of a support vector machine applied for those compounds and used to classify a collection with unknown activity. This approach was aimed at reducing the number of compounds to be tested against the given target. Our method predicts the biological activity of chemical compounds based on only the atom pairs (AP) two dimensional topological descriptors. The supervised support vector machine (SVM) method herein is trained on compounds from the MDL drug data report (MDDR) known to be active for specific protein target. For detailed analysis, five different biological targets were selected including cyclooxygenase-2, dihydrofolate reductase, thrombin, HIV-reverse transcriptase and antagonists of the estrogen receptor. The accuracy of compound identification was estimated using the recall and precision values. The sensitivities for all protein targets exceeded 80% and the classification performance reached 100% for selected targets. In another application of the method, we addressed the absence of an initial set of active compounds for a selected protein target at the beginning of an HTS-campaign. In such a case, virtual high-throughput screening (vHTS) is usually applied by using a flexible docking procedure. However, the vHTS experiment typically contains a large percentage of false positives that should be verified by costly and time-consuming experimental follow-up assays. The subsequent use of our machine learning method was found to improve the speed (since the docking procedure was not required for all compounds from the database) and also the accuracy of the HTS hit lists (the

  11. Automated image segmentation using support vector machines

    NASA Astrophysics Data System (ADS)

    Powell, Stephanie; Magnotta, Vincent A.; Andreasen, Nancy C.

    2007-03-01

    Neurodegenerative and neurodevelopmental diseases demonstrate problems associated with brain maturation and aging. Automated methods to delineate brain structures of interest are required to analyze large amounts of imaging data like that being collected in several on going multi-center studies. We have previously reported on using artificial neural networks (ANN) to define subcortical brain structures including the thalamus (0.88), caudate (0.85) and the putamen (0.81). In this work, apriori probability information was generated using Thirion's demons registration algorithm. The input vector consisted of apriori probability, spherical coordinates, and an iris of surrounding signal intensity values. We have applied the support vector machine (SVM) machine learning algorithm to automatically segment subcortical and cerebellar regions using the same input vector information. SVM architecture was derived from the ANN framework. Training was completed using a radial-basis function kernel with gamma equal to 5.5. Training was performed using 15,000 vectors collected from 15 training images in approximately 10 minutes. The resulting support vectors were applied to delineate 10 images not part of the training set. Relative overlap calculated for the subcortical structures was 0.87 for the thalamus, 0.84 for the caudate, 0.84 for the putamen, and 0.72 for the hippocampus. Relative overlap for the cerebellar lobes ranged from 0.76 to 0.86. The reliability of the SVM based algorithm was similar to the inter-rater reliability between manual raters and can be achieved without rater intervention.

  12. Agent Collaborative Target Localization and Classification in Wireless Sensor Networks

    PubMed Central

    Wang, Xue; Bi, Dao-wei; Ding, Liang; Wang, Sheng

    2007-01-01

    Wireless sensor networks (WSNs) are autonomous networks that have been frequently deployed to collaboratively perform target localization and classification tasks. Their autonomous and collaborative features resemble the characteristics of agents. Such similarities inspire the development of heterogeneous agent architecture for WSN in this paper. The proposed agent architecture views WSN as multi-agent systems and mobile agents are employed to reduce in-network communication. According to the architecture, an energy based acoustic localization algorithm is proposed. In localization, estimate of target location is obtained by steepest descent search. The search algorithm adapts to measurement environments by dynamically adjusting its termination condition. With the agent architecture, target classification is accomplished by distributed support vector machine (SVM). Mobile agents are employed for feature extraction and distributed SVM learning to reduce communication load. Desirable learning performance is guaranteed by combining support vectors and convex hull vectors. Fusion algorithms are designed to merge SVM classification decisions made from various modalities. Real world experiments with MICAz sensor nodes are conducted for vehicle localization and classification. Experimental results show the proposed agent architecture remarkably facilitates WSN designs and algorithm implementation. The localization and classification algorithms also prove to be accurate and energy efficient.

  13. Product Quality Modelling Based on Incremental Support Vector Machine

    NASA Astrophysics Data System (ADS)

    Wang, J.; Zhang, W.; Qin, B.; Shi, W.

    2012-05-01

    Incremental Support vector machine (ISVM) is a new learning method developed in recent years based on the foundations of statistical learning theory. It is suitable for the problem of sequentially arriving field data and has been widely used for product quality prediction and production process optimization. However, the traditional ISVM learning does not consider the quality of the incremental data which may contain noise and redundant data; it will affect the learning speed and accuracy to a great extent. In order to improve SVM training speed and accuracy, a modified incremental support vector machine (MISVM) is proposed in this paper. Firstly, the margin vectors are extracted according to the Karush-Kuhn-Tucker (KKT) condition; then the distance from the margin vectors to the final decision hyperplane is calculated to evaluate the importance of margin vectors, where the margin vectors are removed while their distance exceed the specified value; finally, the original SVs and remaining margin vectors are used to update the SVM. The proposed MISVM can not only eliminate the unimportant samples such as noise samples, but also can preserve the important samples. The MISVM has been experimented on two public data and one field data of zinc coating weight in strip hot-dip galvanizing, and the results shows that the proposed method can improve the prediction accuracy and the training speed effectively. Furthermore, it can provide the necessary decision supports and analysis tools for auto control of product quality, and also can extend to other process industries, such as chemical process and manufacturing process.

  14. Control-group feature normalization for multivariate pattern analysis of structural MRI data using the support vector machine.

    PubMed

    Linn, Kristin A; Gaonkar, Bilwaj; Satterthwaite, Theodore D; Doshi, Jimit; Davatzikos, Christos; Shinohara, Russell T

    2016-05-15

    Normalization of feature vector values is a common practice in machine learning. Generally, each feature value is standardized to the unit hypercube or by normalizing to zero mean and unit variance. Classification decisions based on support vector machines (SVMs) or by other methods are sensitive to the specific normalization used on the features. In the context of multivariate pattern analysis using neuroimaging data, standardization effectively up- and down-weights features based on their individual variability. Since the standard approach uses the entire data set to guide the normalization, it utilizes the total variability of these features. This total variation is inevitably dependent on the amount of marginal separation between groups. Thus, such a normalization may attenuate the separability of the data in high dimensional space. In this work we propose an alternate approach that uses an estimate of the control-group standard deviation to normalize features before training. We study our proposed approach in the context of group classification using structural MRI data. We show that control-based normalization leads to better reproducibility of estimated multivariate disease patterns and improves the classifier performance in many cases. Copyright © 2016 Elsevier Inc. All rights reserved.

  15. A multiple kernel support vector machine scheme for feature selection and rule extraction from gene expression data of cancer tissue.

    PubMed

    Chen, Zhenyu; Li, Jianping; Wei, Liwei

    2007-10-01

    Recently, gene expression profiling using microarray techniques has been shown as a promising tool to improve the diagnosis and treatment of cancer. Gene expression data contain high level of noise and the overwhelming number of genes relative to the number of available samples. It brings out a great challenge for machine learning and statistic techniques. Support vector machine (SVM) has been successfully used to classify gene expression data of cancer tissue. In the medical field, it is crucial to deliver the user a transparent decision process. How to explain the computed solutions and present the extracted knowledge becomes a main obstacle for SVM. A multiple kernel support vector machine (MK-SVM) scheme, consisting of feature selection, rule extraction and prediction modeling is proposed to improve the explanation capacity of SVM. In this scheme, we show that the feature selection problem can be translated into an ordinary multiple parameters learning problem. And a shrinkage approach: 1-norm based linear programming is proposed to obtain the sparse parameters and the corresponding selected features. We propose a novel rule extraction approach using the information provided by the separating hyperplane and support vectors to improve the generalization capacity and comprehensibility of rules and reduce the computational complexity. Two public gene expression datasets: leukemia dataset and colon tumor dataset are used to demonstrate the performance of this approach. Using the small number of selected genes, MK-SVM achieves encouraging classification accuracy: more than 90% for both two datasets. Moreover, very simple rules with linguist labels are extracted. The rule sets have high diagnostic power because of their good classification performance.

  16. Automated system for lung nodules classification based on wavelet feature descriptor and support vector machine.

    PubMed

    Madero Orozco, Hiram; Vergara Villegas, Osslan Osiris; Cruz Sánchez, Vianey Guadalupe; Ochoa Domínguez, Humberto de Jesús; Nandayapa Alfaro, Manuel de Jesús

    2015-02-12

    Lung cancer is a leading cause of death worldwide; it refers to the uncontrolled growth of abnormal cells in the lung. A computed tomography (CT) scan of the thorax is the most sensitive method for detecting cancerous lung nodules. A lung nodule is a round lesion which can be either non-cancerous or cancerous. In the CT, the lung cancer is observed as round white shadow nodules. The possibility to obtain a manually accurate interpretation from CT scans demands a big effort by the radiologist and might be a fatiguing process. Therefore, the design of a computer-aided diagnosis (CADx) system would be helpful as a second opinion tool. The stages of the proposed CADx are: a supervised extraction of the region of interest to eliminate the shape differences among CT images. The Daubechies db1, db2, and db4 wavelet transforms are computed with one and two levels of decomposition. After that, 19 features are computed from each wavelet sub-band. Then, the sub-band and attribute selection is performed. As a result, 11 features are selected and combined in pairs as inputs to the support vector machine (SVM), which is used to distinguish CT images containing cancerous nodules from those not containing nodules. The clinical data set used for experiments consists of 45 CT scans from ELCAP and LIDC. For the training stage 61 CT images were used (36 with cancerous lung nodules and 25 without lung nodules). The system performance was tested with 45 CT scans (23 CT scans with lung nodules and 22 without nodules), different from that used for training. The results obtained show that the methodology successfully classifies cancerous nodules with a diameter from 2 mm to 30 mm. The total preciseness obtained was 82%; the sensitivity was 90.90%, whereas the specificity was 73.91%. The CADx system presented is competitive with other literature systems in terms of sensitivity. The system reduces the complexity of classification by not performing the typical segmentation stage of most CADx

  17. Fast support vector data descriptions for novelty detection.

    PubMed

    Liu, Yi-Hung; Liu, Yan-Chen; Chen, Yen-Jen

    2010-08-01

    Support vector data description (SVDD) has become a very attractive kernel method due to its good results in many novelty detection problems. However, the decision function of SVDD is expressed in terms of the kernel expansion, which results in a run-time complexity linear in the number of support vectors. For applications where fast real-time response is needed, how to speed up the decision function is crucial. This paper aims at dealing with the issue of reducing the testing time complexity of SVDD. A method called fast SVDD (F-SVDD) is proposed. Unlike the traditional methods which all try to compress a kernel expansion into one with fewer terms, the proposed F-SVDD directly finds the preimage of a feature vector, and then uses a simple relationship between this feature vector and the SVDD sphere center to re-express the center with a single vector. The decision function of F-SVDD contains only one kernel term, and thus the decision boundary of F-SVDD is only spherical in the original space. Hence, the run-time complexity of the F-SVDD decision function is no longer linear in the support vectors, but is a constant, no matter how large the training set size is. In this paper, we also propose a novel direct preimage-finding method, which is noniterative and involves no free parameters. The unique preimage can be obtained in real time by the proposed direct method without taking trial-and-error. For demonstration, several real-world data sets and a large-scale data set, the extended MIT face data set, are used in experiments. In addition, a practical industry example regarding liquid crystal display micro-defect inspection is also used to compare the applicability of SVDD and our proposed F-SVDD when faced with mass data input. The results are very encouraging.

  18. Arbitrary norm support vector machines.

    PubMed

    Huang, Kaizhu; Zheng, Danian; King, Irwin; Lyu, Michael R

    2009-02-01

    Support vector machines (SVM) are state-of-the-art classifiers. Typically L2-norm or L1-norm is adopted as a regularization term in SVMs, while other norm-based SVMs, for example, the L0-norm SVM or even the L(infinity)-norm SVM, are rarely seen in the literature. The major reason is that L0-norm describes a discontinuous and nonconvex term, leading to a combinatorially NP-hard optimization problem. In this letter, motivated by Bayesian learning, we propose a novel framework that can implement arbitrary norm-based SVMs in polynomial time. One significant feature of this framework is that only a sequence of sequential minimal optimization problems needs to be solved, thus making it practical in many real applications. The proposed framework is important in the sense that Bayesian priors can be efficiently plugged into most learning methods without knowing the explicit form. Hence, this builds a connection between Bayesian learning and the kernel machines. We derive the theoretical framework, demonstrate how our approach works on the L0-norm SVM as a typical example, and perform a series of experiments to validate its advantages. Experimental results on nine benchmark data sets are very encouraging. The implemented L0-norm is competitive with or even better than the standard L2-norm SVM in terms of accuracy but with a reduced number of support vectors, -9.46% of the number on average. When compared with another sparse model, the relevance vector machine, our proposed algorithm also demonstrates better sparse properties with a training speed over seven times faster.

  19. Application of Classification Models to Pharyngeal High-Resolution Manometry

    ERIC Educational Resources Information Center

    Mielens, Jason D.; Hoffman, Matthew R.; Ciucci, Michelle R.; McCulloch, Timothy M.; Jiang, Jack J.

    2012-01-01

    Purpose: The authors present 3 methods of performing pattern recognition on spatiotemporal plots produced by pharyngeal high-resolution manometry (HRM). Method: Classification models, including the artificial neural networks (ANNs) multilayer perceptron (MLP) and learning vector quantization (LVQ), as well as support vector machines (SVM), were…

  20. Application of support vector machine for the separation of mineralised zones in the Takht-e-Gonbad porphyry deposit, SE Iran

    NASA Astrophysics Data System (ADS)

    Mahvash Mohammadi, Neda; Hezarkhani, Ardeshir

    2018-07-01

    Classification of mineralised zones is an important factor for the analysis of economic deposits. In this paper, the support vector machine (SVM), a supervised learning algorithm, based on subsurface data is proposed for classification of mineralised zones in the Takht-e-Gonbad porphyry Cu-deposit (SE Iran). The effects of the input features are evaluated via calculating the accuracy rates on the SVM performance. Ultimately, the SVM model, is developed based on input features namely lithology, alteration, mineralisation, the level and, radial basis function (RBF) as a kernel function. Moreover, the optimal amount of parameters λ and C, using n-fold cross-validation method, are calculated at level 0.001 and 0.01 respectively. The accuracy of this model is 0.931 for classification of mineralised zones in the Takht-e-Gonbad porphyry deposit. The results of the study confirm the efficiency of SVM method for classification the mineralised zones.

  1. Support vector machines-based modelling of seismic liquefaction potential

    NASA Astrophysics Data System (ADS)

    Pal, Mahesh

    2006-08-01

    This paper investigates the potential of support vector machines (SVM)-based classification approach to assess the liquefaction potential from actual standard penetration test (SPT) and cone penetration test (CPT) field data. SVMs are based on statistical learning theory and found to work well in comparison to neural networks in several other applications. Both CPT and SPT field data sets is used with SVMs for predicting the occurrence and non-occurrence of liquefaction based on different input parameter combination. With SPT and CPT test data sets, highest accuracy of 96 and 97%, respectively, was achieved with SVMs. This suggests that SVMs can effectively be used to model the complex relationship between different soil parameter and the liquefaction potential. Several other combinations of input variable were used to assess the influence of different input parameters on liquefaction potential. Proposed approach suggest that neither normalized cone resistance value with CPT data nor the calculation of standardized SPT value is required with SPT data. Further, SVMs required few user-defined parameters and provide better performance in comparison to neural network approach.

  2. Classification VIA Information-Theoretic Fusion of Vector-Magnetic and Acoustic Sensor Data

    DTIC Science & Technology

    2007-04-01

    10) where tBsBtBsBtBsBtsB zzyyxx, . (11) The operation in (10) may be viewed as a vector matched- filter on to estimate )(tB CPARv . In summary...choosing to maximize the classification information in Y are described in Section 3.2. A 3.2. Maximum mutual information ( MMI ) features We begin with a...review of several desirable properties of features that maximize a mutual information ( MMI ) criterion. Then we review a particular algorithm [2

  3. A Scatter-Based Prototype Framework and Multi-Class Extension of Support Vector Machines

    PubMed Central

    Jenssen, Robert; Kloft, Marius; Zien, Alexander; Sonnenburg, Sören; Müller, Klaus-Robert

    2012-01-01

    We provide a novel interpretation of the dual of support vector machines (SVMs) in terms of scatter with respect to class prototypes and their mean. As a key contribution, we extend this framework to multiple classes, providing a new joint Scatter SVM algorithm, at the level of its binary counterpart in the number of optimization variables. This enables us to implement computationally efficient solvers based on sequential minimal and chunking optimization. As a further contribution, the primal problem formulation is developed in terms of regularized risk minimization and the hinge loss, revealing the score function to be used in the actual classification of test patterns. We investigate Scatter SVM properties related to generalization ability, computational efficiency, sparsity and sensitivity maps, and report promising results. PMID:23118845

  4. Correlated Topic Vector for Scene Classification.

    PubMed

    Wei, Pengxu; Qin, Fei; Wan, Fang; Zhu, Yi; Jiao, Jianbin; Ye, Qixiang

    2017-07-01

    Scene images usually involve semantic correlations, particularly when considering large-scale image data sets. This paper proposes a novel generative image representation, correlated topic vector, to model such semantic correlations. Oriented from the correlated topic model, correlated topic vector intends to naturally utilize the correlations among topics, which are seldom considered in the conventional feature encoding, e.g., Fisher vector, but do exist in scene images. It is expected that the involvement of correlations can increase the discriminative capability of the learned generative model and consequently improve the recognition accuracy. Incorporated with the Fisher kernel method, correlated topic vector inherits the advantages of Fisher vector. The contributions to the topics of visual words have been further employed by incorporating the Fisher kernel framework to indicate the differences among scenes. Combined with the deep convolutional neural network (CNN) features and Gibbs sampling solution, correlated topic vector shows great potential when processing large-scale and complex scene image data sets. Experiments on two scene image data sets demonstrate that correlated topic vector improves significantly the deep CNN features, and outperforms existing Fisher kernel-based features.

  5. Gender classification of running subjects using full-body kinematics

    NASA Astrophysics Data System (ADS)

    Williams, Christina M.; Flora, Jeffrey B.; Iftekharuddin, Khan M.

    2016-05-01

    This paper proposes novel automated gender classification of subjects while engaged in running activity. The machine learning techniques include preprocessing steps using principal component analysis followed by classification with linear discriminant analysis, and nonlinear support vector machines, and decision-stump with AdaBoost. The dataset consists of 49 subjects (25 males, 24 females, 2 trials each) all equipped with approximately 80 retroreflective markers. The trials are reflective of the subject's entire body moving unrestrained through a capture volume at a self-selected running speed, thus producing highly realistic data. The classification accuracy using leave-one-out cross validation for the 49 subjects is improved from 66.33% using linear discriminant analysis to 86.74% using the nonlinear support vector machine. Results are further improved to 87.76% by means of implementing a nonlinear decision stump with AdaBoost classifier. The experimental findings suggest that the linear classification approaches are inadequate in classifying gender for a large dataset with subjects running in a moderately uninhibited environment.

  6. ATLS Hypovolemic Shock Classification by Prediction of Blood Loss in Rats Using Regression Models.

    PubMed

    Choi, Soo Beom; Choi, Joon Yul; Park, Jee Soo; Kim, Deok Won

    2016-07-01

    In our previous study, our input data set consisted of 78 rats, the blood loss in percent as a dependent variable, and 11 independent variables (heart rate, systolic blood pressure, diastolic blood pressure, mean arterial pressure, pulse pressure, respiration rate, temperature, perfusion index, lactate concentration, shock index, and new index (lactate concentration/perfusion)). The machine learning methods for multicategory classification were applied to a rat model in acute hemorrhage to predict the four Advanced Trauma Life Support (ATLS) hypovolemic shock classes for triage in our previous study. However, multicategory classification is much more difficult and complicated than binary classification. We introduce a simple approach for classifying ATLS hypovolaemic shock class by predicting blood loss in percent using support vector regression and multivariate linear regression (MLR). We also compared the performance of the classification models using absolute and relative vital signs. The accuracies of support vector regression and MLR models with relative values by predicting blood loss in percent were 88.5% and 84.6%, respectively. These were better than the best accuracy of 80.8% of the direct multicategory classification using the support vector machine one-versus-one model in our previous study for the same validation data set. Moreover, the simple MLR models with both absolute and relative values could provide possibility of the future clinical decision support system for ATLS classification. The perfusion index and new index were more appropriate with relative changes than absolute values.

  7. A Fault Alarm and Diagnosis Method Based on Sensitive Parameters and Support Vector Machine

    NASA Astrophysics Data System (ADS)

    Zhang, Jinjie; Yao, Ziyun; Lv, Zhiquan; Zhu, Qunxiong; Xu, Fengtian; Jiang, Zhinong

    2015-08-01

    Study on the extraction of fault feature and the diagnostic technique of reciprocating compressor is one of the hot research topics in the field of reciprocating machinery fault diagnosis at present. A large number of feature extraction and classification methods have been widely applied in the related research, but the practical fault alarm and the accuracy of diagnosis have not been effectively improved. Developing feature extraction and classification methods to meet the requirements of typical fault alarm and automatic diagnosis in practical engineering is urgent task. The typical mechanical faults of reciprocating compressor are presented in the paper, and the existing data of online monitoring system is used to extract fault feature parameters within 15 types in total; the inner sensitive connection between faults and the feature parameters has been made clear by using the distance evaluation technique, also sensitive characteristic parameters of different faults have been obtained. On this basis, a method based on fault feature parameters and support vector machine (SVM) is developed, which will be applied to practical fault diagnosis. A better ability of early fault warning has been proved by the experiment and the practical fault cases. Automatic classification by using the SVM to the data of fault alarm has obtained better diagnostic accuracy.

  8. I-CAN: The Classification and Prediction of Support Needs

    ERIC Educational Resources Information Center

    Arnold, Samuel R. C.; Riches, Vivienne C.; Stancliffe, Roger J.

    2014-01-01

    Background: Since 1992, the diagnosis and classification of intellectual disability has been dependent upon three constructs: intelligence, adaptive behaviour and support needs (Luckasson "et al." 1992. Mental Retardation: Definition, Classification and Systems of Support. American Association on Intellectual and Developmental…

  9. sw-SVM: sensor weighting support vector machines for EEG-based brain-computer interfaces.

    PubMed

    Jrad, N; Congedo, M; Phlypo, R; Rousseau, S; Flamary, R; Yger, F; Rakotomamonjy, A

    2011-10-01

    In many machine learning applications, like brain-computer interfaces (BCI), high-dimensional sensor array data are available. Sensor measurements are often highly correlated and signal-to-noise ratio is not homogeneously spread across sensors. Thus, collected data are highly variable and discrimination tasks are challenging. In this work, we focus on sensor weighting as an efficient tool to improve the classification procedure. We present an approach integrating sensor weighting in the classification framework. Sensor weights are considered as hyper-parameters to be learned by a support vector machine (SVM). The resulting sensor weighting SVM (sw-SVM) is designed to satisfy a margin criterion, that is, the generalization error. Experimental studies on two data sets are presented, a P300 data set and an error-related potential (ErrP) data set. For the P300 data set (BCI competition III), for which a large number of trials is available, the sw-SVM proves to perform equivalently with respect to the ensemble SVM strategy that won the competition. For the ErrP data set, for which a small number of trials are available, the sw-SVM shows superior performances as compared to three state-of-the art approaches. Results suggest that the sw-SVM promises to be useful in event-related potentials classification, even with a small number of training trials.

  10. Prediction of cell penetrating peptides by support vector machines.

    PubMed

    Sanders, William S; Johnston, C Ian; Bridges, Susan M; Burgess, Shane C; Willeford, Kenneth O

    2011-07-01

    Cell penetrating peptides (CPPs) are those peptides that can transverse cell membranes to enter cells. Once inside the cell, different CPPs can localize to different cellular components and perform different roles. Some generate pore-forming complexes resulting in the destruction of cells while others localize to various organelles. Use of machine learning methods to predict potential new CPPs will enable more rapid screening for applications such as drug delivery. We have investigated the influence of the composition of training datasets on the ability to classify peptides as cell penetrating using support vector machines (SVMs). We identified 111 known CPPs and 34 known non-penetrating peptides from the literature and commercial vendors and used several approaches to build training data sets for the classifiers. Features were calculated from the datasets using a set of basic biochemical properties combined with features from the literature determined to be relevant in the prediction of CPPs. Our results using different training datasets confirm the importance of a balanced training set with approximately equal number of positive and negative examples. The SVM based classifiers have greater classification accuracy than previously reported methods for the prediction of CPPs, and because they use primary biochemical properties of the peptides as features, these classifiers provide insight into the properties needed for cell-penetration. To confirm our SVM classifications, a subset of peptides classified as either penetrating or non-penetrating was selected for synthesis and experimental validation. Of the synthesized peptides predicted to be CPPs, 100% of these peptides were shown to be penetrating.

  11. Bayesian Kernel Methods for Non-Gaussian Distributions: Binary and Multi-class Classification Problems

    DTIC Science & Technology

    2013-05-28

    those of the support vector machine and relevance vector machine, and the model runs more quickly than the other algorithms . When one class occurs...incremental support vector machine algorithm for online learning when fewer than 50 data points are available. (a) Papers published in peer-reviewed journals...learning environments, where data processing occurs one observation at a time and the classification algorithm improves over time with new

  12. Beam-hardening correction by a surface fitting and phase classification by a least square support vector machine approach for tomography images of geological samples

    NASA Astrophysics Data System (ADS)

    Khan, F.; Enzmann, F.; Kersten, M.

    2015-12-01

    In X-ray computed microtomography (μXCT) image processing is the most important operation prior to image analysis. Such processing mainly involves artefact reduction and image segmentation. We propose a new two-stage post-reconstruction procedure of an image of a geological rock core obtained by polychromatic cone-beam μXCT technology. In the first stage, the beam-hardening (BH) is removed applying a best-fit quadratic surface algorithm to a given image data set (reconstructed slice), which minimizes the BH offsets of the attenuation data points from that surface. The final BH-corrected image is extracted from the residual data, or the difference between the surface elevation values and the original grey-scale values. For the second stage, we propose using a least square support vector machine (a non-linear classifier algorithm) to segment the BH-corrected data as a pixel-based multi-classification task. A combination of the two approaches was used to classify a complex multi-mineral rock sample. The Matlab code for this approach is provided in the Appendix. A minor drawback is that the proposed segmentation algorithm may become computationally demanding in the case of a high dimensional training data set.

  13. Epileptic seizure detection from EEG signals with phase-amplitude cross-frequency coupling and support vector machine

    NASA Astrophysics Data System (ADS)

    Liu, Yang; Wang, Jiang; Cai, Lihui; Chen, Yingyuan; Qin, Yingmei

    2018-03-01

    As a pattern of cross-frequency coupling (CFC), phase-amplitude coupling (PAC) depicts the interaction between the phase and amplitude of distinct frequency bands from the same signal, and has been proved to be closely related to the brain’s cognitive and memory activities. This work utilized PAC and support vector machine (SVM) classifier to identify the epileptic seizures from electroencephalogram (EEG) data. The entropy-based modulation index (MI) matrixes are used to express the strength of PAC, from which we extracted features as the input for classifier. Based on the Bonn database, which contains five datasets of EEG segments obtained from healthy volunteers and epileptic subjects, a 100% classification accuracy is achieved for identifying seizure ictal from healthy data, and an accuracy of 97.67% is reached in the classification of ictal EEG signals from inter-ictal EEGs. Based on the CHB-MIT database which is a group of continuously recorded epileptic EEGs by scalp electrodes, a 97.50% classification accuracy is obtained and a raising sign of MI value is found at 6s before seizure onset. The classification performance in this work is effective, and PAC can be considered as a useful tool for detecting and predicting the epileptic seizures and providing reference for clinical diagnosis.

  14. A Real-Time Interference Monitoring Technique for GNSS Based on a Twin Support Vector Machine Method.

    PubMed

    Li, Wutao; Huang, Zhigang; Lang, Rongling; Qin, Honglei; Zhou, Kai; Cao, Yongbin

    2016-03-04

    Interferences can severely degrade the performance of Global Navigation Satellite System (GNSS) receivers. As the first step of GNSS any anti-interference measures, interference monitoring for GNSS is extremely essential and necessary. Since interference monitoring can be considered as a classification problem, a real-time interference monitoring technique based on Twin Support Vector Machine (TWSVM) is proposed in this paper. A TWSVM model is established, and TWSVM is solved by the Least Squares Twin Support Vector Machine (LSTWSVM) algorithm. The interference monitoring indicators are analyzed to extract features from the interfered GNSS signals. The experimental results show that the chosen observations can be used as the interference monitoring indicators. The interference monitoring performance of the proposed method is verified by using GPS L1 C/A code signal and being compared with that of standard SVM. The experimental results indicate that the TWSVM-based interference monitoring is much faster than the conventional SVM. Furthermore, the training time of TWSVM is on millisecond (ms) level and the monitoring time is on microsecond (μs) level, which make the proposed approach usable in practical interference monitoring applications.

  15. A Real-Time Interference Monitoring Technique for GNSS Based on a Twin Support Vector Machine Method

    PubMed Central

    Li, Wutao; Huang, Zhigang; Lang, Rongling; Qin, Honglei; Zhou, Kai; Cao, Yongbin

    2016-01-01

    Interferences can severely degrade the performance of Global Navigation Satellite System (GNSS) receivers. As the first step of GNSS any anti-interference measures, interference monitoring for GNSS is extremely essential and necessary. Since interference monitoring can be considered as a classification problem, a real-time interference monitoring technique based on Twin Support Vector Machine (TWSVM) is proposed in this paper. A TWSVM model is established, and TWSVM is solved by the Least Squares Twin Support Vector Machine (LSTWSVM) algorithm. The interference monitoring indicators are analyzed to extract features from the interfered GNSS signals. The experimental results show that the chosen observations can be used as the interference monitoring indicators. The interference monitoring performance of the proposed method is verified by using GPS L1 C/A code signal and being compared with that of standard SVM. The experimental results indicate that the TWSVM-based interference monitoring is much faster than the conventional SVM. Furthermore, the training time of TWSVM is on millisecond (ms) level and the monitoring time is on microsecond (μs) level, which make the proposed approach usable in practical interference monitoring applications. PMID:26959020

  16. Automated detection of pulmonary nodules in CT images with support vector machines

    NASA Astrophysics Data System (ADS)

    Liu, Lu; Liu, Wanyu; Sun, Xiaoming

    2008-10-01

    Many methods have been proposed to avoid radiologists fail to diagnose small pulmonary nodules. Recently, support vector machines (SVMs) had received an increasing attention for pattern recognition. In this paper, we present a computerized system aimed at pulmonary nodules detection; it identifies the lung field, extracts a set of candidate regions with a high sensitivity ratio and then classifies candidates by the use of SVMs. The Computer Aided Diagnosis (CAD) system presented in this paper supports the diagnosis of pulmonary nodules from Computed Tomography (CT) images as inflammation, tuberculoma, granuloma..sclerosing hemangioma, and malignant tumor. Five texture feature sets were extracted for each lesion, while a genetic algorithm based feature selection method was applied to identify the most robust features. The selected feature set was fed into an ensemble of SVMs classifiers. The achieved classification performance was 100%, 92.75% and 90.23% in the training, validation and testing set, respectively. It is concluded that computerized analysis of medical images in combination with artificial intelligence can be used in clinical practice and may contribute to more efficient diagnosis.

  17. Cloud field classification based upon high spatial resolution textural features. II - Simplified vector approaches

    NASA Technical Reports Server (NTRS)

    Chen, D. W.; Sengupta, S. K.; Welch, R. M.

    1989-01-01

    This paper compares the results of cloud-field classification derived from two simplified vector approaches, the Sum and Difference Histogram (SADH) and the Gray Level Difference Vector (GLDV), with the results produced by the Gray Level Cooccurrence Matrix (GLCM) approach described by Welch et al. (1988). It is shown that the SADH method produces accuracies equivalent to those obtained using the GLCM method, while the GLDV method fails to resolve error clusters. Compared to the GLCM method, the SADH method leads to a 31 percent saving in run time and a 50 percent saving in storage requirements, while the GLVD approach leads to a 40 percent saving in run time and an 87 percent saving in storage requirements.

  18. Evolutionary-driven support vector machines for determining the degree of liver fibrosis in chronic hepatitis C.

    PubMed

    Stoean, Ruxandra; Stoean, Catalin; Lupsor, Monica; Stefanescu, Horia; Badea, Radu

    2011-01-01

    evolutionary method is viewed in comparison to the traditional one. The multifaceted discrimination into all five degrees of fibrosis and the slightly less difficult common separation into solely three related stages are both investigated. The resulting performance proves the superiority over the standard support vector classification and the attained formula is helpful in providing an immediate calculation of the liver stage for new cases, while establishing the presence/absence and comprehending the weight of each medical factor with respect to a certain fibrosis level. The use of the evolutionary technique for fibrosis degree prediction triggers simplicity and offers a direct expression of the influence of dynamically selected indicators on the corresponding stage. Perhaps most importantly, it significantly surpasses the classical support vector machines, which are both widely used and technically sound. All these therefore confirm the promise of the new methodology towards a dependable support within the medical decision-making. Copyright © 2010 Elsevier B.V. All rights reserved.

  19. Boosting specificity of MEG artifact removal by weighted support vector machine.

    PubMed

    Duan, Fang; Phothisonothai, Montri; Kikuchi, Mitsuru; Yoshimura, Yuko; Minabe, Yoshio; Watanabe, Kastumi; Aihara, Kazuyuki

    2013-01-01

    An automatic artifact removal method of magnetoencephalogram (MEG) was presented in this paper. The method proposed is based on independent components analysis (ICA) and support vector machine (SVM). However, different from the previous studies, in this paper we consider two factors which would influence the performance. First, the imbalance factor of independent components (ICs) of MEG is handled by weighted SVM. Second, instead of simply setting a fixed weight to each class, a re-weighting scheme is used for the preservation of useful MEG ICs. Experimental results on manually marked MEG dataset showed that the method proposed could correctly distinguish the artifacts from the MEG ICs. Meanwhile, 99.72% ± 0.67 of MEG ICs were preserved. The classification accuracy was 97.91% ± 1.39. In addition, it was found that this method was not sensitive to individual differences. The cross validation (leave-one-subject-out) results showed an averaged accuracy of 97.41% ± 2.14.

  20. Application of biomonitoring and support vector machine in water quality assessment*

    PubMed Central

    Liao, Yue; Xu, Jian-yu; Wang, Zhu-wei

    2012-01-01

    The behavior of schools of zebrafish (Danio rerio) was studied in acute toxicity environments. Behavioral features were extracted and a method for water quality assessment using support vector machine (SVM) was developed. The behavioral parameters of fish were recorded and analyzed during one hour in an environment of a 24-h half-lethal concentration (LC50) of a pollutant. The data were used to develop a method to evaluate water quality, so as to give an early indication of toxicity. Four kinds of metal ions (Cu2+, Hg2+, Cr6+, and Cd2+) were used for toxicity testing. To enhance the efficiency and accuracy of assessment, a method combining SVM and a genetic algorithm (GA) was used. The results showed that the average prediction accuracy of the method was over 80% and the time cost was acceptable. The method gave satisfactory results for a variety of metal pollutants, demonstrating that this is an effective approach to the classification of water quality. PMID:22467374

  1. Incremental Transductive Learning Approaches to Schistosomiasis Vector Classification

    NASA Astrophysics Data System (ADS)

    Fusco, Terence; Bi, Yaxin; Wang, Haiying; Browne, Fiona

    2016-08-01

    The key issues pertaining to collection of epidemic disease data for our analysis purposes are that it is a labour intensive, time consuming and expensive process resulting in availability of sparse sample data which we use to develop prediction models. To address this sparse data issue, we present the novel Incremental Transductive methods to circumvent the data collection process by applying previously acquired data to provide consistent, confidence-based labelling alternatives to field survey research. We investigated various reasoning approaches for semi-supervised machine learning including Bayesian models for labelling data. The results show that using the proposed methods, we can label instances of data with a class of vector density at a high level of confidence. By applying the Liberal and Strict Training Approaches, we provide a labelling and classification alternative to standalone algorithms. The methods in this paper are components in the process of reducing the proliferation of the Schistosomiasis disease and its effects.

  2. Rapid authentication of adulteration of olive oil by near-infrared spectroscopy using support vector machines

    NASA Astrophysics Data System (ADS)

    Wu, Jingzhu; Dong, Jingjing; Dong, Wenfei; Chen, Yan; Liu, Cuiling

    2016-10-01

    A classification method of support vector machines with linear kernel was employed to authenticate genuine olive oil based on near-infrared spectroscopy. There were three types of adulteration of olive oil experimented in the study. The adulterated oil was respectively soybean oil, rapeseed oil and the mixture of soybean and rapeseed oil. The average recognition rate of second experiment was more than 90% and that of the third experiment was reach to 100%. The results showed the method had good performance in classifying genuine olive oil and the adulteration with small variation range of adulterated concentration and it was a promising and rapid technique for the detection of oil adulteration and fraud in the food industry.

  3. Identification of cigarette smoke inhalations from wearable sensor data using a Support Vector Machine classifier.

    PubMed

    Lopez-Meyer, Paulo; Tiffany, Stephen; Sazonov, Edward

    2012-01-01

    This study presents a subject-independent model for detection of smoke inhalations from wearable sensors capturing characteristic hand-to-mouth gestures and changes in breathing patterns during cigarette smoking. Wearable sensors were used to detect the proximity of the hand to the mouth and to acquire the respiratory patterns. The waveforms of sensor signals were used as features to build a Support Vector Machine classification model. Across a data set of 20 enrolled participants, precision of correct identification of smoke inhalations was found to be >87%, and a resulting recall >80%. These results suggest that it is possible to analyze smoking behavior by means of a wearable and non-invasive sensor system.

  4. Classifying Physical Morphology of Cocoa Beans Digital Images using Multiclass Ensemble Least-Squares Support Vector Machine

    NASA Astrophysics Data System (ADS)

    Lawi, Armin; Adhitya, Yudhi

    2018-03-01

    The objective of this research is to determine the quality of cocoa beans through morphology of their digital images. Samples of cocoa beans were scattered on a bright white paper under a controlled lighting condition. A compact digital camera was used to capture the images. The images were then processed to extract their morphological parameters. Classification process begins with an analysis of cocoa beans image based on morphological feature extraction. Parameters for extraction of morphological or physical feature parameters, i.e., Area, Perimeter, Major Axis Length, Minor Axis Length, Aspect Ratio, Circularity, Roundness, Ferret Diameter. The cocoa beans are classified into 4 groups, i.e.: Normal Beans, Broken Beans, Fractured Beans, and Skin Damaged Beans. The model of classification used in this paper is the Multiclass Ensemble Least-Squares Support Vector Machine (MELS-SVM), a proposed improvement model of SVM using ensemble method in which the separate hyperplanes are obtained by least square approach and the multiclass procedure uses One-Against- All method. The result of our proposed model showed that the classification with morphological feature input parameters were accurately as 99.705% for the four classes, respectively.

  5. An empirical comparison of different approaches for combining multimodal neuroimaging data with support vector machine

    PubMed Central

    Pettersson-Yeo, William; Benetti, Stefania; Marquand, Andre F.; Joules, Richard; Catani, Marco; Williams, Steve C. R.; Allen, Paul; McGuire, Philip; Mechelli, Andrea

    2014-01-01

    In the pursuit of clinical utility, neuroimaging researchers of psychiatric and neurological illness are increasingly using analyses, such as support vector machine, that allow inference at the single-subject level. Recent studies employing single-modality data, however, suggest that classification accuracies must be improved for such utility to be realized. One possible solution is to integrate different data types to provide a single combined output classification; either by generating a single decision function based on an integrated kernel matrix, or, by creating an ensemble of multiple single modality classifiers and integrating their predictions. Here, we describe four integrative approaches: (1) an un-weighted sum of kernels, (2) multi-kernel learning, (3) prediction averaging, and (4) majority voting, and compare their ability to enhance classification accuracy relative to the best single-modality classification accuracy. We achieve this by integrating structural, functional, and diffusion tensor magnetic resonance imaging data, in order to compare ultra-high risk (n = 19), first episode psychosis (n = 19) and healthy control subjects (n = 23). Our results show that (i) whilst integration can enhance classification accuracy by up to 13%, the frequency of such instances may be limited, (ii) where classification can be enhanced, simple methods may yield greater increases relative to more computationally complex alternatives, and, (iii) the potential for classification enhancement is highly influenced by the specific diagnostic comparison under consideration. In conclusion, our findings suggest that for moderately sized clinical neuroimaging datasets, combining different imaging modalities in a data-driven manner is no “magic bullet” for increasing classification accuracy. However, it remains possible that this conclusion is dependent on the use of neuroimaging modalities that had little, or no, complementary information to offer one another, and that the

  6. An empirical comparison of different approaches for combining multimodal neuroimaging data with support vector machine.

    PubMed

    Pettersson-Yeo, William; Benetti, Stefania; Marquand, Andre F; Joules, Richard; Catani, Marco; Williams, Steve C R; Allen, Paul; McGuire, Philip; Mechelli, Andrea

    2014-01-01

    In the pursuit of clinical utility, neuroimaging researchers of psychiatric and neurological illness are increasingly using analyses, such as support vector machine, that allow inference at the single-subject level. Recent studies employing single-modality data, however, suggest that classification accuracies must be improved for such utility to be realized. One possible solution is to integrate different data types to provide a single combined output classification; either by generating a single decision function based on an integrated kernel matrix, or, by creating an ensemble of multiple single modality classifiers and integrating their predictions. Here, we describe four integrative approaches: (1) an un-weighted sum of kernels, (2) multi-kernel learning, (3) prediction averaging, and (4) majority voting, and compare their ability to enhance classification accuracy relative to the best single-modality classification accuracy. We achieve this by integrating structural, functional, and diffusion tensor magnetic resonance imaging data, in order to compare ultra-high risk (n = 19), first episode psychosis (n = 19) and healthy control subjects (n = 23). Our results show that (i) whilst integration can enhance classification accuracy by up to 13%, the frequency of such instances may be limited, (ii) where classification can be enhanced, simple methods may yield greater increases relative to more computationally complex alternatives, and, (iii) the potential for classification enhancement is highly influenced by the specific diagnostic comparison under consideration. In conclusion, our findings suggest that for moderately sized clinical neuroimaging datasets, combining different imaging modalities in a data-driven manner is no "magic bullet" for increasing classification accuracy. However, it remains possible that this conclusion is dependent on the use of neuroimaging modalities that had little, or no, complementary information to offer one another, and that the

  7. Support vector machine in machine condition monitoring and fault diagnosis

    NASA Astrophysics Data System (ADS)

    Widodo, Achmad; Yang, Bo-Suk

    2007-08-01

    Recently, the issue of machine condition monitoring and fault diagnosis as a part of maintenance system became global due to the potential advantages to be gained from reduced maintenance costs, improved productivity and increased machine availability. This paper presents a survey of machine condition monitoring and fault diagnosis using support vector machine (SVM). It attempts to summarize and review the recent research and developments of SVM in machine condition monitoring and diagnosis. Numerous methods have been developed based on intelligent systems such as artificial neural network, fuzzy expert system, condition-based reasoning, random forest, etc. However, the use of SVM for machine condition monitoring and fault diagnosis is still rare. SVM has excellent performance in generalization so it can produce high accuracy in classification for machine condition monitoring and diagnosis. Until 2006, the use of SVM in machine condition monitoring and fault diagnosis is tending to develop towards expertise orientation and problem-oriented domain. Finally, the ability to continually change and obtain a novel idea for machine condition monitoring and fault diagnosis using SVM will be future works.

  8. A Two-Layer Least Squares Support Vector Machine Approach to Credit Risk Assessment

    NASA Astrophysics Data System (ADS)

    Liu, Jingli; Li, Jianping; Xu, Weixuan; Shi, Yong

    Least squares support vector machine (LS-SVM) is a revised version of support vector machine (SVM) and has been proved to be a useful tool for pattern recognition. LS-SVM had excellent generalization performance and low computational cost. In this paper, we propose a new method called two-layer least squares support vector machine which combines kernel principle component analysis (KPCA) and linear programming form of least square support vector machine. With this method sparseness and robustness is obtained while solving large dimensional and large scale database. A U.S. commercial credit card database is used to test the efficiency of our method and the result proved to be a satisfactory one.

  9. Condition Assessment of Foundation Piles and Utility Poles Based on Guided Wave Propagation Using a Network of Tactile Transducers and Support Vector Machines

    PubMed Central

    Yu, Yang; Niederleithinger, Ernst; Li, Jianchun; Wiggenhauser, Herbert

    2017-01-01

    This paper presents a novel non-destructive testing and health monitoring system using a network of tactile transducers and accelerometers for the condition assessment and damage classification of foundation piles and utility poles. While in traditional pile integrity testing an impact hammer with broadband frequency excitation is typically used, the proposed testing system utilizes an innovative excitation system based on a network of tactile transducers to induce controlled narrow-band frequency stress waves. Thereby, the simultaneous excitation of multiple stress wave types and modes is avoided (or at least reduced), and targeted wave forms can be generated. The new testing system enables the testing and monitoring of foundation piles and utility poles where the top is inaccessible, making the new testing system suitable, for example, for the condition assessment of pile structures with obstructed heads and of poles with live wires. For system validation, the new system was experimentally tested on nine timber and concrete poles that were inflicted with several types of damage. The tactile transducers were excited with continuous sine wave signals of 1 kHz frequency. Support vector machines were employed together with advanced signal processing algorithms to distinguish recorded stress wave signals from pole structures with different types of damage. The results show that using fast Fourier transform signals, combined with principal component analysis as the input feature vector for support vector machine (SVM) classifiers with different kernel functions, can achieve damage classification with accuracies of 92.5% ± 7.5%. PMID:29258274

  10. A hybrid approach of stepwise regression, logistic regression, support vector machine, and decision tree for forecasting fraudulent financial statements.

    PubMed

    Chen, Suduan; Goo, Yeong-Jia James; Shen, Zone-De

    2014-01-01

    As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%.

  11. Voltammetric electronic tongue and support vector machines for identification of selected features in Mexican coffee.

    PubMed

    Domínguez, Rocio Berenice; Moreno-Barón, Laura; Muñoz, Roberto; Gutiérrez, Juan Manuel

    2014-09-24

    This paper describes a new method based on a voltammetric electronic tongue (ET) for the recognition of distinctive features in coffee samples. An ET was directly applied to different samples from the main Mexican coffee regions without any pretreatment before the analysis. The resulting electrochemical information was modeled with two different mathematical tools, namely Linear Discriminant Analysis (LDA) and Support Vector Machines (SVM). Growing conditions (i.e., organic or non-organic practices and altitude of crops) were considered for a first classification. LDA results showed an average discrimination rate of 88% ± 6.53% while SVM successfully accomplished an overall accuracy of 96.4% ± 3.50% for the same task. A second classification based on geographical origin of samples was carried out. Results showed an overall accuracy of 87.5% ± 7.79% for LDA and a superior performance of 97.5% ± 3.22% for SVM. Given the complexity of coffee samples, the high accuracy percentages achieved by ET coupled with SVM in both classification problems suggested a potential applicability of ET in the assessment of selected coffee features with a simpler and faster methodology along with a null sample pretreatment. In addition, the proposed method can be applied to authentication assessment while improving cost, time and accuracy of the general procedure.

  12. A Method for Extracting Important Segments from Documents Using Support Vector Machines

    NASA Astrophysics Data System (ADS)

    Suzuki, Daisuke; Utsumi, Akira

    In this paper we propose an extraction-based method for automatic summarization. The proposed method consists of two processes: important segment extraction and sentence compaction. The process of important segment extraction classifies each segment in a document as important or not by Support Vector Machines (SVMs). The process of sentence compaction then determines grammatically appropriate portions of a sentence for a summary according to its dependency structure and the classification result by SVMs. To test the performance of our method, we conducted an evaluation experiment using the Text Summarization Challenge (TSC-1) corpus of human-prepared summaries. The result was that our method achieved better performance than a segment-extraction-only method and the Lead method, especially for sentences only a part of which was included in human summaries. Further analysis of the experimental results suggests that a hybrid method that integrates sentence extraction with segment extraction may generate better summaries.

  13. Object recognition of ladar with support vector machine

    NASA Astrophysics Data System (ADS)

    Sun, Jian-Feng; Li, Qi; Wang, Qi

    2005-01-01

    Intensity, range and Doppler images can be obtained by using laser radar. Laser radar can detect much more object information than other detecting sensor, such as passive infrared imaging and synthetic aperture radar (SAR), so it is well suited as the sensor of object recognition. Traditional method of laser radar object recognition is extracting target features, which can be influenced by noise. In this paper, a laser radar recognition method-Support Vector Machine is introduced. Support Vector Machine (SVM) is a new hotspot of recognition research after neural network. It has well performance on digital written and face recognition. Two series experiments about SVM designed for preprocessing and non-preprocessing samples are performed by real laser radar images, and the experiments results are compared.

  14. Semantic classification of business images

    NASA Astrophysics Data System (ADS)

    Erol, Berna; Hull, Jonathan J.

    2006-01-01

    Digital cameras are becoming increasingly common for capturing information in business settings. In this paper, we describe a novel method for classifying images into the following semantic classes: document, whiteboard, business card, slide, and regular images. Our method is based on combining low-level image features, such as text color, layout, and handwriting features with high-level OCR output analysis. Several Support Vector Machine Classifiers are combined for multi-class classification of input images. The system yields 95% accuracy in classification.

  15. Hybrid Model Based on Genetic Algorithms and SVM Applied to Variable Selection within Fruit Juice Classification

    PubMed Central

    Fernandez-Lozano, C.; Canto, C.; Gestal, M.; Andrade-Garda, J. M.; Rabuñal, J. R.; Dorado, J.; Pazos, A.

    2013-01-01

    Given the background of the use of Neural Networks in problems of apple juice classification, this paper aim at implementing a newly developed method in the field of machine learning: the Support Vector Machines (SVM). Therefore, a hybrid model that combines genetic algorithms and support vector machines is suggested in such a way that, when using SVM as a fitness function of the Genetic Algorithm (GA), the most representative variables for a specific classification problem can be selected. PMID:24453933

  16. [Discrimination of varieties of borneol using terahertz spectra based on principal component analysis and support vector machine].

    PubMed

    Li, Wu; Hu, Bing; Wang, Ming-wei

    2014-12-01

    In the present paper, the terahertz time-domain spectroscopy (THz-TDS) identification model of borneol based on principal component analysis (PCA) and support vector machine (SVM) was established. As one Chinese common agent, borneol needs a rapid, simple and accurate detection and identification method for its different source and being easily confused in the pharmaceutical and trade links. In order to assure the quality of borneol product and guard the consumer's right, quickly, efficiently and correctly identifying borneol has significant meaning to the production and transaction of borneol. Terahertz time-domain spectroscopy is a new spectroscopy approach to characterize material using terahertz pulse. The absorption terahertz spectra of blumea camphor, borneol camphor and synthetic borneol were measured in the range of 0.2 to 2 THz with the transmission THz-TDS. The PCA scores of 2D plots (PC1 X PC2) and 3D plots (PC1 X PC2 X PC3) of three kinds of borneol samples were obtained through PCA analysis, and both of them have good clustering effect on the 3 different kinds of borneol. The value matrix of the first 10 principal components (PCs) was used to replace the original spectrum data, and the 60 samples of the three kinds of borneol were trained and then the unknown 60 samples were identified. Four kinds of support vector machine model of different kernel functions were set up in this way. Results show that the accuracy of identification and classification of SVM RBF kernel function for three kinds of borneol is 100%, and we selected the SVM with the radial basis kernel function to establish the borneol identification model, in addition, in the noisy case, the classification accuracy rates of four SVM kernel function are above 85%, and this indicates that SVM has strong generalization ability. This study shows that PCA with SVM method of borneol terahertz spectroscopy has good classification and identification effects, and provides a new method for species

  17. Classification of jet fuel properties by near-infrared spectroscopy using fuzzy rule-building expert systems and support vector machines.

    PubMed

    Xu, Zhanfeng; Bunker, Christopher E; Harrington, Peter de B

    2010-11-01

    Monitoring the changes of jet fuel physical properties is important because fuel used in high-performance aircraft must meet rigorous specifications. Near-infrared (NIR) spectroscopy is a fast method to characterize fuels. Because of the complexity of NIR spectral data, chemometric techniques are used to extract relevant information from spectral data to accurately classify physical properties of complex fuel samples. In this work, discrimination of fuel types and classification of flash point, freezing point, boiling point (10%, v/v), boiling point (50%, v/v), and boiling point (90%, v/v) of jet fuels (JP-5, JP-8, Jet A, and Jet A1) were investigated. Each physical property was divided into three classes, low, medium, and high ranges, using two evaluations with different class boundary definitions. The class boundaries function as the threshold to alarm when the fuel properties change. Optimal partial least squares discriminant analysis (oPLS-DA), fuzzy rule-building expert system (FuRES), and support vector machines (SVM) were used to build the calibration models between the NIR spectra and classes of physical property of jet fuels. OPLS-DA, FuRES, and SVM were compared with respect to prediction accuracy. The validation of the calibration model was conducted by applying bootstrap Latin partition (BLP), which gives a measure of precision. Prediction accuracy of 97 ± 2% of the flash point, 94 ± 2% of freezing point, 99 ± 1% of the boiling point (10%, v/v), 98 ± 2% of the boiling point (50%, v/v), and 96 ± 1% of the boiling point (90%, v/v) were obtained by FuRES in one boundaries definition. Both FuRES and SVM obtained statistically better prediction accuracy over those obtained by oPLS-DA. The results indicate that combined with chemometric classifiers NIR spectroscopy could be a fast method to monitor the changes of jet fuel physical properties.

  18. Support Vector Data Description Model to Map Specific Land Cover with Optimal Parameters Determined from a Window-Based Validation Set.

    PubMed

    Zhang, Jinshui; Yuan, Zhoumiqi; Shuai, Guanyuan; Pan, Yaozhong; Zhu, Xiufang

    2017-04-26

    This paper developed an approach, the window-based validation set for support vector data description (WVS-SVDD), to determine optimal parameters for support vector data description (SVDD) model to map specific land cover by integrating training and window-based validation sets. Compared to the conventional approach where the validation set included target and outlier pixels selected visually and randomly, the validation set derived from WVS-SVDD constructed a tightened hypersphere because of the compact constraint by the outlier pixels which were located neighboring to the target class in the spectral feature space. The overall accuracies for wheat and bare land achieved were as high as 89.25% and 83.65%, respectively. However, target class was underestimated because the validation set covers only a small fraction of the heterogeneous spectra of the target class. The different window sizes were then tested to acquire more wheat pixels for validation set. The results showed that classification accuracy increased with the increasing window size and the overall accuracies were higher than 88% at all window size scales. Moreover, WVS-SVDD showed much less sensitivity to the untrained classes than the multi-class support vector machine (SVM) method. Therefore, the developed method showed its merits using the optimal parameters, tradeoff coefficient ( C ) and kernel width ( s ), in mapping homogeneous specific land cover.

  19. Full-motion video analysis for improved gender classification

    NASA Astrophysics Data System (ADS)

    Flora, Jeffrey B.; Lochtefeld, Darrell F.; Iftekharuddin, Khan M.

    2014-06-01

    The ability of computer systems to perform gender classification using the dynamic motion of the human subject has important applications in medicine, human factors, and human-computer interface systems. Previous works in motion analysis have used data from sensors (including gyroscopes, accelerometers, and force plates), radar signatures, and video. However, full-motion video, motion capture, range data provides a higher resolution time and spatial dataset for the analysis of dynamic motion. Works using motion capture data have been limited by small datasets in a controlled environment. In this paper, we explore machine learning techniques to a new dataset that has a larger number of subjects. Additionally, these subjects move unrestricted through a capture volume, representing a more realistic, less controlled environment. We conclude that existing linear classification methods are insufficient for the gender classification for larger dataset captured in relatively uncontrolled environment. A method based on a nonlinear support vector machine classifier is proposed to obtain gender classification for the larger dataset. In experimental testing with a dataset consisting of 98 trials (49 subjects, 2 trials per subject), classification rates using leave-one-out cross-validation are improved from 73% using linear discriminant analysis to 88% using the nonlinear support vector machine classifier.

  20. Active relearning for robust supervised classification of pulmonary emphysema

    NASA Astrophysics Data System (ADS)

    Raghunath, Sushravya; Rajagopalan, Srinivasan; Karwoski, Ronald A.; Bartholmai, Brian J.; Robb, Richard A.

    2012-03-01

    Radiologists are adept at recognizing the appearance of lung parenchymal abnormalities in CT scans. However, the inconsistent differential diagnosis, due to subjective aggregation, mandates supervised classification. Towards optimizing Emphysema classification, we introduce a physician-in-the-loop feedback approach in order to minimize uncertainty in the selected training samples. Using multi-view inductive learning with the training samples, an ensemble of Support Vector Machine (SVM) models, each based on a specific pair-wise dissimilarity metric, was constructed in less than six seconds. In the active relearning phase, the ensemble-expert label conflicts were resolved by an expert. This just-in-time feedback with unoptimized SVMs yielded 15% increase in classification accuracy and 25% reduction in the number of support vectors. The generality of relearning was assessed in the optimized parameter space of six different classifiers across seven dissimilarity metrics. The resultant average accuracy improved to 21%. The co-operative feedback method proposed here could enhance both diagnostic and staging throughput efficiency in chest radiology practice.

  1. Accuracy of land use change detection using support vector machine and maximum likelihood techniques for open-cast coal mining areas.

    PubMed

    Karan, Shivesh Kishore; Samadder, Sukha Ranjan

    2016-08-01

    One objective of the present study was to evaluate the performance of support vector machine (SVM)-based image classification technique with the maximum likelihood classification (MLC) technique for a rapidly changing landscape of an open-cast mine. The other objective was to assess the change in land use pattern due to coal mining from 2006 to 2016. Assessing the change in land use pattern accurately is important for the development and monitoring of coalfields in conjunction with sustainable development. For the present study, Landsat 5 Thematic Mapper (TM) data of 2006 and Landsat 8 Operational Land Imager (OLI)/Thermal Infrared Sensor (TIRS) data of 2016 of a part of Jharia Coalfield, Dhanbad, India, were used. The SVM classification technique provided greater overall classification accuracy when compared to the MLC technique in classifying heterogeneous landscape with limited training dataset. SVM exceeded MLC in handling a difficult challenge of classifying features having near similar reflectance on the mean signature plot, an improvement of over 11 % was observed in classification of built-up area, and an improvement of 24 % was observed in classification of surface water using SVM; similarly, the SVM technique improved the overall land use classification accuracy by almost 6 and 3 % for Landsat 5 and Landsat 8 images, respectively. Results indicated that land degradation increased significantly from 2006 to 2016 in the study area. This study will help in quantifying the changes and can also serve as a basis for further decision support system studies aiding a variety of purposes such as planning and management of mines and environmental impact assessment.

  2. Objective research of auscultation signals in Traditional Chinese Medicine based on wavelet packet energy and support vector machine.

    PubMed

    Yan, Jianjun; Shen, Xiaojing; Wang, Yiqin; Li, Fufeng; Xia, Chunming; Guo, Rui; Chen, Chunfeng; Shen, Qingwei

    2010-01-01

    This study aims at utilising Wavelet Packet Transform (WPT) and Support Vector Machine (SVM) algorithm to make objective analysis and quantitative research for the auscultation in Traditional Chinese Medicine (TCM) diagnosis. First, Wavelet Packet Decomposition (WPD) at level 6 was employed to split more elaborate frequency bands of the auscultation signals. Then statistic analysis was made based on the extracted Wavelet Packet Energy (WPE) features from WPD coefficients. Furthermore, the pattern recognition was used to distinguish mixed subjects' statistical feature values of sample groups through SVM. Finally, the experimental results showed that the classification accuracies were at a high level.

  3. Phytoplankton global mapping from space with a support vector machine algorithm

    NASA Astrophysics Data System (ADS)

    de Boissieu, Florian; Menkes, Christophe; Dupouy, Cécile; Rodier, Martin; Bonnet, Sophie; Mangeas, Morgan; Frouin, Robert J.

    2014-11-01

    In recent years great progress has been made in global mapping of phytoplankton from space. Two main trends have emerged, the recognition of phytoplankton functional types (PFT) based on reflectance normalized to chlorophyll-a concentration, and the recognition of phytoplankton size class (PSC) based on the relationship between cell size and chlorophyll-a concentration. However, PFTs and PSCs are not decorrelated, and one approach can complement the other in a recognition task. In this paper, we explore the recognition of several dominant PFTs by combining reflectance anomalies, chlorophyll-a concentration and other environmental parameters, such as sea surface temperature and wind speed. Remote sensing pixels are labeled thanks to coincident in-situ pigment data from GeP&CO, NOMAD and MAREDAT datasets, covering various oceanographic environments. The recognition is made with a supervised Support Vector Machine classifier trained on the labeled pixels. This algorithm enables a non-linear separation of the classes in the input space and is especially adapted for small training datasets as available here. Moreover, it provides a class probability estimate, allowing one to enhance the robustness of the classification results through the choice of a minimum probability threshold. A greedy feature selection associated to a 10-fold cross-validation procedure is applied to select the most discriminative input features and evaluate the classification performance. The best classifiers are finally applied on daily remote sensing datasets (SeaWIFS, MODISA) and the resulting dominant PFT maps are compared with other studies. Several conclusions are drawn: (1) the feature selection highlights the weight of temperature, chlorophyll-a and wind speed variables in phytoplankton recognition; (2) the classifiers show good results and dominant PFT maps in agreement with phytoplankton distribution knowledge; (3) classification on MODISA data seems to perform better than on SeaWIFS data

  4. A new technique for rapid assessment of eutrophication status of coastal waters using a support vector machine

    NASA Astrophysics Data System (ADS)

    Kong, Xianyu; Che, Xiaowei; Su, Rongguo; Zhang, Chuansong; Yao, Qingzhen; Shi, Xiaoyong

    2017-05-01

    There is an urgent need to develop efficient evaluation tools that use easily measured variables to make rapid and timely eutrophication assessments, which are important for marine health management, and to implement eutrophication monitoring programs. In this study, an approach for rapidly assessing the eutrophication status of coastal waters with three easily measured parameters (turbidity, chlorophyll a and dissolved oxygen) was developed by the grid search (GS) optimized support vector machine (SVM), with trophic index TRIX classification results as the reference. With the optimized penalty parameter C =64 and the kernel parameter γ =1, the classification accuracy rates reached 89.3% for the training data, 88.3% for the cross-validation, and 88.5% for the validation dataset. Because the developed approach only used three easy-to-measure variables, its application could facilitate the rapid assessment of the eutrophication status of coastal waters, resulting in potential cost savings in marine monitoring programs and assisting in the provision of timely advice for marine management.

  5. Evaluating Support for the Current Classification of Eukaryotic Diversity

    PubMed Central

    Parfrey, Laura Wegener; Barbero, Erika; Lasser, Elyse; Dunthorn, Micah; Bhattacharya, Debashish; Patterson, David J; Katz, Laura A

    2006-01-01

    Perspectives on the classification of eukaryotic diversity have changed rapidly in recent years, as the four eukaryotic groups within the five-kingdom classification—plants, animals, fungi, and protists—have been transformed through numerous permutations into the current system of six “supergroups.” The intent of the supergroup classification system is to unite microbial and macroscopic eukaryotes based on phylogenetic inference. This supergroup approach is increasing in popularity in the literature and is appearing in introductory biology textbooks. We evaluate the stability and support for the current six-supergroup classification of eukaryotes based on molecular genealogies. We assess three aspects of each supergroup: (1) the stability of its taxonomy, (2) the support for monophyly (single evolutionary origin) in molecular analyses targeting a supergroup, and (3) the support for monophyly when a supergroup is included as an out-group in phylogenetic studies targeting other taxa. Our analysis demonstrates that supergroup taxonomies are unstable and that support for groups varies tremendously, indicating that the current classification scheme of eukaryotes is likely premature. We highlight several trends contributing to the instability and discuss the requirements for establishing robust clades within the eukaryotic tree of life. PMID:17194223

  6. Online Artifact Removal for Brain-Computer Interfaces Using Support Vector Machines and Blind Source Separation

    PubMed Central

    Halder, Sebastian; Bensch, Michael; Mellinger, Jürgen; Bogdan, Martin; Kübler, Andrea; Birbaumer, Niels; Rosenstiel, Wolfgang

    2007-01-01

    We propose a combination of blind source separation (BSS) and independent component analysis (ICA) (signal decomposition into artifacts and nonartifacts) with support vector machines (SVMs) (automatic classification) that are designed for online usage. In order to select a suitable BSS/ICA method, three ICA algorithms (JADE, Infomax, and FastICA) and one BSS algorithm (AMUSE) are evaluated to determine their ability to isolate electromyographic (EMG) and electrooculographic (EOG) artifacts into individual components. An implementation of the selected BSS/ICA method with SVMs trained to classify EMG and EOG artifacts, which enables the usage of the method as a filter in measurements with online feedback, is described. This filter is evaluated on three BCI datasets as a proof-of-concept of the method. PMID:18288259

  7. Online artifact removal for brain-computer interfaces using support vector machines and blind source separation.

    PubMed

    Halder, Sebastian; Bensch, Michael; Mellinger, Jürgen; Bogdan, Martin; Kübler, Andrea; Birbaumer, Niels; Rosenstiel, Wolfgang

    2007-01-01

    We propose a combination of blind source separation (BSS) and independent component analysis (ICA) (signal decomposition into artifacts and nonartifacts) with support vector machines (SVMs) (automatic classification) that are designed for online usage. In order to select a suitable BSS/ICA method, three ICA algorithms (JADE, Infomax, and FastICA) and one BSS algorithm (AMUSE) are evaluated to determine their ability to isolate electromyographic (EMG) and electrooculographic (EOG) artifacts into individual components. An implementation of the selected BSS/ICA method with SVMs trained to classify EMG and EOG artifacts, which enables the usage of the method as a filter in measurements with online feedback, is described. This filter is evaluated on three BCI datasets as a proof-of-concept of the method.

  8. On the classification techniques in data mining for microarray data classification

    NASA Astrophysics Data System (ADS)

    Aydadenta, Husna; Adiwijaya

    2018-03-01

    Cancer is one of the deadly diseases, according to data from WHO by 2015 there are 8.8 million more deaths caused by cancer, and this will increase every year if not resolved earlier. Microarray data has become one of the most popular cancer-identification studies in the field of health, since microarray data can be used to look at levels of gene expression in certain cell samples that serve to analyze thousands of genes simultaneously. By using data mining technique, we can classify the sample of microarray data thus it can be identified with cancer or not. In this paper we will discuss some research using some data mining techniques using microarray data, such as Support Vector Machine (SVM), Artificial Neural Network (ANN), Naive Bayes, k-Nearest Neighbor (kNN), and C4.5, and simulation of Random Forest algorithm with technique of reduction dimension using Relief. The result of this paper show performance measure (accuracy) from classification algorithm (SVM, ANN, Naive Bayes, kNN, C4.5, and Random Forets).The results in this paper show the accuracy of Random Forest algorithm higher than other classification algorithms (Support Vector Machine (SVM), Artificial Neural Network (ANN), Naive Bayes, k-Nearest Neighbor (kNN), and C4.5). It is hoped that this paper can provide some information about the speed, accuracy, performance and computational cost generated from each Data Mining Classification Technique based on microarray data.

  9. A Hybrid Approach of Stepwise Regression, Logistic Regression, Support Vector Machine, and Decision Tree for Forecasting Fraudulent Financial Statements

    PubMed Central

    Goo, Yeong-Jia James; Shen, Zone-De

    2014-01-01

    As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%. PMID:25302338

  10. Wavelet Entropy and Directed Acyclic Graph Support Vector Machine for Detection of Patients with Unilateral Hearing Loss in MRI Scanning.

    PubMed

    Wang, Shuihua; Yang, Ming; Du, Sidan; Yang, Jiquan; Liu, Bin; Gorriz, Juan M; Ramírez, Javier; Yuan, Ti-Fei; Zhang, Yudong

    2016-01-01

    Highlights We develop computer-aided diagnosis system for unilateral hearing loss detection in structural magnetic resonance imaging.Wavelet entropy is introduced to extract image global features from brain images. Directed acyclic graph is employed to endow support vector machine an ability to handle multi-class problems.The developed computer-aided diagnosis system achieves an overall accuracy of 95.1% for this three-class problem of differentiating left-sided and right-sided hearing loss from healthy controls. Aim: Sensorineural hearing loss (SNHL) is correlated to many neurodegenerative disease. Now more and more computer vision based methods are using to detect it in an automatic way. Materials: We have in total 49 subjects, scanned by 3.0T MRI (Siemens Medical Solutions, Erlangen, Germany). The subjects contain 14 patients with right-sided hearing loss (RHL), 15 patients with left-sided hearing loss (LHL), and 20 healthy controls (HC). Method: We treat this as a three-class classification problem: RHL, LHL, and HC. Wavelet entropy (WE) was selected from the magnetic resonance images of each subjects, and then submitted to a directed acyclic graph support vector machine (DAG-SVM). Results: The 10 repetition results of 10-fold cross validation shows 3-level decomposition will yield an overall accuracy of 95.10% for this three-class classification problem, higher than feedforward neural network, decision tree, and naive Bayesian classifier. Conclusions: This computer-aided diagnosis system is promising. We hope this study can attract more computer vision method for detecting hearing loss.

  11. Community detection in complex networks using proximate support vector clustering

    NASA Astrophysics Data System (ADS)

    Wang, Feifan; Zhang, Baihai; Chai, Senchun; Xia, Yuanqing

    2018-03-01

    Community structure, one of the most attention attracting properties in complex networks, has been a cornerstone in advances of various scientific branches. A number of tools have been involved in recent studies concentrating on the community detection algorithms. In this paper, we propose a support vector clustering method based on a proximity graph, owing to which the introduced algorithm surpasses the traditional support vector approach both in accuracy and complexity. Results of extensive experiments undertaken on computer generated networks and real world data sets illustrate competent performances in comparison with the other counterparts.

  12. Relevance Vector Machine and Support Vector Machine Classifier Analysis of Scanning Laser Polarimetry Retinal Nerve Fiber Layer Measurements

    PubMed Central

    Bowd, Christopher; Medeiros, Felipe A.; Zhang, Zuohua; Zangwill, Linda M.; Hao, Jiucang; Lee, Te-Won; Sejnowski, Terrence J.; Weinreb, Robert N.; Goldbaum, Michael H.

    2010-01-01

    Purpose To classify healthy and glaucomatous eyes using relevance vector machine (RVM) and support vector machine (SVM) learning classifiers trained on retinal nerve fiber layer (RNFL) thickness measurements obtained by scanning laser polarimetry (SLP). Methods Seventy-two eyes of 72 healthy control subjects (average age = 64.3 ± 8.8 years, visual field mean deviation =−0.71 ± 1.2 dB) and 92 eyes of 92 patients with glaucoma (average age = 66.9 ± 8.9 years, visual field mean deviation =−5.32 ± 4.0 dB) were imaged with SLP with variable corneal compensation (GDx VCC; Laser Diagnostic Technologies, San Diego, CA). RVM and SVM learning classifiers were trained and tested on SLP-determined RNFL thickness measurements from 14 standard parameters and 64 sectors (approximately 5.6° each) obtained in the circumpapillary area under the instrument-defined measurement ellipse (total 78 parameters). Tenfold cross-validation was used to train and test RVM and SVM classifiers on unique subsets of the full 164-eye data set and areas under the receiver operating characteristic (AUROC) curve for the classification of eyes in the test set were generated. AUROC curve results from RVM and SVM were compared to those for 14 SLP software-generated global and regional RNFL thickness parameters. Also reported was the AUROC curve for the GDx VCC software-generated nerve fiber indicator (NFI). Results The AUROC curves for RVM and SVM were 0.90 and 0.91, respectively, and increased to 0.93 and 0.94 when the training sets were optimized with sequential forward and backward selection (resulting in reduced dimensional data sets). AUROC curves for optimized RVM and SVM were significantly larger than those for all individual SLP parameters. The AUROC curve for the NFI was 0.87. Conclusions Results from RVM and SVM trained on SLP RNFL thickness measurements are similar and provide accurate classification of glaucomatous and healthy eyes. RVM may be preferable to SVM, because it provides a

  13. 1-norm support vector novelty detection and its sparseness.

    PubMed

    Zhang, Li; Zhou, WeiDa

    2013-12-01

    This paper proposes a 1-norm support vector novelty detection (SVND) method and discusses its sparseness. 1-norm SVND is formulated as a linear programming problem and uses two techniques for inducing sparseness, or the 1-norm regularization and the hinge loss function. We also find two upper bounds on the sparseness of 1-norm SVND, or exact support vector (ESV) and kernel Gram matrix rank bounds. The ESV bound indicates that 1-norm SVND has a sparser representation model than SVND. The kernel Gram matrix rank bound can loosely estimate the sparseness of 1-norm SVND. Experimental results show that 1-norm SVND is feasible and effective. Copyright © 2013 Elsevier Ltd. All rights reserved.

  14. Voltammetric Electronic Tongue and Support Vector Machines for Identification of Selected Features in Mexican Coffee

    PubMed Central

    Domínguez, Rocio Berenice; Moreno-Barón, Laura; Muñoz, Roberto; Gutiérrez, Juan Manuel

    2014-01-01

    This paper describes a new method based on a voltammetric electronic tongue (ET) for the recognition of distinctive features in coffee samples. An ET was directly applied to different samples from the main Mexican coffee regions without any pretreatment before the analysis. The resulting electrochemical information was modeled with two different mathematical tools, namely Linear Discriminant Analysis (LDA) and Support Vector Machines (SVM). Growing conditions (i.e., organic or non-organic practices and altitude of crops) were considered for a first classification. LDA results showed an average discrimination rate of 88% ± 6.53% while SVM successfully accomplished an overall accuracy of 96.4% ± 3.50% for the same task. A second classification based on geographical origin of samples was carried out. Results showed an overall accuracy of 87.5% ± 7.79% for LDA and a superior performance of 97.5% ± 3.22% for SVM. Given the complexity of coffee samples, the high accuracy percentages achieved by ET coupled with SVM in both classification problems suggested a potential applicability of ET in the assessment of selected coffee features with a simpler and faster methodology along with a null sample pretreatment. In addition, the proposed method can be applied to authentication assessment while improving cost, time and accuracy of the general procedure. PMID:25254303

  15. On the sparseness of 1-norm support vector machines.

    PubMed

    Zhang, Li; Zhou, Weida

    2010-04-01

    There is some empirical evidence available showing that 1-norm Support Vector Machines (1-norm SVMs) have good sparseness; however, both how good sparseness 1-norm SVMs can reach and whether they have a sparser representation than that of standard SVMs are not clear. In this paper we take into account the sparseness of 1-norm SVMs. Two upper bounds on the number of nonzero coefficients in the decision function of 1-norm SVMs are presented. First, the number of nonzero coefficients in 1-norm SVMs is at most equal to the number of only the exact support vectors lying on the +1 and -1 discriminating surfaces, while that in standard SVMs is equal to the number of support vectors, which implies that 1-norm SVMs have better sparseness than that of standard SVMs. Second, the number of nonzero coefficients is at most equal to the rank of the sample matrix. A brief review of the geometry of linear programming and the primal steepest edge pricing simplex method are given, which allows us to provide the proof of the two upper bounds and evaluate their tightness by experiments. Experimental results on toy data sets and the UCI data sets illustrate our analysis. Copyright 2009 Elsevier Ltd. All rights reserved.

  16. Support vector machine incremental learning triggered by wrongly predicted samples

    NASA Astrophysics Data System (ADS)

    Tang, Ting-long; Guan, Qiu; Wu, Yi-rong

    2018-05-01

    According to the classic Karush-Kuhn-Tucker (KKT) theorem, at every step of incremental support vector machine (SVM) learning, the newly adding sample which violates the KKT conditions will be a new support vector (SV) and migrate the old samples between SV set and non-support vector (NSV) set, and at the same time the learning model should be updated based on the SVs. However, it is not exactly clear at this moment that which of the old samples would change between SVs and NSVs. Additionally, the learning model will be unnecessarily updated, which will not greatly increase its accuracy but decrease the training speed. Therefore, how to choose the new SVs from old sets during the incremental stages and when to process incremental steps will greatly influence the accuracy and efficiency of incremental SVM learning. In this work, a new algorithm is proposed to select candidate SVs and use the wrongly predicted sample to trigger the incremental processing simultaneously. Experimental results show that the proposed algorithm can achieve good performance with high efficiency, high speed and good accuracy.

  17. Study of support vector machine and serum surface-enhanced Raman spectroscopy for noninvasive esophageal cancer detection

    NASA Astrophysics Data System (ADS)

    Li, Shao-Xin; Zeng, Qiu-Yao; Li, Lin-Fang; Zhang, Yan-Jiao; Wan, Ming-Ming; Liu, Zhi-Ming; Xiong, Hong-Lian; Guo, Zhou-Yi; Liu, Song-Hao

    2013-02-01

    The ability of combining serum surface-enhanced Raman spectroscopy (SERS) with support vector machine (SVM) for improving classification esophageal cancer patients from normal volunteers is investigated. Two groups of serum SERS spectra based on silver nanoparticles (AgNPs) are obtained: one group from patients with pathologically confirmed esophageal cancer (n=30) and the other group from healthy volunteers (n=31). Principal components analysis (PCA), conventional SVM (C-SVM) and conventional SVM combination with PCA (PCA-SVM) methods are implemented to classify the same spectral dataset. Results show that a diagnostic accuracy of 77.0% is acquired for PCA technique, while diagnostic accuracies of 83.6% and 85.2% are obtained for C-SVM and PCA-SVM methods based on radial basis functions (RBF) models. The results prove that RBF SVM models are superior to PCA algorithm in classification serum SERS spectra. The study demonstrates that serum SERS in combination with SVM technique has great potential to provide an effective and accurate diagnostic schema for noninvasive detection of esophageal cancer.

  18. Epileptic seizure detection in EEG signal with GModPCA and support vector machine.

    PubMed

    Jaiswal, Abeg Kumar; Banka, Haider

    2017-01-01

    Epilepsy is one of the most common neurological disorders caused by recurrent seizures. Electroencephalograms (EEGs) record neural activity and can detect epilepsy. Visual inspection of an EEG signal for epileptic seizure detection is a time-consuming process and may lead to human error; therefore, recently, a number of automated seizure detection frameworks were proposed to replace these traditional methods. Feature extraction and classification are two important steps in these procedures. Feature extraction focuses on finding the informative features that could be used for classification and correct decision-making. Therefore, proposing effective feature extraction techniques for seizure detection is of great significance. Principal Component Analysis (PCA) is a dimensionality reduction technique used in different fields of pattern recognition including EEG signal classification. Global modular PCA (GModPCA) is a variation of PCA. In this paper, an effective framework with GModPCA and Support Vector Machine (SVM) is presented for epileptic seizure detection in EEG signals. The feature extraction is performed with GModPCA, whereas SVM trained with radial basis function kernel performed the classification between seizure and nonseizure EEG signals. Seven different experimental cases were conducted on the benchmark epilepsy EEG dataset. The system performance was evaluated using 10-fold cross-validation. In addition, we prove analytically that GModPCA has less time and space complexities as compared to PCA. The experimental results show that EEG signals have strong inter-sub-pattern correlations. GModPCA and SVM have been able to achieve 100% accuracy for the classification between normal and epileptic signals. Along with this, seven different experimental cases were tested. The classification results of the proposed approach were better than were compared the results of some of the existing methods proposed in literature. It is also found that the time and space

  19. Classification of fMRI independent components using IC-fingerprints and support vector machine classifiers.

    PubMed

    De Martino, Federico; Gentile, Francesco; Esposito, Fabrizio; Balsi, Marco; Di Salle, Francesco; Goebel, Rainer; Formisano, Elia

    2007-01-01

    We present a general method for the classification of independent components (ICs) extracted from functional MRI (fMRI) data sets. The method consists of two steps. In the first step, each fMRI-IC is associated with an IC-fingerprint, i.e., a representation of the component in a multidimensional space of parameters. These parameters are post hoc estimates of global properties of the ICs and are largely independent of a specific experimental design and stimulus timing. In the second step a machine learning algorithm automatically separates the IC-fingerprints into six general classes after preliminary training performed on a small subset of expert-labeled components. We illustrate this approach in a multisubject fMRI study employing visual structure-from-motion stimuli encoding faces and control random shapes. We show that: (1) IC-fingerprints are a valuable tool for the inspection, characterization and selection of fMRI-ICs and (2) automatic classifications of fMRI-ICs in new subjects present a high correspondence with those obtained by expert visual inspection of the components. Importantly, our classification procedure highlights several neurophysiologically interesting processes. The most intriguing of which is reflected, with high intra- and inter-subject reproducibility, in one IC exhibiting a transiently task-related activation in the 'face' region of the primary sensorimotor cortex. This suggests that in addition to or as part of the mirror system, somatotopic regions of the sensorimotor cortex are involved in disambiguating the perception of a moving body part. Finally, we show that the same classification algorithm can be successfully applied, without re-training, to fMRI collected using acquisition parameters, stimulation modality and timing considerably different from those used for training.

  20. Research on Remote Sensing Image Classification Based on Feature Level Fusion

    NASA Astrophysics Data System (ADS)

    Yuan, L.; Zhu, G.

    2018-04-01

    Remote sensing image classification, as an important direction of remote sensing image processing and application, has been widely studied. However, in the process of existing classification algorithms, there still exists the phenomenon of misclassification and missing points, which leads to the final classification accuracy is not high. In this paper, we selected Sentinel-1A and Landsat8 OLI images as data sources, and propose a classification method based on feature level fusion. Compare three kind of feature level fusion algorithms (i.e., Gram-Schmidt spectral sharpening, Principal Component Analysis transform and Brovey transform), and then select the best fused image for the classification experimental. In the classification process, we choose four kinds of image classification algorithms (i.e. Minimum distance, Mahalanobis distance, Support Vector Machine and ISODATA) to do contrast experiment. We use overall classification precision and Kappa coefficient as the classification accuracy evaluation criteria, and the four classification results of fused image are analysed. The experimental results show that the fusion effect of Gram-Schmidt spectral sharpening is better than other methods. In four kinds of classification algorithms, the fused image has the best applicability to Support Vector Machine classification, the overall classification precision is 94.01 % and the Kappa coefficients is 0.91. The fused image with Sentinel-1A and Landsat8 OLI is not only have more spatial information and spectral texture characteristics, but also enhances the distinguishing features of the images. The proposed method is beneficial to improve the accuracy and stability of remote sensing image classification.

  1. Predicting residue-wise contact orders in proteins by support vector regression.

    PubMed

    Song, Jiangning; Burrage, Kevin

    2006-10-03

    improve the prediction performance with the CC to 0.57 and an RMSE of 0.79. In addition, combining the predicted secondary structure by PSIPRED was found to significantly improve the prediction performance and could yield the best prediction accuracy with a CC of 0.60 and RMSE of 0.78, which provided at least comparable performance compared with the other existing methods. The SVR method shows a prediction performance competitive with or at least comparable to the previously developed linear regression-based methods for predicting RWCO values. In contrast to support vector classification (SVC), SVR is very good at estimating the raw value profiles of the samples. The successful application of the SVR approach in this study reinforces the fact that support vector regression is a powerful tool in extracting the protein sequence-structure relationship and in estimating the protein structural profiles from amino acid sequences.

  2. Impact of atmospheric correction and image filtering on hyperspectral classification of tree species using support vector machine

    NASA Astrophysics Data System (ADS)

    Shahriari Nia, Morteza; Wang, Daisy Zhe; Bohlman, Stephanie Ann; Gader, Paul; Graves, Sarah J.; Petrovic, Milenko

    2015-01-01

    Hyperspectral images can be used to identify savannah tree species at the landscape scale, which is a key step in measuring biomass and carbon, and tracking changes in species distributions, including invasive species, in these ecosystems. Before automated species mapping can be performed, image processing and atmospheric correction is often performed, which can potentially affect the performance of classification algorithms. We determine how three processing and correction techniques (atmospheric correction, Gaussian filters, and shade/green vegetation filters) affect the prediction accuracy of classification of tree species at pixel level from airborne visible/infrared imaging spectrometer imagery of longleaf pine savanna in Central Florida, United States. Species classification using fast line-of-sight atmospheric analysis of spectral hypercubes (FLAASH) atmospheric correction outperformed ATCOR in the majority of cases. Green vegetation (normalized difference vegetation index) and shade (near-infrared) filters did not increase classification accuracy when applied to large and continuous patches of specific species. Finally, applying a Gaussian filter reduces interband noise and increases species classification accuracy. Using the optimal preprocessing steps, our classification accuracy of six species classes is about 75%.

  3. Vector analysis of postcardiotomy behavioral phenomena.

    PubMed

    Caston, J C; Miller, W C; Felber, W J

    1975-04-01

    The classification of postcardiotomy behavioral phenomena in Figure 1 is proposed for use as a clinical instrument to analyze etiological determinants. The utilization of a vector analysis analogy inherently denies absolutism. Classifications A-P are presented as prototypes of certain ratio imbalances of the metabolic, hemodynamic, environmental, and psychic vectors. Such a system allows for change from one type to another according to the individuality of the patient and the highly specific changes in his clinical presentation. A vector analysis also allows for infinite intermediary ratio imbalances between classification types as a function of time. Thus, postcardiotomy behavioral phenomena could be viewed as the vector summation of hemodynamic, metabolic, environmental, and psychic processes at a given point in time. Elaboration of unknown determinants in this complex syndrome appears to be task for the future.

  4. Explaining Support Vector Machines: A Color Based Nomogram

    PubMed Central

    Van Belle, Vanya; Van Calster, Ben; Van Huffel, Sabine; Suykens, Johan A. K.; Lisboa, Paulo

    2016-01-01

    Problem setting Support vector machines (SVMs) are very popular tools for classification, regression and other problems. Due to the large choice of kernels they can be applied with, a large variety of data can be analysed using these tools. Machine learning thanks its popularity to the good performance of the resulting models. However, interpreting the models is far from obvious, especially when non-linear kernels are used. Hence, the methods are used as black boxes. As a consequence, the use of SVMs is less supported in areas where interpretability is important and where people are held responsible for the decisions made by models. Objective In this work, we investigate whether SVMs using linear, polynomial and RBF kernels can be explained such that interpretations for model-based decisions can be provided. We further indicate when SVMs can be explained and in which situations interpretation of SVMs is (hitherto) not possible. Here, explainability is defined as the ability to produce the final decision based on a sum of contributions which depend on one single or at most two input variables. Results Our experiments on simulated and real-life data show that explainability of an SVM depends on the chosen parameter values (degree of polynomial kernel, width of RBF kernel and regularization constant). When several combinations of parameter values yield the same cross-validation performance, combinations with a lower polynomial degree or a larger kernel width have a higher chance of being explainable. Conclusions This work summarizes SVM classifiers obtained with linear, polynomial and RBF kernels in a single plot. Linear and polynomial kernels up to the second degree are represented exactly. For other kernels an indication of the reliability of the approximation is presented. The complete methodology is available as an R package and two apps and a movie are provided to illustrate the possibilities offered by the method. PMID:27723811

  5. Prediction and analysis of beta-turns in proteins by support vector machine.

    PubMed

    Pham, Tho Hoan; Satou, Kenji; Ho, Tu Bao

    2003-01-01

    Tight turn has long been recognized as one of the three important features of proteins after the alpha-helix and beta-sheet. Tight turns play an important role in globular proteins from both the structural and functional points of view. More than 90% tight turns are beta-turns. Analysis and prediction of beta-turns in particular and tight turns in general are very useful for the design of new molecules such as drugs, pesticides, and antigens. In this paper, we introduce a support vector machine (SVM) approach to prediction and analysis of beta-turns. We have investigated two aspects of applying SVM to the prediction and analysis of beta-turns. First, we developed a new SVM method, called BTSVM, which predicts beta-turns of a protein from its sequence. The prediction results on the dataset of 426 non-homologous protein chains by sevenfold cross-validation technique showed that our method is superior to the other previous methods. Second, we analyzed how amino acid positions support (or prevent) the formation of beta-turns based on the "multivariable" classification model of a linear SVM. This model is more general than the other ones of previous statistical methods. Our analysis results are more comprehensive and easier to use than previously published analysis results.

  6. Wavelet Entropy and Directed Acyclic Graph Support Vector Machine for Detection of Patients with Unilateral Hearing Loss in MRI Scanning

    PubMed Central

    Wang, Shuihua; Yang, Ming; Du, Sidan; Yang, Jiquan; Liu, Bin; Gorriz, Juan M.; Ramírez, Javier; Yuan, Ti-Fei; Zhang, Yudong

    2016-01-01

    Highlights We develop computer-aided diagnosis system for unilateral hearing loss detection in structural magnetic resonance imaging.Wavelet entropy is introduced to extract image global features from brain images. Directed acyclic graph is employed to endow support vector machine an ability to handle multi-class problems.The developed computer-aided diagnosis system achieves an overall accuracy of 95.1% for this three-class problem of differentiating left-sided and right-sided hearing loss from healthy controls. Aim: Sensorineural hearing loss (SNHL) is correlated to many neurodegenerative disease. Now more and more computer vision based methods are using to detect it in an automatic way. Materials: We have in total 49 subjects, scanned by 3.0T MRI (Siemens Medical Solutions, Erlangen, Germany). The subjects contain 14 patients with right-sided hearing loss (RHL), 15 patients with left-sided hearing loss (LHL), and 20 healthy controls (HC). Method: We treat this as a three-class classification problem: RHL, LHL, and HC. Wavelet entropy (WE) was selected from the magnetic resonance images of each subjects, and then submitted to a directed acyclic graph support vector machine (DAG-SVM). Results: The 10 repetition results of 10-fold cross validation shows 3-level decomposition will yield an overall accuracy of 95.10% for this three-class classification problem, higher than feedforward neural network, decision tree, and naive Bayesian classifier. Conclusions: This computer-aided diagnosis system is promising. We hope this study can attract more computer vision method for detecting hearing loss. PMID:27807415

  7. Nondestructive detection of pork comprehensive quality based on spectroscopy and support vector machine

    NASA Astrophysics Data System (ADS)

    Liu, Yuanyuan; Peng, Yankun; Zhang, Leilei; Dhakal, Sagar; Wang, Caiping

    2014-05-01

    Pork is one of the highly consumed meat item in the world. With growing improvement of living standard, concerned stakeholders including consumers and regulatory body pay more attention to comprehensive quality of fresh pork. Different analytical-laboratory based technologies exist to determine quality attributes of pork. However, none of the technologies are able to meet industrial desire of rapid and non-destructive technological development. Current study used optical instrument as a rapid and non-destructive tool to classify 24 h-aged pork longissimus dorsi samples into three kinds of meat (PSE, Normal and DFD), on the basis of color L* and pH24. Total of 66 samples were used in the experiment. Optical system based on Vis/NIR spectral acquisition system (300-1100 nm) was self- developed in laboratory to acquire spectral signal of pork samples. Median smoothing filter (M-filter) and multiplication scatter correction (MSC) was used to remove spectral noise and signal drift. Support vector machine (SVM) prediction model was developed to classify the samples based on their comprehensive qualities. The results showed that the classification model is highly correlated with the actual quality parameters with classification accuracy more than 85%. The system developed in this study being simple and easy to use, results being promising, the system can be used in meat processing industry for real time, non-destructive and rapid detection of pork qualities in future.

  8. Support vector machine for the diagnosis of malignant mesothelioma

    NASA Astrophysics Data System (ADS)

    Ushasukhanya, S.; Nithyakalyani, A.; Sivakumar, V.

    2018-04-01

    Harmful mesothelioma is an illness in which threatening (malignancy) cells shape in the covering of the trunk or stomach area. Being presented to asbestos can influence the danger of threatening mesothelioma. Signs and side effects of threatening mesothelioma incorporate shortness of breath and agony under the rib confine. Tests that inspect within the trunk and belly are utilized to recognize (find) and analyse harmful mesothelioma. Certain elements influence forecast (shot of recuperation) and treatment choices. In this review, Support vector machine (SVM) classifiers were utilized for Mesothelioma sickness conclusion. SVM output is contrasted by concentrating on Mesothelioma’s sickness and findings by utilizing similar information set. The support vector machine algorithm gives 92.5% precision acquired by means of 3-overlap cross-approval. The Mesothelioma illness dataset were taken from an organization reports from Turkey.

  9. Discriminant analysis for fast multiclass data classification through regularized kernel function approximation.

    PubMed

    Ghorai, Santanu; Mukherjee, Anirban; Dutta, Pranab K

    2010-06-01

    In this brief we have proposed the multiclass data classification by computationally inexpensive discriminant analysis through vector-valued regularized kernel function approximation (VVRKFA). VVRKFA being an extension of fast regularized kernel function approximation (FRKFA), provides the vector-valued response at single step. The VVRKFA finds a linear operator and a bias vector by using a reduced kernel that maps a pattern from feature space into the low dimensional label space. The classification of patterns is carried out in this low dimensional label subspace. A test pattern is classified depending on its proximity to class centroids. The effectiveness of the proposed method is experimentally verified and compared with multiclass support vector machine (SVM) on several benchmark data sets as well as on gene microarray data for multi-category cancer classification. The results indicate the significant improvement in both training and testing time compared to that of multiclass SVM with comparable testing accuracy principally in large data sets. Experiments in this brief also serve as comparison of performance of VVRKFA with stratified random sampling and sub-sampling.

  10. On feature augmentation for semantic argument classification of the Quran English translation using support vector machine

    NASA Astrophysics Data System (ADS)

    Khaira Batubara, Dina; Arif Bijaksana, Moch; Adiwijaya

    2018-03-01

    Research on the semantic argument classification requires semantically labeled data in large numbers, called corpus. Because building a corpus is costly and time-consuming, recently many studies have used existing corpus as the training data to conduct semantic argument classification research on new domain. But previous studies have proven that there is a significant decrease in performance when classifying semantic arguments on different domain between the training and the testing data. The main problem is when there is a new argument that found in the testing data but it is not found in the training data. This research carries on semantic argument classification on a new domain that is Quran English Translation by utilizing Propbank corpus as the training data. To recognize the new argument in the training data, this research proposes four new features for extending the argument features in the training data. By using SVM Linear, the experiment has proven that augmenting the proposed features to the baseline system with some combinations option improve the performance of semantic argument classification on Quran data using Propbank Corpus as training data.

  11. Mobile Phonocardiogram Diagnosis in Newborns Using Support Vector Machine

    PubMed Central

    Amiri, Amir Mohammad; Abtahi, Mohammadreza; Constant, Nick; Mankodiya, Kunal

    2017-01-01

    Phonocardiogram (PCG) monitoring on newborns is one of the most important and challenging tasks in the heart assessment in the early ages of life. In this paper, we present a novel approach for cardiac monitoring applied in PCG data. This basic system coupled with denoising, segmentation, cardiac cycle selection and classification of heart sound can be used widely for a large number of the data. This paper describes the problems and additional advantages of the PCG method including the possibility of recording heart sound at home, removing unwanted noises and data reduction on a mobile device, and an intelligent system to diagnose heart diseases on the cloud server. A wide range of physiological features from various analysis domains, including modeling, time/frequency domain analysis, an algorithm, etc., is proposed in order to extract features which will be considered as inputs for the classifier. In order to record the PCG data set from multiple subjects over one year, an electronic stethoscope was used for collecting data that was connected to a mobile device. In this study, we used different types of classifiers in order to distinguish between healthy and pathological heart sounds, and a comparison on the performances revealed that support vector machine (SVM) provides 92.2% accuracy and AUC = 0.98 in a time of 1.14 seconds for training, on a dataset of 116 samples. PMID:28335471

  12. Incremental Support Vector Machine Framework for Visual Sensor Networks

    NASA Astrophysics Data System (ADS)

    Awad, Mariette; Jiang, Xianhua; Motai, Yuichi

    2006-12-01

    Motivated by the emerging requirements of surveillance networks, we present in this paper an incremental multiclassification support vector machine (SVM) technique as a new framework for action classification based on real-time multivideo collected by homogeneous sites. The technique is based on an adaptation of least square SVM (LS-SVM) formulation but extends beyond the static image-based learning of current SVM methodologies. In applying the technique, an initial supervised offline learning phase is followed by a visual behavior data acquisition and an online learning phase during which the cluster head performs an ensemble of model aggregations based on the sensor nodes inputs. The cluster head then selectively switches on designated sensor nodes for future incremental learning. Combining sensor data offers an improvement over single camera sensing especially when the latter has an occluded view of the target object. The optimization involved alleviates the burdens of power consumption and communication bandwidth requirements. The resulting misclassification error rate, the iterative error reduction rate of the proposed incremental learning, and the decision fusion technique prove its validity when applied to visual sensor networks. Furthermore, the enabled online learning allows an adaptive domain knowledge insertion and offers the advantage of reducing both the model training time and the information storage requirements of the overall system which makes it even more attractive for distributed sensor networks communication.

  13. Time-reversal imaging for classification of submerged elastic targets via Gibbs sampling and the Relevance Vector Machine.

    PubMed

    Dasgupta, Nilanjan; Carin, Lawrence

    2005-04-01

    Time-reversal imaging (TRI) is analogous to matched-field processing, although TRI is typically very wideband and is appropriate for subsequent target classification (in addition to localization). Time-reversal techniques, as applied to acoustic target classification, are highly sensitive to channel mismatch. Hence, it is crucial to estimate the channel parameters before time-reversal imaging is performed. The channel-parameter statistics are estimated here by applying a geoacoustic inversion technique based on Gibbs sampling. The maximum a posteriori (MAP) estimate of the channel parameters are then used to perform time-reversal imaging. Time-reversal implementation requires a fast forward model, implemented here by a normal-mode framework. In addition to imaging, extraction of features from the time-reversed images is explored, with these applied to subsequent target classification. The classification of time-reversed signatures is performed by the relevance vector machine (RVM). The efficacy of the technique is analyzed on simulated in-channel data generated by a free-field finite element method (FEM) code, in conjunction with a channel propagation model, wherein the final classification performance is demonstrated to be relatively insensitive to the associated channel parameters. The underlying theory of Gibbs sampling and TRI are presented along with the feature extraction and target classification via the RVM.

  14. a Hyperspectral Image Classification Method Using Isomap and Rvm

    NASA Astrophysics Data System (ADS)

    Chang, H.; Wang, T.; Fang, H.; Su, Y.

    2018-04-01

    Classification is one of the most significant applications of hyperspectral image processing and even remote sensing. Though various algorithms have been proposed to implement and improve this application, there are still drawbacks in traditional classification methods. Thus further investigations on some aspects, such as dimension reduction, data mining, and rational use of spatial information, should be developed. In this paper, we used a widely utilized global manifold learning approach, isometric feature mapping (ISOMAP), to address the intrinsic nonlinearities of hyperspectral image for dimension reduction. Considering the impropriety of Euclidean distance in spectral measurement, we applied spectral angle (SA) for substitute when constructed the neighbourhood graph. Then, relevance vector machines (RVM) was introduced to implement classification instead of support vector machines (SVM) for simplicity, generalization and sparsity. Therefore, a probability result could be obtained rather than a less convincing binary result. Moreover, taking into account the spatial information of the hyperspectral image, we employ a spatial vector formed by different classes' ratios around the pixel. At last, we combined the probability results and spatial factors with a criterion to decide the final classification result. To verify the proposed method, we have implemented multiple experiments with standard hyperspectral images compared with some other methods. The results and different evaluation indexes illustrated the effectiveness of our method.

  15. A discrete wavelet based feature extraction and hybrid classification technique for microarray data analysis.

    PubMed

    Bennet, Jaison; Ganaprakasam, Chilambuchelvan Arul; Arputharaj, Kannan

    2014-01-01

    Cancer classification by doctors and radiologists was based on morphological and clinical features and had limited diagnostic ability in olden days. The recent arrival of DNA microarray technology has led to the concurrent monitoring of thousands of gene expressions in a single chip which stimulates the progress in cancer classification. In this paper, we have proposed a hybrid approach for microarray data classification based on nearest neighbor (KNN), naive Bayes, and support vector machine (SVM). Feature selection prior to classification plays a vital role and a feature selection technique which combines discrete wavelet transform (DWT) and moving window technique (MWT) is used. The performance of the proposed method is compared with the conventional classifiers like support vector machine, nearest neighbor, and naive Bayes. Experiments have been conducted on both real and benchmark datasets and the results indicate that the ensemble approach produces higher classification accuracy than conventional classifiers. This paper serves as an automated system for the classification of cancer and can be applied by doctors in real cases which serve as a boon to the medical community. This work further reduces the misclassification of cancers which is highly not allowed in cancer detection.

  16. Content-Based Discovery for Web Map Service using Support Vector Machine and User Relevance Feedback

    PubMed Central

    Cheng, Xiaoqiang; Qi, Kunlun; Zheng, Jie; You, Lan; Wu, Huayi

    2016-01-01

    Many discovery methods for geographic information services have been proposed. There are approaches for finding and matching geographic information services, methods for constructing geographic information service classification schemes, and automatic geographic information discovery. Overall, the efficiency of the geographic information discovery keeps improving., There are however, still two problems in Web Map Service (WMS) discovery that must be solved. Mismatches between the graphic contents of a WMS and the semantic descriptions in the metadata make discovery difficult for human users. End-users and computers comprehend WMSs differently creating semantic gaps in human-computer interactions. To address these problems, we propose an improved query process for WMSs based on the graphic contents of WMS layers, combining Support Vector Machine (SVM) and user relevance feedback. Our experiments demonstrate that the proposed method can improve the accuracy and efficiency of WMS discovery. PMID:27861505

  17. Content-Based Discovery for Web Map Service using Support Vector Machine and User Relevance Feedback.

    PubMed

    Hu, Kai; Gui, Zhipeng; Cheng, Xiaoqiang; Qi, Kunlun; Zheng, Jie; You, Lan; Wu, Huayi

    2016-01-01

    Many discovery methods for geographic information services have been proposed. There are approaches for finding and matching geographic information services, methods for constructing geographic information service classification schemes, and automatic geographic information discovery. Overall, the efficiency of the geographic information discovery keeps improving., There are however, still two problems in Web Map Service (WMS) discovery that must be solved. Mismatches between the graphic contents of a WMS and the semantic descriptions in the metadata make discovery difficult for human users. End-users and computers comprehend WMSs differently creating semantic gaps in human-computer interactions. To address these problems, we propose an improved query process for WMSs based on the graphic contents of WMS layers, combining Support Vector Machine (SVM) and user relevance feedback. Our experiments demonstrate that the proposed method can improve the accuracy and efficiency of WMS discovery.

  18. Prediction of plant pre-microRNAs and their microRNAs in genome-scale sequences using structure-sequence features and support vector machine.

    PubMed

    Meng, Jun; Liu, Dong; Sun, Chao; Luan, Yushi

    2014-12-30

    MicroRNAs (miRNAs) are a family of non-coding RNAs approximately 21 nucleotides in length that play pivotal roles at the post-transcriptional level in animals, plants and viruses. These molecules silence their target genes by degrading transcription or suppressing translation. Studies have shown that miRNAs are involved in biological responses to a variety of biotic and abiotic stresses. Identification of these molecules and their targets can aid the understanding of regulatory processes. Recently, prediction methods based on machine learning have been widely used for miRNA prediction. However, most of these methods were designed for mammalian miRNA prediction, and few are available for predicting miRNAs in the pre-miRNAs of specific plant species. Although the complete Solanum lycopersicum genome has been published, only 77 Solanum lycopersicum miRNAs have been identified, far less than the estimated number. Therefore, it is essential to develop a prediction method based on machine learning to identify new plant miRNAs. A novel classification model based on a support vector machine (SVM) was trained to identify real and pseudo plant pre-miRNAs together with their miRNAs. An initial set of 152 novel features related to sequential structures was used to train the model. By applying feature selection, we obtained the best subset of 47 features for use with the Back Support Vector Machine-Recursive Feature Elimination (B-SVM-RFE) method for the classification of plant pre-miRNAs. Using this method, 63 features were obtained for plant miRNA classification. We then developed an integrated classification model, miPlantPreMat, which comprises MiPlantPre and MiPlantMat, to identify plant pre-miRNAs and their miRNAs. This model achieved approximately 90% accuracy using plant datasets from nine plant species, including Arabidopsis thaliana, Glycine max, Oryza sativa, Physcomitrella patens, Medicago truncatula, Sorghum bicolor, Arabidopsis lyrata, Zea mays and Solanum

  19. Fuzzy support vector machines for adaptive Morse code recognition.

    PubMed

    Yang, Cheng-Hong; Jin, Li-Cheng; Chuang, Li-Yeh

    2006-11-01

    Morse code is now being harnessed for use in rehabilitation applications of augmentative-alternative communication and assistive technology, facilitating mobility, environmental control and adapted worksite access. In this paper, Morse code is selected as a communication adaptive device for persons who suffer from muscle atrophy, cerebral palsy or other severe handicaps. A stable typing rate is strictly required for Morse code to be effective as a communication tool. Therefore, an adaptive automatic recognition method with a high recognition rate is needed. The proposed system uses both fuzzy support vector machines and the variable-degree variable-step-size least-mean-square algorithm to achieve these objectives. We apply fuzzy memberships to each point, and provide different contributions to the decision learning function for support vector machines. Statistical analyses demonstrated that the proposed method elicited a higher recognition rate than other algorithms in the literature.

  20. 21 CFR 860.93 - Classification of implants, life-supporting or life-sustaining devices.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... 21 Food and Drugs 8 2011-04-01 2011-04-01 false Classification of implants, life-supporting or life-sustaining devices. 860.93 Section 860.93 Food and Drugs FOOD AND DRUG ADMINISTRATION, DEPARTMENT... Classification § 860.93 Classification of implants, life-supporting or life-sustaining devices. (a) The...

  1. 21 CFR 860.93 - Classification of implants, life-supporting or life-sustaining devices.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... 21 Food and Drugs 8 2010-04-01 2010-04-01 false Classification of implants, life-supporting or life-sustaining devices. 860.93 Section 860.93 Food and Drugs FOOD AND DRUG ADMINISTRATION, DEPARTMENT... Classification § 860.93 Classification of implants, life-supporting or life-sustaining devices. (a) The...

  2. 21 CFR 860.93 - Classification of implants, life-supporting or life-sustaining devices.

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ... 21 Food and Drugs 8 2012-04-01 2012-04-01 false Classification of implants, life-supporting or life-sustaining devices. 860.93 Section 860.93 Food and Drugs FOOD AND DRUG ADMINISTRATION, DEPARTMENT... Classification § 860.93 Classification of implants, life-supporting or life-sustaining devices. (a) The...

  3. 21 CFR 860.93 - Classification of implants, life-supporting or life-sustaining devices.

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ... 21 Food and Drugs 8 2013-04-01 2013-04-01 false Classification of implants, life-supporting or life-sustaining devices. 860.93 Section 860.93 Food and Drugs FOOD AND DRUG ADMINISTRATION, DEPARTMENT... Classification § 860.93 Classification of implants, life-supporting or life-sustaining devices. (a) The...

  4. 21 CFR 860.93 - Classification of implants, life-supporting or life-sustaining devices.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... 21 Food and Drugs 8 2014-04-01 2014-04-01 false Classification of implants, life-supporting or life-sustaining devices. 860.93 Section 860.93 Food and Drugs FOOD AND DRUG ADMINISTRATION, DEPARTMENT... Classification § 860.93 Classification of implants, life-supporting or life-sustaining devices. (a) The...

  5. Lysine acetylation sites prediction using an ensemble of support vector machine classifiers.

    PubMed

    Xu, Yan; Wang, Xiao-Bo; Ding, Jun; Wu, Ling-Yun; Deng, Nai-Yang

    2010-05-07

    Lysine acetylation is an essentially reversible and high regulated post-translational modification which regulates diverse protein properties. Experimental identification of acetylation sites is laborious and expensive. Hence, there is significant interest in the development of computational methods for reliable prediction of acetylation sites from amino acid sequences. In this paper we use an ensemble of support vector machine classifiers to perform this work. The experimentally determined acetylation lysine sites are extracted from Swiss-Prot database and scientific literatures. Experiment results show that an ensemble of support vector machine classifiers outperforms single support vector machine classifier and other computational methods such as PAIL and LysAcet on the problem of predicting acetylation lysine sites. The resulting method has been implemented in EnsemblePail, a web server for lysine acetylation sites prediction available at http://www.aporc.org/EnsemblePail/. Copyright (c) 2010 Elsevier Ltd. All rights reserved.

  6. A Wireless Electronic Nose System Using a Fe2O3 Gas Sensing Array and Least Squares Support Vector Regression

    PubMed Central

    Song, Kai; Wang, Qi; Liu, Qi; Zhang, Hongquan; Cheng, Yingguo

    2011-01-01

    This paper describes the design and implementation of a wireless electronic nose (WEN) system which can online detect the combustible gases methane and hydrogen (CH4/H2) and estimate their concentrations, either singly or in mixtures. The system is composed of two wireless sensor nodes—a slave node and a master node. The former comprises a Fe2O3 gas sensing array for the combustible gas detection, a digital signal processor (DSP) system for real-time sampling and processing the sensor array data and a wireless transceiver unit (WTU) by which the detection results can be transmitted to the master node connected with a computer. A type of Fe2O3 gas sensor insensitive to humidity is developed for resistance to environmental influences. A threshold-based least square support vector regression (LS-SVR)estimator is implemented on a DSP for classification and concentration measurements. Experimental results confirm that LS-SVR produces higher accuracy compared with artificial neural networks (ANNs) and a faster convergence rate than the standard support vector regression (SVR). The designed WEN system effectively achieves gas mixture analysis in a real-time process. PMID:22346587

  7. Classification of bifurcations regions in IVOCT images using support vector machine and artificial neural network models

    NASA Astrophysics Data System (ADS)

    Porto, C. D. N.; Costa Filho, C. F. F.; Macedo, M. M. G.; Gutierrez, M. A.; Costa, M. G. F.

    2017-03-01

    Studies in intravascular optical coherence tomography (IV-OCT) have demonstrated the importance of coronary bifurcation regions in intravascular medical imaging analysis, as plaques are more likely to accumulate in this region leading to coronary disease. A typical IV-OCT pullback acquires hundreds of frames, thus developing an automated tool to classify the OCT frames as bifurcation or non-bifurcation can be an important step to speed up OCT pullbacks analysis and assist automated methods for atherosclerotic plaque quantification. In this work, we evaluate the performance of two state-of-the-art classifiers, SVM and Neural Networks in the bifurcation classification task. The study included IV-OCT frames from 9 patients. In order to improve classification performance, we trained and tested the SVM with different parameters by means of a grid search and different stop criteria were applied to the Neural Network classifier: mean square error, early stop and regularization. Different sets of features were tested, using feature selection techniques: PCA, LDA and scalar feature selection with correlation. Training and test were performed in sets with a maximum of 1460 OCT frames. We quantified our results in terms of false positive rate, true positive rate, accuracy, specificity, precision, false alarm, f-measure and area under ROC curve. Neural networks obtained the best classification accuracy, 98.83%, overcoming the results found in literature. Our methods appear to offer a robust and reliable automated classification of OCT frames that might assist physicians indicating potential frames to analyze. Methods for improving neural networks generalization have increased the classification performance.

  8. Classification of radiological errors in chest radiographs, using support vector machine on the spatial frequency features of false- negative and false-positive regions

    NASA Astrophysics Data System (ADS)

    Pietrzyk, Mariusz W.; Donovan, Tim; Brennan, Patrick C.; Dix, Alan; Manning, David J.

    2011-03-01

    Aim: To optimize automated classification of radiological errors during lung nodule detection from chest radiographs (CxR) using a support vector machine (SVM) run on the spatial frequency features extracted from the local background of selected regions. Background: The majority of the unreported pulmonary nodules are visually detected but not recognized; shown by the prolonged dwell time values at false-negative regions. Similarly, overestimated nodule locations are capturing substantial amounts of foveal attention. Spatial frequency properties of selected local backgrounds are correlated with human observer responses either in terms of accuracy in indicating abnormality position or in the precision of visual sampling the medical images. Methods: Seven radiologists participated in the eye tracking experiments conducted under conditions of pulmonary nodule detection from a set of 20 postero-anterior CxR. The most dwelled locations have been identified and subjected to spatial frequency (SF) analysis. The image-based features of selected ROI were extracted with un-decimated Wavelet Packet Transform. An analysis of variance was run to select SF features and a SVM schema was implemented to classify False-Negative and False-Positive from all ROI. Results: A relative high overall accuracy was obtained for each individually developed Wavelet-SVM algorithm, with over 90% average correct ratio for errors recognition from all prolonged dwell locations. Conclusion: The preliminary results show that combined eye-tracking and image-based features can be used for automated detection of radiological error with SVM. The work is still in progress and not all analytical procedures have been completed, which might have an effect on the specificity of the algorithm.

  9. Support Vector Machines: Relevance Feedback and Information Retrieval.

    ERIC Educational Resources Information Center

    Drucker, Harris; Shahrary, Behzad; Gibbon, David C.

    2002-01-01

    Compares support vector machines (SVMs) to Rocchio, Ide regular and Ide dec-hi algorithms in information retrieval (IR) of text documents using relevancy feedback. If the preliminary search is so poor that one has to search through many documents to find at least one relevant document, then SVM is preferred. Includes nine tables. (Contains 24…

  10. nu-Anomica: A Fast Support Vector Based Novelty Detection Technique

    NASA Technical Reports Server (NTRS)

    Das, Santanu; Bhaduri, Kanishka; Oza, Nikunj C.; Srivastava, Ashok N.

    2009-01-01

    In this paper we propose nu-Anomica, a novel anomaly detection technique that can be trained on huge data sets with much reduced running time compared to the benchmark one-class Support Vector Machines algorithm. In -Anomica, the idea is to train the machine such that it can provide a close approximation to the exact decision plane using fewer training points and without losing much of the generalization performance of the classical approach. We have tested the proposed algorithm on a variety of continuous data sets under different conditions. We show that under all test conditions the developed procedure closely preserves the accuracy of standard one-class Support Vector Machines while reducing both the training time and the test time by 5 - 20 times.

  11. Objective Auscultation of TCM Based on Wavelet Packet Fractal Dimension and Support Vector Machine.

    PubMed

    Yan, Jian-Jun; Guo, Rui; Wang, Yi-Qin; Liu, Guo-Ping; Yan, Hai-Xia; Xia, Chun-Ming; Shen, Xiaojing

    2014-01-01

    This study was conducted to illustrate that auscultation features based on the fractal dimension combined with wavelet packet transform (WPT) were conducive to the identification the pattern of syndromes of Traditional Chinese Medicine (TCM). The WPT and the fractal dimension were employed to extract features of auscultation signals of 137 patients with lung Qi-deficient pattern, 49 patients with lung Yin-deficient pattern, and 43 healthy subjects. With these features, the classification model was constructed based on multiclass support vector machine (SVM). When all auscultation signals were trained by SVM to decide the patterns of TCM syndromes, the overall recognition rate of model was 79.49%; when male and female auscultation signals were trained, respectively, to decide the patterns, the overall recognition rate of model reached 86.05%. The results showed that the methods proposed in this paper were effective to analyze auscultation signals, and the performance of model can be greatly improved when the distinction of gender was considered.

  12. Objective Auscultation of TCM Based on Wavelet Packet Fractal Dimension and Support Vector Machine

    PubMed Central

    Yan, Jian-Jun; Wang, Yi-Qin; Liu, Guo-Ping; Yan, Hai-Xia; Xia, Chun-Ming; Shen, Xiaojing

    2014-01-01

    This study was conducted to illustrate that auscultation features based on the fractal dimension combined with wavelet packet transform (WPT) were conducive to the identification the pattern of syndromes of Traditional Chinese Medicine (TCM). The WPT and the fractal dimension were employed to extract features of auscultation signals of 137 patients with lung Qi-deficient pattern, 49 patients with lung Yin-deficient pattern, and 43 healthy subjects. With these features, the classification model was constructed based on multiclass support vector machine (SVM). When all auscultation signals were trained by SVM to decide the patterns of TCM syndromes, the overall recognition rate of model was 79.49%; when male and female auscultation signals were trained, respectively, to decide the patterns, the overall recognition rate of model reached 86.05%. The results showed that the methods proposed in this paper were effective to analyze auscultation signals, and the performance of model can be greatly improved when the distinction of gender was considered. PMID:24883068

  13. Support Vector Machines to improve physiologic hot flash measures: application to the ambulatory setting.

    PubMed

    Thurston, Rebecca C; Hernandez, Javier; Del Rio, Jose M; De La Torre, Fernando

    2011-07-01

    Most midlife women have hot flashes. The conventional criterion (≥2 μmho rise/30 s) for classifying hot flashes physiologically has shown poor performance. We improved this performance in the laboratory with Support Vector Machines (SVMs), a pattern classification method. We aimed to compare conventional to SVM methods to classify hot flashes in the ambulatory setting. Thirty-one women with hot flashes underwent 24 h of ambulatory sternal skin conductance monitoring. Hot flashes were quantified with conventional (≥2 μmho/30 s) and SVM methods. Conventional methods had low sensitivity (sensitivity=.57, specificity=.98, positive predictive value (PPV)=.91, negative predictive value (NPV)=.90, F1=.60), with performance lower with higher body mass index (BMI). SVMs improved this performance (sensitivity=.87, specificity=.97, PPV=.90, NPV=.96, F1=.88) and reduced BMI variation. SVMs can improve ambulatory physiologic hot flash measures. Copyright © 2010 Society for Psychophysiological Research.

  14. High-order distance-based multiview stochastic learning in image classification.

    PubMed

    Yu, Jun; Rui, Yong; Tang, Yuan Yan; Tao, Dacheng

    2014-12-01

    How do we find all images in a larger set of images which have a specific content? Or estimate the position of a specific object relative to the camera? Image classification methods, like support vector machine (supervised) and transductive support vector machine (semi-supervised), are invaluable tools for the applications of content-based image retrieval, pose estimation, and optical character recognition. However, these methods only can handle the images represented by single feature. In many cases, different features (or multiview data) can be obtained, and how to efficiently utilize them is a challenge. It is inappropriate for the traditionally concatenating schema to link features of different views into a long vector. The reason is each view has its specific statistical property and physical interpretation. In this paper, we propose a high-order distance-based multiview stochastic learning (HD-MSL) method for image classification. HD-MSL effectively combines varied features into a unified representation and integrates the labeling information based on a probabilistic framework. In comparison with the existing strategies, our approach adopts the high-order distance obtained from the hypergraph to replace pairwise distance in estimating the probability matrix of data distribution. In addition, the proposed approach can automatically learn a combination coefficient for each view, which plays an important role in utilizing the complementary information of multiview data. An alternative optimization is designed to solve the objective functions of HD-MSL and obtain different views on coefficients and classification scores simultaneously. Experiments on two real world datasets demonstrate the effectiveness of HD-MSL in image classification.

  15. a Gsa-Svm Hybrid System for Classification of Binary Problems

    NASA Astrophysics Data System (ADS)

    Sarafrazi, Soroor; Nezamabadi-pour, Hossein; Barahman, Mojgan

    2011-06-01

    This paperhybridizesgravitational search algorithm (GSA) with support vector machine (SVM) and made a novel GSA-SVM hybrid system to improve the classification accuracy in binary problems. GSA is an optimization heuristic toolused to optimize the value of SVM kernel parameter (in this paper, radial basis function (RBF) is chosen as the kernel function). The experimental results show that this newapproach can achieve high classification accuracy and is comparable to or better than the particle swarm optimization (PSO)-SVM and genetic algorithm (GA)-SVM, which are two hybrid systems for classification.

  16. Identifying saltcedar with hyperspectral data and support vector machines

    USDA-ARS?s Scientific Manuscript database

    Saltcedar (Tamarix spp.) are a group of dense phreatophytic shrubs and trees that are invasive to riparian areas throughout the United States. This study determined the feasibility of using hyperspectral data and a support vector machine (SVM) classifier to discriminate saltcedar from other cover t...

  17. Efficient brain lesion segmentation using multi-modality tissue-based feature selection and support vector machines.

    PubMed

    Fiot, Jean-Baptiste; Cohen, Laurent D; Raniga, Parnesh; Fripp, Jurgen

    2013-09-01

    Support vector machines (SVM) are machine learning techniques that have been used for segmentation and classification of medical images, including segmentation of white matter hyper-intensities (WMH). Current approaches using SVM for WMH segmentation extract features from the brain and classify these followed by complex post-processing steps to remove false positives. The method presented in this paper combines advanced pre-processing, tissue-based feature selection and SVM classification to obtain efficient and accurate WMH segmentation. Features from 125 patients, generated from up to four MR modalities [T1-w, T2-w, proton-density and fluid attenuated inversion recovery(FLAIR)], differing neighbourhood sizes and the use of multi-scale features were compared. We found that although using all four modalities gave the best overall classification (average Dice scores of 0.54  ±  0.12, 0.72  ±  0.06 and 0.82  ±  0.06 respectively for small, moderate and severe lesion loads); this was not significantly different (p = 0.50) from using just T1-w and FLAIR sequences (Dice scores of 0.52  ±  0.13, 0.71  ±  0.08 and 0.81  ±  0.07). Furthermore, there was a negligible difference between using 5 × 5 × 5 and 3 × 3 × 3 features (p = 0.93). Finally, we show that careful consideration of features and pre-processing techniques not only saves storage space and computation time but also leads to more efficient classification, which outperforms the one based on all features with post-processing. Copyright © 2013 John Wiley & Sons, Ltd.

  18. [Support vector machine?assisted diagnosis of human malignant gastric tissues based on dielectric properties].

    PubMed

    Zhang, Sa; Li, Zhou; Xin, Xue-Gang

    2017-12-20

    To achieve differential diagnosis of normal and malignant gastric tissues based on discrepancies in their dielectric properties using support vector machine. The dielectric properties of normal and malignant gastric tissues at the frequency ranging from 42.58 to 500 MHz were measured by coaxial probe method, and the Cole?Cole model was used to fit the measured data. Receiver?operating characteristic (ROC) curve analysis was used to evaluate the discrimination capability with respect to permittivity, conductivity, and Cole?Cole fitting parameters. Support vector machine was used for discriminating normal and malignant gastric tissues, and the discrimination accuracy was calculated using k?fold cross? The area under the ROC curve was above 0.8 for permittivity at the 5 frequencies at the lower end of the measured frequency range. The combination of the support vector machine with the permittivity at all these 5 frequencies combined achieved the highest discrimination accuracy of 84.38% with a MATLAB runtime of 3.40 s. The support vector machine?assisted diagnosis is feasible for human malignant gastric tissues based on the dielectric properties.

  19. Change detection and classification in brain MR images using change vector analysis.

    PubMed

    Simões, Rita; Slump, Cornelis

    2011-01-01

    The automatic detection of longitudinal changes in brain images is valuable in the assessment of disease evolution and treatment efficacy. Most existing change detection methods that are currently used in clinical research to monitor patients suffering from neurodegenerative diseases--such as Alzheimer's--focus on large-scale brain deformations. However, such patients often have other brain impairments, such as infarcts, white matter lesions and hemorrhages, which are typically overlooked by the deformation-based methods. Other unsupervised change detection algorithms have been proposed to detect tissue intensity changes. The outcome of these methods is typically a binary change map, which identifies changed brain regions. However, understanding what types of changes these regions underwent is likely to provide equally important information about lesion evolution. In this paper, we present an unsupervised 3D change detection method based on Change Vector Analysis. We compute and automatically threshold the Generalized Likelihood Ratio map to obtain a binary change map. Subsequently, we perform histogram-based clustering to classify the change vectors. We obtain a Kappa Index of 0.82 using various types of simulated lesions. The classification error is 2%. Finally, we are able to detect and discriminate both small changes and ventricle expansions in datasets from Mild Cognitive Impairment patients.

  20. Pattern Recognition Application of Support Vector Machine for Fault Classification of Thyristor Controlled Series Compensated Transmission Lines

    NASA Astrophysics Data System (ADS)

    Yashvantrai Vyas, Bhargav; Maheshwari, Rudra Prakash; Das, Biswarup

    2016-06-01

    Application of series compensation in extra high voltage (EHV) transmission line makes the protection job difficult for engineers, due to alteration in system parameters and measurements. The problem amplifies with inclusion of electronically controlled compensation like thyristor controlled series compensation (TCSC) as it produce harmonics and rapid change in system parameters during fault associated with TCSC control. This paper presents a pattern recognition based fault type identification approach with support vector machine. The scheme uses only half cycle post fault data of three phase currents to accomplish the task. The change in current signal features during fault has been considered as discriminatory measure. The developed scheme in this paper is tested over a large set of fault data with variation in system and fault parameters. These fault cases have been generated with PSCAD/EMTDC on a 400 kV, 300 km transmission line model. The developed algorithm has proved better for implementation on TCSC compensated line with its improved accuracy and speed.

  1. Improved RMR Rock Mass Classification Using Artificial Intelligence Algorithms

    NASA Astrophysics Data System (ADS)

    Gholami, Raoof; Rasouli, Vamegh; Alimoradi, Andisheh

    2013-09-01

    Rock mass classification systems such as rock mass rating (RMR) are very reliable means to provide information about the quality of rocks surrounding a structure as well as to propose suitable support systems for unstable regions. Many correlations have been proposed to relate measured quantities such as wave velocity to rock mass classification systems to limit the associated time and cost of conducting the sampling and mechanical tests conventionally used to calculate RMR values. However, these empirical correlations have been found to be unreliable, as they usually overestimate or underestimate the RMR value. The aim of this paper is to compare the results of RMR classification obtained from the use of empirical correlations versus machine-learning methodologies based on artificial intelligence algorithms. The proposed methods were verified based on two case studies located in northern Iran. Relevance vector regression (RVR) and support vector regression (SVR), as two robust machine-learning methodologies, were used to predict the RMR for tunnel host rocks. RMR values already obtained by sampling and site investigation at one tunnel were taken into account as the output of the artificial networks during training and testing phases. The results reveal that use of empirical correlations overestimates the predicted RMR values. RVR and SVR, however, showed more reliable results, and are therefore suggested for use in RMR classification for design purposes of rock structures.

  2. Online least squares one-class support vector machines-based abnormal visual event detection.

    PubMed

    Wang, Tian; Chen, Jie; Zhou, Yi; Snoussi, Hichem

    2013-12-12

    The abnormal event detection problem is an important subject in real-time video surveillance. In this paper, we propose a novel online one-class classification algorithm, online least squares one-class support vector machine (online LS-OC-SVM), combined with its sparsified version (sparse online LS-OC-SVM). LS-OC-SVM extracts a hyperplane as an optimal description of training objects in a regularized least squares sense. The online LS-OC-SVM learns a training set with a limited number of samples to provide a basic normal model, then updates the model through remaining data. In the sparse online scheme, the model complexity is controlled by the coherence criterion. The online LS-OC-SVM is adopted to handle the abnormal event detection problem. Each frame of the video is characterized by the covariance matrix descriptor encoding the moving information, then is classified into a normal or an abnormal frame. Experiments are conducted, on a two-dimensional synthetic distribution dataset and a benchmark video surveillance dataset, to demonstrate the promising results of the proposed online LS-OC-SVM method.

  3. Online Least Squares One-Class Support Vector Machines-Based Abnormal Visual Event Detection

    PubMed Central

    Wang, Tian; Chen, Jie; Zhou, Yi; Snoussi, Hichem

    2013-01-01

    The abnormal event detection problem is an important subject in real-time video surveillance. In this paper, we propose a novel online one-class classification algorithm, online least squares one-class support vector machine (online LS-OC-SVM), combined with its sparsified version (sparse online LS-OC-SVM). LS-OC-SVM extracts a hyperplane as an optimal description of training objects in a regularized least squares sense. The online LS-OC-SVM learns a training set with a limited number of samples to provide a basic normal model, then updates the model through remaining data. In the sparse online scheme, the model complexity is controlled by the coherence criterion. The online LS-OC-SVM is adopted to handle the abnormal event detection problem. Each frame of the video is characterized by the covariance matrix descriptor encoding the moving information, then is classified into a normal or an abnormal frame. Experiments are conducted, on a two-dimensional synthetic distribution dataset and a benchmark video surveillance dataset, to demonstrate the promising results of the proposed online LS-OC-SVM method. PMID:24351629

  4. A new range-free localisation in wireless sensor networks using support vector machine

    NASA Astrophysics Data System (ADS)

    Wang, Zengfeng; Zhang, Hao; Lu, Tingting; Sun, Yujuan; Liu, Xing

    2018-02-01

    Location information of sensor nodes is of vital importance for most applications in wireless sensor networks (WSNs). This paper proposes a new range-free localisation algorithm using support vector machine (SVM) and polar coordinate system (PCS), LSVM-PCS. In LSVM-PCS, two sets of classes are first constructed based on sensor nodes' polar coordinates. Using the boundaries of the defined classes, the operation region of WSN field is partitioned into a finite number of polar grids. Each sensor node can be localised into one of the polar grids by executing two localisation algorithms that are developed on the basis of SVM classification. The centre of the resident polar grid is then estimated as the location of the sensor node. In addition, a two-hop mass-spring optimisation (THMSO) is also proposed to further improve the localisation accuracy of LSVM-PCS. In THMSO, both neighbourhood information and non-neighbourhood information are used to refine the sensor node location. The results obtained verify that the proposed algorithm provides a significant improvement over existing localisation methods.

  5. Improved Hierarchical Optimization-Based Classification of Hyperspectral Images Using Shape Analysis

    NASA Technical Reports Server (NTRS)

    Tarabalka, Yuliya; Tilton, James C.

    2012-01-01

    A new spectral-spatial method for classification of hyperspectral images is proposed. The HSegClas method is based on the integration of probabilistic classification and shape analysis within the hierarchical step-wise optimization algorithm. First, probabilistic support vector machines classification is applied. Then, at each iteration two neighboring regions with the smallest Dissimilarity Criterion (DC) are merged, and classification probabilities are recomputed. The important contribution of this work consists in estimating a DC between regions as a function of statistical, classification and geometrical (area and rectangularity) features. Experimental results are presented on a 102-band ROSIS image of the Center of Pavia, Italy. The developed approach yields more accurate classification results when compared to previously proposed methods.

  6. Segmentation of HER2 protein overexpression in immunohistochemically stained breast cancer images using Support Vector Machines

    NASA Astrophysics Data System (ADS)

    Pezoa, Raquel; Salinas, Luis; Torres, Claudio; Härtel, Steffen; Maureira-Fredes, Cristián; Arce, Paola

    2016-10-01

    Breast cancer is one of the most common cancers in women worldwide. Patient therapy is widely supported by analysis of immunohistochemically (IHC) stained tissue sections. In particular, the analysis of HER2 overexpression by immunohistochemistry helps to determine when patients are suitable to HER2-targeted treatment. Computational HER2 overexpression analysis is still an open problem and a challenging task principally because of the variability of immunohistochemistry tissue samples and the subjectivity of the specialists to assess the samples. In addition, the immunohistochemistry process can produce diverse artifacts that difficult the HER2 overexpression assessment. In this paper we study the segmentation of HER2 overexpression in IHC stained breast cancer tissue images using a support vector machine (SVM) classifier. We asses the SVM performance using diverse color and texture pixel-level features including the RGB, CMYK, HSV, CIE L*a*b* color spaces, color deconvolution filter and Haralick features. We measure classification performance for three datasets containing a total of 153 IHC images that were previously labeled by a pathologist.

  7. An unbalanced spectra classification method based on entropy

    NASA Astrophysics Data System (ADS)

    Liu, Zhong-bao; Zhao, Wen-juan

    2017-05-01

    How to solve the problem of distinguishing the minority spectra from the majority of the spectra is quite important in astronomy. In view of this, an unbalanced spectra classification method based on entropy (USCM) is proposed in this paper to deal with the unbalanced spectra classification problem. USCM greatly improves the performances of the traditional classifiers on distinguishing the minority spectra as it takes the data distribution into consideration in the process of classification. However, its time complexity is exponential with the training size, and therefore, it can only deal with the problem of small- and medium-scale classification. How to solve the large-scale classification problem is quite important to USCM. It can be easily obtained by mathematical computation that the dual form of USCM is equivalent to the minimum enclosing ball (MEB), and core vector machine (CVM) is introduced, USCM based on CVM is proposed to deal with the large-scale classification problem. Several comparative experiments on the 4 subclasses of K-type spectra, 3 subclasses of F-type spectra and 3 subclasses of G-type spectra from Sloan Digital Sky Survey (SDSS) verify USCM and USCM based on CVM perform better than kNN (k nearest neighbor) and SVM (support vector machine) in dealing with the problem of rare spectra mining respectively on the small- and medium-scale datasets and the large-scale datasets.

  8. Hyperspectral imaging with wavelet transform for classification of colon tissue biopsy samples

    NASA Astrophysics Data System (ADS)

    Masood, Khalid

    2008-08-01

    Automatic classification of medical images is a part of our computerised medical imaging programme to support the pathologists in their diagnosis. Hyperspectral data has found its applications in medical imagery. Its usage is increasing significantly in biopsy analysis of medical images. In this paper, we present a histopathological analysis for the classification of colon biopsy samples into benign and malignant classes. The proposed study is based on comparison between 3D spectral/spatial analysis and 2D spatial analysis. Wavelet textural features in the wavelet domain are used in both these approaches for classification of colon biopsy samples. Experimental results indicate that the incorporation of wavelet textural features using a support vector machine, in 2D spatial analysis, achieve best classification accuracy.

  9. Support vector machines classifiers of physical activities in preschoolers

    USDA-ARS?s Scientific Manuscript database

    The goal of this study is to develop, test, and compare multinomial logistic regression (MLR) and support vector machines (SVM) in classifying preschool-aged children physical activity data acquired from an accelerometer. In this study, 69 children aged 3-5 years old were asked to participate in a s...

  10. Interpreting linear support vector machine models with heat map molecule coloring

    PubMed Central

    2011-01-01

    Background Model-based virtual screening plays an important role in the early drug discovery stage. The outcomes of high-throughput screenings are a valuable source for machine learning algorithms to infer such models. Besides a strong performance, the interpretability of a machine learning model is a desired property to guide the optimization of a compound in later drug discovery stages. Linear support vector machines showed to have a convincing performance on large-scale data sets. The goal of this study is to present a heat map molecule coloring technique to interpret linear support vector machine models. Based on the weights of a linear model, the visualization approach colors each atom and bond of a compound according to its importance for activity. Results We evaluated our approach on a toxicity data set, a chromosome aberration data set, and the maximum unbiased validation data sets. The experiments show that our method sensibly visualizes structure-property and structure-activity relationships of a linear support vector machine model. The coloring of ligands in the binding pocket of several crystal structures of a maximum unbiased validation data set target indicates that our approach assists to determine the correct ligand orientation in the binding pocket. Additionally, the heat map coloring enables the identification of substructures important for the binding of an inhibitor. Conclusions In combination with heat map coloring, linear support vector machine models can help to guide the modification of a compound in later stages of drug discovery. Particularly substructures identified as important by our method might be a starting point for optimization of a lead compound. The heat map coloring should be considered as complementary to structure based modeling approaches. As such, it helps to get a better understanding of the binding mode of an inhibitor. PMID:21439031

  11. Best Merge Region Growing with Integrated Probabilistic Classification for Hyperspectral Imagery

    NASA Technical Reports Server (NTRS)

    Tarabalka, Yuliya; Tilton, James C.

    2011-01-01

    A new method for spectral-spatial classification of hyperspectral images is proposed. The method is based on the integration of probabilistic classification within the hierarchical best merge region growing algorithm. For this purpose, preliminary probabilistic support vector machines classification is performed. Then, hierarchical step-wise optimization algorithm is applied, by iteratively merging regions with the smallest Dissimilarity Criterion (DC). The main novelty of this method consists in defining a DC between regions as a function of region statistical and geometrical features along with classification probabilities. Experimental results are presented on a 200-band AVIRIS image of the Northwestern Indiana s vegetation area and compared with those obtained by recently proposed spectral-spatial classification techniques. The proposed method improves classification accuracies when compared to other classification approaches.

  12. Structural analysis of online handwritten mathematical symbols based on support vector machines

    NASA Astrophysics Data System (ADS)

    Simistira, Foteini; Papavassiliou, Vassilis; Katsouros, Vassilis; Carayannis, George

    2013-01-01

    Mathematical expression recognition is still a very challenging task for the research community mainly because of the two-dimensional (2d) structure of mathematical expressions (MEs). In this paper, we present a novel approach for the structural analysis between two on-line handwritten mathematical symbols of a ME, based on spatial features of the symbols. We introduce six features to represent the spatial affinity of the symbols and compare two multi-class classification methods that employ support vector machines (SVMs): one based on the "one-against-one" technique and one based on the "one-against-all", in identifying the relation between a pair of symbols (i.e. subscript, numerator, etc). A dataset containing 1906 spatial relations derived from the Competition on Recognition of Online Handwritten Mathematical Expressions (CROHME) 2012 training dataset is constructed to evaluate the classifiers and compare them with the rule-based classifier of the ILSP-1 system participated in the contest. The experimental results give an overall mean error rate of 2.61% for the "one-against-one" SVM approach, 6.57% for the "one-against-all" SVM technique and 12.31% error rate for the ILSP-1 classifier.

  13. LDA boost classification: boosting by topics

    NASA Astrophysics Data System (ADS)

    Lei, La; Qiao, Guo; Qimin, Cao; Qitao, Li

    2012-12-01

    AdaBoost is an efficacious classification algorithm especially in text categorization (TC) tasks. The methodology of setting up a classifier committee and voting on the documents for classification can achieve high categorization precision. However, traditional Vector Space Model can easily lead to the curse of dimensionality and feature sparsity problems; so it affects classification performance seriously. This article proposed a novel classification algorithm called LDABoost based on boosting ideology which uses Latent Dirichlet Allocation (LDA) to modeling the feature space. Instead of using words or phrase, LDABoost use latent topics as the features. In this way, the feature dimension is significantly reduced. Improved Naïve Bayes (NB) is designed as the weaker classifier which keeps the efficiency advantage of classic NB algorithm and has higher precision. Moreover, a two-stage iterative weighted method called Cute Integration in this article is proposed for improving the accuracy by integrating weak classifiers into strong classifier in a more rational way. Mutual Information is used as metrics of weights allocation. The voting information and the categorization decision made by basis classifiers are fully utilized for generating the strong classifier. Experimental results reveals LDABoost making categorization in a low-dimensional space, it has higher accuracy than traditional AdaBoost algorithms and many other classic classification algorithms. Moreover, its runtime consumption is lower than different versions of AdaBoost, TC algorithms based on support vector machine and Neural Networks.

  14. Bayesian data assimilation provides rapid decision support for vector-borne diseases.

    PubMed

    Jewell, Chris P; Brown, Richard G

    2015-07-06

    Predicting the spread of vector-borne diseases in response to incursions requires knowledge of both host and vector demographics in advance of an outbreak. Although host population data are typically available, for novel disease introductions there is a high chance of the pathogen using a vector for which data are unavailable. This presents a barrier to estimating the parameters of dynamical models representing host-vector-pathogen interaction, and hence limits their ability to provide quantitative risk forecasts. The Theileria orientalis (Ikeda) outbreak in New Zealand cattle demonstrates this problem: even though the vector has received extensive laboratory study, a high degree of uncertainty persists over its national demographic distribution. Addressing this, we develop a Bayesian data assimilation approach whereby indirect observations of vector activity inform a seasonal spatio-temporal risk surface within a stochastic epidemic model. We provide quantitative predictions for the future spread of the epidemic, quantifying uncertainty in the model parameters, case infection times and the disease status of undetected infections. Importantly, we demonstrate how our model learns sequentially as the epidemic unfolds and provide evidence for changing epidemic dynamics through time. Our approach therefore provides a significant advance in rapid decision support for novel vector-borne disease outbreaks. © 2015 The Author(s) Published by the Royal Society. All rights reserved.

  15. Wearable-Sensor-Based Classification Models of Faller Status in Older Adults.

    PubMed

    Howcroft, Jennifer; Lemaire, Edward D; Kofman, Jonathan

    2016-01-01

    Wearable sensors have potential for quantitative, gait-based, point-of-care fall risk assessment that can be easily and quickly implemented in clinical-care and older-adult living environments. This investigation generated models for wearable-sensor based fall-risk classification in older adults and identified the optimal sensor type, location, combination, and modelling method; for walking with and without a cognitive load task. A convenience sample of 100 older individuals (75.5 ± 6.7 years; 76 non-fallers, 24 fallers based on 6 month retrospective fall occurrence) walked 7.62 m under single-task and dual-task conditions while wearing pressure-sensing insoles and tri-axial accelerometers at the head, pelvis, and left and right shanks. Participants also completed the Activities-specific Balance Confidence scale, Community Health Activities Model Program for Seniors questionnaire, six minute walk test, and ranked their fear of falling. Fall risk classification models were assessed for all sensor combinations and three model types: multi-layer perceptron neural network, naïve Bayesian, and support vector machine. The best performing model was a multi-layer perceptron neural network with input parameters from pressure-sensing insoles and head, pelvis, and left shank accelerometers (accuracy = 84%, F1 score = 0.600, MCC score = 0.521). Head sensor-based models had the best performance of the single-sensor models for single-task gait assessment. Single-task gait assessment models outperformed models based on dual-task walking or clinical assessment data. Support vector machines and neural networks were the best modelling technique for fall risk classification. Fall risk classification models developed for point-of-care environments should be developed using support vector machines and neural networks, with a multi-sensor single-task gait assessment.

  16. Propensity score estimation: neural networks, support vector machines, decision trees (CART), and meta-classifiers as alternatives to logistic regression.

    PubMed

    Westreich, Daniel; Lessler, Justin; Funk, Michele Jonsson

    2010-08-01

    Propensity scores for the analysis of observational data are typically estimated using logistic regression. Our objective in this review was to assess machine learning alternatives to logistic regression, which may accomplish the same goals but with fewer assumptions or greater accuracy. We identified alternative methods for propensity score estimation and/or classification from the public health, biostatistics, discrete mathematics, and computer science literature, and evaluated these algorithms for applicability to the problem of propensity score estimation, potential advantages over logistic regression, and ease of use. We identified four techniques as alternatives to logistic regression: neural networks, support vector machines, decision trees (classification and regression trees [CART]), and meta-classifiers (in particular, boosting). Although the assumptions of logistic regression are well understood, those assumptions are frequently ignored. All four alternatives have advantages and disadvantages compared with logistic regression. Boosting (meta-classifiers) and, to a lesser extent, decision trees (particularly CART), appear to be most promising for use in the context of propensity score analysis, but extensive simulation studies are needed to establish their utility in practice. Copyright (c) 2010 Elsevier Inc. All rights reserved.

  17. About decomposition approach for solving the classification problem

    NASA Astrophysics Data System (ADS)

    Andrianova, A. A.

    2016-11-01

    This article describes the features of the application of an algorithm with using of decomposition methods for solving the binary classification problem of constructing a linear classifier based on Support Vector Machine method. Application of decomposition reduces the volume of calculations, in particular, due to the emerging possibilities to build parallel versions of the algorithm, which is a very important advantage for the solution of problems with big data. The analysis of the results of computational experiments conducted using the decomposition approach. The experiment use known data set for binary classification problem.

  18. Broiler chickens can benefit from machine learning: support vector machine analysis of observational epidemiological data

    PubMed Central

    Hepworth, Philip J.; Nefedov, Alexey V.; Muchnik, Ilya B.; Morgan, Kenton L.

    2012-01-01

    Machine-learning algorithms pervade our daily lives. In epidemiology, supervised machine learning has the potential for classification, diagnosis and risk factor identification. Here, we report the use of support vector machine learning to identify the features associated with hock burn on commercial broiler farms, using routinely collected farm management data. These data lend themselves to analysis using machine-learning techniques. Hock burn, dermatitis of the skin over the hock, is an important indicator of broiler health and welfare. Remarkably, this classifier can predict the occurrence of high hock burn prevalence with accuracy of 0.78 on unseen data, as measured by the area under the receiver operating characteristic curve. We also compare the results with those obtained by standard multi-variable logistic regression and suggest that this technique provides new insights into the data. This novel application of a machine-learning algorithm, embedded in poultry management systems could offer significant improvements in broiler health and welfare worldwide. PMID:22319115

  19. Broiler chickens can benefit from machine learning: support vector machine analysis of observational epidemiological data.

    PubMed

    Hepworth, Philip J; Nefedov, Alexey V; Muchnik, Ilya B; Morgan, Kenton L

    2012-08-07

    Machine-learning algorithms pervade our daily lives. In epidemiology, supervised machine learning has the potential for classification, diagnosis and risk factor identification. Here, we report the use of support vector machine learning to identify the features associated with hock burn on commercial broiler farms, using routinely collected farm management data. These data lend themselves to analysis using machine-learning techniques. Hock burn, dermatitis of the skin over the hock, is an important indicator of broiler health and welfare. Remarkably, this classifier can predict the occurrence of high hock burn prevalence with accuracy of 0.78 on unseen data, as measured by the area under the receiver operating characteristic curve. We also compare the results with those obtained by standard multi-variable logistic regression and suggest that this technique provides new insights into the data. This novel application of a machine-learning algorithm, embedded in poultry management systems could offer significant improvements in broiler health and welfare worldwide.

  20. Sleep versus wake classification from heart rate variability using computational intelligence: consideration of rejection in classification models.

    PubMed

    Lewicke, Aaron; Sazonov, Edward; Corwin, Michael J; Neuman, Michael; Schuckers, Stephanie

    2008-01-01

    Reliability of classification performance is important for many biomedical applications. A classification model which considers reliability in the development of the model such that unreliable segments are rejected would be useful, particularly, in large biomedical data sets. This approach is demonstrated in the development of a technique to reliably determine sleep and wake using only the electrocardiogram (ECG) of infants. Typically, sleep state scoring is a time consuming task in which sleep states are manually derived from many physiological signals. The method was tested with simultaneous 8-h ECG and polysomnogram (PSG) determined sleep scores from 190 infants enrolled in the collaborative home infant monitoring evaluation (CHIME) study. Learning vector quantization (LVQ) neural network, multilayer perceptron (MLP) neural network, and support vector machines (SVMs) are tested as the classifiers. After systematic rejection of difficult to classify segments, the models can achieve 85%-87% correct classification while rejecting only 30% of the data. This corresponds to a Kappa statistic of 0.65-0.68. With rejection, accuracy improves by about 8% over a model without rejection. Additionally, the impact of the PSG scored indeterminate state epochs is analyzed. The advantages of a reliable sleep/wake classifier based only on ECG include high accuracy, simplicity of use, and low intrusiveness. Reliability of the classification can be built directly in the model, such that unreliable segments are rejected.

  1. Associations among hydrologic classifications and fish traits to support environmental flow standards

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    McManamay, Ryan A; Bevelhimer, Mark S; Frimpong, Dr. Emmanuel A,

    2014-01-01

    Classification systems are valuable to ecological management in that they organize information into consolidated units thereby providing efficient means to achieve conservation objectives. Of the many ways classifications benefit management, hypothesis generation has been discussed as the most important. However, in order to provide templates for developing and testing ecologically relevant hypotheses, classifications created using environmental variables must be linked to ecological patterns. Herein, we develop associations between a recent US hydrologic classification and fish traits in order to form a template for generating flow ecology hypotheses and supporting environmental flow standard development. Tradeoffs in adaptive strategies for fish weremore » observed across a spectrum of stable, perennial flow to unstable intermittent flow. In accordance with theory, periodic strategists were associated with stable, predictable flow, whereas opportunistic strategists were more affiliated with intermittent, variable flows. We developed linkages between the uniqueness of hydrologic character and ecological distinction among classes, which may translate into predictions between losses in hydrologic uniqueness and ecological community response. Comparisons of classification strength between hydrologic classifications and other frameworks suggested that spatially contiguous classifications with higher regionalization will tend to explain more variation in ecological patterns. Despite explaining less ecological variation than other frameworks, we contend that hydrologic classifications are still useful because they provide a conceptual linkage between hydrologic variation and ecological communities to support flow ecology relationships. Mechanistic associations among fish traits and hydrologic classes support the presumption that environmental flow standards should be developed uniquely for stream classes and ecological communities, therein.« less

  2. Integrating support vector machines and random forests to classify crops in time series of Worldview-2 images

    NASA Astrophysics Data System (ADS)

    Zafari, A.; Zurita-Milla, R.; Izquierdo-Verdiguier, E.

    2017-10-01

    Crop maps are essential inputs for the agricultural planning done at various governmental and agribusinesses agencies. Remote sensing offers timely and costs efficient technologies to identify and map crop types over large areas. Among the plethora of classification methods, Support Vector Machine (SVM) and Random Forest (RF) are widely used because of their proven performance. In this work, we study the synergic use of both methods by introducing a random forest kernel (RFK) in an SVM classifier. A time series of multispectral WorldView-2 images acquired over Mali (West Africa) in 2014 was used to develop our case study. Ground truth containing five common crop classes (cotton, maize, millet, peanut, and sorghum) were collected at 45 farms and used to train and test the classifiers. An SVM with the standard Radial Basis Function (RBF) kernel, a RF, and an SVM-RFK were trained and tested over 10 random training and test subsets generated from the ground data. Results show that the newly proposed SVM-RFK classifier can compete with both RF and SVM-RBF. The overall accuracies based on the spectral bands only are of 83, 82 and 83% respectively. Adding vegetation indices to the analysis result in the classification accuracy of 82, 81 and 84% for SVM-RFK, RF, and SVM-RBF respectively. Overall, it can be observed that the newly tested RFK can compete with SVM-RBF and RF classifiers in terms of classification accuracy.

  3. Efficacy of hidden markov model over support vector machine on multiclass classification of healthy and cancerous cervical tissues

    NASA Astrophysics Data System (ADS)

    Mukhopadhyay, Sabyasachi; Kurmi, Indrajit; Pratiher, Sawon; Mukherjee, Sukanya; Barman, Ritwik; Ghosh, Nirmalya; Panigrahi, Prasanta K.

    2018-02-01

    In this paper, a comparative study between SVM and HMM has been carried out for multiclass classification of cervical healthy and cancerous tissues. In our study, the HMM methodology is more promising to produce higher accuracy in classification.

  4. Expected energy-based restricted Boltzmann machine for classification.

    PubMed

    Elfwing, S; Uchibe, E; Doya, K

    2015-04-01

    In classification tasks, restricted Boltzmann machines (RBMs) have predominantly been used in the first stage, either as feature extractors or to provide initialization of neural networks. In this study, we propose a discriminative learning approach to provide a self-contained RBM method for classification, inspired by free-energy based function approximation (FE-RBM), originally proposed for reinforcement learning. For classification, the FE-RBM method computes the output for an input vector and a class vector by the negative free energy of an RBM. Learning is achieved by stochastic gradient-descent using a mean-squared error training objective. In an earlier study, we demonstrated that the performance and the robustness of FE-RBM function approximation can be improved by scaling the free energy by a constant that is related to the size of network. In this study, we propose that the learning performance of RBM function approximation can be further improved by computing the output by the negative expected energy (EE-RBM), instead of the negative free energy. To create a deep learning architecture, we stack several RBMs on top of each other. We also connect the class nodes to all hidden layers to try to improve the performance even further. We validate the classification performance of EE-RBM using the MNIST data set and the NORB data set, achieving competitive performance compared with other classifiers such as standard neural networks, deep belief networks, classification RBMs, and support vector machines. The purpose of using the NORB data set is to demonstrate that EE-RBM with binary input nodes can achieve high performance in the continuous input domain. Copyright © 2014 The Authors. Published by Elsevier Ltd.. All rights reserved.

  5. Cloud Detection of Optical Satellite Images Using Support Vector Machine

    NASA Astrophysics Data System (ADS)

    Lee, Kuan-Yi; Lin, Chao-Hung

    2016-06-01

    Cloud covers are generally present in optical remote-sensing images, which limit the usage of acquired images and increase the difficulty of data analysis, such as image compositing, correction of atmosphere effects, calculations of vegetation induces, land cover classification, and land cover change detection. In previous studies, thresholding is a common and useful method in cloud detection. However, a selected threshold is usually suitable for certain cases or local study areas, and it may be failed in other cases. In other words, thresholding-based methods are data-sensitive. Besides, there are many exceptions to control, and the environment is changed dynamically. Using the same threshold value on various data is not effective. In this study, a threshold-free method based on Support Vector Machine (SVM) is proposed, which can avoid the abovementioned problems. A statistical model is adopted to detect clouds instead of a subjective thresholding-based method, which is the main idea of this study. The features used in a classifier is the key to a successful classification. As a result, Automatic Cloud Cover Assessment (ACCA) algorithm, which is based on physical characteristics of clouds, is used to distinguish the clouds and other objects. In the same way, the algorithm called Fmask (Zhu et al., 2012) uses a lot of thresholds and criteria to screen clouds, cloud shadows, and snow. Therefore, the algorithm of feature extraction is based on the ACCA algorithm and Fmask. Spatial and temporal information are also important for satellite images. Consequently, co-occurrence matrix and temporal variance with uniformity of the major principal axis are used in proposed method. We aim to classify images into three groups: cloud, non-cloud and the others. In experiments, images acquired by the Landsat 7 Enhanced Thematic Mapper Plus (ETM+) and images containing the landscapes of agriculture, snow area, and island are tested. Experiment results demonstrate the detection

  6. The generalization ability of online SVM classification based on Markov sampling.

    PubMed

    Xu, Jie; Yan Tang, Yuan; Zou, Bin; Xu, Zongben; Li, Luoqing; Lu, Yang

    2015-03-01

    In this paper, we consider online support vector machine (SVM) classification learning algorithms with uniformly ergodic Markov chain (u.e.M.c.) samples. We establish the bound on the misclassification error of an online SVM classification algorithm with u.e.M.c. samples based on reproducing kernel Hilbert spaces and obtain a satisfactory convergence rate. We also introduce a novel online SVM classification algorithm based on Markov sampling, and present the numerical studies on the learning ability of online SVM classification based on Markov sampling for benchmark repository. The numerical studies show that the learning performance of the online SVM classification algorithm based on Markov sampling is better than that of classical online SVM classification based on random sampling as the size of training samples is larger.

  7. Exploring the impact of wavelet-based denoising in the classification of remote sensing hyperspectral images

    NASA Astrophysics Data System (ADS)

    Quesada-Barriuso, Pablo; Heras, Dora B.; Argüello, Francisco

    2016-10-01

    The classification of remote sensing hyperspectral images for land cover applications is a very intensive topic. In the case of supervised classification, Support Vector Machines (SVMs) play a dominant role. Recently, the Extreme Learning Machine algorithm (ELM) has been extensively used. The classification scheme previously published by the authors, and called WT-EMP, introduces spatial information in the classification process by means of an Extended Morphological Profile (EMP) that is created from features extracted by wavelets. In addition, the hyperspectral image is denoised in the 2-D spatial domain, also using wavelets and it is joined to the EMP via a stacked vector. In this paper, the scheme is improved achieving two goals. The first one is to reduce the classification time while preserving the accuracy of the classification by using ELM instead of SVM. The second one is to improve the accuracy results by performing not only a 2-D denoising for every spectral band, but also a previous additional 1-D spectral signature denoising applied to each pixel vector of the image. For each denoising the image is transformed by applying a 1-D or 2-D wavelet transform, and then a NeighShrink thresholding is applied. Improvements in terms of classification accuracy are obtained, especially for images with close regions in the classification reference map, because in these cases the accuracy of the classification in the edges between classes is more relevant.

  8. Real Time Monitoring System of Pollution Waste on Musi River Using Support Vector Machine (SVM) Method

    NASA Astrophysics Data System (ADS)

    Fachrurrozi, Muhammad; Saparudin; Erwin

    2017-04-01

    Real-time Monitoring and early detection system which measures the quality standard of waste in Musi River, Palembang, Indonesia is a system for determining air and water pollution level. This system was designed in order to create an integrated monitoring system and provide real time information that can be read. It is designed to measure acidity and water turbidity polluted by industrial waste, as well as to show and provide conditional data integrated in one system. This system consists of inputting and processing the data, and giving output based on processed data. Turbidity, substances, and pH sensor is used as a detector that produce analog electrical direct current voltage (DC). Early detection system works by determining the value of the ammonia threshold, acidity, and turbidity level of water in Musi River. The results is then presented based on the level group pollution by the Support Vector Machine classification method.

  9. Morphological analysis of dendrites and spines by hybridization of ridge detection with twin support vector machine.

    PubMed

    Wang, Shuihua; Chen, Mengmeng; Li, Yang; Shao, Ying; Zhang, Yudong; Du, Sidan; Wu, Jane

    2016-01-01

    Dendritic spines are described as neuronal protrusions. The morphology of dendritic spines and dendrites has a strong relationship to its function, as well as playing an important role in understanding brain function. Quantitative analysis of dendrites and dendritic spines is essential to an understanding of the formation and function of the nervous system. However, highly efficient tools for the quantitative analysis of dendrites and dendritic spines are currently undeveloped. In this paper we propose a novel three-step cascaded algorithm-RTSVM- which is composed of ridge detection as the curvature structure identifier for backbone extraction, boundary location based on differences in density, the Hu moment as features and Twin Support Vector Machine (TSVM) classifiers for spine classification. Our data demonstrates that this newly developed algorithm has performed better than other available techniques used to detect accuracy and false alarm rates. This algorithm will be used effectively in neuroscience research.

  10. Gas Classification Using Deep Convolutional Neural Networks.

    PubMed

    Peng, Pai; Zhao, Xiaojin; Pan, Xiaofang; Ye, Wenbin

    2018-01-08

    In this work, we propose a novel Deep Convolutional Neural Network (DCNN) tailored for gas classification. Inspired by the great success of DCNN in the field of computer vision, we designed a DCNN with up to 38 layers. In general, the proposed gas neural network, named GasNet, consists of: six convolutional blocks, each block consist of six layers; a pooling layer; and a fully-connected layer. Together, these various layers make up a powerful deep model for gas classification. Experimental results show that the proposed DCNN method is an effective technique for classifying electronic nose data. We also demonstrate that the DCNN method can provide higher classification accuracy than comparable Support Vector Machine (SVM) methods and Multiple Layer Perceptron (MLP).

  11. Gas Classification Using Deep Convolutional Neural Networks

    PubMed Central

    Peng, Pai; Zhao, Xiaojin; Pan, Xiaofang; Ye, Wenbin

    2018-01-01

    In this work, we propose a novel Deep Convolutional Neural Network (DCNN) tailored for gas classification. Inspired by the great success of DCNN in the field of computer vision, we designed a DCNN with up to 38 layers. In general, the proposed gas neural network, named GasNet, consists of: six convolutional blocks, each block consist of six layers; a pooling layer; and a fully-connected layer. Together, these various layers make up a powerful deep model for gas classification. Experimental results show that the proposed DCNN method is an effective technique for classifying electronic nose data. We also demonstrate that the DCNN method can provide higher classification accuracy than comparable Support Vector Machine (SVM) methods and Multiple Layer Perceptron (MLP). PMID:29316723

  12. A Wavelet Support Vector Machine Combination Model for Singapore Tourist Arrival to Malaysia

    NASA Astrophysics Data System (ADS)

    Rafidah, A.; Shabri, Ani; Nurulhuda, A.; Suhaila, Y.

    2017-08-01

    In this study, wavelet support vector machine model (WSVM) is proposed and applied for monthly data Singapore tourist time series prediction. The WSVM model is combination between wavelet analysis and support vector machine (SVM). In this study, we have two parts, first part we compare between the kernel function and second part we compare between the developed models with single model, SVM. The result showed that kernel function linear better than RBF while WSVM outperform with single model SVM to forecast monthly Singapore tourist arrival to Malaysia.

  13. Classifier transfer with data selection strategies for online support vector machine classification with class imbalance

    NASA Astrophysics Data System (ADS)

    Krell, Mario Michael; Wilshusen, Nils; Seeland, Anett; Kim, Su Kyoung

    2017-04-01

    Objective. Classifier transfers usually come with dataset shifts. To overcome dataset shifts in practical applications, we consider the limitations in computational resources in this paper for the adaptation of batch learning algorithms, like the support vector machine (SVM). Approach. We focus on data selection strategies which limit the size of the stored training data by different inclusion, exclusion, and further dataset manipulation criteria like handling class imbalance with two new approaches. We provide a comparison of the strategies with linear SVMs on several synthetic datasets with different data shifts as well as on different transfer settings with electroencephalographic (EEG) data. Main results. For the synthetic data, adding only misclassified samples performed astoundingly well. Here, balancing criteria were very important when the other criteria were not well chosen. For the transfer setups, the results show that the best strategy depends on the intensity of the drift during the transfer. Adding all and removing the oldest samples results in the best performance, whereas for smaller drifts, it can be sufficient to only add samples near the decision boundary of the SVM which reduces processing resources. Significance. For brain-computer interfaces based on EEG data, models trained on data from a calibration session, a previous recording session, or even from a recording session with another subject are used. We show, that by using the right combination of data selection criteria, it is possible to adapt the SVM classifier to overcome the performance drop from the transfer.

  14. Protein classification based on text document classification techniques.

    PubMed

    Cheng, Betty Yee Man; Carbonell, Jaime G; Klein-Seetharaman, Judith

    2005-03-01

    The need for accurate, automated protein classification methods continues to increase as advances in biotechnology uncover new proteins. G-protein coupled receptors (GPCRs) are a particularly difficult superfamily of proteins to classify due to extreme diversity among its members. Previous comparisons of BLAST, k-nearest neighbor (k-NN), hidden markov model (HMM) and support vector machine (SVM) using alignment-based features have suggested that classifiers at the complexity of SVM are needed to attain high accuracy. Here, analogous to document classification, we applied Decision Tree and Naive Bayes classifiers with chi-square feature selection on counts of n-grams (i.e. short peptide sequences of length n) to this classification task. Using the GPCR dataset and evaluation protocol from the previous study, the Naive Bayes classifier attained an accuracy of 93.0 and 92.4% in level I and level II subfamily classification respectively, while SVM has a reported accuracy of 88.4 and 86.3%. This is a 39.7 and 44.5% reduction in residual error for level I and level II subfamily classification, respectively. The Decision Tree, while inferior to SVM, outperforms HMM in both level I and level II subfamily classification. For those GPCR families whose profiles are stored in the Protein FAMilies database of alignments and HMMs (PFAM), our method performs comparably to a search against those profiles. Finally, our method can be generalized to other protein families by applying it to the superfamily of nuclear receptors with 94.5, 97.8 and 93.6% accuracy in family, level I and level II subfamily classification respectively. Copyright 2005 Wiley-Liss, Inc.

  15. Bayesian data assimilation provides rapid decision support for vector-borne diseases

    PubMed Central

    Jewell, Chris P.; Brown, Richard G.

    2015-01-01

    Predicting the spread of vector-borne diseases in response to incursions requires knowledge of both host and vector demographics in advance of an outbreak. Although host population data are typically available, for novel disease introductions there is a high chance of the pathogen using a vector for which data are unavailable. This presents a barrier to estimating the parameters of dynamical models representing host–vector–pathogen interaction, and hence limits their ability to provide quantitative risk forecasts. The Theileria orientalis (Ikeda) outbreak in New Zealand cattle demonstrates this problem: even though the vector has received extensive laboratory study, a high degree of uncertainty persists over its national demographic distribution. Addressing this, we develop a Bayesian data assimilation approach whereby indirect observations of vector activity inform a seasonal spatio-temporal risk surface within a stochastic epidemic model. We provide quantitative predictions for the future spread of the epidemic, quantifying uncertainty in the model parameters, case infection times and the disease status of undetected infections. Importantly, we demonstrate how our model learns sequentially as the epidemic unfolds and provide evidence for changing epidemic dynamics through time. Our approach therefore provides a significant advance in rapid decision support for novel vector-borne disease outbreaks. PMID:26136225

  16. Classification of skin cancer images using local binary pattern and SVM classifier

    NASA Astrophysics Data System (ADS)

    Adjed, Faouzi; Faye, Ibrahima; Ababsa, Fakhreddine; Gardezi, Syed Jamal; Dass, Sarat Chandra

    2016-11-01

    In this paper, a classification method for melanoma and non-melanoma skin cancer images has been presented using the local binary patterns (LBP). The LBP computes the local texture information from the skin cancer images, which is later used to compute some statistical features that have capability to discriminate the melanoma and non-melanoma skin tissues. Support vector machine (SVM) is applied on the feature matrix for classification into two skin image classes (malignant and benign). The method achieves good classification accuracy of 76.1% with sensitivity of 75.6% and specificity of 76.7%.

  17. Predictive classification of pediatric bipolar disorder using atlas-based diffusion weighted imaging and support vector machines.

    PubMed

    Mwangi, Benson; Wu, Mon-Ju; Bauer, Isabelle E; Modi, Haina; Zeni, Cristian P; Zunta-Soares, Giovana B; Hasan, Khader M; Soares, Jair C

    2015-11-30

    Previous studies have reported abnormalities of white-matter diffusivity in pediatric bipolar disorder. However, it has not been established whether these abnormalities are able to distinguish individual subjects with pediatric bipolar disorder from healthy controls with a high specificity and sensitivity. Diffusion-weighted imaging scans were acquired from 16 youths diagnosed with DSM-IV bipolar disorder and 16 demographically matched healthy controls. Regional white matter tissue microstructural measurements such as fractional anisotropy, axial diffusivity and radial diffusivity were computed using an atlas-based approach. These measurements were used to 'train' a support vector machine (SVM) algorithm to predict new or 'unseen' subjects' diagnostic labels. The SVM algorithm predicted individual subjects with specificity=87.5%, sensitivity=68.75%, accuracy=78.12%, positive predictive value=84.62%, negative predictive value=73.68%, area under receiver operating characteristic curve (AUROC)=0.7812 and chi-square p-value=0.0012. A pattern of reduced regional white matter fractional anisotropy was observed in pediatric bipolar disorder patients. These results suggest that atlas-based diffusion weighted imaging measurements can distinguish individual pediatric bipolar disorder patients from healthy controls. Notably, from a clinical perspective these findings will contribute to the pathophysiological understanding of pediatric bipolar disorder. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  18. An intraoperative diagnosis of parotid gland tumors using Raman spectroscopy and support vector machine

    NASA Astrophysics Data System (ADS)

    Yan, Bing; Wen, Zhining; Li, Yi; Li, Longjiang; Xue, Lili

    2014-11-01

    The preoperative and intraoperative diagnosis of parotid gland tumors is difficult, but is important for their surgical management. In order to explore an intraoperative diagnostic method, Raman spectroscopy is applied to detect the normal parotid gland and tumors, including pleomorphic adenoma, Warthin’s tumor and mucoepidermoid carcinoma. In the 600-1800 cm-1 region of the Raman shift, there are numerous spectral differences between the parotid gland and tumors. Compared with Raman spectra of the normal parotid gland, the Raman spectra of parotid tumors show an increase of the peaks assigned to nucleic acids and proteins, but a decrease of the peaks related to lipids. Spectral differences also exist between the spectra of parotid tumors. Based on these differences, a remarkable classification and diagnosis of the parotid gland and tumors are carried out by support vector machine (SVM), with high accuracy (96.7~100%), sensitivity (93.3~100%) and specificity (96.7~100%). Raman spectroscopy combined with SVM has a great potential to aid the intraoperative diagnosis of parotid tumors and could provide an accurate and rapid diagnostic approach.

  19. Comparison of Random Forest and Support Vector Machine classifiers using UAV remote sensing imagery

    NASA Astrophysics Data System (ADS)

    Piragnolo, Marco; Masiero, Andrea; Pirotti, Francesco

    2017-04-01

    Since recent years surveying with unmanned aerial vehicles (UAV) is getting a great amount of attention due to decreasing costs, higher precision and flexibility of usage. UAVs have been applied for geomorphological investigations, forestry, precision agriculture, cultural heritage assessment and for archaeological purposes. It can be used for land use and land cover classification (LULC). In literature, there are two main types of approaches for classification of remote sensing imagery: pixel-based and object-based. On one hand, pixel-based approach mostly uses training areas to define classes and respective spectral signatures. On the other hand, object-based classification considers pixels, scale, spatial information and texture information for creating homogeneous objects. Machine learning methods have been applied successfully for classification, and their use is increasing due to the availability of faster computing capabilities. The methods learn and train the model from previous computation. Two machine learning methods which have given good results in previous investigations are Random Forest (RF) and Support Vector Machine (SVM). The goal of this work is to compare RF and SVM methods for classifying LULC using images collected with a fixed wing UAV. The processing chain regarding classification uses packages in R, an open source scripting language for data analysis, which provides all necessary algorithms. The imagery was acquired and processed in November 2015 with cameras providing information over the red, blue, green and near infrared wavelength reflectivity over a testing area in the campus of Agripolis, in Italy. Images were elaborated and ortho-rectified through Agisoft Photoscan. The ortho-rectified image is the full data set, and the test set is derived from partial sub-setting of the full data set. Different tests have been carried out, using a percentage from 2 % to 20 % of the total. Ten training sets and ten validation sets are obtained from

  20. Machine learning in soil classification.

    PubMed

    Bhattacharya, B; Solomatine, D P

    2006-03-01

    In a number of engineering problems, e.g. in geotechnics, petroleum engineering, etc. intervals of measured series data (signals) are to be attributed a class maintaining the constraint of contiguity and standard classification methods could be inadequate. Classification in this case needs involvement of an expert who observes the magnitude and trends of the signals in addition to any a priori information that might be available. In this paper, an approach for automating this classification procedure is presented. Firstly, a segmentation algorithm is developed and applied to segment the measured signals. Secondly, the salient features of these segments are extracted using boundary energy method. Based on the measured data and extracted features to assign classes to the segments classifiers are built; they employ Decision Trees, ANN and Support Vector Machines. The methodology was tested in classifying sub-surface soil using measured data from Cone Penetration Testing and satisfactory results were obtained.

  1. Diagnosing tuberculosis with a novel support vector machine-based artificial immune recognition system.

    PubMed

    Saybani, Mahmoud Reza; Shamshirband, Shahaboddin; Golzari Hormozi, Shahram; Wah, Teh Ying; Aghabozorgi, Saeed; Pourhoseingholi, Mohamad Amin; Olariu, Teodora

    2015-04-01

    Tuberculosis (TB) is a major global health problem, which has been ranked as the second leading cause of death from an infectious disease worldwide. Diagnosis based on cultured specimens is the reference standard, however results take weeks to process. Scientists are looking for early detection strategies, which remain the cornerstone of tuberculosis control. Consequently there is a need to develop an expert system that helps medical professionals to accurately and quickly diagnose the disease. Artificial Immune Recognition System (AIRS) has been used successfully for diagnosing various diseases. However, little effort has been undertaken to improve its classification accuracy. In order to increase the classification accuracy of AIRS, this study introduces a new hybrid system that incorporates a support vector machine into AIRS for diagnosing tuberculosis. Patient epacris reports obtained from the Pasteur laboratory of Iran were used as the benchmark data set, with the sample size of 175 (114 positive samples for TB and 60 samples in the negative group). The strategy of this study was to ensure representativeness, thus it was important to have an adequate number of instances for both TB and non-TB cases. The classification performance was measured through 10-fold cross-validation, Root Mean Squared Error (RMSE), sensitivity and specificity, Youden's Index, and Area Under the Curve (AUC). Statistical analysis was done using the Waikato Environment for Knowledge Analysis (WEKA), a machine learning program for windows. With an accuracy of 100%, sensitivity of 100%, specificity of 100%, Youden's Index of 1, Area Under the Curve of 1, and RMSE of 0, the proposed method was able to successfully classify tuberculosis patients. There have been many researches that aimed at diagnosing tuberculosis faster and more accurately. Our results described a model for diagnosing tuberculosis with 100% sensitivity and 100% specificity. This model can be used as an additional tool for

  2. Support vector machine firefly algorithm based optimization of lens system.

    PubMed

    Shamshirband, Shahaboddin; Petković, Dalibor; Pavlović, Nenad T; Ch, Sudheer; Altameem, Torki A; Gani, Abdullah

    2015-01-01

    Lens system design is an important factor in image quality. The main aspect of the lens system design methodology is the optimization procedure. Since optimization is a complex, nonlinear task, soft computing optimization algorithms can be used. There are many tools that can be employed to measure optical performance, but the spot diagram is the most useful. The spot diagram gives an indication of the image of a point object. In this paper, the spot size radius is considered an optimization criterion. Intelligent soft computing scheme support vector machines (SVMs) coupled with the firefly algorithm (FFA) are implemented. The performance of the proposed estimators is confirmed with the simulation results. The result of the proposed SVM-FFA model has been compared with support vector regression (SVR), artificial neural networks, and generic programming methods. The results show that the SVM-FFA model performs more accurately than the other methodologies. Therefore, SVM-FFA can be used as an efficient soft computing technique in the optimization of lens system designs.

  3. A Comparison Between a Synthetic Over-Sampling Equilibrium and Observed Subsets of Data for Epidemic Vector Classification

    NASA Astrophysics Data System (ADS)

    Fusco, Terence; Bi, Yaxin; Nugent, Chris; Wu, Shengli

    2016-08-01

    We can see that the data imputation approach using the Regression CTA has performed more favourably when compared with the alternative methods on this dataset. We now have the evidence to show that this method is viable moving forward with further research in this area. The weighted distribution experiments have provided us with a more balanced and appropriate ratio for snail density classification purposes when using either the 3 or 5 category combination. The most desirable results are found when using 3 categories of SD with the weighted distribution of classes being 20-60-20. This information reflects the optimum classification accuracy across the data range and can be applied to any novel environment feature dataset pertaining to Schistosomiasis vector classification. ITSVM has provided us with a method of labelling SD data which we can use for classification with epidemic disease prediction research. The confidence level selection enables consistent labelling accuracy for bespoke requirements when classifying the data from each year. The SMOTE Equilibrium proposed method has yielded a slight increase with each multiple of synthetic instances that are compounded to the training dataset. The reduction of overfitting and increase of data instances has shown a gradual classification accuracy increase across the data for each year. We will now test to see what the optimum synthetic instance incremental increase is across our data and apply this to our experiments with this research.

  4. Prediction of Spirometric Forced Expiratory Volume (FEV1) Data Using Support Vector Regression

    NASA Astrophysics Data System (ADS)

    Kavitha, A.; Sujatha, C. M.; Ramakrishnan, S.

    2010-01-01

    In this work, prediction of forced expiratory volume in 1 second (FEV1) in pulmonary function test is carried out using the spirometer and support vector regression analysis. Pulmonary function data are measured with flow volume spirometer from volunteers (N=175) using a standard data acquisition protocol. The acquired data are then used to predict FEV1. Support vector machines with polynomial kernel function with four different orders were employed to predict the values of FEV1. The performance is evaluated by computing the average prediction accuracy for normal and abnormal cases. Results show that support vector machines are capable of predicting FEV1 in both normal and abnormal cases and the average prediction accuracy for normal subjects was higher than that of abnormal subjects. Accuracy in prediction was found to be high for a regularization constant of C=10. Since FEV1 is the most significant parameter in the analysis of spirometric data, it appears that this method of assessment is useful in diagnosing the pulmonary abnormalities with incomplete data and data with poor recording.

  5. Matrix Multiplication Algorithm Selection with Support Vector Machines

    DTIC Science & Technology

    2015-05-01

    libraries that could intelligently choose the optimal algorithm for a particular set of inputs. Users would be oblivious to the underlying algorithmic...SAT.” J. Artif . Intell. Res.(JAIR), vol. 32, pp. 565–606, 2008. [9] M. G. Lagoudakis and M. L. Littman, “Algorithm selection using reinforcement...Artificial Intelligence , vol. 21, no. 05, pp. 961–976, 2007. [15] C.-C. Chang and C.-J. Lin, “LIBSVM: A library for support vector machines,” ACM

  6. Color Image Classification Using Block Matching and Learning

    NASA Astrophysics Data System (ADS)

    Kondo, Kazuki; Hotta, Seiji

    In this paper, we propose block matching and learning for color image classification. In our method, training images are partitioned into small blocks. Given a test image, it is also partitioned into small blocks, and mean-blocks corresponding to each test block are calculated with neighbor training blocks. Our method classifies a test image into the class that has the shortest total sum of distances between mean blocks and test ones. We also propose a learning method for reducing memory requirement. Experimental results show that our classification outperforms other classifiers such as support vector machine with bag of keypoints.

  7. Spatial-spectral blood cell classification with microscopic hyperspectral imagery

    NASA Astrophysics Data System (ADS)

    Ran, Qiong; Chang, Lan; Li, Wei; Xu, Xiaofeng

    2017-10-01

    Microscopic hyperspectral images provide a new way for blood cell examination. The hyperspectral imagery can greatly facilitate the classification of different blood cells. In this paper, the microscopic hyperspectral images are acquired by connecting the microscope and the hyperspectral imager, and then tested for blood cell classification. For combined use of the spectral and spatial information provided by hyperspectral images, a spatial-spectral classification method is improved from the classical extreme learning machine (ELM) by integrating spatial context into the image classification task with Markov random field (MRF) model. Comparisons are done among ELM, ELM-MRF, support vector machines(SVM) and SVMMRF methods. Results show the spatial-spectral classification methods(ELM-MRF, SVM-MRF) perform better than pixel-based methods(ELM, SVM), and the proposed ELM-MRF has higher precision and show more accurate location of cells.

  8. A Fast SVM-Based Tongue's Colour Classification Aided by k-Means Clustering Identifiers and Colour Attributes as Computer-Assisted Tool for Tongue Diagnosis.

    PubMed

    Kamarudin, Nur Diyana; Ooi, Chia Yee; Kawanabe, Tadaaki; Odaguchi, Hiroshi; Kobayashi, Fuminori

    2017-01-01

    In tongue diagnosis, colour information of tongue body has kept valuable information regarding the state of disease and its correlation with the internal organs. Qualitatively, practitioners may have difficulty in their judgement due to the instable lighting condition and naked eye's ability to capture the exact colour distribution on the tongue especially the tongue with multicolour substance. To overcome this ambiguity, this paper presents a two-stage tongue's multicolour classification based on a support vector machine (SVM) whose support vectors are reduced by our proposed k -means clustering identifiers and red colour range for precise tongue colour diagnosis. In the first stage, k -means clustering is used to cluster a tongue image into four clusters of image background (black), deep red region, red/light red region, and transitional region. In the second-stage classification, red/light red tongue images are further classified into red tongue or light red tongue based on the red colour range derived in our work. Overall, true rate classification accuracy of the proposed two-stage classification to diagnose red, light red, and deep red tongue colours is 94%. The number of support vectors in SVM is improved by 41.2%, and the execution time for one image is recorded as 48 seconds.

  9. A Fast SVM-Based Tongue's Colour Classification Aided by k-Means Clustering Identifiers and Colour Attributes as Computer-Assisted Tool for Tongue Diagnosis

    PubMed Central

    Ooi, Chia Yee; Kawanabe, Tadaaki; Odaguchi, Hiroshi; Kobayashi, Fuminori

    2017-01-01

    In tongue diagnosis, colour information of tongue body has kept valuable information regarding the state of disease and its correlation with the internal organs. Qualitatively, practitioners may have difficulty in their judgement due to the instable lighting condition and naked eye's ability to capture the exact colour distribution on the tongue especially the tongue with multicolour substance. To overcome this ambiguity, this paper presents a two-stage tongue's multicolour classification based on a support vector machine (SVM) whose support vectors are reduced by our proposed k-means clustering identifiers and red colour range for precise tongue colour diagnosis. In the first stage, k-means clustering is used to cluster a tongue image into four clusters of image background (black), deep red region, red/light red region, and transitional region. In the second-stage classification, red/light red tongue images are further classified into red tongue or light red tongue based on the red colour range derived in our work. Overall, true rate classification accuracy of the proposed two-stage classification to diagnose red, light red, and deep red tongue colours is 94%. The number of support vectors in SVM is improved by 41.2%, and the execution time for one image is recorded as 48 seconds. PMID:29065640

  10. Differentiation of several interstitial lung disease patterns in HRCT images using support vector machine: role of databases on performance

    NASA Astrophysics Data System (ADS)

    Kale, Mandar; Mukhopadhyay, Sudipta; Dash, Jatindra K.; Garg, Mandeep; Khandelwal, Niranjan

    2016-03-01

    Interstitial lung disease (ILD) is complicated group of pulmonary disorders. High Resolution Computed Tomography (HRCT) considered to be best imaging technique for analysis of different pulmonary disorders. HRCT findings can be categorised in several patterns viz. Consolidation, Emphysema, Ground Glass Opacity, Nodular, Normal etc. based on their texture like appearance. Clinician often find it difficult to diagnosis these pattern because of their complex nature. In such scenario computer-aided diagnosis system could help clinician to identify patterns. Several approaches had been proposed for classification of ILD patterns. This includes computation of textural feature and training /testing of classifier such as artificial neural network (ANN), support vector machine (SVM) etc. In this paper, wavelet features are calculated from two different ILD database, publically available MedGIFT ILD database and private ILD database, followed by performance evaluation of ANN and SVM classifiers in terms of average accuracy. It is found that average classification accuracy by SVM is greater than ANN where trained and tested on same database. Investigation continued further to test variation in accuracy of classifier when training and testing is performed with alternate database and training and testing of classifier with database formed by merging samples from same class from two individual databases. The average classification accuracy drops when two independent databases used for training and testing respectively. There is significant improvement in average accuracy when classifiers are trained and tested with merged database. It infers dependency of classification accuracy on training data. It is observed that SVM outperforms ANN when same database is used for training and testing.

  11. A support vector machine-based method to identify mild cognitive impairment with multi-level characteristics of magnetic resonance imaging.

    PubMed

    Long, Zhuqing; Jing, Bin; Yan, Huagang; Dong, Jianxin; Liu, Han; Mo, Xiao; Han, Ying; Li, Haiyun

    2016-09-07

    Mild cognitive impairment (MCI) represents a transitional state between normal aging and Alzheimer's disease (AD). Non-invasive diagnostic methods are desirable to identify MCI for early therapeutic interventions. In this study, we proposed a support vector machine (SVM)-based method to discriminate between MCI patients and normal controls (NCs) using multi-level characteristics of magnetic resonance imaging (MRI). This method adopted a radial basis function (RBF) as the kernel function, and a grid search method to optimize the two parameters of SVM. The calculated characteristics, i.e., the Hurst exponent (HE), amplitude of low-frequency fluctuations (ALFF), regional homogeneity (ReHo) and gray matter density (GMD), were adopted as the classification features. A leave-one-out cross-validation (LOOCV) was used to evaluate the classification performance of the method. Applying the proposed method to the experimental data from 29 MCI patients and 33 healthy subjects, we achieved a classification accuracy of up to 96.77%, with a sensitivity of 93.10% and a specificity of 100%, and the area under the curve (AUC) yielded up to 0.97. Furthermore, the most discriminative features for classification were found to predominantly involve default-mode regions, such as hippocampus (HIP), parahippocampal gyrus (PHG), posterior cingulate gyrus (PCG) and middle frontal gyrus (MFG), and subcortical regions such as lentiform nucleus (LN) and amygdala (AMYG). Therefore, our method is promising in distinguishing MCI patients from NCs and may be useful for the diagnosis of MCI. Copyright © 2016 IBRO. Published by Elsevier Ltd. All rights reserved.

  12. Feature selection for elderly faller classification based on wearable sensors.

    PubMed

    Howcroft, Jennifer; Kofman, Jonathan; Lemaire, Edward D

    2017-05-30

    Wearable sensors can be used to derive numerous gait pattern features for elderly fall risk and faller classification; however, an appropriate feature set is required to avoid high computational costs and the inclusion of irrelevant features. The objectives of this study were to identify and evaluate smaller feature sets for faller classification from large feature sets derived from wearable accelerometer and pressure-sensing insole gait data. A convenience sample of 100 older adults (75.5 ± 6.7 years; 76 non-fallers, 24 fallers based on 6 month retrospective fall occurrence) walked 7.62 m while wearing pressure-sensing insoles and tri-axial accelerometers at the head, pelvis, left and right shanks. Feature selection was performed using correlation-based feature selection (CFS), fast correlation based filter (FCBF), and Relief-F algorithms. Faller classification was performed using multi-layer perceptron neural network, naïve Bayesian, and support vector machine classifiers, with 75:25 single stratified holdout and repeated random sampling. The best performing model was a support vector machine with 78% accuracy, 26% sensitivity, 95% specificity, 0.36 F1 score, and 0.31 MCC and one posterior pelvis accelerometer input feature (left acceleration standard deviation). The second best model achieved better sensitivity (44%) and used a support vector machine with 74% accuracy, 83% specificity, 0.44 F1 score, and 0.29 MCC. This model had ten input features: maximum, mean and standard deviation posterior acceleration; maximum, mean and standard deviation anterior acceleration; mean superior acceleration; and three impulse features. The best multi-sensor model sensitivity (56%) was achieved using posterior pelvis and both shank accelerometers and a naïve Bayesian classifier. The best single-sensor model sensitivity (41%) was achieved using the posterior pelvis accelerometer and a naïve Bayesian classifier. Feature selection provided models with smaller feature

  13. SVMs for Vibration-Based Terrain Classification

    NASA Astrophysics Data System (ADS)

    Weiss, Christian; Stark, Matthias; Zell, Andreas

    When an outdoor mobile robot traverses different types of ground surfaces, different types of vibrations are induced in the body of the robot. These vibrations can be used to learn a discrimination between different surfaces and to classify the current terrain. Recently, we presented a method that uses Support Vector Machines for classification, and we showed results on data collected with a hand-pulled cart. In this paper, we show that our approach also works well on an outdoor robot. Furthermore, we more closely investigate in which direction the vibration should be measured. Finally, we present a simple but effective method to improve the classification by combining measurements taken in multiple directions.

  14. Machine Learning for Biological Trajectory Classification Applications

    NASA Technical Reports Server (NTRS)

    Sbalzarini, Ivo F.; Theriot, Julie; Koumoutsakos, Petros

    2002-01-01

    Machine-learning techniques, including clustering algorithms, support vector machines and hidden Markov models, are applied to the task of classifying trajectories of moving keratocyte cells. The different algorithms axe compared to each other as well as to expert and non-expert test persons, using concepts from signal-detection theory. The algorithms performed very well as compared to humans, suggesting a robust tool for trajectory classification in biological applications.

  15. A Personalized Electronic Movie Recommendation System Based on Support Vector Machine and Improved Particle Swarm Optimization

    PubMed Central

    Wang, Xibin; Luo, Fengji; Qian, Ying; Ranzi, Gianluca

    2016-01-01

    With the rapid development of ICT and Web technologies, a large an amount of information is becoming available and this is producing, in some instances, a condition of information overload. Under these conditions, it is difficult for a person to locate and access useful information for making decisions. To address this problem, there are information filtering systems, such as the personalized recommendation system (PRS) considered in this paper, that assist a person in identifying possible products or services of interest based on his/her preferences. Among available approaches, collaborative Filtering (CF) is one of the most widely used recommendation techniques. However, CF has some limitations, e.g., the relatively simple similarity calculation, cold start problem, etc. In this context, this paper presents a new regression model based on the support vector machine (SVM) classification and an improved PSO (IPSO) for the development of an electronic movie PRS. In its implementation, a SVM classification model is first established to obtain a preliminary movie recommendation list based on which a SVM regression model is applied to predict movies’ ratings. The proposed PRS not only considers the movie’s content information but also integrates the users’ demographic and behavioral information to better capture the users’ interests and preferences. The efficiency of the proposed method is verified by a series of experiments based on the MovieLens benchmark data set. PMID:27898691

  16. A Personalized Electronic Movie Recommendation System Based on Support Vector Machine and Improved Particle Swarm Optimization.

    PubMed

    Wang, Xibin; Luo, Fengji; Qian, Ying; Ranzi, Gianluca

    2016-01-01

    With the rapid development of ICT and Web technologies, a large an amount of information is becoming available and this is producing, in some instances, a condition of information overload. Under these conditions, it is difficult for a person to locate and access useful information for making decisions. To address this problem, there are information filtering systems, such as the personalized recommendation system (PRS) considered in this paper, that assist a person in identifying possible products or services of interest based on his/her preferences. Among available approaches, collaborative Filtering (CF) is one of the most widely used recommendation techniques. However, CF has some limitations, e.g., the relatively simple similarity calculation, cold start problem, etc. In this context, this paper presents a new regression model based on the support vector machine (SVM) classification and an improved PSO (IPSO) for the development of an electronic movie PRS. In its implementation, a SVM classification model is first established to obtain a preliminary movie recommendation list based on which a SVM regression model is applied to predict movies' ratings. The proposed PRS not only considers the movie's content information but also integrates the users' demographic and behavioral information to better capture the users' interests and preferences. The efficiency of the proposed method is verified by a series of experiments based on the MovieLens benchmark data set.

  17. Classification of subsurface objects using singular values derived from signal frames

    DOEpatents

    Chambers, David H; Paglieroni, David W

    2014-05-06

    The classification system represents a detected object with a feature vector derived from the return signals acquired by an array of N transceivers operating in multistatic mode. The classification system generates the feature vector by transforming the real-valued return signals into complex-valued spectra, using, for example, a Fast Fourier Transform. The classification system then generates a feature vector of singular values for each user-designated spectral sub-band by applying a singular value decomposition (SVD) to the N.times.N square complex-valued matrix formed from sub-band samples associated with all possible transmitter-receiver pairs. The resulting feature vector of singular values may be transformed into a feature vector of singular value likelihoods and then subjected to a multi-category linear or neural network classifier for object classification.

  18. Soft-sensing model of temperature for aluminum reduction cell on improved twin support vector regression

    NASA Astrophysics Data System (ADS)

    Li, Tao

    2018-06-01

    The complexity of aluminum electrolysis process leads the temperature for aluminum reduction cells hard to measure directly. However, temperature is the control center of aluminum production. To solve this problem, combining some aluminum plant's practice data, this paper presents a Soft-sensing model of temperature for aluminum electrolysis process on Improved Twin Support Vector Regression (ITSVR). ITSVR eliminates the slow learning speed of Support Vector Regression (SVR) and the over-fit risk of Twin Support Vector Regression (TSVR) by introducing a regularization term into the objective function of TSVR, which ensures the structural risk minimization principle and lower computational complexity. Finally, the model with some other parameters as auxiliary variable, predicts the temperature by ITSVR. The simulation result shows Soft-sensing model based on ITSVR has short time-consuming and better generalization.

  19. Patient classification as an outlier detection problem: An application of the One-Class Support Vector Machine

    PubMed Central

    Mourão-Miranda, Janaina; Hardoon, David R.; Hahn, Tim; Marquand, Andre F.; Williams, Steve C.R.; Shawe-Taylor, John; Brammer, Michael

    2011-01-01

    Pattern recognition approaches, such as the Support Vector Machine (SVM), have been successfully used to classify groups of individuals based on their patterns of brain activity or structure. However these approaches focus on finding group differences and are not applicable to situations where one is interested in accessing deviations from a specific class or population. In the present work we propose an application of the one-class SVM (OC-SVM) to investigate if patterns of fMRI response to sad facial expressions in depressed patients would be classified as outliers in relation to patterns of healthy control subjects. We defined features based on whole brain voxels and anatomical regions. In both cases we found a significant correlation between the OC-SVM predictions and the patients' Hamilton Rating Scale for Depression (HRSD), i.e. the more depressed the patients were the more of an outlier they were. In addition the OC-SVM split the patient groups into two subgroups whose membership was associated with future response to treatment. When applied to region-based features the OC-SVM classified 52% of patients as outliers. However among the patients classified as outliers 70% did not respond to treatment and among those classified as non-outliers 89% responded to treatment. In addition 89% of the healthy controls were classified as non-outliers. PMID:21723950

  20. Vector-model-supported optimization in volumetric-modulated arc stereotactic radiotherapy planning for brain metastasis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Liu, Eva Sau Fan; Department of Health Technology and Informatics, The Hong Kong Polytechnic University; Wu, Vincent Wing Cheung

    Long planning time in volumetric-modulated arc stereotactic radiotherapy (VMA-SRT) cases can limit its clinical efficiency and use. A vector model could retrieve previously successful radiotherapy cases that share various common anatomic features with the current case. The prsent study aimed to develop a vector model that could reduce planning time by applying the optimization parameters from those retrieved reference cases. Thirty-six VMA-SRT cases of brain metastasis (gender, male [n = 23], female [n = 13]; age range, 32 to 81 years old) were collected and used as a reference database. Another 10 VMA-SRT cases were planned with both conventional optimization and vector-model-supported optimization, followingmore » the oncologists' clinical dose prescriptions. Planning time and plan quality measures were compared using the 2-sided paired Wilcoxon signed rank test with a significance level of 0.05, with positive false discovery rate (pFDR) of less than 0.05. With vector-model-supported optimization, there was a significant reduction in the median planning time, a 40% reduction from 3.7 to 2.2 hours (p = 0.002, pFDR = 0.032), and for the number of iterations, a 30% reduction from 8.5 to 6.0 (p = 0.006, pFDR = 0.047). The quality of plans from both approaches was comparable. From these preliminary results, vector-model-supported optimization can expedite the optimization of VMA-SRT for brain metastasis while maintaining plan quality.« less

  1. Efficient boundary hunting via vector quantization

    NASA Astrophysics Data System (ADS)

    Diamantini, Claudia; Panti, Maurizio

    2001-03-01

    A great amount of information about a classification problem is contained in those instances falling near the decision boundary. This intuition dates back to the earliest studies in pattern recognition, and in the more recent adaptive approaches to the so called boundary hunting, such as the work of Aha et alii on Instance Based Learning and the work of Vapnik et alii on Support Vector Machines. The last work is of particular interest, since theoretical and experimental results ensure the accuracy of boundary reconstruction. However, its optimization approach has heavy computational and memory requirements, which limits its application on huge amounts of data. In the paper we describe an alternative approach to boundary hunting based on adaptive labeled quantization architectures. The adaptation is performed by a stochastic gradient algorithm for the minimization of the error probability. Error probability minimization guarantees the accurate approximation of the optimal decision boundary, while the use of a stochastic gradient algorithm defines an efficient method to reach such approximation. In the paper comparisons to Support Vector Machines are considered.

  2. Using an object-based grid system to evaluate a newly developed EP approach to formulate SVMs as applied to the classification of organophosphate nerve agents

    NASA Astrophysics Data System (ADS)

    Land, Walker H., Jr.; Lewis, Michael; Sadik, Omowunmi; Wong, Lut; Wanekaya, Adam; Gonzalez, Richard J.; Balan, Arun

    2004-04-01

    This paper extends the classification approaches described in reference [1] in the following way: (1.) developing and evaluating a new method for evolving organophosphate nerve agent Support Vector Machine (SVM) classifiers using Evolutionary Programming, (2.) conducting research experiments using a larger database of organophosphate nerve agents, and (3.) upgrading the architecture to an object-based grid system for evaluating the classification of EP derived SVMs. Due to the increased threats of chemical and biological weapons of mass destruction (WMD) by international terrorist organizations, a significant effort is underway to develop tools that can be used to detect and effectively combat biochemical warfare. This paper reports the integration of multi-array sensors with Support Vector Machines (SVMs) for the detection of organophosphates nerve agents using a grid computing system called Legion. Grid computing is the use of large collections of heterogeneous, distributed resources (including machines, databases, devices, and users) to support large-scale computations and wide-area data access. Finally, preliminary results using EP derived support vector machines designed to operate on distributed systems have provided accurate classification results. In addition, distributed training time architectures are 50 times faster when compared to standard iterative training time methods.

  3. Application of machine learning using support vector machines for crater detection from Martian digital topography data

    NASA Astrophysics Data System (ADS)

    Salamunićcar, Goran; Lončarić, Sven

    In our previous work, in order to extend the GT-57633 catalogue [PSS, 56 (15), 1992-2008] with still uncatalogued impact-craters, the following has been done [GRS, 48 (5), in press, doi:10.1109/TGRS.2009.2037750]: (1) the crater detection algorithm (CDA) based on digital elevation model (DEM) was developed; (2) using 1/128° MOLA data, this CDA proposed 414631 crater-candidates; (3) each crater-candidate was analyzed manually; and (4) 57592 were confirmed as correct detections. The resulting GT-115225 catalog is the significant result of this effort. However, to check such a large number of crater-candidates manually was a demanding task. This was the main motivation for work on improvement of the CDA in order to provide better classification of craters as true and false detections. To achieve this, we extended the CDA with the machine learning capability, using support vector machines (SVM). In the first step, the CDA (re)calculates numerous terrain morphometric attributes from DEM. For this purpose, already existing modules of the CDA from our previous work were reused in order to be capable to prepare these attributes. In addition, new attributes were introduced such as ellipse eccentricity and tilt. For machine learning purpose, the CDA is additionally extended to provide 2-D topography-profile and 3-D shape for each crater-candidate. The latter two are a performance problem because of the large number of crater-candidates in combination with the large number of attributes. As a solution, we developed a CDA architecture wherein it is possible to combine the SVM with a radial basis function (RBF) or any other kernel (for initial set of attributes), with the SVM with linear kernel (for the cases when 2-D and 3-D data are included as well). Another challenge is that, in addition to diversity of possible crater types, there are numerous morphological differences between the smallest (mostly very circular bowl-shaped craters) and the largest (multi-ring) impact

  4. Classification of postural profiles among mouth-breathing children by learning vector quantization.

    PubMed

    Mancini, F; Sousa, F S; Hummel, A D; Falcão, A E J; Yi, L C; Ortolani, C F; Sigulem, D; Pisa, I T

    2011-01-01

    Mouth breathing is a chronic syndrome that may bring about postural changes. Finding characteristic patterns of changes occurring in the complex musculoskeletal system of mouth-breathing children has been a challenge. Learning vector quantization (LVQ) is an artificial neural network model that can be applied for this purpose. The aim of the present study was to apply LVQ to determine the characteristic postural profiles shown by mouth-breathing children, in order to further understand abnormal posture among mouth breathers. Postural training data on 52 children (30 mouth breathers and 22 nose breathers) and postural validation data on 32 children (22 mouth breathers and 10 nose breathers) were used. The performance of LVQ and other classification models was compared in relation to self-organizing maps, back-propagation applied to multilayer perceptrons, Bayesian networks, naive Bayes, J48 decision trees, k, and k-nearest-neighbor classifiers. Classifier accuracy was assessed by means of leave-one-out cross-validation, area under ROC curve (AUC), and inter-rater agreement (Kappa statistics). By using the LVQ model, five postural profiles for mouth-breathing children could be determined. LVQ showed satisfactory results for mouth-breathing and nose-breathing classification: sensitivity and specificity rates of 0.90 and 0.95, respectively, when using the training dataset, and 0.95 and 0.90, respectively, when using the validation dataset. The five postural profiles for mouth-breathing children suggested by LVQ were incorporated into application software for classifying the severity of mouth breathers' abnormal posture.

  5. A collaborative framework for Distributed Privacy-Preserving Support Vector Machine learning.

    PubMed

    Que, Jialan; Jiang, Xiaoqian; Ohno-Machado, Lucila

    2012-01-01

    A Support Vector Machine (SVM) is a popular tool for decision support. The traditional way to build an SVM model is to estimate parameters based on a centralized repository of data. However, in the field of biomedicine, patient data are sometimes stored in local repositories or institutions where they were collected, and may not be easily shared due to privacy concerns. This creates a substantial barrier for researchers to effectively learn from the distributed data using machine learning tools like SVMs. To overcome this difficulty and promote efficient information exchange without sharing sensitive raw data, we developed a Distributed Privacy Preserving Support Vector Machine (DPP-SVM). The DPP-SVM enables privacy-preserving collaborative learning, in which a trusted server integrates "privacy-insensitive" intermediary results. The globally learned model is guaranteed to be exactly the same as learned from combined data. We also provide a free web-service (http://privacy.ucsd.edu:8080/ppsvm/) for multiple participants to collaborate and complete the SVM-learning task in an efficient and privacy-preserving manner.

  6. A Collaborative Framework for Distributed Privacy-Preserving Support Vector Machine Learning

    PubMed Central

    Que, Jialan; Jiang, Xiaoqian; Ohno-Machado, Lucila

    2012-01-01

    A Support Vector Machine (SVM) is a popular tool for decision support. The traditional way to build an SVM model is to estimate parameters based on a centralized repository of data. However, in the field of biomedicine, patient data are sometimes stored in local repositories or institutions where they were collected, and may not be easily shared due to privacy concerns. This creates a substantial barrier for researchers to effectively learn from the distributed data using machine learning tools like SVMs. To overcome this difficulty and promote efficient information exchange without sharing sensitive raw data, we developed a Distributed Privacy Preserving Support Vector Machine (DPP-SVM). The DPP-SVM enables privacy-preserving collaborative learning, in which a trusted server integrates “privacy-insensitive” intermediary results. The globally learned model is guaranteed to be exactly the same as learned from combined data. We also provide a free web-service (http://privacy.ucsd.edu:8080/ppsvm/) for multiple participants to collaborate and complete the SVM-learning task in an efficient and privacy-preserving manner. PMID:23304414

  7. Support vector machine for day ahead electricity price forecasting

    NASA Astrophysics Data System (ADS)

    Razak, Intan Azmira binti Wan Abdul; Abidin, Izham bin Zainal; Siah, Yap Keem; Rahman, Titik Khawa binti Abdul; Lada, M. Y.; Ramani, Anis Niza binti; Nasir, M. N. M.; Ahmad, Arfah binti

    2015-05-01

    Electricity price forecasting has become an important part of power system operation and planning. In a pool- based electric energy market, producers submit selling bids consisting in energy blocks and their corresponding minimum selling prices to the market operator. Meanwhile, consumers submit buying bids consisting in energy blocks and their corresponding maximum buying prices to the market operator. Hence, both producers and consumers use day ahead price forecasts to derive their respective bidding strategies to the electricity market yet reduce the cost of electricity. However, forecasting electricity prices is a complex task because price series is a non-stationary and highly volatile series. Many factors cause for price spikes such as volatility in load and fuel price as well as power import to and export from outside the market through long term contract. This paper introduces an approach of machine learning algorithm for day ahead electricity price forecasting with Least Square Support Vector Machine (LS-SVM). Previous day data of Hourly Ontario Electricity Price (HOEP), generation's price and demand from Ontario power market are used as the inputs for training data. The simulation is held using LSSVMlab in Matlab with the training and testing data of 2004. SVM that widely used for classification and regression has great generalization ability with structured risk minimization principle rather than empirical risk minimization. Moreover, same parameter settings in trained SVM give same results that absolutely reduce simulation process compared to other techniques such as neural network and time series. The mean absolute percentage error (MAPE) for the proposed model shows that SVM performs well compared to neural network.

  8. A new local-global approach for classification.

    PubMed

    Peres, R T; Pedreira, C E

    2010-09-01

    In this paper, we propose a new local-global pattern classification scheme that combines supervised and unsupervised approaches, taking advantage of both, local and global environments. We understand as global methods the ones concerned with the aim of constructing a model for the whole problem space using the totality of the available observations. Local methods focus into sub regions of the space, possibly using an appropriately selected subset of the sample. In the proposed method, the sample is first divided in local cells by using a Vector Quantization unsupervised algorithm, the LBG (Linde-Buzo-Gray). In a second stage, the generated assemblage of much easier problems is locally solved with a scheme inspired by Bayes' rule. Four classification methods were implemented for comparison purposes with the proposed scheme: Learning Vector Quantization (LVQ); Feedforward Neural Networks; Support Vector Machine (SVM) and k-Nearest Neighbors. These four methods and the proposed scheme were implemented in eleven datasets, two controlled experiments, plus nine public available datasets from the UCI repository. The proposed method has shown a quite competitive performance when compared to these classical and largely used classifiers. Our method is simple concerning understanding and implementation and is based on very intuitive concepts. Copyright 2010 Elsevier Ltd. All rights reserved.

  9. Application of machine learning on brain cancer multiclass classification

    NASA Astrophysics Data System (ADS)

    Panca, V.; Rustam, Z.

    2017-07-01

    Classification of brain cancer is a problem of multiclass classification. One approach to solve this problem is by first transforming it into several binary problems. The microarray gene expression dataset has the two main characteristics of medical data: extremely many features (genes) and only a few number of samples. The application of machine learning on microarray gene expression dataset mainly consists of two steps: feature selection and classification. In this paper, the features are selected using a method based on support vector machine recursive feature elimination (SVM-RFE) principle which is improved to solve multiclass classification, called multiple multiclass SVM-RFE. Instead of using only the selected features on a single classifier, this method combines the result of multiple classifiers. The features are divided into subsets and SVM-RFE is used on each subset. Then, the selected features on each subset are put on separate classifiers. This method enhances the feature selection ability of each single SVM-RFE. Twin support vector machine (TWSVM) is used as the method of the classifier to reduce computational complexity. While ordinary SVM finds single optimum hyperplane, the main objective Twin SVM is to find two non-parallel optimum hyperplanes. The experiment on the brain cancer microarray gene expression dataset shows this method could classify 71,4% of the overall test data correctly, using 100 and 1000 genes selected from multiple multiclass SVM-RFE feature selection method. Furthermore, the per class results show that this method could classify data of normal and MD class with 100% accuracy.

  10. Dual linear structured support vector machine tracking method via scale correlation filter

    NASA Astrophysics Data System (ADS)

    Li, Weisheng; Chen, Yanquan; Xiao, Bin; Feng, Chen

    2018-01-01

    Adaptive tracking-by-detection methods based on structured support vector machine (SVM) performed well on recent visual tracking benchmarks. However, these methods did not adopt an effective strategy of object scale estimation, which limits the overall tracking performance. We present a tracking method based on a dual linear structured support vector machine (DLSSVM) with a discriminative scale correlation filter. The collaborative tracker comprised of a DLSSVM model and a scale correlation filter obtains good results in tracking target position and scale estimation. The fast Fourier transform is applied for detection. Extensive experiments show that our tracking approach outperforms many popular top-ranking trackers. On a benchmark including 100 challenging video sequences, the average precision of the proposed method is 82.8%.

  11. Support vector machine and fuzzy C-mean clustering-based comparative evaluation of changes in motor cortex electroencephalogram under chronic alcoholism.

    PubMed

    Kumar, Surendra; Ghosh, Subhojit; Tetarway, Suhash; Sinha, Rakesh Kumar

    2015-07-01

    In this study, the magnitude and spatial distribution of frequency spectrum in the resting electroencephalogram (EEG) were examined to address the problem of detecting alcoholism in the cerebral motor cortex. The EEG signals were recorded from chronic alcoholic conditions (n = 20) and the control group (n = 20). Data were taken from motor cortex region and divided into five sub-bands (delta, theta, alpha, beta-1 and beta-2). Three methodologies were adopted for feature extraction: (1) absolute power, (2) relative power and (3) peak power frequency. The dimension of the extracted features is reduced by linear discrimination analysis and classified by support vector machine (SVM) and fuzzy C-mean clustering. The maximum classification accuracy (88 %) with SVM clustering was achieved with the EEG spectral features with absolute power frequency on F4 channel. Among the bands, relatively higher classification accuracy was found over theta band and beta-2 band in most of the channels when computed with the EEG features of relative power. Electrodes wise CZ, C3 and P4 were having more alteration. Considering the good classification accuracy obtained by SVM with relative band power features in most of the EEG channels of motor cortex, it can be suggested that the noninvasive automated online diagnostic system for the chronic alcoholic condition can be developed with the help of EEG signals.

  12. Noninvasive prostate cancer screening based on serum surface-enhanced Raman spectroscopy and support vector machine

    NASA Astrophysics Data System (ADS)

    Li, Shaoxin; Zhang, Yanjiao; Xu, Junfa; Li, Linfang; Zeng, Qiuyao; Lin, Lin; Guo, Zhouyi; Liu, Zhiming; Xiong, Honglian; Liu, Songhao

    2014-09-01

    This study aims to present a noninvasive prostate cancer screening methods using serum surface-enhanced Raman scattering (SERS) and support vector machine (SVM) techniques through peripheral blood sample. SERS measurements are performed using serum samples from 93 prostate cancer patients and 68 healthy volunteers by silver nanoparticles. Three types of kernel functions including linear, polynomial, and Gaussian radial basis function (RBF) are employed to build SVM diagnostic models for classifying measured SERS spectra. For comparably evaluating the performance of SVM classification models, the standard multivariate statistic analysis method of principal component analysis (PCA) is also applied to classify the same datasets. The study results show that for the RBF kernel SVM diagnostic model, the diagnostic accuracy of 98.1% is acquired, which is superior to the results of 91.3% obtained from PCA methods. The receiver operating characteristic curve of diagnostic models further confirm above research results. This study demonstrates that label-free serum SERS analysis technique combined with SVM diagnostic algorithm has great potential for noninvasive prostate cancer screening.

  13. Hybridization between multi-objective genetic algorithm and support vector machine for feature selection in walker-assisted gait.

    PubMed

    Martins, Maria; Costa, Lino; Frizera, Anselmo; Ceres, Ramón; Santos, Cristina

    2014-03-01

    Walker devices are often prescribed incorrectly to patients, leading to the increase of dissatisfaction and occurrence of several problems, such as, discomfort and pain. Thus, it is necessary to objectively evaluate the effects that assisted gait can have on the gait patterns of walker users, comparatively to a non-assisted gait. A gait analysis, focusing on spatiotemporal and kinematics parameters, will be issued for this purpose. However, gait analysis yields redundant information that often is difficult to interpret. This study addresses the problem of selecting the most relevant gait features required to differentiate between assisted and non-assisted gait. For that purpose, it is presented an efficient approach that combines evolutionary techniques, based on genetic algorithms, and support vector machine algorithms, to discriminate differences between assisted and non-assisted gait with a walker with forearm supports. For comparison purposes, other classification algorithms are verified. Results with healthy subjects show that the main differences are characterized by balance and joints excursion in the sagittal plane. These results, confirmed by clinical evidence, allow concluding that this technique is an efficient feature selection approach. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  14. Virus Database and Online Inquiry System Based on Natural Vectors.

    PubMed

    Dong, Rui; Zheng, Hui; Tian, Kun; Yau, Shek-Chung; Mao, Weiguang; Yu, Wenping; Yin, Changchuan; Yu, Chenglong; He, Rong Lucy; Yang, Jie; Yau, Stephen St

    2017-01-01

    We construct a virus database called VirusDB (http://yaulab.math.tsinghua.edu.cn/VirusDB/) and an online inquiry system to serve people who are interested in viral classification and prediction. The database stores all viral genomes, their corresponding natural vectors, and the classification information of the single/multiple-segmented viral reference sequences downloaded from National Center for Biotechnology Information. The online inquiry system serves the purpose of computing natural vectors and their distances based on submitted genomes, providing an online interface for accessing and using the database for viral classification and prediction, and back-end processes for automatic and manual updating of database content to synchronize with GenBank. Submitted genomes data in FASTA format will be carried out and the prediction results with 5 closest neighbors and their classifications will be returned by email. Considering the one-to-one correspondence between sequence and natural vector, time efficiency, and high accuracy, natural vector is a significant advance compared with alignment methods, which makes VirusDB a useful database in further research.

  15. Support vector machine classification and characterization of age-related reorganization of functional brain networks

    PubMed Central

    Meier, Timothy B.; Desphande, Alok S.; Vergun, Svyatoslav; Nair, Veena A.; Song, Jie; Biswal, Bharat B.; Meyerand, Mary E.; Birn, Rasmus M.; Prabhakaran, Vivek

    2012-01-01

    Most of what is known about the reorganization of functional brain networks that accompanies normal aging is based on neuroimaging studies in which participants perform specific tasks. In these studies, reorganization is defined by the differences in task activation between young and old adults. However, task activation differences could be the result of differences in task performance, strategy, or motivation, and not necessarily reflect reorganization. Resting-state fMRI provides a method of investigating functional brain networks without such confounds. Here, a support vector machine (SVM) classifier was used in an attempt to differentiate older adults from younger adults based on their resting-state functional connectivity. In addition, the information used by the SVM was investigated to see what functional connections best differentiated younger adult brains from older adult brains. Three separate resting-state scans from 26 younger adults (18-35 yrs) and 26 older adults (55-85) were obtained from the International Consortium for Brain Mapping (ICBM) dataset made publically available in the 1000 Functional Connectomes project www.nitrc.org/projects/fcon_1000. 100 seed-regions from four functional networks with 5 mm3 radius were defined based on a recent study using machine learning classifiers on adolescent brains. Time-series for every seed-region were averaged and three matrices of z-transformed correlation coefficients were created for each subject corresponding to each individual’s three resting-state scans. SVM was then applied using leave-one-out cross-validation. The SVM classifier was 84% accurate in classifying older and younger adult brains. The majority of the connections used by the classifier to distinguish subjects by age came from seed-regions belonging to the sensorimotor and cingulo-opercular networks. These results suggest that age-related decreases in positive correlations within the cingulo-opercular and default networks, and decreases in

  16. Support vector machine classification and characterization of age-related reorganization of functional brain networks.

    PubMed

    Meier, Timothy B; Desphande, Alok S; Vergun, Svyatoslav; Nair, Veena A; Song, Jie; Biswal, Bharat B; Meyerand, Mary E; Birn, Rasmus M; Prabhakaran, Vivek

    2012-03-01

    Most of what is known about the reorganization of functional brain networks that accompanies normal aging is based on neuroimaging studies in which participants perform specific tasks. In these studies, reorganization is defined by the differences in task activation between young and old adults. However, task activation differences could be the result of differences in task performance, strategy, or motivation, and not necessarily reflect reorganization. Resting-state fMRI provides a method of investigating functional brain networks without such confounds. Here, a support vector machine (SVM) classifier was used in an attempt to differentiate older adults from younger adults based on their resting-state functional connectivity. In addition, the information used by the SVM was investigated to see what functional connections best differentiated younger adult brains from older adult brains. Three separate resting-state scans from 26 younger adults (18-35 yrs) and 26 older adults (55-85) were obtained from the International Consortium for Brain Mapping (ICBM) dataset made publically available in the 1000 Functional Connectomes project www.nitrc.org/projects/fcon_1000. 100 seed-regions from four functional networks with 5mm(3) radius were defined based on a recent study using machine learning classifiers on adolescent brains. Time-series for every seed-region were averaged and three matrices of z-transformed correlation coefficients were created for each subject corresponding to each individual's three resting-state scans. SVM was then applied using leave-one-out cross-validation. The SVM classifier was 84% accurate in classifying older and younger adult brains. The majority of the connections used by the classifier to distinguish subjects by age came from seed-regions belonging to the sensorimotor and cingulo-opercular networks. These results suggest that age-related decreases in positive correlations within the cingulo-opercular and default networks, and decreases in

  17. Nonlinear programming for classification problems in machine learning

    NASA Astrophysics Data System (ADS)

    Astorino, Annabella; Fuduli, Antonio; Gaudioso, Manlio

    2016-10-01

    We survey some nonlinear models for classification problems arising in machine learning. In the last years this field has become more and more relevant due to a lot of practical applications, such as text and web classification, object recognition in machine vision, gene expression profile analysis, DNA and protein analysis, medical diagnosis, customer profiling etc. Classification deals with separation of sets by means of appropriate separation surfaces, which is generally obtained by solving a numerical optimization model. While linear separability is the basis of the most popular approach to classification, the Support Vector Machine (SVM), in the recent years using nonlinear separating surfaces has received some attention. The objective of this work is to recall some of such proposals, mainly in terms of the numerical optimization models. In particular we tackle the polyhedral, ellipsoidal, spherical and conical separation approaches and, for some of them, we also consider the semisupervised versions.

  18. Hierarchical vs non-hierarchical audio indexation and classification for video genres

    NASA Astrophysics Data System (ADS)

    Dammak, Nouha; BenAyed, Yassine

    2018-04-01

    In this paper, Support Vector Machines (SVMs) are used for segmenting and indexing video genres based on only audio features extracted at block level, which has a prominent asset by capturing local temporal information. The main contribution of our study is to show the wide effect on the classification accuracies while using an hierarchical categorization structure based on Mel Frequency Cepstral Coefficients (MFCC) audio descriptor. In fact, the classification consists in three common video genres: sports videos, music clips and news scenes. The sub-classification may divide each genre into several multi-speaker and multi-dialect sub-genres. The validation of this approach was carried out on over 360 minutes of video span yielding a classification accuracy of over 99%.

  19. Voice based gender classification using machine learning

    NASA Astrophysics Data System (ADS)

    Raahul, A.; Sapthagiri, R.; Pankaj, K.; Vijayarajan, V.

    2017-11-01

    Gender identification is one of the major problem speech analysis today. Tracing the gender from acoustic data i.e., pitch, median, frequency etc. Machine learning gives promising results for classification problem in all the research domains. There are several performance metrics to evaluate algorithms of an area. Our Comparative model algorithm for evaluating 5 different machine learning algorithms based on eight different metrics in gender classification from acoustic data. Agenda is to identify gender, with five different algorithms: Linear Discriminant Analysis (LDA), K-Nearest Neighbour (KNN), Classification and Regression Trees (CART), Random Forest (RF), and Support Vector Machine (SVM) on basis of eight different metrics. The main parameter in evaluating any algorithms is its performance. Misclassification rate must be less in classification problems, which says that the accuracy rate must be high. Location and gender of the person have become very crucial in economic markets in the form of AdSense. Here with this comparative model algorithm, we are trying to assess the different ML algorithms and find the best fit for gender classification of acoustic data.

  20. Fault diagnosis of automobile hydraulic brake system using statistical features and support vector machines

    NASA Astrophysics Data System (ADS)

    Jegadeeshwaran, R.; Sugumaran, V.

    2015-02-01

    Hydraulic brakes in automobiles are important components for the safety of passengers; therefore, the brakes are a good subject for condition monitoring. The condition of the brake components can be monitored by using the vibration characteristics. On-line condition monitoring by using machine learning approach is proposed in this paper as a possible solution to such problems. The vibration signals for both good as well as faulty conditions of brakes were acquired from a hydraulic brake test setup with the help of a piezoelectric transducer and a data acquisition system. Descriptive statistical features were extracted from the acquired vibration signals and the feature selection was carried out using the C4.5 decision tree algorithm. There is no specific method to find the right number of features required for classification for a given problem. Hence an extensive study is needed to find the optimum number of features. The effect of the number of features was also studied, by using the decision tree as well as Support Vector Machines (SVM). The selected features were classified using the C-SVM and Nu-SVM with different kernel functions. The results are discussed and the conclusion of the study is presented.

  1. The optional selection of micro-motion feature based on Support Vector Machine

    NASA Astrophysics Data System (ADS)

    Li, Bo; Ren, Hongmei; Xiao, Zhi-he; Sheng, Jing

    2017-11-01

    Micro-motion form of target is multiple, different micro-motion forms are apt to be modulated, which makes it difficult for feature extraction and recognition. Aiming at feature extraction of cone-shaped objects with different micro-motion forms, this paper proposes the best selection method of micro-motion feature based on support vector machine. After the time-frequency distribution of radar echoes, comparing the time-frequency spectrum of objects with different micro-motion forms, features are extracted based on the differences between the instantaneous frequency variations of different micro-motions. According to the methods based on SVM (Support Vector Machine) features are extracted, then the best features are acquired. Finally, the result shows the method proposed in this paper is feasible under the test condition of certain signal-to-noise ratio(SNR).

  2. Genetic algorithm based feature selection combined with dual classification for the automated detection of proliferative diabetic retinopathy.

    PubMed

    Welikala, R A; Fraz, M M; Dehmeshki, J; Hoppe, A; Tah, V; Mann, S; Williamson, T H; Barman, S A

    2015-07-01

    Proliferative diabetic retinopathy (PDR) is a condition that carries a high risk of severe visual impairment. The hallmark of PDR is the growth of abnormal new vessels. In this paper, an automated method for the detection of new vessels from retinal images is presented. This method is based on a dual classification approach. Two vessel segmentation approaches are applied to create two separate binary vessel map which each hold vital information. Local morphology features are measured from each binary vessel map to produce two separate 4-D feature vectors. Independent classification is performed for each feature vector using a support vector machine (SVM) classifier. The system then combines these individual outcomes to produce a final decision. This is followed by the creation of additional features to generate 21-D feature vectors, which feed into a genetic algorithm based feature selection approach with the objective of finding feature subsets that improve the performance of the classification. Sensitivity and specificity results using a dataset of 60 images are 0.9138 and 0.9600, respectively, on a per patch basis and 1.000 and 0.975, respectively, on a per image basis. Copyright © 2015 Elsevier Ltd. All rights reserved.

  3. Gender classification from face images by using local binary pattern and gray-level co-occurrence matrix

    NASA Astrophysics Data System (ADS)

    Uzbaş, Betül; Arslan, Ahmet

    2018-04-01

    Gender is an important step for human computer interactive processes and identification. Human face image is one of the important sources to determine gender. In the present study, gender classification is performed automatically from facial images. In order to classify gender, we propose a combination of features that have been extracted face, eye and lip regions by using a hybrid method of Local Binary Pattern and Gray-Level Co-Occurrence Matrix. The features have been extracted from automatically obtained face, eye and lip regions. All of the extracted features have been combined and given as input parameters to classification methods (Support Vector Machine, Artificial Neural Networks, Naive Bayes and k-Nearest Neighbor methods) for gender classification. The Nottingham Scan face database that consists of the frontal face images of 100 people (50 male and 50 female) is used for this purpose. As the result of the experimental studies, the highest success rate has been achieved as 98% by using Support Vector Machine. The experimental results illustrate the efficacy of our proposed method.

  4. Support Vector Machines Model of Computed Tomography for Assessing Lymph Node Metastasis in Esophageal Cancer with Neoadjuvant Chemotherapy.

    PubMed

    Wang, Zhi-Long; Zhou, Zhi-Guo; Chen, Ying; Li, Xiao-Ting; Sun, Ying-Shi

    The aim of this study was to diagnose lymph node metastasis of esophageal cancer by support vector machines model based on computed tomography. A total of 131 esophageal cancer patients with preoperative chemotherapy and radical surgery were included. Various indicators (tumor thickness, tumor length, tumor CT value, total number of lymph nodes, and long axis and short axis sizes of largest lymph node) on CT images before and after neoadjuvant chemotherapy were recorded. A support vector machines model based on these CT indicators was built to predict lymph node metastasis. Support vector machines model diagnosed lymph node metastasis better than preoperative short axis size of largest lymph node on CT. The area under the receiver operating characteristic curves were 0.887 and 0.705, respectively. The support vector machine model of CT images can help diagnose lymph node metastasis in esophageal cancer with preoperative chemotherapy.

  5. An efficient scheme for automatic web pages categorization using the support vector machine

    NASA Astrophysics Data System (ADS)

    Bhalla, Vinod Kumar; Kumar, Neeraj

    2016-07-01

    In the past few years, with an evolution of the Internet and related technologies, the number of the Internet users grows exponentially. These users demand access to relevant web pages from the Internet within fraction of seconds. To achieve this goal, there is a requirement of an efficient categorization of web page contents. Manual categorization of these billions of web pages to achieve high accuracy is a challenging task. Most of the existing techniques reported in the literature are semi-automatic. Using these techniques, higher level of accuracy cannot be achieved. To achieve these goals, this paper proposes an automatic web pages categorization into the domain category. The proposed scheme is based on the identification of specific and relevant features of the web pages. In the proposed scheme, first extraction and evaluation of features are done followed by filtering the feature set for categorization of domain web pages. A feature extraction tool based on the HTML document object model of the web page is developed in the proposed scheme. Feature extraction and weight assignment are based on the collection of domain-specific keyword list developed by considering various domain pages. Moreover, the keyword list is reduced on the basis of ids of keywords in keyword list. Also, stemming of keywords and tag text is done to achieve a higher accuracy. An extensive feature set is generated to develop a robust classification technique. The proposed scheme was evaluated using a machine learning method in combination with feature extraction and statistical analysis using support vector machine kernel as the classification tool. The results obtained confirm the effectiveness of the proposed scheme in terms of its accuracy in different categories of web pages.

  6. Evaluation of different classification methods for the diagnosis of schizophrenia based on functional near-infrared spectroscopy.

    PubMed

    Li, Zhaohua; Wang, Yuduo; Quan, Wenxiang; Wu, Tongning; Lv, Bin

    2015-02-15

    Based on near-infrared spectroscopy (NIRS), recent converging evidence has been observed that patients with schizophrenia exhibit abnormal functional activities in the prefrontal cortex during a verbal fluency task (VFT). Therefore, some studies have attempted to employ NIRS measurements to differentiate schizophrenia patients from healthy controls with different classification methods. However, no systematic evaluation was conducted to compare their respective classification performances on the same study population. In this study, we evaluated the classification performance of four classification methods (including linear discriminant analysis, k-nearest neighbors, Gaussian process classifier, and support vector machines) on an NIRS-aided schizophrenia diagnosis. We recruited a large sample of 120 schizophrenia patients and 120 healthy controls and measured the hemoglobin response in the prefrontal cortex during the VFT using a multichannel NIRS system. Features for classification were extracted from three types of NIRS data in each channel. We subsequently performed a principal component analysis (PCA) for feature selection prior to comparison of the different classification methods. We achieved a maximum accuracy of 85.83% and an overall mean accuracy of 83.37% using a PCA-based feature selection on oxygenated hemoglobin signals and support vector machine classifier. This is the first comprehensive evaluation of different classification methods for the diagnosis of schizophrenia based on different types of NIRS signals. Our results suggested that, using the appropriate classification method, NIRS has the potential capacity to be an effective objective biomarker for the diagnosis of schizophrenia. Copyright © 2014 Elsevier B.V. All rights reserved.

  7. Emotion recognition from single-trial EEG based on kernel Fisher's emotion pattern and imbalanced quasiconformal kernel support vector machine.

    PubMed

    Liu, Yi-Hung; Wu, Chien-Te; Cheng, Wei-Teng; Hsiao, Yu-Tsung; Chen, Po-Ming; Teng, Jyh-Tong

    2014-07-24

    Electroencephalogram-based emotion recognition (EEG-ER) has received increasing attention in the fields of health care, affective computing, and brain-computer interface (BCI). However, satisfactory ER performance within a bi-dimensional and non-discrete emotional space using single-trial EEG data remains a challenging task. To address this issue, we propose a three-layer scheme for single-trial EEG-ER. In the first layer, a set of spectral powers of different EEG frequency bands are extracted from multi-channel single-trial EEG signals. In the second layer, the kernel Fisher's discriminant analysis method is applied to further extract features with better discrimination ability from the EEG spectral powers. The feature vector produced by layer 2 is called a kernel Fisher's emotion pattern (KFEP), and is sent into layer 3 for further classification where the proposed imbalanced quasiconformal kernel support vector machine (IQK-SVM) serves as the emotion classifier. The outputs of the three layer EEG-ER system include labels of emotional valence and arousal. Furthermore, to collect effective training and testing datasets for the current EEG-ER system, we also use an emotion-induction paradigm in which a set of pictures selected from the International Affective Picture System (IAPS) are employed as emotion induction stimuli. The performance of the proposed three-layer solution is compared with that of other EEG spectral power-based features and emotion classifiers. Results on 10 healthy participants indicate that the proposed KFEP feature performs better than other spectral power features, and IQK-SVM outperforms traditional SVM in terms of the EEG-ER accuracy. Our findings also show that the proposed EEG-ER scheme achieves the highest classification accuracies of valence (82.68%) and arousal (84.79%) among all testing methods.

  8. Emotion Recognition from Single-Trial EEG Based on Kernel Fisher's Emotion Pattern and Imbalanced Quasiconformal Kernel Support Vector Machine

    PubMed Central

    Liu, Yi-Hung; Wu, Chien-Te; Cheng, Wei-Teng; Hsiao, Yu-Tsung; Chen, Po-Ming; Teng, Jyh-Tong

    2014-01-01

    Electroencephalogram-based emotion recognition (EEG-ER) has received increasing attention in the fields of health care, affective computing, and brain-computer interface (BCI). However, satisfactory ER performance within a bi-dimensional and non-discrete emotional space using single-trial EEG data remains a challenging task. To address this issue, we propose a three-layer scheme for single-trial EEG-ER. In the first layer, a set of spectral powers of different EEG frequency bands are extracted from multi-channel single-trial EEG signals. In the second layer, the kernel Fisher's discriminant analysis method is applied to further extract features with better discrimination ability from the EEG spectral powers. The feature vector produced by layer 2 is called a kernel Fisher's emotion pattern (KFEP), and is sent into layer 3 for further classification where the proposed imbalanced quasiconformal kernel support vector machine (IQK-SVM) serves as the emotion classifier. The outputs of the three layer EEG-ER system include labels of emotional valence and arousal. Furthermore, to collect effective training and testing datasets for the current EEG-ER system, we also use an emotion-induction paradigm in which a set of pictures selected from the International Affective Picture System (IAPS) are employed as emotion induction stimuli. The performance of the proposed three-layer solution is compared with that of other EEG spectral power-based features and emotion classifiers. Results on 10 healthy participants indicate that the proposed KFEP feature performs better than other spectral power features, and IQK-SVM outperforms traditional SVM in terms of the EEG-ER accuracy. Our findings also show that the proposed EEG-ER scheme achieves the highest classification accuracies of valence (82.68%) and arousal (84.79%) among all testing methods. PMID:25061837

  9. Discrimination of raw and processed Dipsacus asperoides by near infrared spectroscopy combined with least squares-support vector machine and random forests

    NASA Astrophysics Data System (ADS)

    Xin, Ni; Gu, Xiao-Feng; Wu, Hao; Hu, Yu-Zhu; Yang, Zhong-Lin

    2012-04-01

    Most herbal medicines could be processed to fulfill the different requirements of therapy. The purpose of this study was to discriminate between raw and processed Dipsacus asperoides, a common traditional Chinese medicine, based on their near infrared (NIR) spectra. Least squares-support vector machine (LS-SVM) and random forests (RF) were employed for full-spectrum classification. Three types of kernels, including linear kernel, polynomial kernel and radial basis function kernel (RBF), were checked for optimization of LS-SVM model. For comparison, a linear discriminant analysis (LDA) model was performed for classification, and the successive projections algorithm (SPA) was executed prior to building an LDA model to choose an appropriate subset of wavelengths. The three methods were applied to a dataset containing 40 raw herbs and 40 corresponding processed herbs. We ran 50 runs of 10-fold cross validation to evaluate the model's efficiency. The performance of the LS-SVM with RBF kernel (RBF LS-SVM) was better than the other two kernels. The RF, RBF LS-SVM and SPA-LDA successfully classified all test samples. The mean error rates for the 50 runs of 10-fold cross validation were 1.35% for RBF LS-SVM, 2.87% for RF, and 2.50% for SPA-LDA. The best classification results were obtained by using LS-SVM with RBF kernel, while RF was fast in the training and making predictions.

  10. Bleeding detection in wireless capsule endoscopy using adaptive colour histogram model and support vector classification

    NASA Astrophysics Data System (ADS)

    Mackiewicz, Michal W.; Fisher, Mark; Jamieson, Crawford

    2008-03-01

    Wireless Capsule Endoscopy (WCE) is a colour imaging technology that enables detailed examination of the interior of the gastrointestinal tract. A typical WCE examination takes ~ 8 hours and captures ~ 40,000 useful images. After the examination, the images are viewed as a video sequence, which generally takes a clinician over an hour to analyse. The manufacturers of the WCE provide certain automatic image analysis functions e.g. Given Imaging offers in their Rapid Reader software: The Suspected Blood Indicator (SBI), which is designed to report the location in the video of areas of active bleeding. However, this tool has been reported to have insufficient specificity and sensitivity. Therefore it does not free the specialist from reviewing the entire footage and was suggested only to be used as a fast screening tool. In this paper we propose a method of bleeding detection that uses in its first stage Hue-Saturation-Intensity colour histograms to track a moving background and bleeding colour distributions over time. Such an approach addresses the problem caused by drastic changes in blood colour distribution that occur when it is altered by gastrointestinal fluids and allow detection of other red lesions, which although are usually "less red" than fresh bleeding, they can still be detected when the difference between their colour distributions and the background is large enough. In the second stage of our method, we analyse all candidate blood frames, by extracting colour (HSI) and texture (LBP) features from the suspicious image regions (obtained in the first stage) and their neighbourhoods and classifying them using Support Vector Classifier into Bleeding, Lesion and Normal classes. We show that our algorithm compares favourably with the SBI on the test set of 84 full length videos.

  11. Classification of inflammatory bowel diseases by means of Raman spectroscopic imaging of epithelium cells

    NASA Astrophysics Data System (ADS)

    Bielecki, Christiane; Bocklitz, Thomas W.; Schmitt, Michael; Krafft, Christoph; Marquardt, Claudio; Gharbi, Akram; Knösel, Thomas; Stallmach, Andreas; Popp, Juergen

    2012-07-01

    We report on a Raman microspectroscopic characterization of the inflammatory bowel diseases (IBD) Crohn's disease (CD) and ulcerative colitis (UC). Therefore, Raman maps of human colon tissue sections were analyzed by utilizing innovative chemometric approaches. First, support vector machines were applied to highlight the tissue morphology (=Raman spectroscopic histopathology). In a second step, the biochemical tissue composition has been studied by analyzing the epithelium Raman spectra of sections of healthy control subjects (n=11), subjects with CD (n=14), and subjects with UC (n=13). These three groups exhibit significantly different molecular specific Raman signatures, allowing establishment of a classifier (support-vector-machine). By utilizing this classifier it was possible to separate between healthy control patients, patients with CD, and patients with UC with an accuracy of 98.90%. The automatic design of both classification steps (visualization of the tissue morphology and molecular classification of IBD) paves the way for an objective clinical diagnosis of IBD by means of Raman spectroscopy in combination with chemometric approaches.

  12. The Identification of Hunger Behaviour of Lates Calcarifer through the Integration of Image Processing Technique and Support Vector Machine

    NASA Astrophysics Data System (ADS)

    Taha, Z.; Razman, M. A. M.; Adnan, F. A.; Ghani, A. S. Abdul; Majeed, A. P. P. Abdul; Musa, R. M.; Sallehudin, M. F.; Mukai, Y.

    2018-03-01

    Fish Hunger behaviour is one of the important element in determining the fish feeding routine, especially for farmed fishes. Inaccurate feeding routines (under-feeding or over-feeding) lead the fishes to die and thus, reduces the total production of fishes. The excessive food which is not eaten by fish will be dissolved in the water and thus, reduce the water quality (oxygen quantity in the water will be reduced). The reduction of oxygen (water quality) leads the fish to die and in some cases, may lead to fish diseases. This study correlates Barramundi fish-school behaviour with hunger condition through the hybrid data integration of image processing technique. The behaviour is clustered with respect to the position of the centre of gravity of the school of fish prior feeding, during feeding and after feeding. The clustered fish behaviour is then classified by means of a machine learning technique namely Support vector machine (SVM). It has been shown from the study that the Fine Gaussian variation of SVM is able to provide a reasonably accurate classification of fish feeding behaviour with a classification accuracy of 79.7%. The proposed integration technique may increase the usefulness of the captured data and thus better differentiates the various behaviour of farmed fishes.

  13. Human action recognition with group lasso regularized-support vector machine

    NASA Astrophysics Data System (ADS)

    Luo, Huiwu; Lu, Huanzhang; Wu, Yabei; Zhao, Fei

    2016-05-01

    The bag-of-visual-words (BOVW) and Fisher kernel are two popular models in human action recognition, and support vector machine (SVM) is the most commonly used classifier for the two models. We show two kinds of group structures in the feature representation constructed by BOVW and Fisher kernel, respectively, since the structural information of feature representation can be seen as a prior for the classifier and can improve the performance of the classifier, which has been verified in several areas. However, the standard SVM employs L2-norm regularization in its learning procedure, which penalizes each variable individually and cannot express the structural information of feature representation. We replace the L2-norm regularization with group lasso regularization in standard SVM, and a group lasso regularized-support vector machine (GLRSVM) is proposed. Then, we embed the group structural information of feature representation into GLRSVM. Finally, we introduce an algorithm to solve the optimization problem of GLRSVM by alternating directions method of multipliers. The experiments evaluated on KTH, YouTube, and Hollywood2 datasets show that our method achieves promising results and improves the state-of-the-art methods on KTH and YouTube datasets.

  14. Incremental classification learning for anomaly detection in medical images

    NASA Astrophysics Data System (ADS)

    Giritharan, Balathasan; Yuan, Xiaohui; Liu, Jianguo

    2009-02-01

    Computer-aided diagnosis usually screens thousands of instances to find only a few positive cases that indicate probable presence of disease.The amount of patient data increases consistently all the time. In diagnosis of new instances, disagreement occurs between a CAD system and physicians, which suggests inaccurate classifiers. Intuitively, misclassified instances and the previously acquired data should be used to retrain the classifier. This, however, is very time consuming and, in some cases where dataset is too large, becomes infeasible. In addition, among the patient data, only a small percentile shows positive sign, which is known as imbalanced data.We present an incremental Support Vector Machines(SVM) as a solution for the class imbalance problem in classification of anomaly in medical images. The support vectors provide a concise representation of the distribution of the training data. Here we use bootstrapping to identify potential candidate support vectors for future iterations. Experiments were conducted using images from endoscopy videos, and the sensitivity and specificity were close to that of SVM trained using all samples available at a given incremental step with significantly improved efficiency in training the classifier.

  15. Using Support Vector Machine to identify imaging biomarkers of neurological and psychiatric disease: a critical review.

    PubMed

    Orrù, Graziella; Pettersson-Yeo, William; Marquand, Andre F; Sartori, Giuseppe; Mechelli, Andrea

    2012-04-01

    Standard univariate analysis of neuroimaging data has revealed a host of neuroanatomical and functional differences between healthy individuals and patients suffering a wide range of neurological and psychiatric disorders. Significant only at group level however these findings have had limited clinical translation, and recent attention has turned toward alternative forms of analysis, including Support-Vector-Machine (SVM). A type of machine learning, SVM allows categorisation of an individual's previously unseen data into a predefined group using a classification algorithm, developed on a training data set. In recent years, SVM has been successfully applied in the context of disease diagnosis, transition prediction and treatment prognosis, using both structural and functional neuroimaging data. Here we provide a brief overview of the method and review those studies that applied it to the investigation of Alzheimer's disease, schizophrenia, major depression, bipolar disorder, presymptomatic Huntington's disease, Parkinson's disease and autistic spectrum disorder. We conclude by discussing the main theoretical and practical challenges associated with the implementation of this method into the clinic and possible future directions. Copyright © 2012 Elsevier Ltd. All rights reserved.

  16. Nucleic and Amino Acid Sequences Support Structure-Based Viral Classification.

    PubMed

    Sinclair, Robert M; Ravantti, Janne J; Bamford, Dennis H

    2017-04-15

    Viral capsids ensure viral genome integrity by protecting the enclosed nucleic acids. Interactions between the genome and capsid and between individual capsid proteins (i.e., capsid architecture) are intimate and are expected to be characterized by strong evolutionary conservation. For this reason, a capsid structure-based viral classification has been proposed as a way to bring order to the viral universe. The seeming lack of sufficient sequence similarity to reproduce this classification has made it difficult to reject structural convergence as the basis for the classification. We reinvestigate whether the structure-based classification for viral coat proteins making icosahedral virus capsids is in fact supported by previously undetected sequence similarity. Since codon choices can influence nascent protein folding cotranslationally, we searched for both amino acid and nucleotide sequence similarity. To demonstrate the sensitivity of the approach, we identify a candidate gene for the pandoravirus capsid protein. We show that the structure-based classification is strongly supported by amino acid and also nucleotide sequence similarities, suggesting that the similarities are due to common descent. The correspondence between structure-based and sequence-based analyses of the same proteins shown here allow them to be used in future analyses of the relationship between linear sequence information and macromolecular function, as well as between linear sequence and protein folds. IMPORTANCE Viral capsids protect nucleic acid genomes, which in turn encode capsid proteins. This tight coupling of protein shell and nucleic acids, together with strong functional constraints on capsid protein folding and architecture, leads to the hypothesis that capsid protein-coding nucleotide sequences may retain signatures of ancient viral evolution. We have been able to show that this is indeed the case, using the major capsid proteins of viruses forming icosahedral capsids. Importantly

  17. Nucleic and Amino Acid Sequences Support Structure-Based Viral Classification

    PubMed Central

    Sinclair, Robert M.; Ravantti, Janne J.

    2017-01-01

    ABSTRACT Viral capsids ensure viral genome integrity by protecting the enclosed nucleic acids. Interactions between the genome and capsid and between individual capsid proteins (i.e., capsid architecture) are intimate and are expected to be characterized by strong evolutionary conservation. For this reason, a capsid structure-based viral classification has been proposed as a way to bring order to the viral universe. The seeming lack of sufficient sequence similarity to reproduce this classification has made it difficult to reject structural convergence as the basis for the classification. We reinvestigate whether the structure-based classification for viral coat proteins making icosahedral virus capsids is in fact supported by previously undetected sequence similarity. Since codon choices can influence nascent protein folding cotranslationally, we searched for both amino acid and nucleotide sequence similarity. To demonstrate the sensitivity of the approach, we identify a candidate gene for the pandoravirus capsid protein. We show that the structure-based classification is strongly supported by amino acid and also nucleotide sequence similarities, suggesting that the similarities are due to common descent. The correspondence between structure-based and sequence-based analyses of the same proteins shown here allow them to be used in future analyses of the relationship between linear sequence information and macromolecular function, as well as between linear sequence and protein folds. IMPORTANCE Viral capsids protect nucleic acid genomes, which in turn encode capsid proteins. This tight coupling of protein shell and nucleic acids, together with strong functional constraints on capsid protein folding and architecture, leads to the hypothesis that capsid protein-coding nucleotide sequences may retain signatures of ancient viral evolution. We have been able to show that this is indeed the case, using the major capsid proteins of viruses forming icosahedral capsids

  18. Spatial Mutual Information Based Hyperspectral Band Selection for Classification

    PubMed Central

    2015-01-01

    The amount of information involved in hyperspectral imaging is large. Hyperspectral band selection is a popular method for reducing dimensionality. Several information based measures such as mutual information have been proposed to reduce information redundancy among spectral bands. Unfortunately, mutual information does not take into account the spatial dependency between adjacent pixels in images thus reducing its robustness as a similarity measure. In this paper, we propose a new band selection method based on spatial mutual information. As validation criteria, a supervised classification method using support vector machine (SVM) is used. Experimental results of the classification of hyperspectral datasets show that the proposed method can achieve more accurate results. PMID:25918742

  19. A proposed ecosystem services classification system to support green accounting

    EPA Science Inventory

    There are a multitude of actual or envisioned, complete or incomplete, ecosystem service classification systems being proposed to support Green Accounting. Green Accounting is generally thought to be the formal accounting attempt to factor environmental production into National ...

  20. Predicting domain-domain interaction based on domain profiles with feature selection and support vector machines

    PubMed Central

    2010-01-01

    Background Protein-protein interaction (PPI) plays essential roles in cellular functions. The cost, time and other limitations associated with the current experimental methods have motivated the development of computational methods for predicting PPIs. As protein interactions generally occur via domains instead of the whole molecules, predicting domain-domain interaction (DDI) is an important step toward PPI prediction. Computational methods developed so far have utilized information from various sources at different levels, from primary sequences, to molecular structures, to evolutionary profiles. Results In this paper, we propose a computational method to predict DDI using support vector machines (SVMs), based on domains represented as interaction profile hidden Markov models (ipHMM) where interacting residues in domains are explicitly modeled according to the three dimensional structural information available at the Protein Data Bank (PDB). Features about the domains are extracted first as the Fisher scores derived from the ipHMM and then selected using singular value decomposition (SVD). Domain pairs are represented by concatenating their selected feature vectors, and classified by a support vector machine trained on these feature vectors. The method is tested by leave-one-out cross validation experiments with a set of interacting protein pairs adopted from the 3DID database. The prediction accuracy has shown significant improvement as compared to InterPreTS (Interaction Prediction through Tertiary Structure), an existing method for PPI prediction that also uses the sequences and complexes of known 3D structure. Conclusions We show that domain-domain interaction prediction can be significantly enhanced by exploiting information inherent in the domain profiles via feature selection based on Fisher scores, singular value decomposition and supervised learning based on support vector machines. Datasets and source code are freely available on the web at http

  1. Chinese Sentence Classification Based on Convolutional Neural Network

    NASA Astrophysics Data System (ADS)

    Gu, Chengwei; Wu, Ming; Zhang, Chuang

    2017-10-01

    Sentence classification is one of the significant issues in Natural Language Processing (NLP). Feature extraction is often regarded as the key point for natural language processing. Traditional ways based on machine learning can not take high level features into consideration, such as Naive Bayesian Model. The neural network for sentence classification can make use of contextual information to achieve greater results in sentence classification tasks. In this paper, we focus on classifying Chinese sentences. And the most important is that we post a novel architecture of Convolutional Neural Network (CNN) to apply on Chinese sentence classification. In particular, most of the previous methods often use softmax classifier for prediction, we embed a linear support vector machine to substitute softmax in the deep neural network model, minimizing a margin-based loss to get a better result. And we use tanh as an activation function, instead of ReLU. The CNN model improve the result of Chinese sentence classification tasks. Experimental results on the Chinese news title database validate the effectiveness of our model.

  2. A comparative study of machine learning models for ethnicity classification

    NASA Astrophysics Data System (ADS)

    Trivedi, Advait; Bessie Amali, D. Geraldine

    2017-11-01

    This paper endeavours to adopt a machine learning approach to solve the problem of ethnicity recognition. Ethnicity identification is an important vision problem with its use cases being extended to various domains. Despite the multitude of complexity involved, ethnicity identification comes naturally to humans. This meta information can be leveraged to make several decisions, be it in target marketing or security. With the recent development of intelligent systems a sub module to efficiently capture ethnicity would be useful in several use cases. Several attempts to identify an ideal learning model to represent a multi-ethnic dataset have been recorded. A comparative study of classifiers such as support vector machines, logistic regression has been documented. Experimental results indicate that the logical classifier provides a much accurate classification than the support vector machine.

  3. Evaluating the statistical performance of less applied algorithms in classification of worldview-3 imagery data in an urbanized landscape

    NASA Astrophysics Data System (ADS)

    Ranaie, Mehrdad; Soffianian, Alireza; Pourmanafi, Saeid; Mirghaffari, Noorollah; Tarkesh, Mostafa

    2018-03-01

    In recent decade, analyzing the remotely sensed imagery is considered as one of the most common and widely used procedures in the environmental studies. In this case, supervised image classification techniques play a central role. Hence, taking a high resolution Worldview-3 over a mixed urbanized landscape in Iran, three less applied image classification methods including Bagged CART, Stochastic gradient boosting model and Neural network with feature extraction were tested and compared with two prevalent methods: random forest and support vector machine with linear kernel. To do so, each method was run ten time and three validation techniques was used to estimate the accuracy statistics consist of cross validation, independent validation and validation with total of train data. Moreover, using ANOVA and Tukey test, statistical difference significance between the classification methods was significantly surveyed. In general, the results showed that random forest with marginal difference compared to Bagged CART and stochastic gradient boosting model is the best performing method whilst based on independent validation there was no significant difference between the performances of classification methods. It should be finally noted that neural network with feature extraction and linear support vector machine had better processing speed than other.

  4. Automatic detection of sleep apnea based on EEG detrended fluctuation analysis and support vector machine.

    PubMed

    Zhou, Jing; Wu, Xiao-ming; Zeng, Wei-jie

    2015-12-01

    Sleep apnea syndrome (SAS) is prevalent in individuals and recently, there are many studies focus on using simple and efficient methods for SAS detection instead of polysomnography. However, not much work has been done on using nonlinear behavior of the electroencephalogram (EEG) signals. The purpose of this study is to find a novel and simpler method for detecting apnea patients and to quantify nonlinear characteristics of the sleep apnea. 30 min EEG scaling exponents that quantify power-law correlations were computed using detrended fluctuation analysis (DFA) and compared between six SAS and six healthy subjects during sleep. The mean scaling exponents were calculated every 30 s and 360 control values and 360 apnea values were obtained. These values were compared between the two groups and support vector machine (SVM) was used to classify apnea patients. Significant difference was found between EEG scaling exponents of the two groups (p < 0.001). SVM was used and obtained high and consistent recognition rate: average classification accuracy reached 95.1% corresponding to the sensitivity 93.2% and specificity 98.6%. DFA of EEG is an efficient and practicable method and is helpful clinically in diagnosis of sleep apnea.

  5. Biomarkers of Eating Disorders Using Support Vector Machine Analysis of Structural Neuroimaging Data: Preliminary Results

    PubMed Central

    Cerasa, Antonio; Castiglioni, Isabella; Salvatore, Christian; Funaro, Angela; Martino, Iolanda; Alfano, Stefania; Donzuso, Giulia; Perrotta, Paolo; Gioia, Maria Cecilia; Gilardi, Maria Carla; Quattrone, Aldo

    2015-01-01

    Presently, there are no valid biomarkers to identify individuals with eating disorders (ED). The aim of this work was to assess the feasibility of a machine learning method for extracting reliable neuroimaging features allowing individual categorization of patients with ED. Support Vector Machine (SVM) technique, combined with a pattern recognition method, was employed utilizing structural magnetic resonance images. Seventeen females with ED (six with diagnosis of anorexia nervosa and 11 with bulimia nervosa) were compared against 17 body mass index-matched healthy controls (HC). Machine learning allowed individual diagnosis of ED versus HC with an Accuracy ≥ 0.80. Voxel-based pattern recognition analysis demonstrated that voxels influencing the classification Accuracy involved the occipital cortex, the posterior cerebellar lobule, precuneus, sensorimotor/premotor cortices, and the medial prefrontal cortex, all critical regions known to be strongly involved in the pathophysiological mechanisms of ED. Although these findings should be considered preliminary given the small size investigated, SVM analysis highlights the role of well-known brain regions as possible biomarkers to distinguish ED from HC at an individual level, thus encouraging the translational implementation of this new multivariate approach in the clinical practice. PMID:26648660

  6. Potential of cancer screening with serum surface-enhanced Raman spectroscopy and a support vector machine

    NASA Astrophysics Data System (ADS)

    Li, S. X.; Zhang, Y. J.; Zeng, Q. Y.; Li, L. F.; Guo, Z. Y.; Liu, Z. M.; Xiong, H. L.; Liu, S. H.

    2014-06-01

    Cancer is the most common disease to threaten human health. The ability to screen individuals with malignant tumours with only a blood sample would be greatly advantageous to early diagnosis and intervention. This study explores the possibility of discriminating between cancer patients and normal subjects with serum surface-enhanced Raman spectroscopy (SERS) and a support vector machine (SVM) through a peripheral blood sample. A total of 130 blood samples were obtained from patients with liver cancer, colonic cancer, esophageal cancer, nasopharyngeal cancer, gastric cancer, as well as 113 blood samples from normal volunteers. Several diagnostic models were built with the serum SERS spectra using SVM and principal component analysis (PCA) techniques. The results show that a diagnostic accuracy of 85.5% is acquired with a PCA algorithm, while a diagnostic accuracy of 95.8% is obtained using radial basis function (RBF), PCA-SVM methods. The results prove that a RBF kernel PCA-SVM technique is superior to PCA and conventional SVM (C-SVM) algorithms in classification serum SERS spectra. The study demonstrates that serum SERS, in combination with SVM techniques, has great potential for screening cancerous patients with any solid malignant tumour through a peripheral blood sample.

  7. A Bayesian least squares support vector machines based framework for fault diagnosis and failure prognosis

    NASA Astrophysics Data System (ADS)

    Khawaja, Taimoor Saleem

    A high-belief low-overhead Prognostics and Health Management (PHM) system is desired for online real-time monitoring of complex non-linear systems operating in a complex (possibly non-Gaussian) noise environment. This thesis presents a Bayesian Least Squares Support Vector Machine (LS-SVM) based framework for fault diagnosis and failure prognosis in nonlinear non-Gaussian systems. The methodology assumes the availability of real-time process measurements, definition of a set of fault indicators and the existence of empirical knowledge (or historical data) to characterize both nominal and abnormal operating conditions. An efficient yet powerful Least Squares Support Vector Machine (LS-SVM) algorithm, set within a Bayesian Inference framework, not only allows for the development of real-time algorithms for diagnosis and prognosis but also provides a solid theoretical framework to address key concepts related to classification for diagnosis and regression modeling for prognosis. SVM machines are founded on the principle of Structural Risk Minimization (SRM) which tends to find a good trade-off between low empirical risk and small capacity. The key features in SVM are the use of non-linear kernels, the absence of local minima, the sparseness of the solution and the capacity control obtained by optimizing the margin. The Bayesian Inference framework linked with LS-SVMs allows a probabilistic interpretation of the results for diagnosis and prognosis. Additional levels of inference provide the much coveted features of adaptability and tunability of the modeling parameters. The two main modules considered in this research are fault diagnosis and failure prognosis. With the goal of designing an efficient and reliable fault diagnosis scheme, a novel Anomaly Detector is suggested based on the LS-SVM machines. The proposed scheme uses only baseline data to construct a 1-class LS-SVM machine which, when presented with online data is able to distinguish between normal behavior

  8. Testing the Potential of Vegetation Indices for Land Use/cover Classification Using High Resolution Data

    NASA Astrophysics Data System (ADS)

    Karakacan Kuzucu, A.; Bektas Balcik, F.

    2017-11-01

    Accurate and reliable land use/land cover (LULC) information obtained by remote sensing technology is necessary in many applications such as environmental monitoring, agricultural management, urban planning, hydrological applications, soil management, vegetation condition study and suitability analysis. But this information still remains a challenge especially in heterogeneous landscapes covering urban and rural areas due to spectrally similar LULC features. In parallel with technological developments, supplementary data such as satellite-derived spectral indices have begun to be used as additional bands in classification to produce data with high accuracy. The aim of this research is to test the potential of spectral vegetation indices combination with supervised classification methods and to extract reliable LULC information from SPOT 7 multispectral imagery. The Normalized Difference Vegetation Index (NDVI), the Ratio Vegetation Index (RATIO), the Soil Adjusted Vegetation Index (SAVI) were the three vegetation indices used in this study. The classical maximum likelihood classifier (MLC) and support vector machine (SVM) algorithm were applied to classify SPOT 7 image. Catalca is selected region located in the north west of the Istanbul in Turkey, which has complex landscape covering artificial surface, forest and natural area, agricultural field, quarry/mining area, pasture/scrubland and water body. Accuracy assessment of all classified images was performed through overall accuracy and kappa coefficient. The results indicated that the incorporation of these three different vegetation indices decrease the classification accuracy for the MLC and SVM classification. In addition, the maximum likelihood classification slightly outperformed the support vector machine classification approach in both overall accuracy and kappa statistics.

  9. The identification of high potential archers based on fitness and motor ability variables: A Support Vector Machine approach.

    PubMed

    Taha, Zahari; Musa, Rabiu Muazu; P P Abdul Majeed, Anwar; Alim, Muhammad Muaz; Abdullah, Mohamad Razali

    2018-02-01

    Support Vector Machine (SVM) has been shown to be an effective learning algorithm for classification and prediction. However, the application of SVM for prediction and classification in specific sport has rarely been used to quantify/discriminate low and high-performance athletes. The present study classified and predicted high and low-potential archers from a set of fitness and motor ability variables trained on different SVMs kernel algorithms. 50 youth archers with the mean age and standard deviation of 17.0 ± 0.6 years drawn from various archery programmes completed a six arrows shooting score test. Standard fitness and ability measurements namely hand grip, vertical jump, standing broad jump, static balance, upper muscle strength and the core muscle strength were also recorded. Hierarchical agglomerative cluster analysis (HACA) was used to cluster the archers based on the performance variables tested. SVM models with linear, quadratic, cubic, fine RBF, medium RBF, as well as the coarse RBF kernel functions, were trained based on the measured performance variables. The HACA clustered the archers into high-potential archers (HPA) and low-potential archers (LPA), respectively. The linear, quadratic, cubic, as well as the medium RBF kernel functions models, demonstrated reasonably excellent classification accuracy of 97.5% and 2.5% error rate for the prediction of the HPA and the LPA. The findings of this investigation can be valuable to coaches and sports managers to recognise high potential athletes from a combination of the selected few measured fitness and motor ability performance variables examined which would consequently save cost, time and effort during talent identification programme. Copyright © 2017 Elsevier B.V. All rights reserved.

  10. Insect cell transformation vectors that support high level expression and promoter assessment in insect cell culture

    USDA-ARS?s Scientific Manuscript database

    A somatic transformation vector, pDP9, was constructed that provides a simplified means of producing permanently transformed cultured insect cells that support high levels of protein expression of foreign genes. The pDP9 plasmid vector incorporates DNA sequences from the Junonia coenia densovirus th...

  11. Improving urban land use and land cover classification from high-spatial-resolution hyperspectral imagery using contextual information

    USDA-ARS?s Scientific Manuscript database

    In this paper, we propose approaches to improve the pixel-based support vector machine (SVM) classification for urban land use and land cover (LULC) mapping from airborne hyperspectral imagery with high spatial resolution. Class spatial neighborhood relationship is used to correct the misclassified ...

  12. CARSVM: a class association rule-based classification framework and its application to gene expression data.

    PubMed

    Kianmehr, Keivan; Alhajj, Reda

    2008-09-01

    In this study, we aim at building a classification framework, namely the CARSVM model, which integrates association rule mining and support vector machine (SVM). The goal is to benefit from advantages of both, the discriminative knowledge represented by class association rules and the classification power of the SVM algorithm, to construct an efficient and accurate classifier model that improves the interpretability problem of SVM as a traditional machine learning technique and overcomes the efficiency issues of associative classification algorithms. In our proposed framework: instead of using the original training set, a set of rule-based feature vectors, which are generated based on the discriminative ability of class association rules over the training samples, are presented to the learning component of the SVM algorithm. We show that rule-based feature vectors present a high-qualified source of discrimination knowledge that can impact substantially the prediction power of SVM and associative classification techniques. They provide users with more conveniences in terms of understandability and interpretability as well. We have used four datasets from UCI ML repository to evaluate the performance of the developed system in comparison with five well-known existing classification methods. Because of the importance and popularity of gene expression analysis as real world application of the classification model, we present an extension of CARSVM combined with feature selection to be applied to gene expression data. Then, we describe how this combination will provide biologists with an efficient and understandable classifier model. The reported test results and their biological interpretation demonstrate the applicability, efficiency and effectiveness of the proposed model. From the results, it can be concluded that a considerable increase in classification accuracy can be obtained when the rule-based feature vectors are integrated in the learning process of the SVM

  13. NOTE: Fluoroscopic gating without implanted fiducial markers for lung cancer radiotherapy based on support vector machines

    NASA Astrophysics Data System (ADS)

    Cui, Ying; Dy, Jennifer G.; Alexander, Brian; Jiang, Steve B.

    2008-08-01

    Various problems with the current state-of-the-art techniques for gated radiotherapy have prevented this new treatment modality from being widely implemented in clinical routine. These problems are caused mainly by applying various external respiratory surrogates. There might be large uncertainties in deriving the tumor position from external respiratory surrogates. While tracking implanted fiducial markers has sufficient accuracy, this procedure may not be widely accepted due to the risk of pneumothorax. Previously, we have developed a technique to generate gating signals from fluoroscopic images without implanted fiducial markers using template matching methods (Berbeco et al 2005 Phys. Med. Biol. 50 4481-90, Cui et al 2007b Phys. Med. Biol. 52 741-55). In this note, our main contribution is to provide a totally different new view of the gating problem by recasting it as a classification problem. Then, we solve this classification problem by a well-studied powerful classification method called a support vector machine (SVM). Note that the goal of an automated gating tool is to decide when to turn the beam ON or OFF. We treat ON and OFF as the two classes in our classification problem. We create our labeled training data during the patient setup session by utilizing the reference gating signal, manually determined by a radiation oncologist. We then pre-process these labeled training images and build our SVM prediction model. During treatment delivery, fluoroscopic images are continuously acquired, pre-processed and sent as an input to the SVM. Finally, our SVM model will output the predicted labels as gating signals. We test the proposed technique on five sequences of fluoroscopic images from five lung cancer patients against the reference gating signal as ground truth. We compare the performance of the SVM to our previous template matching method (Cui et al 2007b Phys. Med. Biol. 52 741-55). We find that the SVM is slightly more accurate on average (1-3%) than

  14. The classification of the patients with pulmonary diseases using breath air samples spectral analysis

    NASA Astrophysics Data System (ADS)

    Kistenev, Yury V.; Borisov, Alexey V.; Kuzmin, Dmitry A.; Bulanova, Anna A.

    2016-08-01

    Technique of exhaled breath sampling is discussed. The procedure of wavelength auto-calibration is proposed and tested. Comparison of the experimental data with the model absorption spectra of 5% CO2 is conducted. The classification results of three study groups obtained by using support vector machine and principal component analysis methods are presented.

  15. Scorebox extraction from mobile sports videos using Support Vector Machines

    NASA Astrophysics Data System (ADS)

    Kim, Wonjun; Park, Jimin; Kim, Changick

    2008-08-01

    Scorebox plays an important role in understanding contents of sports videos. However, the tiny scorebox may give the small-display-viewers uncomfortable experience in grasping the game situation. In this paper, we propose a novel framework to extract the scorebox from sports video frames. We first extract candidates by using accumulated intensity and edge information after short learning period. Since there are various types of scoreboxes inserted in sports videos, multiple attributes need to be used for efficient extraction. Based on those attributes, the optimal information gain is computed and top three ranked attributes in terms of information gain are selected as a three-dimensional feature vector for Support Vector Machines (SVM) to distinguish the scorebox from other candidates, such as logos and advertisement boards. The proposed method is tested on various videos of sports games and experimental results show the efficiency and robustness of our proposed method.

  16. Classification of the Regional Ionospheric Disturbance Based on Machine Learning Techniques

    NASA Astrophysics Data System (ADS)

    Terzi, Merve Begum; Arikan, Orhan; Karatay, Secil; Arikan, Feza; Gulyaeva, Tamara

    2016-08-01

    In this study, Total Electron Content (TEC) estimated from GPS receivers is used to model the regional and local variability that differs from global activity along with solar and geomagnetic indices. For the automated classification of regional disturbances, a classification technique based on a robust machine learning technique that have found wide spread use, Support Vector Machine (SVM) is proposed. Performance of developed classification technique is demonstrated for midlatitude ionosphere over Anatolia using TEC estimates generated from GPS data provided by Turkish National Permanent GPS Network (TNPGN-Active) for solar maximum year of 2011. As a result of implementing developed classification technique to Global Ionospheric Map (GIM) TEC data, which is provided by the NASA Jet Propulsion Laboratory (JPL), it is shown that SVM can be a suitable learning method to detect anomalies in TEC variations.

  17. Intelligent Design of Metal Oxide Gas Sensor Arrays Using Reciprocal Kernel Support Vector Regression

    NASA Astrophysics Data System (ADS)

    Dougherty, Andrew W.

    Metal oxides are a staple of the sensor industry. The combination of their sensitivity to a number of gases, and the electrical nature of their sensing mechanism, make the particularly attractive in solid state devices. The high temperature stability of the ceramic material also make them ideal for detecting combustion byproducts where exhaust temperatures can be high. However, problems do exist with metal oxide sensors. They are not very selective as they all tend to be sensitive to a number of reduction and oxidation reactions on the oxide's surface. This makes sensors with large numbers of sensors interesting to study as a method for introducing orthogonality to the system. Also, the sensors tend to suffer from long term drift for a number of reasons. In this thesis I will develop a system for intelligently modeling metal oxide sensors and determining their suitability for use in large arrays designed to analyze exhaust gas streams. It will introduce prior knowledge of the metal oxide sensors' response mechanisms in order to produce a response function for each sensor from sparse training data. The system will use the same technique to model and remove any long term drift from the sensor response. It will also provide an efficient means for determining the orthogonality of the sensor to determine whether they are useful in gas sensing arrays. The system is based on least squares support vector regression using the reciprocal kernel. The reciprocal kernel is introduced along with a method of optimizing the free parameters of the reciprocal kernel support vector machine. The reciprocal kernel is shown to be simpler and to perform better than an earlier kernel, the modified reciprocal kernel. Least squares support vector regression is chosen as it uses all of the training points and an emphasis was placed throughout this research for extracting the maximum information from very sparse data. The reciprocal kernel is shown to be effective in modeling the sensor

  18. Highly Accurate Classification of Watson-Crick Basepairs on Termini of Single DNA Molecules

    PubMed Central

    Winters-Hilt, Stephen; Vercoutere, Wenonah; DeGuzman, Veronica S.; Deamer, David; Akeson, Mark; Haussler, David

    2003-01-01

    We introduce a computational method for classification of individual DNA molecules measured by an α-hemolysin channel detector. We show classification with better than 99% accuracy for DNA hairpin molecules that differ only in their terminal Watson-Crick basepairs. Signal classification was done in silico to establish performance metrics (i.e., where train and test data were of known type, via single-species data files). It was then performed in solution to assay real mixtures of DNA hairpins. Hidden Markov Models (HMMs) were used with Expectation/Maximization for denoising and for associating a feature vector with the ionic current blockade of the DNA molecule. Support Vector Machines (SVMs) were used as discriminators, and were the focus of off-line training. A multiclass SVM architecture was designed to place less discriminatory load on weaker discriminators, and novel SVM kernels were used to boost discrimination strength. The tuning on HMMs and SVMs enabled biophysical analysis of the captured molecule states and state transitions; structure revealed in the biophysical analysis was used for better feature selection. PMID:12547778

  19. Support vector machines to detect physiological patterns for EEG and EMG-based human-computer interaction: a review

    NASA Astrophysics Data System (ADS)

    Quitadamo, L. R.; Cavrini, F.; Sbernini, L.; Riillo, F.; Bianchi, L.; Seri, S.; Saggio, G.

    2017-02-01

    Support vector machines (SVMs) are widely used classifiers for detecting physiological patterns in human-computer interaction (HCI). Their success is due to their versatility, robustness and large availability of free dedicated toolboxes. Frequently in the literature, insufficient details about the SVM implementation and/or parameters selection are reported, making it impossible to reproduce study analysis and results. In order to perform an optimized classification and report a proper description of the results, it is necessary to have a comprehensive critical overview of the applications of SVM. The aim of this paper is to provide a review of the usage of SVM in the determination of brain and muscle patterns for HCI, by focusing on electroencephalography (EEG) and electromyography (EMG) techniques. In particular, an overview of the basic principles of SVM theory is outlined, together with a description of several relevant literature implementations. Furthermore, details concerning reviewed papers are listed in tables and statistics of SVM use in the literature are presented. Suitability of SVM for HCI is discussed and critical comparisons with other classifiers are reported.

  20. Prediction of cause of death from forensic autopsy reports using text classification techniques: A comparative study.

    PubMed

    Mujtaba, Ghulam; Shuib, Liyana; Raj, Ram Gopal; Rajandram, Retnagowri; Shaikh, Khairunisa

    2018-07-01

    Automatic text classification techniques are useful for classifying plaintext medical documents. This study aims to automatically predict the cause of death from free text forensic autopsy reports by comparing various schemes for feature extraction, term weighing or feature value representation, text classification, and feature reduction. For experiments, the autopsy reports belonging to eight different causes of death were collected, preprocessed and converted into 43 master feature vectors using various schemes for feature extraction, representation, and reduction. The six different text classification techniques were applied on these 43 master feature vectors to construct a classification model that can predict the cause of death. Finally, classification model performance was evaluated using four performance measures i.e. overall accuracy, macro precision, macro-F-measure, and macro recall. From experiments, it was found that that unigram features obtained the highest performance compared to bigram, trigram, and hybrid-gram features. Furthermore, in feature representation schemes, term frequency, and term frequency with inverse document frequency obtained similar and better results when compared with binary frequency, and normalized term frequency with inverse document frequency. Furthermore, the chi-square feature reduction approach outperformed Pearson correlation, and information gain approaches. Finally, in text classification algorithms, support vector machine classifier outperforms random forest, Naive Bayes, k-nearest neighbor, decision tree, and ensemble-voted classifier. Our results and comparisons hold practical importance and serve as references for future works. Moreover, the comparison outputs will act as state-of-art techniques to compare future proposals with existing automated text classification techniques. Copyright © 2017 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.

  1. Ontology for Vector Surveillance and Management

    PubMed Central

    LOZANO-FUENTES, SAUL; BANDYOPADHYAY, ARITRA; COWELL, LINDSAY G.; GOLDFAIN, ALBERT; EISEN, LARS

    2013-01-01

    Ontologies, which are made up by standardized and defined controlled vocabulary terms and their interrelationships, are comprehensive and readily searchable repositories for knowledge in a given domain. The Open Biomedical Ontologies (OBO) Foundry was initiated in 2001 with the aims of becoming an “umbrella” for life-science ontologies and promoting the use of ontology development best practices. A software application (OBO-Edit; *.obo file format) was developed to facilitate ontology development and editing. The OBO Foundry now comprises over 100 ontologies and candidate ontologies, including the NCBI organismal classification ontology (NCBITaxon), the Mosquito Insecticide Resistance Ontology (MIRO), the Infectious Disease Ontology (IDO), the IDOMAL malaria ontology, and ontologies for mosquito gross anatomy and tick gross anatomy. We previously developed a disease data management system for dengue and malaria control programs, which incorporated a set of information trees built upon ontological principles, including a “term tree” to promote the use of standardized terms. In the course of doing so, we realized that there were substantial gaps in existing ontologies with regards to concepts, processes, and, especially, physical entities (e.g., vector species, pathogen species, and vector surveillance and management equipment) in the domain of surveillance and management of vectors and vector-borne pathogens. We therefore produced an ontology for vector surveillance and management, focusing on arthropod vectors and vector-borne pathogens with relevance to humans or domestic animals, and with special emphasis on content to support operational activities through inclusion in databases, data management systems, or decision support systems. The Vector Surveillance and Management Ontology (VSMO) includes >2,200 unique terms, of which the vast majority (>80%) were newly generated during the development of this ontology. One core feature of the VSMO is the linkage

  2. Ontology for vector surveillance and management.

    PubMed

    Lozano-Fuentes, Saul; Bandyopadhyay, Aritra; Cowell, Lindsay G; Goldfain, Albert; Eisen, Lars

    2013-01-01

    Ontologies, which are made up by standardized and defined controlled vocabulary terms and their interrelationships, are comprehensive and readily searchable repositories for knowledge in a given domain. The Open Biomedical Ontologies (OBO) Foundry was initiated in 2001 with the aims of becoming an "umbrella" for life-science ontologies and promoting the use of ontology development best practices. A software application (OBO-Edit; *.obo file format) was developed to facilitate ontology development and editing. The OBO Foundry now comprises over 100 ontologies and candidate ontologies, including the NCBI organismal classification ontology (NCBITaxon), the Mosquito Insecticide Resistance Ontology (MIRO), the Infectious Disease Ontology (IDO), the IDOMAL malaria ontology, and ontologies for mosquito gross anatomy and tick gross anatomy. We previously developed a disease data management system for dengue and malaria control programs, which incorporated a set of information trees built upon ontological principles, including a "term tree" to promote the use of standardized terms. In the course of doing so, we realized that there were substantial gaps in existing ontologies with regards to concepts, processes, and, especially, physical entities (e.g., vector species, pathogen species, and vector surveillance and management equipment) in the domain of surveillance and management of vectors and vector-borne pathogens. We therefore produced an ontology for vector surveillance and management, focusing on arthropod vectors and vector-borne pathogens with relevance to humans or domestic animals, and with special emphasis on content to support operational activities through inclusion in databases, data management systems, or decision support systems. The Vector Surveillance and Management Ontology (VSMO) includes >2,200 unique terms, of which the vast majority (>80%) were newly generated during the development of this ontology. One core feature of the VSMO is the linkage, through

  3. Diagnostic Classification of Schizophrenia Patients on the Basis of Regional Reward-Related fMRI Signal Patterns

    PubMed Central

    Koch, Stefan P.; Hägele, Claudia; Haynes, John-Dylan; Heinz, Andreas; Schlagenhauf, Florian; Sterzer, Philipp

    2015-01-01

    Functional neuroimaging has provided evidence for altered function of mesolimbic circuits implicated in reward processing, first and foremost the ventral striatum, in patients with schizophrenia. While such findings based on significant group differences in brain activations can provide important insights into the pathomechanisms of mental disorders, the use of neuroimaging results from standard univariate statistical analysis for individual diagnosis has proven difficult. In this proof of concept study, we tested whether the predictive accuracy for the diagnostic classification of schizophrenia patients vs. healthy controls could be improved using multivariate pattern analysis (MVPA) of regional functional magnetic resonance imaging (fMRI) activation patterns for the anticipation of monetary reward. With a searchlight MVPA approach using support vector machine classification, we found that the diagnostic category could be predicted from local activation patterns in frontal, temporal, occipital and midbrain regions, with a maximal cluster peak classification accuracy of 93% for the right pallidum. Region-of-interest based MVPA for the ventral striatum achieved a maximal cluster peak accuracy of 88%, whereas the classification accuracy on the basis of standard univariate analysis reached only 75%. Moreover, using support vector regression we could additionally predict the severity of negative symptoms from ventral striatal activation patterns. These results show that MVPA can be used to substantially increase the accuracy of diagnostic classification on the basis of task-related fMRI signal patterns in a regionally specific way. PMID:25799236

  4. Prediction of hourly PM2.5 using a space-time support vector regression model

    NASA Astrophysics Data System (ADS)

    Yang, Wentao; Deng, Min; Xu, Feng; Wang, Hang

    2018-05-01

    Real-time air quality prediction has been an active field of research in atmospheric environmental science. The existing methods of machine learning are widely used to predict pollutant concentrations because of their enhanced ability to handle complex non-linear relationships. However, because pollutant concentration data, as typical geospatial data, also exhibit spatial heterogeneity and spatial dependence, they may violate the assumptions of independent and identically distributed random variables in most of the machine learning methods. As a result, a space-time support vector regression model is proposed to predict hourly PM2.5 concentrations. First, to address spatial heterogeneity, spatial clustering is executed to divide the study area into several homogeneous or quasi-homogeneous subareas. To handle spatial dependence, a Gauss vector weight function is then developed to determine spatial autocorrelation variables as part of the input features. Finally, a local support vector regression model with spatial autocorrelation variables is established for each subarea. Experimental data on PM2.5 concentrations in Beijing are used to verify whether the results of the proposed model are superior to those of other methods.

  5. Vehicle Classification Using an Imbalanced Dataset Based on a Single Magnetic Sensor.

    PubMed

    Xu, Chang; Wang, Yingguan; Bao, Xinghe; Li, Fengrong

    2018-05-24

    This paper aims to improve the accuracy of automatic vehicle classifiers for imbalanced datasets. Classification is made through utilizing a single anisotropic magnetoresistive sensor, with the models of vehicles involved being classified into hatchbacks, sedans, buses, and multi-purpose vehicles (MPVs). Using time domain and frequency domain features in combination with three common classification algorithms in pattern recognition, we develop a novel feature extraction method for vehicle classification. These three common classification algorithms are the k-nearest neighbor, the support vector machine, and the back-propagation neural network. Nevertheless, a problem remains with the original vehicle magnetic dataset collected being imbalanced, and may lead to inaccurate classification results. With this in mind, we propose an approach called SMOTE, which can further boost the performance of classifiers. Experimental results show that the k-nearest neighbor (KNN) classifier with the SMOTE algorithm can reach a classification accuracy of 95.46%, thus minimizing the effect of the imbalance.

  6. Automatic migraine classification via feature selection committee and machine learning techniques over imaging and questionnaire data.

    PubMed

    Garcia-Chimeno, Yolanda; Garcia-Zapirain, Begonya; Gomez-Beldarrain, Marian; Fernandez-Ruanova, Begonya; Garcia-Monco, Juan Carlos

    2017-04-13

    Feature selection methods are commonly used to identify subsets of relevant features to facilitate the construction of models for classification, yet little is known about how feature selection methods perform in diffusion tensor images (DTIs). In this study, feature selection and machine learning classification methods were tested for the purpose of automating diagnosis of migraines using both DTIs and questionnaire answers related to emotion and cognition - factors that influence of pain perceptions. We select 52 adult subjects for the study divided into three groups: control group (15), subjects with sporadic migraine (19) and subjects with chronic migraine and medication overuse (18). These subjects underwent magnetic resonance with diffusion tensor to see white matter pathway integrity of the regions of interest involved in pain and emotion. The tests also gather data about pathology. The DTI images and test results were then introduced into feature selection algorithms (Gradient Tree Boosting, L1-based, Random Forest and Univariate) to reduce features of the first dataset and classification algorithms (SVM (Support Vector Machine), Boosting (Adaboost) and Naive Bayes) to perform a classification of migraine group. Moreover we implement a committee method to improve the classification accuracy based on feature selection algorithms. When classifying the migraine group, the greatest improvements in accuracy were made using the proposed committee-based feature selection method. Using this approach, the accuracy of classification into three types improved from 67 to 93% when using the Naive Bayes classifier, from 90 to 95% with the support vector machine classifier, 93 to 94% in boosting. The features that were determined to be most useful for classification included are related with the pain, analgesics and left uncinate brain (connected with the pain and emotions). The proposed feature selection committee method improved the performance of migraine diagnosis

  7. The Automation System Censor Speech for the Indonesian Rude Swear Words Based on Support Vector Machine and Pitch Analysis

    NASA Astrophysics Data System (ADS)

    Endah, S. N.; Nugraheni, D. M. K.; Adhy, S.; Sutikno

    2017-04-01

    According to Law No. 32 of 2002 and the Indonesian Broadcasting Commission Regulation No. 02/P/KPI/12/2009 & No. 03/P/KPI/12/2009, stated that broadcast programs should not scold with harsh words, not harass, insult or demean minorities and marginalized groups. However, there are no suitable tools to censor those words automatically. Therefore, researches to develop a system of intelligent software to censor the words automatically are needed. To conduct censor, the system must be able to recognize the words in question. This research proposes the classification of speech divide into two classes using Support Vector Machine (SVM), first class is set of rude words and the second class is set of properly words. The speech pitch values as an input in SVM, it used for the development of the system for the Indonesian rude swear word. The results of the experiment show that SVM is good for this system.

  8. Radar target classification method with high accuracy and decision speed performance using MUSIC spectrum vectors and PCA projection

    NASA Astrophysics Data System (ADS)

    Secmen, Mustafa

    2011-10-01

    This paper introduces the performance of an electromagnetic target recognition method in resonance scattering region, which includes pseudo spectrum Multiple Signal Classification (MUSIC) algorithm and principal component analysis (PCA) technique. The aim of this method is to classify an "unknown" target as one of the "known" targets in an aspect-independent manner. The suggested method initially collects the late-time portion of noise-free time-scattered signals obtained from different reference aspect angles of known targets. Afterward, these signals are used to obtain MUSIC spectrums in real frequency domain having super-resolution ability and noise resistant feature. In the final step, PCA technique is applied to these spectrums in order to reduce dimensionality and obtain only one feature vector per known target. In the decision stage, noise-free or noisy scattered signal of an unknown (test) target from an unknown aspect angle is initially obtained. Subsequently, MUSIC algorithm is processed for this test signal and resulting test vector is compared with feature vectors of known targets one by one. Finally, the highest correlation gives the type of test target. The method is applied to wire models of airplane targets, and it is shown that it can tolerate considerable noise levels although it has a few different reference aspect angles. Besides, the runtime of the method for a test target is sufficiently low, which makes the method suitable for real-time applications.

  9. Identification of Migratory Insects from their Physical Features using a Decision-Tree Support Vector Machine and its Application to Radar Entomology.

    PubMed

    Hu, Cheng; Kong, Shaoyang; Wang, Rui; Long, Teng; Fu, Xiaowei

    2018-04-03

    Migration is a key process in the population dynamics of numerous insect species, including many that are pests or vectors of disease. Identification of insect migrants is critically important to studies of insect migration. Radar is an effective means of monitoring nocturnal insect migrants. However, species identification of migrating insects is often unachievable with current radar technology. Special-purpose entomological radar can measure radar cross-sections (RCSs) from which the insect mass, wingbeat frequency and body length-to-width ratio (a measure of morphological form) can be estimated. These features may be valuable for species identification. This paper explores the identification of insect migrants based on the mass, wingbeat frequency and length-to-width ratio, and body length is also introduced to assess the benefit of adding another variable. A total of 23 species of migratory insects captured by a searchlight trap are used to develop a classification model based on decision-tree support vector machine method. The results reveal that the identification accuracy exceeds 80% for all species if the mass, wingbeat frequency and length-to-width ratio are utilized, and the addition of body length is shown to further increase accuracy. It is also shown that improving the precision of the measurements leads to increased identification accuracy.

  10. Extraction and classification of 3D objects from volumetric CT data

    NASA Astrophysics Data System (ADS)

    Song, Samuel M.; Kwon, Junghyun; Ely, Austin; Enyeart, John; Johnson, Chad; Lee, Jongkyu; Kim, Namho; Boyd, Douglas P.

    2016-05-01

    We propose an Automatic Threat Detection (ATD) algorithm for Explosive Detection System (EDS) using our multistage Segmentation Carving (SC) followed by Support Vector Machine (SVM) classifier. The multi-stage Segmentation and Carving (SC) step extracts all suspect 3-D objects. The feature vector is then constructed for all extracted objects and the feature vector is classified by the Support Vector Machine (SVM) previously learned using a set of ground truth threat and benign objects. The learned SVM classifier has shown to be effective in classification of different types of threat materials. The proposed ATD algorithm robustly deals with CT data that are prone to artifacts due to scatter, beam hardening as well as other systematic idiosyncrasies of the CT data. Furthermore, the proposed ATD algorithm is amenable for including newly emerging threat materials as well as for accommodating data from newly developing sensor technologies. Efficacy of the proposed ATD algorithm with the SVM classifier is demonstrated by the Receiver Operating Characteristics (ROC) curve that relates Probability of Detection (PD) as a function of Probability of False Alarm (PFA). The tests performed using CT data of passenger bags shows excellent performance characteristics.

  11. Vector quantizer designs for joint compression and terrain categorization of multispectral imagery

    NASA Technical Reports Server (NTRS)

    Gorman, John D.; Lyons, Daniel F.

    1994-01-01

    Two vector quantizer designs for compression of multispectral imagery and their impact on terrain categorization performance are evaluated. The mean-squared error (MSE) and classification performance of the two quantizers are compared, and it is shown that a simple two-stage design minimizing MSE subject to a constraint on classification performance has a significantly better classification performance than a standard MSE-based tree-structured vector quantizer followed by maximum likelihood classification. This improvement in classification performance is obtained with minimal loss in MSE performance. The results show that it is advantageous to tailor compression algorithm designs to the required data exploitation tasks. Applications of joint compression/classification include compression for the archival or transmission of Landsat imagery that is later used for land utility surveys and/or radiometric analysis.

  12. T-wave end detection using neural networks and Support Vector Machines.

    PubMed

    Suárez-León, Alexander Alexeis; Varon, Carolina; Willems, Rik; Van Huffel, Sabine; Vázquez-Seisdedos, Carlos Román

    2018-05-01

    In this paper we propose a new approach for detecting the end of the T-wave in the electrocardiogram (ECG) using Neural Networks and Support Vector Machines. Both, Multilayer Perceptron (MLP) neural networks and Fixed-Size Least-Squares Support Vector Machines (FS-LSSVM) were used as regression algorithms to determine the end of the T-wave. Different strategies for selecting the training set such as random selection, k-means, robust clustering and maximum quadratic (Rényi) entropy were evaluated. Individual parameters were tuned for each method during training and the results are given for the evaluation set. A comparison between MLP and FS-LSSVM approaches was performed. Finally, a fair comparison of the FS-LSSVM method with other state-of-the-art algorithms for detecting the end of the T-wave was included. The experimental results show that FS-LSSVM approaches are more suitable as regression algorithms than MLP neural networks. Despite the small training sets used, the FS-LSSVM methods outperformed the state-of-the-art techniques. FS-LSSVM can be successfully used as a T-wave end detection algorithm in ECG even with small training set sizes. Copyright © 2018 Elsevier Ltd. All rights reserved.

  13. Automated novelty detection in the WISE survey with one-class support vector machines

    NASA Astrophysics Data System (ADS)

    Solarz, A.; Bilicki, M.; Gromadzki, M.; Pollo, A.; Durkalec, A.; Wypych, M.

    2017-10-01

    Wide-angle photometric surveys of previously uncharted sky areas or wavelength regimes will always bring in unexpected sources - novelties or even anomalies - whose existence and properties cannot be easily predicted from earlier observations. Such objects can be efficiently located with novelty detection algorithms. Here we present an application of such a method, called one-class support vector machines (OCSVM), to search for anomalous patterns among sources preselected from the mid-infrared AllWISE catalogue covering the whole sky. To create a model of expected data we train the algorithm on a set of objects with spectroscopic identifications from the SDSS DR13 database, present also in AllWISE. The OCSVM method detects as anomalous those sources whose patterns - WISE photometric measurements in this case - are inconsistent with the model. Among the detected anomalies we find artefacts, such as objects with spurious photometry due to blending, but more importantly also real sources of genuine astrophysical interest. Among the latter, OCSVM has identified a sample of heavily reddened AGN/quasar candidates distributed uniformly over the sky and in a large part absent from other WISE-based AGN catalogues. It also allowed us to find a specific group of sources of mixed types, mostly stars and compact galaxies. By combining the semi-supervised OCSVM algorithm with standard classification methods it will be possible to improve the latter by accounting for sources which are not present in the training sample, but are otherwise well-represented in the target set. Anomaly detection adds flexibility to automated source separation procedures and helps verify the reliability and representativeness of the training samples. It should be thus considered as an essential step in supervised classification schemes to ensure completeness and purity of produced catalogues. The catalogues of outlier data are only available at the CDS via anonymous ftp to http

  14. New decision support tool for acute lymphoblastic leukemia classification

    NASA Astrophysics Data System (ADS)

    Madhukar, Monica; Agaian, Sos; Chronopoulos, Anthony T.

    2012-03-01

    In this paper, we build up a new decision support tool to improve treatment intensity choice in childhood ALL. The developed system includes different methods to accurately measure furthermore cell properties in microscope blood film images. The blood images are exposed to series of pre-processing steps which include color correlation, and contrast enhancement. By performing K-means clustering on the resultant images, the nuclei of the cells under consideration are obtained. Shape features and texture features are then extracted for classification. The system is further tested on the classification of spectra measured from the cell nuclei in blood samples in order to distinguish normal cells from those affected by Acute Lymphoblastic Leukemia. The results show that the proposed system robustly segments and classifies acute lymphoblastic leukemia based on complete microscopic blood images.

  15. Gas chimney detection based on improving the performance of combined multilayer perceptron and support vector classifier

    NASA Astrophysics Data System (ADS)

    Hashemi, H.; Tax, D. M. J.; Duin, R. P. W.; Javaherian, A.; de Groot, P.

    2008-11-01

    Seismic object detection is a relatively new field in which 3-D bodies are visualized and spatial relationships between objects of different origins are studied in order to extract geologic information. In this paper, we propose a method for finding an optimal classifier with the help of a statistical feature ranking technique and combining different classifiers. The method, which has general applicability, is demonstrated here on a gas chimney detection problem. First, we evaluate a set of input seismic attributes extracted at locations labeled by a human expert using regularized discriminant analysis (RDA). In order to find the RDA score for each seismic attribute, forward and backward search strategies are used. Subsequently, two non-linear classifiers: multilayer perceptron (MLP) and support vector classifier (SVC) are run on the ranked seismic attributes. Finally, to capitalize on the intrinsic differences between both classifiers, the MLP and SVC results are combined using logical rules of maximum, minimum and mean. The proposed method optimizes the ranked feature space size and yields the lowest classification error in the final combined result. We will show that the logical minimum reveals gas chimneys that exhibit both the softness of MLP and the resolution of SVC classifiers.

  16. PSOFuzzySVM-TMH: identification of transmembrane helix segments using ensemble feature space by incorporated fuzzy support vector machine.

    PubMed

    Hayat, Maqsood; Tahir, Muhammad

    2015-08-01

    Membrane protein is a central component of the cell that manages intra and extracellular processes. Membrane proteins execute a diversity of functions that are vital for the survival of organisms. The topology of transmembrane proteins describes the number of transmembrane (TM) helix segments and its orientation. However, owing to the lack of its recognized structures, the identification of TM helix and its topology through experimental methods is laborious with low throughput. In order to identify TM helix segments reliably, accurately, and effectively from topogenic sequences, we propose the PSOFuzzySVM-TMH model. In this model, evolutionary based information position specific scoring matrix and discrete based information 6-letter exchange group are used to formulate transmembrane protein sequences. The noisy and extraneous attributes are eradicated using an optimization selection technique, particle swarm optimization, from both feature spaces. Finally, the selected feature spaces are combined in order to form ensemble feature space. Fuzzy-support vector Machine is utilized as a classification algorithm. Two benchmark datasets, including low and high resolution datasets, are used. At various levels, the performance of the PSOFuzzySVM-TMH model is assessed through 10-fold cross validation test. The empirical results reveal that the proposed framework PSOFuzzySVM-TMH outperforms in terms of classification performance in the examined datasets. It is ascertained that the proposed model might be a useful and high throughput tool for academia and research community for further structure and functional studies on transmembrane proteins.

  17. The identification of high potential archers based on relative psychological coping skills variables: A Support Vector Machine approach

    NASA Astrophysics Data System (ADS)

    Taha, Zahari; Muazu Musa, Rabiu; Majeed, A. P. P. Abdul; Razali Abdullah, Mohamad; Aizzat Zakaria, Muhammad; Muaz Alim, Muhammad; Arif Mat Jizat, Jessnor; Fauzi Ibrahim, Mohamad

    2018-03-01

    Support Vector Machine (SVM) has been revealed to be a powerful learning algorithm for classification and prediction. However, the use of SVM for prediction and classification in sport is at its inception. The present study classified and predicted high and low potential archers from a collection of psychological coping skills variables trained on different SVMs. 50 youth archers with the average age and standard deviation of (17.0 ±.056) gathered from various archery programmes completed a one end shooting score test. Psychological coping skills inventory which evaluates the archers level of related coping skills were filled out by the archers prior to their shooting tests. k-means cluster analysis was applied to cluster the archers based on their scores on variables assessed. SVM models, i.e. linear and fine radial basis function (RBF) kernel functions, were trained on the psychological variables. The k-means clustered the archers into high psychologically prepared archers (HPPA) and low psychologically prepared archers (LPPA), respectively. It was demonstrated that the linear SVM exhibited good accuracy and precision throughout the exercise with an accuracy of 92% and considerably fewer error rate for the prediction of the HPPA and the LPPA as compared to the fine RBF SVM. The findings of this investigation can be valuable to coaches and sports managers to recognise high potential athletes from the selected psychological coping skills variables examined which would consequently save time and energy during talent identification and development programme.

  18. Identification of temporal variations in mental workload using locally-linear-embedding-based EEG feature reduction and support-vector-machine-based clustering and classification techniques.

    PubMed

    Yin, Zhong; Zhang, Jianhua

    2014-07-01

    Identifying the abnormal changes of mental workload (MWL) over time is quite crucial for preventing the accidents due to cognitive overload and inattention of human operators in safety-critical human-machine systems. It is known that various neuroimaging technologies can be used to identify the MWL variations. In order to classify MWL into a few discrete levels using representative MWL indicators and small-sized training samples, a novel EEG-based approach by combining locally linear embedding (LLE), support vector clustering (SVC) and support vector data description (SVDD) techniques is proposed and evaluated by using the experimentally measured data. The MWL indicators from different cortical regions are first elicited by using the LLE technique. Then, the SVC approach is used to find the clusters of these MWL indicators and thereby to detect MWL variations. It is shown that the clusters can be interpreted as the binary class MWL. Furthermore, a trained binary SVDD classifier is shown to be capable of detecting slight variations of those indicators. By combining the two schemes, a SVC-SVDD framework is proposed, where the clear-cut (smaller) cluster is detected by SVC first and then a subsequent SVDD model is utilized to divide the overlapped (larger) cluster into two classes. Finally, three-class MWL levels (low, normal and high) can be identified automatically. The experimental data analysis results are compared with those of several existing methods. It has been demonstrated that the proposed framework can lead to acceptable computational accuracy and has the advantages of both unsupervised and supervised training strategies. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  19. Mapping Neglected Swimming Pools from Satellite Data for Urban Vector Control

    NASA Astrophysics Data System (ADS)

    Barker, C. M.; Melton, F. S.; Reisen, W. K.

    2010-12-01

    Neglected swimming pools provide suitable breeding habit for mosquitoes, can contain thousands of mosquito larvae, and present both a significant nuisance and public health risk due to their inherent proximity to urban and suburban populations. The rapid increase and sustained rate of foreclosures in California associated with the recent recession presents a challenge for vector control districts seeking to identify, treat, and monitor neglected pools. Commercial high resolution satellite imagery offers some promise for mapping potential neglected pools, and for mapping pools for which routine maintenance has been reestablished. We present progress on unsupervised classification techniques for mapping both neglected pools and clean pools using high resolution commercial satellite data and discuss the potential uses and limitations of this data source in support of vector control efforts. An unsupervised classification scheme that utilizes image segmentation, band thresholds, and a change detection approach was implemented for sample regions in Coachella Valley, CA and the greater Los Angeles area. Comparison with field data collected by vector control personal was used to assess the accuracy of the estimates. The results suggest that the current system may provide some utility for early detection, or cost effective and time efficient annual monitoring, but additional work is required to address spectral and spatial limitations of current commercial satellite sensors for this purpose.

  20. Topic detection using paragraph vectors to support active learning in systematic reviews.

    PubMed

    Hashimoto, Kazuma; Kontonatsios, Georgios; Miwa, Makoto; Ananiadou, Sophia

    2016-08-01

    Systematic reviews require expert reviewers to manually screen thousands of citations in order to identify all relevant articles to the review. Active learning text classification is a supervised machine learning approach that has been shown to significantly reduce the manual annotation workload by semi-automating the citation screening process of systematic reviews. In this paper, we present a new topic detection method that induces an informative representation of studies, to improve the performance of the underlying active learner. Our proposed topic detection method uses a neural network-based vector space model to capture semantic similarities between documents. We firstly represent documents within the vector space, and cluster the documents into a predefined number of clusters. The centroids of the clusters are treated as latent topics. We then represent each document as a mixture of latent topics. For evaluation purposes, we employ the active learning strategy using both our novel topic detection method and a baseline topic model (i.e., Latent Dirichlet Allocation). Results obtained demonstrate that our method is able to achieve a high sensitivity of eligible studies and a significantly reduced manual annotation cost when compared to the baseline method. This observation is consistent across two clinical and three public health reviews. The tool introduced in this work is available from https://nactem.ac.uk/pvtopic/. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.

  1. Support vector machines for TEC seismo-ionospheric anomalies detection

    NASA Astrophysics Data System (ADS)

    Akhoondzadeh, M.

    2013-02-01

    Using time series prediction methods, it is possible to pursue the behaviors of earthquake precursors in the future and to announce early warnings when the differences between the predicted value and the observed value exceed the predefined threshold value. Support Vector Machines (SVMs) are widely used due to their many advantages for classification and regression tasks. This study is concerned with investigating the Total Electron Content (TEC) time series by using a SVM to detect seismo-ionospheric anomalous variations induced by the three powerful earthquakes of Tohoku (11 March 2011), Haiti (12 January 2010) and Samoa (29 September 2009). The duration of TEC time series dataset is 49, 46 and 71 days, for Tohoku, Haiti and Samoa earthquakes, respectively, with each at time resolution of 2 h. In the case of Tohoku earthquake, the results show that the difference between the predicted value obtained from the SVM method and the observed value reaches the maximum value (i.e., 129.31 TECU) at earthquake time in a period of high geomagnetic activities. The SVM method detected a considerable number of anomalous occurrences 1 and 2 days prior to the Haiti earthquake and also 1 and 5 days before the Samoa earthquake in a period of low geomagnetic activities. In order to show that the method is acting sensibly with regard to the results extracted during nonevent and event TEC data, i.e., to perform some null-hypothesis tests in which the methods would also be calibrated, the same period of data from the previous year of the Samoa earthquake date has been taken into the account. Further to this, in this study, the detected TEC anomalies using the SVM method were compared to the previous results (Akhoondzadeh and Saradjian, 2011; Akhoondzadeh, 2012) obtained from the mean, median, wavelet and Kalman filter methods. The SVM detected anomalies are similar to those detected using the previous methods. It can be concluded that SVM can be a suitable learning method to detect

  2. Improved wavelet packet classification algorithm for vibrational intrusions in distributed fiber-optic monitoring systems

    NASA Astrophysics Data System (ADS)

    Wang, Bingjie; Pi, Shaohua; Sun, Qi; Jia, Bo

    2015-05-01

    An improved classification algorithm that considers multiscale wavelet packet Shannon entropy is proposed. Decomposition coefficients at all levels are obtained to build the initial Shannon entropy feature vector. After subtracting the Shannon entropy map of the background signal, components of the strongest discriminating power in the initial feature vector are picked out to rebuild the Shannon entropy feature vector, which is transferred to radial basis function (RBF) neural network for classification. Four types of man-made vibrational intrusion signals are recorded based on a modified Sagnac interferometer. The performance of the improved classification algorithm has been evaluated by the classification experiments via RBF neural network under different diffusion coefficients. An 85% classification accuracy rate is achieved, which is higher than the other common algorithms. The classification results show that this improved classification algorithm can be used to classify vibrational intrusion signals in an automatic real-time monitoring system.

  3. Advanced Methods for Passive Acoustic Detection, Classification, and Localization of Marine Mammals

    DTIC Science & Technology

    2014-09-30

    floor 1176 Howell St Newport RI 02842 phone: (401) 832-5749 fax: (401) 832-4441 email: David.Moretti@navy.mil Steve W. Martin SPAWAR...APPROACH Odontocete click detection and classification. A multi-class support vector machine (SVM) classifier was previously developed ( Jarvis ...beaked whales, Risso’s dolphins, short-finned pilot whales, and sperm whales. Here Moretti’s group, particularly S. Jarvis , is improving the SVM

  4. Advanced Methods for Passive Acoustic Detection, Classification, and Localization of Marine Mammals

    DTIC Science & Technology

    2011-09-30

    Newport RI 02842 phone: (401) 832-5749 fax: (401) 832-4441 email: David.Moretti@navy.mil Steve W. Martin SPAWAR Systems Center Pacific...APPROACH Odontocete click detection and classification. A multiclass support vector machine (SVM) classifier was previously developed ( Jarvis et...beaked whales, Risso’s dolphins, short-finned pilot whales, and sperm whales. Here Moretti’s group, especially S. Jarvis , will improve the SVM classifier

  5. A Nearest Neighbor Classifier Employing Critical Boundary Vectors for Efficient On-Chip Template Reduction.

    PubMed

    Xia, Wenjun; Mita, Yoshio; Shibata, Tadashi

    2016-05-01

    Aiming at efficient data condensation and improving accuracy, this paper presents a hardware-friendly template reduction (TR) method for the nearest neighbor (NN) classifiers by introducing the concept of critical boundary vectors. A hardware system is also implemented to demonstrate the feasibility of using an field-programmable gate array (FPGA) to accelerate the proposed method. Initially, k -means centers are used as substitutes for the entire template set. Then, to enhance the classification performance, critical boundary vectors are selected by a novel learning algorithm, which is completed within a single iteration. Moreover, to remove noisy boundary vectors that can mislead the classification in a generalized manner, a global categorization scheme has been explored and applied to the algorithm. The global characterization automatically categorizes each classification problem and rapidly selects the boundary vectors according to the nature of the problem. Finally, only critical boundary vectors and k -means centers are used as the new template set for classification. Experimental results for 24 data sets show that the proposed algorithm can effectively reduce the number of template vectors for classification with a high learning speed. At the same time, it improves the accuracy by an average of 2.17% compared with the traditional NN classifiers and also shows greater accuracy than seven other TR methods. We have shown the feasibility of using a proof-of-concept FPGA system of 256 64-D vectors to accelerate the proposed method on hardware. At a 50-MHz clock frequency, the proposed system achieves a 3.86 times higher learning speed than on a 3.4-GHz PC, while consuming only 1% of the power of that used by the PC.

  6. Automated cell analysis tool for a genome-wide RNAi screen with support vector machine based supervised learning

    NASA Astrophysics Data System (ADS)

    Remmele, Steffen; Ritzerfeld, Julia; Nickel, Walter; Hesser, Jürgen

    2011-03-01

    RNAi-based high-throughput microscopy screens have become an important tool in biological sciences in order to decrypt mostly unknown biological functions of human genes. However, manual analysis is impossible for such screens since the amount of image data sets can often be in the hundred thousands. Reliable automated tools are thus required to analyse the fluorescence microscopy image data sets usually containing two or more reaction channels. The herein presented image analysis tool is designed to analyse an RNAi screen investigating the intracellular trafficking and targeting of acylated Src kinases. In this specific screen, a data set consists of three reaction channels and the investigated cells can appear in different phenotypes. The main issue of the image processing task is an automatic cell segmentation which has to be robust and accurate for all different phenotypes and a successive phenotype classification. The cell segmentation is done in two steps by segmenting the cell nuclei first and then using a classifier-enhanced region growing on basis of the cell nuclei to segment the cells. The classification of the cells is realized by a support vector machine which has to be trained manually using supervised learning. Furthermore, the tool is brightness invariant allowing different staining quality and it provides a quality control that copes with typical defects during preparation and acquisition. A first version of the tool has already been successfully applied for an RNAi-screen containing three hundred thousand image data sets and the SVM extended version is designed for additional screens.

  7. Detection of Pathological Voice Using Cepstrum Vectors: A Deep Learning Approach.

    PubMed

    Fang, Shih-Hau; Tsao, Yu; Hsiao, Min-Jing; Chen, Ji-Ying; Lai, Ying-Hui; Lin, Feng-Chuan; Wang, Chi-Te

    2018-03-19

    Computerized detection of voice disorders has attracted considerable academic and clinical interest in the hope of providing an effective screening method for voice diseases before endoscopic confirmation. This study proposes a deep-learning-based approach to detect pathological voice and examines its performance and utility compared with other automatic classification algorithms. This study retrospectively collected 60 normal voice samples and 402 pathological voice samples of 8 common clinical voice disorders in a voice clinic of a tertiary teaching hospital. We extracted Mel frequency cepstral coefficients from 3-second samples of a sustained vowel. The performances of three machine learning algorithms, namely, deep neural network (DNN), support vector machine, and Gaussian mixture model, were evaluated based on a fivefold cross-validation. Collective cases from the voice disorder database of MEEI (Massachusetts Eye and Ear Infirmary) were used to verify the performance of the classification mechanisms. The experimental results demonstrated that DNN outperforms Gaussian mixture model and support vector machine. Its accuracy in detecting voice pathologies reached 94.26% and 90.52% in male and female subjects, based on three representative Mel frequency cepstral coefficient features. When applied to the MEEI database for validation, the DNN also achieved a higher accuracy (99.32%) than the other two classification algorithms. By stacking several layers of neurons with optimized weights, the proposed DNN algorithm can fully utilize the acoustic features and efficiently differentiate between normal and pathological voice samples. Based on this pilot study, future research may proceed to explore more application of DNN from laboratory and clinical perspectives. Copyright © 2018 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  8. Gender classification from video under challenging operating conditions

    NASA Astrophysics Data System (ADS)

    Mendoza-Schrock, Olga; Dong, Guozhu

    2014-06-01

    The literature is abundant with papers on gender classification research. However the majority of such research is based on the assumption that there is enough resolution so that the subject's face can be resolved. Hence the majority of the research is actually in the face recognition and facial feature area. A gap exists for gender classification under challenging operating conditions—different seasonal conditions, different clothing, etc.—and when the subject's face cannot be resolved due to lack of resolution. The Seasonal Weather and Gender (SWAG) Database is a novel database that contains subjects walking through a scene under operating conditions that span a calendar year. This paper exploits a subset of that database—the SWAG One dataset—using data mining techniques, traditional classifiers (ex. Naïve Bayes, Support Vector Machine, etc.) and traditional (canny edge detection, etc.) and non-traditional (height/width ratios, etc.) feature extractors to achieve high correct gender classification rates (greater than 85%). Another novelty includes exploiting frame differentials.

  9. Modified Mahalanobis Taguchi System for Imbalance Data Classification

    PubMed Central

    2017-01-01

    The Mahalanobis Taguchi System (MTS) is considered one of the most promising binary classification algorithms to handle imbalance data. Unfortunately, MTS lacks a method for determining an efficient threshold for the binary classification. In this paper, a nonlinear optimization model is formulated based on minimizing the distance between MTS Receiver Operating Characteristics (ROC) curve and the theoretical optimal point named Modified Mahalanobis Taguchi System (MMTS). To validate the MMTS classification efficacy, it has been benchmarked with Support Vector Machines (SVMs), Naive Bayes (NB), Probabilistic Mahalanobis Taguchi Systems (PTM), Synthetic Minority Oversampling Technique (SMOTE), Adaptive Conformal Transformation (ACT), Kernel Boundary Alignment (KBA), Hidden Naive Bayes (HNB), and other improved Naive Bayes algorithms. MMTS outperforms the benchmarked algorithms especially when the imbalance ratio is greater than 400. A real life case study on manufacturing sector is used to demonstrate the applicability of the proposed model and to compare its performance with Mahalanobis Genetic Algorithm (MGA). PMID:28811820

  10. Decision support system for diabetic retinopathy using discrete wavelet transform.

    PubMed

    Noronha, K; Acharya, U R; Nayak, K P; Kamath, S; Bhandary, S V

    2013-03-01

    Prolonged duration of the diabetes may affect the tiny blood vessels of the retina causing diabetic retinopathy. Routine eye screening of patients with diabetes helps to detect diabetic retinopathy at the early stage. It is very laborious and time-consuming for the doctors to go through many fundus images continuously. Therefore, decision support system for diabetic retinopathy detection can reduce the burden of the ophthalmologists. In this work, we have used discrete wavelet transform and support vector machine classifier for automated detection of normal and diabetic retinopathy classes. The wavelet-based decomposition was performed up to the second level, and eight energy features were extracted. Two energy features from the approximation coefficients of two levels and six energy values from the details in three orientations (horizontal, vertical and diagonal) were evaluated. These features were fed to the support vector machine classifier with various kernel functions (linear, radial basis function, polynomial of orders 2 and 3) to evaluate the highest classification accuracy. We obtained the highest average classification accuracy, sensitivity and specificity of more than 99% with support vector machine classifier (polynomial kernel of order 3) using three discrete wavelet transform features. We have also proposed an integrated index called Diabetic Retinopathy Risk Index using clinically significant wavelet energy features to identify normal and diabetic retinopathy classes using just one number. We believe that this (Diabetic Retinopathy Risk Index) can be used as an adjunct tool by the doctors during the eye screening to cross-check their diagnosis.

  11. Voxel-Based Neighborhood for Spatial Shape Pattern Classification of Lidar Point Clouds with Supervised Learning

    PubMed Central

    Plaza-Leiva, Victoria; Gomez-Ruiz, Jose Antonio; Mandow, Anthony; García-Cerezo, Alfonso

    2017-01-01

    Improving the effectiveness of spatial shape features classification from 3D lidar data is very relevant because it is largely used as a fundamental step towards higher level scene understanding challenges of autonomous vehicles and terrestrial robots. In this sense, computing neighborhood for points in dense scans becomes a costly process for both training and classification. This paper proposes a new general framework for implementing and comparing different supervised learning classifiers with a simple voxel-based neighborhood computation where points in each non-overlapping voxel in a regular grid are assigned to the same class by considering features within a support region defined by the voxel itself. The contribution provides offline training and online classification procedures as well as five alternative feature vector definitions based on principal component analysis for scatter, tubular and planar shapes. Moreover, the feasibility of this approach is evaluated by implementing a neural network (NN) method previously proposed by the authors as well as three other supervised learning classifiers found in scene processing methods: support vector machines (SVM), Gaussian processes (GP), and Gaussian mixture models (GMM). A comparative performance analysis is presented using real point clouds from both natural and urban environments and two different 3D rangefinders (a tilting Hokuyo UTM-30LX and a Riegl). Classification performance metrics and processing time measurements confirm the benefits of the NN classifier and the feasibility of voxel-based neighborhood. PMID:28294963

  12. Voxel-Based Neighborhood for Spatial Shape Pattern Classification of Lidar Point Clouds with Supervised Learning.

    PubMed

    Plaza-Leiva, Victoria; Gomez-Ruiz, Jose Antonio; Mandow, Anthony; García-Cerezo, Alfonso

    2017-03-15

    Improving the effectiveness of spatial shape features classification from 3D lidar data is very relevant because it is largely used as a fundamental step towards higher level scene understanding challenges of autonomous vehicles and terrestrial robots. In this sense, computing neighborhood for points in dense scans becomes a costly process for both training and classification. This paper proposes a new general framework for implementing and comparing different supervised learning classifiers with a simple voxel-based neighborhood computation where points in each non-overlapping voxel in a regular grid are assigned to the same class by considering features within a support region defined by the voxel itself. The contribution provides offline training and online classification procedures as well as five alternative feature vector definitions based on principal component analysis for scatter, tubular and planar shapes. Moreover, the feasibility of this approach is evaluated by implementing a neural network (NN) method previously proposed by the authors as well as three other supervised learning classifiers found in scene processing methods: support vector machines (SVM), Gaussian processes (GP), and Gaussian mixture models (GMM). A comparative performance analysis is presented using real point clouds from both natural and urban environments and two different 3D rangefinders (a tilting Hokuyo UTM-30LX and a Riegl). Classification performance metrics and processing time measurements confirm the benefits of the NN classifier and the feasibility of voxel-based neighborhood.

  13. Classification of CT examinations for COPD visual severity analysis

    NASA Astrophysics Data System (ADS)

    Tan, Jun; Zheng, Bin; Wang, Xingwei; Pu, Jiantao; Gur, David; Sciurba, Frank C.; Leader, J. Ken

    2012-03-01

    In this study we present a computational method of CT examination classification into visual assessed emphysema severity. The visual severity categories ranged from 0 to 5 and were rated by an experienced radiologist. The six categories were none, trace, mild, moderate, severe and very severe. Lung segmentation was performed for every input image and all image features are extracted from the segmented lung only. We adopted a two-level feature representation method for the classification. Five gray level distribution statistics, six gray level co-occurrence matrix (GLCM), and eleven gray level run-length (GLRL) features were computed for each CT image depicted segment lung. Then we used wavelets decomposition to obtain the low- and high-frequency components of the input image, and again extract from the lung region six GLCM features and eleven GLRL features. Therefore our feature vector length is 56. The CT examinations were classified using the support vector machine (SVM) and k-nearest neighbors (KNN) and the traditional threshold (density mask) approach. The SVM classifier had the highest classification performance of all the methods with an overall sensitivity of 54.4% and a 69.6% sensitivity to discriminate "no" and "trace visually assessed emphysema. We believe this work may lead to an automated, objective method to categorically classify emphysema severity on CT exam.

  14. Unified framework for triaxial accelerometer-based fall event detection and classification using cumulants and hierarchical decision tree classifier.

    PubMed

    Kambhampati, Satya Samyukta; Singh, Vishal; Manikandan, M Sabarimalai; Ramkumar, Barathram

    2015-08-01

    In this Letter, the authors present a unified framework for fall event detection and classification using the cumulants extracted from the acceleration (ACC) signals acquired using a single waist-mounted triaxial accelerometer. The main objective of this Letter is to find suitable representative cumulants and classifiers in effectively detecting and classifying different types of fall and non-fall events. It was discovered that the first level of the proposed hierarchical decision tree algorithm implements fall detection using fifth-order cumulants and support vector machine (SVM) classifier. In the second level, the fall event classification algorithm uses the fifth-order cumulants and SVM. Finally, human activity classification is performed using the second-order cumulants and SVM. The detection and classification results are compared with those of the decision tree, naive Bayes, multilayer perceptron and SVM classifiers with different types of time-domain features including the second-, third-, fourth- and fifth-order cumulants and the signal magnitude vector and signal magnitude area. The experimental results demonstrate that the second- and fifth-order cumulant features and SVM classifier can achieve optimal detection and classification rates of above 95%, as well as the lowest false alarm rate of 1.03%.

  15. Automatic classification of thermal patterns in diabetic foot based on morphological pattern spectrum

    NASA Astrophysics Data System (ADS)

    Hernandez-Contreras, D.; Peregrina-Barreto, H.; Rangel-Magdaleno, J.; Ramirez-Cortes, J.; Renero-Carrillo, F.

    2015-11-01

    This paper presents a novel approach to characterize and identify patterns of temperature in thermographic images of the human foot plant in support of early diagnosis and follow-up of diabetic patients. Composed feature vectors based on 3D morphological pattern spectrum (pecstrum) and relative position, allow the system to quantitatively characterize and discriminate non-diabetic (control) and diabetic (DM) groups. Non-linear classification using neural networks is used for that purpose. A classification rate of 94.33% in average was obtained with the composed feature extraction process proposed in this paper. Performance evaluation and obtained results are presented.

  16. Application of Support Vector Machine to Forex Monitoring

    NASA Astrophysics Data System (ADS)

    Kamruzzaman, Joarder; Sarker, Ruhul A.

    Previous studies have demonstrated superior performance of artificial neural network (ANN) based forex forecasting models over traditional regression models. This paper applies support vector machines to build a forecasting model from the historical data using six simple technical indicators and presents a comparison with an ANN based model trained by scaled conjugate gradient (SCG) learning algorithm. The models are evaluated and compared on the basis of five commonly used performance metrics that measure closeness of prediction as well as correctness in directional change. Forecasting results of six different currencies against Australian dollar reveal superior performance of SVM model using simple linear kernel over ANN-SCG model in terms of all the evaluation metrics. The effect of SVM parameter selection on prediction performance is also investigated and analyzed.

  17. Hidden Markov Model and Support Vector Machine based decoding of finger movements using Electrocorticography

    PubMed Central

    Wissel, Tobias; Pfeiffer, Tim; Frysch, Robert; Knight, Robert T.; Chang, Edward F.; Hinrichs, Hermann; Rieger, Jochem W.; Rose, Georg

    2013-01-01

    Objective Support Vector Machines (SVM) have developed into a gold standard for accurate classification in Brain-Computer-Interfaces (BCI). The choice of the most appropriate classifier for a particular application depends on several characteristics in addition to decoding accuracy. Here we investigate the implementation of Hidden Markov Models (HMM)for online BCIs and discuss strategies to improve their performance. Approach We compare the SVM, serving as a reference, and HMMs for classifying discrete finger movements obtained from the Electrocorticograms of four subjects doing a finger tapping experiment. The classifier decisions are based on a subset of low-frequency time domain and high gamma oscillation features. Main results We show that decoding optimization between the two approaches is due to the way features are extracted and selected and less dependent on the classifier. An additional gain in HMM performance of up to 6% was obtained by introducing model constraints. Comparable accuracies of up to 90% were achieved with both SVM and HMM with the high gamma cortical response providing the most important decoding information for both techniques. Significance We discuss technical HMM characteristics and adaptations in the context of the presented data as well as for general BCI applications. Our findings suggest that HMMs and their characteristics are promising for efficient online brain-computer interfaces. PMID:24045504

  18. Stable Isotope Ratio and Elemental Profile Combined with Support Vector Machine for Provenance Discrimination of Oolong Tea (Wuyi-Rock Tea)

    PubMed Central

    Lou, Yun-xiao; Fu, Xian-shu; Yu, Xiao-ping; Zhang, Ya-fen

    2017-01-01

    This paper focused on an effective method to discriminate the geographical origin of Wuyi-Rock tea by the stable isotope ratio (SIR) and metallic element profiling (MEP) combined with support vector machine (SVM) analysis. Wuyi-Rock tea (n = 99) collected from nine producing areas and non-Wuyi-Rock tea (n = 33) from eleven nonproducing areas were analysed for SIR and MEP by established methods. The SVM model based on coupled data produced the best prediction accuracy (0.9773). This prediction shows that instrumental methods combined with a classification model can provide an effective and stable tool for provenance discrimination. Moreover, every feature variable in stable isotope and metallic element data was ranked by its contribution to the model. The results show that δ2H, δ18O, Cs, Cu, Ca, and Rb contents are significant indications for provenance discrimination and not all of the metallic elements improve the prediction accuracy of the SVM model. PMID:28473941

  19. Computer-aided classification of optical images for diagnosis of osteoarthritis in the finger joints.

    PubMed

    Zhang, Jiang; Wang, James Z; Yuan, Zhen; Sobel, Eric S; Jiang, Huabei

    2011-01-01

    This study presents a computer-aided classification method to distinguish osteoarthritis finger joints from healthy ones based on the functional images captured by x-ray guided diffuse optical tomography. Three imaging features, joint space width, optical absorption, and scattering coefficients, are employed to train a Least Squares Support Vector Machine (LS-SVM) classifier for osteoarthritis classification. The 10-fold validation results show that all osteoarthritis joints are clearly identified and all healthy joints are ruled out by the LS-SVM classifier. The best sensitivity, specificity, and overall accuracy of the classification by experienced technicians based on manual calculation of optical properties and visual examination of optical images are only 85%, 93%, and 90%, respectively. Therefore, our LS-SVM based computer-aided classification is a considerably improved method for osteoarthritis diagnosis.

  20. A novel representation for apoptosis protein subcellular localization prediction using support vector machine.

    PubMed

    Zhang, Li; Liao, Bo; Li, Dachao; Zhu, Wen

    2009-07-21

    Apoptosis, or programmed cell death, plays an important role in development of an organism. Obtaining information on subcellular location of apoptosis proteins is very helpful to understand the apoptosis mechanism. In this paper, based on the concept that the position distribution information of amino acids is closely related with the structure and function of proteins, we introduce the concept of distance frequency [Matsuda, S., Vert, J.P., Ueda, N., Toh, H., Akutsu, T., 2005. A novel representation of protein sequences for prediction of subcellular location using support vector machines. Protein Sci. 14, 2804-2813] and propose a novel way to calculate distance frequencies. In order to calculate the local features, each protein sequence is separated into p parts with the same length in our paper. Then we use the novel representation of protein sequences and adopt support vector machine to predict subcellular location. The overall prediction accuracy is significantly improved by jackknife test.

  1. ASIST SIG/CR Classification Workshop 2000: Classification for User Support and Learning.

    ERIC Educational Resources Information Center

    Soergel, Dagobert

    2001-01-01

    Reports on papers presented at the 62nd Annual Meeting of ASIST (American Society for Information Science and Technology) for the Special Interest Group in Classification Research (SIG/CR). Topics include types of knowledge; developing user-oriented classifications, including domain analysis; classification in the user interface; and automatic…

  2. Face recognition using total margin-based adaptive fuzzy support vector machines.

    PubMed

    Liu, Yi-Hung; Chen, Yen-Ting

    2007-01-01

    This paper presents a new classifier called total margin-based adaptive fuzzy support vector machines (TAF-SVM) that deals with several problems that may occur in support vector machines (SVMs) when applied to the face recognition. The proposed TAF-SVM not only solves the overfitting problem resulted from the outlier with the approach of fuzzification of the penalty, but also corrects the skew of the optimal separating hyperplane due to the very imbalanced data sets by using different cost algorithm. In addition, by introducing the total margin algorithm to replace the conventional soft margin algorithm, a lower generalization error bound can be obtained. Those three functions are embodied into the traditional SVM so that the TAF-SVM is proposed and reformulated in both linear and nonlinear cases. By using two databases, the Chung Yuan Christian University (CYCU) multiview and the facial recognition technology (FERET) face databases, and using the kernel Fisher's discriminant analysis (KFDA) algorithm to extract discriminating face features, experimental results show that the proposed TAF-SVM is superior to SVM in terms of the face-recognition accuracy. The results also indicate that the proposed TAF-SVM can achieve smaller error variances than SVM over a number of tests such that better recognition stability can be obtained.

  3. PHOTOMETRIC SUPERNOVA CLASSIFICATION WITH MACHINE LEARNING

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lochner, Michelle; Peiris, Hiranya V.; Lahav, Ofer

    Automated photometric supernova classification has become an active area of research in recent years in light of current and upcoming imaging surveys such as the Dark Energy Survey (DES) and the Large Synoptic Survey Telescope, given that spectroscopic confirmation of type for all supernovae discovered will be impossible. Here, we develop a multi-faceted classification pipeline, combining existing and new approaches. Our pipeline consists of two stages: extracting descriptive features from the light curves and classification using a machine learning algorithm. Our feature extraction methods vary from model-dependent techniques, namely SALT2 fits, to more independent techniques that fit parametric models tomore » curves, to a completely model-independent wavelet approach. We cover a range of representative machine learning algorithms, including naive Bayes, k -nearest neighbors, support vector machines, artificial neural networks, and boosted decision trees (BDTs). We test the pipeline on simulated multi-band DES light curves from the Supernova Photometric Classification Challenge. Using the commonly used area under the curve (AUC) of the Receiver Operating Characteristic as a metric, we find that the SALT2 fits and the wavelet approach, with the BDTs algorithm, each achieve an AUC of 0.98, where 1 represents perfect classification. We find that a representative training set is essential for good classification, whatever the feature set or algorithm, with implications for spectroscopic follow-up. Importantly, we find that by using either the SALT2 or the wavelet feature sets with a BDT algorithm, accurate classification is possible purely from light curve data, without the need for any redshift information.« less

  4. Photometric Supernova Classification with Machine Learning

    NASA Astrophysics Data System (ADS)

    Lochner, Michelle; McEwen, Jason D.; Peiris, Hiranya V.; Lahav, Ofer; Winter, Max K.

    2016-08-01

    Automated photometric supernova classification has become an active area of research in recent years in light of current and upcoming imaging surveys such as the Dark Energy Survey (DES) and the Large Sy