Electricity load forecasting using support vector regression with memetic algorithms.
Hu, Zhongyi; Bao, Yukun; Xiong, Tao
2013-01-01
Electricity load forecasting is an important issue that is widely explored and examined in power systems operation literature and commercial transactions in electricity markets literature as well. Among the existing forecasting models, support vector regression (SVR) has gained much attention. Considering the performance of SVR highly depends on its parameters; this study proposed a firefly algorithm (FA) based memetic algorithm (FA-MA) to appropriately determine the parameters of SVR forecasting model. In the proposed FA-MA algorithm, the FA algorithm is applied to explore the solution space, and the pattern search is used to conduct individual learning and thus enhance the exploitation of FA. Experimental results confirm that the proposed FA-MA based SVR model can not only yield more accurate forecasting results than the other four evolutionary algorithms based SVR models and three well-known forecasting models but also outperform the hybrid algorithms in the related existing literature.
[Comparative Efficiency of Algorithms Based on Support Vector Machines for Regression].
Kadyrova, N O; Pavlova, L V
2015-01-01
Methods of construction of support vector machines do not require additional a priori information and can be used to process large scale data set. It is especially important for various problems in computational biology. The main set of algorithms of support vector machines for regression is presented. The comparative efficiency of a number of support-vector-algorithms for regression is investigated. A thorough analysis of the study results found the most efficient support vector algorithms for regression. The description of the presented algorithms, sufficient for their practical implementation is given.
Support Vector Machine algorithm for regression and classification
2001-08-01
The software is an implementation of the Support Vector Machine (SVM) algorithm that was invented and developed by Vladimir Vapnik and his co-workers at AT&T Bell Laboratories. The specific implementation reported here is an Active Set method for solving a quadratic optimization problem that forms the major part of any SVM program. The implementation is tuned to specific constraints generated in the SVM learning. Thus, it is more efficient than general-purpose quadratic optimization programs. Amore » decomposition method has been implemented in the software that enables processing large data sets. The size of the learning data is virtually unlimited by the capacity of the computer physical memory. The software is flexible and extensible. Two upper bounds are implemented to regulate the SVM learning for classification, which allow users to adjust the false positive and false negative rates. The software can be used either as a standalone, general-purpose SVM regression or classification program, or be embedded into a larger software system.« less
Support Vector Machine algorithm for regression and classification
Yu, Chenggang; Zavaljevski, Nela
2001-08-01
The software is an implementation of the Support Vector Machine (SVM) algorithm that was invented and developed by Vladimir Vapnik and his co-workers at AT&T Bell Laboratories. The specific implementation reported here is an Active Set method for solving a quadratic optimization problem that forms the major part of any SVM program. The implementation is tuned to specific constraints generated in the SVM learning. Thus, it is more efficient than general-purpose quadratic optimization programs. A decomposition method has been implemented in the software that enables processing large data sets. The size of the learning data is virtually unlimited by the capacity of the computer physical memory. The software is flexible and extensible. Two upper bounds are implemented to regulate the SVM learning for classification, which allow users to adjust the false positive and false negative rates. The software can be used either as a standalone, general-purpose SVM regression or classification program, or be embedded into a larger software system.
[Comparative efficiency of algorithms based on support vector machines for binary classification].
Kadyrova, N O; Pavlova, L V
2015-01-01
Methods of construction of support vector machines require no further a priori infoimation and provide big data processing, what is especially important for various problems in computational biology. The question of the quality of learning algorithms is considered. The main algorithms of support vector machines for binary classification are reviewed and they were comparatively explored for their efficiencies. The critical analysis of the results of this study revealed the most effective support-vector-classifiers. The description of the recommended algorithms, sufficient for their practical implementation, is presented.
Approximation of HRPITS results for SI GaAs by large scale support vector machine algorithms
NASA Astrophysics Data System (ADS)
Jankowski, Stanisław; Wojdan, Konrad; Szymański, Zbigniew; Kozłowski, Roman
2006-10-01
For the first time large-scale support vector machine algorithms are used to extraction defect parameters in semi-insulating (SI) GaAs from high resolution photoinduced transient spectroscopy experiment. By smart decomposition of the data set the SVNTorch algorithm enabled to obtain good approximation of analyzed correlation surface by a parsimonious model (with small number of support vector). The extracted parameters of deep level defect centers from SVM approximation are of good quality as compared to the reference data.
Noise robust speech recognition with support vector learning algorithms
NASA Astrophysics Data System (ADS)
Namarvar, Hassan H.; Berger, Theodore W.
2001-05-01
We propose a new noise robust speech recognition system using time-frequency domain analysis and radial basis function (RBF) support vector machines (SVM). Here, we ignore the effects of correlative and nonstationary noise and only focus on continuous additive Gaussian white noise. We then develop an isolated digit/command recognizer and compare its performance to two other systems, in which the SVM classifier has been replaced by multilayer perceptron (MLP) and RBF neural networks. All systems are trained under the low signal-to-noise ratio (SNR) condition. We obtained the best correct classification rate of 83% and 52% for digit recognition on the TI-46 corpus for the SVM and MLP systems, respectively under the SNR=0 (dB), while we could not train the RBF network for the same dataset. The newly developed speech recognition system seems to be noise robust for medium size speech recognition problems under continuous, stationary background noise. However, it is still required to test the system under realistic noisy environment to observe whether the system keeps its adaptability and robustness under such conditions. [Work supported in part by grants from DARPA CBS, NASA, and ONR.
Design of Clinical Support Systems Using Integrated Genetic Algorithm and Support Vector Machine
NASA Astrophysics Data System (ADS)
Chen, Yung-Fu; Huang, Yung-Fa; Jiang, Xiaoyi; Hsu, Yuan-Nian; Lin, Hsuan-Hung
Clinical decision support system (CDSS) provides knowledge and specific information for clinicians to enhance diagnostic efficiency and improving healthcare quality. An appropriate CDSS can highly elevate patient safety, improve healthcare quality, and increase cost-effectiveness. Support vector machine (SVM) is believed to be superior to traditional statistical and neural network classifiers. However, it is critical to determine suitable combination of SVM parameters regarding classification performance. Genetic algorithm (GA) can find optimal solution within an acceptable time, and is faster than greedy algorithm with exhaustive searching strategy. By taking the advantage of GA in quickly selecting the salient features and adjusting SVM parameters, a method using integrated GA and SVM (IGS), which is different from the traditional method with GA used for feature selection and SVM for classification, was used to design CDSSs for prediction of successful ventilation weaning, diagnosis of patients with severe obstructive sleep apnea, and discrimination of different cell types form Pap smear. The results show that IGS is better than methods using SVM alone or linear discriminator.
NASA Astrophysics Data System (ADS)
Luo, Wei-Ping; Li, Hong-Qi; Shi, Ning
2016-06-01
At the early stages of deep-water oil exploration and development, fewer and further apart wells are drilled than in onshore oilfields. Supervised least squares support vector machine algorithms are used to predict the reservoir parameters but the prediction accuracy is low. We combined the least squares support vector machine (LSSVM) algorithm with semi-supervised learning and established a semi-supervised regression model, which we call the semi-supervised least squares support vector machine (SLSSVM) model. The iterative matrix inversion is also introduced to improve the training ability and training time of the model. We use the UCI data to test the generalization of a semi-supervised and a supervised LSSVM models. The test results suggest that the generalization performance of the LSSVM model greatly improves and with decreasing training samples the generalization performance is better. Moreover, for small-sample models, the SLSSVM method has higher precision than the semi-supervised K-nearest neighbor (SKNN) method. The new semisupervised LSSVM algorithm was used to predict the distribution of porosity and sandstone in the Jingzhou study area.
Application of support vector machine and quantum genetic algorithm in infrared target recognition
NASA Astrophysics Data System (ADS)
Wang, Hongliang; Huang, Yangwen; Ding, Haifei
2010-08-01
In this paper, a kind of classifier based on support vector machine (SVM) is designed for infrared target recognition. In allusion to the problem how to choose kernel parameter and error penalty factor, quantum genetic algorithm (QGA) is used to optimize the parameters of SVM model, it overcomes the shortcoming of determining its parameters after trial and error in the past. Classification experiments of infrared target features extracted by this method show that the convergence speed is fast and the rate of accurate recognition is high.
NASA Technical Reports Server (NTRS)
Garay, Michael J.; Mazzoni, Dominic; Davies, Roger; Wagstaff, Kiri
2004-01-01
Support Vector Machines (SVMs) are a type of supervised learning algorith,, other examples of which are Artificial Neural Networks (ANNs), Decision Trees, and Naive Bayesian Classifiers. Supervised learning algorithms are used to classify objects labled by a 'supervisor' - typically a human 'expert.'.
An incremental learning algorithm based on Support Vector Machine for pattern recognition
NASA Astrophysics Data System (ADS)
Zou, Lamei; Zhang, Tianxu; Cao, Zhiguo
2009-10-01
With the advent of information age, especially with the rapid development of network, "information explosion" problem has emerged. How to improve the classifier's training precision steadily with accumulation of the samples is the original idea of the incremental learning. Support Vector Machine (SVM) has been successfully applied in many pattern recognition fields. While its complex computation is the bottle-neck to deal with large-scale data. It's important to do researches on the SVM's incremental learning. This article proposes a SVM's incremental learning algorithm based on the filtering fixed partition of the data set. This article firstly presents "Two-class problem"s algorithm and then generalizes it to the "Multiclass problem" algorithm by the One-vs-One method. The experimental results on three types of data sets' classification show that the proposed incremental learning technique can greatly improve the efficiency of SVM learning. SVM Incremental learning can not only ensure the correct identification rate but also speedup the training process.
NASA Astrophysics Data System (ADS)
Zhang, XiaoLi; Liang, DaKai; Zeng, Jie; Asundi, Anand
2012-02-01
Structural Health Monitoring (SHM) based on fiber Bragg grating (FBG) sensor network has attracted considerable attention in recent years. However, FBG sensor network is embedded or glued in the structure simply with series or parallel. In this case, if optic fiber sensors or fiber nodes fail, the fiber sensors cannot be sensed behind the failure point. Therefore, for improving the survivability of the FBG-based sensor system in the SHM, it is necessary to build high reliability FBG sensor network for the SHM engineering application. In this study, a model reconstruction soft computing recognition algorithm based on genetic algorithm-support vector regression (GA-SVR) is proposed to achieve the reliability of the FBG-based sensor system. Furthermore, an 8-point FBG sensor system is experimented in an aircraft wing box. The external loading damage position prediction is an important subject for SHM system; as an example, different failure modes are selected to demonstrate the SHM system's survivability of the FBG-based sensor network. Simultaneously, the results are compared with the non-reconstruct model based on GA-SVR in each failure mode. Results show that the proposed model reconstruction algorithm based on GA-SVR can still keep the predicting precision when partial sensors failure in the SHM system; thus a highly reliable sensor network for the SHM system is facilitated without introducing extra component and noise.
A Novel Classification Algorithm Based on Incremental Semi-Supervised Support Vector Machine
Gao, Fei; Mei, Jingyuan; Sun, Jinping; Wang, Jun; Yang, Erfu; Hussain, Amir
2015-01-01
For current computational intelligence techniques, a major challenge is how to learn new concepts in changing environment. Traditional learning schemes could not adequately address this problem due to a lack of dynamic data selection mechanism. In this paper, inspired by human learning process, a novel classification algorithm based on incremental semi-supervised support vector machine (SVM) is proposed. Through the analysis of prediction confidence of samples and data distribution in a changing environment, a “soft-start” approach, a data selection mechanism and a data cleaning mechanism are designed, which complete the construction of our incremental semi-supervised learning system. Noticeably, with the ingenious design procedure of our proposed algorithm, the computation complexity is reduced effectively. In addition, for the possible appearance of some new labeled samples in the learning process, a detailed analysis is also carried out. The results show that our algorithm does not rely on the model of sample distribution, has an extremely low rate of introducing wrong semi-labeled samples and can effectively make use of the unlabeled samples to enrich the knowledge system of classifier and improve the accuracy rate. Moreover, our method also has outstanding generalization performance and the ability to overcome the concept drift in a changing environment. PMID:26275294
Zhang, Daqing; Xiao, Jianfeng; Zhou, Nannan; Zheng, Mingyue; Luo, Xiaomin; Jiang, Hualiang; Chen, Kaixian
2015-01-01
Blood-brain barrier (BBB) is a highly complex physical barrier determining what substances are allowed to enter the brain. Support vector machine (SVM) is a kernel-based machine learning method that is widely used in QSAR study. For a successful SVM model, the kernel parameters for SVM and feature subset selection are the most important factors affecting prediction accuracy. In most studies, they are treated as two independent problems, but it has been proven that they could affect each other. We designed and implemented genetic algorithm (GA) to optimize kernel parameters and feature subset selection for SVM regression and applied it to the BBB penetration prediction. The results show that our GA/SVM model is more accurate than other currently available log BB models. Therefore, to optimize both SVM parameters and feature subset simultaneously with genetic algorithm is a better approach than other methods that treat the two problems separately. Analysis of our log BB model suggests that carboxylic acid group, polar surface area (PSA)/hydrogen-bonding ability, lipophilicity, and molecular charge play important role in BBB penetration. Among those properties relevant to BBB penetration, lipophilicity could enhance the BBB penetration while all the others are negatively correlated with BBB penetration. PMID:26504797
Modelling soil water retention using support vector machines with genetic algorithm optimisation.
Lamorski, Krzysztof; Sławiński, Cezary; Moreno, Felix; Barna, Gyöngyi; Skierucha, Wojciech; Arrue, José L
2014-01-01
This work presents point pedotransfer function (PTF) models of the soil water retention curve. The developed models allowed for estimation of the soil water content for the specified soil water potentials: -0.98, -3.10, -9.81, -31.02, -491.66, and -1554.78 kPa, based on the following soil characteristics: soil granulometric composition, total porosity, and bulk density. Support Vector Machines (SVM) methodology was used for model development. A new methodology for elaboration of retention function models is proposed. Alternative to previous attempts known from literature, the ν-SVM method was used for model development and the results were compared with the formerly used the C-SVM method. For the purpose of models' parameters search, genetic algorithms were used as an optimisation framework. A new form of the aim function used for models parameters search is proposed which allowed for development of models with better prediction capabilities. This new aim function avoids overestimation of models which is typically encountered when root mean squared error is used as an aim function. Elaborated models showed good agreement with measured soil water retention data. Achieved coefficients of determination values were in the range 0.67-0.92. Studies demonstrated usability of ν-SVM methodology together with genetic algorithm optimisation for retention modelling which gave better performing models than other tested approaches. PMID:24772030
Modelling Soil Water Retention Using Support Vector Machines with Genetic Algorithm Optimisation
Lamorski, Krzysztof; Sławiński, Cezary; Moreno, Felix; Barna, Gyöngyi; Skierucha, Wojciech; Arrue, José L.
2014-01-01
This work presents point pedotransfer function (PTF) models of the soil water retention curve. The developed models allowed for estimation of the soil water content for the specified soil water potentials: –0.98, –3.10, –9.81, –31.02, –491.66, and –1554.78 kPa, based on the following soil characteristics: soil granulometric composition, total porosity, and bulk density. Support Vector Machines (SVM) methodology was used for model development. A new methodology for elaboration of retention function models is proposed. Alternative to previous attempts known from literature, the ν-SVM method was used for model development and the results were compared with the formerly used the C-SVM method. For the purpose of models' parameters search, genetic algorithms were used as an optimisation framework. A new form of the aim function used for models parameters search is proposed which allowed for development of models with better prediction capabilities. This new aim function avoids overestimation of models which is typically encountered when root mean squared error is used as an aim function. Elaborated models showed good agreement with measured soil water retention data. Achieved coefficients of determination values were in the range 0.67–0.92. Studies demonstrated usability of ν-SVM methodology together with genetic algorithm optimisation for retention modelling which gave better performing models than other tested approaches. PMID:24772030
Automatic ultrasonic breast lesions detection using support vector machine based algorithm
NASA Astrophysics Data System (ADS)
Yeh, Chih-Kuang; Miao, Shan-Jung; Fan, Wei-Che; Chen, Yung-Sheng
2007-03-01
It is difficult to automatically detect tumors and extract lesion boundaries in ultrasound images due to the variance in shape, the interference from speckle noise, and the low contrast between objects and background. The enhancement of ultrasonic image becomes a significant task before performing lesion classification, which was usually done with manual delineation of the tumor boundaries in the previous works. In this study, a linear support vector machine (SVM) based algorithm is proposed for ultrasound breast image training and classification. Then a disk expansion algorithm is applied for automatically detecting lesions boundary. A set of sub-images including smooth and irregular boundaries in tumor objects and those in speckle-noised background are trained by the SVM algorithm to produce an optimal classification function. Based on this classification model, each pixel within an ultrasound image is classified into either object or background oriented pixel. This enhanced binary image can highlight the object and suppress the speckle noise; and it can be regarded as degraded paint character (DPC) image containing closure noise, which is well known in perceptual organization of psychology. An effective scheme of removing closure noise using iterative disk expansion method has been successfully demonstrated in our previous works. The boundary detection of ultrasonic breast lesions can be further equivalent to the removal of speckle noise. By applying the disk expansion method to the binary image, we can obtain a significant radius-based image where the radius for each pixel represents the corresponding disk covering the specific object information. Finally, a signal transmission process is used for searching the complete breast lesion region and thus the desired lesion boundary can be effectively and automatically determined. Our algorithm can be performed iteratively until all desired objects are detected. Simulations and clinical images were introduced to
NASA Astrophysics Data System (ADS)
Yan, Yiming; Zhang, Ye; Gao, Fengjiao
2012-12-01
This article proposes a `dynamic' artificial bee colony (D-ABC) algorithm for solving optimizing problems. It overcomes the poor performance of artificial bee colony (ABC) algorithm, when applied to multi-parameters optimization. A dynamic `activity' factor is introduced to D-ABC algorithm to speed up convergence and improve the quality of solution. This D-ABC algorithm is employed for multi-parameters optimization of support vector machine (SVM)-based soft-margin classifier. Parameter optimization is significant to improve classification performance of SVM-based classifier. Classification accuracy is defined as the objection function, and the many parameters, including `kernel parameter', `cost factor', etc., form a solution vector to be optimized. Experiments demonstrate that D-ABC algorithm has better performance than traditional methods for this optimizing problem, and better parameters of SVM are obtained which lead to higher classification accuracy.
The Construction of Support Vector Machine Classifier Using the Firefly Algorithm
Chao, Chih-Feng; Horng, Ming-Huwi
2015-01-01
The setting of parameters in the support vector machines (SVMs) is very important with regard to its accuracy and efficiency. In this paper, we employ the firefly algorithm to train all parameters of the SVM simultaneously, including the penalty parameter, smoothness parameter, and Lagrangian multiplier. The proposed method is called the firefly-based SVM (firefly-SVM). This tool is not considered the feature selection, because the SVM, together with feature selection, is not suitable for the application in a multiclass classification, especially for the one-against-all multiclass SVM. In experiments, binary and multiclass classifications are explored. In the experiments on binary classification, ten of the benchmark data sets of the University of California, Irvine (UCI), machine learning repository are used; additionally the firefly-SVM is applied to the multiclass diagnosis of ultrasonic supraspinatus images. The classification performance of firefly-SVM is also compared to the original LIBSVM method associated with the grid search method and the particle swarm optimization based SVM (PSO-SVM). The experimental results advocate the use of firefly-SVM to classify pattern classifications for maximum accuracy. PMID:25802511
Phytoplankton global mapping from space with a support vector machine algorithm
NASA Astrophysics Data System (ADS)
de Boissieu, Florian; Menkes, Christophe; Dupouy, Cécile; Rodier, Martin; Bonnet, Sophie; Mangeas, Morgan; Frouin, Robert J.
2014-11-01
In recent years great progress has been made in global mapping of phytoplankton from space. Two main trends have emerged, the recognition of phytoplankton functional types (PFT) based on reflectance normalized to chlorophyll-a concentration, and the recognition of phytoplankton size class (PSC) based on the relationship between cell size and chlorophyll-a concentration. However, PFTs and PSCs are not decorrelated, and one approach can complement the other in a recognition task. In this paper, we explore the recognition of several dominant PFTs by combining reflectance anomalies, chlorophyll-a concentration and other environmental parameters, such as sea surface temperature and wind speed. Remote sensing pixels are labeled thanks to coincident in-situ pigment data from GeP&CO, NOMAD and MAREDAT datasets, covering various oceanographic environments. The recognition is made with a supervised Support Vector Machine classifier trained on the labeled pixels. This algorithm enables a non-linear separation of the classes in the input space and is especially adapted for small training datasets as available here. Moreover, it provides a class probability estimate, allowing one to enhance the robustness of the classification results through the choice of a minimum probability threshold. A greedy feature selection associated to a 10-fold cross-validation procedure is applied to select the most discriminative input features and evaluate the classification performance. The best classifiers are finally applied on daily remote sensing datasets (SeaWIFS, MODISA) and the resulting dominant PFT maps are compared with other studies. Several conclusions are drawn: (1) the feature selection highlights the weight of temperature, chlorophyll-a and wind speed variables in phytoplankton recognition; (2) the classifiers show good results and dominant PFT maps in agreement with phytoplankton distribution knowledge; (3) classification on MODISA data seems to perform better than on SeaWIFS data
Automated beam placement for breast radiotherapy using a support vector machine based algorithm
Zhao Xuan; Kong, Dewen; Jozsef, Gabor; Chang, Jenghwa; Wong, Edward K.; Formenti, Silvia C.; Wang Yao
2012-05-15
Purpose: To develop an automated beam placement technique for whole breast radiotherapy using tangential beams. We seek to find optimal parameters for tangential beams to cover the whole ipsilateral breast (WB) and minimize the dose to the organs at risk (OARs). Methods: A support vector machine (SVM) based method is proposed to determine the optimal posterior plane of the tangential beams. Relative significances of including/avoiding the volumes of interests are incorporated into the cost function of the SVM. After finding the optimal 3-D plane that separates the whole breast (WB) and the included clinical target volumes (CTVs) from the OARs, the gantry angle, collimator angle, and posterior jaw size of the tangential beams are derived from the separating plane equation. Dosimetric measures of the treatment plans determined by the automated method are compared with those obtained by applying manual beam placement by the physicians. The method can be further extended to use multileaf collimator (MLC) blocking by optimizing posterior MLC positions. Results: The plans for 36 patients (23 prone- and 13 supine-treated) with left breast cancer were analyzed. Our algorithm reduced the volume of the heart that receives >500 cGy dose (V5) from 2.7 to 1.7 cm{sup 3} (p = 0.058) on average and the volume of the ipsilateral lung that receives >1000 cGy dose (V10) from 55.2 to 40.7 cm{sup 3} (p = 0.0013). The dose coverage as measured by volume receiving >95% of the prescription dose (V95%) of the WB without a 5 mm superficial layer decreases by only 0.74% (p = 0.0002) and the V95% for the tumor bed with 1.5 cm margin remains unchanged. Conclusions: This study has demonstrated the feasibility of using a SVM-based algorithm to determine optimal beam placement without a physician's intervention. The proposed method reduced the dose to OARs, especially for supine treated patients, without any relevant degradation of dose homogeneity and coverage in general.
A Nonlinear Adaptive Beamforming Algorithm Based on Least Squares Support Vector Regression
Wang, Lutao; Jin, Gang; Li, Zhengzhou; Xu, Hongbin
2012-01-01
To overcome the performance degradation in the presence of steering vector mismatches, strict restrictions on the number of available snapshots, and numerous interferences, a novel beamforming approach based on nonlinear least-square support vector regression machine (LS-SVR) is derived in this paper. In this approach, the conventional linearly constrained minimum variance cost function used by minimum variance distortionless response (MVDR) beamformer is replaced by a squared-loss function to increase robustness in complex scenarios and provide additional control over the sidelobe level. Gaussian kernels are also used to obtain better generalization capacity. This novel approach has two highlights, one is a recursive regression procedure to estimate the weight vectors on real-time, the other is a sparse model with novelty criterion to reduce the final size of the beamformer. The analysis and simulation tests show that the proposed approach offers better noise suppression capability and achieve near optimal signal-to-interference-and-noise ratio (SINR) with a low computational burden, as compared to other recently proposed robust beamforming techniques.
Ju, Zhe; Gu, Hong
2016-08-15
As one important post-translational modification of prokaryotic proteins, pupylation plays a key role in regulating various biological processes. The accurate identification of pupylation sites is crucial for understanding the underlying mechanisms of pupylation. Although several computational methods have been developed for the identification of pupylation sites, the prediction accuracy of them is still unsatisfactory. Here, a novel bioinformatics tool named IMP-PUP is proposed to improve the prediction of pupylation sites. IMP-PUP is constructed on the composition of k-spaced amino acid pairs and trained with a modified semi-supervised self-training support vector machine (SVM) algorithm. The proposed algorithm iteratively trains a series of support vector machine classifiers on both annotated and non-annotated pupylated proteins. Computational results show that IMP-PUP achieves the area under receiver operating characteristic curves of 0.91, 0.73, and 0.75 on our training set, Tung's testing set, and our testing set, respectively, which are better than those of the different error costs SVM algorithm and the original self-training SVM algorithm. Independent tests also show that IMP-PUP significantly outperforms three other existing pupylation site predictors: GPS-PUP, iPUP, and pbPUP. Therefore, IMP-PUP can be a useful tool for accurate prediction of pupylation sites. A MATLAB software package for IMP-PUP is available at https://juzhe1120.github.io/. PMID:27197054
A neural support vector machine.
Jändel, Magnus
2010-06-01
Support vector machines are state-of-the-art pattern recognition algorithms that are well founded in optimization and generalization theory but not obviously applicable to the brain. This paper presents Bio-SVM, a biologically feasible support vector machine. An unstable associative memory oscillates between support vectors and interacts with a feed-forward classification pathway. Kernel neurons blend support vectors and sensory input. Downstream temporal integration generates the classification. Instant learning of surprising events and off-line tuning of support vector weights trains the system. Emotion-based learning, forgetting trivia, sleep and brain oscillations are phenomena that agree with the Bio-SVM model. A mapping to the olfactory system is suggested.
Avidan, Shai
2004-08-01
Support Vector Tracking (SVT) integrates the Support Vector Machine (SVM) classifier into an optic-flow-based tracker. Instead of minimizing an intensity difference function between successive frames, SVT maximizes the SVM classification score. To account for large motions between successive frames, we build pyramids from the support vectors and use a coarse-to-fine approach in the classification stage. We show results of using SVT for vehicle tracking in image sequences.
Chen, Zhiru; Hong, Wenxue
2016-02-01
Considering the low accuracy of prediction in the positive samples and poor overall classification effects caused by unbalanced sample data of MicroRNA (miRNA) target, we proposes a support vector machine (SVM)-integration of under-sampling and weight (IUSM) algorithm in this paper, an under-sampling based on the ensemble learning algorithm. The algorithm adopts SVM as learning algorithm and AdaBoost as integration framework, and embeds clustering-based under-sampling into the iterative process, aiming at reducing the degree of unbalanced distribution of positive and negative samples. Meanwhile, in the process of adaptive weight adjustment of the samples, the SVM-IUSM algorithm eliminates the abnormal ones in negative samples with robust sample weights smoothing mechanism so as to avoid over-learning. Finally, the prediction of miRNA target integrated classifier is achieved with the combination of multiple weak classifiers through the voting mechanism. The experiment revealed that the SVM-IUSW, compared with other algorithms on unbalanced dataset collection, could not only improve the accuracy of positive targets and the overall effect of classification, but also enhance the generalization ability of miRNA target classifier.
Saberkari, Hamidreza; Shamsi, Mousa; Joroughi, Mahsa; Golabi, Faegheh; Sedaaghi, Mohammad Hossein
2014-10-01
Microarray data have an important role in identification and classification of the cancer tissues. Having a few samples of microarrays in cancer researches is always one of the most concerns which lead to some problems in designing the classifiers. For this matter, preprocessing gene selection techniques should be utilized before classification to remove the noninformative genes from the microarray data. An appropriate gene selection method can significantly improve the performance of cancer classification. In this paper, we use selective independent component analysis (SICA) for decreasing the dimension of microarray data. Using this selective algorithm, we can solve the instability problem occurred in the case of employing conventional independent component analysis (ICA) methods. First, the reconstruction error and selective set are analyzed as independent components of each gene, which have a small part in making error in order to reconstruct new sample. Then, some of the modified support vector machine (υ-SVM) algorithm sub-classifiers are trained, simultaneously. Eventually, the best sub-classifier with the highest recognition rate is selected. The proposed algorithm is applied on three cancer datasets (leukemia, breast cancer and lung cancer datasets), and its results are compared with other existing methods. The results illustrate that the proposed algorithm (SICA + υ-SVM) has higher accuracy and validity in order to increase the classification accuracy. Such that, our proposed algorithm exhibits relative improvements of 3.3% in correctness rate over ICA + SVM and SVM algorithms in lung cancer dataset.
Chen, Zhiru; Hong, Wenxue
2016-02-01
Considering the low accuracy of prediction in the positive samples and poor overall classification effects caused by unbalanced sample data of MicroRNA (miRNA) target, we proposes a support vector machine (SVM)-integration of under-sampling and weight (IUSM) algorithm in this paper, an under-sampling based on the ensemble learning algorithm. The algorithm adopts SVM as learning algorithm and AdaBoost as integration framework, and embeds clustering-based under-sampling into the iterative process, aiming at reducing the degree of unbalanced distribution of positive and negative samples. Meanwhile, in the process of adaptive weight adjustment of the samples, the SVM-IUSM algorithm eliminates the abnormal ones in negative samples with robust sample weights smoothing mechanism so as to avoid over-learning. Finally, the prediction of miRNA target integrated classifier is achieved with the combination of multiple weak classifiers through the voting mechanism. The experiment revealed that the SVM-IUSW, compared with other algorithms on unbalanced dataset collection, could not only improve the accuracy of positive targets and the overall effect of classification, but also enhance the generalization ability of miRNA target classifier. PMID:27382743
Saberkari, Hamidreza; Shamsi, Mousa; Joroughi, Mahsa; Golabi, Faegheh; Sedaaghi, Mohammad Hossein
2014-01-01
Microarray data have an important role in identification and classification of the cancer tissues. Having a few samples of microarrays in cancer researches is always one of the most concerns which lead to some problems in designing the classifiers. For this matter, preprocessing gene selection techniques should be utilized before classification to remove the noninformative genes from the microarray data. An appropriate gene selection method can significantly improve the performance of cancer classification. In this paper, we use selective independent component analysis (SICA) for decreasing the dimension of microarray data. Using this selective algorithm, we can solve the instability problem occurred in the case of employing conventional independent component analysis (ICA) methods. First, the reconstruction error and selective set are analyzed as independent components of each gene, which have a small part in making error in order to reconstruct new sample. Then, some of the modified support vector machine (υ-SVM) algorithm sub-classifiers are trained, simultaneously. Eventually, the best sub-classifier with the highest recognition rate is selected. The proposed algorithm is applied on three cancer datasets (leukemia, breast cancer and lung cancer datasets), and its results are compared with other existing methods. The results illustrate that the proposed algorithm (SICA + υ-SVM) has higher accuracy and validity in order to increase the classification accuracy. Such that, our proposed algorithm exhibits relative improvements of 3.3% in correctness rate over ICA + SVM and SVM algorithms in lung cancer dataset. PMID:25426433
Wang, Wei; Chen, Xiyuan
2016-08-10
Modeling and compensation of temperature drift is an important method for improving the precision of fiber-optic gyroscopes (FOGs). In this paper, a new method of modeling and compensation for FOGs based on improved particle swarm optimization (PSO) and support vector machine (SVM) algorithms is proposed. The convergence speed and reliability of PSO are improved by introducing a dynamic inertia factor. The regression accuracy of SVM is improved by introducing a combined kernel function with four parameters and piecewise regression with fixed steps. The steps are as follows. First, the parameters of the combined kernel functions are optimized by the improved PSO algorithm. Second, the proposed kernel function of SVM is used to carry out piecewise regression, and the regression model is also obtained. Third, the temperature drift is compensated for by the regression data. The regression accuracy of the proposed method (in the case of mean square percentage error indicators) increased by 83.81% compared to the traditional SVM.
Dai, Wensheng; Wu, Jui-Yu; Lu, Chi-Jie
2014-01-01
Sales forecasting is one of the most important issues in managing information technology (IT) chain store sales since an IT chain store has many branches. Integrating feature extraction method and prediction tool, such as support vector regression (SVR), is a useful method for constructing an effective sales forecasting scheme. Independent component analysis (ICA) is a novel feature extraction technique and has been widely applied to deal with various forecasting problems. But, up to now, only the basic ICA method (i.e., temporal ICA model) was applied to sale forecasting problem. In this paper, we utilize three different ICA methods including spatial ICA (sICA), temporal ICA (tICA), and spatiotemporal ICA (stICA) to extract features from the sales data and compare their performance in sales forecasting of IT chain store. Experimental results from a real sales data show that the sales forecasting scheme by integrating stICA and SVR outperforms the comparison models in terms of forecasting error. The stICA is a promising tool for extracting effective features from branch sales data and the extracted features can improve the prediction performance of SVR for sales forecasting.
Dai, Wensheng
2014-01-01
Sales forecasting is one of the most important issues in managing information technology (IT) chain store sales since an IT chain store has many branches. Integrating feature extraction method and prediction tool, such as support vector regression (SVR), is a useful method for constructing an effective sales forecasting scheme. Independent component analysis (ICA) is a novel feature extraction technique and has been widely applied to deal with various forecasting problems. But, up to now, only the basic ICA method (i.e., temporal ICA model) was applied to sale forecasting problem. In this paper, we utilize three different ICA methods including spatial ICA (sICA), temporal ICA (tICA), and spatiotemporal ICA (stICA) to extract features from the sales data and compare their performance in sales forecasting of IT chain store. Experimental results from a real sales data show that the sales forecasting scheme by integrating stICA and SVR outperforms the comparison models in terms of forecasting error. The stICA is a promising tool for extracting effective features from branch sales data and the extracted features can improve the prediction performance of SVR for sales forecasting. PMID:25165740
Shahrudin, Shahriza
2015-01-01
This study concerns an attempt to establish a new method for predicting antimicrobial peptides (AMPs) which are important to the immune system. Recently, researchers are interested in designing alternative drugs based on AMPs because they have found that a large number of bacterial strains have become resistant to available antibiotics. However, researchers have encountered obstacles in the AMPs designing process as experiments to extract AMPs from protein sequences are costly and require a long set-up time. Therefore, a computational tool for AMPs prediction is needed to resolve this problem. In this study, an integrated algorithm is newly introduced to predict AMPs by integrating sequence alignment and support vector machine- (SVM-) LZ complexity pairwise algorithm. It was observed that, when all sequences in the training set are used, the sensitivity of the proposed algorithm is 95.28% in jackknife test and 87.59% in independent test, while the sensitivity obtained for jackknife test and independent test is 88.74% and 78.70%, respectively, when only the sequences that has less than 70% similarity are used. Applying the proposed algorithm may allow researchers to effectively predict AMPs from unknown protein peptide sequences with higher sensitivity. PMID:25802839
Nakayama, Chikao; Fujiwara, Koichi; Matsuo, Masahiro; Kano, Manabu; Kadotani, Hiroshi
2015-08-01
Although sleep apnea syndrome (SAS) is a common sleep disorder, most patients with sleep apnea are undiagnosed and untreated because it is difficult for patients themselves to notice SAS in daily living. Polysomnography (PSG) is a gold standard test for sleep disorder diagnosis, however PSG cannot be performed in many hospitals. This fact motivates us to develop an SAS screening system that can be used easily at home. The autonomic nervous function of a patient changes during apnea. Since changes in the autonomic nervous function affect fluctuation of the R-R interval (RRI) of an electrocardiogram (ECG), called heart rate variability (HRV), SAS can be detected through monitoring HRV. The present work proposes a new HRV-based SAS screening algorithm by utilizing support vector machine (SVM), which is a well-known pattern recognition method. In the proposed algorithm, various HRV features are derived from RRI data in both apnea and normal respiration periods of patients and healthy people, and an apnea/normal respiration (A/N) discriminant model is built from the derived HRV features by SVM. The result of applying the proposed SAS screening algorithm to clinical data demonstrates that it can discriminate patients with sleep apnea and healthy people appropriately. The sensitivity and the specificity of the proposed algorithm were 100% and 86%, respectively.
Ebtehaj, Isa; Bonakdari, Hossein
2016-01-01
Sediment transport without deposition is an essential consideration in the optimum design of sewer pipes. In this study, a novel method based on a combination of support vector regression (SVR) and the firefly algorithm (FFA) is proposed to predict the minimum velocity required to avoid sediment settling in pipe channels, which is expressed as the densimetric Froude number (Fr). The efficiency of support vector machine (SVM) models depends on the suitable selection of SVM parameters. In this particular study, FFA is used by determining these SVM parameters. The actual effective parameters on Fr calculation are generally identified by employing dimensional analysis. The different dimensionless variables along with the models are introduced. The best performance is attributed to the model that employs the sediment volumetric concentration (C(V)), ratio of relative median diameter of particles to hydraulic radius (d/R), dimensionless particle number (D(gr)) and overall sediment friction factor (λ(s)) parameters to estimate Fr. The performance of the SVR-FFA model is compared with genetic programming, artificial neural network and existing regression-based equations. The results indicate the superior performance of SVR-FFA (mean absolute percentage error = 2.123%; root mean square error =0.116) compared with other methods. PMID:27148727
Ebtehaj, Isa; Bonakdari, Hossein
2016-01-01
Sediment transport without deposition is an essential consideration in the optimum design of sewer pipes. In this study, a novel method based on a combination of support vector regression (SVR) and the firefly algorithm (FFA) is proposed to predict the minimum velocity required to avoid sediment settling in pipe channels, which is expressed as the densimetric Froude number (Fr). The efficiency of support vector machine (SVM) models depends on the suitable selection of SVM parameters. In this particular study, FFA is used by determining these SVM parameters. The actual effective parameters on Fr calculation are generally identified by employing dimensional analysis. The different dimensionless variables along with the models are introduced. The best performance is attributed to the model that employs the sediment volumetric concentration (C(V)), ratio of relative median diameter of particles to hydraulic radius (d/R), dimensionless particle number (D(gr)) and overall sediment friction factor (λ(s)) parameters to estimate Fr. The performance of the SVR-FFA model is compared with genetic programming, artificial neural network and existing regression-based equations. The results indicate the superior performance of SVR-FFA (mean absolute percentage error = 2.123%; root mean square error =0.116) compared with other methods.
Yilmaz, Nihat; Inan, Onur
2013-01-01
This paper offers a hybrid approach that uses the artificial bee colony (ABC) algorithm for feature selection and support vector machines for classification. The purpose of this paper is to test the effect of elimination of the unimportant and obsolete features of the datasets on the success of the classification, using the SVM classifier. The developed approach conventionally used in liver diseases and diabetes diagnostics, which are commonly observed and reduce the quality of life, is developed. For the diagnosis of these diseases, hepatitis, liver disorders and diabetes datasets from the UCI database were used, and the proposed system reached a classification accuracies of 94.92%, 74.81%, and 79.29%, respectively. For these datasets, the classification accuracies were obtained by the help of the 10-fold cross-validation method. The results show that the performance of the method is highly successful compared to other results attained and seems very promising for pattern recognition applications. PMID:23983632
Wang, Wei; Chen, Xiyuan
2016-08-10
Modeling and compensation of temperature drift is an important method for improving the precision of fiber-optic gyroscopes (FOGs). In this paper, a new method of modeling and compensation for FOGs based on improved particle swarm optimization (PSO) and support vector machine (SVM) algorithms is proposed. The convergence speed and reliability of PSO are improved by introducing a dynamic inertia factor. The regression accuracy of SVM is improved by introducing a combined kernel function with four parameters and piecewise regression with fixed steps. The steps are as follows. First, the parameters of the combined kernel functions are optimized by the improved PSO algorithm. Second, the proposed kernel function of SVM is used to carry out piecewise regression, and the regression model is also obtained. Third, the temperature drift is compensated for by the regression data. The regression accuracy of the proposed method (in the case of mean square percentage error indicators) increased by 83.81% compared to the traditional SVM. PMID:27534465
NASA Astrophysics Data System (ADS)
Zhuo, Li; Zheng, Jing; Li, Xia; Wang, Fang; Ai, Bin; Qian, Junping
2008-10-01
The high-dimensional feature vectors of hyper spectral data often impose a high computational cost as well as the risk of "over fitting" when classification is performed. Therefore it is necessary to reduce the dimensionality through ways like feature selection. Currently, there are two kinds of feature selection methods: filter methods and wrapper methods. The former kind requires no feedback from classifiers and estimates the classification performance indirectly. The latter kind evaluates the "goodness" of selected feature subset directly based on the classification accuracy. Many experimental results have proved that the wrapper methods can yield better performance, although they have the disadvantage of high computational cost. In this paper, we present a Genetic Algorithm (GA) based wrapper method for classification of hyper spectral data using Support Vector Machine (SVM), a state-of-art classifier that has found success in a variety of areas. The genetic algorithm (GA), which seeks to solve optimization problems using the methods of evolution, specifically survival of the fittest, was used to optimize both the feature subset, i.e. band subset, of hyper spectral data and SVM kernel parameters simultaneously. A special strategy was adopted to reduce computation cost caused by the high-dimensional feature vectors of hyper spectral data when the feature subset part of chromosome was designed. The GA-SVM method was realized using the ENVI/IDL language, and was then tested by applying to a HYPERION hyper spectral image. Comparison of the optimized results and the un-optimized results showed that the GA-SVM method could significantly reduce the computation cost while improving the classification accuracy. The number of bands used for classification was reduced from 198 to 13, while the classification accuracy increased from 88.81% to 92.51%. The optimized values of the two SVM kernel parameters were 95.0297 and 0.2021, respectively, which were different from the
GAPS IN SUPPORT VECTOR OPTIMIZATION
STEINWART, INGO; HUSH, DON; SCOVEL, CLINT; LIST, NICOLAS
2007-01-29
We show that the stopping criteria used in many support vector machine (SVM) algorithms working on the dual can be interpreted as primal optimality bounds which in turn are known to be important for the statistical analysis of SVMs. To this end we revisit the duality theory underlying the derivation of the dual and show that in many interesting cases primal optimality bounds are the same as known dual optimality bounds.
Mortaheb, Parinaz; Rezaeian, Mehdi
2016-01-01
Segmentation and three-dimensional (3D) visualization of teeth in dental computerized tomography (CT) images are of dentists’ requirements for both abnormalities diagnosis and the treatments such as dental implant and orthodontic planning. On the other hand, dental CT image segmentation is a difficult process because of the specific characteristics of the tooth's structure. This paper presents a method for automatic segmentation of dental CT images. We present a multi-step method, which starts with a preprocessing phase to reduce the metal artifact using the least square support vector machine. Integral intensity profile is then applied to detect each tooth's region candidates. Finally, the mean shift algorithm is used to partition the region of each tooth, and all these segmented slices are then applied for 3D visualization of teeth. Examining the performance of our proposed approach, a set of reliable assessment metrics is utilized. We applied the segmentation method on 14 cone-beam CT datasets. Functionality analysis of the proposed method demonstrated precise segmentation results on different sample slices. Accuracy analysis of the proposed method indicates that we can increase the sensitivity, specificity, precision, and accuracy of the segmentation results by 83.24%, 98.35%, 72.77%, and 97.62% and decrease the error rate by 2.34%. The experimental results show that the proposed approach performs well on different types of CT images and has better performance than all existing approaches. Moreover, segmentation results can be more accurate by using the proposed algorithm of metal artifact reduction in the preprocessing phase. PMID:27014607
2010-01-01
Background Because a priori knowledge about function of G protein-coupled receptors (GPCRs) can provide useful information to pharmaceutical research, the determination of their function is a quite meaningful topic in protein science. However, with the rapid increase of GPCRs sequences entering into databanks, the gap between the number of known sequence and the number of known function is widening rapidly, and it is both time-consuming and expensive to determine their function based only on experimental techniques. Therefore, it is vitally significant to develop a computational method for quick and accurate classification of GPCRs. Results In this study, a novel three-layer predictor based on support vector machine (SVM) and feature selection is developed for predicting and classifying GPCRs directly from amino acid sequence data. The maximum relevance minimum redundancy (mRMR) is applied to pre-evaluate features with discriminative information while genetic algorithm (GA) is utilized to find the optimized feature subsets. SVM is used for the construction of classification models. The overall accuracy with three-layer predictor at levels of superfamily, family and subfamily are obtained by cross-validation test on two non-redundant dataset. The results are about 0.5% to 16% higher than those of GPCR-CA and GPCRPred. Conclusion The results with high success rates indicate that the proposed predictor is a useful automated tool in predicting GPCRs. GPCR-SVMFS, a corresponding executable program for GPCRs prediction and classification, can be acquired freely on request from the authors. PMID:20550715
ERIC Educational Resources Information Center
Araya, Roberto; Plana, Francisco; Dartnell, Pablo; Soto-Andrade, Jorge; Luci, Gina; Salinas, Elena; Araya, Marylen
2012-01-01
Teacher practice is normally assessed by observers who watch classes or videos of classes. Here, we analyse an alternative strategy that uses text transcripts and a support vector machine classifier. For each one of the 710 videos of mathematics classes from the 2005 Chilean National Teacher Assessment Programme, a single 4-minute slice was…
Hlihor, Raluca Maria; Diaconu, Mariana; Leon, Florin; Curteanu, Silvia; Tavares, Teresa; Gavrilescu, Maria
2015-05-25
We investigated the bioremoval of Cd(II) in batch mode, using dead and living biomass of Trichoderma viride. Kinetic studies revealed three distinct stages of the biosorption process. The pseudo-second order model and the Langmuir model described well the kinetics and equilibrium of the biosorption process, with a determination coefficient, R(2)>0.99. The value of the mean free energy of adsorption, E, is less than 16 kJ/mol at 25 °C, suggesting that, at low temperature, the dominant process involved in Cd(II) biosorption by dead T. viride is the chemical ion-exchange. With the temperature increasing to 40-50 °C, E values are above 16 kJ/mol, showing that the particle diffusion mechanism could play an important role in Cd(II) biosorption. The studies on T. viride growth in Cd(II) solutions and its bioaccumulation performance showed that the living biomass was able to bioaccumulate 100% Cd(II) from a 50 mg/L solution at pH 6.0. The influence of pH, biomass dosage, metal concentration, contact time and temperature on the bioremoval efficiency was evaluated to further assess the biosorption capability of the dead biosorbent. These complex influences were correlated by means of a modeling procedure consisting in data driven approach in which the principles of artificial intelligence were applied with the help of support vector machines (SVM), combined with genetic algorithms (GA). According to our data, the optimal working conditions for the removal of 98.91% Cd(II) by T. viride were found for an aqueous solution containing 26.11 mg/L Cd(II) as follows: pH 6.0, contact time of 3833 min, 8 g/L biosorbent, temperature 46.5 °C. The complete characterization of bioremoval parameters indicates that T. viride is an excellent material to treat wastewater containing low concentrations of metal.
Hlihor, Raluca Maria; Diaconu, Mariana; Leon, Florin; Curteanu, Silvia; Tavares, Teresa; Gavrilescu, Maria
2015-05-25
We investigated the bioremoval of Cd(II) in batch mode, using dead and living biomass of Trichoderma viride. Kinetic studies revealed three distinct stages of the biosorption process. The pseudo-second order model and the Langmuir model described well the kinetics and equilibrium of the biosorption process, with a determination coefficient, R(2)>0.99. The value of the mean free energy of adsorption, E, is less than 16 kJ/mol at 25 °C, suggesting that, at low temperature, the dominant process involved in Cd(II) biosorption by dead T. viride is the chemical ion-exchange. With the temperature increasing to 40-50 °C, E values are above 16 kJ/mol, showing that the particle diffusion mechanism could play an important role in Cd(II) biosorption. The studies on T. viride growth in Cd(II) solutions and its bioaccumulation performance showed that the living biomass was able to bioaccumulate 100% Cd(II) from a 50 mg/L solution at pH 6.0. The influence of pH, biomass dosage, metal concentration, contact time and temperature on the bioremoval efficiency was evaluated to further assess the biosorption capability of the dead biosorbent. These complex influences were correlated by means of a modeling procedure consisting in data driven approach in which the principles of artificial intelligence were applied with the help of support vector machines (SVM), combined with genetic algorithms (GA). According to our data, the optimal working conditions for the removal of 98.91% Cd(II) by T. viride were found for an aqueous solution containing 26.11 mg/L Cd(II) as follows: pH 6.0, contact time of 3833 min, 8 g/L biosorbent, temperature 46.5 °C. The complete characterization of bioremoval parameters indicates that T. viride is an excellent material to treat wastewater containing low concentrations of metal. PMID:25224921
Ocak, Hasan
2013-04-01
A new scheme was presented in this study for the evaluation of fetal well-being from the cardiotocogram (CTG) recordings using support vector machines (SVM) and the genetic algorithm (GA). CTG recordings consist of fetal heart rate (FHR) and the uterine contraction (UC) signals and are widely used by obstetricians for assessing fetal well-being. Features extracted from normal and pathological FHR and UC signals were used to construct an SVM based classifier. The GA was then used to find the optimal feature subset that maximizes the classification performance of the SVM based normal and pathological CTG classifier. An extensive clinical CTG data, classified by three expert obstetricians, was used to test the performance of the new scheme. It was demonstrated that the new scheme was able to predict the fetal state as normal or pathological with 99.3 % and 100 % accuracy, respectively. The results reveal that, the GA can be used to determine the critical features to be used in evaluating fetal well-being and consequently increase the classification performance. When compared to widely used ANN and ANFIS based methods, the proposed scheme performed considerably better. PMID:23321973
NASA Astrophysics Data System (ADS)
Lenhardt, L.; Zeković, I.; Dramićanin, T.; Tešić, Ž.; Milojković-Opsenica, D.; Dramićanin, M. D.
2014-09-01
In recent years, the potential of Fourier-transform infrared spectroscopy coupled with different chemometric tools in food analysis has been established. This technique is rapid, low cost, and reliable and requires little sample preparation. In this work, 130 Serbian unifloral honey samples (linden, acacia, and sunflower types) were analyzed using attenuated total reflectance infrared spectroscopy (ATR-IR). For each spectrum, 64 scans were recorded in wavenumbers between 4000 and 500 cm-1 and at a spectral resolution of 4 cm-1. These spectra were analyzed using principal component analysis (PCA), and calculated principal components were then used for support vector machine (SVM) training. In this way, the pattern-recognition tool is obtained for building a classification model for determining the botanical origin of honey. The PCA was used to analyze results and to see if the separation between groups of different types of honeys exists. Using the SVM, the classification model was built and classification errors were acquired. It has been observed that this technique is adequate for determining the botanical origin of honey with a success rate of 98.6%. Based on these results, it can be concluded that this technique offers many possibilities for future rapid qualitative analysis of honey.
Si, Lei; Wang, Zhongbin; Liu, Xinhua; Tan, Chao; Liu, Ze; Xu, Jing
2016-01-01
Shearers play an important role in fully mechanized coal mining face and accurately identifying their cutting pattern is very helpful for improving the automation level of shearers and ensuring the safety of coal mining. The least squares support vector machine (LSSVM) has been proven to offer strong potential in prediction and classification issues, particularly by employing an appropriate meta-heuristic algorithm to determine the values of its two parameters. However, these meta-heuristic algorithms have the drawbacks of being hard to understand and reaching the global optimal solution slowly. In this paper, an improved fly optimization algorithm (IFOA) to optimize the parameters of LSSVM was presented and the LSSVM coupled with IFOA (IFOA-LSSVM) was used to identify the shearer cutting pattern. The vibration acceleration signals of five cutting patterns were collected and the special state features were extracted based on the ensemble empirical mode decomposition (EEMD) and the kernel function. Some examples on the IFOA-LSSVM model were further presented and the results were compared with LSSVM, PSO-LSSVM, GA-LSSVM and FOA-LSSVM models in detail. The comparison results indicate that the proposed approach was feasible, efficient and outperformed the others. Finally, an industrial application example at the coal mining face was demonstrated to specify the effect of the proposed system. PMID:26771615
Si, Lei; Wang, Zhongbin; Liu, Xinhua; Tan, Chao; Liu, Ze; Xu, Jing
2016-01-01
Shearers play an important role in fully mechanized coal mining face and accurately identifying their cutting pattern is very helpful for improving the automation level of shearers and ensuring the safety of coal mining. The least squares support vector machine (LSSVM) has been proven to offer strong potential in prediction and classification issues, particularly by employing an appropriate meta-heuristic algorithm to determine the values of its two parameters. However, these meta-heuristic algorithms have the drawbacks of being hard to understand and reaching the global optimal solution slowly. In this paper, an improved fly optimization algorithm (IFOA) to optimize the parameters of LSSVM was presented and the LSSVM coupled with IFOA (IFOA-LSSVM) was used to identify the shearer cutting pattern. The vibration acceleration signals of five cutting patterns were collected and the special state features were extracted based on the ensemble empirical mode decomposition (EEMD) and the kernel function. Some examples on the IFOA-LSSVM model were further presented and the results were compared with LSSVM, PSO-LSSVM, GA-LSSVM and FOA-LSSVM models in detail. The comparison results indicate that the proposed approach was feasible, efficient and outperformed the others. Finally, an industrial application example at the coal mining face was demonstrated to specify the effect of the proposed system. PMID:26771615
Si, Lei; Wang, Zhongbin; Liu, Xinhua; Tan, Chao; Liu, Ze; Xu, Jing
2016-01-01
Shearers play an important role in fully mechanized coal mining face and accurately identifying their cutting pattern is very helpful for improving the automation level of shearers and ensuring the safety of coal mining. The least squares support vector machine (LSSVM) has been proven to offer strong potential in prediction and classification issues, particularly by employing an appropriate meta-heuristic algorithm to determine the values of its two parameters. However, these meta-heuristic algorithms have the drawbacks of being hard to understand and reaching the global optimal solution slowly. In this paper, an improved fly optimization algorithm (IFOA) to optimize the parameters of LSSVM was presented and the LSSVM coupled with IFOA (IFOA-LSSVM) was used to identify the shearer cutting pattern. The vibration acceleration signals of five cutting patterns were collected and the special state features were extracted based on the ensemble empirical mode decomposition (EEMD) and the kernel function. Some examples on the IFOA-LSSVM model were further presented and the results were compared with LSSVM, PSO-LSSVM, GA-LSSVM and FOA-LSSVM models in detail. The comparison results indicate that the proposed approach was feasible, efficient and outperformed the others. Finally, an industrial application example at the coal mining face was demonstrated to specify the effect of the proposed system.
Yang, Qin; Zou, Hong-Yan; Zhang, Yan; Tang, Li-Juan; Shen, Guo-Li; Jiang, Jian-Hui; Yu, Ru-Qin
2016-01-15
Most of the proteins locate more than one organelle in a cell. Unmixing the localization patterns of proteins is critical for understanding the protein functions and other vital cellular processes. Herein, non-linear machine learning technique is proposed for the first time upon protein pattern unmixing. Variable-weighted support vector machine (VW-SVM) is a demonstrated robust modeling technique with flexible and rational variable selection. As optimized by a global stochastic optimization technique, particle swarm optimization (PSO) algorithm, it makes VW-SVM to be an adaptive parameter-free method for automated unmixing of protein subcellular patterns. Results obtained by pattern unmixing of a set of fluorescence microscope images of cells indicate VW-SVM as optimized by PSO is able to extract useful pattern features by optimally rescaling each variable for non-linear SVM modeling, consequently leading to improved performances in multiplex protein pattern unmixing compared with conventional SVM and other exiting pattern unmixing methods.
Progressive Classification Using Support Vector Machines
NASA Technical Reports Server (NTRS)
Wagstaff, Kiri; Kocurek, Michael
2009-01-01
An algorithm for progressive classification of data, analogous to progressive rendering of images, makes it possible to compromise between speed and accuracy. This algorithm uses support vector machines (SVMs) to classify data. An SVM is a machine learning algorithm that builds a mathematical model of the desired classification concept by identifying the critical data points, called support vectors. Coarse approximations to the concept require only a few support vectors, while precise, highly accurate models require far more support vectors. Once the model has been constructed, the SVM can be applied to new observations. The cost of classifying a new observation is proportional to the number of support vectors in the model. When computational resources are limited, an SVM of the appropriate complexity can be produced. However, if the constraints are not known when the model is constructed, or if they can change over time, a method for adaptively responding to the current resource constraints is required. This capability is particularly relevant for spacecraft (or any other real-time systems) that perform onboard data analysis. The new algorithm enables the fast, interactive application of an SVM classifier to a new set of data. The classification process achieved by this algorithm is characterized as progressive because a coarse approximation to the true classification is generated rapidly and thereafter iteratively refined. The algorithm uses two SVMs: (1) a fast, approximate one and (2) slow, highly accurate one. New data are initially classified by the fast SVM, producing a baseline approximate classification. For each classified data point, the algorithm calculates a confidence index that indicates the likelihood that it was classified correctly in the first pass. Next, the data points are sorted by their confidence indices and progressively reclassified by the slower, more accurate SVM, starting with the items most likely to be incorrectly classified. The user
Automated image segmentation using support vector machines
NASA Astrophysics Data System (ADS)
Powell, Stephanie; Magnotta, Vincent A.; Andreasen, Nancy C.
2007-03-01
Neurodegenerative and neurodevelopmental diseases demonstrate problems associated with brain maturation and aging. Automated methods to delineate brain structures of interest are required to analyze large amounts of imaging data like that being collected in several on going multi-center studies. We have previously reported on using artificial neural networks (ANN) to define subcortical brain structures including the thalamus (0.88), caudate (0.85) and the putamen (0.81). In this work, apriori probability information was generated using Thirion's demons registration algorithm. The input vector consisted of apriori probability, spherical coordinates, and an iris of surrounding signal intensity values. We have applied the support vector machine (SVM) machine learning algorithm to automatically segment subcortical and cerebellar regions using the same input vector information. SVM architecture was derived from the ANN framework. Training was completed using a radial-basis function kernel with gamma equal to 5.5. Training was performed using 15,000 vectors collected from 15 training images in approximately 10 minutes. The resulting support vectors were applied to delineate 10 images not part of the training set. Relative overlap calculated for the subcortical structures was 0.87 for the thalamus, 0.84 for the caudate, 0.84 for the putamen, and 0.72 for the hippocampus. Relative overlap for the cerebellar lobes ranged from 0.76 to 0.86. The reliability of the SVM based algorithm was similar to the inter-rater reliability between manual raters and can be achieved without rater intervention.
Support vector machine (SVM) was applied for land-cover characterization using MODIS time-series data. Classification performance was examined with respect to training sample size, sample variability, and landscape homogeneity (purity). The results were compared to two convention...
Support Vector Machine-Based Endmember Extraction
Filippi, Anthony M; Archibald, Richard K
2009-01-01
Introduced in this paper is the utilization of Support Vector Machines (SVMs) to automatically perform endmember extraction from hyperspectral data. The strengths of SVM are exploited to provide a fast and accurate calculated representation of high-dimensional data sets that may consist of multiple distributions. Once this representation is computed, the number of distributions can be determined without prior knowledge. For each distribution, an optimal transform can be determined that preserves informational content while reducing the data dimensionality, and hence, the computational cost. Finally, endmember extraction for the whole data set is accomplished. Results indicate that this Support Vector Machine-Based Endmember Extraction (SVM-BEE) algorithm has the capability of autonomously determining endmembers from multiple clusters with computational speed and accuracy, while maintaining a robust tolerance to noise.
Ghaedi, M; Dashtian, K; Ghaedi, A M; Dehghanian, N
2016-05-11
The aim of this work is the study of the predictive ability of a hybrid model of support vector regression with genetic algorithm optimization (GA-SVR) for the adsorption of malachite green (MG) onto multi-walled carbon nanotubes (MWCNTs). Various factors were investigated by central composite design and optimum conditions was set as: pH 8, 0.018 g MWCNTs, 8 mg L(-1) dye mixed with 50 mL solution thoroughly for 10 min. The Langmuir, Freundlich, Temkin and D-R isothermal models are applied to fitting the experimental data, and the data was well explained by the Langmuir model with a maximum adsorption capacity of 62.11-80.64 mg g(-1) in a short time at 25 °C. Kinetic studies at various adsorbent dosages and the initial MG concentration show that maximum MG removal was achieved within 10 min of the start of every experiment under most conditions. The adsorption obeys the pseudo-second-order rate equation in addition to the intraparticle diffusion model. The optimal parameters (C of 0.2509, σ(2) of 0.1288 and ε of 0.2018) for the SVR model were obtained based on the GA. For the testing data set, MSE values of 0.0034 and the coefficient of determination (R(2)) values of 0.9195 were achieved.
Ghaedi, M; Dashtian, K; Ghaedi, A M; Dehghanian, N
2016-05-11
The aim of this work is the study of the predictive ability of a hybrid model of support vector regression with genetic algorithm optimization (GA-SVR) for the adsorption of malachite green (MG) onto multi-walled carbon nanotubes (MWCNTs). Various factors were investigated by central composite design and optimum conditions was set as: pH 8, 0.018 g MWCNTs, 8 mg L(-1) dye mixed with 50 mL solution thoroughly for 10 min. The Langmuir, Freundlich, Temkin and D-R isothermal models are applied to fitting the experimental data, and the data was well explained by the Langmuir model with a maximum adsorption capacity of 62.11-80.64 mg g(-1) in a short time at 25 °C. Kinetic studies at various adsorbent dosages and the initial MG concentration show that maximum MG removal was achieved within 10 min of the start of every experiment under most conditions. The adsorption obeys the pseudo-second-order rate equation in addition to the intraparticle diffusion model. The optimal parameters (C of 0.2509, σ(2) of 0.1288 and ε of 0.2018) for the SVR model were obtained based on the GA. For the testing data set, MSE values of 0.0034 and the coefficient of determination (R(2)) values of 0.9195 were achieved. PMID:27119755
Duan, Li; Guo, Long; Liu, Ke; Liu, E-Hu; Li, Ping
2014-04-25
Citrus herbs have been widely used in traditional medicine and cuisine in China and other countries since the ancient time. However, the authentication and quality control of Citrus herbs has always been a challenging task due to their similar morphological characteristics and the diversity of the multi-components existed in the complicated matrix. In the present investigation, we developed a novel strategy to characterize and classify seven Citrus herbs based on chromatographic analysis and chemometric methods. Firstly, the chemical constituents in seven Citrus herbs were globally characterized by liquid chromatography combined with quadrupole time-of-flight mass spectrometry (LC-QTOF-MS). Based on their retention time, UV spectra and MS fragmentation behavior, a total of 75 compounds were identified or tentatively characterized in these herbal medicines. Secondly, a segmental monitoring method based on LC-variable wavelength detection was developed for simultaneous quantification of ten marker compounds in these Citrus herbs. Thirdly, based on the contents of the ten analytes, genetic algorithm optimized support vector machines (GA-SVM) was employed to differentiate and classify the 64 samples covering these seven herbs. The obtained classifier showed good prediction performance and the overall prediction accuracy reached 96.88%. The proposed strategy is expected to provide new insight for authentication and quality control of traditional herbs.
Vectorized algorithms for spiking neural network simulation.
Brette, Romain; Goodman, Dan F M
2011-06-01
High-level languages (Matlab, Python) are popular in neuroscience because they are flexible and accelerate development. However, for simulating spiking neural networks, the cost of interpretation is a bottleneck. We describe a set of algorithms to simulate large spiking neural networks efficiently with high-level languages using vector-based operations. These algorithms constitute the core of Brian, a spiking neural network simulator written in the Python language. Vectorized simulation makes it possible to combine the flexibility of high-level languages with the computational efficiency usually associated with compiled languages. PMID:21395437
Vector processor algorithms for transonic flow calculations
NASA Technical Reports Server (NTRS)
South, J. C., Jr.; Keller, J. D.; Hafez, M. M.
1979-01-01
This paper discusses a number of algorithms for solving the transonic full-potential equation in conservative form on a vector computer, such as the CDC STAR-100 or the CRAY-1. Recent research with the 'artificial density' method for transonics has led to development of some new iteration schemes which take advantage of vector-computer architecture without suffering significant loss of convergence rate. Several of these more promising schemes are described and 2-D and 3-D results are shown comparing the computational rates on the STAR and CRAY vector computers, and the CYBER-175 serial computer. Schemes included are: (1) Checkerboard SOR, (2) Checkerboard Leapfrog, (3) odd-even vertical line SOR, and (4) odd-even horizontal line SOR.
Automated Vectorization of Decision-Based Algorithms
NASA Technical Reports Server (NTRS)
James, Mark
2006-01-01
Virtually all existing vectorization algorithms are designed to only analyze the numeric properties of an algorithm and distribute those elements across multiple processors. This advances the state of the practice because it is the only known system, at the time of this reporting, that takes high-level statements and analyzes them for their decision properties and converts them to a form that allows them to automatically be executed in parallel. The software takes a high-level source program that describes a complex decision- based condition and rewrites it as a disjunctive set of component Boolean relations that can then be executed in parallel. This is important because parallel architectures are becoming more commonplace in conventional systems and they have always been present in NASA flight systems. This technology allows one to take existing condition-based code and automatically vectorize it so it naturally decomposes across parallel architectures.
Distributed semi-supervised support vector machines.
Scardapane, Simone; Fierimonte, Roberto; Di Lorenzo, Paolo; Panella, Massimo; Uncini, Aurelio
2016-08-01
The semi-supervised support vector machine (S(3)VM) is a well-known algorithm for performing semi-supervised inference under the large margin principle. In this paper, we are interested in the problem of training a S(3)VM when the labeled and unlabeled samples are distributed over a network of interconnected agents. In particular, the aim is to design a distributed training protocol over networks, where communication is restricted only to neighboring agents and no coordinating authority is present. Using a standard relaxation of the original S(3)VM, we formulate the training problem as the distributed minimization of a non-convex social cost function. To find a (stationary) solution in a distributed manner, we employ two different strategies: (i) a distributed gradient descent algorithm; (ii) a recently developed framework for In-Network Nonconvex Optimization (NEXT), which is based on successive convexifications of the original problem, interleaved by state diffusion steps. Our experimental results show that the proposed distributed algorithms have comparable performance with respect to a centralized implementation, while highlighting the pros and cons of the proposed solutions. To the date, this is the first work that paves the way toward the broad field of distributed semi-supervised learning over networks. PMID:27179615
When do support vector machines work fast?
Steinwart, I.; Scovel, James C.
2004-01-01
The authors establish learning rates to the Bayes risk for support vector machines (SVM's) with hinge loss. Since a theorem of Devroyte states that no learning algorithm can learn with a uniform rate to the Bayes risk for all probability distributions they have to restrict the class of considered distributions: in order to obtain fast rates they assume a noise condition recently proposed by Tsybakov and an approximation condition in terms of the distribution and the reproducing kernel Hilbert space used by the SVM. for Gaussian RBF kernels with varying widths they propose a geometric noise assumption on the distribution which ensures the approximation condition. This geometric assumption is not in terms of smoothness but describes the concentration of the marginal distribution near the decision boundary. In particular they are able to describe nontrivial classes of distributions for which SVM's using a Gaussian kernel can learn with almost linear rate.
Classification SAR targets with support vector machine
NASA Astrophysics Data System (ADS)
Cao, Lanying
2007-02-01
With the development of Synthetic Aperture Radar (SAR) technology, automatic target recognition (ATR) is becoming increasingly important. In this paper, we proposed a 3-class target classification system in SAR images. The system is based on invariant wavelet moments and support vector machine (SVM) algorithm. It is a two-stage approach. The first stage is to extract and select a small set of wavelet invariant moment features to indicate target images. The wavelet invariant moments take both advantages of the wavelet inherent property of multi-resolution analysis and moment invariants quality of invariant to translation, scaling changes and rotation. The second stage is classification of targets with SVM algorithm. SVM is based on the principle of structural risk minimization (SRM), which has been shown better than the principle of empirical risk minimization (ERM) which is used by many conventional networks. To test the performance and efficiency of the proposed method, we performed experiments on invariant wavelet moments, different kernel functions, 2-class identification, and 3-class identification. Test results show that wavelet invariant moments indicate the target effectively; linear kernel function achieves better results than other kernel functions, and SVM classification approach performs better than conventional nearest distance approach.
An affine projection algorithm using grouping selection of input vectors
NASA Astrophysics Data System (ADS)
Shin, JaeWook; Kong, NamWoong; Park, PooGyeon
2011-10-01
This paper present an affine projection algorithm (APA) using grouping selection of input vectors. To improve the performance of conventional APA, the proposed algorithm adjusts the number of the input vectors using two procedures: grouping procedure and selection procedure. In grouping procedure, the some input vectors that have overlapping information for update is grouped using normalized inner product. Then, few input vectors that have enough information for for coefficient update is selected using steady-state mean square error (MSE) in selection procedure. Finally, the filter coefficients update using selected input vectors. The experimental results show that the proposed algorithm has small steady-state estimation errors comparing with the existing algorithms.
Support vector machines in analysis of top quark production
NASA Astrophysics Data System (ADS)
Vaiciulis, A.
2003-04-01
The Support Vector Machine (SVM) learning algorithm is a new alternative to multivariate methods such as neural networks. Potential applications of SVMs in high energy physics include the common classification problem of signal/background discrimination as well as particle identification. A comparison of a conventional method and an SVM algorithm is presented here for the case of identifying top quark events in Run II physics at the CDF experiment.
Support Vector Machines: Relevance Feedback and Information Retrieval.
ERIC Educational Resources Information Center
Drucker, Harris; Shahrary, Behzad; Gibbon, David C.
2002-01-01
Compares support vector machines (SVMs) to Rocchio, Ide regular and Ide dec-hi algorithms in information retrieval (IR) of text documents using relevancy feedback. If the preliminary search is so poor that one has to search through many documents to find at least one relevant document, then SVM is preferred. Includes nine tables. (Contains 24…
Quantum Support Vector Machine for Big Data Classification
NASA Astrophysics Data System (ADS)
Rebentrost, Patrick; Mohseni, Masoud; Lloyd, Seth
2014-09-01
Supervised machine learning is the classification of new data based on already classified training examples. In this work, we show that the support vector machine, an optimized binary classifier, can be implemented on a quantum computer, with complexity logarithmic in the size of the vectors and the number of training examples. In cases where classical sampling algorithms require polynomial time, an exponential speedup is obtained. At the core of this quantum big data algorithm is a nonsparse matrix exponentiation technique for efficiently performing a matrix inversion of the training data inner-product (kernel) matrix.
TWSVR: Regression via Twin Support Vector Machine.
Khemchandani, Reshma; Goyal, Keshav; Chandra, Suresh
2016-02-01
Taking motivation from Twin Support Vector Machine (TWSVM) formulation, Peng (2010) attempted to propose Twin Support Vector Regression (TSVR) where the regressor is obtained via solving a pair of quadratic programming problems (QPPs). In this paper we argue that TSVR formulation is not in the true spirit of TWSVM. Further, taking motivation from Bi and Bennett (2003), we propose an alternative approach to find a formulation for Twin Support Vector Regression (TWSVR) which is in the true spirit of TWSVM. We show that our proposed TWSVR can be derived from TWSVM for an appropriately constructed classification problem. To check the efficacy of our proposed TWSVR we compare its performance with TSVR and classical Support Vector Regression(SVR) on various regression datasets.
TWSVR: Regression via Twin Support Vector Machine.
Khemchandani, Reshma; Goyal, Keshav; Chandra, Suresh
2016-02-01
Taking motivation from Twin Support Vector Machine (TWSVM) formulation, Peng (2010) attempted to propose Twin Support Vector Regression (TSVR) where the regressor is obtained via solving a pair of quadratic programming problems (QPPs). In this paper we argue that TSVR formulation is not in the true spirit of TWSVM. Further, taking motivation from Bi and Bennett (2003), we propose an alternative approach to find a formulation for Twin Support Vector Regression (TWSVR) which is in the true spirit of TWSVM. We show that our proposed TWSVR can be derived from TWSVM for an appropriately constructed classification problem. To check the efficacy of our proposed TWSVR we compare its performance with TSVR and classical Support Vector Regression(SVR) on various regression datasets. PMID:26624223
Simplified space vector PWM algorithm for five-level inverter
NASA Astrophysics Data System (ADS)
Lalili, D.; Berkouk, E. M.; Boudjema, F.; Lourci, N.; Taleb, T.; Petzold, J.
2007-12-01
In this work, we present an algorithm for the space vector pulse width modulation (SVPWM) applied to five-level diode clamped inverter. In this algorithm, the space vector diagram of the five-level inverter is decomposed into six space vector diagrams of three-level inverters. In turn, each of these six space vector diagrams of three-level inverter is decomposed into six space vector diagrams of two-level inverters. This idea allows us to generalize the two-level SVPWM algorithm into the case of five-level inverter.
Incremental learning for ν-Support Vector Regression.
Gu, Bin; Sheng, Victor S; Wang, Zhijie; Ho, Derek; Osman, Said; Li, Shuo
2015-07-01
The ν-Support Vector Regression (ν-SVR) is an effective regression learning algorithm, which has the advantage of using a parameter ν on controlling the number of support vectors and adjusting the width of the tube automatically. However, compared to ν-Support Vector Classification (ν-SVC) (Schölkopf et al., 2000), ν-SVR introduces an additional linear term into its objective function. Thus, directly applying the accurate on-line ν-SVC algorithm (AONSVM) to ν-SVR will not generate an effective initial solution. It is the main challenge to design an incremental ν-SVR learning algorithm. To overcome this challenge, we propose a special procedure called initial adjustments in this paper. This procedure adjusts the weights of ν-SVC based on the Karush-Kuhn-Tucker (KKT) conditions to prepare an initial solution for the incremental learning. Combining the initial adjustments with the two steps of AONSVM produces an exact and effective incremental ν-SVR learning algorithm (INSVR). Theoretical analysis has proven the existence of the three key inverse matrices, which are the cornerstones of the three steps of INSVR (including the initial adjustments), respectively. The experiments on benchmark datasets demonstrate that INSVR can avoid the infeasible updating paths as far as possible, and successfully converges to the optimal solution. The results also show that INSVR is faster than batch ν-SVR algorithms with both cold and warm starts.
Optimized support vector regression for drilling rate of penetration estimation
NASA Astrophysics Data System (ADS)
Bodaghi, Asadollah; Ansari, Hamid Reza; Gholami, Mahsa
2015-12-01
In the petroleum industry, drilling optimization involves the selection of operating conditions for achieving the desired depth with the minimum expenditure while requirements of personal safety, environment protection, adequate information of penetrated formations and productivity are fulfilled. Since drilling optimization is highly dependent on the rate of penetration (ROP), estimation of this parameter is of great importance during well planning. In this research, a novel approach called `optimized support vector regression' is employed for making a formulation between input variables and ROP. Algorithms used for optimizing the support vector regression are the genetic algorithm (GA) and the cuckoo search algorithm (CS). Optimization implementation improved the support vector regression performance by virtue of selecting proper values for its parameters. In order to evaluate the ability of optimization algorithms in enhancing SVR performance, their results were compared to the hybrid of pattern search and grid search (HPG) which is conventionally employed for optimizing SVR. The results demonstrated that the CS algorithm achieved further improvement on prediction accuracy of SVR compared to the GA and HPG as well. Moreover, the predictive model derived from back propagation neural network (BPNN), which is the traditional approach for estimating ROP, is selected for comparisons with CSSVR. The comparative results revealed the superiority of CSSVR. This study inferred that CSSVR is a viable option for precise estimation of ROP.
Support Vector Machine with Ensemble Tree Kernel for Relation Extraction.
Liu, Xiaoyong; Fu, Hui; Du, Zhiguo
2016-01-01
Relation extraction is one of the important research topics in the field of information extraction research. To solve the problem of semantic variation in traditional semisupervised relation extraction algorithm, this paper proposes a novel semisupervised relation extraction algorithm based on ensemble learning (LXRE). The new algorithm mainly uses two kinds of support vector machine classifiers based on tree kernel for integration and integrates the strategy of constrained extension seed set. The new algorithm can weaken the inaccuracy of relation extraction, which is caused by the phenomenon of semantic variation. The numerical experimental research based on two benchmark data sets (PropBank and AIMed) shows that the LXRE algorithm proposed in the paper is superior to other two common relation extraction methods in four evaluation indexes (Precision, Recall, F-measure, and Accuracy). It indicates that the new algorithm has good relation extraction ability compared with others. PMID:27118966
Support Vector Machine with Ensemble Tree Kernel for Relation Extraction
Fu, Hui; Du, Zhiguo
2016-01-01
Relation extraction is one of the important research topics in the field of information extraction research. To solve the problem of semantic variation in traditional semisupervised relation extraction algorithm, this paper proposes a novel semisupervised relation extraction algorithm based on ensemble learning (LXRE). The new algorithm mainly uses two kinds of support vector machine classifiers based on tree kernel for integration and integrates the strategy of constrained extension seed set. The new algorithm can weaken the inaccuracy of relation extraction, which is caused by the phenomenon of semantic variation. The numerical experimental research based on two benchmark data sets (PropBank and AIMed) shows that the LXRE algorithm proposed in the paper is superior to other two common relation extraction methods in four evaluation indexes (Precision, Recall, F-measure, and Accuracy). It indicates that the new algorithm has good relation extraction ability compared with others. PMID:27118966
Support Vector Machine Implementations for Classification & Clustering
Winters-Hilt, Stephen; Yelundur, Anil; McChesney, Charlie; Landry, Matthew
2006-01-01
Background We describe Support Vector Machine (SVM) applications to classification and clustering of channel current data. SVMs are variational-calculus based methods that are constrained to have structural risk minimization (SRM), i.e., they provide noise tolerant solutions for pattern recognition. The SVM approach encapsulates a significant amount of model-fitting information in the choice of its kernel. In work thus far, novel, information-theoretic, kernels have been successfully employed for notably better performance over standard kernels. Currently there are two approaches for implementing multiclass SVMs. One is called external multi-class that arranges several binary classifiers as a decision tree such that they perform a single-class decision making function, with each leaf corresponding to a unique class. The second approach, namely internal-multiclass, involves solving a single optimization problem corresponding to the entire data set (with multiple hyperplanes). Results Each SVM approach encapsulates a significant amount of model-fitting information in its choice of kernel. In work thus far, novel, information-theoretic, kernels were successfully employed for notably better performance over standard kernels. Two SVM approaches to multiclass discrimination are described: (1) internal multiclass (with a single optimization), and (2) external multiclass (using an optimized decision tree). We describe benefits of the internal-SVM approach, along with further refinements to the internal-multiclass SVM algorithms that offer significant improvement in training time without sacrificing accuracy. In situations where the data isn't clearly separable, making for poor discrimination, signal clustering is used to provide robust and useful information – to this end, novel, SVM-based clustering methods are also described. As with the classification, there are Internal and External SVM Clustering algorithms, both of which are briefly described. PMID:17118147
Successive overrelaxation for laplacian support vector machine.
Qi, Zhiquan; Tian, Yingjie; Shi, Yong
2015-04-01
Semisupervised learning (SSL) problem, which makes use of both a large amount of cheap unlabeled data and a few unlabeled data for training, in the last few years, has attracted amounts of attention in machine learning and data mining. Exploiting the manifold regularization (MR), Belkin et al. proposed a new semisupervised classification algorithm: Laplacian support vector machines (LapSVMs), and have shown the state-of-the-art performance in SSL field. To further improve the LapSVMs, we proposed a fast Laplacian SVM (FLapSVM) solver for classification. Compared with the standard LapSVM, our method has several improved advantages as follows: 1) FLapSVM does not need to deal with the extra matrix and burden the computations related to the variable switching, which make it more suitable for large scale problems; 2) FLapSVM’s dual problem has the same elegant formulation as that of standard SVMs. This means that the kernel trick can be applied directly into the optimization model; and 3) FLapSVM can be effectively solved by successive overrelaxation technology, which converges linearly to a solution and can process very large data sets that need not reside in memory. In practice, combining the strategies of random scheduling of subproblem and two stopping conditions, the computing speed of FLapSVM is rigidly quicker to that of LapSVM and it is a valid alternative to PLapSVM. PMID:25961091
Surrogate-based Reliability Analysis Using Support Vector Machine
NASA Astrophysics Data System (ADS)
Li, Gang; Liu, Zhiqiang
2010-05-01
An approach of surrogate-based reliability analysis by support vector machine with Monte-Carlo simulation is proposed. The efficient sampling techniques, such as uniform design and Latin Hypercube sampling, are used, and the SVM is trained with the sample pairs of input and output data obtained by the finite element analysis. The trained SVM model, as a solver-surrogate model, is intended to approximate the real performance function. Considering the selection of parameters for SVM affects the learning performance of SVM strongly, the Genetic Algorithm (GA) is integrated to the construction of the SVM, by optimizing the relevant parameters. The influence of the parameters on SVM is discussed and a methodology is proposed for selecting the SVM model. Support Vector Classification (SVC) based and Support Vector Regression (SVR) based reliability analyses are studied. Some numerical examples demonstrate the efficiency and applicability of the proposed method.
Support vector machines for classification and regression.
Brereton, Richard G; Lloyd, Gavin R
2010-02-01
The increasing interest in Support Vector Machines (SVMs) over the past 15 years is described. Methods are illustrated using simulated case studies, and 4 experimental case studies, namely mass spectrometry for studying pollution, near infrared analysis of food, thermal analysis of polymers and UV/visible spectroscopy of polyaromatic hydrocarbons. The basis of SVMs as two-class classifiers is shown with extensive visualisation, including learning machines, kernels and penalty functions. The influence of the penalty error and radial basis function radius on the model is illustrated. Multiclass implementations including one vs. all, one vs. one, fuzzy rules and Directed Acyclic Graph (DAG) trees are described. One-class Support Vector Domain Description (SVDD) is described and contrasted to conventional two- or multi-class classifiers. The use of Support Vector Regression (SVR) is illustrated including its application to multivariate calibration, and why it is useful when there are outliers and non-linearities.
Thrust vector control algorithm design for the Cassini spacecraft
NASA Technical Reports Server (NTRS)
Enright, Paul J.
1993-01-01
This paper describes a preliminary design of the thrust vector control algorithm for the interplanetary spacecraft, Cassini. Topics of discussion include flight software architecture, modeling of sensors, actuators, and vehicle dynamics, and controller design and analysis via classical methods. Special attention is paid to potential interactions with structural flexibilities and propellant dynamics. Controller performance is evaluated in a simulation environment built around a multi-body dynamics model, which contains nonlinear models of the relevant hardware and preliminary versions of supporting attitude determination and control functions.
Multiclass Reduced-Set Support Vector Machines
NASA Technical Reports Server (NTRS)
Tang, Benyang; Mazzoni, Dominic
2006-01-01
There are well-established methods for reducing the number of support vectors in a trained binary support vector machine, often with minimal impact on accuracy. We show how reduced-set methods can be applied to multiclass SVMs made up of several binary SVMs, with significantly better results than reducing each binary SVM independently. Our approach is based on Burges' approach that constructs each reduced-set vector as the pre-image of a vector in kernel space, but we extend this by recomputing the SVM weights and bias optimally using the original SVM objective function. This leads to greater accuracy for a binary reduced-set SVM, and also allows vectors to be 'shared' between multiple binary SVMs for greater multiclass accuracy with fewer reduced-set vectors. We also propose computing pre-images using differential evolution, which we have found to be more robust than gradient descent alone. We show experimental results on a variety of problems and find that this new approach is consistently better than previous multiclass reduced-set methods, sometimes with a dramatic difference.
Evolutionary Support Vector Machines for Transient Stability Monitoring
NASA Astrophysics Data System (ADS)
Dora Arul Selvi, B.; Kamaraj, N.
2012-03-01
Currently, power systems are in the need of fast and reliable contingency monitoring systems for the purpose of maintaining stability in the presence of deregulated and open market environment. In this paper, a quick and unfailing transient stability monitoring algorithm that considers both the symmetrical and unsymmetrical faults is presented. support vector machines (SVMs) are employed as pattern classifiers so as to construct fast relation mappings between the transient stability results and the selected input attributes using mutual information. The type of fault is recognized by a SVM classifier and the critical clearing time of the fault is estimated by a support vector regression machine. The SVM parameters are tuned by an elitist multi-objective non-dominated sorting genetic algorithm in such a manner that the best classification and regression performance are accomplished. To demonstrate the good potential of the scheme, IEEE 3 generator system and a South Indian Grid are utilized.
Biologically relevant neural network architectures for support vector machines.
Jändel, Magnus
2014-01-01
Neural network architectures that implement support vector machines (SVM) are investigated for the purpose of modeling perceptual one-shot learning in biological organisms. A family of SVM algorithms including variants of maximum margin, 1-norm, 2-norm and ν-SVM is considered. SVM training rules adapted for neural computation are derived. It is found that competitive queuing memory (CQM) is ideal for storing and retrieving support vectors. Several different CQM-based neural architectures are examined for each SVM algorithm. Although most of the sixty-four scanned architectures are unconvincing for biological modeling four feasible candidates are found. The seemingly complex learning rule of a full ν-SVM implementation finds a particularly simple and natural implementation in bisymmetric architectures. Since CQM-like neural structures are thought to encode skilled action sequences and bisymmetry is ubiquitous in motor systems it is speculated that trainable pattern recognition in low-level perception has evolved as an internalized motor programme.
Optimization of Support Vector Machine (SVM) for Object Classification
NASA Technical Reports Server (NTRS)
Scholten, Matthew; Dhingra, Neil; Lu, Thomas T.; Chao, Tien-Hsin
2012-01-01
The Support Vector Machine (SVM) is a powerful algorithm, useful in classifying data into species. The SVMs implemented in this research were used as classifiers for the final stage in a Multistage Automatic Target Recognition (ATR) system. A single kernel SVM known as SVMlight, and a modified version known as a SVM with K-Means Clustering were used. These SVM algorithms were tested as classifiers under varying conditions. Image noise levels varied, and the orientation of the targets changed. The classifiers were then optimized to demonstrate their maximum potential as classifiers. Results demonstrate the reliability of SVM as a method for classification. From trial to trial, SVM produces consistent results.
Reinforced adaboost face detector using support vector machine
NASA Astrophysics Data System (ADS)
Jang, Jaeyoon; Yunkoo, C.; Jaehong, K.; Yoon, Hosub
2014-08-01
We propose a novel face detection algorithm in order to improve higher detection rate of face-detector than conventional haar - adaboost face detector. Our purposed method not only improves detection rate of a face but decreases the number of false-positive component. In order to get improved detection ability, we merged two classifiers: adaboost and support vector machine. Because SVM and Adaboost use different feature, they are complementary each other. We can get 2~4% improved performance using proposed method than previous our detector that is not applied proposed method. This method makes improved detector that shows better performance without algorithm replacement.
Vector algorithms for geometrically nonlinear 3D finite element analysis
NASA Technical Reports Server (NTRS)
Whitcomb, John D.
1989-01-01
Algorithms for geometrically nonlinear finite element analysis are presented which exploit the vector processing capability of the VPS-32, which is closely related to the CYBER 205. By manipulating vectors (which are long lists of numbers) rather than individual numbers, very high processing speeds are obtained. Long vector lengths are obtained without extensive replication or reordering by storage of intermediate results in strategic patterns at all stages of the computations. Comparisons of execution times with those from programs using either scalar or other vector programming techniques indicate that the algorithms presented are quite efficient.
Neural cell image segmentation method based on support vector machine
NASA Astrophysics Data System (ADS)
Niu, Shiwei; Ren, Kan
2015-10-01
In the analysis of neural cell images gained by optical microscope, accurate and rapid segmentation is the foundation of nerve cell detection system. In this paper, a modified image segmentation method based on Support Vector Machine (SVM) is proposed to reduce the adverse impact caused by low contrast ratio between objects and background, adherent and clustered cells' interference etc. Firstly, Morphological Filtering and OTSU Method are applied to preprocess images for extracting the neural cells roughly. Secondly, the Stellate Vector, Circularity and Histogram of Oriented Gradient (HOG) features are computed to train SVM model. Finally, the incremental learning SVM classifier is used to classify the preprocessed images, and the initial recognition areas identified by the SVM classifier are added to the library as the positive samples for training SVM model. Experiment results show that the proposed algorithm can achieve much better segmented results than the classic segmentation algorithms.
Support vector machine for automatic pain recognition
NASA Astrophysics Data System (ADS)
Monwar, Md Maruf; Rezaei, Siamak
2009-02-01
Facial expressions are a key index of emotion and the interpretation of such expressions of emotion is critical to everyday social functioning. In this paper, we present an efficient video analysis technique for recognition of a specific expression, pain, from human faces. We employ an automatic face detector which detects face from the stored video frame using skin color modeling technique. For pain recognition, location and shape features of the detected faces are computed. These features are then used as inputs to a support vector machine (SVM) for classification. We compare the results with neural network based and eigenimage based automatic pain recognition systems. The experiment results indicate that using support vector machine as classifier can certainly improve the performance of automatic pain recognition system.
Adaptive support vector regression for UAV flight control.
Shin, Jongho; Jin Kim, H; Kim, Youdan
2011-01-01
This paper explores an application of support vector regression for adaptive control of an unmanned aerial vehicle (UAV). Unlike neural networks, support vector regression (SVR) generates global solutions, because SVR basically solves quadratic programming (QP) problems. With this advantage, the input-output feedback-linearized inverse dynamic model and the compensation term for the inversion error are identified off-line, which we call I-SVR (inversion SVR) and C-SVR (compensation SVR), respectively. In order to compensate for the inversion error and the unexpected uncertainty, an online adaptation algorithm for the C-SVR is proposed. Then, the stability of the overall error dynamics is analyzed by the uniformly ultimately bounded property in the nonlinear system theory. In order to validate the effectiveness of the proposed adaptive controller, numerical simulations are performed on the UAV model.
A novel support vector machine with globality-locality preserving.
Ma, Cheng-Long; Yuan, Yu-Bo
2014-01-01
Support vector machine (SVM) is regarded as a powerful method for pattern classification. However, the solution of the primal optimal model of SVM is susceptible for class distribution and may result in a nonrobust solution. In order to overcome this shortcoming, an improved model, support vector machine with globality-locality preserving (GLPSVM), is proposed. It introduces globality-locality preserving into the standard SVM, which can preserve the manifold structure of the data space. We complete rich experiments on the UCI machine learning data sets. The results validate the effectiveness of the proposed model, especially on the Wine and Iris databases; the recognition rate is above 97% and outperforms all the algorithms that were developed from SVM. PMID:25045750
Stereo matching based on adaptive support-weight approach in RGB vector space.
Geng, Yingnan; Zhao, Yan; Chen, Hexin
2012-06-01
Gradient similarity is a simple, yet powerful, data descriptor which shows robustness in stereo matching. In this paper, a RGB vector space is defined for stereo matching. Based on the adaptive support-weight approach, a matching algorithm, which uses the pixel gradient similarity, color similarity, and proximity in RGB vector space to compute the corresponding support-weights and dissimilarity measurements, is proposed. The experimental results are evaluated on the Middlebury stereo benchmark, showing that our algorithm outperforms other stereo matching algorithms and the algorithm with gradient similarity can achieve better results in stereo matching. PMID:22695592
Applications of Support Vector Machines In Chemo And Bioinformatics
NASA Astrophysics Data System (ADS)
Jayaraman, V. K.; Sundararajan, V.
2010-10-01
Conventional linear & nonlinear tools for classification, regression & data driven modeling are being replaced on a rapid scale by newer techniques & tools based on artificial intelligence and machine learning. While the linear techniques are not applicable for inherently nonlinear problems, newer methods serve as attractive alternatives for solving real life problems. Support Vector Machine (SVM) classifiers are a set of universal feed-forward network based classification algorithms that have been formulated from statistical learning theory and structural risk minimization principle. SVM regression closely follows the classification methodology. In this work recent applications of SVM in Chemo & Bioinformatics will be described with suitable illustrative examples.
An efficient parallel algorithm for matrix-vector multiplication
Hendrickson, B.; Leland, R.; Plimpton, S.
1993-03-01
The multiplication of a vector by a matrix is the kernel computation of many algorithms in scientific computation. A fast parallel algorithm for this calculation is therefore necessary if one is to make full use of the new generation of parallel supercomputers. This paper presents a high performance, parallel matrix-vector multiplication algorithm that is particularly well suited to hypercube multiprocessors. For an n x n matrix on p processors, the communication cost of this algorithm is O(n/[radical]p + log(p)), independent of the matrix sparsity pattern. The performance of the algorithm is demonstrated by employing it as the kernel in the well-known NAS conjugate gradient benchmark, where a run time of 6.09 seconds was observed. This is the best published performance on this benchmark achieved to date using a massively parallel supercomputer.
Weighted K-means support vector machine for cancer prediction.
Kim, SungHwan
2016-01-01
To date, the support vector machine (SVM) has been widely applied to diverse bio-medical fields to address disease subtype identification and pathogenicity of genetic variants. In this paper, I propose the weighted K-means support vector machine (wKM-SVM) and weighted support vector machine (wSVM), for which I allow the SVM to impose weights to the loss term. Besides, I demonstrate the numerical relations between the objective function of the SVM and weights. Motivated by general ensemble techniques, which are known to improve accuracy, I directly adopt the boosting algorithm to the newly proposed weighted KM-SVM (and wSVM). For predictive performance, a range of simulation studies demonstrate that the weighted KM-SVM (and wSVM) with boosting outperforms the standard KM-SVM (and SVM) including but not limited to many popular classification rules. I applied the proposed methods to simulated data and two large-scale real applications in the TCGA pan-cancer methylation data of breast and kidney cancer. In conclusion, the weighted KM-SVM (and wSVM) increases accuracy of the classification model, and will facilitate disease diagnosis and clinical treatment decisions to benefit patients. A software package (wSVM) is publicly available at the R-project webpage (https://www.r-project.org).
Weighted K-means support vector machine for cancer prediction.
Kim, SungHwan
2016-01-01
To date, the support vector machine (SVM) has been widely applied to diverse bio-medical fields to address disease subtype identification and pathogenicity of genetic variants. In this paper, I propose the weighted K-means support vector machine (wKM-SVM) and weighted support vector machine (wSVM), for which I allow the SVM to impose weights to the loss term. Besides, I demonstrate the numerical relations between the objective function of the SVM and weights. Motivated by general ensemble techniques, which are known to improve accuracy, I directly adopt the boosting algorithm to the newly proposed weighted KM-SVM (and wSVM). For predictive performance, a range of simulation studies demonstrate that the weighted KM-SVM (and wSVM) with boosting outperforms the standard KM-SVM (and SVM) including but not limited to many popular classification rules. I applied the proposed methods to simulated data and two large-scale real applications in the TCGA pan-cancer methylation data of breast and kidney cancer. In conclusion, the weighted KM-SVM (and wSVM) increases accuracy of the classification model, and will facilitate disease diagnosis and clinical treatment decisions to benefit patients. A software package (wSVM) is publicly available at the R-project webpage (https://www.r-project.org). PMID:27512621
Vectorization of linear discrete filtering algorithms
NASA Technical Reports Server (NTRS)
Schiess, J. R.
1977-01-01
Linear filters, including the conventional Kalman filter and versions of square root filters devised by Potter and Carlson, are studied for potential application on streaming computers. The square root filters are known to maintain a positive definite covariance matrix in cases in which the Kalman filter diverges due to ill-conditioning of the matrix. Vectorization of the filters is discussed, and comparisons are made of the number of operations and storage locations required by each filter. The Carlson filter is shown to be the most efficient of the filters on the Control Data STAR-100 computer.
A high-performance FFT algorithm for vector supercomputers
NASA Technical Reports Server (NTRS)
Bailey, David H.
1988-01-01
Many traditional algorithms for computing the fast Fourier transform (FFT) on conventional computers are unacceptable for advanced vector and parallel computers because they involve nonunit, power-of-two memory strides. A practical technique for computing the FFT that avoids all such strides and appears to be near-optimal for a variety of current vector and parallel computers is presented. Performance results of a program based on this technique are given. Notable among these results is that a FORTRAN implementation of this algorithm on the CRAY-2 runs up to 77-percent faster than Cray's assembly-coded library routine.
Implicit, nonswitching, vector-oriented algorithm for steady transonic flow
NASA Technical Reports Server (NTRS)
Lottati, I.
1983-01-01
A rapid computation of a sequence of transonic flow solutions has to be performed in many areas of aerodynamic technology. The employment of low-cost vector array processors makes the conduction of such calculations economically feasible. However, for a full utilization of the new hardware, the developed algorithms must take advantage of the special characteristics of the vector array processor. The present investigation has the objective to develop an efficient algorithm for solving transonic flow problems governed by mixed partial differential equations on an array processor.
An assessment of support vector machines for land cover classification
Huang, C.; Davis, L.S.; Townshend, J.R.G.
2002-01-01
The support vector machine (SVM) is a group of theoretically superior machine learning algorithms. It was found competitive with the best available machine learning algorithms in classifying high-dimensional data sets. This paper gives an introduction to the theoretical development of the SVM and an experimental evaluation of its accuracy, stability and training speed in deriving land cover classifications from satellite images. The SVM was compared to three other popular classifiers, including the maximum likelihood classifier (MLC), neural network classifiers (NNC) and decision tree classifiers (DTC). The impacts of kernel configuration on the performance of the SVM and of the selection of training data and input variables on the four classifiers were also evaluated in this experiment.
Fast and Accurate Support Vector Machines on Large Scale Systems
Vishnu, Abhinav; Narasimhan, Jayenthi; Holder, Larry; Kerbyson, Darren J.; Hoisie, Adolfy
2015-09-08
Support Vector Machines (SVM) is a supervised Machine Learning and Data Mining (MLDM) algorithm, which has become ubiquitous largely due to its high accuracy and obliviousness to dimensionality. The objective of SVM is to find an optimal boundary --- also known as hyperplane --- which separates the samples (examples in a dataset) of different classes by a maximum margin. Usually, very few samples contribute to the definition of the boundary. However, existing parallel algorithms use the entire dataset for finding the boundary, which is sub-optimal for performance reasons. In this paper, we propose a novel distributed memory algorithm to eliminate the samples which do not contribute to the boundary definition in SVM. We propose several heuristics, which range from early (aggressive) to late (conservative) elimination of the samples, such that the overall time for generating the boundary is reduced considerably. In a few cases, a sample may be eliminated (shrunk) pre-emptively --- potentially resulting in an incorrect boundary. We propose a scalable approach to synchronize the necessary data structures such that the proposed algorithm maintains its accuracy. We consider the necessary trade-offs of single/multiple synchronization using in-depth time-space complexity analysis. We implement the proposed algorithm using MPI and compare it with libsvm--- de facto sequential SVM software --- which we enhance with OpenMP for multi-core/many-core parallelism. Our proposed approach shows excellent efficiency using up to 4096 processes on several large datasets such as UCI HIGGS Boson dataset and Offending URL dataset.
Vectorized Rebinning Algorithm for Fast Data Down-Sampling
NASA Technical Reports Server (NTRS)
Dean, Bruce; Aronstein, David; Smith, Jeffrey
2013-01-01
A vectorized rebinning (down-sampling) algorithm, applicable to N-dimensional data sets, has been developed that offers a significant reduction in computer run time when compared to conventional rebinning algorithms. For clarity, a two-dimensional version of the algorithm is discussed to illustrate some specific details of the algorithm content, and using the language of image processing, 2D data will be referred to as "images," and each value in an image as a "pixel." The new approach is fully vectorized, i.e., the down-sampling procedure is done as a single step over all image rows, and then as a single step over all image columns. Data rebinning (or down-sampling) is a procedure that uses a discretely sampled N-dimensional data set to create a representation of the same data, but with fewer discrete samples. Such data down-sampling is fundamental to digital signal processing, e.g., for data compression applications.
Privacy preserving RBF kernel support vector machine.
Li, Haoran; Xiong, Li; Ohno-Machado, Lucila; Jiang, Xiaoqian
2014-01-01
Data sharing is challenging but important for healthcare research. Methods for privacy-preserving data dissemination based on the rigorous differential privacy standard have been developed but they did not consider the characteristics of biomedical data and make full use of the available information. This often results in too much noise in the final outputs. We hypothesized that this situation can be alleviated by leveraging a small portion of open-consented data to improve utility without sacrificing privacy. We developed a hybrid privacy-preserving differentially private support vector machine (SVM) model that uses public data and private data together. Our model leverages the RBF kernel and can handle nonlinearly separable cases. Experiments showed that this approach outperforms two baselines: (1) SVMs that only use public data, and (2) differentially private SVMs that are built from private data. Our method demonstrated very close performance metrics compared to nonprivate SVMs trained on the private data. PMID:25013805
Supernova Recognition using Support Vector Machines
Romano, Raquel A.; Aragon, Cecilia R.; Ding, Chris
2006-10-01
We introduce a novel application of Support Vector Machines(SVMs) to the problem of identifying potential supernovae usingphotometric and geometric features computed from astronomical imagery.The challenges of this supervised learning application are significant:1) noisy and corrupt imagery resulting in high levels of featureuncertainty,2) features with heavy-tailed, peaked distributions,3)extremely imbalanced and overlapping positiveand negative data sets, and4) the need to reach high positive classification rates, i.e. to find allpotential supernovae, while reducing the burdensome workload of manuallyexamining false positives. High accuracy is achieved viaasign-preserving, shifted log transform applied to features with peaked,heavy-tailed distributions. The imbalanced data problem is handled byoversampling positive examples,selectively sampling misclassifiednegative examples,and iteratively training multiple SVMs for improvedsupernovarecognition on unseen test data. We present crossvalidationresults and demonstrate the impact on a largescale supernova survey thatcurrently uses the SVM decision value to rank-order 600,000 potentialsupernovae each night.
Support Vector Machines in Fault Tolerance Control
NASA Astrophysics Data System (ADS)
Ribeiro, Bernardete
2002-09-01
This paper presents a new approach for quality monitoring of on-line molded parts in the context of an injection molding problem using Support Vector Machines (SVMs). While the main goal in the industrial framework is to automatically calculate the setpoints, a less important task is to classify plastic molded parts defects efficiently in order to assess multiple quality characteristics. The paper presents a comparison of the performance assessment of SVMs and RBF neural networks as part quality monitoring tools by analyzing complete data patterns. Results show that the classification model using SVMs presents slightly better performance than RBF neural networks mainly due to the superior generalization of the SVMs in high-dimensional spaces. Particularly, when RBF kernels are used, the accuracy of the task increases thus leading to smaller error rates. Besides, the optimization method is a constrained quadratic programming, which is a well studied and understood mathematical programming technique.
Attitude determination using vector observations: A fast optimal matrix algorithm
NASA Technical Reports Server (NTRS)
Markley, F. Landis
1993-01-01
The attitude matrix minimizing Wahba's loss function is computed directly by a method that is competitive with the fastest known algorithm for finding this optimal estimate. The method also provides an estimate of the attitude error covariance matrix. Analysis of the special case of two vector observations identifies those cases for which the TRIAD or algebraic method minimizes Wahba's loss function.
A generalized vector-valued total variation algorithm
Wohlberg, Brendt; Rodriguez, Paul
2009-01-01
We propose a simple but flexible method for solving the generalized vector-valued TV (VTV) functional, which includes both the {ell}{sup 2}-VTV and {ell}{sup 1}-VTV regularizations as special cases, to address the problems of deconvolution and denoising of vector-valued (e.g. color) images with Gaussian or salt-andpepper noise. This algorithm is the vectorial extension of the Iteratively Reweighted Norm (IRN) algorithm [I] originally developed for scalar (grayscale) images. This method offers competitive computational performance for denoising and deconvolving vector-valued images corrupted with Gaussian ({ell}{sup 2}-VTV case) and salt-and-pepper noise ({ell}{sup 1}-VTV case).
Direct and implicit optical matrix-vector algorithms
NASA Technical Reports Server (NTRS)
Casasent, D.; Ghosh, A.
1983-01-01
New direct and implicit algorithms for optical matrix-vector and systolic array processors are considered. Direct rather than indirect algorithms to solve linear systems and implicit rather than explicit solutions to solve second-order partial differential equations are discussed. In many cases, such approaches more properly utilize the advantageous features of optical systolic array processors. The matrix-decomposition operation (rather than solution of the simplified matrix-vector equation that results) is recognized as the computationally burdensome aspect of such problems that should be computed on an optical system. The Householder QR matrix-decomposition algorithm is considered as a specific example of a direct solution. Extensions to eigenvalue computation and formation of matrices of special structure are also noted.
A sparse matrix algorithm on the Boolean vector machine
NASA Technical Reports Server (NTRS)
Wagner, Robert A.; Patrick, Merrell L.
1988-01-01
VLSI technology is being used to implement a prototype Boolean Vector Machine (BVM), which is a large network of very small processors with equally small memories that operate in SIMD mode; these use bit-serial arithmetic, and communicate via cube-connected cycles network. The BVM's bit-serial arithmetic and the small memories of individual processors are noted to compromise the system's effectiveness in large numerical problem applications. Attention is presently given to the implementation of a basic matrix-vector iteration algorithm for space matrices of the BVM, in order to generate over 1 billion useful floating-point operations/sec for this iteration algorithm. The algorithm is expressed in a novel language designated 'BVM'.
Grid fill algorithm for vector graphics render on mobile devices
NASA Astrophysics Data System (ADS)
Zhang, Jixian; Yue, Kun; Yuan, Guowu; Zhang, Binbin
2015-12-01
The performance of vector graphics render has always been one of the key elements in mobile devices and the most important step to improve the performance is to enhance the efficiency of polygon fill algorithms. In this paper, we proposed a new and more efficient polygon fill algorithm based on the scan line algorithm and Grid Fill Algorithm (GFA). First, we elaborated the GFA through solid fill. Second, we described the techniques for implementing antialiasing and self-intersection polygon fill with GFA. Then, we discussed the implementation of GFA based on the gradient fill. Generally, compared to other fill algorithms, GFA has better performance and achieves faster fill speed, which is specifically consistent with the inherent characteristics of mobile devices. Experimental results show that better fill effects can be achieved by using GFA.
Tea category classification using morphological characteristics and support vector machines
NASA Astrophysics Data System (ADS)
Li, X. L.; He, Y.; Qiu, Z. J.; Bao, Y. D.
2008-11-01
Tea categories classification is an importance task for quality inspection. And traditional way for doing this by human is time-consuming, requirement of too much manual labor. This study proposed a method for discriminating green tea categories based on multi-spectral images technique. Four tea categories were selected for this study, and total of 243 multi-spectral images were collected using a common-aperture multi-spectral charged coupled device camera with three channels (550, 660 and 800 nm). A compound image which has the clearest outline of samples was process by combination of the three monochrome images (550, 660 and 800 nm). After image preprocessing, 18 morphometry parameters were obtained for each samples. The 18 parameters used including area, perimeter, centroid and eccentricity et al. To better understanding these parameters, principal component analysis was conducted on them, and score plot of the first three independent components was obtained. The first three components accounted for 99.02% of the variation of original 18 parameters. It can be found that the four tea categories were distributed in dense clusters respectively in score plot. But the boundaries among them were not clear, so a further discrimination must be developed. Three algorithms including support vector machines, artificial neural network and linear discriminant analysis were adopted for developed classification models based on the optimized 9 features. Wonderful result was obtained by support vector machines model with accuracy of 93.75% for prediction unknown samples in testing set. It can be concluded that it is an effective method to classification tea categories based on computer vision, and support vector machines is very specialized for development of classification model.
Support vector machines for nuclear reactor state estimation
Zavaljevski, N.; Gross, K. C.
2000-02-14
Validation of nuclear power reactor signals is often performed by comparing signal prototypes with the actual reactor signals. The signal prototypes are often computed based on empirical data. The implementation of an estimation algorithm which can make predictions on limited data is an important issue. A new machine learning algorithm called support vector machines (SVMS) recently developed by Vladimir Vapnik and his coworkers enables a high level of generalization with finite high-dimensional data. The improved generalization in comparison with standard methods like neural networks is due mainly to the following characteristics of the method. The input data space is transformed into a high-dimensional feature space using a kernel function, and the learning problem is formulated as a convex quadratic programming problem with a unique solution. In this paper the authors have applied the SVM method for data-based state estimation in nuclear power reactors. In particular, they implemented and tested kernels developed at Argonne National Laboratory for the Multivariate State Estimation Technique (MSET), a nonlinear, nonparametric estimation technique with a wide range of applications in nuclear reactors. The methodology has been applied to three data sets from experimental and commercial nuclear power reactor applications. The results are promising. The combination of MSET kernels with the SVM method has better noise reduction and generalization properties than the standard MSET algorithm.
Support Vector Machines for Hyperspectral Remote Sensing Classification
NASA Technical Reports Server (NTRS)
Gualtieri, J. Anthony; Cromp, R. F.
1998-01-01
The Support Vector Machine provides a new way to design classification algorithms which learn from examples (supervised learning) and generalize when applied to new data. We demonstrate its success on a difficult classification problem from hyperspectral remote sensing, where we obtain performances of 96%, and 87% correct for a 4 class problem, and a 16 class problem respectively. These results are somewhat better than other recent results on the same data. A key feature of this classifier is its ability to use high-dimensional data without the usual recourse to a feature selection step to reduce the dimensionality of the data. For this application, this is important, as hyperspectral data consists of several hundred contiguous spectral channels for each exemplar. We provide an introduction to this new approach, and demonstrate its application to classification of an agriculture scene.
Intelligent Quality Prediction Using Weighted Least Square Support Vector Regression
NASA Astrophysics Data System (ADS)
Yu, Yaojun
A novel quality prediction method with mobile time window is proposed for small-batch producing process based on weighted least squares support vector regression (LS-SVR). The design steps and learning algorithm are also addressed. In the method, weighted LS-SVR is taken as the intelligent kernel, with which the small-batch learning is solved well and the nearer sample is set a larger weight, while the farther is set the smaller weight in the history data. A typical machining process of cutting bearing outer race is carried out and the real measured data are used to contrast experiment. The experimental results demonstrate that the prediction accuracy of the weighted LS-SVR based model is only 20%-30% that of the standard LS-SVR based one in the same condition. It provides a better candidate for quality prediction of small-batch producing process.
Application of Support Vector Machine to Forex Monitoring
NASA Astrophysics Data System (ADS)
Kamruzzaman, Joarder; Sarker, Ruhul A.
Previous studies have demonstrated superior performance of artificial neural network (ANN) based forex forecasting models over traditional regression models. This paper applies support vector machines to build a forecasting model from the historical data using six simple technical indicators and presents a comparison with an ANN based model trained by scaled conjugate gradient (SCG) learning algorithm. The models are evaluated and compared on the basis of five commonly used performance metrics that measure closeness of prediction as well as correctness in directional change. Forecasting results of six different currencies against Australian dollar reveal superior performance of SVM model using simple linear kernel over ANN-SCG model in terms of all the evaluation metrics. The effect of SVM parameter selection on prediction performance is also investigated and analyzed.
Visualization and Interpretation of Support Vector Machine Activity Predictions.
Balfer, Jenny; Bajorath, Jürgen
2015-06-22
Support vector machines (SVMs) are among the preferred machine learning algorithms for virtual compound screening and activity prediction because of their frequently observed high performance levels. However, a well-known conundrum of SVMs (and other supervised learning methods) is the black box character of their predictions, which makes it difficult to understand why models succeed or fail. Herein we introduce an approach to rationalize the performance of SVM models based upon the Tanimoto kernel compared with the linear kernel. Model comparison and interpretation are facilitated by a visualization technique, making it possible to identify descriptor features that determine compound activity predictions. An implementation of the methodology has been made freely available. PMID:25988274
Segmentation of mosaicism in cervicographic images using support vector machines
NASA Astrophysics Data System (ADS)
Xue, Zhiyun; Long, L. Rodney; Antani, Sameer; Jeronimo, Jose; Thoma, George R.
2009-02-01
The National Library of Medicine (NLM), in collaboration with the National Cancer Institute (NCI), is creating a large digital repository of cervicographic images for the study of uterine cervix cancer prevention. One of the research goals is to automatically detect diagnostic bio-markers in these images. Reliable bio-marker segmentation in large biomedical image collections is a challenging task due to the large variation in image appearance. Methods described in this paper focus on segmenting mosaicism, which is an important vascular feature used to visually assess the degree of cervical intraepithelial neoplasia. The proposed approach uses support vector machines (SVM) trained on a ground truth dataset annotated by medical experts (which circumvents the need for vascular structure extraction). We have evaluated the performance of the proposed algorithm and experimentally demonstrated its feasibility.
Cloud Detection of Optical Satellite Images Using Support Vector Machine
NASA Astrophysics Data System (ADS)
Lee, Kuan-Yi; Lin, Chao-Hung
2016-06-01
Cloud covers are generally present in optical remote-sensing images, which limit the usage of acquired images and increase the difficulty of data analysis, such as image compositing, correction of atmosphere effects, calculations of vegetation induces, land cover classification, and land cover change detection. In previous studies, thresholding is a common and useful method in cloud detection. However, a selected threshold is usually suitable for certain cases or local study areas, and it may be failed in other cases. In other words, thresholding-based methods are data-sensitive. Besides, there are many exceptions to control, and the environment is changed dynamically. Using the same threshold value on various data is not effective. In this study, a threshold-free method based on Support Vector Machine (SVM) is proposed, which can avoid the abovementioned problems. A statistical model is adopted to detect clouds instead of a subjective thresholding-based method, which is the main idea of this study. The features used in a classifier is the key to a successful classification. As a result, Automatic Cloud Cover Assessment (ACCA) algorithm, which is based on physical characteristics of clouds, is used to distinguish the clouds and other objects. In the same way, the algorithm called Fmask (Zhu et al., 2012) uses a lot of thresholds and criteria to screen clouds, cloud shadows, and snow. Therefore, the algorithm of feature extraction is based on the ACCA algorithm and Fmask. Spatial and temporal information are also important for satellite images. Consequently, co-occurrence matrix and temporal variance with uniformity of the major principal axis are used in proposed method. We aim to classify images into three groups: cloud, non-cloud and the others. In experiments, images acquired by the Landsat 7 Enhanced Thematic Mapper Plus (ETM+) and images containing the landscapes of agriculture, snow area, and island are tested. Experiment results demonstrate the detection
Testing of the Support Vector Machine for Binary-Class Classification
NASA Technical Reports Server (NTRS)
Scholten, Matthew
2011-01-01
The Support Vector Machine is a powerful algorithm, useful in classifying data in to species. The Support Vector Machines implemented in this research were used as classifiers for the final stage in a Multistage Autonomous Target Recognition system. A single kernel SVM known as SVMlight, and a modified version known as a Support Vector Machine with K-Means Clustering were used. These SVM algorithms were tested as classifiers under varying conditions. Image noise levels varied, and the orientation of the targets changed. The classifiers were then optimized to demonstrate their maximum potential as classifiers. Results demonstrate the reliability of SMV as a method for classification. From trial to trial, SVM produces consistent results
Xie, Hong-Bo; Huang, Hu; Wu, Jianhua; Liu, Lei
2015-02-01
We present a multiclass fuzzy relevance vector machine (FRVM) learning mechanism and evaluate its performance to classify multiple hand motions using surface electromyographic (sEMG) signals. The relevance vector machine (RVM) is a sparse Bayesian kernel method which avoids some limitations of the support vector machine (SVM). However, RVM still suffers the difficulty of possible unclassifiable regions in multiclass problems. We propose two fuzzy membership function-based FRVM algorithms to solve such problems, based on experiments conducted on seven healthy subjects and two amputees with six hand motions. Two feature sets, namely, AR model coefficients and room mean square value (AR-RMS), and wavelet transform (WT) features, are extracted from the recorded sEMG signals. Fuzzy support vector machine (FSVM) analysis was also conducted for wide comparison in terms of accuracy, sparsity, training and testing time, as well as the effect of training sample sizes. FRVM yielded comparable classification accuracy with dramatically fewer support vectors in comparison with FSVM. Furthermore, the processing delay of FRVM was much less than that of FSVM, whilst training time of FSVM much faster than FRVM. The results indicate that FRVM classifier trained using sufficient samples can achieve comparable generalization capability as FSVM with significant sparsity in multi-channel sEMG classification, which is more suitable for sEMG-based real-time control applications.
Automatic inspection of textured surfaces by support vector machines
NASA Astrophysics Data System (ADS)
Jahanbin, Sina; Bovik, Alan C.; Pérez, Eduardo; Nair, Dinesh
2009-08-01
Automatic inspection of manufactured products with natural looking textures is a challenging task. Products such as tiles, textile, leather, and lumber project image textures that cannot be modeled as periodic or otherwise regular; therefore, a stochastic modeling of local intensity distribution is required. An inspection system to replace human inspectors should be flexible in detecting flaws such as scratches, cracks, and stains occurring in various shapes and sizes that have never been seen before. A computer vision algorithm is proposed in this paper that extracts local statistical features from grey-level texture images decomposed with wavelet frames into subbands of various orientations and scales. The local features extracted are second order statistics derived from grey-level co-occurrence matrices. Subsequently, a support vector machine (SVM) classifier is trained to learn a general description of normal texture from defect-free samples. This algorithm is implemented in LabVIEW and is capable of processing natural texture images in real-time.
A support vector machine approach for detection of microcalcifications.
El-Naqa, Issam; Yang, Yongyi; Wernick, Miles N; Galatsanos, Nikolas P; Nishikawa, Robert M
2002-12-01
In this paper, we investigate an approach based on support vector machines (SVMs) for detection of microcalcification (MC) clusters in digital mammograms, and propose a successive enhancement learning scheme for improved performance. SVM is a machine-learning method, based on the principle of structural risk minimization, which performs well when applied to data outside the training set. We formulate MC detection as a supervised-learning problem and apply SVM to develop the detection algorithm. We use the SVM to detect at each location in the image whether an MC is present or not. We tested the proposed method using a database of 76 clinical mammograms containing 1120 MCs. We use free-response receiver operating characteristic curves to evaluate detection performance, and compare the proposed algorithm with several existing methods. In our experiments, the proposed SVM framework outperformed all the other methods tested. In particular, a sensitivity as high as 94% was achieved by the SVM method at an error rate of one false-positive cluster per image. The ability of SVM to out perform several well-known methods developed for the widely studied problem of MC detection suggests that SVM is a promising technique for object detection in a medical imaging application. PMID:12588039
Semisupervised Support Vector Machines With Tangent Space Intrinsic Manifold Regularization.
Sun, Shiliang; Xie, Xijiong
2016-09-01
Semisupervised learning has been an active research topic in machine learning and data mining. One main reason is that labeling examples is expensive and time-consuming, while there are large numbers of unlabeled examples available in many practical problems. So far, Laplacian regularization has been widely used in semisupervised learning. In this paper, we propose a new regularization method called tangent space intrinsic manifold regularization. It is intrinsic to data manifold and favors linear functions on the manifold. Fundamental elements involved in the formulation of the regularization are local tangent space representations, which are estimated by local principal component analysis, and the connections that relate adjacent tangent spaces. Simultaneously, we explore its application to semisupervised classification and propose two new learning algorithms called tangent space intrinsic manifold regularized support vector machines (TiSVMs) and tangent space intrinsic manifold regularized twin SVMs (TiTSVMs). They effectively integrate the tangent space intrinsic manifold regularization consideration. The optimization of TiSVMs can be solved by a standard quadratic programming, while the optimization of TiTSVMs can be solved by a pair of standard quadratic programmings. The experimental results of semisupervised classification problems show the effectiveness of the proposed semisupervised learning algorithms.
Using support vector machines in the multivariate state estimation technique
Zavaljevski, N.; Gross, K.C.
1999-07-01
One approach to validate nuclear power plant (NPP) signals makes use of pattern recognition techniques. This approach often assumes that there is a set of signal prototypes that are continuously compared with the actual sensor signals. These signal prototypes are often computed based on empirical models with little or no knowledge about physical processes. A common problem of all data-based models is their limited ability to make predictions on the basis of available training data. Another problem is related to suboptimal training algorithms. Both of these potential shortcomings with conventional approaches to signal validation and sensor operability validation are successfully resolved by adopting a recently proposed learning paradigm called the support vector machine (SVM). The work presented here is a novel application of SVM for data-based modeling of system state variables in an NPP, integrated with a nonlinear, nonparametric technique called the multivariate state estimation technique (MSET), an algorithm developed at Argonne National Laboratory for a wide range of nuclear plant applications.
Multipurpose image watermarking algorithm based on multistage vector quantization.
Lu, Zhe-Ming; Xu, Dian-Guo; Sun, Sheng-He
2005-06-01
The rapid growth of digital multimedia and Internet technologies has made copyright protection, copy protection, and integrity verification three important issues in the digital world. To solve these problems, the digital watermarking technique has been presented and widely researched. Traditional watermarking algorithms are mostly based on discrete transform domains, such as the discrete cosine transform, discrete Fourier transform (DFT), and discrete wavelet transform (DWT). Most of these algorithms are good for only one purpose. Recently, some multipurpose digital watermarking methods have been presented, which can achieve the goal of content authentication and copyright protection simultaneously. However, they are based on DWT or DFT. Lately, several robust watermarking schemes based on vector quantization (VQ) have been presented, but they can only be used for copyright protection. In this paper, we present a novel multipurpose digital image watermarking method based on the multistage vector quantizer structure, which can be applied to image authentication and copyright protection. In the proposed method, the semi-fragile watermark and the robust watermark are embedded in different VQ stages using different techniques, and both of them can be extracted without the original image. Simulation results demonstrate the effectiveness of our algorithm in terms of robustness and fragility. PMID:15971780
Using support vector machines for anomalous change detonation
Theiler, James P; Steinwart, Ingo; Llamocca, Daniel
2010-01-01
We cast anomalous change detection as a binary classification problem, and use a support vector machine (SVM) to build a detector that does not depend on assumptions about the underlying data distribution. To speed up the computation, our SVM is implemented, in part, on a graphical processing unit. Results on real and simulated anomalous changes are used to compare performance to algorithms which effectively assume a Gaussian distribution. In this paper, we investigate the use of support vector machines (SVMs) with radial basis kernels for finding anomalous changes. Compared to typical applications of SVMs, we are operating in a regime of very low false alarm rate. This means that even for relatively large training sets, the data are quite meager in the regime of operational interest. This drives us to use larger training sets, which in turn places more of a computational burden on the SVM. We initially considered three different approaches to to address the need to work in the very low false alarm rate regime. The first is a standard SVM which is trained at one threshold (where more reliable estimates of false alarm rates are possible) and then re-thresholded for the low false alarm rate regime. The second uses the same thresholding approach, but employs a so-called least squares SVM; here a quadratic (instead of a hinge-based) loss function is employed, and for this model, there are good theoretical arguments in favor of adjusting the threshold in a straightforward manner. The third approach employs a weighted support vector machine, where the weights for the two types of errors (false alarm and missed detection) are automatically adjusted to achieve the desired false alarm rate. We have found in previous experiments (not shown here) that the first two types can in some cases work well, while in other cases they do not. This renders both approaches unreliable for automated change detection. By contrast, the third approach reliably produces good results, but at
SPIDERz: SuPport vector classification for IDEntifying Redshifts
NASA Astrophysics Data System (ADS)
Jones, Evan; Singal, J.
2016-08-01
SPIDERz (SuPport vector classification for IDEntifying Redshifts) applies powerful support vector machine (SVM) optimization and statistical learning techniques to custom data sets to obtain accurate photometric redshift (photo-z) estimations. It is written for the IDL environment and can be applied to traditional data sets consisting of photometric band magnitudes, or alternatively to data sets with additional galaxy parameters (such as shape information) to investigate potential correlations between the extra galaxy parameters and redshift.
Ma, Yuliang; Ding, Xiaohui; She, Qingshan; Luo, Zhizeng; Potter, Thomas; Zhang, Yingchun
2016-01-01
Support vector machines are powerful tools used to solve the small sample and nonlinear classification problems, but their ultimate classification performance depends heavily upon the selection of appropriate kernel and penalty parameters. In this study, we propose using a particle swarm optimization algorithm to optimize the selection of both the kernel and penalty parameters in order to improve the classification performance of support vector machines. The performance of the optimized classifier was evaluated with motor imagery EEG signals in terms of both classification and prediction. Results show that the optimized classifier can significantly improve the classification accuracy of motor imagery EEG signals. PMID:27313656
Ma, Yuliang; Ding, Xiaohui; She, Qingshan; Luo, Zhizeng; Potter, Thomas; Zhang, Yingchun
2016-01-01
Support vector machines are powerful tools used to solve the small sample and nonlinear classification problems, but their ultimate classification performance depends heavily upon the selection of appropriate kernel and penalty parameters. In this study, we propose using a particle swarm optimization algorithm to optimize the selection of both the kernel and penalty parameters in order to improve the classification performance of support vector machines. The performance of the optimized classifier was evaluated with motor imagery EEG signals in terms of both classification and prediction. Results show that the optimized classifier can significantly improve the classification accuracy of motor imagery EEG signals. PMID:27313656
Predicting complications of percutaneous coronary intervention using a novel support vector method
Lee, Gyemin; Gurm, Hitinder S; Syed, Zeeshan
2013-01-01
Objective To explore the feasibility of a novel approach using an augmented one-class learning algorithm to model in-laboratory complications of percutaneous coronary intervention (PCI). Materials and methods Data from the Blue Cross Blue Shield of Michigan Cardiovascular Consortium (BMC2) multicenter registry for the years 2007 and 2008 (n=41 016) were used to train models to predict 13 different in-laboratory PCI complications using a novel one-plus-class support vector machine (OP-SVM) algorithm. The performance of these models in terms of discrimination and calibration was compared to the performance of models trained using the following classification algorithms on BMC2 data from 2009 (n=20 289): logistic regression (LR), one-class support vector machine classification (OC-SVM), and two-class support vector machine classification (TC-SVM). For the OP-SVM and TC-SVM approaches, variants of the algorithms with cost-sensitive weighting were also considered. Results The OP-SVM algorithm and its cost-sensitive variant achieved the highest area under the receiver operating characteristic curve for the majority of the PCI complications studied (eight cases). Similar improvements were observed for the Hosmer–Lemeshow χ2 value (seven cases) and the mean cross-entropy error (eight cases). Conclusions The OP-SVM algorithm based on an augmented one-class learning problem improved discrimination and calibration across different PCI complications relative to LR and traditional support vector machine classification. Such an approach may have value in a broader range of clinical domains. PMID:23599229
Christodoulou, Christos George (University of New Mexico, Albuquerque, NM); Abdallah, Chaouki T. (University of New Mexico, Albuquerque, NM); Rohwer, Judd Andrew
2003-02-01
The paper presents a multiclass, multilabel implementation of least squares support vector machines (LS-SVM) for direction of arrival (DOA) estimation in a CDMA system. For any estimation or classification system, the algorithm's capabilities and performance must be evaluated. Specifically, for classification algorithms, a high confidence level must exist along with a technique to tag misclassifications automatically. The presented learning algorithm includes error control and validation steps for generating statistics on the multiclass evaluation path and the signal subspace dimension. The error statistics provide a confidence level for the classification accuracy.
Speech/Music Classification Enhancement for 3GPP2 SMV Codec Based on Support Vector Machine
NASA Astrophysics Data System (ADS)
Kim, Sang-Kyun; Chang, Joon-Hyuk
In this letter, we propose a novel approach to speech/music classification based on the support vector machine (SVM) to improve the performance of the 3GPP2 selectable mode vocoder (SMV) codec. We first analyze the features and the classification method used in real time speech/music classification algorithm in SMV, and then apply the SVM for enhanced speech/music classification. For evaluation of performance, we compare the proposed algorithm and the traditional algorithm of the SMV. The performance of the proposed system is evaluated under the various environments and shows better performance compared to the original method in the SMV.
Ensemble Feature Learning of Genomic Data Using Support Vector Machine
Anaissi, Ali; Goyal, Madhu; Catchpoole, Daniel R.; Braytee, Ali; Kennedy, Paul J.
2016-01-01
The identification of a subset of genes having the ability to capture the necessary information to distinguish classes of patients is crucial in bioinformatics applications. Ensemble and bagging methods have been shown to work effectively in the process of gene selection and classification. Testament to that is random forest which combines random decision trees with bagging to improve overall feature selection and classification accuracy. Surprisingly, the adoption of these methods in support vector machines has only recently received attention but mostly on classification not gene selection. This paper introduces an ensemble SVM-Recursive Feature Elimination (ESVM-RFE) for gene selection that follows the concepts of ensemble and bagging used in random forest but adopts the backward elimination strategy which is the rationale of RFE algorithm. The rationale behind this is, building ensemble SVM models using randomly drawn bootstrap samples from the training set, will produce different feature rankings which will be subsequently aggregated as one feature ranking. As a result, the decision for elimination of features is based upon the ranking of multiple SVM models instead of choosing one particular model. Moreover, this approach will address the problem of imbalanced datasets by constructing a nearly balanced bootstrap sample. Our experiments show that ESVM-RFE for gene selection substantially increased the classification performance on five microarray datasets compared to state-of-the-art methods. Experiments on the childhood leukaemia dataset show that an average 9% better accuracy is achieved by ESVM-RFE over SVM-RFE, and 5% over random forest based approach. The selected genes by the ESVM-RFE algorithm were further explored with Singular Value Decomposition (SVD) which reveals significant clusters with the selected data. PMID:27304923
Building Ultra-Low False Alarm Rate Support Vector Classifier Ensembles Using Random Subspaces
Chen, B Y; Lemmond, T D; Hanley, W G
2008-10-06
This paper presents the Cost-Sensitive Random Subspace Support Vector Classifier (CS-RS-SVC), a new learning algorithm that combines random subspace sampling and bagging with Cost-Sensitive Support Vector Classifiers to more effectively address detection applications burdened by unequal misclassification requirements. When compared to its conventional, non-cost-sensitive counterpart on a two-class signal detection application, random subspace sampling is shown to very effectively leverage the additional flexibility offered by the Cost-Sensitive Support Vector Classifier, yielding a more than four-fold increase in the detection rate at a false alarm rate (FAR) of zero. Moreover, the CS-RS-SVC is shown to be fairly robust to constraints on the feature subspace dimensionality, enabling reductions in computation time of up to 82% with minimal performance degradation.
A support vector machine approach for classification of welding defects from ultrasonic signals
NASA Astrophysics Data System (ADS)
Chen, Yuan; Ma, Hong-Wei; Zhang, Guang-Ming
2014-07-01
Defect classification is an important issue in ultrasonic non-destructive evaluation. A layered multi-class support vector machine (LMSVM) classification system, which combines multiple SVM classifiers through a layered architecture, is proposed in this paper. The proposed LMSVM classification system is applied to the classification of welding defects from ultrasonic test signals. The measured ultrasonic defect echo signals are first decomposed into wavelet coefficients by the wavelet packet transform. The energy of the wavelet coefficients at different frequency channels are used to construct the feature vectors. The bees algorithm (BA) is then used for feature selection and SVM parameter optimisation for the LMSVM classification system. The BA-based feature selection optimises the energy feature vectors. The optimised feature vectors are input to the LMSVM classification system for training and testing. Experimental results of classifying welding defects demonstrate that the proposed technique is highly robust, precise and reliable for ultrasonic defect classification.
Improved algorithm for data conversion from raster to vector
NASA Astrophysics Data System (ADS)
Teng, Junhua; Wang, Fahui
2007-06-01
Transforming Remote Sensing (RS) classification result from the raster to vector format (R2V) is a common task in Geographic Information Systems (GIS) and RS image processing. R2V acts as a bridge connecting GIS and RS data integration, and is an important module in many commercial software packages such as ENVI and ArcGIS. While considering inconvenience and inefficiency existed in current R2V algorithm, it still has some room to improve. In this paper some technologies and skills are addressed to improve R2V, including sub-image dynamical separation, fast edge tracing, segment combination and partial topology construction. A new method of two-Arm chain edge tracing is introduced. The improved algorithm has so me advantages: It can transform all types of RS classification only once, and build complete topology relationship; The shared edge between two polygons is recorded only once, the diagonal pixels with same attribution are connected automatically; It is scalable while processing large dimension image,it runs fast and enjoys a significant advantage in processing large RS images; It is convenient to edit and modify the vectorised map because of its complete topology information. Based on case study, the preliminary results show its some advantages over Envi and ArcGIS.
An Algorithm for Converting Static Earth Sensor Measurements into Earth Observation Vectors
NASA Technical Reports Server (NTRS)
Harman, R.; Hashmall, Joseph A.; Sedlak, Joseph
2004-01-01
An algorithm has been developed that converts penetration angles reported by Static Earth Sensors (SESs) into Earth observation vectors. This algorithm allows compensation for variation in the horizon height including that caused by Earth oblateness. It also allows pitch and roll to be computed using any number (greater than 1) of simultaneous sensor penetration angles simplifying processing during periods of Sun and Moon interference. The algorithm computes body frame unit vectors through each SES cluster. It also computes GCI vectors from the spacecraft to the position on the Earth's limb where each cluster detects the Earth's limb. These body frame vectors are used as sensor observation vectors and the GCI vectors are used as reference vectors in an attitude solution. The attitude, with the unobservable yaw discarded, is iteratively refined to provide the Earth observation vector solution.
Support vector machines for classification: a statistical portrait.
Lee, Yoonkyung
2010-01-01
The support vector machine is a supervised learning technique for classification increasingly used in many applications of data mining, engineering, and bioinformatics. This chapter aims to provide an introduction to the method, covering from the basic concept of the optimal separating hyperplane to its nonlinear generalization through kernels. A general framework of kernel methods that encompass the support vector machine as a special case is outlined. In addition, statistical properties that illuminate both advantage and limitation of the method due to its specific mechanism for classification are briefly discussed. For illustration of the method and related practical issues, an application to real data with high-dimensional features is presented.
nu-Anomica: A Fast Support Vector Based Novelty Detection Technique
NASA Technical Reports Server (NTRS)
Das, Santanu; Bhaduri, Kanishka; Oza, Nikunj C.; Srivastava, Ashok N.
2009-01-01
In this paper we propose nu-Anomica, a novel anomaly detection technique that can be trained on huge data sets with much reduced running time compared to the benchmark one-class Support Vector Machines algorithm. In -Anomica, the idea is to train the machine such that it can provide a close approximation to the exact decision plane using fewer training points and without losing much of the generalization performance of the classical approach. We have tested the proposed algorithm on a variety of continuous data sets under different conditions. We show that under all test conditions the developed procedure closely preserves the accuracy of standard one-class Support Vector Machines while reducing both the training time and the test time by 5 - 20 times.
Optical diagnosis of colon and cervical cancer by support vector machine
NASA Astrophysics Data System (ADS)
Mukhopadhyay, Sabyasachi; Kurmi, Indrajit; Dey, Rajib; Das, Nandan K.; Pradhan, Sanjay; Pradhan, Asima; Ghosh, Nirmalya; Panigrahi, Prasanta K.; Mohanty, Samarendra
2016-05-01
A probabilistic robust diagnostic algorithm is very much essential for successful cancer diagnosis by optical spectroscopy. We report here support vector machine (SVM) classification to better discriminate the colon and cervical cancer tissues from normal tissues based on elastic scattering spectroscopy. The efficacy of SVM based classification with different kernel has been tested on multifractal parameters like Hurst exponent, singularity spectrum width in order to classify the cancer tissues.
Support vector machine for day ahead electricity price forecasting
NASA Astrophysics Data System (ADS)
Razak, Intan Azmira binti Wan Abdul; Abidin, Izham bin Zainal; Siah, Yap Keem; Rahman, Titik Khawa binti Abdul; Lada, M. Y.; Ramani, Anis Niza binti; Nasir, M. N. M.; Ahmad, Arfah binti
2015-05-01
Electricity price forecasting has become an important part of power system operation and planning. In a pool- based electric energy market, producers submit selling bids consisting in energy blocks and their corresponding minimum selling prices to the market operator. Meanwhile, consumers submit buying bids consisting in energy blocks and their corresponding maximum buying prices to the market operator. Hence, both producers and consumers use day ahead price forecasts to derive their respective bidding strategies to the electricity market yet reduce the cost of electricity. However, forecasting electricity prices is a complex task because price series is a non-stationary and highly volatile series. Many factors cause for price spikes such as volatility in load and fuel price as well as power import to and export from outside the market through long term contract. This paper introduces an approach of machine learning algorithm for day ahead electricity price forecasting with Least Square Support Vector Machine (LS-SVM). Previous day data of Hourly Ontario Electricity Price (HOEP), generation's price and demand from Ontario power market are used as the inputs for training data. The simulation is held using LSSVMlab in Matlab with the training and testing data of 2004. SVM that widely used for classification and regression has great generalization ability with structured risk minimization principle rather than empirical risk minimization. Moreover, same parameter settings in trained SVM give same results that absolutely reduce simulation process compared to other techniques such as neural network and time series. The mean absolute percentage error (MAPE) for the proposed model shows that SVM performs well compared to neural network.
A selective-update affine projection algorithm with selective input vectors
NASA Astrophysics Data System (ADS)
Kong, NamWoong; Shin, JaeWook; Park, PooGyeon
2011-10-01
This paper proposes an affine projection algorithm (APA) with selective input vectors, which based on the concept of selective-update in order to reduce estimation errors and computations. The algorithm consists of two procedures: input- vector-selection and state-decision. The input-vector-selection procedure determines the number of input vectors by checking with mean square error (MSE) whether the input vectors have enough information for update. The state-decision procedure determines the current state of the adaptive filter by using the state-decision criterion. As the adaptive filter is in transient state, the algorithm updates the filter coefficients with the selected input vectors. On the other hand, as soon as the adaptive filter reaches the steady state, the update procedure is not performed. Through these two procedures, the proposed algorithm achieves small steady-state estimation errors, low computational complexity and low update complexity for colored input signals.
Diagnosis of Acute Coronary Syndrome with a Support Vector Machine.
Berikol, Göksu Bozdereli; Yildiz, Oktay; Özcan, I Türkay
2016-04-01
Acute coronary syndrome (ACS) is a serious condition arising from an imbalance of supply and demand to meet myocardium's metabolic needs. Patients typically present with retrosternal chest pain radiating to neck and left arm. Electrocardiography (ECG) and laboratory tests are used indiagnosis. However in emergency departments, there are some difficulties for physicians to decide whether hospitalizing, following up or discharging the patient. The aim of the study is to diagnose ACS and helping the physician with his decisionto discharge or to hospitalizevia machine learning techniques such as support vector machine (SVM) by using patient data including age, sex, risk factors, and cardiac enzymes (CK-MB, Troponin I) of patients presenting to emergency department with chest pain. Clinical, laboratory, and imaging data of 228 patients presenting to emergency department with chest pain were reviewedand the performance of support vector machine. Four different methods (Support vector machine (SVM), Artificial neural network (ANN), Naïve Bayes and Logistic Regression) were tested and the results of SVM which has the highest accuracy is reported. Among 228 patients aged 19 to 91 years who were included in the study, 99 (43.4 %) were qualified as ACS, while 129 (56.5 %) had no ACS. The classification model using SVM attained a 99.13 % classification success. The present study showed a 99.13 % classification success for ACS diagnosis attained by Support Vector Machine. This study showed that machine learning techniques may help emergency department staff make decisions by rapidly producing relevant data.
Support vector machines classifiers of physical activities in preschoolers
Technology Transfer Automated Retrieval System (TEKTRAN)
The goal of this study is to develop, test, and compare multinomial logistic regression (MLR) and support vector machines (SVM) in classifying preschool-aged children physical activity data acquired from an accelerometer. In this study, 69 children aged 3-5 years old were asked to participate in a s...
Identifying saltcedar with hyperspectral data and support vector machines
Technology Transfer Automated Retrieval System (TEKTRAN)
Saltcedar (Tamarix spp.) are a group of dense phreatophytic shrubs and trees that are invasive to riparian areas throughout the United States. This study determined the feasibility of using hyperspectral data and a support vector machine (SVM) classifier to discriminate saltcedar from other cover t...
Prediction of Machine Tool Condition Using Support Vector Machine
NASA Astrophysics Data System (ADS)
Wang, Peigong; Meng, Qingfeng; Zhao, Jian; Li, Junjie; Wang, Xiufeng
2011-07-01
Condition monitoring and predicting of CNC machine tools are investigated in this paper. Considering the CNC machine tools are often small numbers of samples, a condition predicting method for CNC machine tools based on support vector machines (SVMs) is proposed, then one-step and multi-step condition prediction models are constructed. The support vector machines prediction models are used to predict the trends of working condition of a certain type of CNC worm wheel and gear grinding machine by applying sequence data of vibration signal, which is collected during machine processing. And the relationship between different eigenvalue in CNC vibration signal and machining quality is discussed. The test result shows that the trend of vibration signal Peak-to-peak value in surface normal direction is most relevant to the trend of surface roughness value. In trends prediction of working condition, support vector machine has higher prediction accuracy both in the short term ('One-step') and long term (multi-step) prediction compared to autoregressive (AR) model and the RBF neural network. Experimental results show that it is feasible to apply support vector machine to CNC machine tool condition prediction.
Luo, Liyan; Xu, Luping; Zhang, Hua
2015-07-07
In order to enhance the robustness and accelerate the recognition speed of star identification, an autonomous star identification algorithm for star sensors is proposed based on the one-dimensional vector pattern (one_DVP). In the proposed algorithm, the space geometry information of the observed stars is used to form the one-dimensional vector pattern of the observed star. The one-dimensional vector pattern of the same observed star remains unchanged when the stellar image rotates, so the problem of star identification is simplified as the comparison of the two feature vectors. The one-dimensional vector pattern is adopted to build the feature vector of the star pattern, which makes it possible to identify the observed stars robustly. The characteristics of the feature vector and the proposed search strategy for the matching pattern make it possible to achieve the recognition result as quickly as possible. The simulation results demonstrate that the proposed algorithm can effectively accelerate the star identification. Moreover, the recognition accuracy and robustness by the proposed algorithm are better than those by the pyramid algorithm, the modified grid algorithm, and the LPT algorithm. The theoretical analysis and experimental results show that the proposed algorithm outperforms the other three star identification algorithms.
An Autonomous Star Identification Algorithm Based on One-Dimensional Vector Pattern for Star Sensors
Luo, Liyan; Xu, Luping; Zhang, Hua
2015-01-01
In order to enhance the robustness and accelerate the recognition speed of star identification, an autonomous star identification algorithm for star sensors is proposed based on the one-dimensional vector pattern (one_DVP). In the proposed algorithm, the space geometry information of the observed stars is used to form the one-dimensional vector pattern of the observed star. The one-dimensional vector pattern of the same observed star remains unchanged when the stellar image rotates, so the problem of star identification is simplified as the comparison of the two feature vectors. The one-dimensional vector pattern is adopted to build the feature vector of the star pattern, which makes it possible to identify the observed stars robustly. The characteristics of the feature vector and the proposed search strategy for the matching pattern make it possible to achieve the recognition result as quickly as possible. The simulation results demonstrate that the proposed algorithm can effectively accelerate the star identification. Moreover, the recognition accuracy and robustness by the proposed algorithm are better than those by the pyramid algorithm, the modified grid algorithm, and the LPT algorithm. The theoretical analysis and experimental results show that the proposed algorithm outperforms the other three star identification algorithms. PMID:26198233
SOLAR FLARE PREDICTION USING SDO/HMI VECTOR MAGNETIC FIELD DATA WITH A MACHINE-LEARNING ALGORITHM
Bobra, M. G.; Couvidat, S.
2015-01-10
We attempt to forecast M- and X-class solar flares using a machine-learning algorithm, called support vector machine (SVM), and four years of data from the Solar Dynamics Observatory's Helioseismic and Magnetic Imager, the first instrument to continuously map the full-disk photospheric vector magnetic field from space. Most flare forecasting efforts described in the literature use either line-of-sight magnetograms or a relatively small number of ground-based vector magnetograms. This is the first time a large data set of vector magnetograms has been used to forecast solar flares. We build a catalog of flaring and non-flaring active regions sampled from a database of 2071 active regions, comprised of 1.5 million active region patches of vector magnetic field data, and characterize each active region by 25 parameters. We then train and test the machine-learning algorithm and we estimate its performances using forecast verification metrics with an emphasis on the true skill statistic (TSS). We obtain relatively high TSS scores and overall predictive abilities. We surmise that this is partly due to fine-tuning the SVM for this purpose and also to an advantageous set of features that can only be calculated from vector magnetic field data. We also apply a feature selection algorithm to determine which of our 25 features are useful for discriminating between flaring and non-flaring active regions and conclude that only a handful are needed for good predictive abilities.
Classification of Regional Ionospheric Disturbances Based on Support Vector Machines
NASA Astrophysics Data System (ADS)
Begüm Terzi, Merve; Arikan, Feza; Arikan, Orhan; Karatay, Secil
2016-07-01
Ionosphere is an anisotropic, inhomogeneous, time varying and spatio-temporally dispersive medium whose parameters can be estimated almost always by using indirect measurements. Geomagnetic, gravitational, solar or seismic activities cause variations of ionosphere at various spatial and temporal scales. This complex spatio-temporal variability is challenging to be identified due to extensive scales in period, duration, amplitude and frequency of disturbances. Since geomagnetic and solar indices such as Disturbance storm time (Dst), F10.7 solar flux, Sun Spot Number (SSN), Auroral Electrojet (AE), Kp and W-index provide information about variability on a global scale, identification and classification of regional disturbances poses a challenge. The main aim of this study is to classify the regional effects of global geomagnetic storms and classify them according to their risk levels. For this purpose, Total Electron Content (TEC) estimated from GPS receivers, which is one of the major parameters of ionosphere, will be used to model the regional and local variability that differs from global activity along with solar and geomagnetic indices. In this work, for the automated classification of the regional disturbances, a classification technique based on a robust machine learning technique that have found wide spread use, Support Vector Machine (SVM) is proposed. SVM is a supervised learning model used for classification with associated learning algorithm that analyze the data and recognize patterns. In addition to performing linear classification, SVM can efficiently perform nonlinear classification by embedding data into higher dimensional feature spaces. Performance of the developed classification technique is demonstrated for midlatitude ionosphere over Anatolia using TEC estimates generated from the GPS data provided by Turkish National Permanent GPS Network (TNPGN-Active) for solar maximum year of 2011. As a result of implementing the developed classification
Chen, Yuantao; Xu, Weihong; Kuang, Fangjun; Gao, Shangbing
2013-01-01
The efficient target tracking algorithm researches have become current research focus of intelligent robots. The main problems of target tracking process in mobile robot face environmental uncertainty. They are very difficult to estimate the target states, illumination change, target shape changes, complex backgrounds, and other factors and all affect the occlusion in tracking robustness. To further improve the target tracking's accuracy and reliability, we present a novel target tracking algorithm to use visual saliency and adaptive support vector machine (ASVM). Furthermore, the paper's algorithm has been based on the mixture saliency of image features. These features include color, brightness, and sport feature. The execution process used visual saliency features and those common characteristics have been expressed as the target's saliency. Numerous experiments demonstrate the effectiveness and timeliness of the proposed target tracking algorithm in video sequences where the target objects undergo large changes in pose, scale, and illumination. PMID:24363779
Classifier based on support vector machine for JET plasma configurations
Dormido-Canto, S.; Farias, G.; Dormido, R.; Sanchez, J.; Duro, N.; Vargas, H.
2008-10-15
The last flux surface can be used to identify the plasma configuration of discharges. For automated recognition of JET configurations, a learning system based on support vector machines has been developed. Each configuration is described by 12 geometrical parameters. A multiclass system has been developed by means of the one-versus-the-rest approach. Results with eight simultaneous classes (plasma configurations) show a success rate close to 100%.
Optimal control by least squares support vector machines.
Suykens, J A; Vandewalle, J; De Moor, B
2001-01-01
Support vector machines have been very successful in pattern recognition and function estimation problems. In this paper we introduce the use of least squares support vector machines (LS-SVM's) for the optimal control of nonlinear systems. Linear and neural full static state feedback controllers are considered. The problem is formulated in such a way that it incorporates the N-stage optimal control problem as well as a least squares support vector machine approach for mapping the state space into the action space. The solution is characterized by a set of nonlinear equations. An alternative formulation as a constrained nonlinear optimization problem in less unknowns is given, together with a method for imposing local stability in the LS-SVM control scheme. The results are discussed for support vector machines with radial basis function kernel. Advantages of LS-SVM control are that no number of hidden units has to be determined for the controller and that no centers have to be specified for the Gaussian kernels when applying Mercer's condition. The curse of dimensionality is avoided in comparison with defining a regular grid for the centers in classical radial basis function networks. This is at the expense of taking the trajectory of state variables as additional unknowns in the optimization problem, while classical neural network approaches typically lead to parametric optimization problems. In the SVM methodology the number of unknowns equals the number of training data, while in the primal space the number of unknowns can be infinite dimensional. The method is illustrated both on stabilization and tracking problems including examples on swinging up an inverted pendulum with local stabilization at the endpoint and a tracking problem for a ball and beam system.
Support vector based battery state of charge estimator
NASA Astrophysics Data System (ADS)
Hansen, Terry; Wang, Chia-Jiu
This paper investigates the use of a support vector machine (SVM) to estimate the state-of-charge (SOC) of a large-scale lithium-ion-polymer (LiP) battery pack. The SOC of a battery cannot be measured directly and must be estimated from measurable battery parameters such as current and voltage. The coulomb counting SOC estimator has been used in many applications but it has many drawbacks [S. Piller, M. Perrin, Methods for state-of-charge determination and their application, J. Power Sources 96 (2001) 113-120]. The proposed SVM based solution not only removes the drawbacks of the coulomb counting SOC estimator but also produces accurate SOC estimates, using industry standard US06 [V.H. Johnson, A.A. Pesaran, T. Sack, Temperature-dependent battery models for high-power lithium-ion batteries, in: Presented at the 17th Annual Electric Vehicle Symposium Montreal, Canada, October 15-18, 2000. The paper is downloadable at website http://www.nrel.gov/docs/fy01osti/28716.pdf] aggressive driving cycle test procedures. The proposed SOC estimator extracts support vectors from a battery operation history then uses only these support vectors to estimate SOC, resulting in minimal computation load and suitable for real-time embedded system applications.
Lumbar Ultrasound Image Feature Extraction and Classification with Support Vector Machine.
Yu, Shuang; Tan, Kok Kiong; Sng, Ban Leong; Li, Shengjin; Sia, Alex Tiong Heng
2015-10-01
Needle entry site localization remains a challenge for procedures that involve lumbar puncture, for example, epidural anesthesia. To solve the problem, we have developed an image classification algorithm that can automatically identify the bone/interspinous region for ultrasound images obtained from lumbar spine of pregnant patients in the transverse plane. The proposed algorithm consists of feature extraction, feature selection and machine learning procedures. A set of features, including matching values, positions and the appearance of black pixels within pre-defined windows along the midline, were extracted from the ultrasound images using template matching and midline detection methods. A support vector machine was then used to classify the bone images and interspinous images. The support vector machine model was trained with 1,040 images from 26 pregnant subjects and tested on 800 images from a separate set of 20 pregnant patients. A success rate of 95.0% on training set and 93.2% on test set was achieved with the proposed method. The trained support vector machine model was further tested on 46 off-line collected videos, and successfully identified the proper needle insertion site (interspinous region) in 45 of the cases. Therefore, the proposed method is able to process the ultrasound images of lumbar spine in an automatic manner, so as to facilitate the anesthetists' work of identifying the needle entry site.
Simple, fast codebook training algorithm by entropy sequence for vector quantization
NASA Astrophysics Data System (ADS)
Pang, Chao-yang; Yao, Shaowen; Qi, Zhang; Sun, Shi-xin; Liu, Jingde
2001-09-01
The traditional training algorithm for vector quantization such as the LBG algorithm uses the convergence of distortion sequence as the condition of the end of algorithm. We presented a novel training algorithm for vector quantization in this paper. The convergence of the entropy sequence of each region sequence is employed as the condition of the end of the algorithm. Compared with the famous LBG algorithm, it is simple, fast and easy to be comprehended and controlled. We test the performance of the algorithm by typical test image Lena and Barb. The result shows that the PSNR difference between the algorithm and LBG is less than 0.1dB, but the running time of it is at most one second of LBG.
Seismic interpretation using Support Vector Machines implemented on Graphics Processing Units
Kuzma, H A; Rector, J W; Bremer, D
2006-06-22
Support Vector Machines (SVMs) estimate lithologic properties of rock formations from seismic data by interpolating between known models using synthetically generated model/data pairs. SVMs are related to kriging and radial basis function neural networks. In our study, we train an SVM to approximate an inverse to the Zoeppritz equations. Training models are sampled from distributions constructed from well-log statistics. Training data is computed via a physically realistic forward modeling algorithm. In our experiments, each training data vector is a set of seismic traces similar to a 2-d image. The SVM returns a model given by a weighted comparison of the new data to each training data vector. The method of comparison is given by a kernel function which implicitly transforms data into a high-dimensional feature space and performs a dot-product. The feature space of a Gaussian kernel is made up of sines and cosines and so is appropriate for band-limited seismic problems. Training an SVM involves estimating a set of weights from the training model/data pairs. It is designed to be an easy problem; at worst it is a quadratic programming problem on the order of the size of the training set. By implementing the slowest part of our SVM algorithm on a graphics processing unit (GPU), we improve the speed of the algorithm by two orders of magnitude. Our SVM/GPU combination achieves results that are similar to those of conventional iterative inversion in fractions of the time.
Scorebox extraction from mobile sports videos using Support Vector Machines
NASA Astrophysics Data System (ADS)
Kim, Wonjun; Park, Jimin; Kim, Changick
2008-08-01
Scorebox plays an important role in understanding contents of sports videos. However, the tiny scorebox may give the small-display-viewers uncomfortable experience in grasping the game situation. In this paper, we propose a novel framework to extract the scorebox from sports video frames. We first extract candidates by using accumulated intensity and edge information after short learning period. Since there are various types of scoreboxes inserted in sports videos, multiple attributes need to be used for efficient extraction. Based on those attributes, the optimal information gain is computed and top three ranked attributes in terms of information gain are selected as a three-dimensional feature vector for Support Vector Machines (SVM) to distinguish the scorebox from other candidates, such as logos and advertisement boards. The proposed method is tested on various videos of sports games and experimental results show the efficiency and robustness of our proposed method.
Wang, Shuihua; Chen, Mengmeng; Li, Yang; Shao, Ying; Zhang, Yudong; Du, Sidan; Wu, Jane
2016-01-01
Dendritic spines are described as neuronal protrusions. The morphology of dendritic spines and dendrites has a strong relationship to its function, as well as playing an important role in understanding brain function. Quantitative analysis of dendrites and dendritic spines is essential to an understanding of the formation and function of the nervous system. However, highly efficient tools for the quantitative analysis of dendrites and dendritic spines are currently undeveloped. In this paper we propose a novel three-step cascaded algorithm-RTSVM- which is composed of ridge detection as the curvature structure identifier for backbone extraction, boundary location based on differences in density, the Hu moment as features and Twin Support Vector Machine (TSVM) classifiers for spine classification. Our data demonstrates that this newly developed algorithm has performed better than other available techniques used to detect accuracy and false alarm rates. This algorithm will be used effectively in neuroscience research. PMID:27547530
Feature Selection and Classification of Hyperspectral Images with Support Vector Machines
Archibald, Richard K; Fann, George I
2007-01-01
Hyperspectral images consist of large number of bands which require sophisticated analysis to extract. One approach to reduce computational cost, information representation, and accelerate knowledge discovery is to eliminate bands that do not add value to the classification and analysis method which is being applied. In particular, algorithms that perform band elimination should be designed to take advantage of the structure of the classification method used. This letter introduces an embedded-feature-selection (EFS) algorithm that is tailored to operate with support vector machines (SVMs) to perform band selection and classification simultaneously. We have successfully applied this algorithm to determine a reasonable subset of bands without any user-defined stopping criteria on some sample AVIRIS images; a problem occurs in benchmarking recursive-feature-elimination methods for the SVMs.
Prediction of neurotoxins by support vector machine based on multiple feature vectors.
Guang, Xuan-Min; Guo, Yan-Zhi; Wang, Xia; Li, Meng-Long
2010-09-01
Neurotoxin is a toxin which acts on nerve cells by interacting with membrane proteins. Different neurotoxins have different functions and sources. With much more knowledge of neurotoxins it would be greatly helpful for the development of drug design. The support vector machine (SVM) was used to predict the neurotoxin based on multiple feature vector descriptors, including the amino acid composition, length of the protein sequence, weight of the protein and the evolution information described by position specific scoring matrix (PSSM). After a five-fold cross-validation procedure, the method achieved an accuracy of 100% in discriminating neurotoxins from non-toxins. As for classifying neurotoxins based on their sources and functions, the accuracy was 99.50% and 99.38% respectively. At last, the method yielded a good performance in sub-classification of ion channels inhibitors with the total accuracy of 87.27%. These results indicate that this method outperforms previously described NTXpred method.
Improvements on ν-Twin Support Vector Machine.
Khemchandani, Reshma; Saigal, Pooja; Chandra, Suresh
2016-07-01
In this paper, we propose two novel binary classifiers termed as "Improvements on ν-Twin Support Vector Machine: Iν-TWSVM and Iν-TWSVM (Fast)" that are motivated by ν-Twin Support Vector Machine (ν-TWSVM). Similar to ν-TWSVM, Iν-TWSVM determines two nonparallel hyperplanes such that they are closer to their respective classes and are at least ρ distance away from the other class. The significant advantage of Iν-TWSVM over ν-TWSVM is that Iν-TWSVM solves one smaller-sized Quadratic Programming Problem (QPP) and one Unconstrained Minimization Problem (UMP); as compared to solving two related QPPs in ν-TWSVM. Further, Iν-TWSVM (Fast) avoids solving a smaller sized QPP and transforms it as a unimodal function, which can be solved using line search methods and similar to Iν-TWSVM, the other problem is solved as a UMP. Due to their novel formulation, the proposed classifiers are faster than ν-TWSVM and have comparable generalization ability. Iν-TWSVM also implements structural risk minimization (SRM) principle by introducing a regularization term, along with minimizing the empirical risk. The other properties of Iν-TWSVM, related to support vectors (SVs), are similar to that of ν-TWSVM. To test the efficacy of the proposed method, experiments have been conducted on a wide range of UCI and a skewed variation of NDC datasets. We have also given the application of Iν-TWSVM as a binary classifier for pixel classification of color images. PMID:27136663
Near Real-Time Dust Aerosol Detection with Support Vector Machines for Regression
NASA Astrophysics Data System (ADS)
Rivas-Perea, P.; Rivas-Perea, P. E.; Cota-Ruiz, J.; Aragon Franco, R. A.
2015-12-01
Remote sensing instruments operating in the near-infrared spectrum usually provide the necessary information for further dust aerosol spectral analysis using statistical or machine learning algorithms. Such algorithms have proven to be effective in analyzing very specific case studies or dust events. However, very few make the analysis open to the public on a regular basis, fewer are designed specifically to operate in near real-time to higher resolutions, and almost none give a global daily coverage. In this research we investigated a large-scale approach to a machine learning algorithm called "support vector regression". The algorithm uses four near-infrared spectral bands from NASA MODIS instrument: B20 (3.66-3.84μm), B29 (8.40-8.70μm), B31 (10.78-11.28μm), and B32 (11.77-12.27μm). The algorithm is presented with ground truth from more than 30 distinct reported dust events, from different geographical regions, at different seasons, both over land and sea cover, in the presence of clouds and clear sky, and in the presence of fires. The purpose of our algorithm is to learn to distinguish the dust aerosols spectral signature from other spectral signatures, providing as output an estimate of the probability of a data point being consistent with dust aerosol signatures. During modeling with ground truth, our algorithm achieved more than 90% of accuracy, and the current live performance of the algorithm is remarkable. Moreover, our algorithm is currently operating in near real-time using NASA's Land, Atmosphere Near real-time Capability for EOS (LANCE) servers, providing a high resolution global overview including 64, 32, 16, 8, 4, 2, and 1km. The near real-time analysis of our algorithm is now available to the general public at http://dust.reev.us and archives of the results starting from 2012 are available upon request.
Gene classification using codon usage and support vector machines.
Ma, Jianmin; Nguyen, Minh N; Rajapakse, Jagath C
2009-01-01
A novel approach for gene classification, which adopts codon usage bias as input feature vector for classification by support vector machines (SVM) is proposed. The DNA sequence is first converted to a 59-dimensional feature vector where each element corresponds to the relative synonymous usage frequency of a codon. As the input to the classifier is independent of sequence length and variance, our approach is useful when the sequences to be classified are of different lengths, a condition that homology-based methods tend to fail. The method is demonstrated by using 1,841 Human Leukocyte Antigen (HLA) sequences which are classified into two major classes: HLA-I and HLA-II; each major class is further subdivided into sub-groups of HLA-I and HLA-II molecules. Using codon usage frequencies, binary SVM achieved accuracy rate of 99.3% for HLA major class classification and multi-class SVM achieved accuracy rates of 99.73% and 98.38% for sub-class classification of HLA-I and HLA-II molecules, respectively. The results show that gene classification based on codon usage bias is consistent with the molecular structures and biological functions of HLA molecules. PMID:19179707
Cardiovascular Response Identification Based on Nonlinear Support Vector Regression
NASA Astrophysics Data System (ADS)
Wang, Lu; Su, Steven W.; Chan, Gregory S. H.; Celler, Branko G.; Cheng, Teddy M.; Savkin, Andrey V.
This study experimentally investigates the relationships between central cardiovascular variables and oxygen uptake based on nonlinear analysis and modeling. Ten healthy subjects were studied using cycle-ergometry exercise tests with constant workloads ranging from 25 Watt to 125 Watt. Breath by breath gas exchange, heart rate, cardiac output, stroke volume and blood pressure were measured at each stage. The modeling results proved that the nonlinear modeling method (Support Vector Regression) outperforms traditional regression method (reducing Estimation Error between 59% and 80%, reducing Testing Error between 53% and 72%) and is the ideal approach in the modeling of physiological data, especially with small training data set.
Support vector machine classifiers for large data sets.
Gertz, E. M.; Griffin, J. D.
2006-01-31
This report concerns the generation of support vector machine classifiers for solving the pattern recognition problem in machine learning. Several methods are proposed based on interior point methods for convex quadratic programming. Software implementations are developed by adapting the object-oriented packaging OOQP to the problem structure and by using the software package PETSc to perform time-intensive computations in a distributed setting. Linear systems arising from classification problems with moderately large numbers of features are solved by using two techniques--one a parallel direct solver, the other a Krylov-subspace method incorporating novel preconditioning strategies. Numerical results are provided, and computational experience is discussed.
Classification of EEG signals using a multiple kernel learning support vector machine.
Li, Xiaoou; Chen, Xun; Yan, Yuning; Wei, Wenshi; Wang, Z Jane
2014-07-17
In this study, a multiple kernel learning support vector machine algorithm is proposed for the identification of EEG signals including mental and cognitive tasks, which is a key component in EEG-based brain computer interface (BCI) systems. The presented BCI approach included three stages: (1) a pre-processing step was performed to improve the general signal quality of the EEG; (2) the features were chosen, including wavelet packet entropy and Granger causality, respectively; (3) a multiple kernel learning support vector machine (MKL-SVM) based on a gradient descent optimization algorithm was investigated to classify EEG signals, in which the kernel was defined as a linear combination of polynomial kernels and radial basis function kernels. Experimental results showed that the proposed method provided better classification performance compared with the SVM based on a single kernel. For mental tasks, the average accuracies for 2-class, 3-class, 4-class, and 5-class classifications were 99.20%, 81.25%, 76.76%, and 75.25% respectively. Comparing stroke patients with healthy controls using the proposed algorithm, we achieved the average classification accuracies of 89.24% and 80.33% for 0-back and 1-back tasks respectively. Our results indicate that the proposed approach is promising for implementing human-computer interaction (HCI), especially for mental task classification and identifying suitable brain impairment candidates.
Fuzzy Nonlinear Proximal Support Vector Machine for Land Extraction Based on Remote Sensing Image
Zhong, Xiaomei; Li, Jianping; Dou, Huacheng; Deng, Shijun; Wang, Guofei; Jiang, Yu; Wang, Yongjie; Zhou, Zebing; Wang, Li; Yan, Fei
2013-01-01
Currently, remote sensing technologies were widely employed in the dynamic monitoring of the land. This paper presented an algorithm named fuzzy nonlinear proximal support vector machine (FNPSVM) by basing on ETM+ remote sensing image. This algorithm is applied to extract various types of lands of the city Da’an in northern China. Two multi-category strategies, namely “one-against-one” and “one-against-rest” for this algorithm were described in detail and then compared. A fuzzy membership function was presented to reduce the effects of noises or outliers on the data samples. The approaches of feature extraction, feature selection, and several key parameter settings were also given. Numerous experiments were carried out to evaluate its performances including various accuracies (overall accuracies and kappa coefficient), stability, training speed, and classification speed. The FNPSVM classifier was compared to the other three classifiers including the maximum likelihood classifier (MLC), back propagation neural network (BPN), and the proximal support vector machine (PSVM) under different training conditions. The impacts of the selection of training samples, testing samples and features on the four classifiers were also evaluated in these experiments. PMID:23936016
Agricultural mapping using Support Vector Machine-Based Endmember Extraction (SVM-BEE)
Archibald, Richard K; Filippi, Anthony M; Bhaduri, Budhendra L; Bright, Eddie A
2009-01-01
Extracting endmembers from remotely sensed images of vegetated areas can present difficulties. In this research, we applied a recently developed endmember-extraction algorithm based on Support Vector Machines (SVMs) to the problem of semi-autonomous estimation of vegetation endmembers from a hyperspectral image. This algorithm, referred to as Support Vector Machine-Based Endmember Extraction (SVM-BEE), accurately and rapidly yields a computed representation of hyperspectral data that can accommodate multiple distributions. The number of distributions is identified without prior knowledge, based upon this representation. Prior work established that SVM-BEE is robustly noise-tolerant and can semi-automatically and effectively estimate endmembers; synthetic data and a geologic scene were previously analyzed. Here we compared the efficacies of the SVM-BEE and N-FINDR algorithms in extracting endmembers from a predominantly agricultural scene. SVM-BEE was able to estimate vegetation and other endmembers for all classes in the image, which N-FINDR failed to do. Classifications based on SVM-BEE endmembers were markedly more accurate compared with those based on N-FINDR endmembers.
Predictive search algorithm for vector quantization of images
NASA Astrophysics Data System (ADS)
Kuo, Chung-Ming; Hsieh, Chaur-Heh; Weng, Shiuh-Ku
2002-05-01
We present a fast predictive search algorithm for vectorquantization (VQ) based on a wavelet transform and weighted average Kalman filter (WAKF). With the proposed algorithm, the minimum distortion code word can be found by searching only a portion of the wavelet transformed code book. If the minimum distortion code word found falls within a predicted search area obtained by the WAKF algorithm, the relative address that is shorter than the absolute address for a full search range is sent to the decoder. Simulation results indicate that the proposed algorithm achieves a significant reduction in computations and about a 30% bit-rate reduction, as compared to conventional full search VQs. In addition, the reconstructed quality is equivalent to that of the full search algorithm.
Support vector machine multiuser receiver for DS-CDMA signals in multipath channels.
Chen, S; Samingan, A K; Hanzo, L
2001-01-01
The problem of constructing an adaptive multiuser detector (MUD) is considered for direct sequence code division multiple access (DS-CDMA) signals transmitted through multipath channels. The emerging learning technique, called support vector machines (SVM), is proposed as a method of obtaining a nonlinear MUD from a relatively small training data block. Computer simulation is used to study this SVM MUD, and the results show that it can closely match the performance of the optimal Bayesian one-shot detector. Comparisons with an adaptive radial basis function (RBF) MUD trained by an unsupervised clustering algorithm are discussed.
Linear and nonlinear structural identifications using the support vector regression
NASA Astrophysics Data System (ADS)
Zhang, Jian; Sato, Tadanobu
2006-03-01
Robust and efficient identification methods are necessary to study in the structural health monitoring field, especially when the I/O data are accompanied by high-level noise and the structure studied is a large-scale one. The Support vector Regression (SVR) is a promising nonlinear modeling method that has been found working very well in many fields, and has a powerful potential to be applied in system identifications. The SVR-based methods are provided in this article to make linear large-scale structural identification and nonlinear hysteretic structural identifications. The LS estimator is a cornerstone of statistics but less robust to outliers. Instead of the classical Gaussian loss function without regularization used in the LS method, a novel e-insensitive loss function is employed in the SVR. Meanwhile, the SVR adopts the 'max-margined' idea to search for an optimum hyper-plane separating the training data into two subsets by maximizing the margin between them. Therefore, the SVR-based structural identification approach is robust and accuracy even though the observation data involve different kinds and high-level noise. By means of the local strategy, the linear large-scale structural identification approach based on the SVR is first investigated. The novel SVR can identify structural parameters directly by writing structural observation equations in linear equations with respect to unknown structural parameters. Furthermore, the substrutural idea employed reduces the number of unknown parameters seriously to guarantee the SVR work in a low dimension and to focus the identification on a local arbitrary subsystem. It is crucial to make nonlinear structural identification also, because structures exhibit highly nonlinear characters under severe loads such as strong seismic excitations. The Bouc-Wen model is often utilized to describe structural nonlinear properties, the power parameter of the model however is often assumed as known even though it is unknown
Novel cascade FPGA accelerator for support vector machines classification.
Papadonikolakis, Markos; Bouganis, Christos-Savvas
2012-07-01
Support vector machines (SVMs) are a powerful machine learning tool, providing state-of-the-art accuracy to many classification problems. However, SVM classification is a computationally complex task, suffering from linear dependencies on the number of the support vectors and the problem's dimensionality. This paper presents a fully scalable field programmable gate array (FPGA) architecture for the acceleration of SVM classification, which exploits the device heterogeneity and the dynamic range diversities among the dataset attributes. An adaptive and fully-customized processing unit is proposed, which utilizes the available heterogeneous resources of a modern FPGA device in efficient way with respect to the problem's characteristics. The implementation results demonstrate the efficiency of the heterogeneous architecture, presenting a speed-up factor of 2-3 orders of magnitude, compared to the CPU implementation. The proposed architecture outperforms other proposed FPGA and graphic processor unit approaches by more than seven times. Furthermore, based on the special properties of the heterogeneous architecture, this paper introduces the first FPGA-oriented cascade SVM classifier scheme, which exploits the FPGA reconfigurability and intensifies the custom-arithmetic properties of the heterogeneous architecture. The results show that the proposed cascade scheme is able to increase the heterogeneous classifier throughput even further, without introducing any penalty on the resource utilization.
Parameter estimation for support vector anomaly detection in hyperspectral imagery
NASA Astrophysics Data System (ADS)
Meth, Reuven; Ahn, James; Banerjee, Amit; Juang, Radford; Burlina, Philippe
2012-06-01
Hyperspectral Image (HSI) anomaly detectors typically employ local background modeling techniques to facilitate target detection from surrounding clutter. Global background modeling has been challenging due to the multi-modal content that must be automatically modeled to enable target/background separation. We have previously developed a support vector based anomaly detector that does not impose an a priori parametric model on the data and enables multi-modal modeling of large background regions with inhomogeneous content. Effective application of this support vector approach requires the setting of a kernel parameter that controls the tightness of the model fit to the background data. Estimation of the kernel parameter has typically considered Type I / false-positive error optimization due to the availability of background samples, but this approach has not proven effective for general application since these methods only control the false alarm level, without any optimization for maximizing detection. Parameter optimization with respect to Type II / false-negative error has remained elusive due to the lack of sufficient target training exemplars. We present an approach that optimizes parameter selection based on both Type I and Type II error criteria by introducing outliers based on existing hypercube content to guide parameter estimation. The approach has been applied to hyperspectral imagery and has demonstrated automatic estimation of parameters consistent with those that were found to be optimal, thereby providing an automated method for general anomaly detection applications.
A vector-multiplication dominated parallel algorithm for the computation of real eigenvalue spectra
NASA Astrophysics Data System (ADS)
Clint, M.
1982-06-01
In order to exploit effectively the power of array and vector processors for the numerical solution of linear algebraic problems it is desirable to express algorithms principally in terms of vector and matrix operations. Algorithms which manipulate vectors and matrices at component level are best suited for execution on single processor hardware. Often, however, it is difficult, if not impossible, to construct efficient versions of such algorithms which are suitable foe execution on parallwl hardware. A method for computing the eigenvalues of real unsymmetric matrices with real eigenvalue spectra is presented. The method is an extension of the one described in ref. [1]. The algorithm makes heavy use of vector inner product evaluations. The manipulation of individual components of vectors and matrices is kept to a minimum. Essentially, the method involves the construction of a sequence of biorthogonal transformation matrices the combined effect of which is to diagonalise the matrix. The eigenvalues of the matrix are diagonal elements of the final diagonalised form. If the eigenvectors of the matrix are also required the algorithm may be extended in a straightforward way. The effectiveness of the algorithm is demonstrated by an application of sequential version to several small matrices and some comments are made about the time complexity of the parallel version.
Algorithms for solving large sparse systems of simultaneous linear equations on vector processors
NASA Technical Reports Server (NTRS)
David, R. E.
1984-01-01
Very efficient algorithms for solving large sparse systems of simultaneous linear equations have been developed for serial processing computers. These involve a reordering of matrix rows and columns in order to obtain a near triangular pattern of nonzero elements. Then an LU factorization is developed to represent the matrix inverse in terms of a sequence of elementary Gaussian eliminations, or pivots. In this paper it is shown how these algorithms are adapted for efficient implementation on vector processors. Results obtained on the CYBER 200 Model 205 are presented for a series of large test problems which show the comparative advantages of the triangularization and vector processing algorithms.
NASA Astrophysics Data System (ADS)
Na'imi, S. R.; Shadizadeh, S. R.; Riahi, M. A.; Mirzakhanian, M.
2014-08-01
Porosity and fluid saturation distributions are crucial properties of hydrocarbon reservoirs and are involved in almost all calculations related to reservoir and production. True measurements of these parameters derived from laboratory measurements, are only available at the isolated localities of a reservoir and also are expensive and time-consuming. Therefore, employing other methodologies which have stiffness, simplicity, and cheapness is needful. Support Vector Regression approach is a moderately novel method for doing functional estimation in regression problems. Contrary to conventional neural networks which minimize the error on the training data by the use of usual Empirical Risk Minimization principle, Support Vector Regression minimizes an upper bound on the anticipated risk by means of the Structural Risk Minimization principle. This difference which is the destination in statistical learning causes greater ability of this approach for generalization tasks. In this study, first, appropriate seismic attributes which have an underlying dependency with reservoir porosity and water saturation are extracted. Subsequently, a non-linear support vector regression algorithm is utilized to obtain quantitative formulation between porosity and water saturation parameters and selected seismic attributes. For an undrilled reservoir, in which there are no sufficient core and log data, it is moderately possible to characterize hydrocarbon bearing formation by means of this method.
Robust Support Vector Machines for Classification with Nonconvex and Smooth Losses.
Feng, Yunlong; Yang, Yuning; Huang, Xiaolin; Mehrkanoon, Siamak; Suykens, Johan A K
2016-06-01
This letter addresses the robustness problem when learning a large margin classifier in the presence of label noise. In our study, we achieve this purpose by proposing robustified large margin support vector machines. The robustness of the proposed robust support vector classifiers (RSVC), which is interpreted from a weighted viewpoint in this work, is due to the use of nonconvex classification losses. Besides the robustness, we also show that the proposed RSCV is simultaneously smooth, which again benefits from using smooth classification losses. The idea of proposing RSVC comes from M-estimation in statistics since the proposed robust and smooth classification losses can be taken as one-sided cost functions in robust statistics. Its Fisher consistency property and generalization ability are also investigated. Besides the robustness and smoothness, another nice property of RSVC lies in the fact that its solution can be obtained by solving weighted squared hinge loss-based support vector machine problems iteratively. We further show that in each iteration, it is a quadratic programming problem in its dual space and can be solved by using state-of-the-art methods. We thus propose an iteratively reweighted type algorithm and provide a constructive proof of its convergence to a stationary point. Effectiveness of the proposed classifiers is verified on both artificial and real data sets. PMID:27137357
Deriving statistical significance maps for support vector regression using medical imaging data.
Gaonkar, Bilwaj; Sotiras, Aristeidis; Davatzikos, Christos
2013-01-01
Regression analysis involves predicting a continuous variable using imaging data. The Support Vector Regression (SVR) algorithm has previously been used in addressing regression analysis in neuroimaging. However, identifying the regions of the image that the SVR uses to model the dependence of a target variable remains an open problem. It is an important issue when one wants to biologically interpret the meaning of a pattern that predicts the variable(s) of interest, and therefore to understand normal or pathological process. One possible approach to the identification of these regions is the use of permutation testing. Permutation testing involves 1) generation of a large set of 'null SVR models' using randomly permuted sets of target variables, and 2) comparison of the SVR model trained using the original labels to the set of null models. These permutation tests often require prohibitively long computational time. Recent work in support vector classification shows that it is possible to analytically approximate the results of permutation testing in medical image analysis. We propose an analogous approach to approximate permutation testing based analysis for support vector regression with medical imaging data. In this paper we present 1) the theory behind our approximation, and 2) experimental results using two real datasets.
Protein Kinase Classification with 2866 Hidden Markov Models and One Support Vector Machine
NASA Technical Reports Server (NTRS)
Weber, Ryan; New, Michael H.; Fonda, Mark (Technical Monitor)
2002-01-01
The main application considered in this paper is predicting true kinases from randomly permuted kinases that share the same length and amino acid distributions as the true kinases. Numerous methods already exist for this classification task, such as HMMs, motif-matchers, and sequence comparison algorithms. We build on some of these efforts by creating a vector from the output of thousands of structurally based HMMs, created offline with Pfam-A seed alignments using SAM-T99, which then must be combined into an overall classification for the protein. Then we use a Support Vector Machine for classifying this large ensemble Pfam-Vector, with a polynomial and chisquared kernel. In particular, the chi-squared kernel SVM performs better than the HMMs and better than the BLAST pairwise comparisons, when predicting true from false kinases in some respects, but no one algorithm is best for all purposes or in all instances so we consider the particular strengths and weaknesses of each.
Horvath, Dragos; Marcou, Gilles; Varnek, Alexandre; Kayastha, Shilva; de la Vega de León, Antonio; Bajorath, Jürgen
2016-09-26
Activity cliffs (ACs) are formed by structurally similar compounds with large differences in activity. Accordingly, ACs are of high interest for the exploration of structure-activity relationships (SARs). ACs reveal small chemical modifications that result in profound biological effects. The ability to foresee such small chemical changes with significant biological consequences would represent a major advance for drug design. Nevertheless, only few attempts have been made so far to predict whether a pair of analogues is likely to represent an AC-and even fewer went further to quantitatively predict how "deep" a cliff might be. This might be due to the fact that such predictions must focus on compound pairs. Matched molecular pairs (MMPs), defined as pairs of structural analogs that are only distinguished by a chemical modification at a single site, are a preferred representation of ACs. Herein, we report new strategies for AC prediction that are based upon two different approaches: (i) condensed graphs of reactions, which were originally introduced for modeling of chemical reactions and were here adapted to encode MMPs, and, (ii) plain descriptor recombination-a strategy used for quantitative structure-property relationship (QSPR) modeling of nonadditive mixtures (MQSPR). By applying these concepts, ACs were encoded as single descriptor vectors used as input for support vector machine (SVM) classification and support vector regression (SVR), yielding accurate predictions of AC status (i.e., cliff vs noncliff) and potency differences, respectively. The latter were predicted in a compound order-sensitive manner returning the signed value of expected potency differences between AC compounds. PMID:27564682
Horvath, Dragos; Marcou, Gilles; Varnek, Alexandre; Kayastha, Shilva; de la Vega de León, Antonio; Bajorath, Jürgen
2016-09-26
Activity cliffs (ACs) are formed by structurally similar compounds with large differences in activity. Accordingly, ACs are of high interest for the exploration of structure-activity relationships (SARs). ACs reveal small chemical modifications that result in profound biological effects. The ability to foresee such small chemical changes with significant biological consequences would represent a major advance for drug design. Nevertheless, only few attempts have been made so far to predict whether a pair of analogues is likely to represent an AC-and even fewer went further to quantitatively predict how "deep" a cliff might be. This might be due to the fact that such predictions must focus on compound pairs. Matched molecular pairs (MMPs), defined as pairs of structural analogs that are only distinguished by a chemical modification at a single site, are a preferred representation of ACs. Herein, we report new strategies for AC prediction that are based upon two different approaches: (i) condensed graphs of reactions, which were originally introduced for modeling of chemical reactions and were here adapted to encode MMPs, and, (ii) plain descriptor recombination-a strategy used for quantitative structure-property relationship (QSPR) modeling of nonadditive mixtures (MQSPR). By applying these concepts, ACs were encoded as single descriptor vectors used as input for support vector machine (SVM) classification and support vector regression (SVR), yielding accurate predictions of AC status (i.e., cliff vs noncliff) and potency differences, respectively. The latter were predicted in a compound order-sensitive manner returning the signed value of expected potency differences between AC compounds.
NASA Astrophysics Data System (ADS)
Wood, Joshua; Wilson, Joseph
2011-06-01
In using GPR images for landmine detection it is often useful to identify the air-ground interface in the GRP signal for alignment purposes. A common simple technique for doing this is to assume that the highest return in an A-scan is from the reflection due to the ground and to use that as the location of the interface. However there are many situations, such as the presence of nose clutter or shallow sub-surface objects, that can cause the global maximum estimate to be incorrect. A Support Vector Data Description (SVDD) is a one-class classifier related to the SVM which encloses the class in a hyper-sphere as opposed to using a hyper-plane as a decision boundary. We apply SVDD to the problem of detection of the air-ground interface by treating each sample in an A-scan, with some number of leading and trailing samples, as a feature vector. Training is done using a set of feature vectors based on known interfaces and detection is done by creating feature vectors from each of the samples in an A-scan, applying the trained SVDD to them and selecting the one with the least distance from the center of the hyper-sphere. We compare this approach with the global maximum approach, examining both the performance on human truthed data and how each method affects false alarm and true positive rates when used as the alignment method in mine detection algorithms.
Automated identification of biomedical article type using support Vector machines
NASA Astrophysics Data System (ADS)
Kim, In Cheol; Le, Daniel X.; Thoma, George R.
2011-01-01
Authors of short papers such as letters or editorials often express complementary opinions, and sometimes contradictory ones, on related work in previously published articles. The MEDLINE® citations for such short papers are required to list bibliographic data on these "commented on" articles in a "CON" field. The challenge is to automatically identify the CON articles referred to by the author of the short paper (called "Comment-in" or CIN paper). Our approach is to use support vector machines (SVM) to first classify a paper as either a CIN or a regular full-length article (which is exempt from this requirement), and then to extract from the CIN paper the bibliographic data of the CON articles. A solution to the first part of the problem, identifying CIN articles, is addressed here. We implement and compare the performance of two types of SVM, one with a linear kernel function and the other with a radial basis kernel function (RBF). Input feature vectors for the SVMs are created by combining four types of features based on statistics of words in the article title, words that suggest the article type (letter, correspondence, editorial), size of body text, and cue phrases. Experiments conducted on a set of online biomedical articles show that the SVM with a linear kernel function yields a significantly lower false negative error rate than the one with an RBF. Our experiments also show that the SVM with a linear kernel function achieves a significantly higher level of accuracy, and lower false positive and false negative error rates by using input feature vectors created by combining all four types of features rather than any single type.
Cui, Song; Youn, Eunseog; Lee, Joohyun; Maas, Stephan J
2014-01-01
Biological prediction of transcription factor binding sites and their corresponding transcription factor target genes (TFTGs) makes great contribution to understanding the gene regulatory networks. However, these approaches are based on laborious and time-consuming biological experiments. Numerous computational approaches have shown great potential to circumvent laborious biological methods. However, the majority of these algorithms provide limited performances and fail to consider the structural property of the datasets. We proposed a refined systematic computational approach for predicting TFTGs. Based on previous work done on identifying auxin response factor target genes from Arabidopsis thaliana co-expression data, we adopted a novel reverse-complementary distance-sensitive n-gram profile algorithm. This algorithm converts each upstream sub-sequence into a high-dimensional vector data point and transforms the prediction task into a classification problem using support vector machine-based classifier. Our approach showed significant improvement compared to other computational methods based on the area under curve value of the receiver operating characteristic curve using 10-fold cross validation. In addition, in the light of the highly skewed structure of the dataset, we also evaluated other metrics and their associated curves, such as precision-recall curves and cost curves, which provided highly satisfactory results.
Hepworth, Philip J; Nefedov, Alexey V; Muchnik, Ilya B; Morgan, Kenton L
2012-08-01
Machine-learning algorithms pervade our daily lives. In epidemiology, supervised machine learning has the potential for classification, diagnosis and risk factor identification. Here, we report the use of support vector machine learning to identify the features associated with hock burn on commercial broiler farms, using routinely collected farm management data. These data lend themselves to analysis using machine-learning techniques. Hock burn, dermatitis of the skin over the hock, is an important indicator of broiler health and welfare. Remarkably, this classifier can predict the occurrence of high hock burn prevalence with accuracy of 0.78 on unseen data, as measured by the area under the receiver operating characteristic curve. We also compare the results with those obtained by standard multi-variable logistic regression and suggest that this technique provides new insights into the data. This novel application of a machine-learning algorithm, embedded in poultry management systems could offer significant improvements in broiler health and welfare worldwide.
Acoustic Biometric System Based on Preprocessing Techniques and Linear Support Vector Machines.
del Val, Lara; Izquierdo-Fuente, Alberto; Villacorta, Juan J; Raboso, Mariano
2015-06-17
Drawing on the results of an acoustic biometric system based on a MSE classifier, a new biometric system has been implemented. This new system preprocesses acoustic images, extracts several parameters and finally classifies them, based on Support Vector Machine (SVM). The preprocessing techniques used are spatial filtering, segmentation-based on a Gaussian Mixture Model (GMM) to separate the person from the background, masking-to reduce the dimensions of images-and binarization-to reduce the size of each image. An analysis of classification error and a study of the sensitivity of the error versus the computational burden of each implemented algorithm are presented. This allows the selection of the most relevant algorithms, according to the benefits required by the system. A significant improvement of the biometric system has been achieved by reducing the classification error, the computational burden and the storage requirements.
Support vector machines for spike pattern classification with a leaky integrate-and-fire neuron
Ambard, Maxime; Rotter, Stefan
2012-01-01
Spike pattern classification is a key topic in machine learning, computational neuroscience, and electronic device design. Here, we offer a new supervised learning rule based on Support Vector Machines (SVM) to determine the synaptic weights of a leaky integrate-and-fire (LIF) neuron model for spike pattern classification. We compare classification performance between this algorithm and other methods sharing the same conceptual framework. We consider the effect of postsynaptic potential (PSP) kernel dynamics on patterns separability, and we propose an extension of the method to decrease computational load. The algorithm performs well in generalization tasks. We show that the peak value of spike patterns separability depends on a relation between PSP dynamics and spike pattern duration, and we propose a particular kernel that is well-suited for fast computations and electronic implementations. PMID:23181017
Halder, Sebastian; Bensch, Michael; Mellinger, Jürgen; Bogdan, Martin; Kübler, Andrea; Birbaumer, Niels; Rosenstiel, Wolfgang
2007-01-01
We propose a combination of blind source separation (BSS) and independent component analysis (ICA) (signal decomposition into artifacts and nonartifacts) with support vector machines (SVMs) (automatic classification) that are designed for online usage. In order to select a suitable BSS/ICA method, three ICA algorithms (JADE, Infomax, and FastICA) and one BSS algorithm (AMUSE) are evaluated to determine their ability to isolate electromyographic (EMG) and electrooculographic (EOG) artifacts into individual components. An implementation of the selected BSS/ICA method with SVMs trained to classify EMG and EOG artifacts, which enables the usage of the method as a filter in measurements with online feedback, is described. This filter is evaluated on three BCI datasets as a proof-of-concept of the method. PMID:18288259
Manivannan, K; Aggarwal, P; Devabhaktuni, V; Kumar, A; Nims, D; Bhattacharya, P
2012-07-15
An efficient and highly reliable automatic selection of optimal segmentation algorithm for characterizing particulate matter is presented in this paper. Support vector machines (SVMs) are used as a new self-regulating classifier trained by gray level co-occurrence matrix (GLCM) of the image. This matrix is calculated at various angles and the texture features are evaluated for classifying the images. Results show that the performance of GLCM-based SVMs is drastically improved over the previous histogram-based SVMs. Our proposed GLCM-based approach of training SVM predicts a robust and more accurate segmentation algorithm than the standard histogram technique, as additional information based on the spatial relationship between pixels is incorporated for image classification. Further, the GLCM-based SVM classifiers were more accurate and required less training data when compared to the artificial neural network (ANN) classifiers. PMID:22595545
Wang, Shuihua; Chen, Mengmeng; Li, Yang; Shao, Ying; Zhang, Yudong
2016-01-01
Dendritic spines are described as neuronal protrusions. The morphology of dendritic spines and dendrites has a strong relationship to its function, as well as playing an important role in understanding brain function. Quantitative analysis of dendrites and dendritic spines is essential to an understanding of the formation and function of the nervous system. However, highly efficient tools for the quantitative analysis of dendrites and dendritic spines are currently undeveloped. In this paper we propose a novel three-step cascaded algorithm–RTSVM— which is composed of ridge detection as the curvature structure identifier for backbone extraction, boundary location based on differences in density, the Hu moment as features and Twin Support Vector Machine (TSVM) classifiers for spine classification. Our data demonstrates that this newly developed algorithm has performed better than other available techniques used to detect accuracy and false alarm rates. This algorithm will be used effectively in neuroscience research. PMID:27547530
Acoustic Biometric System Based on Preprocessing Techniques and Linear Support Vector Machines
del Val, Lara; Izquierdo-Fuente, Alberto; Villacorta, Juan J.; Raboso, Mariano
2015-01-01
Drawing on the results of an acoustic biometric system based on a MSE classifier, a new biometric system has been implemented. This new system preprocesses acoustic images, extracts several parameters and finally classifies them, based on Support Vector Machine (SVM). The preprocessing techniques used are spatial filtering, segmentation—based on a Gaussian Mixture Model (GMM) to separate the person from the background, masking—to reduce the dimensions of images—and binarization—to reduce the size of each image. An analysis of classification error and a study of the sensitivity of the error versus the computational burden of each implemented algorithm are presented. This allows the selection of the most relevant algorithms, according to the benefits required by the system. A significant improvement of the biometric system has been achieved by reducing the classification error, the computational burden and the storage requirements. PMID:26091392
Yoo, Jaehyun; Kim, H Jin
2015-01-01
Machine learning has been successfully used for target localization in wireless sensor networks (WSNs) due to its accurate and robust estimation against highly nonlinear and noisy sensor measurement. For efficient and adaptive learning, this paper introduces online semi-supervised support vector regression (OSS-SVR). The first advantage of the proposed algorithm is that, based on semi-supervised learning framework, it can reduce the requirement on the amount of the labeled training data, maintaining accurate estimation. Second, with an extension to online learning, the proposed OSS-SVR automatically tracks changes of the system to be learned, such as varied noise characteristics. We compare the proposed algorithm with semi-supervised manifold learning, an online Gaussian process and online semi-supervised colocalization. The algorithms are evaluated for estimating the unknown location of a mobile robot in a WSN. The experimental results show that the proposed algorithm is more accurate under the smaller amount of labeled training data and is robust to varying noise. Moreover, the suggested algorithm performs fast computation, maintaining the best localization performance in comparison with the other methods. PMID:26024420
Modeling node bandwidth limits and their effects on vector combining algorithms
Littlefield, R.J.
1992-01-13
Each node in a message-passing multicomputer typically has several communication links. However, the maximum aggregate communication speed of a node is often less than the sum of its individual link speeds. Such computers are called node bandwidth limited (NBL). The NBL constraint is important when choosing algorithms because it can change the relative performance of different algorithms that accomplish the same task. This paper introduces a model of communication performance for NBL computers and uses the model to analyze the overall performance of three algorithms for vector combining (global sum) on the Intel Touchstone DELTA computer. Each of the three algorithms is found to be at least 33% faster than the other two for some combinations of machine size and vector length. The NBL constraint is shown to significantly affect the conditions under which each algorithm is fastest.
Karayiannis, Nicolaos B; Randolph-Gips, Mary M
2005-03-01
This paper presents the development of soft clustering and learning vector quantization (LVQ) algorithms that rely on a weighted norm to measure the distance between the feature vectors and their prototypes. The development of LVQ and clustering algorithms is based on the minimization of a reformulation function under the constraint that the generalized mean of the norm weights be constant. According to the proposed formulation, the norm weights can be computed from the data in an iterative fashion together with the prototypes. An error analysis provides some guidelines for selecting the parameter involved in the definition of the generalized mean in terms of the feature variances. The algorithms produced from this formulation are easy to implement and they are almost as fast as clustering algorithms relying on the Euclidean norm. An experimental evaluation on four data sets indicates that the proposed algorithms outperform consistently clustering algorithms relying on the Euclidean norm and they are strong competitors to non-Euclidean algorithms which are computationally more demanding.
Hybrid Neural Network and Support Vector Machine Method for Optimization
NASA Technical Reports Server (NTRS)
Rai, Man Mohan (Inventor)
2007-01-01
System and method for optimization of a design associated with a response function, using a hybrid neural net and support vector machine (NN/SVM) analysis to minimize or maximize an objective function, optionally subject to one or more constraints. As a first example, the NN/SVM analysis is applied iteratively to design of an aerodynamic component, such as an airfoil shape, where the objective function measures deviation from a target pressure distribution on the perimeter of the aerodynamic component. As a second example, the NN/SVM analysis is applied to data classification of a sequence of data points in a multidimensional space. The NN/SVM analysis is also applied to data regression.
Improved Online Support Vector Machines Spam Filtering Using String Kernels
NASA Astrophysics Data System (ADS)
Amayri, Ola; Bouguila, Nizar
A major bottleneck in electronic communications is the enormous dissemination of spam emails. Developing of suitable filters that can adequately capture those emails and achieve high performance rate become a main concern. Support vector machines (SVMs) have made a large contribution to the development of spam email filtering. Based on SVMs, the crucial problems in email classification are feature mapping of input emails and the choice of the kernels. In this paper, we present thorough investigation of several distance-based kernels and propose the use of string kernels and prove its efficiency in blocking spam emails. We detail a feature mapping variants in text classification (TC) that yield improved performance for the standard SVMs in filtering task. Furthermore, to cope for realtime scenarios we propose an online active framework for spam filtering.
Multiclass primal support vector machines for breast density classification.
Land, Walker H; Verheggen, Elizabeth A
2009-01-01
Parenchymal patterns defining the density of breast tissue are detected by advanced correlation pattern recognition in an integrated Computer-Aided Detection (CAD) and diagnosis system. Fractal signatures of density are modelled according to four clinical categories. A Support Vector Machine (SVM) in the primal formulation solves the multiclass problem using 'One-Versus-All' (OVA) and 'All-Versus-All' (AVA) decompositions, achieving 85% and 94% accuracy, respectively. Fully automated classification of breast density via a texture model derived from fractal dimension, dispersion, and lacunarity moves current qualitative methods forward to objective quantitative measures, amenable with the overarching vision of substantiating the role of density in epidemiological risk models of breast cancer. PMID:20054985
Support vector regression for real-time flood stage forecasting
NASA Astrophysics Data System (ADS)
Yu, Pao-Shan; Chen, Shien-Tsung; Chang, I.-Fan
2006-09-01
SummaryFlood forecasting is an important non-structural approach for flood mitigation. The flood stage is chosen as the variable to be forecasted because it is practically useful in flood forecasting. The support vector machine, a novel artificial intelligence-based method developed from statistical learning theory, is adopted herein to establish a real-time stage forecasting model. The lags associated with the input variables are determined by applying the hydrological concept of the time of response, and a two-step grid search method is applied to find the optimal parameters, and thus overcome the difficulties in constructing the learning machine. Two structures of models used to perform multiple-hour-ahead stage forecasts are developed. Validation results from flood events in Lan-Yang River, Taiwan, revealed that the proposed models can effectively predict the flood stage forecasts one-to-six-hours ahead. Moreover, a sensitivity analysis was conducted on the lags associated with the input variables.
Aero-Engine Condition Monitoring Based on Support Vector Machine
NASA Astrophysics Data System (ADS)
Zhang, Chunxiao; Wang, Nan
The maintenance and management of civil aero-engine require advanced monitor approaches to estimate aero-engine performance and health in order to increase life of aero-engine and reduce maintenance costs. In this paper, we adopted support vector machine (SVM) regression approach to monitor an aero-engine health and condition by building monitoring models of main aero-engine performance parameters(EGT, N1, N2 and FF). The accuracy of nonlinear baseline models of performance parameters is tested and the maximum relative error does not exceed ±0.3%, which meets the engineering requirements. The results show that SVM nonlinear regression is an effective method in aero-engine monitoring.
HYBRID NEURAL NETWORK AND SUPPORT VECTOR MACHINE METHOD FOR OPTIMIZATION
NASA Technical Reports Server (NTRS)
Rai, Man Mohan (Inventor)
2005-01-01
System and method for optimization of a design associated with a response function, using a hybrid neural net and support vector machine (NN/SVM) analysis to minimize or maximize an objective function, optionally subject to one or more constraints. As a first example, the NN/SVM analysis is applied iteratively to design of an aerodynamic component, such as an airfoil shape, where the objective function measures deviation from a target pressure distribution on the perimeter of the aerodynamic component. As a second example, the NN/SVM analysis is applied to data classification of a sequence of data points in a multidimensional space. The NN/SVM analysis is also applied to data regression.
Prediction of Pork Quality by Fuzzy Support Vector Machine Classifier
NASA Astrophysics Data System (ADS)
Zhang, Jianxi; Yu, Huaizhi; Wang, Jiamin
Existing objective methods to evaluate pork quality in general do not yield satisfactory results and their applications in meat industry are limited. In this study, fuzzy support vector machine (FSVM) method was developed to evaluate and predict pork quality rapidly and nondestructively. Firstly, the discrete wavelet transform (DWT) was used to eliminate the noise component in original spectrum and the new spectrum was reconstructed. Then, considering the characteristic variables still exist correlation and contain some redundant information, principal component analysis (PCA) was carried out. Lastly, FSVM was developed to differentiate and classify pork samples into different quality grades using the features from PCA. Jackknife tests on the working datasets indicated that the prediction accuracies were higher than other methods.
Interpreting support vector machine models for multivariate group wise analysis in neuroimaging.
Gaonkar, Bilwaj; T Shinohara, Russell; Davatzikos, Christos
2015-08-01
Machine learning based classification algorithms like support vector machines (SVMs) have shown great promise for turning a high dimensional neuroimaging data into clinically useful decision criteria. However, tracing imaging based patterns that contribute significantly to classifier decisions remains an open problem. This is an issue of critical importance in imaging studies seeking to determine which anatomical or physiological imaging features contribute to the classifier's decision, thereby allowing users to critically evaluate the findings of such machine learning methods and to understand disease mechanisms. The majority of published work addresses the question of statistical inference for support vector classification using permutation tests based on SVM weight vectors. Such permutation testing ignores the SVM margin, which is critical in SVM theory. In this work we emphasize the use of a statistic that explicitly accounts for the SVM margin and show that the null distributions associated with this statistic are asymptotically normal. Further, our experiments show that this statistic is a lot less conservative as compared to weight based permutation tests and yet specific enough to tease out multivariate patterns in the data. Thus, we can better understand the multivariate patterns that the SVM uses for neuroimaging based classification.
Interpreting support vector machine models for multivariate group wise analysis in neuroimaging
Gaonkar, Bilwaj; Shinohara, Russell T; Davatzikos, Christos
2015-01-01
Machine learning based classification algorithms like support vector machines (SVMs) have shown great promise for turning a high dimensional neuroimaging data into clinically useful decision criteria. However, tracing imaging based patterns that contribute significantly to classifier decisions remains an open problem. This is an issue of critical importance in imaging studies seeking to determine which anatomical or physiological imaging features contribute to the classifier’s decision, thereby allowing users to critically evaluate the findings of such machine learning methods and to understand disease mechanisms. The majority of published work addresses the question of statistical inference for support vector classification using permutation tests based on SVM weight vectors. Such permutation testing ignores the SVM margin, which is critical in SVM theory. In this work we emphasize the use of a statistic that explicitly accounts for the SVM margin and show that the null distributions associated with this statistic are asymptotically normal. Further, our experiments show that this statistic is a lot less conservative as compared to weight based permutation tests and yet specific enough to tease out multivariate patterns in the data. Thus, we can better understand the multivariate patterns that the SVM uses for neuroimaging based classification. PMID:26210913
NASA Astrophysics Data System (ADS)
Yang, Chien-Chun; Nagarajan, Mahesh B.; Huber, Markus B.; Carballido-Gamio, Julio; Bauer, Jan S.; Baum, Thomas; Eckstein, Felix; Lochmüller, Eva-Maria; Link, Thomas M.; Wismüller, Axel
2014-03-01
Regional trabecular bone quality estimation for purposes of femoral bone strength prediction is important for improving the clinical assessment of osteoporotic fracture risk. In this study, we explore the ability of 3D Minkowski Functionals derived from multi-detector computed tomography (MDCT) images of proximal femur specimens in predicting their corresponding biomechanical strength. MDCT scans were acquired for 50 proximal femur specimens harvested from human cadavers. An automated volume of interest (VOI)-fitting algorithm was used to define a consistent volume in the femoral head of each specimen. In these VOIs, the trabecular bone micro-architecture was characterized by statistical moments of its BMD distribution and by topological features derived from Minkowski Functionals. A linear multiregression analysis and a support vector regression (SVR) algorithm with a linear kernel were used to predict the failure load (FL) from the feature sets; the predicted FL was compared to the true FL determined through biomechanical testing. The prediction performance was measured by the root mean square error (RMSE) for each feature set. The best prediction result was obtained from the Minkowski Functional surface used in combination with SVR, which had the lowest prediction error (RMSE = 0.939 ± 0.345) and which was significantly lower than mean BMD (RMSE = 1.075 ± 0.279, p<0.005). Our results indicate that the biomechanical strength prediction can be significantly improved in proximal femur specimens with Minkowski Functionals extracted from on MDCT images used in conjunction with support vector regression.
Concurrent and vectorized mixed time, explicit nonlinear structural dynamics algorithms
NASA Technical Reports Server (NTRS)
Belytschko, Ted; Gilbertsen, Noreen
1987-01-01
A nonlinear structural dynamics program with an element library that exploits parallel processing is described. The aim is to exploit scheduling-allocation so that parallel processing and vectorization can effectively be treated in a general purpose program with explicit time integration and different time steps in different parts of the mesh. The program uses an element group scheme, which, as a by-product, also provides an automatic scheme for assigning different time steps to different parts of the mesh. The program has been tested on the Alliant FX/8; it shows a fivefold improvement in speed over compiler optimization.
NASA Astrophysics Data System (ADS)
Gao, Wei; Zhu, Linli; Wang, Kaiyun
2015-12-01
Ontology, a model of knowledge representation and storage, has had extensive applications in pharmaceutics, social science, chemistry and biology. In the age of “big data”, the constructed concepts are often represented as higher-dimensional data by scholars, and thus the sparse learning techniques are introduced into ontology algorithms. In this paper, based on the alternating direction augmented Lagrangian method, we present an ontology optimization algorithm for ontological sparse vector learning, and a fast version of such ontology technologies. The optimal sparse vector is obtained by an iterative procedure, and the ontology function is then obtained from the sparse vector. Four simulation experiments show that our ontological sparse vector learning model has a higher precision ratio on plant ontology, humanoid robotics ontology, biology ontology and physics education ontology data for similarity measuring and ontology mapping applications.
NASA Astrophysics Data System (ADS)
Khawaja, Taimoor Saleem
A high-belief low-overhead Prognostics and Health Management (PHM) system is desired for online real-time monitoring of complex non-linear systems operating in a complex (possibly non-Gaussian) noise environment. This thesis presents a Bayesian Least Squares Support Vector Machine (LS-SVM) based framework for fault diagnosis and failure prognosis in nonlinear non-Gaussian systems. The methodology assumes the availability of real-time process measurements, definition of a set of fault indicators and the existence of empirical knowledge (or historical data) to characterize both nominal and abnormal operating conditions. An efficient yet powerful Least Squares Support Vector Machine (LS-SVM) algorithm, set within a Bayesian Inference framework, not only allows for the development of real-time algorithms for diagnosis and prognosis but also provides a solid theoretical framework to address key concepts related to classification for diagnosis and regression modeling for prognosis. SVM machines are founded on the principle of Structural Risk Minimization (SRM) which tends to find a good trade-off between low empirical risk and small capacity. The key features in SVM are the use of non-linear kernels, the absence of local minima, the sparseness of the solution and the capacity control obtained by optimizing the margin. The Bayesian Inference framework linked with LS-SVMs allows a probabilistic interpretation of the results for diagnosis and prognosis. Additional levels of inference provide the much coveted features of adaptability and tunability of the modeling parameters. The two main modules considered in this research are fault diagnosis and failure prognosis. With the goal of designing an efficient and reliable fault diagnosis scheme, a novel Anomaly Detector is suggested based on the LS-SVM machines. The proposed scheme uses only baseline data to construct a 1-class LS-SVM machine which, when presented with online data is able to distinguish between normal behavior
A 2D vector map watermarking algorithm resistant to simplication attack
NASA Astrophysics Data System (ADS)
Wang, Chuanjian; Liang, Bin; Zhao, Qingzhan; Qiu, Zuqi; Peng, Yuwei; Yu, Liang
2009-12-01
Vector maps are valuable asset of data producers. How to protect copyright of vector maps effectively using digital watermarking is a hot research issue. In this paper, we propose a new robust and blind watermarking algorithm resilient to simplification attack. We proof that spatial topological relation between map objects bears an important property of approximate simplification invariance. We choose spatial topological relations as watermark feature domain and embed watermarks by slightly modifying spatial topological relation between map objects. Experiment shows that our algorithm has good performance to resist simplification attack and tradeoff of the robustness and data fidelity is acquired.
A vector reconstruction based clustering algorithm particularly for large-scale text collection.
Liu, Ming; Wu, Chong; Chen, Lei
2015-03-01
Along with the fast evolvement of internet technology, internet users have to face the large amount of textual data every day. Apparently, organizing texts into categories can help users dig the useful information from large-scale text collection. Clustering is one of the most promising tools for categorizing texts due to its unsupervised characteristic. Unfortunately, most of traditional clustering algorithms lose their high qualities on large-scale text collection, which mainly attributes to the high-dimensional vector space and semantic similarity among texts. To effectively and efficiently cluster large-scale text collection, this paper puts forward a vector reconstruction based clustering algorithm. Only the features that can represent the cluster are preserved in cluster's representative vector. This algorithm alternately repeats two sub-processes until it converges. One process is partial tuning sub-process, where feature's weight is fine-tuned by iterative process similar to self-organizing-mapping (SOM) algorithm. To accelerate clustering velocity, an intersection based similarity measurement and its corresponding neuron adjustment function are proposed and implemented in this sub-process. The other process is overall tuning sub-process, where the features are reallocated among different clusters. In this sub-process, the features useless to represent the cluster are removed from cluster's representative vector. Experimental results on the three text collections (including two small-scale and one large-scale text collections) demonstrate that our algorithm obtains high-quality performances on both small-scale and large-scale text collections.
Assimilation of PFISR Data Using Support Vector Regression and Ground Based Camera Constraints
NASA Astrophysics Data System (ADS)
Clayton, R.; Lynch, K. A.; Nicolls, M. J.; Hampton, D. L.; Michell, R.; Samara, M.; Guinther, J.
2013-12-01
In order to best interpret the information gained from multipoint in situ measurements, a Support Vector Regression algorithm is being developed to interpret the data collected from the instruments in the context of ground observations (such as those from camera or radar array). The idea behind SVR is to construct the simplest function that models the data with the least squared error, subject to constraints given by the user. Constraints can be brought into the algorithm from other data sources or from models. As is often the case with data, a perfect solution to such a problem may be impossible, thus 'slack' may be introduced to control how closely the model adheres to the data. The algorithm employs kernels, and chooses radial basis functions as an appropriate kernel. The current SVR code can take input data as one to three dimensional scalars or vectors, and may also include time. External data can be incorporated and assimilated into a model of the environment. Regions of minimal and maximal values are allowed to relax to the sample average (or a user-supplied model) on size and time scales determined by user input, known as feature sizes. These feature sizes can vary for each degree of freedom if the user desires. The user may also select weights for each data point, if it is desirable to weight parts of the data differently. In order to test the algorithm, Poker Flat Incoherent Scatter Radar (PFISR) and MICA sounding rocket data are being used as sample data. The PFISR data consists of many beams, each with multiple ranges. In addition to analyzing the radar data as it stands, the algorithm is being used to simulate data from a localized ionospheric swarm of Cubesats using existing PFISR data. The sample points of the radar at one altitude slice can serve as surrogates for satellites in a cubeswarm. The number of beams of the PFISR radar can then be used to see what the algorithm would output for a swarm of similar size. By using PFISR data in the 15-beam to
Support vector machines classifiers of physical activities in preschoolers
Zhao, Wei; Adolph, Anne L; Puyau, Maurice R; Vohra, Firoz A; Butte, Nancy F; Zakeri, Issa F
2013-01-01
The goal of this study is to develop, test, and compare multinomial logistic regression (MLR) and support vector machines (SVM) in classifying preschool-aged children physical activity data acquired from an accelerometer. In this study, 69 children aged 3–5 years old were asked to participate in a supervised protocol of physical activities while wearing a triaxial accelerometer. Accelerometer counts, steps, and position were obtained from the device. We applied K-means clustering to determine the number of natural groupings presented by the data. We used MLR and SVM to classify the six activity types. Using direct observation as the criterion method, the 10-fold cross-validation (CV) error rate was used to compare MLR and SVM classifiers, with and without sleep. Altogether, 58 classification models based on combinations of the accelerometer output variables were developed. In general, the SVM classifiers have a smaller 10-fold CV error rate than their MLR counterparts. Including sleep, a SVM classifier provided the best performance with a 10-fold CV error rate of 24.70%. Without sleep, a SVM classifier-based triaxial accelerometer counts, vector magnitude, steps, position, and 1- and 2-min lag and lead values achieved a 10-fold CV error rate of 20.16% and an overall classification error rate of 15.56%. SVM supersedes the classical classifier MLR in categorizing physical activities in preschool-aged children. Using accelerometer data, SVM can be used to correctly classify physical activities typical of preschool-aged children with an acceptable classification error rate. PMID:24303099
Keohane, Bernie M; Mason, Steve M; Baguley, David M
2004-02-01
A novel auditory brainstem response (ABR) detection and scoring algorithm, entitled the Vector algorithm is described. An independent clinical evaluation of the algorithm using 464 tests (120 non-stimulated and 344 stimulated tests) on 60 infants, with a mean age of approximately 6.5 weeks, estimated test sensitivity greater than 0.99 and test specificity at 0.87 for one test. Specificity was estimated to be greater than 0.95 for a two stage screen. Test times were of the order of 1.5 minutes per ear for detection of an ABR and 4.5 minutes per ear in the absence of a clear response. The Vector algorithm is commercially available for both automated screening and threshold estimation in hearing screening devices.
A Hashing-Based Search Algorithm for Coding Digital Images by Vector Quantization
NASA Astrophysics Data System (ADS)
Chu, Chen-Chau
1989-11-01
This paper describes a fast algorithm to compress digital images by vector quantization. Vector quantization relies heavily on searching to build codebooks and to classify blocks of pixels into code indices. The proposed algorithm uses hashing, localized search, and multi-stage search to accelerate the searching process. The average of pixel values in a block is used as the feature for hashing and intermediate screening. Experimental results using monochrome images are presented. This algorithm compares favorably with other methods with regard to processing time, and has comparable or better mean square error measurements than some of them. The major advantages of the proposed algorithm are its speed, good quality of the reconstructed images, and flexibility.
2014-01-01
Background Support vector regression (SVR) and Gaussian process regression (GPR) were used for the analysis of electroanalytical experimental data to estimate diffusion coefficients. Results For simulated cyclic voltammograms based on the EC, Eqr, and EqrC mechanisms these regression algorithms in combination with nonlinear kernel/covariance functions yielded diffusion coefficients with higher accuracy as compared to the standard approach of calculating diffusion coefficients relying on the Nicholson-Shain equation. The level of accuracy achieved by SVR and GPR is virtually independent of the rate constants governing the respective reaction steps. Further, the reduction of high-dimensional voltammetric signals by manual selection of typical voltammetric peak features decreased the performance of both regression algorithms compared to a reduction by downsampling or principal component analysis. After training on simulated data sets, diffusion coefficients were estimated by the regression algorithms for experimental data comprising voltammetric signals for three organometallic complexes. Conclusions Estimated diffusion coefficients closely matched the values determined by the parameter fitting method, but reduced the required computational time considerably for one of the reaction mechanisms. The automated processing of voltammograms according to the regression algorithms yields better results than the conventional analysis of peak-related data. PMID:24987463
Robust Vision-Based Pose Estimation Algorithm for AN Uav with Known Gravity Vector
NASA Astrophysics Data System (ADS)
Kniaz, V. V.
2016-06-01
Accurate estimation of camera external orientation with respect to a known object is one of the central problems in photogrammetry and computer vision. In recent years this problem is gaining an increasing attention in the field of UAV autonomous flight. Such application requires a real-time performance and robustness of the external orientation estimation algorithm. The accuracy of the solution is strongly dependent on the number of reference points visible on the given image. The problem only has an analytical solution if 3 or more reference points are visible. However, in limited visibility conditions it is often needed to perform external orientation with only 2 visible reference points. In such case the solution could be found if the gravity vector direction in the camera coordinate system is known. A number of algorithms for external orientation estimation for the case of 2 known reference points and a gravity vector were developed to date. Most of these algorithms provide analytical solution in the form of polynomial equation that is subject to large errors in the case of complex reference points configurations. This paper is focused on the development of a new computationally effective and robust algorithm for external orientation based on positions of 2 known reference points and a gravity vector. The algorithm implementation for guidance of a Parrot AR.Drone 2.0 micro-UAV is discussed. The experimental evaluation of the algorithm proved its computational efficiency and robustness against errors in reference points positions and complex configurations.
Fruit fly optimization based least square support vector regression for blind image restoration
NASA Astrophysics Data System (ADS)
Zhang, Jiao; Wang, Rui; Li, Junshan; Yang, Yawei
2014-11-01
The goal of image restoration is to reconstruct the original scene from a degraded observation. It is a critical and challenging task in image processing. Classical restorations require explicit knowledge of the point spread function and a description of the noise as priors. However, it is not practical for many real image processing. The recovery processing needs to be a blind image restoration scenario. Since blind deconvolution is an ill-posed problem, many blind restoration methods need to make additional assumptions to construct restrictions. Due to the differences of PSF and noise energy, blurring images can be quite different. It is difficult to achieve a good balance between proper assumption and high restoration quality in blind deconvolution. Recently, machine learning techniques have been applied to blind image restoration. The least square support vector regression (LSSVR) has been proven to offer strong potential in estimating and forecasting issues. Therefore, this paper proposes a LSSVR-based image restoration method. However, selecting the optimal parameters for support vector machine is essential to the training result. As a novel meta-heuristic algorithm, the fruit fly optimization algorithm (FOA) can be used to handle optimization problems, and has the advantages of fast convergence to the global optimal solution. In the proposed method, the training samples are created from a neighborhood in the degraded image to the central pixel in the original image. The mapping between the degraded image and the original image is learned by training LSSVR. The two parameters of LSSVR are optimized though FOA. The fitness function of FOA is calculated by the restoration error function. With the acquired mapping, the degraded image can be recovered. Experimental results show the proposed method can obtain satisfactory restoration effect. Compared with BP neural network regression, SVR method and Lucy-Richardson algorithm, it speeds up the restoration rate and
Monthly evaporation forecasting using artificial neural networks and support vector machines
NASA Astrophysics Data System (ADS)
Tezel, Gulay; Buyukyildiz, Meral
2016-04-01
Evaporation is one of the most important components of the hydrological cycle, but is relatively difficult to estimate, due to its complexity, as it can be influenced by numerous factors. Estimation of evaporation is important for the design of reservoirs, especially in arid and semi-arid areas. Artificial neural network methods and support vector machines (SVM) are frequently utilized to estimate evaporation and other hydrological variables. In this study, usability of artificial neural networks (ANNs) (multilayer perceptron (MLP) and radial basis function network (RBFN)) and ɛ-support vector regression (SVR) artificial intelligence methods was investigated to estimate monthly pan evaporation. For this aim, temperature, relative humidity, wind speed, and precipitation data for the period 1972 to 2005 from Beysehir meteorology station were used as input variables while pan evaporation values were used as output. The Romanenko and Meyer method was also considered for the comparison. The results were compared with observed class A pan evaporation data. In MLP method, four different training algorithms, gradient descent with momentum and adaptive learning rule backpropagation (GDX), Levenberg-Marquardt (LVM), scaled conjugate gradient (SCG), and resilient backpropagation (RBP), were used. Also, ɛ-SVR model was used as SVR model. The models were designed via 10-fold cross-validation (CV); algorithm performance was assessed via mean absolute error (MAE), root mean square error (RMSE), and coefficient of determination (R 2). According to the performance criteria, the ANN algorithms and ɛ-SVR had similar results. The ANNs and ɛ-SVR methods were found to perform better than the Romanenko and Meyer methods. Consequently, the best performance using the test data was obtained using SCG(4,2,2,1) with R 2 = 0.905.
Khormali, Aminollah; Addeh, Jalil
2016-07-01
Unnatural patterns in the control charts can be associated with a specific set of assignable causes for process variation. Hence, pattern recognition is very useful in identifying the process problems. In this study, a multiclass SVM (SVM) based classifier is proposed because of the promising generalization capability of support vector machines. In the proposed method type-2 fuzzy c-means (T2FCM) clustering algorithm is used to make a SVM system more effective. The fuzzy support vector machine classifier suggested in this paper is composed of three main sub-networks: fuzzy classifier sub-network, SVM sub-network and optimization sub-network. In SVM training, the hyper-parameters plays a very important role in its recognition accuracy. Therefore, cuckoo optimization algorithm (COA) is proposed for selecting appropriate parameters of the classifier. Simulation results showed that the proposed system has very high recognition accuracy. PMID:27101724
Spacebased Estimation of Moisture Transport in Marine Atmosphere Using Support Vector Regression
NASA Technical Reports Server (NTRS)
Xie, Xiaosu; Liu, W. Timothy; Tang, Benyang
2007-01-01
An improved algorithm is developed based on support vector regression (SVR) to estimate horizonal water vapor transport integrated through the depth of the atmosphere ((Theta)) over the global ocean from observations of surface wind-stress vector by QuikSCAT, cloud drift wind vector derived from the Multi-angle Imaging SpectroRadiometer (MISR) and geostationary satellites, and precipitable water from the Special Sensor Microwave/Imager (SSM/I). The statistical relation is established between the input parameters (the surface wind stress, the 850 mb wind, the precipitable water, time and location) and the target data ((Theta) calculated from rawinsondes and reanalysis of numerical weather prediction model). The results are validated with independent daily rawinsonde observations, monthly mean reanalysis data, and through regional water balance. This study clearly demonstrates the improvement of (Theta) derived from satellite data using SVR over previous data sets based on linear regression and neural network. The SVR methodology reduces both mean bias and standard deviation comparedwith rawinsonde observations. It agrees better with observations from synoptic to seasonal time scales, and compare more favorably with the reanalysis data on seasonal variations. Only the SVR result can achieve the water balance over South America. The rationale of the advantage by SVR method and the impact of adding the upper level wind will also be discussed.
2013-01-01
Background Named entity recognition (NER) is an important task in clinical natural language processing (NLP) research. Machine learning (ML) based NER methods have shown good performance in recognizing entities in clinical text. Algorithms and features are two important factors that largely affect the performance of ML-based NER systems. Conditional Random Fields (CRFs), a sequential labelling algorithm, and Support Vector Machines (SVMs), which is based on large margin theory, are two typical machine learning algorithms that have been widely applied to clinical NER tasks. For features, syntactic and semantic information of context words has often been used in clinical NER systems. However, Structural Support Vector Machines (SSVMs), an algorithm that combines the advantages of both CRFs and SVMs, and word representation features, which contain word-level back-off information over large unlabelled corpus by unsupervised algorithms, have not been extensively investigated for clinical text processing. Therefore, the primary goal of this study is to evaluate the use of SSVMs and word representation features in clinical NER tasks. Methods In this study, we developed SSVMs-based NER systems to recognize clinical entities in hospital discharge summaries, using the data set from the concept extration task in the 2010 i2b2 NLP challenge. We compared the performance of CRFs and SSVMs-based NER classifiers with the same feature sets. Furthermore, we extracted two different types of word representation features (clustering-based representation features and distributional representation features) and integrated them with the SSVMs-based clinical NER system. We then reported the performance of SSVM-based NER systems with different types of word representation features. Results and discussion Using the same training (N = 27,837) and test (N = 45,009) sets in the challenge, our evaluation showed that the SSVMs-based NER systems achieved better performance than the CRFs
Modified particle filtering algorithm for single acoustic vector sensor DOA tracking.
Li, Xinbo; Sun, Haixin; Jiang, Liangxu; Shi, Yaowu; Wu, Yue
2015-01-01
The conventional direction of arrival (DOA) estimation algorithm with static sources assumption usually estimates the source angles of two adjacent moments independently and the correlation of the moments is not considered. In this article, we focus on the DOA estimation of moving sources and a modified particle filtering (MPF) algorithm is proposed with state space model of single acoustic vector sensor. Although the particle filtering (PF) algorithm has been introduced for acoustic vector sensor applications, it is not suitable for the case that one dimension angle of source is estimated with large deviation, the two dimension angles (pitch angle and azimuth angle) cannot be simultaneously employed to update the state through resampling processing of PF algorithm. To solve the problems mentioned above, the MPF algorithm is proposed in which the state estimation of previous moment is introduced to the particle sampling of present moment to improve the importance function. Moreover, the independent relationship of pitch angle and azimuth angle is considered and the two dimension angles are sampled and evaluated, respectively. Then, the MUSIC spectrum function is used as the "likehood" function of the MPF algorithm, and the modified PF-MUSIC (MPF-MUSIC) algorithm is proposed to improve the root mean square error (RMSE) and the probability of convergence. The theoretical analysis and the simulation results validate the effectiveness and feasibility of the two proposed algorithms.
Modified Particle Filtering Algorithm for Single Acoustic Vector Sensor DOA Tracking
Li, Xinbo; Sun, Haixin; Jiang, Liangxu; Shi, Yaowu; Wu, Yue
2015-01-01
The conventional direction of arrival (DOA) estimation algorithm with static sources assumption usually estimates the source angles of two adjacent moments independently and the correlation of the moments is not considered. In this article, we focus on the DOA estimation of moving sources and a modified particle filtering (MPF) algorithm is proposed with state space model of single acoustic vector sensor. Although the particle filtering (PF) algorithm has been introduced for acoustic vector sensor applications, it is not suitable for the case that one dimension angle of source is estimated with large deviation, the two dimension angles (pitch angle and azimuth angle) cannot be simultaneously employed to update the state through resampling processing of PF algorithm. To solve the problems mentioned above, the MPF algorithm is proposed in which the state estimation of previous moment is introduced to the particle sampling of present moment to improve the importance function. Moreover, the independent relationship of pitch angle and azimuth angle is considered and the two dimension angles are sampled and evaluated, respectively. Then, the MUSIC spectrum function is used as the “likehood” function of the MPF algorithm, and the modified PF-MUSIC (MPF-MUSIC) algorithm is proposed to improve the root mean square error (RMSE) and the probability of convergence. The theoretical analysis and the simulation results validate the effectiveness and feasibility of the two proposed algorithms. PMID:26501280
NASA Astrophysics Data System (ADS)
Krys, Sebastian; Jankowski, Stanislaw; Piatkowska-Janko, Ewa
2009-06-01
This paper presents the application of differential evolution, an evolutionary algorithm of solving a single objective optimization problem - tuning the hiperparameters of least-square support vector machine classifier. The goal was to improve the classification of patients with sustained ventricular tachycardia after myocardial infarction based on a signal-averaged electrocardiography dataset received from the Medical University of Warsaw. The applied method attained a classification rate of 96% of the SVT+ group.
Detection of Splice Sites Using Support Vector Machine
NASA Astrophysics Data System (ADS)
Varadwaj, Pritish; Purohit, Neetesh; Arora, Bhumika
Automatic identification and annotation of exon and intron region of gene, from DNA sequences has been an important research area in field of computational biology. Several approaches viz. Hidden Markov Model (HMM), Artificial Intelligence (AI) based machine learning and Digital Signal Processing (DSP) techniques have extensively and independently been used by various researchers to cater this challenging task. In this work, we propose a Support Vector Machine based kernel learning approach for detection of splice sites (the exon-intron boundary) in a gene. Electron-Ion Interaction Potential (EIIP) values of nucleotides have been used for mapping character sequences to corresponding numeric sequences. Radial Basis Function (RBF) SVM kernel is trained using EIIP numeric sequences. Furthermore this was tested on test gene dataset for detection of splice site by window (of 12 residues) shifting. Optimum values of window size, various important parameters of SVM kernel have been optimized for a better accuracy. Receiver Operating Characteristic (ROC) curves have been utilized for displaying the sensitivity rate of the classifier and results showed 94.82% accuracy for splice site detection on test dataset.
Segmentation of multiple sclerosis lesions using support vector machines
NASA Astrophysics Data System (ADS)
Ferrari, Ricardo J.; Wei, Xingchang; Zhang, Yunyan; Scott, James N.; Mitchell, J. R.
2003-05-01
In this paper we present preliminary results to automatically segment multiple sclerosis (MS) lesions in multispectral magnetic resonance datasets using support vector machines (SVM). A total of eighteen studies (each composed of T1-, T2-weighted and FLAIR images) acquired from a 3T GE Signa scanner was analyzed. A neuroradiologist used a computer-assisted technique to identify all MS lesions in each study. These results were used later in the training and testing stages of the SVM classifier. A preprocessing stage including anisotropic diffusion filtering, non-uniformity intensity correction, and intensity tissue normalization was applied to the images. The SVM kernel used in this study was the radial basis function (RBF). The kernel parameter (γ) and the penalty value for the errors were determined by using a very loose stopping criterion for the SVM decomposition. Overall, a 5-fold cross-validation accuracy rate of 80% was achieved in the automatic classification of MS lesion voxels using the proposed SVM-RBF classifier.
Classification of masses on mammograms using support vector machine
NASA Astrophysics Data System (ADS)
Chu, Yong; Li, Lihua; Goldgof, Dmitry B.; Qui, Yan; Clark, Robert A.
2003-05-01
Mammography is the most effective method for early detection of breast cancer. However, the positive predictive value for classification of malignant and benign lesion from mammographic images is not very high. Clinical studies have shown that most biopsies for cancer are very low, between 15% and 30%. It is important to increase the diagnostic accuracy by improving the positive predictive value to reduce the number of unnecessary biopsies. In this paper, a new classification method was proposed to distinguish malignant from benign masses in mammography by Support Vector Machine (SVM) method. Thirteen features were selected based on receiver operating characteristic (ROC) analysis of classification using individual feature. These features include four shape features, two gradient features and seven Laws features. With these features, SVM was used to classify the masses into two categories, benign and malignant, in which a Gaussian kernel and sequential minimal optimization learning technique are performed. The data set used in this study consists of 193 cases, in which there are 96 benign cases and 97 malignant cases. The leave-one-out evaluation of SVM classifier was taken. The results show that the positive predict value of the presented method is 81.6% with the sensitivity of 83.7% and the false-positive rate of 30.2%. It demonstrated that the SVM-based classifier is effective in mass classification.
Predicting Computer System Failures Using Support Vector Machines
Fulp, Errin W.; Fink, Glenn A.; Haack, Jereme N.
2008-12-07
Mitigating the impact of computer failure is possible if accurate failure predictions are provided. Resources, applications, and services can be scheduled around predicted failure and limit the impact. Such strategies are especially important for multi-computer systems, such as compute clusters, that experience a higher rate failure due to the large number of components. However providing accurate predictions with sufficient lead time remains a challenging problem. This paper describes a new spectrum-kernel Support Vector Machine (SVM) approach to predict failure events based on system log files. These files contain messages that represent a change of system state. While a single message in the file may not be sufficient for predicting failure, a sequence or pattern of messages may be. The approach described in this paper will use a sliding window (sub-sequence) of messages to predict the likelihood of failure. The frequency representation of the message sub-sequences observed are then used as input to the SVM that associates the messages to a class of failed or non-failed system. Experimental results using actual system log files from a Linux-based compute cluster indicate the proposed SVM approach can predict hard disk failure with an accuracy of 76% one day in advance.
Incremental Support Vector Machine Framework for Visual Sensor Networks
NASA Astrophysics Data System (ADS)
Awad, Mariette; Jiang, Xianhua; Motai, Yuichi
2006-12-01
Motivated by the emerging requirements of surveillance networks, we present in this paper an incremental multiclassification support vector machine (SVM) technique as a new framework for action classification based on real-time multivideo collected by homogeneous sites. The technique is based on an adaptation of least square SVM (LS-SVM) formulation but extends beyond the static image-based learning of current SVM methodologies. In applying the technique, an initial supervised offline learning phase is followed by a visual behavior data acquisition and an online learning phase during which the cluster head performs an ensemble of model aggregations based on the sensor nodes inputs. The cluster head then selectively switches on designated sensor nodes for future incremental learning. Combining sensor data offers an improvement over single camera sensing especially when the latter has an occluded view of the target object. The optimization involved alleviates the burdens of power consumption and communication bandwidth requirements. The resulting misclassification error rate, the iterative error reduction rate of the proposed incremental learning, and the decision fusion technique prove its validity when applied to visual sensor networks. Furthermore, the enabled online learning allows an adaptive domain knowledge insertion and offers the advantage of reducing both the model training time and the information storage requirements of the overall system which makes it even more attractive for distributed sensor networks communication.
River flow time series using least squares support vector machines
NASA Astrophysics Data System (ADS)
Samsudin, R.; Saad, P.; Shabri, A.
2011-06-01
This paper proposes a novel hybrid forecasting model known as GLSSVM, which combines the group method of data handling (GMDH) and the least squares support vector machine (LSSVM). The GMDH is used to determine the useful input variables which work as the time series forecasting for the LSSVM model. Monthly river flow data from two stations, the Selangor and Bernam rivers in Selangor state of Peninsular Malaysia were taken into consideration in the development of this hybrid model. The performance of this model was compared with the conventional artificial neural network (ANN) models, Autoregressive Integrated Moving Average (ARIMA), GMDH and LSSVM models using the long term observations of monthly river flow discharge. The root mean square error (RMSE) and coefficient of correlation (R) are used to evaluate the models' performances. In both cases, the new hybrid model has been found to provide more accurate flow forecasts compared to the other models. The results of the comparison indicate that the new hybrid model is a useful tool and a promising new method for river flow forecasting.
Nonlinear structural damage detection using support vector machines
NASA Astrophysics Data System (ADS)
Xiao, Li; Qu, Wenzhong
2012-04-01
An actual structure including connections and interfaces may exist nonlinear. Because of many complicated problems about nonlinear structural health monitoring (SHM), relatively little progress have been made in this aspect. Statistical pattern recognition techniques have been demonstrated to be competitive with other methods when applied to real engineering datasets. When a structure existing 'breathing' cracks that open and close under operational loading may cause a linear structural system to respond to its operational and environmental loads in a nonlinear manner nonlinear. In this paper, a vibration-based structural health monitoring when the structure exists cracks is investigated with autoregressive support vector machine (AR-SVM). Vibration experiments are carried out with a model frame. Time-series data in different cases such as: initial linear structure; linear structure with mass changed; nonlinear structure; nonlinear structure with mass changed are acquired.AR model of acceleration time-series is established, and different kernel function types and corresponding parameters are chosen and compared, which can more accurate, more effectively locate the damage. Different cases damaged states and different damage positions have been recognized successfully. AR-SVM method for the insufficient training samples is proved to be practical and efficient on structure nonlinear damage detection.
Explaining Support Vector Machines: A Color Based Nomogram
Van Belle, Vanya; Van Calster, Ben; Van Huffel, Sabine; Suykens, Johan A. K.; Lisboa, Paulo
2016-01-01
Problem setting Support vector machines (SVMs) are very popular tools for classification, regression and other problems. Due to the large choice of kernels they can be applied with, a large variety of data can be analysed using these tools. Machine learning thanks its popularity to the good performance of the resulting models. However, interpreting the models is far from obvious, especially when non-linear kernels are used. Hence, the methods are used as black boxes. As a consequence, the use of SVMs is less supported in areas where interpretability is important and where people are held responsible for the decisions made by models. Objective In this work, we investigate whether SVMs using linear, polynomial and RBF kernels can be explained such that interpretations for model-based decisions can be provided. We further indicate when SVMs can be explained and in which situations interpretation of SVMs is (hitherto) not possible. Here, explainability is defined as the ability to produce the final decision based on a sum of contributions which depend on one single or at most two input variables. Results Our experiments on simulated and real-life data show that explainability of an SVM depends on the chosen parameter values (degree of polynomial kernel, width of RBF kernel and regularization constant). When several combinations of parameter values yield the same cross-validation performance, combinations with a lower polynomial degree or a larger kernel width have a higher chance of being explainable. Conclusions This work summarizes SVM classifiers obtained with linear, polynomial and RBF kernels in a single plot. Linear and polynomial kernels up to the second degree are represented exactly. For other kernels an indication of the reliability of the approximation is presented. The complete methodology is available as an R package and two apps and a movie are provided to illustrate the possibilities offered by the method. PMID:27723811
Evaluation of Algorithms for a Miles-in-Trail Decision Support Tool
NASA Technical Reports Server (NTRS)
Bloem, Michael; Hattaway, David; Bambos, Nicholas
2012-01-01
Four machine learning algorithms were prototyped and evaluated for use in a proposed decision support tool that would assist air traffic managers as they set Miles-in-Trail restrictions. The tool would display probabilities that each possible Miles-in-Trail value should be used in a given situation. The algorithms were evaluated with an expected Miles-in-Trail cost that assumes traffic managers set restrictions based on the tool-suggested probabilities. Basic Support Vector Machine, random forest, and decision tree algorithms were evaluated, as was a softmax regression algorithm that was modified to explicitly reduce the expected Miles-in-Trail cost. The algorithms were evaluated with data from the summer of 2011 for air traffic flows bound to the Newark Liberty International Airport (EWR) over the ARD, PENNS, and SHAFF fixes. The algorithms were provided with 18 input features that describe the weather at EWR, the runway configuration at EWR, the scheduled traffic demand at EWR and the fixes, and other traffic management initiatives in place at EWR. Features describing other traffic management initiatives at EWR and the weather at EWR achieved relatively high information gain scores, indicating that they are the most useful for estimating Miles-in-Trail. In spite of a high variance or over-fitting problem, the decision tree algorithm achieved the lowest expected Miles-in-Trail costs when the algorithms were evaluated using 10-fold cross validation with the summer 2011 data for these air traffic flows.
A randomized algorithm for two-cluster partition of a set of vectors
NASA Astrophysics Data System (ADS)
Kel'manov, A. V.; Khandeev, V. I.
2015-02-01
A randomized algorithm is substantiated for the strongly NP-hard problem of partitioning a finite set of vectors of Euclidean space into two clusters of given sizes according to the minimum-of-the sum-of-squared-distances criterion. It is assumed that the centroid of one of the clusters is to be optimized and is determined as the mean value over all vectors in this cluster. The centroid of the other cluster is fixed at the origin. For an established parameter value, the algorithm finds an approximate solution of the problem in time that is linear in the space dimension and the input size of the problem for given values of the relative error and failure probability. The conditions are established under which the algorithm is asymptotically exact and runs in time that is linear in the space dimension and quadratic in the input size of the problem.
Yu, Jinhua; Wang, Yuanyuan; Chen, Ping
2009-01-01
Accurate estimation of fetal weight before delivery is of great benefit to limit the potential complication associated with the low-birth-weight infants. Although the regression analysis has been used as a daily clinical means to estimate the fetal weight on the basis of ultrasound measurements, it still lacks enough accuracy for low-birth-weight fetuses. The ineffectiveness is mainly due to the large inter- or intraobserver variability in measurements and the inappropriateness of the regression analysis. A novel method based on the support vector regression (SVR) is proposed to improve the weight estimation accuracy for fetuses of less than 2500 g. Here, fuzzy logic is introduced into SVR (termed FSVR) to limit the contribution of inaccurate training data to the model establishment, and thus, to enhance the robustness of FSVR to noisy data. To guarantee the generalization performance of the FSVR model, the nondominated sorting genetic algorithm (NSGA) is utilized to obtain the optimal parameters for the FSVR, which is referred to as the evolutionary fuzzy support vector regression (EFSVR) model. Compared with regression formulas, back-propagation neural network, and SVR, EFSVR achieves the lowest mean absolute percent error (6.6%) and the highest correlation coefficient (0.902) between the estimated fetal weight and the actual birth weight. The EFSVR model produces significant improvement (1.9%-4.2%) on the accuracy of fetal weight estimation over several widely used formulas. Experiments show the potential of EFSVR in clinical prenatal care.
Human action recognition with group lasso regularized-support vector machine
NASA Astrophysics Data System (ADS)
Luo, Huiwu; Lu, Huanzhang; Wu, Yabei; Zhao, Fei
2016-05-01
The bag-of-visual-words (BOVW) and Fisher kernel are two popular models in human action recognition, and support vector machine (SVM) is the most commonly used classifier for the two models. We show two kinds of group structures in the feature representation constructed by BOVW and Fisher kernel, respectively, since the structural information of feature representation can be seen as a prior for the classifier and can improve the performance of the classifier, which has been verified in several areas. However, the standard SVM employs L2-norm regularization in its learning procedure, which penalizes each variable individually and cannot express the structural information of feature representation. We replace the L2-norm regularization with group lasso regularization in standard SVM, and a group lasso regularized-support vector machine (GLRSVM) is proposed. Then, we embed the group structural information of feature representation into GLRSVM. Finally, we introduce an algorithm to solve the optimization problem of GLRSVM by alternating directions method of multipliers. The experiments evaluated on KTH, YouTube, and Hollywood2 datasets show that our method achieves promising results and improves the state-of-the-art methods on KTH and YouTube datasets.
Prediction of Skin Sensitization with a Particle Swarm Optimized Support Vector Machine
Yuan, Hua; Huang, Jianping; Cao, Chenzhong
2009-01-01
Skin sensitization is the most commonly reported occupational illness, causing much suffering to a wide range of people. Identification and labeling of environmental allergens is urgently required to protect people from skin sensitization. The guinea pig maximization test (GPMT) and murine local lymph node assay (LLNA) are the two most important in vivo models for identification of skin sensitizers. In order to reduce the number of animal tests, quantitative structure-activity relationships (QSARs) are strongly encouraged in the assessment of skin sensitization of chemicals. This paper has investigated the skin sensitization potential of 162 compounds with LLNA results and 92 compounds with GPMT results using a support vector machine. A particle swarm optimization algorithm was implemented for feature selection from a large number of molecular descriptors calculated by Dragon. For the LLNA data set, the classification accuracies are 95.37% and 88.89% for the training and the test sets, respectively. For the GPMT data set, the classification accuracies are 91.80% and 90.32% for the training and the test sets, respectively. The classification performances were greatly improved compared to those reported in the literature, indicating that the support vector machine optimized by particle swarm in this paper is competent for the identification of skin sensitizers. PMID:19742136
Least squares twin support vector machine with Universum data for classification
NASA Astrophysics Data System (ADS)
Xu, Yitian; Chen, Mei; Li, Guohui
2016-11-01
Universum, a third class not belonging to either class of the classification problem, allows to incorporate the prior knowledge into the learning process. A lot of previous work have demonstrated that the Universum is helpful to the supervised and semi-supervised classification. Moreover, Universum has already been introduced into the support vector machine (SVM) and twin support vector machine (TSVM) to enhance the generalisation performance. To further increase the generalisation performance, we propose a least squares TSVM with Universum data (?-TSVM) in this paper. Our ?-TSVM possesses the following advantages: first, it exploits Universum data to improve generalisation performance. Besides, it implements the structural risk minimisation principle by adding a regularisation to the objective function. Finally, it costs less computing time by solving two small-sized systems of linear equations instead of a single larger-sized quadratic programming problem. To verify the validity of our proposed algorithm, we conduct various experiments around the size of labelled samples and the number of Universum data on data-sets including seven benchmark data-sets, Toy data, MNIST and Face images. Empirical experiments indicate that Universum contributes to making prediction accuracy improved even stable. Especially when fewer labelled samples given, ?-TSVM is far superior to the improved LS-TSVM (ILS-TSVM), and slightly superior to the ?-TSVM.
Noninvasive extraction of fetal electrocardiogram based on Support Vector Machine
NASA Astrophysics Data System (ADS)
Fu, Yumei; Xiang, Shihan; Chen, Tianyi; Zhou, Ping; Huang, Weiyan
2015-10-01
The fetal electrocardiogram (FECG) signal has important clinical value for diagnosing the fetal heart diseases and choosing suitable therapeutics schemes to doctors. So, the noninvasive extraction of FECG from electrocardiogram (ECG) signals becomes a hot research point. A new method, the Support Vector Machine (SVM) is utilized for the extraction of FECG with limited size of data. Firstly, the theory of the SVM and the principle of the extraction based on the SVM are studied. Secondly, the transformation of maternal electrocardiogram (MECG) component in abdominal composite signal is verified to be nonlinear and fitted with the SVM. Then, the SVM is trained, and the training results are compared with the real data to ensure the effect of the training. Meanwhile, the parameters of the SVM are optimized to achieve the best performance so that the learning machine can be utilized to fit the unknown samples. Finally, the FECG is extracted by removing the optimal estimation of MECG component from the abdominal composite signal. In order to evaluate the performance of FECG extraction based on the SVM, the Signal-to-Noise Ratio (SNR) and the visual test are used. The experimental results show that the FECG with good quality can be extracted, its SNR ratio is significantly increased as high as 9.2349 dB and the time cost is significantly decreased as short as 0.802 seconds. Compared with the traditional method, the noninvasive extraction method based on the SVM has a simple realization, the shorter treatment time and the better extraction quality under the same conditions.
Features classification using support vector machine for a facial expression recognition system
NASA Astrophysics Data System (ADS)
Patil, Rajesh A.; Sahula, Vineet; Mandal, Atanendu S.
2012-10-01
A methodology for automatic facial expression recognition in image sequences is proposed, which makes use of the Candide wire frame model and an active appearance algorithm for tracking, and support vector machine (SVM) for classification. A face is detected automatically from the given image sequence and by adapting the Candide wire frame model properly on the first frame of face image sequence, facial features in the subsequent frames are tracked using an active appearance algorithm. The algorithm adapts the Candide wire frame model to the face in each of the frames and then automatically tracks the grid in consecutive video frames over time. We require that first frame of the image sequence corresponds to the neutral facial expression, while the last frame of the image sequence corresponds to greatest intensity of facial expression. The geometrical displacement of Candide wire frame nodes, defined as the difference of the node coordinates between the first and the greatest facial expression intensity frame, is used as an input to the SVM, which classify the facial expression into one of the classes viz happy, surprise, sadness, anger, disgust, and fear.
New algorithm for nonlinear vector-based upconversion with center weighted medians
NASA Astrophysics Data System (ADS)
Blume, Holger
1997-07-01
One important task in the field of digital video signal processing is the conversion of one standard into another with different field and scan rates. Therefore a new vector-based nonlinear upconversion algorithm has been developed that applies nonlinear center weighted median filters (CWM). Assuming a two channel model of the human visual system with different spatio-temporal characteristics, there are contrary demands for the CWM filters. One can meet these demands by a vertical band separation and an application of so-called temporally and spatially dominated CWMs. By this means, interpolation errors of the separated channels can be compensated by an adequate splitting of the spectrum. Therefore a very robust vector error tolerant upconversion method can be achieved, which significantly improves the interpolation quality. By an appropriate choice of the CWM filter root structures, main picture elements are interpolated correctly even if faulty vector fields occur. To demonstrate the correctness of the deduced interpolation scheme, picture content is classified. These classes are distinguished by correct or incorrect vector assignment and correlated or noncorrelated picture content. The mode of operation of the new algorithm is portrayed for each class. Whereas the mode of operation for correlated picture content can be shown by object models, this is shown for noncorrelated picture content by the probability distribution function of the applied CWM filters. The new algorithm has been verified by objective evaluation methods [peak signal to noise ratio, and subjective mean square error measurements] and by a comprehensive subjective test series.
PSO-based support vector machine with cuckoo search technique for clinical disease diagnoses.
Liu, Xiaoyong; Fu, Hui
2014-01-01
Disease diagnosis is conducted with a machine learning method. We have proposed a novel machine learning method that hybridizes support vector machine (SVM), particle swarm optimization (PSO), and cuckoo search (CS). The new method consists of two stages: firstly, a CS based approach for parameter optimization of SVM is developed to find the better initial parameters of kernel function, and then PSO is applied to continue SVM training and find the best parameters of SVM. Experimental results indicate that the proposed CS-PSO-SVM model achieves better classification accuracy and F-measure than PSO-SVM and GA-SVM. Therefore, we can conclude that our proposed method is very efficient compared to the previously reported algorithms. PMID:24971382
Fault diagnosis based on signed directed graph and support vector machine
NASA Astrophysics Data System (ADS)
Han, Xiaoming; Lv, Qing; Xie, Gang; Zheng, Jianxia
2011-12-01
Support Vector Machine (SVM) based on Structural Risk Minimization (SRM) of Statistical Learning Theory has excellent performance in fault diagnosis. However, its training speed and diagnosis speed are relatively slow. Signed Directed Graph (SDG) based on deep knowledge model has better completeness that is knowledge representation ability. However, much quantitative information is not utilized in qualitative SDG model which often produces a false solution. In order to speed up the training and diagnosis of SVM and improve the diagnostic resolution of SDG, SDG and SVM are combined in this paper. Training samples' dimension of SVM is reduced to improve training speed and diagnosis speed by the consistent path of SDG; the resolution of SDG is improved by good classification performance of SVM. The Matlab simulation by Tennessee-Eastman Process (TEP) simulation system demonstrates the feasibility of the fault diagnosis algorithm proposed in this paper.
Fault diagnosis based on signed directed graph and support vector machine
NASA Astrophysics Data System (ADS)
Han, Xiaoming; Lv, Qing; Xie, Gang; Zheng, Jianxia
2012-01-01
Support Vector Machine (SVM) based on Structural Risk Minimization (SRM) of Statistical Learning Theory has excellent performance in fault diagnosis. However, its training speed and diagnosis speed are relatively slow. Signed Directed Graph (SDG) based on deep knowledge model has better completeness that is knowledge representation ability. However, much quantitative information is not utilized in qualitative SDG model which often produces a false solution. In order to speed up the training and diagnosis of SVM and improve the diagnostic resolution of SDG, SDG and SVM are combined in this paper. Training samples' dimension of SVM is reduced to improve training speed and diagnosis speed by the consistent path of SDG; the resolution of SDG is improved by good classification performance of SVM. The Matlab simulation by Tennessee-Eastman Process (TEP) simulation system demonstrates the feasibility of the fault diagnosis algorithm proposed in this paper.
A Numerical Comparison of Rule Ensemble Methods and Support Vector Machines
Meza, Juan C.; Woods, Mark
2009-12-18
Machine or statistical learning is a growing field that encompasses many scientific problems including estimating parameters from data, identifying risk factors in health studies, image recognition, and finding clusters within datasets, to name just a few examples. Statistical learning can be described as 'learning from data' , with the goal of making a prediction of some outcome of interest. This prediction is usually made on the basis of a computer model that is built using data where the outcomes and a set of features have been previously matched. The computer model is called a learner, hence the name machine learning. In this paper, we present two such algorithms, a support vector machine method and a rule ensemble method. We compared their predictive power on three supernova type 1a data sets provided by the Nearby Supernova Factory and found that while both methods give accuracies of approximately 95%, the rule ensemble method gives much lower false negative rates.
PSO-based support vector machine with cuckoo search technique for clinical disease diagnoses.
Liu, Xiaoyong; Fu, Hui
2014-01-01
Disease diagnosis is conducted with a machine learning method. We have proposed a novel machine learning method that hybridizes support vector machine (SVM), particle swarm optimization (PSO), and cuckoo search (CS). The new method consists of two stages: firstly, a CS based approach for parameter optimization of SVM is developed to find the better initial parameters of kernel function, and then PSO is applied to continue SVM training and find the best parameters of SVM. Experimental results indicate that the proposed CS-PSO-SVM model achieves better classification accuracy and F-measure than PSO-SVM and GA-SVM. Therefore, we can conclude that our proposed method is very efficient compared to the previously reported algorithms.
Barman, Ishan; Dingari, Narahara Chari; Rajaram, Narasimhan; Tunnell, James W.; Dasari, Ramachandra R.; Feld, Michael S.
2011-01-01
Diffuse reflectance spectroscopy (DRS) has been extensively applied for the characterization of biological tissue, especially for dysplasia and cancer detection, by determination of the tissue optical properties. A major challenge in performing routine clinical diagnosis lies in the extraction of the relevant parameters, especially at high absorption levels typically observed in cancerous tissue. Here, we present a new least-squares support vector machine (LS-SVM) based regression algorithm for rapid and accurate determination of the absorption and scattering properties. Using physical tissue models, we demonstrate that the proposed method can be implemented more than two orders of magnitude faster than the state-of-the-art approaches while providing better prediction accuracy. Our results show that the proposed regression method has great potential for clinical applications including in tissue scanners for cancer margin assessment, where rapid quantification of optical properties is critical to the performance. PMID:21412464
Barman, Ishan; Dingari, Narahara Chari; Rajaram, Narasimhan; Tunnell, James W; Dasari, Ramachandra R; Feld, Michael S
2011-01-01
Diffuse reflectance spectroscopy (DRS) has been extensively applied for the characterization of biological tissue, especially for dysplasia and cancer detection, by determination of the tissue optical properties. A major challenge in performing routine clinical diagnosis lies in the extraction of the relevant parameters, especially at high absorption levels typically observed in cancerous tissue. Here, we present a new least-squares support vector machine (LS-SVM) based regression algorithm for rapid and accurate determination of the absorption and scattering properties. Using physical tissue models, we demonstrate that the proposed method can be implemented more than two orders of magnitude faster than the state-of-the-art approaches while providing better prediction accuracy. Our results show that the proposed regression method has great potential for clinical applications including in tissue scanners for cancer margin assessment, where rapid quantification of optical properties is critical to the performance. PMID:21412464
Mao, Yong; Zhou, Xiao Bo; Pi, Dao Ying; Sun, You Xian
2005-11-01
In this study, we present a constructive algorithm for training cooperative support vector machine ensembles (CSVMEs). CSVME combines ensemble architecture design with cooperative training for individual SVMs in ensembles. Unlike most previous studies on training ensembles, CSVME puts emphasis on both accuracy and collaboration among individual SVMs in an ensemble. A group of SVMs selected on the basis of recursive classifier elimination is used in CSVME, and the number of the individual SVMs selected to construct CSVME is determined by 10-fold cross-validation. This kind of SVME has been tested on two ovarian cancer datasets previously obtained by proteomic mass spectrometry. By combining several individual SVMs, the proposed method achieves better performance than the SVME of all base SVMs.
Automatic pathology classification using a single feature machine learning support - vector machines
NASA Astrophysics Data System (ADS)
Yepes-Calderon, Fernando; Pedregosa, Fabian; Thirion, Bertrand; Wang, Yalin; Lepore, Natasha
2014-03-01
Magnetic Resonance Imaging (MRI) has been gaining popularity in the clinic in recent years as a safe in-vivo imaging technique. As a result, large troves of data are being gathered and stored daily that may be used as clinical training sets in hospitals. While numerous machine learning (ML) algorithms have been implemented for Alzheimer's disease classification, their outputs are usually difficult to interpret in the clinical setting. Here, we propose a simple method of rapid diagnostic classification for the clinic using Support Vector Machines (SVM)1 and easy to obtain geometrical measurements that, together with a cortical and sub-cortical brain parcellation, create a robust framework capable of automatic diagnosis with high accuracy. On a significantly large imaging dataset consisting of over 800 subjects taken from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database, classification-success indexes of up to 99.2% are reached with a single measurement.
NASA Astrophysics Data System (ADS)
Sadat Hashemipour, Maryam; Soleimani, Seyed Ali
2016-01-01
Artificial immune system (AIS) algorithm based on clonal selection method can be defined as a soft computing method inspired by theoretical immune system in order to solve science and engineering problems. Support vector machine (SVM) is a popular pattern classification method with many diverse applications. Kernel parameter setting in the SVM training procedure along with the feature selection significantly impacts on the classification accuracy rate. In this study, AIS based on Adaptive Clonal Selection (AISACS) algorithm has been used to optimise the SVM parameters and feature subset selection without degrading the SVM classification accuracy. Several public datasets of University of California Irvine machine learning (UCI) repository are employed to calculate the classification accuracy rate in order to evaluate the AISACS approach then it was compared with grid search algorithm and Genetic Algorithm (GA) approach. The experimental results show that the feature reduction rate and running time of the AISACS approach are better than the GA approach.
A novel retinal vessel extraction algorithm based on matched filtering and gradient vector flow
NASA Astrophysics Data System (ADS)
Yu, Lei; Xia, Mingliang; Xuan, Li
2013-10-01
The microvasculature network of retina plays an important role in the study and diagnosis of retinal diseases (age-related macular degeneration and diabetic retinopathy for example). Although it is possible to noninvasively acquire high-resolution retinal images with modern retinal imaging technologies, non-uniform illumination, the low contrast of thin vessels and the background noises all make it difficult for diagnosis. In this paper, we introduce a novel retinal vessel extraction algorithm based on gradient vector flow and matched filtering to segment retinal vessels with different likelihood. Firstly, we use isotropic Gaussian kernel and adaptive histogram equalization to smooth and enhance the retinal images respectively. Secondly, a multi-scale matched filtering method is adopted to extract the retinal vessels. Then, the gradient vector flow algorithm is introduced to locate the edge of the retinal vessels. Finally, we combine the results of matched filtering method and gradient vector flow algorithm to extract the vessels at different likelihood levels. The experiments demonstrate that our algorithm is efficient and the intensities of vessel images exactly represent the likelihood of the vessels.
A Consistent Information Criterion for Support Vector Machines in Diverging Model Spaces
Zhang, Xiang; Wu, Yichao; Wang, Lan; Li, Runze
2015-01-01
Information criteria have been popularly used in model selection and proved to possess nice theoretical properties. For classification, Claeskens et al. (2008) proposed support vector machine information criterion for feature selection and provided encouraging numerical evidence. Yet no theoretical justification was given there. This work aims to fill the gap and to provide some theoretical justifications for support vector machine information criterion in both fixed and diverging model spaces. We first derive a uniform convergence rate for the support vector machine solution and then show that a modification of the support vector machine information criterion achieves model selection consistency even when the number of features diverges at an exponential rate of the sample size. This consistency result can be further applied to selecting the optimal tuning parameter for various penalized support vector machine methods. Finite-sample performance of the proposed information criterion is investigated using Monte Carlo studies and one real-world gene selection problem. PMID:27239164
Flux-vector splitting algorithm for chain-rule conservation-law form
NASA Technical Reports Server (NTRS)
Shih, T. I.-P.; Nguyen, H. L.; Willis, E. A.; Steinthorsson, E.; Li, Z.
1991-01-01
A flux-vector splitting algorithm with Newton-Raphson iteration was developed for the 'full compressible' Navier-Stokes equations cast in chain-rule conservation-law form. The algorithm is intended for problems with deforming spatial domains and for problems whose governing equations cannot be cast in strong conservation-law form. The usefulness of the algorithm for such problems was demonstrated by applying it to analyze the unsteady, two- and three-dimensional flows inside one combustion chamber of a Wankel engine under nonfiring conditions. Solutions were obtained to examine the algorithm in terms of conservation error, robustness, and ability to handle complex flows on time-dependent grid systems.
Instability of anisotropic cosmological solutions supported by vector fields.
Himmetoglu, Burak; Contaldi, Carlo R; Peloso, Marco
2009-03-20
Models with vector fields acquiring a nonvanishing vacuum expectation value along one spatial direction have been proposed to sustain a prolonged stage of anisotropic accelerated expansion. Such models have been used for realizations of early time inflation, with a possible relation to the large scale cosmic microwave background anomalies, or of the late time dark energy. We show that, quite generally, the concrete realizations proposed so far are plagued by instabilities (either ghosts or unstable growth of the linearized perturbations) which can be ultimately related to the longitudinal vector polarization present in them. Phenomenological results based on these models are therefore unreliable.
Computer-aided diagnosis for prostate cancer using support vector machine
NASA Astrophysics Data System (ADS)
Mohamed, Samar S.; Salama, Magdy M. A.
2005-04-01
The work in this paper aims for analyzing texture features of the prostate using Trans-Rectal Ultra-Sound images (TRUS) images for tissue characterization. This research is expected to assist beginner radiologists with the decision making. Moreover it will also assist in determining the biopsy locations. Texture feature analysis is composed of four stages. The first stage is automatically identifying Regions Of Interest (ROI), a step that was usually done either by an expert radiologist or by dividing the whole image into smaller squares that represent regions of interest. The second stage is extracting the statistical features from the identified ROIs. Two different statistical feature sets were used in this study; the first is Grey Level Dependence Matrix features. The second feature set is Grey level difference vector features. These constructed features are then ranked using Mutual Information (MI) feature selection algorithm that maximizes MI between feature and class. The obtained feature sets, the combined feature set as well as the reduced feature subset were examined using Support Vector Machine (SVM) classifier, a well established classifier that is suitable for noisy data such as those obtained from the ultrasound images. The obtained sensitivity is 83.3%, specificity ranges from 90% to 100% and accuracy ranges from 87.5% to 93.75%.
Fall Detection with the Support Vector Machine during Scripted and Continuous Unscripted Activities
Liu, Shing-Hong; Cheng, Wen-Chang
2012-01-01
In recent years, the number of proposed fall-detection systems that have been developed has increased dramatically. A threshold-based algorithm utilizing an accelerometer has been used to detect low-complexity falling activities. In this study, we defined activities in which the body's center of gravity quickly declines as falling activities of daily life (ADLs). In the non-falling ADLs, we also focused on the body's center of gravity. A hyperplane of the support vector machine (SVM) was used as the separating plane to replace the traditional threshold method for the detection of falling ADLs. The scripted and continuous unscripted activities were performed by two groups of young volunteers (20 subjects) and one group of elderly volunteers (five subjects). The results showed that the four parameters of the input vector had the best accuracy with 99.1% and 98.4% in the training and testing, respectively. For the continuous unscripted test of one hour, there were two and one false positive events among young volunteers and elderly volunteers, respectively. PMID:23112713
Classification of Fruits Using Computer Vision and a Multiclass Support Vector Machine
Zhang, Yudong; Wu, Lenan
2012-01-01
Automatic classification of fruits via computer vision is still a complicated task due to the various properties of numerous types of fruits. We propose a novel classification method based on a multi-class kernel support vector machine (kSVM) with the desirable goal of accurate and fast classification of fruits. First, fruit images were acquired by a digital camera, and then the background of each image was removed by a split-and-merge algorithm; Second, the color histogram, texture and shape features of each fruit image were extracted to compose a feature space; Third, principal component analysis (PCA) was used to reduce the dimensions of feature space; Finally, three kinds of multi-class SVMs were constructed, i.e., Winner-Takes-All SVM, Max-Wins-Voting SVM, and Directed Acyclic Graph SVM. Meanwhile, three kinds of kernels were chosen, i.e., linear kernel, Homogeneous Polynomial kernel, and Gaussian Radial Basis kernel; finally, the SVMs were trained using 5-fold stratified cross validation with the reduced feature vectors as input. The experimental results demonstrated that the Max-Wins-Voting SVM with Gaussian Radial Basis kernel achieves the best classification accuracy of 88.2%. For computation time, the Directed Acyclic Graph SVMs performs swiftest. PMID:23112727
Chanda, Emmanuel; Mukonka, Victor Munyongwe; Mthembu, David; Kamuliwo, Mulakwa; Coetzer, Sarel; Shinondo, Cecilia Jill
2012-01-01
Geographic information systems (GISs) with emerging technologies are being harnessed for studying spatial patterns in vector-borne diseases to reduce transmission. To implement effective vector control, increased knowledge on interactions of epidemiological and entomological malaria transmission determinants in the assessment of impact of interventions is critical. This requires availability of relevant spatial and attribute data to support malaria surveillance, monitoring, and evaluation. Monitoring the impact of vector control through a GIS-based decision support (DSS) has revealed spatial relative change in prevalence of infection and vector susceptibility to insecticides and has enabled measurement of spatial heterogeneity of trend or impact. The revealed trends and interrelationships have allowed the identification of areas with reduced parasitaemia and increased insecticide resistance thus demonstrating the impact of resistance on vector control. The GIS-based DSS provides opportunity for rational policy formulation and cost-effective utilization of limited resources for enhanced malaria vector control.
Ma, Qun; Hao, Gui-qi; Qiao, Yan-jiang; Zhang, Zhuo-yong; Zhang, Xiao-fang
2006-10-01
A method for determining the artificial bezoar powder in bezoar powder using near-infrared (NIR) diffuse reflectance spectrometry was proposed in the present paper. The method was based on support vector machine (SVM). The calibration set was set up by adding unequal artificial bezoar powder to the bezoar powder (content range: 0%-100% ) and collecting the NIR spectrum of the samples in the wave number range of 4000-10000 cm(-1). The processing algorithm was wavelet transform with first and second derivatives. A mathematical model with support vector machine was established. The model was checked with leave one method. The sum of the square of the relative prediction error was 0.00135. This method is reliable and can be used to control the quality of bezoar powder.
Rajesh Sharma, R.; Marikkannu, P.
2015-01-01
A novel hybrid approach for the identification of brain regions using magnetic resonance images accountable for brain tumor is presented in this paper. Classification of medical images is substantial in both clinical and research areas. Magnetic resonance imaging (MRI) modality outperforms towards diagnosing brain abnormalities like brain tumor, multiple sclerosis, hemorrhage, and many more. The primary objective of this work is to propose a three-dimensional (3D) novel brain tumor classification model using MRI images with both micro- and macroscale textures designed to differentiate the MRI of brain under two classes of lesion, benign and malignant. The design approach was initially preprocessed using 3D Gaussian filter. Based on VOI (volume of interest) of the image, features were extracted using 3D volumetric Square Centroid Lines Gray Level Distribution Method (SCLGM) along with 3D run length and cooccurrence matrix. The optimal features are selected using the proposed refined gravitational search algorithm (RGSA). Support vector machines, over backpropagation network, and k-nearest neighbor are used to evaluate the goodness of classifier approach. The preliminary evaluation of the system is performed using 320 real-time brain MRI images. The system is trained and tested by using a leave-one-case-out method. The performance of the classifier is tested using the receiver operating characteristic curve of 0.986 (±002). The experimental results demonstrate the systematic and efficient feature extraction and feature selection algorithm to the performance of state-of-the-art feature classification methods. PMID:26509188
Support Vector Machine (SVM) based Rain Area Detection from Kalpana-1 Satellite Data
NASA Astrophysics Data System (ADS)
Upadhyaya, S.; Ramsankaran, R. A. A. J.
2014-11-01
Rain is one of the major components of water cycle; extreme rain events can cause destruction and misery due to flash flood and droughts. Therefore, assessing rainfall at high temporal and spatial resolution is of fundamental importance which can be achieved only by satellite remote sensing. Though there are many algorithms developed for estimation of rainfall using satellite data, they suffer from various drawbacks. One such challenge in satellite rainfall estimation is to detect rain and no-rain areas properly. To address this problem, in the present study we have used the Support Vector Machines (SVM). It is significant to note that this is the first study to report the utility of SVM in detecting rain and no-rain areas. The developed SVM based index performance has been evaluated by comparing with two most popular rain detection methods used for Indian regions i.e. Simple TIR threshold used in Global Precipitation Index (GPI) technique and Roca method used in Insat Multi Spectral Rainfall Algorithm (IMSRA). Performance of the above considered indices has been analyzed by considering various categorical statistics like Probabil ity of Detection (POD), Probability of no-rain detection (POND), Accuracy, Bias, False Alarm Ratio (FAR) and Heidke Skill Score (HSS). The obtained results clearly show that the new SVM based index performs much better than the earlier indices.
NASA Astrophysics Data System (ADS)
Delgado, Juan A.; Altuve, Miguel; Nabhan Homsi, Masun
2015-12-01
This paper introduces a robust method based on the Support Vector Machine (SVM) algorithm to detect the presence of Fetal QRS (fQRS) complexes in electrocardiogram (ECG) recordings provided by the PhysioNet/CinC challenge 2013. ECG signals are first segmented into contiguous frames of 250 ms duration and then labeled in six classes. Fetal segments are tagged according to the position of fQRS complex within each one. Next, segment features extraction and dimensionality reduction are obtained by applying principal component analysis on Haar-wavelet transform. After that, two sub-datasets are generated to separate representative segments from atypical ones. Imbalanced class problem is dealt by applying sampling without replacement on each sub-dataset. Finally, two SVMs are trained and cross-validated using the two balanced sub-datasets separately. Experimental results show that the proposed approach achieves high performance rates in fetal heartbeats detection that reach up to 90.95% of accuracy, 92.16% of sensitivity, 88.51% of specificity, 94.13% of positive predictive value and 84.96% of negative predictive value. A comparative study is also carried out to show the performance of other two machine learning algorithms for fQRS complex estimation, which are K-nearest neighborhood and Bayesian network.
Three-class classification in computer-aided diagnosis of breast cancer by support vector machine
NASA Astrophysics Data System (ADS)
Sun, Xuejun; Qian, Wei; Song, Dansheng
2004-05-01
Design of classifier in computer-aided diagnosis (CAD) scheme of breast cancer plays important role to its overall performance in sensitivity and specificity. Classification of a detected object as malignant lesion, benign lesion, or normal tissue on mammogram is a typical three-class pattern recognition problem. This paper presents a three-class classification approach by using two-stage classifier combined with support vector machine (SVM) learning algorithm for classification of breast cancer on mammograms. The first classification stage is used to detect abnormal areas and normal breast tissues, and the second stage is for classification of malignant or benign in detected abnormal objects. A series of spatial, morphology and texture features have been extracted on detected objects areas. By using genetic algorithm (GA), different feature groups for different stage classification have been investigated. Computerized free-response receiver operating characteristic (FROC) and receiver operating characteristic (ROC) analyses have been employed in different classification stages. Results have shown that obvious performance improvement in both sensitivity and specificity was observed through proposed classification approach compared with conventional two-class classification approaches, indicating its effectiveness in classification of breast cancer on mammograms.
NASA Astrophysics Data System (ADS)
Wang, Guochang; Carr, Timothy R.; Ju, Yiwen; Li, Chaofeng
2014-03-01
Unconventional shale reservoirs as the result of extremely low matrix permeability, higher potential gas productivity requires not only sufficient gas-in-place, but also a high concentration of brittle minerals (silica and/or carbonate) that is amenable to hydraulic fracturing. Shale lithofacies is primarily defined by mineral composition and organic matter richness, and its representation as a 3-D model has advantages in recognizing productive zones of shale-gas reservoirs, designing horizontal wells and stimulation strategy, and aiding in understanding depositional process of organic-rich shale. A challenging and key step is to effectively recognize shale lithofacies from well conventional logs, where the relationship is very complex and nonlinear. In the recognition of shale lithofacies, the application of support vector machine (SVM), which underlies statistical learning theory and structural risk minimization principle, is superior to the traditional empirical risk minimization principle employed by artificial neural network (ANN). We propose SVM classifier combined with learning algorithms, such as grid searching, genetic algorithm and particle swarm optimization, and various kernel functions the approach to identify Marcellus Shale lithofacies. Compared with ANN classifiers, the experimental results of SVM classifiers showed higher cross-validation accuracy, better stability and less computational time cost. The SVM classifier with radius basis function as kernel worked best as it is trained by particle swarm optimization. The lithofacies predicted using the SVM classifier are used to build a 3-D Marcellus Shale lithofacies model, which assists in identifying higher productive zones, especially with thermal maturity and natural fractures.
NASA Astrophysics Data System (ADS)
Li, S. X.; Zhang, Y. J.; Zeng, Q. Y.; Li, L. F.; Guo, Z. Y.; Liu, Z. M.; Xiong, H. L.; Liu, S. H.
2014-06-01
Cancer is the most common disease to threaten human health. The ability to screen individuals with malignant tumours with only a blood sample would be greatly advantageous to early diagnosis and intervention. This study explores the possibility of discriminating between cancer patients and normal subjects with serum surface-enhanced Raman spectroscopy (SERS) and a support vector machine (SVM) through a peripheral blood sample. A total of 130 blood samples were obtained from patients with liver cancer, colonic cancer, esophageal cancer, nasopharyngeal cancer, gastric cancer, as well as 113 blood samples from normal volunteers. Several diagnostic models were built with the serum SERS spectra using SVM and principal component analysis (PCA) techniques. The results show that a diagnostic accuracy of 85.5% is acquired with a PCA algorithm, while a diagnostic accuracy of 95.8% is obtained using radial basis function (RBF), PCA-SVM methods. The results prove that a RBF kernel PCA-SVM technique is superior to PCA and conventional SVM (C-SVM) algorithms in classification serum SERS spectra. The study demonstrates that serum SERS, in combination with SVM techniques, has great potential for screening cancerous patients with any solid malignant tumour through a peripheral blood sample.
SUPPORT VECTOR MACHINES FOR BROAD AREA FEATURE CLASSIFICATION IN REMOTELY SENSED IMAGES
S. PERKINS; N. HARVEY; ET AL
2001-03-01
Classification of broad area features in satellite imagery is one of the most important applications of remote sensing. It is often difficult and time-consuming to develop classifiers by hand, so many researchers have turned to techniques from the fields of statistics and machine learning to automatically generate classifiers. Common techniques include maximum likelihood classifiers, neural networks and genetic algorithms. We present a new system called Afreet, which uses a recently developed machine learning paradigm called Support Vector Machines (SVMs). In contrast to other techniques, SVMs offer a solid mathematical foundation that provides a probabilistic guarantee on how well the classifier will generalize to unseen data. In addition the SVM training algorithm is guaranteed to converge to the globally optimal SVM classifier, can learn highly non-linear discrimination functions, copes extremely well with high-dimensional feature spaces (such as hype spectral data), and scales well to large problem sizes. Afreet combines an SVM with a sophisticated spatio-spectral feature construction mechanism that allows it to classify spectrally ambiguous pixels. We demonstrate the effectiveness of the system by applying Afreet to several broad area classification problems in remote sensing, and provide a comparison with conventional maximum likelihood classification.
Estimating stellar atmospheric parameters based on LASSO and support-vector regression
NASA Astrophysics Data System (ADS)
Lu, Yu; Li, Xiangru
2015-09-01
A scheme for estimating atmospheric parameters Teff, log g and [Fe/H] is proposed on the basis of the Least Absolute Shrinkage and Selection Operator (LASSO) algorithm and Haar wavelet. The proposed scheme consists of three processes. A spectrum is decomposed using the Haar wavelet transform and low-frequency components at the fourth level are considered as candidate features. Then, spectral features from the candidate features are detected using the LASSO algorithm to estimate the atmospheric parameters. Finally, atmospheric parameters are estimated from the extracted spectral features using the support-vector regression (SVR) method. The proposed scheme was evaluated using three sets of stellar spectra from the Sloan Digital Sky Survey (SDSS), Large Sky Area Multi-object Fibre Spectroscopic Telescope (LAMOST) and Kurucz's model, respectively. The mean absolute errors are as follows: for the 40 000 SDSS spectra, 0.0062 dex for log Teff (85.83 K for Teff), 0.2035 dex for log g and 0.1512 dex for [Fe/H]; for the 23 963 LAMOST spectra, 0.0074 dex for log Teff (95.37 K for Teff), 0.1528 dex for log g and 0.1146 dex for [Fe/H]; for the 10 469 synthetic spectra, 0.0010 dex for log Teff (14.42K for Teff), 0.0123 dex for log g and 0.0125 dex for [Fe/H].
Eddy current characterization of small cracks using least square support vector machine
NASA Astrophysics Data System (ADS)
Chelabi, M.; Hacib, T.; Le Bihan, Y.; Ikhlef, N.; Boughedda, H.; Mekideche, M. R.
2016-04-01
Eddy current (EC) sensors are used for non-destructive testing since they are able to probe conductive materials. Despite being a conventional technique for defect detection and localization, the main weakness of this technique is that defect characterization, of the exact determination of the shape and dimension, is still a question to be answered. In this work, we demonstrate the capability of small crack sizing using signals acquired from an EC sensor. We report our effort to develop a systematic approach to estimate the size of rectangular and thin defects (length and depth) in a conductive plate. The achieved approach by the novel combination of a finite element method (FEM) with a statistical learning method is called least square support vector machines (LS-SVM). First, we use the FEM to design the forward problem. Next, an algorithm is used to find an adaptive database. Finally, the LS-SVM is used to solve the inverse problems, creating polynomial functions able to approximate the correlation between the crack dimension and the signal picked up from the EC sensor. Several methods are used to find the parameters of the LS-SVM. In this study, the particle swarm optimization (PSO) and genetic algorithm (GA) are proposed for tuning the LS-SVM. The results of the design and the inversions were compared to both simulated and experimental data, with accuracy experimentally verified. These suggested results prove the applicability of the presented approach.
Nara, Takaaki; Oohama, Junji; Hashimoto, Masaru; Takeda, Tsunehiro; Ando, Shigeru
2007-07-01
This paper presents a novel algorithm to reconstruct parameters of a sufficient number of current dipoles that describe data (equivalent current dipoles, ECDs, hereafter) from radial/vector magnetoencephalography (MEG) with and without electroencephalography (EEG). We assume a three-compartment head model and arbitrary surfaces on which the MEG sensors and EEG electrodes are placed. Via the multipole expansion of the magnetic field, we obtain algebraic equations relating the dipole parameters to the vector MEG/EEG data. By solving them directly, without providing initial parameter guesses and computing forward solutions iteratively, the dipole positions and moments projected onto the xy-plane (equatorial plane) are reconstructed from a single time shot of the data. In addition, when the head layers and the sensor surfaces are spherically symmetric, we show that the required data reduce to radial MEG only. This clarifies the advantage of vector MEG/EEG measurements and algorithms for a generally-shaped head and sensor surfaces. In the numerical simulations, the centroids of the patch sources are well localized using vector/radial MEG measured on the upper hemisphere. By assuming the model order to be larger than the actual dipole number, the resultant spurious dipole is shown to have a much smaller strength magnetic moment (about 0.05 times smaller when the SNR = 16 dB), so that the number of ECDs is reasonably estimated. We consider that our direct method with greatly reduced computational cost can also be used to provide a good initial guess for conventional dipolar/multipolar fitting algorithms.
Nara, Takaaki; Oohama, Junji; Hashimoto, Masaru; Takeda, Tsunehiro; Ando, Shigeru
2007-07-01
This paper presents a novel algorithm to reconstruct parameters of a sufficient number of current dipoles that describe data (equivalent current dipoles, ECDs, hereafter) from radial/vector magnetoencephalography (MEG) with and without electroencephalography (EEG). We assume a three-compartment head model and arbitrary surfaces on which the MEG sensors and EEG electrodes are placed. Via the multipole expansion of the magnetic field, we obtain algebraic equations relating the dipole parameters to the vector MEG/EEG data. By solving them directly, without providing initial parameter guesses and computing forward solutions iteratively, the dipole positions and moments projected onto the xy-plane (equatorial plane) are reconstructed from a single time shot of the data. In addition, when the head layers and the sensor surfaces are spherically symmetric, we show that the required data reduce to radial MEG only. This clarifies the advantage of vector MEG/EEG measurements and algorithms for a generally-shaped head and sensor surfaces. In the numerical simulations, the centroids of the patch sources are well localized using vector/radial MEG measured on the upper hemisphere. By assuming the model order to be larger than the actual dipole number, the resultant spurious dipole is shown to have a much smaller strength magnetic moment (about 0.05 times smaller when the SNR = 16 dB), so that the number of ECDs is reasonably estimated. We consider that our direct method with greatly reduced computational cost can also be used to provide a good initial guess for conventional dipolar/multipolar fitting algorithms. PMID:17664582
NASA Astrophysics Data System (ADS)
Nara, Takaaki; Oohama, Junji; Hashimoto, Masaru; Takeda, Tsunehiro; Ando, Shigeru
2007-07-01
This paper presents a novel algorithm to reconstruct parameters of a sufficient number of current dipoles that describe data (equivalent current dipoles, ECDs, hereafter) from radial/vector magnetoencephalography (MEG) with and without electroencephalography (EEG). We assume a three-compartment head model and arbitrary surfaces on which the MEG sensors and EEG electrodes are placed. Via the multipole expansion of the magnetic field, we obtain algebraic equations relating the dipole parameters to the vector MEG/EEG data. By solving them directly, without providing initial parameter guesses and computing forward solutions iteratively, the dipole positions and moments projected onto the xy-plane (equatorial plane) are reconstructed from a single time shot of the data. In addition, when the head layers and the sensor surfaces are spherically symmetric, we show that the required data reduce to radial MEG only. This clarifies the advantage of vector MEG/EEG measurements and algorithms for a generally-shaped head and sensor surfaces. In the numerical simulations, the centroids of the patch sources are well localized using vector/radial MEG measured on the upper hemisphere. By assuming the model order to be larger than the actual dipole number, the resultant spurious dipole is shown to have a much smaller strength magnetic moment (about 0.05 times smaller when the SNR = 16 dB), so that the number of ECDs is reasonably estimated. We consider that our direct method with greatly reduced computational cost can also be used to provide a good initial guess for conventional dipolar/multipolar fitting algorithms.
Technology Transfer Automated Retrieval System (TEKTRAN)
A somatic transformation vector, pDP9, was constructed that provides a simplified means of producing permanently transformed cultured insect cells that support high levels of protein expression of foreign genes. The pDP9 plasmid vector incorporates DNA sequences from the Junonia coenia densovirus th...
Evaluation of vectorized Monte Carlo algorithms on GPUs for a neutron Eigenvalue problem
Du, X.; Liu, T.; Ji, W.; Xu, X. G.; Brown, F. B.
2013-07-01
Conventional Monte Carlo (MC) methods for radiation transport computations are 'history-based', which means that one particle history at a time is tracked. Simulations based on such methods suffer from thread divergence on the graphics processing unit (GPU), which severely affects the performance of GPUs. To circumvent this limitation, event-based vectorized MC algorithms can be utilized. A versatile software test-bed, called ARCHER - Accelerated Radiation-transport Computations in Heterogeneous Environments - was used for this study. ARCHER facilitates the development and testing of a MC code based on the vectorized MC algorithm implemented on GPUs by using NVIDIA's Compute Unified Device Architecture (CUDA). The ARCHER{sub GPU} code was designed to solve a neutron eigenvalue problem and was tested on a NVIDIA Tesla M2090 Fermi card. We found that although the vectorized MC method significantly reduces the occurrence of divergent branching and enhances the warp execution efficiency, the overall simulation speed is ten times slower than the conventional history-based MC method on GPUs. By analyzing detailed GPU profiling information from ARCHER, we discovered that the main reason was the large amount of global memory transactions, causing severe memory access latency. Several possible solutions to alleviate the memory latency issue are discussed. (authors)
Water flow algorithm decision support tool for travelling salesman problem
NASA Astrophysics Data System (ADS)
Kamarudin, Anis Aklima; Othman, Zulaiha Ali; Sarim, Hafiz Mohd
2016-08-01
This paper discuss about the role of Decision Support Tool in Travelling Salesman Problem (TSP) for helping the researchers who doing research in same area will get the better result from the proposed algorithm. A study has been conducted and Rapid Application Development (RAD) model has been use as a methodology which includes requirement planning, user design, construction and cutover. Water Flow Algorithm (WFA) with initialization technique improvement is used as the proposed algorithm in this study for evaluating effectiveness against TSP cases. For DST evaluation will go through usability testing conducted on system use, quality of information, quality of interface and overall satisfaction. Evaluation is needed for determine whether this tool can assists user in making a decision to solve TSP problems with the proposed algorithm or not. Some statistical result shown the ability of this tool in term of helping researchers to conduct the experiments on the WFA with improvements TSP initialization.
Wu, Hai-wei; Yu, Hai-ye; Zhang, Lei
2011-05-01
Using K-fold cross validation method and two support vector machine functions, four kernel functions, grid-search, genetic algorithm and particle swarm optimization, the authors constructed the support vector machine model of the best penalty parameter c and the best correlation coefficient. Using information granulation technology, the authors constructed P particle and epsilon particle about those factors affecting net photosynthetic rate, and reduced these dimensions of the determinant. P particle includes the percent of visible spectrum ingredients. Epsilon particle includes leaf temperature, scattering radiation, air temperature, and so on. It is possible to obtain the best correlation coefficient among photosynthetic effective radiation, visible spectrum and individual net photosynthetic rate by this technology. The authors constructed the training set and the forecasting set including photosynthetic effective radiation, P particle and epsilon particle. The result shows that epsilon-SVR-RBF-genetic algorithm model, nu-SVR-linear-grid-search model and nu-SVR-RBF-genetic algorithm model obtain the correlation coefficient of up to 97% about the forecasting set including photosynthetic effective radiation and P particle. The penalty parameter c of nu-SVR-linear-grid-search model is the minimum, so the model's generalization ability is the best. The authors forecasted the forecasting set including photosynthetic effective radiation, P particle and epsilon particle by the model, and the correlation coefficient is up to 96%.
An algorithm to estimate the object support in truncated images
Hsieh, Scott S.; Nett, Brian E.; Cao, Guangzhi; Pelc, Norbert J.
2014-07-15
Purpose: Truncation artifacts in CT occur if the object to be imaged extends past the scanner field of view (SFOV). These artifacts impede diagnosis and could possibly introduce errors in dose plans for radiation therapy. Several approaches exist for correcting truncation artifacts, but existing correction algorithms do not accurately recover the skin line (or support) of the patient, which is important in some dose planning methods. The purpose of this paper was to develop an iterative algorithm that recovers the support of the object. Methods: The authors assume that the truncated portion of the image is made up of soft tissue of uniform CT number and attempt to find a shape consistent with the measured data. Each known measurement in the sinogram is interpreted as an estimate of missing mass along a line. An initial estimate of the object support is generated by thresholding a reconstruction made using a previous truncation artifact correction algorithm (e.g., water cylinder extrapolation). This object support is iteratively deformed to reduce the inconsistency with the measured data. The missing data are estimated using this object support to complete the dataset. The method was tested on simulated and experimentally truncated CT data. Results: The proposed algorithm produces a better defined skin line than water cylinder extrapolation. On the experimental data, the RMS error of the skin line is reduced by about 60%. For moderately truncated images, some soft tissue contrast is retained near the SFOV. As the extent of truncation increases, the soft tissue contrast outside the SFOV becomes unusable although the skin line remains clearly defined, and in reformatted images it varies smoothly from slice to slice as expected. Conclusions: The support recovery algorithm provides a more accurate estimate of the patient outline than thresholded, basic water cylinder extrapolation, and may be preferred in some radiation therapy applications.
Estimation of Electrically-Evoked Knee Torque from Mechanomyography Using Support Vector Regression.
Ibitoye, Morufu Olusola; Hamzaid, Nur Azah; Abdul Wahab, Ahmad Khairi; Hasnan, Nazirah; Olatunji, Sunday Olusanya; Davis, Glen M
2016-07-19
The difficulty of real-time muscle force or joint torque estimation during neuromuscular electrical stimulation (NMES) in physical therapy and exercise science has motivated recent research interest in torque estimation from other muscle characteristics. This study investigated the accuracy of a computational intelligence technique for estimating NMES-evoked knee extension torque based on the Mechanomyographic signals (MMG) of contracting muscles that were recorded from eight healthy males. Simulation of the knee torque was modelled via Support Vector Regression (SVR) due to its good generalization ability in related fields. Inputs to the proposed model were MMG amplitude characteristics, the level of electrical stimulation or contraction intensity, and knee angle. Gaussian kernel function, as well as its optimal parameters were identified with the best performance measure and were applied as the SVR kernel function to build an effective knee torque estimation model. To train and test the model, the data were partitioned into training (70%) and testing (30%) subsets, respectively. The SVR estimation accuracy, based on the coefficient of determination (R²) between the actual and the estimated torque values was up to 94% and 89% during the training and testing cases, with root mean square errors (RMSE) of 9.48 and 12.95, respectively. The knee torque estimations obtained using SVR modelling agreed well with the experimental data from an isokinetic dynamometer. These findings support the realization of a closed-loop NMES system for functional tasks using MMG as the feedback signal source and an SVR algorithm for joint torque estimation.
Automated detection of pulmonary nodules in CT images with support vector machines
NASA Astrophysics Data System (ADS)
Liu, Lu; Liu, Wanyu; Sun, Xiaoming
2008-10-01
Many methods have been proposed to avoid radiologists fail to diagnose small pulmonary nodules. Recently, support vector machines (SVMs) had received an increasing attention for pattern recognition. In this paper, we present a computerized system aimed at pulmonary nodules detection; it identifies the lung field, extracts a set of candidate regions with a high sensitivity ratio and then classifies candidates by the use of SVMs. The Computer Aided Diagnosis (CAD) system presented in this paper supports the diagnosis of pulmonary nodules from Computed Tomography (CT) images as inflammation, tuberculoma, granuloma..sclerosing hemangioma, and malignant tumor. Five texture feature sets were extracted for each lesion, while a genetic algorithm based feature selection method was applied to identify the most robust features. The selected feature set was fed into an ensemble of SVMs classifiers. The achieved classification performance was 100%, 92.75% and 90.23% in the training, validation and testing set, respectively. It is concluded that computerized analysis of medical images in combination with artificial intelligence can be used in clinical practice and may contribute to more efficient diagnosis.
A Support Vector Machine Classification of Thyroid Bioptic Specimens Using MALDI-MSI Data
De Sio, Gabriele; Chinello, Clizia; Pagni, Fabio
2016-01-01
Biomarkers able to characterise and predict multifactorial diseases are still one of the most important targets for all the “omics” investigations. In this context, Matrix-Assisted Laser Desorption/Ionisation-Mass Spectrometry Imaging (MALDI-MSI) has gained considerable attention in recent years, but it also led to a huge amount of complex data to be elaborated and interpreted. For this reason, computational and machine learning procedures for biomarker discovery are important tools to consider, both to reduce data dimension and to provide predictive markers for specific diseases. For instance, the availability of protein and genetic markers to support thyroid lesion diagnoses would impact deeply on society due to the high presence of undetermined reports (THY3) that are generally treated as malignant patients. In this paper we show how an accurate classification of thyroid bioptic specimens can be obtained through the application of a state-of-the-art machine learning approach (i.e., Support Vector Machines) on MALDI-MSI data, together with a particular wrapper feature selection algorithm (i.e., recursive feature elimination). The model is able to provide an accurate discriminatory capability using only 20 out of 144 features, resulting in an increase of the model performances, reliability, and computational efficiency. Finally, tissue areas rather than average proteomic profiles are classified, highlighting potential discriminating areas of clinical interest. PMID:27293431
Estimation of Electrically-Evoked Knee Torque from Mechanomyography Using Support Vector Regression.
Ibitoye, Morufu Olusola; Hamzaid, Nur Azah; Abdul Wahab, Ahmad Khairi; Hasnan, Nazirah; Olatunji, Sunday Olusanya; Davis, Glen M
2016-01-01
The difficulty of real-time muscle force or joint torque estimation during neuromuscular electrical stimulation (NMES) in physical therapy and exercise science has motivated recent research interest in torque estimation from other muscle characteristics. This study investigated the accuracy of a computational intelligence technique for estimating NMES-evoked knee extension torque based on the Mechanomyographic signals (MMG) of contracting muscles that were recorded from eight healthy males. Simulation of the knee torque was modelled via Support Vector Regression (SVR) due to its good generalization ability in related fields. Inputs to the proposed model were MMG amplitude characteristics, the level of electrical stimulation or contraction intensity, and knee angle. Gaussian kernel function, as well as its optimal parameters were identified with the best performance measure and were applied as the SVR kernel function to build an effective knee torque estimation model. To train and test the model, the data were partitioned into training (70%) and testing (30%) subsets, respectively. The SVR estimation accuracy, based on the coefficient of determination (R²) between the actual and the estimated torque values was up to 94% and 89% during the training and testing cases, with root mean square errors (RMSE) of 9.48 and 12.95, respectively. The knee torque estimations obtained using SVR modelling agreed well with the experimental data from an isokinetic dynamometer. These findings support the realization of a closed-loop NMES system for functional tasks using MMG as the feedback signal source and an SVR algorithm for joint torque estimation. PMID:27447638
NASA Astrophysics Data System (ADS)
Guo, Q.; Shao, J.; Ruiz, V.
2005-01-01
This paper investigates detection of architectural distortion in mammographic images using support vector machine. Hausdorff dimension is used to characterise the texture feature of mammographic images. Support vector machine, a learning machine based on statistical learning theory, is trained through supervised learning to detect architectural distortion. Compared to the Radial Basis Function neural networks, SVM produced more accurate classification results in distinguishing architectural distortion abnormality from normal breast parenchyma.
Self-adapting root-MUSIC algorithm and its real-valued formulation for acoustic vector sensor array
NASA Astrophysics Data System (ADS)
Wang, Peng; Zhang, Guo-jun; Xue, Chen-yang; Zhang, Wen-dong; Xiong, Ji-jun
2012-12-01
In this paper, based on the root-MUSIC algorithm for acoustic pressure sensor array, a new self-adapting root-MUSIC algorithm for acoustic vector sensor array is proposed by self-adaptive selecting the lead orientation vector, and its real-valued formulation by Forward-Backward(FB) smoothing and real-valued inverse covariance matrix is also proposed, which can reduce the computational complexity and distinguish the coherent signals. The simulation experiment results show the better performance of two new algorithm with low Signal-to-Noise (SNR) in direction of arrival (DOA) estimation than traditional MUSIC algorithm, and the experiment results using MEMS vector hydrophone array in lake trails show the engineering practicability of two new algorithms.
Learning interpretive decision algorithm for severe storm forecasting support
Gaffney, J.E. Jr.; Racer, I.R.
1983-01-01
As part of its ongoing program to develop new and better forecasting procedures and techniques, the National Weather Service has initiated an effort in interpretive processing. Investigation has begun to determine the applicability of artificial intelligence (AI)/expert system technology to interpretive processing. This paper presents an expert system algorithm that is being investigated to support the forecasting of severe thunderstorms. 14 references.
Designing an Algorithm Animation System To Support Instructional Tasks.
ERIC Educational Resources Information Center
Hamilton-Taylor, Ashley George; Kraemer, Eileen
2002-01-01
The authors are conducting a study of instructors teaching data structure and algorithm topics, with a focus on the use of diagrams and tracing. The results of this study are being used to inform the design of the Support Kit for Animation (SKA). This article describes a preliminary version of SKA, and possible usage scenarios. (Author/AEF)
Vectorizing the Monte Carlo algorithm for lattice gauge theory calculations on the CDC cyber 205
NASA Astrophysics Data System (ADS)
Barkai, D.; Moriarty, K. J. M.
1982-06-01
Lattice gauge theory is a technique for studying quantum field theory free of divergences. All the Monte Carlo computer calculations up to now have been performed on scalar machines. A technique has been developed for effectively vectorizing this class of Monte Carlo problems. The key for vectorizing is in finding groups in finding groups of points on the space-time lattice which are independent of each other. This requires a particular ordering of points along diagonals. A technique for matrix multiply is used which enables one to get the whole of the result matrix in one pass. The CDC CYBER 205 is most suitable for this class of problems using random "index-lists" (arising from the ordering algorithm and the use of random numbers) due to the hardware implementation of "GATHER" and "SCATTER" operations performing at a streaming-rate. A preliminary implementation of this method has executed 5 times faster than on the CDC 7600 system.
Adaptive vector quantization of MR images using online k-means algorithm
NASA Astrophysics Data System (ADS)
Shademan, Azad; Zia, Mohammad A.
2001-12-01
The k-means algorithm is widely used to design image codecs using vector quantization (VQ). In this paper, we focus on an adaptive approach to implement a VQ technique using the online version of k-means algorithm, in which the size of the codebook is adapted continuously to the statistical behavior of the image. Based on the statistical analysis of the feature space, a set of thresholds are designed such that those codewords corresponding to the low-density clusters would be removed from the codebook and hence, resulting in a higher bit-rate efficiency. Applications of this approach would be in telemedicine, where sequences of highly correlated medical images, e.g. consecutive brain slices, are transmitted over a low bit-rate channel. We have applied this algorithm on magnetic resonance (MR) images and the simulation results on a sample sequence are given. The proposed method has been compared to the standard k-means algorithm in terms of PSNR, MSE, and elapsed time to complete the algorithm.
AMJoin: An Advanced Join Algorithm for Multiple Data Streams Using a Bit-Vector Hash Table
NASA Astrophysics Data System (ADS)
Kwon, Tae-Hyung; Kim, Hyeon-Gyu; Kim, Myoung-Ho; Son, Jin-Hyun
A multiple stream join is one of the most important but high cost operations in ubiquitous streaming services. In this paper, we propose a newly improved and practical algorithm for joining multiple streams called AMJoin, which improves the multiple join performance by guaranteeing the detection of join failures in constant time. To achieve this goal, we first design a new data structure called BiHT (Bit-vector Hash Table) and present the overall behavior of AMJoin in detail. In addition, we show various experimental results and their analyses for clarifying its efficiency and practicability.
A support vector machine to search for metal-poor galaxies
NASA Astrophysics Data System (ADS)
Shi, Fei; Liu, Yu-Yan; Kong, Xu; Chen, Yang; Li, Zhong-Hua; Zhi, Shu-Teng
2014-10-01
To develop a fast and reliable method for selecting metal-poor galaxies (MPGs), especially in large surveys and huge data bases, a support vector machine (SVM) supervized learning algorithms is applied to a sample of star-forming galaxies from the Sloan Digital Sky Survey data release 9 provided by the Max Planck Institute and the Johns Hopkins University (http://www.sdss3.org/dr9/spectro/spectroaccess.php). A two-step approach is adopted: (i) the SVM must be trained with a subset of objects that are known to be either MPGs or metal-rich galaxies (MRGs), treating the strong emission line flux measurements as input feature vectors in n-dimensional space, where n is the number of strong emission line flux ratios. (ii) After training on a sample of star-forming galaxies, the remaining galaxies are classified in the automatic test analysis as either MPGs or MRGs using a 10-fold cross-validation technique. For target selection, we have achieved an acquisition accuracy for MPGs of ˜96 and ˜95 per cent for an MPG threshold of 12 + log(O/H) = 8.00 and 12 + log(O/H) = 8.39, respectively. Running the code takes minutes in most cases under the MATLAB 2013a software environment. The code in the Letter is available on the web (http://fshi5388.blog.163.com). The SVM method can easily be extended to any MPGs target selection task and can be regarded as an efficient classification method particularly suitable for modern large surveys.
Hashing algorithms and data structures for rapid searches of fingerprint vectors.
Nasr, Ramzi; Hirschberg, Daniel S; Baldi, Pierre
2010-08-23
In many large chemoinformatics database systems, molecules are represented by long binary fingerprint vectors whose components record the presence or absence of particular functional groups or combinatorial features. To speed up database searches, we propose to add to each fingerprint a short signature integer vector of length M. For a given fingerprint, the i component of the signature vector counts the number of 1-bits in the fingerprint that fall on components congruent to i modulo M. Given two signatures, we show how one can rapidly compute a bound on the Jaccard-Tanimoto similarity measure of the two corresponding fingerprints, using the intersection bound. Thus, these signatures allow one to significantly prune the search space by discarding molecules associated with unfavorable bounds. Analytical methods are developed to predict the resulting amount of pruning as a function of M. Data structures combining different values of M are also developed together with methods for predicting the optimal values of M for a given implementation. Simulations using a particular implementation show that the proposed approach leads to a 1 order of magnitude speedup over a linear search and a 3-fold speedup over a previous implementation. All theoretical results and predictions are corroborated by large-scale simulations using molecules from the ChemDB. Several possible algorithmic extensions are discussed.
Balanced VS Imbalanced Training Data: Classifying Rapideye Data with Support Vector Machines
NASA Astrophysics Data System (ADS)
Ustuner, M.; Sanli, F. B.; Abdikan, S.
2016-06-01
The accuracy of supervised image classification is highly dependent upon several factors such as the design of training set (sample selection, composition, purity and size), resolution of input imagery and landscape heterogeneity. The design of training set is still a challenging issue since the sensitivity of classifier algorithm at learning stage is different for the same dataset. In this paper, the classification of RapidEye imagery with balanced and imbalanced training data for mapping the crop types was addressed. Classification with imbalanced training data may result in low accuracy in some scenarios. Support Vector Machines (SVM), Maximum Likelihood (ML) and Artificial Neural Network (ANN) classifications were implemented here to classify the data. For evaluating the influence of the balanced and imbalanced training data on image classification algorithms, three different training datasets were created. Two different balanced datasets which have 70 and 100 pixels for each class of interest and one imbalanced dataset in which each class has different number of pixels were used in classification stage. Results demonstrate that ML and NN classifications are affected by imbalanced training data in resulting a reduction in accuracy (from 90.94% to 85.94% for ML and from 91.56% to 88.44% for NN) while SVM is not affected significantly (from 94.38% to 94.69%) and slightly improved. Our results highlighted that SVM is proven to be a very robust, consistent and effective classifier as it can perform very well under balanced and imbalanced training data situations. Furthermore, the training stage should be precisely and carefully designed for the need of adopted classifier.
Trajectory classification in circular restricted three-body problem using support vector machine
NASA Astrophysics Data System (ADS)
Li, Weipeng; Huang, Hai; Peng, Fujun
2015-07-01
In the circular restricted three-body problem (CR3BP), transit orbit is a class of orbit which can pass through the bottleneck region of the zero velocity curve and escapes from the vicinity of the primary or the secondary. This kind of orbit plays a very important role in the design of space exploration missions. A kind of low-energy interplanetary transfer, which is called Interplanetary Superhighway (IPS), can be realized by utilizing transit orbits. To use the transit orbit in actual mission design, a key issue is to find an algorithm which can separate the states corresponding to transit orbits from the states corresponding to other types of orbits rapidly. In fact, the distribution of transit orbit in the phase space has been investigated by numerical method, and a Fourier series approximation method has been introduced to describe the boundary of transit orbits. However, the Fourier series approximation method needs several hundred sets of Fourier series. The coefficients of these Fourier series are neither easy to be computed nor convenient to be stored, which makes the method can hardly be used in actual mission design. In this paper, the support vector machine (SVM) is used to classify the trajectories in the CR3BP. Using the Gaussian kernel, the 6-dimensional states in the CR3BP are mapped into an infinite-dimensional space, and the bound of the transit orbits is described by a hyperplane. A training data generation method is introduced, which reduces the size of training data by generating the states near the hyperplane. The numerical results show that the proposed algorithm gives the good correct rate of classification, and its computing speed is much faster than that of the Fourier series approximation method.
2012-01-01
Background Members of the phylum Proteobacteria are most prominent among bacteria causing plant diseases that result in a diminution of the quantity and quality of food produced by agriculture. To ameliorate these losses, there is a need to identify infections in early stages. Recent developments in next generation nucleic acid sequencing and mass spectrometry open the door to screening plants by the sequences of their macromolecules. Such an approach requires the ability to recognize the organismal origin of unknown DNA or peptide fragments. There are many ways to approach this problem but none have emerged as the best protocol. Here we attempt a systematic way to determine organismal origins of peptides by using a machine learning algorithm. The algorithm that we implement is a Support Vector Machine (SVM). Result The amino acid compositions of proteobacterial proteins were found to be different from those of plant proteins. We developed an SVM model based on amino acid and dipeptide compositions to distinguish between a proteobacterial protein and a plant protein. The amino acid composition (AAC) based SVM model had an accuracy of 92.44% with 0.85 Matthews correlation coefficient (MCC) while the dipeptide composition (DC) based SVM model had a maximum accuracy of 94.67% and 0.89 MCC. We also developed SVM models based on a hybrid approach (AAC and DC), which gave a maximum accuracy 94.86% and a 0.90 MCC. The models were tested on unseen or untrained datasets to assess their validity. Conclusion The results indicate that the SVM based on the AAC and DC hybrid approach can be used to distinguish proteobacterial from plant protein sequences. PMID:23046503
Watanabe, Takanori; Kessler, Daniel; Scott, Clayton; Angstadt, Michael; Sripada, Chandra
2014-08-01
Substantial evidence indicates that major psychiatric disorders are associated with distributed neural dysconnectivity, leading to a strong interest in using neuroimaging methods to accurately predict disorder status. In this work, we are specifically interested in a multivariate approach that uses features derived from whole-brain resting state functional connectomes. However, functional connectomes reside in a high dimensional space, which complicates model interpretation and introduces numerous statistical and computational challenges. Traditional feature selection techniques are used to reduce data dimensionality, but are blind to the spatial structure of the connectomes. We propose a regularization framework where the 6-D structure of the functional connectome (defined by pairs of points in 3-D space) is explicitly taken into account via the fused Lasso or the GraphNet regularizer. Our method only restricts the loss function to be convex and margin-based, allowing non-differentiable loss functions such as the hinge-loss to be used. Using the fused Lasso or GraphNet regularizer with the hinge-loss leads to a structured sparse support vector machine (SVM) with embedded feature selection. We introduce a novel efficient optimization algorithm based on the augmented Lagrangian and the classical alternating direction method, which can solve both fused Lasso and GraphNet regularized SVM with very little modification. We also demonstrate that the inner subproblems of the algorithm can be solved efficiently in analytic form by coupling the variable splitting strategy with a data augmentation scheme. Experiments on simulated data and resting state scans from a large schizophrenia dataset show that our proposed approach can identify predictive regions that are spatially contiguous in the 6-D "connectome space," offering an additional layer of interpretability that could provide new insights about various disease processes.
Watanabe, Takanori; Kessler, Daniel; Scott, Clayton; Angstadt, Michael; Sripada, Chandra
2014-01-01
Substantial evidence indicates that major psychiatric disorders are associated with distributed neural dysconnectivity, leading to strong interest in using neuroimaging methods to accurately predict disorder status. In this work, we are specifically interested in a multivariate approach that uses features derived from whole-brain resting state functional connectomes. However, functional connectomes reside in a high dimensional space, which complicates model interpretation and introduces numerous statistical and computational challenges. Traditional feature selection techniques are used to reduce data dimensionality, but are blind to the spatial structure of the connectomes. We propose a regularization framework where the 6-D structure of the functional connectome (defined by pairs of points in 3-D space) is explicitly taken into account via the fused Lasso or the GraphNet regularizer. Our method only restricts the loss function to be convex and margin-based, allowing non-differentiable loss functions such as the hinge-loss to be used. Using the fused Lasso or GraphNet regularizer with the hinge-loss leads to a structured sparse support vector machine (SVM) with embedded feature selection. We introduce a novel efficient optimization algorithm based on the augmented Lagrangian and the classical alternating direction method, which can solve both fused Lasso and GraphNet regularized SVM with very little modification. We also demonstrate that the inner subproblems of the algorithm can be solved efficiently in analytic form by coupling the variable splitting strategy with a data augmentation scheme. Experiments on simulated data and resting state scans from a large schizophrenia dataset show that our proposed approach can identify predictive regions that are spatially contiguous in the 6-D “connectome space,” offering an additional layer of interpretability that could provide new insights about various disease processes. PMID:24704268
Goodson, Summer G.; Zhang, Zhaojun; Tsuruta, James K.; Wang, Wei; O'Brien, Deborah A.
2011-01-01
Vigorous sperm motility, including the transition from progressive to hyperactivated motility that occurs in the female reproductive tract, is required for normal fertilization in mammals. We developed an automated, quantitative method that objectively classifies five distinct motility patterns of mouse sperm using Support Vector Machines (SVM), a common method in supervised machine learning. This multiclass SVM model is based on more than 2000 sperm tracks that were captured by computer-assisted sperm analysis (CASA) during in vitro capacitation and visually classified as progressive, intermediate, hyperactivated, slow, or weakly motile. Parameters associated with the classified tracks were incorporated into established SVM algorithms to generate a series of equations. These equations were integrated into a binary decision tree that sequentially sorts uncharacterized tracks into distinct categories. The first equation sorts CASA tracks into vigorous and nonvigorous categories. Additional equations classify vigorous tracks as progressive, intermediate, or hyperactivated and nonvigorous tracks as slow or weakly motile. Our CASAnova software uses these SVM equations to classify individual sperm motility patterns automatically. Comparisons of motility profiles from sperm incubated with and without bicarbonate confirmed the ability of the model to distinguish hyperactivated patterns of motility that develop during in vitro capacitation. The model accurately classifies motility profiles of sperm from a mutant mouse model with severe motility defects. Application of the model to sperm from multiple inbred strains reveals strain-dependent differences in sperm motility profiles. CASAnova provides a rapid and reproducible platform for quantitative comparisons of motility in large, heterogeneous populations of mouse sperm. PMID:21349820
Fuzzy classifier based support vector regression framework for Poisson ratio determination
NASA Astrophysics Data System (ADS)
Asoodeh, Mojtaba; Bagheripour, Parisa
2013-09-01
Poisson ratio is considered as one of the most important rock mechanical properties of hydrocarbon reservoirs. Determination of this parameter through laboratory measurement is time, cost, and labor intensive. Furthermore, laboratory measurements do not provide continuous data along the reservoir intervals. Hence, a fast, accurate, and inexpensive way of determining Poisson ratio which produces continuous data over the whole reservoir interval is desirable. For this purpose, support vector regression (SVR) method based on statistical learning theory (SLT) was employed as a supervised learning algorithm to estimate Poisson ratio from conventional well log data. SVR is capable of accurately extracting the implicit knowledge contained in conventional well logs and converting the gained knowledge into Poisson ratio data. Structural risk minimization (SRM) principle which is embedded in the SVR structure in addition to empirical risk minimization (EMR) principle provides a robust model for finding quantitative formulation between conventional well log data and Poisson ratio. Although satisfying results were obtained from an individual SVR model, it had flaws of overestimation in low Poisson ratios and underestimation in high Poisson ratios. These errors were eliminated through implementation of fuzzy classifier based SVR (FCBSVR). The FCBSVR significantly improved accuracy of the final prediction. This strategy was successfully applied to data from carbonate reservoir rocks of an Iranian Oil Field. Results indicated that SVR predicted Poisson ratio values are in good agreement with measured values.
NASA Astrophysics Data System (ADS)
Li, Shaoxin; Zhang, Yanjiao; Xu, Junfa; Li, Linfang; Zeng, Qiuyao; Lin, Lin; Guo, Zhouyi; Liu, Zhiming; Xiong, Honglian; Liu, Songhao
2014-09-01
This study aims to present a noninvasive prostate cancer screening methods using serum surface-enhanced Raman scattering (SERS) and support vector machine (SVM) techniques through peripheral blood sample. SERS measurements are performed using serum samples from 93 prostate cancer patients and 68 healthy volunteers by silver nanoparticles. Three types of kernel functions including linear, polynomial, and Gaussian radial basis function (RBF) are employed to build SVM diagnostic models for classifying measured SERS spectra. For comparably evaluating the performance of SVM classification models, the standard multivariate statistic analysis method of principal component analysis (PCA) is also applied to classify the same datasets. The study results show that for the RBF kernel SVM diagnostic model, the diagnostic accuracy of 98.1% is acquired, which is superior to the results of 91.3% obtained from PCA methods. The receiver operating characteristic curve of diagnostic models further confirm above research results. This study demonstrates that label-free serum SERS analysis technique combined with SVM diagnostic algorithm has great potential for noninvasive prostate cancer screening.
Online Least Squares One-Class Support Vector Machines-Based Abnormal Visual Event Detection
Wang, Tian; Chen, Jie; Zhou, Yi; Snoussi, Hichem
2013-01-01
The abnormal event detection problem is an important subject in real-time video surveillance. In this paper, we propose a novel online one-class classification algorithm, online least squares one-class support vector machine (online LS-OC-SVM), combined with its sparsified version (sparse online LS-OC-SVM). LS-OC-SVM extracts a hyperplane as an optimal description of training objects in a regularized least squares sense. The online LS-OC-SVM learns a training set with a limited number of samples to provide a basic normal model, then updates the model through remaining data. In the sparse online scheme, the model complexity is controlled by the coherence criterion. The online LS-OC-SVM is adopted to handle the abnormal event detection problem. Each frame of the video is characterized by the covariance matrix descriptor encoding the moving information, then is classified into a normal or an abnormal frame. Experiments are conducted, on a two-dimensional synthetic distribution dataset and a benchmark video surveillance dataset, to demonstrate the promising results of the proposed online LS-OC-SVM method. PMID:24351629
Online least squares one-class support vector machines-based abnormal visual event detection.
Wang, Tian; Chen, Jie; Zhou, Yi; Snoussi, Hichem
2013-12-12
The abnormal event detection problem is an important subject in real-time video surveillance. In this paper, we propose a novel online one-class classification algorithm, online least squares one-class support vector machine (online LS-OC-SVM), combined with its sparsified version (sparse online LS-OC-SVM). LS-OC-SVM extracts a hyperplane as an optimal description of training objects in a regularized least squares sense. The online LS-OC-SVM learns a training set with a limited number of samples to provide a basic normal model, then updates the model through remaining data. In the sparse online scheme, the model complexity is controlled by the coherence criterion. The online LS-OC-SVM is adopted to handle the abnormal event detection problem. Each frame of the video is characterized by the covariance matrix descriptor encoding the moving information, then is classified into a normal or an abnormal frame. Experiments are conducted, on a two-dimensional synthetic distribution dataset and a benchmark video surveillance dataset, to demonstrate the promising results of the proposed online LS-OC-SVM method.
A faster optimization method based on support vector regression for aerodynamic problems
NASA Astrophysics Data System (ADS)
Yang, Xixiang; Zhang, Weihua
2013-09-01
In this paper, a new strategy for optimal design of complex aerodynamic configuration with a reasonable low computational effort is proposed. In order to solve the formulated aerodynamic optimization problem with heavy computation complexity, two steps are taken: (1) a sequential approximation method based on support vector regression (SVR) and hybrid cross validation strategy, is proposed to predict aerodynamic coefficients, and thus approximates the objective function and constraint conditions of the originally formulated optimization problem with given limited sample points; (2) a sequential optimization algorithm is proposed to ensure the obtained optimal solution by solving the approximation optimization problem in step (1) is very close to the optimal solution of the originally formulated optimization problem. In the end, we adopt a complex aerodynamic design problem, that is optimal aerodynamic design of a flight vehicle with grid fins, to demonstrate our proposed optimization methods, and numerical results show that better results can be obtained with a significantly lower computational effort than using classical optimization techniques.
Setting up the critical rainfall line for debris flows via support vector machines
NASA Astrophysics Data System (ADS)
Tsai, Y. F.; Chan, C. H.; Chang, C. H.
2015-10-01
The Chi-Chi earthquake in 1999 caused tremendous landslides which triggered many debris flows and resulted in significant loss of public lives and property. To prevent the disaster of debris flow, setting a critical rainfall line for each debris-flow stream is necessary. Firstly, 8 predisposing factors of debris flow were used to cluster 377 streams which have similar rainfall lines into 7 groups via the genetic algorithm. Then, support vector machines (SVM) were applied to setup the critical rainfall line for debris flows. SVM is a machine learning approach proposed based on statistical learning theory and has been widely used on pattern recognition and regression. This theory raises the generalized ability of learning mechanisms according to the minimum structural risk. Therefore, the advantage of using SVM can obtain results of minimized error rates without many training samples. Finally, the experimental results confirm that SVM method performs well in setting a critical rainfall line for each group of debris-flow streams.
NASA Astrophysics Data System (ADS)
Ghaemi, Z.; Farnaghi, M.; Alimohammadi, A.
2015-12-01
The critical impact of air pollution on human health and environment in one hand and the complexity of pollutant concentration behavior in the other hand lead the scientists to look for advance techniques for monitoring and predicting the urban air quality. Additionally, recent developments in data measurement techniques have led to collection of various types of data about air quality. Such data is extremely voluminous and to be useful it must be processed at high velocity. Due to the complexity of big data analysis especially for dynamic applications, online forecasting of pollutant concentration trends within a reasonable processing time is still an open problem. The purpose of this paper is to present an online forecasting approach based on Support Vector Machine (SVM) to predict the air quality one day in advance. In order to overcome the computational requirements for large-scale data analysis, distributed computing based on the Hadoop platform has been employed to leverage the processing power of multiple processing units. The MapReduce programming model is adopted for massive parallel processing in this study. Based on the online algorithm and Hadoop framework, an online forecasting system is designed to predict the air pollution of Tehran for the next 24 hours. The results have been assessed on the basis of Processing Time and Efficiency. Quite accurate predictions of air pollutant indicator levels within an acceptable processing time prove that the presented approach is very suitable to tackle large scale air pollution prediction problems.
NASA Astrophysics Data System (ADS)
Dushyanth, N. D.; Suma, M. N.; Latte, Mrityanjaya V.
2016-03-01
Damage in the structure may raise a significant amount of maintenance cost and serious safety problems. Hence detection of the damage at its early stage is of prime importance. The main contribution pursued in this investigation is to propose a generic optimal methodology to improve the accuracy of positioning of the flaw in a structure. This novel approach involves a two-step process. The first step essentially aims at extracting the damage-sensitive features from the received signal, and these extracted features are often termed the damage index or damage indices, serving as an indicator to know whether the damage is present or not. In particular, a multilevel SVM (support vector machine) plays a vital role in the distinction of faulty and healthy structures. Formerly, when a structure is unveiled as a damaged structure, in the subsequent step, the position of the damage is identified using Hilbert-Huang transform. The proposed algorithm has been evaluated in both simulation and experimental tests on a 6061 aluminum plate with dimensions 300 mm × 300 mm × 5 mm which accordingly yield considerable improvement in the accuracy of estimating the position of the flaw.
Estimation of the laser cutting operating cost by support vector regression methodology
NASA Astrophysics Data System (ADS)
Jović, Srđan; Radović, Aleksandar; Šarkoćević, Živče; Petković, Dalibor; Alizamir, Meysam
2016-09-01
Laser cutting is a popular manufacturing process utilized to cut various types of materials economically. The operating cost is affected by laser power, cutting speed, assist gas pressure, nozzle diameter and focus point position as well as the workpiece material. In this article, the process factors investigated were: laser power, cutting speed, air pressure and focal point position. The aim of this work is to relate the operating cost to the process parameters mentioned above. CO2 laser cutting of stainless steel of medical grade AISI316L has been investigated. The main goal was to analyze the operating cost through the laser power, cutting speed, air pressure, focal point position and material thickness. Since the laser operating cost is a complex, non-linear task, soft computing optimization algorithms can be used. Intelligent soft computing scheme support vector regression (SVR) was implemented. The performance of the proposed estimator was confirmed with the simulation results. The SVR results are then compared with artificial neural network and genetic programing. According to the results, a greater improvement in estimation accuracy can be achieved through the SVR compared to other soft computing methodologies. The new optimization methods benefit from the soft computing capabilities of global optimization and multiobjective optimization rather than choosing a starting point by trial and error and combining multiple criteria into a single criterion.
Land Cover Classification from Full-Waveform LIDAR Data Based on Support Vector Machines
NASA Astrophysics Data System (ADS)
Zhou, M.; Li, C. R.; Ma, L.; Guan, H. C.
2016-06-01
In this study, a land cover classification method based on multi-class Support Vector Machines (SVM) is presented to predict the types of land cover in Miyun area. The obtained backscattered full-waveforms were processed following a workflow of waveform pre-processing, waveform decomposition and feature extraction. The extracted features, which consist of distance, intensity, Full Width at Half Maximum (FWHM) and back scattering cross-section, were corrected and used as attributes for training data to generate the SVM prediction model. The SVM prediction model was applied to predict the types of land cover in Miyun area as ground, trees, buildings and farmland. The classification results of these four types of land covers were obtained based on the ground truth information according to the CCD image data of Miyun area. It showed that the proposed classification algorithm achieved an overall classification accuracy of 90.63%. In order to better explain the SVM classification results, the classification results of SVM method were compared with that of Artificial Neural Networks (ANNs) method and it showed that SVM method could achieve better classification results.
Stellar Spectral Classification with Locality Preserving Projections and Support Vector Machine
NASA Astrophysics Data System (ADS)
Zhong-bao, Liu
2016-06-01
With the help of computer tools and algorithms, automatic stellar spectral classification has become an area of current interest. The process of stellar spectral classification mainly includes two steps: dimension reduction and classification. As a popular dimensionality reduction technique, Principal Component Analysis (PCA) is widely used in stellar spectra classification. Another dimensionality reduction technique, Locality Preserving Projections (LPP) has not been widely used in astronomy. The advantage of LPP is that it can preserve the local structure of the data after dimensionality reduction. In view of this, we investigate how to apply LPP+SVM in classifying the stellar spectral subclasses. In the comparative experiment, the performance of LPP is compared with PCA. The stellar spectral classification process is composed of the following steps. Firstly, PCA and LPP are respectively applied to reduce the dimension of spectra data. Then, Support Vector Machine (SVM) is used to classify the 4 subclasses of K-type and 3 subclasses of F-type spectra from Sloan Digital Sky Survey (SDSS). Lastly, the performance of LPP+SVM is compared with that of PCA+SVM in stellar spectral classification, and we found that LPP does better than PCA.
Application of biomonitoring and support vector machine in water quality assessment*
Liao, Yue; Xu, Jian-yu; Wang, Zhu-wei
2012-01-01
The behavior of schools of zebrafish (Danio rerio) was studied in acute toxicity environments. Behavioral features were extracted and a method for water quality assessment using support vector machine (SVM) was developed. The behavioral parameters of fish were recorded and analyzed during one hour in an environment of a 24-h half-lethal concentration (LC50) of a pollutant. The data were used to develop a method to evaluate water quality, so as to give an early indication of toxicity. Four kinds of metal ions (Cu2+, Hg2+, Cr6+, and Cd2+) were used for toxicity testing. To enhance the efficiency and accuracy of assessment, a method combining SVM and a genetic algorithm (GA) was used. The results showed that the average prediction accuracy of the method was over 80% and the time cost was acceptable. The method gave satisfactory results for a variety of metal pollutants, demonstrating that this is an effective approach to the classification of water quality. PMID:22467374
Terminator Detection by Support Vector Machine Utilizing aStochastic Context-Free Grammar
Francis-Lyon, Patricia; Cristianini, Nello; Holbrook, Stephen
2006-12-30
A 2-stage detector was designed to find rho-independent transcription terminators in the Escherichia coli genome. The detector includes a Stochastic Context Free Grammar (SCFG) component and a Support Vector Machine (SVM) component. To find terminators, the SCFG searches the intergenic regions of nucleotide sequence for local matches to a terminator grammar that was designed and trained utilizing examples of known terminators. The grammar selects sequences that are the best candidates for terminators and assigns them a prefix, stem-loop, suffix structure using the Cocke-Younger-Kasaami (CYK) algorithm, modified to incorporate energy affects of base pairing. The parameters from this inferred structure are passed to the SVM classifier, which distinguishes terminators from non-terminators that score high according to the terminator grammar. The SVM was trained with negative examples drawn from intergenic sequences that include both featureless and RNA gene regions (which were assigned prefix, stem-loop, suffix structure by the SCFG), so that it successfully distinguishes terminators from either of these. The classifier was found to be 96.4% successful during testing.
NASA Astrophysics Data System (ADS)
Lima, Aranildo R.; Cannon, Alex J.; Hsieh, William W.
2013-01-01
A hybrid algorithm combining support vector regression with evolutionary strategy (SVR-ES) is proposed for predictive models in the environmental sciences. SVR-ES uses uncorrelated mutation with p step sizes to find the optimal SVR hyper-parameters. Three environmental forecast datasets used in the WCCI-2006 contest - surface air temperature, precipitation and sulphur dioxide concentration - were tested. We used multiple linear regression (MLR) as benchmark and a variety of machine learning techniques including bootstrap-aggregated ensemble artificial neural network (ANN), SVR-ES, SVR with hyper-parameters given by the Cherkassky-Ma estimate, the M5 regression tree, and random forest (RF). We also tested all techniques using stepwise linear regression (SLR) first to screen out irrelevant predictors. We concluded that SVR-ES is an attractive approach because it tends to outperform the other techniques and can also be implemented in an almost automatic way. The Cherkassky-Ma estimate is a useful approach for minimizing the mean absolute error and saving computational time related to the hyper-parameter search. The ANN and RF are also good options to outperform multiple linear regression (MLR). Finally, the use of SLR for predictor selection can dramatically reduce computational time and often help to enhance accuracy.
Spatial Downscaling of Remotely Sensed Soil Moisture Using Support Vector Machine in Northeast Asia
NASA Astrophysics Data System (ADS)
Choi, M.; Moon, H.; Kim, D.
2014-12-01
Recent advances in remote sensing of soil moisture have broadened the understanding of spatiotemporal behavior of soil moisture and contributed to major improvements in the associated research fields. However, large spatial coverage and short timescale notwithstanding, low spatial resolution of passive microwave soil moisture data has been frequently treated as major research problem in many studies, which suggested statistical or deterministic downscaling method as a solution to obtain targeted spatial resolutions. This study suggests a methodology to downscale 10 km and 25 km daily L3 volumetric soil moisture datasets from Advanced Microwave Scanning Radiometer 2 (AMSR2) in 2013 in Northeast Asia using Support Vector Machine (SVM). In the presented methodology, hydrometeorological variables observed from satellite remote sensing which have physically significant relationship with soil moisture are chosen as predictor variables to estimate soil moisture in finer resolution. Separate downscaling algorithms optimized for seasonal conditions are applied to achieve more accurate results of downscaled soil moisture. A comparative analysis between in-situ and downscaled soil moisture is also conducted for quantitatively assessing its accuracy. Further application can be carried out in hydrological modeling or prediction of extreme weather phenomena in fine spatial resolution based on the results of this study.
Online least squares one-class support vector machines-based abnormal visual event detection.
Wang, Tian; Chen, Jie; Zhou, Yi; Snoussi, Hichem
2013-01-01
The abnormal event detection problem is an important subject in real-time video surveillance. In this paper, we propose a novel online one-class classification algorithm, online least squares one-class support vector machine (online LS-OC-SVM), combined with its sparsified version (sparse online LS-OC-SVM). LS-OC-SVM extracts a hyperplane as an optimal description of training objects in a regularized least squares sense. The online LS-OC-SVM learns a training set with a limited number of samples to provide a basic normal model, then updates the model through remaining data. In the sparse online scheme, the model complexity is controlled by the coherence criterion. The online LS-OC-SVM is adopted to handle the abnormal event detection problem. Each frame of the video is characterized by the covariance matrix descriptor encoding the moving information, then is classified into a normal or an abnormal frame. Experiments are conducted, on a two-dimensional synthetic distribution dataset and a benchmark video surveillance dataset, to demonstrate the promising results of the proposed online LS-OC-SVM method. PMID:24351629
Beaumont, Christopher N.; Williams, Jonathan P.; Goodman, Alyssa A.
2011-11-01
We apply Support Vector Machines (SVMs)-a machine learning algorithm-to the task of classifying structures in the interstellar medium (ISM). As a case study, we present a position-position-velocity (PPV) data cube of {sup 12}CO J = 3-2 emission toward G16.05-0.57, a supernova remnant that lies behind the M17 molecular cloud. Despite the fact that these two objects partially overlap in PPV space, the two structures can easily be distinguished by eye based on their distinct morphologies. The SVM algorithm is able to infer these morphological distinctions, and associate individual pixels with each object at >90% accuracy. This case study suggests that similar techniques may be applicable to classifying other structures in the ISM-a task that has thus far proven difficult to automate.
NASA Astrophysics Data System (ADS)
Jansen, P.; Vergossen, D.; Renner, D.; John, W.; Götze, J.
2015-11-01
An alternative method for determining the state of charge (SOC) on lithium iron phosphate cells by impedance spectra classification is given. Methods based on the electric equivalent circuit diagram (ECD), such as the Kalman Filter, the extended Kalman Filter and the state space observer, for instance, have reached their limits for this cell chemistry. The new method resigns on the open circuit voltage curve and the parameters for the electric ECD. Impedance spectra classification is implemented by a Support Vector Machine (SVM). The classes for the SVM-algorithm are represented by all the impedance spectra that correspond to the SOC (the SOC classes) for defined temperature and aging states. A divide and conquer based search algorithm on a binary search tree makes it possible to grade measured impedances using the SVM method. Statistical analysis is used to verify the concept by grading every single impedance from each impedance spectrum corresponding to the SOC by class with different magnitudes of charged error.
NASA Astrophysics Data System (ADS)
Harker, Brian J.
The measurement of vector magnetic fields on the sun is one of the most important diagnostic tools for characterizing solar activity. The ubiquitous solar wind is guided into interplanetary space by open magnetic field lines in the upper solar atmosphere. Highly-energetic solar flares and Coronal Mass Ejections (CMEs) are triggered in lower layers of the solar atmosphere by the driving forces at the visible "surface" of the sun, the photosphere. The driving forces there tangle and interweave the vector magnetic fields, ultimately leading to an unstable field topology with large excess magnetic energy, and this excess energy is suddenly and violently released by magnetic reconnection, emitting intense broadband radiation that spans the electromagnetic spectrum, accelerating billions of metric tons of plasma away from the sun, and finally relaxing the magnetic field to lower-energy states. These eruptive flaring events can have severe impacts on the near-Earth environment and the human technology that inhabits it. This dissertation presents a novel inversion method for inferring the properties of the vector magnetic field from telescopic measurements of the polarization states (Stokes vector) of the light received from the sun, in an effort to develop a method that is fast, accurate, and reliable. One of the long-term goals of this work is to develop such a method that is capable of rapidly-producing characterizations of the magnetic field from time-sequential data, such that near real-time projections of the complexity and flare- productivity of solar active regions can be made. This will be a boon to the field of solar flare forecasting, and should help mitigate the harmful effects of space weather on mankind's space-based endeavors. To this end, I have developed an inversion method based on genetic algorithms (GA) that have the potential for achieving such high-speed analysis.
Amin, Morteza Moradi; Kermani, Saeed; Talebi, Ardeshir; Oghli, Mostafa Ghelich
2015-01-01
Acute lymphoblastic leukemia is the most common form of pediatric cancer which is categorized into three L1, L2, and L3 and could be detected through screening of blood and bone marrow smears by pathologists. Due to being time-consuming and tediousness of the procedure, a computer-based system is acquired for convenient detection of Acute lymphoblastic leukemia. Microscopic images are acquired from blood and bone marrow smears of patients with Acute lymphoblastic leukemia and normal cases. After applying image preprocessing, cells nuclei are segmented by k-means algorithm. Then geometric and statistical features are extracted from nuclei and finally these cells are classified to cancerous and noncancerous cells by means of support vector machine classifier with 10-fold cross validation. These cells are also classified into their sub-types by multi-Support vector machine classifier. Classifier is evaluated by these parameters: Sensitivity, specificity, and accuracy which values for cancerous and noncancerous cells 98%, 95%, and 97%, respectively. These parameters are also used for evaluation of cell sub-types which values in mean 84.3%, 97.3%, and 95.6%, respectively. The results show that proposed algorithm could achieve an acceptable performance for the diagnosis of Acute lymphoblastic leukemia and its sub-types and can be used as an assistant diagnostic tool for pathologists. PMID:25709941
Support Vector Machine Model for Automatic Detection and Classification of Seismic Events
NASA Astrophysics Data System (ADS)
Barros, Vesna; Barros, Lucas
2016-04-01
The automated processing of multiple seismic signals to detect, localize and classify seismic events is a central tool in both natural hazards monitoring and nuclear treaty verification. However, false detections and missed detections caused by station noise and incorrect classification of arrivals are still an issue and the events are often unclassified or poorly classified. Thus, machine learning techniques can be used in automatic processing for classifying the huge database of seismic recordings and provide more confidence in the final output. Applied in the context of the International Monitoring System (IMS) - a global sensor network developed for the Comprehensive Nuclear-Test-Ban Treaty (CTBT) - we propose a fully automatic method for seismic event detection and classification based on a supervised pattern recognition technique called the Support Vector Machine (SVM). According to Kortström et al., 2015, the advantages of using SVM are handleability of large number of features and effectiveness in high dimensional spaces. Our objective is to detect seismic events from one IMS seismic station located in an area of high seismicity and mining activity and classify them as earthquakes or quarry blasts. It is expected to create a flexible and easily adjustable SVM method that can be applied in different regions and datasets. Taken a step further, accurate results for seismic stations could lead to a modification of the model and its parameters to make it applicable to other waveform technologies used to monitor nuclear explosions such as infrasound and hydroacoustic waveforms. As an authorized user, we have direct access to all IMS data and bulletins through a secure signatory account. A set of significant seismic waveforms containing different types of events (e.g. earthquake, quarry blasts) and noise is being analysed to train the model and learn the typical pattern of the signal from these events. Moreover, comparing the performance of the support-vector
Estimation of Electrically-Evoked Knee Torque from Mechanomyography Using Support Vector Regression
Ibitoye, Morufu Olusola; Hamzaid, Nur Azah; Abdul Wahab, Ahmad Khairi; Hasnan, Nazirah; Olatunji, Sunday Olusanya; Davis, Glen M.
2016-01-01
The difficulty of real-time muscle force or joint torque estimation during neuromuscular electrical stimulation (NMES) in physical therapy and exercise science has motivated recent research interest in torque estimation from other muscle characteristics. This study investigated the accuracy of a computational intelligence technique for estimating NMES-evoked knee extension torque based on the Mechanomyographic signals (MMG) of contracting muscles that were recorded from eight healthy males. Simulation of the knee torque was modelled via Support Vector Regression (SVR) due to its good generalization ability in related fields. Inputs to the proposed model were MMG amplitude characteristics, the level of electrical stimulation or contraction intensity, and knee angle. Gaussian kernel function, as well as its optimal parameters were identified with the best performance measure and were applied as the SVR kernel function to build an effective knee torque estimation model. To train and test the model, the data were partitioned into training (70%) and testing (30%) subsets, respectively. The SVR estimation accuracy, based on the coefficient of determination (R2) between the actual and the estimated torque values was up to 94% and 89% during the training and testing cases, with root mean square errors (RMSE) of 9.48 and 12.95, respectively. The knee torque estimations obtained using SVR modelling agreed well with the experimental data from an isokinetic dynamometer. These findings support the realization of a closed-loop NMES system for functional tasks using MMG as the feedback signal source and an SVR algorithm for joint torque estimation. PMID:27447638
EVALUATION FOR JUDGMENT CRITERIA OF REPAIR ON CIVIL ENGINEERING STRUCTURE BY SUPPORT VECTOR MACHINE
NASA Astrophysics Data System (ADS)
Yuki, Kazunori; Kobayashi, Hiroki; Ohishi, Hiroyuki; Sugimoto, Hiroyuki; Iida, Takeshi; Furukawa, Kohei
In this study, setting method for judgment criteria for repair of civil engineering structures is analyzed by the use of inspection and repair record of expansion joint of bridges with the Support Vector Machine. The Support Vector Machine is a technique used to apply for setting of risk degree of disasters on natural slopes. However it is needed that effective exclusion method of noise data has to be considered to apply for the analysis of the setting method. Therefore the noise data is excluded objectively in order that high confidence data can be extracted from the record. In this way setting the method can be developed. As a result of this study, it can be shown that the setting method by Support Vector Machine is effective as a tool for maintenance management plan of civil engineering structures since the method has a high integrity with evaluation by professional engineer.
Automated Classification of Epiphyses in the Distal Radius and Ulna using a Support Vector Machine.
Wang, Ya-hui; Liu, Tai-ang; Wei, Hua; Wan, Lei; Ying, Chong-liang; Zhu, Guang-you
2016-03-01
The aim of this study was to automatically classify epiphyses in the distal radius and ulna using a support vector machine (SVM) and to examine the accuracy of the epiphyseal growth grades generated by the support vector machine. X-ray images of distal radii and ulnae were collected from 140 Chinese teenagers aged between 11.0 and 19.0 years. Epiphyseal growth of the two elements was classified into five grades. Features of each element were extracted using a histogram of oriented gradient (HOG), and models were established using support vector classification (SVC). The prediction results and the validity of the models were evaluated with a cross-validation test and independent test for accuracy (PA ). Our findings suggest that this new technique for epiphyseal classification was successful and that an automated technique using an SVM is reliable and feasible, with a relative high accuracy for the models. PMID:27404614
NASA Astrophysics Data System (ADS)
Kachach, Redouane; Cañas, José María
2016-05-01
Using video in traffic monitoring is one of the most active research domains in the computer vision community. TrafficMonitor, a system that employs a hybrid approach for automatic vehicle tracking and classification on highways using a simple stationary calibrated camera, is presented. The proposed system consists of three modules: vehicle detection, vehicle tracking, and vehicle classification. Moving vehicles are detected by an enhanced Gaussian mixture model background estimation algorithm. The design includes a technique to resolve the occlusion problem by using a combination of two-dimensional proximity tracking algorithm and the Kanade-Lucas-Tomasi feature tracking algorithm. The last module classifies the shapes identified into five vehicle categories: motorcycle, car, van, bus, and truck by using three-dimensional templates and an algorithm based on histogram of oriented gradients and the support vector machine classifier. Several experiments have been performed using both real and simulated traffic in order to validate the system. The experiments were conducted on GRAM-RTM dataset and a proper real video dataset which is made publicly available as part of this work.
Sun, Jun; Liu, Li; Fan, Fangyun; Wu, Xiaojun
2016-01-01
This paper focuses on the feature gene selection for cancer classification, which employs an optimization algorithm to select a subset of the genes. We propose a binary quantum-behaved particle swarm optimization (BQPSO) for cancer feature gene selection, coupling support vector machine (SVM) for cancer classification. First, the proposed BQPSO algorithm is described, which is a discretized version of original QPSO for binary 0-1 optimization problems. Then, we present the principle and procedure for cancer feature gene selection and cancer classification based on BQPSO and SVM with leave-one-out cross validation (LOOCV). Finally, the BQPSO coupling SVM (BQPSO/SVM), binary PSO coupling SVM (BPSO/SVM), and genetic algorithm coupling SVM (GA/SVM) are tested for feature gene selection and cancer classification on five microarray data sets, namely, Leukemia, Prostate, Colon, Lung, and Lymphoma. The experimental results show that BQPSO/SVM has significant advantages in accuracy, robustness, and the number of feature genes selected compared with the other two algorithms. PMID:27642363
Sun, Jun; Liu, Li; Fan, Fangyun; Wu, Xiaojun
2016-01-01
This paper focuses on the feature gene selection for cancer classification, which employs an optimization algorithm to select a subset of the genes. We propose a binary quantum-behaved particle swarm optimization (BQPSO) for cancer feature gene selection, coupling support vector machine (SVM) for cancer classification. First, the proposed BQPSO algorithm is described, which is a discretized version of original QPSO for binary 0-1 optimization problems. Then, we present the principle and procedure for cancer feature gene selection and cancer classification based on BQPSO and SVM with leave-one-out cross validation (LOOCV). Finally, the BQPSO coupling SVM (BQPSO/SVM), binary PSO coupling SVM (BPSO/SVM), and genetic algorithm coupling SVM (GA/SVM) are tested for feature gene selection and cancer classification on five microarray data sets, namely, Leukemia, Prostate, Colon, Lung, and Lymphoma. The experimental results show that BQPSO/SVM has significant advantages in accuracy, robustness, and the number of feature genes selected compared with the other two algorithms.
Long, Jinyi; Yu, Zhuliang
2010-01-01
Parameter setting plays an important role for improving the performance of a brain computer interface (BCI). Currently, parameters (e.g. channels and frequency band) are often manually selected. It is time-consuming and not easy to obtain an optimal combination of parameters for a BCI. In this paper, motor imagery-based BCIs are considered, in which channels and frequency band are key parameters. First, a semi-supervised support vector machine algorithm is proposed for automatically selecting a set of channels with given frequency band. Next, this algorithm is extended for joint channel-frequency selection. In this approach, both training data with labels and test data without labels are used for training a classifier. Hence it can be used in small training data case. Finally, our algorithms are applied to a BCI competition data set. Our data analysis results show that these algorithms are effective for selection of frequency band and channels when the training data set is small. PMID:21886673
Xi, Maolong; Sun, Jun; Liu, Li; Fan, Fangyun; Wu, Xiaojun
2016-01-01
This paper focuses on the feature gene selection for cancer classification, which employs an optimization algorithm to select a subset of the genes. We propose a binary quantum-behaved particle swarm optimization (BQPSO) for cancer feature gene selection, coupling support vector machine (SVM) for cancer classification. First, the proposed BQPSO algorithm is described, which is a discretized version of original QPSO for binary 0-1 optimization problems. Then, we present the principle and procedure for cancer feature gene selection and cancer classification based on BQPSO and SVM with leave-one-out cross validation (LOOCV). Finally, the BQPSO coupling SVM (BQPSO/SVM), binary PSO coupling SVM (BPSO/SVM), and genetic algorithm coupling SVM (GA/SVM) are tested for feature gene selection and cancer classification on five microarray data sets, namely, Leukemia, Prostate, Colon, Lung, and Lymphoma. The experimental results show that BQPSO/SVM has significant advantages in accuracy, robustness, and the number of feature genes selected compared with the other two algorithms. PMID:27642363
An improved hurricane wind vector retrieval algorithm using SeaWinds scatterometer
NASA Astrophysics Data System (ADS)
Laupattarakasem, Peth
Over the last three decades, microwave remote sensing has played a significant role in ocean surface wind measurement, and several scatterometer missions have flown in space since early 1990's. Although they have been extremely successful for measuring ocean surface winds with high accuracy for the vast majority of marine weather conditions, unfortunately, the conventional scatterometer cannot measure extreme winds condition such as hurricane. The SeaWinds scatterometer, onboard the QuikSCAT satellite is NASA's only operating scatterometer at present. Like its predecessors, it measures global ocean vector winds; however, for a number of reasons, the quality of the measurements in hurricanes are significantly degraded. The most pressing issues are associated with the presence of precipitation and Ku-band saturation effects, especially in extreme wind speed regime such as tropical cyclones (hurricanes and typhoons). Under this dissertation, an improved hurricane ocean vector wind retrieval approach, named as Q-Winds, was developed using existing SeaWinds scatterometer data. This unique data processing algorithm uses combined SeaWinds active and passive measurements to extend the use of SeaWinds for tropical cyclones up to approximately 50 m/s (Hurricane Category-3). Results show that Q-Winds wind speeds are consistently superior to the standard SeaWinds Project Level 2B wind speeds for hurricane wind speed measurement, and also Q-Winds provides more reliable rain flagging algorithm for quality assurance purposes. By comparing to H*Wind, Q-Winds achieves ˜9% of error, while L2B-12.5km exhibits wind speed saturation at ˜30 m/s with error of ˜31% for high wind speed (>40 m/s).
Incorrect support and missing center tolerances of phasing algorithms
Huang, Xiaojing; Nelson, Johanna; Steinbrener, Jan; Kirz, Janos; Turner, Joshua J.; Jacobsen, Chris
2010-01-01
In x-ray diffraction microscopy, iterative algorithms retrieve reciprocal space phase information, and a real space image, from an object's coherent diffraction intensities through the use of a priori information such as a finite support constraint. In many experiments, the object's shape or support is not well known, and the diffraction pattern is incompletely measured. We describe here computer simulations to look at the effects of both of these possible errors when using several common reconstruction algorithms. Overly tight object supports prevent successful convergence; however, we show that this can often be recognized through pathological behavior of the phase retrieval transfermore » function. Dynamic range limitations often make it difficult to record the central speckles of the diffraction pattern. We show that this leads to increasing artifacts in the image when the number of missing central speckles exceeds about 10, and that the removal of unconstrained modes from the reconstructed image is helpful only when the number of missing central speckles is less than about 50. In conclusion, this simulation study helps in judging the reconstructability of experimentally recorded coherent diffraction patterns.« less
Incorrect support and missing center tolerances of phasing algorithms
Huang, Xiaojing; Nelson, Johanna; Steinbrener, Jan; Kirz, Janos; Turner, Joshua J.; Jacobsen, Chris
2010-01-01
In x-ray diffraction microscopy, iterative algorithms retrieve reciprocal space phase information, and a real space image, from an object's coherent diffraction intensities through the use of a priori information such as a finite support constraint. In many experiments, the object's shape or support is not well known, and the diffraction pattern is incompletely measured. We describe here computer simulations to look at the effects of both of these possible errors when using several common reconstruction algorithms. Overly tight object supports prevent successful convergence; however, we show that this can often be recognized through pathological behavior of the phase retrieval transfer function. Dynamic range limitations often make it difficult to record the central speckles of the diffraction pattern. We show that this leads to increasing artifacts in the image when the number of missing central speckles exceeds about 10, and that the removal of unconstrained modes from the reconstructed image is helpful only when the number of missing central speckles is less than about 50. In conclusion, this simulation study helps in judging the reconstructability of experimentally recorded coherent diffraction patterns.
Jaya, T; Dheeba, J; Singh, N Albert
2015-12-01
Diabetic retinopathy is a major cause of vision loss in diabetic patients. Currently, there is a need for making decisions using intelligent computer algorithms when screening a large volume of data. This paper presents an expert decision-making system designed using a fuzzy support vector machine (FSVM) classifier to detect hard exudates in fundus images. The optic discs in the colour fundus images are segmented to avoid false alarms using morphological operations and based on circular Hough transform. To discriminate between the exudates and the non-exudates pixels, colour and texture features are extracted from the images. These features are given as input to the FSVM classifier. The classifier analysed 200 retinal images collected from diabetic retinopathy screening programmes. The tests made on the retinal images show that the proposed detection system has better discriminating power than the conventional support vector machine. With the best combination of FSVM and features sets, the area under the receiver operating characteristic curve reached 0.9606, which corresponds to a sensitivity of 94.1% with a specificity of 90.0%. The results suggest that detecting hard exudates using FSVM contribute to computer-assisted detection of diabetic retinopathy and as a decision support system for ophthalmologists.
Library support for problem-based learning: an algorithmic approach.
Ispahany, Nighat; Torraca, Kathren; Chilov, Marina; Zimbler, Elaine R; Matsoukas, Konstantina; Allen, Tracy Y
2007-01-01
Academic health sciences libraries can take various approaches to support the problem-based learning component of the curriculum. This article presents one such approach taken to integrate information navigation skills into the small group discussion part of the Pathophysiology course in the second year of the Dental school curriculum. Along with presenting general resources for the course, the Library Toolkit introduced an algorithmic approach to finding answers to sample clinical case questions. While elements of Evidence-Based Practice were introduced, the emphasis was on teaching students to navigate relevant resources and apply various database search techniques to find answers to the clinical problems presented.
NASA Technical Reports Server (NTRS)
Morrell, F. R.; Bailey, M. L.; Motyka, P. R.
1988-01-01
Flight test results of a vector-based fault-tolerant algorithm for a redundant strapdown inertial measurement unit are presented. Because the inertial sensors provide flight-critical information for flight control and navigation, failure detection and isolation is developed in terms of a multi-level structure. Threshold compensation techniques for gyros and accelerometers, developed to enhance the sensitivity of the failure detection process to low-level failures, are presented. Four flight tests, conducted in a commercial transport type environment, were used to determine the ability of the failure detection and isolation algorithm to detect failure signals, such a hard-over, null, or bias shifts. The algorithm provided timely detection and correct isolation of flight control- and low-level failures. The flight tests of the vector-based algorithm demonstrated its capability to provide false alarm free dual fail-operational performance for the skewed array of inertial sensors.
Prediction of water surface elevation of Great Salt Lake using Support Vector Machine
NASA Astrophysics Data System (ADS)
Shrestha, N. K.; Urroz, G.
2009-12-01
Record breaking rises of Great Salt Lake (GSL) water levels that were observed in the period 1982-1987 resulted in severe economic impact to the State of Utah. Rising lake levels caused flooding that damaged highways, railways, recreation facilities and industries located in exposed lake bed. Prediction of GSL water levels necessitates the development of a model for accurate predictions of such levels in order to reduce or prevent economic loss due to flooding as happened in the past. A data-driven model, whose intent is to determine the relationship between inputs and outputs without knowing underlying physical process, was used in this project. A data-driven model can bridge the gap between classical regression-based and physically-based hydrological models. A Support Vector Machines (SVM) was used to predict water surface elevation of the GSL. The SVM-based reconstruction was used to develop time series forecast for multiple lead times. The model is able to extract the dynamics of the system by using only a few observed data points for training. The reliability of the algorithm in learning and forecasting the dynamics of the system was tested by changing two parameters: the integer time lag and the dimension (d) of the system. Parameter tau models the delay in which the dynamics unfolds by creating vectors of dimension d out of single measurements. For a given set of parameters tau and d, the discrepancy between observation and prediction is reduced by changing the cost parameter and a parameter called epsilon that controls the width of the SVM insensitive zone. All the data points within the epsilon insensitive zone are neglected in the SVM analysis. The analysis was performed for two time periods. The period of 1982 to 1987 was used to test the model performance in predicting the corresponding dramatic rise of GSL elevation. The period of 1987 to 2008 was used to test the performance of model for the normal water level rise and fall of the GSL. This analysis
NASA Astrophysics Data System (ADS)
Alouene, Y.; Petropoulos, G. P.; Kalogrias, A.; Papanikolaou, F.
2012-04-01
Floods are a water-related natural disaster affecting and often threatening different aspects of human life, such as property damage, economic degradation, and in some instances even loss of precious human lives. Being able to provide accurately and cost-effectively assessment of damage from floods is essential to both scientists and policy makers in many aspects ranging from mitigating to assessing damage extent as well as in rehabilitation of affected areas. Remote Sensing often combined with Geographical Information Systems (GIS) has generally shown a very promising potential in performing rapidly and cost-effectively flooding damage assessment, particularly so in remote, otherwise inaccessible locations. The progress in remote sensing during the last twenty years or so has resulted to the development of a large number of image processing techniques suitable for use with a range of remote sensing data in performing flooding damage assessment. Supervised image classification is regarded as one of the most widely used approaches employed for this purpose. Yet, the use of recently developed image classification algorithms such as of machine learning-based Support Vector Machines (SVMs) classifier has not been adequately investigated for this purpose. The objective of our work had been to quantitatively evaluate the ability of SVMs combined with Landsat TM multispectral imagery in performing a damage assessment of a flood occurred in a Mediterranean region. A further objective has been to examine if the inclusion of additional spectral information apart from the original TM bands in SVMs can improve flooded area extraction accuracy. As a case study is used the case of a river Evros flooding of 2010 located in the north of Greece, in which TM imagery before and shortly after the flooding was available. Assessment of the flooded area is performed in a GIS environment on the basis of classification accuracy assessment metrics as well as comparisons versus a vector
NASA Astrophysics Data System (ADS)
Kuzma, H. A.; Kappler, K. A.; Rector, J. W.
2006-12-01
Kriging, wiener filters, support vector machines (SVMs), neural networks, linear and non-linear inversion are methods for predicting the values of one set of variables given the values of another. They can all be used to estimate a set of model parameters from measured data given that a physical relationship exists between models and data. However, since the methods were developed in different fields, the mathematics used to describe them tend to obscure rather than highlight the links between them. In this poster, we diagram the methods and clarify their connections in hopes that practitioners of one method will be able to understand and learn from the insights developed in another. At the heart of all of the methods are a set of coefficients that must be found by minimizing an objective function. The solution to the objective function can be found either by inverting a matrix, or by searching through a space of possible answers. We distinguish between direct inversion, in which the desired coefficients are those of the model itself, and indirect inversion, in which examples of models and data are used to estimate the coefficients of an inverse process that, once discovered, can be used to compute new models from new data. Kriging is developed from Geostatistics. The model is usually a rock property (such as gold concentration) and the data is a sample location (x,y,z). The desired coefficients are a set of weights which are used to predict the concentration of a sample taken at a new location based on a variogram. The variogram is computed by averaging across a given set of known samples and is manually adjusted to reflect prior knowledge. Wiener filters were developed in signal processing to predict the values of one time-series from measurements of another. A wiener filter can be derived from kriging by replacing variograms with correlation. Support vector machines are an offshoot of statistical learning theory. They can be written as a form of kriging in which
Dilated contour extraction and component labeling algorithm for object vector representation
NASA Astrophysics Data System (ADS)
Skourikhine, Alexei N.
2005-08-01
Object boundary extraction from binary images is important for many applications, e.g., image vectorization, automatic interpretation of images containing segmentation results, printed and handwritten documents and drawings, maps, and AutoCAD drawings. Efficient and reliable contour extraction is also important for pattern recognition due to its impact on shape-based object characterization and recognition. The presented contour tracing and component labeling algorithm produces dilated (sub-pixel) contours associated with corresponding regions. The algorithm has the following features: (1) it always produces non-intersecting, non-degenerate contours, including the case of one-pixel wide objects; (2) it associates the outer and inner (i.e., around hole) contours with the corresponding regions during the process of contour tracing in a single pass over the image; (3) it maintains desired connectivity of object regions as specified by 8-neighbor or 4-neighbor connectivity of adjacent pixels; (4) it avoids degenerate regions in both background and foreground; (5) it allows an easy augmentation that will provide information about the containment relations among regions; (6) it has a time complexity that is dominantly linear in the number of contour points. This early component labeling (contour-region association) enables subsequent efficient object-based processing of the image information.
Shin, Jaeyeon; Park, Hodong; Cho, Sungpil; Nam, Hakhyun; Lee, Kyoung-Joung
2014-09-01
Point-of-care testing glucose meters are widely used, important tools for determining the blood glucose levels of people with diabetes, patients in intensive care units, pregnant women, and newborn infants. However, a number of studies have concluded that a change in hematocrit (Hct) levels can seriously affect the accuracy of glucose measurements. The aim of this study was to develop an algorithm for glucose calculation with improved accuracy using the Hct compensation method that minimizes the effects of Hct on glucose measurements. The glucose concentrations in this study were calculated with an adaptive calibration curve using linear fitting prediction and a support vector machine, which minimized the bias in the glucose concentrations caused by the Hct interference. This was followed by an evaluation of performance according to the international organization for standardization (ISO) 15197:2013 based on bias with respect to the reference method, the coefficient of variation, and the valid blood samples/total blood samples within the ±20% and 15% error grids. Chronoamperometry was performed to verify the effect of Hct variation and to compare the proposed method. As a result, the average coefficients of variation for chronoamperometry and the Hct compensation method were 2.43% and 3.71%, respectively, while the average biases (%) for these methods were 12.08% and 5.69%, respectively. The results of chronoamperometry demonstrated that a decrease in Hct levels increases glucose concentrations, whereas an increase in Hct levels reduces glucose concentrations. Finally, the proposed method has improved the accuracy of glucose measurements compared to existing chronoamperometry methods.
2016-01-01
OBJECTIVES Diabetes is increasing in worldwide prevalence, toward epidemic levels. Diabetic neuropathy, one of the most common complications of diabetes mellitus, is a serious condition that can lead to amputation. This study used a multicategory support vector machine (MSVM) to predict diabetic peripheral neuropathy severity classified into four categories using patients’ demographic characteristics and clinical features. METHODS In this study, the data were collected at the Diabetes Center of Hamadan in Iran. Patients were enrolled by the convenience sampling method. Six hundred patients were recruited. After obtaining informed consent, a questionnaire collecting general information and a neuropathy disability score (NDS) questionnaire were administered. The NDS was used to classify the severity of the disease. We used MSVM with both one-against-all and one-against-one methods and three kernel functions, radial basis function (RBF), linear, and polynomial, to predict the class of disease with an unbalanced dataset. The synthetic minority class oversampling technique algorithm was used to improve model performance. To compare the performance of the models, the mean of accuracy was used. RESULTS For predicting diabetic neuropathy, a classifier built from a balanced dataset and the RBF kernel function with a one-against-one strategy predicted the class to which a patient belonged with about 76% accuracy. CONCLUSIONS The results of this study indicate that, in terms of overall classification accuracy, the MSVM model based on a balanced dataset can be useful for predicting the severity of diabetic neuropathy, and it should be further investigated for the prediction of other diseases. PMID:27032459
NASA Astrophysics Data System (ADS)
Ha, W.; Gowda, P. H.; Oommen, T.; Howell, T. A.; Hernandez, J. E.
2010-12-01
High spatial resolution Land Surface Temperature (LST) images are required to estimate evapotranspiration (ET) at a field scale for irrigation scheduling purposes. Satellite sensors such as Landsat 5 Thematic Mapper (TM) and Moderate Resolution Imaging Spectroradiometer (MODIS) can offer images at several spectral bandwidths including visible, near-infrared (NIR), shortwave-infrared, and thermal-infrared (TIR). The TIR images usually have coarser spatial resolutions than those from non-thermal infrared bands. Due to this technical constraint of the satellite sensors on these platforms, image downscaling has been proposed in the field of ET remote sensing. This paper explores the potential of the Support Vector Machines (SVM) to perform downscaling of LST images derived from aircraft (4 m spatial resolution), TM (120 m), and MODIS (1000 m) using normalized difference vegetation index images derived from simultaneously acquired high resolution visible and NIR data (1 m for aircraft, 30 m for TM, and 250 m for MODIS). The SVM is a new generation machine learning algorithm that has found a wide application in the field of pattern recognition and time series analysis. The SVM would be ideally suited for downscaling problems due to its generalization ability in capturing non-linear regression relationship between the predictand and the multiple predictors. Remote sensing data acquired over the Texas High Plains during the 2008 summer growing season will be used in this study. Accuracy assessment of the downscaled 1, 30, and 250 m LST images will be made by comparing them with LST data measured with infrared thermometers at a small spatial scale, upscaled 30 m aircraft-based LST images, and upscaled 250 m TM-based LST images, respectively.
A divide-and-combine method for large scale nonparallel support vector machines.
Tian, Yingjie; Ju, Xuchan; Shi, Yong
2016-03-01
Nonparallel Support Vector Machine (NPSVM) which is more flexible and has better generalization than typical SVM is widely used for classification. Although some methods and toolboxes like SMO and libsvm for NPSVM are used, NPSVM is hard to scale up when facing millions of samples. In this paper, we propose a divide-and-combine method for large scale nonparallel support vector machine (DCNPSVM). In the division step, DCNPSVM divide samples into smaller sub-samples aiming at solving smaller subproblems independently. We theoretically and experimentally prove that the objective function value, solutions, and support vectors solved by DCNPSVM are close to the objective function value, solutions, and support vectors of the whole NPSVM problem. In the combination step, the sub-solutions combined as initial iteration points are used to solve the whole problem by global coordinate descent which converges quickly. In order to balance the accuracy and efficiency, we adopt a multi-level structure which outperforms state-of-the-art methods. Moreover, our DCNPSVM can tackle unbalance problems efficiently by tuning the parameters. Experimental results on lots of large data sets show the effectiveness of our method in memory usage, classification accuracy and time consuming. PMID:26690682
Technology Transfer Automated Retrieval System (TEKTRAN)
This paper presents a novel wrinkle evaluation method that uses modified wavelet coefficients and an optimized support-vector-machine (SVM) classification scheme to characterize and classify wrinkle appearance of fabric. Fabric images were decomposed with the wavelet transform (WT), and five parame...
Technology Transfer Automated Retrieval System (TEKTRAN)
This study evaluated linear spectral unmixing (LSU), mixture tuned matched filtering (MTMF) and support vector machine (SVM) techniques for detecting and mapping giant reed (Arundo donax L.), an invasive weed that presents a severe threat to agroecosystems and riparian areas throughout the southern ...
Support Vector Machine applied to predict the zoonotic potential of E. coli O157 cattle isolates
Technology Transfer Automated Retrieval System (TEKTRAN)
Methods based on sequence data analysis facilitate the tracking of disease outbreaks, allow relationships between strains to be reconstructed and virulence factors to be identified. However, these methods are used postfactum after an outbreak has happened. Here, we show that support vector machine a...
Zhu, Biyun; Chen, Hui; Chen, Budong; Xu, Yan; Zhang, Kuan
2014-02-01
This study aims to explore the classification ability of decision trees (DTs) and support vector machines (SVMs) to discriminate between the digital chest radiographs (DRs) of pneumoconiosis patients and control subjects. Twenty-eight wavelet-based energy texture features were calculated at the lung fields on DRs of 85 healthy controls and 40 patients with stage I and stage II pneumoconiosis. DTs with algorithm C5.0 and SVMs with four different kernels were trained by samples with two combinations of the texture features to classify a DR as of a healthy subject or of a patient with pneumoconiosis. All of the models were developed with fivefold cross-validation, and the final performances of each model were compared by the area under receiver operating characteristic (ROC) curve. For both SVM (with a radial basis function kernel) and DT (with algorithm C5.0), areas under ROC curves (AUCs) were 0.94 ± 0.02 and 0.86 ± 0.04 (P = 0.02) when using the full feature set and 0.95 ± 0.02 and 0.88 ± 0.04 (P = 0.05) when using the selected feature set, respectively. When built on the selected texture features, the SVM with a polynomial kernel showed a higher diagnostic performance with an AUC value of 0.97 ± 0.02 than SVMs with a linear kernel, a radial basis function kernel and a sigmoid kernel with AUC values of 0.96 ± 0.02 (P = 0.37), 0.95 ± 0.02 (P = 0.24), and 0.90 ± 0.03 (P = 0.01), respectively. The SVM model with a polynomial kernel built on the selected feature set showed the highest diagnostic performance among all tested models when using either all the wavelet texture features or the selected ones. The model has a good potential in diagnosing pneumoconiosis based on digital chest radiographs.
2012-01-01
Background Myocardial ischemia can be developed into more serious diseases. Early Detection of the ischemic syndrome in electrocardiogram (ECG) more accurately and automatically can prevent it from developing into a catastrophic disease. To this end, we propose a new method, which employs wavelets and simple feature selection. Methods For training and testing, the European ST-T database is used, which is comprised of 367 ischemic ST episodes in 90 records. We first remove baseline wandering, and detect time positions of QRS complexes by a method based on the discrete wavelet transform. Next, for each heart beat, we extract three features which can be used for differentiating ST episodes from normal: 1) the area between QRS offset and T-peak points, 2) the normalized and signed sum from QRS offset to effective zero voltage point, and 3) the slope from QRS onset to offset point. We average the feature values for successive five beats to reduce effects of outliers. Finally we apply classifiers to those features. Results We evaluated the algorithm by kernel density estimation (KDE) and support vector machine (SVM) methods. Sensitivity and specificity for KDE were 0.939 and 0.912, respectively. The KDE classifier detects 349 ischemic ST episodes out of total 367 ST episodes. Sensitivity and specificity of SVM were 0.941 and 0.923, respectively. The SVM classifier detects 355 ischemic ST episodes. Conclusions We proposed a new method for detecting ischemia in ECG. It contains signal processing techniques of removing baseline wandering and detecting time positions of QRS complexes by discrete wavelet transform, and feature extraction from morphology of ECG waveforms explicitly. It was shown that the number of selected features were sufficient to discriminate ischemic ST episodes from the normal ones. We also showed how the proposed KDE classifier can automatically select kernel bandwidths, meaning that the algorithm does not require any numerical values of the parameters
Towards automatic lithological classification from remote sensing data using support vector machines
NASA Astrophysics Data System (ADS)
Yu, Le; Porwal, Alok; Holden, Eun-Jung; Dentith, Michael
2010-05-01
Remote sensing data can be effectively used as a mean to build geological knowledge for poorly mapped terrains. Spectral remote sensing data from space- and air-borne sensors have been widely used to geological mapping, especially in areas of high outcrop density in arid regions. However, spectral remote sensing information by itself cannot be efficiently used for a comprehensive lithological classification of an area due to (1) diagnostic spectral response of a rock within an image pixel is conditioned by several factors including the atmospheric effects, spectral and spatial resolution of the image, sub-pixel level heterogeneity in chemical and mineralogical composition of the rock, presence of soil and vegetation cover; (2) only surface information and is therefore highly sensitive to the noise due to weathering, soil cover, and vegetation. Consequently, for efficient lithological classification, spectral remote sensing data needs to be supplemented with other remote sensing datasets that provide geomorphological and subsurface geological information, such as digital topographic model (DEM) and aeromagnetic data. Each of the datasets contain significant information about geology that, in conjunction, can potentially be used for automated lithological classification using supervised machine learning algorithms. In this study, support vector machine (SVM), which is a kernel-based supervised learning method, was applied to automated lithological classification of a study area in northwestern India using remote sensing data, namely, ASTER, DEM and aeromagnetic data. Several digital image processing techniques were used to produce derivative datasets that contained enhanced information relevant to lithological discrimination. A series of SVMs (trained using k-folder cross-validation with grid search) were tested using various combinations of input datasets selected from among 50 datasets including the original 14 ASTER bands and 36 derivative datasets (including 14
NASA Astrophysics Data System (ADS)
Salas-Gonzalez, D.; Górriz, J. M.; Ramírez, J.; López, M.; Álvarez, I.; Segovia, F.; Chaves, R.; Puntonet, C. G.
2010-05-01
This paper presents a computer-aided diagnosis technique for improving the accuracy of early diagnosis of Alzheimer-type dementia. The proposed methodology is based on the selection of voxels which present Welch's t-test between both classes, normal and Alzheimer images, greater than a given threshold. The mean and standard deviation of intensity values are calculated for selected voxels. They are chosen as feature vectors for two different classifiers: support vector machines with linear kernel and classification trees. The proposed methodology reaches greater than 95% accuracy in the classification task.
Using a Support Vector Machine (SVM) to Improve Generalization Ability of Load Model Parameters
Ma, Jian; Dong, Zhao Yang; Zhang, Pei
2009-04-24
Load modeling plays an important role in power system stability analysis and planning studies. The parameters of load models may experience variations in different application situations. Choosing appropriate parameters is critical for dynamic simulation and stability studies in power system. This paper presents a method to select the parameters with good generalization ability based on a given large number of available parameters that have been identified from dynamic simulation data in different scenarios. Principal component analysis is used to extract the major features of the given parameter sets. Reduced feature vectors are obtained by mapping the given parameter sets into principal component space. Then support vectors are found by implementing a classification problem. Load model parameters based on the obtained support vectors are built to reflect the dynamic property of the load. All of the given parameter sets were identified from simulation data based on the New England 10-machine 39-bus system, by taking into account different situations, such as load types, fault locations, fault types, and fault clearing time. The parameters obtained by support vector machine have good generalization capability, and can represent the load more accurately in most situations.
Li, Wutao; Huang, Zhigang; Lang, Rongling; Qin, Honglei; Zhou, Kai; Cao, Yongbin
2016-03-04
Interferences can severely degrade the performance of Global Navigation Satellite System (GNSS) receivers. As the first step of GNSS any anti-interference measures, interference monitoring for GNSS is extremely essential and necessary. Since interference monitoring can be considered as a classification problem, a real-time interference monitoring technique based on Twin Support Vector Machine (TWSVM) is proposed in this paper. A TWSVM model is established, and TWSVM is solved by the Least Squares Twin Support Vector Machine (LSTWSVM) algorithm. The interference monitoring indicators are analyzed to extract features from the interfered GNSS signals. The experimental results show that the chosen observations can be used as the interference monitoring indicators. The interference monitoring performance of the proposed method is verified by using GPS L1 C/A code signal and being compared with that of standard SVM. The experimental results indicate that the TWSVM-based interference monitoring is much faster than the conventional SVM. Furthermore, the training time of TWSVM is on millisecond (ms) level and the monitoring time is on microsecond (μs) level, which make the proposed approach usable in practical interference monitoring applications.
A Real-Time Interference Monitoring Technique for GNSS Based on a Twin Support Vector Machine Method
Li, Wutao; Huang, Zhigang; Lang, Rongling; Qin, Honglei; Zhou, Kai; Cao, Yongbin
2016-01-01
Interferences can severely degrade the performance of Global Navigation Satellite System (GNSS) receivers. As the first step of GNSS any anti-interference measures, interference monitoring for GNSS is extremely essential and necessary. Since interference monitoring can be considered as a classification problem, a real-time interference monitoring technique based on Twin Support Vector Machine (TWSVM) is proposed in this paper. A TWSVM model is established, and TWSVM is solved by the Least Squares Twin Support Vector Machine (LSTWSVM) algorithm. The interference monitoring indicators are analyzed to extract features from the interfered GNSS signals. The experimental results show that the chosen observations can be used as the interference monitoring indicators. The interference monitoring performance of the proposed method is verified by using GPS L1 C/A code signal and being compared with that of standard SVM. The experimental results indicate that the TWSVM-based interference monitoring is much faster than the conventional SVM. Furthermore, the training time of TWSVM is on millisecond (ms) level and the monitoring time is on microsecond (μs) level, which make the proposed approach usable in practical interference monitoring applications. PMID:26959020
Li, Wutao; Huang, Zhigang; Lang, Rongling; Qin, Honglei; Zhou, Kai; Cao, Yongbin
2016-01-01
Interferences can severely degrade the performance of Global Navigation Satellite System (GNSS) receivers. As the first step of GNSS any anti-interference measures, interference monitoring for GNSS is extremely essential and necessary. Since interference monitoring can be considered as a classification problem, a real-time interference monitoring technique based on Twin Support Vector Machine (TWSVM) is proposed in this paper. A TWSVM model is established, and TWSVM is solved by the Least Squares Twin Support Vector Machine (LSTWSVM) algorithm. The interference monitoring indicators are analyzed to extract features from the interfered GNSS signals. The experimental results show that the chosen observations can be used as the interference monitoring indicators. The interference monitoring performance of the proposed method is verified by using GPS L1 C/A code signal and being compared with that of standard SVM. The experimental results indicate that the TWSVM-based interference monitoring is much faster than the conventional SVM. Furthermore, the training time of TWSVM is on millisecond (ms) level and the monitoring time is on microsecond (μs) level, which make the proposed approach usable in practical interference monitoring applications. PMID:26959020
NASA Astrophysics Data System (ADS)
Dong, Yu; Zhang, Tao; Xi, Ling
2015-01-01
Stego images embedded by unknown steganographic algorithms currently may not be detected by using steganalysis detectors based on binary classifier. However, it is difficult to obtain high detection accuracy by using universal steganalysis based on one-class classifier. For solving this problem, a blind detection method for JPEG steganography was proposed from the perspective of information theory. The proposed method combined the semisupervised learning and soft margin support vector machine with steganalysis detector based on one-class classifier to utilize the information in test data for improving detection performance. Reliable blind detection for JPEG steganography was realized only using cover images for training. The experimental results show that the proposed method can contribute to improving the detection accuracy of steganalysis detector based on one-class classifier and has good robustness under different source mismatch conditions.
2014-01-01
Sales forecasting plays an important role in operating a business since it can be used to determine the required inventory level to meet consumer demand and avoid the problem of under/overstocking. Improving the accuracy of sales forecasting has become an important issue of operating a business. This study proposes a hybrid sales forecasting scheme by combining independent component analysis (ICA) with K-means clustering and support vector regression (SVR). The proposed scheme first uses the ICA to extract hidden information from the observed sales data. The extracted features are then applied to K-means algorithm for clustering the sales data into several disjoined clusters. Finally, the SVR forecasting models are applied to each group to generate final forecasting results. Experimental results from information technology (IT) product agent sales data reveal that the proposed sales forecasting scheme outperforms the three comparison models and hence provides an efficient alternative for sales forecasting. PMID:25045738
NASA Astrophysics Data System (ADS)
Ceryan, Nurcihan
2014-12-01
The uniaxial compressive strength (UCS) of intact rocks is an important and pertinent property for characterizing a rock mass. It is known that standard UCS tests are destructive, expensive and time-consuming task, which is particularly true for thinly bedded, highly fractured, foliated, highly porous and weak rocks. Consequently, prediction models have become an attractive alternative for engineering geologists. In the last several years, a new, alternative kernel-based technique, support vector machines (SVMs), has been popular in modeling studies. Despite superior SVM performance, this technique has certain significant, practical drawbacks. Hence, the relevance vector machines (RVMs) approach has been proposed to recast the main ideas underlying SVMs in a Bayesian context. The primary purpose of this study is to examine the applicability and capability of RVM and SVM models for predicting the UCS of volcanic rocks from NE Turkey and comparing its performance with ANN models. In these models, the porosity and P-durability index representing microstructural variables are the input parameters. The study results indicate that these methods can successfully predict the UCS for the volcanic rocks. The SVM and RVM performed better than the ANN model. When these kernel based models are considered, RVM model found successful in terms of statistical performance criterions (e.g., performance index, PI values for training and testing data are computed as 1.579 and 1.449). These values for SVM are 1.509 and 1.307. Although SVM and RVM models are powerful techniques, the RVM run time was considerably faster, and it yielded the highest accuracy.
NASA Astrophysics Data System (ADS)
Xie, Li; Li, Guangyao; Xiao, Mang; Peng, Lei
2016-04-01
Various kinds of remote sensing image classification algorithms have been developed to adapt to the rapid growth of remote sensing data. Conventional methods typically have restrictions in either classification accuracy or computational efficiency. Aiming to overcome the difficulties, a new solution for remote sensing image classification is presented in this study. A discretization algorithm based on information entropy is applied to extract features from the data set and a vector space model (VSM) method is employed as the feature representation algorithm. Because of the simple structure of the feature space, the training rate is accelerated. The performance of the proposed method is compared with two other algorithms: back propagation neural networks (BPNN) method and ant colony optimization (ACO) method. Experimental results confirm that the proposed method is superior to the other algorithms in terms of classification accuracy and computational efficiency.
NASA Astrophysics Data System (ADS)
Huber, Markus B.; Yang, Chien-Chun; Carballido-Gamio, Julio; Bauer, Jan S.; Baum, Thomas; Nagarajan, Mahesh B.; Eckstein, Felix; Lochmüller, Eva; Majumdar, Sharmila; Link, Thomas M.; Wismüller, Axel
2012-03-01
To improve the clinical assessment of osteoporotic hip fracture risk, recent computer-aided diagnosis systems explore new approaches to estimate the local trabecular bone quality beyond bone density alone to predict femoral bone strength. In this context, statistical bone mineral density (BMD) features extracted from multi-detector computed tomography (MDCT) images of proximal femur specimens and different function approximations methods were compared in their ability to predict the biomechanical strength. MDCT scans were acquired in 146 proximal femur specimens harvested from human cadavers. The femurs' failure load (FL) was determined through biomechanical testing. An automated volume of interest (VOI)-fitting algorithm was used to define a consistent volume in the femoral head of each specimen. In these VOIs, the trabecular bone was represented by statistical moments of the BMD distribution and by pairwise spatial occurrence of BMD values using the gray-level co-occurrence (GLCM) approach. A linear multi-regression analysis (MultiReg) and a support vector regression algorithm with a linear kernel (SVRlin) were used to predict the FL from the image feature sets. The prediction performance was measured by the root mean square error (RMSE) for each image feature on independent test sets; in addition the coefficient of determination R2 was calculated. The best prediction result was obtained with a GLCM feature set using SVRlin, which had the lowest prediction error (RSME = 1.040+/-0.143, R2 = 0.544) and which was significantly lower that the standard approach of using BMD.mean and MultiReg (RSME = 1.093+/-0.133, R2 = 0.490, p<0.0001). The combined sets including BMD.mean and GLCM features had a similar or slightly lower performance than using only GLCM features. The results indicate that the performance of high-dimensional BMD features extracted from MDCT images in predicting the biomechanical strength of proximal femur specimens can be significantly improved by
A vectorized algorithm for 3D dynamics of a tethered satellite
NASA Technical Reports Server (NTRS)
Wilson, Howard B.
1989-01-01
Equations of motion characterizing the three dimensional motion of a tethered satellite during the retrieval phase are studied. The mathematical model involves an arbitrary number of point masses connected by weightless cords. Motion occurs in a gravity gradient field. The formulation presented accounts for general functions describing support point motion, rate of tether retrieval, and arbitrary forces applied to the point masses. The matrix oriented program language MATLAB is used to produce an efficient vectorized formulation for computing natural frequencies and mode shapes for small oscillations about the static equilibrium configuration; and for integrating the nonlinear differential equations governing large amplitude motions. An example of time response pertaining to the skip rope effect is investigated.
NASA Astrophysics Data System (ADS)
Salamunićcar, Goran; Lončarić, Sven
In our previous work, in order to extend the GT-57633 catalogue [PSS, 56 (15), 1992-2008] with still uncatalogued impact-craters, the following has been done [GRS, 48 (5), in press, doi:10.1109/TGRS.2009.2037750]: (1) the crater detection algorithm (CDA) based on digital elevation model (DEM) was developed; (2) using 1/128° MOLA data, this CDA proposed 414631 crater-candidates; (3) each crater-candidate was analyzed manually; and (4) 57592 were confirmed as correct detections. The resulting GT-115225 catalog is the significant result of this effort. However, to check such a large number of crater-candidates manually was a demanding task. This was the main motivation for work on improvement of the CDA in order to provide better classification of craters as true and false detections. To achieve this, we extended the CDA with the machine learning capability, using support vector machines (SVM). In the first step, the CDA (re)calculates numerous terrain morphometric attributes from DEM. For this purpose, already existing modules of the CDA from our previous work were reused in order to be capable to prepare these attributes. In addition, new attributes were introduced such as ellipse eccentricity and tilt. For machine learning purpose, the CDA is additionally extended to provide 2-D topography-profile and 3-D shape for each crater-candidate. The latter two are a performance problem because of the large number of crater-candidates in combination with the large number of attributes. As a solution, we developed a CDA architecture wherein it is possible to combine the SVM with a radial basis function (RBF) or any other kernel (for initial set of attributes), with the SVM with linear kernel (for the cases when 2-D and 3-D data are included as well). Another challenge is that, in addition to diversity of possible crater types, there are numerous morphological differences between the smallest (mostly very circular bowl-shaped craters) and the largest (multi-ring) impact
NASA Astrophysics Data System (ADS)
Wen, Hongwei; Liu, Yue; Wang, Jieqiong; Zhang, Jishui; Peng, Yun; He, Huiguang
2016-03-01
Tourette syndrome (TS) is a developmental neuropsychiatric disorder with the cardinal symptoms of motor and vocal tics which emerges in early childhood and fluctuates in severity in later years. To date, the neural basis of TS is not fully understood yet and TS has a long-term prognosis that is difficult to accurately estimate. Few studies have looked at the potential of using diffusion tensor imaging (DTI) in conjunction with machine learning algorithms in order to automate the classification of healthy children and TS children. Here we apply Tract-Based Spatial Statistics (TBSS) method to 44 TS children and 48 age and gender matched healthy children in order to extract the diffusion values from each voxel in the white matter (WM) skeleton, and a feature selection algorithm (ReliefF) was used to select the most salient voxels for subsequent classification with support vector machine (SVM). We use a nested cross validation to yield an unbiased assessment of the classification method and prevent overestimation. The accuracy (88.04%), sensitivity (88.64%) and specificity (87.50%) were achieved in our method as peak performance of the SVM classifier was achieved using the axial diffusion (AD) metric, demonstrating the potential of a joint TBSS and SVM pipeline for fast, objective classification of healthy and TS children. These results support that our methods may be useful for the early identification of subjects with TS, and hold promise for predicting prognosis and treatment outcome for individuals with TS.
Identification of comment-on sentences in online biomedical documents using support vector machines
NASA Astrophysics Data System (ADS)
Kim, In Cheol; Le, Daniel X.; Thoma, George R.
2007-01-01
MEDLINE (R) is the premier bibliographic online database of the National Library of Medicine, containing approximately 14 million citations and abstracts from over 4,800 biomedical journals. This paper presents an automated method based on support vector machines to identify a "comment-on" list, which is a field in a MEDLINE citation denoting previously published articles commented on by a given article. For comparative study, we also introduce another method based on scoring functions that estimate the significance of each sentence in a given article. Preliminary experiments conducted on HTML-formatted online biomedical documents collected from 24 different journal titles show that the support vector machine with polynomial kernel function performs best in terms of recall and F-measure rates.
Color Image Watermarking Scheme Based on Efficient Preprocessing and Support Vector Machines
NASA Astrophysics Data System (ADS)
Fındık, Oğuz; Bayrak, Mehmet; Babaoğlu, Ismail; Çomak, Emre
This paper suggests a new block based watermarking technique utilizing preprocessing and support vector machine (PPSVMW) to protect color image's intellectual property rights. Binary test set is employed here to train support vector machine (SVM). Before adding binary data into the original image, blocks have been separated into two parts to train SVM for better accuracy. Watermark's 1 valued bits were randomly added into the first block part and 0 into the second block part. Watermark is embedded by modifying the blue channel pixel value in the middle of each block so that watermarked image could be composed. SVM was trained with set-bits and three other features which are averages of the differences of pixels in three distinct shapes extracted from each block, and hence without the need of original image, it could be extracted. The results of PPSVMW technique proposed in this study were compared with those of the Tsai's technique. Our technique was proved to be more efficient.
Support vector machine based IS/OS disruption detection from SD-OCT images
NASA Astrophysics Data System (ADS)
Wang, Liyun; Zhu, Weifang; Liao, Jianping; Xiang, Dehui; Jin, Chao; Chen, Haoyu; Chen, Xinjian
2014-03-01
In this paper, we sought to find a method to detect the Inner Segment /Outer Segment (IS/OS)disruption region automatically. A novel support vector machine (SVM) based method was proposed for IS/OS disruption detection. The method includes two parts: training and testing. During the training phase, 7 features from the region around the fovea are calculated. Support vector machine (SVM) is utilized as the classification method. In the testing phase, the training model derived is utilized to classify the disruption and non-disruption region of the IS/OS, and calculate the accuracy separately. The proposed method was tested on 9 patients' SD-OCT images using leave-one-out strategy. The preliminary results demonstrated the feasibility and efficiency of the proposed method.
NASA Astrophysics Data System (ADS)
Zheng, Jun; Shao, Xinyu; Gao, Liang; Jiang, Ping; Qiu, Haobo
2015-06-01
Engineering design, especially for complex engineering systems, is usually a time-consuming process involving computation-intensive computer-based simulation and analysis methods. A difference mapping method using least square support vector regression is developed in this work, as a special metamodelling methodology that includes variable-fidelity data, to replace the computationally expensive computer codes. A general difference mapping framework is proposed where a surrogate base is first created, then the approximation is gained by a mapping the difference between the base and the real high-fidelity response surface. The least square support vector regression is adopted to accomplish the mapping. Two different sampling strategies, nested and non-nested design of experiments, are conducted to explore their respective effects on modelling accuracy. Different sample sizes and three approximation performance measures of accuracy are considered.
NASA Astrophysics Data System (ADS)
Chiang, Jie-Lun; Tsai, Kuang-Jung; Chen, Yie-Ruey; Lee, Ming-Hsi; Sun, Jai-Wei
2014-05-01
Strong correlation exists between river discharge and suspended sediment load. The relationship of discharge and suspended sediment load was used to estimate suspended sediment load by using regression model, artificial neural network and support vector machine in this study. Records of river discharges and suspended sediment loads in the Goodwin Creek Experimental Watershed in United States were investigated as a case study. Seventy percent of the records were used as training data set to develop prediction models. The other thirty percent records were used as verification data set. The performances of those models were evaluated by mean absolute percentage error (MAPE). The MAPEs show that support vector machine outperforms the artificial neural network and regression model. The results show that the MAPE of the proposed SVM can achieve less than 14% for 120 minutes prediction (four time steps). As a result, we believe that the proposed SVM model has high potential for predicting suspended sediment load.
Wen, Congcong; Wang, Zhiyi; Zhang, Meiling; Wang, Shuanghu; Geng, Peiwu; Sun, Fa; Chen, Mengchun; Lin, Guanyang; Hu, Lufeng; Ma, Jianshe; Wang, Xianqin
2016-01-01
Paraquat is quick-acting and non-selective, killing green plant tissue on contact; it is also toxic to human beings and animals. In this study, we developed a urine metabonomic method by gas chromatography-mass spectrometry to evaluate the effect of acute paraquat poisoning on rats. Pattern recognition analysis, including both partial least squares discriminate analysis and principal component analysis revealed that acute paraquat poisoning induced metabolic perturbations. Compared with the control group, the levels of benzeneacetic acid and hexadecanoic acid of the acute paraquat poisoning group (intragastric administration 36 mg/kg) increased, while the levels of butanedioic acid, pentanedioic acid, altronic acid decreased. Based on these urinary metabolomics data, support vector machine was applied to discriminate the metabolomic change of paraquat groups from the control group, which achieved 100% classification accuracy. In conclusion, metabonomic method combined with support vector machine can be used as a useful diagnostic tool in paraquat-poisoned rats.
Improving Empirical Mode Decomposition Using Support Vector Machines for Multifocus Image Fusion
Chen, Shaohui; Su, Hongbo; Zhang, Renhua; Tian, Jing; Yang, Lihu
2008-01-01
Empirical mode decomposition (EMD) is good at analyzing nonstationary and nonlinear signals while support vector machines (SVMs) are widely used for classification. In this paper, a combination of EMD and SVM is proposed as an improved method for fusing multifocus images. Experimental results show that the proposed method is superior to the fusion methods based on à-trous wavelet transform (AWT) and EMD in terms of quantitative analyses by Root Mean Squared Error (RMSE) and Mutual Information (MI).
Hipp, Jason D.; Cheng, Jerome Y.; Toner, Mehmet; Tompkins, Ronald G.; Balis, Ulysses J.
2011-01-01
Introduction: Historically, effective clinical utilization of image analysis and pattern recognition algorithms in pathology has been hampered by two critical limitations: 1) the availability of digital whole slide imagery data sets and 2) a relative domain knowledge deficit in terms of application of such algorithms, on the part of practicing pathologists. With the advent of the recent and rapid adoption of whole slide imaging solutions, the former limitation has been largely resolved. However, with the expectation that it is unlikely for the general cohort of contemporary pathologists to gain advanced image analysis skills in the short term, the latter problem remains, thus underscoring the need for a class of algorithm that has the concurrent properties of image domain (or organ system) independence and extreme ease of use, without the need for specialized training or expertise. Results: In this report, we present a novel, general case pattern recognition algorithm, Spatially Invariant Vector Quantization (SIVQ), that overcomes the aforementioned knowledge deficit. Fundamentally based on conventional Vector Quantization (VQ) pattern recognition approaches, SIVQ gains its superior performance and essentially zero-training workflow model from its use of ring vectors, which exhibit continuous symmetry, as opposed to square or rectangular vectors, which do not. By use of the stochastic matching properties inherent in continuous symmetry, a single ring vector can exhibit as much as a millionfold improvement in matching possibilities, as opposed to conventional VQ vectors. SIVQ was utilized to demonstrate rapid and highly precise pattern recognition capability in a broad range of gross and microscopic use-case settings. Conclusion: With the performance of SIVQ observed thus far, we find evidence that indeed there exist classes of image analysis/pattern recognition algorithms suitable for deployment in settings where pathologists alone can effectively incorporate their
Liu, Zhong-bao; Gao, Yan-yun; Wang, Jian-zhen
2015-01-01
Support vector machine (SVM) with good leaning ability and generalization is widely used in the star spectra data classification. But when the scale of data becomes larger, the shortages of SVM appear: the calculation amount is quite large and the classification speed is too slow. In order to solve the above problems, twin support vector machine (TWSVM) was proposed by Jayadeva. The advantage of TSVM is that the time cost is reduced to 1/4 of that of SVM. While all the methods mentioned above only focus on the global characteristics and neglect the local characteristics. In view of this, an automatic classification method of star spectra data based on manifold fuzzy twin support vector machine (MF-TSVM) is proposed in this paper. In MF-TSVM, manifold-based discriminant analysis (MDA) is used to obtain the global and local characteristics of the input data and the fuzzy membership is introduced to reduce the influences of noise and singular data on the classification results. Comparative experiments with current classification methods, such as C-SVM and KNN, on the SDSS star spectra datasets verify the effectiveness of the proposed method. PMID:25993861
Prediction of Spirometric Forced Expiratory Volume (FEV1) Data Using Support Vector Regression
NASA Astrophysics Data System (ADS)
Kavitha, A.; Sujatha, C. M.; Ramakrishnan, S.
2010-01-01
In this work, prediction of forced expiratory volume in 1 second (FEV1) in pulmonary function test is carried out using the spirometer and support vector regression analysis. Pulmonary function data are measured with flow volume spirometer from volunteers (N=175) using a standard data acquisition protocol. The acquired data are then used to predict FEV1. Support vector machines with polynomial kernel function with four different orders were employed to predict the values of FEV1. The performance is evaluated by computing the average prediction accuracy for normal and abnormal cases. Results show that support vector machines are capable of predicting FEV1 in both normal and abnormal cases and the average prediction accuracy for normal subjects was higher than that of abnormal subjects. Accuracy in prediction was found to be high for a regularization constant of C=10. Since FEV1 is the most significant parameter in the analysis of spirometric data, it appears that this method of assessment is useful in diagnosing the pulmonary abnormalities with incomplete data and data with poor recording.
Online Support Vector Regression with Varying Parameters for Time-Dependent Data
Omitaomu, Olufemi A; Jeong, Myong K; Badiru, Adedeji B
2011-01-01
Support vector regression (SVR) is a machine learning technique that continues to receive interest in several domains including manufacturing, engineering, and medicine. In order to extend its application to problems in which datasets arrive constantly and in which batch processing of the datasets is infeasible or expensive, an accurate online support vector regression (AOSVR) technique was proposed. The AOSVR technique efficiently updates a trained SVR function whenever a sample is added to or removed from the training set without retraining the entire training data. However, the AOSVR technique assumes that the new samples and the training samples are of the same characteristics; hence, the same value of SVR parameters is used for training and prediction. This assumption is not applicable to data samples that are inherently noisy and non-stationary such as sensor data. As a result, we propose Accurate On-line Support Vector Regression with Varying Parameters (AOSVR-VP) that uses varying SVR parameters rather than fixed SVR parameters, and hence accounts for the variability that may exist in the samples. To accomplish this objective, we also propose a generalized weight function to automatically update the weights of SVR parameters in on-line monitoring applications. The proposed function allows for lower and upper bounds for SVR parameters. We tested our proposed approach and compared results with the conventional AOSVR approach using two benchmark time series data and sensor data from nuclear power plant. The results show that using varying SVR parameters is more applicable to time dependent data.
NASA Astrophysics Data System (ADS)
Cui, Ying; Dy, Jennifer G.; Alexander, Brian; Jiang, Steve B.
2008-08-01
Various problems with the current state-of-the-art techniques for gated radiotherapy have prevented this new treatment modality from being widely implemented in clinical routine. These problems are caused mainly by applying various external respiratory surrogates. There might be large uncertainties in deriving the tumor position from external respiratory surrogates. While tracking implanted fiducial markers has sufficient accuracy, this procedure may not be widely accepted due to the risk of pneumothorax. Previously, we have developed a technique to generate gating signals from fluoroscopic images without implanted fiducial markers using template matching methods (Berbeco et al 2005 Phys. Med. Biol. 50 4481-90, Cui et al 2007b Phys. Med. Biol. 52 741-55). In this note, our main contribution is to provide a totally different new view of the gating problem by recasting it as a classification problem. Then, we solve this classification problem by a well-studied powerful classification method called a support vector machine (SVM). Note that the goal of an automated gating tool is to decide when to turn the beam ON or OFF. We treat ON and OFF as the two classes in our classification problem. We create our labeled training data during the patient setup session by utilizing the reference gating signal, manually determined by a radiation oncologist. We then pre-process these labeled training images and build our SVM prediction model. During treatment delivery, fluoroscopic images are continuously acquired, pre-processed and sent as an input to the SVM. Finally, our SVM model will output the predicted labels as gating signals. We test the proposed technique on five sequences of fluoroscopic images from five lung cancer patients against the reference gating signal as ground truth. We compare the performance of the SVM to our previous template matching method (Cui et al 2007b Phys. Med. Biol. 52 741-55). We find that the SVM is slightly more accurate on average (1-3%) than
NASA Astrophysics Data System (ADS)
Gizaw, Mesgana Seyoum; Gan, Thian Yew
2016-07-01
Regional Flood Frequency Analysis (RFFA) is a statistical method widely used to estimate flood quantiles of catchments with limited streamflow data. In addition, to estimate the flood quantile of ungauged sites, there could be only a limited number of stations with complete dataset are available from hydrologically similar, surrounding catchments. Besides traditional regression based RFFA methods, recent applications of machine learning algorithms such as the artificial neural network (ANN) have shown encouraging results in regional flood quantile estimations. Another novel machine learning technique that is becoming widely applicable in the hydrologic community is the Support Vector Regression (SVR). In this study, an RFFA model based on SVR was developed to estimate regional flood quantiles for two study areas, one with 26 catchments located in southeastern British Columbia (BC) and another with 23 catchments located in southern Ontario (ON), Canada. The SVR-RFFA model for both study sites was developed from 13 sets of physiographic and climatic predictors for the historical period. The Ef (Nash Sutcliffe coefficient) and R2 of the SVR-RFFA model was about 0.7 when estimating flood quantiles of 10, 25, 50 and 100 year return periods which indicate satisfactory model performance in both study areas. In addition, the SVR-RFFA model also performed well based on other goodness-of-fit statistics such as BIAS (mean bias) and BIASr (relative BIAS). If the amount of data available for training RFFA models is limited, the SVR-RFFA model was found to perform better than an ANN based RFFA model, and with significantly lower median CV (coefficient of variation) of the estimated flood quantiles. The SVR-RFFA model was then used to project changes in flood quantiles over the two study areas under the impact of climate change using the RCP4.5 and RCP8.5 climate projections of five Coupled Model Intercomparison Project (CMIP5) GCMs (Global Climate Models) for the 2041
Dileep, Aroor Dinesh; Sekhar, Chellu Chandra
2014-08-01
Dynamic kernel (DK)-based support vector machines are used for the classification of varying length patterns. This paper explores the use of intermediate matching kernel (IMK) as a DK for classification of varying length patterns of long duration speech represented as sets of feature vectors. The main issue in construction of IMK is the choice for the set of virtual feature vectors used to select the local feature vectors for matching. This paper proposes to use components of class-independent Gaussian mixture model (CIGMM) as a representation for the set of virtual feature vectors. For every component of CIGMM, a local feature vector each from the two sets of local feature vectors that has the highest probability of belonging to that component is selected and a base kernel is computed between the selected local feature vectors. The IMK is computed as the sum of all the base kernels corresponding to different components of CIGMM. It is proposed to use the responsibility term weighted base kernels in computation of IMK to improve its discrimination ability. This paper also proposes the posterior probability weighted DKs (including the proposed IMKs) to improve their classification performance and reduce the number of support vectors. The performance of the support vector machine (SVM)-based classifiers using the proposed IMKs is studied for speech emotion recognition and speaker identification tasks and compared with that of the SVM-based classifiers using the state-of-the-art DKs. PMID:25050941
Akutekwe, Arinze; Seker, Huseyin
2015-08-01
Comprehensive understanding of gene regulatory networks (GRNs) is a major challenge in systems biology. Most methods for modeling and inferring the dynamics of GRNs, such as those based on state space models, vector autoregressive models and G1DBN algorithm, assume linear dependencies among genes. However, this strong assumption does not make for true representation of time-course relationships across the genes, which are inherently nonlinear. Nonlinear modeling methods such as the S-systems and causal structure identification (CSI) have been proposed, but are known to be statistically inefficient and analytically intractable in high dimensions. To overcome these limitations, we propose an optimized ensemble approach based on support vector regression (SVR) and dynamic Bayesian networks (DBNs). The method called SVR-DBN, uses nonlinear kernels of the SVR to infer the temporal relationships among genes within the DBN framework. The two-stage ensemble is further improved by SVR parameter optimization using Particle Swarm Optimization. Results on eight insilico-generated datasets, and two real world datasets of Drosophila Melanogaster and Escherichia Coli, show that our method outperformed the G1DBN algorithm by a total average accuracy of 12%. We further applied our method to model the time-course relationships of ovarian carcinoma. From our results, four hub genes were discovered. Stratified analysis further showed that the expression levels Prostrate differentiation factor and BTG family member 2 genes, were significantly increased by the cisplatin and oxaliplatin platinum drugs; while expression levels of Polo-like kinase and Cyclin B1 genes, were both decreased by the platinum drugs. These hub genes might be potential biomarkers for ovarian carcinoma. PMID:26738192
Applying Support Vector Machine in classifying satellite images for the assessment of urban sprawl
NASA Astrophysics Data System (ADS)
murgante, Beniamino; Nolè, Gabriele; Lasaponara, Rosa; Lanorte, Antonio; Calamita, Giuseppe
2013-04-01
In last decades the spreading of new buildings, road infrastructures and a scattered proliferation of houses in zones outside urban areas, produced a countryside urbanization with no rules, consuming soils and impoverishing the landscape. Such a phenomenon generated a huge environmental impact, diseconomies and a decrease in life quality. This study analyzes processes concerning land use change, paying particular attention to urban sprawl phenomenon. The application is based on the integration of Geographic Information Systems and Remote Sensing adopting open source technologies. The objective is to understand size distribution and dynamic expansion of urban areas in order to define a methodology useful to both identify and monitor the phenomenon. In order to classify "urban" pixels, over time monitoring of settlements spread, understanding trends of artificial territories, classifications of satellite images at different dates have been realized. In order to obtain these classifications, supervised classification algorithms have been adopted. More particularly, Support Vector Machine (SVM) learning algorithm has been applied to multispectral remote data. One of the more interesting features in SVM is the possibility to obtain good results also adopting few classification pixels of training areas. SVM has several interesting features, such as the capacity to obtain good results also adopting few classification pixels of training areas, a high possibility of configuration parameters and the ability to discriminate pixels with similar spectral responses. Multi-temporal ASTER satellite data at medium resolution have been adopted because are very suitable in evaluating such phenomena. The application is based on the integration of Geographic Information Systems and Remote Sensing technologies by means of open source software. Tools adopted in managing and processing data are GRASS GIS, Quantum GIS and R statistical project. The area of interest is located south of Bari
NASA Astrophysics Data System (ADS)
Yu, Fei; Hui, Mei; Zhao, Yue-jin
2009-08-01
The image block matching algorithm based on motion vectors of correlative pixels in oblique direction is presented for digital image stabilization. The digital image stabilization is a new generation of image stabilization technique which can obtains the information of relative motion among frames of dynamic image sequences by the method of digital image processing. In this method the matching parameters are calculated from the vectors projected in the oblique direction. The matching parameters based on the vectors contain the information of vectors in transverse and vertical direction in the image blocks at the same time. So the better matching information can be obtained after making correlative operation in the oblique direction. And an iterative weighted least square method is used to eliminate the error of block matching. The weights are related with the pixels' rotational angle. The center of rotation and the global emotion estimation of the shaking image can be obtained by the weighted least square from the estimation of each block chosen evenly from the image. Then, the shaking image can be stabilized with the center of rotation and the global emotion estimation. Also, the algorithm can run at real time by the method of simulated annealing in searching method of block matching. An image processing system based on DSP was used to exam this algorithm. The core processor in the DSP system is TMS320C6416 of TI, and the CCD camera with definition of 720×576 pixels was chosen as the input video signal. Experimental results show that the algorithm can be performed at the real time processing system and have an accurate matching precision.
Martins, F V C; Carrano, E G; Wanner, E F; Takahashi, R H C; Mateus, G R; Nakamura, F G
2014-01-01
Recent works raised the hypothesis that the assignment of a geometry to the decision variable space of a combinatorial problem could be useful both for providing meaningful descriptions of the fitness landscape and for supporting the systematic construction of evolutionary operators (the geometric operators) that make a consistent usage of the space geometric properties in the search for problem optima. This paper introduces some new geometric operators that constitute the realization of searches along the combinatorial space versions of the geometric entities descent directions and subspaces. The new geometric operators are stated in the specific context of the wireless sensor network dynamic coverage and connectivity problem (WSN-DCCP). A genetic algorithm (GA) is developed for the WSN-DCCP using the proposed operators, being compared with a formulation based on integer linear programming (ILP) which is solved with exact methods. That ILP formulation adopts a proxy objective function based on the minimization of energy consumption in the network, in order to approximate the objective of network lifetime maximization, and a greedy approach for dealing with the system's dynamics. To the authors' knowledge, the proposed GA is the first algorithm to outperform the lifetime of networks as synthesized by the ILP formulation, also running in much smaller computational times for large instances. PMID:24102647
Martins, F V C; Carrano, E G; Wanner, E F; Takahashi, R H C; Mateus, G R; Nakamura, F G
2014-01-01
Recent works raised the hypothesis that the assignment of a geometry to the decision variable space of a combinatorial problem could be useful both for providing meaningful descriptions of the fitness landscape and for supporting the systematic construction of evolutionary operators (the geometric operators) that make a consistent usage of the space geometric properties in the search for problem optima. This paper introduces some new geometric operators that constitute the realization of searches along the combinatorial space versions of the geometric entities descent directions and subspaces. The new geometric operators are stated in the specific context of the wireless sensor network dynamic coverage and connectivity problem (WSN-DCCP). A genetic algorithm (GA) is developed for the WSN-DCCP using the proposed operators, being compared with a formulation based on integer linear programming (ILP) which is solved with exact methods. That ILP formulation adopts a proxy objective function based on the minimization of energy consumption in the network, in order to approximate the objective of network lifetime maximization, and a greedy approach for dealing with the system's dynamics. To the authors' knowledge, the proposed GA is the first algorithm to outperform the lifetime of networks as synthesized by the ILP formulation, also running in much smaller computational times for large instances.
Keasler, J A
2012-03-27
Vectorization is data parallelism (SIMD, SIMT, etc.) - extension of ISA enabling the same instruction to be performed on multiple data items simultaeously. Many/most CPUs support vectorization in some form. Vectorization is difficult to enable, but can yield large efficiency gains. Extra programmer effort is required because: (1) not all algorithms can be vectorized (regular algorithm structure and fine-grain parallelism must be used); (2) most CPUs have data alignment restrictions for load/store operations (obey or risk incorrect code); (3) special directives are often needed to enable vectorization; and (4) vector instructions are architecture-specific. Vectorization is the best way to optimize for power and performance due to reduced clock cycles. When data is organized properly, a vector load instruction (i.e. movaps) can replace 'normal' load instructions (i.e. movsd). Vector operations can potentially have a smaller footprint in the instruction cache when fewer instructions need to be executed. Hybrid index sets insulate users from architecture specific details. We have applied hybrid index sets to achieve optimal vectorization. We can extend this concept to handle other programming models.
NASA Astrophysics Data System (ADS)
Khan, Faisal; Enzmann, Frieder; Kersten, Michael
2016-03-01
Image processing of X-ray-computed polychromatic cone-beam micro-tomography (μXCT) data of geological samples mainly involves artefact reduction and phase segmentation. For the former, the main beam-hardening (BH) artefact is removed by applying a best-fit quadratic surface algorithm to a given image data set (reconstructed slice), which minimizes the BH offsets of the attenuation data points from that surface. A Matlab code for this approach is provided in the Appendix. The final BH-corrected image is extracted from the residual data or from the difference between the surface elevation values and the original grey-scale values. For the segmentation, we propose a novel least-squares support vector machine (LS-SVM, an algorithm for pixel-based multi-phase classification) approach. A receiver operating characteristic (ROC) analysis was performed on BH-corrected and uncorrected samples to show that BH correction is in fact an important prerequisite for accurate multi-phase classification. The combination of the two approaches was thus used to classify successfully three different more or less complex multi-phase rock core samples.
NASA Astrophysics Data System (ADS)
Colkesen, Ismail; Sahin, Emrehan Kutlug; Kavzoglu, Taskin
2016-06-01
Identification of landslide prone areas and production of accurate landslide susceptibility zonation maps have been crucial topics for hazard management studies. Since the prediction of susceptibility is one of the main processing steps in landslide susceptibility analysis, selection of a suitable prediction method plays an important role in the success of the susceptibility zonation process. Although simple statistical algorithms (e.g. logistic regression) have been widely used in the literature, the use of advanced non-parametric algorithms in landslide susceptibility zonation has recently become an active research topic. The main purpose of this study is to investigate the possible application of kernel-based Gaussian process regression (GPR) and support vector regression (SVR) for producing landslide susceptibility map of Tonya district of Trabzon, Turkey. Results of these two regression methods were compared with logistic regression (LR) method that is regarded as a benchmark method. Results showed that while kernel-based GPR and SVR methods generally produced similar results (90.46% and 90.37%, respectively), they outperformed the conventional LR method by about 18%. While confirming the superiority of the GPR method, statistical tests based on ROC statistics, success rate and prediction rate curves revealed the significant improvement in susceptibility map accuracy by applying kernel-based GPR and SVR methods.
A Novel Empirical Mode Decomposition With Support Vector Regression for Wind Speed Forecasting.
Ren, Ye; Suganthan, Ponnuthurai Nagaratnam; Srikanth, Narasimalu
2016-08-01
Wind energy is a clean and an abundant renewable energy source. Accurate wind speed forecasting is essential for power dispatch planning, unit commitment decision, maintenance scheduling, and regulation. However, wind is intermittent and wind speed is difficult to predict. This brief proposes a novel wind speed forecasting method by integrating empirical mode decomposition (EMD) and support vector regression (SVR) methods. The EMD is used to decompose the wind speed time series into several intrinsic mode functions (IMFs) and a residue. Subsequently, a vector combining one historical data from each IMF and the residue is generated to train the SVR. The proposed EMD-SVR model is evaluated with a wind speed data set. The proposed EMD-SVR model outperforms several recently reported methods with respect to accuracy or computational complexity.
Retinal blood vessel segmentation using line operators and support vector classification.
Ricci, Elisa; Perfetti, Renzo
2007-10-01
In the framework of computer-aided diagnosis of eye diseases, retinal vessel segmentation based on line operators is proposed. A line detector, previously used in mammography, is applied to the green channel of the retinal image. It is based on the evaluation of the average grey level along lines of fixed length passing through the target pixel at different orientations. Two segmentation methods are considered. The first uses the basic line detector whose response is thresholded to obtain unsupervised pixel classification. As a further development, we employ two orthogonal line detectors along with the grey level of the target pixel to construct a feature vector for supervised classification using a support vector machine. The effectiveness of both methods is demonstrated through receiver operating characteristic analysis on two publicly available databases of color fundus images.
A collaborative framework for Distributed Privacy-Preserving Support Vector Machine learning.
Que, Jialan; Jiang, Xiaoqian; Ohno-Machado, Lucila
2012-01-01
A Support Vector Machine (SVM) is a popular tool for decision support. The traditional way to build an SVM model is to estimate parameters based on a centralized repository of data. However, in the field of biomedicine, patient data are sometimes stored in local repositories or institutions where they were collected, and may not be easily shared due to privacy concerns. This creates a substantial barrier for researchers to effectively learn from the distributed data using machine learning tools like SVMs. To overcome this difficulty and promote efficient information exchange without sharing sensitive raw data, we developed a Distributed Privacy Preserving Support Vector Machine (DPP-SVM). The DPP-SVM enables privacy-preserving collaborative learning, in which a trusted server integrates "privacy-insensitive" intermediary results. The globally learned model is guaranteed to be exactly the same as learned from combined data. We also provide a free web-service (http://privacy.ucsd.edu:8080/ppsvm/) for multiple participants to collaborate and complete the SVM-learning task in an efficient and privacy-preserving manner. PMID:23304414
A collaborative framework for Distributed Privacy-Preserving Support Vector Machine learning.
Que, Jialan; Jiang, Xiaoqian; Ohno-Machado, Lucila
2012-01-01
A Support Vector Machine (SVM) is a popular tool for decision support. The traditional way to build an SVM model is to estimate parameters based on a centralized repository of data. However, in the field of biomedicine, patient data are sometimes stored in local repositories or institutions where they were collected, and may not be easily shared due to privacy concerns. This creates a substantial barrier for researchers to effectively learn from the distributed data using machine learning tools like SVMs. To overcome this difficulty and promote efficient information exchange without sharing sensitive raw data, we developed a Distributed Privacy Preserving Support Vector Machine (DPP-SVM). The DPP-SVM enables privacy-preserving collaborative learning, in which a trusted server integrates "privacy-insensitive" intermediary results. The globally learned model is guaranteed to be exactly the same as learned from combined data. We also provide a free web-service (http://privacy.ucsd.edu:8080/ppsvm/) for multiple participants to collaborate and complete the SVM-learning task in an efficient and privacy-preserving manner.
NASA Technical Reports Server (NTRS)
Chima, R. V.; Johnson, G. M.
1983-01-01
A multiple-grid algorithm for use in efficiently obtaining steady solutions to the Euler and Navier-Stokes equations is presented. The convergence of the explicit MacCormack algorithm on a fine grid is accelerated by propagating transients from the domain using a sequence of successively coarser grids. Both the fine and coarse grid schemes are readily vectorizable. The combination of multiple-gridding and vectorization results in substantially reduced computational times for the numerical solution of a wide range of flow problems. Results are presented for subsonic, transonic, and supersonic inviscid flows and for subsonic attached and separated laminar viscous flows. Work reduction factors over a scalar, single-grid algorithm range as high as 76.8. Previously announced in STAR as N83-24467
NASA Technical Reports Server (NTRS)
Chima, R. V.; Johnson, G. M.
1983-01-01
A multiple-grid algorithm for use in efficiently obtaining steady solutions to the Euler and Navier-Stokes equations is presented. The convergence of the explicit MacCormack algorithm on a fine grid is accelerated by propagating transients from the domain using a sequence of successively coarser grids. Both the fine and coarse grid schemes are readily vectorizable. The combination of multiple-gridding and vectorization results in substantially reduced computational times for the numerical solution of a wide range of flow problems. Results are presented for subsonic, transonic, and supersonic inviscid flows and for subsonic attached and separated laminar viscous flows. Work reduction factors over a scalar, single-grid algorithm range as high as 76.8.
Webb-Robertson, Bobbie-Jo M.
2009-05-06
Accurate identification of peptides is a current challenge in mass spectrometry (MS) based proteomics. The standard approach uses a search routine to compare tandem mass spectra to a database of peptides associated with the target organism. These database search routines yield multiple metrics associated with the quality of the mapping of the experimental spectrum to the theoretical spectrum of a peptide. The structure of these results make separating correct from false identifications difficult and has created a false identification problem. Statistical confidence scores are an approach to battle this false positive problem that has led to significant improvements in peptide identification. We have shown that machine learning, specifically support vector machine (SVM), is an effective approach to separating true peptide identifications from false ones. The SVM-based peptide statistical scoring method transforms a peptide into a vector representation based on database search metrics to train and validate the SVM. In practice, following the database search routine, a peptides is denoted in its vector representation and the SVM generates a single statistical score that is then used to classify presence or absence in the sample
Eyebrows Identity Authentication Based on Wavelet Transform and Support Vector Machines
NASA Astrophysics Data System (ADS)
Jun-bin, CAO; Haitao, Yang; Lili, Ding
In order to study the novel biometric of eyebrow,,this paper presents an Eyebrows identity authentication based on wavelet transform and support vector machines. The features of the eyebrows image are extracted by wavelet transform, and then classifies them based on SVM. Verification results of the experiment on an eyebrow database taken from 100 of self-built personal demonstrate the effectiveness of the system. The system has a lower FAR 0.22%and FRR 28% Therefore, eyebrow recongnition may possibly apply to personal identification.
A novel approach for short-term load forecasting using support vector machines.
Tian, Liang; Noore, Afzel
2004-10-01
A support vector machine (SVM) modeling approach for short-term load forecasting is proposed. The SVM learning scheme is applied to the power load data, forcing the network to learn the inherent internal temporal property of power load sequence. We also study the performance when other related input variables such as temperature and humidity are considered. The performance of our proposed SVM modeling approach has been tested and compared with feed-forward neural network and cosine radial basis function neural network approaches. Numerical results show that the SVM approach yields better generalization capability and lower prediction error compared to those neural network approaches.
Zhang, Erhu; Wang, Fan; Li, Yongchao; Bai, Xiaonan
2014-01-01
In this paper, we propose a novel method for the detection of microcalcifications using mathematical morphology and a support vector machine (SVM). First, the contrast in the original mammogram was improved by gamma correction and two carefully designed structural elements were used to enhance any microcalcifications. Next, the potential regions were extracted using our proposed dual-threshold technique. Finally, a SVM classifier was used to reduce the number of false positives. The performance of the proposed method was evaluated using the MIAS database. The experimental results demonstrated the efficiency and effectiveness of our method. PMID:24211882
Li, Min; Zhou, Tong; Song, Yanan
2016-07-01
A grain size characterization method based on energy attenuation coefficient spectrum and support vector regression (SVR) is proposed. First, the spectra of the first and second back-wall echoes are cut into several frequency bands to calculate the energy attenuation coefficient spectrum. Second, the frequency band that is sensitive to grain size variation is determined. Finally, a statistical model between the energy attenuation coefficient in the sensitive frequency band and average grain size is established through SVR. Experimental verification is conducted on austenitic stainless steel. The average relative error of the predicted grain size is 5.65%, which is better than that of conventional methods.
Support vector regression for porosity prediction in a heterogeneous reservoir: A comparative study
NASA Astrophysics Data System (ADS)
Al-Anazi, A. F.; Gates, I. D.
2010-12-01
In wells with limited log and core data, porosity, a fundamental and essential property to characterize reservoirs, is challenging to estimate by conventional statistical methods from offset well log and core data in heterogeneous formations. Beyond simple regression, neural networks have been used to develop more accurate porosity correlations. Unfortunately, neural network-based correlations have limited generalization ability and global correlations for a field are usually less accurate compared to local correlations for a sub-region of the reservoir. In this paper, support vector machines are explored as an intelligent technique to correlate porosity to well log data. Recently, support vector regression (SVR), based on the statistical learning theory, have been proposed as a new intelligence technique for both prediction and classification tasks. The underlying formulation of support vector machines embodies the structural risk minimization (SRM) principle which has been shown to be superior to the traditional empirical risk minimization (ERM) principle employed by conventional neural networks and classical statistical methods. This new formulation uses margin-based loss functions to control model complexity independently of the dimensionality of the input space, and kernel functions to project the estimation problem to a higher dimensional space, which enables the solution of more complex nonlinear problem optimization methods to exist for a globally optimal solution. SRM minimizes an upper bound on the expected risk using a margin-based loss function ( ɛ-insensitivity loss function for regression) in contrast to ERM which minimizes the error on the training data. Unlike classical learning methods, SRM, indexed by margin-based loss function, can also control model complexity independent of dimensionality. The SRM inductive principle is designed for statistical estimation with finite data where the ERM inductive principle provides the optimal solution (the
Zhang, B; Liang, X L; Gao, H Y; Ye, L S; Wang, Y G
2016-05-13
We evaluated the application of three machine learning algorithms, including logistic regression, support vector machine and back-propagation neural network, for diagnosing congenital heart disease and colorectal cancer. By inspecting related serum tumor marker levels in colorectal cancer patients and healthy subjects, early diagnosis models for colorectal cancer were built using three machine learning algorithms to assess their corresponding diagnostic values. Except for serum alpha-fetoprotein, the levels of 11 other serum markers of patients in the colorectal cancer group were higher than those in the benign colorectal cancer group (P < 0.05). The results of logistic regression analysis indicted that individual detection of serum carcinoembryonic antigens, CA199, CA242, CA125, and CA153 and their combined detection was effective for diagnosing colorectal cancer. Combined detection had a better diagnostic effect with a sensitivity of 94.2% and specificity of 97.7%; combining serum carcinoembryonic antigens, CA199, CA242, CA125, and CA153, with the support vector machine diagnosis model and back-propagation, a neural network diagnosis model was built with diagnostic accuracies of 82 and 75%, sensitivities of 85 and 80%, and specificities of 80 and 70%, respectively. Colorectal cancer diagnosis models based on the three machine learning algorithms showed high diagnostic value and can help obtain evidence for the early diagnosis of colorectal cancer.
de Klerk, Helen M; Gilbertson, Jason; Lück-Vogel, Melanie; Kemp, Jaco; Munch, Zahn
2016-11-01
Traditionally, to map environmental features using remote sensing, practitioners will use training data to develop models on various satellite data sets using a number of classification approaches and use test data to select a single 'best performer' from which the final map is made. We use a combination of an omission/commission plot to evaluate various results and compile a probability map based on consistently strong performing models across a range of standard accuracy measures. We suggest that this easy-to-use approach can be applied in any study using remote sensing to map natural features for management action. We demonstrate this approach using optical remote sensing products of different spatial and spectral resolution to map the endemic and threatened flora of quartz patches in the Knersvlakte, South Africa. Quartz patches can be mapped using either SPOT 5 (used due to its relatively fine spatial resolution) or Landsat8 imagery (used because it is freely accessible and has higher spectral resolution). Of the variety of classification algorithms available, we tested maximum likelihood and support vector machine, and applied these to raw spectral data, the first three PCA summaries of the data, and the standard normalised difference vegetation index. We found that there is no 'one size fits all' solution to the choice of a 'best fit' model (i.e. combination of classification algorithm or data sets), which is in agreement with the literature that classifier performance will vary with data properties. We feel this lends support to our suggestion that rather than the identification of a 'single best' model and a map based on this result alone, a probability map based on the range of consistently top performing models provides a rigorous solution to environmental mapping.
de Klerk, Helen M; Gilbertson, Jason; Lück-Vogel, Melanie; Kemp, Jaco; Munch, Zahn
2016-11-01
Traditionally, to map environmental features using remote sensing, practitioners will use training data to develop models on various satellite data sets using a number of classification approaches and use test data to select a single 'best performer' from which the final map is made. We use a combination of an omission/commission plot to evaluate various results and compile a probability map based on consistently strong performing models across a range of standard accuracy measures. We suggest that this easy-to-use approach can be applied in any study using remote sensing to map natural features for management action. We demonstrate this approach using optical remote sensing products of different spatial and spectral resolution to map the endemic and threatened flora of quartz patches in the Knersvlakte, South Africa. Quartz patches can be mapped using either SPOT 5 (used due to its relatively fine spatial resolution) or Landsat8 imagery (used because it is freely accessible and has higher spectral resolution). Of the variety of classification algorithms available, we tested maximum likelihood and support vector machine, and applied these to raw spectral data, the first three PCA summaries of the data, and the standard normalised difference vegetation index. We found that there is no 'one size fits all' solution to the choice of a 'best fit' model (i.e. combination of classification algorithm or data sets), which is in agreement with the literature that classifier performance will vary with data properties. We feel this lends support to our suggestion that rather than the identification of a 'single best' model and a map based on this result alone, a probability map based on the range of consistently top performing models provides a rigorous solution to environmental mapping. PMID:27543751
Climate Change, Public Health, and Decision Support: The New Threat of Vector-borne Disease
NASA Astrophysics Data System (ADS)
Grant, F.; Kumar, S.
2011-12-01
Climate change and vector-borne diseases constitute a massive threat to human development. It will not be enough to cut emissions of greenhouse gases-the tide of the future has already been established. Climate change and vector-borne diseases are already undermining the world's efforts to reduce extreme poverty. It is in the best interests of the world leaders to think in terms of concerted global actions, but adaptation and mitigation must be accomplished within the context of local community conditions, resources, and needs. Failure to act will continue to consign developed countries to completely avoidable health risks and significant expense. Failure to act will also reduce poorest of the world's population-some 2.6 billion people-to a future of diminished opportunity. Northrop Grumman has taken significant steps forward to develop the tools needed to assess climate change impacts on public health, collect relevant data for decision making, model projections at regional and local levels; and, deliver information and knowledge to local and regional stakeholders. Supporting these tools is an advanced enterprise architecture consisting of high performance computing, GIS visualization, and standards-based architecture. To address current deficiencies in local planning and decision making with respect to regional climate change and its effect on human health, our research is focused on performing a dynamical downscaling with the Weather Research and Forecasting (WRF) model to develop decision aids that translate the regional climate data into actionable information for users. For the present climate WRF was forced with the Max Planck Institute European Center/Hamburg Model version 5 (ECHAM5) General Circulation Model 20th century simulation. For the 21th century climate, we used an ECHAM5 simulation with the Special Report on Emissions (SRES) A1B emissions scenario. WRF was run in nested mode at spatial resolution of 108 km, 36 km and 12 km and 28 vertical levels
A Multi-Label Learning Based Kernel Automatic Recommendation Method for Support Vector Machine
Zhang, Xueying; Song, Qinbao
2015-01-01
Choosing an appropriate kernel is very important and critical when classifying a new problem with Support Vector Machine. So far, more attention has been paid on constructing new kernels and choosing suitable parameter values for a specific kernel function, but less on kernel selection. Furthermore, most of current kernel selection methods focus on seeking a best kernel with the highest classification accuracy via cross-validation, they are time consuming and ignore the differences among the number of support vectors and the CPU time of SVM with different kernels. Considering the tradeoff between classification success ratio and CPU time, there may be multiple kernel functions performing equally well on the same classification problem. Aiming to automatically select those appropriate kernel functions for a given data set, we propose a multi-label learning based kernel recommendation method built on the data characteristics. For each data set, the meta-knowledge data base is first created by extracting the feature vector of data characteristics and identifying the corresponding applicable kernel set. Then the kernel recommendation model is constructed on the generated meta-knowledge data base with the multi-label classification method. Finally, the appropriate kernel functions are recommended to a new data set by the recommendation model according to the characteristics of the new data set. Extensive experiments over 132 UCI benchmark data sets, with five different types of data set characteristics, eleven typical kernels (Linear, Polynomial, Radial Basis Function, Sigmoidal function, Laplace, Multiquadric, Rational Quadratic, Spherical, Spline, Wave and Circular), and five multi-label classification methods demonstrate that, compared with the existing kernel selection methods and the most widely used RBF kernel function, SVM with the kernel function recommended by our proposed method achieved the highest classification performance. PMID:25893896
Li, Hong-Dong; Liang, Yi-Zeng; Xu, Qing-Song; Cao, Dong-Sheng; Tan, Bin-Bin; Deng, Bai-Chuan; Lin, Chen-Chen
2011-01-01
Selecting a small number of informative genes for microarray-based tumor classification is central to cancer prediction and treatment. Based on model population analysis, here we present a new approach, called Margin Influence Analysis (MIA), designed to work with support vector machines (SVM) for selecting informative genes. The rationale for performing margin influence analysis lies in the fact that the margin of support vector machines is an important factor which underlies the generalization performance of SVM models. Briefly, MIA could reveal genes which have statistically significant influence on the margin by using Mann-Whitney U test. The reason for using the Mann-Whitney U test rather than two-sample t test is that Mann-Whitney U test is a nonparametric test method without any distribution-related assumptions and is also a robust method. Using two publicly available cancerous microarray data sets, it is demonstrated that MIA could typically select a small number of margin-influencing genes and further achieves comparable classification accuracy compared to those reported in the literature. The distinguished features and outstanding performance may make MIA a good alternative for gene selection of high dimensional microarray data. (The source code in MATLAB with GNU General Public License Version 2.0 is freely available at http://code.google.com/p/mia2009/). PMID:21339535
PREDICTION OF SOLAR FLARE SIZE AND TIME-TO-FLARE USING SUPPORT VECTOR MACHINE REGRESSION
Boucheron, Laura E.; Al-Ghraibah, Amani; McAteer, R. T. James
2015-10-10
We study the prediction of solar flare size and time-to-flare using 38 features describing magnetic complexity of the photospheric magnetic field. This work uses support vector regression to formulate a mapping from the 38-dimensional feature space to a continuous-valued label vector representing flare size or time-to-flare. When we consider flaring regions only, we find an average error in estimating flare size of approximately half a geostationary operational environmental satellite (GOES) class. When we additionally consider non-flaring regions, we find an increased average error of approximately three-fourths a GOES class. We also consider thresholding the regressed flare size for the experiment containing both flaring and non-flaring regions and find a true positive rate of 0.69 and a true negative rate of 0.86 for flare prediction. The results for both of these size regression experiments are consistent across a wide range of predictive time windows, indicating that the magnetic complexity features may be persistent in appearance long before flare activity. This is supported by our larger error rates of some 40 hr in the time-to-flare regression problem. The 38 magnetic complexity features considered here appear to have discriminative potential for flare size, but their persistence in time makes them less discriminative for the time-to-flare problem.
Support vector machine based classification of fast Fourier transform spectroscopy of proteins
NASA Astrophysics Data System (ADS)
Lazarevic, Aleksandar; Pokrajac, Dragoljub; Marcano, Aristides; Melikechi, Noureddine
2009-02-01
Fast Fourier transform spectroscopy has proved to be a powerful method for study of the secondary structure of proteins since peak positions and their relative amplitude are affected by the number of hydrogen bridges that sustain this secondary structure. However, to our best knowledge, the method has not been used yet for identification of proteins within a complex matrix like a blood sample. The principal reason is the apparent similarity of protein infrared spectra with actual differences usually masked by the solvent contribution and other interactions. In this paper, we propose a novel machine learning based method that uses protein spectra for classification and identification of such proteins within a given sample. The proposed method uses principal component analysis (PCA) to identify most important linear combinations of original spectral components and then employs support vector machine (SVM) classification model applied on such identified combinations to categorize proteins into one of given groups. Our experiments have been performed on the set of four different proteins, namely: Bovine Serum Albumin, Leptin, Insulin-like Growth Factor 2 and Osteopontin. Our proposed method of applying principal component analysis along with support vector machines exhibits excellent classification accuracy when identifying proteins using their infrared spectra.
Prediction of water quality index in constructed wetlands using support vector machine.
Mohammadpour, Reza; Shaharuddin, Syafiq; Chang, Chun Kiat; Zakaria, Nor Azazi; Ab Ghani, Aminuddin; Chan, Ngai Weng
2015-04-01
Poor water quality is a serious problem in the world which threatens human health, ecosystems, and plant/animal life. Prediction of surface water quality is a main concern in water resource and environmental systems. In this research, the support vector machine and two methods of artificial neural networks (ANNs), namely feed forward back propagation (FFBP) and radial basis function (RBF), were used to predict the water quality index (WQI) in a free constructed wetland. Seventeen points of the wetland were monitored twice a month over a period of 14 months, and an extensive dataset was collected for 11 water quality variables. A detailed comparison of the overall performance showed that prediction of the support vector machine (SVM) model with coefficient of correlation (R(2)) = 0.9984 and mean absolute error (MAE) = 0.0052 was either better or comparable with neural networks. This research highlights that the SVM and FFBP can be successfully employed for the prediction of water quality in a free surface constructed wetland environment. These methods simplify the calculation of the WQI and reduce substantial efforts and time by optimizing the computations.
NASA Astrophysics Data System (ADS)
Ribes, S.; Voicu, I.; Girault, J. M.; Fournier, M.; Perrotin, F.; Tranquart, F.; Kouamé, D.
2011-03-01
Electronic fetal monitoring may be required during the whole pregnancy to closely monitor specific fetal and maternal disorders. Currently used methods suffer from many limitations and are not sufficient to evaluate fetal asphyxia. Fetal activity parameters such as movements, heart rate and associated parameters are essential indicators of the fetus well being, and no current device gives a simultaneous and sufficient estimation of all these parameters to evaluate the fetus well-being. We built for this purpose, a multi-transducer-multi-gate Doppler system and developed dedicated signal processing techniques for fetal activity parameter extraction in order to investigate fetus's asphyxia or well-being through fetal activity parameters. To reach this goal, this paper shows preliminary feasibility of separating normal and compromised fetuses using our system. To do so, data set consisting of two groups of fetal signals (normal and compromised) has been established and provided by physicians. From estimated parameters an instantaneous Manning-like score, referred to as ultrasonic score was introduced and was used together with movements, heart rate and associated parameters in a classification process using Support Vector Machines (SVM) method. The influence of the fetal activity parameters and the performance of the SVM were evaluated using the computation of sensibility, specificity, percentage of support vectors and total classification accuracy. We showed our ability to separate the data into two sets : normal fetuses and compromised fetuses and obtained an excellent matching with the clinical classification performed by physician.
Support vector machines for seizure detection in an animal model of chronic epilepsy
NASA Astrophysics Data System (ADS)
Nandan, Manu; Talathi, Sachin S.; Myers, Stephen; Ditto, William L.; Khargonekar, Pramod P.; Carney, Paul R.
2010-06-01
We compare the performance of three support vector machine (SVM) types: weighted SVM, one-class SVM and support vector data description (SVDD) for the application of seizure detection in an animal model of chronic epilepsy. Large EEG datasets (273 h and 91 h respectively, with a sampling rate of 1 kHz) from two groups of rats with chronic epilepsy were used in this study. For each of these EEG datasets, we extracted three energy-based seizure detection features: mean energy, mean curve length and wavelet energy. Using these features we performed twofold cross-validation to obtain the performance statistics: sensitivity (S), specificity (K) and detection latency (τ) as a function of control parameters for the given SVM. Optimal control parameters for each SVM type that produced the best seizure detection statistics were then identified using two independent strategies. Performance of each SVM type is ranked based on the overall seizure detection performance through an optimality index metric (O). We found that SVDD not only performed better than the other SVM types in terms of highest value of the mean optimality index metric (\\skew3\\bar{O} ) but also gave a more reliable performance across the two EEG datasets.
Miltiadis Alamaniotis; Vivek Agarwal
2014-10-01
This paper places itself in the realm of anticipatory systems and envisions monitoring and control methods being capable of making predictions over system critical parameters. Anticipatory systems allow intelligent control of complex systems by predicting their future state. In the current work, an intelligent model aimed at implementing anticipatory monitoring and control in energy industry is presented and tested. More particularly, a set of support vector regressors (SVRs) are trained using both historical and observed data. The trained SVRs are used to predict the future value of the system based on current operational system parameter. The predicted values are then inputted to a fuzzy logic based module where the values are fused to obtain a single value, i.e., final system output prediction. The methodology is tested on real turbine degradation datasets. The outcome of the approach presented in this paper highlights the superiority over single support vector regressors. In addition, it is shown that appropriate selection of fuzzy sets and fuzzy rules plays an important role in improving system performance.
Yao, Jianhua; Dwyer, Andrew; Summers, Ronald M.; Mollura, Daniel J.
2010-01-01
Objective The purpose of this study was to develop and test a computer-assisted detection method for identification and measurement of pulmonary abnormalities on chest CT in cases of infection, such as novel H1N1 influenza. The method developed could be a potentially useful tool for classifying and quantifying pulmonary infectious disease on CT. Subjects and Methods Forty Chest CTs were studied using texture analysis and support vector machine (SVM) classification to differentiate normal from abnormal lung regions on CT, including ten cases of immunohistochemistry proven infection, ten normal controls, and twenty cases of fibrosis. Results Statistically significant differences in the receiver operator characteristics (ROC) curves for detecting abnormal regions in H1N1 infection were obtained between normal lung and regions of fibrosis, with significant differences in texture features of different infections. These differences enable quantification of abnormal lung volumes in CT imaging. Conclusion Texture analysis and support vector machine classification can distinguish between areas of abnormality in acute infection and areas of chronic fibrosis, differentiate lesions having consolidative and ground glass appearances, and quantify those texture features to increase the precision of CT scoring as a potential tool for measuring disease progression and severity. PMID:21295734
Determination of Meta-Parameters for Support Vector Machine Linear Combinations.
Jasial, Swarit; Balfer, Jenny; Vogt, Martin; Bajorath, Jürgen
2015-02-01
Support vector machines (SVMs) are among the most popular machine learning methods for compound classification and other chemoinformatics tasks such as, for example, the prediction of ligand-target pairs or compound activity profiles. Depending on the specific applications, different SVM strategies can be used. For example, in the context of potency-directed virtual screening, linear combinations of multiple SVM models have been shown to enrich database selection sets with potent compounds compared to individual models. An open question concerning the use of SVM linear combinations (SVM-LCs) is how to best weight the models on a relative scale. Typically, linear weights are subjectively set. Herein, preferred weighting factors for SVM-LC were systematically determined. Therefore, weights were treated as meta-parameters and optimized by machine learning to enrich data set rankings with highly active compounds. The meta-parameter approach has been applied to 10 screening data sets and found to further improve SVM performance over other SVM-LCs and support vector regression (SVR) models. The results show that optimal weights depend on data set characteristics and chosen molecular representations. In addition, individual models often do not contribute to the performance of SVM-LCs. Taken together, these findings emphasize the need for systematic meta-parameter estimation.
Morshed, Nader; Echols, Nathaniel; Adams, Paul D.
2015-05-01
A method to automatically identify possible elemental ions in X-ray crystal structures has been extended to use support vector machine (SVM) classifiers trained on selected structures in the PDB, with significantly improved sensitivity over manually encoded heuristics. In the process of macromolecular model building, crystallographers must examine electron density for isolated atoms and differentiate sites containing structured solvent molecules from those containing elemental ions. This task requires specific knowledge of metal-binding chemistry and scattering properties and is prone to error. A method has previously been described to identify ions based on manually chosen criteria for a number of elements. Here, the use of support vector machines (SVMs) to automatically classify isolated atoms as either solvent or one of various ions is described. Two data sets of protein crystal structures, one containing manually curated structures deposited with anomalous diffraction data and another with automatically filtered, high-resolution structures, were constructed. On the manually curated data set, an SVM classifier was able to distinguish calcium from manganese, zinc, iron and nickel, as well as all five of these ions from water molecules, with a high degree of accuracy. Additionally, SVMs trained on the automatically curated set of high-resolution structures were able to successfully classify most common elemental ions in an independent validation test set. This method is readily extensible to other elemental ions and can also be used in conjunction with previous methods based on a priori expectations of the chemical environment and X-ray scattering.
Determination of Meta-Parameters for Support Vector Machine Linear Combinations.
Jasial, Swarit; Balfer, Jenny; Vogt, Martin; Bajorath, Jürgen
2015-02-01
Support vector machines (SVMs) are among the most popular machine learning methods for compound classification and other chemoinformatics tasks such as, for example, the prediction of ligand-target pairs or compound activity profiles. Depending on the specific applications, different SVM strategies can be used. For example, in the context of potency-directed virtual screening, linear combinations of multiple SVM models have been shown to enrich database selection sets with potent compounds compared to individual models. An open question concerning the use of SVM linear combinations (SVM-LCs) is how to best weight the models on a relative scale. Typically, linear weights are subjectively set. Herein, preferred weighting factors for SVM-LC were systematically determined. Therefore, weights were treated as meta-parameters and optimized by machine learning to enrich data set rankings with highly active compounds. The meta-parameter approach has been applied to 10 screening data sets and found to further improve SVM performance over other SVM-LCs and support vector regression (SVR) models. The results show that optimal weights depend on data set characteristics and chosen molecular representations. In addition, individual models often do not contribute to the performance of SVM-LCs. Taken together, these findings emphasize the need for systematic meta-parameter estimation. PMID:27490035
Alvarsson, Jonathan; Eklund, Martin; Andersson, Claes; Carlsson, Lars; Spjuth, Ola; Wikberg, Jarl E S
2014-11-24
QSAR modeling using molecular signatures and support vector machines with a radial basis function is increasingly used for virtual screening in the drug discovery field. This method has three free parameters: C, γ, and signature height. C is a penalty parameter that limits overfitting, γ controls the width of the radial basis function kernel, and the signature height determines how much of the molecule is described by each atom signature. Determination of optimal values for these parameters is time-consuming. Good default values could therefore save considerable computational cost. The goal of this project was to investigate whether such default values could be found by using seven public QSAR data sets spanning a wide range of end points and using both a bit version and a count version of the molecular signatures. On the basis of the experiments performed, we recommend a parameter set of heights 0 to 2 for the count version of the signature fingerprints and heights 0 to 3 for the bit version. These are in combination with a support vector machine using C in the range of 1 to 100 and γ in the range of 0.001 to 0.1. When data sets are small or longer run times are not a problem, then there is reason to consider the addition of height 3 to the count fingerprint and a wider grid search. However, marked improvements should not be expected. PMID:25318024
Han, Longfei; Luo, Senlin; Yu, Jianmin; Pan, Limin; Chen, Songjing
2015-03-01
Diabetes mellitus is a chronic disease and a worldwide public health challenge. It has been shown that 50-80% proportion of T2DM is undiagnosed. In this paper, support vector machines are utilized to screen diabetes, and an ensemble learning module is added, which turns the "black box" of SVM decisions into comprehensible and transparent rules, and it is also useful for solving imbalance problem. Results on China Health and Nutrition Survey data show that the proposed ensemble learning method generates rule sets with weighted average precision 94.2% and weighted average recall 93.9% for all classes. Furthermore, the hybrid system can provide a tool for diagnosis of diabetes, and it supports a second opinion for lay users.
NASA Astrophysics Data System (ADS)
Xiao, Guoqiang; Jiang, Yang; Song, Gang; Jiang, Jianmin
2010-12-01
We propose a support-vector-machine (SVM) tree to hierarchically learn from domain knowledge represented by low-level features toward automatic classification of sports videos. The proposed SVM tree adopts a binary tree structure to exploit the nature of SVM's binary classification, where each internal node is a single SVM learning unit, and each external node represents the classified output type. Such a SVM tree presents a number of advantages, which include: 1. low computing cost; 2. integrated learning and classification while preserving individual SVM's learning strength; and 3. flexibility in both structure and learning modules, where different numbers of nodes and features can be added to address specific learning requirements, and various learning models can be added as individual nodes, such as neural networks, AdaBoost, hidden Markov models, dynamic Bayesian networks, etc. Experiments support that the proposed SVM tree achieves good performances in sports video classifications.
New fuzzy support vector machine for the class imbalance problem in medical datasets classification.
Gu, Xiaoqing; Ni, Tongguang; Wang, Hongyuan
2014-01-01
In medical datasets classification, support vector machine (SVM) is considered to be one of the most successful methods. However, most of the real-world medical datasets usually contain some outliers/noise and data often have class imbalance problems. In this paper, a fuzzy support machine (FSVM) for the class imbalance problem (called FSVM-CIP) is presented, which can be seen as a modified class of FSVM by extending manifold regularization and assigning two misclassification costs for two classes. The proposed FSVM-CIP can be used to handle the class imbalance problem in the presence of outliers/noise, and enhance the locality maximum margin. Five real-world medical datasets, breast, heart, hepatitis, BUPA liver, and pima diabetes, from the UCI medical database are employed to illustrate the method presented in this paper. Experimental results on these datasets show the outperformed or comparable effectiveness of FSVM-CIP.
Carbon-Mangels, Miriam; Hutter, Michael C
2011-10-01
Classification algorithms suffer from the curse of dimensionality, which leads to overfitting, particularly if the problem is over-determined. Therefore it is of particular interest to identify the most relevant descriptors to reduce the complexity. We applied Bayesian estimates to model the probability distribution of descriptors values used for binary classification using n-fold cross-validation. As a measure for the discriminative power of the classifiers, the symmetric form of the Kullback-Leibler divergence of their probability distributions was computed. We found that the most relevant descriptors possess a Gaussian-like distribution of their values, show the largest divergences, and therefore appear most often in the cross-validation scenario. The results were compared to those of the LASSO feature selection method applied to multiple decision trees and support vector machine approaches for data sets of substrates and nonsubstrates of three Cytochrome P450 isoenzymes, which comprise strongly unbalanced compound distributions. In contrast to decision trees and support vector machines, the performance of Bayesian estimates is less affected by unbalanced data sets. This strategy reveals those descriptors that allow a simple linear separation of the classes, whereas the superior accuracy of decision trees and support vector machines can be attributed to nonlinear separation, which are in turn more prone to overfitting.
NASA Astrophysics Data System (ADS)
Shi, Fei; Liu, Yu-Yan; Sun, Guang-Lan; Li, Pei-Yu; Lei, Yu-Ming; Wang, Jian
2015-10-01
The emission-lines of galaxies originate from massive young stars or supermassive blackholes. As a result, spectral classification of emission-line galaxies into star-forming galaxies, active galactic nucleus (AGN) hosts, or compositions of both relates closely to formation and evolution of galaxy. To find efficient and automatic spectral classification method, especially in large surveys and huge data bases, a support vector machine (SVM) supervised learning algorithm is applied to a sample of emission-line galaxies from the Sloan Digital Sky Survey (SDSS) data release 9 (DR9) provided by the Max Planck Institute and the Johns Hopkins University (MPA/JHU). A two-step approach is adopted. (i) The SVM must be trained with a subset of objects that are known to be AGN hosts, composites or star-forming galaxies, treating the strong emission-line flux measurements as input feature vectors in an n-dimensional space, where n is the number of strong emission-line flux ratios. (ii) After training on a sample of emission-line galaxies, the remaining galaxies are automatically classified. In the classification process, we use a 10-fold cross-validation technique. We show that the classification diagrams based on the [N II]/Hα versus other emission-line ratio, such as [O III]/Hβ, [Ne III]/[O II], ([O III]λ4959+[O III]λ5007)/[O III]λ4363, [O II]/Hβ, [Ar III]/[O III], [S II]/Hα, and [O I]/Hα, plus colour, allows us to separate unambiguously AGN hosts, composites or star-forming galaxies. Among them, the diagram of [N II]/Hα versus [O III]/Hβ achieved an accuracy of 99 per cent to separate the three classes of objects. The other diagrams above give an accuracy of ˜91 per cent.
NASA Astrophysics Data System (ADS)
Qian, Feng; Sun, Fan; Zhong, Weimin; Luo, Na
2013-09-01
An approach that combines genetic algorithm (GA) and control vector parameterization (CVP) is proposed to solve the dynamic optimization problems of chemical processes using numerical methods. In the new CVP method, control variables are approximated with polynomials based on state variables and time in the entire time interval. The iterative method, which reduces redundant expense and improves computing efficiency, is used with GA to reduce the width of the search region. Constrained dynamic optimization problems are even more difficult. A new method that embeds the information of infeasible chromosomes into the evaluation function is introduced in this study to solve dynamic optimization problems with or without constraint. The results demonstrated the feasibility and robustness of the proposed methods. The proposed algorithm can be regarded as a useful optimization tool, especially when gradient information is not available.
Bao, Yidan; Kong, Wenwen; Liu, Fei; Qiu, Zhengjun; He, Yong
2012-01-01
Amino acids are quite important indices to indicate the growth status of oilseed rape under herbicide stress. Near infrared (NIR) spectroscopy combined with chemometrics was applied for fast determination of glutamic acid in oilseed rape leaves. The optimal spectral preprocessing method was obtained after comparing Savitzky-Golay smoothing, standard normal variate, multiplicative scatter correction, first and second derivatives, detrending and direct orthogonal signal correction. Linear and nonlinear calibration methods were developed, including partial least squares (PLS) and least squares-support vector machine (LS-SVM). The most effective wavelengths (EWs) were determined by the successive projections algorithm (SPA), and these wavelengths were used as the inputs of PLS and LS-SVM model. The best prediction results were achieved by SPA-LS-SVM (Raw) model with correlation coefficient r = 0.9943 and root mean squares error of prediction (RMSEP) = 0.0569 for prediction set. These results indicated that NIR spectroscopy combined with SPA-LS-SVM was feasible for the fast and effective detection of glutamic acid in oilseed rape leaves. The selected EWs could be used to develop spectral sensors, and the important and basic amino acid data were helpful to study the function mechanism of herbicide.
Frontiers for the Early Diagnosis of AD by Means of MRI Brain Imaging and Support Vector Machines.
Salvatore, Christian; Battista, Petronilla; Castiglioni, Isabella
2016-01-01
The emergence of Alzheimer's Disease (AD) as a consequence of increasing aging population makes urgent the availability of methods for the early and accurate diagnosis. Magnetic Resonance Imaging (MRI) could be used as in vivo, non invasive tool to identify sensitive and specific markers of very early AD progression. In recent years, multivariate pattern analysis (MVPA) and machine- learning algorithms have attracted strong interest within the neuroimaging community, as they allow automatic classification of imaging data with higher performance than univariate statistical analysis. An exhaustive search of PubMed, Web of Science and Medline records was performed in this work, in order to retrieve studies focused on the potential role of MRI in aiding the clinician in early diagnosis of AD by using Support Vector Machines (SVMs) as MVPA automated classification method. A total of 30 studies emerged, published from 2008 to date. This review aims to give a state-of-the-art overview about SVM for the early and differential diagnosis of AD-related pathologies by means of MRI data, starting from preliminary steps such as image pre-processing, feature extraction and feature selection, and ending with classification, validation strategies and extraction of MRI-related biomarkers. The main advantages and drawbacks of the different techniques were explored. Results obtained by the reviewed studies were reported in terms of classification performance and biomarker outcomes, in order to shed light on the parameters that accompany normal and pathological aging. Unresolved issues and possible future directions were finally pointed out. PMID:26567735
Hayat, Maqsood; Tahir, Muhammad
2015-08-01
Membrane protein is a central component of the cell that manages intra and extracellular processes. Membrane proteins execute a diversity of functions that are vital for the survival of organisms. The topology of transmembrane proteins describes the number of transmembrane (TM) helix segments and its orientation. However, owing to the lack of its recognized structures, the identification of TM helix and its topology through experimental methods is laborious with low throughput. In order to identify TM helix segments reliably, accurately, and effectively from topogenic sequences, we propose the PSOFuzzySVM-TMH model. In this model, evolutionary based information position specific scoring matrix and discrete based information 6-letter exchange group are used to formulate transmembrane protein sequences. The noisy and extraneous attributes are eradicated using an optimization selection technique, particle swarm optimization, from both feature spaces. Finally, the selected feature spaces are combined in order to form ensemble feature space. Fuzzy-support vector Machine is utilized as a classification algorithm. Two benchmark datasets, including low and high resolution datasets, are used. At various levels, the performance of the PSOFuzzySVM-TMH model is assessed through 10-fold cross validation test. The empirical results reveal that the proposed framework PSOFuzzySVM-TMH outperforms in terms of classification performance in the examined datasets. It is ascertained that the proposed model might be a useful and high throughput tool for academia and research community for further structure and functional studies on transmembrane proteins. PMID:26054033
Wang, Meng; Yang, Jie; Liu, Guo-Ping; Xu, Zhi-Jie; Chou, Kuo-Chen
2004-06-01
Membrane proteins are generally classified into the following five types: (1) type I membrane proteins, (2) type II membrane proteins, (3) multipass transmembrane proteins, (4) lipid chain-anchored membrane proteins and (5) GPI-anchored membrane proteins. Prediction of membrane protein types has become one of the growing hot topics in bioinformatics. Currently, we are facing two critical challenges in this area: first, how to take into account the extremely complicated sequence-order effects, and second, how to deal with the highly uneven sizes of the subsets in a training dataset. In this paper, stimulated by the concept of using the pseudo-amino acid composition to incorporate the sequence-order effects, the spectral analysis technique is introduced to represent the statistical sample of a protein. Based on such a framework, the weighted support vector machine (SVM) algorithm is applied. The new approach has remarkable power in dealing with the bias caused by the situation when one subset in the training dataset contains many more samples than the other. The new method is particularly useful when our focus is aimed at proteins belonging to small subsets. The results obtained by the self-consistency test, jackknife test and independent dataset test are encouraging, indicating that the current approach may serve as a powerful complementary tool to other existing methods for predicting the types of membrane proteins.
Tian, Peirong; Zhang, Weitao; Zhao, Hongmei; Lei, Yutao; Cui, Long; Wang, Wei; Li, Qingbo; Zhu, Qing; Zhang, Yuanfu; Xu, Zhi
2015-01-01
Background: Fourier transform infrared (FTIR) spectroscopy has shown its unique advantages in distinguishing cancerous tissue from normal one. The aim of this study was to establish a quick and accurate diagnostic method of FTIR spectroscopy to differentiate malignancies from benign breast tissues intraoperatively. Materials and methods: In this study, a total of 100 breast tissue samples obtained from 100 patients were taken on surgery. All tissue samples were scanned for spectra intraoperatively before being processed for histopathological diagnosis. Standard normal variate (SNV) method was adopted to reduce scatter effects. Support vector machine (SVM) classification was used to discriminate spectra between malignant and benign breast tissues. Leave-one-out cross validation (LOOCV) was used to evaluate the discrimination. Results: According to histopathological examination, 50 cases were diagnosed as fibroadenoma and 50 cases as invasive ductal carcinoma. The results of SVM algorithm showed that the sensitivity, specificity and accuracy rate of this method are 90.0%, 98.0% and 94.0%, respectively. Conclusions: FTIR spectroscopy technique in combination with SVM classification could be an accurate, rapid and objective tool to differentiate malignant from benign tumors during operation. Our studies establish the feasibility of FTIR spectroscopy with chemometrics method to guide surgeons during the surgery as an effective supplement for pathological diagnosis on frozen section. PMID:25785083
NASA Astrophysics Data System (ADS)
Huang, Cong; Liu, Dan-Dan; Wang, Jing-Song
2009-06-01
The 10.7 cm solar radio flux (F10.7), the value of the solar radio emission flux density at a wavelength of 10.7 cm, is a useful index of solar activity as a proxy for solar extreme ultraviolet radiation. It is meaningful and important to predict F10.7 values accurately for both long-term (months-years) and short-term (days) forecasting, which are often used as inputs in space weather models. This study applies a novel neural network technique, support vector regression (SVR), to forecasting daily values of F10.7. The aim of this study is to examine the feasibility of SVR in short-term F10.7 forecasting. The approach, based on SVR, reduces the dimension of feature space in the training process by using a kernel-based learning algorithm. Thus, the complexity of the calculation becomes lower and a small amount of training data will be sufficient. The time series of F10.7 from 2002 to 2006 are employed as the data sets. The performance of the approach is estimated by calculating the norm mean square error and mean absolute percentage error. It is shown that our approach can perform well by using fewer training data points than the traditional neural network.
NASA Astrophysics Data System (ADS)
Yashvantrai Vyas, Bhargav; Maheshwari, Rudra Prakash; Das, Biswarup
2016-06-01
Application of series compensation in extra high voltage (EHV) transmission line makes the protection job difficult for engineers, due to alteration in system parameters and measurements. The problem amplifies with inclusion of electronically controlled compensation like thyristor controlled series compensation (TCSC) as it produce harmonics and rapid change in system parameters during fault associated with TCSC control. This paper presents a pattern recognition based fault type identification approach with support vector machine. The scheme uses only half cycle post fault data of three phase currents to accomplish the task. The change in current signal features during fault has been considered as discriminatory measure. The developed scheme in this paper is tested over a large set of fault data with variation in system and fault parameters. These fault cases have been generated with PSCAD/EMTDC on a 400 kV, 300 km transmission line model. The developed algorithm has proved better for implementation on TCSC compensated line with its improved accuracy and speed.
You, Zhu-Hong; Li, Jianqiang; Gao, Xin; He, Zhou; Zhu, Lin; Lei, Ying-Ke; Ji, Zhiwei
2015-01-01
Proteins and their interactions lie at the heart of most underlying biological processes. Consequently, correct detection of protein-protein interactions (PPIs) is of fundamental importance to understand the molecular mechanisms in biological systems. Although the convenience brought by high-throughput experiment in technological advances makes it possible to detect a large amount of PPIs, the data generated through these methods is unreliable and may not be completely inclusive of all possible PPIs. Targeting at this problem, this study develops a novel computational approach to effectively detect the protein interactions. This approach is proposed based on a novel matrix-based representation of protein sequence combined with the algorithm of support vector machine (SVM), which fully considers the sequence order and dipeptide information of the protein primary sequence. When performed on yeast PPIs datasets, the proposed method can reach 90.06% prediction accuracy with 94.37% specificity at the sensitivity of 85.74%, indicating that this predictor is a useful tool to predict PPIs. Achieved results also demonstrate that our approach can be a helpful supplement for the interactions that have been detected experimentally. PMID:26000305
NASA Astrophysics Data System (ADS)
Ismail, S.; Shabri, A.; Samsudin, R.
2012-11-01
Successful river flow forecasting is a major goal and an essential procedure that is necessary in water resource planning and management. There are many forecasting techniques used for river flow forecasting. This study proposed a hybrid model based on a combination of two methods: Self Organizing Map (SOM) and Least Squares Support Vector Machine (LSSVM) model, referred to as the SOM-LSSVM model for river flow forecasting. The hybrid model uses the SOM algorithm to cluster the entire dataset into several disjointed clusters, where the monthly river flows data with similar input pattern are grouped together from a high dimensional input space onto a low dimensional output layer. By doing this, the data with similar input patterns will be mapped to neighbouring neurons in the SOM's output layer. After the dataset has been decomposed into several disjointed clusters, an individual LSSVM is applied to forecast the river flow. The feasibility of this proposed model is evaluated with respect to the actual river flow data from the Bernam River located in Selangor, Malaysia. The performance of the SOM-LSSVM was compared with other single models such as ARIMA, ANN and LSSVM. The performance of these models was then evaluated using various performance indicators. The experimental results show that the SOM-LSSVM model outperforms the other models and performs better than ANN, LSSVM as well as ARIMA for river flow forecasting. It also indicates that the proposed model can forecast more precisely, and provides a promising alternative technique for river flow forecasting.
NASA Astrophysics Data System (ADS)
Saha, Priya; Bhowmik, Mrinal K.; Bhattacharjee, Debotosh; De, Barin K.; Nasipuri, Mita
2013-03-01
Pose and illumination invariant face recognition problem is now-a-days an emergent problem in the field of information security. In this paper, gradient based fusion method of gradient visual and corresponding infrared face images have been proposed to overcome the problem of illumination varying conditions. This technique mainly extracts illumination insensitive features under different conditions for effective face recognition purpose. The gradient image is computed from a visible light image. Information fusion is performed in the gradient map domain. The image fusion of infrared image and corresponding visual gradient image is done in wavelet domain by taking the maximum information of approximation and detailed coefficients. These fused images have been taken for dimension reduction using Independent Component Analysis (ICA). The reduced face images are taken for training and testing purposes from different classes of different datasets of IRIS face database. SVM multiclass strategy `one-vs.-all' have been taken in the experiment. For training support vector machine, Sequential Minimal Optimization (SMO) algorithm has been used. Linear kernel and Polynomial kernel with degree 3 are used in SVM kernel functions. The experiment results show that the proposed approach generates good classification accuracies for the face images under different lighting conditions.
Kher, Rahul; Pawar, Tanmay; Thakar, Vishvjit; Shah, Hitesh
2015-02-01
The use of wearable recorders for long-term monitoring of physiological parameters has increased in the last few years. The ambulatory electrocardiogram (A-ECG) signals of five healthy subjects with four body movements or physical activities (PA)-left arm up down, right arm up down, waist twisting and walking-have been recorded using a wearable ECG recorder. The classification of these four PAs has been performed using neuro-fuzzy classifier (NFC) and support vector machines (SVM). The PA classification is based on the distinct, time-frequency features of the extracted motion artifacts contained in recorded A-ECG signals. The motion artifacts in A-ECG signals have been separated first by the discrete wavelet transform (DWT) and the time-frequency features of these motion artifacts have then been extracted using the Gabor transform. The Gabor energy feature vectors have been fed to the NFC and SVM classifiers. Both the classifiers have achieved a PA classification accuracy of over 95% for all subjects. PMID:25641014
NASA Astrophysics Data System (ADS)
Campanini, Renato; Dongiovanni, Danilo; Iampieri, Emiro; Lanconelli, Nico; Masotti, Matteo; Palermo, Giuseppe; Riccardi, Alessandro; Roffilli, Matteo
2004-03-01
In this work, we present a novel approach to mass detection in digital mammograms. The great variability of the appearance of masses is the main obstacle to building a mass detection method. It is indeed demanding to characterize all the varieties of masses with a reduced set of features. Hence, in our approach we have chosen not to extract any feature, for the detection of the region of interest; in contrast, we exploit all the information available on the image. A multiresolution overcomplete wavelet representation is performed, in order to codify the image with redundancy of information. The vectors of the very-large space obtained are then provided to a first support vector machine (SVM) classifier. The detection task is considered here as a two-class pattern recognition problem: crops are classified as suspect or not, by using this SVM classifier. False candidates are eliminated with a second cascaded SVM. To further reduce the number of false positives, an ensemble of experts is applied: the final suspect regions are achieved by using a voting strategy. The sensitivity of the presented system is nearly 80% with a false-positive rate of 1.1 marks per image, estimated on images coming from the USF DDSM database.
Campanini, Renato; Dongiovanni, Danilo; Iampieri, Emiro; Lanconelli, Nico; Masotti, Matteo; Palermo, Giuseppe; Riccardi, Alessandro; Roffilli, Matteo
2004-03-21
In this work, we present a novel approach to mass detection in digital mammograms. The great variability of the appearance of masses is the main obstacle to building a mass detection method. It is indeed demanding to characterize all the varieties of masses with a reduced set of features. Hence, in our approach we have chosen not to extract any feature, for the detection of the region of interest; in contrast, we exploit all the information available on the image. A multiresolution overcomplete wavelet representation is performed, in order to codify the image with redundancy of information. The vectors of the very-large space obtained are then provided to a first support vector machine (SVM) classifier. The detection task is considered here as a two-class pattern recognition problem: crops are classified as suspect or not, by using this SVM classifier. False candidates are eliminated with a second cascaded SVM. To further reduce the number of false positives, an ensemble of experts is applied: the final suspect regions are achieved by using a voting strategy. The sensitivity of the presented system is nearly 80% with a false-positive rate of 1.1 marks per image, estimated on images coming from the USF DDSM database. PMID:15104319
Population genomics supports baculoviruses as vectors of horizontal transfer of insect transposons
Gilbert, Clément; Chateigner, Aurélien; Ernenwein, Lise; Barbe, Valérie; Bézier, Annie; Herniou, Elisabeth A.; Cordaux, Richard
2014-01-01
Horizontal transfer (HT) of DNA is an important factor shaping eukaryote evolution. Although several hundreds of eukaryote-to-eukaryote HTs of transposable elements (TEs) have been reported, the vectors underlying these transfers remain elusive. Here, we show that multiple copies of two TEs from the cabbage looper (Trichoplusia ni) transposed in vivo into genomes of the baculovirus Autographa californica multiple nucleopolyhedrovirus (AcMNPV) during caterpillar infection. We further demonstrate that both TEs underwent recent HT between several sympatric moth species (T. ni, Manduca sexta, Helicoverpa spp.) showing different degrees of susceptibility to AcMNPV. Based on two independent population genomics data sets (reaching a total coverage >330,000X), we report a frequency of one moth TE in ~8,500 AcMNPV genomes. Together, our results provide strong support for the role of viruses as vectors of TE HT between animals, and they call for a systematic evaluation of the frequency and impact of virus-mediated HT on the evolution of host genomes. PMID:24556639
An adaptive online learning approach for Support Vector Regression: Online-SVR-FID
NASA Astrophysics Data System (ADS)
Liu, Jie; Zio, Enrico
2016-08-01
Support Vector Regression (SVR) is a popular supervised data-driven approach for building empirical models from available data. Like all data-driven methods, under non-stationary environmental and operational conditions it needs to be provided with adaptive learning capabilities, which might become computationally burdensome with large datasets cumulating dynamically. In this paper, a cost-efficient online adaptive learning approach is proposed for SVR by combining Feature Vector Selection (FVS) and Incremental and Decremental Learning. The proposed approach adaptively modifies the model only when different pattern drifts are detected according to proposed criteria. Two tolerance parameters are introduced in the approach to control the computational complexity, reduce the influence of the intrinsic noise in the data and avoid the overfitting problem of SVR. Comparisons of the prediction results is made with other online learning approaches e.g. NORMA, SOGA, KRLS, Incremental Learning, on several artificial datasets and a real case study concerning time series prediction based on data recorded on a component of a nuclear power generation system. The performance indicators MSE and MARE computed on the test dataset demonstrate the efficiency of the proposed online learning method.
Abibullaev, Berdakh; An, Jinung; Jin, Sang-Hyeon; Lee, Seung Hyun; Moon, Jeon Il
2013-12-01
Brain signal variation across different subjects and sessions significantly impairs the accuracy of most brain-computer interface (BCI) systems. Herein, we present a classification algorithm that minimizes such variation, using linear programming support-vector machines (LP-SVM) and their extension to multiple kernel learning methods. The minimization is based on the decision boundaries formed in classifiers' feature spaces and their relation to BCI variation. Specifically, we estimate subject/session-invariant features in the reproducing kernel Hilbert spaces (RKHS) induced with Gaussian kernels. The idea is to construct multiple subject/session-dependent RKHS and to perform classification with LP-SVMs. To evaluate the performance of the algorithm, we applied it to oxy-hemoglobin data sets acquired from eight sessions and seven subjects as they performed two different mental tasks. Results show that our classifiers maintain good performance when applied to random patterns across varying sessions/subjects.
Intelligent decision support algorithm for distribution system restoration.
Singh, Reetu; Mehfuz, Shabana; Kumar, Parmod
2016-01-01
Distribution system is the means of revenue for electric utility. It needs to be restored at the earliest if any feeder or complete system is tripped out due to fault or any other cause. Further, uncertainty of the loads, result in variations in the distribution network's parameters. Thus, an intelligent algorithm incorporating hybrid fuzzy-grey relation, which can take into account the uncertainties and compare the sequences is discussed to analyse and restore the distribution system. The simulation studies are carried out to show the utility of the method by ranking the restoration plans for a typical distribution system. This algorithm also meets the smart grid requirements in terms of an automated restoration plan for the partial/full blackout of network.
Intelligent decision support algorithm for distribution system restoration.
Singh, Reetu; Mehfuz, Shabana; Kumar, Parmod
2016-01-01
Distribution system is the means of revenue for electric utility. It needs to be restored at the earliest if any feeder or complete system is tripped out due to fault or any other cause. Further, uncertainty of the loads, result in variations in the distribution network's parameters. Thus, an intelligent algorithm incorporating hybrid fuzzy-grey relation, which can take into account the uncertainties and compare the sequences is discussed to analyse and restore the distribution system. The simulation studies are carried out to show the utility of the method by ranking the restoration plans for a typical distribution system. This algorithm also meets the smart grid requirements in terms of an automated restoration plan for the partial/full blackout of network. PMID:27512634
Algorithmic support for commodity-based parallel computing systems.
Leung, Vitus Joseph; Bender, Michael A.; Bunde, David P.; Phillips, Cynthia Ann
2003-10-01
The Computational Plant or Cplant is a commodity-based distributed-memory supercomputer under development at Sandia National Laboratories. Distributed-memory supercomputers run many parallel programs simultaneously. Users submit their programs to a job queue. When a job is scheduled to run, it is assigned to a set of available processors. Job runtime depends not only on the number of processors but also on the particular set of processors assigned to it. Jobs should be allocated to localized clusters of processors to minimize communication costs and to avoid bandwidth contention caused by overlapping jobs. This report introduces new allocation strategies and performance metrics based on space-filling curves and one dimensional allocation strategies. These algorithms are general and simple. Preliminary simulations and Cplant experiments indicate that both space-filling curves and one-dimensional packing improve processor locality compared to the sorted free list strategy previously used on Cplant. These new allocation strategies are implemented in Release 2.0 of the Cplant System Software that was phased into the Cplant systems at Sandia by May 2002. Experimental results then demonstrated that the average number of communication hops between the processors allocated to a job strongly correlates with the job's completion time. This report also gives processor-allocation algorithms for minimizing the average number of communication hops between the assigned processors for grid architectures. The associated clustering problem is as follows: Given n points in {Re}d, find k points that minimize their average pairwise L{sub 1} distance. Exact and approximate algorithms are given for these optimization problems. One of these algorithms has been implemented on Cplant and will be included in Cplant System Software, Version 2.1, to be released. In more preliminary work, we suggest improvements to the scheduler separate from the allocator.
Kawai, Kentaro; Fujishima, Satoshi; Takahashi, Yoshimasa
2008-06-01
Aiming at the prediction of pleiotropic effects of drugs, we have investigated the multilabel classification of drugs that have one or more of 100 different kinds of activity labels. Structural feature representation of each drug molecule was based on the topological fragment spectra method, which was proposed in our previous work. Support vector machine (SVM) was used for the classification and the prediction of their activity classes. Multilabel classification was carried out by a set of the SVM classifiers. The collective SVM classifiers were trained with a training set of 59,180 compounds and validated by another set (validation set) of 29,590 compounds. For a test set that consists of 9,864 compounds, the classifiers correctly classified 80.8% of the drugs into their own active classes. The SVM classifiers also successfully performed predictions of the activity spectra for multilabel compounds. PMID:18533712
Real time facial expression recognition from image sequences using support vector machines
NASA Astrophysics Data System (ADS)
Kotsia, I.; Pitas, I.
2005-07-01
In this paper, a real-time method is proposed as a solution to the problem of facial expression classi cation in video sequences. The user manually places some of the Candide grid nodes to the face depicted at the rst frame. The grid adaptation system, based on deformable models, tracks the entire Candide grid as the facial expression evolves through time, thus producing a grid that corresponds to the greatest intensity of the facial expression, as shown at the last frame. Certain points that are involved into creating the Facial Action Units movements are selected. Their geometrical displacement information, de ned as the coordinates' dierence between the last and the rst frame, is extracted to be the input to a six class Support Vector Machine system. The output of the system is the facial expression recognized. The proposed real-time system, recognizes the 6 basic facial expressions with an approximately 98% accuracy.
Analysis of dengue infection based on Raman spectroscopy and support vector machine (SVM).
Khan, Saranjam; Ullah, Rahat; Khan, Asifullah; Wahab, Noorul; Bilal, Muhammad; Ahmed, Mushtaq
2016-06-01
The current study presents the use of Raman spectroscopy combined with support vector machine (SVM) for the classification of dengue suspected human blood sera. Raman spectra for 84 clinically dengue suspected patients acquired from Holy Family Hospital, Rawalpindi, Pakistan, have been used in this study.The spectral differences between dengue positive and normal sera have been exploited by using effective machine learning techniques. In this regard, SVM models built on the basis of three different kernel functions including Gaussian radial basis function (RBF), polynomial function and linear functionhave been employed to classify the human blood sera based on features obtained from Raman Spectra.The classification model have been evaluated with the 10-fold cross validation method. In the present study, the best performance has been achieved for the polynomial kernel of order 1. A diagnostic accuracy of about 85% with the precision of 90%, sensitivity of 73% and specificity of 93% has been achieved under these conditions.
NASA Astrophysics Data System (ADS)
Bao, Wenxing; Feng, Wei; Ma, Ruishi
2015-12-01
In this paper, we proposed a new classification method based on support vector machine (SVM) combined with multi-scale segmentation. The proposed method obtains satisfactory segmentation results which are based on both the spectral characteristics and the shape parameters of segments. SVM method is used to label all these regions after multiscale segmentation. It can effectively improve the classification results. Firstly, the homogeneity of the object spectra, texture and shape are calculated from the input image. Secondly, multi-scale segmentation method is applied to the RS image. Combining graph theory based optimization with the multi-scale image segmentations, the resulting segments are merged regarding the heterogeneity criteria. Finally, based on the segmentation result, the model of SVM combined with spectrum texture classification is constructed and applied. The results show that the proposed method can effectively improve the remote sensing image classification accuracy and classification efficiency.
Support vector machine-based feature extractor for L/H transitions in JETa)
NASA Astrophysics Data System (ADS)
González, S.; Vega, J.; Murari, A.; Pereira, A.; Ramírez, J. M.; Dormido-Canto, S.; Jet-Efda Contributors
2010-10-01
Support vector machines (SVM) are machine learning tools originally developed in the field of artificial intelligence to perform both classification and regression. In this paper, we show how SVM can be used to determine the most relevant quantities to characterize the confinement transition from low to high confinement regimes in tokamak plasmas. A set of 27 signals is used as starting point. The signals are discarded one by one until an optimal number of relevant waveforms is reached, which is the best tradeoff between keeping a limited number of quantities and not loosing essential information. The method has been applied to a database of 749 JET discharges and an additional database of 150 JET discharges has been used to test the results obtained.
Data on Support Vector Machines (SVM) model to forecast photovoltaic power.
Malvoni, M; De Giorgi, M G; Congedo, P M
2016-12-01
The data concern the photovoltaic (PV) power, forecasted by a hybrid model that considers weather variations and applies a technique to reduce the input data size, as presented in the paper entitled "Photovoltaic forecast based on hybrid pca-lssvm using dimensionality reducted data" (M. Malvoni, M.G. De Giorgi, P.M. Congedo, 2015) [1]. The quadratic Renyi entropy criteria together with the principal component analysis (PCA) are applied to the Least Squares Support Vector Machines (LS-SVM) to predict the PV power in the day-ahead time frame. The data here shared represent the proposed approach results. Hourly PV power predictions for 1,3,6,12, 24 ahead hours and for different data reduction sizes are provided in Supplementary material. PMID:27622206
Analysis of EEG signals by combining eigenvector methods and multiclass support vector machines.
Derya Ubeyli, Elif
2008-01-01
A new approach based on the implementation of multiclass support vector machine (SVM) with the error correcting output codes (ECOC) is presented for classification of electroencephalogram (EEG) signals. In practical applications of pattern recognition, there are often diverse features extracted from raw data which needs recognizing. Decision making was performed in two stages: feature extraction by eigenvector methods and classification using the classifiers trained on the extracted features. The aim of the study is classification of the EEG signals by the combination of eigenvector methods and multiclass SVM. The purpose is to determine an optimum classification scheme for this problem and also to infer clues about the extracted features. The present research demonstrated that the eigenvector methods are the features which well represent the EEG signals and the multiclass SVM trained on these features achieved high classification accuracies. PMID:17651716
Ubeyli, Elif Derya
2007-05-01
In this paper, the multiclass support vector machines (SVMs) with the error correcting output codes (ECOC) were presented for detecting variabilities of the multiclass Doppler ultrasound signals. The ophthalmic arterial (OA) Doppler signals were recorded from healthy subjects, subjects suffering from OA stenosis, subjects suffering from ocular Behcet disease. The internal carotid arterial (ICA) Doppler signals were recorded from healthy subjects, subjects suffering from ICA stenosis, subjects suffering from ICA occlusion. Methods of combining multiple classifiers with diverse features are viewed as a general problem in various application areas of pattern recognition. Because of the importance of making the right decision, better classification procedures for Doppler ultrasound signals are searched. Decision making was performed in two stages: feature extraction by eigenvector methods and classification using the SVMs trained on the extracted features. The research demonstrated that the multiclass SVMs trained on extracted features achieved high accuracy rates. PMID:17289211
Credit Risk Evaluation Using a C-Variable Least Squares Support Vector Classification Model
NASA Astrophysics Data System (ADS)
Yu, Lean; Wang, Shouyang; Lai, K. K.
Credit risk evaluation is one of the most important issues in financial risk management. In this paper, a C-variable least squares support vector classification (C-VLSSVC) model is proposed for credit risk analysis. The main idea of this model is based on the prior knowledge that different classes may have different importance for modeling and more weights should be given to those classes with more importance. The C-VLSSVC model can be constructed by a simple modification of the regularization parameter in LSSVC, whereby more weights are given to the lease squares classification errors with important classes than the lease squares classification errors with unimportant classes while keeping the regularized terms in its original form. For illustration purpose, a real-world credit dataset is used to test the effectiveness of the C-VLSSVC model.
Electric vehicle state of charge estimation: Nonlinear correlation and fuzzy support vector machine
NASA Astrophysics Data System (ADS)
Sheng, Hanmin; Xiao, Jian
2015-05-01
The aim of this study is to estimate the state of charge (SOC) of the lithium iron phosphate (LiFePO4) battery pack by applying machine learning strategy. To reduce the noise sensitive issue of common machine learning strategies, a kind of SOC estimation method based on fuzzy least square support vector machine is proposed. By applying fuzzy inference and nonlinear correlation measurement, the effects of the samples with low confidence can be reduced. Further, a new approach for determining the error interval of regression results is proposed to avoid the control system malfunction. Tests are carried out on modified COMS electric vehicles, with two battery packs each consists of 24 50 Ah LiFePO4 batteries. The effectiveness of the method is proven by the test and the comparison with other popular methods.
Application of the support vector machine to predict subclinical mastitis in dairy cattle.
Mammadova, Nazira; Keskin, Ismail
2013-01-01
This study presented a potentially useful alternative approach to ascertain the presence of subclinical and clinical mastitis in dairy cows using support vector machine (SVM) techniques. The proposed method detected mastitis in a cross-sectional representative sample of Holstein dairy cattle milked using an automatic milking system. The study used such suspected indicators of mastitis as lactation rank, milk yield, electrical conductivity, average milking duration, and control season as input data. The output variable was somatic cell counts obtained from milk samples collected monthly throughout the 15 months of the control period. Cattle were judged to be healthy or infected based on those somatic cell counts. This study undertook a detailed scrutiny of the SVM methodology, constructing and examining a model which showed 89% sensitivity, 92% specificity, and 50% error in mastitis detection. PMID:24574862
Support vector machine-based feature extractor for L/H transitions in JET
Gonzalez, S.; Vega, J.; Pereira, A.; Ramirez, J. M.; Dormido-Canto, S.; Collaboration: JET-EFDA Contributors
2010-10-15
Support vector machines (SVM) are machine learning tools originally developed in the field of artificial intelligence to perform both classification and regression. In this paper, we show how SVM can be used to determine the most relevant quantities to characterize the confinement transition from low to high confinement regimes in tokamak plasmas. A set of 27 signals is used as starting point. The signals are discarded one by one until an optimal number of relevant waveforms is reached, which is the best tradeoff between keeping a limited number of quantities and not loosing essential information. The method has been applied to a database of 749 JET discharges and an additional database of 150 JET discharges has been used to test the results obtained.
Li, Yang; Kong, Yue; Zhang, Mengdi; Yan, Aixia; Liu, Zhenming
2016-04-01
Inhibition of the neuraminidase is one of the most promising strategies for preventing influenza virus spreading. 479 neuraminidase inhibitors are collected for dataset 1 and 208 neuraminidase inhibitors for A/P/8/34 are collected for dataset 2. Using support vector machine (SVM), four computational models were built to predict whether a compound is an active or weakly active inhibitor of neuraminidase. Each compound is represented by MASSC fingerprints and ADRIANA.Code descriptors. The predication accuracies for the test sets of all the models are over 78 %. Model 2B, which is the best model, obtains a prediction accuracy and a Matthews Correlation Coefficient (MCC) of 89.71 % and 0.81 on test set, respectively. The molecular polarizability, molecular shape, molecular size and hydrogen bonding are related to the activities of neuraminidase inhibitors. The models can be obtained from the authors. PMID:27491921
Using support vector machine and dynamic parameter encoding to enhance global optimization
NASA Astrophysics Data System (ADS)
Zheng, Z.; Chen, X.; Liu, C.; Huang, K.
2016-05-01
This study presents an approach which combines support vector machine (SVM) and dynamic parameter encoding (DPE) to enhance the run-time performance of global optimization with time-consuming fitness function evaluations. SVMs are used as surrogate models to partly substitute for fitness evaluations. To reduce the computation time and guarantee correct convergence, this work proposes a novel strategy to adaptively adjust the number of fitness evaluations needed according to the approximate error of the surrogate model. Meanwhile, DPE is employed to compress the solution space, so that it not only accelerates the convergence but also decreases the approximate error. Numerical results of optimizing a few benchmark functions and an antenna in a practical application are presented, which verify the feasibility, efficiency and robustness of the proposed approach.
Prediction of the solar radiation on the Earth using support vector regression technique
NASA Astrophysics Data System (ADS)
Piri, Jamshid; Shamshirband, Shahaboddin; Petković, Dalibor; Tong, Chong Wen; Rehman, Muhammad Habib ur
2015-01-01
The solar rays on the surface of Earth is one of the major factor in water resources, environmental and agricultural modeling. The main environmental factors influencing plants growth are temperature, moisture, and solar radiation. Solar radiation is rarely obtained in weather stations; as a result, many empirical approaches have been applied to estimate it by using other parameters. In this study, a soft computing technique, named support vector regression (SVR) has been used to estimate the solar radiation. The data was collected from two synoptic stations with different climate conditions (Zahedan and Bojnurd) during the period of 5 and 7 years, respectively. These data contain sunshine hours, maximum temperature, minimum temperature, average relative humidity and daily solar radiation. In this study, the polynomial and radial basis functions (RBF) are applied as the SVR kernel function to estimate solar radiation. The performance of the proposed estimators is confirmed with the simulation results.
Support-vector-based emergent self-organising approach for emotional understanding
NASA Astrophysics Data System (ADS)
Nguwi, Yok-Yen; Cho, Siu-Yeung
2010-12-01
This study discusses the computational analysis of general emotion understanding from questionnaires methodology. The questionnaires method approaches the subject by investigating the real experience that accompanied the emotions, whereas the other laboratory approaches are generally associated with exaggerated elements. We adopted a connectionist model called support-vector-based emergent self-organising map (SVESOM) to analyse the emotion profiling from the questionnaires method. The SVESOM first identifies the important variables by giving discriminative features with high ranking. The classifier then performs the classification based on the selected features. Experimental results show that the top rank features are in line with the work of Scherer and Wallbott [(1994), 'Evidence for Universality and Cultural Variation of Differential Emotion Response Patterning', Journal of Personality and Social Psychology, 66, 310-328], which approached the emotions physiologically. While the performance measures show that using the full features for classifications can degrade the performance, the selected features provide superior results in terms of accuracy and generalisation.
Prediction of banana quality indices from color features using support vector regression.
Sanaeifar, Alireza; Bakhshipour, Adel; de la Guardia, Miguel
2016-01-01
Banana undergoes significant quality indices and color transformations during shelf-life process, which in turn affect important chemical and physical characteristics for the organoleptic quality of banana. A computer vision system was implemented in order to evaluate color of banana in RGB, L*a*b* and HSV color spaces, and changes in color features of banana during shelf-life were employed for the quantitative prediction of quality indices. The radial basis function (RBF) was applied as the kernel function of support vector regression (SVR) and the color features, in different color spaces, were selected as the inputs of the model, being determined total soluble solids, pH, titratable acidity and firmness as the output. Experimental results provided an improvement in predictive accuracy as compared with those obtained by using artificial neural network (ANN). PMID:26653423
On combining support vector machines and simulated annealing in stereovision matching.
Pajares, Gonzalo; de la Cruz, Jesús M
2004-08-01
This paper outlines a method for solving the stereovision matching problem using edge segments as the primitives. In stereovision matching, the following constraints are commonly used: epipolar, similarity, smoothness, ordering, and uniqueness. We propose a new strategy in which such constraints are sequentially combined. The goal is to achieve high performance in terms of correct matches by combining several strategies. The contributions of this paper are reflected in the development of a similarity measure through a support vector machines classification approach; the transformation of the smoothness, ordering and epipolar constraints into the form of an energy function, through an optimization simulated annealing approach, whose minimum value corresponds to a good matching solution and by introducing specific conditions to overcome the violation of the smoothness and ordering constraints. The performance of the proposed method is illustrated by comparative analysis against some recent global matching methods. PMID:15462432
Deep learning of support vector machines with class probability output networks.
Kim, Sangwook; Yu, Zhibin; Kil, Rhee Man; Lee, Minho
2015-04-01
Deep learning methods endeavor to learn features automatically at multiple levels and allow systems to learn complex functions mapping from the input space to the output space for the given data. The ability to learn powerful features automatically is increasingly important as the volume of data and range of applications of machine learning methods continues to grow. This paper proposes a new deep architecture that uses support vector machines (SVMs) with class probability output networks (CPONs) to provide better generalization power for pattern classification problems. As a result, deep features are extracted without additional feature engineering steps, using multiple layers of the SVM classifiers with CPONs. The proposed structure closely approaches the ideal Bayes classifier as the number of layers increases. Using a simulation of classification problems, the effectiveness of the proposed method is demonstrated.
Experimental study on light induced influence model to mice using support vector machine
NASA Astrophysics Data System (ADS)
Ji, Lei; Zhao, Zhimin; Yu, Yinshan; Zhu, Xingyue
2014-08-01
Previous researchers have made studies on different influences created by light irradiation to animals, including retinal damage, changes of inner index and so on. However, the model of light induced damage to animals using physiological indicators as features in machine learning method is never founded. This study was designed to evaluate the changes in micro vascular diameter, the serum absorption spectrum and the blood flow influenced by light irradiation of different wavelengths, powers and exposure time with support vector machine (SVM). The micro images of the mice auricle were recorded and the vessel diameters were calculated by computer program. The serum absorption spectrums were analyzed. The result shows that training sample rate 20% and 50% have almost the same correct recognition rate. Better performance and accuracy was achieved by third-order polynomial kernel SVM quadratic optimization method and it worked suitably for predicting the light induced damage to organisms.
A hybrid least squares support vector machines and GMDH approach for river flow forecasting
NASA Astrophysics Data System (ADS)
Samsudin, R.; Saad, P.; Shabri, A.
2010-06-01
This paper proposes a novel hybrid forecasting model, which combines the group method of data handling (GMDH) and the least squares support vector machine (LSSVM), known as GLSSVM. The GMDH is used to determine the useful input variables for LSSVM model and the LSSVM model which works as time series forecasting. In this study the application of GLSSVM for monthly river flow forecasting of Selangor and Bernam River are investigated. The results of the proposed GLSSVM approach are compared with the conventional artificial neural network (ANN) models, Autoregressive Integrated Moving Average (ARIMA) model, GMDH and LSSVM models using the long term observations of monthly river flow discharge. The standard statistical, the root mean square error (RMSE) and coefficient of correlation (R) are employed to evaluate the performance of various models developed. Experiment result indicates that the hybrid model was powerful tools to model discharge time series and can be applied successfully in complex hydrological modeling.
Cervical cancer survival prediction using hybrid of SMOTE, CART and smooth support vector machine
NASA Astrophysics Data System (ADS)
Purnami, S. W.; Khasanah, P. M.; Sumartini, S. H.; Chosuvivatwong, V.; Sriplung, H.
2016-04-01
According to the WHO, every two minutes there is one patient who died from cervical cancer. The high mortality rate is due to the lack of awareness of women for early detection. There are several factors that supposedly influence the survival of cervical cancer patients, including age, anemia status, stage, type of treatment, complications and secondary disease. This study wants to classify/predict cervical cancer survival based on those factors. Various classifications methods: classification and regression tree (CART), smooth support vector machine (SSVM), three order spline SSVM (TSSVM) were used. Since the data of cervical cancer are imbalanced, synthetic minority oversampling technique (SMOTE) is used for handling imbalanced dataset. Performances of these methods are evaluated using accuracy, sensitivity and specificity. Results of this study show that balancing data using SMOTE as preprocessing can improve performance of classification. The SMOTE-SSVM method provided better result than SMOTE-TSSVM and SMOTE-CART.
Data on Support Vector Machines (SVM) model to forecast photovoltaic power.
Malvoni, M; De Giorgi, M G; Congedo, P M
2016-12-01
The data concern the photovoltaic (PV) power, forecasted by a hybrid model that considers weather variations and applies a technique to reduce the input data size, as presented in the paper entitled "Photovoltaic forecast based on hybrid pca-lssvm using dimensionality reducted data" (M. Malvoni, M.G. De Giorgi, P.M. Congedo, 2015) [1]. The quadratic Renyi entropy criteria together with the principal component analysis (PCA) are applied to the Least Squares Support Vector Machines (LS-SVM) to predict the PV power in the day-ahead time frame. The data here shared represent the proposed approach results. Hourly PV power predictions for 1,3,6,12, 24 ahead hours and for different data reduction sizes are provided in Supplementary material.
gamma-Turn types prediction in proteins using the support vector machines.
Jahandideh, Samad; Sarvestani, Amir Sabet; Abdolmaleki, Parviz; Jahandideh, Mina; Barfeie, Mahdyar
2007-12-21
Recently, two different models have been developed for predicting gamma-turns in proteins by Kaur and Raghava [2002. An evaluation of beta-turn prediction methods. Bioinformatics 18, 1508-1514; 2003. A neural-network based method for prediction of gamma-turns in proteins from multiple sequence alignment. Protein Sci. 12, 923-929]. However, the major limitation of previous methods is inability in predicting gamma-turns types. Thus, there is a need to predict gamma-turn types using an approach which will be useful in overall tertiary structure prediction. In this work, support vector machines (SVMs), a powerful model is proposed for predicting gamma-turn types in proteins. The high rates of prediction accuracy showed that the formation of gamma-turn types is evidently correlated with the sequence of tripeptides, and hence can be approximately predicted based on the sequence information of the tripeptides alone. PMID:17936305
Support vector machine with adaptive composite kernel for hyperspectral image classification
NASA Astrophysics Data System (ADS)
Li, Wei; Du, Qian
2015-05-01
With the improvement of spatial resolution of hyperspectral imagery, it is more reasonable to include spatial information in classification. The resulting spectral-spatial classification outperforms the traditional hyperspectral image classification with spectral information only. Among many spectral-spatial classifiers, support vector machine with composite kernel (SVM-CK) can provide superior performance, with one kernel for spectral information and the other for spatial information. In the original SVM-CK, the spatial information is retrieved by spatial averaging of pixels in a local neighborhood, and used in classifying the central pixel. Obviously, not all the pixels in such a local neighborhood may belong to the same class. Thus, we investigate the performance of Gaussian lowpass filter and an adaptive filter with weights being assigned based on the similarity to the central pixel. The adaptive filter can significantly improve classification accuracy while the Gaussian lowpass filter is less time-consuming and less sensitive to the window size.
Using support vector machine and evolutionary profiles to predict antifreeze protein sequences.
Zhao, Xiaowei; Ma, Zhiqiang; Yin, Minghao
2012-01-01
Antifreeze proteins (AFPs) are ice-binding proteins. Accurate identification of new AFPs is important in understanding ice-protein interactions and creating novel ice-binding domains in other proteins. In this paper, an accurate method, called AFP_PSSM, has been developed for predicting antifreeze proteins using a support vector machine (SVM) and position specific scoring matrix (PSSM) profiles. This is the first study in which evolutionary information in the form of PSSM profiles has been successfully used for predicting antifreeze proteins. Tested by 10-fold cross validation and independent test, the accuracy of the proposed method reaches 82.67% for the training dataset and 93.01% for the testing dataset, respectively. These results indicate that our predictor is a useful tool for predicting antifreeze proteins. A web server (AFP_PSSM) that implements the proposed predictor is freely available.
Prediction of banana quality indices from color features using support vector regression.
Sanaeifar, Alireza; Bakhshipour, Adel; de la Guardia, Miguel
2016-01-01
Banana undergoes significant quality indices and color transformations during shelf-life process, which in turn affect important chemical and physical characteristics for the organoleptic quality of banana. A computer vision system was implemented in order to evaluate color of banana in RGB, L*a*b* and HSV color spaces, and changes in color features of banana during shelf-life were employed for the quantitative prediction of quality indices. The radial basis function (RBF) was applied as the kernel function of support vector regression (SVR) and the color features, in different color spaces, were selected as the inputs of the model, being determined total soluble solids, pH, titratable acidity and firmness as the output. Experimental results provided an improvement in predictive accuracy as compared with those obtained by using artificial neural network (ANN).
Analysis of dengue infection based on Raman spectroscopy and support vector machine (SVM)
Khan, Saranjam; Ullah, Rahat; Khan, Asifullah; Wahab, Noorul; Bilal, Muhammad; Ahmed, Mushtaq
2016-01-01
The current study presents the use of Raman spectroscopy combined with support vector machine (SVM) for the classification of dengue suspected human blood sera. Raman spectra for 84 clinically dengue suspected patients acquired from Holy Family Hospital, Rawalpindi, Pakistan, have been used in this study.The spectral differences between dengue positive and normal sera have been exploited by using effective machine learning techniques. In this regard, SVM models built on the basis of three different kernel functions including Gaussian radial basis function (RBF), polynomial function and linear functionhave been employed to classify the human blood sera based on features obtained from Raman Spectra.The classification model have been evaluated with the 10-fold cross validation method. In the present study, the best performance has been achieved for the polynomial kernel of order 1. A diagnostic accuracy of about 85% with the precision of 90%, sensitivity of 73% and specificity of 93% has been achieved under these conditions. PMID:27375941
Landslide susceptibility mapping using support vector machine and GIS at the Golestan Province, Iran
NASA Astrophysics Data System (ADS)
Pourghasemi, Hamid Reza; Jirandeh, Abbas Goli; Pradhan, Biswajeet; Xu, Chong; Gokceoglu, Candan
2013-04-01
The main goal of this study is to produce landslide susceptibility map using GIS-based support vector machine (SVM) at Kalaleh Township area of the Golestan province, Iran. In this paper, six different types of kernel classifiers such as linear, polynomial degree of 2, polynomial degree of 3, polynomial degree of 4, radial basis function (RBF) and sigmoid were used for landslide susceptibility mapping. At the first stage of the study, landslide locations were identified by aerial photographs and field surveys, and a total of 82 landslide locations were extracted from various sources. Of this, 75% of the landslides (61 landslide locations) are used as training dataset and the rest was used as (21 landslide locations) the validation dataset. Fourteen input data layers were employed as landslide conditioning factors in the landslide susceptibility modelling. These factors are slope degree, slope aspect, altitude, plan curvature, profile curvature, tangential curvature, surface area ratio (SAR), lithology, land use, distance from faults, distance from rivers, distance from roads, topographic wetness index (TWI) and stream power index (SPI). Using these conditioning factors, landslide susceptibility indices were calculated using support vector machine by employing six types of kernel function classifiers. Subsequently, the results were plotted in ArcGIS and six landslide susceptibility maps were produced. Then, using the success rate and the prediction rate methods, the validation process was performed by comparing the existing landslide data with the six landslide susceptibility maps. The validation results showed that success rates for six types of kernel models varied from 79% to 87%. Similarly, results of prediction rates showed that RBF (85%) and polynomial degree of 3 (83%) models performed slightly better than other types of kernel (polynomial degree of 2 = 78%, sigmoid = 78%, polynomial degree of 4 = 78%, and linear = 77%) models. Based on our results, the
2011-01-01
Background Cardiotocography (CTG) is the most widely used tool for fetal surveillance. The visual analysis of fetal heart rate (FHR) traces largely depends on the expertise and experience of the clinician involved. Several approaches have been proposed for the effective interpretation of FHR. In this paper, a new approach for FHR feature extraction based on empirical mode decomposition (EMD) is proposed, which was used along with support vector machine (SVM) for the classification of FHR recordings as 'normal' or 'at risk'. Methods The FHR were recorded from 15 subjects at a sampling rate of 4 Hz and a dataset consisting of 90 randomly selected records of 20 minutes duration was formed from these. All records were labelled as 'normal' or 'at risk' by two experienced obstetricians. A training set was formed by 60 records, the remaining 30 left as the testing set. The standard deviations of the EMD components are input as features to a support vector machine (SVM) to classify FHR samples. Results For the training set, a five-fold cross validation test resulted in an accuracy of 86% whereas the overall geometric mean of sensitivity and specificity was 94.8%. The Kappa value for the training set was .923. Application of the proposed method to the testing set (30 records) resulted in a geometric mean of 81.5%. The Kappa value for the testing set was .684. Conclusions Based on the overall performance of the system it can be stated that the proposed methodology is a promising new approach for the feature extraction and classification of FHR signals. PMID:21244712
Daily changes of individual gait patterns identified by means of support vector machines.
Horst, F; Kramer, F; Schäfer, B; Eekhoff, A; Hegen, P; Nigg, B M; Schöllhorn, W I
2016-09-01
Despite the common knowledge about the individual character of human gait patterns and about their non-repeatability, little is known about their stability, their interactions and their changes over time. Variations of gait patterns are typically described as random deviations around a stable mean curve derived from groups, which appear due to noise or experimental insufficiencies. The purpose of this study is to examine the nature of intrinsic inter-session variability in more detail by proving separable characteristics of gait patterns between individuals as well as within individuals in repeated measurement sessions. Eight healthy subjects performed 15 gait trials at a self-selected speed on eight days within two weeks. For each trial, the time-continuous ground reaction forces and lower body kinematics were quantified. A total of 960 gait patterns were analysed by means of support vector machines and the coefficient of multiple correlation. The results emphasise the remarkable amount of individual characteristics in human gait. Support vector machines results showed an error-free assignment of gait patterns to the corresponding individual. Thus, differences in gait patterns between individuals seem to be persistent over two weeks. Within the range of individual gait patterns, day specific characteristics could be distinguished by classification rates of 97.3% and 59.5% for the eight-day classification of lower body joint angles and ground reaction forces, respectively. Hence, gait patterns can be assumed not to be constant over time and rather exhibit discernible daily changes within previously stated good repeatability. Advantages for more individual and situational diagnoses or therapy are identified.
Large-scale ligand-based predictive modelling using support vector machines.
Alvarsson, Jonathan; Lampa, Samuel; Schaal, Wesley; Andersson, Claes; Wikberg, Jarl E S; Spjuth, Ola
2016-01-01
The increasing size of datasets in drug discovery makes it challenging to build robust and accurate predictive models within a reasonable amount of time. In order to investigate the effect of dataset sizes on predictive performance and modelling time, ligand-based regression models were trained on open datasets of varying sizes of up to 1.2 million chemical structures. For modelling, two implementations of support vector machines (SVM) were used. Chemical structures were described by the signatures molecular descriptor. Results showed that for the larger datasets, the LIBLINEAR SVM implementation performed on par with the well-established libsvm with a radial basis function kernel, but with dramatically less time for model building even on modest computer resources. Using a non-linear kernel proved to be infeasible for large data sizes, even with substantial computational resources on a computer cluster. To deploy the resulting models, we extended the Bioclipse decision support framework to support models from LIBLINEAR and made our models of logD and solubility available from within Bioclipse. PMID:27516811
AgRISTARS. Supporting research: Algorithms for scene modelling
NASA Technical Reports Server (NTRS)
Rassbach, M. E. (Principal Investigator)
1982-01-01
The requirements for a comprehensive analysis of LANDSAT or other visual data scenes are defined. The development of a general model of a scene and a computer algorithm for finding the particular model for a given scene is discussed. The modelling system includes a boundary analysis subsystem, which detects all the boundaries and lines in the image and builds a boundary graph; a continuous variation analysis subsystem, which finds gradual variations not well approximated by a boundary structure; and a miscellaneous features analysis, which includes texture, line parallelism, etc. The noise reduction capabilities of this method and its use in image rectification and registration are discussed.
NASA Astrophysics Data System (ADS)
Zhang, Yong; Cong, Qian; Xie, Yunfei; Yang, Jingxiu; Zhao, Bing
2008-12-01
It is important to monitor quality of tobacco during the production of cigarette. Therefore, in order to scientifically control the tobacco raw material and guarantee the cigarette quality, fast and accurate determination routine chemical of constituents of tobacco, including the total sugar, reducing sugar, Nicotine, the total nitrogen and so on, is needed. In this study, 50 samples of tobacco from different cultivation areas were surveyed by near-infrared (NIR) spectroscopy, and the spectral differences provided enough quantitative analysis information for the tobacco. Partial least squares regression (PLSR), artificial neural network (ANN), and support vector machine (SVM), were applied. The quantitative analysis models of 50 tobacco samples were studied comparatively in this experiment using PLSR, ANN, radial basis function (RBF) SVM regression, and the parameters of the models were also discussed. The spectrum variables of 50 samples had been compressed through the wavelet transformation technology before the models were established. The best experimental results were obtained using the (RBF) SVM regression with γ = 1.5, 1.3, 0.9, and 0.1, separately corresponds to total sugar, reducing sugar, Nicotine, and total nitrogen, respectively. Finally, compared with the back propagation (BP-ANN) and PLSR approach, SVM algorithm showed its excellent generalization for quantitative analysis results, while the number of samples for establishing the model is smaller. The overall results show that NIR spectroscopy combined with SVM can be efficiently utilized for rapid and accurate analysis of routine chemical compositions in tobacco. Simultaneously, the research can serve as the technical support and the foundation of quantitative analysis of other NIR applications.
Zhang, Yong; Cong, Qian; Xie, Yunfei; JingxiuYang; Zhao, Bing
2008-12-15
It is important to monitor quality of tobacco during the production of cigarette. Therefore, in order to scientifically control the tobacco raw material and guarantee the cigarette quality, fast and accurate determination routine chemical of constituents of tobacco, including the total sugar, reducing sugar, Nicotine, the total nitrogen and so on, is needed. In this study, 50 samples of tobacco from different cultivation areas were surveyed by near-infrared (NIR) spectroscopy, and the spectral differences provided enough quantitative analysis information for the tobacco. Partial least squares regression (PLSR), artificial neural network (ANN), and support vector machine (SVM), were applied. The quantitative analysis models of 50 tobacco samples were studied comparatively in this experiment using PLSR, ANN, radial basis function (RBF) SVM regression, and the parameters of the models were also discussed. The spectrum variables of 50 samples had been compressed through the wavelet transformation technology before the models were established. The best experimental results were obtained using the (RBF) SVM regression with gamma=1.5, 1.3, 0.9, and 0.1, separately corresponds to total sugar, reducing sugar, Nicotine, and total nitrogen, respectively. Finally, compared with the back propagation (BP-ANN) and PLSR approach, SVM algorithm showed its excellent generalization for quantitative analysis results, while the number of samples for establishing the model is smaller. The overall results show that NIR spectroscopy combined with SVM can be efficiently utilized for rapid and accurate analysis of routine chemical compositions in tobacco. Simultaneously, the research can serve as the technical support and the foundation of quantitative analysis of other NIR applications.
A Battery-Aware Algorithm for Supporting Collaborative Applications
NASA Astrophysics Data System (ADS)
Rollins, Sami; Chang-Yit, Cheryl
Battery-powered devices such as laptops, cell phones, and MP3 players are becoming ubiquitous. There are several significant ways in which the ubiquity of battery-powered technology impacts the field of collaborative computing. First, applications such as collaborative data gathering, become possible. Also, existing applications that depend on collaborating devices to maintain the system infrastructure must be reconsidered. Fundamentally, the problem lies in the fact that collaborative applications often require end-user computing devices to perform tasks that happen in the background and are not directly advantageous to the user. In this work, we seek to better understand how laptop users use the batteries attached to their devices and analyze a battery-aware alternative to Gnutella’s ultrapeer selection algorithm. Our algorithm provides insight into how system maintenance tasks can be allocated to battery-powered nodes. The most significant result of our study indicates that a large portion of laptop users can participate in system maintenance without sacrificing any of their battery. These results show great promise for existing collaborative applications as well as new applications, such as collaborative data gathering, that rely upon battery-powered devices.
Sweeney, Elizabeth M; Vogelstein, Joshua T; Cuzzocreo, Jennifer L; Calabresi, Peter A; Reich, Daniel S; Crainiceanu, Ciprian M; Shinohara, Russell T
2014-01-01
Machine learning is a popular method for mining and analyzing large collections of medical data. We focus on a particular problem from medical research, supervised multiple sclerosis (MS) lesion segmentation in structural magnetic resonance imaging (MRI). We examine the extent to which the choice of machine learning or classification algorithm and feature extraction function impacts the performance of lesion segmentation methods. As quantitative measures derived from structural MRI are important clinical tools for research into the pathophysiology and natural history of MS, the development of automated lesion segmentation methods is an active research field. Yet, little is known about what drives performance of these methods. We evaluate the performance of automated MS lesion segmentation methods, which consist of a supervised classification algorithm composed with a feature extraction function. These feature extraction functions act on the observed T1-weighted (T1-w), T2-weighted (T2-w) and fluid-attenuated inversion recovery (FLAIR) MRI voxel intensities. Each MRI study has a manual lesion segmentation that we use to train and validate the supervised classification algorithms. Our main finding is that the differences in predictive performance are due more to differences in the feature vectors, rather than the machine learning or classification algorithms. Features that incorporate information from neighboring voxels in the brain were found to increase performance substantially. For lesion segmentation, we conclude that it is better to use simple, interpretable, and fast algorithms, such as logistic regression, linear discriminant analysis, and quadratic discriminant analysis, and to develop the features to improve performance. PMID:24781953
MAP Support Detection for Greedy Sparse Signal Recovery Algorithms in Compressive Sensing
NASA Astrophysics Data System (ADS)
Lee, Namyoon
2016-10-01
A reliable support detection is essential for a greedy algorithm to reconstruct a sparse signal accurately from compressed and noisy measurements. This paper proposes a novel support detection method for greedy algorithms, which is referred to as "\\textit{maximum a posteriori (MAP) support detection}". Unlike existing support detection methods that identify support indices with the largest correlation value in magnitude per iteration, the proposed method selects them with the largest likelihood ratios computed under the true and null support hypotheses by simultaneously exploiting the distributions of sensing matrix, sparse signal, and noise. Leveraging this technique, MAP-Matching Pursuit (MAP-MP) is first presented to show the advantages of exploiting the proposed support detection method, and a sufficient condition for perfect signal recovery is derived for the case when the sparse signal is binary. Subsequently, a set of iterative greedy algorithms, called MAP-generalized Orthogonal Matching Pursuit (MAP-gOMP), MAP-Compressive Sampling Matching Pursuit (MAP-CoSaMP), and MAP-Subspace Pursuit (MAP-SP) are presented to demonstrate the applicability of the proposed support detection method to existing greedy algorithms. From empirical results, it is shown that the proposed greedy algorithms with highly reliable support detection can be better, faster, and easier to implement than basis pursuit via linear programming.
Model for noise-induced hearing loss using support vector machine
NASA Astrophysics Data System (ADS)
Qiu, Wei; Ye, Jun; Liu-White, Xiaohong; Hamernik, Roger P.
2005-09-01
Contemporary noise standards are based on the assumption that an energy metric such as the equivalent noise level is sufficient for estimating the potential of a noise stimulus to cause noise-induced hearing loss (NIHL). Available data, from laboratory-based experiments (Lei et al., 1994; Hamernik and Qiu, 2001) indicate that while an energy metric may be necessary, it is not sufficient for the prediction of NIHL. A support vector machine (SVM) NIHL prediction model was constructed, based on a 550-subject (noise-exposed chinchillas) database. Training of the model used data from 367 noise-exposed subjects. The model was tested using the remaining 183 subjects. Input variables for the model included acoustic, audiometric, and biological variables, while output variables were PTS and cell loss. The results show that an energy parameter is not sufficient to predict NIHL, especially in complex noise environments. With the kurtosis and other noise and biological parameters included as additional inputs, the performance of SVM prediction model was significantly improved. The SVM prediction model has the potential to reliably predict noise-induced hearing loss. [Work supported by NIOSH.
Real-time nondestructive structural health monitoring using support vector machines and wavelets
NASA Astrophysics Data System (ADS)
Bulut, Ahmet; Singh, Ambuj K.; Shin, Peter; Fountain, Tony; Jasso, Hector; Yan, Linjun; Elgamal, Ahmed
2005-05-01
We present an alternative to visual inspection for detecting damage to civil infrastructure. We describe a real-time decision support system for nondestructive health monitoring. The system is instrumented by an integrated network of wireless sensors mounted on civil infrastructures such as bridges, highways, and commercial and industrial facilities. To address scalability and power consumption issues related to sensor networks, we propose a three-tier system that uses wavelets to adaptively reduce the streaming data spatially and temporally. At the sensor level, measurement data is temporally compressed before being sent upstream to intermediate communication nodes. There, correlated data from multiple sensors is combined and sent to the operation center for further reduction and interpretation. At each level, the compression ratio can be adaptively changed via wavelets. This multi-resolution approach is useful in optimizing total resources in the system. At the operation center, Support Vector Machines (SVMs) are used to detect the location of potential damage from the reduced data. We demonstrate that the SVM is a robust classifier in the presence of noise and that wavelet-based compression gracefully degrades its classification accuracy. We validate the effectiveness of our approach using a finite element model of the Humboldt Bay Bridge. We envision that our approach will prove novel and useful in the design of scalable nondestructive health monitoring systems.
Object Recognition System-on-Chip Using the Support Vector Machines
NASA Astrophysics Data System (ADS)
Reyna-Rojas, Roberto; Houzet, Dominique; Dragomirescu, Daniela; Carlier, Florent; Ouadjaout, Salim
2005-12-01
The first aim of this work is to propose the design of a system-on-chip (SoC) platform dedicated to digital image and signal processing, which is tuned to implement efficiently multiply-and-accumulate (MAC) vector/matrix operations. The second aim of this work is to implement a recent promising neural network method, namely, the support vector machine (SVM) used for real-time object recognition, in order to build a vision machine. With such a reconfigurable and programmable SoC platform, it is possible to implement any SVM function dedicated to any object recognition problem. The final aim is to obtain an automatic reconfiguration of the SoC platform, based on the results of the learning phase on an objects' database, which makes it possible to recognize practically any object without manual programming. Recognition can be of any kind that is from image to signal data. Such a system is a general-purpose automatic classifier. Many applications can be considered as a classification problem, but are usually treated specifically in order to optimize the cost of the implemented solution. The cost of our approach is more important than a dedicated one, but in a near future, hundreds of millions of gates will be common and affordable compared to the design cost. What we are proposing here is a general-purpose classification neural network implemented on a reconfigurable SoC platform. The first version presented here is limited in size and thus in object recognition performances, but can be easily upgraded according to technology improvements.
Liu, Jin; Guo, Ting-ting; Li, Hao-chuan; Jia, Shi-qiang; Yan, Yan-lu; An, Dong; Zhang, Yao; Chen, Shao-jiang
2015-11-01
Doubled haploid (DH) lines are routinely applied in the hybrid maize breeding programs of many institutes and companies for their advantages of complete homozygosity and short breeding cycle length. A key issue in this approach is an efficient screening system to identify haploid kernels from the hybrid kernels crossed with the inducer. At present, haploid kernel selection is carried out manually using the"red-crown" kernel trait (the haploid kernel has a non-pigmented embryo and pigmented endosperm) controlled by the R1-nj gene. Manual selection is time-consuming and unreliable. Furthermore, the color of the kernel embryo is concealed by the pericarp. Here, we establish a novel approach for identifying maize haploid kernels based on visible (Vis) spectroscopy and support vector machine (SVM) pattern recognition technology. The diffuse transmittance spectra of individual kernels (141 haploid kernels and 141 hybrid kernels from 9 genotypes) were collected using a portable UV-Vis spectrometer and integrating sphere. The raw spectral data were preprocessed using smoothing and vector normalization methods. The desired feature wavelengths were selected based on the results of the Kolmogorov-Smirnov test. The wavelengths with p values above 0. 05 were eliminated because the distributions of absorbance data in these wavelengths show no significant difference between haploid and hybrid kernels. Principal component analysis was then performed to reduce the number of variables. The SVM model was evaluated by 9-fold cross-validation. In each round, samples of one genotype were used as the testing set, while those of other genotypes were used as the training set. The mean rate of correct discrimination was 92.06%. This result demonstrates the feasibility of using Vis spectroscopy to identify haploid maize kernels. The method would help develop a rapid and accurate automated screening-system for haploid kernels. PMID:26978947
TJ-II wave forms analysis with wavelets and support vector machines
Dormido-Canto, S.; Farias, G.; Dormido, R.; Vega, J.; Sanchez, J.; Santos, M.
2004-10-01
Since the fusion plasma experiment generates hundreds of signals, it is essential to have automatic mechanisms for searching similarities and retrieving of specific data in the wave form database. Wavelet transform (WT) is a transformation that allows one to map signals to spaces of lower dimensionality. Support vector machine (SVM) is a very effective method for general purpose pattern recognition. Given a set of input vectors which belong to two different classes, the SVM maps the inputs into a high-dimensional feature space through some nonlinear mapping, where an optimal separating hyperplane is constructed. In this work, the combined use of WT and SVM is proposed for searching and retrieving similar wave forms in the TJ-II database. In a first stage, plasma signals will be preprocessed by WT to reduce their dimensionality and to extract their main features. In the next stage, and using the smoothed signals produced by the WT, SVM will be applied to show up the efficiency of the proposed method to deal with the problem of sorting out thousands of fusion plasma signals.From observation of several experiments, our WT+SVM method is very viable, and the results seems promising. However, we have further work to do. We have to finish the development of a Matlab toolbox for WT+SVM processing and to include new relevant features in the SVM inputs to improve the technique. We have also to make a better preprocessing of the input signals and to study the performance of other generic and self custom kernels. To reach it, and since the preprocessing stages are very time consuming, we are going to study the viability of using DSPs, RPGAs or parallel programming techniques to reduce the execution time.
NASA Astrophysics Data System (ADS)
Zhong-Bao, Liu
2016-06-01
Support Vector Machine (SVM) is one of the important stellar spectral classification methods, and it is widely used in practice. But its classification efficiencies cannot be greatly improved because it does not take the class distribution into consideration. In view of this, a modified SVM named Minimum within-class and Maximum between-class scatter Support Vector Machine (MMSVM) is constructed to deal with the above problem. MMSVM merges the advantages of Fisher's Discriminant Analysis (FDA) and SVM, and the comparative experiments on the Sloan Digital Sky Survey (SDSS) show that MMSVM performs better than SVM.
NASA Technical Reports Server (NTRS)
Lawton, Pat
2004-01-01
The objective of this work was to support the design of improved IUE NEWSIPS high dispersion extraction algorithms. The purpose of this work was to evaluate use of the Linearized Image (LIHI) file versus the Re-Sampled Image (SIHI) file, evaluate various extraction, and design algorithms for evaluation of IUE High Dispersion spectra. It was concluded the use of the Re-Sampled Image (SIHI) file was acceptable. Since the Gaussian profile worked well for the core and the Lorentzian profile worked well for the wings, the Voigt profile was chosen for use in the extraction algorithm. It was found that the gamma and sigma parameters varied significantly across the detector, so gamma and sigma masks for the SWP detector were developed. Extraction code was written.
Polynomial interpretation of multipole vectors
NASA Astrophysics Data System (ADS)
Katz, Gabriel; Weeks, Jeff
2004-09-01
Copi, Huterer, Starkman, and Schwarz introduced multipole vectors in a tensor context and used them to demonstrate that the first-year Wilkinson microwave anisotropy probe (WMAP) quadrupole and octopole planes align at roughly the 99.9% confidence level. In the present article, the language of polynomials provides a new and independent derivation of the multipole vector concept. Bézout’s theorem supports an elementary proof that the multipole vectors exist and are unique (up to rescaling). The constructive nature of the proof leads to a fast, practical algorithm for computing multipole vectors. We illustrate the algorithm by finding exact solutions for some simple toy examples and numerical solutions for the first-year WMAP quadrupole and octopole. We then apply our algorithm to Monte Carlo skies to independently reconfirm the estimate that the WMAP quadrupole and octopole planes align at the 99.9% level.
DocBot: a novel clinical decision support algorithm.
Ninh, Andrew Q
2014-01-01
DocBot is a web-based clinical decision support system (CDSS) that uses patient interaction and electronic health record analytics to assist medical practitioners with decision making. It consists of two distinct HTML interfaces: a preclinical form wherein a patient inputs symptomatic and demographic information, and an interface wherein a medical practitioner views patient information and analysis. DocBot comprises an improved software architecture that uses patient information, electronic health records, and etiologically relevant binary decision questions (stored in a knowledgebase) to provide medical practitioners with information including, but not limited to medical assessments, treatment plans, and specialist referrals.
Mao, Rui; Raj Kumar, Praveen Kumar; Guo, Cheng; Zhang, Yang; Liang, Chun
2014-01-01
One of the important modes of pre-mRNA post-transcriptional modification is alternative splicing. Alternative splicing allows creation of many distinct mature mRNA transcripts from a single gene by utilizing different splice sites. In plants like Arabidopsis thaliana, the most common type of alternative splicing is intron retention. Many studies in the past focus on positional distribution of retained introns (RIs) among different genic regions and their expression regulations, while little systematic classification of RIs from constitutively spliced introns (CSIs) has been conducted using machine learning approaches. We used random forest and support vector machine (SVM) with radial basis kernel function (RBF) to differentiate these two types of introns in Arabidopsis. By comparing coordinates of introns of all annotated mRNAs from TAIR10, we obtained our high-quality experimental data. To distinguish RIs from CSIs, We investigated the unique characteristics of RIs in comparison with CSIs and finally extracted 37 quantitative features: local and global nucleotide sequence features of introns, frequent motifs, the signal strength of splice sites, and the similarity between sequences of introns and their flanking regions. We demonstrated that our proposed feature extraction approach was more accurate in effectively classifying RIs from CSIs in comparison with other four approaches. The optimal penalty parameter C and the RBF kernel parameter [Formula: see text] in SVM were set based on particle swarm optimization algorithm (PSOSVM). Our classification performance showed F-Measure of 80.8% (random forest) and 77.4% (PSOSVM). Not only the basic sequence features and positional distribution characteristics of RIs were obtained, but also putative regulatory motifs in intron splicing were predicted based on our feature extraction approach. Clearly, our study will facilitate a better understanding of underlying mechanisms involved in intron retention.
NASA Astrophysics Data System (ADS)
Salehi, B.; Mahdavi, S.; Brisco, B.; Huang, W.
2015-12-01
Both Synthetic Aperture RADAR (SAR) and optical imagery play a pivotal role in many applications. Thus it is desirable to fuse the two independent sources of data congruously. Many of the fusion methods, however, fail to consider the different nature of SAR and optical data. Moreover, it is not straightforward to adjust the contribution of the two data sources with respect to the application. Support Vector Machine (SVM) is one of the classification methods which can provide the possibility of combination of two kinds of images considering the different nature of them. It is particularly useful when object-based classification is used, in which case features extracted from SAR and optical images can be treated differently. This paper aims to develop an object-based classification method using both optical and SAR data which treats the two data sources independently. For the implementation of the method, a RapidEye and a RADARSAT-2 Quad-polarimetric image over Avalon Peninsula in Newfoundland, Canada will be used for wetland classification. RapidEye will be segmented using multiresolution algorithm in eCognitionTM. Because of speckle, segmentation of SAR images does not have robust results. Thus the result of the segmentation from RapidEye image is superimposed on RADARSAT-2 image. Then useful SAR and optical features are extracted. Integrating features extracted from optical and SAR data, a compound kernel in SVM is applied for classification. This kernel is a combination of two kernels with different weights, each of which are for the features of one of the data sources. Using compound kernel can outperform using the same kernel for both images. The proposed method has two main advantages. First, different nature of optical and SAR images which is the result of dissimilar dynamic range, resolution, etc. is considered. Second, as the two data sources are combined with different weights, it is possible to adjust the role of each data sources for varying applications.
Kazemi, Fatemeh; Najafabadi, Tooraj Abbasian; Araabi, Babak Nadjar
2016-01-01
Acute myelogenous leukemia (AML) is a subtype of acute leukemia, which is characterized by the accumulation of myeloid blasts in the bone marrow. Careful microscopic examination of stained blood smear or bone marrow aspirate is still the most significant diagnostic methodology for initial AML screening and considered as the first step toward diagnosis. It is time-consuming and due to the elusive nature of the signs and symptoms of AML; wrong diagnosis may occur by pathologists. Therefore, the need for automation of leukemia detection has arisen. In this paper, an automatic technique for identification and detection of AML and its prevalent subtypes, i.e., M2-M5 is presented. At first, microscopic images are acquired from blood smears of patients with AML and normal cases. After applying image preprocessing, color segmentation strategy is applied for segmenting white blood cells from other blood components and then discriminative features, i.e., irregularity, nucleus-cytoplasm ratio, Hausdorff dimension, shape, color, and texture features are extracted from the entire nucleus in the whole images containing multiple nuclei. Images are classified to cancerous and noncancerous images by binary support vector machine (SVM) classifier with 10-fold cross validation technique. Classifier performance is evaluated by three parameters, i.e., sensitivity, specificity, and accuracy. Cancerous images are also classified into their prevalent subtypes by multi-SVM classifier. The results show that the proposed algorithm has achieved an acceptable performance for diagnosis of AML and its common subtypes. Therefore, it can be used as an assistant diagnostic tool for pathologists.
Kazemi, Fatemeh; Najafabadi, Tooraj Abbasian; Araabi, Babak Nadjar
2016-01-01
Acute myelogenous leukemia (AML) is a subtype of acute leukemia, which is characterized by the accumulation of myeloid blasts in the bone marrow. Careful microscopic examination of stained blood smear or bone marrow aspirate is still the most significant diagnostic methodology for initial AML screening and considered as the first step toward diagnosis. It is time-consuming and due to the elusive nature of the signs and symptoms of AML; wrong diagnosis may occur by pathologists. Therefore, the need for automation of leukemia detection has arisen. In this paper, an automatic technique for identification and detection of AML and its prevalent subtypes, i.e., M2–M5 is presented. At first, microscopic images are acquired from blood smears of patients with AML and normal cases. After applying image preprocessing, color segmentation strategy is applied for segmenting white blood cells from other blood components and then discriminative features, i.e., irregularity, nucleus-cytoplasm ratio, Hausdorff dimension, shape, color, and texture features are extracted from the entire nucleus in the whole images containing multiple nuclei. Images are classified to cancerous and noncancerous images by binary support vector machine (SVM) classifier with 10-fold cross validation technique. Classifier performance is evaluated by three parameters, i.e., sensitivity, specificity, and accuracy. Cancerous images are also classified into their prevalent subtypes by multi-SVM classifier. The results show that the proposed algorithm has achieved an acceptable performance for diagnosis of AML and its common subtypes. Therefore, it can be used as an assistant diagnostic tool for pathologists. PMID:27563575
Mao, Rui; Raj Kumar, Praveen Kumar; Guo, Cheng; Zhang, Yang; Liang, Chun
2014-01-01
One of the important modes of pre-mRNA post-transcriptional modification is alternative splicing. Alternative splicing allows creation of many distinct mature mRNA transcripts from a single gene by utilizing different splice sites. In plants like Arabidopsis thaliana, the most common type of alternative splicing is intron retention. Many studies in the past focus on positional distribution of retained introns (RIs) among different genic regions and their expression regulations, while little systematic classification of RIs from constitutively spliced introns (CSIs) has been conducted using machine learning approaches. We used random forest and support vector machine (SVM) with radial basis kernel function (RBF) to differentiate these two types of introns in Arabidopsis. By comparing coordinates of introns of all annotated mRNAs from TAIR10, we obtained our high-quality experimental data. To distinguish RIs from CSIs, We investigated the unique characteristics of RIs in comparison with CSIs and finally extracted 37 quantitative features: local and global nucleotide sequence features of introns, frequent motifs, the signal strength of splice sites, and the similarity between sequences of introns and their flanking regions. We demonstrated that our proposed feature extraction approach was more accurate in effectively classifying RIs from CSIs in comparison with other four approaches. The optimal penalty parameter C and the RBF kernel parameter in SVM were set based on particle swarm optimization algorithm (PSOSVM). Our classification performance showed F-Measure of 80.8% (random forest) and 77.4% (PSOSVM). Not only the basic sequence features and positional distribution characteristics of RIs were obtained, but also putative regulatory motifs in intron splicing were predicted based on our feature extraction approach. Clearly, our study will facilitate a better understanding of underlying mechanisms involved in intron retention. PMID:25110928
NASA Astrophysics Data System (ADS)
Gatos, I.; Tsantis, S.; Karamesini, M.; Skouroliakou, A.; Kagadis, G.
2015-09-01
Purpose: The design and implementation of a computer-based image analysis system employing the support vector machine (SVM) classifier system for the classification of Focal Liver Lesions (FLLs) on routine non-enhanced, T2-weighted Magnetic Resonance (MR) images. Materials and Methods: The study comprised 92 patients; each one of them has undergone MRI performed on a Magnetom Concerto (Siemens). Typical signs on dynamic contrast-enhanced MRI and biopsies were employed towards a three class categorization of the 92 cases: 40-benign FLLs, 25-Hepatocellular Carcinomas (HCC) within Cirrhotic liver parenchyma and 27-liver metastases from Non-Cirrhotic liver. Prior to FLLs classification an automated lesion segmentation algorithm based on Marcov Random Fields was employed in order to acquire each FLL Region of Interest. 42 texture features derived from the gray-level histogram, co-occurrence and run-length matrices and 12 morphological features were obtained from each lesion. Stepwise multi-linear regression analysis was utilized to avoid feature redundancy leading to a feature subset that fed the multiclass SVM classifier designed for lesion classification. SVM System evaluation was performed by means of leave-one-out method and ROC analysis. Results: Maximum accuracy for all three classes (90.0%) was obtained by means of the Radial Basis Kernel Function and three textural features (Inverse- Different-Moment, Sum-Variance and Long-Run-Emphasis) that describe lesion's contrast, variability and shape complexity. Sensitivity values for the three classes were 92.5%, 81.5% and 96.2% respectively, whereas specificity values were 94.2%, 95.3% and 95.5%. The AUC value achieved for the selected subset was 0.89 with 0.81 - 0.94 confidence interval. Conclusion: The proposed SVM system exhibit promising results that could be utilized as a second opinion tool to the radiologist in order to decrease the time/cost of diagnosis and the need for patients to undergo invasive examination.
Kazemi, Fatemeh; Najafabadi, Tooraj Abbasian; Araabi, Babak Nadjar
2016-01-01
Acute myelogenous leukemia (AML) is a subtype of acute leukemia, which is characterized by the accumulation of myeloid blasts in the bone marrow. Careful microscopic examination of stained blood smear or bone marrow aspirate is still the most significant diagnostic methodology for initial AML screening and considered as the first step toward diagnosis. It is time-consuming and due to the elusive nature of the signs and symptoms of AML; wrong diagnosis may occur by pathologists. Therefore, the need for automation of leukemia detection has arisen. In this paper, an automatic technique for identification and detection of AML and its prevalent subtypes, i.e., M2-M5 is presented. At first, microscopic images are acquired from blood smears of patients with AML and normal cases. After applying image preprocessing, color segmentation strategy is applied for segmenting white blood cells from other blood components and then discriminative features, i.e., irregularity, nucleus-cytoplasm ratio, Hausdorff dimension, shape, color, and texture features are extracted from the entire nucleus in the whole images containing multiple nuclei. Images are classified to cancerous and noncancerous images by binary support vector machine (SVM) classifier with 10-fold cross validation technique. Classifier performance is evaluated by three parameters, i.e., sensitivity, specificity, and accuracy. Cancerous images are also classified into their prevalent subtypes by multi-SVM classifier. The results show that the proposed algorithm has achieved an acceptable performance for diagnosis of AML and its common subtypes. Therefore, it can be used as an assistant diagnostic tool for pathologists. PMID:27563575
NASA Astrophysics Data System (ADS)
Zahir, N.; Mahdi, H.
2015-12-01
Lake Urmia is one of the most important ecosystems of the country which is on the verge of elimination. Many factors contribute to this crisis among them is the precipitation, paly important roll. Precipitation has many forms one of them is in the form of snow. The snow on Sahand Mountain is one of the main and important sources of the Lake Urmia's water. Snow Depth (SD) is vital parameters for estimating water balance for future year. In this regards, this study is focused on SD parameter using Special Sensor Microwave/Imager (SSM/I) instruments on board the Defence Meteorological Satellite Program (DMSP) F16. The usual statistical methods for retrieving SD include linear and non-linear ones. These methods used least square procedure to estimate SD model. Recently, kernel base methods widely used for modelling statistical problem. From these methods, the support vector regression (SVR) is achieved the high performance for modelling the statistical problem. Examination of the obtained data shows the existence of outlier in them. For omitting these outliers, wavelet denoising method is applied. After the omission of the outliers it is needed to select the optimum bands and parameters for SVR. To overcome these issues, feature selection methods have shown a direct effect on improving the regression performance. We used genetic algorithm (GA) for selecting suitable features of the SSMI bands in order to estimate SD model. The results for the training and testing data in Sahand mountain is [R²_TEST=0.9049 and RMSE= 6.9654] that show the high SVR performance.
Faridi, A; Sakomura, N K; Golian, A; Marcato, S M
2012-12-01
As a new modeling method, support vector regression (SVR) has been regarded as the state-of-the-art technique for regression and approximation. In this study, the SVR models had been introduced and developed to predict body and carcass-related characteristics of 2 strains of broiler chicken. To evaluate the prediction ability of SVR models, we compared their performance with that of neural network (NN) models. Evaluation of the prediction accuracy of models was based on the R(2), MS error, and bias. The variables of interest as model output were BW, empty BW, carcass, breast, drumstick, thigh, and wing weight in 2 strains of Ross and Cobb chickens based on intake dietary nutrients, including ME (kcal/bird per week), CP, TSAA, and Lys, all as grams per bird per week. A data set composed of 64 measurements taken from each strain were used for this analysis, where 44 data lines were used for model training, whereas the remaining 20 lines were used to test the created models. The results of this study revealed that it is possible to satisfactorily estimate the BW and carcass parts of the broiler chickens via their dietary nutrient intake. Through statistical criteria used to evaluate the performance of the SVR and NN models, the overall results demonstrate that the discussed models can be effective for accurate prediction of the body and carcass-related characteristics investigated here. However, the SVR method achieved better accuracy and generalization than the NN method. This indicates that the new data mining technique (SVR model) can be used as an alternative modeling tool for NN models. However, further reevaluation of this algorithm in the future is suggested.
Nie, Guoping; Li, Yong; Wang, Feichi; Wang, Siwen; Hu, Xuehai
2015-01-01
G-protein-coupled receptors (GPCRs) are seven membrane-spanning proteins and regulate many important physiological processes, such as vision, neurotransmission, immune response and so on. GPCRs-related pathways are the targets of a large number of marketed drugs. Therefore, the design of a reliable computational model for predicting GPCRs from amino acid sequence has long been a significant biomedical problem. Chaos game representation (CGR) reveals the fractal patterns hidden in protein sequences, and then fractal dimension (FD) is an important feature of these highly irregular geometries with concise mathematical expression. Here, in order to extract important features from GPCR protein sequences, CGR algorithm, fractal dimension and amino acid composition (AAC) are employed to formulate the numerical features of protein samples. Four groups of features are considered, and each group is evaluated by support vector machine (SVM) and 10-fold cross-validation test. To test the performance of the present method, a new non-redundant dataset was built based on latest GPCRDB database. Comparing the results of numerical experiments, the group of combined features with AAC and FD gets the best result, the accuracy is 99.22% and Matthew's correlation coefficient (MCC) is 0.9845 for identifying GPCRs from non-GPCRs. Moreover, if it is classified as a GPCR, it will be further put into the second level, which will classify a GPCR into one of the five main subfamilies. At this level, the group of combined features with AAC and FD also gets best accuracy 85.73%. Finally, the proposed predictor is also compared with existing methods and shows better performances.
Tan, Maxine; Pu, Jiantao; Zheng, Bin
2014-01-01
Purpose: Improving radiologists’ performance in classification between malignant and benign breast lesions is important to increase cancer detection sensitivity and reduce false-positive recalls. For this purpose, developing computer-aided diagnosis (CAD) schemes has been attracting research interest in recent years. In this study, we investigated a new feature selection method for the task of breast mass classification. Methods: We initially computed 181 image features based on mass shape, spiculation, contrast, presence of fat or calcifications, texture, isodensity, and other morphological features. From this large image feature pool, we used a sequential forward floating selection (SFFS)-based feature selection method to select relevant features, and analyzed their performance using a support vector machine (SVM) model trained for the classification task. On a database of 600 benign and 600 malignant mass regions of interest (ROIs), we performed the study using a ten-fold cross-validation method. Feature selection and optimization of the SVM parameters were conducted on the training subsets only. Results: The area under the receiver operating characteristic curve (AUC) = 0.805±0.012 was obtained for the classification task. The results also showed that the most frequently-selected features by the SFFS-based algorithm in 10-fold iterations were those related to mass shape, isodensity and presence of fat, which are consistent with the image features frequently used by radiologists in the clinical environment for mass classification. The study also indicated that accurately computing mass spiculation features from the projection mammograms was difficult, and failed to perform well for the mass classification task due to tissue overlap within the benign mass regions. Conclusions: In conclusion, this comprehensive feature analysis study provided new and valuable information for optimizing computerized mass classification schemes that may have potential to be
Alves, Julio Cesar L; Poppi, Ronei J
2013-01-30
This work verifies the potential of support vector machine (SVM) algorithm applied to near infrared (NIR) spectroscopy data to develop multivariate calibration models for determination of biodiesel content in diesel fuel blends that are more effective and appropriate for analytical determinations of this type of fuel nowadays, providing the usual extended analytical range with required accuracy. Considering the difficulty to develop suitable models for this type of determination in an extended analytical range and that, in practice, biodiesel/diesel fuel blends are nowadays most often used between 0 and 30% (v/v) of biodiesel content, a calibration model is suggested for the range 0-35% (v/v) of biodiesel in diesel blends. The possibility of using a calibration model for the range 0-100% (v/v) of biodiesel in diesel fuel blends was also investigated and the difficulty in obtaining adequate results for this full analytical range is discussed. The SVM models are compared with those obtained with PLS models. The best result was obtained by the SVM model using the spectral region 4400-4600 cm(-1) providing the RMSEP value of 0.11% in 0-35% biodiesel content calibration model. This model provides the determination of biodiesel content in agreement with the accuracy required by ABNT NBR and ASTM reference methods and without interference due to the presence of vegetable oil in the mixture. The best SVM model fit performance for the relationship studied is also verified by providing similar prediction results with the use of 4400-6200 cm(-1) spectral range while the PLS results are much worse over this spectral region. PMID:23597903
NASA Astrophysics Data System (ADS)
Heleno, Sandra; Matias, Magda; Pina, Pedro
2015-04-01
Visual interpretation of satellite imagery remains extremely demanding in terms of resources and time, especially when dealing with numerous multi-scale landslides affecting wide areas, such as is the case of rainfall-induced shallow landslides. Applying automated methods can contribute to more efficient landslide mapping and updating of existing inventories, and in recent years the number and variety of approaches is rapidly increasing. Very High Resolution (VHR) images, acquired by space-borne sensors with sub-metric precision, such as Ikonos, Quickbird, Geoeye and Worldview, are increasingly being considered as the best option for landslide mapping, but these new levels of spatial detail also present new challenges to state of the art image analysis tools, asking for automated methods specifically suited to map landslide events on VHR optical images. In this work we develop and test a methodology for semi-automatic landslide recognition and mapping of landslide source and transport areas. The method combines object-based image analysis and a Support Vector Machine supervised learning algorithm, and was tested using a GeoEye-1 multispectral image, sensed 3 days after a damaging landslide event in Madeira Island, together with a pre-event LiDAR DEM. Our approach has proved successful in the recognition of landslides on a 15 Km2-wide study area, with 81 out of 85 landslides detected in its validation regions. The classifier also showed reasonable performance (false positive rate 60% and false positive rate below 36% in both validation regions) in the internal mapping of landslide source and transport areas, in particular in the sunnier east-facing slopes. In the less illuminated areas the classifier is still able to accurately map the source areas, but performs poorly in the mapping of landslide transport areas.
Weis, Derick C; Visco, Donald P; Faulon, Jean-Loup
2008-11-01
The amount of high-throughput screening (HTS) data readily available has significantly increased because of the PubChem project (http://pubchem.ncbi.nlm.nih.gov/). There is considerable opportunity for data mining of small molecules for a variety of biological systems using cheminformatic tools and the resources available through PubChem. In this work, we trained a support vector machine (SVM) classifier using the Signature molecular descriptor on factor XIa inhibitor HTS data. The optimal number of Signatures was selected by implementing a feature selection algorithm of highly correlated clusters. Our method included an improvement that allowed clusters to work together for accuracy improvement, where previous methods have scored clusters on an individual basis. The resulting model had a 10-fold cross-validation accuracy of 89%, and additional validation was provided by two independent test sets. We applied the SVM to rapidly predict activity for approximately 12 million compounds also deposited in PubChem. Confidence in these predictions was assessed by considering the number of Signatures within the training set range for a given compound, defined as the overlap metric. To further evaluate compounds identified as active by the SVM, docking studies were performed using AutoDock. A focused database of compounds predicted to be active was obtained with several of the compounds appreciably dissimilar to those used in training the SVM. This focused database is suitable for further study. The data mining technique presented here is not specific to factor XIa inhibitors, and could be applied to other bioassays in PubChem where one is looking to expand the search for small molecules as chemical probes.
A parallel-vector algorithm for rapid structural analysis on high-performance computers
NASA Technical Reports Server (NTRS)
Storaasli, Olaf O.; Nguyen, Duc T.; Agarwal, Tarun K.
1990-01-01
A fast, accurate Choleski method for the solution of symmetric systems of linear equations is presented. This direct method is based on a variable-band storage scheme and takes advantage of column heights to reduce the number of operations in the Choleski factorization. The method employs parallel computation in the outermost DO-loop and vector computation via the loop unrolling technique in the innermost DO-loop. The method avoids computations with zeros outside the column heights, and as an option, zeros inside the band. The close relationship between Choleski and Gauss elimination methods is examined. The minor changes required to convert the Choleski code to a Gauss code to solve non-positive-definite symmetric systems of equations are identified. The results for two large scale structural analyses performed on supercomputers, demonstrate the accuracy and speed of the method.
Automatic retinal vessel classification using a Least Square-Support Vector Machine in VAMPIRE.
Relan, D; MacGillivray, T; Ballerini, L; Trucco, E
2014-01-01
It is important to classify retinal blood vessels into arterioles and venules for computerised analysis of the vasculature and to aid discovery of disease biomarkers. For instance, zone B is the standardised region of a retinal image utilised for the measurement of the arteriole to venule width ratio (AVR), a parameter indicative of microvascular health and systemic disease. We introduce a Least Square-Support Vector Machine (LS-SVM) classifier for the first time (to the best of our knowledge) to label automatically arterioles and venules. We use only 4 image features and consider vessels inside zone B (802 vessels from 70 fundus camera images) and in an extended zone (1,207 vessels, 70 fundus camera images). We achieve an accuracy of 94.88% and 93.96% in zone B and the extended zone, respectively, with a training set of 10 images and a testing set of 60 images. With a smaller training set of only 5 images and the same testing set we achieve an accuracy of 94.16% and 93.95%, respectively. This experiment was repeated five times by randomly choosing 10 and 5 images for the training set. Mean classification accuracy are close to the above mentioned result. We conclude that the performance of our system is very promising and outperforms most recently reported systems. Our approach requires smaller training data sets compared to others but still results in a similar or higher classification rate.
Taghizadeh-Sarabi, Mitra; Daliri, Mohammad Reza; Niksirat, Kavous Salehzadeh
2015-01-01
Decoding and classification of objects through task-oriented electroencephalographic (EEG) signals are the most crucial goals of recent researches conducted mainly for brain-computer interface applications. In this study we aimed to classify single-trial 12 categories of recorded EEG signals. Ten subjects participated in this study. The task was to select target images among 12 basic object categories including animals, flowers, fruits, transportation devices, body organs, clothing, food, stationery, buildings, electronic devices, dolls and jewelry. In order to decode object categories, we have considered several units namely artifact removing, feature extraction, feature selection, and classification. Data were divided into training, validation, and test sets following the artifact removal process. Features were extracted using three different wavelets namely Daubechies4, Haar, and Symlet2. Features were selected among training data and were reduced afterward via scalar feature selection using three criteria including T test, entropy, and Bhattacharyya distance. Selected features were classified by the one-against-one support vector machine (SVM) multi-class classifier. The parameters of SVM were optimized based on training and validation sets. The classification performance (measured by means of accuracy) was obtained approximately 80 % for animal and stationery categories. Moreover, Symlet2 and T test were selected as better wavelet and selection criteria, respectively.
Cerasa, Antonio; Castiglioni, Isabella; Salvatore, Christian; Funaro, Angela; Martino, Iolanda; Alfano, Stefania; Donzuso, Giulia; Perrotta, Paolo; Gioia, Maria Cecilia; Gilardi, Maria Carla; Quattrone, Aldo
2015-01-01
Presently, there are no valid biomarkers to identify individuals with eating disorders (ED). The aim of this work was to assess the feasibility of a machine learning method for extracting reliable neuroimaging features allowing individual categorization of patients with ED. Support Vector Machine (SVM) technique, combined with a pattern recognition method, was employed utilizing structural magnetic resonance images. Seventeen females with ED (six with diagnosis of anorexia nervosa and 11 with bulimia nervosa) were compared against 17 body mass index-matched healthy controls (HC). Machine learning allowed individual diagnosis of ED versus HC with an Accuracy ≥ 0.80. Voxel-based pattern recognition analysis demonstrated that voxels influencing the classification Accuracy involved the occipital cortex, the posterior cerebellar lobule, precuneus, sensorimotor/premotor cortices, and the medial prefrontal cortex, all critical regions known to be strongly involved in the pathophysiological mechanisms of ED. Although these findings should be considered preliminary given the small size investigated, SVM analysis highlights the role of well-known brain regions as possible biomarkers to distinguish ED from HC at an individual level, thus encouraging the translational implementation of this new multivariate approach in the clinical practice. PMID:26648660
Liu, Yi-Hung; Chen, Yan-Jen
2011-01-01
Defect detection has been considered an efficient way to increase the yield rate of panels in thin film transistor liquid crystal display (TFT-LCD) manufacturing. In this study we focus on the array process since it is the first and key process in TFT-LCD manufacturing. Various defects occur in the array process, and some of them could cause great damage to the LCD panels. Thus, how to design a method that can robustly detect defects from the images captured from the surface of LCD panels has become crucial. Previously, support vector data description (SVDD) has been successfully applied to LCD defect detection. However, its generalization performance is limited. In this paper, we propose a novel one-class machine learning method, called quasiconformal kernel SVDD (QK-SVDD) to address this issue. The QK-SVDD can significantly improve generalization performance of the traditional SVDD by introducing the quasiconformal transformation into a predefined kernel. Experimental results, carried out on real LCD images provided by an LCD manufacturer in Taiwan, indicate that the proposed QK-SVDD not only obtains a high defect detection rate of 96%, but also greatly improves generalization performance of SVDD. The improvement has shown to be over 30%. In addition, results also show that the QK-SVDD defect detector is able to accomplish the task of defect detection on an LCD image within 60 ms. PMID:22016625