Fuzzy support vector machine: an efficient rule-based classification technique for microarrays.
Hajiloo, Mohsen; Rabiee, Hamid R; Anooshahpour, Mahdi
2013-01-01
The abundance of gene expression microarray data has led to the development of machine learning algorithms applicable for tackling disease diagnosis, disease prognosis, and treatment selection problems. However, these algorithms often produce classifiers with weaknesses in terms of accuracy, robustness, and interpretability. This paper introduces fuzzy support vector machine which is a learning algorithm based on combination of fuzzy classifiers and kernel machines for microarray classification. Experimental results on public leukemia, prostate, and colon cancer datasets show that fuzzy support vector machine applied in combination with filter or wrapper feature selection methods develops a robust model with higher accuracy than the conventional microarray classification models such as support vector machine, artificial neural network, decision trees, k nearest neighbors, and diagonal linear discriminant analysis. Furthermore, the interpretable rule-base inferred from fuzzy support vector machine helps extracting biological knowledge from microarray data. Fuzzy support vector machine as a new classification model with high generalization power, robustness, and good interpretability seems to be a promising tool for gene expression microarray classification.
2013-05-28
those of the support vector machine and relevance vector machine, and the model runs more quickly than the other algorithms . When one class occurs...incremental support vector machine algorithm for online learning when fewer than 50 data points are available. (a) Papers published in peer-reviewed journals...learning environments, where data processing occurs one observation at a time and the classification algorithm improves over time with new
Predicting complications of percutaneous coronary intervention using a novel support vector method.
Lee, Gyemin; Gurm, Hitinder S; Syed, Zeeshan
2013-01-01
To explore the feasibility of a novel approach using an augmented one-class learning algorithm to model in-laboratory complications of percutaneous coronary intervention (PCI). Data from the Blue Cross Blue Shield of Michigan Cardiovascular Consortium (BMC2) multicenter registry for the years 2007 and 2008 (n=41 016) were used to train models to predict 13 different in-laboratory PCI complications using a novel one-plus-class support vector machine (OP-SVM) algorithm. The performance of these models in terms of discrimination and calibration was compared to the performance of models trained using the following classification algorithms on BMC2 data from 2009 (n=20 289): logistic regression (LR), one-class support vector machine classification (OC-SVM), and two-class support vector machine classification (TC-SVM). For the OP-SVM and TC-SVM approaches, variants of the algorithms with cost-sensitive weighting were also considered. The OP-SVM algorithm and its cost-sensitive variant achieved the highest area under the receiver operating characteristic curve for the majority of the PCI complications studied (eight cases). Similar improvements were observed for the Hosmer-Lemeshow χ(2) value (seven cases) and the mean cross-entropy error (eight cases). The OP-SVM algorithm based on an augmented one-class learning problem improved discrimination and calibration across different PCI complications relative to LR and traditional support vector machine classification. Such an approach may have value in a broader range of clinical domains.
Predicting complications of percutaneous coronary intervention using a novel support vector method
Lee, Gyemin; Gurm, Hitinder S; Syed, Zeeshan
2013-01-01
Objective To explore the feasibility of a novel approach using an augmented one-class learning algorithm to model in-laboratory complications of percutaneous coronary intervention (PCI). Materials and methods Data from the Blue Cross Blue Shield of Michigan Cardiovascular Consortium (BMC2) multicenter registry for the years 2007 and 2008 (n=41 016) were used to train models to predict 13 different in-laboratory PCI complications using a novel one-plus-class support vector machine (OP-SVM) algorithm. The performance of these models in terms of discrimination and calibration was compared to the performance of models trained using the following classification algorithms on BMC2 data from 2009 (n=20 289): logistic regression (LR), one-class support vector machine classification (OC-SVM), and two-class support vector machine classification (TC-SVM). For the OP-SVM and TC-SVM approaches, variants of the algorithms with cost-sensitive weighting were also considered. Results The OP-SVM algorithm and its cost-sensitive variant achieved the highest area under the receiver operating characteristic curve for the majority of the PCI complications studied (eight cases). Similar improvements were observed for the Hosmer–Lemeshow χ2 value (seven cases) and the mean cross-entropy error (eight cases). Conclusions The OP-SVM algorithm based on an augmented one-class learning problem improved discrimination and calibration across different PCI complications relative to LR and traditional support vector machine classification. Such an approach may have value in a broader range of clinical domains. PMID:23599229
Matrix Multiplication Algorithm Selection with Support Vector Machines
2015-05-01
libraries that could intelligently choose the optimal algorithm for a particular set of inputs. Users would be oblivious to the underlying algorithmic...SAT.” J. Artif . Intell. Res.(JAIR), vol. 32, pp. 565–606, 2008. [9] M. G. Lagoudakis and M. L. Littman, “Algorithm selection using reinforcement...Artificial Intelligence , vol. 21, no. 05, pp. 961–976, 2007. [15] C.-C. Chang and C.-J. Lin, “LIBSVM: A library for support vector machines,” ACM
Testing of the Support Vector Machine for Binary-Class Classification
NASA Technical Reports Server (NTRS)
Scholten, Matthew
2011-01-01
The Support Vector Machine is a powerful algorithm, useful in classifying data in to species. The Support Vector Machines implemented in this research were used as classifiers for the final stage in a Multistage Autonomous Target Recognition system. A single kernel SVM known as SVMlight, and a modified version known as a Support Vector Machine with K-Means Clustering were used. These SVM algorithms were tested as classifiers under varying conditions. Image noise levels varied, and the orientation of the targets changed. The classifiers were then optimized to demonstrate their maximum potential as classifiers. Results demonstrate the reliability of SMV as a method for classification. From trial to trial, SVM produces consistent results
Algorithm for detection the QRS complexes based on support vector machine
NASA Astrophysics Data System (ADS)
Van, G. V.; Podmasteryev, K. V.
2017-11-01
The efficiency of computer ECG analysis depends on the accurate detection of QRS-complexes. This paper presents an algorithm for QRS complex detection based of support vector machine (SVM). The proposed algorithm is evaluated on annotated standard databases such as MIT-BIH Arrhythmia database. The QRS detector obtained a sensitivity Se = 98.32% and specificity Sp = 95.46% for MIT-BIH Arrhythmia database. This algorithm can be used as the basis for the software to diagnose electrical activity of the heart.
Community detection in complex networks using proximate support vector clustering
NASA Astrophysics Data System (ADS)
Wang, Feifan; Zhang, Baihai; Chai, Senchun; Xia, Yuanqing
2018-03-01
Community structure, one of the most attention attracting properties in complex networks, has been a cornerstone in advances of various scientific branches. A number of tools have been involved in recent studies concentrating on the community detection algorithms. In this paper, we propose a support vector clustering method based on a proximity graph, owing to which the introduced algorithm surpasses the traditional support vector approach both in accuracy and complexity. Results of extensive experiments undertaken on computer generated networks and real world data sets illustrate competent performances in comparison with the other counterparts.
Gradient Evolution-based Support Vector Machine Algorithm for Classification
NASA Astrophysics Data System (ADS)
Zulvia, Ferani E.; Kuo, R. J.
2018-03-01
This paper proposes a classification algorithm based on a support vector machine (SVM) and gradient evolution (GE) algorithms. SVM algorithm has been widely used in classification. However, its result is significantly influenced by the parameters. Therefore, this paper aims to propose an improvement of SVM algorithm which can find the best SVMs’ parameters automatically. The proposed algorithm employs a GE algorithm to automatically determine the SVMs’ parameters. The GE algorithm takes a role as a global optimizer in finding the best parameter which will be used by SVM algorithm. The proposed GE-SVM algorithm is verified using some benchmark datasets and compared with other metaheuristic-based SVM algorithms. The experimental results show that the proposed GE-SVM algorithm obtains better results than other algorithms tested in this paper.
Progressive Classification Using Support Vector Machines
NASA Technical Reports Server (NTRS)
Wagstaff, Kiri; Kocurek, Michael
2009-01-01
An algorithm for progressive classification of data, analogous to progressive rendering of images, makes it possible to compromise between speed and accuracy. This algorithm uses support vector machines (SVMs) to classify data. An SVM is a machine learning algorithm that builds a mathematical model of the desired classification concept by identifying the critical data points, called support vectors. Coarse approximations to the concept require only a few support vectors, while precise, highly accurate models require far more support vectors. Once the model has been constructed, the SVM can be applied to new observations. The cost of classifying a new observation is proportional to the number of support vectors in the model. When computational resources are limited, an SVM of the appropriate complexity can be produced. However, if the constraints are not known when the model is constructed, or if they can change over time, a method for adaptively responding to the current resource constraints is required. This capability is particularly relevant for spacecraft (or any other real-time systems) that perform onboard data analysis. The new algorithm enables the fast, interactive application of an SVM classifier to a new set of data. The classification process achieved by this algorithm is characterized as progressive because a coarse approximation to the true classification is generated rapidly and thereafter iteratively refined. The algorithm uses two SVMs: (1) a fast, approximate one and (2) slow, highly accurate one. New data are initially classified by the fast SVM, producing a baseline approximate classification. For each classified data point, the algorithm calculates a confidence index that indicates the likelihood that it was classified correctly in the first pass. Next, the data points are sorted by their confidence indices and progressively reclassified by the slower, more accurate SVM, starting with the items most likely to be incorrectly classified. The user can halt this reclassification process at any point, thereby obtaining the best possible result for a given amount of computation time. Alternatively, the results can be displayed as they are generated, providing the user with real-time feedback about the current accuracy of classification.
A hybrid approach to select features and classify diseases based on medical data
NASA Astrophysics Data System (ADS)
AbdelLatif, Hisham; Luo, Jiawei
2018-03-01
Feature selection is popular problem in the classification of diseases in clinical medicine. Here, we developing a hybrid methodology to classify diseases, based on three medical datasets, Arrhythmia, Breast cancer, and Hepatitis datasets. This methodology called k-means ANOVA Support Vector Machine (K-ANOVA-SVM) uses K-means cluster with ANOVA statistical to preprocessing data and selection the significant features, and Support Vector Machines in the classification process. To compare and evaluate the performance, we choice three classification algorithms, decision tree Naïve Bayes, Support Vector Machines and applied the medical datasets direct to these algorithms. Our methodology was a much better classification accuracy is given of 98% in Arrhythmia datasets, 92% in Breast cancer datasets and 88% in Hepatitis datasets, Compare to use the medical data directly with decision tree Naïve Bayes, and Support Vector Machines. Also, the ROC curve and precision with (K-ANOVA-SVM) Achieved best results than other algorithms
Sparse Solutions for Single Class SVMs: A Bi-Criterion Approach
NASA Technical Reports Server (NTRS)
Das, Santanu; Oza, Nikunj C.
2011-01-01
In this paper we propose an innovative learning algorithm - a variation of One-class nu Support Vector Machines (SVMs) learning algorithm to produce sparser solutions with much reduced computational complexities. The proposed technique returns an approximate solution, nearly as good as the solution set obtained by the classical approach, by minimizing the original risk function along with a regularization term. We introduce a bi-criterion optimization that helps guide the search towards the optimal set in much reduced time. The outcome of the proposed learning technique was compared with the benchmark one-class Support Vector machines algorithm which more often leads to solutions with redundant support vectors. Through out the analysis, the problem size for both optimization routines was kept consistent. We have tested the proposed algorithm on a variety of data sources under different conditions to demonstrate the effectiveness. In all cases the proposed algorithm closely preserves the accuracy of standard one-class nu SVMs while reducing both training time and test time by several factors.
Analysis of miRNA expression profile based on SVM algorithm
NASA Astrophysics Data System (ADS)
Ting-ting, Dai; Chang-ji, Shan; Yan-shou, Dong; Yi-duo, Bian
2018-05-01
Based on mirna expression spectrum data set, a new data mining algorithm - tSVM - KNN (t statistic with support vector machine - k nearest neighbor) is proposed. the idea of the algorithm is: firstly, the feature selection of the data set is carried out by the unified measurement method; Secondly, SVM - KNN algorithm, which combines support vector machine (SVM) and k - nearest neighbor (k - nearest neighbor) is used as classifier. Simulation results show that SVM - KNN algorithm has better classification ability than SVM and KNN alone. Tsvm - KNN algorithm only needs 5 mirnas to obtain 96.08 % classification accuracy in terms of the number of mirna " tags" and recognition accuracy. compared with similar algorithms, tsvm - KNN algorithm has obvious advantages.
Quantum Support Vector Machine for Big Data Classification
NASA Astrophysics Data System (ADS)
Rebentrost, Patrick; Mohseni, Masoud; Lloyd, Seth
2014-09-01
Supervised machine learning is the classification of new data based on already classified training examples. In this work, we show that the support vector machine, an optimized binary classifier, can be implemented on a quantum computer, with complexity logarithmic in the size of the vectors and the number of training examples. In cases where classical sampling algorithms require polynomial time, an exponential speedup is obtained. At the core of this quantum big data algorithm is a nonsparse matrix exponentiation technique for efficiently performing a matrix inversion of the training data inner-product (kernel) matrix.
Orthogonal vector algorithm to obtain the solar vector using the single-scattering Rayleigh model.
Wang, Yinlong; Chu, Jinkui; Zhang, Ran; Shi, Chao
2018-02-01
Information obtained from a polarization pattern in the sky provides many animals like insects and birds with vital long-distance navigation cues. The solar vector can be derived from the polarization pattern using the single-scattering Rayleigh model. In this paper, an orthogonal vector algorithm, which utilizes the redundancy of the single-scattering Rayleigh model, is proposed. We use the intersection angles between the polarization vectors as the main criteria in our algorithm. The assumption that all polarization vectors can be considered coplanar is used to simplify the three-dimensional (3D) problem with respect to the polarization vectors in our simulation. The surface-normal vector of the plane, which is determined by the polarization vectors after translation, represents the solar vector. Unfortunately, the two-directionality of the polarization vectors makes the resulting solar vector ambiguous. One important result of this study is, however, that this apparent disadvantage has no effect on the complexity of the algorithm. Furthermore, two other universal least-squares algorithms were investigated and compared. A device was then constructed, which consists of five polarized-light sensors as well as a 3D attitude sensor. Both the simulation and experimental data indicate that the orthogonal vector algorithms, if used with a suitable threshold, perform equally well or better than the other two algorithms. Our experimental data reveal that if the intersection angles between the polarization vectors are close to 90°, the solar-vector angle deviations are small. The data also support the assumption of coplanarity. During the 51 min experiment, the mean of the measured solar-vector angle deviations was about 0.242°, as predicted by our theoretical model.
Fernandez-Lozano, C.; Canto, C.; Gestal, M.; Andrade-Garda, J. M.; Rabuñal, J. R.; Dorado, J.; Pazos, A.
2013-01-01
Given the background of the use of Neural Networks in problems of apple juice classification, this paper aim at implementing a newly developed method in the field of machine learning: the Support Vector Machines (SVM). Therefore, a hybrid model that combines genetic algorithms and support vector machines is suggested in such a way that, when using SVM as a fitness function of the Genetic Algorithm (GA), the most representative variables for a specific classification problem can be selected. PMID:24453933
USDA-ARS?s Scientific Manuscript database
Tillage management practices have direct impact on water holding capacity, evaporation, carbon sequestration, and water quality. This study examines the feasibility of two statistical learning algorithms, such as Least Square Support Vector Machine (LSSVM) and Relevance Vector Machine (RVM), for cla...
NASA Technical Reports Server (NTRS)
Garay, Michael J.; Mazzoni, Dominic; Davies, Roger; Wagstaff, Kiri
2004-01-01
Support Vector Machines (SVMs) are a type of supervised learning algorith,, other examples of which are Artificial Neural Networks (ANNs), Decision Trees, and Naive Bayesian Classifiers. Supervised learning algorithms are used to classify objects labled by a 'supervisor' - typically a human 'expert.'.
Pulmonary Nodule Recognition Based on Multiple Kernel Learning Support Vector Machine-PSO
Zhu, Zhichuan; Zhao, Qingdong; Liu, Liwei; Zhang, Lijuan
2018-01-01
Pulmonary nodule recognition is the core module of lung CAD. The Support Vector Machine (SVM) algorithm has been widely used in pulmonary nodule recognition, and the algorithm of Multiple Kernel Learning Support Vector Machine (MKL-SVM) has achieved good results therein. Based on grid search, however, the MKL-SVM algorithm needs long optimization time in course of parameter optimization; also its identification accuracy depends on the fineness of grid. In the paper, swarm intelligence is introduced and the Particle Swarm Optimization (PSO) is combined with MKL-SVM algorithm to be MKL-SVM-PSO algorithm so as to realize global optimization of parameters rapidly. In order to obtain the global optimal solution, different inertia weights such as constant inertia weight, linear inertia weight, and nonlinear inertia weight are applied to pulmonary nodules recognition. The experimental results show that the model training time of the proposed MKL-SVM-PSO algorithm is only 1/7 of the training time of the MKL-SVM grid search algorithm, achieving better recognition effect. Moreover, Euclidean norm of normalized error vector is proposed to measure the proximity between the average fitness curve and the optimal fitness curve after convergence. Through statistical analysis of the average of 20 times operation results with different inertial weights, it can be seen that the dynamic inertial weight is superior to the constant inertia weight in the MKL-SVM-PSO algorithm. In the dynamic inertial weight algorithm, the parameter optimization time of nonlinear inertia weight is shorter; the average fitness value after convergence is much closer to the optimal fitness value, which is better than the linear inertial weight. Besides, a better nonlinear inertial weight is verified. PMID:29853983
Pulmonary Nodule Recognition Based on Multiple Kernel Learning Support Vector Machine-PSO.
Li, Yang; Zhu, Zhichuan; Hou, Alin; Zhao, Qingdong; Liu, Liwei; Zhang, Lijuan
2018-01-01
Pulmonary nodule recognition is the core module of lung CAD. The Support Vector Machine (SVM) algorithm has been widely used in pulmonary nodule recognition, and the algorithm of Multiple Kernel Learning Support Vector Machine (MKL-SVM) has achieved good results therein. Based on grid search, however, the MKL-SVM algorithm needs long optimization time in course of parameter optimization; also its identification accuracy depends on the fineness of grid. In the paper, swarm intelligence is introduced and the Particle Swarm Optimization (PSO) is combined with MKL-SVM algorithm to be MKL-SVM-PSO algorithm so as to realize global optimization of parameters rapidly. In order to obtain the global optimal solution, different inertia weights such as constant inertia weight, linear inertia weight, and nonlinear inertia weight are applied to pulmonary nodules recognition. The experimental results show that the model training time of the proposed MKL-SVM-PSO algorithm is only 1/7 of the training time of the MKL-SVM grid search algorithm, achieving better recognition effect. Moreover, Euclidean norm of normalized error vector is proposed to measure the proximity between the average fitness curve and the optimal fitness curve after convergence. Through statistical analysis of the average of 20 times operation results with different inertial weights, it can be seen that the dynamic inertial weight is superior to the constant inertia weight in the MKL-SVM-PSO algorithm. In the dynamic inertial weight algorithm, the parameter optimization time of nonlinear inertia weight is shorter; the average fitness value after convergence is much closer to the optimal fitness value, which is better than the linear inertial weight. Besides, a better nonlinear inertial weight is verified.
NASA Astrophysics Data System (ADS)
Shastri, Niket; Pathak, Kamlesh
2018-05-01
The water vapor content in atmosphere plays very important role in climate. In this paper the application of GPS signal in meteorology is discussed, which is useful technique that is used to estimate the perceptible water vapor of atmosphere. In this paper various algorithms like artificial neural network, support vector machine and multiple linear regression are use to predict perceptible water vapor. The comparative studies in terms of root mean square error and mean absolute errors are also carried out for all the algorithms.
Extrapolation methods for vector sequences
NASA Technical Reports Server (NTRS)
Smith, David A.; Ford, William F.; Sidi, Avram
1987-01-01
This paper derives, describes, and compares five extrapolation methods for accelerating convergence of vector sequences or transforming divergent vector sequences to convergent ones. These methods are the scalar epsilon algorithm (SEA), vector epsilon algorithm (VEA), topological epsilon algorithm (TEA), minimal polynomial extrapolation (MPE), and reduced rank extrapolation (RRE). MPE and RRE are first derived and proven to give the exact solution for the right 'essential degree' k. Then, Brezinski's (1975) generalization of the Shanks-Schmidt transform is presented; the generalized form leads from systems of equations to TEA. The necessary connections are then made with SEA and VEA. The algorithms are extended to the nonlinear case by cycling, the error analysis for MPE and VEA is sketched, and the theoretical support for quadratic convergence is discussed. Strategies for practical implementation of the methods are considered.
nu-Anomica: A Fast Support Vector Based Novelty Detection Technique
NASA Technical Reports Server (NTRS)
Das, Santanu; Bhaduri, Kanishka; Oza, Nikunj C.; Srivastava, Ashok N.
2009-01-01
In this paper we propose nu-Anomica, a novel anomaly detection technique that can be trained on huge data sets with much reduced running time compared to the benchmark one-class Support Vector Machines algorithm. In -Anomica, the idea is to train the machine such that it can provide a close approximation to the exact decision plane using fewer training points and without losing much of the generalization performance of the classical approach. We have tested the proposed algorithm on a variety of continuous data sets under different conditions. We show that under all test conditions the developed procedure closely preserves the accuracy of standard one-class Support Vector Machines while reducing both the training time and the test time by 5 - 20 times.
Electricity Load Forecasting Using Support Vector Regression with Memetic Algorithms
Hu, Zhongyi; Xiong, Tao
2013-01-01
Electricity load forecasting is an important issue that is widely explored and examined in power systems operation literature and commercial transactions in electricity markets literature as well. Among the existing forecasting models, support vector regression (SVR) has gained much attention. Considering the performance of SVR highly depends on its parameters; this study proposed a firefly algorithm (FA) based memetic algorithm (FA-MA) to appropriately determine the parameters of SVR forecasting model. In the proposed FA-MA algorithm, the FA algorithm is applied to explore the solution space, and the pattern search is used to conduct individual learning and thus enhance the exploitation of FA. Experimental results confirm that the proposed FA-MA based SVR model can not only yield more accurate forecasting results than the other four evolutionary algorithms based SVR models and three well-known forecasting models but also outperform the hybrid algorithms in the related existing literature. PMID:24459425
Electricity load forecasting using support vector regression with memetic algorithms.
Hu, Zhongyi; Bao, Yukun; Xiong, Tao
2013-01-01
Electricity load forecasting is an important issue that is widely explored and examined in power systems operation literature and commercial transactions in electricity markets literature as well. Among the existing forecasting models, support vector regression (SVR) has gained much attention. Considering the performance of SVR highly depends on its parameters; this study proposed a firefly algorithm (FA) based memetic algorithm (FA-MA) to appropriately determine the parameters of SVR forecasting model. In the proposed FA-MA algorithm, the FA algorithm is applied to explore the solution space, and the pattern search is used to conduct individual learning and thus enhance the exploitation of FA. Experimental results confirm that the proposed FA-MA based SVR model can not only yield more accurate forecasting results than the other four evolutionary algorithms based SVR models and three well-known forecasting models but also outperform the hybrid algorithms in the related existing literature.
Signal detection using support vector machines in the presence of ultrasonic speckle
NASA Astrophysics Data System (ADS)
Kotropoulos, Constantine L.; Pitas, Ioannis
2002-04-01
Support Vector Machines are a general algorithm based on guaranteed risk bounds of statistical learning theory. They have found numerous applications, such as in classification of brain PET images, optical character recognition, object detection, face verification, text categorization and so on. In this paper we propose the use of support vector machines to segment lesions in ultrasound images and we assess thoroughly their lesion detection ability. We demonstrate that trained support vector machines with a Radial Basis Function kernel segment satisfactorily (unseen) ultrasound B-mode images as well as clinical ultrasonic images.
Integrating image quality in 2nu-SVM biometric match score fusion.
Vatsa, Mayank; Singh, Richa; Noore, Afzel
2007-10-01
This paper proposes an intelligent 2nu-support vector machine based match score fusion algorithm to improve the performance of face and iris recognition by integrating the quality of images. The proposed algorithm applies redundant discrete wavelet transform to evaluate the underlying linear and non-linear features present in the image. A composite quality score is computed to determine the extent of smoothness, sharpness, noise, and other pertinent features present in each subband of the image. The match score and the corresponding quality score of an image are fused using 2nu-support vector machine to improve the verification performance. The proposed algorithm is experimentally validated using the FERET face database and the CASIA iris database. The verification performance and statistical evaluation show that the proposed algorithm outperforms existing fusion algorithms.
Aircraft Engine Thrust Estimator Design Based on GSA-LSSVM
NASA Astrophysics Data System (ADS)
Sheng, Hanlin; Zhang, Tianhong
2017-08-01
In view of the necessity of highly precise and reliable thrust estimator to achieve direct thrust control of aircraft engine, based on support vector regression (SVR), as well as least square support vector machine (LSSVM) and a new optimization algorithm - gravitational search algorithm (GSA), by performing integrated modelling and parameter optimization, a GSA-LSSVM-based thrust estimator design solution is proposed. The results show that compared to particle swarm optimization (PSO) algorithm, GSA can find unknown optimization parameter better and enables the model developed with better prediction and generalization ability. The model can better predict aircraft engine thrust and thus fulfills the need of direct thrust control of aircraft engine.
Arana-Daniel, Nancy; Gallegos, Alberto A; López-Franco, Carlos; Alanís, Alma Y; Morales, Jacob; López-Franco, Adriana
2016-01-01
With the increasing power of computers, the amount of data that can be processed in small periods of time has grown exponentially, as has the importance of classifying large-scale data efficiently. Support vector machines have shown good results classifying large amounts of high-dimensional data, such as data generated by protein structure prediction, spam recognition, medical diagnosis, optical character recognition and text classification, etc. Most state of the art approaches for large-scale learning use traditional optimization methods, such as quadratic programming or gradient descent, which makes the use of evolutionary algorithms for training support vector machines an area to be explored. The present paper proposes an approach that is simple to implement based on evolutionary algorithms and Kernel-Adatron for solving large-scale classification problems, focusing on protein structure prediction. The functional properties of proteins depend upon their three-dimensional structures. Knowing the structures of proteins is crucial for biology and can lead to improvements in areas such as medicine, agriculture and biofuels.
Support vector machine firefly algorithm based optimization of lens system.
Shamshirband, Shahaboddin; Petković, Dalibor; Pavlović, Nenad T; Ch, Sudheer; Altameem, Torki A; Gani, Abdullah
2015-01-01
Lens system design is an important factor in image quality. The main aspect of the lens system design methodology is the optimization procedure. Since optimization is a complex, nonlinear task, soft computing optimization algorithms can be used. There are many tools that can be employed to measure optical performance, but the spot diagram is the most useful. The spot diagram gives an indication of the image of a point object. In this paper, the spot size radius is considered an optimization criterion. Intelligent soft computing scheme support vector machines (SVMs) coupled with the firefly algorithm (FFA) are implemented. The performance of the proposed estimators is confirmed with the simulation results. The result of the proposed SVM-FFA model has been compared with support vector regression (SVR), artificial neural networks, and generic programming methods. The results show that the SVM-FFA model performs more accurately than the other methodologies. Therefore, SVM-FFA can be used as an efficient soft computing technique in the optimization of lens system designs.
Polynomial interpretation of multipole vectors
NASA Astrophysics Data System (ADS)
Katz, Gabriel; Weeks, Jeff
2004-09-01
Copi, Huterer, Starkman, and Schwarz introduced multipole vectors in a tensor context and used them to demonstrate that the first-year Wilkinson microwave anisotropy probe (WMAP) quadrupole and octopole planes align at roughly the 99.9% confidence level. In the present article, the language of polynomials provides a new and independent derivation of the multipole vector concept. Bézout’s theorem supports an elementary proof that the multipole vectors exist and are unique (up to rescaling). The constructive nature of the proof leads to a fast, practical algorithm for computing multipole vectors. We illustrate the algorithm by finding exact solutions for some simple toy examples and numerical solutions for the first-year WMAP quadrupole and octopole. We then apply our algorithm to Monte Carlo skies to independently reconfirm the estimate that the WMAP quadrupole and octopole planes align at the 99.9% level.
A portable approach for PIC on emerging architectures
NASA Astrophysics Data System (ADS)
Decyk, Viktor
2016-03-01
A portable approach for designing Particle-in-Cell (PIC) algorithms on emerging exascale computers, is based on the recognition that 3 distinct programming paradigms are needed. They are: low level vector (SIMD) processing, middle level shared memory parallel programing, and high level distributed memory programming. In addition, there is a memory hierarchy associated with each level. Such algorithms can be initially developed using vectorizing compilers, OpenMP, and MPI. This is the approach recommended by Intel for the Phi processor. These algorithms can then be translated and possibly specialized to other programming models and languages, as needed. For example, the vector processing and shared memory programming might be done with CUDA instead of vectorizing compilers and OpenMP, but generally the algorithm itself is not greatly changed. The UCLA PICKSC web site at http://www.idre.ucla.edu/ contains example open source skeleton codes (mini-apps) illustrating each of these three programming models, individually and in combination. Fortran2003 now supports abstract data types, and design patterns can be used to support a variety of implementations within the same code base. Fortran2003 also supports interoperability with C so that implementations in C languages are also easy to use. Finally, main codes can be translated into dynamic environments such as Python, while still taking advantage of high performing compiled languages. Parallel languages are still evolving with interesting developments in co-Array Fortran, UPC, and OpenACC, among others, and these can also be supported within the same software architecture. Work supported by NSF and DOE Grants.
Support vector machine (SVM) was applied for land-cover characterization using MODIS time-series data. Classification performance was examined with respect to training sample size, sample variability, and landscape homogeneity (purity). The results were compared to two convention...
Extraction and classification of 3D objects from volumetric CT data
NASA Astrophysics Data System (ADS)
Song, Samuel M.; Kwon, Junghyun; Ely, Austin; Enyeart, John; Johnson, Chad; Lee, Jongkyu; Kim, Namho; Boyd, Douglas P.
2016-05-01
We propose an Automatic Threat Detection (ATD) algorithm for Explosive Detection System (EDS) using our multistage Segmentation Carving (SC) followed by Support Vector Machine (SVM) classifier. The multi-stage Segmentation and Carving (SC) step extracts all suspect 3-D objects. The feature vector is then constructed for all extracted objects and the feature vector is classified by the Support Vector Machine (SVM) previously learned using a set of ground truth threat and benign objects. The learned SVM classifier has shown to be effective in classification of different types of threat materials. The proposed ATD algorithm robustly deals with CT data that are prone to artifacts due to scatter, beam hardening as well as other systematic idiosyncrasies of the CT data. Furthermore, the proposed ATD algorithm is amenable for including newly emerging threat materials as well as for accommodating data from newly developing sensor technologies. Efficacy of the proposed ATD algorithm with the SVM classifier is demonstrated by the Receiver Operating Characteristics (ROC) curve that relates Probability of Detection (PD) as a function of Probability of False Alarm (PFA). The tests performed using CT data of passenger bags shows excellent performance characteristics.
Active Learning Using Hint Information.
Li, Chun-Liang; Ferng, Chun-Sung; Lin, Hsuan-Tien
2015-08-01
The abundance of real-world data and limited labeling budget calls for active learning, an important learning paradigm for reducing human labeling efforts. Many recently developed active learning algorithms consider both uncertainty and representativeness when making querying decisions. However, exploiting representativeness with uncertainty concurrently usually requires tackling sophisticated and challenging learning tasks, such as clustering. In this letter, we propose a new active learning framework, called hinted sampling, which takes both uncertainty and representativeness into account in a simpler way. We design a novel active learning algorithm within the hinted sampling framework with an extended support vector machine. Experimental results validate that the novel active learning algorithm can result in a better and more stable performance than that achieved by state-of-the-art algorithms. We also show that the hinted sampling framework allows improving another active learning algorithm designed from the transductive support vector machine.
Comparison of l₁-Norm SVR and Sparse Coding Algorithms for Linear Regression.
Zhang, Qingtian; Hu, Xiaolin; Zhang, Bo
2015-08-01
Support vector regression (SVR) is a popular function estimation technique based on Vapnik's concept of support vector machine. Among many variants, the l1-norm SVR is known to be good at selecting useful features when the features are redundant. Sparse coding (SC) is a technique widely used in many areas and a number of efficient algorithms are available. Both l1-norm SVR and SC can be used for linear regression. In this brief, the close connection between the l1-norm SVR and SC is revealed and some typical algorithms are compared for linear regression. The results show that the SC algorithms outperform the Newton linear programming algorithm, an efficient l1-norm SVR algorithm, in efficiency. The algorithms are then used to design the radial basis function (RBF) neural networks. Experiments on some benchmark data sets demonstrate the high efficiency of the SC algorithms. In particular, one of the SC algorithms, the orthogonal matching pursuit is two orders of magnitude faster than a well-known RBF network designing algorithm, the orthogonal least squares algorithm.
T-wave end detection using neural networks and Support Vector Machines.
Suárez-León, Alexander Alexeis; Varon, Carolina; Willems, Rik; Van Huffel, Sabine; Vázquez-Seisdedos, Carlos Román
2018-05-01
In this paper we propose a new approach for detecting the end of the T-wave in the electrocardiogram (ECG) using Neural Networks and Support Vector Machines. Both, Multilayer Perceptron (MLP) neural networks and Fixed-Size Least-Squares Support Vector Machines (FS-LSSVM) were used as regression algorithms to determine the end of the T-wave. Different strategies for selecting the training set such as random selection, k-means, robust clustering and maximum quadratic (Rényi) entropy were evaluated. Individual parameters were tuned for each method during training and the results are given for the evaluation set. A comparison between MLP and FS-LSSVM approaches was performed. Finally, a fair comparison of the FS-LSSVM method with other state-of-the-art algorithms for detecting the end of the T-wave was included. The experimental results show that FS-LSSVM approaches are more suitable as regression algorithms than MLP neural networks. Despite the small training sets used, the FS-LSSVM methods outperformed the state-of-the-art techniques. FS-LSSVM can be successfully used as a T-wave end detection algorithm in ECG even with small training set sizes. Copyright © 2018 Elsevier Ltd. All rights reserved.
Li, Ji; Hu, Guoqing; Zhou, Yonghong; Zou, Chong; Peng, Wei; Alam Sm, Jahangir
2016-10-14
A piezo-resistive pressure sensor is made of silicon, the nature of which is considerably influenced by ambient temperature. The effect of temperature should be eliminated during the working period in expectation of linear output. To deal with this issue, an approach consists of a hybrid kernel Least Squares Support Vector Machine (LSSVM) optimized by a chaotic ions motion algorithm presented. To achieve the learning and generalization for excellent performance, a hybrid kernel function, constructed by a local kernel as Radial Basis Function (RBF) kernel, and a global kernel as polynomial kernel is incorporated into the Least Squares Support Vector Machine. The chaotic ions motion algorithm is introduced to find the best hyper-parameters of the Least Squares Support Vector Machine. The temperature data from a calibration experiment is conducted to validate the proposed method. With attention on algorithm robustness and engineering applications, the compensation result shows the proposed scheme outperforms other compared methods on several performance measures as maximum absolute relative error, minimum absolute relative error mean and variance of the averaged value on fifty runs. Furthermore, the proposed temperature compensation approach lays a foundation for more extensive research.
Automated image segmentation using support vector machines
NASA Astrophysics Data System (ADS)
Powell, Stephanie; Magnotta, Vincent A.; Andreasen, Nancy C.
2007-03-01
Neurodegenerative and neurodevelopmental diseases demonstrate problems associated with brain maturation and aging. Automated methods to delineate brain structures of interest are required to analyze large amounts of imaging data like that being collected in several on going multi-center studies. We have previously reported on using artificial neural networks (ANN) to define subcortical brain structures including the thalamus (0.88), caudate (0.85) and the putamen (0.81). In this work, apriori probability information was generated using Thirion's demons registration algorithm. The input vector consisted of apriori probability, spherical coordinates, and an iris of surrounding signal intensity values. We have applied the support vector machine (SVM) machine learning algorithm to automatically segment subcortical and cerebellar regions using the same input vector information. SVM architecture was derived from the ANN framework. Training was completed using a radial-basis function kernel with gamma equal to 5.5. Training was performed using 15,000 vectors collected from 15 training images in approximately 10 minutes. The resulting support vectors were applied to delineate 10 images not part of the training set. Relative overlap calculated for the subcortical structures was 0.87 for the thalamus, 0.84 for the caudate, 0.84 for the putamen, and 0.72 for the hippocampus. Relative overlap for the cerebellar lobes ranged from 0.76 to 0.86. The reliability of the SVM based algorithm was similar to the inter-rater reliability between manual raters and can be achieved without rater intervention.
Support vector machine incremental learning triggered by wrongly predicted samples
NASA Astrophysics Data System (ADS)
Tang, Ting-long; Guan, Qiu; Wu, Yi-rong
2018-05-01
According to the classic Karush-Kuhn-Tucker (KKT) theorem, at every step of incremental support vector machine (SVM) learning, the newly adding sample which violates the KKT conditions will be a new support vector (SV) and migrate the old samples between SV set and non-support vector (NSV) set, and at the same time the learning model should be updated based on the SVs. However, it is not exactly clear at this moment that which of the old samples would change between SVs and NSVs. Additionally, the learning model will be unnecessarily updated, which will not greatly increase its accuracy but decrease the training speed. Therefore, how to choose the new SVs from old sets during the incremental stages and when to process incremental steps will greatly influence the accuracy and efficiency of incremental SVM learning. In this work, a new algorithm is proposed to select candidate SVs and use the wrongly predicted sample to trigger the incremental processing simultaneously. Experimental results show that the proposed algorithm can achieve good performance with high efficiency, high speed and good accuracy.
An Efficient Distributed Compressed Sensing Algorithm for Decentralized Sensor Network.
Liu, Jing; Huang, Kaiyu; Zhang, Guoxian
2017-04-20
We consider the joint sparsity Model 1 (JSM-1) in a decentralized scenario, where a number of sensors are connected through a network and there is no fusion center. A novel algorithm, named distributed compact sensing matrix pursuit (DCSMP), is proposed to exploit the computational and communication capabilities of the sensor nodes. In contrast to the conventional distributed compressed sensing algorithms adopting a random sensing matrix, the proposed algorithm focuses on the deterministic sensing matrices built directly on the real acquisition systems. The proposed DCSMP algorithm can be divided into two independent parts, the common and innovation support set estimation processes. The goal of the common support set estimation process is to obtain an estimated common support set by fusing the candidate support set information from an individual node and its neighboring nodes. In the following innovation support set estimation process, the measurement vector is projected into a subspace that is perpendicular to the subspace spanned by the columns indexed by the estimated common support set, to remove the impact of the estimated common support set. We can then search the innovation support set using an orthogonal matching pursuit (OMP) algorithm based on the projected measurement vector and projected sensing matrix. In the proposed DCSMP algorithm, the process of estimating the common component/support set is decoupled with that of estimating the innovation component/support set. Thus, the inaccurately estimated common support set will have no impact on estimating the innovation support set. It is proven that under the condition the estimated common support set contains the true common support set, the proposed algorithm can find the true innovation set correctly. Moreover, since the innovation support set estimation process is independent of the common support set estimation process, there is no requirement for the cardinality of both sets; thus, the proposed DCSMP algorithm is capable of tackling the unknown sparsity problem successfully.
Greedy Algorithms for Nonnegativity-Constrained Simultaneous Sparse Recovery
Kim, Daeun; Haldar, Justin P.
2016-01-01
This work proposes a family of greedy algorithms to jointly reconstruct a set of vectors that are (i) nonnegative and (ii) simultaneously sparse with a shared support set. The proposed algorithms generalize previous approaches that were designed to impose these constraints individually. Similar to previous greedy algorithms for sparse recovery, the proposed algorithms iteratively identify promising support indices. In contrast to previous approaches, the support index selection procedure has been adapted to prioritize indices that are consistent with both the nonnegativity and shared support constraints. Empirical results demonstrate for the first time that the combined use of simultaneous sparsity and nonnegativity constraints can substantially improve recovery performance relative to existing greedy algorithms that impose less signal structure. PMID:26973368
Agricultural mapping using Support Vector Machine-Based Endmember Extraction (SVM-BEE)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Archibald, Richard K; Filippi, Anthony M; Bhaduri, Budhendra L
Extracting endmembers from remotely sensed images of vegetated areas can present difficulties. In this research, we applied a recently developed endmember-extraction algorithm based on Support Vector Machines (SVMs) to the problem of semi-autonomous estimation of vegetation endmembers from a hyperspectral image. This algorithm, referred to as Support Vector Machine-Based Endmember Extraction (SVM-BEE), accurately and rapidly yields a computed representation of hyperspectral data that can accommodate multiple distributions. The number of distributions is identified without prior knowledge, based upon this representation. Prior work established that SVM-BEE is robustly noise-tolerant and can semi-automatically and effectively estimate endmembers; synthetic data and a geologicmore » scene were previously analyzed. Here we compared the efficacies of the SVM-BEE and N-FINDR algorithms in extracting endmembers from a predominantly agricultural scene. SVM-BEE was able to estimate vegetation and other endmembers for all classes in the image, which N-FINDR failed to do. Classifications based on SVM-BEE endmembers were markedly more accurate compared with those based on N-FINDR endmembers.« less
García Nieto, P J; Alonso Fernández, J R; de Cos Juez, F J; Sánchez Lasheras, F; Díaz Muñiz, C
2013-04-01
Cyanotoxins, a kind of poisonous substances produced by cyanobacteria, are responsible for health risks in drinking and recreational waters. As a result, anticipate its presence is a matter of importance to prevent risks. The aim of this study is to use a hybrid approach based on support vector regression (SVR) in combination with genetic algorithms (GAs), known as a genetic algorithm support vector regression (GA-SVR) model, in forecasting the cyanotoxins presence in the Trasona reservoir (Northern Spain). The GA-SVR approach is aimed at highly nonlinear biological problems with sharp peaks and the tests carried out proved its high performance. Some physical-chemical parameters have been considered along with the biological ones. The results obtained are two-fold. In the first place, the significance of each biological and physical-chemical variable on the cyanotoxins presence in the reservoir is determined with success. Finally, a predictive model able to forecast the possible presence of cyanotoxins in a short term was obtained. Copyright © 2013 Elsevier Inc. All rights reserved.
Agent Collaborative Target Localization and Classification in Wireless Sensor Networks
Wang, Xue; Bi, Dao-wei; Ding, Liang; Wang, Sheng
2007-01-01
Wireless sensor networks (WSNs) are autonomous networks that have been frequently deployed to collaboratively perform target localization and classification tasks. Their autonomous and collaborative features resemble the characteristics of agents. Such similarities inspire the development of heterogeneous agent architecture for WSN in this paper. The proposed agent architecture views WSN as multi-agent systems and mobile agents are employed to reduce in-network communication. According to the architecture, an energy based acoustic localization algorithm is proposed. In localization, estimate of target location is obtained by steepest descent search. The search algorithm adapts to measurement environments by dynamically adjusting its termination condition. With the agent architecture, target classification is accomplished by distributed support vector machine (SVM). Mobile agents are employed for feature extraction and distributed SVM learning to reduce communication load. Desirable learning performance is guaranteed by combining support vectors and convex hull vectors. Fusion algorithms are designed to merge SVM classification decisions made from various modalities. Real world experiments with MICAz sensor nodes are conducted for vehicle localization and classification. Experimental results show the proposed agent architecture remarkably facilitates WSN designs and algorithm implementation. The localization and classification algorithms also prove to be accurate and energy efficient.
Support Vector Machines: Relevance Feedback and Information Retrieval.
ERIC Educational Resources Information Center
Drucker, Harris; Shahrary, Behzad; Gibbon, David C.
2002-01-01
Compares support vector machines (SVMs) to Rocchio, Ide regular and Ide dec-hi algorithms in information retrieval (IR) of text documents using relevancy feedback. If the preliminary search is so poor that one has to search through many documents to find at least one relevant document, then SVM is preferred. Includes nine tables. (Contains 24…
Fuzzy support vector machines for adaptive Morse code recognition.
Yang, Cheng-Hong; Jin, Li-Cheng; Chuang, Li-Yeh
2006-11-01
Morse code is now being harnessed for use in rehabilitation applications of augmentative-alternative communication and assistive technology, facilitating mobility, environmental control and adapted worksite access. In this paper, Morse code is selected as a communication adaptive device for persons who suffer from muscle atrophy, cerebral palsy or other severe handicaps. A stable typing rate is strictly required for Morse code to be effective as a communication tool. Therefore, an adaptive automatic recognition method with a high recognition rate is needed. The proposed system uses both fuzzy support vector machines and the variable-degree variable-step-size least-mean-square algorithm to achieve these objectives. We apply fuzzy memberships to each point, and provide different contributions to the decision learning function for support vector machines. Statistical analyses demonstrated that the proposed method elicited a higher recognition rate than other algorithms in the literature.
NASA Astrophysics Data System (ADS)
Li, Chao; Yang, Sheng-Chao; Guo, Qiao-Sheng; Zheng, Kai-Yan; Wang, Ping-Li; Meng, Zhen-Gui
2016-01-01
A combination of Fourier transform infrared spectroscopy with chemometrics tools provided an approach for studying Marsdenia tenacissima according to its geographical origin. A total of 128 M. tenacissima samples from four provinces in China were analyzed with FTIR spectroscopy. Six pattern recognition methods were used to construct the discrimination models: support vector machine-genetic algorithms, support vector machine-particle swarm optimization, K-nearest neighbors, radial basis function neural network, random forest and support vector machine-grid search. Experimental results showed that K-nearest neighbors was superior to other mathematical algorithms after data were preprocessed with wavelet de-noising, with a discrimination rate of 100% in both the training and prediction sets. This study demonstrated that FTIR spectroscopy coupled with K-nearest neighbors could be successfully applied to determine the geographical origins of M. tenacissima samples, thereby providing reliable authentication in a rapid, cheap and noninvasive way.
NASA Technical Reports Server (NTRS)
Charlesworth, Arthur
1990-01-01
The nondeterministic divide partitions a vector into two non-empty slices by allowing the point of division to be chosen nondeterministically. Support for high-level divide-and-conquer programming provided by the nondeterministic divide is investigated. A diva algorithm is a recursive divide-and-conquer sequential algorithm on one or more vectors of the same range, whose division point for a new pair of recursive calls is chosen nondeterministically before any computation is performed and whose recursive calls are made immediately after the choice of division point; also, access to vector components is only permitted during activations in which the vector parameters have unit length. The notion of diva algorithm is formulated precisely as a diva call, a restricted call on a sequential procedure. Diva calls are proven to be intimately related to associativity. Numerous applications of diva calls are given and strategies are described for translating a diva call into code for a variety of parallel computers. Thus diva algorithms separate logical correctness concerns from implementation concerns.
Face recognition using total margin-based adaptive fuzzy support vector machines.
Liu, Yi-Hung; Chen, Yen-Ting
2007-01-01
This paper presents a new classifier called total margin-based adaptive fuzzy support vector machines (TAF-SVM) that deals with several problems that may occur in support vector machines (SVMs) when applied to the face recognition. The proposed TAF-SVM not only solves the overfitting problem resulted from the outlier with the approach of fuzzification of the penalty, but also corrects the skew of the optimal separating hyperplane due to the very imbalanced data sets by using different cost algorithm. In addition, by introducing the total margin algorithm to replace the conventional soft margin algorithm, a lower generalization error bound can be obtained. Those three functions are embodied into the traditional SVM so that the TAF-SVM is proposed and reformulated in both linear and nonlinear cases. By using two databases, the Chung Yuan Christian University (CYCU) multiview and the facial recognition technology (FERET) face databases, and using the kernel Fisher's discriminant analysis (KFDA) algorithm to extract discriminating face features, experimental results show that the proposed TAF-SVM is superior to SVM in terms of the face-recognition accuracy. The results also indicate that the proposed TAF-SVM can achieve smaller error variances than SVM over a number of tests such that better recognition stability can be obtained.
ERIC Educational Resources Information Center
Araya, Roberto; Plana, Francisco; Dartnell, Pablo; Soto-Andrade, Jorge; Luci, Gina; Salinas, Elena; Araya, Marylen
2012-01-01
Teacher practice is normally assessed by observers who watch classes or videos of classes. Here, we analyse an alternative strategy that uses text transcripts and a support vector machine classifier. For each one of the 710 videos of mathematics classes from the 2005 Chilean National Teacher Assessment Programme, a single 4-minute slice was…
Developing operation algorithms for vision subsystems in autonomous mobile robots
NASA Astrophysics Data System (ADS)
Shikhman, M. V.; Shidlovskiy, S. V.
2018-05-01
The paper analyzes algorithms for selecting keypoints on the image for the subsequent automatic detection of people and obstacles. The algorithm is based on the histogram of oriented gradients and the support vector method. The combination of these methods allows successful selection of dynamic and static objects. The algorithm can be applied in various autonomous mobile robots.
Chen, Yuantao; Xu, Weihong; Kuang, Fangjun; Gao, Shangbing
2013-01-01
The efficient target tracking algorithm researches have become current research focus of intelligent robots. The main problems of target tracking process in mobile robot face environmental uncertainty. They are very difficult to estimate the target states, illumination change, target shape changes, complex backgrounds, and other factors and all affect the occlusion in tracking robustness. To further improve the target tracking's accuracy and reliability, we present a novel target tracking algorithm to use visual saliency and adaptive support vector machine (ASVM). Furthermore, the paper's algorithm has been based on the mixture saliency of image features. These features include color, brightness, and sport feature. The execution process used visual saliency features and those common characteristics have been expressed as the target's saliency. Numerous experiments demonstrate the effectiveness and timeliness of the proposed target tracking algorithm in video sequences where the target objects undergo large changes in pose, scale, and illumination.
Design of Clinical Support Systems Using Integrated Genetic Algorithm and Support Vector Machine
NASA Astrophysics Data System (ADS)
Chen, Yung-Fu; Huang, Yung-Fa; Jiang, Xiaoyi; Hsu, Yuan-Nian; Lin, Hsuan-Hung
Clinical decision support system (CDSS) provides knowledge and specific information for clinicians to enhance diagnostic efficiency and improving healthcare quality. An appropriate CDSS can highly elevate patient safety, improve healthcare quality, and increase cost-effectiveness. Support vector machine (SVM) is believed to be superior to traditional statistical and neural network classifiers. However, it is critical to determine suitable combination of SVM parameters regarding classification performance. Genetic algorithm (GA) can find optimal solution within an acceptable time, and is faster than greedy algorithm with exhaustive searching strategy. By taking the advantage of GA in quickly selecting the salient features and adjusting SVM parameters, a method using integrated GA and SVM (IGS), which is different from the traditional method with GA used for feature selection and SVM for classification, was used to design CDSSs for prediction of successful ventilation weaning, diagnosis of patients with severe obstructive sleep apnea, and discrimination of different cell types form Pap smear. The results show that IGS is better than methods using SVM alone or linear discriminator.
Identification of handwriting by using the genetic algorithm (GA) and support vector machine (SVM)
NASA Astrophysics Data System (ADS)
Zhang, Qigui; Deng, Kai
2016-12-01
As portable digital camera and a camera phone comes more and more popular, and equally pressing is meeting the requirements of people to shoot at any time, to identify and storage handwritten character. In this paper, genetic algorithm(GA) and support vector machine(SVM)are used for identification of handwriting. Compare with parameters-optimized method, this technique overcomes two defects: first, it's easy to trap in the local optimum; second, finding the best parameters in the larger range will affects the efficiency of classification and prediction. As the experimental results suggest, GA-SVM has a higher recognition rate.
Optimization of Support Vector Machine (SVM) for Object Classification
NASA Technical Reports Server (NTRS)
Scholten, Matthew; Dhingra, Neil; Lu, Thomas T.; Chao, Tien-Hsin
2012-01-01
The Support Vector Machine (SVM) is a powerful algorithm, useful in classifying data into species. The SVMs implemented in this research were used as classifiers for the final stage in a Multistage Automatic Target Recognition (ATR) system. A single kernel SVM known as SVMlight, and a modified version known as a SVM with K-Means Clustering were used. These SVM algorithms were tested as classifiers under varying conditions. Image noise levels varied, and the orientation of the targets changed. The classifiers were then optimized to demonstrate their maximum potential as classifiers. Results demonstrate the reliability of SVM as a method for classification. From trial to trial, SVM produces consistent results.
Diagnosis of Chronic Kidney Disease Based on Support Vector Machine by Feature Selection Methods.
Polat, Huseyin; Danaei Mehr, Homay; Cetin, Aydin
2017-04-01
As Chronic Kidney Disease progresses slowly, early detection and effective treatment are the only cure to reduce the mortality rate. Machine learning techniques are gaining significance in medical diagnosis because of their classification ability with high accuracy rates. The accuracy of classification algorithms depend on the use of correct feature selection algorithms to reduce the dimension of datasets. In this study, Support Vector Machine classification algorithm was used to diagnose Chronic Kidney Disease. To diagnose the Chronic Kidney Disease, two essential types of feature selection methods namely, wrapper and filter approaches were chosen to reduce the dimension of Chronic Kidney Disease dataset. In wrapper approach, classifier subset evaluator with greedy stepwise search engine and wrapper subset evaluator with the Best First search engine were used. In filter approach, correlation feature selection subset evaluator with greedy stepwise search engine and filtered subset evaluator with the Best First search engine were used. The results showed that the Support Vector Machine classifier by using filtered subset evaluator with the Best First search engine feature selection method has higher accuracy rate (98.5%) in the diagnosis of Chronic Kidney Disease compared to other selected methods.
Fuzzy Nonlinear Proximal Support Vector Machine for Land Extraction Based on Remote Sensing Image
Zhong, Xiaomei; Li, Jianping; Dou, Huacheng; Deng, Shijun; Wang, Guofei; Jiang, Yu; Wang, Yongjie; Zhou, Zebing; Wang, Li; Yan, Fei
2013-01-01
Currently, remote sensing technologies were widely employed in the dynamic monitoring of the land. This paper presented an algorithm named fuzzy nonlinear proximal support vector machine (FNPSVM) by basing on ETM+ remote sensing image. This algorithm is applied to extract various types of lands of the city Da’an in northern China. Two multi-category strategies, namely “one-against-one” and “one-against-rest” for this algorithm were described in detail and then compared. A fuzzy membership function was presented to reduce the effects of noises or outliers on the data samples. The approaches of feature extraction, feature selection, and several key parameter settings were also given. Numerous experiments were carried out to evaluate its performances including various accuracies (overall accuracies and kappa coefficient), stability, training speed, and classification speed. The FNPSVM classifier was compared to the other three classifiers including the maximum likelihood classifier (MLC), back propagation neural network (BPN), and the proximal support vector machine (PSVM) under different training conditions. The impacts of the selection of training samples, testing samples and features on the four classifiers were also evaluated in these experiments. PMID:23936016
A Power Transformers Fault Diagnosis Model Based on Three DGA Ratios and PSO Optimization SVM
NASA Astrophysics Data System (ADS)
Ma, Hongzhe; Zhang, Wei; Wu, Rongrong; Yang, Chunyan
2018-03-01
In order to make up for the shortcomings of existing transformer fault diagnosis methods in dissolved gas-in-oil analysis (DGA) feature selection and parameter optimization, a transformer fault diagnosis model based on the three DGA ratios and particle swarm optimization (PSO) optimize support vector machine (SVM) is proposed. Using transforming support vector machine to the nonlinear and multi-classification SVM, establishing the particle swarm optimization to optimize the SVM multi classification model, and conducting transformer fault diagnosis combined with the cross validation principle. The fault diagnosis results show that the average accuracy of test method is better than the standard support vector machine and genetic algorithm support vector machine, and the proposed method can effectively improve the accuracy of transformer fault diagnosis is proved.
NASA Technical Reports Server (NTRS)
Buchholz, Peter; Ciardo, Gianfranco; Donatelli, Susanna; Kemper, Peter
1997-01-01
We present a systematic discussion of algorithms to multiply a vector by a matrix expressed as the Kronecker product of sparse matrices, extending previous work in a unified notational framework. Then, we use our results to define new algorithms for the solution of large structured Markov models. In addition to a comprehensive overview of existing approaches, we give new results with respect to: (1) managing certain types of state-dependent behavior without incurring extra cost; (2) supporting both Jacobi-style and Gauss-Seidel-style methods by appropriate multiplication algorithms; (3) speeding up algorithms that consider probability vectors of size equal to the "actual" state space instead of the "potential" state space.
Protein Kinase Classification with 2866 Hidden Markov Models and One Support Vector Machine
NASA Technical Reports Server (NTRS)
Weber, Ryan; New, Michael H.; Fonda, Mark (Technical Monitor)
2002-01-01
The main application considered in this paper is predicting true kinases from randomly permuted kinases that share the same length and amino acid distributions as the true kinases. Numerous methods already exist for this classification task, such as HMMs, motif-matchers, and sequence comparison algorithms. We build on some of these efforts by creating a vector from the output of thousands of structurally based HMMs, created offline with Pfam-A seed alignments using SAM-T99, which then must be combined into an overall classification for the protein. Then we use a Support Vector Machine for classifying this large ensemble Pfam-Vector, with a polynomial and chisquared kernel. In particular, the chi-squared kernel SVM performs better than the HMMs and better than the BLAST pairwise comparisons, when predicting true from false kinases in some respects, but no one algorithm is best for all purposes or in all instances so we consider the particular strengths and weaknesses of each.
NASA Astrophysics Data System (ADS)
Ye, Su; Chen, Dongmei; Yu, Jie
2016-04-01
In remote sensing, conventional supervised change-detection methods usually require effective training data for multiple change types. This paper introduces a more flexible and efficient procedure that seeks to identify only the changes that users are interested in, here after referred to as "targeted change detection". Based on a one-class classifier "Support Vector Domain Description (SVDD)", a novel algorithm named "Three-layer SVDD Fusion (TLSF)" is developed specially for targeted change detection. The proposed algorithm combines one-class classification generated from change vector maps, as well as before- and after-change images in order to get a more reliable detecting result. In addition, this paper introduces a detailed workflow for implementing this algorithm. This workflow has been applied to two case studies with different practical monitoring objectives: urban expansion and forest fire assessment. The experiment results of these two case studies show that the overall accuracy of our proposed algorithm is superior (Kappa statistics are 86.3% and 87.8% for Case 1 and 2, respectively), compared to applying SVDD to change vector analysis and post-classification comparison.
Thrust vector control algorithm design for the Cassini spacecraft
NASA Technical Reports Server (NTRS)
Enright, Paul J.
1993-01-01
This paper describes a preliminary design of the thrust vector control algorithm for the interplanetary spacecraft, Cassini. Topics of discussion include flight software architecture, modeling of sensors, actuators, and vehicle dynamics, and controller design and analysis via classical methods. Special attention is paid to potential interactions with structural flexibilities and propellant dynamics. Controller performance is evaluated in a simulation environment built around a multi-body dynamics model, which contains nonlinear models of the relevant hardware and preliminary versions of supporting attitude determination and control functions.
Chen, Zhiru; Hong, Wenxue
2016-02-01
Considering the low accuracy of prediction in the positive samples and poor overall classification effects caused by unbalanced sample data of MicroRNA (miRNA) target, we proposes a support vector machine (SVM)-integration of under-sampling and weight (IUSM) algorithm in this paper, an under-sampling based on the ensemble learning algorithm. The algorithm adopts SVM as learning algorithm and AdaBoost as integration framework, and embeds clustering-based under-sampling into the iterative process, aiming at reducing the degree of unbalanced distribution of positive and negative samples. Meanwhile, in the process of adaptive weight adjustment of the samples, the SVM-IUSM algorithm eliminates the abnormal ones in negative samples with robust sample weights smoothing mechanism so as to avoid over-learning. Finally, the prediction of miRNA target integrated classifier is achieved with the combination of multiple weak classifiers through the voting mechanism. The experiment revealed that the SVM-IUSW, compared with other algorithms on unbalanced dataset collection, could not only improve the accuracy of positive targets and the overall effect of classification, but also enhance the generalization ability of miRNA target classifier.
An assessment of support vector machines for land cover classification
Huang, C.; Davis, L.S.; Townshend, J.R.G.
2002-01-01
The support vector machine (SVM) is a group of theoretically superior machine learning algorithms. It was found competitive with the best available machine learning algorithms in classifying high-dimensional data sets. This paper gives an introduction to the theoretical development of the SVM and an experimental evaluation of its accuracy, stability and training speed in deriving land cover classifications from satellite images. The SVM was compared to three other popular classifiers, including the maximum likelihood classifier (MLC), neural network classifiers (NNC) and decision tree classifiers (DTC). The impacts of kernel configuration on the performance of the SVM and of the selection of training data and input variables on the four classifiers were also evaluated in this experiment.
SOLAR FLARE PREDICTION USING SDO/HMI VECTOR MAGNETIC FIELD DATA WITH A MACHINE-LEARNING ALGORITHM
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bobra, M. G.; Couvidat, S., E-mail: couvidat@stanford.edu
2015-01-10
We attempt to forecast M- and X-class solar flares using a machine-learning algorithm, called support vector machine (SVM), and four years of data from the Solar Dynamics Observatory's Helioseismic and Magnetic Imager, the first instrument to continuously map the full-disk photospheric vector magnetic field from space. Most flare forecasting efforts described in the literature use either line-of-sight magnetograms or a relatively small number of ground-based vector magnetograms. This is the first time a large data set of vector magnetograms has been used to forecast solar flares. We build a catalog of flaring and non-flaring active regions sampled from a databasemore » of 2071 active regions, comprised of 1.5 million active region patches of vector magnetic field data, and characterize each active region by 25 parameters. We then train and test the machine-learning algorithm and we estimate its performances using forecast verification metrics with an emphasis on the true skill statistic (TSS). We obtain relatively high TSS scores and overall predictive abilities. We surmise that this is partly due to fine-tuning the SVM for this purpose and also to an advantageous set of features that can only be calculated from vector magnetic field data. We also apply a feature selection algorithm to determine which of our 25 features are useful for discriminating between flaring and non-flaring active regions and conclude that only a handful are needed for good predictive abilities.« less
Using Time Series Analysis to Predict Cardiac Arrest in a PICU.
Kennedy, Curtis E; Aoki, Noriaki; Mariscalco, Michele; Turley, James P
2015-11-01
To build and test cardiac arrest prediction models in a PICU, using time series analysis as input, and to measure changes in prediction accuracy attributable to different classes of time series data. Retrospective cohort study. Thirty-one bed academic PICU that provides care for medical and general surgical (not congenital heart surgery) patients. Patients experiencing a cardiac arrest in the PICU and requiring external cardiac massage for at least 2 minutes. None. One hundred three cases of cardiac arrest and 109 control cases were used to prepare a baseline dataset that consisted of 1,025 variables in four data classes: multivariate, raw time series, clinical calculations, and time series trend analysis. We trained 20 arrest prediction models using a matrix of five feature sets (combinations of data classes) with four modeling algorithms: linear regression, decision tree, neural network, and support vector machine. The reference model (multivariate data with regression algorithm) had an accuracy of 78% and 87% area under the receiver operating characteristic curve. The best model (multivariate + trend analysis data with support vector machine algorithm) had an accuracy of 94% and 98% area under the receiver operating characteristic curve. Cardiac arrest predictions based on a traditional model built with multivariate data and a regression algorithm misclassified cases 3.7 times more frequently than predictions that included time series trend analysis and built with a support vector machine algorithm. Although the final model lacks the specificity necessary for clinical application, we have demonstrated how information from time series data can be used to increase the accuracy of clinical prediction models.
Stochastic subset selection for learning with kernel machines.
Rhinelander, Jason; Liu, Xiaoping P
2012-06-01
Kernel machines have gained much popularity in applications of machine learning. Support vector machines (SVMs) are a subset of kernel machines and generalize well for classification, regression, and anomaly detection tasks. The training procedure for traditional SVMs involves solving a quadratic programming (QP) problem. The QP problem scales super linearly in computational effort with the number of training samples and is often used for the offline batch processing of data. Kernel machines operate by retaining a subset of observed data during training. The data vectors contained within this subset are referred to as support vectors (SVs). The work presented in this paper introduces a subset selection method for the use of kernel machines in online, changing environments. Our algorithm works by using a stochastic indexing technique when selecting a subset of SVs when computing the kernel expansion. The work described here is novel because it separates the selection of kernel basis functions from the training algorithm used. The subset selection algorithm presented here can be used in conjunction with any online training technique. It is important for online kernel machines to be computationally efficient due to the real-time requirements of online environments. Our algorithm is an important contribution because it scales linearly with the number of training samples and is compatible with current training techniques. Our algorithm outperforms standard techniques in terms of computational efficiency and provides increased recognition accuracy in our experiments. We provide results from experiments using both simulated and real-world data sets to verify our algorithm.
Distributed support vector machine in master-slave mode.
Chen, Qingguo; Cao, Feilong
2018-05-01
It is well known that the support vector machine (SVM) is an effective learning algorithm. The alternating direction method of multipliers (ADMM) algorithm has emerged as a powerful technique for solving distributed optimisation models. This paper proposes a distributed SVM algorithm in a master-slave mode (MS-DSVM), which integrates a distributed SVM and ADMM acting in a master-slave configuration where the master node and slave nodes are connected, meaning the results can be broadcasted. The distributed SVM is regarded as a regularised optimisation problem and modelled as a series of convex optimisation sub-problems that are solved by ADMM. Additionally, the over-relaxation technique is utilised to accelerate the convergence rate of the proposed MS-DSVM. Our theoretical analysis demonstrates that the proposed MS-DSVM has linear convergence, meaning it possesses the fastest convergence rate among existing standard distributed ADMM algorithms. Numerical examples demonstrate that the convergence and accuracy of the proposed MS-DSVM are superior to those of existing methods under the ADMM framework. Copyright © 2018 Elsevier Ltd. All rights reserved.
Mao, Yong; Zhou, Xiao-Bo; Pi, Dao-Ying; Sun, You-Xian; Wong, Stephen T C
2005-10-01
In microarray-based cancer classification, gene selection is an important issue owing to the large number of variables and small number of samples as well as its non-linearity. It is difficult to get satisfying results by using conventional linear statistical methods. Recursive feature elimination based on support vector machine (SVM RFE) is an effective algorithm for gene selection and cancer classification, which are integrated into a consistent framework. In this paper, we propose a new method to select parameters of the aforementioned algorithm implemented with Gaussian kernel SVMs as better alternatives to the common practice of selecting the apparently best parameters by using a genetic algorithm to search for a couple of optimal parameter. Fast implementation issues for this method are also discussed for pragmatic reasons. The proposed method was tested on two representative hereditary breast cancer and acute leukaemia datasets. The experimental results indicate that the proposed method performs well in selecting genes and achieves high classification accuracies with these genes.
Efficient irregular wavefront propagation algorithms on Intel® Xeon Phi™
Gomes, Jeremias M.; Teodoro, George; de Melo, Alba; Kong, Jun; Kurc, Tahsin; Saltz, Joel H.
2016-01-01
We investigate the execution of the Irregular Wavefront Propagation Pattern (IWPP), a fundamental computing structure used in several image analysis operations, on the Intel® Xeon Phi™ co-processor. An efficient implementation of IWPP on the Xeon Phi is a challenging problem because of IWPP’s irregularity and the use of atomic instructions in the original IWPP algorithm to resolve race conditions. On the Xeon Phi, the use of SIMD and vectorization instructions is critical to attain high performance. However, SIMD atomic instructions are not supported. Therefore, we propose a new IWPP algorithm that can take advantage of the supported SIMD instruction set. We also evaluate an alternate storage container (priority queue) to track active elements in the wavefront in an effort to improve the parallel algorithm efficiency. The new IWPP algorithm is evaluated with Morphological Reconstruction and Imfill operations as use cases. Our results show performance improvements of up to 5.63× on top of the original IWPP due to vectorization. Moreover, the new IWPP achieves speedups of 45.7× and 1.62×, respectively, as compared to efficient CPU and GPU implementations. PMID:27298591
Efficient irregular wavefront propagation algorithms on Intel® Xeon Phi™.
Gomes, Jeremias M; Teodoro, George; de Melo, Alba; Kong, Jun; Kurc, Tahsin; Saltz, Joel H
2015-10-01
We investigate the execution of the Irregular Wavefront Propagation Pattern (IWPP), a fundamental computing structure used in several image analysis operations, on the Intel ® Xeon Phi ™ co-processor. An efficient implementation of IWPP on the Xeon Phi is a challenging problem because of IWPP's irregularity and the use of atomic instructions in the original IWPP algorithm to resolve race conditions. On the Xeon Phi, the use of SIMD and vectorization instructions is critical to attain high performance. However, SIMD atomic instructions are not supported. Therefore, we propose a new IWPP algorithm that can take advantage of the supported SIMD instruction set. We also evaluate an alternate storage container (priority queue) to track active elements in the wavefront in an effort to improve the parallel algorithm efficiency. The new IWPP algorithm is evaluated with Morphological Reconstruction and Imfill operations as use cases. Our results show performance improvements of up to 5.63 × on top of the original IWPP due to vectorization. Moreover, the new IWPP achieves speedups of 45.7 × and 1.62 × , respectively, as compared to efficient CPU and GPU implementations.
Virtual screening by a new Clustering-based Weighted Similarity Extreme Learning Machine approach
Kudisthalert, Wasu
2018-01-01
Machine learning techniques are becoming popular in virtual screening tasks. One of the powerful machine learning algorithms is Extreme Learning Machine (ELM) which has been applied to many applications and has recently been applied to virtual screening. We propose the Weighted Similarity ELM (WS-ELM) which is based on a single layer feed-forward neural network in a conjunction of 16 different similarity coefficients as activation function in the hidden layer. It is known that the performance of conventional ELM is not robust due to random weight selection in the hidden layer. Thus, we propose a Clustering-based WS-ELM (CWS-ELM) that deterministically assigns weights by utilising clustering algorithms i.e. k-means clustering and support vector clustering. The experiments were conducted on one of the most challenging datasets–Maximum Unbiased Validation Dataset–which contains 17 activity classes carefully selected from PubChem. The proposed algorithms were then compared with other machine learning techniques such as support vector machine, random forest, and similarity searching. The results show that CWS-ELM in conjunction with support vector clustering yields the best performance when utilised together with Sokal/Sneath(1) coefficient. Furthermore, ECFP_6 fingerprint presents the best results in our framework compared to the other types of fingerprints, namely ECFP_4, FCFP_4, and FCFP_6. PMID:29652912
General Quantum Meet-in-the-Middle Search Algorithm Based on Target Solution of Fixed Weight
NASA Astrophysics Data System (ADS)
Fu, Xiang-Qun; Bao, Wan-Su; Wang, Xiang; Shi, Jian-Hong
2016-10-01
Similar to the classical meet-in-the-middle algorithm, the storage and computation complexity are the key factors that decide the efficiency of the quantum meet-in-the-middle algorithm. Aiming at the target vector of fixed weight, based on the quantum meet-in-the-middle algorithm, the algorithm for searching all n-product vectors with the same weight is presented, whose complexity is better than the exhaustive search algorithm. And the algorithm can reduce the storage complexity of the quantum meet-in-the-middle search algorithm. Then based on the algorithm and the knapsack vector of the Chor-Rivest public-key crypto of fixed weight d, we present a general quantum meet-in-the-middle search algorithm based on the target solution of fixed weight, whose computational complexity is \\sumj = 0d {(O(\\sqrt {Cn - k + 1d - j }) + O(C_kj log C_k^j))} with Σd i =0 Ck i memory cost. And the optimal value of k is given. Compared to the quantum meet-in-the-middle search algorithm for knapsack problem and the quantum algorithm for searching a target solution of fixed weight, the computational complexity of the algorithm is lower. And its storage complexity is smaller than the quantum meet-in-the-middle-algorithm. Supported by the National Basic Research Program of China under Grant No. 2013CB338002 and the National Natural Science Foundation of China under Grant No. 61502526
Support vector machine as a binary classifier for automated object detection in remotely sensed data
NASA Astrophysics Data System (ADS)
Wardaya, P. D.
2014-02-01
In the present paper, author proposes the application of Support Vector Machine (SVM) for the analysis of satellite imagery. One of the advantages of SVM is that, with limited training data, it may generate comparable or even better results than the other methods. The SVM algorithm is used for automated object detection and characterization. Specifically, the SVM is applied in its basic nature as a binary classifier where it classifies two classes namely, object and background. The algorithm aims at effectively detecting an object from its background with the minimum training data. The synthetic image containing noises is used for algorithm testing. Furthermore, it is implemented to perform remote sensing image analysis such as identification of Island vegetation, water body, and oil spill from the satellite imagery. It is indicated that SVM provides the fast and accurate analysis with the acceptable result.
Hybrid approach of selecting hyperparameters of support vector machine for regression.
Jeng, Jin-Tsong
2006-06-01
To select the hyperparameters of the support vector machine for regression (SVR), a hybrid approach is proposed to determine the kernel parameter of the Gaussian kernel function and the epsilon value of Vapnik's epsilon-insensitive loss function. The proposed hybrid approach includes a competitive agglomeration (CA) clustering algorithm and a repeated SVR (RSVR) approach. Since the CA clustering algorithm is used to find the nearly "optimal" number of clusters and the centers of clusters in the clustering process, the CA clustering algorithm is applied to select the Gaussian kernel parameter. Additionally, an RSVR approach that relies on the standard deviation of a training error is proposed to obtain an epsilon in the loss function. Finally, two functions, one real data set (i.e., a time series of quarterly unemployment rate for West Germany) and an identification of nonlinear plant are used to verify the usefulness of the hybrid approach.
Wang, Shuihua; Chen, Mengmeng; Li, Yang; Shao, Ying; Zhang, Yudong; Du, Sidan; Wu, Jane
2016-01-01
Dendritic spines are described as neuronal protrusions. The morphology of dendritic spines and dendrites has a strong relationship to its function, as well as playing an important role in understanding brain function. Quantitative analysis of dendrites and dendritic spines is essential to an understanding of the formation and function of the nervous system. However, highly efficient tools for the quantitative analysis of dendrites and dendritic spines are currently undeveloped. In this paper we propose a novel three-step cascaded algorithm-RTSVM- which is composed of ridge detection as the curvature structure identifier for backbone extraction, boundary location based on differences in density, the Hu moment as features and Twin Support Vector Machine (TSVM) classifiers for spine classification. Our data demonstrates that this newly developed algorithm has performed better than other available techniques used to detect accuracy and false alarm rates. This algorithm will be used effectively in neuroscience research.
NASA Astrophysics Data System (ADS)
Mofavvaz, Shirin; Sohrabi, Mahmoud Reza; Nezamzadeh-Ejhieh, Alireza
2017-07-01
In the present study, artificial neural networks (ANNs) and least squares support vector machines (LS-SVM) as intelligent methods based on absorption spectra in the range of 230-300 nm have been used for determination of antihistamine decongestant contents. In the first step, one type of network (feed-forward back-propagation) from the artificial neural network with two different training algorithms, Levenberg-Marquardt (LM) and gradient descent with momentum and adaptive learning rate back-propagation (GDX) algorithm, were employed and their performance was evaluated. The performance of the LM algorithm was better than the GDX algorithm. In the second one, the radial basis network was utilized and results compared with the previous network. In the last one, the other intelligent method named least squares support vector machine was proposed to construct the antihistamine decongestant prediction model and the results were compared with two of the aforementioned networks. The values of the statistical parameters mean square error (MSE), Regression coefficient (R2), correlation coefficient (r) and also mean recovery (%), relative standard deviation (RSD) used for selecting the best model between these methods. Moreover, the proposed methods were compared to the high- performance liquid chromatography (HPLC) as a reference method. One way analysis of variance (ANOVA) test at the 95% confidence level applied to the comparison results of suggested and reference methods that there were no significant differences between them.
A Semisupervised Support Vector Machines Algorithm for BCI Systems
Qin, Jianzhao; Li, Yuanqing; Sun, Wei
2007-01-01
As an emerging technology, brain-computer interfaces (BCIs) bring us new communication interfaces which translate brain activities into control signals for devices like computers, robots, and so forth. In this study, we propose a semisupervised support vector machine (SVM) algorithm for brain-computer interface (BCI) systems, aiming at reducing the time-consuming training process. In this algorithm, we apply a semisupervised SVM for translating the features extracted from the electrical recordings of brain into control signals. This SVM classifier is built from a small labeled data set and a large unlabeled data set. Meanwhile, to reduce the time for training semisupervised SVM, we propose a batch-mode incremental learning method, which can also be easily applied to the online BCI systems. Additionally, it is suggested in many studies that common spatial pattern (CSP) is very effective in discriminating two different brain states. However, CSP needs a sufficient labeled data set. In order to overcome the drawback of CSP, we suggest a two-stage feature extraction method for the semisupervised learning algorithm. We apply our algorithm to two BCI experimental data sets. The offline data analysis results demonstrate the effectiveness of our algorithm. PMID:18368141
NASA Astrophysics Data System (ADS)
Xu, Lili; Luo, Shuqian
2010-11-01
Microaneurysms (MAs) are the first manifestations of the diabetic retinopathy (DR) as well as an indicator for its progression. Their automatic detection plays a key role for both mass screening and monitoring and is therefore in the core of any system for computer-assisted diagnosis of DR. The algorithm basically comprises the following stages: candidate detection aiming at extracting the patterns possibly corresponding to MAs based on mathematical morphological black top hat, feature extraction to characterize these candidates, and classification based on support vector machine (SVM), to validate MAs. Feature vector and kernel function of SVM selection is very important to the algorithm. We use the receiver operating characteristic (ROC) curve to evaluate the distinguishing performance of different feature vectors and different kernel functions of SVM. The ROC analysis indicates the quadratic polynomial SVM with a combination of features as the input shows the best discriminating performance.
Xu, Lili; Luo, Shuqian
2010-01-01
Microaneurysms (MAs) are the first manifestations of the diabetic retinopathy (DR) as well as an indicator for its progression. Their automatic detection plays a key role for both mass screening and monitoring and is therefore in the core of any system for computer-assisted diagnosis of DR. The algorithm basically comprises the following stages: candidate detection aiming at extracting the patterns possibly corresponding to MAs based on mathematical morphological black top hat, feature extraction to characterize these candidates, and classification based on support vector machine (SVM), to validate MAs. Feature vector and kernel function of SVM selection is very important to the algorithm. We use the receiver operating characteristic (ROC) curve to evaluate the distinguishing performance of different feature vectors and different kernel functions of SVM. The ROC analysis indicates the quadratic polynomial SVM with a combination of features as the input shows the best discriminating performance.
Yan, Kang K; Zhao, Hongyu; Pang, Herbert
2017-12-06
High-throughput sequencing data are widely collected and analyzed in the study of complex diseases in quest of improving human health. Well-studied algorithms mostly deal with single data source, and cannot fully utilize the potential of these multi-omics data sources. In order to provide a holistic understanding of human health and diseases, it is necessary to integrate multiple data sources. Several algorithms have been proposed so far, however, a comprehensive comparison of data integration algorithms for classification of binary traits is currently lacking. In this paper, we focus on two common classes of integration algorithms, graph-based that depict relationships with subjects denoted by nodes and relationships denoted by edges, and kernel-based that can generate a classifier in feature space. Our paper provides a comprehensive comparison of their performance in terms of various measurements of classification accuracy and computation time. Seven different integration algorithms, including graph-based semi-supervised learning, graph sharpening integration, composite association network, Bayesian network, semi-definite programming-support vector machine (SDP-SVM), relevance vector machine (RVM) and Ada-boost relevance vector machine are compared and evaluated with hypertension and two cancer data sets in our study. In general, kernel-based algorithms create more complex models and require longer computation time, but they tend to perform better than graph-based algorithms. The performance of graph-based algorithms has the advantage of being faster computationally. The empirical results demonstrate that composite association network, relevance vector machine, and Ada-boost RVM are the better performers. We provide recommendations on how to choose an appropriate algorithm for integrating data from multiple sources.
Automated Scoring of Chinese Engineering Students' English Essays
ERIC Educational Resources Information Center
Liu, Ming; Wang, Yuqi; Xu, Weiwei; Liu, Li
2017-01-01
The number of Chinese engineering students has increased greatly since 1999. Rating the quality of these students' English essays has thus become time-consuming and challenging. This paper presents a novel automatic essay scoring algorithm called PSOSVR, based on a machine learning algorithm, Support Vector Machine for Regression (SVR), and a…
Martins, Maria; Costa, Lino; Frizera, Anselmo; Ceres, Ramón; Santos, Cristina
2014-03-01
Walker devices are often prescribed incorrectly to patients, leading to the increase of dissatisfaction and occurrence of several problems, such as, discomfort and pain. Thus, it is necessary to objectively evaluate the effects that assisted gait can have on the gait patterns of walker users, comparatively to a non-assisted gait. A gait analysis, focusing on spatiotemporal and kinematics parameters, will be issued for this purpose. However, gait analysis yields redundant information that often is difficult to interpret. This study addresses the problem of selecting the most relevant gait features required to differentiate between assisted and non-assisted gait. For that purpose, it is presented an efficient approach that combines evolutionary techniques, based on genetic algorithms, and support vector machine algorithms, to discriminate differences between assisted and non-assisted gait with a walker with forearm supports. For comparison purposes, other classification algorithms are verified. Results with healthy subjects show that the main differences are characterized by balance and joints excursion in the sagittal plane. These results, confirmed by clinical evidence, allow concluding that this technique is an efficient feature selection approach. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Lee, David; Park, Sang-Hoon; Lee, Sang-Goog
2017-10-07
In this paper, we propose a set of wavelet-based combined feature vectors and a Gaussian mixture model (GMM)-supervector to enhance training speed and classification accuracy in motor imagery brain-computer interfaces. The proposed method is configured as follows: first, wavelet transforms are applied to extract the feature vectors for identification of motor imagery electroencephalography (EEG) and principal component analyses are used to reduce the dimensionality of the feature vectors and linearly combine them. Subsequently, the GMM universal background model is trained by the expectation-maximization (EM) algorithm to purify the training data and reduce its size. Finally, a purified and reduced GMM-supervector is used to train the support vector machine classifier. The performance of the proposed method was evaluated for three different motor imagery datasets in terms of accuracy, kappa, mutual information, and computation time, and compared with the state-of-the-art algorithms. The results from the study indicate that the proposed method achieves high accuracy with a small amount of training data compared with the state-of-the-art algorithms in motor imagery EEG classification.
Research on intrusion detection based on Kohonen network and support vector machine
NASA Astrophysics Data System (ADS)
Shuai, Chunyan; Yang, Hengcheng; Gong, Zeweiyi
2018-05-01
In view of the problem of low detection accuracy and the long detection time of support vector machine, which directly applied to the network intrusion detection system. Optimization of SVM parameters can greatly improve the detection accuracy, but it can not be applied to high-speed network because of the long detection time. a method based on Kohonen neural network feature selection is proposed to reduce the optimization time of support vector machine parameters. Firstly, this paper is to calculate the weights of the KDD99 network intrusion data by Kohonen network and select feature by weight. Then, after the feature selection is completed, genetic algorithm (GA) and grid search method are used for parameter optimization to find the appropriate parameters and classify them by support vector machines. By comparing experiments, it is concluded that feature selection can reduce the time of parameter optimization, which has little influence on the accuracy of classification. The experiments suggest that the support vector machine can be used in the network intrusion detection system and reduce the missing rate.
Machine Learning for Biological Trajectory Classification Applications
NASA Technical Reports Server (NTRS)
Sbalzarini, Ivo F.; Theriot, Julie; Koumoutsakos, Petros
2002-01-01
Machine-learning techniques, including clustering algorithms, support vector machines and hidden Markov models, are applied to the task of classifying trajectories of moving keratocyte cells. The different algorithms axe compared to each other as well as to expert and non-expert test persons, using concepts from signal-detection theory. The algorithms performed very well as compared to humans, suggesting a robust tool for trajectory classification in biological applications.
A Mathematical and Sociological Analysis of Google Search Algorithm
2013-01-16
through the collective intelligence of the web to determine a page’s importance. Let v be a vector of RN with N ≥ 8 billion. Any unit vector in RN is...scrolled up by some artifical hits. Aknowledgment: The authors would like to thank Dr. John Lavery for his encouragement and support which enable them to
Object recognition of real targets using modelled SAR images
NASA Astrophysics Data System (ADS)
Zherdev, D. A.
2017-12-01
In this work the problem of recognition is studied using SAR images. The algorithm of recognition is based on the computation of conjugation indices with vectors of class. The support subspaces for each class are constructed by exception of the most and the less correlated vectors in a class. In the study we examine the ability of a significant feature vector size reduce that leads to recognition time decrease. The images of targets form the feature vectors that are transformed using pre-trained convolutional neural network (CNN).
Efficient boundary hunting via vector quantization
NASA Astrophysics Data System (ADS)
Diamantini, Claudia; Panti, Maurizio
2001-03-01
A great amount of information about a classification problem is contained in those instances falling near the decision boundary. This intuition dates back to the earliest studies in pattern recognition, and in the more recent adaptive approaches to the so called boundary hunting, such as the work of Aha et alii on Instance Based Learning and the work of Vapnik et alii on Support Vector Machines. The last work is of particular interest, since theoretical and experimental results ensure the accuracy of boundary reconstruction. However, its optimization approach has heavy computational and memory requirements, which limits its application on huge amounts of data. In the paper we describe an alternative approach to boundary hunting based on adaptive labeled quantization architectures. The adaptation is performed by a stochastic gradient algorithm for the minimization of the error probability. Error probability minimization guarantees the accurate approximation of the optimal decision boundary, while the use of a stochastic gradient algorithm defines an efficient method to reach such approximation. In the paper comparisons to Support Vector Machines are considered.
NASA Astrophysics Data System (ADS)
Zhou, Xin; Jun, Sun; Zhang, Bing; Jun, Wu
2017-07-01
In order to improve the reliability of the spectrum feature extracted by wavelet transform, a method combining wavelet transform (WT) with bacterial colony chemotaxis algorithm and support vector machine (BCC-SVM) algorithm (WT-BCC-SVM) was proposed in this paper. Besides, we aimed to identify different kinds of pesticide residues on lettuce leaves in a novel and rapid non-destructive way by using fluorescence spectra technology. The fluorescence spectral data of 150 lettuce leaf samples of five different kinds of pesticide residues on the surface of lettuce were obtained using Cary Eclipse fluorescence spectrometer. Standard normalized variable detrending (SNV detrending), Savitzky-Golay coupled with Standard normalized variable detrending (SG-SNV detrending) were used to preprocess the raw spectra, respectively. Bacterial colony chemotaxis combined with support vector machine (BCC-SVM) and support vector machine (SVM) classification models were established based on full spectra (FS) and wavelet transform characteristics (WTC), respectively. Moreover, WTC were selected by WT. The results showed that the accuracy of training set, calibration set and the prediction set of the best optimal classification model (SG-SNV detrending-WT-BCC-SVM) were 100%, 98% and 93.33%, respectively. In addition, the results indicated that it was feasible to use WT-BCC-SVM to establish diagnostic model of different kinds of pesticide residues on lettuce leaves.
Cheng, Jerome; Hipp, Jason; Monaco, James; Lucas, David R; Madabhushi, Anant; Balis, Ulysses J
2011-01-01
Spatially invariant vector quantization (SIVQ) is a texture and color-based image matching algorithm that queries the image space through the use of ring vectors. In prior studies, the selection of one or more optimal vectors for a particular feature of interest required a manual process, with the user initially stochastically selecting candidate vectors and subsequently testing them upon other regions of the image to verify the vector's sensitivity and specificity properties (typically by reviewing a resultant heat map). In carrying out the prior efforts, the SIVQ algorithm was noted to exhibit highly scalable computational properties, where each region of analysis can take place independently of others, making a compelling case for the exploration of its deployment on high-throughput computing platforms, with the hypothesis that such an exercise will result in performance gains that scale linearly with increasing processor count. An automated process was developed for the selection of optimal ring vectors to serve as the predicate matching operator in defining histopathological features of interest. Briefly, candidate vectors were generated from every possible coordinate origin within a user-defined vector selection area (VSA) and subsequently compared against user-identified positive and negative "ground truth" regions on the same image. Each vector from the VSA was assessed for its goodness-of-fit to both the positive and negative areas via the use of the receiver operating characteristic (ROC) transfer function, with each assessment resulting in an associated area-under-the-curve (AUC) figure of merit. Use of the above-mentioned automated vector selection process was demonstrated in two cases of use: First, to identify malignant colonic epithelium, and second, to identify soft tissue sarcoma. For both examples, a very satisfactory optimized vector was identified, as defined by the AUC metric. Finally, as an additional effort directed towards attaining high-throughput capability for the SIVQ algorithm, we demonstrated the successful incorporation of it with the MATrix LABoratory (MATLAB™) application interface. The SIVQ algorithm is suitable for automated vector selection settings and high throughput computation.
NASA Astrophysics Data System (ADS)
Kachach, Redouane; Cañas, José María
2016-05-01
Using video in traffic monitoring is one of the most active research domains in the computer vision community. TrafficMonitor, a system that employs a hybrid approach for automatic vehicle tracking and classification on highways using a simple stationary calibrated camera, is presented. The proposed system consists of three modules: vehicle detection, vehicle tracking, and vehicle classification. Moving vehicles are detected by an enhanced Gaussian mixture model background estimation algorithm. The design includes a technique to resolve the occlusion problem by using a combination of two-dimensional proximity tracking algorithm and the Kanade-Lucas-Tomasi feature tracking algorithm. The last module classifies the shapes identified into five vehicle categories: motorcycle, car, van, bus, and truck by using three-dimensional templates and an algorithm based on histogram of oriented gradients and the support vector machine classifier. Several experiments have been performed using both real and simulated traffic in order to validate the system. The experiments were conducted on GRAM-RTM dataset and a proper real video dataset which is made publicly available as part of this work.
Halder, Sebastian; Bensch, Michael; Mellinger, Jürgen; Bogdan, Martin; Kübler, Andrea; Birbaumer, Niels; Rosenstiel, Wolfgang
2007-01-01
We propose a combination of blind source separation (BSS) and independent component analysis (ICA) (signal decomposition into artifacts and nonartifacts) with support vector machines (SVMs) (automatic classification) that are designed for online usage. In order to select a suitable BSS/ICA method, three ICA algorithms (JADE, Infomax, and FastICA) and one BSS algorithm (AMUSE) are evaluated to determine their ability to isolate electromyographic (EMG) and electrooculographic (EOG) artifacts into individual components. An implementation of the selected BSS/ICA method with SVMs trained to classify EMG and EOG artifacts, which enables the usage of the method as a filter in measurements with online feedback, is described. This filter is evaluated on three BCI datasets as a proof-of-concept of the method. PMID:18288259
Halder, Sebastian; Bensch, Michael; Mellinger, Jürgen; Bogdan, Martin; Kübler, Andrea; Birbaumer, Niels; Rosenstiel, Wolfgang
2007-01-01
We propose a combination of blind source separation (BSS) and independent component analysis (ICA) (signal decomposition into artifacts and nonartifacts) with support vector machines (SVMs) (automatic classification) that are designed for online usage. In order to select a suitable BSS/ICA method, three ICA algorithms (JADE, Infomax, and FastICA) and one BSS algorithm (AMUSE) are evaluated to determine their ability to isolate electromyographic (EMG) and electrooculographic (EOG) artifacts into individual components. An implementation of the selected BSS/ICA method with SVMs trained to classify EMG and EOG artifacts, which enables the usage of the method as a filter in measurements with online feedback, is described. This filter is evaluated on three BCI datasets as a proof-of-concept of the method.
Software tool for data mining and its applications
NASA Astrophysics Data System (ADS)
Yang, Jie; Ye, Chenzhou; Chen, Nianyi
2002-03-01
A software tool for data mining is introduced, which integrates pattern recognition (PCA, Fisher, clustering, hyperenvelop, regression), artificial intelligence (knowledge representation, decision trees), statistical learning (rough set, support vector machine), computational intelligence (neural network, genetic algorithm, fuzzy systems). It consists of nine function models: pattern recognition, decision trees, association rule, fuzzy rule, neural network, genetic algorithm, Hyper Envelop, support vector machine, visualization. The principle and knowledge representation of some function models of data mining are described. The software tool of data mining is realized by Visual C++ under Windows 2000. Nonmonotony in data mining is dealt with by concept hierarchy and layered mining. The software tool of data mining has satisfactorily applied in the prediction of regularities of the formation of ternary intermetallic compounds in alloy systems, and diagnosis of brain glioma.
Srinivasan, Pratul P.; Kim, Leo A.; Mettu, Priyatham S.; Cousins, Scott W.; Comer, Grant M.; Izatt, Joseph A.; Farsiu, Sina
2014-01-01
We present a novel fully automated algorithm for the detection of retinal diseases via optical coherence tomography (OCT) imaging. Our algorithm utilizes multiscale histograms of oriented gradient descriptors as feature vectors of a support vector machine based classifier. The spectral domain OCT data sets used for cross-validation consisted of volumetric scans acquired from 45 subjects: 15 normal subjects, 15 patients with dry age-related macular degeneration (AMD), and 15 patients with diabetic macular edema (DME). Our classifier correctly identified 100% of cases with AMD, 100% cases with DME, and 86.67% cases of normal subjects. This algorithm is a potentially impactful tool for the remote diagnosis of ophthalmic diseases. PMID:25360373
Zimmermann, Karel; Gibrat, Jean-François
2010-01-04
Sequence comparisons make use of a one-letter representation for amino acids, the necessary quantitative information being supplied by the substitution matrices. This paper deals with the problem of finding a representation that provides a comprehensive description of amino acid intrinsic properties consistent with the substitution matrices. We present a Euclidian vector representation of the amino acids, obtained by the singular value decomposition of the substitution matrices. The substitution matrix entries correspond to the dot product of amino acid vectors. We apply this vector encoding to the study of the relative importance of various amino acid physicochemical properties upon the substitution matrices. We also characterize and compare the PAM and BLOSUM series substitution matrices. This vector encoding introduces a Euclidian metric in the amino acid space, consistent with substitution matrices. Such a numerical description of the amino acid is useful when intrinsic properties of amino acids are necessary, for instance, building sequence profiles or finding consensus sequences, using machine learning algorithms such as Support Vector Machine and Neural Networks algorithms.
A Fast Reduced Kernel Extreme Learning Machine.
Deng, Wan-Yu; Ong, Yew-Soon; Zheng, Qing-Hua
2016-04-01
In this paper, we present a fast and accurate kernel-based supervised algorithm referred to as the Reduced Kernel Extreme Learning Machine (RKELM). In contrast to the work on Support Vector Machine (SVM) or Least Square SVM (LS-SVM), which identifies the support vectors or weight vectors iteratively, the proposed RKELM randomly selects a subset of the available data samples as support vectors (or mapping samples). By avoiding the iterative steps of SVM, significant cost savings in the training process can be readily attained, especially on Big datasets. RKELM is established based on the rigorous proof of universal learning involving reduced kernel-based SLFN. In particular, we prove that RKELM can approximate any nonlinear functions accurately under the condition of support vectors sufficiency. Experimental results on a wide variety of real world small instance size and large instance size applications in the context of binary classification, multi-class problem and regression are then reported to show that RKELM can perform at competitive level of generalized performance as the SVM/LS-SVM at only a fraction of the computational effort incurred. Copyright © 2015 Elsevier Ltd. All rights reserved.
Khotanlou, Hassan; Afrasiabi, Mahlagha
2012-10-01
This paper presents a new feature selection approach for automatically extracting multiple sclerosis (MS) lesions in three-dimensional (3D) magnetic resonance (MR) images. Presented method is applicable to different types of MS lesions. In this method, T1, T2, and fluid attenuated inversion recovery (FLAIR) images are firstly preprocessed. In the next phase, effective features to extract MS lesions are selected by using a genetic algorithm (GA). The fitness function of the GA is the Similarity Index (SI) of a support vector machine (SVM) classifier. The results obtained on different types of lesions have been evaluated by comparison with manual segmentations. This algorithm is evaluated on 15 real 3D MR images using several measures. As a result, the SI between MS regions determined by the proposed method and radiologists was 87% on average. Experiments and comparisons with other methods show the effectiveness and the efficiency of the proposed approach.
Detection of anomaly in human retina using Laplacian Eigenmaps and vectorized matched filtering
NASA Astrophysics Data System (ADS)
Yacoubou Djima, Karamatou A.; Simonelli, Lucia D.; Cunningham, Denise; Czaja, Wojciech
2015-03-01
We present a novel method for automated anomaly detection on auto fluorescent data provided by the National Institute of Health (NIH). This is motivated by the need for new tools to improve the capability of diagnosing macular degeneration in its early stages, track the progression over time, and test the effectiveness of new treatment methods. In previous work, macular anomalies have been detected automatically through multiscale analysis procedures such as wavelet analysis or dimensionality reduction algorithms followed by a classification algorithm, e.g., Support Vector Machine. The method that we propose is a Vectorized Matched Filtering (VMF) algorithm combined with Laplacian Eigenmaps (LE), a nonlinear dimensionality reduction algorithm with locality preserving properties. By applying LE, we are able to represent the data in the form of eigenimages, some of which accentuate the visibility of anomalies. We pick significant eigenimages and proceed with the VMF algorithm that classifies anomalies across all of these eigenimages simultaneously. To evaluate our performance, we compare our method to two other schemes: a matched filtering algorithm based on anomaly detection on single images and a combination of PCA and VMF. LE combined with VMF algorithm performs best, yielding a high rate of accurate anomaly detection. This shows the advantage of using a nonlinear approach to represent the data and the effectiveness of VMF, which operates on the images as a data cube rather than individual images.
Ebtehaj, Isa; Bonakdari, Hossein
2016-01-01
Sediment transport without deposition is an essential consideration in the optimum design of sewer pipes. In this study, a novel method based on a combination of support vector regression (SVR) and the firefly algorithm (FFA) is proposed to predict the minimum velocity required to avoid sediment settling in pipe channels, which is expressed as the densimetric Froude number (Fr). The efficiency of support vector machine (SVM) models depends on the suitable selection of SVM parameters. In this particular study, FFA is used by determining these SVM parameters. The actual effective parameters on Fr calculation are generally identified by employing dimensional analysis. The different dimensionless variables along with the models are introduced. The best performance is attributed to the model that employs the sediment volumetric concentration (C(V)), ratio of relative median diameter of particles to hydraulic radius (d/R), dimensionless particle number (D(gr)) and overall sediment friction factor (λ(s)) parameters to estimate Fr. The performance of the SVR-FFA model is compared with genetic programming, artificial neural network and existing regression-based equations. The results indicate the superior performance of SVR-FFA (mean absolute percentage error = 2.123%; root mean square error =0.116) compared with other methods.
A support vector machine approach for classification of welding defects from ultrasonic signals
NASA Astrophysics Data System (ADS)
Chen, Yuan; Ma, Hong-Wei; Zhang, Guang-Ming
2014-07-01
Defect classification is an important issue in ultrasonic non-destructive evaluation. A layered multi-class support vector machine (LMSVM) classification system, which combines multiple SVM classifiers through a layered architecture, is proposed in this paper. The proposed LMSVM classification system is applied to the classification of welding defects from ultrasonic test signals. The measured ultrasonic defect echo signals are first decomposed into wavelet coefficients by the wavelet packet transform. The energy of the wavelet coefficients at different frequency channels are used to construct the feature vectors. The bees algorithm (BA) is then used for feature selection and SVM parameter optimisation for the LMSVM classification system. The BA-based feature selection optimises the energy feature vectors. The optimised feature vectors are input to the LMSVM classification system for training and testing. Experimental results of classifying welding defects demonstrate that the proposed technique is highly robust, precise and reliable for ultrasonic defect classification.
PCA-LBG-based algorithms for VQ codebook generation
NASA Astrophysics Data System (ADS)
Tsai, Jinn-Tsong; Yang, Po-Yuan
2015-04-01
Vector quantisation (VQ) codebooks are generated by combining principal component analysis (PCA) algorithms with Linde-Buzo-Gray (LBG) algorithms. All training vectors are grouped according to the projected values of the principal components. The PCA-LBG-based algorithms include (1) PCA-LBG-Median, which selects the median vector of each group, (2) PCA-LBG-Centroid, which adopts the centroid vector of each group, and (3) PCA-LBG-Random, which randomly selects a vector of each group. The LBG algorithm finds a codebook based on the better vectors sent to an initial codebook by the PCA. The PCA performs an orthogonal transformation to convert a set of potentially correlated variables into a set of variables that are not linearly correlated. Because the orthogonal transformation efficiently distinguishes test image vectors, the proposed PCA-LBG-based algorithm is expected to outperform conventional algorithms in designing VQ codebooks. The experimental results confirm that the proposed PCA-LBG-based algorithms indeed obtain better results compared to existing methods reported in the literature.
Support Vector Machine-Based Endmember Extraction
DOE Office of Scientific and Technical Information (OSTI.GOV)
Filippi, Anthony M; Archibald, Richard K
Introduced in this paper is the utilization of Support Vector Machines (SVMs) to automatically perform endmember extraction from hyperspectral data. The strengths of SVM are exploited to provide a fast and accurate calculated representation of high-dimensional data sets that may consist of multiple distributions. Once this representation is computed, the number of distributions can be determined without prior knowledge. For each distribution, an optimal transform can be determined that preserves informational content while reducing the data dimensionality, and hence, the computational cost. Finally, endmember extraction for the whole data set is accomplished. Results indicate that this Support Vector Machine-Based Endmembermore » Extraction (SVM-BEE) algorithm has the capability of autonomously determining endmembers from multiple clusters with computational speed and accuracy, while maintaining a robust tolerance to noise.« less
An introduction to kernel-based learning algorithms.
Müller, K R; Mika, S; Rätsch, G; Tsuda, K; Schölkopf, B
2001-01-01
This paper provides an introduction to support vector machines, kernel Fisher discriminant analysis, and kernel principal component analysis, as examples for successful kernel-based learning methods. We first give a short background about Vapnik-Chervonenkis theory and kernel feature spaces and then proceed to kernel based learning in supervised and unsupervised scenarios including practical and algorithmic considerations. We illustrate the usefulness of kernel algorithms by discussing applications such as optical character recognition and DNA analysis.
A New Method of Facial Expression Recognition Based on SPE Plus SVM
NASA Astrophysics Data System (ADS)
Ying, Zilu; Huang, Mingwei; Wang, Zhen; Wang, Zhewei
A novel method of facial expression recognition (FER) is presented, which uses stochastic proximity embedding (SPE) for data dimension reduction, and support vector machine (SVM) for expression classification. The proposed algorithm is applied to Japanese Female Facial Expression (JAFFE) database for FER, better performance is obtained compared with some traditional algorithms, such as PCA and LDA etc.. The result have further proved the effectiveness of the proposed algorithm.
An Autonomous Star Identification Algorithm Based on One-Dimensional Vector Pattern for Star Sensors
Luo, Liyan; Xu, Luping; Zhang, Hua
2015-01-01
In order to enhance the robustness and accelerate the recognition speed of star identification, an autonomous star identification algorithm for star sensors is proposed based on the one-dimensional vector pattern (one_DVP). In the proposed algorithm, the space geometry information of the observed stars is used to form the one-dimensional vector pattern of the observed star. The one-dimensional vector pattern of the same observed star remains unchanged when the stellar image rotates, so the problem of star identification is simplified as the comparison of the two feature vectors. The one-dimensional vector pattern is adopted to build the feature vector of the star pattern, which makes it possible to identify the observed stars robustly. The characteristics of the feature vector and the proposed search strategy for the matching pattern make it possible to achieve the recognition result as quickly as possible. The simulation results demonstrate that the proposed algorithm can effectively accelerate the star identification. Moreover, the recognition accuracy and robustness by the proposed algorithm are better than those by the pyramid algorithm, the modified grid algorithm, and the LPT algorithm. The theoretical analysis and experimental results show that the proposed algorithm outperforms the other three star identification algorithms. PMID:26198233
Luo, Liyan; Xu, Luping; Zhang, Hua
2015-07-07
In order to enhance the robustness and accelerate the recognition speed of star identification, an autonomous star identification algorithm for star sensors is proposed based on the one-dimensional vector pattern (one_DVP). In the proposed algorithm, the space geometry information of the observed stars is used to form the one-dimensional vector pattern of the observed star. The one-dimensional vector pattern of the same observed star remains unchanged when the stellar image rotates, so the problem of star identification is simplified as the comparison of the two feature vectors. The one-dimensional vector pattern is adopted to build the feature vector of the star pattern, which makes it possible to identify the observed stars robustly. The characteristics of the feature vector and the proposed search strategy for the matching pattern make it possible to achieve the recognition result as quickly as possible. The simulation results demonstrate that the proposed algorithm can effectively accelerate the star identification. Moreover, the recognition accuracy and robustness by the proposed algorithm are better than those by the pyramid algorithm, the modified grid algorithm, and the LPT algorithm. The theoretical analysis and experimental results show that the proposed algorithm outperforms the other three star identification algorithms.
Zhang, Chenglin; Yan, Lei; Han, Song; Guan, Xinping
2017-01-01
Target localization, which aims to estimate the location of an unknown target, is one of the key issues in applications of underwater acoustic sensor networks (UASNs). However, the constrained property of an underwater environment, such as restricted communication capacity of sensor nodes and sensing noises, makes target localization a challenging problem. This paper relies on fractional sensor nodes to formulate a support vector learning-based particle filter algorithm for the localization problem in communication-constrained underwater acoustic sensor networks. A node-selection strategy is exploited to pick fractional sensor nodes with short-distance pattern to participate in the sensing process at each time frame. Subsequently, we propose a least-square support vector regression (LSSVR)-based observation function, through which an iterative regression strategy is used to deal with the distorted data caused by sensing noises, to improve the observation accuracy. At the same time, we integrate the observation to formulate the likelihood function, which effectively update the weights of particles. Thus, the particle effectiveness is enhanced to avoid “particle degeneracy” problem and improve localization accuracy. In order to validate the performance of the proposed localization algorithm, two different noise scenarios are investigated. The simulation results show that the proposed localization algorithm can efficiently improve the localization accuracy. In addition, the node-selection strategy can effectively select the subset of sensor nodes to improve the communication efficiency of the sensor network. PMID:29267252
Li, Xinbin; Zhang, Chenglin; Yan, Lei; Han, Song; Guan, Xinping
2017-12-21
Target localization, which aims to estimate the location of an unknown target, is one of the key issues in applications of underwater acoustic sensor networks (UASNs). However, the constrained property of an underwater environment, such as restricted communication capacity of sensor nodes and sensing noises, makes target localization a challenging problem. This paper relies on fractional sensor nodes to formulate a support vector learning-based particle filter algorithm for the localization problem in communication-constrained underwater acoustic sensor networks. A node-selection strategy is exploited to pick fractional sensor nodes with short-distance pattern to participate in the sensing process at each time frame. Subsequently, we propose a least-square support vector regression (LSSVR)-based observation function, through which an iterative regression strategy is used to deal with the distorted data caused by sensing noises, to improve the observation accuracy. At the same time, we integrate the observation to formulate the likelihood function, which effectively update the weights of particles. Thus, the particle effectiveness is enhanced to avoid "particle degeneracy" problem and improve localization accuracy. In order to validate the performance of the proposed localization algorithm, two different noise scenarios are investigated. The simulation results show that the proposed localization algorithm can efficiently improve the localization accuracy. In addition, the node-selection strategy can effectively select the subset of sensor nodes to improve the communication efficiency of the sensor network.
Sorting on STAR. [CDC computer algorithm timing comparison
NASA Technical Reports Server (NTRS)
Stone, H. S.
1978-01-01
Timing comparisons are given for three sorting algorithms written for the CDC STAR computer. One algorithm is Hoare's (1962) Quicksort, which is the fastest or nearly the fastest sorting algorithm for most computers. A second algorithm is a vector version of Quicksort that takes advantage of the STAR's vector operations. The third algorithm is an adaptation of Batcher's (1968) sorting algorithm, which makes especially good use of vector operations but has a complexity of N(log N)-squared as compared with a complexity of N log N for the Quicksort algorithms. In spite of its worse complexity, Batcher's sorting algorithm is competitive with the serial version of Quicksort for vectors up to the largest that can be treated by STAR. Vector Quicksort outperforms the other two algorithms and is generally preferred. These results indicate that unusual instruction sets can introduce biases in program execution time that counter results predicted by worst-case asymptotic complexity analysis.
Marek K. Jakubowksi; Qinghua Guo; Brandon Collins; Scott Stephens; Maggi Kelly
2013-01-01
We compared the ability of several classification and regression algorithms to predict forest stand structure metrics and standard surface fuel models. Our study area spans a dense, topographically complex Sierra Nevada mixed-conifer forest. We used clustering, regression trees, and support vector machine algorithms to analyze high density (average 9 pulses/m
2014-01-01
Background Support vector regression (SVR) and Gaussian process regression (GPR) were used for the analysis of electroanalytical experimental data to estimate diffusion coefficients. Results For simulated cyclic voltammograms based on the EC, Eqr, and EqrC mechanisms these regression algorithms in combination with nonlinear kernel/covariance functions yielded diffusion coefficients with higher accuracy as compared to the standard approach of calculating diffusion coefficients relying on the Nicholson-Shain equation. The level of accuracy achieved by SVR and GPR is virtually independent of the rate constants governing the respective reaction steps. Further, the reduction of high-dimensional voltammetric signals by manual selection of typical voltammetric peak features decreased the performance of both regression algorithms compared to a reduction by downsampling or principal component analysis. After training on simulated data sets, diffusion coefficients were estimated by the regression algorithms for experimental data comprising voltammetric signals for three organometallic complexes. Conclusions Estimated diffusion coefficients closely matched the values determined by the parameter fitting method, but reduced the required computational time considerably for one of the reaction mechanisms. The automated processing of voltammograms according to the regression algorithms yields better results than the conventional analysis of peak-related data. PMID:24987463
Goudarzi, Shidrokh; Haslina Hassan, Wan; Abdalla Hashim, Aisha-Hassan; Soleymani, Seyed Ahmad; Anisi, Mohammad Hossein; Zakaria, Omar M.
2016-01-01
This study aims to design a vertical handover prediction method to minimize unnecessary handovers for a mobile node (MN) during the vertical handover process. This relies on a novel method for the prediction of a received signal strength indicator (RSSI) referred to as IRBF-FFA, which is designed by utilizing the imperialist competition algorithm (ICA) to train the radial basis function (RBF), and by hybridizing with the firefly algorithm (FFA) to predict the optimal solution. The prediction accuracy of the proposed IRBF–FFA model was validated by comparing it to support vector machines (SVMs) and multilayer perceptron (MLP) models. In order to assess the model’s performance, we measured the coefficient of determination (R2), correlation coefficient (r), root mean square error (RMSE) and mean absolute percentage error (MAPE). The achieved results indicate that the IRBF–FFA model provides more precise predictions compared to different ANNs, namely, support vector machines (SVMs) and multilayer perceptron (MLP). The performance of the proposed model is analyzed through simulated and real-time RSSI measurements. The results also suggest that the IRBF–FFA model can be applied as an efficient technique for the accurate prediction of vertical handover. PMID:27438600
Goudarzi, Shidrokh; Haslina Hassan, Wan; Abdalla Hashim, Aisha-Hassan; Soleymani, Seyed Ahmad; Anisi, Mohammad Hossein; Zakaria, Omar M
2016-01-01
This study aims to design a vertical handover prediction method to minimize unnecessary handovers for a mobile node (MN) during the vertical handover process. This relies on a novel method for the prediction of a received signal strength indicator (RSSI) referred to as IRBF-FFA, which is designed by utilizing the imperialist competition algorithm (ICA) to train the radial basis function (RBF), and by hybridizing with the firefly algorithm (FFA) to predict the optimal solution. The prediction accuracy of the proposed IRBF-FFA model was validated by comparing it to support vector machines (SVMs) and multilayer perceptron (MLP) models. In order to assess the model's performance, we measured the coefficient of determination (R2), correlation coefficient (r), root mean square error (RMSE) and mean absolute percentage error (MAPE). The achieved results indicate that the IRBF-FFA model provides more precise predictions compared to different ANNs, namely, support vector machines (SVMs) and multilayer perceptron (MLP). The performance of the proposed model is analyzed through simulated and real-time RSSI measurements. The results also suggest that the IRBF-FFA model can be applied as an efficient technique for the accurate prediction of vertical handover.
An affine projection algorithm using grouping selection of input vectors
NASA Astrophysics Data System (ADS)
Shin, JaeWook; Kong, NamWoong; Park, PooGyeon
2011-10-01
This paper present an affine projection algorithm (APA) using grouping selection of input vectors. To improve the performance of conventional APA, the proposed algorithm adjusts the number of the input vectors using two procedures: grouping procedure and selection procedure. In grouping procedure, the some input vectors that have overlapping information for update is grouped using normalized inner product. Then, few input vectors that have enough information for for coefficient update is selected using steady-state mean square error (MSE) in selection procedure. Finally, the filter coefficients update using selected input vectors. The experimental results show that the proposed algorithm has small steady-state estimation errors comparing with the existing algorithms.
CAT-PUMA: CME Arrival Time Prediction Using Machine learning Algorithms
NASA Astrophysics Data System (ADS)
Liu, Jiajia; Ye, Yudong; Shen, Chenglong; Wang, Yuming; Erdélyi, Robert
2018-04-01
CAT-PUMA (CME Arrival Time Prediction Using Machine learning Algorithms) quickly and accurately predicts the arrival of Coronal Mass Ejections (CMEs) of CME arrival time. The software was trained via detailed analysis of CME features and solar wind parameters using 182 previously observed geo-effective partial-/full-halo CMEs and uses algorithms of the Support Vector Machine (SVM) to make its predictions, which can be made within minutes of providing the necessary input parameters of a CME.
Zhao, Yu-Xiang; Chou, Chien-Hsing
2016-01-01
In this study, a new feature selection algorithm, the neighborhood-relationship feature selection (NRFS) algorithm, is proposed for identifying rat electroencephalogram signals and recognizing Chinese characters. In these two applications, dependent relationships exist among the feature vectors and their neighboring feature vectors. Therefore, the proposed NRFS algorithm was designed for solving this problem. By applying the NRFS algorithm, unselected feature vectors have a high priority of being added into the feature subset if the neighboring feature vectors have been selected. In addition, selected feature vectors have a high priority of being eliminated if the neighboring feature vectors are not selected. In the experiments conducted in this study, the NRFS algorithm was compared with two feature algorithms. The experimental results indicated that the NRFS algorithm can extract the crucial frequency bands for identifying rat vigilance states and identifying crucial character regions for recognizing Chinese characters. PMID:27314346
NASA Astrophysics Data System (ADS)
Jia, Rui-Sheng; Sun, Hong-Mei; Peng, Yan-Jun; Liang, Yong-Quan; Lu, Xin-Ming
2017-07-01
Microseismic monitoring is an effective means for providing early warning of rock or coal dynamical disasters, and its first step is microseismic event detection, although low SNR microseismic signals often cannot effectively be detected by routine methods. To solve this problem, this paper presents permutation entropy and a support vector machine to detect low SNR microseismic events. First, an extraction method of signal features based on multi-scale permutation entropy is proposed by studying the influence of the scale factor on the signal permutation entropy. Second, the detection model of low SNR microseismic events based on the least squares support vector machine is built by performing a multi-scale permutation entropy calculation for the collected vibration signals, constructing a feature vector set of signals. Finally, a comparative analysis of the microseismic events and noise signals in the experiment proves that the different characteristics of the two can be fully expressed by using multi-scale permutation entropy. The detection model of microseismic events combined with the support vector machine, which has the features of high classification accuracy and fast real-time algorithms, can meet the requirements of online, real-time extractions of microseismic events.
Webb, Samuel J; Hanser, Thierry; Howlin, Brendan; Krause, Paul; Vessey, Jonathan D
2014-03-25
A new algorithm has been developed to enable the interpretation of black box models. The developed algorithm is agnostic to learning algorithm and open to all structural based descriptors such as fragments, keys and hashed fingerprints. The algorithm has provided meaningful interpretation of Ames mutagenicity predictions from both random forest and support vector machine models built on a variety of structural fingerprints.A fragmentation algorithm is utilised to investigate the model's behaviour on specific substructures present in the query. An output is formulated summarising causes of activation and deactivation. The algorithm is able to identify multiple causes of activation or deactivation in addition to identifying localised deactivations where the prediction for the query is active overall. No loss in performance is seen as there is no change in the prediction; the interpretation is produced directly on the model's behaviour for the specific query. Models have been built using multiple learning algorithms including support vector machine and random forest. The models were built on public Ames mutagenicity data and a variety of fingerprint descriptors were used. These models produced a good performance in both internal and external validation with accuracies around 82%. The models were used to evaluate the interpretation algorithm. Interpretation was revealed that links closely with understood mechanisms for Ames mutagenicity. This methodology allows for a greater utilisation of the predictions made by black box models and can expedite further study based on the output for a (quantitative) structure activity model. Additionally the algorithm could be utilised for chemical dataset investigation and knowledge extraction/human SAR development.
Support vector machine for the diagnosis of malignant mesothelioma
NASA Astrophysics Data System (ADS)
Ushasukhanya, S.; Nithyakalyani, A.; Sivakumar, V.
2018-04-01
Harmful mesothelioma is an illness in which threatening (malignancy) cells shape in the covering of the trunk or stomach area. Being presented to asbestos can influence the danger of threatening mesothelioma. Signs and side effects of threatening mesothelioma incorporate shortness of breath and agony under the rib confine. Tests that inspect within the trunk and belly are utilized to recognize (find) and analyse harmful mesothelioma. Certain elements influence forecast (shot of recuperation) and treatment choices. In this review, Support vector machine (SVM) classifiers were utilized for Mesothelioma sickness conclusion. SVM output is contrasted by concentrating on Mesothelioma’s sickness and findings by utilizing similar information set. The support vector machine algorithm gives 92.5% precision acquired by means of 3-overlap cross-approval. The Mesothelioma illness dataset were taken from an organization reports from Turkey.
Xie, Hong-Bo; Huang, Hu; Wu, Jianhua; Liu, Lei
2015-02-01
We present a multiclass fuzzy relevance vector machine (FRVM) learning mechanism and evaluate its performance to classify multiple hand motions using surface electromyographic (sEMG) signals. The relevance vector machine (RVM) is a sparse Bayesian kernel method which avoids some limitations of the support vector machine (SVM). However, RVM still suffers the difficulty of possible unclassifiable regions in multiclass problems. We propose two fuzzy membership function-based FRVM algorithms to solve such problems, based on experiments conducted on seven healthy subjects and two amputees with six hand motions. Two feature sets, namely, AR model coefficients and room mean square value (AR-RMS), and wavelet transform (WT) features, are extracted from the recorded sEMG signals. Fuzzy support vector machine (FSVM) analysis was also conducted for wide comparison in terms of accuracy, sparsity, training and testing time, as well as the effect of training sample sizes. FRVM yielded comparable classification accuracy with dramatically fewer support vectors in comparison with FSVM. Furthermore, the processing delay of FRVM was much less than that of FSVM, whilst training time of FSVM much faster than FRVM. The results indicate that FRVM classifier trained using sufficient samples can achieve comparable generalization capability as FSVM with significant sparsity in multi-channel sEMG classification, which is more suitable for sEMG-based real-time control applications.
Zhang, Yiyan; Xin, Yi; Li, Qin; Ma, Jianshe; Li, Shuai; Lv, Xiaodan; Lv, Weiqi
2017-11-02
Various kinds of data mining algorithms are continuously raised with the development of related disciplines. The applicable scopes and their performances of these algorithms are different. Hence, finding a suitable algorithm for a dataset is becoming an important emphasis for biomedical researchers to solve practical problems promptly. In this paper, seven kinds of sophisticated active algorithms, namely, C4.5, support vector machine, AdaBoost, k-nearest neighbor, naïve Bayes, random forest, and logistic regression, were selected as the research objects. The seven algorithms were applied to the 12 top-click UCI public datasets with the task of classification, and their performances were compared through induction and analysis. The sample size, number of attributes, number of missing values, and the sample size of each class, correlation coefficients between variables, class entropy of task variable, and the ratio of the sample size of the largest class to the least class were calculated to character the 12 research datasets. The two ensemble algorithms reach high accuracy of classification on most datasets. Moreover, random forest performs better than AdaBoost on the unbalanced dataset of the multi-class task. Simple algorithms, such as the naïve Bayes and logistic regression model are suitable for a small dataset with high correlation between the task and other non-task attribute variables. K-nearest neighbor and C4.5 decision tree algorithms perform well on binary- and multi-class task datasets. Support vector machine is more adept on the balanced small dataset of the binary-class task. No algorithm can maintain the best performance in all datasets. The applicability of the seven data mining algorithms on the datasets with different characteristics was summarized to provide a reference for biomedical researchers or beginners in different fields.
NASA Astrophysics Data System (ADS)
Khawaja, Taimoor Saleem
A high-belief low-overhead Prognostics and Health Management (PHM) system is desired for online real-time monitoring of complex non-linear systems operating in a complex (possibly non-Gaussian) noise environment. This thesis presents a Bayesian Least Squares Support Vector Machine (LS-SVM) based framework for fault diagnosis and failure prognosis in nonlinear non-Gaussian systems. The methodology assumes the availability of real-time process measurements, definition of a set of fault indicators and the existence of empirical knowledge (or historical data) to characterize both nominal and abnormal operating conditions. An efficient yet powerful Least Squares Support Vector Machine (LS-SVM) algorithm, set within a Bayesian Inference framework, not only allows for the development of real-time algorithms for diagnosis and prognosis but also provides a solid theoretical framework to address key concepts related to classification for diagnosis and regression modeling for prognosis. SVM machines are founded on the principle of Structural Risk Minimization (SRM) which tends to find a good trade-off between low empirical risk and small capacity. The key features in SVM are the use of non-linear kernels, the absence of local minima, the sparseness of the solution and the capacity control obtained by optimizing the margin. The Bayesian Inference framework linked with LS-SVMs allows a probabilistic interpretation of the results for diagnosis and prognosis. Additional levels of inference provide the much coveted features of adaptability and tunability of the modeling parameters. The two main modules considered in this research are fault diagnosis and failure prognosis. With the goal of designing an efficient and reliable fault diagnosis scheme, a novel Anomaly Detector is suggested based on the LS-SVM machines. The proposed scheme uses only baseline data to construct a 1-class LS-SVM machine which, when presented with online data is able to distinguish between normal behavior and any abnormal or novel data during real-time operation. The results of the scheme are interpreted as a posterior probability of health (1 - probability of fault). As shown through two case studies in Chapter 3, the scheme is well suited for diagnosing imminent faults in dynamical non-linear systems. Finally, the failure prognosis scheme is based on an incremental weighted Bayesian LS-SVR machine. It is particularly suited for online deployment given the incremental nature of the algorithm and the quick optimization problem solved in the LS-SVR algorithm. By way of kernelization and a Gaussian Mixture Modeling (GMM) scheme, the algorithm can estimate "possibly" non-Gaussian posterior distributions for complex non-linear systems. An efficient regression scheme associated with the more rigorous core algorithm allows for long-term predictions, fault growth estimation with confidence bounds and remaining useful life (RUL) estimation after a fault is detected. The leading contributions of this thesis are (a) the development of a novel Bayesian Anomaly Detector for efficient and reliable Fault Detection and Identification (FDI) based on Least Squares Support Vector Machines, (b) the development of a data-driven real-time architecture for long-term Failure Prognosis using Least Squares Support Vector Machines, (c) Uncertainty representation and management using Bayesian Inference for posterior distribution estimation and hyper-parameter tuning, and finally (d) the statistical characterization of the performance of diagnosis and prognosis algorithms in order to relate the efficiency and reliability of the proposed schemes.
Power line identification of millimeter wave radar based on PCA-GS-SVM
NASA Astrophysics Data System (ADS)
Fang, Fang; Zhang, Guifeng; Cheng, Yansheng
2017-12-01
Aiming at the problem that the existing detection method can not effectively solve the security of UAV's ultra low altitude flight caused by power line, a power line recognition method based on grid search (GS) and the principal component analysis and support vector machine (PCA-SVM) is proposed. Firstly, the candidate line of Hough transform is reduced by PCA, and the main feature of candidate line is extracted. Then, upport vector machine (SVM is) optimized by grid search method (GS). Finally, using support vector machine classifier optimized parameters to classify the candidate line. MATLAB simulation results show that this method can effectively identify the power line and noise, and has high recognition accuracy and algorithm efficiency.
NASA Astrophysics Data System (ADS)
Ni, Y. Q.; Fan, K. Q.; Zheng, G.; Chan, T. H. T.; Ko, J. M.
2003-08-01
An automatic modal identification program is developed for continuous extraction of modal parameters of three cable-supported bridges in Hong Kong which are instrumented with a long-term monitoring system. The program employs the Complex Modal Indication Function (CMIF) algorithm to identify modal properties from continuous ambient vibration measurements in an on-line manner. By using the LabVIEW graphical programming language, the software realizes the algorithm in Virtual Instrument (VI) style. The applicability and implementation issues of the developed software are demonstrated by using one-year measurement data acquired from 67 channels of accelerometers deployed on the cable-stayed Ting Kau Bridge. With the continuously identified results, normal variability of modal vectors caused by varying environmental and operational conditions is observed. Such observation is very helpful for selection of appropriate measured modal vectors for structural health monitoring applications.
NASA Astrophysics Data System (ADS)
Cao, Jin; Jiang, Zhibin; Wang, Kangzhou
2017-07-01
Many nonlinear customer satisfaction-related factors significantly influence the future customer demand for service-oriented manufacturing (SOM). To address this issue and enhance the prediction accuracy, this article develops a novel customer demand prediction approach for SOM. The approach combines the phase space reconstruction (PSR) technique with the optimized least square support vector machine (LSSVM). First, the prediction sample space is reconstructed by the PSR to enrich the time-series dynamics of the limited data sample. Then, the generalization and learning ability of the LSSVM are improved by the hybrid polynomial and radial basis function kernel. Finally, the key parameters of the LSSVM are optimized by the particle swarm optimization algorithm. In a real case study, the customer demand prediction of an air conditioner compressor is implemented. Furthermore, the effectiveness and validity of the proposed approach are demonstrated by comparison with other classical predication approaches.
Acoustic Biometric System Based on Preprocessing Techniques and Linear Support Vector Machines
del Val, Lara; Izquierdo-Fuente, Alberto; Villacorta, Juan J.; Raboso, Mariano
2015-01-01
Drawing on the results of an acoustic biometric system based on a MSE classifier, a new biometric system has been implemented. This new system preprocesses acoustic images, extracts several parameters and finally classifies them, based on Support Vector Machine (SVM). The preprocessing techniques used are spatial filtering, segmentation—based on a Gaussian Mixture Model (GMM) to separate the person from the background, masking—to reduce the dimensions of images—and binarization—to reduce the size of each image. An analysis of classification error and a study of the sensitivity of the error versus the computational burden of each implemented algorithm are presented. This allows the selection of the most relevant algorithms, according to the benefits required by the system. A significant improvement of the biometric system has been achieved by reducing the classification error, the computational burden and the storage requirements. PMID:26091392
Acoustic Biometric System Based on Preprocessing Techniques and Linear Support Vector Machines.
del Val, Lara; Izquierdo-Fuente, Alberto; Villacorta, Juan J; Raboso, Mariano
2015-06-17
Drawing on the results of an acoustic biometric system based on a MSE classifier, a new biometric system has been implemented. This new system preprocesses acoustic images, extracts several parameters and finally classifies them, based on Support Vector Machine (SVM). The preprocessing techniques used are spatial filtering, segmentation-based on a Gaussian Mixture Model (GMM) to separate the person from the background, masking-to reduce the dimensions of images-and binarization-to reduce the size of each image. An analysis of classification error and a study of the sensitivity of the error versus the computational burden of each implemented algorithm are presented. This allows the selection of the most relevant algorithms, according to the benefits required by the system. A significant improvement of the biometric system has been achieved by reducing the classification error, the computational burden and the storage requirements.
Ranking Support Vector Machine with Kernel Approximation
Dou, Yong
2017-01-01
Learning to rank algorithm has become important in recent years due to its successful application in information retrieval, recommender system, and computational biology, and so forth. Ranking support vector machine (RankSVM) is one of the state-of-art ranking models and has been favorably used. Nonlinear RankSVM (RankSVM with nonlinear kernels) can give higher accuracy than linear RankSVM (RankSVM with a linear kernel) for complex nonlinear ranking problem. However, the learning methods for nonlinear RankSVM are still time-consuming because of the calculation of kernel matrix. In this paper, we propose a fast ranking algorithm based on kernel approximation to avoid computing the kernel matrix. We explore two types of kernel approximation methods, namely, the Nyström method and random Fourier features. Primal truncated Newton method is used to optimize the pairwise L2-loss (squared Hinge-loss) objective function of the ranking model after the nonlinear kernel approximation. Experimental results demonstrate that our proposed method gets a much faster training speed than kernel RankSVM and achieves comparable or better performance over state-of-the-art ranking algorithms. PMID:28293256
Ranking Support Vector Machine with Kernel Approximation.
Chen, Kai; Li, Rongchun; Dou, Yong; Liang, Zhengfa; Lv, Qi
2017-01-01
Learning to rank algorithm has become important in recent years due to its successful application in information retrieval, recommender system, and computational biology, and so forth. Ranking support vector machine (RankSVM) is one of the state-of-art ranking models and has been favorably used. Nonlinear RankSVM (RankSVM with nonlinear kernels) can give higher accuracy than linear RankSVM (RankSVM with a linear kernel) for complex nonlinear ranking problem. However, the learning methods for nonlinear RankSVM are still time-consuming because of the calculation of kernel matrix. In this paper, we propose a fast ranking algorithm based on kernel approximation to avoid computing the kernel matrix. We explore two types of kernel approximation methods, namely, the Nyström method and random Fourier features. Primal truncated Newton method is used to optimize the pairwise L2-loss (squared Hinge-loss) objective function of the ranking model after the nonlinear kernel approximation. Experimental results demonstrate that our proposed method gets a much faster training speed than kernel RankSVM and achieves comparable or better performance over state-of-the-art ranking algorithms.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jiang, Huaiguang
This work proposes an approach for distribution system load forecasting, which aims to provide highly accurate short-term load forecasting with high resolution utilizing a support vector regression (SVR) based forecaster and a two-step hybrid parameters optimization method. Specifically, because the load profiles in distribution systems contain abrupt deviations, a data normalization is designed as the pretreatment for the collected historical load data. Then an SVR model is trained by the load data to forecast the future load. For better performance of SVR, a two-step hybrid optimization algorithm is proposed to determine the best parameters. In the first step of themore » hybrid optimization algorithm, a designed grid traverse algorithm (GTA) is used to narrow the parameters searching area from a global to local space. In the second step, based on the result of the GTA, particle swarm optimization (PSO) is used to determine the best parameters in the local parameter space. After the best parameters are determined, the SVR model is used to forecast the short-term load deviation in the distribution system.« less
A new range-free localisation in wireless sensor networks using support vector machine
NASA Astrophysics Data System (ADS)
Wang, Zengfeng; Zhang, Hao; Lu, Tingting; Sun, Yujuan; Liu, Xing
2018-02-01
Location information of sensor nodes is of vital importance for most applications in wireless sensor networks (WSNs). This paper proposes a new range-free localisation algorithm using support vector machine (SVM) and polar coordinate system (PCS), LSVM-PCS. In LSVM-PCS, two sets of classes are first constructed based on sensor nodes' polar coordinates. Using the boundaries of the defined classes, the operation region of WSN field is partitioned into a finite number of polar grids. Each sensor node can be localised into one of the polar grids by executing two localisation algorithms that are developed on the basis of SVM classification. The centre of the resident polar grid is then estimated as the location of the sensor node. In addition, a two-hop mass-spring optimisation (THMSO) is also proposed to further improve the localisation accuracy of LSVM-PCS. In THMSO, both neighbourhood information and non-neighbourhood information are used to refine the sensor node location. The results obtained verify that the proposed algorithm provides a significant improvement over existing localisation methods.
An Algorithm for Converting Static Earth Sensor Measurements into Earth Observation Vectors
NASA Technical Reports Server (NTRS)
Harman, R.; Hashmall, Joseph A.; Sedlak, Joseph
2004-01-01
An algorithm has been developed that converts penetration angles reported by Static Earth Sensors (SESs) into Earth observation vectors. This algorithm allows compensation for variation in the horizon height including that caused by Earth oblateness. It also allows pitch and roll to be computed using any number (greater than 1) of simultaneous sensor penetration angles simplifying processing during periods of Sun and Moon interference. The algorithm computes body frame unit vectors through each SES cluster. It also computes GCI vectors from the spacecraft to the position on the Earth's limb where each cluster detects the Earth's limb. These body frame vectors are used as sensor observation vectors and the GCI vectors are used as reference vectors in an attitude solution. The attitude, with the unobservable yaw discarded, is iteratively refined to provide the Earth observation vector solution.
NASA Astrophysics Data System (ADS)
Avetisyan, H.; Bruna, O.; Holub, J.
2016-11-01
A numerous techniques and algorithms are dedicated to extract emotions from input data. In our investigation it was stated that emotion-detection approaches can be classified into 3 following types: Keyword based / lexical-based, learning based, and hybrid. The most commonly used techniques, such as keyword-spotting method, Support Vector Machines, Naïve Bayes Classifier, Hidden Markov Model and hybrid algorithms, have impressive results in this sphere and can reach more than 90% determining accuracy.
Using support vector machine to predict beta- and gamma-turns in proteins.
Hu, Xiuzhen; Li, Qianzhong
2008-09-01
By using the composite vector with increment of diversity, position conservation scoring function, and predictive secondary structures to express the information of sequence, a support vector machine (SVM) algorithm for predicting beta- and gamma-turns in the proteins is proposed. The 426 and 320 nonhomologous protein chains described by Guruprasad and Rajkumar (Guruprasad and Rajkumar J. Biosci 2000, 25,143) are used for training and testing the predictive model of the beta- and gamma-turns, respectively. The overall prediction accuracy and the Matthews correlation coefficient in 7-fold cross-validation are 79.8% and 0.47, respectively, for the beta-turns. The overall prediction accuracy in 5-fold cross-validation is 61.0% for the gamma-turns. These results are significantly higher than the other algorithms in the prediction of beta- and gamma-turns using the same datasets. In addition, the 547 and 823 nonhomologous protein chains described by Fuchs and Alix (Fuchs and Alix Proteins: Struct Funct Bioinform 2005, 59, 828) are used for training and testing the predictive model of the beta- and gamma-turns, and better results are obtained. This algorithm may be helpful to improve the performance of protein turns' prediction. To ensure the ability of the SVM method to correctly classify beta-turn and non-beta-turn (gamma-turn and non-gamma-turn), the receiver operating characteristic threshold independent measure curves are provided. (c) 2008 Wiley Periodicals, Inc.
NASA Astrophysics Data System (ADS)
Fatehi, Moslem; Asadi, Hooshang H.
2017-04-01
In this study, the application of a transductive support vector machine (TSVM), an innovative semi-supervised learning algorithm, has been proposed for mapping the potential drill targets at a detailed exploration stage. The semi-supervised learning method is a hybrid of supervised and unsupervised learning approach that simultaneously uses both training and non-training data to design a classifier. By using the TSVM algorithm, exploration layers at the Dalli porphyry Cu-Au deposit in the central Iran were integrated to locate the boundary of the Cu-Au mineralization for further drilling. By applying this algorithm on the non-training (unlabeled) and limited training (labeled) Dalli exploration data, the study area was classified in two domains of Cu-Au ore and waste. Then, the results were validated by the earlier block models created, using the available borehole and trench data. In addition to TSVM, the support vector machine (SVM) algorithm was also implemented on the study area for comparison. Thirty percent of the labeled exploration data was used to evaluate the performance of these two algorithms. The results revealed 87 percent correct recognition accuracy for the TSVM algorithm and 82 percent for the SVM algorithm. The deepest inclined borehole, recently drilled in the western part of the Dalli deposit, indicated that the boundary of Cu-Au mineralization, as identified by the TSVM algorithm, was only 15 m off from the actual boundary intersected by this borehole. According to the results of the TSVM algorithm, six new boreholes were suggested for further drilling at the Dalli deposit. This study showed that the TSVM algorithm could be a useful tool for enhancing the mineralization zones and consequently, ensuring a more accurate drill hole planning.
NASA Astrophysics Data System (ADS)
Aghamaleki, Javad Abbasi; Behrad, Alireza
2018-01-01
Double compression detection is a crucial stage in digital image and video forensics. However, the detection of double compressed videos is challenging when the video forger uses the same quantization matrix and synchronized group of pictures (GOP) structure during the recompression history to conceal tampering effects. A passive approach is proposed for detecting double compressed MPEG videos with the same quantization matrix and synchronized GOP structure. To devise the proposed algorithm, the effects of recompression on P frames are mathematically studied. Then, based on the obtained guidelines, a feature vector is proposed to detect double compressed frames on the GOP level. Subsequently, sparse representations of the feature vectors are used for dimensionality reduction and enrich the traces of recompression. Finally, a support vector machine classifier is employed to detect and localize double compression in temporal domain. The experimental results show that the proposed algorithm achieves the accuracy of more than 95%. In addition, the comparisons of the results of the proposed method with those of other methods reveal the efficiency of the proposed algorithm.
Support vector machine multiuser receiver for DS-CDMA signals in multipath channels.
Chen, S; Samingan, A K; Hanzo, L
2001-01-01
The problem of constructing an adaptive multiuser detector (MUD) is considered for direct sequence code division multiple access (DS-CDMA) signals transmitted through multipath channels. The emerging learning technique, called support vector machines (SVM), is proposed as a method of obtaining a nonlinear MUD from a relatively small training data block. Computer simulation is used to study this SVM MUD, and the results show that it can closely match the performance of the optimal Bayesian one-shot detector. Comparisons with an adaptive radial basis function (RBF) MUD trained by an unsupervised clustering algorithm are discussed.
A selective-update affine projection algorithm with selective input vectors
NASA Astrophysics Data System (ADS)
Kong, NamWoong; Shin, JaeWook; Park, PooGyeon
2011-10-01
This paper proposes an affine projection algorithm (APA) with selective input vectors, which based on the concept of selective-update in order to reduce estimation errors and computations. The algorithm consists of two procedures: input- vector-selection and state-decision. The input-vector-selection procedure determines the number of input vectors by checking with mean square error (MSE) whether the input vectors have enough information for update. The state-decision procedure determines the current state of the adaptive filter by using the state-decision criterion. As the adaptive filter is in transient state, the algorithm updates the filter coefficients with the selected input vectors. On the other hand, as soon as the adaptive filter reaches the steady state, the update procedure is not performed. Through these two procedures, the proposed algorithm achieves small steady-state estimation errors, low computational complexity and low update complexity for colored input signals.
SNPs selection using support vector regression and genetic algorithms in GWAS
2014-01-01
Introduction This paper proposes a new methodology to simultaneously select the most relevant SNPs markers for the characterization of any measurable phenotype described by a continuous variable using Support Vector Regression with Pearson Universal kernel as fitness function of a binary genetic algorithm. The proposed methodology is multi-attribute towards considering several markers simultaneously to explain the phenotype and is based jointly on statistical tools, machine learning and computational intelligence. Results The suggested method has shown potential in the simulated database 1, with additive effects only, and real database. In this simulated database, with a total of 1,000 markers, and 7 with major effect on the phenotype and the other 993 SNPs representing the noise, the method identified 21 markers. Of this total, 5 are relevant SNPs between the 7 but 16 are false positives. In real database, initially with 50,752 SNPs, we have reduced to 3,073 markers, increasing the accuracy of the model. In the simulated database 2, with additive effects and interactions (epistasis), the proposed method matched to the methodology most commonly used in GWAS. Conclusions The method suggested in this paper demonstrates the effectiveness in explaining the real phenotype (PTA for milk), because with the application of the wrapper based on genetic algorithm and Support Vector Regression with Pearson Universal, many redundant markers were eliminated, increasing the prediction and accuracy of the model on the real database without quality control filters. The PUK demonstrated that it can replicate the performance of linear and RBF kernels. PMID:25573332
Detection of Pathological Voice Using Cepstrum Vectors: A Deep Learning Approach.
Fang, Shih-Hau; Tsao, Yu; Hsiao, Min-Jing; Chen, Ji-Ying; Lai, Ying-Hui; Lin, Feng-Chuan; Wang, Chi-Te
2018-03-19
Computerized detection of voice disorders has attracted considerable academic and clinical interest in the hope of providing an effective screening method for voice diseases before endoscopic confirmation. This study proposes a deep-learning-based approach to detect pathological voice and examines its performance and utility compared with other automatic classification algorithms. This study retrospectively collected 60 normal voice samples and 402 pathological voice samples of 8 common clinical voice disorders in a voice clinic of a tertiary teaching hospital. We extracted Mel frequency cepstral coefficients from 3-second samples of a sustained vowel. The performances of three machine learning algorithms, namely, deep neural network (DNN), support vector machine, and Gaussian mixture model, were evaluated based on a fivefold cross-validation. Collective cases from the voice disorder database of MEEI (Massachusetts Eye and Ear Infirmary) were used to verify the performance of the classification mechanisms. The experimental results demonstrated that DNN outperforms Gaussian mixture model and support vector machine. Its accuracy in detecting voice pathologies reached 94.26% and 90.52% in male and female subjects, based on three representative Mel frequency cepstral coefficient features. When applied to the MEEI database for validation, the DNN also achieved a higher accuracy (99.32%) than the other two classification algorithms. By stacking several layers of neurons with optimized weights, the proposed DNN algorithm can fully utilize the acoustic features and efficiently differentiate between normal and pathological voice samples. Based on this pilot study, future research may proceed to explore more application of DNN from laboratory and clinical perspectives. Copyright © 2018 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Reconstructing householder vectors from Tall-Skinny QR
Ballard, Grey Malone; Demmel, James; Grigori, Laura; ...
2015-08-05
The Tall-Skinny QR (TSQR) algorithm is more communication efficient than the standard Householder algorithm for QR decomposition of matrices with many more rows than columns. However, TSQR produces a different representation of the orthogonal factor and therefore requires more software development to support the new representation. Further, implicitly applying the orthogonal factor to the trailing matrix in the context of factoring a square matrix is more complicated and costly than with the Householder representation. We show how to perform TSQR and then reconstruct the Householder vector representation with the same asymptotic communication efficiency and little extra computational cost. We demonstratemore » the high performance and numerical stability of this algorithm both theoretically and empirically. The new Householder reconstruction algorithm allows us to design more efficient parallel QR algorithms, with significantly lower latency cost compared to Householder QR and lower bandwidth and latency costs compared with Communication-Avoiding QR (CAQR) algorithm. Experiments on supercomputers demonstrate the benefits of the communication cost improvements: in particular, our experiments show substantial improvements over tuned library implementations for tall-and-skinny matrices. Furthermore, we also provide algorithmic improvements to the Householder QR and CAQR algorithms, and we investigate several alternatives to the Householder reconstruction algorithm that sacrifice guarantees on numerical stability in some cases in order to obtain higher performance.« less
Scattering transform and LSPTSVM based fault diagnosis of rotating machinery
NASA Astrophysics Data System (ADS)
Ma, Shangjun; Cheng, Bo; Shang, Zhaowei; Liu, Geng
2018-05-01
This paper proposes an algorithm for fault diagnosis of rotating machinery to overcome the shortcomings of classical techniques which are noise sensitive in feature extraction and time consuming for training. Based on the scattering transform and the least squares recursive projection twin support vector machine (LSPTSVM), the method has the advantages of high efficiency and insensitivity for noise signal. Using the energy of the scattering coefficients in each sub-band, the features of the vibration signals are obtained. Then, an LSPTSVM classifier is used for fault diagnosis. The new method is compared with other common methods including the proximal support vector machine, the standard support vector machine and multi-scale theory by using fault data for two systems, a motor bearing and a gear box. The results show that the new method proposed in this study is more effective for fault diagnosis of rotating machinery.
A Human Activity Recognition System Using Skeleton Data from RGBD Sensors.
Cippitelli, Enea; Gasparrini, Samuele; Gambi, Ennio; Spinsante, Susanna
2016-01-01
The aim of Active and Assisted Living is to develop tools to promote the ageing in place of elderly people, and human activity recognition algorithms can help to monitor aged people in home environments. Different types of sensors can be used to address this task and the RGBD sensors, especially the ones used for gaming, are cost-effective and provide much information about the environment. This work aims to propose an activity recognition algorithm exploiting skeleton data extracted by RGBD sensors. The system is based on the extraction of key poses to compose a feature vector, and a multiclass Support Vector Machine to perform classification. Computation and association of key poses are carried out using a clustering algorithm, without the need of a learning algorithm. The proposed approach is evaluated on five publicly available datasets for activity recognition, showing promising results especially when applied for the recognition of AAL related actions. Finally, the current applicability of this solution in AAL scenarios and the future improvements needed are discussed.
Segmentation of magnetic resonance images using fuzzy algorithms for learning vector quantization.
Karayiannis, N B; Pai, P I
1999-02-01
This paper evaluates a segmentation technique for magnetic resonance (MR) images of the brain based on fuzzy algorithms for learning vector quantization (FALVQ). These algorithms perform vector quantization by updating all prototypes of a competitive network through an unsupervised learning process. Segmentation of MR images is formulated as an unsupervised vector quantization process, where the local values of different relaxation parameters form the feature vectors which are represented by a relatively small set of prototypes. The experiments evaluate a variety of FALVQ algorithms in terms of their ability to identify different tissues and discriminate between normal tissues and abnormalities.
Budget Online Learning Algorithm for Least Squares SVM.
Jian, Ling; Shen, Shuqian; Li, Jundong; Liang, Xijun; Li, Lei
2017-09-01
Batch-mode least squares support vector machine (LSSVM) is often associated with unbounded number of support vectors (SVs'), making it unsuitable for applications involving large-scale streaming data. Limited-scale LSSVM, which allows efficient updating, seems to be a good solution to tackle this issue. In this paper, to train the limited-scale LSSVM dynamically, we present a budget online LSSVM (BOLSSVM) algorithm. Methodologically, by setting a fixed budget for SVs', we are able to update the LSSVM model according to the updated SVs' set dynamically without retraining from scratch. In particular, when a new small chunk of SVs' substitute for the old ones, the proposed algorithm employs a low rank correction technology and the Sherman-Morrison-Woodbury formula to compute the inverse of saddle point matrix derived from the LSSVM's Karush-Kuhn-Tucker (KKT) system, which, in turn, updates the LSSVM model efficiently. In this way, the proposed BOLSSVM algorithm is especially useful for online prediction tasks. Another merit of the proposed BOLSSVM is that it can be used for k -fold cross validation. Specifically, compared with batch-mode learning methods, the computational complexity of the proposed BOLSSVM method is significantly reduced from O(n 4 ) to O(n 3 ) for leave-one-out cross validation with n training samples. The experimental results of classification and regression on benchmark data sets and real-world applications show the validity and effectiveness of the proposed BOLSSVM algorithm.
Application of XGBoost algorithm in hourly PM2.5 concentration prediction
NASA Astrophysics Data System (ADS)
Pan, Bingyue
2018-02-01
In view of prediction techniques of hourly PM2.5 concentration in China, this paper applied the XGBoost(Extreme Gradient Boosting) algorithm to predict hourly PM2.5 concentration. The monitoring data of air quality in Tianjin city was analyzed by using XGBoost algorithm. The prediction performance of the XGBoost method is evaluated by comparing observed and predicted PM2.5 concentration using three measures of forecast accuracy. The XGBoost method is also compared with the random forest algorithm, multiple linear regression, decision tree regression and support vector machines for regression models using computational results. The results demonstrate that the XGBoost algorithm outperforms other data mining methods.
Optimizing Support Vector Machine Parameters with Genetic Algorithm for Credit Risk Assessment
NASA Astrophysics Data System (ADS)
Manurung, Jonson; Mawengkang, Herman; Zamzami, Elviawaty
2017-12-01
Support vector machine (SVM) is a popular classification method known to have strong generalization capabilities. SVM can solve the problem of classification and linear regression or nonlinear kernel which can be a learning algorithm for the ability of classification and regression. However, SVM also has a weakness that is difficult to determine the optimal parameter value. SVM calculates the best linear separator on the input feature space according to the training data. To classify data which are non-linearly separable, SVM uses kernel tricks to transform the data into a linearly separable data on a higher dimension feature space. The kernel trick using various kinds of kernel functions, such as : linear kernel, polynomial, radial base function (RBF) and sigmoid. Each function has parameters which affect the accuracy of SVM classification. To solve the problem genetic algorithms are proposed to be applied as the optimal parameter value search algorithm thus increasing the best classification accuracy on SVM. Data taken from UCI repository of machine learning database: Australian Credit Approval. The results show that the combination of SVM and genetic algorithms is effective in improving classification accuracy. Genetic algorithms has been shown to be effective in systematically finding optimal kernel parameters for SVM, instead of randomly selected kernel parameters. The best accuracy for data has been upgraded from kernel Linear: 85.12%, polynomial: 81.76%, RBF: 77.22% Sigmoid: 78.70%. However, for bigger data sizes, this method is not practical because it takes a lot of time.
Assessment of various supervised learning algorithms using different performance metrics
NASA Astrophysics Data System (ADS)
Susheel Kumar, S. M.; Laxkar, Deepak; Adhikari, Sourav; Vijayarajan, V.
2017-11-01
Our work brings out comparison based on the performance of supervised machine learning algorithms on a binary classification task. The supervised machine learning algorithms which are taken into consideration in the following work are namely Support Vector Machine(SVM), Decision Tree(DT), K Nearest Neighbour (KNN), Naïve Bayes(NB) and Random Forest(RF). This paper mostly focuses on comparing the performance of above mentioned algorithms on one binary classification task by analysing the Metrics such as Accuracy, F-Measure, G-Measure, Precision, Misclassification Rate, False Positive Rate, True Positive Rate, Specificity, Prevalence.
Miranian, A; Abdollahzade, M
2013-02-01
Local modeling approaches, owing to their ability to model different operating regimes of nonlinear systems and processes by independent local models, seem appealing for modeling, identification, and prediction applications. In this paper, we propose a local neuro-fuzzy (LNF) approach based on the least-squares support vector machines (LSSVMs). The proposed LNF approach employs LSSVMs, which are powerful in modeling and predicting time series, as local models and uses hierarchical binary tree (HBT) learning algorithm for fast and efficient estimation of its parameters. The HBT algorithm heuristically partitions the input space into smaller subdomains by axis-orthogonal splits. In each partitioning, the validity functions automatically form a unity partition and therefore normalization side effects, e.g., reactivation, are prevented. Integration of LSSVMs into the LNF network as local models, along with the HBT learning algorithm, yield a high-performance approach for modeling and prediction of complex nonlinear time series. The proposed approach is applied to modeling and predictions of different nonlinear and chaotic real-world and hand-designed systems and time series. Analysis of the prediction results and comparisons with recent and old studies demonstrate the promising performance of the proposed LNF approach with the HBT learning algorithm for modeling and prediction of nonlinear and chaotic systems and time series.
NASA Astrophysics Data System (ADS)
Pasquato, Mario; Chung, Chul
2016-05-01
Context. Machine-learning (ML) solves problems by learning patterns from data with limited or no human guidance. In astronomy, ML is mainly applied to large observational datasets, e.g. for morphological galaxy classification. Aims: We apply ML to gravitational N-body simulations of star clusters that are either formed by merging two progenitors or evolved in isolation, planning to later identify globular clusters (GCs) that may have a history of merging from observational data. Methods: We create mock-observations from simulated GCs, from which we measure a set of parameters (also called features in the machine-learning field). After carrying out dimensionality reduction on the feature space, the resulting datapoints are fed in to various classification algorithms. Using repeated random subsampling validation, we check whether the groups identified by the algorithms correspond to the underlying physical distinction between mergers and monolithically evolved simulations. Results: The three algorithms we considered (C5.0 trees, k-nearest neighbour, and support-vector machines) all achieve a test misclassification rate of about 10% without parameter tuning, with support-vector machines slightly outperforming the others. The first principal component of feature space correlates with cluster concentration. If we exclude it from the regression, the performance of the algorithms is only slightly reduced.
A novel dynamical community detection algorithm based on weighting scheme
NASA Astrophysics Data System (ADS)
Li, Ju; Yu, Kai; Hu, Ke
2015-12-01
Network dynamics plays an important role in analyzing the correlation between the function properties and the topological structure. In this paper, we propose a novel dynamical iteration (DI) algorithm, which incorporates the iterative process of membership vector with weighting scheme, i.e. weighting W and tightness T. These new elements can be used to adjust the link strength and the node compactness for improving the speed and accuracy of community structure detection. To estimate the optimal stop time of iteration, we utilize a new stability measure which is defined as the Markov random walk auto-covariance. We do not need to specify the number of communities in advance. It naturally supports the overlapping communities by associating each node with a membership vector describing the node's involvement in each community. Theoretical analysis and experiments show that the algorithm can uncover communities effectively and efficiently.
Prediction of Drug-Plasma Protein Binding Using Artificial Intelligence Based Algorithms.
Kumar, Rajnish; Sharma, Anju; Siddiqui, Mohammed Haris; Tiwari, Rajesh Kumar
2018-01-01
Plasma protein binding (PPB) has vital importance in the characterization of drug distribution in the systemic circulation. Unfavorable PPB can pose a negative effect on clinical development of promising drug candidates. The drug distribution properties should be considered at the initial phases of the drug design and development. Therefore, PPB prediction models are receiving an increased attention. In the current study, we present a systematic approach using Support vector machine, Artificial neural network, k- nearest neighbor, Probabilistic neural network, Partial least square and Linear discriminant analysis to relate various in vitro and in silico molecular descriptors to a diverse dataset of 736 drugs/drug-like compounds. The overall accuracy of Support vector machine with Radial basis function kernel came out to be comparatively better than the rest of the applied algorithms. The training set accuracy, validation set accuracy, precision, sensitivity, specificity and F1 score for the Suprort vector machine was found to be 89.73%, 89.97%, 92.56%, 87.26%, 91.97% and 0.898, respectively. This model can potentially be useful in screening of relevant drug candidates at the preliminary stages of drug design and development. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
[Orthogonal Vector Projection Algorithm for Spectral Unmixing].
Song, Mei-ping; Xu, Xing-wei; Chang, Chein-I; An, Ju-bai; Yao, Li
2015-12-01
Spectrum unmixing is an important part of hyperspectral technologies, which is essential for material quantity analysis in hyperspectral imagery. Most linear unmixing algorithms require computations of matrix multiplication and matrix inversion or matrix determination. These are difficult for programming, especially hard for realization on hardware. At the same time, the computation costs of the algorithms increase significantly as the number of endmembers grows. Here, based on the traditional algorithm Orthogonal Subspace Projection, a new method called. Orthogonal Vector Projection is prompted using orthogonal principle. It simplifies this process by avoiding matrix multiplication and inversion. It firstly computes the final orthogonal vector via Gram-Schmidt process for each endmember spectrum. And then, these orthogonal vectors are used as projection vector for the pixel signature. The unconstrained abundance can be obtained directly by projecting the signature to the projection vectors, and computing the ratio of projected vector length and orthogonal vector length. Compared to the Orthogonal Subspace Projection and Least Squares Error algorithms, this method does not need matrix inversion, which is much computation costing and hard to implement on hardware. It just completes the orthogonalization process by repeated vector operations, easy for application on both parallel computation and hardware. The reasonability of the algorithm is proved by its relationship with Orthogonal Sub-space Projection and Least Squares Error algorithms. And its computational complexity is also compared with the other two algorithms', which is the lowest one. At last, the experimental results on synthetic image and real image are also provided, giving another evidence for effectiveness of the method.
Improving Vector Evaluated Particle Swarm Optimisation by Incorporating Nondominated Solutions
Lim, Kian Sheng; Ibrahim, Zuwairie; Buyamin, Salinda; Ahmad, Anita; Naim, Faradila; Ghazali, Kamarul Hawari; Mokhtar, Norrima
2013-01-01
The Vector Evaluated Particle Swarm Optimisation algorithm is widely used to solve multiobjective optimisation problems. This algorithm optimises one objective using a swarm of particles where their movements are guided by the best solution found by another swarm. However, the best solution of a swarm is only updated when a newly generated solution has better fitness than the best solution at the objective function optimised by that swarm, yielding poor solutions for the multiobjective optimisation problems. Thus, an improved Vector Evaluated Particle Swarm Optimisation algorithm is introduced by incorporating the nondominated solutions as the guidance for a swarm rather than using the best solution from another swarm. In this paper, the performance of improved Vector Evaluated Particle Swarm Optimisation algorithm is investigated using performance measures such as the number of nondominated solutions found, the generational distance, the spread, and the hypervolume. The results suggest that the improved Vector Evaluated Particle Swarm Optimisation algorithm has impressive performance compared with the conventional Vector Evaluated Particle Swarm Optimisation algorithm. PMID:23737718
Improving Vector Evaluated Particle Swarm Optimisation by incorporating nondominated solutions.
Lim, Kian Sheng; Ibrahim, Zuwairie; Buyamin, Salinda; Ahmad, Anita; Naim, Faradila; Ghazali, Kamarul Hawari; Mokhtar, Norrima
2013-01-01
The Vector Evaluated Particle Swarm Optimisation algorithm is widely used to solve multiobjective optimisation problems. This algorithm optimises one objective using a swarm of particles where their movements are guided by the best solution found by another swarm. However, the best solution of a swarm is only updated when a newly generated solution has better fitness than the best solution at the objective function optimised by that swarm, yielding poor solutions for the multiobjective optimisation problems. Thus, an improved Vector Evaluated Particle Swarm Optimisation algorithm is introduced by incorporating the nondominated solutions as the guidance for a swarm rather than using the best solution from another swarm. In this paper, the performance of improved Vector Evaluated Particle Swarm Optimisation algorithm is investigated using performance measures such as the number of nondominated solutions found, the generational distance, the spread, and the hypervolume. The results suggest that the improved Vector Evaluated Particle Swarm Optimisation algorithm has impressive performance compared with the conventional Vector Evaluated Particle Swarm Optimisation algorithm.
On A Nonlinear Generalization of Sparse Coding and Dictionary Learning.
Xie, Yuchen; Ho, Jeffrey; Vemuri, Baba
2013-01-01
Existing dictionary learning algorithms are based on the assumption that the data are vectors in an Euclidean vector space ℝ d , and the dictionary is learned from the training data using the vector space structure of ℝ d and its Euclidean L 2 -metric. However, in many applications, features and data often originated from a Riemannian manifold that does not support a global linear (vector space) structure. Furthermore, the extrinsic viewpoint of existing dictionary learning algorithms becomes inappropriate for modeling and incorporating the intrinsic geometry of the manifold that is potentially important and critical to the application. This paper proposes a novel framework for sparse coding and dictionary learning for data on a Riemannian manifold, and it shows that the existing sparse coding and dictionary learning methods can be considered as special (Euclidean) cases of the more general framework proposed here. We show that both the dictionary and sparse coding can be effectively computed for several important classes of Riemannian manifolds, and we validate the proposed method using two well-known classification problems in computer vision and medical imaging analysis.
On A Nonlinear Generalization of Sparse Coding and Dictionary Learning
Xie, Yuchen; Ho, Jeffrey; Vemuri, Baba
2013-01-01
Existing dictionary learning algorithms are based on the assumption that the data are vectors in an Euclidean vector space ℝd, and the dictionary is learned from the training data using the vector space structure of ℝd and its Euclidean L2-metric. However, in many applications, features and data often originated from a Riemannian manifold that does not support a global linear (vector space) structure. Furthermore, the extrinsic viewpoint of existing dictionary learning algorithms becomes inappropriate for modeling and incorporating the intrinsic geometry of the manifold that is potentially important and critical to the application. This paper proposes a novel framework for sparse coding and dictionary learning for data on a Riemannian manifold, and it shows that the existing sparse coding and dictionary learning methods can be considered as special (Euclidean) cases of the more general framework proposed here. We show that both the dictionary and sparse coding can be effectively computed for several important classes of Riemannian manifolds, and we validate the proposed method using two well-known classification problems in computer vision and medical imaging analysis. PMID:24129583
NASA Astrophysics Data System (ADS)
Attia, Khalid A. M.; Nassar, Mohammed W. I.; El-Zeiny, Mohamed B.; Serag, Ahmed
2017-01-01
For the first time, a new variable selection method based on swarm intelligence namely firefly algorithm is coupled with three different multivariate calibration models namely, concentration residual augmented classical least squares, artificial neural network and support vector regression in UV spectral data. A comparative study between the firefly algorithm and the well-known genetic algorithm was developed. The discussion revealed the superiority of using this new powerful algorithm over the well-known genetic algorithm. Moreover, different statistical tests were performed and no significant differences were found between all the models regarding their predictabilities. This ensures that simpler and faster models were obtained without any deterioration of the quality of the calibration.
Research on Classification of Chinese Text Data Based on SVM
NASA Astrophysics Data System (ADS)
Lin, Yuan; Yu, Hongzhi; Wan, Fucheng; Xu, Tao
2017-09-01
Data Mining has important application value in today’s industry and academia. Text classification is a very important technology in data mining. At present, there are many mature algorithms for text classification. KNN, NB, AB, SVM, decision tree and other classification methods all show good classification performance. Support Vector Machine’ (SVM) classification method is a good classifier in machine learning research. This paper will study the classification effect based on the SVM method in the Chinese text data, and use the support vector machine method in the chinese text to achieve the classify chinese text, and to able to combination of academia and practical application.
Yan, Jianjun; Shen, Xiaojing; Wang, Yiqin; Li, Fufeng; Xia, Chunming; Guo, Rui; Chen, Chunfeng; Shen, Qingwei
2010-01-01
This study aims at utilising Wavelet Packet Transform (WPT) and Support Vector Machine (SVM) algorithm to make objective analysis and quantitative research for the auscultation in Traditional Chinese Medicine (TCM) diagnosis. First, Wavelet Packet Decomposition (WPD) at level 6 was employed to split more elaborate frequency bands of the auscultation signals. Then statistic analysis was made based on the extracted Wavelet Packet Energy (WPE) features from WPD coefficients. Furthermore, the pattern recognition was used to distinguish mixed subjects' statistical feature values of sample groups through SVM. Finally, the experimental results showed that the classification accuracies were at a high level.
Applications of Support Vector Machines In Chemo And Bioinformatics
NASA Astrophysics Data System (ADS)
Jayaraman, V. K.; Sundararajan, V.
2010-10-01
Conventional linear & nonlinear tools for classification, regression & data driven modeling are being replaced on a rapid scale by newer techniques & tools based on artificial intelligence and machine learning. While the linear techniques are not applicable for inherently nonlinear problems, newer methods serve as attractive alternatives for solving real life problems. Support Vector Machine (SVM) classifiers are a set of universal feed-forward network based classification algorithms that have been formulated from statistical learning theory and structural risk minimization principle. SVM regression closely follows the classification methodology. In this work recent applications of SVM in Chemo & Bioinformatics will be described with suitable illustrative examples.
Stoean, Ruxandra; Stoean, Catalin; Lupsor, Monica; Stefanescu, Horia; Badea, Radu
2011-01-01
Hepatic fibrosis, the principal pointer to the development of a liver disease within chronic hepatitis C, can be measured through several stages. The correct evaluation of its degree, based on recent different non-invasive procedures, is of current major concern. The latest methodology for assessing it is the Fibroscan and the effect of its employment is impressive. However, the complex interaction between its stiffness indicator and the other biochemical and clinical examinations towards a respective degree of liver fibrosis is hard to be manually discovered. In this respect, the novel, well-performing evolutionary-powered support vector machines are proposed towards an automated learning of the relationship between medical attributes and fibrosis levels. The traditional support vector machines have been an often choice for addressing hepatic fibrosis, while the evolutionary option has been validated on many real-world tasks and proven flexibility and good performance. The evolutionary approach is simple and direct, resulting from the hybridization of the learning component within support vector machines and the optimization engine of evolutionary algorithms. It discovers the optimal coefficients of surfaces that separate instances of distinct classes. Apart from a detached manner of establishing the fibrosis degree for new cases, a resulting formula also offers insight upon the correspondence between the medical factors and the respective outcome. What is more, a feature selection genetic algorithm can be further embedded into the method structure, in order to dynamically concentrate search only on the most relevant attributes. The data set refers 722 patients with chronic hepatitis C infection and 24 indicators. The five possible degrees of fibrosis range from F0 (no fibrosis) to F4 (cirrhosis). Since the standard support vector machines are among the most frequently used methods in recent artificial intelligence studies for hepatic fibrosis staging, the evolutionary method is viewed in comparison to the traditional one. The multifaceted discrimination into all five degrees of fibrosis and the slightly less difficult common separation into solely three related stages are both investigated. The resulting performance proves the superiority over the standard support vector classification and the attained formula is helpful in providing an immediate calculation of the liver stage for new cases, while establishing the presence/absence and comprehending the weight of each medical factor with respect to a certain fibrosis level. The use of the evolutionary technique for fibrosis degree prediction triggers simplicity and offers a direct expression of the influence of dynamically selected indicators on the corresponding stage. Perhaps most importantly, it significantly surpasses the classical support vector machines, which are both widely used and technically sound. All these therefore confirm the promise of the new methodology towards a dependable support within the medical decision-making. Copyright © 2010 Elsevier B.V. All rights reserved.
Amin, Morteza Moradi; Kermani, Saeed; Talebi, Ardeshir; Oghli, Mostafa Ghelich
2015-01-01
Acute lymphoblastic leukemia is the most common form of pediatric cancer which is categorized into three L1, L2, and L3 and could be detected through screening of blood and bone marrow smears by pathologists. Due to being time-consuming and tediousness of the procedure, a computer-based system is acquired for convenient detection of Acute lymphoblastic leukemia. Microscopic images are acquired from blood and bone marrow smears of patients with Acute lymphoblastic leukemia and normal cases. After applying image preprocessing, cells nuclei are segmented by k-means algorithm. Then geometric and statistical features are extracted from nuclei and finally these cells are classified to cancerous and noncancerous cells by means of support vector machine classifier with 10-fold cross validation. These cells are also classified into their sub-types by multi-Support vector machine classifier. Classifier is evaluated by these parameters: Sensitivity, specificity, and accuracy which values for cancerous and noncancerous cells 98%, 95%, and 97%, respectively. These parameters are also used for evaluation of cell sub-types which values in mean 84.3%, 97.3%, and 95.6%, respectively. The results show that proposed algorithm could achieve an acceptable performance for the diagnosis of Acute lymphoblastic leukemia and its sub-types and can be used as an assistant diagnostic tool for pathologists.
Monthly evaporation forecasting using artificial neural networks and support vector machines
NASA Astrophysics Data System (ADS)
Tezel, Gulay; Buyukyildiz, Meral
2016-04-01
Evaporation is one of the most important components of the hydrological cycle, but is relatively difficult to estimate, due to its complexity, as it can be influenced by numerous factors. Estimation of evaporation is important for the design of reservoirs, especially in arid and semi-arid areas. Artificial neural network methods and support vector machines (SVM) are frequently utilized to estimate evaporation and other hydrological variables. In this study, usability of artificial neural networks (ANNs) (multilayer perceptron (MLP) and radial basis function network (RBFN)) and ɛ-support vector regression (SVR) artificial intelligence methods was investigated to estimate monthly pan evaporation. For this aim, temperature, relative humidity, wind speed, and precipitation data for the period 1972 to 2005 from Beysehir meteorology station were used as input variables while pan evaporation values were used as output. The Romanenko and Meyer method was also considered for the comparison. The results were compared with observed class A pan evaporation data. In MLP method, four different training algorithms, gradient descent with momentum and adaptive learning rule backpropagation (GDX), Levenberg-Marquardt (LVM), scaled conjugate gradient (SCG), and resilient backpropagation (RBP), were used. Also, ɛ-SVR model was used as SVR model. The models were designed via 10-fold cross-validation (CV); algorithm performance was assessed via mean absolute error (MAE), root mean square error (RMSE), and coefficient of determination (R 2). According to the performance criteria, the ANN algorithms and ɛ-SVR had similar results. The ANNs and ɛ-SVR methods were found to perform better than the Romanenko and Meyer methods. Consequently, the best performance using the test data was obtained using SCG(4,2,2,1) with R 2 = 0.905.
Blood glucose level prediction based on support vector regression using mobile platforms.
Reymann, Maximilian P; Dorschky, Eva; Groh, Benjamin H; Martindale, Christine; Blank, Peter; Eskofier, Bjoern M
2016-08-01
The correct treatment of diabetes is vital to a patient's health: Staying within defined blood glucose levels prevents dangerous short- and long-term effects on the body. Mobile devices informing patients about their future blood glucose levels could enable them to take counter-measures to prevent hypo or hyper periods. Previous work addressed this challenge by predicting the blood glucose levels using regression models. However, these approaches required a physiological model, representing the human body's response to insulin and glucose intake, or are not directly applicable to mobile platforms (smart phones, tablets). In this paper, we propose an algorithm for mobile platforms to predict blood glucose levels without the need for a physiological model. Using an online software simulator program, we trained a Support Vector Regression (SVR) model and exported the parameter settings to our mobile platform. The prediction accuracy of our mobile platform was evaluated with pre-recorded data of a type 1 diabetes patient. The blood glucose level was predicted with an error of 19 % compared to the true value. Considering the permitted error of commercially used devices of 15 %, our algorithm is the basis for further development of mobile prediction algorithms.
Reduced Order Model Basis Vector Generation: Generates Basis Vectors fro ROMs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Arrighi, Bill
2016-03-03
libROM is a library that implements order reduction via singular value decomposition (SVD) of sampled state vectors. It implements 2 parallel, incremental SVD algorithms and one serial, non-incremental algorithm. It also provides a mechanism for adaptive sampling of basis vectors.
Sakumura, Yuichi; Koyama, Yutaro; Tokutake, Hiroaki; Hida, Toyoaki; Sato, Kazuo; Itoh, Toshio; Akamatsu, Takafumi; Shin, Woosuck
2017-01-01
Monitoring exhaled breath is a very attractive, noninvasive screening technique for early diagnosis of diseases, especially lung cancer. However, the technique provides insufficient accuracy because the exhaled air has many crucial volatile organic compounds (VOCs) at very low concentrations (ppb level). We analyzed the breath exhaled by lung cancer patients and healthy subjects (controls) using gas chromatography/mass spectrometry (GC/MS), and performed a subsequent statistical analysis to diagnose lung cancer based on the combination of multiple lung cancer-related VOCs. We detected 68 VOCs as marker species using GC/MS analysis. We reduced the number of VOCs and used support vector machine (SVM) algorithm to classify the samples. We observed that a combination of five VOCs (CHN, methanol, CH3CN, isoprene, 1-propanol) is sufficient for 89.0% screening accuracy, and hence, it can be used for the design and development of a desktop GC-sensor analysis system for lung cancer. PMID:28165388
Support Vector Machine algorithm for regression and classification
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yu, Chenggang; Zavaljevski, Nela
2001-08-01
The software is an implementation of the Support Vector Machine (SVM) algorithm that was invented and developed by Vladimir Vapnik and his co-workers at AT&T Bell Laboratories. The specific implementation reported here is an Active Set method for solving a quadratic optimization problem that forms the major part of any SVM program. The implementation is tuned to specific constraints generated in the SVM learning. Thus, it is more efficient than general-purpose quadratic optimization programs. A decomposition method has been implemented in the software that enables processing large data sets. The size of the learning data is virtually unlimited by themore » capacity of the computer physical memory. The software is flexible and extensible. Two upper bounds are implemented to regulate the SVM learning for classification, which allow users to adjust the false positive and false negative rates. The software can be used either as a standalone, general-purpose SVM regression or classification program, or be embedded into a larger software system.« less
Sakumura, Yuichi; Koyama, Yutaro; Tokutake, Hiroaki; Hida, Toyoaki; Sato, Kazuo; Itoh, Toshio; Akamatsu, Takafumi; Shin, Woosuck
2017-02-04
Monitoring exhaled breath is a very attractive, noninvasive screening technique for early diagnosis of diseases, especially lung cancer. However, the technique provides insufficient accuracy because the exhaled air has many crucial volatile organic compounds (VOCs) at very low concentrations (ppb level). We analyzed the breath exhaled by lung cancer patients and healthy subjects (controls) using gas chromatography/mass spectrometry (GC/MS), and performed a subsequent statistical analysis to diagnose lung cancer based on the combination of multiple lung cancer-related VOCs. We detected 68 VOCs as marker species using GC/MS analysis. We reduced the number of VOCs and used support vector machine (SVM) algorithm to classify the samples. We observed that a combination of five VOCs (CHN, methanol, CH₃CN, isoprene, 1-propanol) is sufficient for 89.0% screening accuracy, and hence, it can be used for the design and development of a desktop GC-sensor analysis system for lung cancer.
Support Vector Data Descriptions and k-Means Clustering: One Class?
Gornitz, Nico; Lima, Luiz Alberto; Muller, Klaus-Robert; Kloft, Marius; Nakajima, Shinichi
2017-09-27
We present ClusterSVDD, a methodology that unifies support vector data descriptions (SVDDs) and k-means clustering into a single formulation. This allows both methods to benefit from one another, i.e., by adding flexibility using multiple spheres for SVDDs and increasing anomaly resistance and flexibility through kernels to k-means. In particular, our approach leads to a new interpretation of k-means as a regularized mode seeking algorithm. The unifying formulation further allows for deriving new algorithms by transferring knowledge from one-class learning settings to clustering settings and vice versa. As a showcase, we derive a clustering method for structured data based on a one-class learning scenario. Additionally, our formulation can be solved via a particularly simple optimization scheme. We evaluate our approach empirically to highlight some of the proposed benefits on artificially generated data, as well as on real-world problems, and provide a Python software package comprising various implementations of primal and dual SVDD as well as our proposed ClusterSVDD.
Hepworth, Philip J.; Nefedov, Alexey V.; Muchnik, Ilya B.; Morgan, Kenton L.
2012-01-01
Machine-learning algorithms pervade our daily lives. In epidemiology, supervised machine learning has the potential for classification, diagnosis and risk factor identification. Here, we report the use of support vector machine learning to identify the features associated with hock burn on commercial broiler farms, using routinely collected farm management data. These data lend themselves to analysis using machine-learning techniques. Hock burn, dermatitis of the skin over the hock, is an important indicator of broiler health and welfare. Remarkably, this classifier can predict the occurrence of high hock burn prevalence with accuracy of 0.78 on unseen data, as measured by the area under the receiver operating characteristic curve. We also compare the results with those obtained by standard multi-variable logistic regression and suggest that this technique provides new insights into the data. This novel application of a machine-learning algorithm, embedded in poultry management systems could offer significant improvements in broiler health and welfare worldwide. PMID:22319115
Cao, Qi; Leung, K M
2014-09-22
Reliable computer models for the prediction of chemical biodegradability from molecular descriptors and fingerprints are very important for making health and environmental decisions. Coupling of the differential evolution (DE) algorithm with the support vector classifier (SVC) in order to optimize the main parameters of the classifier resulted in an improved classifier called the DE-SVC, which is introduced in this paper for use in chemical biodegradability studies. The DE-SVC was applied to predict the biodegradation of chemicals on the basis of extensive sample data sets and known structural features of molecules. Our optimization experiments showed that DE can efficiently find the proper parameters of the SVC. The resulting classifier possesses strong robustness and reliability compared with grid search, genetic algorithm, and particle swarm optimization methods. The classification experiments conducted here showed that the DE-SVC exhibits better classification performance than models previously used for such studies. It is a more effective and efficient prediction model for chemical biodegradability.
Hepworth, Philip J; Nefedov, Alexey V; Muchnik, Ilya B; Morgan, Kenton L
2012-08-07
Machine-learning algorithms pervade our daily lives. In epidemiology, supervised machine learning has the potential for classification, diagnosis and risk factor identification. Here, we report the use of support vector machine learning to identify the features associated with hock burn on commercial broiler farms, using routinely collected farm management data. These data lend themselves to analysis using machine-learning techniques. Hock burn, dermatitis of the skin over the hock, is an important indicator of broiler health and welfare. Remarkably, this classifier can predict the occurrence of high hock burn prevalence with accuracy of 0.78 on unseen data, as measured by the area under the receiver operating characteristic curve. We also compare the results with those obtained by standard multi-variable logistic regression and suggest that this technique provides new insights into the data. This novel application of a machine-learning algorithm, embedded in poultry management systems could offer significant improvements in broiler health and welfare worldwide.
Parodi, Stefano; Manneschi, Chiara; Verda, Damiano; Ferrari, Enrico; Muselli, Marco
2018-03-01
This study evaluates the performance of a set of machine learning techniques in predicting the prognosis of Hodgkin's lymphoma using clinical factors and gene expression data. Analysed samples from 130 Hodgkin's lymphoma patients included a small set of clinical variables and more than 54,000 gene features. Machine learning classifiers included three black-box algorithms ( k-nearest neighbour, Artificial Neural Network, and Support Vector Machine) and two methods based on intelligible rules (Decision Tree and the innovative Logic Learning Machine method). Support Vector Machine clearly outperformed any of the other methods. Among the two rule-based algorithms, Logic Learning Machine performed better and identified a set of simple intelligible rules based on a combination of clinical variables and gene expressions. Decision Tree identified a non-coding gene ( XIST) involved in the early phases of X chromosome inactivation that was overexpressed in females and in non-relapsed patients. XIST expression might be responsible for the better prognosis of female Hodgkin's lymphoma patients.
Margin based ontology sparse vector learning algorithm and applied in biology science.
Gao, Wei; Qudair Baig, Abdul; Ali, Haidar; Sajjad, Wasim; Reza Farahani, Mohammad
2017-01-01
In biology field, the ontology application relates to a large amount of genetic information and chemical information of molecular structure, which makes knowledge of ontology concepts convey much information. Therefore, in mathematical notation, the dimension of vector which corresponds to the ontology concept is often very large, and thus improves the higher requirements of ontology algorithm. Under this background, we consider the designing of ontology sparse vector algorithm and application in biology. In this paper, using knowledge of marginal likelihood and marginal distribution, the optimized strategy of marginal based ontology sparse vector learning algorithm is presented. Finally, the new algorithm is applied to gene ontology and plant ontology to verify its efficiency.
Fast and Accurate Support Vector Machines on Large Scale Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vishnu, Abhinav; Narasimhan, Jayenthi; Holder, Larry
Support Vector Machines (SVM) is a supervised Machine Learning and Data Mining (MLDM) algorithm, which has become ubiquitous largely due to its high accuracy and obliviousness to dimensionality. The objective of SVM is to find an optimal boundary --- also known as hyperplane --- which separates the samples (examples in a dataset) of different classes by a maximum margin. Usually, very few samples contribute to the definition of the boundary. However, existing parallel algorithms use the entire dataset for finding the boundary, which is sub-optimal for performance reasons. In this paper, we propose a novel distributed memory algorithm to eliminatemore » the samples which do not contribute to the boundary definition in SVM. We propose several heuristics, which range from early (aggressive) to late (conservative) elimination of the samples, such that the overall time for generating the boundary is reduced considerably. In a few cases, a sample may be eliminated (shrunk) pre-emptively --- potentially resulting in an incorrect boundary. We propose a scalable approach to synchronize the necessary data structures such that the proposed algorithm maintains its accuracy. We consider the necessary trade-offs of single/multiple synchronization using in-depth time-space complexity analysis. We implement the proposed algorithm using MPI and compare it with libsvm--- de facto sequential SVM software --- which we enhance with OpenMP for multi-core/many-core parallelism. Our proposed approach shows excellent efficiency using up to 4096 processes on several large datasets such as UCI HIGGS Boson dataset and Offending URL dataset.« less
Evaluation of machine learning algorithms for improved risk assessment for Down's syndrome.
Koivu, Aki; Korpimäki, Teemu; Kivelä, Petri; Pahikkala, Tapio; Sairanen, Mikko
2018-05-04
Prenatal screening generates a great amount of data that is used for predicting risk of various disorders. Prenatal risk assessment is based on multiple clinical variables and overall performance is defined by how well the risk algorithm is optimized for the population in question. This article evaluates machine learning algorithms to improve performance of first trimester screening of Down syndrome. Machine learning algorithms pose an adaptive alternative to develop better risk assessment models using the existing clinical variables. Two real-world data sets were used to experiment with multiple classification algorithms. Implemented models were tested with a third, real-world, data set and performance was compared to a predicate method, a commercial risk assessment software. Best performing deep neural network model gave an area under the curve of 0.96 and detection rate of 78% with 1% false positive rate with the test data. Support vector machine model gave area under the curve of 0.95 and detection rate of 61% with 1% false positive rate with the same test data. When compared with the predicate method, the best support vector machine model was slightly inferior, but an optimized deep neural network model was able to give higher detection rates with same false positive rate or similar detection rate but with markedly lower false positive rate. This finding could further improve the first trimester screening for Down syndrome, by using existing clinical variables and a large training data derived from a specific population. Copyright © 2018 Elsevier Ltd. All rights reserved.
Automatic Recognition of Fetal Facial Standard Plane in Ultrasound Image via Fisher Vector.
Lei, Baiying; Tan, Ee-Leng; Chen, Siping; Zhuo, Liu; Li, Shengli; Ni, Dong; Wang, Tianfu
2015-01-01
Acquisition of the standard plane is the prerequisite of biometric measurement and diagnosis during the ultrasound (US) examination. In this paper, a new algorithm is developed for the automatic recognition of the fetal facial standard planes (FFSPs) such as the axial, coronal, and sagittal planes. Specifically, densely sampled root scale invariant feature transform (RootSIFT) features are extracted and then encoded by Fisher vector (FV). The Fisher network with multi-layer design is also developed to extract spatial information to boost the classification performance. Finally, automatic recognition of the FFSPs is implemented by support vector machine (SVM) classifier based on the stochastic dual coordinate ascent (SDCA) algorithm. Experimental results using our dataset demonstrate that the proposed method achieves an accuracy of 93.27% and a mean average precision (mAP) of 99.19% in recognizing different FFSPs. Furthermore, the comparative analyses reveal the superiority of the proposed method based on FV over the traditional methods.
A Simple Deep Learning Method for Neuronal Spike Sorting
NASA Astrophysics Data System (ADS)
Yang, Kai; Wu, Haifeng; Zeng, Yu
2017-10-01
Spike sorting is one of key technique to understand brain activity. With the development of modern electrophysiology technology, some recent multi-electrode technologies have been able to record the activity of thousands of neuronal spikes simultaneously. The spike sorting in this case will increase the computational complexity of conventional sorting algorithms. In this paper, we will focus spike sorting on how to reduce the complexity, and introduce a deep learning algorithm, principal component analysis network (PCANet) to spike sorting. The introduced method starts from a conventional model and establish a Toeplitz matrix. Through the column vectors in the matrix, we trains a PCANet, where some eigenvalue vectors of spikes could be extracted. Finally, support vector machine (SVM) is used to sort spikes. In experiments, we choose two groups of simulated data from public databases availably and compare this introduced method with conventional methods. The results indicate that the introduced method indeed has lower complexity with the same sorting errors as the conventional methods.
Attia, Khalid A M; Nassar, Mohammed W I; El-Zeiny, Mohamed B; Serag, Ahmed
2017-01-05
For the first time, a new variable selection method based on swarm intelligence namely firefly algorithm is coupled with three different multivariate calibration models namely, concentration residual augmented classical least squares, artificial neural network and support vector regression in UV spectral data. A comparative study between the firefly algorithm and the well-known genetic algorithm was developed. The discussion revealed the superiority of using this new powerful algorithm over the well-known genetic algorithm. Moreover, different statistical tests were performed and no significant differences were found between all the models regarding their predictabilities. This ensures that simpler and faster models were obtained without any deterioration of the quality of the calibration. Copyright © 2016 Elsevier B.V. All rights reserved.
Face recognition using tridiagonal matrix enhanced multivariance products representation
NASA Astrophysics Data System (ADS)
Ã-zay, Evrim Korkmaz
2017-01-01
This study aims to retrieve face images from a database according to a target face image. For this purpose, Tridiagonal Matrix Enhanced Multivariance Products Representation (TMEMPR) is taken into consideration. TMEMPR is a recursive algorithm based on Enhanced Multivariance Products Representation (EMPR). TMEMPR decomposes a matrix into three components which are a matrix of left support terms, a tridiagonal matrix of weight parameters for each recursion, and a matrix of right support terms, respectively. In this sense, there is an analogy between Singular Value Decomposition (SVD) and TMEMPR. However TMEMPR is a more flexible algorithm since its initial support terms (or vectors) can be chosen as desired. Low computational complexity is another advantage of TMEMPR because the algorithm has been constructed with recursions of certain arithmetic operations without requiring any iteration. The algorithm has been trained and tested with ORL face image database with 400 different grayscale images of 40 different people. TMEMPR's performance has been compared with SVD's performance as a result.
NASA Astrophysics Data System (ADS)
Dobeck, Gerald J.; Cobb, J. Tory
2002-08-01
The high-resolution sonar is one of the principal sensors used by the Navy to detect and classify sea mines in minehunting operations. For such sonar systems, substantial effort has been devoted to the development of automated detection and classification (D/C) algorithms. These have been spurred by several factors including (1) aids for operators to reduce work overload, (2) more optimal use of all available data, and (3) the introduction of unmanned minehunting systems. The environments where sea mines are typically laid (harbor areas, shipping lanes, and the littorals) give rise to many false alarms caused by natural, biologic, and man-made clutter. The objective of the automated D/C algorithms is to eliminate most of these false alarms while still maintaining a very high probability of mine detection and classification (PdPc). In recent years, the benefits of fusing the outputs of multiple D/C algorithms have been studied. We refer to this as Algorithm Fusion. The results have been remarkable, including reliable robustness to new environments. The Quadratic Penalty Function Support Vector Machine (QPFSVM) algorithm to aid in the automated detection and classification of sea mines is introduced in this paper. The QPFSVM algorithm is easy to train, simple to implement, and robust to feature space dimension. Outputs of successive SVM algorithms are cascaded in stages (fused) to improve the Probability of Classification (Pc) and reduce the number of false alarms. Even though our experience has been gained in the area of sea mine detection and classification, the principles described herein are general and can be applied to fusion of any D/C problem (e.g., automated medical diagnosis or automatic target recognition for ballistic missile defense).
Wave-Based Algorithms and Bounds for Target Support Estimation
2015-05-15
vector electromagnetic formalism in [5]. This theory leads to three main variants of the optical theorem detector, in particular, three alternative...further expands the applicability for transient pulse change detection of ar- bitrary nonlinear-media and time-varying targets [9]. This report... electromagnetic methods a new methodology to estimate the minimum convex source region and the (possibly nonconvex) support of a scattering target from knowledge of
Si, Lei; Wang, Zhongbin; Liu, Xinhua; Tan, Chao; Liu, Ze; Xu, Jing
2016-01-01
Shearers play an important role in fully mechanized coal mining face and accurately identifying their cutting pattern is very helpful for improving the automation level of shearers and ensuring the safety of coal mining. The least squares support vector machine (LSSVM) has been proven to offer strong potential in prediction and classification issues, particularly by employing an appropriate meta-heuristic algorithm to determine the values of its two parameters. However, these meta-heuristic algorithms have the drawbacks of being hard to understand and reaching the global optimal solution slowly. In this paper, an improved fly optimization algorithm (IFOA) to optimize the parameters of LSSVM was presented and the LSSVM coupled with IFOA (IFOA-LSSVM) was used to identify the shearer cutting pattern. The vibration acceleration signals of five cutting patterns were collected and the special state features were extracted based on the ensemble empirical mode decomposition (EEMD) and the kernel function. Some examples on the IFOA-LSSVM model were further presented and the results were compared with LSSVM, PSO-LSSVM, GA-LSSVM and FOA-LSSVM models in detail. The comparison results indicate that the proposed approach was feasible, efficient and outperformed the others. Finally, an industrial application example at the coal mining face was demonstrated to specify the effect of the proposed system. PMID:26771615
Kim, Jongin; Park, Hyeong-jun
2016-01-01
The purpose of this study is to classify EEG data on imagined speech in a single trial. We recorded EEG data while five subjects imagined different vowels, /a/, /e/, /i/, /o/, and /u/. We divided each single trial dataset into thirty segments and extracted features (mean, variance, standard deviation, and skewness) from all segments. To reduce the dimension of the feature vector, we applied a feature selection algorithm based on the sparse regression model. These features were classified using a support vector machine with a radial basis function kernel, an extreme learning machine, and two variants of an extreme learning machine with different kernels. Because each single trial consisted of thirty segments, our algorithm decided the label of the single trial by selecting the most frequent output among the outputs of the thirty segments. As a result, we observed that the extreme learning machine and its variants achieved better classification rates than the support vector machine with a radial basis function kernel and linear discrimination analysis. Thus, our results suggested that EEG responses to imagined speech could be successfully classified in a single trial using an extreme learning machine with a radial basis function and linear kernel. This study with classification of imagined speech might contribute to the development of silent speech BCI systems. PMID:28097128
Rate determination from vector observations
NASA Technical Reports Server (NTRS)
Weiss, Jerold L.
1993-01-01
Vector observations are a common class of attitude data provided by a wide variety of attitude sensors. Attitude determination from vector observations is a well-understood process and numerous algorithms such as the TRIAD algorithm exist. These algorithms require measurement of the line of site (LOS) vector to reference objects and knowledge of the LOS directions in some predetermined reference frame. Once attitude is determined, it is a simple matter to synthesize vehicle rate using some form of lead-lag filter, and then, use it for vehicle stabilization. Many situations arise, however, in which rate knowledge is required but knowledge of the nominal LOS directions are not available. This paper presents two methods for determining spacecraft angular rates from vector observations without a priori knowledge of the vector directions. The first approach uses an extended Kalman filter with a spacecraft dynamic model and a kinematic model representing the motion of the observed LOS vectors. The second approach uses a 'differential' TRIAD algorithm to compute the incremental direction cosine matrix, from which vehicle rate is then derived.
A k-Vector Approach to Sampling, Interpolation, and Approximation
NASA Astrophysics Data System (ADS)
Mortari, Daniele; Rogers, Jonathan
2013-12-01
The k-vector search technique is a method designed to perform extremely fast range searching of large databases at computational cost independent of the size of the database. k-vector search algorithms have historically found application in satellite star-tracker navigation systems which index very large star catalogues repeatedly in the process of attitude estimation. Recently, the k-vector search algorithm has been applied to numerous other problem areas including non-uniform random variate sampling, interpolation of 1-D or 2-D tables, nonlinear function inversion, and solution of systems of nonlinear equations. This paper presents algorithms in which the k-vector search technique is used to solve each of these problems in a computationally-efficient manner. In instances where these tasks must be performed repeatedly on a static (or nearly-static) data set, the proposed k-vector-based algorithms offer an extremely fast solution technique that outperforms standard methods.
Guidi, G; Pettenati, M C; Miniati, R; Iadanza, E
2012-01-01
In this paper we describe an Heart Failure analysis Dashboard that, combined with a handy device for the automatic acquisition of a set of patient's clinical parameters, allows to support telemonitoring functions. The Dashboard's intelligent core is a Computer Decision Support System designed to assist the clinical decision of non-specialist caring personnel, and it is based on three functional parts: Diagnosis, Prognosis, and Follow-up management. Four Artificial Intelligence-based techniques are compared for providing diagnosis function: a Neural Network, a Support Vector Machine, a Classification Tree and a Fuzzy Expert System whose rules are produced by a Genetic Algorithm. State of the art algorithms are used to support a score-based prognosis function. The patient's Follow-up is used to refine the diagnosis.
Welikala, R A; Fraz, M M; Dehmeshki, J; Hoppe, A; Tah, V; Mann, S; Williamson, T H; Barman, S A
2015-07-01
Proliferative diabetic retinopathy (PDR) is a condition that carries a high risk of severe visual impairment. The hallmark of PDR is the growth of abnormal new vessels. In this paper, an automated method for the detection of new vessels from retinal images is presented. This method is based on a dual classification approach. Two vessel segmentation approaches are applied to create two separate binary vessel map which each hold vital information. Local morphology features are measured from each binary vessel map to produce two separate 4-D feature vectors. Independent classification is performed for each feature vector using a support vector machine (SVM) classifier. The system then combines these individual outcomes to produce a final decision. This is followed by the creation of additional features to generate 21-D feature vectors, which feed into a genetic algorithm based feature selection approach with the objective of finding feature subsets that improve the performance of the classification. Sensitivity and specificity results using a dataset of 60 images are 0.9138 and 0.9600, respectively, on a per patch basis and 1.000 and 0.975, respectively, on a per image basis. Copyright © 2015 Elsevier Ltd. All rights reserved.
USDA-ARS?s Scientific Manuscript database
Support Vector Machine (SVM) was used in the Genetic Algorithms (GA) process to select and classify a subset of hyperspectral image bands. The method was applied to fluorescence hyperspectral data for the detection of aflatoxin contamination in Aspergillus flavus infected single corn kernels. In the...
Matching algorithm of missile tail flame based on back-propagation neural network
NASA Astrophysics Data System (ADS)
Huang, Da; Huang, Shucai; Tang, Yidong; Zhao, Wei; Cao, Wenhuan
2018-02-01
This work presents a spectral matching algorithm of missile plume detection that based on neural network. The radiation value of the characteristic spectrum of the missile tail flame is taken as the input of the network. The network's structure including the number of nodes and layers is determined according to the number of characteristic spectral bands and missile types. We can get the network weight matrixes and threshold vectors through training the network using training samples, and we can determine the performance of the network through testing the network using the test samples. A small amount of data cause the network has the advantages of simple structure and practicality. Network structure composed of weight matrix and threshold vector can complete task of spectrum matching without large database support. Network can achieve real-time requirements with a small quantity of data. Experiment results show that the algorithm has the ability to match the precise spectrum and strong robustness.
Weighted K-means support vector machine for cancer prediction.
Kim, SungHwan
2016-01-01
To date, the support vector machine (SVM) has been widely applied to diverse bio-medical fields to address disease subtype identification and pathogenicity of genetic variants. In this paper, I propose the weighted K-means support vector machine (wKM-SVM) and weighted support vector machine (wSVM), for which I allow the SVM to impose weights to the loss term. Besides, I demonstrate the numerical relations between the objective function of the SVM and weights. Motivated by general ensemble techniques, which are known to improve accuracy, I directly adopt the boosting algorithm to the newly proposed weighted KM-SVM (and wSVM). For predictive performance, a range of simulation studies demonstrate that the weighted KM-SVM (and wSVM) with boosting outperforms the standard KM-SVM (and SVM) including but not limited to many popular classification rules. I applied the proposed methods to simulated data and two large-scale real applications in the TCGA pan-cancer methylation data of breast and kidney cancer. In conclusion, the weighted KM-SVM (and wSVM) increases accuracy of the classification model, and will facilitate disease diagnosis and clinical treatment decisions to benefit patients. A software package (wSVM) is publicly available at the R-project webpage (https://www.r-project.org).
Interpreting support vector machine models for multivariate group wise analysis in neuroimaging
Gaonkar, Bilwaj; Shinohara, Russell T; Davatzikos, Christos
2015-01-01
Machine learning based classification algorithms like support vector machines (SVMs) have shown great promise for turning a high dimensional neuroimaging data into clinically useful decision criteria. However, tracing imaging based patterns that contribute significantly to classifier decisions remains an open problem. This is an issue of critical importance in imaging studies seeking to determine which anatomical or physiological imaging features contribute to the classifier’s decision, thereby allowing users to critically evaluate the findings of such machine learning methods and to understand disease mechanisms. The majority of published work addresses the question of statistical inference for support vector classification using permutation tests based on SVM weight vectors. Such permutation testing ignores the SVM margin, which is critical in SVM theory. In this work we emphasize the use of a statistic that explicitly accounts for the SVM margin and show that the null distributions associated with this statistic are asymptotically normal. Further, our experiments show that this statistic is a lot less conservative as compared to weight based permutation tests and yet specific enough to tease out multivariate patterns in the data. Thus, we can better understand the multivariate patterns that the SVM uses for neuroimaging based classification. PMID:26210913
Barua, Shaibal; Begum, Shahina; Ahmed, Mobyen Uddin
2015-01-01
Machine learning algorithms play an important role in computer science research. Recent advancement in sensor data collection in clinical sciences lead to a complex, heterogeneous data processing, and analysis for patient diagnosis and prognosis. Diagnosis and treatment of patients based on manual analysis of these sensor data are difficult and time consuming. Therefore, development of Knowledge-based systems to support clinicians in decision-making is important. However, it is necessary to perform experimental work to compare performances of different machine learning methods to help to select appropriate method for a specific characteristic of data sets. This paper compares classification performance of three popular machine learning methods i.e., case-based reasoning, neutral networks and support vector machine to diagnose stress of vehicle drivers using finger temperature and heart rate variability. The experimental results show that case-based reasoning outperforms other two methods in terms of classification accuracy. Case-based reasoning has achieved 80% and 86% accuracy to classify stress using finger temperature and heart rate variability. On contrary, both neural network and support vector machine have achieved less than 80% accuracy by using both physiological signals.
Vector Quantization Algorithm Based on Associative Memories
NASA Astrophysics Data System (ADS)
Guzmán, Enrique; Pogrebnyak, Oleksiy; Yáñez, Cornelio; Manrique, Pablo
This paper presents a vector quantization algorithm for image compression based on extended associative memories. The proposed algorithm is divided in two stages. First, an associative network is generated applying the learning phase of the extended associative memories between a codebook generated by the LBG algorithm and a training set. This associative network is named EAM-codebook and represents a new codebook which is used in the next stage. The EAM-codebook establishes a relation between training set and the LBG codebook. Second, the vector quantization process is performed by means of the recalling stage of EAM using as associative memory the EAM-codebook. This process generates a set of the class indices to which each input vector belongs. With respect to the LBG algorithm, the main advantages offered by the proposed algorithm is high processing speed and low demand of resources (system memory); results of image compression and quality are presented.
NASA Astrophysics Data System (ADS)
Nishizuka, N.; Sugiura, K.; Kubo, Y.; Den, M.; Watari, S.; Ishii, M.
2017-02-01
We developed a flare prediction model using machine learning, which is optimized to predict the maximum class of flares occurring in the following 24 hr. Machine learning is used to devise algorithms that can learn from and make decisions on a huge amount of data. We used solar observation data during the period 2010-2015, such as vector magnetograms, ultraviolet (UV) emission, and soft X-ray emission taken by the Solar Dynamics Observatory and the Geostationary Operational Environmental Satellite. We detected active regions (ARs) from the full-disk magnetogram, from which ˜60 features were extracted with their time differentials, including magnetic neutral lines, the current helicity, the UV brightening, and the flare history. After standardizing the feature database, we fully shuffled and randomly separated it into two for training and testing. To investigate which algorithm is best for flare prediction, we compared three machine-learning algorithms: the support vector machine, k-nearest neighbors (k-NN), and extremely randomized trees. The prediction score, the true skill statistic, was higher than 0.9 with a fully shuffled data set, which is higher than that for human forecasts. It was found that k-NN has the highest performance among the three algorithms. The ranking of the feature importance showed that previous flare activity is most effective, followed by the length of magnetic neutral lines, the unsigned magnetic flux, the area of UV brightening, and the time differentials of features over 24 hr, all of which are strongly correlated with the flux emergence dynamics in an AR.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nishizuka, N.; Kubo, Y.; Den, M.
We developed a flare prediction model using machine learning, which is optimized to predict the maximum class of flares occurring in the following 24 hr. Machine learning is used to devise algorithms that can learn from and make decisions on a huge amount of data. We used solar observation data during the period 2010–2015, such as vector magnetograms, ultraviolet (UV) emission, and soft X-ray emission taken by the Solar Dynamics Observatory and the Geostationary Operational Environmental Satellite . We detected active regions (ARs) from the full-disk magnetogram, from which ∼60 features were extracted with their time differentials, including magnetic neutralmore » lines, the current helicity, the UV brightening, and the flare history. After standardizing the feature database, we fully shuffled and randomly separated it into two for training and testing. To investigate which algorithm is best for flare prediction, we compared three machine-learning algorithms: the support vector machine, k-nearest neighbors (k-NN), and extremely randomized trees. The prediction score, the true skill statistic, was higher than 0.9 with a fully shuffled data set, which is higher than that for human forecasts. It was found that k-NN has the highest performance among the three algorithms. The ranking of the feature importance showed that previous flare activity is most effective, followed by the length of magnetic neutral lines, the unsigned magnetic flux, the area of UV brightening, and the time differentials of features over 24 hr, all of which are strongly correlated with the flux emergence dynamics in an AR.« less
Discrete Data Transfer Technique for Fluid-Structure Interaction
NASA Technical Reports Server (NTRS)
Samareh, Jamshid A.
2007-01-01
This paper presents a general three-dimensional algorithm for data transfer between dissimilar meshes. The algorithm is suitable for applications of fluid-structure interaction and other high-fidelity multidisciplinary analysis and optimization. Because the algorithm is independent of the mesh topology, we can treat structured and unstructured meshes in the same manner. The algorithm is fast and accurate for transfer of scalar or vector fields between dissimilar surface meshes. The algorithm is also applicable for the integration of a scalar field (e.g., coefficients of pressure) on one mesh and injection of the resulting vectors (e.g., force vectors) onto another mesh. The author has implemented the algorithm in a C++ computer code. This paper contains a complete formulation of the algorithm with a few selected results.
Moteghaed, Niloofar Yousefi; Maghooli, Keivan; Garshasbi, Masoud
2018-01-01
Background: Gene expression data are characteristically high dimensional with a small sample size in contrast to the feature size and variability inherent in biological processes that contribute to difficulties in analysis. Selection of highly discriminative features decreases the computational cost and complexity of the classifier and improves its reliability for prediction of a new class of samples. Methods: The present study used hybrid particle swarm optimization and genetic algorithms for gene selection and a fuzzy support vector machine (SVM) as the classifier. Fuzzy logic is used to infer the importance of each sample in the training phase and decrease the outlier sensitivity of the system to increase the ability to generalize the classifier. A decision-tree algorithm was applied to the most frequent genes to develop a set of rules for each type of cancer. This improved the abilities of the algorithm by finding the best parameters for the classifier during the training phase without the need for trial-and-error by the user. The proposed approach was tested on four benchmark gene expression profiles. Results: Good results have been demonstrated for the proposed algorithm. The classification accuracy for leukemia data is 100%, for colon cancer is 96.67% and for breast cancer is 98%. The results show that the best kernel used in training the SVM classifier is the radial basis function. Conclusions: The experimental results show that the proposed algorithm can decrease the dimensionality of the dataset, determine the most informative gene subset, and improve classification accuracy using the optimal parameters of the classifier with no user interface. PMID:29535919
NASA Astrophysics Data System (ADS)
Khan, F.; Enzmann, F.; Kersten, M.
2015-12-01
In X-ray computed microtomography (μXCT) image processing is the most important operation prior to image analysis. Such processing mainly involves artefact reduction and image segmentation. We propose a new two-stage post-reconstruction procedure of an image of a geological rock core obtained by polychromatic cone-beam μXCT technology. In the first stage, the beam-hardening (BH) is removed applying a best-fit quadratic surface algorithm to a given image data set (reconstructed slice), which minimizes the BH offsets of the attenuation data points from that surface. The final BH-corrected image is extracted from the residual data, or the difference between the surface elevation values and the original grey-scale values. For the second stage, we propose using a least square support vector machine (a non-linear classifier algorithm) to segment the BH-corrected data as a pixel-based multi-classification task. A combination of the two approaches was used to classify a complex multi-mineral rock sample. The Matlab code for this approach is provided in the Appendix. A minor drawback is that the proposed segmentation algorithm may become computationally demanding in the case of a high dimensional training data set.
Gao, Xiang-Ming; Yang, Shi-Feng; Pan, San-Bo
2017-01-01
Predicting the output power of photovoltaic system with nonstationarity and randomness, an output power prediction model for grid-connected PV systems is proposed based on empirical mode decomposition (EMD) and support vector machine (SVM) optimized with an artificial bee colony (ABC) algorithm. First, according to the weather forecast data sets on the prediction date, the time series data of output power on a similar day with 15-minute intervals are built. Second, the time series data of the output power are decomposed into a series of components, including some intrinsic mode components IMFn and a trend component Res, at different scales using EMD. The corresponding SVM prediction model is established for each IMF component and trend component, and the SVM model parameters are optimized with the artificial bee colony algorithm. Finally, the prediction results of each model are reconstructed, and the predicted values of the output power of the grid-connected PV system can be obtained. The prediction model is tested with actual data, and the results show that the power prediction model based on the EMD and ABC-SVM has a faster calculation speed and higher prediction accuracy than do the single SVM prediction model and the EMD-SVM prediction model without optimization.
Zhang, Daqing; Xiao, Jianfeng; Zhou, Nannan; Luo, Xiaomin; Jiang, Hualiang; Chen, Kaixian
2015-01-01
Blood-brain barrier (BBB) is a highly complex physical barrier determining what substances are allowed to enter the brain. Support vector machine (SVM) is a kernel-based machine learning method that is widely used in QSAR study. For a successful SVM model, the kernel parameters for SVM and feature subset selection are the most important factors affecting prediction accuracy. In most studies, they are treated as two independent problems, but it has been proven that they could affect each other. We designed and implemented genetic algorithm (GA) to optimize kernel parameters and feature subset selection for SVM regression and applied it to the BBB penetration prediction. The results show that our GA/SVM model is more accurate than other currently available log BB models. Therefore, to optimize both SVM parameters and feature subset simultaneously with genetic algorithm is a better approach than other methods that treat the two problems separately. Analysis of our log BB model suggests that carboxylic acid group, polar surface area (PSA)/hydrogen-bonding ability, lipophilicity, and molecular charge play important role in BBB penetration. Among those properties relevant to BBB penetration, lipophilicity could enhance the BBB penetration while all the others are negatively correlated with BBB penetration. PMID:26504797
2017-01-01
Predicting the output power of photovoltaic system with nonstationarity and randomness, an output power prediction model for grid-connected PV systems is proposed based on empirical mode decomposition (EMD) and support vector machine (SVM) optimized with an artificial bee colony (ABC) algorithm. First, according to the weather forecast data sets on the prediction date, the time series data of output power on a similar day with 15-minute intervals are built. Second, the time series data of the output power are decomposed into a series of components, including some intrinsic mode components IMFn and a trend component Res, at different scales using EMD. The corresponding SVM prediction model is established for each IMF component and trend component, and the SVM model parameters are optimized with the artificial bee colony algorithm. Finally, the prediction results of each model are reconstructed, and the predicted values of the output power of the grid-connected PV system can be obtained. The prediction model is tested with actual data, and the results show that the power prediction model based on the EMD and ABC-SVM has a faster calculation speed and higher prediction accuracy than do the single SVM prediction model and the EMD-SVM prediction model without optimization. PMID:28912803
Jiang, Huaiguang; Zhang, Yingchen; Muljadi, Eduard; ...
2016-01-01
This paper proposes an approach for distribution system load forecasting, which aims to provide highly accurate short-term load forecasting with high resolution utilizing a support vector regression (SVR) based forecaster and a two-step hybrid parameters optimization method. Specifically, because the load profiles in distribution systems contain abrupt deviations, a data normalization is designed as the pretreatment for the collected historical load data. Then an SVR model is trained by the load data to forecast the future load. For better performance of SVR, a two-step hybrid optimization algorithm is proposed to determine the best parameters. In the first step of themore » hybrid optimization algorithm, a designed grid traverse algorithm (GTA) is used to narrow the parameters searching area from a global to local space. In the second step, based on the result of the GTA, particle swarm optimization (PSO) is used to determine the best parameters in the local parameter space. After the best parameters are determined, the SVR model is used to forecast the short-term load deviation in the distribution system. The performance of the proposed approach is compared to some classic methods in later sections of the paper.« less
Parallel-vector unsymmetric Eigen-Solver on high performance computers
NASA Technical Reports Server (NTRS)
Nguyen, Duc T.; Jiangning, Qin
1993-01-01
The popular QR algorithm for solving all eigenvalues of an unsymmetric matrix is reviewed. Among the basic components in the QR algorithm, it was concluded from this study, that the reduction of an unsymmetric matrix to a Hessenberg form (before applying the QR algorithm itself) can be done effectively by exploiting the vector speed and multiple processors offered by modern high-performance computers. Numerical examples of several test cases have indicated that the proposed parallel-vector algorithm for converting a given unsymmetric matrix to a Hessenberg form offers computational advantages over the existing algorithm. The time saving obtained by the proposed methods is increased as the problem size increased.
USDA-ARS?s Scientific Manuscript database
Segmentation is the first step in image analysis to subdivide an image into meaningful regions. The segmentation result directly affects the subsequent image analysis. The objective of the research was to develop an automatic adjustable algorithm for segmentation of color images, using linear suppor...
Attitude guidance and simulation with animation of a land-survey satellite motion
NASA Astrophysics Data System (ADS)
Somova, Tatyana
2017-01-01
We consider problems of synthesis of the vector spline attitude guidance laws for a land-survey satellite and an in-flight support of the satellite attitude control system with the use of computer animation of its motion. We have presented the results on the efficiency of the developed algorithms.
NASA Astrophysics Data System (ADS)
Khan, Faisal; Enzmann, Frieder; Kersten, Michael
2016-03-01
Image processing of X-ray-computed polychromatic cone-beam micro-tomography (μXCT) data of geological samples mainly involves artefact reduction and phase segmentation. For the former, the main beam-hardening (BH) artefact is removed by applying a best-fit quadratic surface algorithm to a given image data set (reconstructed slice), which minimizes the BH offsets of the attenuation data points from that surface. A Matlab code for this approach is provided in the Appendix. The final BH-corrected image is extracted from the residual data or from the difference between the surface elevation values and the original grey-scale values. For the segmentation, we propose a novel least-squares support vector machine (LS-SVM, an algorithm for pixel-based multi-phase classification) approach. A receiver operating characteristic (ROC) analysis was performed on BH-corrected and uncorrected samples to show that BH correction is in fact an important prerequisite for accurate multi-phase classification. The combination of the two approaches was thus used to classify successfully three different more or less complex multi-phase rock core samples.
NASA Astrophysics Data System (ADS)
Wong, Pak-kin; Vong, Chi-man; Wong, Hang-cheong; Li, Ke
2010-05-01
Modern automotive spark-ignition (SI) power performance usually refers to output power and torque, and they are significantly affected by the setup of control parameters in the engine management system (EMS). EMS calibration is done empirically through tests on the dynamometer (dyno) because no exact mathematical engine model is yet available. With an emerging nonlinear function estimation technique of Least squares support vector machines (LS-SVM), the approximate power performance model of a SI engine can be determined by training the sample data acquired from the dyno. A novel incremental algorithm based on typical LS-SVM is also proposed in this paper, so the power performance models built from the incremental LS-SVM can be updated whenever new training data arrives. With updating the models, the model accuracies can be continuously increased. The predicted results using the estimated models from the incremental LS-SVM are good agreement with the actual test results and with the almost same average accuracy of retraining the models from scratch, but the incremental algorithm can significantly shorten the model construction time when new training data arrives.
The construction of support vector machine classifier using the firefly algorithm.
Chao, Chih-Feng; Horng, Ming-Huwi
2015-01-01
The setting of parameters in the support vector machines (SVMs) is very important with regard to its accuracy and efficiency. In this paper, we employ the firefly algorithm to train all parameters of the SVM simultaneously, including the penalty parameter, smoothness parameter, and Lagrangian multiplier. The proposed method is called the firefly-based SVM (firefly-SVM). This tool is not considered the feature selection, because the SVM, together with feature selection, is not suitable for the application in a multiclass classification, especially for the one-against-all multiclass SVM. In experiments, binary and multiclass classifications are explored. In the experiments on binary classification, ten of the benchmark data sets of the University of California, Irvine (UCI), machine learning repository are used; additionally the firefly-SVM is applied to the multiclass diagnosis of ultrasonic supraspinatus images. The classification performance of firefly-SVM is also compared to the original LIBSVM method associated with the grid search method and the particle swarm optimization based SVM (PSO-SVM). The experimental results advocate the use of firefly-SVM to classify pattern classifications for maximum accuracy.
The Construction of Support Vector Machine Classifier Using the Firefly Algorithm
Chao, Chih-Feng; Horng, Ming-Huwi
2015-01-01
The setting of parameters in the support vector machines (SVMs) is very important with regard to its accuracy and efficiency. In this paper, we employ the firefly algorithm to train all parameters of the SVM simultaneously, including the penalty parameter, smoothness parameter, and Lagrangian multiplier. The proposed method is called the firefly-based SVM (firefly-SVM). This tool is not considered the feature selection, because the SVM, together with feature selection, is not suitable for the application in a multiclass classification, especially for the one-against-all multiclass SVM. In experiments, binary and multiclass classifications are explored. In the experiments on binary classification, ten of the benchmark data sets of the University of California, Irvine (UCI), machine learning repository are used; additionally the firefly-SVM is applied to the multiclass diagnosis of ultrasonic supraspinatus images. The classification performance of firefly-SVM is also compared to the original LIBSVM method associated with the grid search method and the particle swarm optimization based SVM (PSO-SVM). The experimental results advocate the use of firefly-SVM to classify pattern classifications for maximum accuracy. PMID:25802511
On the classification techniques in data mining for microarray data classification
NASA Astrophysics Data System (ADS)
Aydadenta, Husna; Adiwijaya
2018-03-01
Cancer is one of the deadly diseases, according to data from WHO by 2015 there are 8.8 million more deaths caused by cancer, and this will increase every year if not resolved earlier. Microarray data has become one of the most popular cancer-identification studies in the field of health, since microarray data can be used to look at levels of gene expression in certain cell samples that serve to analyze thousands of genes simultaneously. By using data mining technique, we can classify the sample of microarray data thus it can be identified with cancer or not. In this paper we will discuss some research using some data mining techniques using microarray data, such as Support Vector Machine (SVM), Artificial Neural Network (ANN), Naive Bayes, k-Nearest Neighbor (kNN), and C4.5, and simulation of Random Forest algorithm with technique of reduction dimension using Relief. The result of this paper show performance measure (accuracy) from classification algorithm (SVM, ANN, Naive Bayes, kNN, C4.5, and Random Forets).The results in this paper show the accuracy of Random Forest algorithm higher than other classification algorithms (Support Vector Machine (SVM), Artificial Neural Network (ANN), Naive Bayes, k-Nearest Neighbor (kNN), and C4.5). It is hoped that this paper can provide some information about the speed, accuracy, performance and computational cost generated from each Data Mining Classification Technique based on microarray data.
Quantum optimization for training support vector machines.
Anguita, Davide; Ridella, Sandro; Rivieccio, Fabio; Zunino, Rodolfo
2003-01-01
Refined concepts, such as Rademacher estimates of model complexity and nonlinear criteria for weighting empirical classification errors, represent recent and promising approaches to characterize the generalization ability of Support Vector Machines (SVMs). The advantages of those techniques lie in both improving the SVM representation ability and yielding tighter generalization bounds. On the other hand, they often make Quadratic-Programming algorithms no longer applicable, and SVM training cannot benefit from efficient, specialized optimization techniques. The paper considers the application of Quantum Computing to solve the problem of effective SVM training, especially in the case of digital implementations. The presented research compares the behavioral aspects of conventional and enhanced SVMs; experiments in both a synthetic and real-world problems support the theoretical analysis. At the same time, the related differences between Quadratic-Programming and Quantum-based optimization techniques are considered.
Application of Support Vector Machine to Forex Monitoring
NASA Astrophysics Data System (ADS)
Kamruzzaman, Joarder; Sarker, Ruhul A.
Previous studies have demonstrated superior performance of artificial neural network (ANN) based forex forecasting models over traditional regression models. This paper applies support vector machines to build a forecasting model from the historical data using six simple technical indicators and presents a comparison with an ANN based model trained by scaled conjugate gradient (SCG) learning algorithm. The models are evaluated and compared on the basis of five commonly used performance metrics that measure closeness of prediction as well as correctness in directional change. Forecasting results of six different currencies against Australian dollar reveal superior performance of SVM model using simple linear kernel over ANN-SCG model in terms of all the evaluation metrics. The effect of SVM parameter selection on prediction performance is also investigated and analyzed.
NASA Astrophysics Data System (ADS)
Xian, Guangming
2018-03-01
In this paper, the vibration flow field parameters of polymer melts in a visual slit die are optimized by using intelligent algorithm. Experimental small angle light scattering (SALS) patterns are shown to characterize the processing process. In order to capture the scattered light, a polarizer and an analyzer are placed before and after the polymer melts. The results reported in this study are obtained using high-density polyethylene (HDPE) with rotation speed at 28 rpm. In addition, support vector regression (SVR) analytical method is introduced for optimization the parameters of vibration flow field. This work establishes the general applicability of SVR for predicting the optimal parameters of vibration flow field.
Using support vector machines to identify literacy skills: Evidence from eye movements.
Lou, Ya; Liu, Yanping; Kaakinen, Johanna K; Li, Xingshan
2017-06-01
Is inferring readers' literacy skills possible by analyzing their eye movements during text reading? This study used Support Vector Machines (SVM) to analyze eye movement data from 61 undergraduate students who read a multiple-paragraph, multiple-topic expository text. Forward fixation time, first-pass rereading time, second-pass fixation time, and regression path reading time on different regions of the text were provided as features. The SVM classification algorithm assisted in distinguishing high-literacy-skilled readers from low-literacy-skilled readers with 80.3 % accuracy. Results demonstrate the effectiveness of combining eye tracking and machine learning techniques to detect readers with low literacy skills, and suggest that such approaches can be potentially used in predicting other cognitive abilities.
NASA Astrophysics Data System (ADS)
Zhang, Yanjiao; Lai, Xiaoping; Zeng, Qiuyao; Li, Linfang; Lin, Lin; Li, Shaoxin; Liu, Zhiming; Su, Chengkang; Qi, Minni; Guo, Zhouyi
2018-03-01
This study aims to classify low-grade and high-grade bladder cancer (BC) patients using serum surface-enhanced Raman scattering (SERS) spectra and support vector machine (SVM) algorithms. Serum SERS spectra are acquired from 88 serum samples with silver nanoparticles as the SERS-active substrate. Diagnostic accuracies of 96.4% and 95.4% are obtained when differentiating the serum SERS spectra of all BC patients versus normal subjects and low-grade versus high-grade BC patients, respectively, with optimal SVM classifier models. This study demonstrates that the serum SERS technique combined with SVM has great potential to noninvasively detect and classify high-grade and low-grade BC patients.
NASA Astrophysics Data System (ADS)
Balbin, Jessie R.; Padilla, Dionis A.; Fausto, Janette C.; Vergara, Ernesto M.; Garcia, Ramon G.; Delos Angeles, Bethsedea Joy S.; Dizon, Neil John A.; Mardo, Mark Kevin N.
2017-02-01
This research is about translating series of hand gesture to form a word and produce its equivalent sound on how it is read and said in Filipino accent using Support Vector Machine and Mel Frequency Cepstral Coefficient analysis. The concept is to detect Filipino speech input and translate the spoken words to their text form in Filipino. This study is trying to help the Filipino deaf community to impart their thoughts through the use of hand gestures and be able to communicate to people who do not know how to read hand gestures. This also helps literate deaf to simply read the spoken words relayed to them using the Filipino speech to text system.
Facial Expression Recognition using Multiclass Ensemble Least-Square Support Vector Machine
NASA Astrophysics Data System (ADS)
Lawi, Armin; Sya'Rani Machrizzandi, M.
2018-03-01
Facial expression is one of behavior characteristics of human-being. The use of biometrics technology system with facial expression characteristics makes it possible to recognize a person’s mood or emotion. The basic components of facial expression analysis system are face detection, face image extraction, facial classification and facial expressions recognition. This paper uses Principal Component Analysis (PCA) algorithm to extract facial features with expression parameters, i.e., happy, sad, neutral, angry, fear, and disgusted. Then Multiclass Ensemble Least-Squares Support Vector Machine (MELS-SVM) is used for the classification process of facial expression. The result of MELS-SVM model obtained from our 185 different expression images of 10 persons showed high accuracy level of 99.998% using RBF kernel.
Multiple-Point Temperature Gradient Algorithm for Ring Laser Gyroscope Bias Compensation
Li, Geng; Zhang, Pengfei; Wei, Guo; Xie, Yuanping; Yu, Xudong; Long, Xingwu
2015-01-01
To further improve ring laser gyroscope (RLG) bias stability, a multiple-point temperature gradient algorithm is proposed for RLG bias compensation in this paper. Based on the multiple-point temperature measurement system, a complete thermo-image of the RLG block is developed. Combined with the multiple-point temperature gradients between different points of the RLG block, the particle swarm optimization algorithm is used to tune the support vector machine (SVM) parameters, and an optimized design for selecting the thermometer locations is also discussed. The experimental results validate the superiority of the introduced method and enhance the precision and generalizability in the RLG bias compensation model. PMID:26633401
High-speed cell recognition algorithm for ultrafast flow cytometer imaging system.
Zhao, Wanyue; Wang, Chao; Chen, Hongwei; Chen, Minghua; Yang, Sigang
2018-04-01
An optical time-stretch flow imaging system enables high-throughput examination of cells/particles with unprecedented high speed and resolution. A significant amount of raw image data is produced. A high-speed cell recognition algorithm is, therefore, highly demanded to analyze large amounts of data efficiently. A high-speed cell recognition algorithm consisting of two-stage cascaded detection and Gaussian mixture model (GMM) classification is proposed. The first stage of detection extracts cell regions. The second stage integrates distance transform and the watershed algorithm to separate clustered cells. Finally, the cells detected are classified by GMM. We compared the performance of our algorithm with support vector machine. Results show that our algorithm increases the running speed by over 150% without sacrificing the recognition accuracy. This algorithm provides a promising solution for high-throughput and automated cell imaging and classification in the ultrafast flow cytometer imaging platform. (2018) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE).
High-speed cell recognition algorithm for ultrafast flow cytometer imaging system
NASA Astrophysics Data System (ADS)
Zhao, Wanyue; Wang, Chao; Chen, Hongwei; Chen, Minghua; Yang, Sigang
2018-04-01
An optical time-stretch flow imaging system enables high-throughput examination of cells/particles with unprecedented high speed and resolution. A significant amount of raw image data is produced. A high-speed cell recognition algorithm is, therefore, highly demanded to analyze large amounts of data efficiently. A high-speed cell recognition algorithm consisting of two-stage cascaded detection and Gaussian mixture model (GMM) classification is proposed. The first stage of detection extracts cell regions. The second stage integrates distance transform and the watershed algorithm to separate clustered cells. Finally, the cells detected are classified by GMM. We compared the performance of our algorithm with support vector machine. Results show that our algorithm increases the running speed by over 150% without sacrificing the recognition accuracy. This algorithm provides a promising solution for high-throughput and automated cell imaging and classification in the ultrafast flow cytometer imaging platform.
LDA boost classification: boosting by topics
NASA Astrophysics Data System (ADS)
Lei, La; Qiao, Guo; Qimin, Cao; Qitao, Li
2012-12-01
AdaBoost is an efficacious classification algorithm especially in text categorization (TC) tasks. The methodology of setting up a classifier committee and voting on the documents for classification can achieve high categorization precision. However, traditional Vector Space Model can easily lead to the curse of dimensionality and feature sparsity problems; so it affects classification performance seriously. This article proposed a novel classification algorithm called LDABoost based on boosting ideology which uses Latent Dirichlet Allocation (LDA) to modeling the feature space. Instead of using words or phrase, LDABoost use latent topics as the features. In this way, the feature dimension is significantly reduced. Improved Naïve Bayes (NB) is designed as the weaker classifier which keeps the efficiency advantage of classic NB algorithm and has higher precision. Moreover, a two-stage iterative weighted method called Cute Integration in this article is proposed for improving the accuracy by integrating weak classifiers into strong classifier in a more rational way. Mutual Information is used as metrics of weights allocation. The voting information and the categorization decision made by basis classifiers are fully utilized for generating the strong classifier. Experimental results reveals LDABoost making categorization in a low-dimensional space, it has higher accuracy than traditional AdaBoost algorithms and many other classic classification algorithms. Moreover, its runtime consumption is lower than different versions of AdaBoost, TC algorithms based on support vector machine and Neural Networks.
Fruit fly optimization based least square support vector regression for blind image restoration
NASA Astrophysics Data System (ADS)
Zhang, Jiao; Wang, Rui; Li, Junshan; Yang, Yawei
2014-11-01
The goal of image restoration is to reconstruct the original scene from a degraded observation. It is a critical and challenging task in image processing. Classical restorations require explicit knowledge of the point spread function and a description of the noise as priors. However, it is not practical for many real image processing. The recovery processing needs to be a blind image restoration scenario. Since blind deconvolution is an ill-posed problem, many blind restoration methods need to make additional assumptions to construct restrictions. Due to the differences of PSF and noise energy, blurring images can be quite different. It is difficult to achieve a good balance between proper assumption and high restoration quality in blind deconvolution. Recently, machine learning techniques have been applied to blind image restoration. The least square support vector regression (LSSVR) has been proven to offer strong potential in estimating and forecasting issues. Therefore, this paper proposes a LSSVR-based image restoration method. However, selecting the optimal parameters for support vector machine is essential to the training result. As a novel meta-heuristic algorithm, the fruit fly optimization algorithm (FOA) can be used to handle optimization problems, and has the advantages of fast convergence to the global optimal solution. In the proposed method, the training samples are created from a neighborhood in the degraded image to the central pixel in the original image. The mapping between the degraded image and the original image is learned by training LSSVR. The two parameters of LSSVR are optimized though FOA. The fitness function of FOA is calculated by the restoration error function. With the acquired mapping, the degraded image can be recovered. Experimental results show the proposed method can obtain satisfactory restoration effect. Compared with BP neural network regression, SVR method and Lucy-Richardson algorithm, it speeds up the restoration rate and performs better. Both objective and subjective restoration performances are studied in the comparison experiments.
LANDMARK-BASED SPEECH RECOGNITION: REPORT OF THE 2004 JOHNS HOPKINS SUMMER WORKSHOP.
Hasegawa-Johnson, Mark; Baker, James; Borys, Sarah; Chen, Ken; Coogan, Emily; Greenberg, Steven; Juneja, Amit; Kirchhoff, Katrin; Livescu, Karen; Mohan, Srividya; Muller, Jennifer; Sonmez, Kemal; Wang, Tianyu
2005-01-01
Three research prototype speech recognition systems are described, all of which use recently developed methods from artificial intelligence (specifically support vector machines, dynamic Bayesian networks, and maximum entropy classification) in order to implement, in the form of an automatic speech recognizer, current theories of human speech perception and phonology (specifically landmark-based speech perception, nonlinear phonology, and articulatory phonology). All three systems begin with a high-dimensional multiframe acoustic-to-distinctive feature transformation, implemented using support vector machines trained to detect and classify acoustic phonetic landmarks. Distinctive feature probabilities estimated by the support vector machines are then integrated using one of three pronunciation models: a dynamic programming algorithm that assumes canonical pronunciation of each word, a dynamic Bayesian network implementation of articulatory phonology, or a discriminative pronunciation model trained using the methods of maximum entropy classification. Log probability scores computed by these models are then combined, using log-linear combination, with other word scores available in the lattice output of a first-pass recognizer, and the resulting combination score is used to compute a second-pass speech recognition output.
NASA Astrophysics Data System (ADS)
Su, Lihong
In remote sensing communities, support vector machine (SVM) learning has recently received increasing attention. SVM learning usually requires large memory and enormous amounts of computation time on large training sets. According to SVM algorithms, the SVM classification decision function is fully determined by support vectors, which compose a subset of the training sets. In this regard, a solution to optimize SVM learning is to efficiently reduce training sets. In this paper, a data reduction method based on agglomerative hierarchical clustering is proposed to obtain smaller training sets for SVM learning. Using a multiple angle remote sensing dataset of a semi-arid region, the effectiveness of the proposed method is evaluated by classification experiments with a series of reduced training sets. The experiments show that there is no loss of SVM accuracy when the original training set is reduced to 34% using the proposed approach. Maximum likelihood classification (MLC) also is applied on the reduced training sets. The results show that MLC can also maintain the classification accuracy. This implies that the most informative data instances can be retained by this approach.
NASA Astrophysics Data System (ADS)
Dheeba, J.; Jaya, T.; Singh, N. Albert
2017-09-01
Classification of cancerous masses is a challenging task in many computerised detection systems. Cancerous masses are difficult to detect because these masses are obscured and subtle in mammograms. This paper investigates an intelligent classifier - fuzzy support vector machine (FSVM) applied to classify the tissues containing masses on mammograms for breast cancer diagnosis. The algorithm utilises texture features extracted using Laws texture energy measures and a FSVM to classify the suspicious masses. The new FSVM treats every feature as both normal and abnormal samples, but with different membership. By this way, the new FSVM have more generalisation ability to classify the masses in mammograms. The classifier analysed 219 clinical mammograms collected from breast cancer screening laboratory. The tests made on the real clinical mammograms shows that the proposed detection system has better discriminating power than the conventional support vector machine. With the best combination of FSVM and Laws texture features, the area under the Receiver operating characteristic curve reached .95, which corresponds to a sensitivity of 93.27% with a specificity of 87.17%. The results suggest that detecting masses using FSVM contribute to computer-aided detection of breast cancer and as a decision support system for radiologists.
An implementation of the QMR method based on coupled two-term recurrences
NASA Technical Reports Server (NTRS)
Freund, Roland W.; Nachtigal, Noeel M.
1992-01-01
The authors have proposed a new Krylov subspace iteration, the quasi-minimal residual algorithm (QMR), for solving non-Hermitian linear systems. In the original implementation of the QMR method, the Lanczos process with look-ahead is used to generate basis vectors for the underlying Krylov subspaces. In the Lanczos algorithm, these basis vectors are computed by means of three-term recurrences. It has been observed that, in finite precision arithmetic, vector iterations based on three-term recursions are usually less robust than mathematically equivalent coupled two-term vector recurrences. This paper presents a look-ahead algorithm that constructs the Lanczos basis vectors by means of coupled two-term recursions. Implementation details are given, and the look-ahead strategy is described. A new implementation of the QMR method, based on this coupled two-term algorithm, is described. A simplified version of the QMR algorithm without look-ahead is also presented, and the special case of QMR for complex symmetric linear systems is considered. Results of numerical experiments comparing the original and the new implementations of the QMR method are reported.
Design of thrust vectoring exhaust nozzles for real-time applications using neural networks
NASA Technical Reports Server (NTRS)
Prasanth, Ravi K.; Markin, Robert E.; Whitaker, Kevin W.
1991-01-01
Thrust vectoring continues to be an important issue in military aircraft system designs. A recently developed concept of vectoring aircraft thrust makes use of flexible exhaust nozzles. Subtle modifications in the nozzle wall contours produce a non-uniform flow field containing a complex pattern of shock and expansion waves. The end result, due to the asymmetric velocity and pressure distributions, is vectored thrust. Specification of the nozzle contours required for a desired thrust vector angle (an inverse design problem) has been achieved with genetic algorithms. This approach is computationally intensive and prevents the nozzles from being designed in real-time, which is necessary for an operational aircraft system. An investigation was conducted into using genetic algorithms to train a neural network in an attempt to obtain, in real-time, two-dimensional nozzle contours. Results show that genetic algorithm trained neural networks provide a viable, real-time alternative for designing thrust vectoring nozzles contours. Thrust vector angles up to 20 deg were obtained within an average error of 0.0914 deg. The error surfaces encountered were highly degenerate and thus the robustness of genetic algorithms was well suited for minimizing global errors.
Evaluation of Algorithms for a Miles-in-Trail Decision Support Tool
NASA Technical Reports Server (NTRS)
Bloem, Michael; Hattaway, David; Bambos, Nicholas
2012-01-01
Four machine learning algorithms were prototyped and evaluated for use in a proposed decision support tool that would assist air traffic managers as they set Miles-in-Trail restrictions. The tool would display probabilities that each possible Miles-in-Trail value should be used in a given situation. The algorithms were evaluated with an expected Miles-in-Trail cost that assumes traffic managers set restrictions based on the tool-suggested probabilities. Basic Support Vector Machine, random forest, and decision tree algorithms were evaluated, as was a softmax regression algorithm that was modified to explicitly reduce the expected Miles-in-Trail cost. The algorithms were evaluated with data from the summer of 2011 for air traffic flows bound to the Newark Liberty International Airport (EWR) over the ARD, PENNS, and SHAFF fixes. The algorithms were provided with 18 input features that describe the weather at EWR, the runway configuration at EWR, the scheduled traffic demand at EWR and the fixes, and other traffic management initiatives in place at EWR. Features describing other traffic management initiatives at EWR and the weather at EWR achieved relatively high information gain scores, indicating that they are the most useful for estimating Miles-in-Trail. In spite of a high variance or over-fitting problem, the decision tree algorithm achieved the lowest expected Miles-in-Trail costs when the algorithms were evaluated using 10-fold cross validation with the summer 2011 data for these air traffic flows.
a Gsa-Svm Hybrid System for Classification of Binary Problems
NASA Astrophysics Data System (ADS)
Sarafrazi, Soroor; Nezamabadi-pour, Hossein; Barahman, Mojgan
2011-06-01
This paperhybridizesgravitational search algorithm (GSA) with support vector machine (SVM) and made a novel GSA-SVM hybrid system to improve the classification accuracy in binary problems. GSA is an optimization heuristic toolused to optimize the value of SVM kernel parameter (in this paper, radial basis function (RBF) is chosen as the kernel function). The experimental results show that this newapproach can achieve high classification accuracy and is comparable to or better than the particle swarm optimization (PSO)-SVM and genetic algorithm (GA)-SVM, which are two hybrid systems for classification.
Ventricular repolarization variability for hypoglycemia detection.
Ling, Steve; Nguyen, H T
2011-01-01
Hypoglycemia is the most acute and common complication of Type 1 diabetes and is a limiting factor in a glycemic management of diabetes. In this paper, two main contributions are presented; firstly, ventricular repolarization variabilities are introduced for hypoglycemia detection, and secondly, a swarm-based support vector machine (SVM) algorithm with the inputs of the repolarization variabilities is developed to detect hypoglycemia. By using the algorithm and including several repolarization variabilities as inputs, the best hypoglycemia detection performance is found with sensitivity and specificity of 82.14% and 60.19%, respectively.
Data Mining Methods for Recommender Systems
NASA Astrophysics Data System (ADS)
Amatriain, Xavier; Jaimes*, Alejandro; Oliver, Nuria; Pujol, Josep M.
In this chapter, we give an overview of the main Data Mining techniques used in the context of Recommender Systems. We first describe common preprocessing methods such as sampling or dimensionality reduction. Next, we review the most important classification techniques, including Bayesian Networks and Support Vector Machines. We describe the k-means clustering algorithm and discuss several alternatives. We also present association rules and related algorithms for an efficient training process. In addition to introducing these techniques, we survey their uses in Recommender Systems and present cases where they have been successfully applied.
About decomposition approach for solving the classification problem
NASA Astrophysics Data System (ADS)
Andrianova, A. A.
2016-11-01
This article describes the features of the application of an algorithm with using of decomposition methods for solving the binary classification problem of constructing a linear classifier based on Support Vector Machine method. Application of decomposition reduces the volume of calculations, in particular, due to the emerging possibilities to build parallel versions of the algorithm, which is a very important advantage for the solution of problems with big data. The analysis of the results of computational experiments conducted using the decomposition approach. The experiment use known data set for binary classification problem.
View-Dependent Streamline Deformation and Exploration
Tong, Xin; Edwards, John; Chen, Chun-Ming; Shen, Han-Wei; Johnson, Chris R.; Wong, Pak Chung
2016-01-01
Occlusion presents a major challenge in visualizing 3D flow and tensor fields using streamlines. Displaying too many streamlines creates a dense visualization filled with occluded structures, but displaying too few streams risks losing important features. We propose a new streamline exploration approach by visually manipulating the cluttered streamlines by pulling visible layers apart and revealing the hidden structures underneath. This paper presents a customized view-dependent deformation algorithm and an interactive visualization tool to minimize visual clutter in 3D vector and tensor fields. The algorithm is able to maintain the overall integrity of the fields and expose previously hidden structures. Our system supports both mouse and direct-touch interactions to manipulate the viewing perspectives and visualize the streamlines in depth. By using a lens metaphor of different shapes to select the transition zone of the targeted area interactively, the users can move their focus and examine the vector or tensor field freely. PMID:26600061
NASA Astrophysics Data System (ADS)
Han, Sheng; Xi, Shi-qiong; Geng, Wei-dong
2017-11-01
In order to solve the problem of low recognition rate of traditional feature extraction operators under low-resolution images, a novel algorithm of expression recognition is proposed, named central oblique average center-symmetric local binary pattern (CS-LBP) with adaptive threshold (ATCS-LBP). Firstly, the features of face images can be extracted by the proposed operator after pretreatment. Secondly, the obtained feature image is divided into blocks. Thirdly, the histogram of each block is computed independently and all histograms can be connected serially to create a final feature vector. Finally, expression classification is achieved by using support vector machine (SVM) classifier. Experimental results on Japanese female facial expression (JAFFE) database show that the proposed algorithm can achieve a recognition rate of 81.9% when the resolution is as low as 16×16, which is much better than that of the traditional feature extraction operators.
View-Dependent Streamline Deformation and Exploration
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tong, Xin; Edwards, John; Chen, Chun-Ming
Occlusion presents a major challenge in visualizing 3D flow and tensor fields using streamlines. Displaying too many streamlines creates a dense visualization filled with occluded structures, but displaying too few streams risks losing important features. We propose a new streamline exploration approach by visually manipulating the cluttered streamlines by pulling visible layers apart and revealing the hidden structures underneath. This paper presents a customized view-dependent deformation algorithm and an interactive visualization tool to minimize visual cluttering for visualizing 3D vector and tensor fields. The algorithm is able to maintain the overall integrity of the fields and expose previously hidden structures.more » Our system supports both mouse and direct-touch interactions to manipulate the viewing perspectives and visualize the streamlines in depth. By using a lens metaphor of different shapes to select the transition zone of the targeted area interactively, the users can move their focus and examine the vector or tensor field freely.« less
View-Dependent Streamline Deformation and Exploration.
Tong, Xin; Edwards, John; Chen, Chun-Ming; Shen, Han-Wei; Johnson, Chris R; Wong, Pak Chung
2016-07-01
Occlusion presents a major challenge in visualizing 3D flow and tensor fields using streamlines. Displaying too many streamlines creates a dense visualization filled with occluded structures, but displaying too few streams risks losing important features. We propose a new streamline exploration approach by visually manipulating the cluttered streamlines by pulling visible layers apart and revealing the hidden structures underneath. This paper presents a customized view-dependent deformation algorithm and an interactive visualization tool to minimize visual clutter in 3D vector and tensor fields. The algorithm is able to maintain the overall integrity of the fields and expose previously hidden structures. Our system supports both mouse and direct-touch interactions to manipulate the viewing perspectives and visualize the streamlines in depth. By using a lens metaphor of different shapes to select the transition zone of the targeted area interactively, the users can move their focus and examine the vector or tensor field freely.
Fast-Solving Quasi-Optimal LS-S3VM Based on an Extended Candidate Set.
Ma, Yuefeng; Liang, Xun; Kwok, James T; Li, Jianping; Zhou, Xiaoping; Zhang, Haiyan
2018-04-01
The semisupervised least squares support vector machine (LS-S 3 VM) is an important enhancement of least squares support vector machines in semisupervised learning. Given that most data collected from the real world are without labels, semisupervised approaches are more applicable than standard supervised approaches. Although a few training methods for LS-S 3 VM exist, the problem of deriving the optimal decision hyperplane efficiently and effectually has not been solved. In this paper, a fully weighted model of LS-S 3 VM is proposed, and a simple integer programming (IP) model is introduced through an equivalent transformation to solve the model. Based on the distances between the unlabeled data and the decision hyperplane, a new indicator is designed to represent the possibility that the label of an unlabeled datum should be reversed in each iteration during training. Using the indicator, we construct an extended candidate set consisting of the indices of unlabeled data with high possibilities, which integrates more information from unlabeled data. Our algorithm is degenerated into a special scenario of the previous algorithm when the extended candidate set is reduced into a set with only one element. Two strategies are utilized to determine the descent directions based on the extended candidate set. Furthermore, we developed a novel method for locating a good starting point based on the properties of the equivalent IP model. Combined with the extended candidate set and the carefully computed starting point, a fast algorithm to solve LS-S 3 VM quasi-optimally is proposed. The choice of quasi-optimal solutions results in low computational cost and avoidance of overfitting. Experiments show that our algorithm equipped with the two designed strategies is more effective than other algorithms in at least one of the following three aspects: 1) computational complexity; 2) generalization ability; and 3) flexibility. However, our algorithm and other algorithms have similar levels of performance in the remaining aspects.
Implicit, nonswitching, vector-oriented algorithm for steady transonic flow
NASA Technical Reports Server (NTRS)
Lottati, I.
1983-01-01
A rapid computation of a sequence of transonic flow solutions has to be performed in many areas of aerodynamic technology. The employment of low-cost vector array processors makes the conduction of such calculations economically feasible. However, for a full utilization of the new hardware, the developed algorithms must take advantage of the special characteristics of the vector array processor. The present investigation has the objective to develop an efficient algorithm for solving transonic flow problems governed by mixed partial differential equations on an array processor.
Zhang, Yu-xin; Cheng, Zhi-feng; Xu, Zheng-ping; Bai, Jing
2015-01-01
In order to solve the problems such as complex operation, consumption for the carrier gas and long test period in traditional power transformer fault diagnosis approach based on dissolved gas analysis (DGA), this paper proposes a new method which is detecting 5 types of characteristic gas content in transformer oil such as CH4, C2H2, C2H4, C2H6 and H2 based on photoacoustic Spectroscopy and C2H2/C2H4, CH4/H2, C2H4/C2H6 three-ratios data are calculated. The support vector machine model was constructed using cross validation method under five support vector machine functions and four kernel functions, heuristic algorithms were used in parameter optimization for penalty factor c and g, which to establish the best SVM model for the highest fault diagnosis accuracy and the fast computing speed. Particles swarm optimization and genetic algorithm two types of heuristic algorithms were comparative studied in this paper for accuracy and speed in optimization. The simulation result shows that SVM model composed of C-SVC, RBF kernel functions and genetic algorithm obtain 97. 5% accuracy in test sample set and 98. 333 3% accuracy in train sample set, and genetic algorithm was about two times faster than particles swarm optimization in computing speed. The methods described in this paper has many advantages such as simple operation, non-contact measurement, no consumption for the carrier gas, long test period, high stability and sensitivity, the result shows that the methods described in this paper can instead of the traditional transformer fault diagnosis by gas chromatography and meets the actual project needs in transformer fault diagnosis.
VizieR Online Data Catalog: OCSVM anomalies (Solarz+, 2017)
NASA Astrophysics Data System (ADS)
Solarz, A.; Bilicki, M.; Gromadzki, M.; Pollo, A.; Durkalec, A.; Wypych, M.
2017-07-01
One table containing 642,353 sources selected as anomalous with one-class support vector machine algorithm in AllWISE data release. Data have AllWISE photometry in W1, W2 and W3 passband and include W3 flux correction described in Krakowski et al. (2016A&A...596A..39K). (1 data file).
NASA Astrophysics Data System (ADS)
Ramalingam, V. V.; Pandian, A.; Jaiswal, Abhijeet; Bhatia, Nikhar
2018-04-01
This paper presents a novel method based on concept of Machine Learning for Emotion Detection using various algorithms of Support Vector Machine and major emotions described are linked to the Word-Net for enhanced accuracy. The approach proposed plays a promising role to augment the Artificial Intelligence in the near future and could be vital in optimization of Human-Machine Interface.
REQUEST: A Recursive QUEST Algorithm for Sequential Attitude Determination
NASA Technical Reports Server (NTRS)
Bar-Itzhack, Itzhack Y.
1996-01-01
In order to find the attitude of a spacecraft with respect to a reference coordinate system, vector measurements are taken. The vectors are pairs of measurements of the same generalized vector, taken in the spacecraft body coordinates, as well as in the reference coordinate system. We are interested in finding the best estimate of the transformation between these coordinate system.s The algorithm called QUEST yields that estimate where attitude is expressed by a quarternion. Quest is an efficient algorithm which provides a least squares fit of the quaternion of rotation to the vector measurements. Quest however, is a single time point (single frame) batch algorithm, thus measurements that were taken at previous time points are discarded. The algorithm presented in this work provides a recursive routine which considers all past measurements. The algorithm is based on on the fact that the, so called, K matrix, one of whose eigenvectors is the sought quaternion, is linerly related to the measured pairs, and on the ability to propagate K. The extraction of the appropriate eigenvector is done according to the classical QUEST algorithm. This stage, however, can be eliminated, and the computation simplified, if a standard eigenvalue-eigenvector solver algorithm is used. The development of the recursive algorithm is presented and illustrated via a numerical example.
Three learning phases for radial-basis-function networks.
Schwenker, F; Kestler, H A; Palm, G
2001-05-01
In this paper, learning algorithms for radial basis function (RBF) networks are discussed. Whereas multilayer perceptrons (MLP) are typically trained with backpropagation algorithms, starting the training procedure with a random initialization of the MLP's parameters, an RBF network may be trained in many different ways. We categorize these RBF training methods into one-, two-, and three-phase learning schemes. Two-phase RBF learning is a very common learning scheme. The two layers of an RBF network are learnt separately; first the RBF layer is trained, including the adaptation of centers and scaling parameters, and then the weights of the output layer are adapted. RBF centers may be trained by clustering, vector quantization and classification tree algorithms, and the output layer by supervised learning (through gradient descent or pseudo inverse solution). Results from numerical experiments of RBF classifiers trained by two-phase learning are presented in three completely different pattern recognition applications: (a) the classification of 3D visual objects; (b) the recognition hand-written digits (2D objects); and (c) the categorization of high-resolution electrocardiograms given as a time series (ID objects) and as a set of features extracted from these time series. In these applications, it can be observed that the performance of RBF classifiers trained with two-phase learning can be improved through a third backpropagation-like training phase of the RBF network, adapting the whole set of parameters (RBF centers, scaling parameters, and output layer weights) simultaneously. This, we call three-phase learning in RBF networks. A practical advantage of two- and three-phase learning in RBF networks is the possibility to use unlabeled training data for the first training phase. Support vector (SV) learning in RBF networks is a different learning approach. SV learning can be considered, in this context of learning, as a special type of one-phase learning, where only the output layer weights of the RBF network are calculated, and the RBF centers are restricted to be a subset of the training data. Numerical experiments with several classifier schemes including k-nearest-neighbor, learning vector quantization and RBF classifiers trained through two-phase, three-phase and support vector learning are given. The performance of the RBF classifiers trained through SV learning and three-phase learning are superior to the results of two-phase learning, but SV learning often leads to complex network structures, since the number of support vectors is not a small fraction of the total number of data points.
Zhang, Yanjun; Zhang, Xiangmin; Liu, Wenhui; Luo, Yuxi; Yu, Enjia; Zou, Keju; Liu, Xiaoliang
2014-01-01
This paper employed the clinical Polysomnographic (PSG) data, mainly including all-night Electroencephalogram (EEG), Electrooculogram (EOG) and Electromyogram (EMG) signals of subjects, and adopted the American Academy of Sleep Medicine (AASM) clinical staging manual as standards to realize automatic sleep staging. Authors extracted eighteen different features of EEG, EOG and EMG in time domains and frequency domains to construct the vectors according to the existing literatures as well as clinical experience. By adopting sleep samples self-learning, the linear combination of weights and parameters of multiple kernels of the fuzzy support vector machine (FSVM) were learned and the multi-kernel FSVM (MK-FSVM) was constructed. The overall agreement between the experts' scores and the results presented was 82.53%. Compared with previous results, the accuracy of N1 was improved to some extent while the accuracies of other stages were approximate, which well reflected the sleep structure. The staging algorithm proposed in this paper is transparent, and worth further investigation.
Interpreting linear support vector machine models with heat map molecule coloring
2011-01-01
Background Model-based virtual screening plays an important role in the early drug discovery stage. The outcomes of high-throughput screenings are a valuable source for machine learning algorithms to infer such models. Besides a strong performance, the interpretability of a machine learning model is a desired property to guide the optimization of a compound in later drug discovery stages. Linear support vector machines showed to have a convincing performance on large-scale data sets. The goal of this study is to present a heat map molecule coloring technique to interpret linear support vector machine models. Based on the weights of a linear model, the visualization approach colors each atom and bond of a compound according to its importance for activity. Results We evaluated our approach on a toxicity data set, a chromosome aberration data set, and the maximum unbiased validation data sets. The experiments show that our method sensibly visualizes structure-property and structure-activity relationships of a linear support vector machine model. The coloring of ligands in the binding pocket of several crystal structures of a maximum unbiased validation data set target indicates that our approach assists to determine the correct ligand orientation in the binding pocket. Additionally, the heat map coloring enables the identification of substructures important for the binding of an inhibitor. Conclusions In combination with heat map coloring, linear support vector machine models can help to guide the modification of a compound in later stages of drug discovery. Particularly substructures identified as important by our method might be a starting point for optimization of a lead compound. The heat map coloring should be considered as complementary to structure based modeling approaches. As such, it helps to get a better understanding of the binding mode of an inhibitor. PMID:21439031
Quantum machine learning for quantum anomaly detection
NASA Astrophysics Data System (ADS)
Liu, Nana; Rebentrost, Patrick
2018-04-01
Anomaly detection is used for identifying data that deviate from "normal" data patterns. Its usage on classical data finds diverse applications in many important areas such as finance, fraud detection, medical diagnoses, data cleaning, and surveillance. With the advent of quantum technologies, anomaly detection of quantum data, in the form of quantum states, may become an important component of quantum applications. Machine-learning algorithms are playing pivotal roles in anomaly detection using classical data. Two widely used algorithms are the kernel principal component analysis and the one-class support vector machine. We find corresponding quantum algorithms to detect anomalies in quantum states. We show that these two quantum algorithms can be performed using resources that are logarithmic in the dimensionality of quantum states. For pure quantum states, these resources can also be logarithmic in the number of quantum states used for training the machine-learning algorithm. This makes these algorithms potentially applicable to big quantum data applications.
Support vector machines for nuclear reactor state estimation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zavaljevski, N.; Gross, K. C.
2000-02-14
Validation of nuclear power reactor signals is often performed by comparing signal prototypes with the actual reactor signals. The signal prototypes are often computed based on empirical data. The implementation of an estimation algorithm which can make predictions on limited data is an important issue. A new machine learning algorithm called support vector machines (SVMS) recently developed by Vladimir Vapnik and his coworkers enables a high level of generalization with finite high-dimensional data. The improved generalization in comparison with standard methods like neural networks is due mainly to the following characteristics of the method. The input data space is transformedmore » into a high-dimensional feature space using a kernel function, and the learning problem is formulated as a convex quadratic programming problem with a unique solution. In this paper the authors have applied the SVM method for data-based state estimation in nuclear power reactors. In particular, they implemented and tested kernels developed at Argonne National Laboratory for the Multivariate State Estimation Technique (MSET), a nonlinear, nonparametric estimation technique with a wide range of applications in nuclear reactors. The methodology has been applied to three data sets from experimental and commercial nuclear power reactor applications. The results are promising. The combination of MSET kernels with the SVM method has better noise reduction and generalization properties than the standard MSET algorithm.« less
Modelling soil water retention using support vector machines with genetic algorithm optimisation.
Lamorski, Krzysztof; Sławiński, Cezary; Moreno, Felix; Barna, Gyöngyi; Skierucha, Wojciech; Arrue, José L
2014-01-01
This work presents point pedotransfer function (PTF) models of the soil water retention curve. The developed models allowed for estimation of the soil water content for the specified soil water potentials: -0.98, -3.10, -9.81, -31.02, -491.66, and -1554.78 kPa, based on the following soil characteristics: soil granulometric composition, total porosity, and bulk density. Support Vector Machines (SVM) methodology was used for model development. A new methodology for elaboration of retention function models is proposed. Alternative to previous attempts known from literature, the ν-SVM method was used for model development and the results were compared with the formerly used the C-SVM method. For the purpose of models' parameters search, genetic algorithms were used as an optimisation framework. A new form of the aim function used for models parameters search is proposed which allowed for development of models with better prediction capabilities. This new aim function avoids overestimation of models which is typically encountered when root mean squared error is used as an aim function. Elaborated models showed good agreement with measured soil water retention data. Achieved coefficients of determination values were in the range 0.67-0.92. Studies demonstrated usability of ν-SVM methodology together with genetic algorithm optimisation for retention modelling which gave better performing models than other tested approaches.
Li, Ji; Hu, Guoqing; Zhou, Yonghong; Zou, Chong; Peng, Wei; Alam SM, Jahangir
2017-01-01
As a high performance-cost ratio solution for differential pressure measurement, piezo-resistive differential pressure sensors are widely used in engineering processes. However, their performance is severely affected by the environmental temperature and the static pressure applied to them. In order to modify the non-linear measuring characteristics of the piezo-resistive differential pressure sensor, compensation actions should synthetically consider these two aspects. Advantages such as nonlinear approximation capability, highly desirable generalization ability and computational efficiency make the kernel extreme learning machine (KELM) a practical approach for this critical task. Since the KELM model is intrinsically sensitive to the regularization parameter and the kernel parameter, a searching scheme combining the coupled simulated annealing (CSA) algorithm and the Nelder-Mead simplex algorithm is adopted to find an optimal KLEM parameter set. A calibration experiment at different working pressure levels was conducted within the temperature range to assess the proposed method. In comparison with other compensation models such as the back-propagation neural network (BP), radius basis neural network (RBF), particle swarm optimization optimized support vector machine (PSO-SVM), particle swarm optimization optimized least squares support vector machine (PSO-LSSVM) and extreme learning machine (ELM), the compensation results show that the presented compensation algorithm exhibits a more satisfactory performance with respect to temperature compensation and synthetic compensation problems. PMID:28422080
Li, Ji; Hu, Guoqing; Zhou, Yonghong; Zou, Chong; Peng, Wei; Alam Sm, Jahangir
2017-04-19
As a high performance-cost ratio solution for differential pressure measurement, piezo-resistive differential pressure sensors are widely used in engineering processes. However, their performance is severely affected by the environmental temperature and the static pressure applied to them. In order to modify the non-linear measuring characteristics of the piezo-resistive differential pressure sensor, compensation actions should synthetically consider these two aspects. Advantages such as nonlinear approximation capability, highly desirable generalization ability and computational efficiency make the kernel extreme learning machine (KELM) a practical approach for this critical task. Since the KELM model is intrinsically sensitive to the regularization parameter and the kernel parameter, a searching scheme combining the coupled simulated annealing (CSA) algorithm and the Nelder-Mead simplex algorithm is adopted to find an optimal KLEM parameter set. A calibration experiment at different working pressure levels was conducted within the temperature range to assess the proposed method. In comparison with other compensation models such as the back-propagation neural network (BP), radius basis neural network (RBF), particle swarm optimization optimized support vector machine (PSO-SVM), particle swarm optimization optimized least squares support vector machine (PSO-LSSVM) and extreme learning machine (ELM), the compensation results show that the presented compensation algorithm exhibits a more satisfactory performance with respect to temperature compensation and synthetic compensation problems.
Gene/protein name recognition based on support vector machine using dictionary as features.
Mitsumori, Tomohiro; Fation, Sevrani; Murata, Masaki; Doi, Kouichi; Doi, Hirohumi
2005-01-01
Automated information extraction from biomedical literature is important because a vast amount of biomedical literature has been published. Recognition of the biomedical named entities is the first step in information extraction. We developed an automated recognition system based on the SVM algorithm and evaluated it in Task 1.A of BioCreAtIvE, a competition for automated gene/protein name recognition. In the work presented here, our recognition system uses the feature set of the word, the part-of-speech (POS), the orthography, the prefix, the suffix, and the preceding class. We call these features "internal resource features", i.e., features that can be found in the training data. Additionally, we consider the features of matching against dictionaries to be external resource features. We investigated and evaluated the effect of these features as well as the effect of tuning the parameters of the SVM algorithm. We found that the dictionary matching features contributed slightly to the improvement in the performance of the f-score. We attribute this to the possibility that the dictionary matching features might overlap with other features in the current multiple feature setting. During SVM learning, each feature alone had a marginally positive effect on system performance. This supports the fact that the SVM algorithm is robust on the high dimensionality of the feature vector space and means that feature selection is not required.
Attitude Determination Using Two Vector Measurements
NASA Technical Reports Server (NTRS)
Markley, F. Landis
1998-01-01
Many spacecraft attitude determination methods use exactly two vector measurements. The two vectors are typically the unit vector to the Sun and the Earth's magnetic field vector for coarse "sun-mag" attitude determination or unit vectors to two stars tracked by two star trackers for fine attitude determination. TRIAD, the earliest published algorithm for determining spacecraft attitude from two vector measurements, has been widely used in both ground-based and onboard attitude determination. Later attitude determination methods have been based on Wahba's optimality criterion for n arbitrarily weighted observations. The solution of Wahba's problem is somewhat difficult in the general case, but there is a simple closed-form solution in the two-observation case. This solution reduces to the TRIAD solution for certain choices of measurement weights. This paper presents and compares these algorithms as well as sub-optimal algorithms proposed by Bar-Itzhack, Harman, and Reynolds. Some new results will be presented, but the paper is primarily a review and tutorial.
On the Impact of Widening Vector Registers on Sequence Alignment
DOE Office of Scientific and Technical Information (OSTI.GOV)
Daily, Jeffrey A.; Kalyanaraman, Anantharaman; Krishnamoorthy, Sriram
2016-09-22
Vector extensions, such as SSE, have been part of the x86 since the 1990s, with applications in graphics, signal processing, and scientific applications. Although many algorithms and applications can naturally benefit from automatic vectorization techniques, there are still many that are difficult to vectorize due to their dependence on irregular data structures, dense branch operations, or data dependencies. Sequence alignment, one of the most widely used operations in bioinformatics workflows, has a computational footprint that features complex data dependencies. In this paper, we demonstrate that the trend of widening vector registers adversely affects the state-of-the-art sequence alignment algorithm based onmore » striped data layouts. We present a practically efficient SIMD implementation of a parallel scan based sequence alignment algorithm that can better exploit wider SIMD units. We conduct comprehensive workload and use case analyses to characterize the relative behavior of the striped and scan approaches and identify the best choice of algorithm based on input length and SIMD width.« less
NASA Technical Reports Server (NTRS)
Shaffer, Scott; Dunbar, R. Scott; Hsiao, S. Vincent; Long, David G.
1989-01-01
The NASA Scatterometer, NSCAT, is an active spaceborne radar designed to measure the normalized radar backscatter coefficient (sigma0) of the ocean surface. These measurements can, in turn, be used to infer the surface vector wind over the ocean using a geophysical model function. Several ambiguous wind vectors result because of the nature of the model function. A median-filter-based ambiguity removal algorithm will be used by the NSCAT ground data processor to select the best wind vector from the set of ambiguous wind vectors. This process is commonly known as dealiasing or ambiguity removal. The baseline NSCAT ambiguity removal algorithm and the method used to select the set of optimum parameter values are described. An extensive simulation of the NSCAT instrument and ground data processor provides a means of testing the resulting tuned algorithm. This simulation generates the ambiguous wind-field vectors expected from the instrument as it orbits over a set of realistic meoscale wind fields. The ambiguous wind field is then dealiased using the median-based ambiguity removal algorithm. Performance is measured by comparison of the unambiguous wind fields with the true wind fields. Results have shown that the median-filter-based ambiguity removal algorithm satisfies NSCAT mission requirements.
Transportation Modes Classification Using Sensors on Smartphones.
Fang, Shih-Hau; Liao, Hao-Hsiang; Fei, Yu-Xiang; Chen, Kai-Hsiang; Huang, Jen-Wei; Lu, Yu-Ding; Tsao, Yu
2016-08-19
This paper investigates the transportation and vehicular modes classification by using big data from smartphone sensors. The three types of sensors used in this paper include the accelerometer, magnetometer, and gyroscope. This study proposes improved features and uses three machine learning algorithms including decision trees, K-nearest neighbor, and support vector machine to classify the user's transportation and vehicular modes. In the experiments, we discussed and compared the performance from different perspectives including the accuracy for both modes, the executive time, and the model size. Results show that the proposed features enhance the accuracy, in which the support vector machine provides the best performance in classification accuracy whereas it consumes the largest prediction time. This paper also investigates the vehicle classification mode and compares the results with that of the transportation modes.
Transportation Modes Classification Using Sensors on Smartphones
Fang, Shih-Hau; Liao, Hao-Hsiang; Fei, Yu-Xiang; Chen, Kai-Hsiang; Huang, Jen-Wei; Lu, Yu-Ding; Tsao, Yu
2016-01-01
This paper investigates the transportation and vehicular modes classification by using big data from smartphone sensors. The three types of sensors used in this paper include the accelerometer, magnetometer, and gyroscope. This study proposes improved features and uses three machine learning algorithms including decision trees, K-nearest neighbor, and support vector machine to classify the user’s transportation and vehicular modes. In the experiments, we discussed and compared the performance from different perspectives including the accuracy for both modes, the executive time, and the model size. Results show that the proposed features enhance the accuracy, in which the support vector machine provides the best performance in classification accuracy whereas it consumes the largest prediction time. This paper also investigates the vehicle classification mode and compares the results with that of the transportation modes. PMID:27548182
A Scatter-Based Prototype Framework and Multi-Class Extension of Support Vector Machines
Jenssen, Robert; Kloft, Marius; Zien, Alexander; Sonnenburg, Sören; Müller, Klaus-Robert
2012-01-01
We provide a novel interpretation of the dual of support vector machines (SVMs) in terms of scatter with respect to class prototypes and their mean. As a key contribution, we extend this framework to multiple classes, providing a new joint Scatter SVM algorithm, at the level of its binary counterpart in the number of optimization variables. This enables us to implement computationally efficient solvers based on sequential minimal and chunking optimization. As a further contribution, the primal problem formulation is developed in terms of regularized risk minimization and the hinge loss, revealing the score function to be used in the actual classification of test patterns. We investigate Scatter SVM properties related to generalization ability, computational efficiency, sparsity and sensitivity maps, and report promising results. PMID:23118845
Segmentation of mosaicism in cervicographic images using support vector machines
NASA Astrophysics Data System (ADS)
Xue, Zhiyun; Long, L. Rodney; Antani, Sameer; Jeronimo, Jose; Thoma, George R.
2009-02-01
The National Library of Medicine (NLM), in collaboration with the National Cancer Institute (NCI), is creating a large digital repository of cervicographic images for the study of uterine cervix cancer prevention. One of the research goals is to automatically detect diagnostic bio-markers in these images. Reliable bio-marker segmentation in large biomedical image collections is a challenging task due to the large variation in image appearance. Methods described in this paper focus on segmenting mosaicism, which is an important vascular feature used to visually assess the degree of cervical intraepithelial neoplasia. The proposed approach uses support vector machines (SVM) trained on a ground truth dataset annotated by medical experts (which circumvents the need for vascular structure extraction). We have evaluated the performance of the proposed algorithm and experimentally demonstrated its feasibility.
Introduction to Vector Field Visualization
NASA Technical Reports Server (NTRS)
Kao, David; Shen, Han-Wei
2010-01-01
Vector field visualization techniques are essential to help us understand the complex dynamics of flow fields. These can be found in a wide range of applications such as study of flows around an aircraft, the blood flow in our heart chambers, ocean circulation models, and severe weather predictions. The vector fields from these various applications can be visually depicted using a number of techniques such as particle traces and advecting textures. In this tutorial, we present several fundamental algorithms in flow visualization including particle integration, particle tracking in time-dependent flows, and seeding strategies. For flows near surfaces, a wide variety of synthetic texture-based algorithms have been developed to depict near-body flow features. The most common approach is based on the Line Integral Convolution (LIC) algorithm. There also exist extensions of LIC to support more flexible texture generations for 3D flow data. This tutorial reviews these algorithms. Tensor fields are found in several real-world applications and also require the aid of visualization to help users understand their data sets. Examples where one can find tensor fields include mechanics to see how material respond to external forces, civil engineering and geomechanics of roads and bridges, and the study of neural pathway via diffusion tensor imaging. This tutorial will provide an overview of the different tensor field visualization techniques, discuss basic tensor decompositions, and go into detail on glyph based methods, deformation based methods, and streamline based methods. Practical examples will be used when presenting the methods; and applications from some case studies will be used as part of the motivation.
Comparison of algorithms for computing the two-dimensional discrete Hartley transform
NASA Technical Reports Server (NTRS)
Reichenbach, Stephen E.; Burton, John C.; Miller, Keith W.
1989-01-01
Three methods have been described for computing the two-dimensional discrete Hartley transform. Two of these employ a separable transform, the third method, the vector-radix algorithm, does not require separability. In-place computation of the vector-radix method is described. Operation counts and execution times indicate that the vector-radix method is fastest.
NASA Astrophysics Data System (ADS)
Bai, Chen; Han, Dongjuan
2018-04-01
MUSIC is widely used on DOA estimation. Triangle grid is a common kind of the arrangement of array, but it is more complicated than rectangular array in calculation of steering vector. In this paper, the quaternions algorithm can reduce dimension of vector and make the calculation easier.
A Turn-Projected State-Based Conflict Resolution Algorithm
NASA Technical Reports Server (NTRS)
Butler, Ricky W.; Lewis, Timothy A.
2013-01-01
State-based conflict detection and resolution (CD&R) algorithms detect conflicts and resolve them on the basis on current state information without the use of additional intent information from aircraft flight plans. Therefore, the prediction of the trajectory of aircraft is based solely upon the position and velocity vectors of the traffic aircraft. Most CD&R algorithms project the traffic state using only the current state vectors. However, the past state vectors can be used to make a better prediction of the future trajectory of the traffic aircraft. This paper explores the idea of using past state vectors to detect traffic turns and resolve conflicts caused by these turns using a non-linear projection of the traffic state. A new algorithm based on this idea is presented and validated using a fast-time simulator developed for this study.
Polar decomposition for attitude determination from vector observations
NASA Technical Reports Server (NTRS)
Bar-Itzhack, Itzhack Y.
1993-01-01
This work treats the problem of weighted least squares fitting of a 3D Euclidean-coordinate transformation matrix to a set of unit vectors measured in the reference and transformed coordinates. A closed-form analytic solution to the problem is re-derived. The fact that the solution is the closest orthogonal matrix to some matrix defined on the measured vectors and their weights is clearly demonstrated. Several known algorithms for computing the analytic closed form solution are considered. An algorithm is discussed which is based on the polar decomposition of matrices into the closest unitary matrix to the decomposed matrix and a Hermitian matrix. A somewhat longer improved algorithm is suggested too. A comparison of several algorithms is carried out using simulated data as well as real data from the Upper Atmosphere Research Satellite. The comparison is based on accuracy and time consumption. It is concluded that the algorithms based on polar decomposition yield a simple although somewhat less accurate solution. The precision of the latter algorithms increase with the number of the measured vectors and with the accuracy of their measurement.
Detection of Abnormal Events via Optical Flow Feature Analysis
Wang, Tian; Snoussi, Hichem
2015-01-01
In this paper, a novel algorithm is proposed to detect abnormal events in video streams. The algorithm is based on the histogram of the optical flow orientation descriptor and the classification method. The details of the histogram of the optical flow orientation descriptor are illustrated for describing movement information of the global video frame or foreground frame. By combining one-class support vector machine and kernel principal component analysis methods, the abnormal events in the current frame can be detected after a learning period characterizing normal behaviors. The difference abnormal detection results are analyzed and explained. The proposed detection method is tested on benchmark datasets, then the experimental results show the effectiveness of the algorithm. PMID:25811227
Classification of fMRI resting-state maps using machine learning techniques: A comparative study
NASA Astrophysics Data System (ADS)
Gallos, Ioannis; Siettos, Constantinos
2017-11-01
We compare the efficiency of Principal Component Analysis (PCA) and nonlinear learning manifold algorithms (ISOMAP and Diffusion maps) for classifying brain maps between groups of schizophrenia patients and healthy from fMRI scans during a resting-state experiment. After a standard pre-processing pipeline, we applied spatial Independent component analysis (ICA) to reduce (a) noise and (b) spatial-temporal dimensionality of fMRI maps. On the cross-correlation matrix of the ICA components, we applied PCA, ISOMAP and Diffusion Maps to find an embedded low-dimensional space. Finally, support-vector-machines (SVM) and k-NN algorithms were used to evaluate the performance of the algorithms in classifying between the two groups.
NASA Astrophysics Data System (ADS)
Borodinov, A. A.; Myasnikov, V. V.
2018-04-01
The present work is devoted to comparing the accuracy of the known qualification algorithms in the task of recognizing local objects on radar images for various image preprocessing methods. Preprocessing involves speckle noise filtering and normalization of the object orientation in the image by the method of image moments and by a method based on the Hough transform. In comparison, the following classification algorithms are used: Decision tree; Support vector machine, AdaBoost, Random forest. The principal component analysis is used to reduce the dimension. The research is carried out on the objects from the base of radar images MSTAR. The paper presents the results of the conducted studies.
[The application of gene expression programming in the diagnosis of heart disease].
Dai, Wenbin; Zhang, Yuntao; Gao, Xingyu
2009-02-01
GEP (Gene expression programming) is a new genetic algorithm, and it has been proved to be excellent in function finding. In this paper, for the purpose of setting up a diagnostic model, GEP is used to deal with the data of heart disease. Eight variables, Sex, Chest pain, Blood pressure, Angina, Peak, Slope, Colored vessels and Thal, are picked out of thirteen variables to form a classified function. This function is used to predict a forecasting set of 100 samples, and the accuracy is 87%. Other algorithms such as SVM (Support vector machine) are applied to the same data and the forecasting results show that GEP is better than other algorithms.
NASA Astrophysics Data System (ADS)
Luo, Bin; Lin, Lin; Zhong, ShiSheng
2018-02-01
In this research, we propose a preference-guided optimisation algorithm for multi-criteria decision-making (MCDM) problems with interval-valued fuzzy preferences. The interval-valued fuzzy preferences are decomposed into a series of precise and evenly distributed preference-vectors (reference directions) regarding the objectives to be optimised on the basis of uniform design strategy firstly. Then the preference information is further incorporated into the preference-vectors based on the boundary intersection approach, meanwhile, the MCDM problem with interval-valued fuzzy preferences is reformulated into a series of single-objective optimisation sub-problems (each sub-problem corresponds to a decomposed preference-vector). Finally, a preference-guided optimisation algorithm based on MOEA/D (multi-objective evolutionary algorithm based on decomposition) is proposed to solve the sub-problems in a single run. The proposed algorithm incorporates the preference-vectors within the optimisation process for guiding the search procedure towards a more promising subset of the efficient solutions matching the interval-valued fuzzy preferences. In particular, lots of test instances and an engineering application are employed to validate the performance of the proposed algorithm, and the results demonstrate the effectiveness and feasibility of the algorithm.
An efficient parallel algorithm for matrix-vector multiplication
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hendrickson, B.; Leland, R.; Plimpton, S.
The multiplication of a vector by a matrix is the kernel computation of many algorithms in scientific computation. A fast parallel algorithm for this calculation is therefore necessary if one is to make full use of the new generation of parallel supercomputers. This paper presents a high performance, parallel matrix-vector multiplication algorithm that is particularly well suited to hypercube multiprocessors. For an n x n matrix on p processors, the communication cost of this algorithm is O(n/[radical]p + log(p)), independent of the matrix sparsity pattern. The performance of the algorithm is demonstrated by employing it as the kernel in themore » well-known NAS conjugate gradient benchmark, where a run time of 6.09 seconds was observed. This is the best published performance on this benchmark achieved to date using a massively parallel supercomputer.« less
NASA Technical Reports Server (NTRS)
Jaggi, S.
1993-01-01
A study is conducted to investigate the effects and advantages of data compression techniques on multispectral imagery data acquired by NASA's airborne scanners at the Stennis Space Center. The first technique used was vector quantization. The vector is defined in the multispectral imagery context as an array of pixels from the same location from each channel. The error obtained in substituting the reconstructed images for the original set is compared for different compression ratios. Also, the eigenvalues of the covariance matrix obtained from the reconstructed data set are compared with the eigenvalues of the original set. The effects of varying the size of the vector codebook on the quality of the compression and on subsequent classification are also presented. The output data from the Vector Quantization algorithm was further compressed by a lossless technique called Difference-mapped Shift-extended Huffman coding. The overall compression for 7 channels of data acquired by the Calibrated Airborne Multispectral Scanner (CAMS), with an RMS error of 15.8 pixels was 195:1 (0.41 bpp) and with an RMS error of 3.6 pixels was 18:1 (.447 bpp). The algorithms were implemented in software and interfaced with the help of dedicated image processing boards to an 80386 PC compatible computer. Modules were developed for the task of image compression and image analysis. Also, supporting software to perform image processing for visual display and interpretation of the compressed/classified images was developed.
Back analysis of geomechanical parameters in underground engineering using artificial bee colony.
Zhu, Changxing; Zhao, Hongbo; Zhao, Ming
2014-01-01
Accurate geomechanical parameters are critical in tunneling excavation, design, and supporting. In this paper, a displacements back analysis based on artificial bee colony (ABC) algorithm is proposed to identify geomechanical parameters from monitored displacements. ABC was used as global optimal algorithm to search the unknown geomechanical parameters for the problem with analytical solution. To the problem without analytical solution, optimal back analysis is time-consuming, and least square support vector machine (LSSVM) was used to build the relationship between unknown geomechanical parameters and displacement and improve the efficiency of back analysis. The proposed method was applied to a tunnel with analytical solution and a tunnel without analytical solution. The results show the proposed method is feasible.
Automated assessment of cognitive health using smart home technologies.
Dawadi, Prafulla N; Cook, Diane J; Schmitter-Edgecombe, Maureen; Parsey, Carolyn
2013-01-01
The goal of this work is to develop intelligent systems to monitor the wellbeing of individuals in their home environments. This paper introduces a machine learning-based method to automatically predict activity quality in smart homes and automatically assess cognitive health based on activity quality. This paper describes an automated framework to extract set of features from smart home sensors data that reflects the activity performance or ability of an individual to complete an activity which can be input to machine learning algorithms. Output from learning algorithms including principal component analysis, support vector machine, and logistic regression algorithms are used to quantify activity quality for a complex set of smart home activities and predict cognitive health of participants. Smart home activity data was gathered from volunteer participants (n=263) who performed a complex set of activities in our smart home testbed. We compare our automated activity quality prediction and cognitive health prediction with direct observation scores and health assessment obtained from neuropsychologists. With all samples included, we obtained statistically significant correlation (r=0.54) between direct observation scores and predicted activity quality. Similarly, using a support vector machine classifier, we obtained reasonable classification accuracy (area under the ROC curve=0.80, g-mean=0.73) in classifying participants into two different cognitive classes, dementia and cognitive healthy. The results suggest that it is possible to automatically quantify the task quality of smart home activities and perform limited assessment of the cognitive health of individual if smart home activities are properly chosen and learning algorithms are appropriately trained.
Research on Remote Sensing Image Classification Based on Feature Level Fusion
NASA Astrophysics Data System (ADS)
Yuan, L.; Zhu, G.
2018-04-01
Remote sensing image classification, as an important direction of remote sensing image processing and application, has been widely studied. However, in the process of existing classification algorithms, there still exists the phenomenon of misclassification and missing points, which leads to the final classification accuracy is not high. In this paper, we selected Sentinel-1A and Landsat8 OLI images as data sources, and propose a classification method based on feature level fusion. Compare three kind of feature level fusion algorithms (i.e., Gram-Schmidt spectral sharpening, Principal Component Analysis transform and Brovey transform), and then select the best fused image for the classification experimental. In the classification process, we choose four kinds of image classification algorithms (i.e. Minimum distance, Mahalanobis distance, Support Vector Machine and ISODATA) to do contrast experiment. We use overall classification precision and Kappa coefficient as the classification accuracy evaluation criteria, and the four classification results of fused image are analysed. The experimental results show that the fusion effect of Gram-Schmidt spectral sharpening is better than other methods. In four kinds of classification algorithms, the fused image has the best applicability to Support Vector Machine classification, the overall classification precision is 94.01 % and the Kappa coefficients is 0.91. The fused image with Sentinel-1A and Landsat8 OLI is not only have more spatial information and spectral texture characteristics, but also enhances the distinguishing features of the images. The proposed method is beneficial to improve the accuracy and stability of remote sensing image classification.
Automated Assessment of Cognitive Health Using Smart Home Technologies
Dawadi, Prafulla N.; Cook, Diane J.; Schmitter-Edgecombe, Maureen; Parsey, Carolyn
2014-01-01
BACKGROUND The goal of this work is to develop intelligent systems to monitor the well being of individuals in their home environments. OBJECTIVE This paper introduces a machine learning-based method to automatically predict activity quality in smart homes and automatically assess cognitive health based on activity quality. METHODS This paper describes an automated framework to extract set of features from smart home sensors data that reflects the activity performance or ability of an individual to complete an activity which can be input to machine learning algorithms. Output from learning algorithms including principal component analysis, support vector machine, and logistic regression algorithms are used to quantify activity quality for a complex set of smart home activities and predict cognitive health of participants. RESULTS Smart home activity data was gathered from volunteer participants (n=263) who performed a complex set of activities in our smart home testbed. We compare our automated activity quality prediction and cognitive health prediction with direct observation scores and health assessment obtained from neuropsychologists. With all samples included, we obtained statistically significant correlation (r=0.54) between direct observation scores and predicted activity quality. Similarly, using a support vector machine classifier, we obtained reasonable classification accuracy (area under the ROC curve = 0.80, g-mean = 0.73) in classifying participants into two different cognitive classes, dementia and cognitive healthy. CONCLUSIONS The results suggest that it is possible to automatically quantify the task quality of smart home activities and perform limited assessment of the cognitive health of individual if smart home activities are properly chosen and learning algorithms are appropriately trained. PMID:23949177
A new implementation of the CMRH method for solving dense linear systems
NASA Astrophysics Data System (ADS)
Heyouni, M.; Sadok, H.
2008-04-01
The CMRH method [H. Sadok, Methodes de projections pour les systemes lineaires et non lineaires, Habilitation thesis, University of Lille1, Lille, France, 1994; H. Sadok, CMRH: A new method for solving nonsymmetric linear systems based on the Hessenberg reduction algorithm, Numer. Algorithms 20 (1999) 303-321] is an algorithm for solving nonsymmetric linear systems in which the Arnoldi component of GMRES is replaced by the Hessenberg process, which generates Krylov basis vectors which are orthogonal to standard unit basis vectors rather than mutually orthogonal. The iterate is formed from these vectors by solving a small least squares problem involving a Hessenberg matrix. Like GMRES, this method requires one matrix-vector product per iteration. However, it can be implemented to require half as much arithmetic work and less storage. Moreover, numerical experiments show that this method performs accurately and reduces the residual about as fast as GMRES. With this new implementation, we show that the CMRH method is the only method with long-term recurrence which requires not storing at the same time the entire Krylov vectors basis and the original matrix as in the GMRES algorithmE A comparison with Gaussian elimination is provided.
The generalization ability of online SVM classification based on Markov sampling.
Xu, Jie; Yan Tang, Yuan; Zou, Bin; Xu, Zongben; Li, Luoqing; Lu, Yang
2015-03-01
In this paper, we consider online support vector machine (SVM) classification learning algorithms with uniformly ergodic Markov chain (u.e.M.c.) samples. We establish the bound on the misclassification error of an online SVM classification algorithm with u.e.M.c. samples based on reproducing kernel Hilbert spaces and obtain a satisfactory convergence rate. We also introduce a novel online SVM classification algorithm based on Markov sampling, and present the numerical studies on the learning ability of online SVM classification based on Markov sampling for benchmark repository. The numerical studies show that the learning performance of the online SVM classification algorithm based on Markov sampling is better than that of classical online SVM classification based on random sampling as the size of training samples is larger.
Structural Analysis of Biodiversity
Sirovich, Lawrence; Stoeckle, Mark Y.; Zhang, Yu
2010-01-01
Large, recently-available genomic databases cover a wide range of life forms, suggesting opportunity for insights into genetic structure of biodiversity. In this study we refine our recently-described technique using indicator vectors to analyze and visualize nucleotide sequences. The indicator vector approach generates correlation matrices, dubbed Klee diagrams, which represent a novel way of assembling and viewing large genomic datasets. To explore its potential utility, here we apply the improved algorithm to a collection of almost 17000 DNA barcode sequences covering 12 widely-separated animal taxa, demonstrating that indicator vectors for classification gave correct assignment in all 11000 test cases. Indicator vector analysis revealed discontinuities corresponding to species- and higher-level taxonomic divisions, suggesting an efficient approach to classification of organisms from poorly-studied groups. As compared to standard distance metrics, indicator vectors preserve diagnostic character probabilities, enable automated classification of test sequences, and generate high-information density single-page displays. These results support application of indicator vectors for comparative analysis of large nucleotide data sets and raise prospect of gaining insight into broad-scale patterns in the genetic structure of biodiversity. PMID:20195371
Multi-Parent Clustering Algorithms from Stochastic Grammar Data Models
NASA Technical Reports Server (NTRS)
Mjoisness, Eric; Castano, Rebecca; Gray, Alexander
1999-01-01
We introduce a statistical data model and an associated optimization-based clustering algorithm which allows data vectors to belong to zero, one or several "parent" clusters. For each data vector the algorithm makes a discrete decision among these alternatives. Thus, a recursive version of this algorithm would place data clusters in a Directed Acyclic Graph rather than a tree. We test the algorithm with synthetic data generated according to the statistical data model. We also illustrate the algorithm using real data from large-scale gene expression assays.
Video data compression using artificial neural network differential vector quantization
NASA Technical Reports Server (NTRS)
Krishnamurthy, Ashok K.; Bibyk, Steven B.; Ahalt, Stanley C.
1991-01-01
An artificial neural network vector quantizer is developed for use in data compression applications such as Digital Video. Differential Vector Quantization is used to preserve edge features, and a new adaptive algorithm, known as Frequency-Sensitive Competitive Learning, is used to develop the vector quantizer codebook. To develop real time performance, a custom Very Large Scale Integration Application Specific Integrated Circuit (VLSI ASIC) is being developed to realize the associative memory functions needed in the vector quantization algorithm. By using vector quantization, the need for Huffman coding can be eliminated, resulting in superior performance against channel bit errors than methods that use variable length codes.
Vectorized algorithms for spiking neural network simulation.
Brette, Romain; Goodman, Dan F M
2011-06-01
High-level languages (Matlab, Python) are popular in neuroscience because they are flexible and accelerate development. However, for simulating spiking neural networks, the cost of interpretation is a bottleneck. We describe a set of algorithms to simulate large spiking neural networks efficiently with high-level languages using vector-based operations. These algorithms constitute the core of Brian, a spiking neural network simulator written in the Python language. Vectorized simulation makes it possible to combine the flexibility of high-level languages with the computational efficiency usually associated with compiled languages.
Wavelet subband coding of computer simulation output using the A++ array class library
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bradley, J.N.; Brislawn, C.M.; Quinlan, D.J.
1995-07-01
The goal of the project is to produce utility software for off-line compression of existing data and library code that can be called from a simulation program for on-line compression of data dumps as the simulation proceeds. Naturally, we would like the amount of CPU time required by the compression algorithm to be small in comparison to the requirements of typical simulation codes. We also want the algorithm to accomodate a wide variety of smooth, multidimensional data types. For these reasons, the subband vector quantization (VQ) approach employed in has been replaced by a scalar quantization (SQ) strategy using amore » bank of almost-uniform scalar subband quantizers in a scheme similar to that used in the FBI fingerprint image compression standard. This eliminates the considerable computational burdens of training VQ codebooks for each new type of data and performing nearest-vector searches to encode the data. The comparison of subband VQ and SQ algorithms in indicated that, in practice, there is relatively little additional gain from using vector as opposed to scalar quantization on DWT subbands, even when the source imagery is from a very homogeneous population, and our subjective experience with synthetic computer-generated data supports this stance. It appears that a careful study is needed of the tradeoffs involved in selecting scalar vs. vector subband quantization, but such an analysis is beyond the scope of this paper. Our present work is focused on the problem of generating wavelet transform/scalar quantization (WSQ) implementations that can be ported easily between different hardware environments. This is an extremely important consideration given the great profusion of different high-performance computing architectures available, the high cost associated with learning how to map algorithms effectively onto a new architecture, and the rapid rate of evolution in the world of high-performance computing.« less
Kianmehr, Keivan; Alhajj, Reda
2008-09-01
In this study, we aim at building a classification framework, namely the CARSVM model, which integrates association rule mining and support vector machine (SVM). The goal is to benefit from advantages of both, the discriminative knowledge represented by class association rules and the classification power of the SVM algorithm, to construct an efficient and accurate classifier model that improves the interpretability problem of SVM as a traditional machine learning technique and overcomes the efficiency issues of associative classification algorithms. In our proposed framework: instead of using the original training set, a set of rule-based feature vectors, which are generated based on the discriminative ability of class association rules over the training samples, are presented to the learning component of the SVM algorithm. We show that rule-based feature vectors present a high-qualified source of discrimination knowledge that can impact substantially the prediction power of SVM and associative classification techniques. They provide users with more conveniences in terms of understandability and interpretability as well. We have used four datasets from UCI ML repository to evaluate the performance of the developed system in comparison with five well-known existing classification methods. Because of the importance and popularity of gene expression analysis as real world application of the classification model, we present an extension of CARSVM combined with feature selection to be applied to gene expression data. Then, we describe how this combination will provide biologists with an efficient and understandable classifier model. The reported test results and their biological interpretation demonstrate the applicability, efficiency and effectiveness of the proposed model. From the results, it can be concluded that a considerable increase in classification accuracy can be obtained when the rule-based feature vectors are integrated in the learning process of the SVM algorithm. In the context of applicability, according to the results obtained from gene expression analysis, we can conclude that the CARSVM system can be utilized in a variety of real world applications with some adjustments.
NASA Astrophysics Data System (ADS)
Gao, Wei; Zhu, Linli; Wang, Kaiyun
2015-12-01
Ontology, a model of knowledge representation and storage, has had extensive applications in pharmaceutics, social science, chemistry and biology. In the age of “big data”, the constructed concepts are often represented as higher-dimensional data by scholars, and thus the sparse learning techniques are introduced into ontology algorithms. In this paper, based on the alternating direction augmented Lagrangian method, we present an ontology optimization algorithm for ontological sparse vector learning, and a fast version of such ontology technologies. The optimal sparse vector is obtained by an iterative procedure, and the ontology function is then obtained from the sparse vector. Four simulation experiments show that our ontological sparse vector learning model has a higher precision ratio on plant ontology, humanoid robotics ontology, biology ontology and physics education ontology data for similarity measuring and ontology mapping applications.
Keohane, Bernie M; Mason, Steve M; Baguley, David M
2004-02-01
A novel auditory brainstem response (ABR) detection and scoring algorithm, entitled the Vector algorithm is described. An independent clinical evaluation of the algorithm using 464 tests (120 non-stimulated and 344 stimulated tests) on 60 infants, with a mean age of approximately 6.5 weeks, estimated test sensitivity greater than 0.99 and test specificity at 0.87 for one test. Specificity was estimated to be greater than 0.95 for a two stage screen. Test times were of the order of 1.5 minutes per ear for detection of an ABR and 4.5 minutes per ear in the absence of a clear response. The Vector algorithm is commercially available for both automated screening and threshold estimation in hearing screening devices.
NASA Technical Reports Server (NTRS)
Samba, A. S.
1985-01-01
The problem of solving banded linear systems by direct (non-iterative) techniques on the Vector Processor System (VPS) 32 supercomputer is considered. Two efficient direct methods for solving banded linear systems on the VPS 32 are described. The vector cyclic reduction (VCR) algorithm is discussed in detail. The performance of the VCR on a three parameter model problem is also illustrated. The VCR is an adaptation of the conventional point cyclic reduction algorithm. The second direct method is the Customized Reduction of Augmented Triangles' (CRAT). CRAT has the dominant characteristics of an efficient VPS 32 algorithm. CRAT is tailored to the pipeline architecture of the VPS 32 and as a consequence the algorithm is implicitly vectorizable.
NASA Astrophysics Data System (ADS)
Zhu, Maohu; Jie, Nanfeng; Jiang, Tianzi
2014-03-01
A reliable and precise classification of schizophrenia is significant for its diagnosis and treatment of schizophrenia. Functional magnetic resonance imaging (fMRI) is a novel tool increasingly used in schizophrenia research. Recent advances in statistical learning theory have led to applying pattern classification algorithms to access the diagnostic value of functional brain networks, discovered from resting state fMRI data. The aim of this study was to propose an adaptive learning algorithm to distinguish schizophrenia patients from normal controls using resting-state functional language network. Furthermore, here the classification of schizophrenia was regarded as a sample selection problem where a sparse subset of samples was chosen from the labeled training set. Using these selected samples, which we call informative vectors, a classifier for the clinic diagnosis of schizophrenia was established. We experimentally demonstrated that the proposed algorithm incorporating resting-state functional language network achieved 83.6% leaveone- out accuracy on resting-state fMRI data of 27 schizophrenia patients and 28 normal controls. In contrast with KNearest- Neighbor (KNN), Support Vector Machine (SVM) and l1-norm, our method yielded better classification performance. Moreover, our results suggested that a dysfunction of resting-state functional language network plays an important role in the clinic diagnosis of schizophrenia.
Semisupervised Support Vector Machines With Tangent Space Intrinsic Manifold Regularization.
Sun, Shiliang; Xie, Xijiong
2016-09-01
Semisupervised learning has been an active research topic in machine learning and data mining. One main reason is that labeling examples is expensive and time-consuming, while there are large numbers of unlabeled examples available in many practical problems. So far, Laplacian regularization has been widely used in semisupervised learning. In this paper, we propose a new regularization method called tangent space intrinsic manifold regularization. It is intrinsic to data manifold and favors linear functions on the manifold. Fundamental elements involved in the formulation of the regularization are local tangent space representations, which are estimated by local principal component analysis, and the connections that relate adjacent tangent spaces. Simultaneously, we explore its application to semisupervised classification and propose two new learning algorithms called tangent space intrinsic manifold regularized support vector machines (TiSVMs) and tangent space intrinsic manifold regularized twin SVMs (TiTSVMs). They effectively integrate the tangent space intrinsic manifold regularization consideration. The optimization of TiSVMs can be solved by a standard quadratic programming, while the optimization of TiTSVMs can be solved by a pair of standard quadratic programmings. The experimental results of semisupervised classification problems show the effectiveness of the proposed semisupervised learning algorithms.
2018-01-01
Early detection of power transformer fault is important because it can reduce the maintenance cost of the transformer and it can ensure continuous electricity supply in power systems. Dissolved Gas Analysis (DGA) technique is commonly used to identify oil-filled power transformer fault type but utilisation of artificial intelligence method with optimisation methods has shown convincing results. In this work, a hybrid support vector machine (SVM) with modified evolutionary particle swarm optimisation (EPSO) algorithm was proposed to determine the transformer fault type. The superiority of the modified PSO technique with SVM was evaluated by comparing the results with the actual fault diagnosis, unoptimised SVM and previous reported works. Data reduction was also applied using stepwise regression prior to the training process of SVM to reduce the training time. It was found that the proposed hybrid SVM-Modified EPSO (MEPSO)-Time Varying Acceleration Coefficient (TVAC) technique results in the highest correct identification percentage of faults in a power transformer compared to other PSO algorithms. Thus, the proposed technique can be one of the potential solutions to identify the transformer fault type based on DGA data on site. PMID:29370230
Illias, Hazlee Azil; Zhao Liang, Wee
2018-01-01
Early detection of power transformer fault is important because it can reduce the maintenance cost of the transformer and it can ensure continuous electricity supply in power systems. Dissolved Gas Analysis (DGA) technique is commonly used to identify oil-filled power transformer fault type but utilisation of artificial intelligence method with optimisation methods has shown convincing results. In this work, a hybrid support vector machine (SVM) with modified evolutionary particle swarm optimisation (EPSO) algorithm was proposed to determine the transformer fault type. The superiority of the modified PSO technique with SVM was evaluated by comparing the results with the actual fault diagnosis, unoptimised SVM and previous reported works. Data reduction was also applied using stepwise regression prior to the training process of SVM to reduce the training time. It was found that the proposed hybrid SVM-Modified EPSO (MEPSO)-Time Varying Acceleration Coefficient (TVAC) technique results in the highest correct identification percentage of faults in a power transformer compared to other PSO algorithms. Thus, the proposed technique can be one of the potential solutions to identify the transformer fault type based on DGA data on site.
NASA Astrophysics Data System (ADS)
Wu, Peilin; Zhang, Qunying; Fei, Chunjiao; Fang, Guangyou
2017-04-01
Aeromagnetic gradients are typically measured by optically pumped magnetometers mounted on an aircraft. Any aircraft, particularly helicopters, produces significant levels of magnetic interference. Therefore, aeromagnetic compensation is essential, and least square (LS) is the conventional method used for reducing interference levels. However, the LSs approach to solving the aeromagnetic interference model has a few difficulties, one of which is in handling multicollinearity. Therefore, we propose an aeromagnetic gradient compensation method, specifically targeted for helicopter use but applicable on any airborne platform, which is based on the ɛ-support vector regression algorithm. The structural risk minimization criterion intrinsic to the method avoids multicollinearity altogether. Local aeromagnetic anomalies can be retained, and platform-generated fields are suppressed simultaneously by constructing an appropriate loss function and kernel function. The method was tested using an unmanned helicopter and obtained improvement ratios of 12.7 and 3.5 in the vertical and horizontal gradient data, respectively. Both of these values are probably better than those that would have been obtained from the conventional method applied to the same data, had it been possible to do so in a suitable comparative context. The validity of the proposed method is demonstrated by the experimental result.
NASA Astrophysics Data System (ADS)
Li, S. X.; Zhang, Y. J.; Zeng, Q. Y.; Li, L. F.; Guo, Z. Y.; Liu, Z. M.; Xiong, H. L.; Liu, S. H.
2014-06-01
Cancer is the most common disease to threaten human health. The ability to screen individuals with malignant tumours with only a blood sample would be greatly advantageous to early diagnosis and intervention. This study explores the possibility of discriminating between cancer patients and normal subjects with serum surface-enhanced Raman spectroscopy (SERS) and a support vector machine (SVM) through a peripheral blood sample. A total of 130 blood samples were obtained from patients with liver cancer, colonic cancer, esophageal cancer, nasopharyngeal cancer, gastric cancer, as well as 113 blood samples from normal volunteers. Several diagnostic models were built with the serum SERS spectra using SVM and principal component analysis (PCA) techniques. The results show that a diagnostic accuracy of 85.5% is acquired with a PCA algorithm, while a diagnostic accuracy of 95.8% is obtained using radial basis function (RBF), PCA-SVM methods. The results prove that a RBF kernel PCA-SVM technique is superior to PCA and conventional SVM (C-SVM) algorithms in classification serum SERS spectra. The study demonstrates that serum SERS, in combination with SVM techniques, has great potential for screening cancerous patients with any solid malignant tumour through a peripheral blood sample.
Heist, E Kevin; Herre, John M; Binkley, Philip F; Van Bakel, Adrian B; Porterfield, James G; Porterfield, Linda M; Qu, Fujian; Turkel, Melanie; Pavri, Behzad B
2014-10-15
Detect Fluid Early from Intrathoracic Impedance Monitoring (DEFEAT-PE) is a prospective, multicenter study of multiple intrathoracic impedance vectors to detect pulmonary congestion (PC) events. Changes in intrathoracic impedance between the right ventricular (RV) coil and device can (RVcoil→Can) of implantable cardioverter-defibrillators (ICDs) and cardiac resynchronization therapy ICDs (CRT-Ds) are used clinically for the detection of PC events, but other impedance vectors and algorithms have not been studied prospectively. An initial 75-patient study was used to derive optimal impedance vectors to detect PC events, with 2 vector combinations selected for prospective analysis in DEFEAT-PE (ICD vectors: RVring→Can + RVcoil→Can, detection threshold 13 days; CRT-D vectors: left ventricular ring→Can + RVcoil→Can, detection threshold 14 days). Impedance changes were considered true positive if detected <30 days before an adjudicated PC event. One hundred sixty-two patients were enrolled (80 with ICDs and 82 with CRT-Ds), all with ≥1 previous PC event. One hundred forty-four patients provided study data, with 214 patient-years of follow-up and 139 PC events. Sensitivity for PC events of the prespecified algorithms was as follows: ICD: sensitivity 32.3%, false-positive rate 1.28 per patient-year; CRT-D: sensitivity 32.4%, false-positive rate 1.66 per patient-year. An alternative algorithm, ultimately approved by the US Food and Drug Administration (RVring→Can + RVcoil→Can, detection threshold 14 days), resulted in (for all patients) sensitivity of 21.6% and a false-positive rate of 0.9 per patient-year. The CRT-D thoracic impedance vector algorithm selected in the derivation study was not superior to the ICD algorithm RVring→Can + RVcoil→Can when studied prospectively. In conclusion, to achieve an acceptably low false-positive rate, the intrathoracic impedance algorithms studied in DEFEAT-PE resulted in low sensitivity for the prediction of heart failure events. Copyright © 2014 Elsevier Inc. All rights reserved.
Banno, Masaki; Komiyama, Yusuke; Cao, Wei; Oku, Yuya; Ueki, Kokoro; Sumikoshi, Kazuya; Nakamura, Shugo; Terada, Tohru; Shimizu, Kentaro
2017-02-01
Several methods have been proposed for protein-sugar binding site prediction using machine learning algorithms. However, they are not effective to learn various properties of binding site residues caused by various interactions between proteins and sugars. In this study, we classified sugars into acidic and nonacidic sugars and showed that their binding sites have different amino acid occurrence frequencies. By using this result, we developed sugar-binding residue predictors dedicated to the two classes of sugars: an acid sugar binding predictor and a nonacidic sugar binding predictor. We also developed a combination predictor which combines the results of the two predictors. We showed that when a sugar is known to be an acidic sugar, the acidic sugar binding predictor achieves the best performance, and showed that when a sugar is known to be a nonacidic sugar or is not known to be either of the two classes, the combination predictor achieves the best performance. Our method uses only amino acid sequences for prediction. Support vector machine was used as a machine learning algorithm and the position-specific scoring matrix created by the position-specific iterative basic local alignment search tool was used as the feature vector. We evaluated the performance of the predictors using five-fold cross-validation. We have launched our system, as an open source freeware tool on the GitHub repository (https://doi.org/10.5281/zenodo.61513). Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.
A sparse matrix algorithm on the Boolean vector machine
NASA Technical Reports Server (NTRS)
Wagner, Robert A.; Patrick, Merrell L.
1988-01-01
VLSI technology is being used to implement a prototype Boolean Vector Machine (BVM), which is a large network of very small processors with equally small memories that operate in SIMD mode; these use bit-serial arithmetic, and communicate via cube-connected cycles network. The BVM's bit-serial arithmetic and the small memories of individual processors are noted to compromise the system's effectiveness in large numerical problem applications. Attention is presently given to the implementation of a basic matrix-vector iteration algorithm for space matrices of the BVM, in order to generate over 1 billion useful floating-point operations/sec for this iteration algorithm. The algorithm is expressed in a novel language designated 'BVM'.
Accelerating Families of Fuzzy K-Means Algorithms for Vector Quantization Codebook Design
Mata, Edson; Bandeira, Silvio; de Mattos Neto, Paulo; Lopes, Waslon; Madeiro, Francisco
2016-01-01
The performance of signal processing systems based on vector quantization depends on codebook design. In the image compression scenario, the quality of the reconstructed images depends on the codebooks used. In this paper, alternatives are proposed for accelerating families of fuzzy K-means algorithms for codebook design. The acceleration is obtained by reducing the number of iterations of the algorithms and applying efficient nearest neighbor search techniques. Simulation results concerning image vector quantization have shown that the acceleration obtained so far does not decrease the quality of the reconstructed images. Codebook design time savings up to about 40% are obtained by the accelerated versions with respect to the original versions of the algorithms. PMID:27886061
Accelerating Families of Fuzzy K-Means Algorithms for Vector Quantization Codebook Design.
Mata, Edson; Bandeira, Silvio; de Mattos Neto, Paulo; Lopes, Waslon; Madeiro, Francisco
2016-11-23
The performance of signal processing systems based on vector quantization depends on codebook design. In the image compression scenario, the quality of the reconstructed images depends on the codebooks used. In this paper, alternatives are proposed for accelerating families of fuzzy K-means algorithms for codebook design. The acceleration is obtained by reducing the number of iterations of the algorithms and applying efficient nearest neighbor search techniques. Simulation results concerning image vector quantization have shown that the acceleration obtained so far does not decrease the quality of the reconstructed images. Codebook design time savings up to about 40% are obtained by the accelerated versions with respect to the original versions of the algorithms.
Algorithms for solving large sparse systems of simultaneous linear equations on vector processors
NASA Technical Reports Server (NTRS)
David, R. E.
1984-01-01
Very efficient algorithms for solving large sparse systems of simultaneous linear equations have been developed for serial processing computers. These involve a reordering of matrix rows and columns in order to obtain a near triangular pattern of nonzero elements. Then an LU factorization is developed to represent the matrix inverse in terms of a sequence of elementary Gaussian eliminations, or pivots. In this paper it is shown how these algorithms are adapted for efficient implementation on vector processors. Results obtained on the CYBER 200 Model 205 are presented for a series of large test problems which show the comparative advantages of the triangularization and vector processing algorithms.
Effective data compaction algorithm for vector scan EB writing system
NASA Astrophysics Data System (ADS)
Ueki, Shinichi; Ashida, Isao; Kawahira, Hiroichi
2001-01-01
We have developed a new mask data compaction algorithm dedicated to vector scan electron beam (EB) writing systems for 0.13 μm device generation. Large mask data size has become a significant problem at mask data processing for which data compaction is an important technique. In our new mask data compaction, 'array' representation and 'cell' representation are used. The mask data format for the EB writing system with vector scan supports these representations. The array representation has a pitch and a number of repetitions in both X and Y direction. The cell representation has a definition of figure group and its reference. The new data compaction method has the following three steps. (1) Search arrays of figures by selecting pitches of array so that a number of figures are included. (2) Find out same arrays that have same repetitive pitch and number of figures. (3) Search cells of figures, where the figures in each cell take identical positional relationship. By this new method for the mask data of a 4M-DRAM block gate layer with peripheral circuits, 202 Mbytes without compaction was highly compacted to 6.7 Mbytes in 20 minutes on a 500 MHz PC.
Naive Bayes Bearing Fault Diagnosis Based on Enhanced Independence of Data
Zhang, Nannan; Wu, Lifeng; Yang, Jing; Guan, Yong
2018-01-01
The bearing is the key component of rotating machinery, and its performance directly determines the reliability and safety of the system. Data-based bearing fault diagnosis has become a research hotspot. Naive Bayes (NB), which is based on independent presumption, is widely used in fault diagnosis. However, the bearing data are not completely independent, which reduces the performance of NB algorithms. In order to solve this problem, we propose a NB bearing fault diagnosis method based on enhanced independence of data. The method deals with data vector from two aspects: the attribute feature and the sample dimension. After processing, the classification limitation of NB is reduced by the independence hypothesis. First, we extract the statistical characteristics of the original signal of the bearings effectively. Then, the Decision Tree algorithm is used to select the important features of the time domain signal, and the low correlation features is selected. Next, the Selective Support Vector Machine (SSVM) is used to prune the dimension data and remove redundant vectors. Finally, we use NB to diagnose the fault with the low correlation data. The experimental results show that the independent enhancement of data is effective for bearing fault diagnosis. PMID:29401730
Prediction of Baseflow Index of Catchments using Machine Learning Algorithms
NASA Astrophysics Data System (ADS)
Yadav, B.; Hatfield, K.
2017-12-01
We present the results of eight machine learning techniques for predicting the baseflow index (BFI) of ungauged basins using a surrogate of catchment scale climate and physiographic data. The tested algorithms include ordinary least squares, ridge regression, least absolute shrinkage and selection operator (lasso), elasticnet, support vector machine, gradient boosted regression trees, random forests, and extremely randomized trees. Our work seeks to identify the dominant controls of BFI that can be readily obtained from ancillary geospatial databases and remote sensing measurements, such that the developed techniques can be extended to ungauged catchments. More than 800 gauged catchments spanning the continental United States were selected to develop the general methodology. The BFI calculation was based on the baseflow separated from daily streamflow hydrograph using HYSEP filter. The surrogate catchment attributes were compiled from multiple sources including digital elevation model, soil, landuse, climate data, other publicly available ancillary and geospatial data. 80% catchments were used to train the ML algorithms, and the remaining 20% of the catchments were used as an independent test set to measure the generalization performance of fitted models. A k-fold cross-validation using exhaustive grid search was used to fit the hyperparameters of each model. Initial model development was based on 19 independent variables, but after variable selection and feature ranking, we generated revised sparse models of BFI prediction that are based on only six catchment attributes. These key predictive variables selected after the careful evaluation of bias-variance tradeoff include average catchment elevation, slope, fraction of sand, permeability, temperature, and precipitation. The most promising algorithms exceeding an accuracy score (r-square) of 0.7 on test data include support vector machine, gradient boosted regression trees, random forests, and extremely randomized trees. Considering both the accuracy and the computational complexity of these algorithms, we identify the extremely randomized trees as the best performing algorithm for BFI prediction in ungauged basins.
Computational mechanics analysis tools for parallel-vector supercomputers
NASA Technical Reports Server (NTRS)
Storaasli, O. O.; Nguyen, D. T.; Baddourah, M. A.; Qin, J.
1993-01-01
Computational algorithms for structural analysis on parallel-vector supercomputers are reviewed. These parallel algorithms, developed by the authors, are for the assembly of structural equations, 'out-of-core' strategies for linear equation solution, massively distributed-memory equation solution, unsymmetric equation solution, general eigen-solution, geometrically nonlinear finite element analysis, design sensitivity analysis for structural dynamics, optimization algorithm and domain decomposition. The source code for many of these algorithms is available from NASA Langley.
Modeling node bandwidth limits and their effects on vector combining algorithms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Littlefield, R.J.
Each node in a message-passing multicomputer typically has several communication links. However, the maximum aggregate communication speed of a node is often less than the sum of its individual link speeds. Such computers are called node bandwidth limited (NBL). The NBL constraint is important when choosing algorithms because it can change the relative performance of different algorithms that accomplish the same task. This paper introduces a model of communication performance for NBL computers and uses the model to analyze the overall performance of three algorithms for vector combining (global sum) on the Intel Touchstone DELTA computer. Each of the threemore » algorithms is found to be at least 33% faster than the other two for some combinations of machine size and vector length. The NBL constraint is shown to significantly affect the conditions under which each algorithm is fastest.« less
Fast Quaternion Attitude Estimation from Two Vector Measurements
NASA Technical Reports Server (NTRS)
Markley, F. Landis; Bauer, Frank H. (Technical Monitor)
2001-01-01
Many spacecraft attitude determination methods use exactly two vector measurements. The two vectors are typically the unit vector to the Sun and the Earth's magnetic field vector for coarse "sun-mag" attitude determination or unit vectors to two stars tracked by two star trackers for fine attitude determination. Existing closed-form attitude estimates based on Wahba's optimality criterion for two arbitrarily weighted observations are somewhat slow to evaluate. This paper presents two new fast quaternion attitude estimation algorithms using two vector observations, one optimal and one suboptimal. The suboptimal method gives the same estimate as the TRIAD algorithm, at reduced computational cost. Simulations show that the TRIAD estimate is almost as accurate as the optimal estimate in representative test scenarios.
Multidirectional Scanning Model, MUSCLE, to Vectorize Raster Images with Straight Lines
Karas, Ismail Rakip; Bayram, Bulent; Batuk, Fatmagul; Akay, Abdullah Emin; Baz, Ibrahim
2008-01-01
This paper presents a new model, MUSCLE (Multidirectional Scanning for Line Extraction), for automatic vectorization of raster images with straight lines. The algorithm of the model implements the line thinning and the simple neighborhood methods to perform vectorization. The model allows users to define specified criteria which are crucial for acquiring the vectorization process. In this model, various raster images can be vectorized such as township plans, maps, architectural drawings, and machine plans. The algorithm of the model was developed by implementing an appropriate computer programming and tested on a basic application. Results, verified by using two well known vectorization programs (WinTopo and Scan2CAD), indicated that the model can successfully vectorize the specified raster data quickly and accurately. PMID:27879843
Multidirectional Scanning Model, MUSCLE, to Vectorize Raster Images with Straight Lines.
Karas, Ismail Rakip; Bayram, Bulent; Batuk, Fatmagul; Akay, Abdullah Emin; Baz, Ibrahim
2008-04-15
This paper presents a new model, MUSCLE (Multidirectional Scanning for Line Extraction), for automatic vectorization of raster images with straight lines. The algorithm of the model implements the line thinning and the simple neighborhood methods to perform vectorization. The model allows users to define specified criteria which are crucial for acquiring the vectorization process. In this model, various raster images can be vectorized such as township plans, maps, architectural drawings, and machine plans. The algorithm of the model was developed by implementing an appropriate computer programming and tested on a basic application. Results, verified by using two well known vectorization programs (WinTopo and Scan2CAD), indicated that the model can successfully vectorize the specified raster data quickly and accurately.
Alves, Julio Cesar L; Henriques, Claudete B; Poppi, Ronei J
2014-01-03
The use of near infrared (NIR) spectroscopy combined with chemometric methods have been widely used in petroleum and petrochemical industry and provides suitable methods for process control and quality control. The algorithm support vector machines (SVM) has demonstrated to be a powerful chemometric tool for development of classification models due to its ability to nonlinear modeling and with high generalization capability and these characteristics can be especially important for treating near infrared (NIR) spectroscopy data of complex mixtures such as petroleum refinery streams. In this work, a study on the performance of the support vector machines algorithm for classification was carried out, using C-SVC and ν-SVC, applied to near infrared (NIR) spectroscopy data of different types of streams that make up the diesel pool in a petroleum refinery: light gas oil, heavy gas oil, hydrotreated diesel, kerosene, heavy naphtha and external diesel. In addition to these six streams, the diesel final blend produced in the refinery was added to complete the data set. C-SVC and ν-SVC classification models with 2, 4, 6 and 7 classes were developed for comparison between its results and also for comparison with the soft independent modeling of class analogy (SIMCA) models results. It is demonstrated the superior performance of SVC models especially using ν-SVC for development of classification models for 6 and 7 classes leading to an improvement of sensitivity on validation sample sets of 24% and 15%, respectively, when compared to SIMCA models, providing better identification of chemical compositions of different diesel pool refinery streams. Copyright © 2013 Elsevier B.V. All rights reserved.
Two novel motion-based algorithms for surveillance video analysis on embedded platforms
NASA Astrophysics Data System (ADS)
Vijverberg, Julien A.; Loomans, Marijn J. H.; Koeleman, Cornelis J.; de With, Peter H. N.
2010-05-01
This paper proposes two novel motion-vector based techniques for target detection and target tracking in surveillance videos. The algorithms are designed to operate on a resource-constrained device, such as a surveillance camera, and to reuse the motion vectors generated by the video encoder. The first novel algorithm for target detection uses motion vectors to construct a consistent motion mask, which is combined with a simple background segmentation technique to obtain a segmentation mask. The second proposed algorithm aims at multi-target tracking and uses motion vectors to assign blocks to targets employing five features. The weights of these features are adapted based on the interaction between targets. These algorithms are combined in one complete analysis application. The performance of this application for target detection has been evaluated for the i-LIDS sterile zone dataset and achieves an F1-score of 0.40-0.69. The performance of the analysis algorithm for multi-target tracking has been evaluated using the CAVIAR dataset and achieves an MOTP of around 9.7 and MOTA of 0.17-0.25. On a selection of targets in videos from other datasets, the achieved MOTP and MOTA are 8.8-10.5 and 0.32-0.49 respectively. The execution time on a PC-based platform is 36 ms. This includes the 20 ms for generating motion vectors, which are also required by the video encoder.
NASA Astrophysics Data System (ADS)
Heidari, Morteza; Zargari Khuzani, Abolfazl; Danala, Gopichandh; Mirniaharikandehei, Seyedehnafiseh; Qian, Wei; Zheng, Bin
2018-03-01
Both conventional and deep machine learning has been used to develop decision-support tools applied in medical imaging informatics. In order to take advantages of both conventional and deep learning approach, this study aims to investigate feasibility of applying a locally preserving projection (LPP) based feature regeneration algorithm to build a new machine learning classifier model to predict short-term breast cancer risk. First, a computer-aided image processing scheme was used to segment and quantify breast fibro-glandular tissue volume. Next, initially computed 44 image features related to the bilateral mammographic tissue density asymmetry were extracted. Then, an LLP-based feature combination method was applied to regenerate a new operational feature vector using a maximal variance approach. Last, a k-nearest neighborhood (KNN) algorithm based machine learning classifier using the LPP-generated new feature vectors was developed to predict breast cancer risk. A testing dataset involving negative mammograms acquired from 500 women was used. Among them, 250 were positive and 250 remained negative in the next subsequent mammography screening. Applying to this dataset, LLP-generated feature vector reduced the number of features from 44 to 4. Using a leave-onecase-out validation method, area under ROC curve produced by the KNN classifier significantly increased from 0.62 to 0.68 (p < 0.05) and odds ratio was 4.60 with a 95% confidence interval of [3.16, 6.70]. Study demonstrated that this new LPP-based feature regeneration approach enabled to produce an optimal feature vector and yield improved performance in assisting to predict risk of women having breast cancer detected in the next subsequent mammography screening.
Monthly prediction of air temperature in Australia and New Zealand with machine learning algorithms
NASA Astrophysics Data System (ADS)
Salcedo-Sanz, S.; Deo, R. C.; Carro-Calvo, L.; Saavedra-Moreno, B.
2016-07-01
Long-term air temperature prediction is of major importance in a large number of applications, including climate-related studies, energy, agricultural, or medical. This paper examines the performance of two Machine Learning algorithms (Support Vector Regression (SVR) and Multi-layer Perceptron (MLP)) in a problem of monthly mean air temperature prediction, from the previous measured values in observational stations of Australia and New Zealand, and climate indices of importance in the region. The performance of the two considered algorithms is discussed in the paper and compared to alternative approaches. The results indicate that the SVR algorithm is able to obtain the best prediction performance among all the algorithms compared in the paper. Moreover, the results obtained have shown that the mean absolute error made by the two algorithms considered is significantly larger for the last 20 years than in the previous decades, in what can be interpreted as a change in the relationship among the prediction variables involved in the training of the algorithms.
Lin, Kuan-Cheng; Hsieh, Yi-Hsiu
2015-10-01
The classification and analysis of data is an important issue in today's research. Selecting a suitable set of features makes it possible to classify an enormous quantity of data quickly and efficiently. Feature selection is generally viewed as a problem of feature subset selection, such as combination optimization problems. Evolutionary algorithms using random search methods have proven highly effective in obtaining solutions to problems of optimization in a diversity of applications. In this study, we developed a hybrid evolutionary algorithm based on endocrine-based particle swarm optimization (EPSO) and artificial bee colony (ABC) algorithms in conjunction with a support vector machine (SVM) for the selection of optimal feature subsets for the classification of datasets. The results of experiments using specific UCI medical datasets demonstrate that the accuracy of the proposed hybrid evolutionary algorithm is superior to that of basic PSO, EPSO and ABC algorithms, with regard to classification accuracy using subsets with a reduced number of features.
Classifying epileptic EEG signals with delay permutation entropy and Multi-Scale K-means.
Zhu, Guohun; Li, Yan; Wen, Peng Paul; Wang, Shuaifang
2015-01-01
Most epileptic EEG classification algorithms are supervised and require large training datasets, that hinder their use in real time applications. This chapter proposes an unsupervised Multi-Scale K-means (MSK-means) MSK-means algorithm to distinguish epileptic EEG signals and identify epileptic zones. The random initialization of the K-means algorithm can lead to wrong clusters. Based on the characteristics of EEGs, the MSK-means MSK-means algorithm initializes the coarse-scale centroid of a cluster with a suitable scale factor. In this chapter, the MSK-means algorithm is proved theoretically superior to the K-means algorithm on efficiency. In addition, three classifiers: the K-means, MSK-means MSK-means and support vector machine (SVM), are used to identify seizure and localize epileptogenic zone using delay permutation entropy features. The experimental results demonstrate that identifying seizure with the MSK-means algorithm and delay permutation entropy achieves 4. 7 % higher accuracy than that of K-means, and 0. 7 % higher accuracy than that of the SVM.
Metaheuristic Optimization and its Applications in Earth Sciences
NASA Astrophysics Data System (ADS)
Yang, Xin-She
2010-05-01
A common but challenging task in modelling geophysical and geological processes is to handle massive data and to minimize certain objectives. This can essentially be considered as an optimization problem, and thus many new efficient metaheuristic optimization algorithms can be used. In this paper, we will introduce some modern metaheuristic optimization algorithms such as genetic algorithms, harmony search, firefly algorithm, particle swarm optimization and simulated annealing. We will also discuss how these algorithms can be applied to various applications in earth sciences, including nonlinear least-squares, support vector machine, Kriging, inverse finite element analysis, and data-mining. We will present a few examples to show how different problems can be reformulated as optimization. Finally, we will make some recommendations for choosing various algorithms to suit various problems. References 1) D. H. Wolpert and W. G. Macready, No free lunch theorems for optimization, IEEE Trans. Evolutionary Computation, Vol. 1, 67-82 (1997). 2) X. S. Yang, Nature-Inspired Metaheuristic Algorithms, Luniver Press, (2008). 3) X. S. Yang, Mathematical Modelling for Earth Sciences, Dunedin Academic Press, (2008).
Quick fuzzy backpropagation algorithm.
Nikov, A; Stoeva, S
2001-03-01
A modification of the fuzzy backpropagation (FBP) algorithm called QuickFBP algorithm is proposed, where the computation of the net function is significantly quicker. It is proved that the FBP algorithm is of exponential time complexity, while the QuickFBP algorithm is of polynomial time complexity. Convergence conditions of the QuickFBP, resp. the FBP algorithm are defined and proved for: (1) single output neural networks in case of training patterns with different targets; and (2) multiple output neural networks in case of training patterns with equivalued target vector. They support the automation of the weights training process (quasi-unsupervised learning) establishing the target value(s) depending on the network's input values. In these cases the simulation results confirm the convergence of both algorithms. An example with a large-sized neural network illustrates the significantly greater training speed of the QuickFBP rather than the FBP algorithm. The adaptation of an interactive web system to users on the basis of the QuickFBP algorithm is presented. Since the QuickFBP algorithm ensures quasi-unsupervised learning, this implies its broad applicability in areas of adaptive and adaptable interactive systems, data mining, etc. applications.
Human action recognition with group lasso regularized-support vector machine
NASA Astrophysics Data System (ADS)
Luo, Huiwu; Lu, Huanzhang; Wu, Yabei; Zhao, Fei
2016-05-01
The bag-of-visual-words (BOVW) and Fisher kernel are two popular models in human action recognition, and support vector machine (SVM) is the most commonly used classifier for the two models. We show two kinds of group structures in the feature representation constructed by BOVW and Fisher kernel, respectively, since the structural information of feature representation can be seen as a prior for the classifier and can improve the performance of the classifier, which has been verified in several areas. However, the standard SVM employs L2-norm regularization in its learning procedure, which penalizes each variable individually and cannot express the structural information of feature representation. We replace the L2-norm regularization with group lasso regularization in standard SVM, and a group lasso regularized-support vector machine (GLRSVM) is proposed. Then, we embed the group structural information of feature representation into GLRSVM. Finally, we introduce an algorithm to solve the optimization problem of GLRSVM by alternating directions method of multipliers. The experiments evaluated on KTH, YouTube, and Hollywood2 datasets show that our method achieves promising results and improves the state-of-the-art methods on KTH and YouTube datasets.
Li, Wutao; Huang, Zhigang; Lang, Rongling; Qin, Honglei; Zhou, Kai; Cao, Yongbin
2016-03-04
Interferences can severely degrade the performance of Global Navigation Satellite System (GNSS) receivers. As the first step of GNSS any anti-interference measures, interference monitoring for GNSS is extremely essential and necessary. Since interference monitoring can be considered as a classification problem, a real-time interference monitoring technique based on Twin Support Vector Machine (TWSVM) is proposed in this paper. A TWSVM model is established, and TWSVM is solved by the Least Squares Twin Support Vector Machine (LSTWSVM) algorithm. The interference monitoring indicators are analyzed to extract features from the interfered GNSS signals. The experimental results show that the chosen observations can be used as the interference monitoring indicators. The interference monitoring performance of the proposed method is verified by using GPS L1 C/A code signal and being compared with that of standard SVM. The experimental results indicate that the TWSVM-based interference monitoring is much faster than the conventional SVM. Furthermore, the training time of TWSVM is on millisecond (ms) level and the monitoring time is on microsecond (μs) level, which make the proposed approach usable in practical interference monitoring applications.
Recursive feature selection with significant variables of support vectors.
Tsai, Chen-An; Huang, Chien-Hsun; Chang, Ching-Wei; Chen, Chun-Houh
2012-01-01
The development of DNA microarray makes researchers screen thousands of genes simultaneously and it also helps determine high- and low-expression level genes in normal and disease tissues. Selecting relevant genes for cancer classification is an important issue. Most of the gene selection methods use univariate ranking criteria and arbitrarily choose a threshold to choose genes. However, the parameter setting may not be compatible to the selected classification algorithms. In this paper, we propose a new gene selection method (SVM-t) based on the use of t-statistics embedded in support vector machine. We compared the performance to two similar SVM-based methods: SVM recursive feature elimination (SVMRFE) and recursive support vector machine (RSVM). The three methods were compared based on extensive simulation experiments and analyses of two published microarray datasets. In the simulation experiments, we found that the proposed method is more robust in selecting informative genes than SVMRFE and RSVM and capable to attain good classification performance when the variations of informative and noninformative genes are different. In the analysis of two microarray datasets, the proposed method yields better performance in identifying fewer genes with good prediction accuracy, compared to SVMRFE and RSVM.
A Real-Time Interference Monitoring Technique for GNSS Based on a Twin Support Vector Machine Method
Li, Wutao; Huang, Zhigang; Lang, Rongling; Qin, Honglei; Zhou, Kai; Cao, Yongbin
2016-01-01
Interferences can severely degrade the performance of Global Navigation Satellite System (GNSS) receivers. As the first step of GNSS any anti-interference measures, interference monitoring for GNSS is extremely essential and necessary. Since interference monitoring can be considered as a classification problem, a real-time interference monitoring technique based on Twin Support Vector Machine (TWSVM) is proposed in this paper. A TWSVM model is established, and TWSVM is solved by the Least Squares Twin Support Vector Machine (LSTWSVM) algorithm. The interference monitoring indicators are analyzed to extract features from the interfered GNSS signals. The experimental results show that the chosen observations can be used as the interference monitoring indicators. The interference monitoring performance of the proposed method is verified by using GPS L1 C/A code signal and being compared with that of standard SVM. The experimental results indicate that the TWSVM-based interference monitoring is much faster than the conventional SVM. Furthermore, the training time of TWSVM is on millisecond (ms) level and the monitoring time is on microsecond (μs) level, which make the proposed approach usable in practical interference monitoring applications. PMID:26959020
Jaya, T; Dheeba, J; Singh, N Albert
2015-12-01
Diabetic retinopathy is a major cause of vision loss in diabetic patients. Currently, there is a need for making decisions using intelligent computer algorithms when screening a large volume of data. This paper presents an expert decision-making system designed using a fuzzy support vector machine (FSVM) classifier to detect hard exudates in fundus images. The optic discs in the colour fundus images are segmented to avoid false alarms using morphological operations and based on circular Hough transform. To discriminate between the exudates and the non-exudates pixels, colour and texture features are extracted from the images. These features are given as input to the FSVM classifier. The classifier analysed 200 retinal images collected from diabetic retinopathy screening programmes. The tests made on the retinal images show that the proposed detection system has better discriminating power than the conventional support vector machine. With the best combination of FSVM and features sets, the area under the receiver operating characteristic curve reached 0.9606, which corresponds to a sensitivity of 94.1% with a specificity of 90.0%. The results suggest that detecting hard exudates using FSVM contribute to computer-assisted detection of diabetic retinopathy and as a decision support system for ophthalmologists.
An improved conjugate gradient scheme to the solution of least squares SVM.
Chu, Wei; Ong, Chong Jin; Keerthi, S Sathiya
2005-03-01
The least square support vector machines (LS-SVM) formulation corresponds to the solution of a linear system of equations. Several approaches to its numerical solutions have been proposed in the literature. In this letter, we propose an improved method to the numerical solution of LS-SVM and show that the problem can be solved using one reduced system of linear equations. Compared with the existing algorithm for LS-SVM, the approach used in this letter is about twice as efficient. Numerical results using the proposed method are provided for comparisons with other existing algorithms.
Is it worth changing pattern recognition methods for structural health monitoring?
NASA Astrophysics Data System (ADS)
Bull, L. A.; Worden, K.; Cross, E. J.; Dervilis, N.
2017-05-01
The key element of this work is to demonstrate alternative strategies for using pattern recognition algorithms whilst investigating structural health monitoring. This paper looks to determine if it makes any difference in choosing from a range of established classification techniques: from decision trees and support vector machines, to Gaussian processes. Classification algorithms are tested on adjustable synthetic data to establish performance metrics, then all techniques are applied to real SHM data. To aid the selection of training data, an informative chain of artificial intelligence tools is used to explore an active learning interaction between meaningful clusters of data.
Security authentication using phase-encoded nanoparticle structures and polarized light.
Carnicer, Artur; Hassanfiroozi, Amir; Latorre-Carmona, Pedro; Huang, Yi-Pai; Javidi, Bahram
2015-01-15
Phase-encoded nanostructures such as quick response (QR) codes made of metallic nanoparticles are suggested to be used in security and authentication applications. We present a polarimetric optical method able to authenticate random phase-encoded QR codes. The system is illuminated using polarized light, and the QR code is encoded using a phase-only random mask. Using classification algorithms, it is possible to validate the QR code from the examination of the polarimetric signature of the speckle pattern. We used Kolmogorov-Smirnov statistical test and Support Vector Machine algorithms to authenticate the phase-encoded QR codes using polarimetric signatures.
NASA Astrophysics Data System (ADS)
Liu, Jianjun; Kan, Jianquan
2018-04-01
In this paper, based on the terahertz spectrum, a new identification method of genetically modified material by support vector machine (SVM) based on affinity propagation clustering is proposed. This algorithm mainly uses affinity propagation clustering algorithm to make cluster analysis and labeling on unlabeled training samples, and in the iterative process, the existing SVM training data are continuously updated, when establishing the identification model, it does not need to manually label the training samples, thus, the error caused by the human labeled samples is reduced, and the identification accuracy of the model is greatly improved.
An improved CS-LSSVM algorithm-based fault pattern recognition of ship power equipments.
Yang, Yifei; Tan, Minjia; Dai, Yuewei
2017-01-01
A ship power equipments' fault monitoring signal usually provides few samples and the data's feature is non-linear in practical situation. This paper adopts the method of the least squares support vector machine (LSSVM) to deal with the problem of fault pattern identification in the case of small sample data. Meanwhile, in order to avoid involving a local extremum and poor convergence precision which are induced by optimizing the kernel function parameter and penalty factor of LSSVM, an improved Cuckoo Search (CS) algorithm is proposed for the purpose of parameter optimization. Based on the dynamic adaptive strategy, the newly proposed algorithm improves the recognition probability and the searching step length, which can effectively solve the problems of slow searching speed and low calculation accuracy of the CS algorithm. A benchmark example demonstrates that the CS-LSSVM algorithm can accurately and effectively identify the fault pattern types of ship power equipments.
Pairwise Sequence Alignment Library
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jeff Daily, PNNL
2015-05-20
Vector extensions, such as SSE, have been part of the x86 CPU since the 1990s, with applications in graphics, signal processing, and scientific applications. Although many algorithms and applications can naturally benefit from automatic vectorization techniques, there are still many that are difficult to vectorize due to their dependence on irregular data structures, dense branch operations, or data dependencies. Sequence alignment, one of the most widely used operations in bioinformatics workflows, has a computational footprint that features complex data dependencies. The trend of widening vector registers adversely affects the state-of-the-art sequence alignment algorithm based on striped data layouts. Therefore, amore » novel SIMD implementation of a parallel scan-based sequence alignment algorithm that can better exploit wider SIMD units was implemented as part of the Parallel Sequence Alignment Library (parasail). Parasail features: Reference implementations of all known vectorized sequence alignment approaches. Implementations of Smith Waterman (SW), semi-global (SG), and Needleman Wunsch (NW) sequence alignment algorithms. Implementations across all modern CPU instruction sets including AVX2 and KNC. Language interfaces for C/C++ and Python.« less
Tensor Fukunaga-Koontz transform for small target detection in infrared images
NASA Astrophysics Data System (ADS)
Liu, Ruiming; Wang, Jingzhuo; Yang, Huizhen; Gong, Chenglong; Zhou, Yuanshen; Liu, Lipeng; Zhang, Zhen; Shen, Shuli
2016-09-01
Infrared small targets detection plays a crucial role in warning and tracking systems. Some novel methods based on pattern recognition technology catch much attention from researchers. However, those classic methods must reshape images into vectors with the high dimensionality. Moreover, vectorizing breaks the natural structure and correlations in the image data. Image representation based on tensor treats images as matrices and can hold the natural structure and correlation information. So tensor algorithms have better classification performance than vector algorithms. Fukunaga-Koontz transform is one of classification algorithms and it is a vector version method with the disadvantage of all vector algorithms. In this paper, we first extended the Fukunaga-Koontz transform into its tensor version, tensor Fukunaga-Koontz transform. Then we designed a method based on tensor Fukunaga-Koontz transform for detecting targets and used it to detect small targets in infrared images. The experimental results, comparison through signal-to-clutter, signal-to-clutter gain and background suppression factor, have validated the advantage of the target detection based on the tensor Fukunaga-Koontz transform over that based on the Fukunaga-Koontz transform.
Generalized sidelobe canceller beamforming method for ultrasound imaging.
Wang, Ping; Li, Na; Luo, Han-Wu; Zhu, Yong-Kun; Cui, Shi-Gang
2017-03-01
A modified generalized sidelobe canceller (IGSC) algorithm is proposed to enhance the resolution and robustness against the noise of the traditional generalized sidelobe canceller (GSC) and coherence factor combined method (GSC-CF). In the GSC algorithm, weighting vector is divided into adaptive and non-adaptive parts, while the non-adaptive part does not block all the desired signal. A modified steer vector of the IGSC algorithm is generated by the projection of the non-adaptive vector on the signal space constructed by the covariance matrix of received data. The blocking matrix is generated based on the orthogonal complementary space of the modified steer vector and the weighting vector is updated subsequently. The performance of IGSC was investigated by simulations and experiments. Through simulations, IGSC outperformed GSC-CF in terms of spatial resolution by 0.1 mm regardless there is noise or not, as well as the contrast ratio respect. The proposed IGSC can be further improved by combining with CF. The experimental results also validated the effectiveness of the proposed algorithm with dataset provided by the University of Michigan.
An efficient and portable SIMD algorithm for charge/current deposition in Particle-In-Cell codes
Vincenti, H.; Lobet, M.; Lehe, R.; ...
2016-09-19
In current computer architectures, data movement (from die to network) is by far the most energy consuming part of an algorithm (≈20pJ/word on-die to ≈10,000 pJ/word on the network). To increase memory locality at the hardware level and reduce energy consumption related to data movement, future exascale computers tend to use many-core processors on each compute nodes that will have a reduced clock speed to allow for efficient cooling. To compensate for frequency decrease, machine vendors are making use of long SIMD instruction registers that are able to process multiple data with one arithmetic operator in one clock cycle. SIMD registermore » length is expected to double every four years. As a consequence, Particle-In-Cell (PIC) codes will have to achieve good vectorization to fully take advantage of these upcoming architectures. In this paper, we present a new algorithm that allows for efficient and portable SIMD vectorization of current/charge deposition routines that are, along with the field gathering routines, among the most time consuming parts of the PIC algorithm. Our new algorithm uses a particular data structure that takes into account memory alignment constraints and avoids gather/scat;ter instructions that can significantly affect vectorization performances on current CPUs. The new algorithm was successfully implemented in the 3D skeleton PIC code PICSAR and tested on Haswell Xeon processors (AVX2-256 bits wide data registers). Results show a factor of ×2 to ×2.5 speed-up in double precision for particle shape factor of orders 1–3. The new algorithm can be applied as is on future KNL (Knights Landing) architectures that will include AVX-512 instruction sets with 512 bits register lengths (8 doubles/16 singles). Program summary Program Title: vec_deposition Program Files doi:http://dx.doi.org/10.17632/nh77fv9k8c.1 Licensing provisions: BSD 3-Clause Programming language: Fortran 90 External routines/libraries: OpenMP > 4.0 Nature of problem: Exascale architectures will have many-core processors per node with long vector data registers capable of performing one single instruction on multiple data during one clock cycle. Data register lengths are expected to double every four years and this pushes for new portable solutions for efficiently vectorizing Particle-In-Cell codes on these future many-core architectures. One of the main hotspot routines of the PIC algorithm is the current/charge deposition for which there is no efficient and portable vector algorithm. Solution method: Here we provide an efficient and portable vector algorithm of current/charge deposition routines that uses a new data structure, which significantly reduces gather/scatter operations. Vectorization is controlled using OpenMP 4.0 compiler directives for vectorization which ensures portability across different architectures. Restrictions: Here we do not provide the full PIC algorithm with an executable but only vector routines for current/charge deposition. These scalar/vector routines can be used as library routines in your 3D Particle-In-Cell code. However, to get the best performances out of vector routines you have to satisfy the two following requirements: (1) Your code should implement particle tiling (as explained in the manuscript) to allow for maximized cache reuse and reduce memory accesses that can hinder vector performances. The routines can be used directly on each particle tile. (2) You should compile your code with a Fortran 90 compiler (e.g Intel, gnu or cray) and provide proper alignment flags and compiler alignment directives (more details in README file).« less
An efficient and portable SIMD algorithm for charge/current deposition in Particle-In-Cell codes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vincenti, H.; Lobet, M.; Lehe, R.
In current computer architectures, data movement (from die to network) is by far the most energy consuming part of an algorithm (≈20pJ/word on-die to ≈10,000 pJ/word on the network). To increase memory locality at the hardware level and reduce energy consumption related to data movement, future exascale computers tend to use many-core processors on each compute nodes that will have a reduced clock speed to allow for efficient cooling. To compensate for frequency decrease, machine vendors are making use of long SIMD instruction registers that are able to process multiple data with one arithmetic operator in one clock cycle. SIMD registermore » length is expected to double every four years. As a consequence, Particle-In-Cell (PIC) codes will have to achieve good vectorization to fully take advantage of these upcoming architectures. In this paper, we present a new algorithm that allows for efficient and portable SIMD vectorization of current/charge deposition routines that are, along with the field gathering routines, among the most time consuming parts of the PIC algorithm. Our new algorithm uses a particular data structure that takes into account memory alignment constraints and avoids gather/scat;ter instructions that can significantly affect vectorization performances on current CPUs. The new algorithm was successfully implemented in the 3D skeleton PIC code PICSAR and tested on Haswell Xeon processors (AVX2-256 bits wide data registers). Results show a factor of ×2 to ×2.5 speed-up in double precision for particle shape factor of orders 1–3. The new algorithm can be applied as is on future KNL (Knights Landing) architectures that will include AVX-512 instruction sets with 512 bits register lengths (8 doubles/16 singles). Program summary Program Title: vec_deposition Program Files doi:http://dx.doi.org/10.17632/nh77fv9k8c.1 Licensing provisions: BSD 3-Clause Programming language: Fortran 90 External routines/libraries: OpenMP > 4.0 Nature of problem: Exascale architectures will have many-core processors per node with long vector data registers capable of performing one single instruction on multiple data during one clock cycle. Data register lengths are expected to double every four years and this pushes for new portable solutions for efficiently vectorizing Particle-In-Cell codes on these future many-core architectures. One of the main hotspot routines of the PIC algorithm is the current/charge deposition for which there is no efficient and portable vector algorithm. Solution method: Here we provide an efficient and portable vector algorithm of current/charge deposition routines that uses a new data structure, which significantly reduces gather/scatter operations. Vectorization is controlled using OpenMP 4.0 compiler directives for vectorization which ensures portability across different architectures. Restrictions: Here we do not provide the full PIC algorithm with an executable but only vector routines for current/charge deposition. These scalar/vector routines can be used as library routines in your 3D Particle-In-Cell code. However, to get the best performances out of vector routines you have to satisfy the two following requirements: (1) Your code should implement particle tiling (as explained in the manuscript) to allow for maximized cache reuse and reduce memory accesses that can hinder vector performances. The routines can be used directly on each particle tile. (2) You should compile your code with a Fortran 90 compiler (e.g Intel, gnu or cray) and provide proper alignment flags and compiler alignment directives (more details in README file).« less
Autofocus algorithm using one-dimensional Fourier transform and Pearson correlation
NASA Astrophysics Data System (ADS)
Bueno Mario, A.; Alvarez-Borrego, Josue; Acho, L.
2004-10-01
A new autofocus algorithm based on one-dimensional Fourier transform and Pearson correlation for Z automatized microscope is proposed. Our goal is to determine in fast response time and accuracy, the best focused plane through an algorithm. We capture in bright and dark field several images set at different Z distances from biological organism sample. The algorithm uses the one-dimensional Fourier transform to obtain the image frequency content of a vectors pattern previously defined comparing the Pearson correlation of these frequency vectors versus the reference image frequency vector, the most out of focus image, we find the best focusing. Experimental results showed the algorithm has fast response time and accuracy in getting the best focus plane from captured images. In conclusions, the algorithm can be implemented in real time systems due fast response time, accuracy and robustness. The algorithm can be used to get focused images in bright and dark field and it can be extended to include fusion techniques to construct multifocus final images beyond of this paper.
Multi-robot task allocation based on two dimensional artificial fish swarm algorithm
NASA Astrophysics Data System (ADS)
Zheng, Taixiong; Li, Xueqin; Yang, Liangyi
2007-12-01
The problem of task allocation for multiple robots is to allocate more relative-tasks to less relative-robots so as to minimize the processing time of these tasks. In order to get optimal multi-robot task allocation scheme, a twodimensional artificial swarm algorithm based approach is proposed in this paper. In this approach, the normal artificial fish is extended to be two dimension artificial fish. In the two dimension artificial fish, each vector of primary artificial fish is extended to be an m-dimensional vector. Thus, each vector can express a group of tasks. By redefining the distance between artificial fish and the center of artificial fish, the behavior of two dimension fish is designed and the task allocation algorithm based on two dimension artificial swarm algorithm is put forward. At last, the proposed algorithm is applied to the problem of multi-robot task allocation and comparer with GA and SA based algorithm is done. Simulation and compare result shows the proposed algorithm is effective.
Joint Smoothed l₀-Norm DOA Estimation Algorithm for Multiple Measurement Vectors in MIMO Radar.
Liu, Jing; Zhou, Weidong; Juwono, Filbert H
2017-05-08
Direction-of-arrival (DOA) estimation is usually confronted with a multiple measurement vector (MMV) case. In this paper, a novel fast sparse DOA estimation algorithm, named the joint smoothed l 0 -norm algorithm, is proposed for multiple measurement vectors in multiple-input multiple-output (MIMO) radar. To eliminate the white or colored Gaussian noises, the new method first obtains a low-complexity high-order cumulants based data matrix. Then, the proposed algorithm designs a joint smoothed function tailored for the MMV case, based on which joint smoothed l 0 -norm sparse representation framework is constructed. Finally, for the MMV-based joint smoothed function, the corresponding gradient-based sparse signal reconstruction is designed, thus the DOA estimation can be achieved. The proposed method is a fast sparse representation algorithm, which can solve the MMV problem and perform well for both white and colored Gaussian noises. The proposed joint algorithm is about two orders of magnitude faster than the l 1 -norm minimization based methods, such as l 1 -SVD (singular value decomposition), RV (real-valued) l 1 -SVD and RV l 1 -SRACV (sparse representation array covariance vectors), and achieves better DOA estimation performance.
Classification of ECG signal with Support Vector Machine Method for Arrhythmia Detection
NASA Astrophysics Data System (ADS)
Turnip, Arjon; Ilham Rizqywan, M.; Kusumandari, Dwi E.; Turnip, Mardi; Sihombing, Poltak
2018-03-01
An electrocardiogram is a potential bioelectric record that occurs as a result of cardiac activity. QRS Detection with zero crossing calculation is one method that can precisely determine peak R of QRS wave as part of arrhythmia detection. In this paper, two experimental scheme (2 minutes duration with different activities: relaxed and, typing) were conducted. From the two experiments it were obtained: accuracy, sensitivity, and positive predictivity about 100% each for the first experiment and about 79%, 93%, 83% for the second experiment, respectively. Furthermore, the feature set of MIT-BIH arrhythmia using the support vector machine (SVM) method on the WEKA software is evaluated. By combining the available attributes on the WEKA algorithm, the result is constant since all classes of SVM goes to the normal class with average 88.49% accuracy.
NASA Astrophysics Data System (ADS)
Wei, ZHANG; Tongyu, WU; Bowen, ZHENG; Shiping, LI; Yipo, ZHANG; Zejie, YIN
2018-04-01
A new neutron-gamma discriminator based on the support vector machine (SVM) method is proposed to improve the performance of the time-of-flight neutron spectrometer. The neutron detector is an EJ-299-33 plastic scintillator with pulse-shape discrimination (PSD) property. The SVM algorithm is implemented in field programmable gate array (FPGA) to carry out the real-time sifting of neutrons in neutron-gamma mixed radiation fields. This study compares the ability of the pulse gradient analysis method and the SVM method. The results show that this SVM discriminator can provide a better discrimination accuracy of 99.1%. The accuracy and performance of the SVM discriminator based on FPGA have been evaluated in the experiments. It can get a figure of merit of 1.30.
A Support Vector Machine-Based Gender Identification Using Speech Signal
NASA Astrophysics Data System (ADS)
Lee, Kye-Hwan; Kang, Sang-Ick; Kim, Deok-Hwan; Chang, Joon-Hyuk
We propose an effective voice-based gender identification method using a support vector machine (SVM). The SVM is a binary classification algorithm that classifies two groups by finding the voluntary nonlinear boundary in a feature space and is known to yield high classification performance. In the present work, we compare the identification performance of the SVM with that of a Gaussian mixture model (GMM)-based method using the mel frequency cepstral coefficients (MFCC). A novel approach of incorporating a features fusion scheme based on a combination of the MFCC and the fundamental frequency is proposed with the aim of improving the performance of gender identification. Experimental results demonstrate that the gender identification performance using the SVM is significantly better than that of the GMM-based scheme. Moreover, the performance is substantially improved when the proposed features fusion technique is applied.
NASA Astrophysics Data System (ADS)
Hao, Xuejun; An, Xaioran; Wu, Bo; He, Shaoping
2018-02-01
In the gas pipeline system, safe operation of a gas regulator determines the stability of the fuel gas supply, and the medium-low pressure gas regulator of the safety precaution system is not perfect at the present stage in the Beijing Gas Group; therefore, safety precaution technique optimization has important social and economic significance. In this paper, according to the running status of the medium-low pressure gas regulator in the SCADA system, a new method for gas regulator safety precaution based on the support vector machine (SVM) is presented. This method takes the gas regulator outlet pressure data as input variables of the SVM model, the fault categories and degree as output variables, which will effectively enhance the precaution accuracy as well as save significant manpower and material resources.
Support Vector Machines for Hyperspectral Remote Sensing Classification
NASA Technical Reports Server (NTRS)
Gualtieri, J. Anthony; Cromp, R. F.
1998-01-01
The Support Vector Machine provides a new way to design classification algorithms which learn from examples (supervised learning) and generalize when applied to new data. We demonstrate its success on a difficult classification problem from hyperspectral remote sensing, where we obtain performances of 96%, and 87% correct for a 4 class problem, and a 16 class problem respectively. These results are somewhat better than other recent results on the same data. A key feature of this classifier is its ability to use high-dimensional data without the usual recourse to a feature selection step to reduce the dimensionality of the data. For this application, this is important, as hyperspectral data consists of several hundred contiguous spectral channels for each exemplar. We provide an introduction to this new approach, and demonstrate its application to classification of an agriculture scene.
Supercomputer implementation of finite element algorithms for high speed compressible flows
NASA Technical Reports Server (NTRS)
Thornton, E. A.; Ramakrishnan, R.
1986-01-01
Prediction of compressible flow phenomena using the finite element method is of recent origin and considerable interest. Two shock capturing finite element formulations for high speed compressible flows are described. A Taylor-Galerkin formulation uses a Taylor series expansion in time coupled with a Galerkin weighted residual statement. The Taylor-Galerkin algorithms use explicit artificial dissipation, and the performance of three dissipation models are compared. A Petrov-Galerkin algorithm has as its basis the concepts of streamline upwinding. Vectorization strategies are developed to implement the finite element formulations on the NASA Langley VPS-32. The vectorization scheme results in finite element programs that use vectors of length of the order of the number of nodes or elements. The use of the vectorization procedure speeds up processing rates by over two orders of magnitude. The Taylor-Galerkin and Petrov-Galerkin algorithms are evaluated for 2D inviscid flows on criteria such as solution accuracy, shock resolution, computational speed and storage requirements. The convergence rates for both algorithms are enhanced by local time-stepping schemes. Extension of the vectorization procedure for predicting 2D viscous and 3D inviscid flows are demonstrated. Conclusions are drawn regarding the applicability of the finite element procedures for realistic problems that require hundreds of thousands of nodes.
Optimal integer resolution for attitude determination using global positioning system signals
NASA Technical Reports Server (NTRS)
Crassidis, John L.; Markley, F. Landis; Lightsey, E. Glenn
1998-01-01
In this paper, a new motion-based algorithm for GPS integer ambiguity resolution is derived. The first step of this algorithm converts the reference sightline vectors into body frame vectors. This is accomplished by an optimal vectorized transformation of the phase difference measurements. The result of this transformation leads to the conversion of the integer ambiguities to vectorized biases. This essentially converts the problem to the familiar magnetometer-bias determination problem, for which an optimal and efficient solution exists. Also, the formulation in this paper is re-derived to provide a sequential estimate, so that a suitable stopping condition can be found during the vehicle motion. The advantages of the new algorithm include: it does not require an a-priori estimate of the vehicle's attitude; it provides an inherent integrity check using a covariance-type expression; and it can sequentially estimate the ambiguities during the vehicle motion. The only disadvantage of the new algorithm is that it requires at least three non-coplanar baselines. The performance of the new algorithm is tested on a dynamic hardware simulator.
A novel retinal vessel extraction algorithm based on matched filtering and gradient vector flow
NASA Astrophysics Data System (ADS)
Yu, Lei; Xia, Mingliang; Xuan, Li
2013-10-01
The microvasculature network of retina plays an important role in the study and diagnosis of retinal diseases (age-related macular degeneration and diabetic retinopathy for example). Although it is possible to noninvasively acquire high-resolution retinal images with modern retinal imaging technologies, non-uniform illumination, the low contrast of thin vessels and the background noises all make it difficult for diagnosis. In this paper, we introduce a novel retinal vessel extraction algorithm based on gradient vector flow and matched filtering to segment retinal vessels with different likelihood. Firstly, we use isotropic Gaussian kernel and adaptive histogram equalization to smooth and enhance the retinal images respectively. Secondly, a multi-scale matched filtering method is adopted to extract the retinal vessels. Then, the gradient vector flow algorithm is introduced to locate the edge of the retinal vessels. Finally, we combine the results of matched filtering method and gradient vector flow algorithm to extract the vessels at different likelihood levels. The experiments demonstrate that our algorithm is efficient and the intensities of vessel images exactly represent the likelihood of the vessels.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Spotz, William F.
PyTrilinos is a set of Python interfaces to compiled Trilinos packages. This collection supports serial and parallel dense linear algebra, serial and parallel sparse linear algebra, direct and iterative linear solution techniques, algebraic and multilevel preconditioners, nonlinear solvers and continuation algorithms, eigensolvers and partitioning algorithms. Also included are a variety of related utility functions and classes, including distributed I/O, coloring algorithms and matrix generation. PyTrilinos vector objects are compatible with the popular NumPy Python package. As a Python front end to compiled libraries, PyTrilinos takes advantage of the flexibility and ease of use of Python, and the efficiency of themore » underlying C++, C and Fortran numerical kernels. This paper covers recent, previously unpublished advances in the PyTrilinos package.« less
Implementation and performance evaluation of acoustic denoising algorithms for UAV
NASA Astrophysics Data System (ADS)
Chowdhury, Ahmed Sony Kamal
Unmanned Aerial Vehicles (UAVs) have become popular alternative for wildlife monitoring and border surveillance applications. Elimination of the UAV's background noise and classifying the target audio signal effectively are still a major challenge. The main goal of this thesis is to remove UAV's background noise by means of acoustic denoising techniques. Existing denoising algorithms, such as Adaptive Least Mean Square (LMS), Wavelet Denoising, Time-Frequency Block Thresholding, and Wiener Filter, were implemented and their performance evaluated. The denoising algorithms were evaluated for average Signal to Noise Ratio (SNR), Segmental SNR (SSNR), Log Likelihood Ratio (LLR), and Log Spectral Distance (LSD) metrics. To evaluate the effectiveness of the denoising algorithms on classification of target audio, we implemented Support Vector Machine (SVM) and Naive Bayes classification algorithms. Simulation results demonstrate that LMS and Discrete Wavelet Transform (DWT) denoising algorithm offered superior performance than other algorithms. Finally, we implemented the LMS and DWT algorithms on a DSP board for hardware evaluation. Experimental results showed that LMS algorithm's performance is robust compared to DWT for various noise types to classify target audio signals.
Cross-modal face recognition using multi-matcher face scores
NASA Astrophysics Data System (ADS)
Zheng, Yufeng; Blasch, Erik
2015-05-01
The performance of face recognition can be improved using information fusion of multimodal images and/or multiple algorithms. When multimodal face images are available, cross-modal recognition is meaningful for security and surveillance applications. For example, a probe face is a thermal image (especially at nighttime), while only visible face images are available in the gallery database. Matching a thermal probe face onto the visible gallery faces requires crossmodal matching approaches. A few such studies were implemented in facial feature space with medium recognition performance. In this paper, we propose a cross-modal recognition approach, where multimodal faces are cross-matched in feature space and the recognition performance is enhanced with stereo fusion at image, feature and/or score level. In the proposed scenario, there are two cameras for stereo imaging, two face imagers (visible and thermal images) in each camera, and three recognition algorithms (circular Gaussian filter, face pattern byte, linear discriminant analysis). A score vector is formed with three cross-matched face scores from the aforementioned three algorithms. A classifier (e.g., k-nearest neighbor, support vector machine, binomial logical regression [BLR]) is trained then tested with the score vectors by using 10-fold cross validations. The proposed approach was validated with a multispectral stereo face dataset from 105 subjects. Our experiments show very promising results: ACR (accuracy rate) = 97.84%, FAR (false accept rate) = 0.84% when cross-matching the fused thermal faces onto the fused visible faces by using three face scores and the BLR classifier.
Meiring, Gys Albertus Marthinus; Myburgh, Hermanus Carel
2015-01-01
In this paper the various driving style analysis solutions are investigated. An in-depth investigation is performed to identify the relevant machine learning and artificial intelligence algorithms utilised in current driver behaviour and driving style analysis systems. This review therefore serves as a trove of information, and will inform the specialist and the student regarding the current state of the art in driver style analysis systems, the application of these systems and the underlying artificial intelligence algorithms applied to these applications. The aim of the investigation is to evaluate the possibilities for unique driver identification utilizing the approaches identified in other driver behaviour studies. It was found that Fuzzy Logic inference systems, Hidden Markov Models and Support Vector Machines consist of promising capabilities to address unique driver identification algorithms if model complexity can be reduced. PMID:26690164
State-Based Implicit Coordination and Applications
NASA Technical Reports Server (NTRS)
Narkawicz, Anthony J.; Munoz, Cesar A.
2011-01-01
In air traffic management, pairwise coordination is the ability to achieve separation requirements when conflicting aircraft simultaneously maneuver to solve a conflict. Resolution algorithms are implicitly coordinated if they provide coordinated resolution maneuvers to conflicting aircraft when only surveillance data, e.g., position and velocity vectors, is periodically broadcast by the aircraft. This paper proposes an abstract framework for reasoning about state-based implicit coordination. The framework consists of a formalized mathematical development that enables and simplifies the design and verification of implicitly coordinated state-based resolution algorithms. The use of the framework is illustrated with several examples of algorithms and formal proofs of their coordination properties. The work presented here supports the safety case for a distributed self-separation air traffic management concept where different aircraft may use different conflict resolution algorithms and be assured that separation will be maintained.
Meiring, Gys Albertus Marthinus; Myburgh, Hermanus Carel
2015-12-04
In this paper the various driving style analysis solutions are investigated. An in-depth investigation is performed to identify the relevant machine learning and artificial intelligence algorithms utilised in current driver behaviour and driving style analysis systems. This review therefore serves as a trove of information, and will inform the specialist and the student regarding the current state of the art in driver style analysis systems, the application of these systems and the underlying artificial intelligence algorithms applied to these applications. The aim of the investigation is to evaluate the possibilities for unique driver identification utilizing the approaches identified in other driver behaviour studies. It was found that Fuzzy Logic inference systems, Hidden Markov Models and Support Vector Machines consist of promising capabilities to address unique driver identification algorithms if model complexity can be reduced.
Advanced Techniques for Scene Analysis
2010-06-01
robustness prefers a bigger intergration window to handle larger motions. The advantage of pyramidal implementation is that, while each motion vector dL...labeled SAR images. Now the previous algorithm leads to a more dedicated classifier for the particular target; however, our algorithm trades generality for...accuracy is traded for generality. 7.3.2 I-RELIEF Feature weighting transforms the original feature vector x into a new feature vector x′ by assigning each
A Semi-Vectorization Algorithm to Synthesis of Gravitational Anomaly Quantities on the Earth
NASA Astrophysics Data System (ADS)
Abdollahzadeh, M.; Eshagh, M.; Najafi Alamdari, M.
2009-04-01
The Earth's gravitational potential can be expressed by the well-known spherical harmonic expansion. The computational time of summing up this expansion is an important practical issue which can be reduced by an efficient numerical algorithm. This paper proposes such a method for block-wise synthesizing the anomaly quantities on the Earth surface using vectorization. Fully-vectorization means transformation of the summations to the simple matrix and vector products. It is not a practical for the matrices with large dimensions. Here a semi-vectorization algorithm is proposed to avoid working with large vectors and matrices. It speeds up the computations by using one loop for the summation either on degrees or on orders. The former is a good option to synthesize the anomaly quantities on the Earth surface considering a digital elevation model (DEM). This approach is more efficient than the two-step method which computes the quantities on the reference ellipsoid and continues them upward to the Earth surface. The algorithm has been coded in MATLAB which synthesizes a global grid of 5â²Ã- 5â² (corresponding 9 million points) of gravity anomaly or geoid height using a geopotential model to degree 360 in 10000 seconds by an ordinary computer with 2G RAM.
2016-12-01
Evaluated Genetic Algorithm prepared by Justin L Paul Academy of Applied Science 24 Warren Street Concord, NH 03301 under contract W911SR...Supersonic Bending Body Projectile by a Vector-Evaluated Genetic Algorithm prepared by Justin L Paul Academy of Applied Science 24 Warren Street... Genetic Algorithm 5a. CONTRACT NUMBER W199SR-15-2-001 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) Justin L Paul 5d. PROJECT
Gatos, Ilias; Tsantis, Stavros; Spiliopoulos, Stavros; Karnabatidis, Dimitris; Theotokas, Ioannis; Zoumpoulis, Pavlos; Loupas, Thanasis; Hazle, John D; Kagadis, George C
2017-09-01
The purpose of the present study was to employ a computer-aided diagnosis system that classifies chronic liver disease (CLD) using ultrasound shear wave elastography (SWE) imaging, with a stiffness value-clustering and machine-learning algorithm. A clinical data set of 126 patients (56 healthy controls, 70 with CLD) was analyzed. First, an RGB-to-stiffness inverse mapping technique was employed. A five-cluster segmentation was then performed associating corresponding different-color regions with certain stiffness value ranges acquired from the SWE manufacturer-provided color bar. Subsequently, 35 features (7 for each cluster), indicative of physical characteristics existing within the SWE image, were extracted. A stepwise regression analysis toward feature reduction was used to derive a reduced feature subset that was fed into the support vector machine classification algorithm to classify CLD from healthy cases. The highest accuracy in classification of healthy to CLD subject discrimination from the support vector machine model was 87.3% with sensitivity and specificity values of 93.5% and 81.2%, respectively. Receiver operating characteristic curve analysis gave an area under the curve value of 0.87 (confidence interval: 0.77-0.92). A machine-learning algorithm that quantifies color information in terms of stiffness values from SWE images and discriminates CLD from healthy cases is introduced. New objective parameters and criteria for CLD diagnosis employing SWE images provided by the present study can be considered an important step toward color-based interpretation, and could assist radiologists' diagnostic performance on a daily basis after being installed in a PC and employed retrospectively, immediately after the examination. Copyright © 2017 World Federation for Ultrasound in Medicine & Biology. Published by Elsevier Inc. All rights reserved.
Painting with polygons: a procedural watercolor engine.
DiVerdi, Stephen; Krishnaswamy, Aravind; Měch, Radomír; Ito, Daichi
2013-05-01
Existing natural media painting simulations have produced high-quality results, but have required powerful compute hardware and have been limited to screen resolutions. Digital artists would like to be able to use watercolor-like painting tools, but at print resolutions and on lower end hardware such as laptops or even slates. We present a procedural algorithm for generating watercolor-like dynamic paint behaviors in a lightweight manner. Our goal is not to exactly duplicate watercolor painting, but to create a range of dynamic behaviors that allow users to achieve a similar style of process and result, while at the same time having a unique character of its own. Our stroke representation is vector based, allowing for rendering at arbitrary resolutions, and our procedural pigment advection algorithm is fast enough to support painting on slate devices. We demonstrate our technique in a commercially available slate application used by professional artists. Finally, we present a detailed analysis of the different vector-rendering technologies available.
Motion Estimation Using the Firefly Algorithm in Ultrasonic Image Sequence of Soft Tissue
Chao, Chih-Feng; Horng, Ming-Huwi; Chen, Yu-Chan
2015-01-01
Ultrasonic image sequence of the soft tissue is widely used in disease diagnosis; however, the speckle noises usually influenced the image quality. These images usually have a low signal-to-noise ratio presentation. The phenomenon gives rise to traditional motion estimation algorithms that are not suitable to measure the motion vectors. In this paper, a new motion estimation algorithm is developed for assessing the velocity field of soft tissue in a sequence of ultrasonic B-mode images. The proposed iterative firefly algorithm (IFA) searches for few candidate points to obtain the optimal motion vector, and then compares it to the traditional iterative full search algorithm (IFSA) via a series of experiments of in vivo ultrasonic image sequences. The experimental results show that the IFA can assess the vector with better efficiency and almost equal estimation quality compared to the traditional IFSA method. PMID:25873987
Motion estimation using the firefly algorithm in ultrasonic image sequence of soft tissue.
Chao, Chih-Feng; Horng, Ming-Huwi; Chen, Yu-Chan
2015-01-01
Ultrasonic image sequence of the soft tissue is widely used in disease diagnosis; however, the speckle noises usually influenced the image quality. These images usually have a low signal-to-noise ratio presentation. The phenomenon gives rise to traditional motion estimation algorithms that are not suitable to measure the motion vectors. In this paper, a new motion estimation algorithm is developed for assessing the velocity field of soft tissue in a sequence of ultrasonic B-mode images. The proposed iterative firefly algorithm (IFA) searches for few candidate points to obtain the optimal motion vector, and then compares it to the traditional iterative full search algorithm (IFSA) via a series of experiments of in vivo ultrasonic image sequences. The experimental results show that the IFA can assess the vector with better efficiency and almost equal estimation quality compared to the traditional IFSA method.
Heterogeneous Tensor Decomposition for Clustering via Manifold Optimization.
Sun, Yanfeng; Gao, Junbin; Hong, Xia; Mishra, Bamdev; Yin, Baocai
2016-03-01
Tensor clustering is an important tool that exploits intrinsically rich structures in real-world multiarray or Tensor datasets. Often in dealing with those datasets, standard practice is to use subspace clustering that is based on vectorizing multiarray data. However, vectorization of tensorial data does not exploit complete structure information. In this paper, we propose a subspace clustering algorithm without adopting any vectorization process. Our approach is based on a novel heterogeneous Tucker decomposition model taking into account cluster membership information. We propose a new clustering algorithm that alternates between different modes of the proposed heterogeneous tensor model. All but the last mode have closed-form updates. Updating the last mode reduces to optimizing over the multinomial manifold for which we investigate second order Riemannian geometry and propose a trust-region algorithm. Numerical experiments show that our proposed algorithm compete effectively with state-of-the-art clustering algorithms that are based on tensor factorization.
Vectorized Rebinning Algorithm for Fast Data Down-Sampling
NASA Technical Reports Server (NTRS)
Dean, Bruce; Aronstein, David; Smith, Jeffrey
2013-01-01
A vectorized rebinning (down-sampling) algorithm, applicable to N-dimensional data sets, has been developed that offers a significant reduction in computer run time when compared to conventional rebinning algorithms. For clarity, a two-dimensional version of the algorithm is discussed to illustrate some specific details of the algorithm content, and using the language of image processing, 2D data will be referred to as "images," and each value in an image as a "pixel." The new approach is fully vectorized, i.e., the down-sampling procedure is done as a single step over all image rows, and then as a single step over all image columns. Data rebinning (or down-sampling) is a procedure that uses a discretely sampled N-dimensional data set to create a representation of the same data, but with fewer discrete samples. Such data down-sampling is fundamental to digital signal processing, e.g., for data compression applications.
NASA Astrophysics Data System (ADS)
Wang, Bingjie; Pi, Shaohua; Sun, Qi; Jia, Bo
2015-05-01
An improved classification algorithm that considers multiscale wavelet packet Shannon entropy is proposed. Decomposition coefficients at all levels are obtained to build the initial Shannon entropy feature vector. After subtracting the Shannon entropy map of the background signal, components of the strongest discriminating power in the initial feature vector are picked out to rebuild the Shannon entropy feature vector, which is transferred to radial basis function (RBF) neural network for classification. Four types of man-made vibrational intrusion signals are recorded based on a modified Sagnac interferometer. The performance of the improved classification algorithm has been evaluated by the classification experiments via RBF neural network under different diffusion coefficients. An 85% classification accuracy rate is achieved, which is higher than the other common algorithms. The classification results show that this improved classification algorithm can be used to classify vibrational intrusion signals in an automatic real-time monitoring system.
An index-based algorithm for fast on-line query processing of latent semantic analysis
Li, Pohan; Wang, Wei
2017-01-01
Latent Semantic Analysis (LSA) is widely used for finding the documents whose semantic is similar to the query of keywords. Although LSA yield promising similar results, the existing LSA algorithms involve lots of unnecessary operations in similarity computation and candidate check during on-line query processing, which is expensive in terms of time cost and cannot efficiently response the query request especially when the dataset becomes large. In this paper, we study the efficiency problem of on-line query processing for LSA towards efficiently searching the similar documents to a given query. We rewrite the similarity equation of LSA combined with an intermediate value called partial similarity that is stored in a designed index called partial index. For reducing the searching space, we give an approximate form of similarity equation, and then develop an efficient algorithm for building partial index, which skips the partial similarities lower than a given threshold θ. Based on partial index, we develop an efficient algorithm called ILSA for supporting fast on-line query processing. The given query is transformed into a pseudo document vector, and the similarities between query and candidate documents are computed by accumulating the partial similarities obtained from the index nodes corresponds to non-zero entries in the pseudo document vector. Compared to the LSA algorithm, ILSA reduces the time cost of on-line query processing by pruning the candidate documents that are not promising and skipping the operations that make little contribution to similarity scores. Extensive experiments through comparison with LSA have been done, which demonstrate the efficiency and effectiveness of our proposed algorithm. PMID:28520747
NASA Astrophysics Data System (ADS)
Paino, A.; Keller, J.; Popescu, M.; Stone, K.
2014-06-01
In this paper we present an approach that uses Genetic Programming (GP) to evolve novel feature extraction algorithms for greyscale images. Our motivation is to create an automated method of building new feature extraction algorithms for images that are competitive with commonly used human-engineered features, such as Local Binary Pattern (LBP) and Histogram of Oriented Gradients (HOG). The evolved feature extraction algorithms are functions defined over the image space, and each produces a real-valued feature vector of variable length. Each evolved feature extractor breaks up the given image into a set of cells centered on every pixel, performs evolved operations on each cell, and then combines the results of those operations for every cell using an evolved operator. Using this method, the algorithm is flexible enough to reproduce both LBP and HOG features. The dataset we use to train and test our approach consists of a large number of pre-segmented image "chips" taken from a Forward Looking Infrared Imagery (FLIR) camera mounted on the hood of a moving vehicle. The goal is to classify each image chip as either containing or not containing a buried object. To this end, we define the fitness of a candidate solution as the cross-fold validation accuracy of the features generated by said candidate solution when used in conjunction with a Support Vector Machine (SVM) classifier. In order to validate our approach, we compare the classification accuracy of an SVM trained using our evolved features with the accuracy of an SVM trained using mainstream feature extraction algorithms, including LBP and HOG.
An index-based algorithm for fast on-line query processing of latent semantic analysis.
Zhang, Mingxi; Li, Pohan; Wang, Wei
2017-01-01
Latent Semantic Analysis (LSA) is widely used for finding the documents whose semantic is similar to the query of keywords. Although LSA yield promising similar results, the existing LSA algorithms involve lots of unnecessary operations in similarity computation and candidate check during on-line query processing, which is expensive in terms of time cost and cannot efficiently response the query request especially when the dataset becomes large. In this paper, we study the efficiency problem of on-line query processing for LSA towards efficiently searching the similar documents to a given query. We rewrite the similarity equation of LSA combined with an intermediate value called partial similarity that is stored in a designed index called partial index. For reducing the searching space, we give an approximate form of similarity equation, and then develop an efficient algorithm for building partial index, which skips the partial similarities lower than a given threshold θ. Based on partial index, we develop an efficient algorithm called ILSA for supporting fast on-line query processing. The given query is transformed into a pseudo document vector, and the similarities between query and candidate documents are computed by accumulating the partial similarities obtained from the index nodes corresponds to non-zero entries in the pseudo document vector. Compared to the LSA algorithm, ILSA reduces the time cost of on-line query processing by pruning the candidate documents that are not promising and skipping the operations that make little contribution to similarity scores. Extensive experiments through comparison with LSA have been done, which demonstrate the efficiency and effectiveness of our proposed algorithm.
O'Boyle, Noel M; Palmer, David S; Nigsch, Florian; Mitchell, John Bo
2008-10-29
We present a novel feature selection algorithm, Winnowing Artificial Ant Colony (WAAC), that performs simultaneous feature selection and model parameter optimisation for the development of predictive quantitative structure-property relationship (QSPR) models. The WAAC algorithm is an extension of the modified ant colony algorithm of Shen et al. (J Chem Inf Model 2005, 45: 1024-1029). We test the ability of the algorithm to develop a predictive partial least squares model for the Karthikeyan dataset (J Chem Inf Model 2005, 45: 581-590) of melting point values. We also test its ability to perform feature selection on a support vector machine model for the same dataset. Starting from an initial set of 203 descriptors, the WAAC algorithm selected a PLS model with 68 descriptors which has an RMSE on an external test set of 46.6 degrees C and R2 of 0.51. The number of components chosen for the model was 49, which was close to optimal for this feature selection. The selected SVM model has 28 descriptors (cost of 5, epsilon of 0.21) and an RMSE of 45.1 degrees C and R2 of 0.54. This model outperforms a kNN model (RMSE of 48.3 degrees C, R2 of 0.47) for the same data and has similar performance to a Random Forest model (RMSE of 44.5 degrees C, R2 of 0.55). However it is much less prone to bias at the extremes of the range of melting points as shown by the slope of the line through the residuals: -0.43 for WAAC/SVM, -0.53 for Random Forest. With a careful choice of objective function, the WAAC algorithm can be used to optimise machine learning and regression models that suffer from overfitting. Where model parameters also need to be tuned, as is the case with support vector machine and partial least squares models, it can optimise these simultaneously. The moving probabilities used by the algorithm are easily interpreted in terms of the best and current models of the ants, and the winnowing procedure promotes the removal of irrelevant descriptors.
Marucci-Wellman, Helen R; Corns, Helen L; Lehto, Mark R
2017-01-01
Injury narratives are now available real time and include useful information for injury surveillance and prevention. However, manual classification of the cause or events leading to injury found in large batches of narratives, such as workers compensation claims databases, can be prohibitive. In this study we compare the utility of four machine learning algorithms (Naïve Bayes, Single word and Bi-gram models, Support Vector Machine and Logistic Regression) for classifying narratives into Bureau of Labor Statistics Occupational Injury and Illness event leading to injury classifications for a large workers compensation database. These algorithms are known to do well classifying narrative text and are fairly easy to implement with off-the-shelf software packages such as Python. We propose human-machine learning ensemble approaches which maximize the power and accuracy of the algorithms for machine-assigned codes and allow for strategic filtering of rare, emerging or ambiguous narratives for manual review. We compare human-machine approaches based on filtering on the prediction strength of the classifier vs. agreement between algorithms. Regularized Logistic Regression (LR) was the best performing algorithm alone. Using this algorithm and filtering out the bottom 30% of predictions for manual review resulted in high accuracy (overall sensitivity/positive predictive value of 0.89) of the final machine-human coded dataset. The best pairings of algorithms included Naïve Bayes with Support Vector Machine whereby the triple ensemble NB SW =NB BI-GRAM =SVM had very high performance (0.93 overall sensitivity/positive predictive value and high accuracy (i.e. high sensitivity and positive predictive values)) across both large and small categories leaving 41% of the narratives for manual review. Integrating LR into this ensemble mix improved performance only slightly. For large administrative datasets we propose incorporation of methods based on human-machine pairings such as we have done here, utilizing readily-available off-the-shelf machine learning techniques and resulting in only a fraction of narratives that require manual review. Human-machine ensemble methods are likely to improve performance over total manual coding. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.
Hipp, Jason D; Cheng, Jerome Y; Toner, Mehmet; Tompkins, Ronald G; Balis, Ulysses J
2011-02-26
HISTORICALLY, EFFECTIVE CLINICAL UTILIZATION OF IMAGE ANALYSIS AND PATTERN RECOGNITION ALGORITHMS IN PATHOLOGY HAS BEEN HAMPERED BY TWO CRITICAL LIMITATIONS: 1) the availability of digital whole slide imagery data sets and 2) a relative domain knowledge deficit in terms of application of such algorithms, on the part of practicing pathologists. With the advent of the recent and rapid adoption of whole slide imaging solutions, the former limitation has been largely resolved. However, with the expectation that it is unlikely for the general cohort of contemporary pathologists to gain advanced image analysis skills in the short term, the latter problem remains, thus underscoring the need for a class of algorithm that has the concurrent properties of image domain (or organ system) independence and extreme ease of use, without the need for specialized training or expertise. In this report, we present a novel, general case pattern recognition algorithm, Spatially Invariant Vector Quantization (SIVQ), that overcomes the aforementioned knowledge deficit. Fundamentally based on conventional Vector Quantization (VQ) pattern recognition approaches, SIVQ gains its superior performance and essentially zero-training workflow model from its use of ring vectors, which exhibit continuous symmetry, as opposed to square or rectangular vectors, which do not. By use of the stochastic matching properties inherent in continuous symmetry, a single ring vector can exhibit as much as a millionfold improvement in matching possibilities, as opposed to conventional VQ vectors. SIVQ was utilized to demonstrate rapid and highly precise pattern recognition capability in a broad range of gross and microscopic use-case settings. With the performance of SIVQ observed thus far, we find evidence that indeed there exist classes of image analysis/pattern recognition algorithms suitable for deployment in settings where pathologists alone can effectively incorporate their use into clinical workflow, as a turnkey solution. We anticipate that SIVQ, and other related class-independent pattern recognition algorithms, will become part of the overall armamentarium of digital image analysis approaches that are immediately available to practicing pathologists, without the need for the immediate availability of an image analysis expert.
Variable Selection for Support Vector Machines in Moderately High Dimensions
Zhang, Xiang; Wu, Yichao; Wang, Lan; Li, Runze
2015-01-01
Summary The support vector machine (SVM) is a powerful binary classification tool with high accuracy and great flexibility. It has achieved great success, but its performance can be seriously impaired if many redundant covariates are included. Some efforts have been devoted to studying variable selection for SVMs, but asymptotic properties, such as variable selection consistency, are largely unknown when the number of predictors diverges to infinity. In this work, we establish a unified theory for a general class of nonconvex penalized SVMs. We first prove that in ultra-high dimensions, there exists one local minimizer to the objective function of nonconvex penalized SVMs possessing the desired oracle property. We further address the problem of nonunique local minimizers by showing that the local linear approximation algorithm is guaranteed to converge to the oracle estimator even in the ultra-high dimensional setting if an appropriate initial estimator is available. This condition on initial estimator is verified to be automatically valid as long as the dimensions are moderately high. Numerical examples provide supportive evidence. PMID:26778916
Predicting asthma exacerbations using artificial intelligence.
Finkelstein, Joseph; Wood, Jeffrey
2013-01-01
Modern telemonitoring systems identify a serious patient deterioration when it already occurred. It would be much more beneficial if the upcoming clinical deterioration were identified ahead of time even before a patient actually experiences it. The goal of this study was to assess artificial intelligence approaches which potentially can be used in telemonitoring systems for advance prediction of changes in disease severity before they actually occur. The study dataset was based on daily self-reports submitted by 26 adult asthma patients during home telemonitoring consisting of 7001 records. Two classification algorithms were employed for building predictive models: naïve Bayesian classifier and support vector machines. Using a 7-day window, a support vector machine was able to predict asthma exacerbation to occur on the day 8 with the accuracy of 0.80, sensitivity of 0.84 and specificity of 0.80. Our study showed that methods of artificial intelligence have significant potential in developing individualized decision support for chronic disease telemonitoring systems.
New Syndrome Decoding Techniques for the (n, K) Convolutional Codes
NASA Technical Reports Server (NTRS)
Reed, I. S.; Truong, T. K.
1983-01-01
This paper presents a new syndrome decoding algorithm for the (n,k) convolutional codes (CC) which differs completely from an earlier syndrome decoding algorithm of Schalkwijk and Vinck. The new algorithm is based on the general solution of the syndrome equation, a linear Diophantine equation for the error polynomial vector E(D). The set of Diophantine solutions is a coset of the CC. In this error coset a recursive, Viterbi-like algorithm is developed to find the minimum weight error vector (circumflex)E(D). An example, illustrating the new decoding algorithm, is given for the binary nonsystemmatic (3,1)CC.
Simplified Syndrome Decoding of (n, 1) Convolutional Codes
NASA Technical Reports Server (NTRS)
Reed, I. S.; Truong, T. K.
1983-01-01
A new syndrome decoding algorithm for the (n, 1) convolutional codes (CC) that is different and simpler than the previous syndrome decoding algorithm of Schalkwijk and Vinck is presented. The new algorithm uses the general solution of the polynomial linear Diophantine equation for the error polynomial vector E(D). This set of Diophantine solutions is a coset of the CC space. A recursive or Viterbi-like algorithm is developed to find the minimum weight error vector cirumflex E(D) in this error coset. An example illustrating the new decoding algorithm is given for the binary nonsymmetric (2,1)CC.
Quantized Overcomplete Expansions: Analysis, Synthesis and Algorithms
1995-07-01
would be in the spirit of the Lempel - Ziv algorithm . The decoder would have to be aware of changes in the dictionary, but depending on the nature of the...37 3.4 A General Vector Compression Algorithm Based on Frames : : : : : : : : : : 40 ii 3.4.1 Design Considerations...x3.3. Along with exploring general properties of matching pursuit, we are interested in its application to compressing data vectors in RN. A general
Computational mechanics analysis tools for parallel-vector supercomputers
NASA Technical Reports Server (NTRS)
Storaasli, Olaf O.; Nguyen, Duc T.; Baddourah, Majdi; Qin, Jiangning
1993-01-01
Computational algorithms for structural analysis on parallel-vector supercomputers are reviewed. These parallel algorithms, developed by the authors, are for the assembly of structural equations, 'out-of-core' strategies for linear equation solution, massively distributed-memory equation solution, unsymmetric equation solution, general eigensolution, geometrically nonlinear finite element analysis, design sensitivity analysis for structural dynamics, optimization search analysis and domain decomposition. The source code for many of these algorithms is available.
Distributed Matrix Completion: Application to Cooperative Positioning in Noisy Environments
2013-12-11
positioning, and a gossip version of low-rank approximation were developed. A convex relaxation for positioning in the presence of noise was shown to...of a large data matrix through gossip algorithms. A new algorithm is proposed that amounts to iteratively multiplying a vector by independent random...sparsification of the original matrix and averaging the resulting normalized vectors. This can be viewed as a generalization of gossip algorithms for
Alcaide-Leon, P; Dufort, P; Geraldo, A F; Alshafai, L; Maralani, P J; Spears, J; Bharatha, A
2017-06-01
Accurate preoperative differentiation of primary central nervous system lymphoma and enhancing glioma is essential to avoid unnecessary neurosurgical resection in patients with primary central nervous system lymphoma. The purpose of the study was to evaluate the diagnostic performance of a machine-learning algorithm by using texture analysis of contrast-enhanced T1-weighted images for differentiation of primary central nervous system lymphoma and enhancing glioma. Seventy-one adult patients with enhancing gliomas and 35 adult patients with primary central nervous system lymphomas were included. The tumors were manually contoured on contrast-enhanced T1WI, and the resulting volumes of interest were mined for textural features and subjected to a support vector machine-based machine-learning protocol. Three readers classified the tumors independently on contrast-enhanced T1WI. Areas under the receiver operating characteristic curves were estimated for each reader and for the support vector machine classifier. A noninferiority test for diagnostic accuracy based on paired areas under the receiver operating characteristic curve was performed with a noninferiority margin of 0.15. The mean areas under the receiver operating characteristic curve were 0.877 (95% CI, 0.798-0.955) for the support vector machine classifier; 0.878 (95% CI, 0.807-0.949) for reader 1; 0.899 (95% CI, 0.833-0.966) for reader 2; and 0.845 (95% CI, 0.757-0.933) for reader 3. The mean area under the receiver operating characteristic curve of the support vector machine classifier was significantly noninferior to the mean area under the curve of reader 1 ( P = .021), reader 2 ( P = .035), and reader 3 ( P = .007). Support vector machine classification based on textural features of contrast-enhanced T1WI is noninferior to expert human evaluation in the differentiation of primary central nervous system lymphoma and enhancing glioma. © 2017 by American Journal of Neuroradiology.
Cui, Zaixu; Gong, Gaolang
2018-06-02
Individualized behavioral/cognitive prediction using machine learning (ML) regression approaches is becoming increasingly applied. The specific ML regression algorithm and sample size are two key factors that non-trivially influence prediction accuracies. However, the effects of the ML regression algorithm and sample size on individualized behavioral/cognitive prediction performance have not been comprehensively assessed. To address this issue, the present study included six commonly used ML regression algorithms: ordinary least squares (OLS) regression, least absolute shrinkage and selection operator (LASSO) regression, ridge regression, elastic-net regression, linear support vector regression (LSVR), and relevance vector regression (RVR), to perform specific behavioral/cognitive predictions based on different sample sizes. Specifically, the publicly available resting-state functional MRI (rs-fMRI) dataset from the Human Connectome Project (HCP) was used, and whole-brain resting-state functional connectivity (rsFC) or rsFC strength (rsFCS) were extracted as prediction features. Twenty-five sample sizes (ranged from 20 to 700) were studied by sub-sampling from the entire HCP cohort. The analyses showed that rsFC-based LASSO regression performed remarkably worse than the other algorithms, and rsFCS-based OLS regression performed markedly worse than the other algorithms. Regardless of the algorithm and feature type, both the prediction accuracy and its stability exponentially increased with increasing sample size. The specific patterns of the observed algorithm and sample size effects were well replicated in the prediction using re-testing fMRI data, data processed by different imaging preprocessing schemes, and different behavioral/cognitive scores, thus indicating excellent robustness/generalization of the effects. The current findings provide critical insight into how the selected ML regression algorithm and sample size influence individualized predictions of behavior/cognition and offer important guidance for choosing the ML regression algorithm or sample size in relevant investigations. Copyright © 2018 Elsevier Inc. All rights reserved.
Spacebased Estimation of Moisture Transport in Marine Atmosphere Using Support Vector Regression
NASA Technical Reports Server (NTRS)
Xie, Xiaosu; Liu, W. Timothy; Tang, Benyang
2007-01-01
An improved algorithm is developed based on support vector regression (SVR) to estimate horizonal water vapor transport integrated through the depth of the atmosphere ((Theta)) over the global ocean from observations of surface wind-stress vector by QuikSCAT, cloud drift wind vector derived from the Multi-angle Imaging SpectroRadiometer (MISR) and geostationary satellites, and precipitable water from the Special Sensor Microwave/Imager (SSM/I). The statistical relation is established between the input parameters (the surface wind stress, the 850 mb wind, the precipitable water, time and location) and the target data ((Theta) calculated from rawinsondes and reanalysis of numerical weather prediction model). The results are validated with independent daily rawinsonde observations, monthly mean reanalysis data, and through regional water balance. This study clearly demonstrates the improvement of (Theta) derived from satellite data using SVR over previous data sets based on linear regression and neural network. The SVR methodology reduces both mean bias and standard deviation comparedwith rawinsonde observations. It agrees better with observations from synoptic to seasonal time scales, and compare more favorably with the reanalysis data on seasonal variations. Only the SVR result can achieve the water balance over South America. The rationale of the advantage by SVR method and the impact of adding the upper level wind will also be discussed.
Stacked Denoising Autoencoders Applied to Star/Galaxy Classification
NASA Astrophysics Data System (ADS)
Qin, Hao-ran; Lin, Ji-ming; Wang, Jun-yi
2017-04-01
In recent years, the deep learning algorithm, with the characteristics of strong adaptability, high accuracy, and structural complexity, has become more and more popular, but it has not yet been used in astronomy. In order to solve the problem that the star/galaxy classification accuracy is high for the bright source set, but low for the faint source set of the Sloan Digital Sky Survey (SDSS) data, we introduced the new deep learning algorithm, namely the SDA (stacked denoising autoencoder) neural network and the dropout fine-tuning technique, which can greatly improve the robustness and antinoise performance. We randomly selected respectively the bright source sets and faint source sets from the SDSS DR12 and DR7 data with spectroscopic measurements, and made preprocessing on them. Then, we randomly selected respectively the training sets and testing sets without replacement from the bright source sets and faint source sets. At last, using these training sets we made the training to obtain the SDA models of the bright sources and faint sources in the SDSS DR7 and DR12, respectively. We compared the test result of the SDA model on the DR12 testing set with the test results of the Library for Support Vector Machines (LibSVM), J48 decision tree, Logistic Model Tree (LMT), Support Vector Machine (SVM), Logistic Regression, and Decision Stump algorithm, and compared the test result of the SDA model on the DR7 testing set with the test results of six kinds of decision trees. The experiments show that the SDA has a better classification accuracy than other machine learning algorithms for the faint source sets of DR7 and DR12. Especially, when the completeness function is used as the evaluation index, compared with the decision tree algorithms, the correctness rate of SDA has improved about 15% for the faint source set of SDSS-DR7.
Xia, Jiaqi; Peng, Zhenling; Qi, Dawei; Mu, Hongbo; Yang, Jianyi
2017-03-15
Protein fold classification is a critical step in protein structure prediction. There are two possible ways to classify protein folds. One is through template-based fold assignment and the other is ab-initio prediction using machine learning algorithms. Combination of both solutions to improve the prediction accuracy was never explored before. We developed two algorithms, HH-fold and SVM-fold for protein fold classification. HH-fold is a template-based fold assignment algorithm using the HHsearch program. SVM-fold is a support vector machine-based ab-initio classification algorithm, in which a comprehensive set of features are extracted from three complementary sequence profiles. These two algorithms are then combined, resulting to the ensemble approach TA-fold. We performed a comprehensive assessment for the proposed methods by comparing with ab-initio methods and template-based threading methods on six benchmark datasets. An accuracy of 0.799 was achieved by TA-fold on the DD dataset that consists of proteins from 27 folds. This represents improvement of 5.4-11.7% over ab-initio methods. After updating this dataset to include more proteins in the same folds, the accuracy increased to 0.971. In addition, TA-fold achieved >0.9 accuracy on a large dataset consisting of 6451 proteins from 184 folds. Experiments on the LE dataset show that TA-fold consistently outperforms other threading methods at the family, superfamily and fold levels. The success of TA-fold is attributed to the combination of template-based fold assignment and ab-initio classification using features from complementary sequence profiles that contain rich evolution information. http://yanglab.nankai.edu.cn/TA-fold/. yangjy@nankai.edu.cn or mhb-506@163.com. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
A vectorized algorithm for 3D dynamics of a tethered satellite
NASA Technical Reports Server (NTRS)
Wilson, Howard B.
1989-01-01
Equations of motion characterizing the three dimensional motion of a tethered satellite during the retrieval phase are studied. The mathematical model involves an arbitrary number of point masses connected by weightless cords. Motion occurs in a gravity gradient field. The formulation presented accounts for general functions describing support point motion, rate of tether retrieval, and arbitrary forces applied to the point masses. The matrix oriented program language MATLAB is used to produce an efficient vectorized formulation for computing natural frequencies and mode shapes for small oscillations about the static equilibrium configuration; and for integrating the nonlinear differential equations governing large amplitude motions. An example of time response pertaining to the skip rope effect is investigated.
Unresolved Galaxy Classifier for ESA/Gaia mission: Support Vector Machines approach
NASA Astrophysics Data System (ADS)
Bellas-Velidis, Ioannis; Kontizas, Mary; Dapergolas, Anastasios; Livanou, Evdokia; Kontizas, Evangelos; Karampelas, Antonios
A software package Unresolved Galaxy Classifier (UGC) is being developed for the ground-based pipeline of ESA's Gaia mission. It aims to provide an automated taxonomic classification and specific parameters estimation analyzing Gaia BP/RP instrument low-dispersion spectra of unresolved galaxies. The UGC algorithm is based on a supervised learning technique, the Support Vector Machines (SVM). The software is implemented in Java as two separate modules. An offline learning module provides functions for SVM-models training. Once trained, the set of models can be repeatedly applied to unknown galaxy spectra by the pipeline's application module. A library of galaxy models synthetic spectra, simulated for the BP/RP instrument, is used to train and test the modules. Science tests show a very good classification performance of UGC and relatively good regression performance, except for some of the parameters. Possible approaches to improve the performance are discussed.
Support vector machines and generalisation in HEP
NASA Astrophysics Data System (ADS)
Bevan, Adrian; Gamboa Goñi, Rodrigo; Hays, Jon; Stevenson, Tom
2017-10-01
We review the concept of Support Vector Machines (SVMs) and discuss examples of their use in a number of scenarios. Several SVM implementations have been used in HEP and we exemplify this algorithm using the Toolkit for Multivariate Analysis (TMVA) implementation. We discuss examples relevant to HEP including background suppression for H → τ + τ - at the LHC with several different kernel functions. Performance benchmarking leads to the issue of generalisation of hyper-parameter selection. The avoidance of fine tuning (over training or over fitting) in MVA hyper-parameter optimisation, i.e. the ability to ensure generalised performance of an MVA that is independent of the training, validation and test samples, is of utmost importance. We discuss this issue and compare and contrast performance of hold-out and k-fold cross-validation. We have extended the SVM functionality and introduced tools to facilitate cross validation in TMVA and present results based on these improvements.
Teutsch, T; Mesch, M; Giessen, H; Tarin, C
2015-01-01
In this contribution, a method to select discrete wavelengths that allow an accurate estimation of the glucose concentration in a biosensing system based on metamaterials is presented. The sensing concept is adapted to the particular application of ophthalmic glucose sensing by covering the metamaterial with a glucose-sensitive hydrogel and the sensor readout is performed optically. Due to the fact that in a mobile context a spectrometer is not suitable, few discrete wavelengths must be selected to estimate the glucose concentration. The developed selection methods are based on nonlinear support vector regression (SVR) models. Two selection methods are compared and it is shown that wavelengths selected by a sequential forward feature selection algorithm achieves an estimation improvement. The presented method can be easily applied to different metamaterial layouts and hydrogel configurations.
Density-based penalty parameter optimization on C-SVM.
Liu, Yun; Lian, Jie; Bartolacci, Michael R; Zeng, Qing-An
2014-01-01
The support vector machine (SVM) is one of the most widely used approaches for data classification and regression. SVM achieves the largest distance between the positive and negative support vectors, which neglects the remote instances away from the SVM interface. In order to avoid a position change of the SVM interface as the result of an error system outlier, C-SVM was implemented to decrease the influences of the system's outliers. Traditional C-SVM holds a uniform parameter C for both positive and negative instances; however, according to the different number proportions and the data distribution, positive and negative instances should be set with different weights for the penalty parameter of the error terms. Therefore, in this paper, we propose density-based penalty parameter optimization of C-SVM. The experiential results indicated that our proposed algorithm has outstanding performance with respect to both precision and recall.
A support vector machine based control application to the experimental three-tank system.
Iplikci, Serdar
2010-07-01
This paper presents a support vector machine (SVM) approach to generalized predictive control (GPC) of multiple-input multiple-output (MIMO) nonlinear systems. The possession of higher generalization potential and at the same time avoidance of getting stuck into the local minima have motivated us to employ SVM algorithms for modeling MIMO systems. Based on the SVM model, detailed and compact formulations for calculating predictions and gradient information, which are used in the computation of the optimal control action, are given in the paper. The proposed MIMO SVM-based GPC method has been verified on an experimental three-tank liquid level control system. Experimental results have shown that the proposed method can handle the control task successfully for different reference trajectories. Moreover, a detailed discussion on data gathering, model selection and effects of the control parameters have been given in this paper. 2010 ISA. Published by Elsevier Ltd. All rights reserved.
Breast Cancer Recognition Using a Novel Hybrid Intelligent Method
Addeh, Jalil; Ebrahimzadeh, Ata
2012-01-01
Breast cancer is the second largest cause of cancer deaths among women. At the same time, it is also among the most curable cancer types if it can be diagnosed early. This paper presents a novel hybrid intelligent method for recognition of breast cancer tumors. The proposed method includes three main modules: the feature extraction module, the classifier module, and the optimization module. In the feature extraction module, fuzzy features are proposed as the efficient characteristic of the patterns. In the classifier module, because of the promising generalization capability of support vector machines (SVM), a SVM-based classifier is proposed. In support vector machine training, the hyperparameters have very important roles for its recognition accuracy. Therefore, in the optimization module, the bees algorithm (BA) is proposed for selecting appropriate parameters of the classifier. The proposed system is tested on Wisconsin Breast Cancer database and simulation results show that the recommended system has a high accuracy. PMID:23626945
Uncertainty principles for inverse source problems for electromagnetic and elastic waves
NASA Astrophysics Data System (ADS)
Griesmaier, Roland; Sylvester, John
2018-06-01
In isotropic homogeneous media, far fields of time-harmonic electromagnetic waves radiated by compactly supported volume currents, and elastic waves radiated by compactly supported body force densities can be modelled in very similar fashions. Both are projected restricted Fourier transforms of vector-valued source terms. In this work we generalize two types of uncertainty principles recently developed for far fields of scalar-valued time-harmonic waves in Griesmaier and Sylvester (2017 SIAM J. Appl. Math. 77 154–80) to this vector-valued setting. These uncertainty principles yield stability criteria and algorithms for splitting far fields radiated by collections of well-separated sources into the far fields radiated by individual source components, and for the restoration of missing data segments. We discuss proper regularization strategies for these inverse problems, provide stability estimates based on the new uncertainty principles, and comment on reconstruction schemes. A numerical example illustrates our theoretical findings.
Support vector machines-based fault diagnosis for turbo-pump rotor
NASA Astrophysics Data System (ADS)
Yuan, Sheng-Fa; Chu, Fu-Lei
2006-05-01
Most artificial intelligence methods used in fault diagnosis are based on empirical risk minimisation principle and have poor generalisation when fault samples are few. Support vector machines (SVM) is a new general machine-learning tool based on structural risk minimisation principle that exhibits good generalisation even when fault samples are few. Fault diagnosis based on SVM is discussed. Since basic SVM is originally designed for two-class classification, while most of fault diagnosis problems are multi-class cases, a new multi-class classification of SVM named 'one to others' algorithm is presented to solve the multi-class recognition problems. It is a binary tree classifier composed of several two-class classifiers organised by fault priority, which is simple, and has little repeated training amount, and the rate of training and recognition is expedited. The effectiveness of the method is verified by the application to the fault diagnosis for turbo pump rotor.
Visualizing Vector Fields Using Line Integral Convolution and Dye Advection
NASA Technical Reports Server (NTRS)
Shen, Han-Wei; Johnson, Christopher R.; Ma, Kwan-Liu
1996-01-01
We present local and global techniques to visualize three-dimensional vector field data. Using the Line Integral Convolution (LIC) method to image the global vector field, our new algorithm allows the user to introduce colored 'dye' into the vector field to highlight local flow features. A fast algorithm is proposed that quickly recomputes the dyed LIC images. In addition, we introduce volume rendering methods that can map the LIC texture on any contour surface and/or translucent region defined by additional scalar quantities, and can follow the advection of colored dye throughout the volume.
Ye, Fei; Lou, Xin Yuan; Sun, Lin Fu
2017-01-01
This paper proposes a new support vector machine (SVM) optimization scheme based on an improved chaotic fly optimization algorithm (FOA) with a mutation strategy to simultaneously perform parameter setting turning for the SVM and feature selection. In the improved FOA, the chaotic particle initializes the fruit fly swarm location and replaces the expression of distance for the fruit fly to find the food source. However, the proposed mutation strategy uses two distinct generative mechanisms for new food sources at the osphresis phase, allowing the algorithm procedure to search for the optimal solution in both the whole solution space and within the local solution space containing the fruit fly swarm location. In an evaluation based on a group of ten benchmark problems, the proposed algorithm's performance is compared with that of other well-known algorithms, and the results support the superiority of the proposed algorithm. Moreover, this algorithm is successfully applied in a SVM to perform both parameter setting turning for the SVM and feature selection to solve real-world classification problems. This method is called chaotic fruit fly optimization algorithm (CIFOA)-SVM and has been shown to be a more robust and effective optimization method than other well-known methods, particularly in terms of solving the medical diagnosis problem and the credit card problem.
Fault Detection of Bearing Systems through EEMD and Optimization Algorithm
Lee, Dong-Han; Ahn, Jong-Hyo; Koh, Bong-Hwan
2017-01-01
This study proposes a fault detection and diagnosis method for bearing systems using ensemble empirical mode decomposition (EEMD) based feature extraction, in conjunction with particle swarm optimization (PSO), principal component analysis (PCA), and Isomap. First, a mathematical model is assumed to generate vibration signals from damaged bearing components, such as the inner-race, outer-race, and rolling elements. The process of decomposing vibration signals into intrinsic mode functions (IMFs) and extracting statistical features is introduced to develop a damage-sensitive parameter vector. Finally, PCA and Isomap algorithm are used to classify and visualize this parameter vector, to separate damage characteristics from healthy bearing components. Moreover, the PSO-based optimization algorithm improves the classification performance by selecting proper weightings for the parameter vector, to maximize the visualization effect of separating and grouping of parameter vectors in three-dimensional space. PMID:29143772
VizieR Online Data Catalog: Gamma-ray AGN type determination (Hassan+, 2013)
NASA Astrophysics Data System (ADS)
Hassan, T.; Mirabal, N.; Contreras, J. L.; Oya, I.
2013-11-01
In this paper, we employ Support Vector Machines (SVMs) and Random Forest (RF) that embody two of the most robust supervised learning algorithms available today. We are interested in building classifiers that can distinguish between two AGN classes: BL Lacs and FSRQs. In the 2FGL, there is a total set of 1074 identified/associated AGN objects with the following labels: 'bzb' (BL Lacs), 'bzq' (FSRQs), 'agn' (other non-blazar AGN) and 'agu' (active galaxies of uncertain type). From this global set, we group the identified/associated blazars ('bzb' and 'bzq' labels) as the training/testing set of our algorithms. (2 data files).
The method for froth floatation condition recognition based on adaptive feature weighted
NASA Astrophysics Data System (ADS)
Wang, Jieran; Zhang, Jun; Tian, Jinwen; Zhang, Daimeng; Liu, Xiaomao
2018-03-01
The fusion of foam characteristics can play a complementary role in expressing the content of foam image. The weight of foam characteristics is the key to make full use of the relationship between the different features. In this paper, an Adaptive Feature Weighted Method For Froth Floatation Condition Recognition is proposed. Foam features without and with weights are both classified by using support vector machine (SVM).The classification accuracy and optimal equaling algorithm under the each ore grade are regarded as the result of the adaptive feature weighting algorithm. At the same time the effectiveness of adaptive weighted method is demonstrated.
Remote sensing of suspended sediment water research: principles, methods, and progress
NASA Astrophysics Data System (ADS)
Shen, Ping; Zhang, Jing
2011-12-01
In this paper, we reviewed the principle, data, methods and steps in suspended sediment research by using remote sensing, summed up some representative models and methods, and analyzes the deficiencies of existing methods. Combined with the recent progress of remote sensing theory and application in water suspended sediment research, we introduced in some data processing methods such as atmospheric correction method, adjacent effect correction, and some intelligence algorithms such as neural networks, genetic algorithms, support vector machines into the suspended sediment inversion research, combined with other geographic information, based on Bayesian theory, we improved the suspended sediment inversion precision, and aim to give references to the related researchers.
Peng, Hui; Lan, Chaowang; Liu, Yuansheng; Liu, Tao; Blumenstein, Michael; Li, Jinyan
2017-10-03
Disease-related protein-coding genes have been widely studied, but disease-related non-coding genes remain largely unknown. This work introduces a new vector to represent diseases, and applies the newly vectorized data for a positive-unlabeled learning algorithm to predict and rank disease-related long non-coding RNA (lncRNA) genes. This novel vector representation for diseases consists of two sub-vectors, one is composed of 45 elements, characterizing the information entropies of the disease genes distribution over 45 chromosome substructures. This idea is supported by our observation that some substructures (e.g., the chromosome 6 p-arm) are highly preferred by disease-related protein coding genes, while some (e.g., the 21 p-arm) are not favored at all. The second sub-vector is 30-dimensional, characterizing the distribution of disease gene enriched KEGG pathways in comparison with our manually created pathway groups. The second sub-vector complements with the first one to differentiate between various diseases. Our prediction method outperforms the state-of-the-art methods on benchmark datasets for prioritizing disease related lncRNA genes. The method also works well when only the sequence information of an lncRNA gene is known, or even when a given disease has no currently recognized long non-coding genes.
Peng, Hui; Lan, Chaowang; Liu, Yuansheng; Liu, Tao; Blumenstein, Michael; Li, Jinyan
2017-01-01
Disease-related protein-coding genes have been widely studied, but disease-related non-coding genes remain largely unknown. This work introduces a new vector to represent diseases, and applies the newly vectorized data for a positive-unlabeled learning algorithm to predict and rank disease-related long non-coding RNA (lncRNA) genes. This novel vector representation for diseases consists of two sub-vectors, one is composed of 45 elements, characterizing the information entropies of the disease genes distribution over 45 chromosome substructures. This idea is supported by our observation that some substructures (e.g., the chromosome 6 p-arm) are highly preferred by disease-related protein coding genes, while some (e.g., the 21 p-arm) are not favored at all. The second sub-vector is 30-dimensional, characterizing the distribution of disease gene enriched KEGG pathways in comparison with our manually created pathway groups. The second sub-vector complements with the first one to differentiate between various diseases. Our prediction method outperforms the state-of-the-art methods on benchmark datasets for prioritizing disease related lncRNA genes. The method also works well when only the sequence information of an lncRNA gene is known, or even when a given disease has no currently recognized long non-coding genes. PMID:29108274
Using Impact Modulation to Identify Loose Bolts on a Satellite
2011-10-21
for public release; distribution is unlimited the literature to be an effective damage detection method for cracks, delamination, and fatigue in...to identify loose bolts and fatigue damage using optimized sensor locations using a Support Vector Machines algorithm to classify the dam- age. Finally...48] did preliminary work which showed that VM is effective in detecting fatigue cracks in engineering components despite changes in actuator location
Supervised Time Series Event Detector for Building Data
DOE Office of Scientific and Technical Information (OSTI.GOV)
2016-04-13
A machine learning based approach is developed to detect events that have rarely been seen in the historical data. The data can include building energy consumption, sensor data, environmental data and any data that may affect the building's energy consumption. The algorithm is a modified nonlinear Bayesian support vector machine, which examines daily energy consumption profile, detect the days with abnormal events, and diagnose the cause of the events.
GaAs Supercomputing: Architecture, Language, And Algorithms For Image Processing
NASA Astrophysics Data System (ADS)
Johl, John T.; Baker, Nick C.
1988-10-01
The application of high-speed GaAs processors in a parallel system matches the demanding computational requirements of image processing. The architecture of the McDonnell Douglas Astronautics Company (MDAC) vector processor is described along with the algorithms and language translator. Most image and signal processing algorithms can utilize parallel processing and show a significant performance improvement over sequential versions. The parallelization performed by this system is within each vector instruction. Since each vector has many elements, each requiring some computation, useful concurrent arithmetic operations can easily be performed. Balancing the memory bandwidth with the computation rate of the processors is an important design consideration for high efficiency and utilization. The architecture features a bus-based execution unit consisting of four to eight 32-bit GaAs RISC microprocessors running at a 200 MHz clock rate for a peak performance of 1.6 BOPS. The execution unit is connected to a vector memory with three buses capable of transferring two input words and one output word every 10 nsec. The address generators inside the vector memory perform different vector addressing modes and feed the data to the execution unit. The functions discussed in this paper include basic MATRIX OPERATIONS, 2-D SPATIAL CONVOLUTION, HISTOGRAM, and FFT. For each of these algorithms, assembly language programs were run on a behavioral model of the system to obtain performance figures.
NASA Astrophysics Data System (ADS)
Ansari, Hamid Reza
2014-09-01
In this paper we propose a new method for predicting rock porosity based on a combination of several artificial intelligence systems. The method focuses on one of the Iranian carbonate fields in the Persian Gulf. Because there is strong heterogeneity in carbonate formations, estimation of rock properties experiences more challenge than sandstone. For this purpose, seismic colored inversion (SCI) and a new approach of committee machine are used in order to improve porosity estimation. The study comprises three major steps. First, a series of sample-based attributes is calculated from 3D seismic volume. Acoustic impedance is an important attribute that is obtained by the SCI method in this study. Second, porosity log is predicted from seismic attributes using common intelligent computation systems including: probabilistic neural network (PNN), radial basis function network (RBFN), multi-layer feed forward network (MLFN), ε-support vector regression (ε-SVR) and adaptive neuro-fuzzy inference system (ANFIS). Finally, a power law committee machine (PLCM) is constructed based on imperial competitive algorithm (ICA) to combine the results of all previous predictions in a single solution. This technique is called PLCM-ICA in this paper. The results show that PLCM-ICA model improved the results of neural networks, support vector machine and neuro-fuzzy system.
The Effect of Personalization on Smartphone-Based Fall Detectors
Medrano, Carlos; Plaza, Inmaculada; Igual, Raúl; Sánchez, Ángel; Castro, Manuel
2016-01-01
The risk of falling is high among different groups of people, such as older people, individuals with Parkinson's disease or patients in neuro-rehabilitation units. Developing robust fall detectors is important for acting promptly in case of a fall. Therefore, in this study we propose to personalize smartphone-based detectors to boost their performance as compared to a non-personalized system. Four algorithms were investigated using a public dataset: three novelty detection algorithms—Nearest Neighbor (NN), Local Outlier Factor (LOF) and One-Class Support Vector Machine (OneClass-SVM)—and a traditional supervised algorithm, Support Vector Machine (SVM). The effect of personalization was studied for each subject by considering two different training conditions: data coming only from that subject or data coming from the remaining subjects. The area under the receiver operating characteristic curve (AUC) was selected as the primary figure of merit. The results show that there is a general trend towards the increase in performance by personalizing the detector, but the effect depends on the individual being considered. A personalized NN can reach the performance of a non-personalized SVM (average AUC of 0.9861 and 0.9795, respectively), which is remarkable since NN only uses activities of daily living for training. PMID:26797614
NASA Astrophysics Data System (ADS)
Pullanagari, Reddy; Kereszturi, Gábor; Yule, Ian J.; Ghamisi, Pedram
2017-04-01
Accurate and spatially detailed mapping of complex urban environments is essential for land managers. Classifying high spectral and spatial resolution hyperspectral images is a challenging task because of its data abundance and computational complexity. Approaches with a combination of spectral and spatial information in a single classification framework have attracted special attention because of their potential to improve the classification accuracy. We extracted multiple features from spectral and spatial domains of hyperspectral images and evaluated them with two supervised classification algorithms; support vector machines (SVM) and an artificial neural network. The spatial features considered are produced by a gray level co-occurrence matrix and extended multiattribute profiles. All of these features were stacked, and the most informative features were selected using a genetic algorithm-based SVM. After selecting the most informative features, the classification model was integrated with a segmentation map derived using a hidden Markov random field. We tested the proposed method on a real application of a hyperspectral image acquired from AisaFENIX and on widely used hyperspectral images. From the results, it can be concluded that the proposed framework significantly improves the results with different spectral and spatial resolutions over different instrumentation.
NASA Astrophysics Data System (ADS)
Delgado, Juan A.; Altuve, Miguel; Nabhan Homsi, Masun
2015-12-01
This paper introduces a robust method based on the Support Vector Machine (SVM) algorithm to detect the presence of Fetal QRS (fQRS) complexes in electrocardiogram (ECG) recordings provided by the PhysioNet/CinC challenge 2013. ECG signals are first segmented into contiguous frames of 250 ms duration and then labeled in six classes. Fetal segments are tagged according to the position of fQRS complex within each one. Next, segment features extraction and dimensionality reduction are obtained by applying principal component analysis on Haar-wavelet transform. After that, two sub-datasets are generated to separate representative segments from atypical ones. Imbalanced class problem is dealt by applying sampling without replacement on each sub-dataset. Finally, two SVMs are trained and cross-validated using the two balanced sub-datasets separately. Experimental results show that the proposed approach achieves high performance rates in fetal heartbeats detection that reach up to 90.95% of accuracy, 92.16% of sensitivity, 88.51% of specificity, 94.13% of positive predictive value and 84.96% of negative predictive value. A comparative study is also carried out to show the performance of other two machine learning algorithms for fQRS complex estimation, which are K-nearest neighborhood and Bayesian network.
NASA Astrophysics Data System (ADS)
Duan, Libin; Xiao, Ning-cong; Li, Guangyao; Cheng, Aiguo; Chen, Tao
2017-07-01
Tailor-rolled blank thin-walled (TRB-TH) structures have become important vehicle components owing to their advantages of light weight and crashworthiness. The purpose of this article is to provide an efficient lightweight design for improving the energy-absorbing capability of TRB-TH structures under dynamic loading. A finite element (FE) model for TRB-TH structures is established and validated by performing a dynamic axial crash test. Different material properties for individual parts with different thicknesses are considered in the FE model. Then, a multi-objective crashworthiness design of the TRB-TH structure is constructed based on the ɛ-support vector regression (ɛ-SVR) technique and non-dominated sorting genetic algorithm-II. The key parameters (C, ɛ and σ) are optimized to further improve the predictive accuracy of ɛ-SVR under limited sample points. Finally, the technique for order preference by similarity to the ideal solution method is used to rank the solutions in Pareto-optimal frontiers and find the best compromise optima. The results demonstrate that the light weight and crashworthiness performance of the optimized TRB-TH structures are superior to their uniform thickness counterparts. The proposed approach provides useful guidance for designing TRB-TH energy absorbers for vehicle bodies.
NASA Astrophysics Data System (ADS)
Uslu, Faruk Sukru
2017-07-01
Oil spills on the ocean surface cause serious environmental, political, and economic problems. Therefore, these catastrophic threats to marine ecosystems require detection and monitoring. Hyperspectral sensors are powerful optical sensors used for oil spill detection with the help of detailed spectral information of materials. However, huge amounts of data in hyperspectral imaging (HSI) require fast and accurate computation methods for detection problems. Support vector data description (SVDD) is one of the most suitable methods for detection, especially for large data sets. Nevertheless, the selection of kernel parameters is one of the main problems in SVDD. This paper presents a method, inspired by ensemble learning, for improving performance of SVDD without tuning its kernel parameters. Additionally, a classifier selection technique is proposed to get more gain. The proposed approach also aims to solve the small sample size problem, which is very important for processing high-dimensional data in HSI. The algorithm is applied to two HSI data sets for detection problems. In the first HSI data set, various targets are detected; in the second HSI data set, oil spill detection in situ is realized. The experimental results demonstrate the feasibility and performance improvement of the proposed algorithm for oil spill detection problems.
Mwangi, Benson; Wu, Mon-Ju; Bauer, Isabelle E; Modi, Haina; Zeni, Cristian P; Zunta-Soares, Giovana B; Hasan, Khader M; Soares, Jair C
2015-11-30
Previous studies have reported abnormalities of white-matter diffusivity in pediatric bipolar disorder. However, it has not been established whether these abnormalities are able to distinguish individual subjects with pediatric bipolar disorder from healthy controls with a high specificity and sensitivity. Diffusion-weighted imaging scans were acquired from 16 youths diagnosed with DSM-IV bipolar disorder and 16 demographically matched healthy controls. Regional white matter tissue microstructural measurements such as fractional anisotropy, axial diffusivity and radial diffusivity were computed using an atlas-based approach. These measurements were used to 'train' a support vector machine (SVM) algorithm to predict new or 'unseen' subjects' diagnostic labels. The SVM algorithm predicted individual subjects with specificity=87.5%, sensitivity=68.75%, accuracy=78.12%, positive predictive value=84.62%, negative predictive value=73.68%, area under receiver operating characteristic curve (AUROC)=0.7812 and chi-square p-value=0.0012. A pattern of reduced regional white matter fractional anisotropy was observed in pediatric bipolar disorder patients. These results suggest that atlas-based diffusion weighted imaging measurements can distinguish individual pediatric bipolar disorder patients from healthy controls. Notably, from a clinical perspective these findings will contribute to the pathophysiological understanding of pediatric bipolar disorder. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Mikhchi, Abbas; Honarvar, Mahmood; Kashan, Nasser Emam Jomeh; Aminafshar, Mehdi
2016-06-21
Genotype imputation is an important tool for prediction of unknown genotypes for both unrelated individuals and parent-offspring trios. Several imputation methods are available and can either employ universal machine learning methods, or deploy algorithms dedicated to infer missing genotypes. In this research the performance of eight machine learning methods: Support Vector Machine, K-Nearest Neighbors, Extreme Learning Machine, Radial Basis Function, Random Forest, AdaBoost, LogitBoost, and TotalBoost compared in terms of the imputation accuracy, computation time and the factors affecting imputation accuracy. The methods employed using real and simulated datasets to impute the un-typed SNPs in parent-offspring trios. The tested methods show that imputation of parent-offspring trios can be accurate. The Random Forest and Support Vector Machine were more accurate than the other machine learning methods. The TotalBoost performed slightly worse than the other methods.The running times were different between methods. The ELM was always most fast algorithm. In case of increasing the sample size, the RBF requires long imputation time.The tested methods in this research can be an alternative for imputation of un-typed SNPs in low missing rate of data. However, it is recommended that other machine learning methods to be used for imputation. Copyright © 2016 Elsevier Ltd. All rights reserved.
Simple algorithm for improved security in the FDDI protocol
NASA Astrophysics Data System (ADS)
Lundy, G. M.; Jones, Benjamin
1993-02-01
We propose a modification to the Fiber Distributed Data Interface (FDDI) protocol based on a simple algorithm which will improve confidential communication capability. This proposed modification provides a simple and reliable system which exploits some of the inherent security properties in a fiber optic ring network. This method differs from conventional methods in that end to end encryption can be facilitated at the media access control sublayer of the data link layer in the OSI network model. Our method is based on a variation of the bit stream cipher method. The transmitting station takes the intended confidential message and uses a simple modulo two addition operation against an initialization vector. The encrypted message is virtually unbreakable without the initialization vector. None of the stations on the ring will have access to both the encrypted message and the initialization vector except the transmitting and receiving stations. The generation of the initialization vector is unique for each confidential transmission and thus provides a unique approach to the key distribution problem. The FDDI protocol is of particular interest to the military in terms of LAN/MAN implementations. Both the Army and the Navy are considering the standard as the basis for future network systems. A simple and reliable security mechanism with the potential to support realtime communications is a necessary consideration in the implementation of these systems. The proposed method offers several advantages over traditional methods in terms of speed, reliability, and standardization.
A Cancer Gene Selection Algorithm Based on the K-S Test and CFS.
Su, Qiang; Wang, Yina; Jiang, Xiaobing; Chen, Fuxue; Lu, Wen-Cong
2017-01-01
To address the challenging problem of selecting distinguished genes from cancer gene expression datasets, this paper presents a gene subset selection algorithm based on the Kolmogorov-Smirnov (K-S) test and correlation-based feature selection (CFS) principles. The algorithm selects distinguished genes first using the K-S test, and then, it uses CFS to select genes from those selected by the K-S test. We adopted support vector machines (SVM) as the classification tool and used the criteria of accuracy to evaluate the performance of the classifiers on the selected gene subsets. This approach compared the proposed gene subset selection algorithm with the K-S test, CFS, minimum-redundancy maximum-relevancy (mRMR), and ReliefF algorithms. The average experimental results of the aforementioned gene selection algorithms for 5 gene expression datasets demonstrate that, based on accuracy, the performance of the new K-S and CFS-based algorithm is better than those of the K-S test, CFS, mRMR, and ReliefF algorithms. The experimental results show that the K-S test-CFS gene selection algorithm is a very effective and promising approach compared to the K-S test, CFS, mRMR, and ReliefF algorithms.
NASA Astrophysics Data System (ADS)
Bastidas, L. A.; Pande, S.
2009-12-01
Pattern analysis deals with the automatic detection of patterns in the data and there are a variety of algorithms available for the purpose. These algorithms are commonly called Artificial Intelligence (AI) or data driven algorithms, and have been applied lately to a variety of problems in hydrology and are becoming extremely popular. When confronting such a range of algorithms, the question of which one is the “best” arises. Some algorithms may be preferred because of the lower computational complexity; others take into account prior knowledge of the form and the amount of the data; others are chosen based on a version of the Occam’s razor principle that a simple classifier performs better. Popper has argued, however, that Occam’s razor is without operational value because there is no clear measure or criterion for simplicity. An example of measures that can be used for this purpose are: the so called algorithmic complexity - also known as Kolmogorov complexity or Kolmogorov (algorithmic) entropy; the Bayesian information criterion; or the Vapnik-Chervonenkis dimension. On the other hand, the No Free Lunch Theorem states that there is no best general algorithm, and that specific algorithms are superior only for specific problems. It should be noted also that the appropriate algorithm and the appropriate complexity are constrained by the finiteness of the available data and the uncertainties associated with it. Thus, there is compromise between the complexity of the algorithm, the data properties, and the robustness of the predictions. We discuss the above topics; briefly review the historical development of applications with particular emphasis on statistical learning theory (SLT), also known as machine learning (ML) of which support vector machines and relevant vector machines are the most commonly known algorithms. We present some applications of such algorithms for distributed hydrologic modeling; and introduce an example of how the complexity measure can be applied for appropriate model choice within the context of applications in hydrologic modeling intended for use in studies about water resources and water resources management and their direct relation to extreme conditions or natural hazards.
Prediction of monthly rainfall in Victoria, Australia: Clusterwise linear regression approach
NASA Astrophysics Data System (ADS)
Bagirov, Adil M.; Mahmood, Arshad; Barton, Andrew
2017-05-01
This paper develops the Clusterwise Linear Regression (CLR) technique for prediction of monthly rainfall. The CLR is a combination of clustering and regression techniques. It is formulated as an optimization problem and an incremental algorithm is designed to solve it. The algorithm is applied to predict monthly rainfall in Victoria, Australia using rainfall data with five input meteorological variables over the period of 1889-2014 from eight geographically diverse weather stations. The prediction performance of the CLR method is evaluated by comparing observed and predicted rainfall values using four measures of forecast accuracy. The proposed method is also compared with the CLR using the maximum likelihood framework by the expectation-maximization algorithm, multiple linear regression, artificial neural networks and the support vector machines for regression models using computational results. The results demonstrate that the proposed algorithm outperforms other methods in most locations.
An intelligent identification algorithm for the monoclonal picking instrument
NASA Astrophysics Data System (ADS)
Yan, Hua; Zhang, Rongfu; Yuan, Xujun; Wang, Qun
2017-11-01
The traditional colony selection is mainly operated by manual mode, which takes on low efficiency and strong subjectivity. Therefore, it is important to develop an automatic monoclonal-picking instrument. The critical stage of the automatic monoclonal-picking and intelligent optimal selection is intelligent identification algorithm. An auto-screening algorithm based on Support Vector Machine (SVM) is proposed in this paper, which uses the supervised learning method, which combined with the colony morphological characteristics to classify the colony accurately. Furthermore, through the basic morphological features of the colony, system can figure out a series of morphological parameters step by step. Through the establishment of maximal margin classifier, and based on the analysis of the growth trend of the colony, the selection of the monoclonal colony was carried out. The experimental results showed that the auto-screening algorithm could screen out the regular colony from the other, which meets the requirement of various parameters.
Modified Mahalanobis Taguchi System for Imbalance Data Classification
2017-01-01
The Mahalanobis Taguchi System (MTS) is considered one of the most promising binary classification algorithms to handle imbalance data. Unfortunately, MTS lacks a method for determining an efficient threshold for the binary classification. In this paper, a nonlinear optimization model is formulated based on minimizing the distance between MTS Receiver Operating Characteristics (ROC) curve and the theoretical optimal point named Modified Mahalanobis Taguchi System (MMTS). To validate the MMTS classification efficacy, it has been benchmarked with Support Vector Machines (SVMs), Naive Bayes (NB), Probabilistic Mahalanobis Taguchi Systems (PTM), Synthetic Minority Oversampling Technique (SMOTE), Adaptive Conformal Transformation (ACT), Kernel Boundary Alignment (KBA), Hidden Naive Bayes (HNB), and other improved Naive Bayes algorithms. MMTS outperforms the benchmarked algorithms especially when the imbalance ratio is greater than 400. A real life case study on manufacturing sector is used to demonstrate the applicability of the proposed model and to compare its performance with Mahalanobis Genetic Algorithm (MGA). PMID:28811820
Karayiannis, N B
2000-01-01
This paper presents the development and investigates the properties of ordered weighted learning vector quantization (LVQ) and clustering algorithms. These algorithms are developed by using gradient descent to minimize reformulation functions based on aggregation operators. An axiomatic approach provides conditions for selecting aggregation operators that lead to admissible reformulation functions. Minimization of admissible reformulation functions based on ordered weighted aggregation operators produces a family of soft LVQ and clustering algorithms, which includes fuzzy LVQ and clustering algorithms as special cases. The proposed LVQ and clustering algorithms are used to perform segmentation of magnetic resonance (MR) images of the brain. The diagnostic value of the segmented MR images provides the basis for evaluating a variety of ordered weighted LVQ and clustering algorithms.
Statistical analysis and machine learning algorithms for optical biopsy
NASA Astrophysics Data System (ADS)
Wu, Binlin; Liu, Cheng-hui; Boydston-White, Susie; Beckman, Hugh; Sriramoju, Vidyasagar; Sordillo, Laura; Zhang, Chunyuan; Zhang, Lin; Shi, Lingyan; Smith, Jason; Bailin, Jacob; Alfano, Robert R.
2018-02-01
Analyzing spectral or imaging data collected with various optical biopsy methods is often times difficult due to the complexity of the biological basis. Robust methods that can utilize the spectral or imaging data and detect the characteristic spectral or spatial signatures for different types of tissue is challenging but highly desired. In this study, we used various machine learning algorithms to analyze a spectral dataset acquired from human skin normal and cancerous tissue samples using resonance Raman spectroscopy with 532nm excitation. The algorithms including principal component analysis, nonnegative matrix factorization, and autoencoder artificial neural network are used to reduce dimension of the dataset and detect features. A support vector machine with a linear kernel is used to classify the normal tissue and cancerous tissue samples. The efficacies of the methods are compared.
A Demons algorithm for image registration with locally adaptive regularization.
Cahill, Nathan D; Noble, J Alison; Hawkes, David J
2009-01-01
Thirion's Demons is a popular algorithm for nonrigid image registration because of its linear computational complexity and ease of implementation. It approximately solves the diffusion registration problem by successively estimating force vectors that drive the deformation toward alignment and smoothing the force vectors by Gaussian convolution. In this article, we show how the Demons algorithm can be generalized to allow image-driven locally adaptive regularization in a manner that preserves both the linear complexity and ease of implementation of the original Demons algorithm. We show that the proposed algorithm exhibits lower target registration error and requires less computational effort than the original Demons algorithm on the registration of serial chest CT scans of patients with lung nodules.
Prediction of Skin Sensitization with a Particle Swarm Optimized Support Vector Machine
Yuan, Hua; Huang, Jianping; Cao, Chenzhong
2009-01-01
Skin sensitization is the most commonly reported occupational illness, causing much suffering to a wide range of people. Identification and labeling of environmental allergens is urgently required to protect people from skin sensitization. The guinea pig maximization test (GPMT) and murine local lymph node assay (LLNA) are the two most important in vivo models for identification of skin sensitizers. In order to reduce the number of animal tests, quantitative structure-activity relationships (QSARs) are strongly encouraged in the assessment of skin sensitization of chemicals. This paper has investigated the skin sensitization potential of 162 compounds with LLNA results and 92 compounds with GPMT results using a support vector machine. A particle swarm optimization algorithm was implemented for feature selection from a large number of molecular descriptors calculated by Dragon. For the LLNA data set, the classification accuracies are 95.37% and 88.89% for the training and the test sets, respectively. For the GPMT data set, the classification accuracies are 91.80% and 90.32% for the training and the test sets, respectively. The classification performances were greatly improved compared to those reported in the literature, indicating that the support vector machine optimized by particle swarm in this paper is competent for the identification of skin sensitizers. PMID:19742136
NASA Astrophysics Data System (ADS)
Mustapha, S.; Braytee, A.; Ye, L.
2017-04-01
In this study, we focused at the development and verification of a robust framework for surface crack detection in steel pipes using measured vibration responses; with the presence of multiple progressive damage occurring in different locations within the structure. Feature selection, dimensionality reduction, and multi-class support vector machine were established for this purpose. Nine damage cases, at different locations, orientations and length, were introduced into the pipe structure. The pipe was impacted 300 times using an impact hammer, after each damage case, the vibration data were collected using 3 PZT wafers which were installed on the outer surface of the pipe. At first, damage sensitive features were extracted using the frequency response function approach followed by recursive feature elimination for dimensionality reduction. Then, a multi-class support vector machine learning algorithm was employed to train the data and generate a statistical model. Once the model is established, decision values and distances from the hyper-plane were generated for the new collected data using the trained model. This process was repeated on the data collected from each sensor. Overall, using a single sensor for training and testing led to a very high accuracy reaching 98% in the assessment of the 9 damage cases used in this study.
Wang, Xue; Wang, Sheng; Ma, Jun-Jie
2007-01-01
The effectiveness of wireless sensor networks (WSNs) depends on the coverage and target detection probability provided by dynamic deployment, which is usually supported by the virtual force (VF) algorithm. However, in the VF algorithm, the virtual force exerted by stationary sensor nodes will hinder the movement of mobile sensor nodes. Particle swarm optimization (PSO) is introduced as another dynamic deployment algorithm, but in this case the computation time required is the big bottleneck. This paper proposes a dynamic deployment algorithm which is named “virtual force directed co-evolutionary particle swarm optimization” (VFCPSO), since this algorithm combines the co-evolutionary particle swarm optimization (CPSO) with the VF algorithm, whereby the CPSO uses multiple swarms to optimize different components of the solution vectors for dynamic deployment cooperatively and the velocity of each particle is updated according to not only the historical local and global optimal solutions, but also the virtual forces of sensor nodes. Simulation results demonstrate that the proposed VFCPSO is competent for dynamic deployment in WSNs and has better performance with respect to computation time and effectiveness than the VF, PSO and VFPSO algorithms.
Voice based gender classification using machine learning
NASA Astrophysics Data System (ADS)
Raahul, A.; Sapthagiri, R.; Pankaj, K.; Vijayarajan, V.
2017-11-01
Gender identification is one of the major problem speech analysis today. Tracing the gender from acoustic data i.e., pitch, median, frequency etc. Machine learning gives promising results for classification problem in all the research domains. There are several performance metrics to evaluate algorithms of an area. Our Comparative model algorithm for evaluating 5 different machine learning algorithms based on eight different metrics in gender classification from acoustic data. Agenda is to identify gender, with five different algorithms: Linear Discriminant Analysis (LDA), K-Nearest Neighbour (KNN), Classification and Regression Trees (CART), Random Forest (RF), and Support Vector Machine (SVM) on basis of eight different metrics. The main parameter in evaluating any algorithms is its performance. Misclassification rate must be less in classification problems, which says that the accuracy rate must be high. Location and gender of the person have become very crucial in economic markets in the form of AdSense. Here with this comparative model algorithm, we are trying to assess the different ML algorithms and find the best fit for gender classification of acoustic data.
Flux-vector splitting algorithm for chain-rule conservation-law form
NASA Technical Reports Server (NTRS)
Shih, T. I.-P.; Nguyen, H. L.; Willis, E. A.; Steinthorsson, E.; Li, Z.
1991-01-01
A flux-vector splitting algorithm with Newton-Raphson iteration was developed for the 'full compressible' Navier-Stokes equations cast in chain-rule conservation-law form. The algorithm is intended for problems with deforming spatial domains and for problems whose governing equations cannot be cast in strong conservation-law form. The usefulness of the algorithm for such problems was demonstrated by applying it to analyze the unsteady, two- and three-dimensional flows inside one combustion chamber of a Wankel engine under nonfiring conditions. Solutions were obtained to examine the algorithm in terms of conservation error, robustness, and ability to handle complex flows on time-dependent grid systems.
A parallel Jacobson-Oksman optimization algorithm. [parallel processing (computers)
NASA Technical Reports Server (NTRS)
Straeter, T. A.; Markos, A. T.
1975-01-01
A gradient-dependent optimization technique which exploits the vector-streaming or parallel-computing capabilities of some modern computers is presented. The algorithm, derived by assuming that the function to be minimized is homogeneous, is a modification of the Jacobson-Oksman serial minimization method. In addition to describing the algorithm, conditions insuring the convergence of the iterates of the algorithm and the results of numerical experiments on a group of sample test functions are presented. The results of these experiments indicate that this algorithm will solve optimization problems in less computing time than conventional serial methods on machines having vector-streaming or parallel-computing capabilities.
New syndrome decoding techniques for the (n, k) convolutional codes
NASA Technical Reports Server (NTRS)
Reed, I. S.; Truong, T. K.
1984-01-01
This paper presents a new syndrome decoding algorithm for the (n, k) convolutional codes (CC) which differs completely from an earlier syndrome decoding algorithm of Schalkwijk and Vinck. The new algorithm is based on the general solution of the syndrome equation, a linear Diophantine equation for the error polynomial vector E(D). The set of Diophantine solutions is a coset of the CC. In this error coset a recursive, Viterbi-like algorithm is developed to find the minimum weight error vector (circumflex)E(D). An example, illustrating the new decoding algorithm, is given for the binary nonsystemmatic (3, 1)CC. Previously announced in STAR as N83-34964
Chinese Text Summarization Algorithm Based on Word2vec
NASA Astrophysics Data System (ADS)
Chengzhang, Xu; Dan, Liu
2018-02-01
In order to extract some sentences that can cover the topic of a Chinese article, a Chinese text summarization algorithm based on Word2vec is used in this paper. Words in an article are represented as vectors trained by Word2vec, the weight of each word, the sentence vector and the weight of each sentence are calculated by combining word-sentence relationship with graph-based ranking model. Finally the summary is generated on the basis of the final sentence vector and the final weight of the sentence. The experimental results on real datasets show that the proposed algorithm has a better summarization quality compared with TF-IDF and TextRank.
Robust Vision-Based Pose Estimation Algorithm for AN Uav with Known Gravity Vector
NASA Astrophysics Data System (ADS)
Kniaz, V. V.
2016-06-01
Accurate estimation of camera external orientation with respect to a known object is one of the central problems in photogrammetry and computer vision. In recent years this problem is gaining an increasing attention in the field of UAV autonomous flight. Such application requires a real-time performance and robustness of the external orientation estimation algorithm. The accuracy of the solution is strongly dependent on the number of reference points visible on the given image. The problem only has an analytical solution if 3 or more reference points are visible. However, in limited visibility conditions it is often needed to perform external orientation with only 2 visible reference points. In such case the solution could be found if the gravity vector direction in the camera coordinate system is known. A number of algorithms for external orientation estimation for the case of 2 known reference points and a gravity vector were developed to date. Most of these algorithms provide analytical solution in the form of polynomial equation that is subject to large errors in the case of complex reference points configurations. This paper is focused on the development of a new computationally effective and robust algorithm for external orientation based on positions of 2 known reference points and a gravity vector. The algorithm implementation for guidance of a Parrot AR.Drone 2.0 micro-UAV is discussed. The experimental evaluation of the algorithm proved its computational efficiency and robustness against errors in reference points positions and complex configurations.
Lu, Huijuan; Wei, Shasha; Zhou, Zili; Miao, Yanzi; Lu, Yi
2015-01-01
The main purpose of traditional classification algorithms on bioinformatics application is to acquire better classification accuracy. However, these algorithms cannot meet the requirement that minimises the average misclassification cost. In this paper, a new algorithm of cost-sensitive regularised extreme learning machine (CS-RELM) was proposed by using probability estimation and misclassification cost to reconstruct the classification results. By improving the classification accuracy of a group of small sample which higher misclassification cost, the new CS-RELM can minimise the classification cost. The 'rejection cost' was integrated into CS-RELM algorithm to further reduce the average misclassification cost. By using Colon Tumour dataset and SRBCT (Small Round Blue Cells Tumour) dataset, CS-RELM was compared with other cost-sensitive algorithms such as extreme learning machine (ELM), cost-sensitive extreme learning machine, regularised extreme learning machine, cost-sensitive support vector machine (SVM). The results of experiments show that CS-RELM with embedded rejection cost could reduce the average cost of misclassification and made more credible classification decision than others.
Hong, Xia
2006-07-01
In this letter, a Box-Cox transformation-based radial basis function (RBF) neural network is introduced using the RBF neural network to represent the transformed system output. Initially a fixed and moderate sized RBF model base is derived based on a rank revealing orthogonal matrix triangularization (QR decomposition). Then a new fast identification algorithm is introduced using Gauss-Newton algorithm to derive the required Box-Cox transformation, based on a maximum likelihood estimator. The main contribution of this letter is to explore the special structure of the proposed RBF neural network for computational efficiency by utilizing the inverse of matrix block decomposition lemma. Finally, the Box-Cox transformation-based RBF neural network, with good generalization and sparsity, is identified based on the derived optimal Box-Cox transformation and a D-optimality-based orthogonal forward regression algorithm. The proposed algorithm and its efficacy are demonstrated with an illustrative example in comparison with support vector machine regression.
Limits on the Efficiency of Event-Based Algorithms for Monte Carlo Neutron Transport
DOE Office of Scientific and Technical Information (OSTI.GOV)
Romano, Paul K.; Siegel, Andrew R.
The traditional form of parallelism in Monte Carlo particle transport simulations, wherein each individual particle history is considered a unit of work, does not lend itself well to data-level parallelism. Event-based algorithms, which were originally used for simulations on vector processors, may offer a path toward better utilizing data-level parallelism in modern computer architectures. In this study, a simple model is developed for estimating the efficiency of the event-based particle transport algorithm under two sets of assumptions. Data collected from simulations of four reactor problems using OpenMC was then used in conjunction with the models to calculate the speedup duemore » to vectorization as a function of two parameters: the size of the particle bank and the vector width. When each event type is assumed to have constant execution time, the achievable speedup is directly related to the particle bank size. We observed that the bank size generally needs to be at least 20 times greater than vector size in order to achieve vector efficiency greater than 90%. When the execution times for events are allowed to vary, however, the vector speedup is also limited by differences in execution time for events being carried out in a single event-iteration. For some problems, this implies that vector effciencies over 50% may not be attainable. While there are many factors impacting performance of an event-based algorithm that are not captured by our model, it nevertheless provides insights into factors that may be limiting in a real implementation.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chang, X; Yang, D
Purpose: To investigate the method to automatically recognize the treatment site in the X-Ray portal images. It could be useful to detect potential treatment errors, and to provide guidance to sequential tasks, e.g. automatically verify the patient daily setup. Methods: The portal images were exported from MOSAIQ as DICOM files, and were 1) processed with a threshold based intensity transformation algorithm to enhance contrast, and 2) where then down-sampled (from 1024×768 to 128×96) by using bi-cubic interpolation algorithm. An appearance-based vector space model (VSM) was used to rearrange the images into vectors. A principal component analysis (PCA) method was usedmore » to reduce the vector dimensions. A multi-class support vector machine (SVM), with radial basis function kernel, was used to build the treatment site recognition models. These models were then used to recognize the treatment sites in the portal image. Portal images of 120 patients were included in the study. The images were selected to cover six treatment sites: brain, head and neck, breast, lung, abdomen and pelvis. Each site had images of the twenty patients. Cross-validation experiments were performed to evaluate the performance. Results: MATLAB image processing Toolbox and scikit-learn (a machine learning library in python) were used to implement the proposed method. The average accuracies using the AP and RT images separately were 95% and 94% respectively. The average accuracy using AP and RT images together was 98%. Computation time was ∼0.16 seconds per patient with AP or RT image, ∼0.33 seconds per patient with both of AP and RT images. Conclusion: The proposed method of treatment site recognition is efficient and accurate. It is not sensitive to the differences of image intensity, size and positions of patients in the portal images. It could be useful for the patient safety assurance. The work was partially supported by a research grant from Varian Medical System.« less
Boosting with Averaged Weight Vectors
NASA Technical Reports Server (NTRS)
Oza, Nikunj C.; Clancy, Daniel (Technical Monitor)
2002-01-01
AdaBoost is a well-known ensemble learning algorithm that constructs its constituent or base models in sequence. A key step in AdaBoost is constructing a distribution over the training examples to create each base model. This distribution, represented as a vector, is constructed to be orthogonal to the vector of mistakes made by the previous base model in the sequence. The idea is to make the next base model's errors uncorrelated with those of the previous model. Some researchers have pointed out the intuition that it is probably better to construct a distribution that is orthogonal to the mistake vectors of all the previous base models, but that this is not always possible. We present an algorithm that attempts to come as close as possible to this goal in an efficient manner. We present experimental results demonstrating significant improvement over AdaBoost and the Totally Corrective boosting algorithm, which also attempts to satisfy this goal.
New syndrome decoder for (n, 1) convolutional codes
NASA Technical Reports Server (NTRS)
Reed, I. S.; Truong, T. K.
1983-01-01
The letter presents a new syndrome decoding algorithm for the (n, 1) convolutional codes (CC) that is different and simpler than the previous syndrome decoding algorithm of Schalkwijk and Vinck. The new technique uses the general solution of the polynomial linear Diophantine equation for the error polynomial vector E(D). A recursive, Viterbi-like, algorithm is developed to find the minimum weight error vector E(D). An example is given for the binary nonsystematic (2, 1) CC.
Massively Parallel Solution of Poisson Equation on Coarse Grain MIMD Architectures
NASA Technical Reports Server (NTRS)
Fijany, A.; Weinberger, D.; Roosta, R.; Gulati, S.
1998-01-01
In this paper a new algorithm, designated as Fast Invariant Imbedding algorithm, for solution of Poisson equation on vector and massively parallel MIMD architectures is presented. This algorithm achieves the same optimal computational efficiency as other Fast Poisson solvers while offering a much better structure for vector and parallel implementation. Our implementation on the Intel Delta and Paragon shows that a speedup of over two orders of magnitude can be achieved even for moderate size problems.
NASA Astrophysics Data System (ADS)
Selva Bhuvaneswari, K.; Geetha, P.
2017-05-01
Magnetic resonance imaging segmentation refers to a process of assigning labels to set of pixels or multiple regions. It plays a major role in the field of biomedical applications as it is widely used by the radiologists to segment the medical images input into meaningful regions. In recent years, various brain tumour detection techniques are presented in the literature. The entire segmentation process of our proposed work comprises three phases: threshold generation with dynamic modified region growing phase, texture feature generation phase and region merging phase. by dynamically changing two thresholds in the modified region growing approach, the first phase of the given input image can be performed as dynamic modified region growing process, in which the optimisation algorithm, firefly algorithm help to optimise the two thresholds in modified region growing. After obtaining the region growth segmented image using modified region growing, the edges can be detected with edge detection algorithm. In the second phase, the texture feature can be extracted using entropy-based operation from the input image. In region merging phase, the results obtained from the texture feature-generation phase are combined with the results of dynamic modified region growing phase and similar regions are merged using a distance comparison between regions. After identifying the abnormal tissues, the classification can be done by hybrid kernel-based SVM (Support Vector Machine). The performance analysis of the proposed method will be carried by K-cross fold validation method. The proposed method will be implemented in MATLAB with various images.
Kavitha, Muthu Subash; Asano, Akira; Taguchi, Akira; Heo, Min-Suk
2013-09-01
To prevent low bone mineral density (BMD), that is, osteoporosis, in postmenopausal women, it is essential to diagnose osteoporosis more precisely. This study presented an automatic approach utilizing a histogram-based automatic clustering (HAC) algorithm with a support vector machine (SVM) to analyse dental panoramic radiographs (DPRs) and thus improve diagnostic accuracy by identifying postmenopausal women with low BMD or osteoporosis. We integrated our newly-proposed histogram-based automatic clustering (HAC) algorithm with our previously-designed computer-aided diagnosis system. The extracted moment-based features (mean, variance, skewness, and kurtosis) of the mandibular cortical width for the radial basis function (RBF) SVM classifier were employed. We also compared the diagnostic efficacy of the SVM model with the back propagation (BP) neural network model. In this study, DPRs and BMD measurements of 100 postmenopausal women patients (aged >50 years), with no previous record of osteoporosis, were randomly selected for inclusion. The accuracy, sensitivity, and specificity of the BMD measurements using our HAC-SVM model to identify women with low BMD were 93.0% (88.0%-98.0%), 95.8% (91.9%-99.7%) and 86.6% (79.9%-93.3%), respectively, at the lumbar spine; and 89.0% (82.9%-95.1%), 96.0% (92.2%-99.8%) and 84.0% (76.8%-91.2%), respectively, at the femoral neck. Our experimental results predict that the proposed HAC-SVM model combination applied on DPRs could be useful to assist dentists in early diagnosis and help to reduce the morbidity and mortality associated with low BMD and osteoporosis.
[Near infrared spectroscopy study on water content in turbine oil].
Chen, Bin; Liu, Ge; Zhang, Xian-Ming
2013-11-01
Near infrared (NIR) spectroscopy combined with successive projections algorithm (SPA) was investigated for determination of water content in turbine oil. Through the 57 samples of different water content in turbine oil scanned applying near infrared (NIR) spectroscopy, with the water content in the turbine oil of 0-0.156%, different pretreatment methods such as the original spectra, first derivative spectra and differential polynomial least squares fitting algorithm Savitzky-Golay (SG), and successive projections algorithm (SPA) were applied for the extraction of effective wavelengths, the correlation coefficient (R) and root mean square error (RMSE) were used as the model evaluation indices, accordingly water content in turbine oil was investigated. The results indicated that the original spectra with different water content in turbine oil were pretreated by the performance of first derivative + SG pretreatments, then the selected effective wavelengths were used as the inputs of least square support vector machine (LS-SVM). A total of 16 variables selected by SPA were employed to construct the model of SPA and least square support vector machine (SPA-LS-SVM). There is 9 as The correlation coefficient was 0.975 9 and the root of mean square error of validation set was 2.655 8 x 10(-3) using the model, and it is feasible to determine the water content in oil using near infrared spectroscopy and SPA-LS-SVM, and an excellent prediction precision was obtained. This study supplied a new and alternative approach to the further application of near infrared spectroscopy in on-line monitoring of contamination such as water content in oil.
Static vs. dynamic decoding algorithms in a non-invasive body-machine interface
Seáñez-González, Ismael; Pierella, Camilla; Farshchiansadegh, Ali; Thorp, Elias B.; Abdollahi, Farnaz; Pedersen, Jessica; Mussa-Ivaldi, Ferdinando A.
2017-01-01
In this study, we consider a non-invasive body-machine interface that captures body motions still available to people with spinal cord injury (SCI) and maps them into a set of signals for controlling a computer user interface while engaging in a sustained level of mobility and exercise. We compare the effectiveness of two decoding algorithms that transform a high-dimensional body-signal vector into a lower dimensional control vector on 6 subjects with high-level SCI and 8 controls. One algorithm is based on a static map from current body signals to the current value of the control vector set through principal component analysis (PCA), the other on dynamic mapping a segment of body signals to the value and the temporal derivatives of the control vector set through a Kalman filter. SCI and control participants performed straighter and smoother cursor movements with the Kalman algorithm during center-out reaching, but their movements were faster and more precise when using PCA. All participants were able to use the BMI’s continuous, two-dimensional control to type on a virtual keyboard and play pong, and performance with both algorithms was comparable. However, seven of eight control participants preferred PCA as their method of virtual wheelchair control. The unsupervised PCA algorithm was easier to train and seemed sufficient to achieve a higher degree of learnability and perceived ease of use. PMID:28092564
NASA Astrophysics Data System (ADS)
Wen, Hongwei; Liu, Yue; Wang, Jieqiong; Zhang, Jishui; Peng, Yun; He, Huiguang
2016-03-01
Tourette syndrome (TS) is a developmental neuropsychiatric disorder with the cardinal symptoms of motor and vocal tics which emerges in early childhood and fluctuates in severity in later years. To date, the neural basis of TS is not fully understood yet and TS has a long-term prognosis that is difficult to accurately estimate. Few studies have looked at the potential of using diffusion tensor imaging (DTI) in conjunction with machine learning algorithms in order to automate the classification of healthy children and TS children. Here we apply Tract-Based Spatial Statistics (TBSS) method to 44 TS children and 48 age and gender matched healthy children in order to extract the diffusion values from each voxel in the white matter (WM) skeleton, and a feature selection algorithm (ReliefF) was used to select the most salient voxels for subsequent classification with support vector machine (SVM). We use a nested cross validation to yield an unbiased assessment of the classification method and prevent overestimation. The accuracy (88.04%), sensitivity (88.64%) and specificity (87.50%) were achieved in our method as peak performance of the SVM classifier was achieved using the axial diffusion (AD) metric, demonstrating the potential of a joint TBSS and SVM pipeline for fast, objective classification of healthy and TS children. These results support that our methods may be useful for the early identification of subjects with TS, and hold promise for predicting prognosis and treatment outcome for individuals with TS.
Robust vector quantization for noisy channels
NASA Technical Reports Server (NTRS)
Demarca, J. R. B.; Farvardin, N.; Jayant, N. S.; Shoham, Y.
1988-01-01
The paper briefly discusses techniques for making vector quantizers more tolerant to tranmsission errors. Two algorithms are presented for obtaining an efficient binary word assignment to the vector quantizer codewords without increasing the transmission rate. It is shown that about 4.5 dB gain over random assignment can be achieved with these algorithms. It is also proposed to reduce the effects of error propagation in vector-predictive quantizers by appropriately constraining the response of the predictive loop. The constrained system is shown to have about 4 dB of SNR gain over an unconstrained system in a noisy channel, with a small loss of clean-channel performance.
Fast Query-Optimized Kernel-Machine Classification
NASA Technical Reports Server (NTRS)
Mazzoni, Dominic; DeCoste, Dennis
2004-01-01
A recently developed algorithm performs kernel-machine classification via incremental approximate nearest support vectors. The algorithm implements support-vector machines (SVMs) at speeds 10 to 100 times those attainable by use of conventional SVM algorithms. The algorithm offers potential benefits for classification of images, recognition of speech, recognition of handwriting, and diverse other applications in which there are requirements to discern patterns in large sets of data. SVMs constitute a subset of kernel machines (KMs), which have become popular as models for machine learning and, more specifically, for automated classification of input data on the basis of labeled training data. While similar in many ways to k-nearest-neighbors (k-NN) models and artificial neural networks (ANNs), SVMs tend to be more accurate. Using representations that scale only linearly in the numbers of training examples, while exploring nonlinear (kernelized) feature spaces that are exponentially larger than the original input dimensionality, KMs elegantly and practically overcome the classic curse of dimensionality. However, the price that one must pay for the power of KMs is that query-time complexity scales linearly with the number of training examples, making KMs often orders of magnitude more computationally expensive than are ANNs, decision trees, and other popular machine learning alternatives. The present algorithm treats an SVM classifier as a special form of a k-NN. The algorithm is based partly on an empirical observation that one can often achieve the same classification as that of an exact KM by using only small fraction of the nearest support vectors (SVs) of a query. The exact KM output is a weighted sum over the kernel values between the query and the SVs. In this algorithm, the KM output is approximated with a k-NN classifier, the output of which is a weighted sum only over the kernel values involving k selected SVs. Before query time, there are gathered statistics about how misleading the output of the k-NN model can be, relative to the outputs of the exact KM for a representative set of examples, for each possible k from 1 to the total number of SVs. From these statistics, there are derived upper and lower thresholds for each step k. These thresholds identify output levels for which the particular variant of the k-NN model already leans so strongly positively or negatively that a reversal in sign is unlikely, given the weaker SV neighbors still remaining. At query time, the partial output of each query is incrementally updated, stopping as soon as it exceeds the predetermined statistical thresholds of the current step. For an easy query, stopping can occur as early as step k = 1. For more difficult queries, stopping might not occur until nearly all SVs are touched. A key empirical observation is that this approach can tolerate very approximate nearest-neighbor orderings. In experiments, SVs and queries were projected to a subspace comprising the top few principal- component dimensions and neighbor orderings were computed in that subspace. This approach ensured that the overhead of the nearest-neighbor computations was insignificant, relative to that of the exact KM computation.
Spectral analysis of two-signed microarray expression data.
Higham, Desmond J; Kalna, Gabriela; Vass, J Keith
2007-06-01
We give a simple and informative derivation of a spectral algorithm for clustering and reordering complementary DNA microarray expression data. Here, expression levels of a set of genes are recorded simultaneously across a number of samples, with a positive weight reflecting up-regulation and a negative weight reflecting down-regulation. We give theoretical support for the algorithm based on a biologically justified hypothesis about the structure of the data, and illustrate its use on public domain data in the context of unsupervised tumour classification. The algorithm is derived by considering a discrete optimization problem and then relaxing to the continuous realm. We prove that in the case where the data have an inherent 'checkerboard' sign pattern, the algorithm will automatically reveal that pattern. Further, our derivation shows that the algorithm may be regarded as imposing a random graph model on the expression levels and then clustering from a maximum likelihood perspective. This indicates that the output will be tolerant to perturbations and will reveal 'near-checkerboard' patterns when these are present in the data. It is interesting to note that the checkerboard structure is revealed by the first (dominant) singular vectors--previous work on spectral methods has focussed on the case of nonnegative edge weights, where only the second and higher singular vectors are relevant. We illustrate the algorithm on real and synthetic data, and then use it in a tumour classification context on three different cancer data sets. Our results show that respecting the two-signed nature of the data (thereby distinguishing between up-regulation and down-regulation) reveals structures that cannot be gleaned from the absolute value data (where up- and down-regulation are both regarded as 'changes').
Quantum Algorithm for K-Nearest Neighbors Classification Based on the Metric of Hamming Distance
NASA Astrophysics Data System (ADS)
Ruan, Yue; Xue, Xiling; Liu, Heng; Tan, Jianing; Li, Xi
2017-11-01
K-nearest neighbors (KNN) algorithm is a common algorithm used for classification, and also a sub-routine in various complicated machine learning tasks. In this paper, we presented a quantum algorithm (QKNN) for implementing this algorithm based on the metric of Hamming distance. We put forward a quantum circuit for computing Hamming distance between testing sample and each feature vector in the training set. Taking advantage of this method, we realized a good analog for classical KNN algorithm by setting a distance threshold value t to select k - n e a r e s t neighbors. As a result, QKNN achieves O( n 3) performance which is only relevant to the dimension of feature vectors and high classification accuracy, outperforms Llyod's algorithm (Lloyd et al. 2013) and Wiebe's algorithm (Wiebe et al. 2014).
Locally adaptive vector quantization: Data compression with feature preservation
NASA Technical Reports Server (NTRS)
Cheung, K. M.; Sayano, M.
1992-01-01
A study of a locally adaptive vector quantization (LAVQ) algorithm for data compression is presented. This algorithm provides high-speed one-pass compression and is fully adaptable to any data source and does not require a priori knowledge of the source statistics. Therefore, LAVQ is a universal data compression algorithm. The basic algorithm and several modifications to improve performance are discussed. These modifications are nonlinear quantization, coarse quantization of the codebook, and lossless compression of the output. Performance of LAVQ on various images using irreversible (lossy) coding is comparable to that of the Linde-Buzo-Gray algorithm, but LAVQ has a much higher speed; thus this algorithm has potential for real-time video compression. Unlike most other image compression algorithms, LAVQ preserves fine detail in images. LAVQ's performance as a lossless data compression algorithm is comparable to that of Lempel-Ziv-based algorithms, but LAVQ uses far less memory during the coding process.
Lou, Xin Yuan; Sun, Lin Fu
2017-01-01
This paper proposes a new support vector machine (SVM) optimization scheme based on an improved chaotic fly optimization algorithm (FOA) with a mutation strategy to simultaneously perform parameter setting turning for the SVM and feature selection. In the improved FOA, the chaotic particle initializes the fruit fly swarm location and replaces the expression of distance for the fruit fly to find the food source. However, the proposed mutation strategy uses two distinct generative mechanisms for new food sources at the osphresis phase, allowing the algorithm procedure to search for the optimal solution in both the whole solution space and within the local solution space containing the fruit fly swarm location. In an evaluation based on a group of ten benchmark problems, the proposed algorithm’s performance is compared with that of other well-known algorithms, and the results support the superiority of the proposed algorithm. Moreover, this algorithm is successfully applied in a SVM to perform both parameter setting turning for the SVM and feature selection to solve real-world classification problems. This method is called chaotic fruit fly optimization algorithm (CIFOA)-SVM and has been shown to be a more robust and effective optimization method than other well-known methods, particularly in terms of solving the medical diagnosis problem and the credit card problem. PMID:28369096
Efficient enumeration of monocyclic chemical graphs with given path frequencies
2014-01-01
Background The enumeration of chemical graphs (molecular graphs) satisfying given constraints is one of the fundamental problems in chemoinformatics and bioinformatics because it leads to a variety of useful applications including structure determination and development of novel chemical compounds. Results We consider the problem of enumerating chemical graphs with monocyclic structure (a graph structure that contains exactly one cycle) from a given set of feature vectors, where a feature vector represents the frequency of the prescribed paths in a chemical compound to be constructed and the set is specified by a pair of upper and lower feature vectors. To enumerate all tree-like (acyclic) chemical graphs from a given set of feature vectors, Shimizu et al. and Suzuki et al. proposed efficient branch-and-bound algorithms based on a fast tree enumeration algorithm. In this study, we devise a novel method for extending these algorithms to enumeration of chemical graphs with monocyclic structure by designing a fast algorithm for testing uniqueness. The results of computational experiments reveal that the computational efficiency of the new algorithm is as good as those for enumeration of tree-like chemical compounds. Conclusions We succeed in expanding the class of chemical graphs that are able to be enumerated efficiently. PMID:24955135
NASA Astrophysics Data System (ADS)
Dougherty, Andrew W.
Metal oxides are a staple of the sensor industry. The combination of their sensitivity to a number of gases, and the electrical nature of their sensing mechanism, make the particularly attractive in solid state devices. The high temperature stability of the ceramic material also make them ideal for detecting combustion byproducts where exhaust temperatures can be high. However, problems do exist with metal oxide sensors. They are not very selective as they all tend to be sensitive to a number of reduction and oxidation reactions on the oxide's surface. This makes sensors with large numbers of sensors interesting to study as a method for introducing orthogonality to the system. Also, the sensors tend to suffer from long term drift for a number of reasons. In this thesis I will develop a system for intelligently modeling metal oxide sensors and determining their suitability for use in large arrays designed to analyze exhaust gas streams. It will introduce prior knowledge of the metal oxide sensors' response mechanisms in order to produce a response function for each sensor from sparse training data. The system will use the same technique to model and remove any long term drift from the sensor response. It will also provide an efficient means for determining the orthogonality of the sensor to determine whether they are useful in gas sensing arrays. The system is based on least squares support vector regression using the reciprocal kernel. The reciprocal kernel is introduced along with a method of optimizing the free parameters of the reciprocal kernel support vector machine. The reciprocal kernel is shown to be simpler and to perform better than an earlier kernel, the modified reciprocal kernel. Least squares support vector regression is chosen as it uses all of the training points and an emphasis was placed throughout this research for extracting the maximum information from very sparse data. The reciprocal kernel is shown to be effective in modeling the sensor responses in the time, gas and temperature domains, and the dual representation of the support vector regression solution is shown to provide insight into the sensor's sensitivity and potential orthogonality. Finally, the dual weights of the support vector regression solution to the sensor's response are suggested as a fitness function for a genetic algorithm, or some other method for efficiently searching large parameter spaces.
Zou, Lingyun; Wang, Zhengzhi; Huang, Jiaomin
2007-12-01
Subcellular location is one of the key biological characteristics of proteins. Position-specific profiles (PSP) have been introduced as important characteristics of proteins in this article. In this study, to obtain position-specific profiles, the Position Specific Iterative-Basic Local Alignment Search Tool (PSI-BLAST) has been used to search for protein sequences in a database. Position-specific scoring matrices are extracted from the profiles as one class of characteristics. Four-part amino acid compositions and 1st-7th order dipeptide compositions have also been calculated as the other two classes of characteristics. Therefore, twelve characteristic vectors are extracted from each of the protein sequences. Next, the characteristic vectors are weighed by a simple weighing function and inputted into a BP neural network predictor named PSP-Weighted Neural Network (PSP-WNN). The Levenberg-Marquardt algorithm is employed to adjust the weight matrices and thresholds during the network training instead of the error back propagation algorithm. With a jackknife test on the RH2427 dataset, PSP-WNN has achieved a higher overall prediction accuracy of 88.4% rather than the prediction results by the general BP neural network, Markov model, and fuzzy k-nearest neighbors algorithm on this dataset. In addition, the prediction performance of PSP-WNN has been evaluated with a five-fold cross validation test on the PK7579 dataset and the prediction results have been consistently better than those of the previous method on the basis of several support vector machines, using compositions of both amino acids and amino acid pairs. These results indicate that PSP-WNN is a powerful tool for subcellular localization prediction. At the end of the article, influences on prediction accuracy using different weighting proportions among three characteristic vector categories have been discussed. An appropriate proportion is considered by increasing the prediction accuracy.
Reasoning with Vectors: A Continuous Model for Fast Robust Inference.
Widdows, Dominic; Cohen, Trevor
2015-10-01
This paper describes the use of continuous vector space models for reasoning with a formal knowledge base. The practical significance of these models is that they support fast, approximate but robust inference and hypothesis generation, which is complementary to the slow, exact, but sometimes brittle behavior of more traditional deduction engines such as theorem provers. The paper explains the way logical connectives can be used in semantic vector models, and summarizes the development of Predication-based Semantic Indexing, which involves the use of Vector Symbolic Architectures to represent the concepts and relationships from a knowledge base of subject-predicate-object triples. Experiments show that the use of continuous models for formal reasoning is not only possible, but already demonstrably effective for some recognized informatics tasks, and showing promise in other traditional problem areas. Examples described in this paper include: predicting new uses for existing drugs in biomedical informatics; removing unwanted meanings from search results in information retrieval and concept navigation; type-inference from attributes; comparing words based on their orthography; and representing tabular data, including modelling numerical values. The algorithms and techniques described in this paper are all publicly released and freely available in the Semantic Vectors open-source software package.
Reasoning with Vectors: A Continuous Model for Fast Robust Inference
Widdows, Dominic; Cohen, Trevor
2015-01-01
This paper describes the use of continuous vector space models for reasoning with a formal knowledge base. The practical significance of these models is that they support fast, approximate but robust inference and hypothesis generation, which is complementary to the slow, exact, but sometimes brittle behavior of more traditional deduction engines such as theorem provers. The paper explains the way logical connectives can be used in semantic vector models, and summarizes the development of Predication-based Semantic Indexing, which involves the use of Vector Symbolic Architectures to represent the concepts and relationships from a knowledge base of subject-predicate-object triples. Experiments show that the use of continuous models for formal reasoning is not only possible, but already demonstrably effective for some recognized informatics tasks, and showing promise in other traditional problem areas. Examples described in this paper include: predicting new uses for existing drugs in biomedical informatics; removing unwanted meanings from search results in information retrieval and concept navigation; type-inference from attributes; comparing words based on their orthography; and representing tabular data, including modelling numerical values. The algorithms and techniques described in this paper are all publicly released and freely available in the Semantic Vectors open-source software package.1 PMID:26582967
An Evaluation of an Algorithm for Linear Inequalities and Its Applications
NASA Technical Reports Server (NTRS)
Jurgensen, J.
1973-01-01
An algorithm is presented for obtaining a solution alpha to a set of inequalities (A alpha) 0 where A is an N x m-matrix and alpha is an m-vector. If the set of inequalities is consistant, then the algorithm is guaranteed to arrive at a solution in a finite number of steps. Also, if in the iteration, a negative vector is obtained, then the initial set of inequalities is inconsistant, and the iteration is terminated.
A Vectorized ’Nearest-Neighbors’ Algorithm of Order N Using a Monotonic Logical Grid
1985-05-29
Computational Phy’sics 0 4 May 29 , 1985 This work was supported by the Office of Naval Research. . ~ Q~JUN 1719851- * NAVAL RESEARCH LABORATORY * lit...YE ’.ARK:NGS UNCLASSIFIED_______________ _____ K -. A R, ~ CA7,ON 4, 71CO1, 3 :)-S7R,9U-ON AdA,.A3:L ’Y OF REPOR7Io - EC..ASi.’ CA27 ON., DOWNGAZING...Year. Month. Day) 5 PAGE COUNT Interim FROM -____ Toi__ 1985 May 29 50 SuPPILSMENTARY NOTATION This work was supported by the Office of Naval
Hyperspectral recognition of processing tomato early blight based on GA and SVM
NASA Astrophysics Data System (ADS)
Yin, Xiaojun; Zhao, SiFeng
2013-03-01
Processing tomato early blight seriously affect the yield and quality of its.Determine the leaves spectrum of different disease severity level of processing tomato early blight.We take the sensitive bands of processing tomato early blight as support vector machine input vector.Through the genetic algorithm(GA) to optimize the parameters of SVM, We could recognize different disease severity level of processing tomato early blight.The result show:the sensitive bands of different disease severity levels of processing tomato early blight is 628-643nm and 689-692nm.The sensitive bands are as the GA and SVM input vector.We get the best penalty parameters is 0.129 and kernel function parameters is 3.479.We make classification training and testing by polynomial nuclear,radial basis function nuclear,Sigmoid nuclear.The best classification model is the radial basis function nuclear of SVM. Training accuracy is 84.615%,Testing accuracy is 80.681%.It is combined GA and SVM to achieve multi-classification of processing tomato early blight.It is provided the technical support of prediction processing tomato early blight occurrence, development and diffusion rule in large areas.
Reduction from cost-sensitive ordinal ranking to weighted binary classification.
Lin, Hsuan-Tien; Li, Ling
2012-05-01
We present a reduction framework from ordinal ranking to binary classification. The framework consists of three steps: extracting extended examples from the original examples, learning a binary classifier on the extended examples with any binary classification algorithm, and constructing a ranker from the binary classifier. Based on the framework, we show that a weighted 0/1 loss of the binary classifier upper-bounds the mislabeling cost of the ranker, both error-wise and regret-wise. Our framework allows not only the design of good ordinal ranking algorithms based on well-tuned binary classification approaches, but also the derivation of new generalization bounds for ordinal ranking from known bounds for binary classification. In addition, our framework unifies many existing ordinal ranking algorithms, such as perceptron ranking and support vector ordinal regression. When compared empirically on benchmark data sets, some of our newly designed algorithms enjoy advantages in terms of both training speed and generalization performance over existing algorithms. In addition, the newly designed algorithms lead to better cost-sensitive ordinal ranking performance, as well as improved listwise ranking performance.
Horizontal vectorization of electron repulsion integrals.
Pritchard, Benjamin P; Chow, Edmond
2016-10-30
We present an efficient implementation of the Obara-Saika algorithm for the computation of electron repulsion integrals that utilizes vector intrinsics to calculate several primitive integrals concurrently in a SIMD vector. Initial benchmarks display a 2-4 times speedup with AVX instructions over comparable scalar code, depending on the basis set. Speedup over scalar code is found to be sensitive to the level of contraction of the basis set, and is best for (lAlB|lClD) quartets when lD = 0 or lB=lD=0, which makes such a vectorization scheme particularly suitable for density fitting. The basic Obara-Saika algorithm, how it is vectorized, and the performance bottlenecks are analyzed and discussed. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
A RLS-SVM Aided Fusion Methodology for INS during GPS Outages
Yao, Yiqing; Xu, Xiaosu
2017-01-01
In order to maintain a relatively high accuracy of navigation performance during global positioning system (GPS) outages, a novel robust least squares support vector machine (LS-SVM)-aided fusion methodology is explored to provide the pseudo-GPS position information for the inertial navigation system (INS). The relationship between the yaw, specific force, velocity, and the position increment is modeled. Rather than share the same weight in the traditional LS-SVM, the proposed algorithm allocates various weights for different data, which makes the system immune to the outliers. Field test data was collected to evaluate the proposed algorithm. The comparison results indicate that the proposed algorithm can effectively provide position corrections for standalone INS during the 300 s GPS outage, which outperforms the traditional LS-SVM method. Historical information is also involved to better represent the vehicle dynamics. PMID:28245549
A RLS-SVM Aided Fusion Methodology for INS during GPS Outages.
Yao, Yiqing; Xu, Xiaosu
2017-02-24
In order to maintain a relatively high accuracy of navigation performance during global positioning system (GPS) outages, a novel robust least squares support vector machine (LS-SVM)-aided fusion methodology is explored to provide the pseudo-GPS position information for the inertial navigation system (INS). The relationship between the yaw, specific force, velocity, and the position increment is modeled. Rather than share the same weight in the traditional LS-SVM, the proposed algorithm allocates various weights for different data, which makes the system immune to the outliers. Field test data was collected to evaluate the proposed algorithm. The comparison results indicate that the proposed algorithm can effectively provide position corrections for standalone INS during the 300 s GPS outage, which outperforms the traditional LS-SVM method. Historical information is also involved to better represent the vehicle dynamics.
Detection of Partial Discharge Sources Using UHF Sensors and Blind Signal Separation
Boya, Carlos; Parrado-Hernández, Emilio
2017-01-01
The measurement of the emitted electromagnetic energy in the UHF region of the spectrum allows the detection of partial discharges and, thus, the on-line monitoring of the condition of the insulation of electrical equipment. Unfortunately, determining the affected asset is difficult when there are several simultaneous insulation defects. This paper proposes the use of an independent component analysis (ICA) algorithm to separate the signals coming from different partial discharge (PD) sources. The performance of the algorithm has been tested using UHF signals generated by test objects. The results are validated by two automatic classification techniques: support vector machines and similarity with class mean. Both methods corroborate the suitability of the algorithm to separate the signals emitted by each PD source even when they are generated by the same type of insulation defect. PMID:29140267
Semisupervised learning using Bayesian interpretation: application to LS-SVM.
Adankon, Mathias M; Cheriet, Mohamed; Biem, Alain
2011-04-01
Bayesian reasoning provides an ideal basis for representing and manipulating uncertain knowledge, with the result that many interesting algorithms in machine learning are based on Bayesian inference. In this paper, we use the Bayesian approach with one and two levels of inference to model the semisupervised learning problem and give its application to the successful kernel classifier support vector machine (SVM) and its variant least-squares SVM (LS-SVM). Taking advantage of Bayesian interpretation of LS-SVM, we develop a semisupervised learning algorithm for Bayesian LS-SVM using our approach based on two levels of inference. Experimental results on both artificial and real pattern recognition problems show the utility of our method.
A study of speech emotion recognition based on hybrid algorithm
NASA Astrophysics Data System (ADS)
Zhu, Ju-xia; Zhang, Chao; Lv, Zhao; Rao, Yao-quan; Wu, Xiao-pei
2011-10-01
To effectively improve the recognition accuracy of the speech emotion recognition system, a hybrid algorithm which combines Continuous Hidden Markov Model (CHMM), All-Class-in-One Neural Network (ACON) and Support Vector Machine (SVM) is proposed. In SVM and ACON methods, some global statistics are used as emotional features, while in CHMM method, instantaneous features are employed. The recognition rate by the proposed method is 92.25%, with the rejection rate to be 0.78%. Furthermore, it obtains the relative increasing of 8.53%, 4.69% and 0.78% compared with ACON, CHMM and SVM methods respectively. The experiment result confirms the efficiency of distinguishing anger, happiness, neutral and sadness emotional states.
Classification of large-sized hyperspectral imagery using fast machine learning algorithms
NASA Astrophysics Data System (ADS)
Xia, Junshi; Yokoya, Naoto; Iwasaki, Akira
2017-07-01
We present a framework of fast machine learning algorithms in the context of large-sized hyperspectral images classification from the theoretical to a practical viewpoint. In particular, we assess the performance of random forest (RF), rotation forest (RoF), and extreme learning machine (ELM) and the ensembles of RF and ELM. These classifiers are applied to two large-sized hyperspectral images and compared to the support vector machines. To give the quantitative analysis, we pay attention to comparing these methods when working with high input dimensions and a limited/sufficient training set. Moreover, other important issues such as the computational cost and robustness against the noise are also discussed.
A Distributed Wireless Camera System for the Management of Parking Spaces.
Vítek, Stanislav; Melničuk, Petr
2017-12-28
The importance of detection of parking space availability is still growing, particularly in major cities. This paper deals with the design of a distributed wireless camera system for the management of parking spaces, which can determine occupancy of the parking space based on the information from multiple cameras. The proposed system uses small camera modules based on Raspberry Pi Zero and computationally efficient algorithm for the occupancy detection based on the histogram of oriented gradients (HOG) feature descriptor and support vector machine (SVM) classifier. We have included information about the orientation of the vehicle as a supporting feature, which has enabled us to achieve better accuracy. The described solution can deliver occupancy information at the rate of 10 parking spaces per second with more than 90% accuracy in a wide range of conditions. Reliability of the implemented algorithm is evaluated with three different test sets which altogether contain over 700,000 samples of parking spaces.
An expert support system for breast cancer diagnosis using color wavelet features.
Issac Niwas, S; Palanisamy, P; Chibbar, Rajni; Zhang, W J
2012-10-01
Breast cancer diagnosis can be done through the pathologic assessments of breast tissue samples such as core needle biopsy technique. The result of analysis on this sample by pathologist is crucial for breast cancer patient. In this paper, nucleus of tissue samples are investigated after decomposition by means of the Log-Gabor wavelet on HSV color domain and an algorithm is developed to compute the color wavelet features. These features are used for breast cancer diagnosis using Support Vector Machine (SVM) classifier algorithm. The ability of properly trained SVM is to correctly classify patterns and make them particularly suitable for use in an expert system that aids in the diagnosis of cancer tissue samples. The results are compared with other multivariate classifiers such as Naïves Bayes classifier and Artificial Neural Network. The overall accuracy of the proposed method using SVM classifier will be further useful for automation in cancer diagnosis.
Multiple Ordinal Regression by Maximizing the Sum of Margins
Hamsici, Onur C.; Martinez, Aleix M.
2016-01-01
Human preferences are usually measured using ordinal variables. A system whose goal is to estimate the preferences of humans and their underlying decision mechanisms requires to learn the ordering of any given sample set. We consider the solution of this ordinal regression problem using a Support Vector Machine algorithm. Specifically, the goal is to learn a set of classifiers with common direction vectors and different biases correctly separating the ordered classes. Current algorithms are either required to solve a quadratic optimization problem, which is computationally expensive, or are based on maximizing the minimum margin (i.e., a fixed margin strategy) between a set of hyperplanes, which biases the solution to the closest margin. Another drawback of these strategies is that they are limited to order the classes using a single ranking variable (e.g., perceived length). In this paper, we define a multiple ordinal regression algorithm based on maximizing the sum of the margins between every consecutive class with respect to one or more rankings (e.g., perceived length and weight). We provide derivations of an efficient, easy-to-implement iterative solution using a Sequential Minimal Optimization procedure. We demonstrate the accuracy of our solutions in several datasets. In addition, we provide a key application of our algorithms in estimating human subjects’ ordinal classification of attribute associations to object categories. We show that these ordinal associations perform better than the binary one typically employed in the literature. PMID:26529784
Artificial Potential Field Controllers for Robust Communications in a Network of Swarm Robots
2005-05-18
vectors are less than 90◦ apart. Algorithm 1 The Algorithm for generating a feasible set of vectors P ← set of high priority vectors Csum ← [( LOS1 +R1...the 46 C program was finished reading and writing the values to the serial line it would delete the timing file. Only after the timing file had been... deleted would the base station write new values for the wheel velocities. The timing file kept both the Linux PC and the base station synchronized so
NASA Technical Reports Server (NTRS)
Mullenmeister, Paul
1988-01-01
The quasi-geostrophic omega-equation in flux form is developed as an example of a Poisson problem over a spherical shell. Solutions of this equation are obtained by applying a two-parameter Chebyshev solver in vector layout for CDC 200 series computers. The performance of this vectorized algorithm greatly exceeds the performance of its scalar analog. The algorithm generates solutions of the omega-equation which are compared with the omega fields calculated with the aid of the mass continuity equation.
Arbitrary norm support vector machines.
Huang, Kaizhu; Zheng, Danian; King, Irwin; Lyu, Michael R
2009-02-01
Support vector machines (SVM) are state-of-the-art classifiers. Typically L2-norm or L1-norm is adopted as a regularization term in SVMs, while other norm-based SVMs, for example, the L0-norm SVM or even the L(infinity)-norm SVM, are rarely seen in the literature. The major reason is that L0-norm describes a discontinuous and nonconvex term, leading to a combinatorially NP-hard optimization problem. In this letter, motivated by Bayesian learning, we propose a novel framework that can implement arbitrary norm-based SVMs in polynomial time. One significant feature of this framework is that only a sequence of sequential minimal optimization problems needs to be solved, thus making it practical in many real applications. The proposed framework is important in the sense that Bayesian priors can be efficiently plugged into most learning methods without knowing the explicit form. Hence, this builds a connection between Bayesian learning and the kernel machines. We derive the theoretical framework, demonstrate how our approach works on the L0-norm SVM as a typical example, and perform a series of experiments to validate its advantages. Experimental results on nine benchmark data sets are very encouraging. The implemented L0-norm is competitive with or even better than the standard L2-norm SVM in terms of accuracy but with a reduced number of support vectors, -9.46% of the number on average. When compared with another sparse model, the relevance vector machine, our proposed algorithm also demonstrates better sparse properties with a training speed over seven times faster.
Yu, Yang; Niederleithinger, Ernst; Li, Jianchun; Wiggenhauser, Herbert
2017-01-01
This paper presents a novel non-destructive testing and health monitoring system using a network of tactile transducers and accelerometers for the condition assessment and damage classification of foundation piles and utility poles. While in traditional pile integrity testing an impact hammer with broadband frequency excitation is typically used, the proposed testing system utilizes an innovative excitation system based on a network of tactile transducers to induce controlled narrow-band frequency stress waves. Thereby, the simultaneous excitation of multiple stress wave types and modes is avoided (or at least reduced), and targeted wave forms can be generated. The new testing system enables the testing and monitoring of foundation piles and utility poles where the top is inaccessible, making the new testing system suitable, for example, for the condition assessment of pile structures with obstructed heads and of poles with live wires. For system validation, the new system was experimentally tested on nine timber and concrete poles that were inflicted with several types of damage. The tactile transducers were excited with continuous sine wave signals of 1 kHz frequency. Support vector machines were employed together with advanced signal processing algorithms to distinguish recorded stress wave signals from pole structures with different types of damage. The results show that using fast Fourier transform signals, combined with principal component analysis as the input feature vector for support vector machine (SVM) classifiers with different kernel functions, can achieve damage classification with accuracies of 92.5% ± 7.5%. PMID:29258274
Automated detection of pulmonary nodules in CT images with support vector machines
NASA Astrophysics Data System (ADS)
Liu, Lu; Liu, Wanyu; Sun, Xiaoming
2008-10-01
Many methods have been proposed to avoid radiologists fail to diagnose small pulmonary nodules. Recently, support vector machines (SVMs) had received an increasing attention for pattern recognition. In this paper, we present a computerized system aimed at pulmonary nodules detection; it identifies the lung field, extracts a set of candidate regions with a high sensitivity ratio and then classifies candidates by the use of SVMs. The Computer Aided Diagnosis (CAD) system presented in this paper supports the diagnosis of pulmonary nodules from Computed Tomography (CT) images as inflammation, tuberculoma, granuloma..sclerosing hemangioma, and malignant tumor. Five texture feature sets were extracted for each lesion, while a genetic algorithm based feature selection method was applied to identify the most robust features. The selected feature set was fed into an ensemble of SVMs classifiers. The achieved classification performance was 100%, 92.75% and 90.23% in the training, validation and testing set, respectively. It is concluded that computerized analysis of medical images in combination with artificial intelligence can be used in clinical practice and may contribute to more efficient diagnosis.
Maximum Margin Clustering of Hyperspectral Data
NASA Astrophysics Data System (ADS)
Niazmardi, S.; Safari, A.; Homayouni, S.
2013-09-01
In recent decades, large margin methods such as Support Vector Machines (SVMs) are supposed to be the state-of-the-art of supervised learning methods for classification of hyperspectral data. However, the results of these algorithms mainly depend on the quality and quantity of available training data. To tackle down the problems associated with the training data, the researcher put effort into extending the capability of large margin algorithms for unsupervised learning. One of the recent proposed algorithms is Maximum Margin Clustering (MMC). The MMC is an unsupervised SVMs algorithm that simultaneously estimates both the labels and the hyperplane parameters. Nevertheless, the optimization of the MMC algorithm is a non-convex problem. Most of the existing MMC methods rely on the reformulating and the relaxing of the non-convex optimization problem as semi-definite programs (SDP), which are computationally very expensive and only can handle small data sets. Moreover, most of these algorithms are two-class classification, which cannot be used for classification of remotely sensed data. In this paper, a new MMC algorithm is used that solve the original non-convex problem using Alternative Optimization method. This algorithm is also extended for multi-class classification and its performance is evaluated. The results of the proposed algorithm show that the algorithm has acceptable results for hyperspectral data clustering.
A nowcasting technique based on application of the particle filter blending algorithm
NASA Astrophysics Data System (ADS)
Chen, Yuanzhao; Lan, Hongping; Chen, Xunlai; Zhang, Wenhai
2017-10-01
To improve the accuracy of nowcasting, a new extrapolation technique called particle filter blending was configured in this study and applied to experimental nowcasting. Radar echo extrapolation was performed by using the radar mosaic at an altitude of 2.5 km obtained from the radar images of 12 S-band radars in Guangdong Province, China. The first bilateral filter was applied in the quality control of the radar data; an optical flow method based on the Lucas-Kanade algorithm and the Harris corner detection algorithm were used to track radar echoes and retrieve the echo motion vectors; then, the motion vectors were blended with the particle filter blending algorithm to estimate the optimal motion vector of the true echo motions; finally, semi-Lagrangian extrapolation was used for radar echo extrapolation based on the obtained motion vector field. A comparative study of the extrapolated forecasts of four precipitation events in 2016 in Guangdong was conducted. The results indicate that the particle filter blending algorithm could realistically reproduce the spatial pattern, echo intensity, and echo location at 30- and 60-min forecast lead times. The forecasts agreed well with observations, and the results were of operational significance. Quantitative evaluation of the forecasts indicates that the particle filter blending algorithm performed better than the cross-correlation method and the optical flow method. Therefore, the particle filter blending method is proved to be superior to the traditional forecasting methods and it can be used to enhance the ability of nowcasting in operational weather forecasts.
Parallel and fault-tolerant algorithms for hypercube multiprocessors
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aykanat, C.
1988-01-01
Several techniques for increasing the performance of parallel algorithms on distributed-memory message-passing multi-processor systems are investigated. These techniques are effectively implemented for the parallelization of the Scaled Conjugate Gradient (SCG) algorithm on a hypercube connected message-passing multi-processor. Significant performance improvement is achieved by using these techniques. The SCG algorithm is used for the solution phase of an FE modeling system. Almost linear speed-up is achieved, and it is shown that hypercube topology is scalable for an FE class of problem. The SCG algorithm is also shown to be suitable for vectorization, and near supercomputer performance is achieved on a vectormore » hypercube multiprocessor by exploiting both parallelization and vectorization. Fault-tolerance issues for the parallel SCG algorithm and for the hypercube topology are also addressed.« less
Quantum and electromagnetic propagation with the conjugate symmetric Lanczos method.
Acevedo, Ramiro; Lombardini, Richard; Turner, Matthew A; Kinsey, James L; Johnson, Bruce R
2008-02-14
The conjugate symmetric Lanczos (CSL) method is introduced for the solution of the time-dependent Schrodinger equation. This remarkably simple and efficient time-domain algorithm is a low-order polynomial expansion of the quantum propagator for time-independent Hamiltonians and derives from the time-reversal symmetry of the Schrodinger equation. The CSL algorithm gives forward solutions by simply complex conjugating backward polynomial expansion coefficients. Interestingly, the expansion coefficients are the same for each uniform time step, a fact that is only spoiled by basis incompleteness and finite precision. This is true for the Krylov basis and, with further investigation, is also found to be true for the Lanczos basis, important for efficient orthogonal projection-based algorithms. The CSL method errors roughly track those of the short iterative Lanczos method while requiring fewer matrix-vector products than the Chebyshev method. With the CSL method, only a few vectors need to be stored at a time, there is no need to estimate the Hamiltonian spectral range, and only matrix-vector and vector-vector products are required. Applications using localized wavelet bases are made to harmonic oscillator and anharmonic Morse oscillator systems as well as electrodynamic pulse propagation using the Hamiltonian form of Maxwell's equations. For gold with a Drude dielectric function, the latter is non-Hermitian, requiring consideration of corrections to the CSL algorithm.
O'Boyle, Noel M; Palmer, David S; Nigsch, Florian; Mitchell, John BO
2008-01-01
Background We present a novel feature selection algorithm, Winnowing Artificial Ant Colony (WAAC), that performs simultaneous feature selection and model parameter optimisation for the development of predictive quantitative structure-property relationship (QSPR) models. The WAAC algorithm is an extension of the modified ant colony algorithm of Shen et al. (J Chem Inf Model 2005, 45: 1024–1029). We test the ability of the algorithm to develop a predictive partial least squares model for the Karthikeyan dataset (J Chem Inf Model 2005, 45: 581–590) of melting point values. We also test its ability to perform feature selection on a support vector machine model for the same dataset. Results Starting from an initial set of 203 descriptors, the WAAC algorithm selected a PLS model with 68 descriptors which has an RMSE on an external test set of 46.6°C and R2 of 0.51. The number of components chosen for the model was 49, which was close to optimal for this feature selection. The selected SVM model has 28 descriptors (cost of 5, ε of 0.21) and an RMSE of 45.1°C and R2 of 0.54. This model outperforms a kNN model (RMSE of 48.3°C, R2 of 0.47) for the same data and has similar performance to a Random Forest model (RMSE of 44.5°C, R2 of 0.55). However it is much less prone to bias at the extremes of the range of melting points as shown by the slope of the line through the residuals: -0.43 for WAAC/SVM, -0.53 for Random Forest. Conclusion With a careful choice of objective function, the WAAC algorithm can be used to optimise machine learning and regression models that suffer from overfitting. Where model parameters also need to be tuned, as is the case with support vector machine and partial least squares models, it can optimise these simultaneously. The moving probabilities used by the algorithm are easily interpreted in terms of the best and current models of the ants, and the winnowing procedure promotes the removal of irrelevant descriptors. PMID:18959785
Attitude determination using vector observations: A fast optimal matrix algorithm
NASA Technical Reports Server (NTRS)
Markley, F. Landis
1993-01-01
The attitude matrix minimizing Wahba's loss function is computed directly by a method that is competitive with the fastest known algorithm for finding this optimal estimate. The method also provides an estimate of the attitude error covariance matrix. Analysis of the special case of two vector observations identifies those cases for which the TRIAD or algebraic method minimizes Wahba's loss function.
Applying an efficient K-nearest neighbor search to forest attribute imputation
Andrew O. Finley; Ronald E. McRoberts; Alan R. Ek
2006-01-01
This paper explores the utility of an efficient nearest neighbor (NN) search algorithm for applications in multi-source kNN forest attribute imputation. The search algorithm reduces the number of distance calculations between a given target vector and each reference vector, thereby, decreasing the time needed to discover the NN subset. Results of five trials show gains...
Vectorial mask optimization methods for robust optical lithography
NASA Astrophysics Data System (ADS)
Ma, Xu; Li, Yanqiu; Guo, Xuejia; Dong, Lisong; Arce, Gonzalo R.
2012-10-01
Continuous shrinkage of critical dimension in an integrated circuit impels the development of resolution enhancement techniques for low k1 lithography. Recently, several pixelated optical proximity correction (OPC) and phase-shifting mask (PSM) approaches were developed under scalar imaging models to account for the process variations. However, the lithography systems with larger-NA (NA>0.6) are predominant for current technology nodes, rendering the scalar models inadequate to describe the vector nature of the electromagnetic field that propagates through the optical lithography system. In addition, OPC and PSM algorithms based on scalar models can compensate for wavefront aberrations, but are incapable of mitigating polarization aberrations in practical lithography systems, which can only be dealt with under the vector model. To this end, we focus on developing robust pixelated gradient-based OPC and PSM optimization algorithms aimed at canceling defocus, dose variation, wavefront and polarization aberrations under a vector model. First, an integrative and analytic vector imaging model is applied to formulate the optimization problem, where the effects of process variations are explicitly incorporated in the optimization framework. A steepest descent algorithm is then used to iteratively optimize the mask patterns. Simulations show that the proposed algorithms can effectively improve the process windows of the optical lithography systems.
Limits on the Efficiency of Event-Based Algorithms for Monte Carlo Neutron Transport
DOE Office of Scientific and Technical Information (OSTI.GOV)
Romano, Paul K.; Siegel, Andrew R.
The traditional form of parallelism in Monte Carlo particle transport simulations, wherein each individual particle history is considered a unit of work, does not lend itself well to data-level parallelism. Event-based algorithms, which were originally used for simulations on vector processors, may offer a path toward better utilizing data-level parallelism in modern computer architectures. In this study, a simple model is developed for estimating the efficiency of the event-based particle transport algorithm under two sets of assumptions. Data collected from simulations of four reactor problems using OpenMC was then used in conjunction with the models to calculate the speedup duemore » to vectorization as a function of the size of the particle bank and the vector width. When each event type is assumed to have constant execution time, the achievable speedup is directly related to the particle bank size. We observed that the bank size generally needs to be at least 20 times greater than vector size to achieve vector efficiency greater than 90%. Lastly, when the execution times for events are allowed to vary, the vector speedup is also limited by differences in execution time for events being carried out in a single event-iteration.« less
Limits on the Efficiency of Event-Based Algorithms for Monte Carlo Neutron Transport
Romano, Paul K.; Siegel, Andrew R.
2017-07-01
The traditional form of parallelism in Monte Carlo particle transport simulations, wherein each individual particle history is considered a unit of work, does not lend itself well to data-level parallelism. Event-based algorithms, which were originally used for simulations on vector processors, may offer a path toward better utilizing data-level parallelism in modern computer architectures. In this study, a simple model is developed for estimating the efficiency of the event-based particle transport algorithm under two sets of assumptions. Data collected from simulations of four reactor problems using OpenMC was then used in conjunction with the models to calculate the speedup duemore » to vectorization as a function of the size of the particle bank and the vector width. When each event type is assumed to have constant execution time, the achievable speedup is directly related to the particle bank size. We observed that the bank size generally needs to be at least 20 times greater than vector size to achieve vector efficiency greater than 90%. Lastly, when the execution times for events are allowed to vary, the vector speedup is also limited by differences in execution time for events being carried out in a single event-iteration.« less
Duan, Li; Guo, Long; Liu, Ke; Liu, E-Hu; Li, Ping
2014-04-25
Citrus herbs have been widely used in traditional medicine and cuisine in China and other countries since the ancient time. However, the authentication and quality control of Citrus herbs has always been a challenging task due to their similar morphological characteristics and the diversity of the multi-components existed in the complicated matrix. In the present investigation, we developed a novel strategy to characterize and classify seven Citrus herbs based on chromatographic analysis and chemometric methods. Firstly, the chemical constituents in seven Citrus herbs were globally characterized by liquid chromatography combined with quadrupole time-of-flight mass spectrometry (LC-QTOF-MS). Based on their retention time, UV spectra and MS fragmentation behavior, a total of 75 compounds were identified or tentatively characterized in these herbal medicines. Secondly, a segmental monitoring method based on LC-variable wavelength detection was developed for simultaneous quantification of ten marker compounds in these Citrus herbs. Thirdly, based on the contents of the ten analytes, genetic algorithm optimized support vector machines (GA-SVM) was employed to differentiate and classify the 64 samples covering these seven herbs. The obtained classifier showed good prediction performance and the overall prediction accuracy reached 96.88%. The proposed strategy is expected to provide new insight for authentication and quality control of traditional herbs. Copyright © 2014 Elsevier B.V. All rights reserved.
Liang, Ja-Der; Ping, Xiao-Ou; Tseng, Yi-Ju; Huang, Guan-Tarn; Lai, Feipei; Yang, Pei-Ming
2014-12-01
Recurrence of hepatocellular carcinoma (HCC) is an important issue despite effective treatments with tumor eradication. Identification of patients who are at high risk for recurrence may provide more efficacious screening and detection of tumor recurrence. The aim of this study was to develop recurrence predictive models for HCC patients who received radiofrequency ablation (RFA) treatment. From January 2007 to December 2009, 83 newly diagnosed HCC patients receiving RFA as their first treatment were enrolled. Five feature selection methods including genetic algorithm (GA), simulated annealing (SA) algorithm, random forests (RF) and hybrid methods (GA+RF and SA+RF) were utilized for selecting an important subset of features from a total of 16 clinical features. These feature selection methods were combined with support vector machine (SVM) for developing predictive models with better performance. Five-fold cross-validation was used to train and test SVM models. The developed SVM-based predictive models with hybrid feature selection methods and 5-fold cross-validation had averages of the sensitivity, specificity, accuracy, positive predictive value, negative predictive value, and area under the ROC curve as 67%, 86%, 82%, 69%, 90%, and 0.69, respectively. The SVM derived predictive model can provide suggestive high-risk recurrent patients, who should be closely followed up after complete RFA treatment. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Improved RMR Rock Mass Classification Using Artificial Intelligence Algorithms
NASA Astrophysics Data System (ADS)
Gholami, Raoof; Rasouli, Vamegh; Alimoradi, Andisheh
2013-09-01
Rock mass classification systems such as rock mass rating (RMR) are very reliable means to provide information about the quality of rocks surrounding a structure as well as to propose suitable support systems for unstable regions. Many correlations have been proposed to relate measured quantities such as wave velocity to rock mass classification systems to limit the associated time and cost of conducting the sampling and mechanical tests conventionally used to calculate RMR values. However, these empirical correlations have been found to be unreliable, as they usually overestimate or underestimate the RMR value. The aim of this paper is to compare the results of RMR classification obtained from the use of empirical correlations versus machine-learning methodologies based on artificial intelligence algorithms. The proposed methods were verified based on two case studies located in northern Iran. Relevance vector regression (RVR) and support vector regression (SVR), as two robust machine-learning methodologies, were used to predict the RMR for tunnel host rocks. RMR values already obtained by sampling and site investigation at one tunnel were taken into account as the output of the artificial networks during training and testing phases. The results reveal that use of empirical correlations overestimates the predicted RMR values. RVR and SVR, however, showed more reliable results, and are therefore suggested for use in RMR classification for design purposes of rock structures.
Pre-operative prediction of surgical morbidity in children: comparison of five statistical models.
Cooper, Jennifer N; Wei, Lai; Fernandez, Soledad A; Minneci, Peter C; Deans, Katherine J
2015-02-01
The accurate prediction of surgical risk is important to patients and physicians. Logistic regression (LR) models are typically used to estimate these risks. However, in the fields of data mining and machine-learning, many alternative classification and prediction algorithms have been developed. This study aimed to compare the performance of LR to several data mining algorithms for predicting 30-day surgical morbidity in children. We used the 2012 National Surgical Quality Improvement Program-Pediatric dataset to compare the performance of (1) a LR model that assumed linearity and additivity (simple LR model) (2) a LR model incorporating restricted cubic splines and interactions (flexible LR model) (3) a support vector machine, (4) a random forest and (5) boosted classification trees for predicting surgical morbidity. The ensemble-based methods showed significantly higher accuracy, sensitivity, specificity, PPV, and NPV than the simple LR model. However, none of the models performed better than the flexible LR model in terms of the aforementioned measures or in model calibration or discrimination. Support vector machines, random forests, and boosted classification trees do not show better performance than LR for predicting pediatric surgical morbidity. After further validation, the flexible LR model derived in this study could be used to assist with clinical decision-making based on patient-specific surgical risks. Copyright © 2014 Elsevier Ltd. All rights reserved.
Yu, Jessica S; Pertusi, Dante A; Adeniran, Adebola V; Tyo, Keith E J
2017-03-15
High throughput screening by fluorescence activated cell sorting (FACS) is a common task in protein engineering and directed evolution. It can also be a rate-limiting step if high false positive or negative rates necessitate multiple rounds of enrichment. Current FACS software requires the user to define sorting gates by intuition and is practically limited to two dimensions. In cases when multiple rounds of enrichment are required, the software cannot forecast the enrichment effort required. We have developed CellSort, a support vector machine (SVM) algorithm that identifies optimal sorting gates based on machine learning using positive and negative control populations. CellSort can take advantage of more than two dimensions to enhance the ability to distinguish between populations. We also present a Bayesian approach to predict the number of sorting rounds required to enrich a population from a given library size. This Bayesian approach allowed us to determine strategies for biasing the sorting gates in order to reduce the required number of enrichment rounds. This algorithm should be generally useful for improve sorting outcomes and reducing effort when using FACS. Source code available at http://tyolab.northwestern.edu/tools/ . k-tyo@northwestern.edu. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
A communication-avoiding, hybrid-parallel, rank-revealing orthogonalization method.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hoemmen, Mark
2010-11-01
Orthogonalization consumes much of the run time of many iterative methods for solving sparse linear systems and eigenvalue problems. Commonly used algorithms, such as variants of Gram-Schmidt or Householder QR, have performance dominated by communication. Here, 'communication' includes both data movement between the CPU and memory, and messages between processors in parallel. Our Tall Skinny QR (TSQR) family of algorithms requires asymptotically fewer messages between processors and data movement between CPU and memory than typical orthogonalization methods, yet achieves the same accuracy as Householder QR factorization. Furthermore, in block orthogonalizations, TSQR is faster and more accurate than existing approaches formore » orthogonalizing the vectors within each block ('normalization'). TSQR's rank-revealing capability also makes it useful for detecting deflation in block iterative methods, for which existing approaches sacrifice performance, accuracy, or both. We have implemented a version of TSQR that exploits both distributed-memory and shared-memory parallelism, and supports real and complex arithmetic. Our implementation is optimized for the case of orthogonalizing a small number (5-20) of very long vectors. The shared-memory parallel component uses Intel's Threading Building Blocks, though its modular design supports other shared-memory programming models as well, including computation on the GPU. Our implementation achieves speedups of 2 times or more over competing orthogonalizations. It is available now in the development branch of the Trilinos software package, and will be included in the 10.8 release.« less
Torija, Antonio J; Ruiz, Diego P; Ramos-Ridao, Angel F
2014-06-01
To ensure appropriate soundscape management in urban environments, the urban-planning authorities need a range of tools that enable such a task to be performed. An essential step during the management of urban areas from a sound standpoint should be the evaluation of the soundscape in such an area. In this sense, it has been widely acknowledged that a subjective and acoustical categorization of a soundscape is the first step to evaluate it, providing a basis for designing or adapting it to match people's expectations as well. In this sense, this work proposes a model for automatic classification of urban soundscapes. This model is intended for the automatic classification of urban soundscapes based on underlying acoustical and perceptual criteria. Thus, this classification model is proposed to be used as a tool for a comprehensive urban soundscape evaluation. Because of the great complexity associated with the problem, two machine learning techniques, Support Vector Machines (SVM) and Support Vector Machines trained with Sequential Minimal Optimization (SMO), are implemented in developing model classification. The results indicate that the SMO model outperforms the SVM model in the specific task of soundscape classification. With the implementation of the SMO algorithm, the classification model achieves an outstanding performance (91.3% of instances correctly classified). © 2013 Elsevier B.V. All rights reserved.
Jorge-Botana, Guillermo; Olmos, Ricardo; León, José Antonio
2009-11-01
There is currently a widespread interest in indexing and extracting taxonomic information from large text collections. An example is the automatic categorization of informally written medical or psychological diagnoses, followed by the extraction of epidemiological information or even terms and structures needed to formulate guiding questions as an heuristic tool for helping doctors. Vector space models have been successfully used to this end (Lee, Cimino, Zhu, Sable, Shanker, Ely & Yu, 2006; Pakhomov, Buntrock & Chute, 2006). In this study we use a computational model known as Latent Semantic Analysis (LSA) on a diagnostic corpus with the aim of retrieving definitions (in the form of lists of semantic neighbors) of common structures it contains (e.g. "storm phobia", "dog phobia") or less common structures that might be formed by logical combinations of categories and diagnostic symptoms (e.g. "gun personality" or "germ personality"). In the quest to bring definitions into line with the meaning of structures and make them in some way representative, various problems commonly arise while recovering content using vector space models. We propose some approaches which bypass these problems, such as Kintsch's (2001) predication algorithm and some corrections to the way lists of neighbors are obtained, which have already been tested on semantic spaces in a non-specific domain (Jorge-Botana, León, Olmos & Hassan-Montero, under review). The results support the idea that the predication algorithm may also be useful for extracting more precise meanings of certain structures from scientific corpora, and that the introduction of some corrections based on vector length may increases its efficiency on non-representative terms.
NASA Astrophysics Data System (ADS)
Wang, L.; Wang, T. G.; Wu, J. H.; Cheng, G. P.
2016-09-01
A novel multi-objective optimization algorithm incorporating evolution strategies and vector mechanisms, referred as VD-MOEA, is proposed and applied in aerodynamic- structural integrated design of wind turbine blade. In the algorithm, a set of uniformly distributed vectors is constructed to guide population in moving forward to the Pareto front rapidly and maintain population diversity with high efficiency. For example, two- and three- objective designs of 1.5MW wind turbine blade are subsequently carried out for the optimization objectives of maximum annual energy production, minimum blade mass, and minimum extreme root thrust. The results show that the Pareto optimal solutions can be obtained in one single simulation run and uniformly distributed in the objective space, maximally maintaining the population diversity. In comparison to conventional evolution algorithms, VD-MOEA displays dramatic improvement of algorithm performance in both convergence and diversity preservation for handling complex problems of multi-variables, multi-objectives and multi-constraints. This provides a reliable high-performance optimization approach for the aerodynamic-structural integrated design of wind turbine blade.
Hazardous gas detection for FTIR-based hyperspectral imaging system using DNN and CNN
NASA Astrophysics Data System (ADS)
Kim, Yong Chan; Yu, Hyeong-Geun; Lee, Jae-Hoon; Park, Dong-Jo; Nam, Hyun-Woo
2017-10-01
Recently, a hyperspectral imaging system (HIS) with a Fourier Transform InfraRed (FTIR) spectrometer has been widely used due to its strengths in detecting gaseous fumes. Even though numerous algorithms for detecting gaseous fumes have already been studied, it is still difficult to detect target gases properly because of atmospheric interference substances and unclear characteristics of low concentration gases. In this paper, we propose detection algorithms for classifying hazardous gases using a deep neural network (DNN) and a convolutional neural network (CNN). In both the DNN and CNN, spectral signal preprocessing, e.g., offset, noise, and baseline removal, are carried out. In the DNN algorithm, the preprocessed spectral signals are used as feature maps of the DNN with five layers, and it is trained by a stochastic gradient descent (SGD) algorithm (50 batch size) and dropout regularization (0.7 ratio). In the CNN algorithm, preprocessed spectral signals are trained with 1 × 3 convolution layers and 1 × 2 max-pooling layers. As a result, the proposed algorithms improve the classification accuracy rate by 1.5% over the existing support vector machine (SVM) algorithm for detecting and classifying hazardous gases.
Vatsa, Mayank; Singh, Richa; Noore, Afzel
2008-08-01
This paper proposes algorithms for iris segmentation, quality enhancement, match score fusion, and indexing to improve both the accuracy and the speed of iris recognition. A curve evolution approach is proposed to effectively segment a nonideal iris image using the modified Mumford-Shah functional. Different enhancement algorithms are concurrently applied on the segmented iris image to produce multiple enhanced versions of the iris image. A support-vector-machine-based learning algorithm selects locally enhanced regions from each globally enhanced image and combines these good-quality regions to create a single high-quality iris image. Two distinct features are extracted from the high-quality iris image. The global textural feature is extracted using the 1-D log polar Gabor transform, and the local topological feature is extracted using Euler numbers. An intelligent fusion algorithm combines the textural and topological matching scores to further improve the iris recognition performance and reduce the false rejection rate, whereas an indexing algorithm enables fast and accurate iris identification. The verification and identification performance of the proposed algorithms is validated and compared with other algorithms using the CASIA Version 3, ICE 2005, and UBIRIS iris databases.
Machine learning for outcome prediction of acute ischemic stroke post intra-arterial therapy.
Asadi, Hamed; Dowling, Richard; Yan, Bernard; Mitchell, Peter
2014-01-01
Stroke is a major cause of death and disability. Accurately predicting stroke outcome from a set of predictive variables may identify high-risk patients and guide treatment approaches, leading to decreased morbidity. Logistic regression models allow for the identification and validation of predictive variables. However, advanced machine learning algorithms offer an alternative, in particular, for large-scale multi-institutional data, with the advantage of easily incorporating newly available data to improve prediction performance. Our aim was to design and compare different machine learning methods, capable of predicting the outcome of endovascular intervention in acute anterior circulation ischaemic stroke. We conducted a retrospective study of a prospectively collected database of acute ischaemic stroke treated by endovascular intervention. Using SPSS®, MATLAB®, and Rapidminer®, classical statistics as well as artificial neural network and support vector algorithms were applied to design a supervised machine capable of classifying these predictors into potential good and poor outcomes. These algorithms were trained, validated and tested using randomly divided data. We included 107 consecutive acute anterior circulation ischaemic stroke patients treated by endovascular technique. Sixty-six were male and the mean age of 65.3. All the available demographic, procedural and clinical factors were included into the models. The final confusion matrix of the neural network, demonstrated an overall congruency of ∼ 80% between the target and output classes, with favourable receiving operative characteristics. However, after optimisation, the support vector machine had a relatively better performance, with a root mean squared error of 2.064 (SD: ± 0.408). We showed promising accuracy of outcome prediction, using supervised machine learning algorithms, with potential for incorporation of larger multicenter datasets, likely further improving prediction. Finally, we propose that a robust machine learning system can potentially optimise the selection process for endovascular versus medical treatment in the management of acute stroke.
Carré, Clément; Mas, André; Krouk, Gabriel
2017-01-01
Inferring transcriptional gene regulatory networks from transcriptomic datasets is a key challenge of systems biology, with potential impacts ranging from medicine to agronomy. There are several techniques used presently to experimentally assay transcription factors to target relationships, defining important information about real gene regulatory networks connections. These techniques include classical ChIP-seq, yeast one-hybrid, or more recently, DAP-seq or target technologies. These techniques are usually used to validate algorithm predictions. Here, we developed a reverse engineering approach based on mathematical and computer simulation to evaluate the impact that this prior knowledge on gene regulatory networks may have on training machine learning algorithms. First, we developed a gene regulatory networks-simulating engine called FRANK (Fast Randomizing Algorithm for Network Knowledge) that is able to simulate large gene regulatory networks (containing 10 4 genes) with characteristics of gene regulatory networks observed in vivo. FRANK also generates stable or oscillatory gene expression directly produced by the simulated gene regulatory networks. The development of FRANK leads to important general conclusions concerning the design of large and stable gene regulatory networks harboring scale free properties (built ex nihilo). In combination with supervised (accepting prior knowledge) support vector machine algorithm we (i) address biologically oriented questions concerning our capacity to accurately reconstruct gene regulatory networks and in particular we demonstrate that prior-knowledge structure is crucial for accurate learning, and (ii) draw conclusions to inform experimental design to performed learning able to solve gene regulatory networks in the future. By demonstrating that our predictions concerning the influence of the prior-knowledge structure on support vector machine learning capacity holds true on real data ( Escherichia coli K14 network reconstruction using network and transcriptomic data), we show that the formalism used to build FRANK can to some extent be a reasonable model for gene regulatory networks in real cells.
Cloud Detection of Optical Satellite Images Using Support Vector Machine
NASA Astrophysics Data System (ADS)
Lee, Kuan-Yi; Lin, Chao-Hung
2016-06-01
Cloud covers are generally present in optical remote-sensing images, which limit the usage of acquired images and increase the difficulty of data analysis, such as image compositing, correction of atmosphere effects, calculations of vegetation induces, land cover classification, and land cover change detection. In previous studies, thresholding is a common and useful method in cloud detection. However, a selected threshold is usually suitable for certain cases or local study areas, and it may be failed in other cases. In other words, thresholding-based methods are data-sensitive. Besides, there are many exceptions to control, and the environment is changed dynamically. Using the same threshold value on various data is not effective. In this study, a threshold-free method based on Support Vector Machine (SVM) is proposed, which can avoid the abovementioned problems. A statistical model is adopted to detect clouds instead of a subjective thresholding-based method, which is the main idea of this study. The features used in a classifier is the key to a successful classification. As a result, Automatic Cloud Cover Assessment (ACCA) algorithm, which is based on physical characteristics of clouds, is used to distinguish the clouds and other objects. In the same way, the algorithm called Fmask (Zhu et al., 2012) uses a lot of thresholds and criteria to screen clouds, cloud shadows, and snow. Therefore, the algorithm of feature extraction is based on the ACCA algorithm and Fmask. Spatial and temporal information are also important for satellite images. Consequently, co-occurrence matrix and temporal variance with uniformity of the major principal axis are used in proposed method. We aim to classify images into three groups: cloud, non-cloud and the others. In experiments, images acquired by the Landsat 7 Enhanced Thematic Mapper Plus (ETM+) and images containing the landscapes of agriculture, snow area, and island are tested. Experiment results demonstrate the detection accuracy of the proposed method is better than related methods.
NASA Astrophysics Data System (ADS)
Ma, Xu; Li, Yanqiu; Guo, Xuejia; Dong, Lisong
2012-03-01
Optical proximity correction (OPC) and phase shifting mask (PSM) are the most widely used resolution enhancement techniques (RET) in the semiconductor industry. Recently, a set of OPC and PSM optimization algorithms have been developed to solve for the inverse lithography problem, which are only designed for the nominal imaging parameters without giving sufficient attention to the process variations due to the aberrations, defocus and dose variation. However, the effects of process variations existing in the practical optical lithography systems become more pronounced as the critical dimension (CD) continuously shrinks. On the other hand, the lithography systems with larger NA (NA>0.6) are now extensively used, rendering the scalar imaging models inadequate to describe the vector nature of the electromagnetic field in the current optical lithography systems. In order to tackle the above problems, this paper focuses on developing robust gradient-based OPC and PSM optimization algorithms to the process variations under a vector imaging model. To achieve this goal, an integrative and analytic vector imaging model is applied to formulate the optimization problem, where the effects of process variations are explicitly incorporated in the optimization framework. The steepest descent algorithm is used to optimize the mask iteratively. In order to improve the efficiency of the proposed algorithms, a set of algorithm acceleration techniques (AAT) are exploited during the optimization procedure.
NASA Astrophysics Data System (ADS)
Mahvash Mohammadi, Neda; Hezarkhani, Ardeshir
2018-07-01
Classification of mineralised zones is an important factor for the analysis of economic deposits. In this paper, the support vector machine (SVM), a supervised learning algorithm, based on subsurface data is proposed for classification of mineralised zones in the Takht-e-Gonbad porphyry Cu-deposit (SE Iran). The effects of the input features are evaluated via calculating the accuracy rates on the SVM performance. Ultimately, the SVM model, is developed based on input features namely lithology, alteration, mineralisation, the level and, radial basis function (RBF) as a kernel function. Moreover, the optimal amount of parameters λ and C, using n-fold cross-validation method, are calculated at level 0.001 and 0.01 respectively. The accuracy of this model is 0.931 for classification of mineralised zones in the Takht-e-Gonbad porphyry deposit. The results of the study confirm the efficiency of SVM method for classification the mineralised zones.
NASA Astrophysics Data System (ADS)
Yashvantrai Vyas, Bhargav; Maheshwari, Rudra Prakash; Das, Biswarup
2016-06-01
Application of series compensation in extra high voltage (EHV) transmission line makes the protection job difficult for engineers, due to alteration in system parameters and measurements. The problem amplifies with inclusion of electronically controlled compensation like thyristor controlled series compensation (TCSC) as it produce harmonics and rapid change in system parameters during fault associated with TCSC control. This paper presents a pattern recognition based fault type identification approach with support vector machine. The scheme uses only half cycle post fault data of three phase currents to accomplish the task. The change in current signal features during fault has been considered as discriminatory measure. The developed scheme in this paper is tested over a large set of fault data with variation in system and fault parameters. These fault cases have been generated with PSCAD/EMTDC on a 400 kV, 300 km transmission line model. The developed algorithm has proved better for implementation on TCSC compensated line with its improved accuracy and speed.
Lu, Chi-Jie; Chang, Chi-Chang
2014-01-01
Sales forecasting plays an important role in operating a business since it can be used to determine the required inventory level to meet consumer demand and avoid the problem of under/overstocking. Improving the accuracy of sales forecasting has become an important issue of operating a business. This study proposes a hybrid sales forecasting scheme by combining independent component analysis (ICA) with K-means clustering and support vector regression (SVR). The proposed scheme first uses the ICA to extract hidden information from the observed sales data. The extracted features are then applied to K-means algorithm for clustering the sales data into several disjoined clusters. Finally, the SVR forecasting models are applied to each group to generate final forecasting results. Experimental results from information technology (IT) product agent sales data reveal that the proposed sales forecasting scheme outperforms the three comparison models and hence provides an efficient alternative for sales forecasting.
NASA Astrophysics Data System (ADS)
Imani, Moslem; You, Rey-Jer; Kuo, Chung-Yen
2014-10-01
Sea level forecasting at various time intervals is of great importance in water supply management. Evolutionary artificial intelligence (AI) approaches have been accepted as an appropriate tool for modeling complex nonlinear phenomena in water bodies. In the study, we investigated the ability of two AI techniques: support vector machine (SVM), which is mathematically well-founded and provides new insights into function approximation, and gene expression programming (GEP), which is used to forecast Caspian Sea level anomalies using satellite altimetry observations from June 1992 to December 2013. SVM demonstrates the best performance in predicting Caspian Sea level anomalies, given the minimum root mean square error (RMSE = 0.035) and maximum coefficient of determination (R2 = 0.96) during the prediction periods. A comparison between the proposed AI approaches and the cascade correlation neural network (CCNN) model also shows the superiority of the GEP and SVM models over the CCNN.
An IPSO-SVM algorithm for security state prediction of mine production logistics system
NASA Astrophysics Data System (ADS)
Zhang, Yanliang; Lei, Junhui; Ma, Qiuli; Chen, Xin; Bi, Runfang
2017-06-01
A theoretical basis for the regulation of corporate security warning and resources was provided in order to reveal the laws behind the security state in mine production logistics. Considering complex mine production logistics system and the variable is difficult to acquire, a superior security status predicting model of mine production logistics system based on the improved particle swarm optimization and support vector machine (IPSO-SVM) is proposed in this paper. Firstly, through the linear adjustments of inertia weight and learning weights, the convergence speed and search accuracy are enhanced with the aim to deal with situations associated with the changeable complexity and the data acquisition difficulty. The improved particle swarm optimization (IPSO) is then introduced to resolve the problem of parameter settings in traditional support vector machines (SVM). At the same time, security status index system is built to determine the classification standards of safety status. The feasibility and effectiveness of this method is finally verified using the experimental results.
NASA Astrophysics Data System (ADS)
Chen, Jing; Qiu, Xiaojie; Yin, Cunyi; Jiang, Hao
2018-02-01
An efficient method to design the broadband gain-flattened Raman fiber amplifier with multiple pumps is proposed based on least squares support vector regression (LS-SVR). A multi-input multi-output LS-SVR model is introduced to replace the complicated solving process of the nonlinear coupled Raman amplification equation. The proposed approach contains two stages: offline training stage and online optimization stage. During the offline stage, the LS-SVR model is trained. Owing to the good generalization capability of LS-SVR, the net gain spectrum can be directly and accurately obtained when inputting any combination of the pump wavelength and power to the well-trained model. During the online stage, we incorporate the LS-SVR model into the particle swarm optimization algorithm to find the optimal pump configuration. The design results demonstrate that the proposed method greatly shortens the computation time and enhances the efficiency of the pump parameter optimization for Raman fiber amplifier design.
Application of biomonitoring and support vector machine in water quality assessment*
Liao, Yue; Xu, Jian-yu; Wang, Zhu-wei
2012-01-01
The behavior of schools of zebrafish (Danio rerio) was studied in acute toxicity environments. Behavioral features were extracted and a method for water quality assessment using support vector machine (SVM) was developed. The behavioral parameters of fish were recorded and analyzed during one hour in an environment of a 24-h half-lethal concentration (LC50) of a pollutant. The data were used to develop a method to evaluate water quality, so as to give an early indication of toxicity. Four kinds of metal ions (Cu2+, Hg2+, Cr6+, and Cd2+) were used for toxicity testing. To enhance the efficiency and accuracy of assessment, a method combining SVM and a genetic algorithm (GA) was used. The results showed that the average prediction accuracy of the method was over 80% and the time cost was acceptable. The method gave satisfactory results for a variety of metal pollutants, demonstrating that this is an effective approach to the classification of water quality. PMID:22467374
2014-01-01
Sales forecasting plays an important role in operating a business since it can be used to determine the required inventory level to meet consumer demand and avoid the problem of under/overstocking. Improving the accuracy of sales forecasting has become an important issue of operating a business. This study proposes a hybrid sales forecasting scheme by combining independent component analysis (ICA) with K-means clustering and support vector regression (SVR). The proposed scheme first uses the ICA to extract hidden information from the observed sales data. The extracted features are then applied to K-means algorithm for clustering the sales data into several disjoined clusters. Finally, the SVR forecasting models are applied to each group to generate final forecasting results. Experimental results from information technology (IT) product agent sales data reveal that the proposed sales forecasting scheme outperforms the three comparison models and hence provides an efficient alternative for sales forecasting. PMID:25045738
Zhou, Pei-pei; Shan, Jin-feng; Jiang, Jian-lan
2015-12-01
To optimize the optimal microwave-assisted extraction method of curcuminoids from Curcuma longa. On the base of single factor experiment, the ethanol concentration, the ratio of liquid to solid and the microwave time were selected for further optimization. Support Vector Regression (SVR) and Central Composite Design-Response Surface Methodology (CCD) algorithm were utilized to design and establish models respectively, while Particle Swarm Optimization (PSO) was introduced to optimize the parameters of SVR models and to search optimal points of models. The evaluation indicator, the sum of curcumin, demethoxycurcumin and bisdemethoxycurcumin by HPLC, were used. The optimal parameters of microwave-assisted extraction were as follows: ethanol concentration of 69%, ratio of liquid to solid of 21 : 1, microwave time of 55 s. On those conditions, the sum of three curcuminoids was 28.97 mg/g (per gram of rhizomes powder). Both the CCD model and the SVR model were credible, for they have predicted the similar process condition and the deviation of yield were less than 1.2%.